Serverless
A Serverless PostgresML database can be created in less than 5 seconds and provides immediate access to modern GPU acceleration, a predefined set of state-of-the-art large language models that should satisfy most use cases, and dozens of supervised learning algorithms like XGBoost, LightGBM, Catboost, and everything from Scikit-learn. We call this combination of tools an AI engine. With a Serverless engine, storage and compute resources dynamically adapt to your application's needs, ensuring it can scale down or handle peak loads without overprovisioning.
Serverless engines are billed on a pay-per-use basis and we offer $100 in free credits to get you started!
Create a Serverless engine
To create a Serverless engine, make sure you have an account on postgresml.org. If you don't, you can create one now.
Once logged in, select "New Engine" from the left menu and choose the Serverless Plan.
.png)
Create new database
.png)
Choose the Serverless plan
Serverless Pricing
Storage is charged per GB/mo, and all requests by CPU or GPU millisecond of compute required to perform them.
Vector & Relational Database
Name | Pricing |
---|---|
Tables & index storage | $0.25/GB per month |
Retrieval, filtering, ranking & other queries | $7.50 per hour |
Embeddings | Included w/ queries |
LLMs | Included w/ queries |
Fine tuning | Included w/ queries |
Machine learning | Included w/ queries |
Serverless Models
Serverless AI engines come with predefined models and a flexible pricing structure
Embedding Models
Name | Parameters (M) | Max input tokens | Dimensions | Strengths |
---|---|---|---|---|
intfloat/e5-small-v2 | 33.4 | 512 | 384 | Good quality, low latency |
mixedbread-ai/mxbai-embed-large-v1 | 335 | 512 | 1024 | High quality, higher latency |
Alibaba-NLP/gte-base-en-v1.5 | 137 | 8192 | 768 | Supports up to 8,000 input tokens |
Alibaba-NLP/gte-large-en-v1.5 | 434 | 8192 | 1024 | Highest quality, 8,000 input tokens |
Instruct Models
Name | Parameters (B) | Active Parameters (B) | Context size | Strengths |
---|---|---|---|---|
meta-llama/Llama-3.2-1B-Instruct | 1 | 1 | 128 | Lowest latency |
meta-llama/Llama-3.2-3B-Instruct | 3 | 3 | 128 | Low latency |
meta-llama/Meta-Llama-3.1-405B-Instruct | 405 | 405 | 128k | Highest quality |
meta-llama/Meta-Llama-3.1-70B-Instruct | 70 | 70 | 128k | High quality |
meta-llama/Meta-Llama-3.1-8B-Instruct | 8 | 8 | 128k | Low latency |
microsoft/Phi-3-mini-128k-instruct | 3.8 | 3.8 | 128k | Low latency |
mistralai/Mixtral-8x7B-Instruct-v0.1 | 56 | 12.9 | 32k | MOE high quality |
mistralai/Mistral-7B-Instruct-v0.2 | 7 | 7 | 32k | Low latency |
Summarization Models
Name | Parameters (B) | Context size | Strengths |
---|---|---|---|
google/pegasus-xsum | 568 | 512 | 8k |