Serverless

A Serverless PostgresML database can be created in less than 5 seconds and provides immediate access to modern GPU acceleration, a predefined set of state-of-the-art large language models that should satisfy most use cases, and dozens of supervised learning algorithms like XGBoost, LightGBM, Catboost, and everything from Scikit-learn. We call this combination of tools an AI engine. With a Serverless engine, storage and compute resources dynamically adapt to your application's needs, ensuring it can scale down or handle peak loads without overprovisioning.

Serverless engines are billed on a pay-per-use basis and we offer $100 in free credits to get you started!

Create a Serverless engine

To create a Serverless engine, make sure you have an account on postgresml.org. If you don't, you can create one now.

Once logged in, select "New Engine" from the left menu and choose the Serverless Plan.

Serverless Pricing

Storage is charged per GB/mo, and all requests by CPU or GPU millisecond of compute required to perform them.

Vector & Relational Database

Name	Pricing
Tables & index storage	$0.25/GB per month
Retrieval, filtering, ranking & other queries	$7.50 per hour
Embeddings	Included w/ queries
LLMs	Included w/ queries
Fine tuning	Included w/ queries
Machine learning	Included w/ queries

Serverless Models

Serverless AI engines come with predefined models and a flexible pricing structure

Embedding Models

Name	Parameters (M)	Max input tokens	Dimensions	Strengths
intfloat/e5-small-v2	33.4	512	384	Good quality, low latency
mixedbread-ai/mxbai-embed-large-v1	335	512	1024	High quality, higher latency
Alibaba-NLP/gte-base-en-v1.5	137	8192	768	Supports up to 8,000 input tokens
Alibaba-NLP/gte-large-en-v1.5	434	8192	1024	Highest quality, 8,000 input tokens

Instruct Models

Name	Parameters (B)	Active Parameters (B)	Context size	Strengths
meta-llama/Llama-3.2-1B-Instruct	1	1	128	Lowest latency
meta-llama/Llama-3.2-3B-Instruct	3	3	128	Low latency
meta-llama/Meta-Llama-3.1-405B-Instruct	405	405	128k	Highest quality
meta-llama/Meta-Llama-3.1-70B-Instruct	70	70	128k	High quality
meta-llama/Meta-Llama-3.1-8B-Instruct	8	8	128k	Low latency
microsoft/Phi-3-mini-128k-instruct	3.8	3.8	128k	Low latency
mistralai/Mixtral-8x7B-Instruct-v0.1	56	12.9	32k	MOE high quality
mistralai/Mistral-7B-Instruct-v0.2	7	7	32k	Low latency

Summarization Models

Name	Parameters (B)	Context size	Strengths
google/pegasus-xsum	568	512	8k