Serverless

A Serverless PostgresML database can be created in less than 5 seconds and provides immediate access to modern GPU acceleration, a predefined set of state-of-the-art large language models that should satisfy most use cases, and dozens of supervised learning algorithms like XGBoost, LightGBM, Catboost, and everything from Scikit-learn. We call this combination of tools an AI engine. With a Serverless engine, storage and compute resources dynamically adapt to your application's needs, ensuring it can scale down or handle peak loads without overprovisioning.

Serverless engines are billed on a pay-per-use basis and we offer $100 in free credits to get you started!

Create a Serverless engine

To create a Serverless engine, make sure you have an account on postgresml.org. If you don't, you can create one now.

Once logged in, select "New Engine" from the left menu and choose the Serverless Plan.

Create new database

Choose the Serverless plan

Serverless Pricing

Storage is charged per GB/mo, and all requests by CPU or GPU millisecond of compute required to perform them.

Vector & Relational Database

Name Pricing
Tables & index storage $0.25/GB per month
Retrieval, filtering, ranking & other queries $7.50 per hour
Embeddings Included w/ queries
LLMs Included w/ queries
Fine tuning Included w/ queries
Machine learning Included w/ queries

Serverless Models

Serverless AI engines come with predefined models and a flexible pricing structure

Embedding Models

Name Parameters (M) Max input tokens Dimensions Strengths
intfloat/e5-small-v2 33.4 512 384 Good quality, low latency
mixedbread-ai/mxbai-embed-large-v1 335 512 1024 High quality, higher latency
Alibaba-NLP/gte-base-en-v1.5 137 8192 768 Supports up to 8,000 input tokens
Alibaba-NLP/gte-large-en-v1.5 434 8192 1024 Highest quality, 8,000 input tokens

Instruct Models

Name Parameters (B) Active Parameters (B) Context size Strengths
meta-llama/Llama-3.2-1B-Instruct 1 1 128 Lowest latency
meta-llama/Llama-3.2-3B-Instruct 3 3 128 Low latency
meta-llama/Meta-Llama-3.1-405B-Instruct 405 405 128k Highest quality
meta-llama/Meta-Llama-3.1-70B-Instruct 70 70 128k High quality
meta-llama/Meta-Llama-3.1-8B-Instruct 8 8 128k Low latency
microsoft/Phi-3-mini-128k-instruct 3.8 3.8 128k Low latency
mistralai/Mixtral-8x7B-Instruct-v0.1 56 12.9 32k MOE high quality
mistralai/Mistral-7B-Instruct-v0.2 7 7 32k Low latency

Summarization Models

Name Parameters (B) Context size Strengths
google/pegasus-xsum 568 512 8k