Plans and
pricing

Start small, scale instantly.

Get $100 Free Usage Credits

Get $100
free usage
credits

Serverless

From $7.50 per query hour

The easiest way to start and scale your RAG app.

Pay-per-use

Burst GPU capacity

Access curated models

Support on Discord

Get Started

Dedicated

From $0.60 per instance hour

For organizations with established workloads.

Committed use discounts

Dedicated hardware

Use any model on HuggingFace

Deploy on major cloud providers in any region

Dedicated support on private Slack or MS Teams

Get Started

Enterprise

Custom pricing

Dedicated hardware for at-scale teams w/ advanced security needs.

Pay as you go or committed use pricing

VPC deployments on major cloud providers in any region

Multiple GPUs

Custom SLAs

Premium support and onboarding

Dedicated support on Slack or MS Teams

Priority feature requests

How does PostgresML
pricing work?

currency_exchange

Pay-per-use

Only use what you need, and pay as you go with no up-front costs.

shoppingmode

Committed use discounts

Commit to certain levels of usage for a fixed monthly cost and get a discounted rate. Scale your configuration up or down at any time with the click of a button.

Serverless pricing

Storage is charged per GB/mo, and all requests by CPU or GPU millisecond of compute required to perform them.

Vector & Relational Database

Name	Pricing
Tables & index storage	$0.25/GB per month
Retrieval, filtering, ranking & other queries	$7.50 per hour
Embeddings	Included w/ queries
LLMs	Included w/ queries
Fine tuning	Included w/ queries
Machine learning	Included w/ queries

Serverless models

Serverless databases come with predefined models and a flexible pricing structure.

Embedding Models

Name	Parameters (M)	Max input tokens	Dimensions	Strengths
intfloat/e5-small-v2	33.4	512	384	Good quality, low latency
mixedbread-ai/mxbai-embed-large-v1	335	512	1024	High quality, higher latency
Alibaba-NLP/gte-base-en-v1.5	137	8192	768	Supports up to 8,000 input tokens
Alibaba-NLP/gte-large-en-v1.5	434	8192	1024	Highest quality, 8,000 input tokens

Instruct Models

Name	Parameters (B)	Active Parameters (B)	Context size	Strengths
meta-llama/Llama-3.2-1B-Instruct	1	1	128	Lowest latency
meta-llama/Llama-3.2-3B-Instruct	3	3	128	Low latency
meta-llama/Meta-Llama-3.1-405B-Instruct	405	405	128k	Highest quality
meta-llama/Meta-Llama-3.1-70B-Instruct	70	70	128k	High quality
meta-llama/Meta-Llama-3.1-8B-Instruct	8	8	128k	Low latency
microsoft/Phi-3-mini-128k-instruct	3.8	3.8	128k	Low latency
mistralai/Mixtral-8x7B-Instruct-v0.1	56	12.9	32k	MOE high quality
mistralai/Mistral-7B-Instruct-v0.2	7	7	32k	Low latency

Summarization Models

Name	Parameters (B)	Context size	Strengths
google/pegasus-xsum	568	512	8k

Cost estimator

Import

records

bytes of metadata per record

vector dimensions

embedding tokens per record

Read Queries

per month

Write Vectors

per month

Text Generation

input tokens per request

queries per month

output tokens per request

Estimate

cost per month

Savings

vs Pinecone + OpenAI

Get Started

Detailed estimate 🤌

Vector Database

Pinecone

	unit	total
import		-
storage		- /month
read queries		- /month
write queries		- /month
total		- /month

PostgresML

	unit	total
import		-
storage		- /month
read queries		- /month
write queries		- /month
total		- /month

Embeddings

Open AI

	ADA-V2	total
import		-
read tokens		- /month
write tokens		- /month
total		- /month

PostgresML

	units	total
import	included	-
read tokens	included	- /month
write tokens	included	- /month
total		- /month

Text Generation

Open AI

model		total
gpt-3.5-turbo-0125		- /month

PostgresML

model		total
mixtral-8x7B		- /month

All-in Rag


import		-
total		- /month


import		-
total		- /month

Get Started

Frequently
asked
questions
💭

What does serverless mean on PostgresML?

add remove

On PostgresML you can build and scale Postgres without having to manage servers or GPUs. Your database will respond to your application’s demand automatically, and scale up or down as needed. Your charges will be based purely on your usage, and measured down to the millisecond.

Does PostgresML charge per token?

add remove

PostgresML does not charge per token. We charge by the amount of time a query runs. Queries that generate or process more tokens will often run longer, but queries that use smaller models will run more quickly. You’re only charged for the resources you use.

Does PostgresML charge for storage?

add remove

PostgresML charges $0.25 per gigabyte per month for storage. This includes fault tolerant RAID configurations for high availability as well as backups for disaster recovery.

How is PostgresML so inexpensive?

add remove

Our approach to GPU memory management is inherently more efficient because at PostgresML, we move full AI capability to the database rather than moving the data to the models.

How does the cost estimator work?

add remove

PostgresML estimates costs based on typical workloads and real world benchmarks. Workload prediction is difficult which can make future cost estimation even harder. Please contact our team if you would like help estimating the size of your workload and the associated costs. We’re happy to help if you have any questions.

What can I do with my free credits?

add remove

Anything you want with PostgresML. We’ll send you an email when your free credits expire as a reminder that you may start incurring charges in the future.

How does billing work?

add remove

By default, you will be billed monthly based on your usage. You will receive an invoice with total charges three days before your elected payment method is automatically billed. If you incur significantly increased utilization before your normal billing cycle, we will notify you with an off cycle invoice to help you control costs and maintain service.

Does PostgresML provide technical support?

add remove

Serverless plans have access to our community Discord. Dedicated plans offer a private Slack or MS teams channel for direct communication with our team. PostgresML provides custom SLAs for enterprise plans. Contact us for details.

Still have questions?

Get started
with $100 in
free credits

Get Started

PostgresML

Korvus PGML PpCat Learning PostgresML VPC

LLMs Embeddings Vector Database Supervised Learning RAG Search Chatbot

Documentation

Blog

Pricing

About Careers Privacy Terms of Service Contact

GitHub Discord Formerly Twitter YouTube LinkedIn

This site uses cookies for usage analytics to improve our service. By continuing to browse this site, you agree to this use. See our Privacy Policy

Plans and pricing

Serverless

From $7.50 per query hour

Dedicated

From $0.60 per instance hour

Enterprise

Custom pricing

How does PostgresML pricing work?

Pay-per-use

Committed use discounts

Serverless pricing

Vector & Relational Database

Serverless models

Embedding Models

Instruct Models

Summarization Models

Cost estimator

Detailed estimate 🤌

Vector Database

Embeddings

Text Generation

All-in Rag

Frequently asked questions 💭

What does serverless mean on PostgresML?

Does PostgresML charge per token?

Does PostgresML charge for storage?

How is PostgresML so inexpensive?

How does the cost estimator work?

What can I do with my free credits?

How does billing work?

Does PostgresML provide technical support?

Still have questions?

Get started with $100 in free credits

Docs

Docs

Blog

Blog

Community

Community

PostgresML

Plans and
pricing

How does PostgresML
pricing work?

Frequently
asked
questions
💭

Get started
with $100 in
free credits