Retrieval augmented generation on PostgresML

A unified suite of tools for production-grade RAG applications.

Is your AI app making the most of your data or just making things up?

Harmful hallucinations

Your AI model generates more wrong answers than right.

Content cutoffs

Your users can't access up to date information since the model was created.

Noisy neighbors

Foundation models consider your content less relevant than other voices, if they consider it at all.

From the
ML team at

RAG is the answer. Deliver the most effective RAG apps on PostgresML.

    Deliver accurate information- sources cited.
    Generate real-time responses without endless training.
    Securely and easily give LLMs access to your data.

What makes RAG on PostgresML so special?

PostgresML uniquely unifies every component of the stack to deliver blazing fast RAG applications.

Relational and vector database

On PostgresML, vectors are just another data type that can be stored in regular tables and queried together with other columns. No additional vector database required.

    Store vector embeddings with the rest of your data
    Index vectors using HNSW or IVFFlat for fast retrieval
    Use vector search for KNN, ANN

Embedding Generation

Generate embeddings without RPCs to external services, minimizing data movement and enabling faster processing and analysis. PostgresML supports dozens of popular embedding models, such as:

    intfloat/e5-large & intfloat/e5-large-v2
    And more...

Large Language Models

Productionize the latest, open-source large language models on HuggingFace with your own data. Browse all the models available to find the perfect solution for your task and dataset.PostgresML supports:

    Command R+
    And more...

Architecture makes or breaks your app.
PostgresML radically simplifies it

PGML Architecture Old Way and New Way

4x Faster

than HuggingFace + Pinecone
for a RAG chatbot

10x faster

than OpenAI for embedding

Save 42%

On vector database cost
compared to Pinecone

Get the same ML/AI functionality in Python and JavaScript

Learn More