4x faster

than HuggingFace + Pinecone

26k + models

Llama, Falcon, Mistral, etc.

1m QPS

queries per second on EC2
"I have a full proof of concept chatbot fully synced
to document changes, all done in 3 hours flat."
Keep it simple. Keep it fast. 

Today’s chatbot implementations require a patchwork build of services that introduce latency at each step. PostgresML combines and automates the entire chatbot workflow. It’s less infrastructure overhead and a better (faster) experience for users.

10x faster

than OpenAI for embedding

Save 42%

On vector database cost
compared to Pinecone

Deploy a next-gen chatbot with a cli builder, vector search, retrieval augmented generation (RAG) and the latest LLMs – all in your database.

Use a single database

Store your documents, chunks and text embeddings in one place for a simplified infrastructure footprint that requires less eng resources.

Generate text embeddings

Get the only platform that can generate state-of-the-art LLM models without an external LLM inference service.

Manage vector indices

Get lightning fast and accurate vector recall. Eliminate roundtrip network calls for recall and querying for the lowest latency app.

Monitor performance

Track chat history, prompts and prompt templates. Fine-tune the latest LLMs with chat history right in the database.

Bring factual memory and lightning-speed responses to your website, Discord, Slack and more with a seamless integration to your preferred communication platform.

It’s simple with a seamless in-database MLOps platform.


pgml-chat --stage ingest

pgml-chat --stage chat \ --chat_interface cli

pgml-chat --stage chat \ --chat_interface slack

