SQL extension

pgml is a PostgreSQL extension which adds SQL functions to the database. Those functions provide access to AI models downloaded from Hugging Face, and classical machine learning algorithms like XGBoost and LightGBM.

Our SQL API is stable and safe to use in your applications, while the models and algorithms we support continue to evolve and improve.

Common Tasks

See the API for a full list of all functions provided by pgml.

Common tasks include:

Open-source LLMs

PostgresML defines four SQL functions which use 🤗 Hugging Face transformers and embeddings models, running directly in the database:

Function Description
pgml.embed() Generate embeddings using latest sentence transformers from Hugging Face.
pgml.transform() Text generation using LLMs like Llama, Mixtral, and many more, with models downloaded from Hugging Face.
pgml.transform_stream() Streaming version of pgml.transform(), which fetches partial responses as they are being generated by the model, substantially decreasing time to first token.
pgml.tune() Perform fine tuning tasks on Hugging Face models, using data stored in the database.

Classical machine learning

PostgresML defines four SQL functions which allow training regression, classification, and clustering models on tabular data:

Function Description
pgml.train() Train a model on PostgreSQL tables or views using any algorithm from Scikit-learn, with the additional support for XGBoost, LightGBM and Catboost.
pgml.predict() Run inference on live application data using a model trained with pgml.train().
pgml.deploy() Deploy a specific version of a model trained with pgml.train(), using your own accuracy metrics.
pgml.load_dataset() Load any of the toy datasets from Scikit-learn or any dataset from Hugging Face.