Skip to content

End-to-end
machine learning solution
for everyone

Train and deploy models to make online predictions using only SQL, with an open source extension for Postgres. Manage your projects and visualize datasets using the built-in dashboard.

Demo

Pure SQL Solution

train.sql
1
2
3
4
5
6
7
SELECT pgml.train(
  'My project name', 
  task => 'regression',
  relation_name => 'my_table_with_data',
  y_column_name => 'my_column_with_labels',
  algorithm => 'xgboost' 
);

Learn more about Training

models

models

deploy.sql
1
2
3
4
5
SELECT pgml.deploy(
  'My project name', 
  strategy => 'most_recent',
  algorithm => 'xgboost'
);

Learn more about Deployments

predict.sql
1
2
3
4
5
6
SELECT *, pgml.predict(
  'My project name', 
  ARRAY[...] -- same features used in training
) AS prediction
FROM my_new_unlabeled_table
ORDER BY prediction DESC;

Learn more about Predictions

models

Get Started

What's in the box

All your favorite algorithms

Whether you need a simple linear regression, or extreme gradient boosting, we've included support for all classification and regression algorithms in Scikit Learn and XGBoost with no extra configuration.

Algorithms

Instant visualizations

Run standard analysis on your datasets to detect outliers, bimodal distributions, feature correlation, and other common data visualizations on your datasets. Everything is cataloged in the dashboard for easy reference.

Dashboard

Hyperparameter search

Use either grid or random searches with cross validation on your training set to discover the most important knobs to tweak on your favorite algorithm.

Hyperparameter Search

Online and offline support

Predictions are served via a standard Postgres connection to ensure that your core apps can always access both your data and your models in real time. Pure SQL workflows also enable batch predictions to cache results in native Postgres tables for lookup.

Predictions

SQL native vector operations

Vector operations make working with learned emebeddings a snap, for things like nearest neighbor searches or other similarity comparisons.

Vector Operations

Managed model deployments

Models can be periodically retrained and automatically promoted to production depending on their key metric. Rollback capability is provided to ensure that you're always able to serve the highest quality predictions, along with historical logs of all deployments for long term study.

Deployments

The performance of Postgres

Since your data never leaves the database, you retain the speed, reliability and security you expect in your foundational stateful services. Leverage your existing infrastructure and the data distribution strategies native to PostgreSQL to deliver new capabilities.

Distributed Training

Open source

We're building on the shoulders of giants. These machine learning libraries and Postgres have recieved extensive academic and industry use, and we'll continue their tradition to build with the community.

MIT License