Installation
A typical PostgresML deployment consists of two parts: the PostgreSQL extension, and the dashboard web app. The extension provides all the machine learning functionality, and can be used independently. The dashboard provides a system overview for easier management, and notebooks for writing experiments.
Extension
The extension can be installed by compiling it from source, or if you're using Ubuntu 22.04, from our package repository.
macOS
If you're just looking to try PostgresML without installing it on your system, take a look at our Quick Start with Docker guide.
Get the source code
To get the source code for PostgresML, you can clone our Github repository:
git clone https://github.com/postgresml/postgresml
Install dependencies
We provide a Brewfile
that will install all the necessary dependencies for compiling PostgresML from source:
cd pgml-extension && \
brew bundle
Rust
PostgresML is written in Rust, so you'll need to install the latest compiler from rust-lang.org. Additionally, we use the Rust PostgreSQL extension framework pgrx
, which requires some initialization steps:
cargo install cargo-pgrx --version 0.11.2 && \
cargo pgrx init
This step will take a few minutes. Perfect opportunity to get a coffee while you wait.
Compile and install
With all the dependencies installed, you can compile and install the extension:
cargo pgrx install
This will compile all the necessary packages, including Rust bindings to XGBoost and LightGBM, together with Python support for Hugging Face transformers and Scikit-learn. The extension will be automatically installed into the PostgreSQL installation created by the postgresql@15
Homebrew formula.
Python dependencies
PostgresML uses Python packages to provide support for Hugging Face LLMs and Scikit-learn algorithms and models. To make this work on your system, you have two options: install those packages into a virtual environment (strongly recommended), or install them globally.
To install the necessary Python packages into a virtual environment, use the virtualenv
tool installed previously by Homebrew:
virtualenv pgml-venv && \
source pgml-venv/bin/activate && \
pip install -r requirements.txt
Installing Python packages globally can cause issues with your system. If you wish to proceed nonetheless, you can do so:
pip3 install -r requirements.txt
Configuration
We have one last step remaining to get PostgresML running on your system: configuration.
PostgresML needs to be loaded into shared memory by PostgreSQL. To do so, you need to add it to preload_shared_libraries
.
Additionally, if you've chosen to use a virtual environment for the Python packages, we need to tell PostgresML where to find it.
Both steps can be done by editing the PostgreSQL configuration file postgresql.conf
usinig your favorite editor:
vim /opt/homebrew/var/postgresql@15/postgresql.conf
Both settings can be added to the config, like so:
shared_preload_libraries = 'pgml,pg_stat_statements'
pgml.venv = '/absolute/path/to/your/pgml-venv'
Save the configuration file and restart PostgreSQL:
brew services restart postgresql@15
Test your installation
You should be able to connect to PostgreSQL and use our extension now:
CREATE EXTENSION pgml;
SELECT pgml.version();
psql (15.3 (Homebrew))
Type "help" for help.
pgml_test=# CREATE EXTENSION pgml;
INFO: Python version: 3.11.4 (main, Jun 20 2023, 17:23:00) [Clang 14.0.3 (clang-1403.0.22.14.1)]
INFO: Scikit-learn 1.2.2, XGBoost 1.7.5, LightGBM 3.3.5, NumPy 1.25.1
CREATE EXTENSION
pgml_test=# SELECT pgml.version();
version
---------
2.7.4
(1 row)
pgvector
We like and use pgvector a lot, as documented in our blog posts and examples, to store and search embeddings. You can install pgvector from source pretty easily:
git clone --branch v0.5.0 https://github.com/pgvector/pgvector && \
cd pgvector && \
echo "trusted = true" >> vector.control && \
make && \
make install
Test pgvector installation
You can create the vector
extension in any database:
CREATE EXTENSION vector;
psql (15.3 (Homebrew))
Type "help" for help.
pgml_test=# CREATE EXTENSION vector;
CREATE EXTENSION
Ubuntu
If you're looking to use PostgresML in production, try our cloud. We support serverless deployments with modern GPUs for startups of all sizes, and dedicated GPU hardware for larger teams that would like to tweak PostgresML to their needs.
For Ubuntu, we compile and ship packages that include everything needed to install and run the extension. At the moment, only Ubuntu 22.04 (Jammy) is supported.
Add our sources
Add our repository to your system sources:
echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" | \
sudo tee -a /etc/apt/sources.list
Install PostgresML
Update your package lists and install PostgresML:
export POSTGRES_VERSION=15
sudo apt update && \
sudo apt install postgresml-${POSTGRES_VERSION}
The postgresml-15
package includes all the necessary dependencies, including Python packages shipped inside a virtual environment. Your PostgreSQL server is configured automatically.
We support PostgreSQL versions 11 through 15, so you can install the one matching your currently installed PostgreSQL version.
Installing just the extension
If you prefer to manage your own Python environment and dependencies, you can install just the extension:
export POSTGRES_VERSION=15
sudo apt install postgresql-pgml-${POSTGRES_VERSION}
Optimized pgvector
pgvector, the extension we use for storing and searching embeddings, needs to be installed separately for optimal performance. Your hardware may support vectorized operation instructions (like AVX-512), which pgvector can take advantage of to run faster.
To install pgvector from source, you can simply:
git clone --branch v0.4.4 https://github.com/pgvector/pgvector && \
cd pgvector && \
echo "trusted = true" >> vector.control && \
make && \
make install
Other Linux
PostgresML will compile and run on pretty much any modern Linux distribution. For a quick example, you can take a look at what we do to build the extension on Ubuntu, and modify those steps to work on your distribution.
Get the source code
To get the source code for PostgresML, you can clone our Github repo:
git clone https://github.com/postgresml/postgresml
Dependencies
You'll need the following packages installed first. The names are taken from Ubuntu (and other Debian based distros), so you'll need to change them to fit your distribution:
export POSTGRES_VERSION=15
build-essential
clang
libopenblas-dev
libssl-dev
bison
flex
pkg-config
cmake
libreadline-dev
libz-dev
tzdata
sudo
libpq-dev
libclang-dev
postgresql-{POSTGRES_VERSION}
postgresql-server-dev-${POSTGRES_VERSION}
python3
python3-pip
libpython3
lld
Rust
PostgresML is written in Rust, so you'll need to install the latest compiler version from rust-lang.org.
pgrx
We use the pgrx
Postgres Rust extension framework, which comes with its own installation and configuration steps:
cd pgml-extension && \
cargo install cargo-pgrx --version 0.11.2 && \
cargo pgrx init
This step will take a few minutes since it has to download and compile multiple PostgreSQL versions used by pgrx
for development.
Compile and install
Finally, you can compile and install the extension:
cargo pgrx install
Dashboard
The dashboard is a web app that can be run against any Postgres database which has the extension installed. There is a Dockerfile included with the source code if you wish to run it as a container.
Get the source code
To get our source code, you can clone our Github repo (if you haven't already):
git clone clone https://github.com/postgresml/postgresml && \
cd pgml-dashboard
Configure your database
Use an existing database which has the pgml
extension installed, or create a new one:
createdb pgml_dashboard && \
psql -d pgml_dashboard -c 'CREATE EXTENSION pgml;'
Configure the environment
Create a .env
file with the necessary DATABASE_URL
, for example:
DATABASE_URL=postgres:///pgml_dashboard
Get Rust
The dashboard is written in Rust and uses the SQLx crate to interact with Postgres. Make sure to install the latest Rust compiler from rust-lang.org.
Database setup
To setup the database, you'll need to install sqlx-cli
and run the migrations:
cargo install sqlx-cli --version 0.6.3 && \
cargo sqlx database setup
Frontend dependencies
The dashboard frontend is using Sass which requires Node & the Sass compiler. You can install Node from Brew, your package repository, or by using Node Version Manager.
If using nvm, you can install the latest stable Node version with:
nvm install stable
Once you have Node installed, you can install the Sass compiler globally:
npm install -g sass
Compile and run
Finally, you can compile and run the dashboard:
cargo run
Once compiled, the dashboard will be available on localhost:8000.
The dashboard can also be packaged for distribution. You'll need to copy the static files along with the target/release
directory to your server.