PostgresML adds optimized vector operations that can be used inside SQL queries. Vector operations are particularly useful for dealing with embeddings that have been generated from other machine learning algorithms, and can provide functions like nearest neighbor calculations using various distance functions.
Embeddings can be a relatively efficient mechanism to leverage the power of deep learning, without the runtime inference costs. These functions are fast with the most expensive distance functions computing upwards of ~100k per second for a memory resident dataset on modern hardware.
The PostgreSQL planner will also automatically parallelize evaluation on larger datasets, if configured to take advantage of multiple CPU cores when available.
Vector operations are implemented in Rust using
ndarray and BLAS, for maximum performance.
Element-wise Arithmetic with Constants¶
Pairwise arithmetic with Vectors¶
Dimensions not at origin¶
Manhattan distance from origin¶
Euclidean distance from origin¶
Absolute value of largest element¶
Squared Unit Vector¶
Nearest Neighbor Example¶
If we had precalculated the embeddings for a set of user and product data, we could find the 100 best products for a user with a similarity search.