pgml.transform()
PostgresML integrates 🤗 Hugging Face Transformers to bring state-of-the-art models into the data layer. There are tens of thousands of pre-trained models with pipelines to turn raw inputs into useful results. Many state of the art deep learning architectures have been published and made available for download. You will want to browse all the models available to find the perfect solution for your dataset and task.
We'll demonstrate some of the tasks that are immediately available to users of your database upon installation: translation, sentiment analysis, summarization, question answering and text generation.
Examples
All of the tasks and models demonstrated here can be customized by passing additional arguments to the Pipeline
initializer or call. You'll find additional links to documentation in the examples below.
The Hugging Face Pipeline
API is exposed in Postgres via:
pgml.transform(
task TEXT OR JSONB, -- task name or full pipeline initializer arguments
call JSONB, -- additional call arguments alongside the inputs
inputs TEXT[] OR BYTEA[] -- inputs for inference
)
This is roughly equivalent to the following Python:
import transformers
def transform(task, call, inputs):
return transformers.pipeline(**task)(inputs, **call)
Most pipelines operate on TEXT[]
inputs, but some require binary BYTEA[]
data like audio classifiers. inputs
can be SELECT
ed from tables in the database, or they may be passed in directly with the query. The output of this call is a JSONB
structure that is task specific. See the Postgres JSON reference for ways to process this output dynamically.
Models will be downloaded and stored locally on disk after the first call. They are also cached per connection to improve repeated calls in a single session. To free that memory, you'll need to close your connection. You may want to establish dedicated credentials and connection pools via pgcat or pgbouncer for larger models that have billions of parameters. You may also pass {"cache": false}
in the JSON call
args to prevent this behavior.