pgml.transform()

The pgml.transform() function is the most powerful feature of PostgresML. It integrates open-source large language models, like Llama, Mixtral, and many more, which allows to perform complex tasks on your data.

The models are downloaded from 🤗 Hugging Face which hosts tens of thousands of pre-trained and fine-tuned models for various tasks like text generation, question answering, summarization, text classification, and more.

API

The pgml.transform() function comes in two flavors, task-based and model-based.

Task-based API

The task-based API automatically chooses a model based on the task:


                content_copy
            
pgml.transform(
    task TEXT,
    args JSONB,
    inputs TEXT[]
)

Argument	Description	Example	Required
task	The name of a natural language processing task.	`'text-generation'`	Required
args	Additional kwargs to pass to the pipeline.	`'{"max_new_tokens": 50}'::JSONB`	Optional
inputs	Array of prompts to pass to the model for inference. Each prompt is evaluated independently and a separate result is returned.	`ARRAY['Once upon a time...']`	Required

Examples


                content_copy
            
SELECT *
FROM pgml.transform(
  task => 'text-generation',
  inputs => ARRAY['In a galaxy far far away']
);


                content_copy
            
SELECT *
FROM pgml.transform(
  task => 'translation_en_to_fr',
  inputs => ARRAY['How do I say hello in French?']
);

Model-based API

The model-based API requires the name of the model and the task, passed as a JSON object. This allows it to be more generic and support more models:


                content_copy
            
pgml.transform(
    model JSONB,
    args JSONB,
    inputs TEXT[]
)

Argument	Description	Example
model	Model configuration, including name and task.	'{ "task": "text-generation", "model": "mistralai/Mixtral-8x7B-v0.1" }'::JSONB
args	Additional kwargs to pass to the pipeline.	`'{"max_new_tokens": 50}'::JSONB`
inputs	Array of prompts to pass to the model for inference. Each prompt is evaluated independently.	`ARRAY['Once upon a time...']`

Example


                content_copy
            
SELECT pgml.transform(
  task   => '{
    "task": "text-generation",
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "model_type": "mistral",
    "revision": "main",
    "device_map": "auto"
  }'::JSONB,
  inputs  => ARRAY['AI is going to'],
  args   => '{
    "max_new_tokens": 100
  }'::JSONB
);


                content_copy
            
import transformers

def transform(task, call, inputs):
    return transformers.pipeline(**task)(inputs, **call)

transform(
    {
        "task": "text-generation",
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "model_type": "mistral",
        "revision": "main",
    },
    {"max_new_tokens": 100},
    ['AI is going to change the world in the following ways:']
)

pgml.transform()

API

Task-based API

Examples

Model-based API

Example

Guides

PostgresML