pgml.transform()

The pgml.transform() function is the most powerful feature of PostgresML. It integrates open-source large language models, like Llama, Mixtral, and many more, which allows to perform complex tasks on your data.

The models are downloaded from 🤗 Hugging Face which hosts tens of thousands of pre-trained and fine-tuned models for various tasks like text generation, question answering, summarization, text classification, and more.

API

The pgml.transform() function comes in two flavors, task-based and model-based.

Task-based API

The task-based API automatically chooses a model based on the task:

content_copy
pgml.transform(
task TEXT,
args JSONB,
inputs TEXT[]
)
Argument Description Example Required
task The name of a natural language processing task. 'text-generation' Required
args Additional kwargs to pass to the pipeline. '{"max_new_tokens": 50}'::JSONB Optional
inputs Array of prompts to pass to the model for inference. Each prompt is evaluated independently and a separate result is returned. ARRAY['Once upon a time...'] Required

Examples

content_copy
SELECT *
FROM pgml.transform(
task => 'text-generation',
inputs => ARRAY['In a galaxy far far away']
);

content_copy
SELECT *
FROM pgml.transform(
task => 'translation_en_to_fr',
inputs => ARRAY['How do I say hello in French?']
);

Model-based API

The model-based API requires the name of the model and the task, passed as a JSON object. This allows it to be more generic and support more models:

content_copy
pgml.transform(
model JSONB,
args JSONB,
inputs TEXT[]
)
Argument Description Example
model Model configuration, including name and task.
'{
  "task": "text-generation",
  "model": "mistralai/Mixtral-8x7B-v0.1"
}'::JSONB
args Additional kwargs to pass to the pipeline. '{"max_new_tokens": 50}'::JSONB
inputs Array of prompts to pass to the model for inference. Each prompt is evaluated independently. ARRAY['Once upon a time...']

Example

content_copy
SELECT pgml.transform(
task => '{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
"device_map": "auto"
}'::JSONB,
inputs => ARRAY['AI is going to'],
args => '{
"max_new_tokens": 100
}'::JSONB
);

content_copy
import transformers
def transform(task, call, inputs):
return transformers.pipeline(**task)(inputs, **call)
transform(
{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
},
{"max_new_tokens": 100},
['AI is going to change the world in the following ways:']
)

Guides

See also: LLM guides for more examples