The pgml.transform()
function is the most powerful feature of PostgresML. It integrates open-source large language models, like Llama, Mixtral, and many more, which allows to perform complex tasks on your data.
The models are downloaded from 🤗 Hugging Face which hosts tens of thousands of pre-trained and fine-tuned models for various tasks like text generation, question answering, summarization, text classification, and more.
The pgml.transform()
function comes in two flavors, task-based and model-based.
The task-based API automatically chooses a model based on the task:
content_copy
pgml.transform(
task TEXT,
args JSONB,
inputs TEXT[]
)
Argument |
Description |
Example |
Required |
task |
The name of a natural language processing task. |
'text-generation' |
Required |
args |
Additional kwargs to pass to the pipeline. |
'{"max_new_tokens": 50}'::JSONB |
Optional |
inputs |
Array of prompts to pass to the model for inference. Each prompt is evaluated independently and a separate result is returned. |
ARRAY['Once upon a time...'] |
Required |
-
-
content_copy
SELECT *
FROM pgml.transform(
task => 'text-generation',
inputs => ARRAY['In a galaxy far far away']
);
content_copy
SELECT *
FROM pgml.transform(
task => 'translation_en_to_fr',
inputs => ARRAY['How do I say hello in French?']
);
The model-based API requires the name of the model and the task, passed as a JSON object. This allows it to be more generic and support more models:
content_copy
pgml.transform(
model JSONB,
args JSONB,
inputs TEXT[]
)
Argument |
Description |
Example |
model |
Model configuration, including name and task. |
'{
"task": "text-generation",
"model": "mistralai/Mixtral-8x7B-v0.1"
}'::JSONB
|
args |
Additional kwargs to pass to the pipeline. |
'{"max_new_tokens": 50}'::JSONB |
inputs |
Array of prompts to pass to the model for inference. Each prompt is evaluated independently. |
ARRAY['Once upon a time...'] |
-
-
content_copy
SELECT pgml.transform(
task => '{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
"device_map": "auto"
}'::JSONB,
inputs => ARRAY['AI is going to'],
args => '{
"max_new_tokens": 100
}'::JSONB
);
content_copy
import transformers
def transform(task, call, inputs):
return transformers.pipeline(**task)(inputs, **call)
transform(
{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
},
{"max_new_tokens": 100},
['AI is going to change the world in the following ways:']
)
See also: LLM guides for more examples