Text-to-Text Generation

Text-to-text generation methods, such as T5, are neural network architectures designed to perform various natural language processing tasks, including summarization, translation, and question answering. T5 is a transformer-based architecture pre-trained on a large corpus of text data using denoising autoencoding. This pre-training process enables the model to learn general language patterns and relationships between different tasks, which can be fine-tuned for specific downstream tasks. During fine-tuning, the T5 model is trained on a task-specific dataset to learn how to perform the specific task.

Translation


                content_copy
            
SELECT pgml.transform(
    task => '{
        "task" : "text2text-generation"
    }'::JSONB,
    inputs => ARRAY[
        'translate from English to French: I''m very happy'
    ]
) AS answer;

Result


                content_copy
            
[
    {"generated_text": "Je suis très heureux"}
]

Similar to other tasks, we can specify a model for text-to-text generation.


                content_copy
            
SELECT pgml.transform(
    task => '{
        "task" : "text2text-generation",
        "model" : "bigscience/T0"
    }'::JSONB,
    inputs => ARRAY[
        'Is the word ''table'' used in the same meaning in the two previous sentences? Sentence A: you can leave the books on the table over there. Sentence B: the tables in this book are very hard to read.'

    ]
) AS answer;

Text-to-Text Generation

PostgresML