Extractive Question Answering

Here is the documentation for the JavaScript and Python code snippets performing end-to-end question answering:

Imports and Setup

The SDK and datasets are imported. Builtins are used in Python for transforming text.

content_copy link edit
const pgml = require("pgml");
require("dotenv").config();

content_copy link edit
from pgml import Collection, Model, Splitter, Pipeline, Builtins
from datasets import load_dataset
from dotenv import load_dotenv

Initialize Collection

A collection is created to hold context passages.

content_copy link edit
const collection = pgml.newCollection("my_javascript_eqa_collection");

content_copy link edit
collection = Collection("squad_collection")

Create Pipeline

A pipeline is created and added to the collection.

content_copy link edit
const pipeline = pgml.newPipeline(
"my_javascript_eqa_pipeline",
pgml.newModel(),
pgml.newSplitter(),
);
await collection.add_pipeline(pipeline);

content_copy link edit
model = Model()
splitter = Splitter()
pipeline = Pipeline("squadv1", model, splitter)
await collection.add_pipeline(pipeline)

Upsert Documents

Context passages from SQuAD are upserted into the collection.

content_copy link edit
const documents = [
{
id: "...",
text: "...",
}
];
await collection.upsert_documents(documents);

content_copy link edit
data = load_dataset("squad")
documents = [
{"id": ..., "text": ...}
for r in data
]
await collection.upsert_documents(documents)

Query for Context

A vector search query retrieves context passages.

content_copy link edit
const queryResults = await collection
.query()
.vector_recall(query, pipeline)
.fetch_all();
const context = queryResults
.map(result => result[1])
.join("\n");

content_copy link edit
results = await collection.query()
.vector_recall(query, pipeline)
.fetch_all()
context = " ".join(results[0][1])

Query for Answer

The context is passed to a QA model to extract the answer.

content_copy link edit
const builtins = pgml.newBuiltins();
const answer = await builtins.transform("question-answering", [
JSON.stringify({question, context})
]);

content_copy link edit
builtins = Builtins()
answer = await builtins.transform(
"question-answering",
[{"question": query, "context": context}]
)