Overview

Installation

content_copy link edit
npm i pgml

content_copy link edit
pip install pgml

Example

Once the SDK is installed, you an use the following example to get started.

Create a collection

content_copy link edit
const pgml = require("pgml");
const main = async () => { // Open the main function
collection = pgml.newCollection("sample_collection");

content_copy link edit
from pgml import Collection, Pipeline
import asyncio
async def main(): # Start of the main function
collection = Collection("sample_collection")

Explanation:

  • The code imports the pgml module.
  • It creates an instance of the Collection class which we will add pipelines and documents onto

Create a pipeline

Continuing with main

content_copy link edit
const pipeline = pgml.newPipeline("sample_pipeline", {
text: {
splitter: { model: "recursive_character" },
semantic_search: {
model: "intfloat/e5-small",
},
},
});
await collection.add_pipeline(pipeline);

content_copy link edit
pipeline = Pipeline(
"test_pipeline",
{
"text": {
"splitter": { "model": "recursive_character" },
"semantic_search": {
"model": "intfloat/e5-small",
},
},
},
)
await collection.add_pipeline(pipeline)

Explanation:

  • The code constructs a pipeline called "sample_pipeline" and adds it to the collection we Initialized above. This pipeline automatically generates chunks and embeddings for the text key for every upserted document.

Upsert documents

Continuing with main

content_copy link edit
const documents = [
{
id: "Document One",
text: "document one contents...",
},
{
id: "Document Two",
text: "document two contents...",
},
];
await collection.upsert_documents(documents);

content_copy link edit
documents = [
{
"id": "Document One",
"text": "document one contents...",
},
{
"id": "Document Two",
"text": "document two contents...",
},
]
await collection.upsert_documents(documents)

Explanation

  • This code creates and upserts some filler documents.
  • As mentioned above, the pipeline added earlier automatically runs and generates chunks and embeddings for each document.

Query documents

Continuing with main

content_copy link edit
const results = await collection.vector_search(
{
query: {
fields: {
text: {
query: "Something about a document...",
},
},
},
limit: 2,
},
pipeline,
);
console.log(results);
await collection.archive();
} // Close the main function

content_copy link edit
results = await collection.vector_search(
{
"query": {
"fields": {
"text": {
"query": "Something about a document...",
},
},
},
"limit": 2,
},
pipeline,
)
print(results)
await collection.archive()
# End of the main function

Explanation:

  • The query method is called to perform a vector-based search on the collection. The query string is Something about a document..., and the top 2 results are requested
  • The search results are printed to the screen
  • Finally, the archive method is called to archive the collection

Call main function.

content_copy link edit
main().then(() => {
console.log("Done with PostgresML demo");
});

content_copy link edit
if __name__ == "__main__":
asyncio.run(main())

Running the Code

Open a terminal or command prompt and navigate to the directory where the file is saved.

Execute the following command:

content_copy link edit
node vector_search.js

content_copy link edit
python3 vector_search.py

You should see the search results printed in the terminal.

content_copy link edit
[
{
"chunk": "document one contents...",
"document": {"id": "Document One", "text": "document one contents..."},
"score": 0.9034339189529419,
},
{
"chunk": "document two contents...",
"document": {"id": "Document Two", "text": "document two contents..."},
"score": 0.8983734250068665,
},
]