OpenSourceAI is a drop in replacement for OpenAI's chat completion endpoint.
Follow the instillation section in getting-started
When done, set the environment variable KORVUS_DATABASE_URL
to your PostgresML database url.
content_copy
export KORVUS_DATABASE_URL=postgres://user:pass@.db.cloud.postgresml.org:6432/pgml
Note that an alternative to setting the environment variable is passing the url to the constructor of OpenSourceAI
content_copy
const korvus = require("korvus");
const client = korvus.newOpenSourceAI(YOUR_DATABASE_URL);
content_copy
import korvus
client = korvus.OpenSourceAI(YOUR_DATABASE_URL)
Our OpenSourceAI class provides 4 functions:
chat_completions_create
chat_completions_create_async
chat_completions_create_stream
chat_completions_create_stream_async
They all take the same arguments:
model
a String
or Object
messages
an Array/List of Objects
max_tokens
the maximum number of new tokens to produce. Default none
temperature
the temperature of the model. Default 0.8
n
the number of choices to create. Default 1
chat_template
a Jinja template to apply the messages onto before tokenizing
The return types of the stream and non-stream variations match OpenAI's return types.
The following examples run through some common use cases.
Here is a simple example using zephyr-7b-beta, one of the best 7 billion parameter models at the time of writing.
content_copy
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const results = client.chat_completions_create(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
content: "You are a friendly chatbot who always responds in the style of a pirate",
},
{
role: "user",
content: "How many helicopters can a human eat in one sitting?",
},
],
);
console.log(results);
content_copy
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{
"role": "user",
"content": "How many helicopters can a human eat in one sitting?",
},
],
temperature=0.85,
)
print(results)
content_copy
{
"choices": [
{
"index": 0,
"message": {
"content": "Ahoy, me hearty! As your friendly chatbot, I'd like to inform ye that a human cannot eat a helicopter in one sitting. Helicopters are not edible, as they are not food items. They are flying machines used for transportation, search and rescue operations, and other purposes. A human can only eat food items, such as fruits, vegetables, meat, and other edible items. I hope this helps, me hearties!",
"role": "assistant"
}
}
],
"created": 1701291672,
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion",
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0
}
}
We don't charge per token, so OpenAI “usage” metrics are not particularly relevant. We'll be extending this data with more direct CPU/GPU resource utilization measurements for users who are interested, or need to pass real usage based pricing on to their own customers.
Notice there is near one to one relation between the parameters and return type of OpenAI’s chat.completions.create and our chat_completion_create.
The best part of using open-source AI is the flexibility with models. Unlike OpenAI, we are not restricted to using a few censored models, but have access to almost any model out there.
Here is an example of streaming with the popular meta-llama/Meta-Llama-3.1-8B-Instruct
model.
content_copy
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const it = client.chat_completions_create_stream(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
content: "You are a friendly chatbot who always responds in the style of a pirate",
},
{
role: "user",
content: "How many helicopters can a human eat in one sitting?",
},
],
);
let result = it.next();
while (!result.done) {
console.log(result.value);
result = it.next();
}
content_copy
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create_stream(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{
"role": "user",
"content": "How many helicopters can a human eat in one sitting?",
},
]
)
for c in results:
print(c)
{
"choices": [
{
"delta": {
"content": "Y",
"role": "assistant"
},
"index": 0
}
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
{
"choices": [
{
"delta": {
"content": "e",
"role": "assistant"
},
"index": 0
}
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
We have truncated the output to two items
Once again, notice there is near one to one relation between the parameters and return type of OpenAI’s chat.completions.create
with the stream
argument set to true and our chat_completions_create_stream
.
We also have asynchronous versions of the chat_completions_create
and chat_completions_create_stream
content_copy
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const results = await client.chat_completions_create_async(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
content: "You are a friendly chatbot who always responds in the style of a pirate",
},
{
role: "user",
content: "How many helicopters can a human eat in one sitting?",
},
],
);
console.log(results);
content_copy
import korvus
client = korvus.OpenSourceAI()
results = await client.chat_completions_create_async(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{
"role": "user",
"content": "How many helicopters can a human eat in one sitting?",
},
]
)
content_copy
{
"choices": [
{
"index": 0,
"message": {
"content": "Ahoy, me hearty! As your friendly chatbot, I'd like to inform ye that a human cannot eat a helicopter in one sitting. Helicopters are not edible, as they are not food items. They are flying machines used for transportation, search and rescue operations, and other purposes. A human can only eat food items, such as fruits, vegetables, meat, and other edible items. I hope this helps, me hearties!",
"role": "assistant"
}
}
],
"created": 1701291672,
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion",
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0
}
}
Notice the return types for the sync and async variations are the same.
content_copy
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const it = await client.chat_completions_create_stream_async(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
content: "You are a friendly chatbot who always responds in the style of a pirate",
},
{
role: "user",
content: "How many helicopters can a human eat in one sitting?",
},
],
);
let result = await it.next();
while (!result.done) {
console.log(result.value);
result = await it.next();
}
content_copy
import korvus
client = korvus.OpenSourceAI()
results = await client.chat_completions_create_stream_async(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{
"role": "user",
"content": "How many helicopters can a human eat in one sitting?",
},
]
)
async for c in results:
print(c)
content_copy
{
"choices": [
{
"delta": {
"content": "Y",
"role": "assistant"
},
"index": 0
}
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
{
"choices": [
{
"delta": {
"content": "e",
"role": "assistant"
},
"index": 0
}
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
We have truncated the output to two items
We have tested the following models and verified they work with the OpenSourceAI:
- meta-llama/Meta-Llama-3.1-8B-Instruct
- meta-llama/Meta-Llama-3.1-70B-Instruct
- microsoft/Phi-3-mini-128k-instruct
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/Mistral-7B-Instruct-v0.2