Implementing Semantic Search in Postgres in 15 Minutes

Silas Marvin

June 18, 2024

What is and is not semantic search

Semantic search uses machine learning to understand the meaning of text by converting it into numerical vectors, allowing for more accurate and context-aware search results.

When users are unsure of the exact terms to search for, semantic search can uncover relevant information that traditional keyword searches might miss. This capability is particularly valuable for discovering content based on the intent and context of the search query, rather than relying solely on precise word matches.

It is not a replacement for keyword search. In many cases, keyword search can outperform semantic search. Specifically, if a user knows the exact keywords they want to match in a document, keyword search is faster and guaranteed to return the correct result, whereas semantic search is only likely to return the correct result. The most robust search systems combine the two. This technique is called hybrid search, which ultimately delivers the most accurate search system and best user experience.

Semantic search is not just for machine learning engineers. The system behind semantic search is relatively easy to implement, and thanks to new Postgres extensions like pgml and pgvector, it is readily available to SQL developers. Just as modern SQL developers are expected to be familiar with and capable of implementing keyword search, they will soon be expected to implement semantic search as well.

For more on hybird search techniques check out our blog post, How to Improve Search Results with Machine Learning.

Embeddings 101

Semantic search is powered by embeddings. To understand how semantic search works, we must have a basic understanding of embeddings.

Embeddings are vectors / arrays. Given some text and some embedding model, we can convert text to vectors:


                content_copy
            
SELECT pgml.embed('mixedbread-ai/mxbai-embed-large-v1', 'Generating embeddings in Postgres is fun!');


                content_copy
            
{-0.12269165,0.79433846,0.1909454,-0.8607215,-0.5526149,-0.48317516,0.48356333,0.40197256,0.6542712,0.20637313,0.68719935,-0.11798598,0.3924242,-0.3669872,-0.37829298,-0.57285887,-0.42399693,-0.57672346,-0.5584913,-0.25157344,-0.26103315,0.8435066,-1.3652948,-0.060239665,0.053472117,0.61965233,0.70429814,0.21168475,2.1243148,0.54657197,0.44898787,0.5141667,0.25056657,-0.7296713,-0.21511579,-0.26193422,0.18050511,0.42497447,0.10701023,-0.47321296,0.88108975,-0.23380123,0.097806804,-0.7617625,-1.7238936,0.0734859,0.5393925,0.08824284,0.6490631,-0.6999467,-0.04020539,0.34580526,-0.22457539,-0.1596002,0.30769205,0.10054478,-0.21030527,-0.6795052,-0.49133295,0.64051557,0.729387,-0.28649548,0.6304755,-1.2938358,0.18542609,-0.1447736,0.26269862,-0.7243509,-0.3743654,0.32034853,-0.033665977,-0.101480104,-0.40238166,-0.13823868,-0.08293891,0.18822464,0.614725,-0.51620704,-0.9493647,0.34618157,-0.045119785,0.5292574,0.24998534,0.50182945,-0.66819376,-0.69498116,1.0365546,0.7618454,0.22734495,-0.3371644,0.18830177,0.65933335,0.90198004,0.62203044,-0.18297921,0.80193377,-0.3250604,0.7243765,0.42883193,0.21042423,-0.01517533,0.5617572,-0.1593908,0.25845265,-0.07747603,0.4637758,0.3156056,-0.8067281,0.20704024,0.26316988,0.26273122,-0.32277155,0.16489738,-0.025123874,-0.8421937,0.42238364,-0.20360216,0.7395353,-0.28297424,-0.58514386,-1.1276962,-0.57587785,0.7367427,-1.183229,-0.17403314,-1.3642671,0.06204233,0.83101535,-0.8367251,0.4434241,0.13569412,-0.5018109,-0.24702606,0.2925449,-0.30402657,0.30018607,-0.8272239,0.7552851,0.71613544,-0.5800097,0.4300131,-0.3769249,0.15121885,1.4300121,-0.70190847,-0.014502372,1.1501042,-0.91252214,-1.299539,1.5988679,0.29511172,-0.3301541,0.10612632,0.48639655,-0.67100185,-0.18592787,-0.0610746,-0.40246755,0.34081936,0.26820442,-0.1269026,-0.02156586,0.10375944,0.6626627,-0.18523005,0.96837664,-0.5868682,0.081125714,-0.62061644,-1.010315,-0.18992952,-0.034805447,0.3482115,0.10850326,0.7015801,1.181063,0.51085556,-0.3421162,1.1605215,0.34367874,-0.45851547,-0.23464307,0.22397688,0.5295375,-0.067920305,0.38869885,-0.764097,0.08183036,-0.74270236,0.1314034,-0.09241337,0.7889378,-0.4487391,0.2671574,-0.057286393,0.23383318,-0.64422816,0.31305853,-0.5284081,-0.8764228,-1.0072867,0.7426642,0.20632008,0.19519271,-0.20781143,-0.55022776,-0.7449971,0.8095787,-1.1823708,-0.12114787,0.7764435,-0.4102213,-0.5614735,-1.151166,0.453138,-0.124295816,-0.7787184,0.8213192,0.19523725,-0.3429081,-0.5960741,0.05939262,0.6634549,-0.10354193,-0.16674386,0.23894079,0.5281129,0.4417929,-0.052335966,0.26073328,-0.5175538,0.43219882,0.42117482,0.9145017,0.62297195,0.5059562,1.0199716,0.33026397,0.10540544,1.4194826,0.2387192,-0.24473047,-0.12635238,0.38584706,0.06950318,0.13178644,0.4950382,0.58716995,-0.22241667,0.28335956,-1.4205463,-0.37189013,-0.006335424,0.674547,-0.35189858,-0.06895771,0.33660728,0.6581518,-0.5726849,0.20706958,-0.63431185,0.55616635,-0.3150213,0.18246625,0.6179018,0.3199304,0.1705371,0.40476194,-0.49592853,-0.00519022,-0.98531955,-0.8100823,-0.58652925,0.10230886,-0.7235388,-0.6156084,0.2809807,-0.2967379,-0.3508671,-1.1141659,-0.22769807,0.08822136,-0.23333925,0.6282077,1.0215682,0.38222972,-1.1630126,0.4021485,-0.064744614,1.0170162,-0.6086199,0.32332307,0.3160495,0.37213752,0.23822482,-0.24534902,-0.35759526,0.16281769,0.20119011,-0.7505329,-0.53170776,0.52023965,0.34757367,-0.3365119,-1.090554,0.74303913,0.7576997,0.1850476,0.38377324,0.6341742,0.0035892723,0.17847057,-0.52225345,0.4744198,-0.7825479,0.85714924,1.2160783,0.05176344,-0.34153363,-0.9228027,-0.45701292,-0.31697652,0.18669243,-0.080539,-0.97618884,0.44975403,0.12266389,-1.5476696,0.10114262,0.2652986,-0.6647504,-0.11139665,0.09672374,0.3067969,0.124992974,-0.075039916,-0.945483,-0.08019136,0.33150327,0.79691124,0.32509813,-0.7345915,0.49151382,0.8019188,0.054724086,0.3824057,0.54616,-1.338427,-0.17915602,0.29255223,-0.1312647,0.17714119,0.9686431,0.5271556,-0.09237713,-0.14801571,-0.8311881,0.4603313,1.173417,-0.17329413,1.1544656,1.2609864,0.6680077,-0.7116551,-0.26211533,-0.6321865,-0.4512319,0.30350694,0.7740681,-1.0377058,0.5507171,0.08685625,-0.4665991,1.0912793,-0.4253514,-1.3324647,0.6247509,0.17459206,0.64427835,-0.1543753,-0.4854082,0.42142552,0.41042453,0.80998975,-0.025750212,0.8487763,0.29716644,-0.8283788,-0.702183,-0.15909031,-0.4065299,1.064912,-0.25737965,-0.22743805,-1.1570827,0.17145145,0.38430393,0.82506144,0.46196732,-0.101009764,0.7100557,0.37232363,0.2594003,0.19210479,0.36719602,0.75960565,-0.65713775,0.23913959,0.692282,-0.41791838,0.47484493,0.17821907,-0.60062724,0.29957938,-0.11593854,0.32937768,-0.45972684,0.01129646,0.18534593,0.62680054,-0.028435916,0.251009,-0.71900076,0.44056803,0.16914998,-1.0019057,-0.55680645,0.059508275,0.20963086,0.06784629,0.07168728,-0.93063635,-0.045650747,-0.007684426,-0.7944553,0.79666996,0.9232027,-0.0643565,0.6617379,-1.1071137,0.35533053,-0.5851006,0.7480103,0.18149409,0.42977095,0.28515843,-0.29686522,0.9553224,0.7197761,-0.6413751,-0.17099445,-0.544606,0.06221392,-0.24136083,-0.5460586,-0.40875596,-0.057024892,-0.31573594,-0.01389576,-0.010156465,0.5784532,-0.44803303,0.38007888,-0.38199085,-0.43404552,0.91768897,-0.09181415,-0.44456294,0.28143787,0.6168798,-0.34374133,0.43424013,0.39190337,-0.56925493,0.8975914,-0.27520975,0.82481575,-0.16046512,-0.21151508,0.013323051,-0.60130703,0.19633308,-0.07837379,-0.16391036,-0.80348927,-1.6232564,-0.123514965,-0.15926442,-0.9025081,0.47055957,-0.078078784,-0.30613127,1.0725194,-0.5127652,-0.26803625,0.2473333,-0.43352637,0.26197925,0.47239286,0.3917152,0.13200012,-0.021115797,-1.3560157,-0.15067065,-0.23412828,0.24189733,-0.7706759,-0.3094795,-0.17276037,0.11040486,-1.122779,-0.8549858,-0.8815358,0.36725566,0.4391438,0.14913401,-0.044919793,-0.90855205,-1.2868156,0.86806804,0.013447602,-1.3518908,-1.0878333,1.1056291,-0.6054898,0.8732615,0.090048715,0.3439396,-0.43436176,-1.4296948,0.21427931,-0.56683505,-0.7287918,-0.66875815,-1.2414092,0.14564492,0.14575684,1.6843026,-0.7691825,-0.8857156,-0.59383214,0.1526336,-0.40446484,-0.093765385,-0.57902026,0.7115043,-0.2987314,1.4434578,-0.7507225,-0.14864576,0.09993563,0.3642726,0.39022216,1.4126799,-0.39582014,-0.46609184,-0.119693935,-0.7797329,0.8846008,-0.008525363,-1.1169624,0.28791374,-0.64548826,-0.14354923,-0.9195319,0.5042809,-0.64800096,-0.566263,0.31473473,-1.3200041,0.066968784,-1.2279652,0.6596321,-0.22676139,0.05292237,-0.44841886,-0.14407255,-1.1879731,-0.9624812,0.3520917,-0.8199045,-0.23614404,0.057054248,0.2774532,0.56673276,-0.68772894,0.8464806,1.0946864,0.7181479,-0.08149687,-0.033113156,-0.45337513,0.6593971,0.040748913,0.25708768,0.2444611,-0.6291184,0.2154976,-1.0344702,-0.57461023,-0.22907877,0.20212884,1.5542895,-0.69493115,0.76096123,-0.27198875,-0.28636566,-0.80702794,-0.09504783,0.5880213,0.52442694,0.88963073,-0.113876544,0.44108576,0.5131936,-0.51199615,-0.5373556,-0.50712276,0.7119059,0.26809675,-0.624161,0.50190353,0.45905492,-0.7560234,-0.36166972,-0.11057704,-0.93385667,0.14702824,-0.5007164,0.062319282,0.14635088,-0.60926783,0.44830725,0.5508014,-0.18144712,0.8553549,0.4763656,-0.06791675,-0.7282673,0.5312333,0.29696235,-0.32435995,0.11339427,-0.3156661,0.21376118,0.101174735,0.49239466,0.31915516,0.7523039,0.015413809,1.1970537,1.2595433,0.7877007,-0.77948576,-0.07308315,-0.005401653,-0.9297423,-0.6518283,-0.5235209,-0.08294889,-0.32686272,0.81800294,0.28346354,0.23243074,1.211297,0.5740814,-0.23115727,-1.0199192,-0.11423441,-1.2686234,-0.3610325,-0.13443044,-0.09186939,-0.46258482,-0.2746501,0.039179135,-0.6018465,-0.8123009,0.65863043,-1.4951158,0.04137505,-0.39956668,-0.21086998,-0.16921428,-0.12892427,-0.07058203,0.22937924,0.1872652,0.24946518,0.06469146,0.69964784,-0.14188632,0.57223684,0.26891342,-0.27864167,-0.5591145,-0.79737157,-1.0706135,-0.2231602,-1.108503,-0.34735858,-0.032272782,-0.38188872,0.32032675,0.6364613,-0.38768604,-1.1507906,-0.913829,0.36491016,0.25496644,-0.06781126,-0.84842575,0.0793298,0.0049917502,0.07099934,-0.5054571,-0.55416757,-0.4953387,0.47616813,0.13400371,1.3912268,0.30719018,-0.16337638,0.18637846,-0.19401097,0.71916217,-0.21031788,0.61066073,-0.43263736,-0.54376316,-0.36609605,0.30756727,0.3625213,0.30662173,-0.109407134,-0.26726124,-0.10782864,-0.5728887,0.35624364,0.23127197,1.0006613,-0.18430339,0.24659279,-0.1414664,-0.9362831,-0.14328903,-0.76222867,-1.6322204,-0.23277596,1.1940688,-0.5248364,0.6987823,0.36069974,-0.38930154,0.31739354,0.8688939,0.25019056,-0.45539424,0.5829257,-0.35556546,-0.23837212,-0.74019665,-0.49967116,0.20733729,0.18190496,-0.84233344,-0.9670267,0.29291785,0.18208896,0.26272357,0.076004505,0.16490388,0.23035681,-0.05491554,-0.35777965,-0.06495173,0.84074193,-0.06649489,0.5308439,-0.27389482,0.52712023,-0.70385605,1.582289,0.3533609,0.6537309,-0.11627128,1.1282475,-0.12714477,0.61138934,1.0615714,0.6239467,0.54578096,-0.56903726,-0.09996867,0.29148775,0.4719238,0.52982926,-0.122312695,-0.59448034,1.1922164,-0.102847695,0.015887707,-0.46900386,0.9373753,0.5174408,0.107704684,0.33192438,-0.73113894,-0.07725855,-0.21073207,-0.53892136,-0.41692436,0.04440565,-0.7362955,-0.18671799,-0.617404,0.11175289,-0.03757055,-0.9091465,-0.4772941,0.115955085,-0.109630615,0.27334505,-0.15329921,-0.40542892,0.6577188,-0.14270602,0.028438624,0.7158844,-0.04260146,0.14211391,0.36379516,-0.16956282,-0.32750866,0.7697329,-0.31624234,-0.81320703,-0.18005963,0.6081982,0.23052801,-0.20143141,0.24865282,-0.5117264,-0.64896625,-0.664304,0.4412688,-0.74262285,0.31758395,1.0110188,-0.0542792,-0.12961724,0.038787734,-0.019657299,0.3522628,0.88944745,0.7572078,0.4543937,0.31338966,2.1305785,0.11285806,0.9827753,0.4258123,0.46003717,0.01849649,-0.050423466,-0.7171815,-0.31475943,-0.48302308,-1.342478,0.017705658,0.3137204,0.43893284,-0.31969646,0.26008397,0.86090857,-0.9084142,0.47359383,1.2101759,0.25754166,0.071290456,-0.19756663,-0.07539108,-0.6719409,0.404817,-0.992041,0.48930237,0.83036274,-1.0315892,-0.06564829,0.00026013568,-0.43265438,-0.55953914,-0.06504767,-0.6801495,0.57494533,0.6398298,0.46862775,0.04649162,-0.70052904,-0.24009219,0.52453166,0.79875654,-0.09534484,0.82706153,0.96052814,0.1742728,0.057494655,-0.21722038,0.21895333,-0.15573184,0.5323167,-0.11215742,0.23329657,-0.566671,-0.7952302,0.31211463,0.40420142,0.32071197,-0.9692792,-0.27738753,0.35658348,-0.23604108,-0.5778135,-1.2452201,0.18487398,0.28343126,0.034852847,-0.42560938,-0.87293553,3.3916373,0.37104064,0.95921576,0.30020702,0.43176678,0.4746065,0.8066563,0.02344249,0.6768376,-1.243408,0.013419566,0.26038718,0.052325014,0.40021995,0.69684315,0.17993873,-0.6125471,0.39728552,0.1287264,-0.821042,-0.6356886,0.04368836,0.58837336,0.2951825,0.80620193,-0.55552566,-0.27555013,-0.86757773,-0.33467183,0.07901353,0.20590094,0.095205106,0.5052767,-0.3156328,-0.054386012,0.29206502,-0.26267004,-1.1437016,0.037064184,0.5587826,-0.23018162,-0.9855164,0.007280944,-0.5550629,-0.46999946,0.58497715,-0.1522534,0.4508725,0.37664524,-0.72747505,-0.52117777,-0.8577786,0.77468944,-1.2249953,-0.85298705,-0.8583468,-0.5801342,-0.817326,0.16878682,1.3681034,-0.6309237,0.42270342,-0.11961653,0.36134583,0.459141,0.24535258,0.21466772,-0.45898587,-0.20054409,-0.92821646,-0.05238323,0.17994325,0.82358634,-1.1087554,0.55523217,-0.29262337,-0.7871331,0.7758087,-0.2988389,-0.14875472,-0.731297,-0.46911976,-0.5939936,0.39334157,-0.2833826,0.64205635,-0.21212497,0.31960186,0.25826675,0.94142056,-0.15007028,0.7186352,-0.13642757,0.4422678,-0.106289506}

We used the pgml.embed PostresML function to generate an embedding of the sentence "Generating embeddings in Postgres is fun!" using the mixedbread-ai/mxbai-embed-large-v1 model from mixedbread.ai.

The output size of the vector varies per model, and in mxbai-embed-large-v1 outputs vectors with 1024 dimensions: each vector contains 1024 floating point numbers.

The vector this model outputs is not random. It is designed to capture the semantic meaning of the text. What this really means, is that sentences which are closer together in meaning will be closer together in vector space.

Let’s look at a more simple example. Let's assume we have a model called simple-embedding-model, and it outputs vectors with only 2 dimensions. Let’s embed the following three phrases: "I like Postgres", "I like SQL" and "Rust is the best":


                content_copy
            
SELECT pgml.embed('simple-embedding-model', 'I like Postgres') AS embedding;

SELECT pgml.embed('simple-embedding-model', 'I like SQL') AS embedding;

SELECT pgml.embed('simple-embedding-model', 'Rust is the best') AS embedding;


                content_copy
            
embedding for 'I like Postgres'
---------
[0.1, 0.2]

embedding for 'I like SQL'
---------
[0.12, 0.25]

embedding for 'Rust is the best'
---------
[-0.8, -0.9]

You'll notice how similar the vectors produced by the text "I like Postgres" and "I like SQL" are compared to "Rust is the best". This is an artificial example, but the same idea holds true when translating to real models like mixedbread-ai/mxbai-embed-large-v1.

What does it mean to be "close"?

We can use the idea that text that is more similar in meaning will be closer together in the vector space to build our semantic search engine.

For instance let’s say that we have the following documents:

Document ID	Document text
1	The pgml.transform function is a PostgreSQL function for calling LLMs in the database.
2	I think tomatoes are incredible on burgers.

and a user is looking for the answer to the question: "What is the pgml.transform function?". If we embed the search query and all of the documents using a model like mixedbread-ai/mxbai-embed-large-v1, we can compare the query embedding to all of the document embeddings, and select the document that has the closest embedding in vector space, and therefore in meaning, to the to the answer.

These are big embeddings, so we can’t simply estimate which one is closest. So, how do we actually measure the similarity (distance) between different vectors?

pgvector as of this writing supports four different measurements of vector similarity:

L2 distance
(negative) inner product
cosine distance
L1 distance

For most use cases we recommend using the cosine distance as defined by the formula:

where A and B are two vectors.

This is a somewhat confusing formula but luckily pgvector provides an operator that computes the cosine distance for us:


                content_copy
            
SELECT '[1,2,3]'::vector <=> '[2,3,4]'::vector;


                content_copy
            
   cosine_distance    
----------------------
 0.007416666029069763

Other distance functions have similar formulas and provide convenient operators to use as well. It may be worth testing other operators and to see which performs better for your use case. For more information on the other distance functions, take a look at our Embeddings guide.

Going back to our search example, we can compute the cosine distance between our query embedding and our documents:


                content_copy
            
SELECT pgml.embed(
  'mixedbread-ai/mxbai-embed-large-v1',
  'What is the pgml.transform function?'
)::vector
  <=>
pgml.embed(
  'mixedbread-ai/mxbai-embed-large-v1',
  'The pgml.transform function is a PostgreSQL function for calling LLMs in the database.'
)::vector AS cosine_distance;

SELECT pgml.embed(
  'mixedbread-ai/mxbai-embed-large-v1',
  'What is the pgml.transform function?'
)::vector
  <=>
pgml.embed(
  'mixedbread-ai/mxbai-embed-large-v1',
  'I think tomatoes are incredible on burgers.'
)::vector AS cosine_distance;


                content_copy
            
cosine_distance   
--------------------
 0.1114425936213167

cosine_distance   
--------------------
 0.7328613577628744

You'll notice that the distance between "What is the pgml.transform function?" and "The pgml.transform function is a PostgreSQL function for calling LLMs in the database." is much smaller than the cosine distance between "What is the pgml.transform function?" and "I think tomatoes are incredible on burgers".

Making it fast!

It is inefficient to compute embeddings for all the documents every time we search the dataset as it takes a few milliseconds to generate an embedding. Instead, we should embed our documents once and search against precomputed embeddings.

pgvector provides us with the vector data type for storing embeddings in regular PostgreSQL tables:


                content_copy
            
CREATE TABLE text_and_embeddings (
    id SERIAL PRIMARY KEY, 
    text text, 
    embedding vector (1024)
);

timer 12.547 ms

Let's add some data to our table:


                content_copy
            
INSERT INTO text_and_embeddings (text, embedding)
VALUES 
  (
    'The pgml.transform function is a PostgreSQL function for calling LLMs in the database.',
    pgml.embed(
      'mixedbread-ai/mxbai-embed-large-v1',
      'The pgml.transform function is a PostgreSQL function for calling LLMs in the database.'
    )
  ),

  (
    'I think tomatoes are incredible on burgers.',
    pgml.embed(
      'mixedbread-ai/mxbai-embed-large-v1',
      'I think tomatoes are incredible on burgers.'
    )
  );

timer 72.156 ms

Now that our table has some data, we can search over it using the following query:


                content_copy
            
WITH query_embedding AS (
    SELECT
        pgml.embed(
          'mixedbread-ai/mxbai-embed-large-v1',
          'What is the pgml.transform function?',
          '{"prompt": "Represent this sentence for searching relevant passages: "}'
        )::vector embedding
)
SELECT
    text,
    (
        SELECT
          embedding
        FROM query_embedding
    ) <=> text_and_embeddings.embedding cosine_distance
FROM
  text_and_embeddings
ORDER BY cosine_distance
LIMIT 1;

timer 35.016 ms


                content_copy
            
                                          text                                          |   cosine_distance   
----------------------------------------------------------------------------------------+---------------------
 The pgml.transform function is a PostgreSQL function for calling LLMs in the database. | 0.13467974993681486

This query is fast for now, but as we add more data to the table, it will slow down because we have not indexed the embedding column.

Let's demonstrate this by inserting 100,000 additional embeddings:


                content_copy
            
INSERT INTO text_and_embeddings (text, embedding) 
SELECT
  md5(random()::text),
  pgml.embed(
    'mixedbread-ai/mxbai-embed-large-v1',
    md5(random()::text)
  ) 
FROM generate_series(1, 100000);

timer 3114242.499 ms

Now trying our search engine again:


                content_copy
            
WITH embedded_query AS (
    SELECT
        pgml.embed('mixedbread-ai/mxbai-embed-large-v1', 'What is the pgml.transform function?', '{"prompt": "Represent this sentence for searching relevant passages: "}')::vector embedding
)
SELECT
    text,
    (
        SELECT
            embedding
        FROM embedded_query) <=> text_and_embeddings.embedding cosine_distance
FROM
  text_and_embeddings
ORDER BY cosine_distance
LIMIT 1;

timer 138.252 ms


                content_copy
            
                                          text                                          |   cosine_distance   
----------------------------------------------------------------------------------------+---------------------
 The pgml.transform function is a PostgreSQL function for calling LLMs in the database. | 0.13467974993681486

This somewhat less than ideal performance can be fixed by indexing the embedding column. There are two types of indexes available in pgvector: IVFFlat and HNSW.

IVFFlat indexes clusters the table into sublists, and when searching, only searches over a fixed number of sublists. In our example, if we were to add an IVFFlat index with 10 lists:


                content_copy
            
CREATE INDEX ON text_and_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 10);

timer 4989.398 ms

and search again, we would get much better performance:


                content_copy
            
WITH embedded_query AS (
    SELECT
        pgml.embed('mixedbread-ai/mxbai-embed-large-v1', 'What is the pgml.transform function?', '{"prompt": "Represent this sentence for searching relevant passages: "}')::vector embedding
)
SELECT
    text,
    (
        SELECT
            embedding
        FROM embedded_query) <=> text_and_embeddings.embedding cosine_distance
FROM
  text_and_embeddings
ORDER BY cosine_distance
LIMIT 1;

timer 44.508 ms


                content_copy
            
                                          text                                          |   cosine_distance   
----------------------------------------------------------------------------------------+---------------------
 The pgml.transform function is a PostgreSQL function for calling LLMs in the database. | 0.13467974993681486

We can see it is a massive speedup because we are only comparing our input to 1/10th of the original vectors, instead of all of them!

HNSW indexes are a bit more complicated. It is essentially a graph with edges linked by proximity in vector space.

HNSW indexes typically have better and faster recall but require more compute when adding new vectors. That being said, we recommend using HNSW indexes for most use cases where writes are less frequent than reads.


                content_copy
            
DROP index text_and_embeddings_embedding_idx;

CREATE INDEX ON text_and_embeddings
USING hnsw (embedding vector_cosine_ops);

timer 115564.303

Now let's try searching again:


                content_copy
            
WITH embedded_query AS (
    SELECT
        pgml.embed(
          'mixedbread-ai/mxbai-embed-large-v1',
          'What is the pgml.transform function?',
          '{"prompt": "Represent this sentence for searching relevant passages: "}'
        )::vector embedding
)
SELECT
    text,
    (
        SELECT
          embedding
        FROM embedded_query
    ) <=> text_and_embeddings.embedding cosine_distance
FROM
  text_and_embeddings
ORDER BY cosine_distance
LIMIT 1;

timer 35.716 ms


                content_copy
            
                                          text                                          |   cosine_distance   
----------------------------------------------------------------------------------------+---------------------
 The pgml.transform function is a PostgreSQL function for calling LLMs in the database. | 0.13467974993681486

That was even faster!

There is a lot more that can go into semantic search. Stay tuned for a follow up post on hybrid search and re-ranking.

If you have any questions, or just have an idea on how to make PostgresML better, we'd love to hear from you in our Discord. We’re open source, and welcome contributions from the community, especially when it comes to the rapidly evolving ML/AI landscape.

Closing thoughts / why PostgreSQL?

There are a host of benefits to performing machine learning tasks in your database. The hard part of AI & ML systems has always been managing data. Vastly more engineers have a full-time job managing data pipelines than models. Vastly more money is spent on data management systems than LLMs, and this will continue to be the case, because data is the bespoke differentiator.

Getting the data to the models in a timely manner often spans multiple teams and multiple disciplines collaborating for multiple quarters. When the landscape is changing as quickly as modern AI & ML, many applications are out of date before they launch, and unmaintainable long term.

Moving the models to the data rather than constantly pulling the data to the models reduces engineering overhead, the number of costly external network calls, and only enhances your ability to scale. Why not scale your data on a proven database handling millions of requests per second? That’s why we do machine learning in Postgres.

For more on the benefits of in-database AI/ML see our blog post, LLMs are Commoditized, Data is the Differentiator.

In this post we focused on SQL, but for those without SQL expertise, the benefits of in-database machine learning are still accessible. You can abstract away the SQL functions in JS, Python, Rust or C.

Implementing Semantic Search in Postgres in 15 Minutes

Implementing Semantic Search in Postgres in 15 Minutes

What is and is not semantic search

Embeddings 101

What does it mean to be "close"?

Making it fast!

Closing thoughts / why PostgreSQL?

Related articles

Generating LLM embeddings with open source models in PostgresML

Making Postgres 30 Percent Faster in Production

Introducing the OpenAI Switch Kit: Move from closed to open-source AI in minutes

Contribute

Contribute

Docs

Docs

Community

Community

PostgresML