Hugging Face Inference Endpoint
This guide will walk you through an example integration of the Hugging Face Inference API with vecs. We will create embeddings using Hugging Face’s sentence-transformers/all-MiniLM-L6-v2
model, insert these embeddings into a PostgreSQL database using vecs, and then query vecs to find the most similar sentences to a given query sentence.
Create a Hugging Face Inference Endpoint
Head over to Hugging Face’s inference endpoints and select New Endpoint
.
Configure your endpoint with your model and provider of choice. In this example we’ll use sentence-transformers/all-MiniLM-L6-v2
and AWS
.
Under “Advanced Configuration” select “Sentence Embeddings” as the “Task”. Then click “Create Endpoint”
Once the endpoint starts up, take note of the Endpoint URL
!!! tip
Don’t forget to pause or delete your Hugging Face Inference Endpoint when you’re not using it
Finally, create and copy an API key we can use to authenticate with the inference endpoint.
Create an Environment
Next, you need to set up your environment. You will need Python 3.7+ with the vecs and requests installed.
You’ll also need a Postgres Database with the pgvector extension
Create Embeddings
We can use the Hugging Face endpoint to create embeddings for a set of sentences.
Store the Embeddings with vecs
Now that we have our embeddings, we can insert them into a PostgreSQL database using vecs subbing in your DB_CONNECTION string.
Querying for Most Similar Sentences
Finally, we can query vecs to find the most similar sentences to a given query sentence. The query sentence is embedded using the same method as the sentences in the dataset, then we query the sentences
collection with vecs.
Returns the most similar 3 records and theirdistance to the query vector.