🌱How to build image to text search using code

A guide on building text to image/image to text search with Vector AI

Assumed Knowledge: Vectors, Vector Search, Python (Basic level) Target Audience: General Developer, Data Scientist, Python Developer Reading Time: 5 minutes Requirements: Python 3.6 or Python 3.7

To build text to image search, you will need a text to image model. An example of a text to image model is OpenAI's CLIP. You can read more about CLIP as a model here.

!pip install vectorhub[clip]
from vectorhub.bi_encoders.text_image.torch import Clip2Vec
model = Clip2Vec() # This will download and run the model

To encode the text, you can run the following in Python:

model.encode_text("This is a dog.")

To encode images, you can run the following in Python (if you are curious about the encoding process for images, you can read about it here.

model.encode_image("https://cdn.britannica.com/88/154388-050-11BCAE3C/CEO-Elon-Musk-SpaceX-car.jpg")

Now that the model has been instantiated, we simply need a way to store our index and then retrieve the relevant image or text. For the rest of the guide, we will use the Vector AI client but if you are interested in using a completely open-source indexing tool, we recommend our other article on using FAISS.

To add vectors to Vector AI's cloud index, simply run the following:

!pip install vectorai 
from vectorai import ViClient 
vi = ViClient(username, api_key)

From here, we want to insert documents into the Vector AI database. Vector AI uses document-based storage. Document-based storage allows Vector AI to be more useful for storing metadata and you can read more about it here.

# Create your list of data 
docs= [
{'image_url':'https://cdn.britannica.com/88/154388-050-11BCAE3C/CEO-Elon-Musk-SpaceX-car.jpg',
'label': 'Elon musk'},
{'image_url':'http://cdn.cnn.com/cnnnext/dam/assets/180316113418-travel-with-a-dog-3.jpg',
'label': 'dog'},
{'image_url':'https://cdn.mos.cms.futurecdn.net/vEcELHdn998wRTcCzqV5m9.jpg',
'label': 'laptop'}
]
collection_name = 'sample_image_text'
vi.insert_documents(collection_name, docs, models={'image_url': model.encode_image})

Now that you have inserted documents, you will now want to search your collection. This can be done using the following:

query = "Good boy"
text_query_vector= model.encode(query)
vi.search(collection_name, vector=text_query_vector, vector_field='image_url_vector_')

Some users may be confused by the name of the vector field. Where doesimage_urlvector_ come from? Vector AI automatically encodes vectors and adds a _vector_to the encoded field. If you are ever confused by what the vector names are, simply refer to the collection schema of the collection to understand what is happening.

vi.collection_schema(collection_name)
# Returns a dictionary of the fields in the collection and their respective values.

If you want to use this in your own applications that are not Python-based, you can then search this using the Vector AI API, which has been documented here.

Last updated