💻How to turn data into Vectors (code)
A guide to turning data into vectors with VectorHub.
Assumed Knowledge: Vectors Target Audience: Python developers, General developers Reading Time: 3 minutes
The process of turning structured/unstructured data (in the form of Excel Spreadsheets, videos, images, word documents, PDFs) into vectors can involve quite complicated pipelines.
To help transform data into vectors, we open-sourced a library called VectorHub (you can explore the hub at hub.vctr.ai). For this, you will need to use Python 3 (tested on Python 3.6/Python 3.7).
The library can be installed via pip:
Once you install via pip, you can then use a model in Python. For example:
You can easily instantiate the model using the below.
Transforming your data into vectors is as simple as the following:
What is happening under the hood in VectorHub?
In this library, VectorHub abstracts away a few complexities to make the encoding smooth. There is, however, still a lot of room for customisation.
Quickly going over the diagram, as data is parsed through VectorHub, it is converted into a NumPy array. This array is fed through the model, pooled together and is then transformed into a native Python object.
For each of the different data types, we have sections on how each of them are converted into vectors so users can understand what is occurring in each of the processes.
These pages can be found below if you are interested:
💻How to turn images Into Vectors💻How to turn audio into Vectors💻How to turn text into VectorsLast updated