Guide To Vectors
  • Introduction
  • Guide to this Book
  • API documentation
  • Python SDK Documentation
  • Learn about vectors
    • 🌱Introduction to vectors
      • 🌱Applications of vectors
        • 🌱Vectors for classification
      • 🌱Limitations of vectors
    • 🌱What is vector search?
      • 🌱How to vector search
      • 🌱How to build image to text search using code
      • 🧍Try vector search with playground!
      • 💻Vector search with code
    • 🌱Terminology Guide
  • Unlock Vector AI
    • 🔍Inserting Into Vector AI
      • 🧍Inserting with playground
      • 💻Inserting with API
        • 💻Inserting with API - encoding while inserting (recommended)
        • 💻Inserting with API - encoding before inserting
        • 💻Inserting with API - encoding after inserting
      • 🧍How to check insertion succeeded
    • 🔍Searching with Vector AI
      • 🌱How to search with the playground
      • 🌱Combining with traditional search
        • 🧍How to combine exact text search with vector search
        • 💻How to add exact text search to vector search
      • 🌱Personalisation with vector search
        • 💻Personalised search/recommendations with vector search
      • 🌱Chunk search
        • 💻How To Chunk Search
        • 💻How To Do MultiVector Chunk Search
        • 💻How to do multi step chunk search
      • 🧍How to diversify search results
    • 🔍Clustering
      • 🌱Clustering Vectors From Deep Learning models
    • 🔍Aggregation
      • 💻Writing Your First Aggregation
      • 💻Publishing Your First Aggregation
    • 🔍Experimentation
      • 🌱Vector Evaluation
        • 🌱Evaluate Vector Bias
    • 🔍Jobs
      • 💻Tagging Jobs
      • 💻Chunking Jobs
      • 💻Encoding Jobs
      • 🧍List all jobs (active and inactive)
    • 🔍Encoding
    • 🔍Maintenance & Monitoring
      • 🧍How to view your collections
      • 💻How to share your collections
      • 💻How to back up your collections
      • 💻How to change name of a collection field
      • 💻How to change the schema of a collection
      • 💻How to remove a field in a collection
      • 💻How to request a read API key
  • Tutorials
    • 💻How to turn data into Vectors (code)
      • 💻How to turn text into Vectors
      • 💻How to turn images Into Vectors
      • 💻How to turn audio into Vectors
    • 💻Image Search For Developers
    • 💻How To Combine Different Vectors For Search
    • 💻How To Combine Different Vectors With Exact Matching Text
    • 💻Semantic NLP search with FAISS and VectorHub
  • ABOUT
    • Credits
    • Philosophy
    • Glossary
Powered by GitBook
On this page
  • Male Vs Female
  • Using The Bias Indicator In VectorAI

Was this helpful?

  1. Unlock Vector AI
  2. Experimentation
  3. Vector Evaluation

Evaluate Vector Bias

A guide on evaluating vectors on whether they are biased.

PreviousVector EvaluationNextJobs

Last updated 4 years ago

Was this helpful?

Assumed Knowledge: Vectors Target Audience: Data Scientists, Vector Enthusiasts, Python Developers Reading Time: 3 minutes

To identify bias in the representation space, we want to know which direction vectors will be leaning towards. This can be achieved using Normed 2D Cosine Similarity Plots.

We will explore below how the model interprets certain terms and their bias. Let us consider the following male vs female comparison and then explore what these charts show.

Male Vs Female

In the male vs female comparison:

Let us go over the main takeaways from this chart:

  • The purple bar suggests that a certain is more biased towards "female" whereas the green bar suggests the word is more biased towards the "male".

  • The words "princess", "skirt", "perfume" and "make-up" are all strongly tied to females.

  • Comparatively, the words "computers", "football", "machine", "beer" and "prince" are all strongly tied to males.

  • The magnitude of the cosine similarities are also interesting as it indicates that princess is more strongly tied to "female" in contrast to "skirt", "perfume", or "makeup". Conversely, "prince" is more strongly tied to "male" in contrast to "beer" or "machine".

In better understanding the hidden bias in our models, we may want to finetune these vectors and models.

Analysing Between Groups

The use cases of vectors can be extended even further. For example - when optimising for retail, we may want to decide where to place items and where to place categories such that customers can intuitively go to a section, find what they need and optimise conversion. For this, we will be interested in seeing where we should place each item and in which section. This can be optimised using bias indicator to determine where a particular item should go.

In the above graph, we explore a similar look into the representation space - looking over how different home tools compare to different categories. The above example compares technology and home gardening to different objects.

  • Telephones, televisions, tablets, PCs, phone are more biased to technology than home gardening.

  • Manure and garden hose are more biased to technology than home gardening.

  • Cultivator is slightly more biased to home gardening compared to technology. This may be because while cultivators are useful for gardening (not necessarily just home gardening), they are more used for farms and are a product of technology.

Using The Bias Indicator In VectorAI

from vectorai import ViClient
vi.bias_indicator(anchor_docs, docs, metadata_field='word')

This guide would not have been possible without the work of the following papers and articles by teams that have open-sourced their work for research purposes and for us to improve on.

@inproceedings{
  author = {Piero Molino, Yang Wang, Jiwei Zhang},
  booktitle = {ACL},
  title = {Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae},
  year = {2019},
}

🔍
🌱
🌱
A simple plot showing the bias of vectors between different groups.
Which objects are home-gardening or technology-related?