Reverse Dictionary
One of the more interesting by products of LLMs is the embedding vector. I read awhile back about others using embedding vectors to create reverse dictionaries.
I think it’s a great demonstration of the power and function of these embeddings. I decided to try my own hand at it, using Cloudflare’s vector database, I made my own “reverse dictionary” or semantic thesaurus.
- I sourced some word / definition lists
- Pre-embed the definitions to a 1024 dim vector
- Used D1 to store my words and Vectorize as my vector database
- I embed the inputs at my endpoint and retrieve the cosine similar results
- The vector DB runs the (approximate) nearest neighbors search, which is non-trivial, and surprisingly fast
You can input anything, a definition, collections of feelings, a paragraph, a poem, and the results will be some words that best match that semantic meaning.
word | score | definition |
---|