Questions

What would be an effective way to cache a tensor search engine?
What is the distribution of vectors produced from a embedding model? and how does this affect search latency based on their distribution/spread?