Post by Elastic

540,561 followers

Searching by stored vectors just got 3-4x faster, and the change is a single API swap. If you use stored vectors for search-by-example (product recommendations, image similarity, document matching), you've probably written this pattern: GET the vector from your index Send it back as a _search query Two requests. Serialize the vector out, deserialize it in, pay the network cost twice. In multi-node clusters, the overhead compounds: inter-node traffic carries its own serialization cost, and running two separate requests works against Elasticsearch's local-node execution bias. In Elasticsearch 9.4, this collapses into a single request using query_vector_builder with a lookup source. Elasticsearch fetches the stored vector internally and runs the search in one pass. No client-side plumbing required. The benchmarks speak for themselves. Two nodes in the same GCP zone, same availability zone, 2M documents: - p50: 10.4ms → 3.1ms (3.3x faster) - p90: 25.4ms → 5.9ms (4.3x faster) - p99: 27.7ms → 8.1ms (3.4x faster) The latency reduction comes almost entirely from removing the unnecessary round trip and letting the cluster handle vector retrieval internally. For teams running recommendation systems or similarity search at scale, this means lower tail latency and simpler client code, with no application-level changes beyond swapping the query shape. Full walkthrough and benchmark methodology on the Elastic Search Labs blog: https://go.es.io/4uJnivj