Commit Graph

31 Commits

Author SHA1 Message Date
Nadav Har'El
c56361a6d7 vector_store_client: read and return similarity_scores
The vector store returns for every ANN search, in addition to the keys
of the matching items, two additional vectors - "distances" and
"similarity_cores". The "distances" are raw distance metrics - lower
scores are better matches, while "similarity_scores" are modified
such that higher scores are better matches.

Traditionally, search scores in systems like Cassandra and Open Search
use the "similarity scores" approach (higher is better, results are
returned in decreasing similarity order), so this is the more interesting
vector of the two.

But before this patch, our vector_store_client::ann() inspected
only "distances". But... then, it didn't return even that to the
caller :-)

So in this patch, we:

1. Ignore "distances" and instead look at "similarity scores",
   which is what users really want based on their experience with
   other vector and non-vector search engines.

2. Return the similarity score of each match together with the match.
   We already have this score (the vector store returns it) and we
   can add it to the existing primary_key structure of each result.
   So each result is a "struct primary_key" which has fields partition,
   clustering, and after this patch - similarity.

Existing callers in CQL and Alternator vector search will ignore this
"similarity" field in each result, and not notice it was added.
But in the next patch, we'll allow Alternator's vector search to
return this similarity in each result.

The existing unit tests for vector_store_client.cc mocked vector-store
responses with "distances", without "similarity_scores", so no longer
represent what we actually expect the vector store to do. So this patch
also contains modifications for these tests, to mock and to test
"similarity_scores" - not "distances". The more interesting tests, in
the next patch, use the real vector store and check that we really do
get a "similarity_scores" response from it.

This patch also handles a small corner case for DOT_PRODUCT, which is
the only unbounded similarity function. If the similarity overflows
the 32-bit float, the vector store returns a JSON "null" instead of
a JSON number (since JSON doesn't support infinite numbers). Our
existing vector-store client code errored out when it saw this "null",
which is wrong - the request should be allowed to proceed. So in this
patch when we see a "null" JSON for similarity, we return +Inf.
This is usually correct because the top results really have +Inf, not
-Inf, but if we ask for all items we can reach those with similarity
-Inf and incorrectly assign +Inf to them (we have a test for this case
in the next patch). But this problenm won't happen when Limit is low,
and in any case it's better than aborting the request after it had
already succeeded.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 14:19:17 +03:00
Karol Nowacki
9269ca9cf7 vector_search: add unreachable node detection time config
Add option `vector_store_unreachable_node_detection_time_in_ms` to
control parameters related to detecting unreachable vector store nodes.
This parameter is used to set the TCP connect timeout, keepalive
parameters, and TCP_USER_TIMEOUT. By configuring these parameters,
we can detect unreachable vector store nodes faster and trigger
failover mechanisms in a timely manner.
2026-04-17 12:26:38 +03:00
Nadav Har'El
aea7b6a66b alternator: DescribeTable for vector index: add IndexStatus and Backfilling
Add to DescribeTable's output for VectorIndexes two fields - IndexStatus
and Backfilling - which are intended to exactly mirror these two fields
that exist for GlobalSecondaryIndexes:

When a vector index is added, IndexStatus is "CREATING" before the index
is usable, and "ACTIVE" when it is finally usable for a Query. During
"CREATING" phase, "Backfilling" may be set to true when the index is
currently being backfilled (the table is scaned and an index is built).

A user is expected to call DescribeTable in a loop after creating a
vector index (via either CreateTable and UpdateTable) and only call
Query on the index after the IndexStatus is finally ACTIVE. Calling
Query earlier, while IndexStatus is still CREATING, will result in an
error.

In the current implementation, Alternator does not track the state of the
vector index, so it needs to contact the vector store to inquire about
the state of the index - using a new function introduced in this patch
that uses an existing vector-store API. This makes DescribeTable slower
on tables that have vector indexes, because the vector store is contacted
on every DescribeTable call.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 13:31:49 +03:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Szymon Wasik
d27610f138 vector_store_client: Return HTTP error description, not just code
This simple patch adds support for storing the HTTP error description
that Vector Store client receives from vector store. Until now it was
just printed to the log but it was not returned. For this reason it
was not forwarded to the drivers which forced users to access ScyllaDB
server logs to understand what is wrong with Vector Store.

This patch also updates formatter to print the message next to the
error code.

Fixes: VECTOR-189
2026-03-10 17:22:30 +01:00
Karol Nowacki
647172d4b8 vector_search: fix names of private members
According to coding style in Scylla,
member variables are prefixed with underscore.
2026-03-02 14:08:16 +01:00
Karol Nowacki
f2308b000f vector_search: remove unused global variable 2026-03-02 14:08:07 +01:00
Karol Nowacki
aef5ff7491 vector_search: test: Fix flaky cert rewrite test
The test is flaky most likely because when TLS certificate rewrite
happens simultaneously with an ANN request, the handshake can hang for a
long time (~60s). This leads to a timeout in the test case.

This change introduces a checkpoint in the test so that it will
wait for the certificate rewrite to happen before sending an ANN request,
which should prevent the handshake from hanging and make the test more reliable.

Fixes: #28012
2026-02-12 09:58:54 +01:00
Dawid Pawlik
2a38794b8e vector_search: cql: construct and use filter in ANN vector queries
Add `filter` option in `ann()` function to write the filter JSON
object as the POST request in ANN vector queries.

Adjust existing `vector_store_client_test` tests accordingly.
2026-01-16 11:18:23 +01:00
Karol Nowacki
c40b3ba4b3 vector_search: Add HTTPS support for vector store connections
This commit introduces TLS encryption support for vector store connections.
A new configuration option is added:
- vector_store_encryption_options.truststore: path to the trust store file

To enable secure connections, use the https:// scheme in the
vector_store_primary_uri/vector_store_secondary_uri configuration options.

Fixes: VECTOR-327
2025-11-22 08:18:45 +01:00
Karol Nowacki
104de44a8d vector_search: Add support for secondary vector store clients
This change adds support for secondary vector store clients, typically
located in different availability zones. Secondary clients serve as
fallback targets when all primary clients are unavailable.
New configuration option allows specifying secondary client addresses
and ports.

Fixes: VECTOR-187

Closes scylladb/scylladb#26484
2025-11-20 08:37:18 +01:00
Karol Nowacki
5c30994bc5 vector_search: Move response_content_to_sstring to utils.hh
Move the response_content_to_sstring utility function from
vector_store_client.cc to utils.hh to enable reuse across
multiple files.

This refactoring prepares for the upcoming `client.cc` implementation
that will also need this functionality.
2025-11-17 06:21:31 +01:00
Karol Nowacki
1972fb315b vector_search: Set max backoff delay to 2x read request timeout
The maximum backoff delay for status checking now depends on the
`read_request_timeout_in_ms` configuration option. The delay is set
to twice the value of this parameter.
2025-11-14 08:05:21 +01:00
Karol Nowacki
097c0f9592 vector_search: Report status check exception via on_internal_error_noexcept
This exception should only occur due to internal errors, not client or external issues.
If triggered, it indicates an internal problem. Therefore, we notify about this exception
using on_internal_error_noexcept.
2025-11-14 08:05:21 +01:00
Karol Nowacki
940ed239b2 vector_search: Extract client management into dedicated class
Refactor client list management by moving it to separate files
(clients.cc/clients.hh) to improve code organization and modularity.
2025-11-14 08:05:21 +01:00
Karol Nowacki
009d3ea278 vector_search: Add backoff for failed clients
Introduces logic to mark clients that fail to answer an ANN request as
"down". Down clients are omitted from further requests until they
successfully respond to a health check.

Health checks for down clients are performed in the background using the
`status` endpoint, with an exponential backoff retry policy ranging
from 100ms to 20s.
2025-11-14 07:38:01 +01:00
Karol Nowacki
49a177b51e vector_search: Use std::expected for low-level client errors
To unify error handling, the low-level client methods now return
`std::expected` instead of throwing exceptions. This allows for
consistent and explicit error propagation from the client up to the
caller.

The relevant error types have been moved to a new `vector_search/error.hh`
header to centralize their definitions.
2025-11-14 07:23:40 +01:00
Karol Nowacki
62f8b26bd7 vector_search: Extract client class
This refactoring extracts low-level client logic into a new, dedicated
`client` class. The new class is responsible for connecting to the
server and serializing requests.

This change prepares for extending the `vector_store_client` to check
node status via the `api/v1/status` endpoint.

`/ann` Response deserialization remains in the `vector_store_client` as it
is schema-dependent.
2025-11-14 07:23:40 +01:00
Michał Hudobski
5c957e83cb vector_search: remove dependence on cql3
This patch removes the dependence of vector search module
on the cql3 module by moving the contents of cql3/type_json.hh
to types/json_utils.hh and removing the usage of cql3 primary_key
object in vector_store_client. We also make the needed adjustments
to files that were previously using the afformentioned type_json.hh
file.

This fixes the circular dependency cql3 <-> vector_search.

Closes scylladb/scylladb#26482
2025-10-21 17:41:55 +03:00
Michał Hudobski
fe4bfffca5 metrics, vector_search: add a dns refresh metric
This commit adds a dns refresh counting metric
to the vector_store service. We would like to
track it to make sure that the networking is working
correctly.
2025-09-29 12:28:52 +02:00
Michał Hudobski
74becdd04b vector_search: move the ann implementation to impl
The implementation of the ann function should have been placed
in the impl struct, not in the client itself. This commit fixes that.
2025-09-29 12:26:42 +02:00
Karol Nowacki
f8b1addfaf vector_store_client: Rename host_port struct to uri
The `host_port` struct represents the parsed components of the vector
store URI. Renaming it to `uri` more accurately reflects its purpose.
2025-09-27 09:04:46 +02:00
Karol Nowacki
27f6459766 vector_store_client: Add support for multiple URIs
The vector store client now supports a comma-separated list of URIs in
the `vector_store_primary_uri` configuration option.

It uses the vector store nodes from these URIs for load balancing and high
availability, querying the next node if the current one fails.
2025-09-27 09:04:45 +02:00
Karol Nowacki
a310cb4c64 vector_store_client: Remove methods used only in tests
The `vector_store_client::port()` and `vector_store_client::host()` methods
were only used in the test code.

Moreover, these tests are no longer needed, as the proper parsing of the
URI is already tested in other tests that perform requests to the
vector store server mock.
2025-09-27 08:47:00 +02:00
Karol Nowacki
a0e62ef8de vector_store_client: Add support for load balancing
This change introduces a load balancing mechanism for the vector store client.
The client can now distribute requests across multiple vector store nodes.
The distribution mechanism performs random selection of nodes for each request.
2025-09-26 13:44:28 +02:00
Karol Nowacki
706eeee1bd vector_store_client: Rename HTTP_REQUEST_RETRIES to ANN_RETRIES
Rename `HTTP_REQUEST_RETRIES` to `ANN_RETRIES` in `vector_store_client`,
as it now applies to all vector store nodes, not just HTTP requests.

Also, remove an unused test setter function.
2025-09-24 10:51:43 +02:00
Karol Nowacki
8411a03f22 vector_store_client: Format with clang-format 2025-09-24 10:41:37 +02:00
Karol Nowacki
57d1b601a8 vector_store_client: Add support for multiple IPs in DNS responses
The DNS resolution logic now processes all IP addresses returned in a DNS
response, not just the primary one.

The client will iterate through the list of resolved IPs, attempting to
query the next one if a request fails. This improves high availability
by allowing the client to query other available nodes if one is down.
2025-09-24 10:41:37 +02:00
Karol Nowacki
27219b8b7c vector_store_client: Extract DNS logic into a dedicated class
The DNS resolution logic and its background task are moved out of the
`vector_store_client` and into a new, dedicated class `vector_search::dns`.

This refactoring is the first step towards supporting DNS hostnames
that resolve to multiple IP addresses.

Signed-off-by: Karol Nowacki <karol.nowacki@scylladb.com>
2025-09-22 08:01:53 +02:00
Karol Nowacki
7cc7b95681 vector_search: Apply clang-format
Run clang-format on the vector_search module to fix minor formatting
inconsistencies.
2025-09-22 08:01:50 +02:00
Karol Nowacki
eae71d3e91 vector_store_client: Move to vector_search module
Vector search related implementation moved to a new module vector_search.
As the vector search functionality is going to be extended, it is
better to keep it in a separate module.
2025-09-22 08:01:47 +02:00