Files
scylladb/docs
Nadav Har'El 384e394ff0 Merge 'Add similarity functions to calculate similarity of given vectors' from Dawid Pawlik
It should be possible to return the similarity of vectors in CQL statements following the [Cassandra compatible syntax](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql):

```
SELECT comment, similarity_cosine(comment_vector, [0.1, 0.15, 0.3, 0.12, 0.05])
    FROM cycling.comments_vs;
```

Although the calculations are slow, and we already have calculated results returned via Vector Store API,
we need the functionality as it allows us to calculate similarity of vectors not stored in vector indexes.

It will be needed for [quantization and rescoring](https://scylladb.atlassian.net/wiki/spaces/RND/pages/195985800/Quantization+and+Rescoring).

The feature is also a nice-to-have in testing as requested many times by testing and CX teams.

The optimized version utilizing already calculated distances from Vector Store without a need of rescoring will be coming soon after via https://github.com/scylladb/scylladb/pull/27991.

---

The patch adds functions:
- `similarity_cosine(<vector>, <vector>)`,
- `similarity_euclidean(<vector>, <vector>)`,
- `similarity_dot_product(<vector>, <vector>)`

Where `<vector>` is either a column of type `VECTOR<FLOAT, N>` or a vector of floats literal.

These functions can be called with every `SELECT` query, not only ANN vector queries as opposed to https://github.com/scylladb/scylladb/pull/25993.

The similarity calculations are implemented inspired by [USearch's implementation](
a2f1759910/include/usearch/index_plugins.hpp (L1304-L1385)) and made compatible with [Cassandra's documentation](https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/functions.html#vector-similarity-functions).
That would guarantee the results in ScyllaDB are calculated using the exact same algorithms as used in Vector Store indexes.

---

Fixes: SCYLLADB-88
Fixes: SCYLLADB-89

New feature, should land into 2026.1

Closes scylladb/scylladb#27524

* github.com:scylladb/scylladb:
  docs: add vector similarity functions documentation
  test/cqlpy: add similarity functions correctness tests
  test/cqlpy: add similarity functions invalid call tests
  cql3: introduce similarity functions syntax
  vector_similarity_fcts: introduce similarity functions
  vector_similarity_fcts: retrieve similarity function argument types
  vector_similarity_fcts: add calculating similarity between vectors
2026-01-05 18:28:10 +02:00
..
2025-12-14 11:48:48 +02:00
2026-01-05 10:44:58 +01:00
2025-02-20 11:24:34 +02:00
2025-12-16 19:41:03 +03:00
2025-12-16 19:41:03 +03:00

ScyllaDB Documentation

This repository contains the source files for ScyllaDB documentation.

  • The dev folder contains developer-oriented documentation related to the ScyllaDB code base. It is not published and is only available via GitHub.
  • All other folders and files contain user-oriented documentation related to ScyllaDB and are sources for docs.scylladb.com/manual.

To report a documentation bug or suggest an improvement, open an issue in GitHub issues for this project.

To contribute to the documentation, open a GitHub pull request.

Key Guidelines for Contributors

To prevent the build from failing:

  • If you add a new file, ensure it's added to an appropriate toctree, for example:

     .. toctree::
        :maxdepth: 2
        :hidden:
    
        Page X </folder1/article1>
        Page Y </folder1/article2>
        Your New Page </folder1/your-new-article>
    
  • Make sure the link syntax is correct. See the guidelines on creating links

  • Make sure the section headings are correct. See the guidelines on creating headings Note that the markup must be at least as long as the text in the heading. For example:

    ----------------------
    Prerequisites
    ----------------------
    

Building User Documentation

Prerequisites

  • Python
  • poetry
  • make

See the ScyllaDB Sphinx Theme prerequisites to check which versions of the above are currently required.

Mac OS X

You must have a working Homebrew in order to install the needed tools.

You also need the standard utility make.

Check if you have these two items with the following commands:

brew help
make -h

Linux Distributions

Building the user docs should work out of the box on most Linux distributions.

Windows

Use "Bash on Ubuntu on Windows" for the same tools and capabilities as on Linux distributions.

Building the Docs

  1. Run make preview in the docs/ directory to build the documentation.
  2. Preview the built documentation locally at http://127.0.0.1:5500/.

Cleanup

You can clean up all the build products and auto-installed Python stuff with:

make pristine

Information for Contributors

If you are interested in contributing to Scylla docs, please read the Scylla open source page at http://www.scylladb.com/opensource/ and complete a Scylla contributor agreement if needed. We can only accept documentation pull requests if we have a contributor agreement on file for you.

Third-party Documentation

  • Do any copying as a separate commit. Always commit an unmodified version first and then do any editing in a separate commit.

  • We already have a copy of the Apache license in our tree, so you do not need to commit a copy of the license.

  • Include the copyright header from the source file in the edited version. If you are copying an Apache Cassandra document with no copyright header, use:

This document includes material from Apache Cassandra.
Apache Cassandra is Copyright 2009-2014 The Apache Software Foundation.