Commit Graph

13 Commits

Author SHA1 Message Date
Lakshmi Narayanan Sreethar
5f5a8cf54c types/comparable_bytes: add testcase to verify compatibility with cassandra 2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar
6c1853a830 types/comparable_bytes: support variable-length natively byte-ordered data types
The following cql3 data types - ascii, blob, duration, inet, and text -
are natively byte-ordered in their serialized forms. To encode them into
a byte-comparable format, zeros are escaped, and since these types have
variable lengths, the encoded form is terminated in an escaped state to
mark its end.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar
5c77d17834 types/comparable_bytes: support decimal cql3 types
The decimal cql3 type is internally stored as a scale and an unscaled
integer. To convert them into a byte comparable format, they are first
normalized into a base-100 exponent and a mantissa that lies in [0.01, 1)
and then encoded into a byte sequence that preserves the numerical order.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar
832236d044 types/comparable_bytes: introduce count_digits() method
Implemented a method `count_digits()` to return the number of significant
digits in a given boost::multiprecision:cpp_int. This is required to
convert big_decimal to a byte comparable format.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar
a00c5d3899 types/comparable_bytes: support uuid and timeuuid cql3 types
The uuid type values are composed of two fixed-length unsigned integers:
an msb and an lsb. The msb contains a version digit, which must be
pulled first in a byte-comparable representation. For version 1 uuids,
in addition to extracting the version digit first, the msb must be
rearranged to make it byte comparable. The lsb is written as is.

For the timeuuid type, the msb is handled simliar to the version 1 uuid
values. The lsb however is treated differently - the sign bits of all
bytes are inverted to preserve the legacy comparison order, which
compared individual bytes as signed values.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar
4592b9764c types/comparable_bytes: support varint cql3 type
Any varint value less than 7 bytes is encoded using the signed long
encoding format and remaining values are all encoded using the full form
encoding :

  <signbyte><length as unsigned integer - 7><7 or more bytes>,

where <signbyte> is 00 for negative numbers and FF for positive ones,
and the length's bytes are inverted if the number is negative (so that
longer length sorts smaller).

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
ad45a19373 types/comparable_bytes: introduce encode/decode_varint_length
The length of a varint value is encoded separately as an unsigned
variable-length integer. For negative varint values, the encoded bytes
are flipped to ensure that longer lengths sort smaller. This patch
implements both encoding and decoding logic for varint lengths and will
be used by the subsequent patch.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
7af153c237 types/comparable_bytes: support float and double cql3 types
The sign bit is flipped for positive values to ensure that they are
ordered after negative values. For negative values, all the bytes are
inverted, allowing larger negative values to be ordered before smaller
ones.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
0145c1d705 types/comparable_bytes: support date, time and timestamp cql3 types
Both the date and time cql3 types are internally unsigned fixed length
integers. Their serialized form is already byte comparable, so the
encoder and decoder return the serialized bytes as it is.

The timestamp type is encoded using the fixed length signed integer
encoding.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
b6ff3f5304 types/comparable_bytes: support bigint cql3 type
The bigint type, internally implemented as a long data type, is encoded
using a variable-length encoding similar to UTF-8. This enables a
significant amount of space to be saved when smaller numbers are
frequently used, while still permitting large values to be efficiently
encoded.

The first bit of the encoding represents the inverted sign (i.e., 1 for
positive, 0 for negative), followed by length encoded as a sequence of
bits matching the inverted sign. This is then followed by a differing
bit (except for 9-byte encodings) and the bits of the number's two's
complement.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
c0d25060bd types/comparable_bytes: support fixed length signed integers
To encode fixed-length signed integers in a byte-comparable format, the
first bit of each value is inverted. This ensures that negative numbers
are ordered before positive ones during comparison. This patch adds
support for the data types : byte_type (tinyint), short_type (smallint),
and int32_type (int). Although long_type (bigint) is a fixed length
integer type, it has different byte comparable encoding and will be
handled separately in another patch.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
8572afca2b types/comparable_bytes: support boolean cql3 type
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar
74c556a33d types: introduce comparable_bytes class
This patch implements a new class, `comparable_bytes`, designed to
implement methods for converting data values to and from byte-comparable
formats. The class stores the comparable bytes as `managed_bytes` and
currently provides the structure for all required methods. The actual
logic for converting various data types will be implemented in subsequent
patches.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-07-01 22:19:07 +05:30