The exponent of a big decimal string is parsed as an int32, adjusted for
the removed fractional part, and stored as an int32. When parsing values
like `1.23E-2147483647`, the unscaled value becomes `123`, and the scale
is adjusted to `2147483647 + 2 = 2147483649`. This exceeds the int32
limit, and since the scale is stored as an int32, it overflows and wraps
around, losing the value.
This patch fixes that the by parsing the exponent as an int64 value and
then adjusting it for the fractional part. The adjusted scale is then
checked to see if it is still within int32 limits before storing. An
exception is thrown if it is not within the int32 limits.
Note that strings with exponents that exceed the int32 range, like
`0.01E2147483650`, were previously not parseable as a big decimal. They
are now accepted if the final adjusted scale fits within int32 limits.
For the above value, unscaled_value = 1 and scale = -2147483648, so it
is now accepted. This is in line with how Java's `BigDecimal` parses
strings.
Fixes: #24581
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#24640
Currently, the tri-compare operator for big_decimal (operator <=>), uses
a precise but potentially very expensive algorithm for comparing the
numbers: it first brings them to the same scale, then compares the
normalized unscaled values. big_decimal has abritrary precisions,
therefore the stored numbers can be arbitrarily large.
In extreme cases, comparing two numbers can result in huge amount of
memory allocated and stalls. If this type is used int he primary key of
a table, these comparisons can make the node completely unresponsive.
This patch adds the following fast-paths to operator <=>:
* An early return for the case of equal scales.
* An early return for different signs.
* An early return for the case where one or both of the numbers are 0.
* A fast algorithm for detecting the case where the there is a big
difference between the two numbers. This algorithm works only with the
scales and is able to compare the two numbers by using only one division
and some additions and substractions. This algorithm is imprecise and
when the numbers are closer than its confidence window, it will
fall-back to the current slow but precise tri-compare.
All but the last case should have been fast before as well, but the
scale-compare algorithm makes a huge difference. Numbers, which would
previously make the node unresponsive, now compare in constant-time.
Fixes: scylladb/scylladb#21716Closesscylladb/scylladb#21715
After the mechanical change in fcb8d040e8
("treewide: use Software Package Data Exchange (SPDX) license identifiers"),
a few stray license blurbs or fragments thereof remain. In two cases
these were extra blurbs in code generators intended for the generated code,
in others they were just missed by the script.
Clean them up, adding an SPDX license identifier where needed.
Closes#10072
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
This behavior is different from cassandra, but without arithmetic
operations it doesn't seem possible to notice the difference from
CQL. Using avg produces the same results, since we use an initial
value of 0 (scale = 0).
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Test cases for big decimals were quite complete, but since the
implementation was recently changed, some corner cases are added:
- incorrect strings
- numbers not fitting into uint64_t
- numbers less than uint64_t::max themselves, but with the unscaled
value exceeding the maximum
1. Move tests to test (using singular seems to be a convention
in the rest of the code base)
2. Move boost tests to test/boost, other
(non-boost) unit tests to test/unit, tests which are
expected to be run manually to test/manual.
Update configure.py and test.py with new paths to tests.