Commit Graph

1391 Commits

Author SHA1 Message Date
Botond Dénes
864d27f9af Merge 'clear_gently: handle null unique_ptr and optional values' from Benny Halevy
This series adds handling of null std::unique_ptr to utils::clear_gently
and handling of std::optional and seastar::optimized_optional (both engaged and disengaged cases).

Also, unit tests were added to tests the above cases.

Fixes #13636

Closes #13638

* github.com:scylladb/scylladb:
  utils: clear_gently: add variants for optional values
  utils: clear_gently: do not clear null unique_ptr
2023-04-24 10:27:32 +03:00
Benny Halevy
002865018f utils: clear_gently: add variants for optional values
Implement clear_gently for std:;optional<T>
and seastar::optimized_optional<T> and respective
unit tests.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 21:34:02 +03:00
Benny Halevy
12877ad026 utils: clear_gently: do not clear null unique_ptr
Otherwise the null pointer is dereferenced.

Add a unit test reproducing the issue
and testing this fix.

Fixes #13636

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 21:33:11 +03:00
Benny Halevy
d1817e9e1b utils: move generation-number to gms
Although get_generation_number implementation is
completely generic, it is used exclusively to seed
the gossip generation number.

Following patches will define a strong gms::generation_id
type and this function should return it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Benny Halevy
f5f566bdd8 utils: add tagged_integer
A generic template for defining strongly typed
integer types.

Use it here to replace raft::internal::tagged_uint64.
Will be used for defining gms generation and version
as strong and distinguishable types in following patches.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Kefu Chai
a2aa133822 treewide: use std::lexicographical_compare_threeway
this the standard library offers
`std::lexicographical_compare_threeway()`, and we never uses the
last two addition parameters which are not provided by
`std::lexicographical_compare_threeway()`. there is no need to have
the homebrew version of trichotomic compare function.

in this change,

* all occurrences of `lexicographical_tri_compare()` are replaced
  with `std::lexicographical_compare_threeway()`.
* ``lexicographical_tri_compare()` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13615
2023-04-21 14:28:18 +03:00
Botond Dénes
10c1f1dc80 Merge 'db: system_keyspace: use microsecond resolution for group0_history range tombstone' from Kamil Braun
in `make_group0_history_state_id_mutation`, when adding a new entry to
the group 0 history table, if the parameter `gc_older_than` is engaged,
we create a range tombstone in the mutation which deletes entries older
than the new one by `gc_older_than`. In particular if
`gc_older_than = 0`, we want to delete all older entries.

There was a subtle bug there: we were using millisecond resolution when
generating the tombstone, while the provided state IDs used microsecond
resolution. On a super fast machine it could happen that we managed to
perform two schema changes in a single millisecond; this happened
sometimes in `group0_test.test_group0_history_clearing_old_entries`
on our new CI/promotion machines, causing the test to fail because the
tombstone didn't clear the entry correspodning to the previous schema
change when performing the next schema change (since they happened in
the same millisecond).

Use microsecond resolution to fix that. The consecutive state IDs used
in group 0 mutations are guaranteed to be strictly monotonic at
microsecond resolution (see `generate_group0_state_id` in
service/raft/raft_group0_client.cc).

Fixes #13594

Closes #13604

* github.com:scylladb/scylladb:
  db: system_keyspace: use microsecond resolution for group0_history range tombstone
  utils: UUID_gen: accept decimicroseconds in min_time_UUID
2023-04-21 14:08:56 +03:00
Kamil Braun
218a056825 utils: UUID_gen: accept decimicroseconds in min_time_UUID
The function now accepts higher-resolution duration types, such as
microsecond resolution timestamps. Will be used by the next commit.
2023-04-21 10:33:02 +02:00
Pavel Emelyanov
30b6f34a0b s3/client: Explicitly set _upload_id empty when completing
The upload_sink::_upload_id remains empty until upload starts, remains
non-empty while it proceeds, then becomes empty again after it
completes. The upload_started() method cheks that and on .close()
started upload is aborted.

The final switch to empty is done by std::move()ing the upload id into
completion requrest, but it's better to use std::exchange() to emphasize
the fact the the _upload_id becomes empty at that point for a reason.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13570
2023-04-20 17:32:08 +03:00
Nadav Har'El
5b792dde68 Merge 'Extend aws_sigv4 code to suite S3 client needs' from Pavel Emelyanov
The AWS signature-generating code was moved from alternator some time ago as is. Now it's clear that in which places it should be extended to work for S3 client as well. The enhancements are

- Support UNSIGNED-PAYLOAD to omit calculating checksums for request body
- Include full URL path into the signature, not just hard-coded "/" string
- Don't check datastamp expiration if not asked for

This is a part of #13493

Closes #13535

* github.com:scylladb/scylladb:
  utils/aws: Brush up the aws_sigv4.hh header
  utils/aws: Export timepoint formatter
  utils/aws: Omit datestamp expiration checks when not needed
  utils/aws: Add canonical-uri argument
  utils/aws: Support unsigned-payload signatures
2023-04-18 16:33:52 +03:00
Avi Kivity
7724223134 Merge 'utils: big_decimal: optimize big_decimal::compare() and use <=> operator' from Kefu Chai
in this series, we use <=> operator to replace `big_decimal::compare()` for better readability. also, we trade the chained ternary expression with a more verbose if-else statement for better performance and readability.

Closes #13478

* github.com:scylladb/scylladb:
  utils: big_decimal: replace compare() with <=> operator
  utils: big_decimal: optimize big_decimal::compare()
2023-04-17 14:33:53 +03:00
Pavel Emelyanov
d09d6adbf4 utils/aws: Brush up the aws_sigv4.hh header
Add lost pragma-once directive.

Remove the hashers.hh inclusion. It was carried in when the whole code
was detached from alternator (f5de0582c8), but this header is not needed
in the header, only in the .cc file which uses sha256_hasher.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:16:45 +03:00
Pavel Emelyanov
792490e095 utils/aws: Export timepoint formatter
The format of timestamp for AWS requests is defined in documentation,
there's already the code that prepares it in this form. This patch
exports this method so that S3 client could use it in next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:14:45 +03:00
Pavel Emelyanov
706b60a0b0 utils/aws: Omit datestamp expiration checks when not needed
The signing code is used in two ways -- by alternator to verify the
arrived signed request and by S3 client to prepare the signed request.
In the former case date expiration check is performed, but for the
latter this is not required, because date stamp is most likely now (or
close to it).

So this patch makes the orig_datestamp argument optional meaning that
expiration checks can be omited.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:14:45 +03:00
Pavel Emelyanov
c5ccef078a utils/aws: Add canonical-uri argument
Current signing code hard-codes the "/" as the URL, likely this just
works for alternator. For S3 client the URL would include bucket and
object name and should thus become the argument, not constant.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:14:45 +03:00
Pavel Emelyanov
8eabe9c4ef utils/aws: Support unsigned-payload signatures
For S3 signing the whole request payload can be too resource consuming.
Fortunately, payload signing is only enforced if used with plain http,
but with real S3 we're going to use signed requests over https only (see
next patch why).

Said that, the patch turns body-content into optional reference (i.e. --
a pointer) so that the signing code could inject the UNSIGNED-PAYLOAD
mark instead of the payload signature and omit heavy payload signing.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:14:45 +03:00
Pavel Emelyanov
7c7a3416c5 s3/client: Add comments about multipart upload completion message
The message length is pre-calculated in advance to provide correct
content-length request header. This math is not obvious and deserves a
comment.

Also, the final message preparation code is also implicitly checking
if any part failed to upload. There's a comment in the upload_sink's
upload_part() method about it, but the finalization place deserves one
too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:08:34 +03:00
Pavel Emelyanov
3f86bed600 s3/client: Fix succeeded/failed part upload final checking
When all parts upload complete the final message is prepared and sent
out to the server. The preparation code is also responsible for checking
if all parts uploaded OK by checking the part etag to be non-empty. In
that check a misprint crept in -- the whole list is checked to be empty,
not the individual etag itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 11:08:15 +03:00
Pavel Emelyanov
79379760e6 s3/client: Fix parts to start from 1
Docs say, that part numbers should start from 1, while the code follows
the tradition and starts from 0. Minio is conveniently incompatible in
this sense so test had been passing so far. On real S3 part number 0
ends up with failed request.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-17 10:43:12 +03:00
Kefu Chai
6bb32efac0 utils: big_decimal: replace compare() with <=> operator
now that we are using C++20, it'd be more convenient if we can use
the <=> operator for comparing. the compiler creates the 6 other
operators for us if the <=> operator is defined. so the code is more
compacted.

in this change, `big_decimal::compare()` is replaced with `operator<=>`,
and its caller is updated accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-15 12:52:30 +08:00
Kefu Chai
e991e6087e utils: big_decimal: optimize big_decimal::compare()
before this change in the worst case, the underlying
`number::compare()` gets called twice. as it is used by Boost::multiprecision
to implement the comparing operators of `number`. but since we can
have the result in one go, there is no need to to perform the
comparison multiple times.

so, in this change, we just call `number::compare()` explicitly,
and use it to implement `compare()`. this should save a call of
`number::compare()`. also, the chained ternary expression is
replaced using if-else statement for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-15 12:52:30 +08:00
Pavel Emelyanov
b1501d4261 s3/client: Don't use designated initialization of sys stat struct
It makes compiler complan about mis-ordered initialization of st_nlink
vs st_mode on different arches. Current code (st_nlink before st_mode)
compiled fine on x86, but fails on ARM which wants st_mode to come
before st_nlink. Changing the order would, apparently, break x86 build
with similar message.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13499
2023-04-13 15:13:56 +03:00
Botond Dénes
0c51f72ad6 Merge 'utils, mutation: replace operator<<(..) with fmt formatter' from Kefu Chai
this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `tombstone` and `shadowable_tombstone` without the help of fmt::ostream. and their `operator<<(ostream,..)` are dropped, as there are no users of them anymore.

Refs #13245

Closes #13474

* github.com:scylladb/scylladb:
  mutation: specialize fmt::formatter<tombstone> and fmt::formatter<shadowable_tombstone>
  utils: specialize fmt::formatter<optional<>>
2023-04-12 09:32:56 +03:00
Kefu Chai
ff202723c6 utils: big_decimal: specialize fmt::formatter<big_decimal>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `big_decimal` without the help of `operator<<`. this operator
is droppe in this change, as all its callers are now using fmtlib
for formatting now. we might need to use fmtlib to implement `big_decimal::to_string()`,
and use `fmt::to_string()` instead, but let's leave it for a follow-up
change.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13479
2023-04-12 09:20:50 +03:00
Kefu Chai
c980bd54ad utils: specialize fmt::formatter<optional<>>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `optional<T>` without the help of `operator<<()`.

this change also enables us to ditch more `operator<<()`s in future.
as we are relying on `operator<<(ostream&, const optional<T>&)` for
printing instances of `optional<T>`, and `operator<<(ostream&, const optional<T>&)`
in turn uses the `operator<<(ostream&, const T&)`. so, the new
specialization of `fmt::formatter<optional<>>` will remove yet
another caller of these operators.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-12 10:57:03 +08:00
Kefu Chai
59579d5876 utils: fragment_range: specialize fmt::formatter<FragmentedView>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print classes fulfill the requirement of `FragmentedView` concept
without the help of template function of `to_hex()`, this function is
dropped in this change, as all its callers are now using fmtlib
for formatting now. the helper of `fragment_to_hex()` is dropped
as well, its only caller is `to_hex()`.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13471
2023-04-11 16:09:38 +03:00
Botond Dénes
05b381bfa2 Merge 'Simple S3 storage for sstables' from Pavel Emelyanov
The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names.

When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry.

To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state (*) and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name.

(*) About sstable's status and state.

The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017)

The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry.

To play with:

1. Start minio (installed by install-dependencies.sh)
```
export MINIO_ROOT_USER=${root_user}
export MINIO_ROOT_PASSWORD=${root_pass}
mkdir -p ${root_directory}
minio server ${root_directory}
```

2. Configure minio CLI, create anonymous bucket
```
mc config host rm local
mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass}
mc mb local/sstables
mc anonymous set public local/sstables
```

3. Start Scylla with object-storage feature enabled
``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}```

4. Create KS with S3 storage
``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };```

The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity.

Closes #12523

* github.com:scylladb/scylladb:
  test: Add object-storage test
  distributed_loader: Print storage type when populating
  sstable_directory: Add ownership table components lister
  sstable_directory: Make components_lister and API
  sstable_directory: Create components lister based on storage options
  sstables: Add S3 storage implementation
  system_keyspace: Add ownership table
  system_keyspace: Plug to user sstables manager too
  sstable: Make storage instance based on storage options
  sstable_directory: Keep storage_options aboard
  sstable: Virtualize the helper that gets on-disk stats for sstable
  sstable, storage: Virtualize data sink making for small components
  sstable, storage: Virtualize data sink making for Data and Index
  sstable/writer: Shuffle writer::init_file_writers()
  sstable: Make storage an API
  utils: Add S3 readable file impl for random reads
  utils: Add S3 data sink for multipart upload
  utils: Add S3 client with basic ops
  cql-pytest: Add option to run scylla over stable directory
  test.py: Equip it with minio server
  sstables: Detach write_toc() helper
2023-04-11 08:17:25 +03:00
Pavel Emelyanov
033fa107f8 utils: Add S3 readable file impl for random reads
Sometimes an sstable is used for random read, sometimes -- for streamed
read using the input stream. For both cases the file API can be
provided, because S3 API allows random reads of arbitrary lengths.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
a4a64149a6 utils: Add S3 data sink for multipart upload
Putting a large object into S3 using plain PUT is bad choice -- one need
to collect the whole object in memory, then send it as a content-length
request with plain body. Less memory stress is by using multipart
upload, but multipart upload has its limitation -- each part should be
at least 5Mb in size. For that reason using file API doesn't work --
file IO API operates with external memory buffers and the file impl
would only have raw pointers to it. In order to collect 5Mb of chunk in
RAM the impl would have to copy the memory which is not good. Unlike the
file API data_sink API is more flexible, as it has temporary buffers at
hand and can cache them in zero-copy manner.

Having sad that, the S3 data_sink implementation is like this:

* put(buffer):
  move the buffer into local cache, once the local cache grows above 5Mb
  send out the part

* flush:
  send out whatever is in cache, then send upload completion request

* close:
  check that the upload finihsed (in flush), abort the upload otherwise

User of the API may (actually should) wrap the sink with output_stream
and use it as any other output_stream.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
3745b5c715 utils: Add S3 client with basic ops
Those include -- HEAD to get size, PUT to upload object in one go, GET
to read the object as contigious buffer and DELETE to drop one.

The client uses http client from seastar and just implements the S3
protocol using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Botond Dénes
985e33a768 utils: s/std::regex/boost::regex/
The former is prone to producing stack-overflow as it uses recursion in
it match implementation.

The migration is entirely mechanical.
2023-04-06 09:51:28 -04:00
Pavel Emelyanov
f5de0582c8 alternator,util: Move aws4-hmac-sha256 signature generator to util
S3 client cannot perform anonymous multipart uploads into any real S3
buckets regardless of their configuration. Since multipart upload is
essential part of the sstables backend, we need to implement the
authorisation support for the client early.

(side note): with minio anonymous multipart upload works, with aws s3
anonymous PUT and DELETE can be configured, it's exactly the combination
of aws + multipart upload that does need authorization.

Fortunately, the signature generation and signature checking code is
symmetrical and we have the checking option already in alternator :) So
what this patch does is just moves the alternator::get_signature()
helper into utils/. A sad side effect of that is all tests now need to
link with gnutls :( that is used to compute the hash value itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13428
2023-04-04 18:24:48 +03:00
Nadav Har'El
59ab9aac44 Merge 'functions: reframe aggregate functions in terms of scalar functions' from Avi Kivity
Currently, aggregate functions are implemented in a statefull manner.
The accumulator is stored internally in an aggregate_function::aggregate,
requiring each query to instantiate new instances (see
aggregate_function_selector's constructor, and note how it's called
from selector::new_instance()).

This makes aggregates hard to use in expressions, since expressions
are stateless (with state only provided to evaluate()). To facilitate
migration towards stateless expressions, we define a
stateless_aggregate_function (modeled after user-defined aggregates,
which are already stateless). This new struct defines the aggregate
in terms of three scalar functions: one to aggregate a new input into
an accumulator (provided in the first parameter), one to finalize an
accumulator into a result, and one to reduce two accumulators for
parallelized aggregation.

All existing native aggregate functions are converted to the new model, and
the old interface is removed. This series does not yet convert selectors to
expressions, but it does remove one of the obstacles.

Performance evaluation: I created a table with a million ints on a single-node cluster, and ran the avg() function on them. I measured the number of instructions executed with `perf stat -p $(pgrep scylla) -e instructions` while the query was running. The query executed from cache, memtables were flushed beforehand. The instruction count per row increased from roughly 49k to roughly 52k, indicating 3k extra instructions per row. While 3k instructions to execute a function is huge, it is currently dwarfed by other overhead (and will be even less important in a cluster where it CL>1 will cause non-coordinator code to run multiple times).

Closes #13105

* github.com:scylladb/scylladb:
  cql3/selection, forward_service: use use stateless_aggregate_function directly
  db: functions: fold stateless_aggregate_function_adapter into aggregate_function
  cql3: functions: simplify accumulator_for template
  cql3: functions: base user-defined aggregates on stateless aggregates
  cql3: functions: drop native_aggregate_function
  cql3: functions: reimplement count(column) statelessly
  cql3: functions: reimplement avg() statelessly
  cql3: functions: reimplement sum() statelessly
  cql3: functions: change wide accumulator type to varint
  cql3: functions: unreverse types for min/max
  cql3: functions: rename make_{min,max}_dynamic_function
  cql3: functions: reimplement min/max statelessly
  cql3: functions: reimplement count(*) statelessly
  cql3: functions: simplify creating native functions even more
  cql3: functions: add helpers for automating marshalling for scalar functions
  types: fix big_decimal constructor from literal 0
  cql3: functions: add helper class for internal scalar functions
  db: functions: add stateless aggregate functions
  db, cql3: move scalar_function from cql3/functions to db/functions
2023-03-30 13:58:47 +03:00
Kefu Chai
ea2badb25f utils: config_file: add a space after =
for better readability

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-29 19:22:21 +08:00
Avi Kivity
cd0b167d6c Merge 'bloom_filter: cleanups' from Kefu Chai
this series applies some random cleanups to bloom_filter. these cleanups were the side products when the author was working on #13314 .

Closes #13315

* github.com:scylladb/scylladb:
  bloom_filter: mark internal help function static
  bloom_filter: add more constness to false positive rate tables
  bloom_filter: use vector::back() when appropriate
2023-03-26 19:43:37 +03:00
Botond Dénes
19560419d2 Merge 'treewide: improve compatibility with gcc 13' from Avi Kivity
An assortment of patches that reduce our incompatibilities with the upcoming gcc 13.

Closes #13243

* github.com:scylladb/scylladb:
  transport: correctly format unknown opcode
  treewide: catch by reference
  test: raft: avoid confusing string compare
  utils, types, test: extract lexicographical compare utilities
  test: raft: fsm_test: disambiguate raft::configuration construction
  test: reader_concurrency_semaphore_test: handle all enum values
  repair: fix signed/unsigned compare
  repair: fix incorrect signed/unsigned compare
  treewide: avoid unused variables in if statements
  keys: disambiguate construction from initializer_list<bytes>
  cql3: expr: fix serialize_listlike() reference-to-temporary with gcc
  compaction: error on invalid scrub type
  treewide: prevent redefining names
  api: task_manager: fix signed/unsigned compare
  alternator: streams: fix signed/unsigned comparison
  test: fix some mismatched signed/unsigned comparisons
2023-03-24 15:16:05 +02:00
Kefu Chai
a7b4f84b6a bloom_filter: mark internal help function static
as `initialize_opt_k()` is not used out side of the translation unit,
let's mark it `static`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-24 15:41:45 +08:00
Kefu Chai
1a82a7ac72 bloom_filter: add more constness to false positive rate tables
we never mutate them, so mark them const for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-24 15:41:45 +08:00
Kefu Chai
7f4a3fdac8 bloom_filter: use vector::back() when appropriate
no need to use `size - 1` for accessing the last element in a vector,
let's just use `vector::back()` for more compacted code.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-24 15:41:45 +08:00
Avi Kivity
bdfc0aa748 utils, types, test: extract lexicographical compare utilities
UUID_test uses lexicograhical_compare from the types module. This
is a layering violation, since UUIDs are at a much lower level than
the database type system. In practical terms, this cause link failures
with gcc due to some thread-local-storage variables defined in types.hh
but not provided by any object, since we don't link with types.o in this
test.

Fix by extracting the relevant functions into a new header.
2023-03-21 15:42:53 +02:00
Botond Dénes
bf8b746bca Merge 'utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<>' from Kefu Chai
this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print UUID without using ostream<<. also, this change re-implements some formatting helpers using fmtlib for better performance and less dependencies on operator<<(), but we cannot drop it at this moment, as quite a few caller sites are still using operator<<(ostream&, const UUID&) and operator<<(ostream&, tagged_uuid<T>&). we will address them separately.

* add `fmt::formatter<UUID>`
* add `fmt::formatter<tagged_uuid<T>>`
* implement `UUID::to_string()` using `fmt::to_string()`
* implement `operator<<(std::ostream&, const UUID&)` with `fmt::print()`, this should help to improve the performance when printing uuid, as `fmt::print()` does not materialize a string when printing the uuid.
* treewide: use fmtlib when printing UUID

Refs #13245

Closes #13246

* github.com:scylladb/scylladb:
  treewide: use fmtlib when printing UUID
  utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<>
2023-03-20 14:26:11 +02:00
Kefu Chai
c14c70b89d utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print UUID without using ostream<<. also, this change reimplements
some formatting helpers using fmtlib for better performance and less
dependencies on operator<<(), but we cannot drop it at this moment,
as quite a few caller sites are still using operator<<(ostream&, const UUID&)
and operator<<(ostream&, tagged_uuid<T>&). we will address them separately.

* add fmt::formatter<UUID>
* add fmt::formatter<tagged_uuid<T>>
* implement UUID::to_string() using fmt::to_string()
* implement operator<<(std::ostream&, const UUID&) with fmt::print(),
  this should help to improve the performance when printing uuid, as
  fmt::print() does not materialize a string when printing the uuid.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-20 14:25:45 +08:00
Pavel Emelyanov
c88e47a624 memory_data_sink: Add move ctor
To make it possible to move the class member away resetting to be be
empty at the same time.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13208
2023-03-20 07:55:20 +02:00
Kefu Chai
93fa70069c utils: to_string: move join to namespace utils
`join` can easily be confused with boost::algorithm::join
so make it more visible that we're using scylla's
utils implementation.

Also, move `struct print_with_comma` to utils::internal.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-03-16 20:34:18 +08:00
Avi Kivity
35dd3edb9e types: fix big_decimal constructor from literal 0
Currently, big_decimal(0) will select the big_decimal(string_view)
constructor (via 0 -> const char* -> string_view conversions).
0 is important for initializing aggregates, so fix it ahead of
using it.
2023-03-15 22:24:12 +02:00
Kefu Chai
1cd3764b08 build: cmake: add missing sources in test-lib and utils
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-15 15:38:46 +08:00
Kefu Chai
0b3d25ab1b build: cmake: add missing linkages
these dependencies were found when trying to compile
`user_function_test`. whenever a library libfoo references another one,
say, libbar, the corresponding linkage from libfoo to libbar is added.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-08 22:53:42 +08:00
Avi Kivity
6aa91c13c5 Merge 'Optimize topology::compare_endpoints' from Benny Halevy
The code for compare_endpoints originates at the dawn of time (bc034aeaec)
and is called on the fast path from storage_proxy via `sort_by_proximity`.

This series considerably reduces the function's footprint by:
1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case)
2. avoid sstring copy in topology::get_{datacenter,rack}

Closes #12761

* github.com:scylladb/scylladb:
  topology: optimize compare_endpoints
  to_string: add print operators for std::{weak,partial}_ordering
  utils: to_sstring: deinline std::strong_ordering print operator
  move to_string.hh to utils/
  test: network_topology: add test_topology_compare_endpoints
2023-03-07 15:17:19 +02:00
Kefu Chai
6b249dd301 utils: UUID: throw marshal_exception when fail to parse uuid
* throw marshal_exception if not the whole string is parsed, we
  should error out if the parsed string contains gabage at the end.
  before this change, we silent accept uuid like
  "ce84997b-6ea2-4468-9f02-8a65abf4wxyz", and parses it as
  "ce84997b-6ea2-4468-9f02-8a65abf4". this is not correct.
* throw marshal_exception if stoull() throws,
  `stoull()` throws if it fails to parse a string to an unsigned long
  long, we should translate the exception to `marshal_exception`, so
  we can handle these exception in a consistent manner.

test is updated accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13069
2023-03-06 12:59:41 +02:00
Kefu Chai
eeb8553305 build: cmake: find and link against RapidJSON
despite that RapidJSON is a header-only library, we still need to
find it and "link" against it for adding the include directory.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-04 13:11:25 +08:00