The leak sanitizer has a bug [1] where, if it detects a leak, it
forks something, and before that, it closes all files (instead of
using close_range like a good citizen).
Docker tends to create containers with the NOFILE limit (number of
open files) set to 1 billion.
The resulting 1 billion close() system calls is incredibly slow.
Work around that problem by passing the host NOFILE limit.
[1] https://github.com/llvm/llvm-project/issues/59112Closes#12638
After topology changes like removing a node, verify that the set of
group 0 members and token ring members is the same.
Modify `get_token_ring_host_ids` to only return NORMAL members. The
previous version which used the `/storage_service/host_id` endpoint
might have returned non-NORMAL members as well.
Fixes: #12153Closes#12619
* seastar d41af8b59...943c09f86 (20):
> reactor: disable io_uring on older kernels if not enough lockable memory is available
> demos/tcp_sctp_client_demo: use user-defined literal for sizes
> core/units: add user-defined literal for IEC prefixes
> core/units: include what we use
> coroutine/exception: do not include core/coroutine.hh
> seastar/coroutine: drop std-coroutine.hh
> core/bitops.hh: add type constraits to templates
> apps/iotune: s/condition == false/!condition/
> core/metrics_api: s/promehteus/prometheus/
> reactor: make io_uring the default backend if available
> tests: connect_test: use 127.0.0.1 for connect refused test
> reactor: use aio to implement reactor_backend_uring::read()
> future: schedule: get_available_state_ref under SEASTAR_DEBUG
> rpc: client_info: add retrieve_auxiliary_opt
> Merge 'Make http requests with content-length header and generated body' from Pavel Emelyanov
> Merge 'Ensure logger doesn't allocate' from Travis Downs
> http, httpd: optimize header field assignment
> sstring: operator<< std::unordered_map: delete stray space char
> Dump memory diagnostics at error level on abort
> Fix CLI help for memory diagnostics dump
Closes#12650
There are several helpers to make an sstable for the table and two with most of the arguments are only used by tests. This PR leaves table with just one arg-less call thus making it easier to patch further.
Closes#12636
* github.com:scylladb/scylladb:
table: Shrink sstables making API
tests: Use sstables manager to make sstables
distributed_loader: Add helpers to make sstables for reshape/reshard
If a server is stopped suddenly (i.e. not graceful), schema tables might
be in inconsistent state. Add a test case and enable Scylla
configuration option (force_schema_commit_log) to handle this.
Fixes#12218Closes#12630
* github.com:scylladb/scylladb:
pytest: test start after ungraceful stop
test.py: enable force_schema_commit_log
Currently there are four helpers, this patch makes it just two and one
of them becomes private the table thus making the API small and neat
(and easy to patch further).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This test uses two many-args helpers from table calss to create sstables
with desired parameters. The table API in question is not used by any
other code but these few places, to it's better to open-code it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This kills two birds with one stone. First, it factors out (quite a lot
of) common arguments that are passed to table.make_sstable(). Second, it
makes the helpers call sstable manager with extended args making it
possible to remove those wrappers from table class later.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Although the number of keyspaces should mostly be 1 here, and thus the
chance of two tables from different keyspaces colliding is miniscule, it
is not zero. Better be safe than sorry, so match the keyspace name too
when looking up a table.
Closes#12627
Each time backlog tracker is informed about a new or old sstable, it
will recompute the static part of backlog which complexity is
proportional to the total number of sstables.
On schema change, we're calling backlog_tracker::replace_sstables()
for each existing sstable, therefore it produces O(N ^ 2) complexity.
Fixes#12499.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12593
This phrase is inaccurate and unnecessary. We know all lines in the
printout are for reads and they are semaphores: no need to repeat this
information on each line.
Example:
Read Concurrency Semaphores:
read: 0/100, 0/ 41901096, queued: 0
streaming: 0/ 10, 0/ 41901096, queued: 0
system: 0/ 10, 0/ 41901096, queued: 0
Closes#12633
The dbuild README has an example how to enable ccache, and required
modifying the PATH. Since recently, our docker image includes
required commands (cxxbridge) in /usr/local/bin, so the build will
fail if that directory isn't also in the path - so add it in the
example.
Also use the opportunity to fix the "/home/nyh" in one example to
"$HOME".
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12631
`ScyllaCluster.server_stop` had this piece of code:
```
server = self.running.pop(server_id)
if gracefully:
await server.stop_gracefully()
else:
await server.stop()
self.stopped[server_id] = server
```
We observed `stop_gracefully()` failing due to a server hanging during
shutdown. We then ended up in a state where neither `self.running` nor
`self.stopped` had this server. Later, when releasing the cluster and
its IPs, we would release that server's IP - but the server might have
still been running (all servers in `self.running` are killed before
releasing IPs, but this one wasn't in `self.running`).
Fix this by popping the server from `self.running` only after
`stop_gracefully`/`stop` finishes.
Make an analogous fix in `server_start`: put `server` into
`self.running` *before* we actually start it. If the start fails, the
server will be considered "running" even though it isn't necessarily,
but that is OK - if it isn't running, then trying to stop it later will
simply do nothing; if it is actually running, we will kill it (which we
should do) when clearing after the cluster; and we don't leak it.
Closes#12613
Test case for a start of a server after it was stopped suddenly (instead
of gracefully). This coud cause commitlog flush issues.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
New clusters that use a fresh conf/scylla.yaml will have `consistent_cluster_management: true`, which will enable Raft, unless the user explicitly turns it off before booting the cluster.
People using existing yaml files will continue without Raft, unless consistent_cluster_management is explicitly requested during/after upgrade.
Also update the docs: cluster creation and node addition procedures.
Fixes#12572.
Closes#12585
* github.com:scylladb/scylladb:
docs: mention `consistent_cluster_management` for creating cluster and adding node procedures
conf: enable `consistent_cluster_management` by default
Its _it member keeps state about the current range.
Although it's modified by the method, this is an implementation
detail that irrelevant to the caller, hence mark the
belongs_to_current_node method as const (and noexcept while
at it).
This allows the caller, cleanup_compaction, to use it from
inside a const method, without having to mark
its respective member as mutable too.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12634
change row purge condition for compacting_reader to remove all expired
rows to avoid read perfomance problems when there are many expired
tombstones in row cache
Refs #2252Closes#12565
From reviews of https://github.com/scylladb/scylladb/pull/12569, avoid
using `async with` and access the `Pool` of clusters with
`get()`/`put()`.
Closes#12612
* github.com:scylladb/scylladb:
test.py: manual cluster handling for PythonSuite
test.py: stop cluster if PythonSuite fails to start
test.py: minor fix for failed PythonSuite test
- makes all regexes static
If making regex compilation static
for uuid_type_impl and timeuuid_type_impl helps then it should
also help for timestamp_type and simple_date_type.
- remove unnecessary tolower transform in simple_date_type_impl::from_sstring
Following function uses only decimal and '-' characters (see date_re). They are not
affected by tolower call in any way.
Aditionally std::strtoll supports "0x" prefixes but also accepts
upper case version "0X" so it's also not affected by tolower call.
get_simple_date_time only casts strings to integer types using
boost:lexical_cast so also not affected by tolower.
Finally, serialize only uses str to include it in an exception text
so tolower doesn't affect it in a positive way. It's even better
that input is displayed to the user as it was, not converted to lower
case.
Closes#12621
* github.com:scylladb/scylladb:
types: remove unnecessary tolower transform in simple_date_type_impl::from_sstring
types: make all regexes static
Most of snitch drivers set _my_dc and _my_rack with direct assignment
thus skipping the sanity checks for dc/rack being empty. On other shards
they call set_my_dc_and_rack() helper which warns the empty value and
replaces it with some defaults.
It's better to use the helper on all shards in order to have the same
dc/rack values everywhere.
refs: #12185
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#12524
Issue #12538 suggested that maybe Alternator shouldn't bother reporting an
invalid table name in item operations like PutItem, and that it's enough
to report that the table doesn't exist. But the test added in this patch
shows that DynamoDB, like Alternator, reports the invalid table name in
this case - not just that the table doesn't exist.
That should make us think twice before acting on issue #12538. If we do
what this issue recommended, this test will need to be fixed (e.g., to
accept as correct both types of errors).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12608
before this change, we returns the total memory managed by Seastar
in the "total" field in system.memory. but this value only reflect
the total memory managed by Seastar's allocator. if
`reserve_additional_memory` is set when starting app_template,
Seastar's memory subsystem just reserves a chunk of memory of this
specified size for system, and takes the remaining memory. since
f05d612da8, we set this value to 50MB for wasmtime runtime. hence
the test of `TestRuntimeInfoTable.test_default_content` in dtest
fails. the test expects the size passed via the option of
`--memory` to be identical to the value reported by system.memory's
"total" field.
after this change, the "total" field takes the reserved memory
for wasm udf into account. the "total" field should reflect the total
size of memory used by Scylla, no matter how we use a certain portion
of the allocated memory.
Fixes#12522
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12573
Instead of complex async with logic, use manual cluster pool handling.
Revert the discard() logic in Pool from a recent commit.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Even though test can't fail both before and after, make the logic
explicit in case code changes in the future.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
add Make variable named `PREVIEW_HOST` so it can be overriden like
```
make preview PREVIEW_HOST=$(hostname -I | cut -d' ' -f 1)
```
it allows developer to preview the document if the host buiding the
document is not localhost.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12589
It's a simple helper used during boot-time that can enjoy query-processor from sharded<system_keyspace>
Closes#12587
* github.com:scylladb/scylladb:
system_keyspace: De-static system_keyspace::increment_and_get_generation
system_keyspace: Fix indentation after previous patch
system_keyspace: Coroutinize system_keyspace::increment_and_get_generation
Following function uses only decimal and '-' characters (see date_re). They are not
affected by tolower call in any way.
Aditionally std::strtoll supports "0x" prefixes but also accepts
upper case version "0X" so it's also not affected by tolower call.
get_simple_date_time only casts strings to integer types using
boost:lexical_cast so also not affected by tolower.
Finally, serialize only uses str to include it in an exception text
so tolower doesn't affect it in a positive way. It's even better
that input is displayed to the user as it was, not converted to lower
case.
The reasons for force-disabling are doubly wrong: we now
use liburing from Fedora 37, which is sufficiently recent,
and the auto-detection code will disable io_uring if a
sufficiently recent version isn't present.
Closes#12620
There was a small chance that we called `timeout_src.request_abort()`
twice in the `with_timeout` function, first by timeout and then by
shutdown. `abort_source` fails on an assertion in this case. Fix this.
Fixes: #12512Closes#12514
Fix https://github.com/scylladb/scylla-docs/issues/3968
This PR adds the information that an upgrade to each successive major version is required to upgrade from an old ScyllaDB version.
Closes#12586
* github.com:scylladb/scylladb:
docs: remove repetition
doc: add the general upgrade policy to the uprage page
Don't use a range scan, which is very inefficient, to perform a query for checking CQL availability.
Improve logging when waiting for server startup times out. Provide details about the failure: whether we managed to obtain the Host ID of the server and whether we managed to establish a CQL connection.
Closes#12588
* github.com:scylladb/scylladb:
test/pylib: scylla_cluster: better logging for timeout on server startup
test/pylib: scylla_cluster: use less expensive query to check for CQL availability
Waiting for server startup is a multi-step procedure: after we start the
actual process, we will:
- try to obtain the Host ID (by querying a REST API endpoint)
- then try to connect a CQL session
- then try to perform a CQL query
The steps are repeated every .1 second until we reach a timeout (the
Host ID step is skipped if we previously managed to obtain it).
On timeout we'd only get a generic "failed to start server" message, it
wouldn't say what we managed to do and what not.
For example, on one of the failed jobs on Jenkins I observed this
timeout error. Looking at the logs of the server, it turned out that the
server printed the "initialization completed" message more than 2
minutes before the actual timeout happened. So for 2 minutes, the test
framework either couldn't obtain the Host ID, or couldn't establish a
CQL connection, or couldn't perform a CQL query, but I wasn't able to
determine fully which one of these was the case.
Improve the code by printing whether we managed to get the Host ID of
the server and if so - whether we managed to connect to CQL.
Fix https://github.com/scylladb/scylladb/issues/12605#event-8328930604
This PR removes the duplicated content (the file with the table was included twice) and reorganizes the content in the Networking section.
Closes#12615
* github.com:scylladb/scylladb:
doc: fix the broken link
doc: replace Scylla with ScyllaDB
doc: remove duplication in the Networking section (the table of ports used by ScyllaDB
If the after test check fails (is_after_test_ok is False), discard the cluster and raise exception so context manager (pool) does not recycle it.
Ignore exception re-raised by the context manager.
Fixes#12360Closes#12569
* github.com:scylladb/scylladb:
test.py: handle broken clusters for Python suite
test.py: Pool discard method
Add and use dht::compaction_group_of that computes the
compaction_group index by unbiasing the token,
similar to dht::shard_of.
This way, all tokens in `_compaction_groups[i]` are ordered
before `_compaction_groups[j]` iff i < j.
Fixes#12595
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12599
It's only called on cluster-join from storage_service which has the
local system_keyspace reference and it's already started by that time.
This allows removing few more occurrences of global qctx.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Just unroll the fn().then({ fn2().then().then(); }); chain.
Indentation is deliberately left broken.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently reverse types match the default case (false), even though they
might be wrapping a tuple type. One user-visible effect of this is that
a schema, which has a reversed<frozen<UDT>> clustering key component,
will have this component incorrectly represented in the schema cql dump:
the UDT will loose the frozen attribute. When attempting to recreate
this schema based on the dump, it will fail as the only frozen UDTs are
allowed in primary key components.
Fixes: #12576Closes#12579