Commit Graph

125 Commits

Author SHA1 Message Date
Lakshmi Narayanan Sreethar
84d06a13c7 api: compaction: add consider_only_existing_data option
Added a new parameter `consider_only_existing_data` to major compaction
API endpoints. When enabled, major compaction will:

- Force-flush all tables.
- Force a new active segment in the commit log.
- Compact all existing SSTables and garbage-collect tombstones by only
  checking the SSTables being compacted. Memtables, commit logs, and
  other SSTables not part of the compaction will not be checked, as they
  will only contain newer data that arrived after the compaction
  started.

The `consider_only_existing_data` is passed down to the compaction
descriptor's `gc_check_only_compacting_sstables` option to ensure that
only the existing data is considered for garbage collection.

The option is also passed to the `maybe_flush_commitlog` method to make
sure all the tables are flushed and a new active segment is created in
the commit log.

Fixes #19728

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-09-05 17:25:45 +05:30
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Raphael S. Carvalho
ad5c5bca5f replica: get rid of fragile compaction group intrusive list
It was added to make integration of storage groups easier, but it's
complicated since it's another source of truth and we could have
problems if it becomes inconsistent with the group map.

Fixes #18506.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2024-07-09 16:53:35 -03:00
Pavel Emelyanov
31d05925cc api,database: Move auto-compaction toggle guard
Toggling per-table auto-compaction enabling bit is guarded with
on-database boolean and raii guard. It's only used by a single
api/column_family.cc file, so it can live there.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-16 14:42:51 +03:00
Pavel Emelyanov
a43b178f72 api: Move some table manipulation helpers from storage_service
Continuation of the previous patch -- helpers toggling tombstone_gc and
auto_compaction on tables should live in the same file that uses them.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-16 14:42:50 +03:00
Pavel Emelyanov
862fcd7bc7 api: Move table-related calls from storage_service domain
The storage_service/(enable|disable)_(tombstone_gc|auto_compaction)
endpoints are not handled by storage_service _service_ and should rather
live in the column_family/ domain which is handler by replica::database.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-16 14:42:50 +03:00
Pavel Emelyanov
ba53283d21 api: Reimplement some endpoints using existing helpers
The (enable|disable)_(tombstone_gc|auto_compaction) endpoints living in
column_family domain can benefit from the helpers that do the same in
the storage_service domain. The "difference" is that c.f. endpoints do
it per-table, while s.s. ones operate on a vector of tables, so the
former is a corner case of the latter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-16 14:42:50 +03:00
Pavel Emelyanov
231ffa623c api: Lost unset of tombstone-gc endpoints
On stop all endpoints must be unregistered, these three are lost

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-16 14:42:50 +03:00
Nadav Har'El
1aacfdf460 REST API: stop using deprecated, buggy, path parameter
The API req->param["name"] to access parameters in the path part of the
URL was buggy - it forgot to do URL decoding and the result of our use
of it in Scylla was bugs like #5883 - where special characters in certain
REST API requests got botched up (encoded by the client, then not
decoded by the server).

The solution is to replace all uses of req->param["name"] by the new
req->get_path_param("name"), which does the decoding correctly.

Unfortunately we needed to change 104 (!) callers in this patch, but the
transformation is mostly mechanical and there is no functional changes in
this patch. Another set of changes was to bring req, not req->param, to
a few functions that want to get the path param.

This patch avoids the numerous deprecation warnings we had before, and
more importantly, it fixes #5883.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-05-02 12:33:46 +03:00
Pavel Emelyanov
186b36165e snapshot: Move per-table snap API to other snapshot endpoints
So that they are collected in one place and to facilitate next patch
that's going to use snapshot-ctl for per-table API too

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-25 10:05:01 +03:00
Pavel Emelyanov
ceac65be1e api: Reserve vectors in advance
Some endpoints in api/column_family fill vectors with data obtained from
database and return them back. Since the amount of data is known in
advance, it's good to reserve the vector.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-20 19:13:05 +03:00
Pavel Emelyanov
f3e58cb806 api: Use range-loop to iterate keyspaces
The code uses standard for (;;) loop, but range version is nicer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-20 19:12:12 +03:00
Kefu Chai
9afec2e3e7 api, compaction: promote flush_mode
so that this enum type can be shared by other task(s) as well.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-01 11:25:53 +08:00
Kefu Chai
ffb5ad494f api: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16973
2024-01-25 11:28:02 +03:00
Benny Halevy
b12b142232 api: add /storage_service/compact
For major compacting all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool compact` translates to
a sequence of `/storage_service/keyspace_compaction` calls).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-11-28 16:37:42 +02:00
Benny Halevy
1fd85bd37b api: compaction: add flush_memtables option
When flushing is done externally, e.g. by running
`nodetool flush` prior to `nodetool compact`,
flush_memtables=false can be passed to skip flushing
of tables right before they are major-compacted.

This is useful to prevent creation of small sstables
due to excessive memtable flushing.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-11-28 16:37:42 +02:00
Aleksandra Martyniuk
e072a2341d replica: api: return table_id instead of const table_id&
Return table_id instead of const table_id& from database::find_uuid
as copying table_id does not cause much overhead and simplifies
methods signature.
2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk
cdbfa0b2f5 replica: iterate safely over tables related maps
Loops over _column_families and _ks_cf_to_uuid which may preempt
are protected by reader mode of rwlock so that iterators won't
get invalid.
2023-07-25 17:13:04 +02:00
Aleksandra Martyniuk
52afd9d42d replica: wrap column families related maps into tables_metadata
As a preparation for ensuring access safety for column families
related maps, add tables_metadata, access to members of which
would be protected by rwlock.
2023-07-25 16:13:00 +02:00
Aleksandra Martyniuk
61dc98b276 api: prevent non-owner cpu access to shared_ptr
In get_sstables_for_key in api/column_family.cc a set of lw_shared_ptrs
to sstables is passes to reducer of map_reduce0. Reducer then accesses
these shared pointers. As reducer is invoked on the same shard
map_reduce0 is called, we have an illegal access to shared pointer
on non-owner cpu.

A set of shared pointers to sstables is trasnsformed in map function,
which is guaranteed to be invoked on a shard associated with the service.

Fixes: #14515.

Closes #14532
2023-07-09 23:09:59 +03:00
Pavel Emelyanov
198bca98ec table: Return shared sstable from get_sstables_by_partition_key()
The call is generic enough not to drop the sstable itself on return so
that callers can do whatever they need with it. The only today's caller
is API which will convert sstables to filenames on its own

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-07 15:04:48 +03:00
Aleksandra Martyniuk
f48b57e7b9 compaction: use table_info in compaction tasks
Task manager compaction tasks need table names for logs.
Thus, compaction tasks store table infos instead of table ids.

get_table_ids function is deleted as it isn't used anywhere.
2023-05-30 09:58:55 +02:00
Raphael S. Carvalho
abc1eae1c2 Add API to disable tombstone GC in compaction
Adding new APIs /column_family/tombstone_gc and
/storage_service/tombstone_gc.

Mimicks existing APIs /column_family/autocompaction and
/storage_service/autocompaction.

column_family variant must specify a single table only,
following existing convention.

whereas the storage_service one can specify an entire
keyspace, or a subset of a tables in a keyspace.

column_family API usage
-----

The table name must be in keyspace:name format

Get status:
curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"

Enable GC
curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"

Disable GC
curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"

storage_service API usage
-----

Tables can be specified using a comma-separated list.

Enable GC on keyspace
curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"

Disable GC on keyspace
curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"

Enable GC on a subset of tables
curl -s -X POST
"http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2"

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-12 10:34:38 -03:00
Botond Dénes
b9491c0134 Merge 'Test the column_family rest api' from Benny Halevy
Add a test for get/enable/disable auto_compaction via to column_family api.
And add log messages for admin operations over that api.

Closes #13566

* github.com:scylladb/scylladb:
  api: column_family: add log messages for admin operation
  test: rest_api: add test_column_family
2023-04-25 09:53:47 +02:00
Pavel Emelyanov
5e201b9120 database: Remove compaction_manager.hh inclusion into database.hh
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.

However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13622
2023-04-23 16:27:11 +03:00
Benny Halevy
456f5dfce5 api: column_family: add log messages for admin operation
Similar to the storage_service api, print a log message
for admin operations like enabling/disabling auto_compaction,
running major compaction, and setting the table compaction
strategy.

Note that there is overlap in functionality
between the storage_service and the column_family api entry points.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-18 17:11:33 +03:00
Avi Kivity
7a42927a3d treewide: stop using 'using namespace std' in namespace scope
Such namespace-wide imports can create conflicts between names that
are the same in seastar and std, such as {std,seastar}::future and
{std,seastar}::format, since we also have 'using namespace seastar'.

Replace the namespace imports with explicit qualification, or with
specific name imports.

Closes #13528
2023-04-17 14:08:37 +03:00
Aleksandra Martyniuk
0918529fdf api: unify major compaction
Major compaction can be	started	from both storage_service and column_family
api. The first allows to compact a subset of tables in given keyspace,
while the latter - given table in given keyspace.

As major compaction started from storage_service has a wider scope,
we use its mechanisms for column_family's one. That makes it more consistent
and reduces number of classes that would be needed to cover the major
compaction with task manager's tasks.
2023-03-10 15:01:22 +01:00
Kefu Chai
5522080f80 api: s/request/http::request/
seastar::httpd::request was deprecated in favor of `seastar::http::request`
since bdd5d929891d2cb821eca25896e25ed4ff658b7a.
so let's use the latter. this change also silences the warning of:

```
/home/kefu/dev/scylladb/api/authorization_cache.cc: In function ‘void api::set_authorization_cache(http_context&, seastar::httpd::routes&, seastar::sharded<auth::service>&)’:
/home/kefu/dev/scylladb/api/authorization_cache.cc:19:104: error: ‘using seastar::httpd::request = struct seastar::http::request’ is deprecated: Use http::request instead [-Werror=deprecated-declarations]
   19 |     httpd::authorization_cache_json::authorization_cache_reset.set(r, [&auth_service] (std::unique_ptr<request> req) -> future<json::json_return_type> {
      |                                                                                                        ^~~~~~~
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-07 14:03:42 +08:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Pavel Emelyanov
d021aaf34d system_keysace: De-static calls that update view-building tables
There's a bunch of them used by mainly view_builder and also by the API
and storage_service. All use global qctx to make its job, now when the
callers have main-local sharded<system_keysace> references they can be
made non-static.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-03 21:56:54 +03:00
Pavel Emelyanov
b347a0cf0b api: Unset column_famliy endpoints
The API calls in question will use system keyspace, that starts before
(and thus stops after) and nowadays indirectly uses database instance
that also starts earlier (and also stops later), so this avoids
potential dangling references.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-03 18:59:28 +03:00
Pavel Emelyanov
eac2e453f2 api: Carry sharded<db::system_keyspace> reference over
There's the column_family/get_built_indexes call that calls a system
keyspace method to fetch data from scylla_views_builds_in_progress
table, so the system keyspace reference will be needed in the API
handler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-03 18:57:43 +03:00
Raphael S. Carvalho
4e836cb96c api: Estimate pending tasks on all compaction groups
Estimates # of compaction jobs to be performed on a table.
Adaptation is done by adding estimation from all groups.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-12-19 11:16:17 -03:00
Raphael S. Carvalho
ef8f542d75 replica: Adapt table::active_memtable() to compaction groups
active_memtable() was fine to a single group, but with multiple groups,
there will be one active memtable per group. Let's change the
interface to reflect that.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-12-19 11:15:14 -03:00
Avi Kivity
b0814bdd42 api: column_family: fix memtable off-heap memory reporting
We report virtual memory used, but that's not a real accounting
of the actual memory used. Use the correct real_memory_used() instead.

Note that this isn't a recent regression and was probably broken forever.
However nobody looks at this measure (and it's usually close to the
correct value) so nobody noticed.

Since it's so minor, I didn't bother filing an issue.
2022-10-04 13:56:29 +03:00
Avi Kivity
bc2fcf5187 dirty_memory_manager: unscramble terminology
Before 95f31f37c1 ("Merge 'dirty_memory_manager: simplify
region_group' from Avi Kivity"), we had two region_group
objects, one _real_region_group and another _virtual_region_group,
each with a set of "soft" and "hard" limits and related functions
and members.

In 95f31f37c1, we merged _real_region_group into _virtual_region_group,
but unfortunately the _real_region_group members received the "hard"
prefix when they got merged. This overloads the meaning of "hard" -
is it related to soft/hard limit or is it related to the real/virtual
distinction?

This patch applied some renaming to restore consistency. Anything
that came from _virtual_region_group now has "virtual" in its name.
Anything that came from _real_region_group now has "real" in its name.
The terms are still pretty bad but at least they are consistent.
2022-10-04 13:56:28 +03:00
Benny Halevy
fb7e55b0a8 api/column_family: add include db/system_keyspace.hh
For db::system_keyspace::load_view_build_progress that currently
indirectly satisfied via sstables/sstables.hh ->
db/large_data_handler.hh

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-29 12:42:54 +03:00
Benny Halevy
257d74bb34 schema, everywhere: define and use table_id as a strong type
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.

Fixes #11207

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-08 08:09:41 +03:00
Amnon Heiman
99a060126d database: Reduce the number of per-table metrics
This patch reduces the number of metrics that is reported per table, when
the per-table flag is on.

When possible, it moves from time_estimated_histogram and
timed_rate_moving_average_and_histogram to use the unified timer.

Instead of a histogram per shard, it will now report a summary per shard
and a histogram per node.

Counters, histograms, and summaries will not be reported if they were
never used.

The API was updated accordingly so it would not break.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Benny Halevy
8cbecb1c21 database: find_uuid: throw no_such_column_family exception if ks/cf were not found
Rather than masquerading all errors as std::out_of_range("")
convert only the std::out_of_range error from _ks_cf_to_uuid.at()
to no_such_column_family(ks, cf).  That relieves all callers of
fund_uuid from doing that conversion themselves.

For example, get_uuid in api/column_family now only deals with converting
no_such_column_family to bad_param_exception, as it needs to do
at the api level, rather than generating a similar error from scratch.

Other call sites required no intervention.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-12-08 16:35:38 +02:00
Benny Halevy
b60d697084 table: futurize disable_auto_compactions
So it can stop ongoing compaction and wait
for them to complete.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-30 08:33:04 +02:00
Raphael S. Carvalho
e2f6a47999 compaction: switch to table_state in estimated_pending_compactions()
Last method in compaction_strategy using table. From now on,
compaction strategy no longer works directly with table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 11:25:28 -03:00
Pavel Emelyanov
fece1a2f9f api: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-11 11:13:56 +03:00
Pavel Emelyanov
c5128eea67 api, database, storage_service: Unify auto-compaction toggle
There are two knobs here -- global and per-table one. Both were added
without any synchronisation, but the former one was later fixed to
become serialized and not to be available "too early".

This patch unifies both toggles to be serialized with each-other and
not be enabled too early.

The justification for this change is to move the global toggle from out
of the storage service, as it really belongs to the database, not the
storage service. Respectively, the current synchronization, that depends
on storage service internals, should be replaced with something else.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-11 11:12:39 +03:00
Avi Kivity
115d6d8d4c system_keyspace: prepare forward-declared members
In anticipation of making system_keyspace a class instead of a
namespace, rename any member that is currently forward-declared,
since one can't forward-declare a class member. Each member
is taken out of the system_keyspace namespace and gains a
system_keyspace prefix. Aliases are added to reduce code churn.

The result isn't lovely, but can be adjusted later.
2021-09-13 15:11:26 +03:00
Avi Kivity
6221b90b89 secondary_index_manager: stop including expression.hh
Use a forward declaration of cql3::expr::oper_t to reduce the
number of translation units depending on expression.hh.

Before:

    $ find build/dev -name '*.d' | xargs cat | grep -c expression.hh
    272

After:

    $ find build/dev -name '*.d' | xargs cat | grep -c expression.hh
    154

Some translation units adjust their includes to restore access
to required headers.

Closes #9229
2021-08-22 21:21:46 +03:00
Piotr Jastrzebski
90a607e844 api: use proper type to reduce partition count
Partition count is of a type size_t but we use std::plus<int>
to reduce values of partition count in various column families.
This patch changes the argument of std::plus to the right type.
Using std::plus<int> for size_t compiles but does not work as expected.
For example plus<int>(2147483648LL, 1LL) = -2147483647 while the code
would probably want 2147483649.

Fixes #9090

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>

Closes #9074
2021-07-26 11:53:06 +03:00