Added a new parameter `consider_only_existing_data` to major compaction
API endpoints. When enabled, major compaction will:
- Force-flush all tables.
- Force a new active segment in the commit log.
- Compact all existing SSTables and garbage-collect tombstones by only
checking the SSTables being compacted. Memtables, commit logs, and
other SSTables not part of the compaction will not be checked, as they
will only contain newer data that arrived after the compaction
started.
The `consider_only_existing_data` is passed down to the compaction
descriptor's `gc_check_only_compacting_sstables` option to ensure that
only the existing data is considered for garbage collection.
The option is also passed to the `maybe_flush_commitlog` method to make
sure all the tables are flushed and a new active segment is created in
the commit log.
Fixes#19728
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
When flushing is done externally, e.g. by running
`nodetool flush` prior to `nodetool compact`,
flush_memtables=false can be passed to skip flushing
of tables right before they are major-compacted.
This is useful to prevent creation of small sstables
due to excessive memtable flushing.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
In definition for /column_family/major_compaction/{name} there is an
incorrect use of "bool" instead of "boolean".
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#9516
This change enhances the toppartitions api to also return
the cardinality of the read and write sample sets. It now uses
the size() method of space_saving_top_k class, counting the unique
operations in the sampled set for up to the given capacity.
Fixes#4089Closes#7766
The patch implements:
- /storage_service/auto_compaction API endpoint
- /column_family/autocompaction/{name} API endpoint
Those APIs allow to control and request the status of background
compaction jobs for the existing tables.
The implementation introduces the table::_compaction_disabled_by_user.
Then the CompactionManager checks if it can push the background
compaction job for the corresponding table.
New members
===
table::enable_auto_compaction();
table::disable_auto_compaction();
bool table::is_auto_compaction_disabled_by_user() const
Test
===
Tests: unit(sstable_datafile_test autocompaction_control_test), manual
$ ninja build/dev/test/boost/sstable_datafile_test
$ ./build/dev/test/boost/sstable_datafile_test --run_test=autocompaction_control_test -- -c1 -m2G --overprovisioned --unsafe-bypass-fsync 1 --blocked-reactor-notify-ms 2000000
The test tries to submit a compaction job after playing
with autocompaction control table switch. However, there is
no reliable way to hook pending compaction task. The code
assumed that with_scheduling_group() closure will never
preempt execution of the stats check.
Revert
===
Reverts commit c8247ac. In previous version the execution
sometimes resulted into the following error:
test/boost/sstable_datafile_test.cc(1076): fatal error: in "autocompaction_control_test":
critical check cm->get_stats().pending_tasks == 1 || cm->get_stats().active_tasks == 1 has failed
This version adds a few sstables to the cf, starts
the compaction and awaits until it is finished.
API change
===
- `/column_family/autocompaction/` always returned `true` while answering to the question: if the autocompaction disabled (see https://github.com/scylladb/scylla-jmx/blob/master/src/main/java/org/apache/cassandra/db/ColumnFamilyStore.java#L321). now it answers to the question: if the autocompaction for specific table is enabled. The question logic is inverted. The patch to the JMX is required. However, the change is decent because all old values were invalid (it always reported all compactions are disabled).
- `/column_family/autocompaction/` got support for POST/DELETE per table
Fixes
===
Fixes#1488Fixes#1808Fixes#440
Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>
Reviewed-by: Glauber Costa <glauber@scylladb.com>
This reverts commit 1c444b7e1e. The test
it adds sometimes fails as follows:
test/boost/sstable_datafile_test.cc(1076): fatal error: in "autocompaction_control_test":
critical check cm->get_stats().pending_tasks == 1 || cm->get_stats().active_tasks == 1 has failed
Ivan is working on a fix, but let's revert this commit to avoid blocking
next promotion failing from time to time.
This patch adds API endpoint /column_family/autocompaction/{name}
that listen to GET and POST requests to pick and control table
background compactions.
To implement that the patch introduces "_compaction_disabled_by_user"
flag that affects if CompactionManager is allowed to push background
compactions jobs into the work.
It introduces
table::enable_auto_compaction();
table::disable_auto_compaction();
bool table::is_auto_compaction_disabled_by_user() const
to control auto compaction state.
Fixes#1488Fixes#1808Fixes#440
Tests: unit(sstable_datafile_test autocompaction_control_test), manual
This implements support for triggering major compations through the REST
API. Please note that "split_output" is not supported and Glauber Costa
confirmed this this is fine:
"We don't support splits, nor do I think we should."
Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>
In swagger 1.2 int is defined as int32.
We originally used int following the jmx definition, in practice
internally we use uint and int64 in many places.
While the API format the type correctly, an external system that uses
swagger-based code generator can face a type issue problem.
This patch replace all use of int in a return type with long that is defined as int64.
Changing the return type, have no impact on the system, but it does help
external systems that use code generator from swagger.
Fixes#5347
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This patch replaces the latency histogram to
rate_moving_avrage_and_histogram and the counters to
rate_moving_average.
The old endpoints where left unchagned but marked as depricated when
needed.
This patch adds the column family API that return the snapshot size.
The changes in the swagger definition file follo origin so the same API will be used for the metric and the
column_family.
The implementation is based on the get_snapshot_details in the
column_family.
This fix:
425
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This adds the API definition with stub implementation that would make
the nodetool cfstats to run.
After this patch the nodetool cfstats command would work, but with stub
imlementation.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
compression
Timers support expect the API to return histogram. This adds the swagger
definition for the following timer in column_family:
get_coordinator_read_latency
get_coordinator_scan_latency
get_waiting_on_free_memtable_space
The following estimated histogram were added to column_family:
get_read_latency_estimated_recent_histogram
get_read_latency_estimated_histogram
get_range_latency_estimated_recent_histogram
get_range_latency_estimated_histogram
get_write_latency_recent_histogram
get_write_latency_estimated_recent_histogram
get_cas_prepare_estimated_recent_histogram
get_cas_prepare_estimated_histogram
get_cas_propose_estimated_recent_histogram
get_cas_propose_estimated_histogram
get_cas_commit_estimated_recent_histogram
get_cas_commit_estimated_histogram
And the following timers in commitlog:
get_waiting_on_segment_allocation
get_waiting_on_commit
To column family API the following API were added:
set_compaction_strategy_class
get_compaction_strategy_class
set_compression_parameters
set_crc_check_chance
get_sstable_count_per_level
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
API: Completing the column_family Swagger definition
This adds the missing definition in the column_family to make it
compatible to ColumnFamilyStoreMbean
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch clear the ambiguity in the swagger definition file and adds
the implementation for the memtable memory related methods.
For each column family there is an active memtable and a list of non
active.
when refering the all the memtable in the column family, the nick name
will contain cf_all_memtables.
Each URL has two versions, one, with a column family name, that is
relevant to a specific column family and one without, which is the
result of running the method on all column families.
This patch adds the following implementation to column_family:
get_memtable_on_heap_size
get_all_memtable_on_heap_size
get_memtable_off_heap_size
get_all_memtable_off_heap_size
get_memtable_live_data_size
get_all_memtable_live_data_size
get_all_memtables_on_heap_size
get_all_all_memtables_on_heap_size
get_all_memtables_off_heap_size
get_all_all_memtables_off_heap_size
get_all_memtables_live_data_size
get_all_all_memtables_live_data_size
Memory consumption is map this way: All memory assume to be off heap, so
on heap will return 0, and off heap will return the memory consumption
After this patch the following URL will be available:
/column_family/metrics/memtable_on_heap_size/{name}
/column_family/metrics/memtable_on_heap_size
/column_family/metrics/memtable_off_heap_size/{name}
/column_family/metrics/memtable_off_heap_size
/column_family/metrics/memtable_live_data_size/{name}
/column_family/metrics/memtable_live_data_size
/column_family/metrics/all_memtables_on_heap_size/{name}
/column_family/metrics/all_memtables_on_heap_size
/column_family/metrics/all_memtables_off_heap_size/{name}
/column_family/metrics/all_memtables_off_heap_size
/column_family/metrics/all_memtables_live_data_size/{name}
/column_family/metrics/all_memtables_live_data_size
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the latency histogram to the column_family swagger
definitions.
The definitions are based on the ColumnFamilyMetrics.
It adds the following commands:
get_read_latency_histogram
get_all_read_latency_histogram
get_write_latency_histogram
get_all_write_latency_histogram
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the read and write counters to the column_family swagger
definitions.
It adds the following commands:
get_read
get_all_read
get_write
get_all_write
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the Column familiy swagger definition file, the API is
equivelent to the ColumnFamilyStoreMBean definition.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>