scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 08:12:08 +00:00

Author	SHA1	Message	Date
Avi Kivity	6f2d060d12	Merge 'Make sstable_directory call sstable_manager for sstables' components' from Pavel Emelyanov This PR hits two goals for "object storage" effort 1. Sstables loader "knows" that sstables components are stored in a Linux directory and uses utils/lister to access it. This is not going to work with sstables over object storage, the loader should be abstracted from the underlying storage. 2. Currently class keyspace and class column_family carry "datadir" and "all_datadirs" on board which are path on local filesystem where sstable files are stored (those usually started with /var/lib/scylla/data). The paths include subsdirs like "snapshots", "staging", etc. This is not going to look nice for obejct storage, the /var/lib/ prefix is excessive and meaningless in this case. Instead, ks and cf should know their "location" and some other component should know the directory where in which the files are stored. Said that, this PR prepares distributed_loader and sstables_directly to stop using Linux paths explicitly by making both call sstables_manager to list and open sstables object. After it will be possible to teach manager to list sstables from object storage. Also this opens the way to removing paths from keyspace and column_family classes and replacing those with relative "location"s. Closes #12128 * github.com:scylladb/scylladb: sstable_directory: Get components lister from manager sstable_directory: Extract directory lister sstable_directory: Remove sstable creation callback sstable_directory: Call manager to make sstables sstable_directory: Keep error handler generator sstable_directory: Keep schema_ptr sstable_directory: Use directory semaphore from manager sstable_directory: Keep reference on manager tests: Use sstables creation helper in some cases sstables_manager: Keep directory semaphore reference sstables, code: Wrap directory semaphore with concurrency	2022-12-05 18:54:17 +02:00
Pavel Emelyanov	5e13ce2619	sstables_manager: Keep directory semaphore reference Preparational patch. The semaphore will be used by sstables_directory in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:18 +03:00
Botond Dénes	1e20095547	Update tools/java submodule * tools/java 1c06006447...ecab7cf7d6 (1): > Add VSCode files to gitignore	2022-12-05 09:54:51 +02:00
Avi Kivity	6a5d9ff261	treewide: use non-experimental std::source_location Now that we use libstdc++ 12, we can use the standardized source_location. Closes #12137	2022-11-30 11:06:43 +02:00
Avi Kivity	380da0586c	Update tools/python3 submodule (drop locale workaround) * tools/python3 773070e...548e860 (1): > install.sh: drop locale workaround from python3 thunk	2022-11-28 12:24:13 +02:00
Avi Kivity	45a57bf22d	Update tools/java submodule (revert scylla-driver) scylla-driver causes dtests to fail randomly (likely due to incorrect handling of the USE statement). Revert it. * tools/java 73422ee114...1c06006447 (2): > Revert "Add Scylla Cloud serverless support" > Revert "Switch cqlsh to use scylla-driver"	2022-11-28 11:29:08 +02:00
Avi Kivity	78222ea171	Update tools/java submodule (cqlsh system_distributed_everywhere is a system keyspace) * tools/java 874e2d529b...73422ee114 (1): > Mark "system_distributed_everywhere" as system ks	2022-11-27 15:37:57 +02:00
Avi Kivity	895d721d5e	Merge 'scylla-sstable: data-dump improvements' from Botond Dénes This series contains a mixed bag of improvements to `scylla sstable dump-data`. These improvements are mostly aimed at making the json output clearer, getting rid of any ambiguities. Closes #12030 * github.com:scylladb/scylladb: tools/scylla-sstable: traverse sstables in argument order tools/scylla-sstable: dump-data docs: s/clustering_fragments/clustering_elements tools/scylla-sstable: dump-data/json: use Null instead of "<unknown>" tools/scylla-sstable: dump-data/json: use more uniform format for collections tools/scylla-sstable: dump-data/json: make cells easier to parse	2022-11-20 17:02:27 +02:00
Avi Kivity	14218d82d6	Update tools/java submodule (serverless) * tools/java caf754f243...874e2d529b (2): > Add Scylla Cloud serverless support > Switch cqlsh to use scylla-driver	2022-11-20 16:41:36 +02:00
Botond Dénes	30597f17ed	tools/scylla-sstable: traverse sstables in argument order In the order the user passed them on the command-line.	2022-11-18 15:58:37 +02:00
Botond Dénes	e337b25aa9	tools/scylla-sstable: dump-data docs: s/clustering_fragments/clustering_elements The usage of clustering_fragments is a typo, the output contains clustering_elements.	2022-11-18 15:58:36 +02:00
Botond Dénes	c39408b394	tools/scylla-sstable: dump-data/json: use Null instead of "<unknown>" The currently used "<unknown>" marker for invalid values/types is undistinguishable from a normal value in some cases. Use the much more distinct and unique json Null instead.	2022-11-18 15:58:36 +02:00
Botond Dénes	1dfceb5716	tools/scylla-sstable: dump-data/json: use more uniform format for collections Instead of trying to be clever and switching the output on the type of collection, use the same format always: a list of objects, where the object has a key and value attribute, containing to the respective collection item key and values. This makes processing much easier for machines (and humans too since the previous system wasn't working well).	2022-11-18 15:58:36 +02:00
Botond Dénes	f89acc8df7	tools/scylla-sstable: dump-data/json: make cells easier to parse There are several slightly different cell types in scylla: regular cells, collection cells (frozen and non-frozen) and counter cells (update and shards). In C++ code the type of the cell is always available for code wishing to make out exactly what kind of cell a cell is. In the JSON output of the dump-data this is currently really hard to do as there is not enough information to disambiguate all the different cell types. We wish to make the JSON output self-sufficient so in this patch we introduce a "type" field which contains one of: * regular * counter-update * counter-shards * frozen-collection * collection Furthermore, we bring the different types closer by also printing the counter shards under the 'value' key, not under the 'shards' key as before. The separate 'shards' is no longer needed to disambiguate. The documentation and the write operation is also updated to reflect the changes.	2022-11-18 15:58:36 +02:00
Avi Kivity	b8b78959fb	build: switch to packaged libdeflate rather than a submodule Now that our toolchain is based on Fedora 37, we can rely on its libdeflate rather than have to carry our own in a submodule. Frozen toolchain is regenerated. As a side effect clang is updated from 15.0.0 to 15.0.4. Closes #12000	2022-11-17 08:01:00 +02:00
Avi Kivity	43d3e91e56	tools: toolchain: prepare: use real bash associative array When we translate from docker/go arch names to the kernel arch names, we use an associative array hack using computed variable names "{$!variable_name}". But it turns out bash has real associative arrays, introduced with "declare -A". Use the to make the code a little clearer. Closes #11985	2022-11-16 08:17:47 +02:00
Avi Kivity	d85f731478	build: update toolchain to Fedora 37 with clang 15 'cargo' instantiation now overrides internal git client with cli client due to unbounded memory usage [1]. [1] https://github.com/rust-lang/cargo/issues/10583#issuecomment-1129997984	2022-11-15 16:48:09 +00:00
Botond Dénes	94db2123b9	Update tools/java submodule * tools/java 583261fc0e...caf754f243 (1): > build: remove JavaScript snippets in ant build file	2022-11-09 07:59:04 +02:00
Avi Kivity	04ecf4ee18	Update tools/java submodule (cassandra-stress fails with node down) * tools/java 87672be28e...583261fc0e (1): > cassandra-stress: pass all hosts stright to the driver	2022-11-08 14:58:14 +02:00
Botond Dénes	243fcb96f0	Update tools/python3 submodule * tools/python3 bf6e892...773070e (1): > create-relocatable-package: harden against missing files	2022-11-08 08:43:30 +02:00
Takuya ASADA	45789004a3	install-dependencies.sh: update node_exporter to 1.4.0 To fix CVE-2022-24675, we need to a binary compiled in <= golang 1.18.1. Only released version which compiled <= golang 1.18.1 is node_exporter 1.4.0, so we need to update to it. See scylladb/scylla-enterprise#2317 Closes #11400 [avi: regenerated frozen toolchain] Closes #11879	2022-11-03 10:15:22 +04:00
Avi Kivity	dd0b571d7e	Update tools/java submodule (Scylla Cloud serverless config option) * tools/java 5f2b91d774...87672be28e (1): > Add serverless Scylla Cloud config file option	2022-10-20 16:15:28 +03:00
Tomasz Grabiec	a979bbf829	dbuild: Do not fail if .gdbinit is missing Closes #11811	2022-10-19 18:38:09 +03:00
Botond Dénes	2d581e9e8f	Merge "Maintain dc/rack by topology" from Pavel Emelyanov " There's an ongoing effort to move the endpoint -> {dc/rack} mappings from snitch onto topology object and this set finalizes it. After it the snitch service stops depending on gossiper and system keyspace and is ready for de-globalization. As a nice side-effect the system keyspace no longer needs to maintain the dc/rack info cache and its starting code gets relaxed. refs: #2737 refs: #2795 " * 'br-snitch-dont-mess-with-topology-data-2' of https://github.com/xemul/scylla: (23 commits) system_keyspace: Dont maintain dc/rack cache system_keyspace: Indentation fix after previous patch system_keyspace: Coroutinuze build_dc_rack_info() topology: Move all post-configuration to topology::config snitch: Start early gossiper: Do not export system keyspace snitch: Remove gossiper reference snitch: Mark get_datacenter/_rack methods const snitch: Drop some dead dependency knots snitch, code: Make get_datacenter() report local dc only snitch, code: Make get_rack() report local rack only storage_service: Populate pending endpoint in on_alive() code: Populate pending locations topology: Put local dc/rack on topology early topology: Add pending locations collection topology: Make get_location() errors more verbose token_metadata: Add config, spread everywhere token_metadata: Hide token_metadata_impl copy constructor gosspier: Remove messaging service getter snitch: Get local address to gossip via config ...	2022-10-19 06:50:21 +03:00
Avi Kivity	2e79bb431c	tools: change source_location location std::experimental::source_location is provided by <experimental/source_location>, not <source_location>. libstdc++ 12 insists, so change the header. Closes #11766	2022-10-12 15:29:14 +03:00
Pavel Emelyanov	b6061bb97d	topology: Move all post-configuration to topology::config Because of snitch ex-dependencies some bits on topology were initialized with nasty post-start calls. Now it all can be removed and the initial topology information can be provided by topology::config Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:18:31 +03:00
Pavel Emelyanov	d60ebc5ace	token_metadata: Add config, spread everywhere Next patches will need to provide some early-start data for topology. The standard way of doing it is via service config, so this patch adds one. The new config is empty in this patch, to be filled later Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:17:08 +03:00
Tomasz Grabiec	fcf0628bc5	dbuild: Use .gdbinit from the host Useful when starting gdb inside the dbuild container. Message-Id: <20221007154230.1936584-1-tgrabiec@scylladb.com>	2022-10-09 11:14:33 +03:00
Avi Kivity	20bad62562	Merge 'Detect and record large collections' from Benny Halevy This series adds support for detecting collections that have too many items and recording them in `system.large_cells`. A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000. Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table. Documentation has been updated respectively. A new column was added to system.large_cells: `collection_items`. Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't. Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold. Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred. Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it. The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled. In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items. Closes #11449 Closes #11674 * github.com:scylladb/scylladb: sstables: mx/writer: optimize large data stats members order sstables: mx/writer: keep large data stats entry as members db: large_data_handler: dynamically update config thresholds utils/updateable_value: add transforming_value_updater db/large_data_handler: cql_table_large_data_handler: record large_collections db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler db/large_data_handler: cql_table_large_data_handler: move ctor out of line docs: large-rows-large-cells-tables: fix typos db/system_keyspace: add collection_elements column to system.large_cells gms/feature_service: add large_collection_detection cluster feature test: sstable_3_x_test: add test_sstable_too_many_collection_elements test: lib: simple_schema: add support for optional collection column test: lib: simple_schema: build schema in ctor body test: lib: simple_schema: cql: define s1 as static only if built this way db/large_data_handler: maybe_record_large_cells: consider collection_elements db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells sstables: mx/writer: add large_data_type::elements_in_collection db/large_data_handler: get the collection_elements_count_threshold db/config: add compaction_collection_elements_count_warning_threshold test: sstable_3_x_test: add test_sstable_write_large_cell test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random	2022-10-06 18:28:21 +03:00
Pavel Emelyanov	2c1ef0d2b7	sstables.hh: Remove unused headers Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11709	2022-10-04 23:37:07 +02:00
Benny Halevy	54ab038825	sstables: mx/writer: add large_data_type::elements_in_collection Add a new large_data_stats type and entry for keeping the collection_elements_count_threshold and the maximum value of collection_elements. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:41:56 +03:00
Nadav Har'El	91bccee9be	Update tools/java submodule * tools/java b004da9d1b...5f2b91d774 (1): > install.sh is using wrong permissions for install cqlsh files Fixes #11584	2022-09-20 14:42:34 +03:00
Avi Kivity	2cec417426	Merge 'tools: use the standard allocator' from Botond Dénes Tools want to be as little disrupting to the environment they run in as possible, because they might be run in a production environment, next to a running scylladb production server. As such, the usual behavior of seastar applications w.r.t. memory is an anti-pattern for tools: they don't want to reserve most of the system memory, in fact they don't want to reserve any amount, instead consuming as much as needed on-demand. To achieve this, tools want to use the standard allocator. To achieve this they need a seastar option to to instruct seastar to not configure and use the seastar allocator and they need LSA to cooperate with the standard allocator. The former is provided by https://github.com/scylladb/seastar/pull/1211. The latter is solved by introducing the concept of a `segment_store_backend`, which abstracts away how the memory arena for segments is acquired and managed. We then refactor the existing segment store so that the seastar allocator specific parts are moved to an implementation of this backend concept, then we introduce another backend implementation appropriate to the standard allocator. Finally, tools configure seastar with the newly introduced option to use the standard allocator and similarly configure LSA to use the standard allocator appropriate backend. Refs: https://github.com/scylladb/scylladb/issues/9882 This is the last major code piece in scylla for making tools production ready. Closes #11510 * github.com:scylladb/scylladb: test/boost: add alternative variant of logalloc test tools: use standard allocator utils/logalloc: add use_standard_allocator_segment_pool_backend() utils/logalloc: introduce segment store backend for standard allocator utils/logalloc: rebase release segment-store on segment-store-backend utils/logalloc: introduce segment_store_backend utils/logalloc: push segment alloc/dealloc to segment_store test/boost/logalloc_test: make test_compaction_with_multiple_regions exception-safe	2022-09-20 12:59:34 +03:00
Botond Dénes	6a0db84706	tools: use standard allocator Use the new seastar option to instruct seastar to not initialize and use the seastar allocator, relying on the standard allocator instead. Configure LSA with the standard allocator based segment store backend: * scylla-types reserves 1MB for LSA -- in theory nothing here should use LSA, but just in case... * scylla-sstable reserves 100MB for LSA, to avoid excessive trashing in the sstable index caches. With this, tools now should allocate memory on demand, without reserving a large chunk of (or all of) the available memory, as regular seastar apps do.	2022-09-16 13:07:01 +03:00
Raphael S. Carvalho	e099a9bf3b	sstables_manager: Add sstable metadata reader concurrency semaphore Let's introduce a reader_concurrency_semaphore for reading sstable metadata, to avoid an OOM due to unlimited concurrency. The concurrency on startup is not controlled, so it's important to enforce a limit on the amount of memory used by the parallel readers. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-09-14 13:09:51 -03:00
Kamil Braun	2fe3e67a47	gms: feature_service: don't distinguish between 'known' and 'supported' features `feature_service` provided two sets of features: `known_feature_set` and `supported_feature_set`. The purpose of both and the distinction between them was unclear and undocumented. The 'supported' features were gossiped by every node. Once a feature is supported by every node in the cluster, it becomes 'enabled'. This means that whatever piece of functionality is covered by the feature, it can by used by the cluster from now on. The 'known' set was used to perform feature checks on node start; if the node saw that a feature is enabled in the cluster, but the node does not 'know' the feature, it would refuse to start. However, if the feature was 'known', but wasn't 'supported', the node would not complain. This means that we could in theory allow the following scenario: 1. all nodes support feature X. 2. X becomes enabled in the cluster. 3. the user changes the configuration of some node so feature X will become unsupported but still known. 4. The node restarts without error. So now we have a feature X which is enabled in the cluster, but not every node supports it. That does not make sense. It is not clear whether it was accidental or purposeful that we used the 'known' set instead of the 'supported' set to perform the feature check. What I think is clear, is that having two sets makes the entire thing unnecessarily complicated and hard to think about. Fortunately, at the base to which this patch is applied, the sets are always the same. So we can easily get rid of one of them. I decided that the name which should stay is 'supported', I think it's more specific than 'known' and it matches the name of the corresponding gossiper application state. Closes #11512	2022-09-12 13:09:12 +03:00
Avi Kivity	521127a253	Update tools/jmx submodule * tools/jmx 06f2735...88d9bdc (1): > install.sh: add --without-systemd option	2022-09-12 13:02:16 +03:00
Nadav Har'El	d71098a3b8	Update tools/java submodule * tools/java b7a0c5bd31...b004da9d1b (1): > Revert "dist/debian:add python3 as dependency"	2022-09-11 17:45:43 +03:00
Avi Kivity	3dc39474ec	Merge 'tools/scylla-types: add tokenof and shardof actions' from Botond Dénes `tokenof` calculates and prints the token of a partition-key. `shardof` calculates the token and finds the owner shard of a partition-key. The number of shards has to be provided by the `--sharads` parameter. Ignore msb bits param can be tweaked with the `--ignore-msb-bits` parameter, which defaults to 12. Examples: ``` $ scylla types tokenof --full-compound -t UTF8Type -t SimpleDateType -t UUIDType 000d66696c655f696e7374616e63650004800049190010c61a3321045941c38e5675255feb0196 (file_instance, 2021-03-27, c61a3321-0459-41c3-8e56-75255feb0196): -5043005771368701888 $ scylla types shardof --full-compound -t UTF8Type -t SimpleDateType -t UUIDType --shards=7 000d66696c655f696e7374616e63650004800049190010c61a3321045941c38e5675255feb0196 (file_instance, 2021-03-27, c61a3321-0459-41c3-8e56-75255feb0196): token: -5043005771368701888, shard: 1 ``` Closes #11436 * github.com:scylladb/scylladb: tools/scylla-types: add shardof action tools/scylla-types: pass variable_map to action handlers tools/scylla-types: add tokenof action tools/scylla-types: extract printing code into functions	2022-09-06 11:25:54 +03:00
Avi Kivity	e3cdc8c4d3	Update tools/java submodule (python3 dependency) * tools/java 6995a83cc1...b7a0c5bd31 (1): > dist/debian:add python3 as dependency	2022-09-05 12:08:24 +03:00
Botond Dénes	21ef0c64f1	tools/scylla-types: add shardof action Decorates a partition key and calculates which shard it belongs to, given the shard count (--shards) and the ignore msb bits (--ignore-msb-bits) parameters. The latter is optional and is defaulted to 12. Example: $ scylla types shardof --full-compound -t UTF8Type -t SimpleDateType -t UUIDType --shards=7 000d66696c655f696e7374616e63650004800049190010c61a3321045941c38e5675255feb0196 (file_instance, 2021-03-27, c61a3321-0459-41c3-8e56-75255feb0196): token: -5043005771368701888, shard: 1	2022-09-05 06:22:57 +03:00
Botond Dénes	4333d33f01	tools/scylla-types: pass variable_map to action handlers Allowing them to have get the value of extra command line parameters.	2022-09-05 06:22:55 +03:00
Botond Dénes	58d4f22679	tools/scylla-types: add tokenof action Calculate and print the token of a partition-key. Example: $ scylla types tokenof --full-compound -t UTF8Type -t SimpleDateType -t UUIDType 000d66696c655f696e7374616e63650004800049190010c61a3321045941c38e5675255feb0196 (file_instance, 2021-03-27, c61a3321-0459-41c3-8e56-75255feb0196): -5043005771368701888	2022-09-05 06:20:10 +03:00
Botond Dénes	be70fcf587	tools/scylla-types: extract printing code into functions To make the individual overloads on the exact type usable on their own.	2022-09-02 07:46:18 +03:00
Benny Halevy	7747b8fa33	sstables: define run_identifier as a strong tagged_uuid type Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11321	2022-08-18 19:03:10 +03:00
Benny Halevy	d295d8e280	everywhere: define locator::host_id as a strong tagged_uuid type So it can be distinguished from other uuid-based identifiers in the system. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11276	2022-08-12 06:01:44 +03:00
Avi Kivity	871127f641	Update tools/java submodule * tools/java ad6764b506...6995a83cc1 (1): > dist/debian: drop upgrading from scylla-tools < 2.0	2022-08-08 16:51:14 +03:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Botond Dénes	d0eaa72bd7	tools/scylla-sstable: introduce the write operation Allows generating an sstable based on a JSON description of its content. Uses identical schema to dump-data, so it is possible to regenerate an existing sstable, by feeding the output of dump-data to write. Most of the scylladb storage engine features is supported, with the exception of the following: * counters * non-strictly atomic types, including frozen collections, tuples or UDTs.	2022-08-03 14:00:02 +03:00
Botond Dénes	4377be30ba	tools/scylla-sstable: add support for writer operations Currently it is assumed that all operations read sstables. They get a non-empty list of sstables as input and have no means to create sstable-writers. We want to add support for operations that write sstables. For this, we relax the current top-level check about the sstable list not being empty. We defer this empty-check for operations that actually need input sstables. Furthermore, the operation_func gains an sstable_manager& argument, to allow operations to create sstable writers. Operations are now read-write capable. In addition to the above the documentation language is adjusted to not assume read-only operations.	2022-08-03 13:49:22 +03:00

1 2 3 4 5 ...

429 Commits