scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Takuya ASADA	201a97e4a4	dist/docker: fix bashrc filename for Ubuntu For Debian variants, correct filename is /etc/bash.bashrc. Fixes #9588 Closes #9589	2021-11-07 17:01:13 +02:00
Avi Kivity	7a3930f7cf	Merge 'More nodetool-replacing virtual tables' from Botond Dénes This PR introduces 4 new virtual tables aimed at replacing nodetool commands, working towards the long-term goal of replacing nodetool completely at least for cluster information retrieval purposes. As you may have noticed, most of these replacement are not exact matches. This is on purpose. I feel that the nodetool commands are somewhat chaotic: they might have had a clear plan on what command prints what but after years of organic development they are a mess of fields that feel like don't belong. In addition to this, they are centered on C* terminology which often sounds strange or doesn't make any sense for scylla (off-heap memory, counter cache, etc.). So in this PR I tried to do a few things: * Drop all fields that don't make sense for scylla; * Rename/reformat/rephrase fields that have a corresponding concept in scylla, so that it uses the scylla terminology; * Group information in tables based on some common theme; With these guidelines in mind lets look at the virtual tables introduced in this PR: * `system.snapshots` - replacement for `nodetool listnapshots`; * `system.protocol_servers`- replacement for `nodetool statusbinary` as well as `Thrift active` and `Native Transport active` from `nodetool info`; * `system.runtime_info` - replacement for `nodetool info`, not an exact match: some fields were removed, some were refactored to make sense for scylla; * `system.versions` - replacement for `nodetool version`, prints all versions, including build-id; Closes #9517 * github.com:scylladb/scylla: test/cql-pytest: add virtual_tables.py test/cql-pytest: nodetool.py: add take_snapshot() db/system_keyspace: add versions table configure.py: move release.cc and build_id.cc to scylla_core db/system_keyspace: add runtime_info table db/system_keyspace: add protocol_servers table service: storage_service: s/client_shutdown_hooks/protocol_servers/ service: storage_service: remove unused unregister_client_shutdown_hook redis: redis_service: implement the protocol_server interface alternator: controller: implement the protocol_server interface transport: controller: implement the protocol_server interface thrift: controller: implement the protocol_server interface Add protocol_server interface db/system_keyspace: add snapshots virtual table db/virtual_table: remove _db member db/system_keyspace: propagate distributed<> database and storage_service to register_virtual_tables() docs/design-notes/system_keyspace.md: add listing of existing virtual tables docs/guides: add virtual-tables.md	2021-11-07 16:55:31 +02:00
Avi Kivity	c6ac1462c2	build, submodules: use utc for build datestamp This helps keep packages built on different machines have the same datestamp, if started on the same time. * tools/java 05ec511bbb...fd10821045 (1): > build: use utc for build datestamp * tools/jmx 48d37f3...d6225c5 (1): > build: use utc for build datestamp * tools/python3 c51db54...8a77e76 (1): > build: use utc for build datestamp [avi: commit own patches as this one requires excessive coordination across submodules, for something quite innocuous] Ref #9563 (doesn't really fix it, but helps a little)	2021-11-07 15:58:48 +02:00
Avi Kivity	1d4f6498c8	Update tools/python3 submodule for .orig cleanup * tools/python3 279aae1...c51db54 (1): > reloc: clean up '.orig' temporary directory before building deb package	2021-11-07 15:55:49 +02:00
Botond Dénes	e991604918	schema: make private constructor invokable via make_lw_shared The schema has a private constructor, which means it can't be constructed with `make_lw_shared()` even by classes which are otherwise able to invoke the private constructor themselves. This results in such classes (`schema_builder`) resorting to building a local schema object, then invoking `make_lw_shared()` with the schema's public move constructor. Moving a schema is not cheap at all however, so each `schema_builder::build()` call results in two expensive schema construction operations. We could make `make_lw_shared()` a friend of `schema` to resolve this, but then we'd de-facto open the private consctructor to the world. Instead this patch introduces a private tag type, which is added to the private constructor, which is then made public. Everybody can invoke the constructor but only friends can create the private tag instance required to actually call it. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20211105085940.359708-1-bdenes@scylladb.com>	2021-11-07 12:51:09 +02:00
Tomasz Grabiec	31bc1eb681	Merge 'Memtable reversing reader: fix computing rt slice, if there was previously emitted range tombstone.' from Michał Radwański This PR started by realizing that in the memtable reversing reader, it never happened on tests that `do_refresh_state` was called with `last_row` and `last_rts` which are not `std::nullopt`. Changes - fix memtable test (`tesst_memtable_with_many_versions_conforms_to_mutation_source`), so that there is a background job forcing state refreshes, - fix the way rt_slice is computed (was `(last_rts, cr_range_snapshot.end]`, now is `[cr_range_snapshot.start, last_rts)`). Fixes #9486 Closes #9572 * github.com:scylladb/scylla: partition_snapshot_reader: fix indentation in fill_buffer range_tombstone_list: {lower,upper,}slice share comparator implementation test: memtable: add full_compaction in background partition_snapshot_reader: fix obtaining rt_slice, if Reversing and _last_rts was set range_tombstone_list: add lower_slice	2021-11-05 15:27:03 +01:00
Botond Dénes	6993a55ff3	test/cql-pytest: add virtual_tables.py Presence and column check for virtual tables. Where possible (and simple) more is checked.	2021-11-05 16:26:21 +02:00
Botond Dénes	18f9d329ed	test/cql-pytest: nodetool.py: add take_snapshot()	2021-11-05 16:26:01 +02:00
Botond Dénes	d51aa66a8a	db/system_keyspace: add versions table Contains all version related information (`nodetool version` and more). Example printout: (cqlsh) select * from system.versions; key \| build_id \| build_mode \| version -------+------------------------------------------+------------+------------------------------- local \| aaecce2f5068b0160efd04a09b0e28e100b9cd9e \| dev \| 4.6.dev-0.20211021.0d744fd3fa	2021-11-05 15:42:42 +02:00
Botond Dénes	5c87263ff8	configure.py: move release.cc and build_id.cc to scylla_core These two files were only added to the scylla executable and some specific unit tests. As we are about to use the symbols defined in these files in some scylla_core code move them there.	2021-11-05 15:42:42 +02:00
Botond Dénes	89cc016f07	db/system_keyspace: add runtime_info table Loosly contains the equivalent of the `nodetool info` command, with some notable differences: * Protocol server related information is in `system.protocol_servers`; * Information about memory, memtable and cache is reformatted to be tailored to scylla: C* specific terminology and metrics are dropped; * Information that doesn't change and is already in `system.local` is not contained; * Added trace-probability too (`nodetool gettraceprobability`); TODO(follow-up): exceptions.	2021-11-05 15:42:42 +02:00
Botond Dénes	78adda197f	db/system_keyspace: add protocol_servers table Lists all the client protocol server and their status. Example output: (cqlsh) select * from system.protocol_servers; name \| is_running \| listen_addresses \| protocol \| protocol_version ------------------+------------+---------------------------------------+----------+------------------ native transport \| True \| ['127.0.0.1:9042', '127.0.0.1:19042'] \| cql \| 3.3.1 alternator \| False \| [] \| dynamodb \| rpc \| False \| [] \| thrift \| 20.1.0 redis \| False \| [] \| redis \| This prints the equivalent of `nodetool statusbinary` and the "Thrift active" and "Native Transport active" fields from the `nodetool info` output with some additional information: * It contains alternator and redis status; * It contains the protocol version; * It contains the listen addresses (if respective server is running);	2021-11-05 15:42:42 +02:00
Botond Dénes	3f56c49a9e	service: storage_service: s/client_shutdown_hooks/protocol_servers/ Replace the simple client shutdown hook registry mechanism with a more powerful registry of the protocol servers themselves. This allows enumerating the protocol servers at runtime, checking whether they are running or not and starting/stopping them.	2021-11-05 15:42:42 +02:00
Botond Dénes	e9c9a39c06	service: storage_service: remove unused unregister_client_shutdown_hook Nobody seems to unregister client shutdown hooks ever. We are about to refactor the client shutdown hook machinery so remove this unused code to make this easier.	2021-11-05 15:42:41 +02:00
Botond Dénes	f56f4ade22	redis: redis_service: implement the protocol_server interface In the process de-globalize redis service and pass dependencies in the constructor.	2021-11-05 15:42:41 +02:00
Botond Dénes	8ddfdd8aa9	alternator: controller: implement the protocol_server interface	2021-11-05 15:42:41 +02:00
Botond Dénes	134fa98ff4	transport: controller: implement the protocol_server interface	2021-11-05 15:42:41 +02:00
Botond Dénes	bda0d0ccba	thrift: controller: implement the protocol_server interface	2021-11-05 15:42:41 +02:00
Botond Dénes	3ff8ba9146	Add protocol_server interface We want to replace the current `storage_service::register_client_shutdown_hook()` machinery with something more powerful. We want to register all running client protocol servers with the storage service, allowing enumerating these at runtime, checking whether they are running or not and starting/stopping them. As the first step towards this, we introduce an abstract interface that we are going to implement at the controllers of the various protocol servers we have. Then we will switch storage service to collect pointers to this interface instead of simple stop functors.	2021-11-05 15:42:41 +02:00
Botond Dénes	64f658aea4	db/system_keyspace: add snapshots virtual table Lists the equivalent of the `nodetool listsnapshots` command.	2021-11-05 15:42:41 +02:00
Botond Dénes	f0281eaa98	db/virtual_table: remove _db member This member is potentially dangerous as it only becomes non-null sometimes after the virtual table object is constructed. This is asking for nullptr dereference. Instead, remove this member and have virtual table implementations that need a db, ask for it in the constructor, it is available in `register_virtual_tables()` now.	2021-11-05 15:42:41 +02:00
Botond Dénes	200e2fad4d	db/system_keyspace: propagate distributed<> database and storage_service to register_virtual_tables() As some virtual tables will need the distributed versions of these.	2021-11-05 15:42:41 +02:00
Botond Dénes	185c5f1f5b	docs/design-notes/system_keyspace.md: add listing of existing virtual tables As well as a link to the newly added docs/guides/virtual-tables.md	2021-11-05 15:42:39 +02:00
Michał Radwański	ee601b7d87	partition_snapshot_reader: fix indentation in fill_buffer	2021-11-05 10:51:58 +01:00
Michał Radwański	35b1c3ff52	range_tombstone_list: {lower,upper,}slice share comparator implementation slice (2 overloads), upper_slice, lower_slice previously had implementations of a comparator. Move out the common structs, so that all 4 of them can share implementation.	2021-11-05 10:51:58 +01:00
Botond Dénes	b8c156d4f7	docs/guides: add virtual-tables.md Explaining what virtual tables are, what are good candidates for virtual tables and how you can write one.	2021-11-05 11:49:27 +02:00
Michael Livshin	60f76155a7	build: have configure.py create compile_commands.json compile_commands.json (a.k.a. "compdb", https://clang.llvm.org/docs/JSONCompilationDatabase.html) is intended to help stand-alone C-family LSP servers index the codebase as precisely as possible. The actively maintained LSP servers with good C++ support are: - Clangd (https://clangd.llvm.org/) - CCLS (https://github.com/MaskRay/ccls) This change causes a successful invocation of configure.py to create a unified Scylla+Seastar+Abseil compdb for every selected build mode, and to leave a valid symlink in the source root (if a valid symlink already exists, it will be left alone). Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Closes #9558	2021-11-05 11:28:37 +02:00
Raphael S. Carvalho	4950ce539c	schema: replace outdated comment on default compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211104210043.199156-1-raphaelsc@scylladb.com>	2021-11-05 00:35:41 +02:00
Nadav Har'El	5e52858295	rjson, alternator: rename set() functions add() The rjson::set() sounds like it can set any member of a JSON object (i.e., map), but that's not true :-( It calls the RapidJson function AddMember() so it can only add a member to an object which doesn't have a member with the same name (i.e., key). If it is called with a key that already has a value, the result may have two values for the same key, which is ill-formed and can cause bugs like issue #9542. So in this patch we begin by renaming rjson::set() and its variant to rjson::add() - to suggest to its user that this function only adds members, without checking if they already exist. After this rename, I was left with dozens of calls to the set() functions that need to changed to either add() - if we're sure that the object cannot already have a member with the same name - or to replace() if it might. The vast majority of the set() calls were starting with an empty item and adding members with fixed (string constant) names, so these can be trivially changed to add(). It turns out that all other set() calls - except the one fixed in issue #9542 - can also use add() because there are various "excuses" why we know the member names will be unique. A typical example is a map with column-name keys, where we know that the column names are unique. I added comments in front of such non-obvious uses of add() which are safe. Almost all uses of rjson except a handful are in Alternator, so I verified that all Alternator test cases continue to pass after this patch. Fixes #9583 Refs #9542 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211104152540.48900-1-nyh@scylladb.com>	2021-11-04 16:35:38 +01:00
Nadav Har'El	b95e431228	alternator: fix bug in ReturnValues=ALL_NEW This patch fixes a bug in UpdateItem's ReturnValues=ALL_NEW, which in some cases returned the OLD (pre-modification) value of some of the attributes, instead of its NEW value. The bug was caused by a confusion in our JSON utility function, rjson::set(), which sounds like it can set any member of a map, but in fact may only be used to add a new member - if a member with the same name (key) already existed, the result is undefined (two values for the same key). In ReturnValues=ALL_NEW we did exactly this: we started with a copy of the original item, and then used set() to override some of the members. This is not allowed. So in this patch, we introduce a new function, rjson::replace(), which does what we previously thought that rjson::set() does - i.e., replace a member if it exists, or if not, add it. We call this function in the ReturnValues=ALL_NEW code. This patch also adds a test case that reproduces the incorrect ALL_NEW results - and gets fixed by this patch. In an upcoming patch, we should rename the confusingly-named set() functions and audit all their uses. But we don't do this in this patch yet. We just add some comments to clarify what set() does - but don't change it, and just add one new function for replace(). Fixes #9542 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211104134937.40797-1-nyh@scylladb.com>	2021-11-04 16:34:58 +01:00
Michał Radwański	cac9ac5126	test: memtable: add full_compaction in background Add full compaction in test_memtable_with_many_versions_conforms_to_mutation_source in background. Without it, some paths in the partition snapshot reader weren't covered, as the tests always managed to read all range tombstones and rows which cover a given clustering range from just a single snapshot. Now, when full_compaction happens in process of reading from a clustering range, we can force state refresh with non-nullopt positions of last row and last range tombstone. Note: this inability to test affected only the reversing reader.	2021-11-04 16:19:54 +01:00
Michał Radwański	94b263e356	partition_snapshot_reader: fix obtaining rt_slice, if Reversing and _last_rts was set If Reversing and _last_rts was set, the created rt_slice still contained range tombstones between _last_rts and (snapshot) clustering range end. This is wrong - the correct range is between (snapshot) clustering range begin and _last_rts.	2021-11-04 16:10:07 +01:00
Pavel Emelyanov	6e97d2ce87	Merge branch 'compaction_cleanup_and_improvements_v2' from Raphael S. Carvalho Cleanup and improvements for compaction * 'compaction_cleanup_and_improvements_v2' of https://github.com/raphaelsc/scylla: compaction: fix outdated doc of compact_sstables() table: fix indentation in compact_sstables() table: give a more descriptive name to compaction_data in compact_sstables() compaction_manager: rename submit_major_compaction to perform_major_compaction compaction: fix indentantion in compaction.hh compaction: move incremental_owned_ranges_checker into cleanup_compaction compaction: make owned ranges const in cleanup_compaction compaction: replace outdated comment in regular_compaction compaction: give a more descriptive name to compaction_data compaction_manager: simplify creation of compaction_data	2021-11-04 17:27:07 +03:00
Raphael S. Carvalho	132a840ed5	compaction: fix outdated doc of compact_sstables() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 11:09:24 -03:00
Raphael S. Carvalho	98dd57113f	table: fix indentation in compact_sstables() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 11:09:24 -03:00
Raphael S. Carvalho	51aa79e267	table: give a more descriptive name to compaction_data in compact_sstables() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 11:09:24 -03:00
Raphael S. Carvalho	8ce9cda391	compaction_manager: rename submit_major_compaction to perform_major_compaction for symmetry, let's call it perform_* as it doesn't work like submission functions which doesn't wait for result, like the one for minor compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:54:00 -03:00
Raphael S. Carvalho	0d745912d0	compaction: fix indentantion in compaction.hh Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:50:46 -03:00
Raphael S. Carvalho	5af9a690c1	compaction: move incremental_owned_ranges_checker into cleanup_compaction let's move checker into cleanup as it's not needed elsewhere. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:49:44 -03:00
Raphael S. Carvalho	04ef2124c6	compaction: make owned ranges const in cleanup_compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:47:12 -03:00
Raphael S. Carvalho	d86c2491d4	compaction: replace outdated comment in regular_compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:45:34 -03:00
Raphael S. Carvalho	b344db1696	compaction: give a more descriptive name to compaction_data info is no longer descriptive, as compaction now works with compaction_data instead of compaction_info. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:43:08 -03:00
Raphael S. Carvalho	63dc4e2107	compaction_manager: simplify creation of compaction_data there's no need for wrapping compaction_data in shared_ptr, also let's kill unused params in create_compaction_data to simplify its creation. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 09:35:49 -03:00
Takuya ASADA	9b4cf8c532	scylla_util.py: On is_gce(), return False when it's on GKE GKE metadata server does not provide same metadata as GCE, we should not return True on is_gce(). So try to fetch machine-type from metadata server, return False if it 404 not found. Fixes #9471 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes #9582	2021-11-04 12:49:06 +02:00
Avi Kivity	a64458e71a	Merge "Run test cases in parallel by default" from Pavel E " Some time ago there was introduced the --parallel-cases option that was set to False by default. Now everything is ready for making it True. Running in a BYO job shows that it takes 30 minutes less to complete the debug tests. Other timings remain almost the same. tests: unit(dev), unit(debug) " * 'br-parallel-cases-by-default' of https://github.com/xemul/scylla: test.py: Run parallel cases by default test, raft: Keep many-400 case out of debug mode test.py: Cache collected test-cases	2021-11-04 10:10:08 +02:00
Pavel Emelyanov	d1679b66f2	test.py: Run parallel cases by default There were few missing bits before making this the default. - default max number of AIOs, now tests are run with the greatly reduced value - 1.5 hours single case from database_test, now it's split and scales with --parallel-cases - suite add_test methods called in a loop for --repeat options, patch #1 from this set fixes it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-04 10:47:13 +03:00
Pavel Emelyanov	12cf69e5f5	test, raft: Keep many-400 case out of debug mode This case takes 45+ minutes which is 1.5 times longer then the second longest case out there. I propose to keep the many-400 case out of debug runs, there's many-100 one which is configured the same way but uses 4x times less "nodes". Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-04 10:47:13 +03:00
Pavel Emelyanov	0d0ccd50b5	test.py: Cache collected test-cases The add_test method of a siute can be called several times in a row e.g. in case of --repeat option or because there are more than one custom_args entries in the suite.yaml file. In any case it's pointless to re-collect the test cases by launching the test binary again, it's much faster (and 100% safe) to keep the list of cases from the previous call and re-use it if the test name matches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-04 10:47:13 +03:00
Avi Kivity	e1817b536f	build: clobber user/group info from node_exporter tarball node_exporter is packaged with some random uid/gid in the tarball. When extracting it as an ordinary user this isn't a problem, since the uid/gid are reset to the current user, but that doesn't happen under dbuild since `tar` thinks the current user is root. This causes a problem if one wants to delete the build directory later, since it becomes owned by some random user (see /etc/subuid) Reset the uid/gid infomation so this doesn't happen. Closes #9579	2021-11-04 09:27:13 +02:00
Raphael S. Carvalho	ab0217e30e	compaction: Improve overall efficiency by not diluting it with relatively inefficient jobs Compaction efficiency can be defined as how much backlog is reduced per byte read or written. We know a few facts about efficiency: 1) the more files are compacted together (the fan-in) the higher the efficiency will be, however... 2) the bigger the size difference of input files the worse the efficiency, i.e. higher write amplification. so compactions with similar-sized files are the most efficient ones, and its efficiency increases with a higher number of files. However, in order to not have bad read amplification, number of files cannot grow out of bounds. So we have to allow parallel compaction on different tiers, but to avoid "dilution" of overall efficiency, we will only allow a compaction to proceed if its efficiency is greater than or equal to the efficiency of ongoing compactions. By the time being, we'll assume that strategies don't pick candidates with wildly different sizes, so efficiency is only calculated as a function of compaction fan-in. Now when system is under heavy load, then fan-in threshold will automatically grow to guarantee that overall efficiency remains stable. Please note that fan-in is defined in number of runs. LCS compaction on higher levels will have a fan-in of 2. Under heavy load, it may happen that LCS will temporarily switch to size-tiered mode for compaction to keep up with amount of data being produced. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211103215110.135633-2-raphaelsc@scylladb.com>	2021-11-03 20:03:23 +02:00

1 2 3 4 5 ...

28815 Commits