scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 10:30:38 +00:00

Author	SHA1	Message	Date
Nadav Har'El	666017f2f0	Merge 'Convert last uses of sprint() to fmt::format()' from Avi Kivity sprint() uses the printf-style formatting language while most of our code uses the Python-derived format language from fmt::format(). The last mass conversion of sprint() to fmt (in `1129134a4a`) missed some callers (principally those that were on multiple lines, and so the automatic converter missed them). Convert the remainder to fmt::format(), and some sprintf() and printf() calls, so we have just one format language in the code base. Seastar::sprint() ought to be deprecated and removed. Test: unit (dev) Closes #9529 * github.com:scylladb/scylla: utils: logalloc: convert debug printf to fmt::print() utils: convert fmt::fprintf() to fmt::print() main: convert fprint() to fmt::print() compress: convert fmt::sprintf() to fmt::format() tracing: replace seastar::sprint() with fmt::format() thrift: replace seastar::sprint() with fmt::format() test: replace seastar::sprint() with fmt::format() streaming: replace seastar::sprint() with fmt::format() storage_service: replace seastar::sprint() with fmt::format() repair: replace seastar::sprint() with fmt::format() redis: replace seastar::sprint() with fmt::format() locator: replace seastar::sprint() with fmt::format() db: replace seastar::sprint() with fmt::format() cql3: replace seastar::sprint() with fmt::format() cdc: replace seastar::sprint() with fmt::format() auth: replace seastar::sprint() with fmt::format()	2021-10-28 22:33:23 +03:00
Avi Kivity	6b02aa72e2	cdc: replace seastar::sprint() with fmt::format() sprint() is obsolete.	2021-10-27 14:30:06 +03:00
Avi Kivity	e44057d5e1	cdc: don't allow background streams description rewrite to delay too far If we're upgrading from an older version with the previous CDC streams format, we'll upgrade it in the background. Background update is needed since we need the cluster to be available when performing the upgrade, but at this point we're just starting a node, and may not succeed in forming a cluster before we shut down. However, running in the background is dangerous since the objects we use may stop existing. The code is careful to use reference counting, but this does not guarantee that other dependencies are still alive, especially since not all dependencies are expressed via constructor parameters. Fix by waiting for the rewrite work in generation_service::stop(). As long as generation_service is up, the required dependencies should be working too. Note that there is another change here besides limiting the background work: checks that were previously done in the foreground (limited to local tables) are now also done in the background. I don't think this has any impact. Note: I expect this to have no real impact. Any CDC users will have long since ugpraded. This is just preparing for other patches that bring in other dependencies, which cannot be passed via reference counted pointers, so they expose the existing problem.	2021-10-18 16:56:59 +03:00
Avi Kivity	eac95e2370	cdc: adjust type of streams_count streams_count has signed type, but it's compared against an unsigned type, annoying gcc. Since a count should be positive, convert it to an unsigned type.	2021-10-06 14:56:00 +03:00
Pavel Emelyanov	db623c5f64	cdc: Replace db::config with generation_service::config This is to push the service towards general idea that each component should have its own config and db::config to stay in main. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 16:04:12 +03:00
Pavel Emelyanov	b879d3f3a5	cdc: Drop db::config from description_generator It only needs one for murmur3_partitioner_ignore_msb_bits value, provide it directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 16:04:12 +03:00
Pavel Emelyanov	2e7364b94f	cdc: Remove all arguments from maybe_rewrite_streams_descriptions All of them are references taken from 'this', since the function is the generation_service method it can use 'this' directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 16:04:12 +03:00
Pavel Emelyanov	6fe31d8eac	cdc: Move maybe_rewrite_streams_descriptions into after_join The generation service already has all it needs to do it. This keeps storage_service smaller and less aware about cdc internals. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 15:34:03 +03:00
Pavel Emelyanov	3b51c5c96a	cdc: Squash two methods into one The recently introduced make_new_generation() method just calls another one by passing more this->... stuff as arguments. Relax the flow by teaching the latter to use 'this' directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 15:34:03 +03:00
Pavel Emelyanov	7a7a87f24a	cdc: Turn make_new_cdc_generation a service method It has everything needed onboard. Only two arguments are required -- the booststrap tokens and whether or not to inject a delay. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 15:34:03 +03:00
Pavel Emelyanov	b867a19da1	cdc: Remove ring-delay arg from make_new_cdc_generation It already has the db::config from where to get one (and even this will change soon). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 15:34:03 +03:00
Pavel Emelyanov	5e2a049266	cdc: Keep database reference on generation_service The service effectively depends on it when rewrites streams descriptions. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 15:34:03 +03:00
Avi Kivity	ed396a31f3	Merge "Remove global storage proxy from cdc" from Pavel E " There's a single call to get_local_storage_proxy in cdc code that needs to get database from. Furtunately, the database can be easily provided there via call argument. tests: unit(dev) " * 'br-remove-proxy-from-cdc' of https://github.com/xemul/scylla: cdc: Add database argument to is_log_for_some_table client_state: Pass database into has_access() client_state: Add database argument to has_schema_access client_state: Add database argument to has_keyspace_access() cdc: Add database argument to check_for_attempt_to_create_nested_cdc_log	2021-09-13 18:45:46 +03:00
Pavel Emelyanov	5515f7187d	range_tombstone, code: Add range_tombstone& getters Currently all the code operates on the range_tombstone class. and many of those places get the range tombstone in question from the range_tombstone_list. Next patches will make that list carry (and return) some new object called range_tombstone_entry, so all the code that expects to see the former one there will need to patched to get the range_tombstone from the _entry one. This patch prepares the ground for that by introdusing the range_tombstone& tombstone() { return *this; } getter on the range_tombstone itself and patching all future users of the _entry to call .tombstone() right now. Next patch will remove those getters together with adding the new range_tombstone_entry object thus automatically converting all the patched places into using the entry in a proper way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	0fd00d7016	cdc: Add database argument to is_log_for_some_table All callers has been patched already. This argument can now be used to replace get_local_storage_proxy().get_db().local() call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-08-27 14:07:26 +03:00
Pavel Emelyanov	fe8bc0757b	cdc: Add database argument to check_for_attempt_to_create_nested_cdc_log The only caller of it already has database argument, just pass it a bit further Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-08-27 14:07:18 +03:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Asias He	6350a19f73	compaction: Move compaction_strategy.hh to compaction dir The top dir is a mess. Move compaction_strategy.hh and compaction_strategy_type.hh to the new home.	2021-08-07 08:06:37 +08:00
Avi Kivity	e52ebe2da5	types: convert abstract_type::compare and related to std::strong_ordering Change comparators around types to std::strong_ordering. Ref #1449.	2021-07-28 13:19:24 +03:00
Calle Wilund	59555fa363	cdc: fix broken function signature in maybe_back_insert_iterator Fixes #9103 compare overload was declared as "bool" even though it is a tri-cmp. causes us to never use the speed-up shortcut (lessen search set), in turn meaning more overhead for collections. Closes #9104	2021-07-27 20:37:30 +03:00
Piotr Jastrzebski	c010cefc4d	cdc: Handle compact storage tables correctly When a table with compact storage has no regular column (only primary key columns), an artificial column of type empty is added. Such column type can't be returned via CQL so CDC Log shouldn't contain a column that reflects this artificial column. This patch does two things: 1. Make sure that CDC Log schema does not contain columns that reflect the artificial column from a base table. 2. When composing mutation to CDC Log, ommit the artificial column. Fixes #8410 Test: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8988	2021-07-12 12:17:35 +03:00
Tomasz Grabiec	06e373e272	sstables: index_reader: Keep index objects under LSA In preparation for caching index objects, manage them under LSA. Implementation notes: key_view was changed to be a view on managed_bytes_view instead of bytes, so it now can be fragmented. Old users of key_view now have to linearize it. Actual linearization should be rare since partition keys are typically small. Index parser is now not constructing the index_entry directly, but produces value objects which live in the standard allocator space: class parsed_promoted_index_entry; calss parsed_partition_index_entry; This change was needed to support consumers which don't populate the partition index cache and don't use LSA, e.g. sstable::generate_summary(). It's now consumer's responsibility to allocate index_entry out of parsed_partition_index_entry.	2021-07-02 19:02:14 +02:00
Kamil Braun	a3f3563828	storage_service: check for existing normal token owners before bootstrapping The bootstrap procedure starts by "waiting for range setup", which means waiting for a time interval specified by the `ring_delay` parameter (30s by default) so the node can receive the tokens of other nodes before introducing its own tokens. However it may sometimes happen that the node doesn't receive the tokens. There are no explicit checks for this. But the code may crash in weird ways if the tokens-received assuption is false, and we are lucky if it does crash (instead of, for example, allowing the node to incorrectly bootstrap, causing data loss in the process). Introduce an explicit check-and-throw-if-false: a bootstrapping node now checks that there's at least one NORMAL token in the token ring, which means that it had to have contacted at least one existing node in the cluster, which means that it received the gossip application states of all nodes from that node; in particular the tokens of all nodes. Also add an assert in CDC code which relies on that assumption (and would cause weird division-by-zero errors if the assumption was false; better to crash on assert than this). Ref #8889. Closes #8896	2021-06-24 13:19:08 +03:00
Pavel Solodovnikov	76bea23174	treewide: reduce header interdependencies Use forward declarations wherever possible. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Closes #8813	2021-06-07 15:58:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	142d3b5ad9	cdc: self-sufficient headers fixup Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Kamil Braun	337a4ef8ad	cdc: when creating new generations, use format v2 if possible A node with this commit, when creating a new CDC generation (during bootstrap, upgrade, or when running checkAndRepairCdcStreams command) will check for the CDC_GENERATIONS_V2 feature and: - If the feature is enabled create the generation in the v2 format and insert it into the new internal table. This is safe because a node joins the feature only if it understands the new format. - Otherwise create it in the v1 format, limiting its size as before, and insert it into the old table. The second case should only happen if we perform bootstrap or run checkAndRepairCdcStreams in the middle of an upgrade procedure. On fully upgraded clusters the feature shall be enabled, causing all new generations to use the new format.	2021-05-25 16:07:23 +02:00
Kamil Braun	4d3870b24b	main: pass feature_service to cdc::generation_service	2021-05-25 16:07:23 +02:00
Kamil Braun	9c1a3180bb	cdc: introduce retrieve_generation_data This function given a generation ID retrieves its data from the internal table in which the data resides. This depends on the version of the ID: for _v1 we're using system_distributed.cdc_generation_descriptions, for _v2 we're using the better system_distributed_v2.cdc_generation_descriptions_v2 (see the previous commit for detailed explanation of the superiority of the new table).	2021-05-25 16:07:23 +02:00
Kamil Braun	4658adbe18	tree-wide: introduce cdc::generation_id_v2 This is a new type of CDC generation identifiers. Compared to old IDs, additionally to the timestamp it contains an UUID. These new identifiers will allow a safer and more efficient algorithm of introducing new generations into a cluster (introduced in a later commit). For now, nodes keep using the old identifier format when creating new generations and whenever they learn about a new CDC generation from gossip they assume that it also is stored in the v1 format. But they do know how to (de)serialize the second format and how to persist new identifiers in local tables.	2021-05-24 17:50:21 +02:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Piotr Sarna	6e28c01c53	cdc: make metadata.hh self-sufficient The header relies on topology_description class definition, which is part of cdc/generation.hh.	2021-05-18 15:10:31 +02:00
Avi Kivity	b1f9df279a	Merge "Untie cdc, storage service and migration notifier knot" from Pavel E " Storage service needs migration notifier reference to pass it to cdc service via get_local_storage_service(). This set removes - get_local_storage_service from cdc - migration notifier from storage service - db_context::builder from cdc (released nuclear binding energy) tests: unit(dev) " * 'br-cdc-no-storage-service' of https://github.com/xemul/scylla: storage_service: Remove migration notifier dependency cdc: Remove db_context::builder cdc: Provide migration notifier right at once cdc: Remove db_context::builder::with_migration_notifier	2021-05-11 18:39:10 +03:00
Piotr Grabowski	cd6154e8bf	cdc: log: assert post_image is always in full mode Add an assertion that checks that post_image can never be in non-full mode.	2021-05-04 12:33:15 +02:00
Pavel Emelyanov	cc813ef0dd	cdc: Remove db_context::builder Right now the builder is just an opaque transfer between cdc_service constructor args and cdc_service's db_context constructor args. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:46:57 +03:00
Pavel Emelyanov	3a7ca647af	cdc: Provide migration notifier right at once The only way db_context's migration notifier reference is set up is via cdc_service->db_context::builder->.build chain of calls. Since the builder's notifier optional reference is always disengaged (the .with_migration_notifier is removed by previous patch) the only possible notifier reference there is from the storage service which, in turn, is the same as in main.cc. Said that -- push the notifier reference onto db_context directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:40:24 +03:00
Pavel Emelyanov	421a514c30	cdc: Remove db_context::builder::with_migration_notifier It's unused and removing it makes next patch's life simpler Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:39:12 +03:00
Piotr Grabowski	b1650114eb	cdc: log: fill cdc$deleted_ columns in pre-images Before this change, cdc$deleted_ columns were all NULL in pre-images. Lack of such information made it hard to correctly interpret the pre-image rows, for example: INSERT INTO tbl(pk, ck, v, v2) VALUES (1, 1, null, 1); INSERT INTO tbl(pk, ck, v2) VALUES (1, 1, 1); For this example, pre-image generated for the second operation would look like this (in both 'true' and 'full' pre-image mode): pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1 v=NULL has two meanings: 1. If pre-image was in 'true' mode, v=NULL describes that v was not affected (affected columns: pk, ck, v2). 2. If pre-image was in 'full' mode, v=NULL describes that v was equal to NULL in the pre-image. Therefore, to properly decode pre-images you would need to know in which mode pre-image was configured on the CDC-enabled table at the moment this CDC log row was inserted. There is no way to determine such information (you can only check a current mode of pre-image). A solution to this problem is to fill in the cdc$deleted_ columns for pre-images. After this change, for the INSERT described above, CDC now generates the following log row: If in pre-image 'true' mode: pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1 If in pre-image 'full' mode: pk=1, ck=1, v=NULL, cdc$deleted_v=true, v2=1 A client library now can properly decode a pre-image row. If it sees a NULL value, it can now check the cdc$deleted_ column to determine if this NULL value was a part of pre-image or it was omitted due to not being an affected column in the delta operation. No such change is necessary for the post-image rows, as those images are always generated in the 'full' mode. Additional example of trouble decoding pre-images before this change. tbl2 - 'true' pre-image mode, tbl3 - 'full' pre-image mode: INSERT INTO tbl2(pk, ck, v, v2) VALUES (1, 1, 5, 1); INSERT INTO tbl3(pk, ck, v, v2) VALUES (1, 1, null, 1); INSERT INTO tbl2(pk, ck, v2) VALUES (1, 1, 1); generated pre-image: pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1 INSERT INTO tbl3(pk, ck, v2) VALUES (1, 1, 1); generated pre-image: pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1 Both pre-images look the same, but: 1. v=NULL in tbl2 describes v being omitted from the pre-image. 2. v=NULL in tbl3 described v being NULL in the pre-image.	2021-04-29 18:04:07 +02:00
Avi Kivity	daeddda7cc	treewide: remove inclusions of storage_proxy.hh from headers storage_proxy.hh is huge and includes many headers itself, so remove its inclusions from headers and re-add smaller headers where needed (and storage_proxy.hh itself in source files that need it). Ref #1.	2021-04-20 21:23:00 +03:00
Avi Kivity	14a4173f50	treewide: make headers self-sufficient In preparation for some large header changes, fix up any headers that aren't self-sufficient by adding needed includes or forward declarations.	2021-04-20 21:23:00 +03:00
Piotr Grabowski	61c8e196be	cdc: improve exception message of invalid "ttl" Improve the exception message of providing invalid "ttl" value to the table. Previously, if you executed a CREATE TABLE query with invalid "ttl" value, you would get a non-descriptive error message: CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': 'invalid'}; ServerError: stoi This commit adds more descriptive exception messages: CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': 'kgjhfkjd'}; ConfigurationException: Invalid value for CDC option "ttl": kgjhfkjd CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': '75747885787487'}; ConfigurationException: Invalid CDC option: ttl too large	2021-04-14 17:40:23 +02:00
Piotr Grabowski	10390afc10	cdc: add validation of "enable" and "postimage" Add validation of "enable" and "postimage" CDC options. Both options are boolean options, but previously they were not validated, meaning you could issue a query: CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': 'dsfdsd'}; and it would be executed without any errors, silently interpreting "dsfdsd" as false. This commit narrows possible values of those boolean CDC options to false, true, 0, 1. After applying this change, issuing the query above would result in this error message: ConfigurationException: Invalid value for CDC option "enabled": dsfdsd	2021-04-14 17:36:38 +02:00
Piotr Sarna	d77eb39076	Merge 'cdc: log: avoid linearizations' from Michał Chojnowski CDC log uses `bytes` to deal with cells and their values, and linearizes all values indiscriminately. This series makes a switch from `bytes` to `managed_bytes` to avoid that linearization. Fixes #7506. Closes #8429 * github.com:scylladb/scylla: cdc: log: change yet another occurence of `bytes` to `managed_bytes` cdc: log: switch the remaining usages of `bytes` to `managed_bytes` in collection_visitor cdc: log: change `deleted_elements` in log_mutation_builder from bytes to managed_bytes cdc: log: rewrite collection merge to use managed_bytes instead of bytes cdc: log: don't linearize collections in get_preimage_col_value cdc: log: change return type of get_preimage_col_value to managed_bytes cdc: log: remove an unnecessary copy in process_row_visitor::live_atomic_cell cdc: log: switch cell_map from bytes to managed_bytes cdc: log: change the argument of log_mutation_builder::set_value to managed_bytes_view cdc: log: don't linearize the primary key in log_mutation_builder atomic_cell: add yet another variant of make_live for managed_bytes_view compound: add explode_fragmented	2021-04-12 10:56:12 +02:00
Michał Chojnowski	6b31f73987	cdc: log: change yet another occurence of `bytes` to `managed_bytes`	2021-04-08 10:16:21 +02:00
Michał Chojnowski	061f72166c	cdc: log: switch the remaining usages of `bytes` to `managed_bytes` in collection_visitor	2021-04-08 10:16:21 +02:00
Michał Chojnowski	2760382a68	cdc: log: change `deleted_elements` in log_mutation_builder from bytes to managed_bytes	2021-04-08 10:16:21 +02:00
Michał Chojnowski	ba53c85829	cdc: log: rewrite collection merge to use managed_bytes instead of bytes	2021-04-08 10:16:21 +02:00
Michał Chojnowski	42acdc4d09	cdc: log: don't linearize collections in get_preimage_col_value	2021-04-08 10:16:21 +02:00
Michał Chojnowski	70a2bed70b	cdc: log: change return type of get_preimage_col_value to managed_bytes	2021-04-08 10:16:21 +02:00
Michał Chojnowski	4214e74678	cdc: log: remove an unnecessary copy in process_row_visitor::live_atomic_cell	2021-04-08 10:16:11 +02:00

1 2 3 4 5 ...

282 Commits