scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 02:50:33 +00:00

Author	SHA1	Message	Date
Kamil Braun	3200d415da	cdc: use a single timeuuid value for a batch of changes If a batch update is performed with a sequence of changes with a single timestamp, they will now show up in CDC with a single timeuuid in the `time` column, distinguished by different `batch_seq_no` values. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-05 12:32:57 +01:00
Kamil Braun	292eba9da0	cdc: replace `split` with `for_each_change` `for_each_change` is like `split` but it doesn't return a vector of mutations representing each change; instead, it takes as a parameter a function which gets called on each mutation. This reduced the memory usage and allows to preserve common context when handling each change (will be useful in next commits). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-05 12:05:08 +01:00
Piotr Sarna	f21bd57058	Merge "cdc: log static rows correctly" from Piotr Currently, writes to a static row in a base table are not reflected at all in the corresponding cdc log. This patch causes such writes to be properly logged. Fixes: #5744 Tests: unit(dev) * piodul/5744-handle-static-row-correctly-in-cdc: cdc_test: add tests for handling static row cdc: fix indentation in transformer::transform cdc: handle static rows separately in transformer::transform cdc: move process_cells higher (and fix captured variables) cdc: reduce dependencies on captured variables in process_cells cdc: fix preimage query for static rows	2020-03-05 10:42:15 +01:00
Nadav Har'El	96ca5ac2c8	alternator: use separate smp_service_group for bouncing requests Until this patch, we used the default_smp_service_group() when bouncing Alternator requests between shards (which is needed for LWT). This patch creates a new smp_service_group for this purpose, which is limited to 5000 concurrent requests (the same limit used for CQL's bounce_request_smp_service_group). The purpose of this limit is to avoid many shards admitting a huge number of requests and bouncing all of them to the same shard who now can't "unadmit" these requests. Fixes #5664. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200304170825.27226-1-nyh@scylladb.com>	2020-03-05 10:17:51 +01:00
Juliusz Stasiewicz	c8527f20b0	CDC+LWT: fix missing CDC entries for successful LWTs Now, if CDC is enabled, `paxos_response_handler::learn_decision()` augments the base table mutation. The differences in logic between: (1) `mutate_internal<std::vector<mutation>>()` and (2) `mutate_internal<std::vector<std::tuple<paxos::proposal, schema_ptr, ...>>>()` make it necessary to separate "CDC mutations" from "base mutation" and send them, respectively, to (1) and (2). Gleb explained in #5869 why it became necessary to add CDC code to LWT writes specifically, instead of doing it somewhere central that affects all writes: "All paths that do write goes through mutate_internally() eventually so it would have been best to do augmentations there, but cdc chose to log only certain writes and not others (unlike MV that does not care how write happened) and mutate_internal have no idea which is which so I do not have other choice but code duplication. ... paxos_response_handler::learn_decision is probably the place to add cdc augmentation." Fixes #5869	2020-03-05 09:49:19 +02:00
Piotr Dulikowski	204e204586	cdc: do not attempt to log empty mutations It is possible to produce an empty mutation using CQL. For example, the following query: DELETE FROM ks.tbl WHERE pk = 0 AND ck < 1 AND ck > 2; will attempt to delete from an empty range of rows. This is translated to the following mutation: {ks.tbl {key: pk{000400000000}, token:-3485513579396041028} {mutation_partition: static: cont=1 {row: }, clustered: {}}} Such mutation does not contain any timestamp, therefore it is difficult to determine what timestamp was used while making the query. This is problematic for CDC, because an entry in CDC log should be written with the same timestamp as a part of the mutation. Because an empty mutation does not modify the table in any way, we can safely skip logging such mutations in CDC and still preserve the ability to reconstruct the current state of the base table from full CDC log. Tests: unit(dev)	2020-03-05 08:32:54 +01:00
Piotr Dulikowski	e6751fad62	cdc_test: add tests for handling static row	2020-03-05 00:16:17 +01:00
Piotr Dulikowski	39519ce923	cdc: fix indentation in transformer::transform	2020-03-05 00:16:17 +01:00
Piotr Dulikowski	0d05b17881	cdc: handle static rows separately in transformer::transform Before this patch, `transform` did not generate any log rows about static row change. This commit fixes that - now, a log row is created if a static row is changed, and this row is separate from the rows that describe changes to the clustering rows.	2020-03-05 00:16:17 +01:00
Piotr Dulikowski	6a0b0b5786	cdc: move process_cells higher (and fix captured variables) The `process_cells` lambda is moved outside the loop, because it will be used by other code in subsequent commits.	2020-03-05 00:15:57 +01:00
Piotr Dulikowski	f136f6e02c	cdc: reduce dependencies on captured variables in process_cells This is a preparation for moving the lambda outside the for loop. - `log_ck`, `pikey`, `pirow` are now passed as arguments, - `value` is now a variable local to the lambda, - `ttl` is now a variable local to the lambda that is returned.	2020-03-05 00:14:05 +01:00
Piotr Dulikowski	a7f51449c3	cdc: fix preimage query for static rows For static rows, we need to fetch at least one row from its partition in order to compute its preimage.	2020-03-04 18:43:55 +01:00
Botond Dénes	8b908a9aba	test: lib/mutation_source_test: log the name of the test-method Most test-methods log a message with their names upon entering them. This helps in identifying the test-method a failure happened in in the logs. Two methods were missing this log line, so add it. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>	2020-03-04 18:16:21 +02:00
Pekka Enberg	7fde2e28da	dist/redhat: Specify files once in scylla.spec file Silences the following warnings when building an RPM: warning: File listed twice: /opt/scylladb/scripts/libexec/hex2list.py warning: File listed twice: /opt/scylladb/scripts/libexec/node_exporter_install warning: File listed twice: /opt/scylladb/scripts/libexec/perftune.py warning: File listed twice: /opt/scylladb/scripts/libexec/scylla-blocktune warning: File listed twice: /opt/scylladb/scripts/libexec/scylla-housekeeping warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_bootparam_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_config_get.py warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_coredump_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_cpuscaling_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_cpuset_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_dev_mode_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_ec2_check warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_fstrim warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_fstrim_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_io_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_kernel_check warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_ntp_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_prepare warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_raid_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_selinux_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_setup warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_stop warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_sysconfig_setup warning: File listed twice: /opt/scylladb/scripts/libexec/seastar-addr2line warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/LICENSE-crc32-vpmsum.TXT warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/README.md warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/apache-license-2.0.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/boost-license-1.0.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/date-license.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/git-archive-all-license.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/libdeflate-license.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/xxhash-license.txt warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/zstd-license.txt I verified that the files are in the generated RPMs after the change: [penberg@nero scylla]$ rpm -ql build/dist/dev/redhat/RPMS/x86_64/scylla-server-666.development-0.20200304.2bc700b008.x86_64.rpm \| grep scripts.*libexec /opt/scylladb/scripts/libexec /opt/scylladb/scripts/libexec/hex2list.py /opt/scylladb/scripts/libexec/node_exporter_install /opt/scylladb/scripts/libexec/perftune.py /opt/scylladb/scripts/libexec/scylla-blocktune /opt/scylladb/scripts/libexec/scylla-housekeeping /opt/scylladb/scripts/libexec/scylla_bootparam_setup /opt/scylladb/scripts/libexec/scylla_config_get.py /opt/scylladb/scripts/libexec/scylla_coredump_setup /opt/scylladb/scripts/libexec/scylla_cpuscaling_setup /opt/scylladb/scripts/libexec/scylla_cpuset_setup /opt/scylladb/scripts/libexec/scylla_dev_mode_setup /opt/scylladb/scripts/libexec/scylla_ec2_check /opt/scylladb/scripts/libexec/scylla_fstrim /opt/scylladb/scripts/libexec/scylla_fstrim_setup /opt/scylladb/scripts/libexec/scylla_io_setup /opt/scylladb/scripts/libexec/scylla_kernel_check /opt/scylladb/scripts/libexec/scylla_ntp_setup /opt/scylladb/scripts/libexec/scylla_prepare /opt/scylladb/scripts/libexec/scylla_raid_setup /opt/scylladb/scripts/libexec/scylla_selinux_setup /opt/scylladb/scripts/libexec/scylla_setup /opt/scylladb/scripts/libexec/scylla_stop /opt/scylladb/scripts/libexec/scylla_sysconfig_setup /opt/scylladb/scripts/libexec/seastar-addr2line [penberg@nero scylla]$ rpm -ql build/dist/dev/redhat/RPMS/x86_64/scylla-server-666.development-0.20200304.2bc700b008.x86_64.rpm \| grep license /opt/scylladb/share/doc/scylla/licenses /opt/scylladb/share/doc/scylla/licenses/LICENSE-crc32-vpmsum.TXT /opt/scylladb/share/doc/scylla/licenses/README.md /opt/scylladb/share/doc/scylla/licenses/apache-license-2.0.txt /opt/scylladb/share/doc/scylla/licenses/boost-license-1.0.txt /opt/scylladb/share/doc/scylla/licenses/date-license.txt /opt/scylladb/share/doc/scylla/licenses/git-archive-all-license.txt /opt/scylladb/share/doc/scylla/licenses/libdeflate-license.txt /opt/scylladb/share/doc/scylla/licenses/xxhash-license.txt /opt/scylladb/share/doc/scylla/licenses/zstd-license.txt Message-Id: <20200304150057.2621-1-penberg@scylladb.com>	2020-03-04 17:25:53 +02:00
Tomasz Grabiec	da4bd3d2e6	Merge "Clean cql3 usage of storage_proxy and _service" from Pavel E. This set removes _all_ mentionings of storage_service and _all_ calls for global storage_proxy instances from cql3/ code. Tests: unit(dev)	2020-03-04 15:20:24 +01:00
Raphael S. Carvalho	3ba3ee2a7b	distributed_loader: trigger regular compaction on resharding completion Regular compaction relies on compaction manager to run compaction jobs until compaction strategy is satisfied. Resharding, on the other hand, is an one-off operation which runs only once in compaction manager, and leave the sstable set in such a way that the strategy is very likely unsatisfied. We need to trigger regular compaction whenever a resharding job replaces a shared sstable by an unshared sstable, so that compaction will not fall way behind due to lots of new sstables created by resharding process. Fixes #5262. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200217144946.20338-1-raphaelsc@scylladb.com>	2020-03-04 16:08:13 +02:00
Nadav Har'El	f67a402c48	merge: Remove treewide dependency on boost/multiprecision Merged patch series from Avi Kivity: boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into each header that depends on it. It is used by converting_mutation_partition_applier and types.hh. While the former is easy to put out-of-line, the latter is not. All we really need is to forward-declare boost::multiprecision::cpp_int, but that is not easy - it is a template taking several parameters, among which are non-type template parameters also defined in that header. So it's quite difficult to disentangle, and fragile wrt boost changes. This patchset introduces a wrapper type utils::multiprecision_int which _can_ be forward declared, and together with a few other small fixes, manages to uninclude boost/multiprecision from most of the source files. The total reduction in number of lines compiled over a full build is 324 * 23,227 or around 7.5 million. Tests: unit (dev) Ref #1 https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1 Avi Kivity (5): converting_mutation_partition_applier: move to .cc file utils: introduce multiprecision_int tests: cdc_test: explicitly convert from cdc::operation to uint8_t treewide: use utils::multiprecision_int for varint implementation types: forward-declare multiprecision_int configure.py \| 2 + concrete_types.hh \| 2 +- converting_mutation_partition_applier.hh \| 163 ++------------- types.hh \| 12 +- utils/big_decimal.hh \| 3 +- utils/multiprecision_int.hh \| 256 +++++++++++++++++++++++ converting_mutation_partition_applier.cc \| 188 +++++++++++++++++ cql3/functions/aggregate_fcts.cc \| 10 +- cql3/functions/castas_fcts.cc \| 28 +-- cql3/type_json.cc \| 2 +- lua.cc \| 38 ++-- mutation_partition_view.cc \| 2 + test/boost/cdc_test.cc \| 6 +- test/boost/cql_query_test.cc \| 16 +- test/boost/json_cql_query_test.cc \| 12 +- test/boost/types_test.cc \| 58 ++--- test/boost/user_function_test.cc \| 2 +- test/lib/random_schema.cc \| 14 +- types.cc \| 20 +- utils/big_decimal.cc \| 4 +- utils/multiprecision_int.cc \| 37 ++++ 21 files changed, 627 insertions(+), 248 deletions(-) create mode 100644 utils/multiprecision_int.hh create mode 100644 converting_mutation_partition_applier.cc create mode 100644 utils/multiprecision_int.cc	2020-03-04 15:13:42 +02:00
Avi Kivity	5dee627f73	types: forward-declare multiprecision_int This reduces the number of translation units that depend on boost/multiprecision from 354 to 30, and reduces the size of database.i (as an example) from 406160 to 382933 (smaller files will benefit more, relatively). Ref #1	2020-03-04 13:28:16 +02:00
Avi Kivity	3c772757c0	treewide: use utils::multiprecision_int for varint implementation The goal is to forward-declare utils::multiprecision_int, something beyond my capabilities for boost::multiprecision::cpp_int, to reduce compile time bloat. The patch is mostly search-and-replace, with a few casts added to disambiguate conversions the compiler had trouble with.	2020-03-04 13:28:16 +02:00
Avi Kivity	874f65c58c	tests: cdc_test: explicitly convert from cdc::operation to uint8_t After the varint data type starts using the new multiprecision_int type, this code fails to compile. I expect that somehow the conversion from enum class to cpp_int was allowed to succeed, and we ended up with a data_value of type varint. The tests succeeded because the serialized representation happened to be the same.	2020-03-04 13:28:16 +02:00
Piotr Jastrzebski	354e3c34c8	cdc log: merge stream_id columns into a single column Previously we had stream_id_1 and stream_id_2 columns of type long each. They were forming a partition key. In a new format we want a single stream_id column that forms a partition key. To be able to still store two longs, the new column will have type blob and its value will be concatenated bytes of two longs that partition key is composed of. We still want partition key to logically be two longs because those two values will be used by a custom partitioner later once we implement it. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-04 13:27:48 +02:00
Avi Kivity	7434c81a29	utils: introduce multiprecision_int multiprecision_int is a wrapper around boost::multiprecision::cpp_int that adds no functionality. The intent is to allow forward declration; cpp_int is so complicated that just finding out what its true type is a difficult exercise, as it depends on many internal declarations. Because cpp_int uses expression templates, the implementation has to explicitly cast to the desired type in many places, otherwise the C++ compile is presented with too many choices, especially in conjunction with data_value (which can convert from many different types too).	2020-03-04 12:42:57 +02:00
Avi Kivity	414ec8c68e	converting_mutation_partition_applier: move to .cc file converting_mutation_partition_applier is a heavyweight class that is not used in the hot path, so it can be safely out-of-lined. This moves some includes to boost/multiprecision out of header files, where they can infect a lot of code. mutation_partition_view.cc's includes were adjusted to recover missing dependencies.	2020-03-04 12:42:57 +02:00
Pavel Emelyanov	35b0e6dd7f	repair_writer: Use db from repair_meta (2nd try) The previous version errorneously used local db reference which was propagated into another shard. This time carry the sharded instance and use .local() as before. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303221729.31261-1-xemul@scylladb.com>	2020-03-04 11:31:52 +01:00
Tomasz Grabiec	477dadc062	Merge "cql_test_env: Drop a few shared_ptr<sharded<...>>" from Rafael I found that a few variables in cql_test_env were wrapping sharded in shared_ptr for no apparent reason. These patches convert them to plain sharded<...>.	2020-03-04 11:31:52 +01:00
Yaron Kaikov	de19496ff7	dist/docker: Add VERSION argument to Dockerfile (#5845 ) Currently, the Dockerfile installs the latest version of Scylla. Let's add a VERSION argument to Dockerfile, which explicitly specifies the version to ensure scripts, for example, always build the expected version. If no VERSION is specified for "docker build", use the default value of "666.development", which is the version number for latest nightly.	2020-03-04 12:20:24 +02:00
Pekka Enberg	e76b5bdf7b	Merge 'Cleanup test.py output' from Kostja "These two patches were made suspect of failing next promotion and excluded from the original series." * 'test.py.log' of https://github.com/kostja/scylla: test.py: remove log output on success unless -s is specified test.py: do not store entire log output in junit report.	2020-03-04 11:58:46 +02:00
Eliran Sinvani	99cedf737c	docker: rsyslog configuration fixes The introduction of rsyslog had two errors in it. Both errors are non fatal and the docker still works, however, the system is left in a wrong state in which supervisord marks rsyslogd service as failed (after several failed retry attempts). Another bug in the configuration causes rsyslog to output an error. 1) An inclusion command from a newer version was used in rsyslogs main configuration file. This caused to rsyslog to complain during startup but it didn't do much damage since rsyslog converts every unrecognised command to a message command. 2) in the supervisord definition of the service, rsyslogd is ran without the -n option which means it defaults to automatically switch to the background. Supervisord interpret this as an unexpected process termination and retries to start the process (unsuccessfully because rsyslog protects itself from having multiple processes of itself) and eventually marks it as down although it is fully up and running. This commit fixes both configuration problems. Tests: Build and run docker and validate the errors are gone. Fixes #5937	2020-03-04 11:56:30 +02:00
Pekka Enberg	325c3e13eb	build: Switch to SHA1 build IDs Currently, you have to build the relocatable package tarball with ./reloc/build_reloc.sh to be able to build an RPM out of it. You need to do this because RPMS require SHA1 build-ids, but the build system does not enforce that. To prepare for adding RPM target to the ninja build, let's switch to SHA1 build ID conditionally, because the performance difference between xxhash and SHA1 is neglible. Rafael Avila de Espindola writes: [...] the sha1 implementation in current lld is pretty fast. Linking release scylla the times I get are lld in fedora fast 2.83739 sha1 3.51990 current lld fast 2.6936 sha1 2.90250 And the sha1 implementation might get even faster: https://bugs.llvm.org/show_bug.cgi?id=44138. Message-Id: <20200303131806.22422-1-penberg@scylladb.com>	2020-03-04 11:00:43 +02:00
Tomasz Grabiec	82b76163e3	utils/small_vector: Add missing include Needed for std::uninitialized_move() et al Message-Id: <20200303191148.11716-1-tgrabiec@scylladb.com>	2020-03-03 21:23:40 +02:00
Tomasz Grabiec	5dfefc0a85	Revert "repair_writer: Use db from repair_meta" This reverts commit `c6ddd21c50`. Uses database& instance across shards, which causes repair writer to use the table object from the wrong shard. Fixes #5907	2020-03-03 19:50:53 +01:00
Avi Kivity	906784639d	Merge "Clean sstables from using global objects" from Pavel E " This set cleans sstable_writer_config and surrounding sstables code from using global storage_ and feature_ service-s and database by moving the configuration logic onto sstables_manager (that was supposed to do it since `eebc3701a5`). Most of the complexity is hidden around sstable_writer_config creation, this set makes the sstables_manager create this object with an explicit call. All the rest are consequences of this change. Tests: unit(debug), manual start-stop " * 'br-clean-sstables-manager-2' of https://github.com/xemul/scylla: sstables: Move get_highest_supported_format sstables: Remove global get_config() helper sstables: Use manager's config() in .new_sstable_component_file() sstable_writer_config: Extend with more db::config stuff sstables_manager: Don't use global helper to generate writer config sstable_writer_config: Sanitize out some features fields initialization sstable_writer_config: Factor out some field initialization sstables: Generate writer config via manager only sstables: Keep reference on manager test: Re-use existing global sstables_manager table: Pass sstable_writer_config into write_memtable_to_sstable	2020-03-03 18:33:01 +02:00
Nadav Har'El	750fe9585a	alternator: change rjson::get() to take std::string_view Change rjson::get() to take std::string_view, instead of RapidJson's version of that type, "StringRef". We already did the same change for rjson::find() in a previous patch. Not only is std::string_view more convenient for potential callers in Scylla, this change also avoids a bug in FindMember() on StringRef where the length is ignored (and instead, null-termination of the string is assumed). This patch doesn't require any changes to callers, because we actually had just a handful of remaining callers (most call sites switched to rjson::find()), and all of them used string constants which could be implicitly converted to StringRef or std::string_view just the same. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303161019.1456-1-nyh@scylladb.com>	2020-03-03 17:13:40 +01:00
Nadav Har'El	91d9632909	alternator: add rjson::remove_member() convenience function This patch adds a rjson::remove_member() wrapper to the RemoveMember method, which takes a std::string_view. But beyond the convenience, this actually works around a subtle bug in RemoveMember where, if given a StringRef parameter, ignores its length (see upstream issue https://github.com/Tencent/rapidjson/issues/1649). In the one place we used RemoveMember, it forced us to copy the string because it wasn't null-terminated. The solution proposed here involves wrapping the string view in a GenericValue - which no longer needs to copy the string, but still works around the bug. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303143524.28300-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Nadav Har'El	0fcb226412	alternator: switch rjson::find() to use std::string_view Our rjson::find() convenience function used RapidJson's "StringRef" type, which is almost exactly like std::string_view. If we switch to use string_view as we do in this patch, a lot of call sites become much simpler. Moreover, there was an even more important motivation for this patch: the RapidJson FindMember() function we used in rjson::find() has a bug when given a StringRef - although a StringRef contains a length, the FindMember() code ignores it and expects the string to be null-terminated (see: https://github.com/Tencent/rapidjson/issues/1649). In this patch, we wrap the pointer and length of a std::string_view in an rjson::value, a code path which bypasses the FindMember bug, and yet does not require copying the string. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303141814.26929-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Nadav Har'El	2ea0b9d226	Merge branch 'split-mutations' of github.com:haaawk/scylla into next Merged pull request https://github.com/scylladb/scylla/pull/5940 from Kamil Braun: Add a bunch of new structs describing a change made to a table, and an extract_changes function which takes a mutation and returns the set of changes contained in this mutation, separated by timestamp and ttl. Add a split function which uses extract_changes to split a mutation into separate mutations, each describing a single change. Static rows are put into separate changes now. The pre_image_select function was fixed to select pre_image data always when there is a static row/clustered row change, even if there were e.g. additional range tombstones. Fixes: #5719. Tests: unit(dev)	2020-03-03 17:27:21 +02:00
Botond Dénes	103bf50e18	storage_proxy: add timeouts to smp calls on the write path When a node is overloaded requests usually start to queue up. Timeouts are supposed to prevent queues from exploding and causing an OOM. One prominent queue that tends to explode is the smp queue as it didn't support timeouts and so requests would sit in the queue until the target shard would process them. If the target shard is heavily overloaded requests might accumulate faster then they are processed, surely leading to an OOM. To prevent this use the recently introduces timeout to `seastar::smp::submit_to()` and derived APIs to time out write requests sitting in the smp queue. We simply use the request's own timeout for this purpose. Fixes: #5055 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200303131658.741720-1-bdenes@scylladb.com>	2020-03-03 15:39:58 +02:00
Kamil Braun	5de9b5b566	cdc: add change splitting test Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-03 13:31:19 +01:00
Kamil Braun	5c4a237c12	cdc: split the mutation before passing it into `transform` If the mutation contains separate logical changes (e.g. with different timestamps and/or ttls), it will be split into multiple mutations, each passed into transform.	2020-03-03 13:17:51 +01:00
Kamil Braun	9924e3aa34	cdc: reduce code duplication in augment_mutation_call Now there's only one call to `transform`.	2020-03-03 13:17:51 +01:00
Kamil Braun	24a32a13b5	cdc: retrieve preimage anytime there are static/clustered row updates Previously we wouldn't retrieve the preimage if the mutation contained something different than static/clustered row updates, e.g. if it contained a partition deletion. However, there are mutations created from batch statements which can contain both a partition deletion and a set of row updates with a later timestamp. We want to retrieve the preimage too in this case.	2020-03-03 13:17:51 +01:00
Kamil Braun	529d30ef66	cdc: add `split` function This function takes a mutation and returns a set of mutations, each representing a separate change with a single timestamp and ttl.	2020-03-03 13:17:51 +01:00
Kamil Braun	132ea89c32	cdc: add `extract_changes` function This commit introduces a bunch of new structs describing a change made to a table, and an `extract_changes` function which takes a mutation and returns the set of changes contained in this mutation, separated by timestamp and ttl.	2020-03-03 13:17:51 +01:00
Kamil Braun	b5c944370e	cdc: add `should_split` function The function checks if there are multiple timestamps and/or ttls inside a mutation, which means separate changes should be created for this mutation in CDC.	2020-03-03 13:17:50 +01:00
Konstantin Osipov	48f09b95d0	test.py: remove log output on success unless -s is specified Log output is saved by the build system and can take a lot of space. Remove it unless -s is specified.	2020-03-03 13:59:14 +03:00
Konstantin Osipov	ae2820a1c7	test.py: do not store entire log output in junit report. This makes report very heavy and is suspected to corrupt XML output.	2020-03-03 13:59:14 +03:00
Nadav Har'El	359b32fb63	merge: CDC: implement new column format and naming Merged pull request https://github.com/scylladb/scylla/pull/5910 by Calle Wilund: Rename metadata and data columns according to new spec Also use transformation methods for names in all code + tests to make switching again easier Break up data column tuple Data column is now pure frozen original type. If column is deleted (set to null), a metadata column cdc$deleted_ is set to true, to distinguish null column == not involved in row operation For non-atomic collections, a cdc$deleted_elements_ column is added, and when removing elements from collection this is where they are shown. For non-atomic assign, the "cdc$deleted_" is true, and is set to new value. column_op removed.	2020-03-03 12:36:16 +02:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Piotr Jastrzebski	f105f43008	commitlog: remove FIXME In segment_manager::on_timer() there's a FIXME to stop discarding future returned from sync() but sync() does not return any future so it's safe to remove the FIXME and stop casting to (void). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <6d6d819cb2972e47e5f3fbe7b896499c64b09e53.1583230579.git.piotr@scylladb.com>	2020-03-03 12:21:56 +02:00
Calle Wilund	ed0d1c5fe2	cdc: Break up data column tuple According to "new" spec: Data column is now pure frozen original type. If column is deleted (set to null), a metadata column cdc$deleted_<name> is set to true, to distinguish null column == not involved in row operation For non-atomic collections, a cdc$deleted_elements_<name> column is added, and when removing elements from collection this is where they are shown. For non-atomic assign, the "cdc$deleted_<name>" is true, and <name> is set to new value. column_op removed.	2020-03-03 08:52:20 +00:00

1 2 3 4 5 ...

21451 Commits