scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Piotr Jastrzebski	f711cce024	sstables: Handle empty counter value in read path Due to a bug in an sstable writer, empty counters were stored without a header. Correct way of storing empty counter is to still write a header that indicates the emptiness. Next patch in this series fixes the write path but we have to make sure that we handle incorrectly serialized counters in the read path becuase there may exist sstables with counters stored without header. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-05-21 12:07:12 +02:00
Avi Kivity	d92973ba86	Merge "scylla-gdb.py: scylla_fiber: add fallback mode" from Botond " Add a fallback-mode that can be used when the `scylla ptr` cannot be used, either because the application is not built with the seastar allocator, or due to bugs. The fallback mode relies on a more primitive method for determining how much memory to scan looking for task pointers inside the task object. This mode, being more primitive, is less prone to errors, but is more wasteful and less precise. " * 'scylla-fiber-fallback-mode/v2' of https://github.com/denesb/scylla: scylla-gdb.py: scylla_fiber: add fallback mode scylla-gdb.py: scylla_ptr: add is_seastar_allocator_used() scylla-gdb.py: pointer_metadata: allow constructing from non-seastar pointers scylla-gdb.py: scylla_fiber: fix misaligned text in docstring	2019-05-19 18:34:55 +03:00
Takuya ASADA	4b08a3f906	reloc/python3: add license files on relocatable python3 package It's better to have license files on our python3 distribution. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190516094329.13273-1-syuu@scylladb.com>	2019-05-19 18:30:19 +03:00
Jesse Haber-Kucharsky	68353a8265	build: Don't build `iotune` unconditionally We compile Seastar unconditionally so that changes to Seastar files are reflected in Scylla when it's built. We don't need to unconditionally build `iotune` in the same way. `iotune` is still listed as a build artifact, so it will be built if `ninja` is invoked without a particular target. However, building a specific target (like `ninja build/dev/scylla`) will not build `iotune`. Fixes #4165 Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <9fb96a281580a8743e04d5dd11398be53960cb58.1558100815.git.jhaberku@scylladb.com>	2019-05-19 18:24:05 +03:00
Avi Kivity	5a276d44af	Merge "row_cache: Make invalidate() preemptible" from Tomasz " This patchset fixes reactor stalls caused by cache invalidation not being preemptible. This becomes a problem when there is a lot of partitions in cache inside the invalidated range. This affects high-level operations like nodetool refresh, table truncation, repair and streaming. Fixes #2683 The improvement on stalls was measured using tests/perf_row_cache_update: Before: Small partitions, no overwrites: invalidation: 339.420624 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 339.422144 [ms]} Small partition with a few rows: invalidation: 191.855331 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 191.856816 [ms]} Large partition, lots of small rows: invalidation: 0.959328 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 0.961453 [ms]} After: Small partitions, no overwrites: invalidation: 400.505554 [ms], preemption: {count: 843, 99%: 0.545791 [ms], max: 0.502340 [ms]} Small partition with a few rows: invalidation: 306.352600 [ms], preemption: {count: 644, 99%: 0.545791 [ms], max: 0.506464 [ms]} Large partition, lots of small rows: invalidation: 0.963660 [ms], preemption: {count: 2, 99%: 0.009887 [ms], max: 0.963264 [ms]} The maximum scheduling latency went down form 339 ms to 0.5 ms (task quota). Tests: - unit (dev) " * tag 'cache-preemptible-invalidation-v2' of github.com:tgrabiec/scylla: row_cache: Make invalidate() preemptible row_cache: Switch _prev_snapshot_pos to be a ring_position_ext dht: Introduce ring_position_ext dht: ring_position_view: Take key by const pointer tests: perf_row_cache_update: Rename 'stall' to 'preemption' to avoid confusion tests: perf_row_cache_update: Report stalls around invalidation	2019-05-19 10:47:46 +03:00
Takuya ASADA	f625284113	dist/debian: apply product name variable on override_dh_auto_install To make product name templatization works correctly, we cannot use "debian/scylla-server" as package contents directory path, need to use template like "debian/{{product}}-server" instead. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190517121946.18248-1-syuu@scylladb.com>	2019-05-19 10:46:08 +03:00
Gleb Natapov	31bf4cfb5e	cache_hitrate_calculator: make cache hitrate calculation preemptable The calculation is done in a non preemptable loop over all tables, so if numbers of tables is very large it may take a while since we also build a string for gossiper state. Make the loop preemtable and also make the string calculation more efficient by preallocating memory for it. Message-Id: <20190516132748.6469-3-gleb@scylladb.com>	2019-05-16 15:32:36 +02:00
Gleb Natapov	4517c56a57	cache_hitrate_calculator: do not copy stats map for each cpu invoke_on_all() copies provided function for each shard it is executed on, so by moving stats map into the capture we copy it for each shard too. Avoid it by putting it into the top level object which is already captured by reference. Message-Id: <20190516132748.6469-2-gleb@scylladb.com>	2019-05-16 15:32:24 +02:00
Dejan Mircevski	8dcb35913a	table: Avoid needless allocation of cell lockers All `table` instances currently unconditionally allocate a cell locker for counter cells, though not all need one. Since the lockers occupy quite a bit of memory (as reported in #4441), it's wasteful to allocate them when unneeded. Fixes #4441. Tests: unit (dev, debug) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Message-Id: <20190515190910.87931-1-dejan@scylladb.com>	2019-05-16 11:10:38 +03:00
Avi Kivity	5b2c8847c7	Merge "Pre timestamp based data segregation cleanup" from Botond " This series contains loosely related generic cleanup patches that the timestamp based data segregation series depends on. Most of the patches have to do with making headers self-sustainable, that is compilable on their own. This was needed to be able to ensure that the new headers introduced or touched by that series are self-sustainable too. This series also introduces `schema_fwd.hh` which contains a forward declaration of `schema` and `schema_ptr` classes. No effort was made to find and replace all existing ad-hoc schema forward declarations in the source tree. " * 'pre-timestamp-based-data-segregation-cleanup/v1' of https://github.com/denesb/scylla: encoding_stats.hh: add missing include sstables/time_window_compaction_strategy.hh: make self-sufficient sstables/size_tiered_compaction_strategy.hh: make self-sufficient sstables/compaction_strategy_impl.hh: make header self-sufficient compaction_strategy.hh: use schema_fwd.hh db/extensions.hh: use schema_fwd.hh Add schema_fwd.hh	2019-05-15 17:37:06 +03:00
Asias He	51c4f8cc47	repair: Fix use after free in remove_repair_meta for repair_metas We should capture repair_metas so that it will not be freed until the parallel_for_each is finished. Fixes: #4333 Tests: repair_additional_test.py:RepairAdditionalTest.repair_kill_1_test Message-Id: <237b20a359122a639330f9f78c67568410aef014.1557922403.git.asias@scylladb.com>	2019-05-15 17:22:51 +03:00
Calle Wilund	e7003f1051	sstable: Make all sstable components subject to file extensions Makes opening all sstable components go through same file open routine, optionally applying extensions to each (except TOC which is special). Also ensures we read Scylla metadata before other non-TOC components, as we might need this for extensions (hint hint). Message-Id: <20190513201821.14417-1-calle@scylladb.com>	2019-05-15 17:14:58 +03:00
Botond Dénes	a0010f52c5	scylla-gdb.py: scylla_fiber: add fallback mode The current implementation of the `scylla fiber` command relies on the `scylla ptr` command to provide metadata on pointers, more specifically the boundaries of the region the object they point to occupies. However, in debug mode, seastar is using the standard allocator and thus the `scylla ptr` command doesn't work. To work around this, provide a fallback mode for debug builds. This mode assumes pointers point to the start of objetcts and scans a configurable region of memory. While less exact than the variant relying on `scylla ptr` it still works reasonably well. The size of the to-be-scanned memory region can be set using the `--scanned-region-size` command line argument. This defaults to 512. Additionally, add a flag (`--force-fallback-mode`) to force using the fallback mode. This is useful if `scylla ptr` is not working for any reason.	2019-05-15 15:46:42 +03:00
Botond Dénes	c78d667153	scylla-gdb.py: scylla_ptr: add is_seastar_allocator_used() Determines whether the application is using the seastar allocator or not. This is done by attempting to resolve the `seastar::memory::cpu_mem` symbol. To avoid the expensive symbol lookup the result is cached. This means that loading a new inferior will possibly return the wrong value. The cache can be flushed by re-sourcing the `scylla-gdb.py` script.	2019-05-15 15:44:38 +03:00
Botond Dénes	c3a06da8fb	scylla-gdb.py: pointer_metadata: allow constructing from non-seastar pointers	2019-05-15 15:43:34 +03:00
Botond Dénes	4964671e83	scylla-gdb.py: scylla_fiber: fix misaligned text in docstring	2019-05-15 15:43:29 +03:00
Avi Kivity	8e19121e98	Merge "Implement simple selection alongside aggregation" from Dejan " Although CQL allows SELECT statements with both simple and aggregate selectors, Scylla disallows them. This patch removes that restriction and ensures that mixed simple/aggregate selection works as specified both with and without GROUP BY. Tests: unit (dev) " * 'aggregate-and-simple-select-together' of https://github.com/dekimir/scylla: cql: Fix mixed selection with GROUP BY cql: Allow mixing of aggregate and simple selectors	2019-05-14 20:03:58 +03:00
Dejan Mircevski	f9b00a4318	cql: Fix mixed selection with GROUP BY GROUP BY is currently supported by simple_selection, the class used when all selectors are simple. But when selectors are mixed, we use selection_with_processing, which does not yet support GROUP BY. This patch fixes that. It also adapts one testcase in filtering_test to the new behavior of simple_selector. The test currently expects the last value seen, but simple_selector now outputs the first value seen. (More details: the WHERE clause implicitly selects the columns it references, and unit tests are forced to provide expected values for these columns. The user-visible result is unchanged in the test; users never see the WHERE column values due to filtering in cql::transport, outside unit tests.) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-14 12:50:39 -04:00
Dejan Mircevski	06e3b36164	cql: Allow mixing of aggregate and simple selectors Scylla currently rejects SELECT statements with both simple and aggregate selectors, but Cassandra allows them. This patch brings parity to Scylla. Fixes #4447. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-14 10:34:02 -04:00
Botond Dénes	fe3b798b51	scylla-gdb.py: scylla fiber: add seastar::smp_message_queue::async_work_item to the whitelist Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <4c49fcf5391e027eae68707c9e6ab2f9188c2ea4.1557838171.git.bdenes@scylladb.com>	2019-05-14 17:09:32 +03:00
Avi Kivity	82b91c1511	Merge "gc_clock: Fix hashing to be backwards-compatible" from Tomasz " Commit `d0f9e00` changed the representation of the gc_clock::duration from int32_t to int64_t. Mutation hashing uses appending_hash<gc_clock::time_point>, which by default feeds duration::count() into the hasher. duration::rep changed from int32_t to int64_t, which changes the value of the hash. This affects schema digest and query digests, resulting in mismatches between nodes during a rolling upgrade. Fixes #4460. Refs #4485. " * tag 'fix-gc_clock-digest-v2.1' of github.com:tgrabiec/scylla: tests: Add test which verifies that schema digest stays the same tests: Add sstables for the schema digest test schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition db/schema_tables: Move feed_hash_for_schema_digest() to .cc file hashing: Introduce type-erased interface for the hasher hashing: Introduce C++ concept for the hasher hashers: Rename hasher to cryptopp_hasher gc_clock: Fix hashing to be backwards-compatible	2019-05-14 16:59:50 +03:00
Tomasz Grabiec	285ada5035	Merge "config: remove _make_config_values macro" from Avi The _make_config_values macro reduces duplication (both the item name and the types need to be available as C++ identifiers and as runtime strings), but is hard to work with. The macro is huge and editors don't handle it well, errors aren't identified at the correct location, and since the macro doesn't have types, it's hard to refactor. This series replaces the macro with ordinary C++ code. Some repetition is introduced, but IMO the result is easier to maintain than the macro. As a bonus the bulk of the code is moved away from the header file. Tests: unit (dev), manual testing of the config REST API * https://github.com/avikivity/scylla config-no-macro/v2 config: make the named_value type name available without requiring _make_config_values config: remove value_status from named_value template parameter list config: add named_value::value_as_json() api: config: stop using _make_config_values config: auto-add named_values into config_file config: add allowed_values parameter to named_value constructor config: convert _make_config_values to individual named_value member declarations and initializers	2019-05-14 16:00:23 +03:00
Avi Kivity	987739898f	docs: document SSTable Scylla.db component Document the format and meaning of the various bits of the Scylla.db component. Message-Id: <20190513081605.7394-1-avi@scylladb.com>	2019-05-14 16:00:23 +03:00
Avi Kivity	786ce70dfc	doc: mention the Slack workspace as a place to get help Message-Id: <20190514090420.5598-1-avi@scylladb.com>	2019-05-14 16:00:23 +03:00
Botond Dénes	c2ec78358b	encoding_stats.hh: add missing include	2019-05-14 13:27:30 +03:00
Botond Dénes	eeacf45b4a	sstables/time_window_compaction_strategy.hh: make self-sufficient	2019-05-14 13:27:30 +03:00
Botond Dénes	9953cecc83	sstables/size_tiered_compaction_strategy.hh: make self-sufficient	2019-05-14 13:27:30 +03:00
Botond Dénes	d02c2253a5	sstables/compaction_strategy_impl.hh: make header self-sufficient Add missing includes and forward declarations. De-inline some methods.	2019-05-14 13:27:30 +03:00
Botond Dénes	20d9d18ab3	compaction_strategy.hh: use schema_fwd.hh	2019-05-14 13:27:30 +03:00
Botond Dénes	690ef09b8f	db/extensions.hh: use schema_fwd.hh	2019-05-14 13:27:30 +03:00
Botond Dénes	48bf1d5629	Add schema_fwd.hh	2019-05-14 13:27:30 +03:00
Tomasz Grabiec	6159d5522d	tests: Add test which verifies that schema digest stays the same (cherry picked from commit `8019634dba`)	2019-05-14 10:43:06 +02:00
Tomasz Grabiec	815295547d	tests: Add sstables for the schema digest test Generated by running test_schema_digest_does_not_change with regenerate set to true. (cherry picked from commit `1f2995c8c5`)	2019-05-14 10:43:06 +02:00
Tomasz Grabiec	9de071d214	schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. That's not a problem during normal cluster operation because the tombstones will expire at all nodes at the same time, and schema digest, although changes, will change to the same value on all nodes at about the same time. This fix changes digest calculation to not feed any digest for partitions which are empty after compaction. The digest returned by schema_mutations::digest() is left unchanged by this patch. It affects the table schema version calculation. It's not changed because the version is calculated on boot, where we don't yet know all the cluster features. It's possible to fix this but it's more complicated, so this patch defers that. Refs #4485. Asd	2019-05-14 10:43:06 +02:00
Tomasz Grabiec	3a4a903674	db/schema_tables: Move feed_hash_for_schema_digest() to .cc file	2019-05-14 10:43:06 +02:00
Tomasz Grabiec	b0eecdcb8f	hashing: Introduce type-erased interface for the hasher The motivation is to allow hiding the definition of functions accepting a hasher. For one, this reduces (re)complication times, because we can put the definition in .cc	2019-05-14 10:43:06 +02:00
Avi Kivity	1cf72b39a5	Merge "Unbreak the Unbreakable Linux" from Glauber " scylla_setup is currently broken for OEL. This happens because the OS detection code checks for RHEL and Fedora. CentOS returns itself as RHEL, but OEL does not. " * 'unbreakable' of github.com:glommer/scylla: scylla_setup: be nicer about unrecognized OS scylla_util: recognize OEL as part of the RHEL family	2019-05-13 21:38:21 +03:00
Glauber Costa	3b64727244	scylla_setup: be nicer about unrecognized OS Right now if the user tries to execute this in an unrecognized OS, the following will be thrown: Traceback (most recent call last): File "/usr/lib/scylla/libexec/scylla_setup", line 214, in <module> do_verify_package('scylla-enterprise-jmx') File "/usr/lib/scylla/libexec/scylla_setup", line 73, in do_verify_package if res != 0: UnboundLocalError: local variable 'res' referenced before assignment It would be a lot nicer to exit gracefully and print a messge saying what is going on. This was caught when running on OEL, which the previous patch fixed. Still, there are other unknown OS out there the users may try to run on. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2019-05-13 14:31:49 -04:00
Glauber Costa	6c15ae5b36	scylla_util: recognize OEL as part of the RHEL family Oracle Linux is a RHEL-like distribution and we support it just fine, but our new incarnation of scylla_setup is failing to recognize it. os-release for OEL is a bit different. It doesn't have an ID_LIKE string, and only shows an ID string, which is set to 'ol'. So let's recognize this. Fixes: #4493 Branches: 3.1 Signed-off-by: Glauber Costa <glauber@scylladb.com>	2019-05-13 14:31:38 -04:00
Tomasz Grabiec	77fb34821b	row_cache: Make invalidate() preemptible This change inserts preemption points between removal of partitions. The main complication is in maintaining consitency in the face of concurrent population or eviction. We use the same mechanism which is used by memtable updates. _prev_snapshot_pos is the ring position which partitions the ring into the part which is already updated in cache and the one which is yet to be updated. That position should be set accordingly on preemption. In case of invalidation, updating means removing all entries in the range and marking the range as discontinuous. When resuming invalidation of a range we continue from _prev_snapshot_pos as the lower bound. This affects high-level operations like nodetool refresh, table truncation, repair and streaming. Fixes #2683 The improvement on stalls was measured using tests/perf_row_cache_update: Before Small partitions, no overwrites: invalidation: 339.420624 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 339.422144 [ms]} Small partition with a few rows: invalidation: 191.855331 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 191.856816 [ms]} Large partition, lots of small rows: invalidation: 0.959328 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 0.961453 [ms]} After: Small partitions, no overwrites: invalidation: 400.505554 [ms], preemption: {count: 843, 99%: 0.545791 [ms], max: 0.502340 [ms]} Small partition with a few rows: invalidation: 306.352600 [ms], preemption: {count: 644, 99%: 0.545791 [ms], max: 0.506464 [ms]} Large partition, lots of small rows: invalidation: 0.963660 [ms], preemption: {count: 2, 99%: 0.009887 [ms], max: 0.963264 [ms]} The maximum scheduling latency went down form 339 ms to 0.5 ms (task quota).	2019-05-13 19:32:00 +02:00
Tomasz Grabiec	595e1a540e	row_cache: Switch _prev_snapshot_pos to be a ring_position_ext dht::ring_position cannot represent all ring_position_view instances, in particular those obtained from dht::ring_position_view::for_range_start(). To allow using the latter, switch to views.	2019-05-13 19:30:50 +02:00
Tomasz Grabiec	1530224377	dht: Introduce ring_position_ext It's an owning version of ring_position_view. Note that ring_position has a narrower domain than the ring_position_view for historical reasons, so we cannot use that.	2019-05-13 19:30:50 +02:00
Tomasz Grabiec	b08180c7fa	dht: ring_position_view: Take key by const pointer	2019-05-13 19:30:39 +02:00
Tomasz Grabiec	ed697306be	tests: perf_row_cache_update: Rename 'stall' to 'preemption' to avoid confusion	2019-05-13 19:18:20 +02:00
Tomasz Grabiec	b516e5fdbf	tests: perf_row_cache_update: Report stalls around invalidation	2019-05-13 10:47:03 +02:00
Avi Kivity	a8b3cb8a28	Update seastar submodule * seastar f73690e...3f7a5e1 (7): > Revert "Make sure all allocations/deallocations are properly byte aligned" > http: fix request content for POST requests > doc: discourage generic lambdas and unconstrained templates > smp: add smp_service_group for smp::submit_to() resource control > Revert "smp: add smp_service_group for smp::submit_to() resource control" > smp: add smp_service_group for smp::submit_to() resource control > Make sure all allocations/deallocations are properly byte aligned	2019-05-12 13:32:41 +03:00
Tomasz Grabiec	fd349a3c65	hashing: Introduce C++ concept for the hasher	2019-05-10 12:54:30 +02:00
Tomasz Grabiec	5c2f5b522d	hashers: Rename hasher to cryptopp_hasher So that we can introduce a truly generic interface named "hasher".	2019-05-10 12:54:08 +02:00
Tomasz Grabiec	b7ece4b884	gc_clock: Fix hashing to be backwards-compatible Commit `d0f9e00` changed the representation of the gc_clock::duration from int32_t to int64_t. Mutation hashing uses appending_hash<gc_clock::time_point>, which by default feeds duration::count() into the hasher. duration::rep changed from int32_t to int64_t, which changes the value of the hash. This affects schema digest and query digests, resulting in mismatches between nodes during a rolling upgrade. Fixes #4460. (cherry picked from commit `549d0eb2f3`)	2019-05-10 12:48:46 +02:00
Avi Kivity	fdace36fa5	Merge "Fixes for GCC9 build" from Paweł " This series contains fixes for GCC9 build, mostly corrections needed after changes in libstdc++. With this series and a workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415 (not included) Scylla builds and passes unit tests with GCC9 (tested on Fedora 30, development mode only). Tests: unit(dev with gcc8 and gcc9). " * tag 'gcc9-fixes/v1' of https://github.com/pdziepak/scylla: tests/imr: add missing noexcept counters: bytes_view::pointer is not const pointer imr/fundamental: use bytes_view::const_pointer for const pointer	2019-05-09 21:51:24 +03:00

1 2 3 4 5 ...

18679 Commits