scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 20:27:03 +00:00

Author	SHA1	Message	Date
Duarte Nunes	1953c5fa61	Merge 'Fix filtering with LIMIT' from Piotr " This series adds proper handling of filtering queries with LIMIT. Previously the limit was erroneously applied before filtering, which leads to truncated results. To avoid that, paged filtering queries now use an enhanced pager, which remembers how many rows dropped and uses that information to fetch for more pages if the limit is not yet reached. For unpaged filtering queries, paging is done internally as in case of aggregations to avoid returning keeping huge results in memory. Also, previously, all limited queries used the page size counted from max(page size, limit). It's not good for filtering, because with LIMIT 1 we would then query for rows one-by-one. To avoid that, filtered queries ask for the whole page and the results are truncated if need be afterwards. Tests: unit (release) " * 'fix_filtering_with_limit_2' of https://github.com/psarna/scylla: tests: add filtering with LIMIT test tests: split filtering tests from cql_query_test cql3: add proper handling of filtering with LIMIT service/pager: use dropped_rows to adjust how many rows to read service/pager: virtualize max_rows_to_fetch function cql3: add counting dropped rows in filtering pager (cherry picked from commit `1afda28cf3`)	2018-12-02 12:07:46 +02:00
Duarte Nunes	b72a94b53e	Merge 'Fix checking if system tables need view updates' from Piotr " This miniseries ensures that system tables are not checked for having view updates, because they never do. What's more, distributed system table is used in the process, so it's unsafe to query the table while streaming it. Tests: unit (release), dtest(update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_decommission_node_2_test) " * 'fix_checking_if_system_tables_need_view_updates_3' of https://github.com/psarna/scylla: streaming: don't check view building of system tables database: add is_internal_keyspace streaming: remove unused sstable_is_staging bool class (cherry picked from commit `d09d4bbd91`)	2018-11-28 15:39:34 +00:00
Piotr Sarna	3f82b697f2	main: fix deinitialization order for view update generator View update generator should be stopped only after drain_on_shutdown() is performed on storage service. Message-Id: <4d2bda4c73422a2ebf46d6dcd06c95d960839889.1543230849.git.sarna@scylladb.com> (cherry picked from commit `6ab8235369`)	2018-11-27 12:34:50 +00:00
Takuya ASADA	ee1ef853e5	dist/common/systemd/scylla-housekeeping-restart.service.mustache: specify correct repo for Debian variants We do specify correct repo for both Red Hat/Debian variants on -deily, but mistakenly don't for -restart, so do same on -restart. Fixes #3906 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181109224509.27380-1-syuu@scylladb.com> (cherry picked from commit `7740cd2142`)	2018-11-27 09:59:05 +02:00
Raphael S. Carvalho	6e7e7f3822	sstables: deprecate sstable metadata's ancestors The reason for that is that it's not available in sstable format mc, so we can no longer rely on it in common code for the currently supported formats. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20181121170057.20900-1-raphaelsc@scylladb.com> (cherry picked from commit `d29482dce8`)	2018-11-24 12:36:40 +02:00
Paweł Dziepak	82a36edc9d	Merge "Optimize sstable writing of the MC format" from Tomasz " Tested with perf_fast_forward from: github.com/tgrabiec/scylla.git perf_fast_forward-for-sst3-opt-write-v1 Using the following command line: build/release/tests/perf/perf_fast_forward_g --populate --sstable-format=mc \ --data-directory /tmp/perf-mc --rows=10000000 -c1 -m4G \ --datasets small-part The average reported flush throughput was (stdev for the avergages is around 4k): - for mc before the series: 367848 frag/s - for lc before the series: 463458 frag/s (= mc.before +25%) - for mc after the series: 429276 frag/s (= mc.before +16%) - for lc after the series: 466495 frag/s (= mc.before +26%) Refs #3874. " * tag 'sst3-opt-write-v2' of github.com:tgrabiec/scylla: sstables: mc: Avoid serialization of promoted index when empty sstables: mc: Avoid double serialization of rows tests: sstable 3.x: Do not compare Statistics component utils: Introduce memory_data_sink schema: Optimize column count getters sstables: checksummed_file_data_sink_impl: Bypass output_stream (cherry picked from commit `4aa5d83590`)	2018-11-24 12:36:40 +02:00
Avi Kivity	d4efa3c9b2	Update seastar submodule * seastar d6647df...880826e (1): > fstream: Introduce make_file_data_sink()	2018-11-24 12:36:40 +02:00
Avi Kivity	324dae3e12	Merge "compress: Restore lz4 as default compressor" from Duarte " Enables sstable compression with LZ4 by default, which was the long-time behavior until a regression turned off compression by default. Fixes #3926 " * 'restore-default-compression/v2' of https://github.com/duarten/scylla: tests/cql_query_test: Assert default compression options compress: Restore lz4 as default compressor tests: Be explicit about absence of compression (cherry picked from commit `bb85a21a8f`)	2018-11-21 16:45:22 +02:00
Tomasz Grabiec	c0ffc9a2b7	utils: phased_barrier: Make advance_and_await() have strong exception guarantees Currently, when advance_and_await() fails to allocate the new gate object, it will throw bad_alloc and leave the phased_barrier object in an invalid state. Calling advance_and_await() again on it will result in undefined behavior (typically SIGSEGV) beacuse _gate will be disengaged. One place affected by this is table::seal_active_memtable(), which calls _flush_barrier.advance_and_await(). If this throws, subsequent flush attempts will SIGSEGV. This patch rearranges the code so that advance_and_await() has strong exception guarantees. Message-Id: <1542645562-20932-1-git-send-email-tgrabiec@scylladb.com> Fixes #3931. (cherry picked from commit `57e25fa0f8`)	2018-11-21 12:17:27 +02:00
Glauber Costa	f81fa5f75c	remove monitor if sstable write failed In (almost) all SSTable write paths, we need to inform the monitor that the write has failed as well. The monitor will remove the SSTable from controller's tracking at that point. Except there is one place where we are not doing that: streaming of big mutations. Streaming of big mutations is an interesting use case, in which it is done in 2 parts: if the writing of the SSTable fails right away, then we do the correct thing. But the SSTables are not commited at that point and the monitors are still kept around with the SSTables until a later time, when they are finally committed. Between those two points in time, it is possible that the streaming code will detect a failure and manually call fail_streaming_mutations(), which marks the SSTable for deletions. At that point we should propagate that information to the monitor as well, but we don't. Fixes #3732 (hopefully) Tests: unit (release) Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20181114213618.16789-1-glauber@scylladb.com> (cherry picked from commit `9f403334c8`)	2018-11-20 19:27:54 +02:00
Glauber Costa	6fd1cfcfce	sstables: correctly parse estimated histograms In commit `a33f0d6`, we changed the way we handle arrays during the write and parse code to avoid reactor stalls. Some potentially big loops were transformed into futurized loops, and also some calls to vector resizes were replaced by a reserve + push_back idiom. The latter broke parsing of the estimated histogram. The reason being that the vectors that are used here are already initialized internally by the estimated_histogram object. Therefore, when we push_back, we don't fill the array all the way from index 0, but end up with a zeroed beginning and only push back some of the elements we need. We could revert this array to a resize() call. After all, the reason we are using reserve + push_back is to avoid calling the constructor member for each element, but We don't really expect the integer specialization to do any of that. However, to avoid confusion with future developers that may feel tempted to converted this as well for the sake of consistency, it is safer to just make sure these arrays are zeroed. Fixes #3918 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20181116130853.10473-1-glauber@scylladb.com> (cherry picked from commit `c6811bd877`)	2018-11-17 17:20:00 +02:00
Nadav Har'El	9d458ffea9	Materialized Views and Secondary Index: no longer experimental After this patch, the Materialized Views and Secondary Index features are considered generally-available and no longer require passing an explicit "--experimental=on" flag to Scylla. The "--experimental=on" flag and the db::config::check_experimental() function remain unused, as we graduated the only two features which used this flag. However, we leave the support for experimental features in the code, to make it easier to add new experimental features in the future. Another reason to leave the command-line parameter behind is so existing scripts that still use it will not break. Fixes #3917 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181115144456.25518-1-nyh@scylladb.com> (cherry picked from commit `78ed7d6d0c`)	2018-11-15 19:50:30 +02:00
Duarte Nunes	9776a048e7	Merge 'Generating view updates during streaming' from Piotr During streaming, there are cases when we should invoke the view write path. In particular, if we're streaming because of repair or if a view has not yet finished building and we're bootstrapping a new node. The design constraints are: 1) The streamed writes should be visible to new writes, but the sstable should not participate in compaction, or we would lose the ability to exclude the streamed writes on a restart; 2) The streamed writes must not be considered when generating view updates for them; 3) Resilient to node restarts; 4) Resilient to concurrent stream sessions, possibly streaming mutations for overlapping ranges. We achieve this by writing the streamed writes to an sstable in a different folder, call it "staging". We achieve 1) by publishing the sstable to the column family sstable set, but excluding it from compactions. We do these steps upon boot, by looking at the staging directory, thus achieving 3). Fixes #3275 * 'streaming_view_to_staging_sstables_9' of https://github.com/psarna/scylla: (29 commits) tests: add materialized views test tests: add view update generator to cql test env main: add registering staging sstables read from disk database: add a check if loaded sstable is already staging database: add get_staging_sstable method streaming: stream tables with views through staging sstables streaming: add system distributed keyspace ref to streaming streaming: add view update generator reference to streaming main: add generating missed mv updates from staging sstables storage_service: move initializing sys_dist_ks before bootstrap db/view: add view_update_from_staging_generator service db/view: add view updating consumer table: add stream_view_replica_updates table: split push_view_replica_updates table: add as_mutation_source_excluding table: move push_view_replica_updates to table.cc database: add populating tables with staging sstables database: add creating /staging directory for sstables database: add sstable-excluding reader table: add move_sstable_from_staging_in_thread function ... (cherry picked from commit `a38f6078fb`)	2018-11-15 17:46:20 +02:00
Asias He	10cf97375e	streaming: Expose reason for streaming On receiving a mutation_fragment or a mutation triggered by a streaming operation, we pass an enum stream_reason to notify the receiver what the streaming is used for. So the receiver can decide further operation, e.g., send view updates, beyond applying the streaming data on disk. Fixes #3276 Message-Id: <f15ebcdee25e87a033dcdd066770114a499881c0.1539498866.git.asias@scylladb.com> (cherry picked from commit `7f826d3343`)	2018-11-15 17:45:31 +02:00
Paweł Dziepak	e6355a9a01	Merge "Write static rows for all partitions if there are static columns" from Vladimir " It appears that in case when there are any static columns in serialization header, Cassandra would write a (possibly empty) static row to every partition in the SSTables file. This patchset alings Scylla's logic with that of Cassandra. Note that Scylla optimizes the case when no partition contains a static row because it keeps track of updated columns that Scylla currently does not do - see #3901 for details. Fixes #3900. " * 'projects/sstables-30/write-all-static-rows/v1' of https://github.com/argenet/scylla: tests: Test writing empty static rows for partitions in tables with static columns. sstables: Ignore empty static rows on reading. sstables: Write empty static rows when there are static columns in the table. (cherry picked from commit `6469a1b451`)	2018-11-12 15:59:35 -08:00
Raphael S. Carvalho	e57907a1d5	sstables: fix procedure to get fully expired sstables with MC format MC format lacks ancestors metadata, so we need to workaround it by using ancestors in metadata collector, which is only available for a sstable written during this instance. It works fine here because we only want to know if a sstable recently compacted has an ancestor which wasn't yet deleted. Fixes #3852. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Reviewed-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <20181102154951.22950-1-raphaelsc@scylladb.com> (cherry picked from commit `1c5934c934`)	2018-11-06 16:03:18 +02:00
Pekka Enberg	f94b46e7e0	docker: Switch to 3.0 RPM repository	2018-11-01 19:40:10 +02:00
Avi Kivity	6847c12668	Merge "dist: use perftune.py for disks tuning" from Vlad " Use perftune.py for tuning disks: - Distribute/pin disks' IRQs: - For NVMe drives: evenly among all present CPUs. - For non-NVMe drives: according to chosen tuning mode. - For all disks used by scylla: - Tune nomerges - Tune I/O scheduler. It's important to tune NIC and disks together in order to keep IRQ pinning in the same mode. Disk are detected and tuned based on the current content of /etc/scylla/scylla.yaml configuration file. " Fixes #3831. * 'use_perftune_for_disks-v3' of https://github.com/vladzcloudius/scylla: dist: change the sysconfig parameter name to reflect the new semantics scylla_util.py::sysconfig_parser: introduce has_option() dist: scylla_setup and scylla_sysconfig_setup: change paremeters names to reflect new semantics dist: don't distribute posix_net_conf.sh any more dist: use perftune.py to tune disks and NIC (cherry picked from commit `f170e3e589`)	2018-11-01 19:19:04 +02:00
Avi Kivity	80b86def1f	Update seastar submodule * seastar 0c8a2c8...d6647df (3): > scripts: perftune.py: properly merge parameters from the command line and the configuration file > scripts: perftune.py: prioritize I/O schedulers > Merge "scripts: perftune.py: support different I/O schedulers" from Vlad Ref #3831.	2018-11-01 19:18:07 +02:00
Vlad Zolotarov	c6de9ea39b	config: enable hinted handoff by default Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20181019180401.12400-1-vladz@scylladb.com> (cherry picked from commit `4d1bb719a4`)	2018-11-01 10:41:44 +02:00
Avi Kivity	94bed81c1d	Update seastar submodule * seastar 39b89de...0c8a2c8 (1): > prometheus: Allow preemption between each metric See scylladb/seastar#469.	2018-10-31 19:21:21 +02:00
Hagit Segev	0f3a21f0bb	release: prepare for 3.0-rc1 scylla-3.0.rc1	2018-10-31 12:08:43 +02:00
Tomasz Grabiec	976db7e9e0	Merge "Proper support for static rows in SSTables 3.x" from Vladimir This patchset addresses two issues with static rows support in SSTables 3.x. ('mc' format): 1. Since collections are allowed in static rows, we need to check for complex deletion, set corresponding flag and write tombstones, if any. 2. Column indices need to be partitioned for static columns the same way they are partitioned for regular ones. * github.com/argenet/scylla.git projects/sstables-30/columns-proper-order-followup/v1: sstables: Partition static columns by atomicity when reading/writing SSTables 3.x. sstables: Use std::reference_wrapper<> instead of a helper structure. sstables: Check for complex deletion when writing static rows. tests: Add/fix comments to test_write_interleaved_atomic_and_collection_columns. tests: Add test covering inverleaved atomic and collection cells in static row. (cherry picked from commit `62c7685b0d`)	2018-10-30 14:51:21 +01:00
Nadav Har'El	996b86b804	Materalized views: fix race condition in resharding while view building When a node reshards (i.e., restarts with a different number of CPUs), and is in the middle of building a view for a pre-existing table, the view building needs to find the right token from which to start building on all shards. We ran the same code on all shards, hoping they would all make the same decision on which token to continue. But in some cases, one shard might make the decision, start building, and make progress - all before a second shard goes to make the decision, which will now be different. This resulted, in some rare cases, in the new materialized view missing a few rows when the build was interrupted with a resharding. The fix is to add the missing synchronization: All shards should make the same decision on whether and how to reshard - and only then should start building the view. Fixes #3890 Fixes #3452 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181028140549.21200-1-nyh@scylladb.com> (cherry picked from commit `b8337f8c9d`)	2018-10-29 09:52:25 +00:00
Avi Kivity	b7b217cc43	Merge "Re-order columns when reading/writing SSTables 3.x" from Vladimir " In Cassandra, row columns are stored in a BTree that uses the following ordering on them: - all atomic columns go first, then all multi-cell ones - columns of both types (atomic and multi-cell) are lexicographically ordered by name regarding each other Scylla needs to store columns and their respective indices using the same ordering as well as when reading them back. Fixes #3853 Tests: unit {release} + Checked that the following SSTables are dumped fine using Cassandra's sstabledump: cqlsh:sst3> CREATE TABLE atomic_and_collection3 ( pk int, ck int, rc1 text, rc2 list<text>, rc3 text, rc4 list<text>, rc5 text, rc6 list<text>, PRIMARY KEY (pk, ck)) WITH compression = {'sstable_compression': ''}; cqlsh:sst3> INSERT INTO atomic_and_collection3 (pk, ck, rc1, rc4, rc5) VALUES (0, 0, 'hello', ['beautiful','world'], 'here'); << flush >> sstabledump: [ { "partition" : { "key" : [ "0" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 96, "clustering" : [ 0 ], "liveness_info" : { "tstamp" : "1540599270139464" }, "cells" : [ { "name" : "rc1", "value" : "hello" }, { "name" : "rc5", "value" : "here" }, { "name" : "rc4", "deletion_info" : { "marked_deleted" : "1540599270139463", "local_delete_time" : "1540599270" } }, { "name" : "rc4", "path" : [ "45e22cb0-d97d-11e8-9f07-000000000000" ], "value" : "beautiful" }, { "name" : "rc4", "path" : [ "45e22cb1-d97d-11e8-9f07-000000000000" ], "value" : "world" } ] } ] } ] " * 'projects/sstables-30/columns-proper-order/v1' of https://github.com/argenet/scylla: tests: Test interleaved atomic and multi-cell columns written to SSTables 3.x. sstables: Re-order columns (atomic first, then collections) for SSTables 3.x. sstables: Use a compound structure for storing information used for reading columns. (cherry picked from commit `75dbff984c`)	2018-10-28 15:51:47 +02:00
Tomasz Grabiec	c274430933	Merge "Properly write static rows missing columns for SSTables 3.x." from Vladimir Before this fix, write_missing_columns() helper would always deal with regular columns even when writing static rows. This would cause errors on reading those files. Now, the missing columns are written correctly for regular and static rows alike. * github.com/argenet/scylla.git projects/sstables-30/fix-writing-static-missing-columns/v1: schema: Add helper method returning the count of columns of specified kind. sstables: Honour the column kind when writing missing columns in 'mc' format. tests: Add test for a static row with missing columns (SStables 3.x.). (cherry picked from commit `cf2d5c19fb`)	2018-10-26 13:30:12 +03:00
Avi Kivity	893a18a7c4	Merge "Properly writing/reading shadowable deletions with SSTables 3.x." from Vladimir " This patchset adddresses two problems with shadowable deletions handling in SSTables 3.x. ('mc' format). Firstly, we previously did not set a flag indicating the presence of extended flags byte with HAS_SHADOWABLE_DELETION bitmask on writing. This would break subsequent reading and cause all types of failures up to crash. Secondly, when reading rows with this extended flag set, we need to preserve that information and create a shadowable_tombstone for the row. Tests: unit {release} + Verified manually with 'hexdump' and using modified 'sstabledump' that second (shadowable) tombstone is written for MV tables by Scylla. + DTest (materialized_views_test.py:TestMaterializedViews.hundred_mv_concurrent_test) that originally failed due to this issue has successfully passed locally. " * 'projects/sstables-30/shadowable-deletion/v4' of https://github.com/argenet/scylla: tests: Add tests writing both regular and shadowable tombstones to SSTables 3.x. tests: Add test covering writing and reading a shadowable tombstone with SSTables 3.x. sstables: Support Scylla-specific extension for writing shadowable tombstones. sstables: Introduce a feature for shadowable tombstones in Scylla.db. memtable: Track regular and shadowable tombstones separately in encoding_stats_collector. sstables: Error out when reading SSTables 3.x with Cassandra shadowable deletion. sstables: Support checking row extension flags for Cassandra shadowable deletion. (cherry picked from commit `8210f4c982`)	2018-10-24 19:32:57 +03:00
Tomasz Grabiec	39b39058fc	sstable_mutation_reader: Do not read partition index when scanning Even when we're using a full clustering range, need_skip() will return true when we start a new partition and advance_context() will be called with position_in_partition::before_all_clustered_rows(). We should detect that there is no need to skip to that position before the call to advance_to(*_current_partition_key), which will read the index page. Fixes #3868. Message-Id: <1539881775-8578-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `9e756d3863`)	2018-10-24 19:32:40 +03:00
Avi Kivity	6bf4a73d88	thrift: limit message size Limit message size according to the configuration, to avoid a huge message from allocating all of the server's memory. We also need to limit memory used in aggregate by thrift, but that is left to another patch. Fixes #3878. Message-Id: <20181024081042.13067-1-avi@scylladb.com> (cherry picked from commit `a9836ad758`)	2018-10-24 19:32:25 +03:00
Gleb Natapov	ca4846dd63	stream_session: remove unused capture 'Consumer function' parameter for distribute_reader_and_consume_on_shards() captures schema_ptr (which is a seastar::shared_ptr), but the function is later copied on another shard at which point schema_ptr is also copied and its counter is incremented by the wrong shard. The capture is not even used, so lets just drop it. Fixes #3838 Message-Id: <20181011075500.GN14449@scylladb.com> (cherry picked from commit `ceb361544a`)	2018-10-24 09:47:02 +03:00
Takuya ASADA	2663ff7bc1	dist/common/sysctl.d: add new conf file to set fs.aio-max-nr We need raise fs.aio-max-nr to larger value since Seastar may allocates more then 65535 AIO events (= kernel default value) Fixes #3842 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181023030449.15445-1-syuu@scylladb.com> (cherry picked from commit `950dbdb466`)	2018-10-24 09:45:51 +03:00
Tomasz Grabiec	043a575fcd	Merge "Correctly handle dropped columns in SSTable 3" from Piotr J. Previously we were making assumptions about missing columns (the size of its value, whether it's a collection or a counter) but they didn't have to be always true. Now we're using column type from serialization header to use the right values. Fixes #3859 * seastar-dev.git haaawk/projects/sstables-30/handling-dropped-columns/v4: sstables 3: Correctly handle dropped columns in column_translation sstables 3: Add test for dropped columns handling (cherry picked from commit `fc37b80d24`)	2018-10-24 09:45:25 +03:00
Vlad Zolotarov	00dc400993	storage_proxy::query_result_local: create a single tracing span on a replica shard Every call of a tracing::global_trace_state_ptr object instead of a tracing::tracing_state_ptr or a call to tracing::global_trace_state_ptr::get() creates a new tracing session (span) object. This should never be done unless query handling moves to a different shard. Fixes #3862 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20181018003500.10030-1-vladz@scylladb.com> (cherry picked from commit `a87c11bad2`)	2018-10-24 09:45:14 +03:00
Duarte Nunes	522a48a244	Merge 'Fix for a select statement with filtered columns' from Eliran " This patchset fixes #3803. When a select statement with filtering is executed and the column that is needed for the filtering is not present in the select clause, rows that should have been filtered out according to this column will still be present in the result set. Tests: 1. The testcase from the issue. 2. Unit tests (release) including the newly added test from this patchset. " * 'issues/3803/v10' of https://github.com/eliransin/scylla: unit test: add test for filtering queries without the filtered column cql3 unit test: add assertion for the number of serialized columns cql3: ensure retrieval of columns for filtering cql3: refactor find_idx to be part of statement restrictions object cql3: add prefix size common functionality to all clustering restrictions cql3: rename selection metadata manipulation functions (cherry picked from commit `3fe92663d4`)	2018-10-24 09:44:46 +03:00
Paweł Dziepak	5faa28ce45	cql3: restore original timeout behaviour for aggregate queries Commit `1d34ef38a8` "cql3: make pagers use time_point instead of duration" has unintentionally altered the timeout semantics for aggregate queries. Such requests fetch multiple pages before sending a response to the client. Originally, each of those fetches had a timeout-duration to finish, after the problematic commit the whole request needs to complete in a single timeout-duration. This, unsurprisingly, makes some queries that were successful before fail with a timeout. This patch restores the original behaviour. Fixes #3877. Message-Id: <20181022125318.4384-1-pdziepak@scylladb.com> (cherry picked from commit `c94d2b6aa6`)	2018-10-24 09:43:59 +03:00
Avi Kivity	52be02558e	config: mark range_request_timeout_in_ms and request_timeout_in_ms as Used This makes them available in scylla --help. Fixes #3884. Message-Id: <20181023101150.29856-1-avi@scylladb.com> (cherry picked from commit `d9e0ea6bb0`)	2018-10-24 09:43:54 +03:00
Avi Kivity	a7cbfbe63f	Merge "hinted handoff: give a sender a low priority" from Vlad " Hinted handoff should not overpower regular flows like READs, WRITEs or background activities like memtable flushes or compactions. In order to achieve this put its sending in the STEAMING CPU scheduling group and its commitlog object into the STREAMING I/O scheduling group. Fixes #3817 " * 'hinted_handoff_scheduling_groups-v2' of https://github.com/vladzcloudius/scylla: db::hints::manager: use "streaming" I/O scheduling class for reads commitlog::read_log_file(): set the a read I/O priority class explicitly db::hints::manager: add hints sender to the "streaming" CPU scheduling group (cherry picked from commit `1533487ba8`)	2018-10-24 09:43:39 +03:00
Duarte Nunes	28fd2044d2	Merge 'hinted handoff: add manager::state and split storing and replaying enablement' from Vlad " Refs #3828 (Probably fixes it) We found a few flaws in a way we enable hints replaying. First of all it was allowed before manager::start() is complete. Then, since manager::start() is called after messaging_service is initialized there was a time window when hints are rejected and this creates an issue for MV. Both issues above were found in the context of #3828. This series fixes them both. Tested {release}: dtest: materialized_views_test.py:TestMaterializedViews.write_to_hinted_handoff_for_views_test dtest: hintedhandoff_additional_test.py " * 'hinted_handoff_dont_create_hints_until_started-v1' of https://github.com/vladzcloudius/scylla: hinted handoff: enable storing hints before starting messaging_service db::hints::manager: add a "started" state db::hints::manager: introduce a _state (cherry picked from commit `3a53b3cebc`)	2018-10-24 09:43:03 +03:00
Calle Wilund	76ff2e5c3d	messaging_service: Make rpc streaming sink respect tls connection Fixes #3787 Message service streaming sink was created using direct call to rpc::client::make_sink. This in turn needs a new socker, which it creates completely ignoring what underlying transport is active for the client in question. Fix by retaining the tls credential pointer in the client wrapper, and using this in a sink method to determine whether to create a new tls socker, or just go ahead with a plain one. Message-Id: <20181010003249.30526-1-calle@scylladb.com> (cherry picked from commit `3cb50c861d`)	2018-10-23 07:36:21 +00:00
Avi Kivity	7b34d54a96	locator: fix abstract_replication_strategy::get_ranges() and friends violating sort order get_ranges() is supposed to return ranges in sorted order. However, `a35136533d` broke this and returned the range that was supposed to be last in the second position (e.g. [0, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9]). The broke cleanup, which relied on the sort order to perform a binary search. Other users of the get_ranges() family did not rely on the sort order. Fixes #3872. Message-Id: <20181019113613.1895-1-avi@scylladb.com> (cherry picked from commit `1ce52d5432`)	2018-10-23 07:36:21 +00:00
Duarte Nunes	26c31f6798	Merge "db/hints: Expose current backlog" from Duarte " Hints are stored on disk by a hints::manager, ensuring they are eventually sent. A hints::resource_manager ensures the hints::managers it tracks don't consume more than their allocated resources by monitoring disk space and disabling new hints if needed. This series fixes some bugs related to the backlog calculation, but mainly exposes the backlog through a hints::manager so upper layers can apply flow control. Refs #2538 " * 'hh-manager-backlog/v3' of https://github.com/duarten/scylla: db/hints/manager: Expose current backlog db/hints/manager: Move decision about blocking hints to the manager db/hints/resource_manager: Correctly account resources in space_watchdog db/hints/resource_manager: Replace timer with seastar::thread db/hints/resource_manager: Ensure managers are correctly registered db/hints/resource_manager: Fix formatting db/hints: Disallow moving or copying the managers	2018-10-23 07:36:21 +00:00
Glauber Costa	28fa66591a	sstables: print sstable path in case of an exception Without that, we don't know where to look for the problems Before: compaction failed: sstables::malformed_sstable_exception (Too big ttl: 3163676957) After: compaction_manager - compaction failed: sstables::malformed_sstable_exception (Too big ttl: 4294967295 in sstable /var/lib/scylla/data/system_traces/events-8826e8e9e16a372887533bc1fc713c25/mc-832-big-Data.db) Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20181016181004.17838-1-glauber@scylladb.com> (cherry picked from commit `7edae5421d`)	2018-10-23 07:36:14 +00:00
Piotr Sarna	0fee1d9e43	cql3: add asking for pk/ck in the base query Base query partition and clustering keys are used to generate paging state for an index query, so they always need to be present when a paged base query is processed. Message-Id: <f3bf69453a6fd2bc842c8bdbd602d62c91cf9218.1538568953.git.sarna@scylladb.com> Fixes #3855. (cherry picked from commit `4a23297117`)	2018-10-16 19:59:42 +03:00
Piotr Sarna	76e72e28f4	cql3: add checking for may_need_paging when executing base query It's not sufficient to check for positive page_size when preparing a base query for indexed select statement - may_need_paging() should be called as well. Message-Id: <d435820019e4082a64ca9807541f0c9ad334e6a8.1538568953.git.sarna@scylladb.com> (cherry picked from commit `50d3de0693`)	2018-10-16 19:58:58 +03:00
Piotr Sarna	f969e80965	cql3: move base query command creation to a separate function Message-Id: <6b48b8cbd6312da4a17bfd3c85af628b4215e9f4.1538568953.git.sarna@scylladb.com> (cherry picked from commit `11b8831c04`)	2018-10-16 19:58:56 +03:00
Vladimir Krivopalov	2029134063	sstables: Reset opened range tombstone when moving to another partition. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <f6dc6b0bd88ca44f2ef84c2a8bee43fde82c89cc.1539396572.git.vladimir@scylladb.com> (cherry picked from commit `092276b13d`)	2018-10-15 13:26:22 +03:00
Vladimir Krivopalov	f30fe7bd17	sstables: Factor out code resetting values for a new partition. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <83a3a4ce6942b036be447bcfeb66142828e75293.1539396572.git.vladimir@scylladb.com> (cherry picked from commit `926b6430fd`)	2018-10-15 13:26:20 +03:00
Piotr Sarna	aeb418af9e	service/pager: avoid dereferencing null partition key The pager::state() function returns a valid paging object even if the pager itself is exhausted. It may also not contain the partition key, so using it unconditionally was a bug - now, in case there is no partition key present, paging state will contain an empty partition key. Fixes #3829 Message-Id: <28401eb21ab8f12645c0a33d9e92ada9de83e96b.1539074813.git.sarna@scylladb.com> (cherry picked from commit `b3685342a6`)	2018-10-15 12:47:25 +03:00
Glauber Costa	714e6d741f	api: use longs instead of ints for snapshot sizes Int types in json will be serialized to int types in C++. They will then only be able to handle 4GB, and we tend to store more data than that. Without this patch, listsnapshots is broken in all versions. Fixes: #3845 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20181012155902.7573-1-glauber@scylladb.com> (cherry picked from commit `98332de268`)	2018-10-12 22:01:59 +03:00
Tomasz Grabiec	95c5872450	Merge "Enable sstable_mutation_test with SSTables 3.x." from Vladimir Introduce uppermost_bound() method instead of upper_bound() in mutation_fragment_filter and clustering_ranges_walker. For now, this has been only used to produce the final range tombstone for sliced reads inside consume_partition_end(). Usage of the upper bound of the current range causes problems of two kinds: 1. If not all the slicing ranges have been traversed with the clustering range walker, which is possible when the last read mutation fragment was before some of the ranges and reading was limited to a specific range of positions taken from index, the emitted range tombstone will not cover the untraversed slices. 2. At the same time, if all ranges have been walked past, the end bound is set to after_all_clustered_rows and the emitted RT may span more data than it should. To avoid both situations, the uppermost bound is used instead, which refers to the upper bound of the last range in the sequence. * github.com/scylladb/seastar-dev.git haaawk/projects/sstables-30/enable-mc-with-sstable-mutation-test/v2 sstables: Use uppermost_bound() instead of upper_bound() in mutation_fragment_filter. tests: Enable sstable_mutation_test for SSTables 'mc' format. Rebased by Piotr J. (cherry picked from commit `b89556512a`)	2018-10-12 17:46:49 +03:00

1 2 3 4 5 ...

16697 Commits