scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
Avi Kivity	95b00aae33	Revert scylla-ami update in "scylla_setup: fix conditional statement of silent mode" This reverts part of commit `364c2551c8`. I mistakenly changed the scylla-ami submodule in addition to applying the patch. The revert keeps the intended part of the patch and undoes the scylla-ami change.	2018-06-10 14:53:40 +03:00
Asias He	d23dafa7ac	dht: Remove column_families parameter in add_rx_ranges and add_tx_ranges In `4b1034b` (storage_service: Remove the stream_hints), we removed the only user of the api with the column_families parameter. std::vector column_families = { db::system_keyspace::HINTS }; streamer->add_tx_ranges(keyspace, std::move(ranges_per_endpoint), column_families); We can simplify the code range_streamer a bit by removing it. Fixes #3476 Tests: dtest update_cluster_layout_tests.py Message-Id: <c81d79c5e6dbc8dd78c1242837de892e39d6abd2.1528356342.git.asias@scylladb.com>	2018-06-10 14:53:40 +03:00
Avi Kivity	f9d66f88bb	transport: advertise the shard serving a connection It is useful for the client driver to know which shard is serving a particular connection, so it can only send requests through that connection which will be served by the same shard, eliminating a hop. Support that by advertising a "SCYLLA_SHARD" option, with a value corresponding to the shard number. Acked-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20180606203437.1198-1-avi@scylladb.com>	2018-06-07 10:43:16 +03:00
Avi Kivity	4a90eeb326	Update seastar submodule * seastar 12cffef...e7275e4 (9): > tests: execution_stage_test: capture sg by value > Merge "Add in-path parameter suport to the code generation" from Amnon > Merge "Add scheduling_group inheritance to execution_stage" from Avi > tutorial: explain how to find origin of exception > tls: Ensure handshake always drains output before return/throw > build: cmake: correct stdc++fs library name once more > perftune.py: make sure config file existing before write > Update travis-ci integration > build: fix compilation issues on cmake. missing stdc++-fs	2018-06-06 19:07:16 +03:00
Avi Kivity	6f23403137	Merge "Virtualize IndexInfo system table" from Duarte " The IndexInfo table tracks the secondary indexes that have already been populated. Since our secondary index implementation is backed by materialized views, we can virtualize that table so queries are actually answered by built_views. Fixes #3483 " * 'built-indexes-virtual-reader/v2' of github.com:duarten/scylla: tests/virtual_reader_test: Add test for built indexes virtual reader db/system_keysace: Add virtual reader for IndexInfo table db/system_keyspace: Explain that table_name is the keyspace in IndexInfo index/secondary_index_manager: Expose index_table_name() db/legacy_schema_migrator: Don't migrate indexes	2018-06-06 17:35:51 +03:00
Piotr Sarna	0818eb42ae	cql3: remove additional IN relation check Commit `80fc1b1408` introduced additional checks to ensure that IN relation in WHERE clause can only occur on last restricted column. This check is not present in current Cassandra code, the restriction isn't mentioned anywhere in 'IN relation' documentation and removing it fixes issue 2865. Running cql_tests dtest suite doesn't show any regression after removing this check. Also at: https://github.com/psarna/scylla/tree/remove_additional_in_relation_check Tests: dtest (cql_tests), unit (release) Fixes #2865 Message-Id: <aa8c0b33618dd58cd153e83589ac016bc63f4343.1528288388.git.sarna@scylladb.com>	2018-06-06 16:01:54 +03:00
Tomasz Grabiec	9975135110	row_cache: Make sure reader makes forward progress after each fill_buffer() If reader's buffer is small enough, or preemption happens often enough, fill_buffer() may not make enough progress to advance _lower_bound. If also iteartors are constantly invalidated across fill_buffer() calls, the reader will not be able to make progress. See row_cache_test.cc::test_reading_progress_with_small_buffer_and_invalidation() for an examplary scenario. Also reproduced in debug-mode row_cache_test.cc::test_concurrent_reads_and_eviction Message-Id: <1528283957-16696-1-git-send-email-tgrabiec@scylladb.com>	2018-06-06 16:01:52 +03:00
Vlad Zolotarov	12e3e4fb2a	service::client_state::has_access(): make readable_system_resources an std::unordered_set There is not reason to use an std::set for it since we don't care about the ordering - only about the existance of a particular entry. Hash table will be more efficient for this use case. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1528220892-5784-2-git-send-email-vladz@scylladb.com>	2018-06-06 15:29:29 +03:00
Duarte Nunes	833d34e88a	Merge 'Make rows in a secondary index ordered by token' from Piotr " As in #3423, ensuring token order on secondary index queries can be done by adding an additional column to views that back secondary indexes. This column is a first clustering column and contains token value, computed on updates. This series also updates tests and comments refering to issue 3423. Tests: unit (release, debug) " * 'order_by_token_in_si_5' of https://github.com/psarna/scylla: cql3: update token order comments index, tests: add token column to secondary index schema view: add handling of a token column for secondary indexes view: add is_index method	2018-06-06 10:07:43 +01:00
Vlad Zolotarov	2dde372ae6	locator::ec2_multi_region_snitch: don't call for ec2_snitch::gossiper_starting() ec2_snitch::gossiper_starting() calls for the base class (default) method that sets _gossip_started to TRUE and thereby prevents to following reconnectable_snitch_helper registration. Fixes #3454 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1528208520-28046-1-git-send-email-vladz@scylladb.com>	2018-06-06 12:00:17 +03:00
Asias He	6496cdf0fb	db: Get rid of the streaming memtable delayed flush In `455d5a5` (streaming memtables: coalesce incoming writes), we introduced the delayed flush to coalesce incoming streaming mutations from different stream_plan. However, most of the time there will be one stream plan at a time, the next stream plan won't start until the previous one is finished. So, the current coalescing does not really work. The delayed flush adds 2s of dealy for each stream session. If we have lots of table to stream, we will waste a lot of time. We stream a keyspace in around 10 stream plans, i.e., 10% of ranges a time. If we have 5000 tables, even if the tables are almost empty, the delay will waste 5000 * 10 * 2 = 27 hours. To stream a keyspace with 4 tables, each table has 1000 rows. Before: [shard 0] stream_session - [Stream #944373d0-5d9c-11e8-9cdb-000000000000] Executing streaming plan for Bootstrap-ks-index-0 with peers={127.0.0.1}, master [shard 0] stream_session - [Stream #944373d0-5d9c-11e8-9cdb-000000000000] Streaming plan for Bootstrap-ks-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1030 KiB, 125.21 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks succeeded, took 8.233 seconds After: [shard 0] stream_session - [Stream #e00bf6a0-5d99-11e8-a7b8-000000000000] Executing streaming plan for Bootstrap-ks-index-0 with peers={127.0.0.1}, master [shard 0] stream_session - [Stream #e00bf6a0-5d99-11e8-a7b8-000000000000] Streaming plan for Bootstrap-ks-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1030 KiB, 4772.32 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks succeeded, took 0.216 seconds Fixes #3436 Message-Id: <cb2dde263782d2a2915ddfe678c74f9637ffd65b.1526979175.git.asias@scylladb.com>	2018-06-06 10:16:02 +03:00
Piotr Sarna	70ba8c8317	cql3: update token order comments Comments about token order were outdated with token column patches and they are now up to date. Fixes #3423	2018-06-06 09:02:37 +02:00
Piotr Sarna	4a9bf7ed5b	index, tests: add token column to secondary index schema Additional token column is now present in every view schema that backs a secondary index. This column is always a first part of the clustering key, so it forces token order on queries. Column's name is ideally idx_token, but can be postfixed with a number to ensure its uniqueness. It also updates tests to make them acknowledge the new token order. Fixes #3423	2018-06-06 09:02:33 +02:00
Piotr Sarna	d5e7b5507b	view: add handling of a token column for secondary indexes In order to ensure token order on secondary index queries, first clustering column for each view that backs a secondary index is going to store a token computed from base's partition keys. After this commit, if there exists a column that is not present in base schema, it will be filled with computed token.	2018-06-05 18:59:25 +02:00
Tomasz Grabiec	f775fc2e4c	mvcc: Fix partition_entry::open_version() After `70c72773be` it's possible that open_version() is called with a phase which is smaller than the phase of the latest version, because latest version belongs to the in-progress cache update. In such case we must return the existing non-latest snapshot and not create a new version on top of the in-progress update. Not doing this violates several invariants, and may lead to inconsistencies, including violation of write atomicity or temporary loss of writes. partition_entry::read() was already adjusted by the aforementioned commit. Do a similar adjustement for open_version(). Fixes sporadic failures of row_cache_test.cc::test_concurrent_reads_and_eviction Message-Id: <1528211847-22825-1-git-send-email-tgrabiec@scylladb.com>	2018-06-05 18:22:38 +03:00
Takuya ASADA	60844ae67b	dist/common/scripts/scylla_coredump_setup: don't run sysctl on Ubuntu 18.04 Since 99-scylla.conf is not included on Ubuntu 18.04, skip running it. Fixes #3494 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180605093619.9197-1-syuu@scylladb.com>	2018-06-05 12:47:46 +03:00
Takuya ASADA	222b8588ee	dist/common/systemd/scylla-server.service.in: add local-fs.target as dependency We mistakenly only added network-online.target is doens't promises to wait /var/lib/scylla mount. To do this we need local-fs.target. Fixes #3441 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180521083349.8970-1-syuu@scylladb.com>	2018-06-05 12:26:21 +03:00
Piotr Sarna	06eee0f525	view: add is_index method is_index method returns true if view that owns it is backing a secondary index.	2018-06-05 11:10:24 +02:00
Piotr Sarna	6130a00597	dist: add scylla/hints directory to scripts /var/lib/scylla/hints directory was missing from dist-specific scripts, which may cause package installations to fail. Package building scripts and descriptions are updated/ Fixes #3495 Message-Id: <0f5596cb49500416820ece023b7f76a4e2427799.1528184949.git.sarna@scylladb.com>	2018-06-05 11:33:29 +03:00
Avi Kivity	4aaf7bbc1d	Merge "Add test for compression" from Piotr " It turns out that compression just works for SSTables 3.x. Thanks to the previous work done on the write path. This series cleans up tests a bit and introduces test for compression on the read path. " * 'haaawk/sstables3/read-compression-v1' of ssh://github.com/scylladb/seastar-dev: Add test for compression in sstables 3.x Extract test_partition_key_with_values_of_different_types_read sstable_3_x_test: use SEASTAR_THREAD_TEST_CASE Drop UNCOMPRESSD_ when code will be used for compressed too	2018-06-04 20:33:50 +03:00
Piotr Jastrzebski	25a7f03f7f	Add test for compression in sstables 3.x Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-06-04 18:41:10 +02:00
Piotr Jastrzebski	be9c7391aa	Extract test_partition_key_with_values_of_different_types_read It will be used also for testing compression. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-06-04 18:41:10 +02:00
Piotr Jastrzebski	1f324b7fc8	sstable_3_x_test: use SEASTAR_THREAD_TEST_CASE Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-06-04 18:40:52 +02:00
Piotr Jastrzebski	3e3ccdb323	Drop UNCOMPRESSD_ when code will be used for compressed too Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-06-04 18:29:02 +02:00
Avi Kivity	6d6c355dc0	Merge "augment system.local with sharding information" from Glauber " This patch adds nr_shards, msb_ignore, and the actual sharding algorithm to the system.local table. Drivers and other tools can then make use of this information to talk to scylla in an optimal way " * 'system_tables-v3' of github.com:glommer/scylla: system_keyspace: add sharding information to local table partitioner: export the name of the algorithm used to do intra-node sharding	2018-06-04 18:50:28 +03:00
Glauber Costa	bdce561ada	system_keyspace: add sharding information to local table We would like the clients to be able to route work directly to the right shards. To do that, they need to know the sharding algorithm and its parameters. The algorithm can be copied into the client, but the parameters need to be exported somewhere. Let's use the local table for that. Signed-off-by: Glauber Costa <glauber@scylladb.com> --- v2: force msb to zero on non-murmur	2018-06-04 11:25:58 -04:00
Glauber Costa	250d9332dc	partitioner: export the name of the algorithm used to do intra-node sharding We will export this on system tables. To avoid hard-coding it in the system table level, keep it at least in the dht layer where it belongs. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-06-04 11:25:58 -04:00
Takuya ASADA	ad4ca1e166	dist: simplified build script templates Currently, build_deb.sh looks very complicated because each of distribution requires different parameter, and we are applying them by sed command one-by-one. This patch will replace them by Mustache, it's simple and easy syntax template language. Both .rpm distributions and .deb distributions have pystache (a Python implimentation of Mustache), we will use it. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180604104026.22765-1-syuu@scylladb.com>	2018-06-04 14:38:52 +03:00
Paweł Dziepak	24764712b6	sstable: fix capture by reference of stack variable in continuation Message-Id: <20180604102542.21799-1-pdziepak@scylladb.com>	2018-06-04 14:35:49 +03:00
Duarte Nunes	dfa779ebe7	Merge 'Separate hinted handoff manager for materialized views' from Piotr " This series introduces a separate hinted handoff manager for materialized views. Steps: * decouple resource limits from hinted handoff, so multiple instances can share space and throughput limits in order to avoid internal fragmentation for every instance's reservations * add a subdirectory to data/, responsible for storing materialized view hints * decouple registering global metrics from hinted handoff constructor, now that there can be more than one instance - otherwise 'registering metrics twice' errors are going to occur * add a hints_for_views_manager to storage proxy and route failed view updates to use it instead of the original hints_manager * restore previous semantics for enabling/disabling hinted handoff - regular hinted handoff can be disabled or enabled just for specific datacenters without influencing materialized views flow " * 'separate_hh_for_mv_4' of https://github.com/psarna/scylla: storage_proxy: restore optional hinted handoff storage_proxy: add hints manager for views hints: decouple hints manager metrics from constructor db, config: add view_pending_updates directory hints: move space_watchdog to resource manager hints: move send limiter to resource manager hints: move constants to resource_manager	2018-06-04 12:03:59 +01:00
Duarte Nunes	01676a2cda	tests/virtual_reader_test: Add test for built indexes virtual reader Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-06-04 11:31:29 +01:00
Duarte Nunes	3e39985c7a	db/system_keysace: Add virtual reader for IndexInfo table The IndexInfo table tracks the secondary indexes that have already been populated. Since our secondary index implementation is backed by materialized views, we can virtualize that table so queries are actually answered by built_views. Fixes #3483 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-06-04 11:14:17 +01:00
Duarte Nunes	65c4205334	db/system_keyspace: Explain that table_name is the keyspace in IndexInfo This patch adds the same comment that exists in Apache Cassandra, explaining that the table_name column in the IndexInfo system table actually refers to the keyspace name. Don't be fooled. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-06-04 11:14:17 +01:00
Duarte Nunes	bc4db67524	index/secondary_index_manager: Expose index_table_name() Expose secondary_index::index_table_name() so knowledge on how to built an index name can remain centralized. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-06-04 11:14:17 +01:00
Duarte Nunes	7187963bda	db/legacy_schema_migrator: Don't migrate indexes Previous versions contained no indexes, and Apache Cassandra indexes cannot be migrated to Scylla. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-06-04 11:14:17 +01:00
Vlad Zolotarov	e759803f48	cql3::authorized_prepared_statements_cache: properly set the expiration timeout Because authorized_prepared_statements_cache caches the information that comes from the permissions cache and from the prepared statements cache it should has the entries expiration period set to the minimum of expiration periods of these caches. The same goes to the entry refresh period but since prepared statements cache does have a refresh period authorized_prepared_statements_cache's entries refresh period is simply equal to the one of the permissions cache. Fixes #3473 Tests: dtest{release} auth_test.py Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1527789716-6206-1-git-send-email-vladz@scylladb.com>	2018-06-04 10:34:54 +02:00
Piotr Jastrzebski	0b72594c1f	data_consume_rows_context_m: Use find_first and find_next Those methods of boost::dynamic_bitset allow much more efficient implementation of skip_absent_columns and move_to_next_column. Also fix some indentation and variable naming. Test: unit {release} Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <8a4dea51060c5a02bb774eac43e9eb67d316049a.1528100153.git.piotr@scylladb.com>	2018-06-04 11:18:03 +03:00
Piotr Sarna	f12fdcffdb	storage_proxy: restore optional hinted handoff Since hinted handoff for materialized views is now a separate entity, regular hinted handoff can go back to being optional.	2018-06-04 09:46:06 +02:00
Piotr Sarna	a6aae369da	storage_proxy: add hints manager for views This commit adds a separate hints manager that serves only failed materialized view updates.	2018-06-04 09:46:06 +02:00
Piotr Sarna	204bc17bd7	hints: decouple hints manager metrics from constructor Now that more than one instance of hints manager can be present at the same time, registering metrics is moved out of the constructor to prevent 'registering metrics twice' errors.	2018-06-04 09:46:06 +02:00
Piotr Sarna	a791dce0ae	db, config: add view_pending_updates directory Hints for materialized view updates need to be kept somewhere, because their dedicated hints manager has to have a root directory. view_pending_updates directory resides in /data and is used for that purpose.	2018-06-04 09:46:06 +02:00
Piotr Sarna	f345efc79a	hints: move space_watchdog to resource manager Space watchdog is decoupled from hints manager and moved to resource manager, so it can be shared among different hints manager instances.	2018-06-04 09:46:01 +02:00
Piotr Sarna	ef40f7e628	hints: move send limiter to resource manager Send limiting semaphore is moved from hints manager to resource manager. In consequence, hints manager now keeps a reference to its resource manager.	2018-06-04 09:35:58 +02:00
Piotr Sarna	2315937854	hints: move constants to resource_manager Constants related to managing resources are moved to newly created resource_manager class. Later, this class will be used to manage (potentially shared) resources of hints managers.	2018-06-04 09:35:58 +02:00
Avi Kivity	9b21fbc055	Merge "LCS: enable compaction controller" from Glauber " In preparation, we change LCS so that it tries harder to push data to the last level, where the backlog is supposed to be zero. The backlog is defined as: backlog_of_stcs_in_l0 + Sum(L in level) sizeof(L) * (max_level - L) * fan_out where: * the fan_out is the amount of SSTables we usually compact with the next level (usually 10). * max_levels is the number of levels currently populated * sizeof(L) is the total amount of data in a particular level. Tests: unit (release) " * 'lcs-backlog-v2' of github.com:glommer/scylla: LCS: implement backlog tracker for compaction controller LCS: don't construct property in the body of constructor LCS: try harder to move SSTables to highest levels. leveled manifest: turn 10 into a constant backlog: add level to write progress monitor	2018-06-04 10:29:56 +03:00
Amos Kong	364c2551c8	scylla_setup: fix conditional statement of silent mode Commit `300af65555` introdued a problem in conditional statement, script will always abort in silent mode, it doesn't care about the return value. Fixes #3485 Signed-off-by: Amos Kong <amos@scylladb.com> Message-Id: <1c12ab04651352964a176368f8ee28f19ae43c68.1528077114.git.amos@scylladb.com>	2018-06-04 10:14:06 +03:00
Glauber Costa	6317bd45d7	LCS: implement backlog tracker for compaction controller This is the last missing tracker among the major strategies. After this, only DTCS is left. To calculate the backlog, we will define the point of zero-backlog as having all data in the last level. The backlog is then: Sum(L in levels) sizeof(L) * (max_levels - L) * fan_out, where: * the fan_out is the amount of SSTables we usually compact with the next level (usually 10). * max_levels is the number of levels currently populated * sizeof(L) is the total amount of data in a particular level. Care is taken for the backlog not to jump when a new level has been just recently created. Aside from that, SSTables that accumulate in L0 can be subject to STCS. We will then add a STCS backlog in those SSTables to represent that. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-06-03 18:14:09 -04:00
Glauber Costa	04546df55c	LCS: don't construct property in the body of constructor Right now we are constructing the _max_sstable_size_in_mb property in the body of the constructor, which it makes it hard for us to use from other properties. We are doing that because we'd like to test for bounds of that value. So a cleaner way is to have a helper function for that. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-06-03 18:14:09 -04:00
Glauber Costa	28382cb25c	LCS: try harder to move SSTables to highest levels. Our current implementation of LCS can end up with situations in which just a bit of data is in the highest levels, with the majority in the lowest levels. That happens because we will only promote things to highest levels if the amount of data in the current level is higher than the maximum. This is a pre-existing problem in itself, but became even clearer when we started trying to define what is the backlog for LCS. We have discussed ways to fix this it by redefining the criteria on when to move data to the next levels. That would require us to change the way things are today considerably, allowing parallel compactions, etc. There is significant risk that we'll increase write amplication and we would need to carefully validate that. For now I will propose a simpler change, that essentially solves the "inverted pyramid" problem of current LCS without major disruption: keep selecting compaction candidates with the same criteria that we do today, we should help make sure we are not compacting high levels for no reason; but if there is nothing to do, use the idle time to push data to higher levels. As an added benefit, old data that is in the higher level can also be compacted away faster. With this patch we see that in an idle, post-load system all data is eventually pushed to the last level. Systems under constant writes keep behaving the same way they did before. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-06-03 18:12:19 -04:00
Glauber Costa	e64b471e3d	leveled manifest: turn 10 into a constant We increase levels in powers of 10 but that is a parameter of the algorithm. At least make it into a constant so that we can reuse it somewhere else. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-06-03 16:55:58 -04:00

1 2 3 4 5 ...

15678 Commits