scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 12:47:02 +00:00

Author	SHA1	Message	Date
Asias He	71bf757b2c	gossiper: Enable features only after gossip is settled n1, n2, n3 in the cluster, shutdown n1, n2, n3 start n1, n2 start n3, we saw features are enabled using the system table while n1 and n2 are already up and running in the cluster. INFO 2019-02-27 09:24:41,023 [shard 0] gossip - Feature check passed. Local node 127.0.0.3 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH} INFO 2019-02-27 09:24:41,025 [shard 0] storage_service - Starting up server gossip INFO 2019-02-27 09:24:41,063 [shard 0] gossip - Node 127.0.0.1 does not contain SUPPORTED_FEATURES in gossip, using features saved in system table, features={CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH} INFO 2019-02-27 09:24:41,063 [shard 0] gossip - Node 127.0.0.2 does not contain SUPPORTED_FEATURES in gossip, using features saved in system table, features={CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH} The problem is we enable the features too early in the start up process. We should enable features after gossip is settled. Fixes #4289 Message-Id: <04f2edb25457806bd9e8450dfdcccc9f466ae832.1551406991.git.asias@scylladb.com>	2019-03-18 18:25:29 +01:00
Dejan Mircevski	c7d05b88a6	Update GCC version check in configure.py This brings the version check up-to-date with README.md and HACKING.md, which were updated by commit fa2b03 ("Replace std::experimental types with C++17 std version.") to say that minimum GCC 8.1.1 is required. Tests: manually run configure.py with various `--compiler` values. Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Message-Id: <20190318130543.24982-1-dejan@scylladb.com>	2019-03-18 15:24:25 +02:00
Tomasz Grabiec	b0e6f17a22	Merge "Fix empty remote common_features in check_knows_remote_features" from Asias Three nodes in the cluster node1, node2, node3 Shutdown the whole cluster Start node1 Start node2, node2 sees empty remote common_features. gossip - Feature check passed. Local node 127.0.0.2 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {} The problem is node3 hasn't started yet, node1 sees node3 has empty features. In get_supported_features(), an empty common features will be returned if an empty features of a node is seen. To fix, we should fallback to use the features saved in system table. Start node3, node3 sees empty remote common_features. gossip - Feature check passed. Local node 127.0.0.3 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {} The problem is node3 hasn't inserted its own features into gossip endpoint_state_map. get_supported_features() returns the common features of all nodes in endpoint_state_map. To fix, we should fallback to use the features stored in the system table for such node in this case. Fixes #4225 Fixes #4341 * dev asias/fix_check_knows_remote_features.upstream.v4.1: gossiper: Remove unused register_feature and unregister_feature gossiper: Remove unused wait_for_feature_on_all_node and wait_for_feature_on_node gossiper: Log feature is enabled only if the feature is not enabled previously gossiper: Fix empty remote common_features in check_knows_remote_features	2019-03-18 10:56:10 +01:00
Asias He	1d59f26c11	gossiper: Fix empty remote common_features in check_knows_remote_features Three nodes in the cluster node1, node2, node3 Shutdown the whole cluster Start node1 Start node2, node2 sees empty remote common_features. gossip - Feature check passed. Local node 127.0.0.2 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {} The problem is node3 hasn't started yet, node1 sees node3 has empty features. In get_supported_features(), an empty common features will be returned if an empty features of a node is seen. To fix, we should fallback to use the features saved in system table. Start node3, node3 sees empty remote common_features. gossip - Feature check passed. Local node 127.0.0.3 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {} The problem is node3 hasn't inserted its own features into gossip endpoint_state_map. get_supported_features() returns the common features of all nodes in endpoint_state_map. To fix, we should fallback to use the features stored in the system table for such node in this case. Fixes #4225	2019-03-18 10:56:10 +01:00
Asias He	acb4badbc3	gossiper: Log feature is enabled only if the feature is not enabled previously We saw the log "Feature FOO is enabled" more than once like below. It is better to log it only when the feature is not enabled previously. gossip - InetAddress 127.0.0.1 is now UP, status = NORMAL gossip - Feature CORRECT_COUNTER_ORDER is enabled gossip - Feature CORRECT_NON_COMPOUND_RANGE_TOMBSTONES is enabled gossip - Feature COUNTERS is enabled gossip - Feature DIGEST_MULTIPARTITION_READ is enabled gossip - Feature INDEXES is enabled gossip - Feature LARGE_PARTITIONS is enabled gossip - Feature LA_SSTABLE_FORMAT is enabled gossip - Feature MATERIALIZED_VIEWS is enabled gossip - Feature MC_SSTABLE_FORMAT is enabled gossip - Feature RANGE_TOMBSTONES is enabled gossip - Feature ROLES is enabled gossip - Feature ROW_LEVEL_REPAIR is enabled gossip - Feature SCHEMA_TABLES_V3 is enabled gossip - Feature STREAM_WITH_RPC_STREAM is enabled gossip - Feature TRUNCATION_TABLE is enabled gossip - Feature WRITE_FAILURE_REPLY is enabled gossip - Feature XXHASH is enabled gossip - Feature CORRECT_COUNTER_ORDER is enabled gossip - Feature CORRECT_NON_COMPOUND_RANGE_TOMBSTONES is enabled gossip - Feature COUNTERS is enabled gossip - Feature DIGEST_MULTIPARTITION_READ is enabled gossip - Feature INDEXES is enabled gossip - Feature LARGE_PARTITIONS is enabled gossip - Feature LA_SSTABLE_FORMAT is enabled gossip - Feature MATERIALIZED_VIEWS is enabled gossip - Feature MC_SSTABLE_FORMAT is enabled gossip - Feature RANGE_TOMBSTONES is enabled gossip - Feature ROLES is enabled gossip - Feature ROW_LEVEL_REPAIR is enabled gossip - Feature SCHEMA_TABLES_V3 is enabled gossip - Feature STREAM_WITH_RPC_STREAM is enabled gossip - Feature TRUNCATION_TABLE is enabled gossip - Feature WRITE_FAILURE_REPLY is enabled gossip - Feature XXHASH is enabled gossip - InetAddress 127.0.0.2 is now UP, status = NORMAL	2019-03-18 10:56:10 +01:00
Asias He	f32f08c91e	gossiper: Remove unused wait_for_feature_on_all_node and wait_for_feature_on_node Remove unused check_features helper as well.	2019-03-18 10:56:09 +01:00
Asias He	6dbcb2e0c9	gossiper: Remove unused register_feature and unregister_feature They are not used any more.	2019-03-18 10:56:09 +01:00
Benny Halevy	ecf88d8e2e	compaction: fix sstable_window_size calculation is only unit/size is set If a user that changes the default UNIT from DAYS to HOURS and does not set the compaction_window_size will endup with a window of 24H instead of 1H. According to the docs https://docs.scylladb.com/getting-started/compaction/#twcs-options compaction_window_size should default to a value of 1. Fixes #4310 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190307131318.13998-1-bhalevy@scylladb.com>	2019-03-18 11:19:18 +02:00
Takuya ASADA	02be95365f	reloc/build_rpm.sh: don't use '*' for tar xf argument It works accidentally but it just expanded by bash to use mached files in current directory, not correctly recognized by tar. Need to use full file name instead. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190312172243.5482-2-syuu@scylladb.com>	2019-03-18 11:09:55 +02:00
Takuya ASADA	5b10b6a0ce	reloc/build_reloc.sh: enable DPDK We get following link error when running reloc/build_reloc.sh in dbuild, need to enable DPDK on Seastar: g++: error: /usr/lib64/librte_cfgfile.so: No such file or directory g++: error: /usr/lib64/librte_cmdline.so: No such file or directory g++: error: /usr/lib64/librte_ethdev.so: No such file or directory g++: error: /usr/lib64/librte_hash.so: No such file or directory g++: error: /usr/lib64/librte_kvargs.so: No such file or directory g++: error: /usr/lib64/librte_mbuf.so: No such file or directory g++: error: /usr/lib64/librte_eal.so: No such file or directory g++: error: /usr/lib64/librte_mempool.so: No such file or directory g++: error: /usr/lib64/librte_mempool_ring.so: No such file or directory g++: error: /usr/lib64/librte_pmd_bnxt.so: No such file or directory g++: error: /usr/lib64/librte_pmd_e1000.so: No such file or directory g++: error: /usr/lib64/librte_pmd_ena.so: No such file or directory g++: error: /usr/lib64/librte_pmd_enic.so: No such file or directory g++: error: /usr/lib64/librte_pmd_fm10k.so: No such file or directory g++: error: /usr/lib64/librte_pmd_qede.so: No such file or directory g++: error: /usr/lib64/librte_pmd_i40e.so: No such file or directory g++: error: /usr/lib64/librte_pmd_ixgbe.so: No such file or directory g++: error: /usr/lib64/librte_pmd_nfp.so: No such file or directory g++: error: /usr/lib64/librte_pmd_ring.so: No such file or directory g++: error: /usr/lib64/librte_pmd_sfc_efx.so: No such file or directory g++: error: /usr/lib64/librte_pmd_vmxnet3_uio.so: No such file or directory g++: error: /usr/lib64/librte_ring.so: No such file or directory Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190312172243.5482-1-syuu@scylladb.com>	2019-03-18 11:09:55 +02:00
Piotr Sarna	2e05d86cf3	service: reduce number of spawned threads when notifying Commit `9c544df217` introduced running up/down/join/leave notifications in threaded context, but spawned a thread for every notification, while it could be done once for all notifiees. Reported-by: Avi Kivity <avi@scylladb.com> Message-Id: <34815d5aa11902c4a052cff38f4c45c45ff919d8.1552897848.git.sarna@scylladb.com>	2019-03-18 10:45:47 +02:00
Avi Kivity	64fa2dd1d2	Merge "gdb: Introduce 'scylla sstables'" from Tomasz " Finds all sstables on current shard and prints useful information, like on-disk and in-memory usage. Example: (gdb) scylla sstables (sstables::sstable) 0x60100034d200: local=1 data_file=9551, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x601000348600: local=1 data_file=1229, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x601000348000: local=1 data_file=4785, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x60100034c600: local=1 data_file=298, in_memory=266192 (bf=400, summary=3072, sm=262096) ... total (shard-local): count=144, data_file=782839677, in_memory=59774408 Because of the way it finds sstables (bag_sstable_set), doesn't yet support tables using LeveledCompactionStrategy. " * 'gdb-scylla-sstables' of github.com:tgrabiec/scylla: gdb: Introduce 'scylla sstables' gdb: Introduce find_instances() gdb: Extract std_unqiue_ptr.get() gdb: Add chunked_vector wrapper gdb: Add small_vector wrapper gdb: Add circular_buffer.size() and circular_buffer.external_memory_footprint() gdb: Add wrapper for seastar::lw_shared_ptr gdb: Add std_vector.external_memory_footprint() gdb: Add wrapper for boost::variant gdb: Add wrapper for std::optional	2019-03-17 19:37:44 +02:00
Takuya ASADA	270f9cf9e6	dist/debian: fix installing scyllatop Since we removed dist/common/bin/scyllatop we are getting a build error on .deb package build (`1bb65a0888`). To fix the error we need to create a symlink for /usr/bin/scyllatop. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190316162105.28855-1-syuu@scylladb.com>	2019-03-17 19:37:44 +02:00
Tomasz Grabiec	05e2c87936	gdb: Introduce 'scylla sstables' Finds all sstables on current shard and prints useful information, like on-disk and in-memory usage. Example: (gdb) scylla sstables (sstables::sstable) 0x60100034d200: local=1 data_file=9551, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x601000348600: local=1 data_file=1229, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x601000348000: local=1 data_file=4785, in_memory=266192 (bf=400, summary=3072, sm=262096) (sstables::sstable) 0x60100034c600: local=1 data_file=298, in_memory=266192 (bf=400, summary=3072, sm=262096)	2019-03-15 15:12:48 +01:00
Tomasz Grabiec	929653f51d	gdb: Introduce find_instances()	2019-03-15 15:12:48 +01:00
Tomasz Grabiec	fc4952c579	gdb: Extract std_unqiue_ptr.get()	2019-03-15 15:12:48 +01:00
Tomasz Grabiec	e47a5019f2	gdb: Add chunked_vector wrapper	2019-03-15 15:12:47 +01:00
Tomasz Grabiec	a6da71e4da	gdb: Add small_vector wrapper	2019-03-15 15:12:47 +01:00
Tomasz Grabiec	0e8589cfdf	gdb: Add circular_buffer.size() and circular_buffer.external_memory_footprint()	2019-03-15 15:12:47 +01:00
Tomasz Grabiec	380c6fbdfe	gdb: Add wrapper for seastar::lw_shared_ptr	2019-03-15 15:12:47 +01:00
Tomasz Grabiec	93e5e0d644	gdb: Add std_vector.external_memory_footprint()	2019-03-15 15:12:47 +01:00
Tomasz Grabiec	8866b1320a	gdb: Add wrapper for boost::variant	2019-03-15 15:12:46 +01:00
Tomasz Grabiec	dd237c32af	gdb: Add wrapper for std::optional	2019-03-15 15:12:46 +01:00
Paweł Dziepak	f4f56027bf	Merge "Detect partitioner mismatch" from Piotr " Refuse to accept SSTables that were created with partitioner different than the one used by the Scylla server. Fixes #4331 " * 'haaawk/4331/v4' of github.com:scylladb/seastar-dev: sstables: Add test for sstable::validate_partitioner sstables: Add sstable::validate_partitioner and use it	2019-03-15 11:45:10 +00:00
Piotr Jastrzebski	2b0437a147	sstables: Add test for sstable::validate_partitioner Make sure the exception is thrown when Scylla tries to load an SSTable created with non-compatible partitioner. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-03-15 10:47:47 +01:00
Piotr Jastrzebski	4aea97f120	sstables: Add sstable::validate_partitioner and use it Scylla server can't read sstables that were created with different partitioner than the one being used by Scylla. We should make sure that Scylla identifies such mismatch and refuses to use such SSTables. We can use partitioner information stored in validation metadata (Statistics.db file) for each SSTable and compare it against partitioner used by Scylla. Fixes #4331 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-03-15 10:14:37 +01:00
Rafael Ávila de Espíndola	94c28cfb16	sstable: Wait for future returned by maybe_record_large_cells. A previous version of the patch that introduced these calls had no limit on how far behind the large data recording could get, and maybe_record_large_cells returned null. The final version switched to a semaphore, but unfortunately these calls were not updated. Tests: unit (dev) Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190314195856.66387-1-espindola@scylladb.com>	2019-03-14 21:01:37 +01:00
Piotr Sarna	9c544df217	service: run notifying code in threaded context In order to allow yielding when handling endpoint lifecycle changes, notifiers now run in threaded context. Implementations which used this assumption before are supplemented with assertions that they indeed run in seastar::async mode. Fixes #4317 Message-Id: <45bbaf2d25dac314e4f322a91350705fad8b81ed.1552567666.git.sarna@scylladb.com>	2019-03-14 12:56:53 +00:00
Piotr Sarna	a7602bd2f1	database: add global view update stats Currently view update metrics are only per-table, but per-table metrics are not always enabled. In order to be able to see the number of generated view updates in all cases, global stats are added. Fixes #4221 Message-Id: <e94c27c530b2d7d262f76d03937e7874d674870a.1552552016.git.sarna@scylladb.com>	2019-03-14 12:04:18 +00:00
Paweł Dziepak	d4d2eb2ed5	Update seastar submodule * seastar e640314...463d24e (3): > Merge 'Handle IOV_MAX limit in posix_file_impl' from Paweł > core: remove unneeded 'exceptional future ignored' report > tests/perf: support multiple iterations in a single test run	2019-03-13 14:24:58 +00:00
Tomasz Grabiec	2ef9d9c12e	Merge "Record large cells to system.large_cells" from Rafael Issue #4234 asks for a large collection detector. Discussing the issue Benny pointed out that it is probably better to have a generic large cell detector as it makes a natural progression on what we already warn on (large partitions and large rows). This patch series implements that. It is on top of shutdown-order-patches-v7 which is currently on next. With the charges to use a semaphore this patch series might be getting a bit big. Let me know if I should split it. * https://github.com/espindola/scylla espindola/large-cells-on-top-of-shutdown-v5: db: refactor large data deletion code db: Rename (maybe_)?update_large_partitions db: refactor a try_record helper large_data_handler: assert it is not used after stop() db: don't use _stopped directly sstables: delete dead error handling code. large_data_handler: Remove const from a few functions large_data_handler: propagate a future out of stop() large_data_handler: Run large data recording in parallel Create a system.large_cells table db: Record large cells Add a test for large cells	2019-03-13 09:44:57 +01:00
Rafael Ávila de Espíndola	f983570ac8	Add a test for large cells Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	63251b66c1	db: Record large cells Fixes #4234. Large cells are now recorded in system.large_cells. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	d17083b483	Create a system.large_cells table This is analogous to the system.large_rows table, but holds individual cells, so it also needs the column name. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	8b4ae95168	large_data_handler: Run large data recording in parallel With this changes the futures returned by large_data_handler will not normally wait for entries to be written to system.large_rows or system.large_partitions. We use a semaphore to bound how behind system.large_* table updates can get. This should avoid delaying sstables writes in the common case, which is more relevant once we warn of large cells since the the default threshold will be just 1MB. Note that there is no ordering between the various maybe_record_* and maybe_delete_large_data_entries requests. This means that we can end up with a stale entry that is only removed once the TTL expires. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	54b856e5e4	large_data_handler: propagate a future out of stop() stop() will close a semaphore in a followup patch, so it needs to return a future. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	989ab33507	large_data_handler: Remove const from a few functions These will use a member semaphore variable in a followup patch, so they cannot be const. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	0b763ec19b	sstables: delete dead error handling code. maybe_delete_large_data_entries handles exceptions internally, so the code this patch deletes would never run. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	5fcb3ff2d7	db: don't use _stopped directly This gives flexibility in how it is implemented. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	a17a936882	large_data_handler: assert it is not used after stop() This should have been changed in the patch db: stop the commit log after the tables during shutdown But unfortunately I missed it then. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola	f3089bf3d1	db: refactor a try_record helper We had almost identical error handling for large_partitions and large_rows. Refactor in preparation for large_cells. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:19:02 -07:00
Rafael Ávila de Espíndola	d7f263d334	db: Rename (maybe_)?update_large_partitions This renames it to record_large_partitions, which matches record_large_rows. It also changes the signature to be closer to record_large_rows. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:16:04 -07:00
Rafael Ávila de Espíndola	f254664fe6	db: refactor large data deletion code The code for deleting entries from system.large_partitions was almost a duplicate from the code for deleting entries from system.large_rows. This patch unifies the two, which also improves the error message when we fail to delete entries from system.large_partitions. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-12 13:16:04 -07:00
Asias He	b8158dd65d	streaming: Get rid of the keep alive timer in streaming There is no guarantee that rpc streaming makes progress in some time period. Remove the keep alive timer in streaming to avoid killing the session when the rpc streaming is just slow. The keep alive timer is used to close the session in the following case: n2 (the rpc streaming sender) streams to n1 (the rpc streaming receiver) kill -9 n2 We need this because we do not kill the session when gossip think a node is down, because we think the node down might only be temporary and it is a waste to drop the previous work that has done especially when the stream session takes long time. Since in range_streamer, we do not stream all data in a single stream session, we stream 10% of the data per time, and we have retry logic. I think it is fine to kill a stream session when gossip thinks a node is down. This patch changes to close all stream session with the node that gossip think it is down. Message-Id: <bdbb9486a533eee25fcaf4a23a946629ba946537.1551773823.git.asias@scylladb.com>	2019-03-12 12:20:28 +01:00
Duarte Nunes	2718c90448	Merge 'Add canceling long-standing view update requests' from Piotr " This series allows canceling view update requests when a node is discovered DOWN. View updates are sent in the background with long timeout (5 minutes), and in case we discover that the node is unavailable, there's no point in waiting that long for the request to finish. What's more, waiting for these requests occurs on shutdown, which may result in waiting 5 minutes until Scylla properly shuts down, which is bad for both users and dtests. This series implements storage_proxy as a lifecycle subscriber, so it can react to membership changes. It also keeps track of all "interruptible" writes per endpoint, so once a node is detected as DOWN, an artificial timeout can be triggered for all aforementioned write requests. Fixes #3826 Fixes #3966 Fixes #4028 " * 'write_hints_for_view_updates_on_shutdown_4' of https://github.com/psarna/scylla: service: remove unused stop_hints_manager storage_proxy: add drain_on_shutdown implementation main: register storage proxy as lifecycle subscriber storage_proxy: add endpoint_lifecycle_subscriber interface storage_proxy: register view update handlers for view write type storage_proxy: add intrusive list of view write handlers storage_proxy: add view_update_write_response_handler	2019-03-08 13:34:46 -03:00
Piotr Sarna	ae52b3baa7	tests: fix complex timestamp test flakiness Complex timestamp tests were ported from dtest and contained a potential race - rows were updated with TTL 1 and then checked if the row exists in both base and view replicas in an eventually() loop. During this loop however, TTL of 1 second might have already passed and the row could have been deleted from base. This patch changes the mentioned TTL to 30 seconds, making the tests extremely unlikely to be flaky. Message-Id: <6b43fe31850babeaa43465eb771c0af45ee6e80d.1552041571.git.sarna@scylladb.com>	2019-03-08 13:34:27 -03:00
Tomasz Grabiec	eb5506275b	Merge "Further enhancements to perf_fast_forward" from Paweł This series contains several improvements to perf_fast_forward that either address some of the problems seen in the automated runs or help understanding the results. The main problem was that test small-partition-slicing had a preparation stage disproportionally long compared to the actual testing phase. While the fragments per second results wasn't affected by that, it restricted the number of iterations of the test that we were able to run, and the test which single iterations is short (and more prone to noise) was executed only four times. This was solved by sharing the preparation stage with all iterations, thus enabling the test to be run many times and improving the stability of the results. Another, improvement is the ability to dump all test results and process them producing histograms. This allows us to see how the distribution of particular statistics looks like and if there are some complications. Refs #4278. * https://github.com/pdziepak/scylla.git more-perf_fast_forward/v1: tests/perf_fast_forward: print number of iterations of each test tests/perf_fast_forward: reuse keys in small partition slicing test tests/perf_fast_forward: extract json result file writing logic tests/perf_fast_forward: add an option to dump all results tests/perf_fast_forward: add script for analysing full results	2019-03-07 12:22:13 -03:00
Piotr Sarna	aea4b7ea78	service: remove unused stop_hints_manager Stopping hints manager now occurs when draining storage proxy and it shouldn't be executed independently, so it's removed from external API.	2019-03-07 13:44:06 +01:00
Piotr Sarna	cc806909d7	storage_proxy: add drain_on_shutdown implementation When storage proxy is shutting down, all interruptible writes can be timed out in order not to wait for them. Instead, the mechanism will fall back to storing hints and/or not progressing with view building.	2019-03-07 13:44:05 +01:00
Piotr Sarna	c61d0ee8aa	main: register storage proxy as lifecycle subscriber In order to be able to act when node joins/leaves, storage proxy is registered as an endpoint lifecycle subscriber. Fixes #3826 Fixes #4028	2019-03-07 12:10:40 +01:00

1 2 3 4 5 ...

18251 Commits