Commit Graph

18251 Commits

Author SHA1 Message Date
Asias He
71bf757b2c gossiper: Enable features only after gossip is settled
n1, n2, n3 in the cluster,

shutdown n1, n2, n3

start n1, n2

start n3, we saw features are enabled using the system table while n1 and n2 are already up and running in the cluster.

INFO  2019-02-27 09:24:41,023 [shard 0] gossip - Feature check passed. Local node 127.0.0.3 features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH}, Remote common_features = {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH}
INFO  2019-02-27 09:24:41,025 [shard 0] storage_service - Starting up server gossip
INFO  2019-02-27 09:24:41,063 [shard 0] gossip - Node 127.0.0.1 does not contain SUPPORTED_FEATURES in gossip, using features saved in system table, features={CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH}
INFO  2019-02-27 09:24:41,063 [shard 0] gossip - Node 127.0.0.2 does not contain SUPPORTED_FEATURES in gossip, using features saved in system table, features={CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS, DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS, LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT, RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3, STREAM_WITH_RPC_STREAM, TRUNCATION_TABLE, WRITE_FAILURE_REPLY, XXHASH}

The problem is we enable the features too early in the start up process.
We should enable features after gossip is settled.

Fixes #4289
Message-Id: <04f2edb25457806bd9e8450dfdcccc9f466ae832.1551406991.git.asias@scylladb.com>
2019-03-18 18:25:29 +01:00
Dejan Mircevski
c7d05b88a6 Update GCC version check in configure.py
This brings the version check up-to-date with README.md and HACKING.md,
which were updated by commit fa2b03 ("Replace std::experimental types
with C++17 std version.") to say that minimum GCC 8.1.1 is required.

Tests: manually run configure.py with various `--compiler` values.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Message-Id: <20190318130543.24982-1-dejan@scylladb.com>
2019-03-18 15:24:25 +02:00
Tomasz Grabiec
b0e6f17a22 Merge "Fix empty remote common_features in check_knows_remote_features" from Asias
Three nodes in the cluster node1, node2, node3

Shutdown the whole cluster

Start node1

Start node2, node2 sees empty remote common_features.

   gossip - Feature check passed.  Local node 127.0.0.2 features =
   {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS,
   DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS,
   LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT,
   RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3,
   STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH},
   Remote common_features = {}

The problem is node3 hasn't started yet, node1 sees node3 has empty
features. In get_supported_features(), an empty common features will be
returned if an empty features of a node is seen. To fix, we should
fallback to use the features saved in system table.

Start node3, node3 sees empty remote common_features.

   gossip - Feature check passed. Local node 127.0.0.3 features =
   {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS,
   DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS,
   LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT,
   RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3,
   STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH},
   Remote common_features = {}

The problem is node3 hasn't inserted its own features into gossip
endpoint_state_map. get_supported_features() returns the common features
of all nodes in endpoint_state_map. To fix, we should fallback to use
the features stored in the system table for such node in this case.

Fixes #4225
Fixes #4341

* dev asias/fix_check_knows_remote_features.upstream.v4.1:
  gossiper: Remove unused register_feature and unregister_feature
  gossiper: Remove unused wait_for_feature_on_all_node and
    wait_for_feature_on_node
  gossiper: Log feature is enabled only if the feature is not enabled
    previously
  gossiper: Fix empty remote common_features in
    check_knows_remote_features
2019-03-18 10:56:10 +01:00
Asias He
1d59f26c11 gossiper: Fix empty remote common_features in check_knows_remote_features
Three nodes in the cluster node1, node2, node3

Shutdown the whole cluster

Start node1

Start node2, node2 sees empty remote common_features.

   gossip - Feature check passed.  Local node 127.0.0.2 features =
   {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS,
   DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS,
   LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT,
   RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3,
   STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH},
   Remote common_features = {}

The problem is node3 hasn't started yet, node1 sees node3 has empty
features. In get_supported_features(), an empty common features will be
returned if an empty features of a node is seen. To fix, we should
fallback to use the features saved in system table.

Start node3, node3 sees empty remote common_features.

   gossip - Feature check passed. Local node 127.0.0.3 features =
   {CORRECT_COUNTER_ORDER, CORRECT_NON_COMPOUND_RANGE_TOMBSTONES, COUNTERS,
   DIGEST_MULTIPARTITION_READ, INDEXES, LARGE_PARTITIONS,
   LA_SSTABLE_FORMAT, MATERIALIZED_VIEWS, MC_SSTABLE_FORMAT,
   RANGE_TOMBSTONES, ROLES, ROW_LEVEL_REPAIR, SCHEMA_TABLES_V3,
   STREAM_WITH_RPC_STREAM, WRITE_FAILURE_REPLY, XXHASH},
   Remote common_features = {}

The problem is node3 hasn't inserted its own features into gossip
endpoint_state_map. get_supported_features() returns the common features
of all nodes in endpoint_state_map. To fix, we should fallback to use
the features stored in the system table for such node in this case.

Fixes #4225
2019-03-18 10:56:10 +01:00
Asias He
acb4badbc3 gossiper: Log feature is enabled only if the feature is not enabled previously
We saw the log "Feature FOO is enabled" more than once like below. It is
better to log it only when the feature is not enabled previously.

    gossip - InetAddress 127.0.0.1 is now UP, status = NORMAL
    gossip - Feature CORRECT_COUNTER_ORDER is enabled
    gossip - Feature CORRECT_NON_COMPOUND_RANGE_TOMBSTONES is enabled
    gossip - Feature COUNTERS is enabled
    gossip - Feature DIGEST_MULTIPARTITION_READ is enabled
    gossip - Feature INDEXES is enabled
    gossip - Feature LARGE_PARTITIONS is enabled
    gossip - Feature LA_SSTABLE_FORMAT is enabled
    gossip - Feature MATERIALIZED_VIEWS is enabled
    gossip - Feature MC_SSTABLE_FORMAT is enabled
    gossip - Feature RANGE_TOMBSTONES is enabled
    gossip - Feature ROLES is enabled
    gossip - Feature ROW_LEVEL_REPAIR is enabled
    gossip - Feature SCHEMA_TABLES_V3 is enabled
    gossip - Feature STREAM_WITH_RPC_STREAM is enabled
    gossip - Feature TRUNCATION_TABLE is enabled
    gossip - Feature WRITE_FAILURE_REPLY is enabled
    gossip - Feature XXHASH is enabled

    gossip - Feature CORRECT_COUNTER_ORDER is enabled
    gossip - Feature CORRECT_NON_COMPOUND_RANGE_TOMBSTONES is enabled
    gossip - Feature COUNTERS is enabled
    gossip - Feature DIGEST_MULTIPARTITION_READ is enabled
    gossip - Feature INDEXES is enabled
    gossip - Feature LARGE_PARTITIONS is enabled
    gossip - Feature LA_SSTABLE_FORMAT is enabled
    gossip - Feature MATERIALIZED_VIEWS is enabled
    gossip - Feature MC_SSTABLE_FORMAT is enabled
    gossip - Feature RANGE_TOMBSTONES is enabled
    gossip - Feature ROLES is enabled
    gossip - Feature ROW_LEVEL_REPAIR is enabled
    gossip - Feature SCHEMA_TABLES_V3 is enabled
    gossip - Feature STREAM_WITH_RPC_STREAM is enabled
    gossip - Feature TRUNCATION_TABLE is enabled
    gossip - Feature WRITE_FAILURE_REPLY is enabled
    gossip - Feature XXHASH is enabled
    gossip - InetAddress 127.0.0.2 is now UP, status = NORMAL
2019-03-18 10:56:10 +01:00
Asias He
f32f08c91e gossiper: Remove unused wait_for_feature_on_all_node and wait_for_feature_on_node
Remove unused check_features helper as well.
2019-03-18 10:56:09 +01:00
Asias He
6dbcb2e0c9 gossiper: Remove unused register_feature and unregister_feature
They are not used any more.
2019-03-18 10:56:09 +01:00
Benny Halevy
ecf88d8e2e compaction: fix sstable_window_size calculation is only unit/size is set
If a user that changes the default UNIT from DAYS to HOURS and does not set
the compaction_window_size will endup with a window of 24H instead of 1H.

According to the docs https://docs.scylladb.com/getting-started/compaction/#twcs-options
compaction_window_size should default to a value of 1.

Fixes #4310

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190307131318.13998-1-bhalevy@scylladb.com>
2019-03-18 11:19:18 +02:00
Takuya ASADA
02be95365f reloc/build_rpm.sh: don't use '*' for tar xf argument
It works accidentally but it just expanded by bash to use mached files
in current directory, not correctly recognized by tar.
Need to use full file name instead.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190312172243.5482-2-syuu@scylladb.com>
2019-03-18 11:09:55 +02:00
Takuya ASADA
5b10b6a0ce reloc/build_reloc.sh: enable DPDK
We get following link error when running reloc/build_reloc.sh in dbuild,
need to enable DPDK on Seastar:

g++: error: /usr/lib64/librte_cfgfile.so: No such file or directory
g++: error: /usr/lib64/librte_cmdline.so: No such file or directory
g++: error: /usr/lib64/librte_ethdev.so: No such file or directory
g++: error: /usr/lib64/librte_hash.so: No such file or directory
g++: error: /usr/lib64/librte_kvargs.so: No such file or directory
g++: error: /usr/lib64/librte_mbuf.so: No such file or directory
g++: error: /usr/lib64/librte_eal.so: No such file or directory
g++: error: /usr/lib64/librte_mempool.so: No such file or directory
g++: error: /usr/lib64/librte_mempool_ring.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_bnxt.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_e1000.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_ena.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_enic.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_fm10k.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_qede.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_i40e.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_ixgbe.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_nfp.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_ring.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_sfc_efx.so: No such file or directory
g++: error: /usr/lib64/librte_pmd_vmxnet3_uio.so: No such file or directory
g++: error: /usr/lib64/librte_ring.so: No such file or directory

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190312172243.5482-1-syuu@scylladb.com>
2019-03-18 11:09:55 +02:00
Piotr Sarna
2e05d86cf3 service: reduce number of spawned threads when notifying
Commit 9c544df217 introduced running up/down/join/leave notifications
in threaded context, but spawned a thread for every notification,
while it could be done once for all notifiees.

Reported-by: Avi Kivity <avi@scylladb.com>
Message-Id: <34815d5aa11902c4a052cff38f4c45c45ff919d8.1552897848.git.sarna@scylladb.com>
2019-03-18 10:45:47 +02:00
Avi Kivity
64fa2dd1d2 Merge "gdb: Introduce 'scylla sstables'" from Tomasz
"
Finds all sstables on current shard and prints useful information,
like on-disk and in-memory usage.

Example:

  (gdb) scylla sstables
  (sstables::sstable*) 0x60100034d200: local=1 data_file=9551, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x601000348600: local=1 data_file=1229, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x601000348000: local=1 data_file=4785, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x60100034c600: local=1 data_file=298, in_memory=266192 (bf=400, summary=3072, sm=262096)
  ...
  total (shard-local): count=144, data_file=782839677, in_memory=59774408

Because of the way it finds sstables (bag_sstable_set), doesn't yet support tables using LeveledCompactionStrategy.
"

* 'gdb-scylla-sstables' of github.com:tgrabiec/scylla:
  gdb: Introduce 'scylla sstables'
  gdb: Introduce find_instances()
  gdb: Extract std_unqiue_ptr.get()
  gdb: Add chunked_vector wrapper
  gdb: Add small_vector wrapper
  gdb: Add circular_buffer.size() and circular_buffer.external_memory_footprint()
  gdb: Add wrapper for seastar::lw_shared_ptr
  gdb: Add std_vector.external_memory_footprint()
  gdb: Add wrapper for boost::variant
  gdb: Add wrapper for std::optional
2019-03-17 19:37:44 +02:00
Takuya ASADA
270f9cf9e6 dist/debian: fix installing scyllatop
Since we removed dist/common/bin/scyllatop we are getting a build error
on .deb package build (1bb65a0888).
To fix the error we need to create a symlink for /usr/bin/scyllatop.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190316162105.28855-1-syuu@scylladb.com>
2019-03-17 19:37:44 +02:00
Tomasz Grabiec
05e2c87936 gdb: Introduce 'scylla sstables'
Finds all sstables on current shard and prints useful information,
like on-disk and in-memory usage.

Example:

  (gdb) scylla sstables
  (sstables::sstable*) 0x60100034d200: local=1 data_file=9551, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x601000348600: local=1 data_file=1229, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x601000348000: local=1 data_file=4785, in_memory=266192 (bf=400, summary=3072, sm=262096)
  (sstables::sstable*) 0x60100034c600: local=1 data_file=298, in_memory=266192 (bf=400, summary=3072, sm=262096)
2019-03-15 15:12:48 +01:00
Tomasz Grabiec
929653f51d gdb: Introduce find_instances() 2019-03-15 15:12:48 +01:00
Tomasz Grabiec
fc4952c579 gdb: Extract std_unqiue_ptr.get() 2019-03-15 15:12:48 +01:00
Tomasz Grabiec
e47a5019f2 gdb: Add chunked_vector wrapper 2019-03-15 15:12:47 +01:00
Tomasz Grabiec
a6da71e4da gdb: Add small_vector wrapper 2019-03-15 15:12:47 +01:00
Tomasz Grabiec
0e8589cfdf gdb: Add circular_buffer.size() and circular_buffer.external_memory_footprint() 2019-03-15 15:12:47 +01:00
Tomasz Grabiec
380c6fbdfe gdb: Add wrapper for seastar::lw_shared_ptr 2019-03-15 15:12:47 +01:00
Tomasz Grabiec
93e5e0d644 gdb: Add std_vector.external_memory_footprint() 2019-03-15 15:12:47 +01:00
Tomasz Grabiec
8866b1320a gdb: Add wrapper for boost::variant 2019-03-15 15:12:46 +01:00
Tomasz Grabiec
dd237c32af gdb: Add wrapper for std::optional 2019-03-15 15:12:46 +01:00
Paweł Dziepak
f4f56027bf Merge "Detect partitioner mismatch" from Piotr
"
Refuse to accept SSTables that were created with partitioner
different than the one used by the Scylla server.

Fixes #4331
"

* 'haaawk/4331/v4' of github.com:scylladb/seastar-dev:
  sstables: Add test for sstable::validate_partitioner
  sstables: Add sstable::validate_partitioner and use it
2019-03-15 11:45:10 +00:00
Piotr Jastrzebski
2b0437a147 sstables: Add test for sstable::validate_partitioner
Make sure the exception is thrown when Scylla
tries to load an SSTable created with non-compatible partitioner.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-03-15 10:47:47 +01:00
Piotr Jastrzebski
4aea97f120 sstables: Add sstable::validate_partitioner and use it
Scylla server can't read sstables that were created
with different partitioner than the one being used by Scylla.

We should make sure that Scylla identifies such mismatch
and refuses to use such SSTables.

We can use partitioner information stored in validation metadata
(Statistics.db file) for each SSTable and compare it against
partitioner used by Scylla.

Fixes #4331

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-03-15 10:14:37 +01:00
Rafael Ávila de Espíndola
94c28cfb16 sstable: Wait for future returned by maybe_record_large_cells.
A previous version of the patch that introduced these calls had no
limit on how far behind the large data recording could get, and
maybe_record_large_cells returned null.

The final version switched to a semaphore, but unfortunately these
calls were not updated.

Tests: unit (dev)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190314195856.66387-1-espindola@scylladb.com>
2019-03-14 21:01:37 +01:00
Piotr Sarna
9c544df217 service: run notifying code in threaded context
In order to allow yielding when handling endpoint lifecycle changes,
notifiers now run in threaded context.
Implementations which used this assumption before are supplemented
with assertions that they indeed run in seastar::async mode.

Fixes #4317
Message-Id: <45bbaf2d25dac314e4f322a91350705fad8b81ed.1552567666.git.sarna@scylladb.com>
2019-03-14 12:56:53 +00:00
Piotr Sarna
a7602bd2f1 database: add global view update stats
Currently view update metrics are only per-table, but per-table metrics
are not always enabled. In order to be able to see the number of
generated view updates in all cases, global stats are added.

Fixes #4221
Message-Id: <e94c27c530b2d7d262f76d03937e7874d674870a.1552552016.git.sarna@scylladb.com>
2019-03-14 12:04:18 +00:00
Paweł Dziepak
d4d2eb2ed5 Update seastar submodule
* seastar e640314...463d24e (3):
  > Merge 'Handle IOV_MAX limit in posix_file_impl' from Paweł
  > core: remove unneeded 'exceptional future ignored' report
  > tests/perf: support multiple iterations in a single test run
2019-03-13 14:24:58 +00:00
Tomasz Grabiec
2ef9d9c12e Merge "Record large cells to system.large_cells" from Rafael
Issue #4234 asks for a large collection detector. Discussing the issue
Benny pointed out that it is probably better to have a generic large
cell detector as it makes a natural progression on what we already
warn on (large partitions and large rows).

This patch series implements that. It is on top of
shutdown-order-patches-v7 which is currently on next.

With the charges to use a semaphore this patch series might be getting
a bit big. Let me know if I should split it.

* https://github.com/espindola/scylla espindola/large-cells-on-top-of-shutdown-v5:
  db: refactor large data deletion code
  db: Rename (maybe_)?update_large_partitions
  db: refactor a try_record helper
  large_data_handler: assert it is not used after stop()
  db: don't use _stopped directly
  sstables: delete dead error handling code.
  large_data_handler: Remove const from a few functions
  large_data_handler: propagate a future out of stop()
  large_data_handler: Run large data recording in parallel
  Create a system.large_cells table
  db: Record large cells
  Add a test for large cells
2019-03-13 09:44:57 +01:00
Rafael Ávila de Espíndola
f983570ac8 Add a test for large cells
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
63251b66c1 db: Record large cells
Fixes #4234.

Large cells are now recorded in system.large_cells.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
d17083b483 Create a system.large_cells table
This is analogous to the system.large_rows table, but holds individual
cells, so it also needs the column name.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
8b4ae95168 large_data_handler: Run large data recording in parallel
With this changes the futures returned by large_data_handler will not
normally wait for entries to be written to system.large_rows or
system.large_partitions.

We use a semaphore to bound how behind system.large_* table updates
can get.

This should avoid delaying sstables writes in the common case, which
is more relevant once we warn of large cells since the the default
threshold will be just 1MB.

Note that there is no ordering between the various maybe_record_* and
maybe_delete_large_data_entries requests. This means that we can end
up with a stale entry that is only removed once the TTL expires.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
54b856e5e4 large_data_handler: propagate a future out of stop()
stop() will close a semaphore in a followup patch, so it needs to return a
future.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
989ab33507 large_data_handler: Remove const from a few functions
These will use a member semaphore variable in a followup patch, so they
cannot be const.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
0b763ec19b sstables: delete dead error handling code.
maybe_delete_large_data_entries handles exceptions internally, so the
code this patch deletes would never run.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
5fcb3ff2d7 db: don't use _stopped directly
This gives flexibility in how it is implemented.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
a17a936882 large_data_handler: assert it is not used after stop()
This should have been changed in the patch

db: stop the commit log after the tables during shutdown

But unfortunately I missed it then.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
f3089bf3d1 db: refactor a try_record helper
We had almost identical error handling for large_partitions and
large_rows. Refactor in preparation for large_cells.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:02 -07:00
Rafael Ávila de Espíndola
d7f263d334 db: Rename (maybe_)?update_large_partitions
This renames it to record_large_partitions, which matches
record_large_rows. It also changes the signature to be closer to
record_large_rows.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:16:04 -07:00
Rafael Ávila de Espíndola
f254664fe6 db: refactor large data deletion code
The code for deleting entries from system.large_partitions was almost
a duplicate from the code for deleting entries from system.large_rows.

This patch unifies the two, which also improves the error message when
we fail to delete entries from system.large_partitions.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:16:04 -07:00
Asias He
b8158dd65d streaming: Get rid of the keep alive timer in streaming
There is no guarantee that rpc streaming makes progress in some time
period. Remove the keep alive timer in streaming to avoid killing the
session when the rpc streaming is just slow.

The keep alive timer is used to close the session in the following case:

n2 (the rpc streaming sender) streams to n1 (the rpc streaming receiver)
kill -9 n2

We need this because we do not kill the session when gossip think a node
is down, because we think the node down might only be temporary
and it is a waste to drop the previous work that has done especially
when the stream session takes long time.

Since in range_streamer, we do not stream all data in a single stream
session, we stream 10% of the data per time, and we have retry logic.
I think it is fine to kill a stream session when gossip thinks a node is
down. This patch changes to close all stream session with the node that
gossip think it is down.
Message-Id: <bdbb9486a533eee25fcaf4a23a946629ba946537.1551773823.git.asias@scylladb.com>
2019-03-12 12:20:28 +01:00
Duarte Nunes
2718c90448 Merge 'Add canceling long-standing view update requests' from Piotr
"
This series allows canceling view update requests when a node is
discovered DOWN. View updates are sent in the background with long
timeout (5 minutes), and in case we discover that the node is
unavailable, there's no point in waiting that long for the request
to finish. What's more, waiting for these requests occurs on shutdown,
which may result in waiting 5 minutes until Scylla properly shuts down,
which is bad for both users and dtests.

This series implements storage_proxy as a lifecycle subscriber,
so it can react to membership changes. It also keeps track of all
"interruptible" writes per endpoint, so once a node is detected as DOWN,
an artificial timeout can be triggered for all aforementioned write
requests.

Fixes #3826
Fixes #3966
Fixes #4028
"

* 'write_hints_for_view_updates_on_shutdown_4' of https://github.com/psarna/scylla:
  service: remove unused stop_hints_manager
  storage_proxy: add drain_on_shutdown implementation
  main: register storage proxy as lifecycle subscriber
  storage_proxy: add endpoint_lifecycle_subscriber interface
  storage_proxy: register view update handlers for view write type
  storage_proxy: add intrusive list of view write handlers
  storage_proxy: add view_update_write_response_handler
2019-03-08 13:34:46 -03:00
Piotr Sarna
ae52b3baa7 tests: fix complex timestamp test flakiness
Complex timestamp tests were ported from dtest and contained a potential
race - rows were updated with TTL 1 and then checked if the row exists
in both base and view replicas in an eventually() loop.
During this loop however, TTL of 1 second might have already passed
and the row could have been deleted from base.
This patch changes the mentioned TTL to 30 seconds, making the tests
extremely unlikely to be flaky.
Message-Id: <6b43fe31850babeaa43465eb771c0af45ee6e80d.1552041571.git.sarna@scylladb.com>
2019-03-08 13:34:27 -03:00
Tomasz Grabiec
eb5506275b Merge "Further enhancements to perf_fast_forward" from Paweł
This series contains several improvements to perf_fast_forward that
either address some of the problems seen in the automated runs or help
understanding the results.

The main problem was that test small-partition-slicing had a preparation
stage disproportionally long compared to the actual testing phase. While
the fragments per second results wasn't affected by that, it restricted
the number of iterations of the test that we were able to run, and the
test which single iterations is short (and more prone to noise) was
executed only four times. This was solved by sharing the preparation
stage with all iterations, thus enabling the test to be run many times
and improving the stability of the results.

Another, improvement is the ability to dump all test results and process
them producing histograms. This allows us to see how the distribution of
particular statistics looks like and if there are some complications.

Refs #4278.

* https://github.com/pdziepak/scylla.git more-perf_fast_forward/v1:
  tests/perf_fast_forward: print number of iterations of each test
  tests/perf_fast_forward: reuse keys in small partition slicing test
  tests/perf_fast_forward: extract json result file writing logic
  tests/perf_fast_forward: add an option to dump all results
  tests/perf_fast_forward: add script for analysing full results
2019-03-07 12:22:13 -03:00
Piotr Sarna
aea4b7ea78 service: remove unused stop_hints_manager
Stopping hints manager now occurs when draining storage proxy
and it shouldn't be executed independently, so it's removed
from external API.
2019-03-07 13:44:06 +01:00
Piotr Sarna
cc806909d7 storage_proxy: add drain_on_shutdown implementation
When storage proxy is shutting down, all interruptible writes
can be timed out in order not to wait for them. Instead, the mechanism
will fall back to storing hints and/or not progressing with view
building.
2019-03-07 13:44:05 +01:00
Piotr Sarna
c61d0ee8aa main: register storage proxy as lifecycle subscriber
In order to be able to act when node joins/leaves, storage proxy
is registered as an endpoint lifecycle subscriber.

Fixes #3826
Fixes #4028
2019-03-07 12:10:40 +01:00