Commit Graph

11856 Commits

Author SHA1 Message Date
Duarte Nunes
9e88b60ef5 mutation: Set cell using clustering_key_prefix
Change the clustering key argument in mutation::set_cell from
exploded_clustering_prefix to clustering_key_prefix, which allows for
some overall code simplification and fewer copies. This mostly affects
the cql3 layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
db63ffdbb4 mutation_partition: Harmonize apply_delete overloads
This patch ensures the different mutation_partition::apply_delete()
overloads behave similarly, so that, for example, an empty clustering
key is treated the same way as an empty
exploded_clustering_key_prefix.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
07e648251b prefix_compound_view_wrapper: Add is_full and is_empty functions
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
ef138bdd2c tests/cql_query_test: Add range deletion tests
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
42873189d4 cql3: Partially support ranged deletions
This patch introduces partial support for range deletions. This allows
deletion operations such as

delete from cf where p=1 and c > 0 and c <= 3.

We enforce that both range bounds be specified, because we can't represent
infinite bounds in the current sstable format. Such bounds are represented
as a prefix with no components, with the bound_kind informing whether they
are a bottom of top bound.

We're currently unable to serialize an infinite bound in such a way that it
would be correctly interpreted by Cassandra 2.2.x. A serialized bound is a
composite with a (<length><value><EOC>)+ format. While we could technically
represent the bottom bound, the top bound, if written as a single component
with 0 bytes in size and some EOC, would always sort before other values.
The same would happen if represented as an empty (no components) composite,
because in Cassandra 2.2.x those always have EOC = NONE.

This limitation should stay in place until we can properly represent range
tombstones in the storage format.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
169cc41251 single_column_primary_key_restrictions: Implement has_bound()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:49 +02:00
Duarte Nunes
f7bc88734a modification_statement: Use statement_restrictions for where clause
This patch replaces the custom where clause processing by adding and
using a statement_restrictions field to modification_statement.

This improves code reuse and also moves some checks to prepare-time.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:49 +02:00
Duarte Nunes
aff23f93b4 statement_restrictions: Expose primary key restrictions
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:49 +02:00
Duarte Nunes
8b7d7c4e6d to_string: Add missing include
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:49 +02:00
Raphael S. Carvalho
ddc1d80c28 compaction: remove dead function declaration
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170504013046.23522-2-raphaelsc@scylladb.com>
2017-05-04 11:48:51 +03:00
Raphael S. Carvalho
61229ab88c compaction: fix type for cleanup
After compaction revamp, compaction type set by cleanup at its ctor
is being overwritten at compaction::setup(). Consequently, cleanup
would not be stopped by 'nodetool stop cleanup' and cleanup would
be listed as regular compaction in 'nodetool compactionstats'.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170504013046.23522-1-raphaelsc@scylladb.com>
2017-05-04 11:48:50 +03:00
Avi Kivity
211a337883 Merge seastar upstream
* seastar 194d80f...4a3118c (4):
  > execution_stage: fix wrong exception thrown for non-unique stages
  > metrics: add missing move assignment operators for metric_group, metric_groups
  > Remove unused lambda captures
  > core: lw_shared_ptr::get() should return nullptr for null pointer
2017-05-04 11:47:05 +03:00
Asias He
66e3b73b9c repair: Fix partition estimation
We estimate number of partitions for a given range of a column familiy
and split the range into sub ranges contains fewer partitions as a
checksum unit.

The estimation is wrong, because we need to count the partitions on all
the shards, instead of only counting the local shard.

Fixes #2299

Message-Id: <7876285bd26cfaf65563d6e03ec541626814118a.1493817339.git.asias@scylladb.com>
2017-05-03 16:25:45 +03:00
Pekka Enberg
1e04731fa0 Merge "gossip mark alive fixes" from Asias
"This series fixes the user after free issue in gossip and elimates the
duplicated / unnecessary mark alive operations.

Fixes #2341"

* tag 'asias/gossip_fix_mark_alive/v1' of github.com:cloudius-systems/seastar-dev:
  gossip: Ignore callbacks and mark alive operation in shadow round
  gossip: Ingore the duplicated mark alive operation
  gossip: Fix user after free in mark_alive
2017-05-03 12:19:16 +03:00
Jacob Johansen
9616956c16 dist/docker: Add support for experimental flag
Fixes #2188

Message-Id: <20170502180047.24071-1-jacob.johansen@virginpulse.com>
2017-05-03 10:29:55 +03:00
Asias He
3bd9840c01 gossip: Ignore callbacks and mark alive operation in shadow round
In shadow round, we only interested in the peer's endpoint_state, e.g., gossip
features, host_id, tokens. No need to call the on_restart or on_join callbacks
or to go through the mark alive procedure with EchoMessage gossip message. We
will do them during normal gossip runs anyway.
2017-05-03 07:24:21 +08:00
Asias He
1441ae5cac gossip: Ingore the duplicated mark alive operation
If a node is being marked as alive with EchoMessage, ignore the future
duplicated mark alive opeariton.
2017-05-03 07:24:21 +08:00
Asias He
d682fbfa28 gossip: Fix user after free in mark_alive
After sending echo message, the Node might not be in the
endpoint_state_map anymore, use the reference of local_state
might cause user after free.

Fixes #2341
2017-05-03 07:24:20 +08:00
Raphael S. Carvalho
8b0e358d73 tests/sstable_test: fix release-mode compaction_manager_test
in release mode, compaction task is active after submitting request
because ready future may be scheduled immediately.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170502171925.9893-1-raphaelsc@scylladb.com>
2017-05-02 20:48:30 +03:00
Avi Kivity
7e29dd7066 managed_bytes: improve alignment hygene
While blob_storage is marked as an unaligned type, the back references also
point to an unaligned type (a pointer to blob_storage), since a back
reference can live in a blob_storage.  This triggers errors from zapcc/clang 4.

Fix by creating a type for the reference, which is marked as unaligned.
Message-Id: <20170502071404.507-1-avi@scylladb.com>
2017-05-02 10:04:13 +01:00
Avi Kivity
b46f6a4124 build: ignore unused lambda capture warnings from clang
Worthwhile to revisit later.
2017-05-02 10:09:58 +03:00
Raphael S. Carvalho
8dfb5f9c33 tests/sstable_test: fix compaction_manager_test
after 'compaction: make major compaction go through compaction manager',
the test fails because task is preempted in debug mode before it reaches
intruction to increase stat.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170501183255.6191-1-raphaelsc@scylladb.com>
2017-05-02 09:06:41 +03:00
Avi Kivity
1d12d69881 logalloc: define segment_zone::maximum_size
Yield build errors with some compilers, if missing.
2017-05-01 16:31:29 +03:00
Amnon Heiman
b59c95359d scylla_setup: Fix conditional when checking for newer version
During the changes in the way the housekeeping check for newer version
and warn about it in the installation the UUID part was removed but kept
in the sarounding if.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20170426075724.7132-1-amnon@scylladb.com>
2017-05-01 12:13:35 +03:00
Raphael S. Carvalho
3071b9052a compaction: make cleanup_compaction inherit from regular_compaction
Some fields that belong to regular and cleanup aren't needed for
resharding_compaction, such as incremental selector (which is used
for determining max purgeable timestamp for a given decorated key)
Better move those fields to regular and make cleanup inherit from
regular compaction.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170428195611.9196-1-raphaelsc@scylladb.com>
2017-04-30 19:37:09 +03:00
Raphael S. Carvalho
687a4bb0c2 dtcs: do not compact fully expired sstable which ancestor is not deleted yet
Currently, fully expired sstable[1] is unconditionally chosen for compaction
by DTCS, but that may lead to a compaction loop under certain conditions.

Let's consider that an almost expired sstable is compacted, and it's not
deleted yet, and that the new sstable becomes expired before its ancestor is
deleted.
Because this new sstable is expired, it will be chosen by DTCS, but it will
not be purged because 'compacted undeleted' sstables are taken into account
by calculation of max purgeable timestamp and prevents expired data from
being purged. The problem is that this sequence of events can keep happening
forever as reported by issue #2260.
NOTE: This problem was easier to reproduce before improvement on compaction
of expired cells, because fully expired sstable was being converted into a
sstable full of tombstones, which is also considered fully expired.

Fixes #2260.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170428233554.13744-1-raphaelsc@scylladb.com>
2017-04-30 19:35:46 +03:00
Paweł Dziepak
24f4dcf9e4 db: make virtual dirty soft limit configurable
Message-Id: <20170428150005.28454-1-pdziepak@scylladb.com>
2017-04-30 19:17:22 +03:00
Avi Kivity
248aa4fc23 Merge "Fix update of counter in static rows" from Paweł
"The logic responsible for converting counter updates to counter shards was
not covered by unit tests and didn't transform counter cells inside static
rows.

This series fixes the problem and makes sure that the tests cover both
static rows and transformation logic."

* tag 'pdziepak/static-counter-updates/v1' of github.com:cloudius-systems/seastar-dev:
  tests/counter: test transform_counter_updates_to_shards
  tests/counter: test static columns
  counters: transform static rows from updates to shards
2017-04-30 19:13:44 +03:00
Avi Kivity
339322517e Merge "sstables: index_reader: Fix advance_to() to include relevant range tombstones" from Tomasz
"Fixes #2326."

* 'tgrabiec/fix-range-tombstones-missing-when-slicing' of github.com:cloudius-systems/seastar-dev:
  tests: mutation_source_test: Cover single-ranged queries in test_streamed_mutation_slicing_returns_only_relevant_tombstones()
  tests: mutation_source_test: Add test for slicing of clustered rows
  tests: mutation_reader_assertions: Log expectations
  tests: mutation_reader_assertions: Add produces_eos_or_empty_mutation()
  tests: sstables: Use read_row() for single-key reads
  tests: sstables: Test more configutaions of sstable writer in test_sstable_conforms_to_mutation_source()
  sstables: Improve logging
  sstables: index_reader: Fix advance_to() to include relevant range tombstones
2017-04-30 14:40:41 +03:00
Avi Kivity
831ee80c3c tests: workaround older boost::apply_visitor requiring a result_type member
Older versions of boost::apply_visitor require a result_type member for the
visitor; supply it to make them happy.

Fixes #2312.
2017-04-30 13:56:44 +03:00
Takuya ASADA
a19c1b7f86 dist/redhat: add missing dependencies for Fedora
We only have "%{?rhel:Requires}" for scylla-server, need fedora one.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1493314367-419-1-git-send-email-syuu@scylladb.com>
2017-04-30 11:06:27 +03:00
Takuya ASADA
fe9f72d2c0 dist/debian: add python3-pyudev to dependencies
pyudev is required for seastar/scripts/perftune.py.

Fixes #2315

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1493309116-18074-1-git-send-email-syuu@scylladb.com>
2017-04-30 11:05:15 +03:00
Paweł Dziepak
f5cf86484e lsa: introduce upper bound on zone size
Attempting to create huge zones may introduce significant latency. This
patch introduces the maximum allowed zone size so that the time spent
trying to allocate and initialising zone is bounded.

Fixes #2335.

Message-Id: <20170428145916.28093-1-pdziepak@scylladb.com>
2017-04-30 10:58:11 +03:00
Paweł Dziepak
5c302cf67b tests/counter: test transform_counter_updates_to_shards 2017-04-28 16:29:34 +01:00
Paweł Dziepak
0473750056 tests/counter: test static columns 2017-04-28 16:29:34 +01:00
Paweł Dziepak
0ffdd8d3d0 counters: transform static rows from updates to shards 2017-04-28 16:29:34 +01:00
Tomasz Grabiec
d4df6e214e tests: mutation_source_test: Cover single-ranged queries in test_streamed_mutation_slicing_returns_only_relevant_tombstones() 2017-04-27 18:43:49 +02:00
Tomasz Grabiec
22cce52dff tests: mutation_source_test: Add test for slicing of clustered rows 2017-04-27 18:43:49 +02:00
Tomasz Grabiec
86b693f562 tests: mutation_reader_assertions: Log expectations 2017-04-27 18:43:49 +02:00
Tomasz Grabiec
ece6e107cc tests: mutation_reader_assertions: Add produces_eos_or_empty_mutation() 2017-04-27 18:43:49 +02:00
Tomasz Grabiec
6354acc1a2 tests: sstables: Use read_row() for single-key reads
So that as_mutation_reader() will create the same kind of reader which
database::make_sstable_reader() does.

Before this change, all readers were range readers.
2017-04-27 18:43:49 +02:00
Tomasz Grabiec
fd5dbe04b5 tests: sstables: Test more configutaions of sstable writer in test_sstable_conforms_to_mutation_source()
Test different versions of the format, and different promoted index
block sizes.  The size of 1 is especially important, it will put each
fragment in a separate block, exposing various issues with promoted
index handling.
2017-04-27 18:43:49 +02:00
Tomasz Grabiec
c5baeed6d2 sstables: Improve logging 2017-04-27 18:43:49 +02:00
Tomasz Grabiec
b523815ac1 sstables: index_reader: Fix advance_to() to include relevant range tombstones
Fixes #2326.
2017-04-27 18:43:49 +02:00
Glauber Costa
14b9aa2285 reduce kernel scheduler wakeup granularity
We set the scheduler wakeup granularity to 500usec, because that is the
difference in runtime we want to see from a waking task before it
preempts the running task (which will usually be Scylla). Scheduling
other processes less often is usually good for Scylla, but in this case,
one of the "other processes" is also a Scylla thread, the one we have
been using for marking ticks after we have abandoned signals.

However, there is an artifact from the Linux scheduler that causes those
preemption to be missed if the wakeup granularity is exactly twice as
small as the sched_latency. Our sched_latency is set to 1ms, which
represents the maximum time period in which we will run all runnable
tasks.

We want to keep the sched_latency at 1ms, so we will reduce the wakeup
granularity so to something slightly lower than 500usec, to make sure
that such artifact won't affect the scheduler calculations. 499.99usec
will do - according to my tests, but we will reduce it to a round
number.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20170427135039.8350-1-glauber@scylladb.com>
2017-04-27 18:11:35 +03:00
Pekka Enberg
9cfb94510f Merge "Fix issues found by PVS-Studio static analyzer" from Vlad
Fix issues found by PVS-Studio as reported by Phillip Khandeliants.

Merge branch 'pvs_analyzer_errors-v1' of github.com:cloudius-systems/seastar-dev

* 'pvs_analyzer_errors-v1' of github.com:cloudius-systems/seastar-dev:
  type_parser: catch exceptions by reference and not by value
  token_metadata::get_host_id(ep): add a missing 'throw'
2017-04-27 11:39:49 +03:00
Vlad Zolotarov
d5b76d5198 type_parser: catch exceptions by reference and not by value
Found by PVS-Studio static analyzer:

Type slicing. An exception should be caught by reference rather than by value.

Fixes #2288

Reported-by: Phillip Khandeliants
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-26 15:12:15 -04:00
Vlad Zolotarov
181c68e97d token_metadata::get_host_id(ep): add a missing 'throw'
Caught by PVS-Studio static analyzer:

The object was created but it is not being used. The 'throw' keyword could be missing: throw runtime_error(FOO);

Reported-by: Phillip Khandeliants
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-26 14:54:34 -04:00
Takuya ASADA
7a59336b8a main.cc: drop FS type check
Since we add support ext4, we don't need to limit filesystem to XFS anymore.

Fixes #1933

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1493212525-26264-1-git-send-email-syuu@scylladb.com>
2017-04-26 17:35:55 +03:00
Raphael S. Carvalho
8bae413bcf database: fix format msg for sprint
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170425224920.16607-1-raphaelsc@scylladb.com>
2017-04-26 17:18:58 +03:00