Commit Graph

14737 Commits

Author SHA1 Message Date
Tomasz Grabiec
29d167bf01 mvcc: Introduce partition_snapshot_row_cursor::ensure_entry_in_latest()
To avoid duplication of logic between cache reader and
ensure_entry_if_complete().
2018-03-06 11:50:28 +01:00
Tomasz Grabiec
fb2107416b tests: cache: Invoke partial eviction in test_concurrent_reads_and_eviction
In hope of catching more issues.
2018-03-06 11:50:27 +01:00
Tomasz Grabiec
bee875fa7d cache: Ensure all evictable partition_versions have a dummy after all rows
Every evictable version will have a dummy entry at the end so that it can be
tracked in the LRU.

It is also needed to allow old versions to stay around (with
tombstones and static rows) after all rows are evicted. Such versions
must be fully discontinuous, and we need some entry to mark that.
2018-03-06 11:50:27 +01:00
Tomasz Grabiec
5320705300 cache: Propagate cache_tracker to places manipulating evictable entries
cache_tracker reference will be needed to link/unlink row entries.

No change of behavior in this patch.
2018-03-06 11:50:27 +01:00
Tomasz Grabiec
30df3ddd7d cache: Do not evict from cache_entry destructor
We will need to propagate a cache_tracker reference to evict(). Instead
of evicting from destructor, do so before cache_entry gets unlinked
from the tree. Entries which are not linked, don't need to be
explicitly evicted.
2018-03-06 11:50:27 +01:00
Tomasz Grabiec
4efab6f6a6 cache: Use on_evicted() in cache_tracker::clear()
In preparation for switching LRU to row level.
2018-03-06 11:50:27 +01:00
Tomasz Grabiec
2118bdce01 cache: Extract cache_entry::on_evicted() 2018-03-06 11:50:27 +01:00
Tomasz Grabiec
24c5949518 cache: cache_tracker: Rename on_merge() to on_partition_merge() 2018-03-06 11:50:27 +01:00
Tomasz Grabiec
d66e864310 cache: cache_tracer: Rename on_erase() to on_partition_erase() 2018-03-06 11:50:27 +01:00
Tomasz Grabiec
3dc9000c51 mutation_partition: Introduce rows_entry::is_last_dummy()
Will be needed by row evictor, which needs to treat last dummies
specially (not evict them).
2018-03-06 11:50:26 +01:00
Tomasz Grabiec
e571bd5a2e mvcc: Add partition_entry::versions_from_oldest() 2018-03-06 11:50:26 +01:00
Tomasz Grabiec
654d4b76c0 anchorless_list: Introduce all_elements_reversed() 2018-03-06 11:50:26 +01:00
Tomasz Grabiec
d9a38c1c85 mutation_partition: Add API to walk from rows_entry to cache_entry
Will be needed on row eviction, to unlink containers when they become
fully evicted.
2018-03-06 11:50:26 +01:00
Tomasz Grabiec
0ccae80332 intrusive_set_external_comparator: Introduce container_of_only_member() 2018-03-06 11:50:26 +01:00
Tomasz Grabiec
758dfd404b intrusive_set_external_comparator: Use auto_unlink on nodes
Needed for row-level eviction, which doesn't have a reference to the
container.
2018-03-06 11:50:26 +01:00
Tomasz Grabiec
1a85c6d556 intrusive_set_external_comparator: Introduce iterator_to() 2018-03-06 11:50:26 +01:00
Tomasz Grabiec
bbe771e28f tests: Add more tests for continuity merging 2018-03-06 11:50:26 +01:00
Tomasz Grabiec
9893e8e5f7 mvcc: Make each version have independent continuity
This change is a preparation for introducing row-level eviction, such that entries
can be evicted from older versions without having to touch other versions.

Currently continuity flags on entries are interpreted relative to the
combined view merged from all entries. For example:

 v2:                  <key=2, cont=1>
 v1: <key=1, cont=1>

In v2, the flag on entry key=2 marks the range (1, 2) as
continuous. This is problematic because if the old version is evicted, continuity
will change in an incorrect way:

   v2:                  <key=2, cont=1>

Here, the range (-inf, 1) would be marked as continuous, which is not true.

To solve this problem, we change the rules for continuity
interpretation in MVCC. Each version will have its own continuity,
fully specified in that version, independent of continuity of other
versions. Continuity of the snapshot will be a union of continuous
ranges in each version.

It is assumed that continuous intervals in different versions are non-
overlapping, except for points corresponding to complete rows, in
which case a later version may overlap with an older version
(overwrite). We make use of this assumption to make calculation of the
union of intervals on merging easier. I make use of the above
assumption in mutation_partition::apply_monotonically().

MVCC population of incomplete entries already almost maintains the
non-overlapping invariant, because population intervals correspond to
intervals which are incomplete in the old snapshot. The only change
needed is to ensure that both population bounds will have entries in
the latest version. Population from memtables doesn't mark any
intervals as continuous, so also conforms. The only change needed
there is to not inherit continuity flags from the old snapshot,
effectively making the new version internally discontinuous except for
row points.

The example from the beginning will become:

 v2: <key=1, cont=0>  <key=2, cont=1>
 v1: <key=1, cont=1>

When marking a range as continuous with some rows present only in
older versions, we need to insert entries in the latest version, so
that we can mark the range as continuous. The easiest solution is to
copy the entry from the old version. Another option would be to add
support for incomplete rows and insert such instead. This way we would
avoid duplicating row contents. This optimization is deferred.
2018-03-06 11:50:25 +01:00
Tomasz Grabiec
bd1e730053 tests: cache: Add test for merging and reading randomly populated versions 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
1b959cb6e9 tests: cache: Take parameters by const& 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
d2744b6ad8 tests: mvcc: Don't set mutations in versions directly
Simply copying mutations which are not fully continuous may violate
MVCC invariants, like the one about non-overlapping continuity which
will be added later. Use apply_to_incomplete() instead.

This unfortunately reduces strenght of the test, since the continuity
of the entry is now completely determined by the first version. We should
use populate() instead, but it doesn't exist yet. It could be extracted
from cache_streamed_mutation, but that's not an easy change.

This is alleviated by adding a similar test to row_cache_test_g, in a
later patch.
2018-03-06 11:32:09 +01:00
Tomasz Grabiec
2a0ece5205 mvcc: Allow dereferencing partition_snapshot_row_weakref 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
d0e1a3c63e mvcc: partition_snapshot_row_weakref: Introduce is_in_latest_version() 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
2f956499a7 mvcc: Drop unused _evictable flag from partition_version_ref 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
313f2c2bb0 cache: Document intent of maybe_update_continuity() 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
3214883a25 cache: Extract cache_streamed_mutation::ensure_population_lower_bound() 2018-03-06 11:32:09 +01:00
Tomasz Grabiec
d9f0c1f097 tests: cache: Fix invalidate() not being waited for
Probably responsible for occasional failures of subsequent assertion.
Didn't mange to reproduce.

Message-Id: <1520330967-584-1-git-send-email-tgrabiec@scylladb.com>
2018-03-06 12:14:04 +02:00
Asias He
25aa59f2f1 gossip: Fix force_after in wait_for_gossip
In commit 8af0b501a2 (gossip: wait for stabilized gossip on bootstrap)

The force_after variable was changed from int32_t to stdx::optional<int32_t>

-            if (force_after > 0 && total_polls > force_after) {
+            if (force_after && total_polls > *force_after) {

Checking force_after > 0 was dropped which is wrong because force_after
is set to -1 by default. So the if branch will always be executed after
1 poll.

We always see:

   [shard 0] gossip - Gossip not settled but startup forced by
   skip_wait_for_gossip_to_settle. Gossp total polls: 1

even if skip_wait_for_gossip_to_settle is not set at all.

Fixes #3257
Message-Id: <845d219cea6101a7c507c13879c850a5c882e510.1520297548.git.asias@scylladb.com>
2018-03-06 10:11:02 +02:00
Vladimir Krivopalov
2cbdb91070 Remove unused io/ directory
Commit 9309a2ee6f ("Remove obselete
files") removed all of the callers but forgot to remove the directory.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <dcdd6ac66e88fac29cc2b0a12936688e71c1d267.1520314939.git.vladimir@scylladb.com>
2018-03-06 08:08:02 +02:00
Asias He
8900e830a3 storage_service: Add missing return in pieces empty check
If pieces.empty is empty, it is bogus to access pieces[0]:

   sstring move_name = pieces[0];

Fix by adding the missing return.

Spotted by Vlad Zolotarov <vladz@scylladb.com>

Fixes #3258
Message-Id: <bcb446f34f953bc51c3704d06630b53fda82e8d2.1520297558.git.asias@scylladb.com>
2018-03-06 08:04:39 +02:00
Vladimir Krivopalov
acdce55572 Inject CryptoPP namespace where Crypto++ byte typedef is used.
In Crypto++ v6, the `byte` typedef has been moved from the global
namespace to the CryptoPP:: namespace.
To make Scylla code compile with both old and new versions, bring the
namespace in so that the code works regardless of the scope of `byte`
definition.

Fixes #3252

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <60e7bfe868b778b1c9bbe15d7247db64b61bd406.1520272198.git.vladimir@scylladb.com>
2018-03-05 20:43:07 +02:00
Avi Kivity
eb598876e5 build: remove broken and unneeded xxhash include path
"-I$full_builddir/{mode}/xxhash" doesn't resolve to a valid path, because
full_builddir is a Python variable, not a Ninja variable.  In build.ninja
it appears as "-I/release/xxhash".

Since the build nevertheless works, we can remove the broken flag instead
of fixing it.
Message-Id: <20180305135919.13634-1-avi@scylladb.com>
2018-03-05 15:34:30 +01:00
Duarte Nunes
0c05fc0bff tests/flush_queue_test: Don't assume continuations run immediately
This patch fixes an issue with test_propagation(), where the test
assumed that after the future returned from wait_for_pending(0)
resolved, the continuations set for the post operation had already
run, which is not true.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180305131908.7667-1-duarte@scylladb.com>
2018-03-05 15:22:33 +02:00
Avi Kivity
1dae29b48d test: mutation_reader_test: fix no-timeout case in reader_wrapper
reader_wrapper's _timeout defaults to now(), which means to time
out immediately rather than no timeout.

Fix by switching to a time_point, defaulting to no_timeout, and
provide a compatible constructor (with a duration parameter) for
callers that do want a duration-based timeout.

Tests: mutation_reader_test (debug, release)
Message-Id: <20180305111739.31972-1-avi@scylladb.com>
2018-03-05 12:40:07 +01:00
Avi Kivity
a9942bd84a Merge seastar upstream
* seastar f841d2d...08e02dc (3):
  > future: make future::wait() a supported function
  > scripts: perftune.py: don't allow cpu-mask that does't include any IRQ CPU
  > Tutorial: show nice dashes in HTML
2018-03-05 12:58:15 +02:00
Vlad Zolotarov
e3ca390333 tests: gce_snitch_test: drop the property file related message
The message in question is printed with printf() which is bad by itself.
And most importantly this test uses a single .property file so this message
doesn't add any interesting information to begin with. Therefore it makes
more sense to drop it than to fix it.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1519661059-13325-1-git-send-email-vladz@scylladb.com>
2018-03-04 16:16:37 +02:00
Takuya ASADA
3229a87fee dist/debian: Drop scylla-fstrim cron job from Debian 8/9
Since we installs scylla-fstrim systemd unit files on Debian 8/9, no need to
install cron job, so drop them.

Fixes #3249

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1519950212-16231-2-git-send-email-syuu@scylladb.com>
2018-03-04 16:13:06 +02:00
Takuya ASADA
759b4de7a5 dist/debian: drop systemd unit files on Ubuntu 14.04
Ubuntu 14.04 uses upstart as init program, don't need systemd unit files,
so drop them.

Fixes #3245

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1519950212-16231-1-git-send-email-syuu@scylladb.com>
2018-03-04 16:13:05 +02:00
Vladimir Krivopalov
e9e9ec2d16 Guidelines for preparing patches in HACKING.md
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <93bf4d5c04848daf2157d1343748410995b224db.1520045191.git.vladimir@scylladb.com>
2018-03-04 16:12:00 +02:00
Piotr Jastrzebski
29eb9f30bc Fix memtable::clear_gently to work in debug mode.
It was getting into an infinite loop because
need_preempt was always returning true.

Tests: units (release,debug)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <a324e7f576b247124080830455c920bdad1f617b.1520025213.git.piotr@scylladb.com>
2018-03-04 14:11:54 +02:00
Vladimir Krivopalov
99bd5180ba Fix Scylla compilation with Crypto++ v6.
In Crypto++ v6, the `byte` typedef has been moved from the global
namespace to the `CryptoPP::` namespace.

This fix brings in the CryptoPP namespace so that the `byte` typedef is
seen with both old and new versions of Crypto++.

Fixes #3252.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <799d055be710231884d101a52c0be8ed8b0a9806.1520125889.git.vladimir@scylladb.com>
2018-03-04 10:23:00 +02:00
Duarte Nunes
45d762703c Merge 'CQL syntax refinements for access-control' from Jesse
This patch series ties up some loose ends around CQL syntax for access-control statements.

The USER-based syntax statements are all backwards compatible. ROLE-specific statements have a new syntax which is described in "cql: Make role syntax for consistent". Other statements (like GRANT) have been updated to accept role names (instead of the more restrictive `username` rule).

Fixes #3217.

Tests: unit (debug)

* 'jhk/roles_syntax/v2' of https://github.com/hakuch/scylla:
  tests: Rename test for consistency
  cql: Eliminate uses of legacy `username` rule
  cql: Elaborate error for quoted user names
  cql: Allow role names to be string literals
  cql: Make role syntax more consistent
  tests: Add CQL syntax tests for access-control
2018-03-02 15:11:14 +00:00
Raphael S. Carvalho
954efcd209 storage_service: log sstable integrity checker status
INFO  2018-02-27 16:02:36,246 [shard 0] storage_service - SSTable data integrity checker is enabled.

Fixes #3071.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180228174253.9190-1-raphaelsc@scylladb.com>
2018-03-01 20:57:06 +01:00
Jesse Haber-Kucharsky
90af3d889a tests: Rename test for consistency
Now we have `cql_auth_query_test` and `cql_auth_syntax_test`.
2018-03-01 12:06:59 -05:00
Jesse Haber-Kucharsky
464f41d2bb cql: Eliminate uses of legacy username rule
All users of `username` are replaced with `userOrRoleName`, except in
USER-specific (legacy) statements: CREATE USER, ALTER USER, DROP USER.
2018-03-01 12:06:59 -05:00
Jesse Haber-Kucharsky
b84e22acdd cql: Elaborate error for quoted user names
Since quoted names are allowed for role names, we add a more descriptive
error message when a quoted name is (erroneously) used for a user name.

This behavior is consistent with Apache Cassandra.
2018-03-01 12:06:59 -05:00
Jesse Haber-Kucharsky
b5264d8bf7 cql: Allow role names to be string literals
This behavior matches that of Apache Cassandra. When a role name is
specified as a string literal (single quotes), the case is preserved.
2018-03-01 12:06:59 -05:00
Jesse Haber-Kucharsky
d7f2035dea cql: Make role syntax more consistent
This patch changes the syntax for CQL statements related to roles to
favor a form like

    CREATE ROLE sam WITH PASSWORD = 'shire' AND LOGIN = false;

instead of

    CREATE ROLE sam WITH PASSWORD 'shire' NOLOGIN;

This new syntax has the benefit of not imposing any ordering constraints
on the modifiers for roles and being consistent with other parts of the
CQL grammar. It is also consistent with syntax in Apache Cassandra.

The old USER-based statements (CREATE USER and ALTER USER) still have
the old forms for backwards compatibility.

A previous change modified the USER-related statements to allow for the
OPTIONS option. However, this was a mistake; only the PASSWORD option
should have been allowed. This patch also corrects this mistake.
2018-03-01 12:04:40 -05:00
Jesse Haber-Kucharsky
62bfc3939c tests: Add CQL syntax tests for access-control
These are quick-running tests for verifying the accepted forms of CQL
statements (and fragments) related to access-control: users, roles, and
permissions.

Establishing the allowed forms of statements is helpful for reference,
but also makes syntax changes (like those expected in later patches)
clearer and more safe.
2018-03-01 11:46:37 -05:00
Tomasz Grabiec
91ccf82ce4 mvcc: Improve printout of partition_snapshot_row_cursor
Multiline output is easier to read by humans.
Also, print continuity.

Message-Id: <1519909484-24531-1-git-send-email-tgrabiec@scylladb.com>
2018-03-01 13:44:00 +00:00