Commit Graph

11719 Commits

Author SHA1 Message Date
Tomasz Grabiec
b030ce693d sstables: mutation_reader: Don't try to read index to skip to static row
Static row is always at the beginning, there's no point in doing
index lookups.
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
3e060659f1 sstables: mutation_reader: Don't try to read static row if table doesn't have any 2017-04-20 10:54:37 +02:00
Tomasz Grabiec
b1860a8a24 clustering_ranges_walker: Allow excluding the static row 2017-04-20 10:54:37 +02:00
Tomasz Grabiec
77d3e30239 sstables: mutation_reader: Use index to skip across clustering restrictions
Improves scans with clustering restrictions. Before the change such
scans would scan whole partition.

Below are results of a test case from perf_fast_forward which selects
few rows from a large partition using query restrictions (not fast forwarding).

Before:

  stride  rows      time [s]     frags     frag/s    aio      [KiB] blocked dropped  idx hit idx miss  idx blk    cpu
  1000000 1         0.000609         1       1642      3        152       2       1        0        1        1  38.0%
  500000  2         0.242255         2          8    511      64152     398       4        0        1        1  98.6%
  250000  4         0.281592         4         14    749      95832     564       4        0        1        1  98.4%
  125000  8         0.328056         8         24    873     111704     657       4        0        1        1  98.4%
  62500   16        0.306700        16         52    935     119640     751       4        0        1        1  99.4%

After:

  stride  rows      time [s]     frags     frag/s    aio      [KiB] blocked dropped  idx hit idx miss  idx blk    cpu
  1000000 1         0.000711         1       1406      3        152       2       1        0        1        1  42.1%
  500000  2         0.000910         2       2197      5        216       3       2        0        1        1  39.2%
  250000  4         0.001384         4       2891      9        344       5       4        0        1        1  35.3%
  125000  8         0.003197         8       2502     21        728      13       8        0        1        1  53.1%
  62500   16        0.006664        16       2401     41       1368      25      16        0        1        1  58.2%
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
05a1f92cbc clustering_ranges_walker: Introduce lower_bound_change_counter()
Allows detecting changes of lower_bound().

Result of advance_to() is not enough. When we get false from
advance_to() twice in a row, lower bound may or may not have changed.
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
461f2af0a1 sstables: mutation_reader: Avoid index lookups when out of range 2017-04-20 10:54:37 +02:00
Tomasz Grabiec
10c92d37d1 sstables: mutation_reader: Simplify fast_forward_to() 2017-04-20 10:54:37 +02:00
Tomasz Grabiec
bfb6858e55 sstables: mutation_reader: Let clustering_ranges_walker handle the _fwd_range start
Simplifies the code a bit, but also will make it easier to calculate
the next position we should skip to after forwarding, taking into
consideration both the position forwarded to as well as clustering
ranges of the query. That will be just calling
_ck_ranges_walker->lower_bound() after it is trimmed.
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
d056d9c31b sstables: mutation_reader: Let mp_row_consumer decide about position passed to the index
In general mp_row_consumer has better information about the next
position to read. It could be after the position we forward to if
there are clustering restrictions. This will be exploited in the
following patches.
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
a37712e9ae sstables: mutation_reader: Move mp_row_consumer::fast_forward_to() out of line 2017-04-20 10:54:37 +02:00
Tomasz Grabiec
bb3e683783 clustering_ranges_walker: Support trimming
Makes implementing fast_forward_to() easier. mp_row_consumer emulates
this currently. This change will allow simplifying this.
2017-04-20 10:54:37 +02:00
Tomasz Grabiec
652d04e78a clustering_ranges_walker: Generalize to work on position ranges
It will include the static row by default. This will allow simplifying
users, which work with position ranges already.
2017-04-20 10:54:36 +02:00
Tomasz Grabiec
c85fe3183c position_range: Allow stealing of bounds 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
503c68de44 position_in_partition: Add more factory methods 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
6c1dc642ee sstables: mutation_reader: Create index on-demand 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
434fda3577 sstables: mutation_reader: Keep priority_class by reference
To indicate that it is not optional.
2017-04-20 10:54:36 +02:00
Tomasz Grabiec
a8c126c82a sstables: Expose get_index_reader() 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
e1af5a406d sstables: Make sstable::get_index_reader() return unique_ptr<>
Makes callers a bit simpler
2017-04-20 10:54:36 +02:00
Tomasz Grabiec
7dc3fe7d3f tests: perf_fast_forward: Add test case for forwarding with clustering restrictions in a large partition 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
eed864690b tests: perf_fast_forward: Add test case for slicing of large partition using a single-partition reader 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
81fc7977a4 tests: perf_fast_forward: Add test for selecting few rows from large partition 2017-04-20 10:54:36 +02:00
Tomasz Grabiec
02da3ba316 tests: perf_fast_forward: Fix use-after-free in scan_with_stride_partitions()
partition_range must live as long as the reader is used.
2017-04-19 08:37:56 +02:00
Raphael S. Carvalho
e78db43b79 compaction_manager: fix crash when dropping a resharding column family
Problem is that column family field of task wasn't being set for resharding,
so column family wasn't being properly removed from compaction manager.
In addition to fixing this issue, we'll also interrupt ongoing compactions
when dropping a column family, exactly like we do with shutdown.

Fixes #2291.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170418125807.7712-1-raphaelsc@scylladb.com>
2017-04-18 17:39:27 +03:00
Duarte Nunes
af37a3fdbf logalloc: Fix compilation error
This patch moves a function using the region_impl type after the type
has been defined.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170418124551.25369-1-duarte@scylladb.com>
2017-04-18 15:56:26 +03:00
Raphael S. Carvalho
11b74050a1 partitioned_sstable_set: fix quadratic space complexity
streaming generates lots of small sstables with large token range,
which triggers O(N^2) in space in interval map.
level 0 sstables will now be stored in a structure that has O(N)
in space complexity and which will be included for every read.

Fixes #2287.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170417185509.6633-1-raphaelsc@scylladb.com>
2017-04-18 13:04:38 +03:00
Takuya ASADA
86e464ab26 dist/offline_installer: support Ubuntu/Debian
moved existing script to dist/offline_installer/redhat, added .deb version into
dist/offline_installer/debian.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1492474821-9907-1-git-send-email-syuu@scylladb.com>
2017-04-18 10:56:50 +03:00
Pekka Enberg
b31c45d8af Merge "clang fixes (part 1)" from Avi
"This series fixes some errors found by clang, with the aim of enabling
clang/zapcc as a supported compiler.  A few more fixes are needed to produce
a binary."

* tag 'clang/1/v1' of https://github.com/avikivity/scylla:
  logalloc: avoid auto in function argument declaration
  thrift: avoid auto in function argument declaration
  streamed_mutation: fix non-POD argument to C-style variadic function
  mutation_partition_serializer: avoid auto in function argument declaration
  date: use correct casts for years
  streaming: avoid auto in function argument declaration
  repair: avoid auto in function argument declaration
  gms: expose gms::inet_address streaming operator
  murmur3_partitioner: fix build on clang
  i_partitioner: remove unused function
  byte_ordered_partitioner: fix bad operator precedence
  result_set: pass comparator by reference to std::sort()
  to_string: move standard container overloads of to_string to std:: namespace
  cql_type: fix bad enum syntax on clang
  build: disable more warnings for clang
  build: fix detection of unsupported warnings on clang
2017-04-18 08:49:25 +03:00
Avi Kivity
844529fe33 logalloc: avoid auto in function argument declaration
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides.  Replace with the right type.

Since the right type is private, add some friendship.
2017-04-17 23:18:44 +03:00
Avi Kivity
54add19ca2 thrift: avoid auto in function argument declaration
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides.  Replace with the right type.
2017-04-17 23:18:44 +03:00
Avi Kivity
f0c25fc20f streamed_mutation: fix non-POD argument to C-style variadic function
Clang warns that passing a non-POD to a C-style variadic function will
result in an abort().  That happens to be exactly what we want, but to
silence the warning, use a template instead.  Since templates aren't
allowed in local classes, move the containing class to namespace scope.
2017-04-17 23:18:44 +03:00
Avi Kivity
635c32eb32 mutation_partition_serializer: avoid auto in function argument declaration
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides.  Replace with the right type.
2017-04-17 23:18:44 +03:00
Avi Kivity
a0858dda3e date: use correct casts for years
Our date implementation uses int64_t for years, but some of the code was
not changed; clang complains, so use the correct casts to make it happy.
2017-04-17 23:03:15 +03:00
Avi Kivity
ca69a04969 streaming: avoid auto in function argument declaration
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides.  Replace with the right type.
2017-04-17 23:03:15 +03:00
Avi Kivity
ae7d7ae20f repair: avoid auto in function argument declaration
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides.  Replace with the right type.
2017-04-17 23:03:15 +03:00
Avi Kivity
c885c468a9 gms: expose gms::inet_address streaming operator
The standard says, and clang enforces, that declaring a function via
a friend declaration is not sufficient for ADL to kick in.  Add a namespace
level declaration so ADL works.
2017-04-17 23:03:15 +03:00
Avi Kivity
af118ab52b murmur3_partitioner: fix build on clang
Don't know what the root cause it, but the fix is harmless.
2017-04-17 23:03:15 +03:00
Avi Kivity
c05f60387b i_partitioner: remove unused function
Found by clang.
2017-04-17 23:03:15 +03:00
Avi Kivity
a496ec7f5b byte_ordered_partitioner: fix bad operator precedence
Found by clang.
2017-04-17 23:03:15 +03:00
Avi Kivity
d9aaa95b29 result_set: pass comparator by reference to std::sort()
Clang complains about some error without it, I could not understand it, but
I'm not going to argue with it.

Since std::sort() will copy the comparator, it's better to pass using an
std::ref(), and everyone is happy.
2017-04-17 23:03:15 +03:00
Avi Kivity
a83a24268d to_string: move standard container overloads of to_string to std:: namespace
Argument-dependent lookup will not find to_string() overloads in the global
namespace if the argument and the caller are in other namespaces.

Move these to_string() overloads to std:: so ADL will find them.

Found by clang.
2017-04-17 23:03:15 +03:00
Avi Kivity
a7fe7aedbf cql_type: fix bad enum syntax on clang
cql3::type used some gcc extension that is not recognized on clang; use
the standard syntax instead.
2017-04-17 22:35:41 +03:00
Avi Kivity
1faef017e3 build: disable more warnings for clang
We should fix the source and re-enable the warnings, but this will do for
now.
2017-04-17 22:34:59 +03:00
Avi Kivity
78e9b0265b build: fix detection of unsupported warnings on clang
The diagnostic that clang spits out when it sees an unrecognized warning
is itself a warning, so the test compilation succeeds and we don't notice
the warning is not supported.

Adding -Werror turns the warning about the unrecognized warning into an
error, allowing the detection machinery to work.
2017-04-17 22:33:01 +03:00
Takuya ASADA
b8f40a2dff dist/ami/files/.bash_profile: warn user when enhanced networking is not enabled
Show warnings on following conditions:
 - VPC is not used
 - Driver is not enhanced networking one

Fixes #1984

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1488844756-14935-1-git-send-email-syuu@scylladb.com>
2017-04-15 15:16:55 +03:00
Benoît Canet
8f793905a3 perf_sstable: Change busy loop to futurized loop
The blocked task detector introduced in
113ed9e963 was seeing
the initialization phase of perf_ssttable as a blocked
task.

Tranform this part of the code in a futurized loop
to make to blocked task detector happy.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <20170413132506.17806-1-benoit@scylladb.com>
2017-04-13 18:17:28 +03:00
Avi Kivity
7d16cfa5f0 Merge branch 'penberg/create-index-stmt-cleanup/v1' of github.com:cloudius-systems/seastar-dev
"The version of create_index_statement class that was translated to C++
is pretty old by now. This series of cleanups brings it closer to Apache
Cassandra trunk to make it easier to bring over more secondary index
code to Scylla."

* 'penberg/create-index-stmt-cleanup/v1' of github.com:cloudius-systems/seastar-dev:
  cql3/statements/create_index_statement: Move target validation
  cql3/statements/create_index_statement: Remove static column validation
  cql3/statements/create_index_statement: Extract validations
  cql3/statements/create_index_statement: Kill bogus custom validation
  cql3/statements/create_index_statement: Add materialized view to validate()
  cql3/statements/create_index_statement: Remove validation
2017-04-13 13:27:53 +03:00
Asias He
d27b47595b gossip: Fix possible use-after-free of entry in endpoint_state_map
We take a reference of endpoint_state entry in endpoint_state_map. We
access it again after code which defers, the reference can be invalid
after the defer if someone deletes the entry during the defer.

Fix this by checking take the reference again after the defering code.

I also audited the code to remove unsafe reference to endpoint_state_map entry
as much as possible.

Fixes the following SIGSEGV:

Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout
0 --default-log-level info --'.
Program terminated with signal SIGSEGV, Segmentation fault.
(this=<optimized out>) at /usr/include/c++/5/bits/stl_pair.h:127
127     in /usr/include/c++/5/bits/stl_pair.h
[Current thread is 1 (Thread 0x7f1448f39bc0 (LWP 107308))]

Fixes #2271

Message-Id: <529ec8ede6da884e844bc81d408b93044610afd2.1491960061.git.asias@scylladb.com>
2017-04-13 13:18:17 +03:00
Takuya ASADA
81c1b07bac dist: add offline installer
This introduce offline installer generator.
It will generate self-extractable archive witch contains Scylla packages
and dependency packages.
Package installation automatically starts when the archive executed.

Limitation: Only supported CentOS at this point.

Fixes #2268

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1491997091-15323-1-git-send-email-syuu@scylladb.com>
2017-04-13 13:16:09 +03:00
Avi Kivity
ac48767146 Merge "tracing and cql3 patches" from Vlad
"This series was initially meant to only transition the keyspace based backend to work
on top of prepared statements but there were a few potential issues found on the way.

In addition the original Tracing series has been expanded with a few patches in the cql3 layer
that are improving the generic clq3 layer but are not obvious without the context of the
following Tracing patches.

The "main" patch contains a heavy rework of trace_keyspace_helper:
   - Use prepared statements for updating tables instead of manually constructing mutations:
      - We intentionally decrease the level of code robustness from "paranoid" to "normal".
      - The code gets a lot more simple, e.g. we don't need to cache columns definitions any more.
      - We are loosing some performance here but:
          - Tracing write is not in the fast path.
          - Tracing write events should be rare.
          - Currently the performance loss (for the actual write time of all trace records) for a "SELECT" query with a specific key is
            about 45%: 144us vs 99us."

* 'tracing_rework_using_prepared-v6' of github.com:cloudius-systems/seastar-dev:
  tracing: use prepared statment for updating tables
  tracing::trace_keyspace_helper: add a bad_column_family constructor that accepts an std::exception parameter
  tracing::trace_keyspace_helper: introduce a table_helper class
  tracing::trace_keyspace_helper: add static qualifier to  make_monotonic_UUID_tp() and elapsed_to_micros() methods
  tracing::tracing: allow slow query TTL only in the signed 32-bit integer range
  cql3::query_processor::prepare(): futurize the error case
  cql3::query_options: add a factory method for creation of options for a BATCH statement
  cql3::statements::batch_statement: add a constructor that doesn't receive the "bound_terms" value
  cql3::query_processor: use weak_ptr for passing the prepared statements around
2017-04-13 11:07:49 +03:00
Raphael S. Carvalho
a6f8f4fe24 compaction: do not write expired cell as dead cell if it can be purged right away
When compacting a fully expired sstable, we're not allowing that sstable
to be purged because expired cell is *unconditionally* converted into a
dead cell. Why not check if the expired cell can be purged instead using
gc before and max purgeable timestamp?

Currently, we need two compactions to get rid of a fully expired sstable
which cells could have always been purged.

look at this sstable with expired cell:
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 120,
        "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z",
"ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true },
        "cells" : [
          { "name" : "country", "value" : "1" },
        ]

now this sstable data after first compaction:
[shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79
(~65% of original) in 229ms = 0.000328997MB/s.

  {
    ...
    "rows" : [
      {
        "type" : "row",
        "position" : 79,
        "cells" : [
          { "name" : "country", "deletion_info" :
{ "local_delete_time" : "2017-04-09T17:07:12Z" },
            "tstamp" : "2017-04-09T17:07:12.702597Z"
          },
        ]

now another compaction will actually get rid of data:
compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original)
in 1ms = 0MB/s. ~2 total partitions merged to 0

NOTE:
It's a waste of time to wait for second compaction because the expired
cell could have been purged at first compaction because it satisfied
gc_before and max purgeable timestamp.

Fixes #2249, #2253

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
2017-04-13 10:59:19 +03:00