Commit Graph

12481 Commits

Author SHA1 Message Date
Botond Dénes
c4277d6774 cql3: Add K_FROZEN and K_TUPLE to basic_unreserved_keyword
To allow the non-reserved keywords "frozen" and "tuple" to be used as
column names without double-quotes.

Fixes #2507

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <9ae17390662aca90c14ae695c9b4a39531c6cde6.1499329781.git.bdenes@scylladb.com>
2017-07-06 12:25:38 +03:00
Avi Kivity
a6d9cf09a7 build: fix excessive stack usage in CqlParser in debug mode
The state machines generated by antlr allocate many local variables per function.
In release mode, the stack space occupied by the variables is reused, but in debug
build, it is not, due to Address Sanitizer setting -fstack-reuse=none. This causes
a single function to take above 100k of stack space.

Fix by hacking the generated code to use just one variable.

Fixes #2546
Message-Id: <20170704135824.13225-1-avi@scylladb.com>
2017-07-05 23:05:26 +02:00
Duarte Nunes
d583ef6860 thrift/handler: Remove leftover debug artifacts
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170705161156.2307-1-duarte@scylladb.com>
2017-07-05 19:57:07 +03:00
Takuya ASADA
6d0bd01e0f dist/offline_installer/redhat: enable EPEL repo before try to install makeself
To prevent yum install error, we need to enable EPEL repo before install
makeself.

Fixes #2508

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1499196715-19710-1-git-send-email-syuu@scylladb.com>
2017-07-05 09:50:03 +03:00
Takuya ASADA
71624d7919 dist/common/scripts/scylla_raid_setup: prevent renaming MDRAID device after reboot
On Debian variants, mdadm.conf should placed at /etc/mdadm instead of /etc.
Also it seems we need update-initramfs to fix renaming issue.

Fixes #2502

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1499179912-14125-1-git-send-email-syuu@scylladb.com>
2017-07-04 18:07:20 +03:00
Avi Kivity
b1a0e37fcb Merge "Adjust row cache metrics for row granularity" from Tomasz
* tag 'tgrabiec/row-cache-metrics-v2' of github.com:cloudius-systems/seastar-dev:
  row_cache: Switch _stats.hits/misses to row granularity
  row_cache: Rename num_entries() to partitions() for clarity
  row_cache: Track mispopulations also at row level
  row_cache: Track row insertions
  row_cache: Track row hits and misses
  row_cache: Make mispopulation counter also apply for continuity information
  row_cache: Add partition_ prefix to current counters
  misc_services: Switch to using reads_with[_no]_misses counters
  row_cache: Add metrics for operations on underlying reader
  row_cache: Add reader-related metrics
  row_cache: Remove dead code
2017-07-04 15:20:25 +03:00
Tomasz Grabiec
37d2b6b3c6 row_cache: Switch _stats.hits/misses to row granularity
Those are exported by the RESTful APIs called
"get_row_hits/get_row_misses" and reported by nodetool.
2017-07-04 13:55:06 +02:00
Tomasz Grabiec
62c76abf71 row_cache: Rename num_entries() to partitions() for clarity 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
60c2a86192 row_cache: Track mispopulations also at row level 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
94547db620 row_cache: Track row insertions 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
a58f2c8640 row_cache: Track row hits and misses 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
77b2a92ece row_cache: Make mispopulation counter also apply for continuity information 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
a5fdff2ac2 row_cache: Add partition_ prefix to current counters
In preparation for adding per-row counters.
2017-07-04 13:55:06 +02:00
Tomasz Grabiec
ae4b24db06 misc_services: Switch to using reads_with[_no]_misses counters
They better approximate the intended meaning than hits/misses, which
according to Gleb is whether a read did any I/O or not.
2017-07-04 13:55:06 +02:00
Tomasz Grabiec
6a22cbceaf row_cache: Add metrics for operations on underlying reader 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
5c7b6fc164 row_cache: Add reader-related metrics 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
be2e89d596 row_cache: Remove dead code 2017-07-04 13:55:06 +02:00
Tomasz Grabiec
e720b317c9 row_cache: Restore update of concurrent_misses_same_key
It was lost in action in 6f6575f456.

Message-Id: <1499168837-5072-1-git-send-email-tgrabiec@scylladb.com>
2017-07-04 14:51:05 +03:00
Avi Kivity
66e56511d6 Merge "Use selective_token_range_sharder in repair" from Asias
"This series introduces selective_token_range_sharder and uses it in repair to
generate dht::token_range belongs to a specific shard."

* tag 'asias/repair-selective_token_range_sharder-v3' of github.com:cloudius-systems/seastar-dev:
  repair: Use selective_token_range_sharder
  tests: Add test_selective_token_range_sharder
  dht: Add selective_token_range_sharder
2017-07-04 14:14:33 +03:00
Asias He
b10e961a64 repair: Use selective_token_range_sharder
With this change, we ask all the shard to handle the ranges provided by
user and we use selective_token_range_sharder to split the ranges and
ignore the ranges do not belong to the current shard.
2017-07-04 18:46:19 +08:00
Asias He
2a794db61b tests: Add test_selective_token_range_sharder 2017-07-04 18:46:19 +08:00
Asias He
d835cf2748 dht: Add selective_token_range_sharder
It is like ring_position_range_sharder but it works with
dht::token_range. This sharder will return the ranges belong to a
selected shard.
2017-07-04 18:46:19 +08:00
Tomasz Grabiec
1d6fec0755 row_cache: Drop not very useful prefixes from metric names
This drops "total_opertaions_" and "objects_" prefixes. There is no
convention of adding them in other parts of the system, and they don't
add much value.

Fixes scylladb/scylla-grafana-monitoring#169.

Message-Id: <1499160342-25865-1-git-send-email-tgrabiec@scylladb.com>
2017-07-04 13:37:12 +03:00
Nadav Har'El
d95f908586 Fix test to use non-wrapping range
The test put a wrapping range into a non-wrapping range variable.
This was harmless at the time this test was written, but newer code
may not be as forgiving so better use a non-wrapping range as intended.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170704103128.29689-1-nyh@scylladb.com>
2017-07-04 13:36:29 +03:00
Avi Kivity
07b8adce0e sstables: fix use-after-free in read_simple()
`r` is moved-from, and later captured in a different lambda. The compiler may
choose to move and perform the other capture later, resulting in a use-after-free.

Fix by copying `r` instead of moving it.

Discovered by sstable_test in debug mode.
Message-Id: <20170702082546.20570-1-avi@scylladb.com>
2017-07-04 10:24:07 +02:00
Raphael S. Carvalho
7b777fe2e3 sstables/lcs: choose sstable with highest droppable tombstone ratio
Currently, lcs will choose, for tombstone compaction, sstable with
the lowest ratio from the ones which ratio is at least above threshold
(0.2 by default).

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170703185633.6644-1-raphaelsc@scylladb.com>
2017-07-04 10:25:10 +03:00
Avi Kivity
bcf7867ac9 Merge "small fixes and cleanup for leveled strategy (part 2)" from from Raphael
* 'lcs_improvements_part_2' of github.com:raphaelsc/scylla:
  lcs: Match estimated tasks arithmetic to score in LCS
  lcs: prevent leveled_compaction_strategy.hh from being included more than once
  lcs: use vector instead for storing a level of sstables
  compaction: keep only one variant of size_tiered_most_interesting_bucket
  lcs: get rid of unused code in leveled_manifest
2017-07-04 10:10:53 +03:00
Raphael S. Carvalho
7606ffd744 lcs: Match estimated tasks arithmetic to score in LCS
Contains fix for CASSANDRA-8904.

Added TARGET_SCORE to get rid of magic number for target score which
is now used more than once.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-07-04 03:35:02 -03:00
Raphael S. Carvalho
dfb5463478 lcs: prevent leveled_compaction_strategy.hh from being included more than once
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-07-04 03:35:00 -03:00
Raphael S. Carvalho
db98ab6aaf lcs: use vector instead for storing a level of sstables
list is no longer needed because lcs no longer moves a sstable breaking
invariant at its level to level 0. Now lcs incrementally restores invariant
by compacting together first set of overlapping tables.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-07-04 03:34:57 -03:00
Raphael S. Carvalho
b350352e6c compaction: keep only one variant of size_tiered_most_interesting_bucket
two variants of size_tiered_most_interesting_bucket existed to avoid copy,
but subsequent work will make lcs use vector for each level of sstables,
so let's only keep one variant.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-07-04 03:34:51 -03:00
Raphael S. Carvalho
5921600b95 lcs: get rid of unused code in leveled_manifest
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-07-04 03:34:34 -03:00
Nadav Har'El
d177ec05cb repair: further limit parallelism of checksum calculation
Repair today has a semaphore limiting the number of ongoing checksum
comparisons running in parallel (on one shard) to 100. We needed this
number to be fairly high, because a "checksum comparison" can involve
high latency operations - namely, sending an RPC request to another node
in a remote DC and waiting for it to calculate a checksum there, and while
waiting for a response we need to proceed calculating checksums in parallel.

But as a consequence, in the current code, we can end up with as many as
100 fibers all at the same stage of reading partitions to checksum from
sstables. This requires tons of memory, to hold at least 128K of buffer
(even more with read-ahead) for each of these fibers, plus partition data
for each. But doing 100 reads in parallel is pointless - one (or very few)
should be enough.

So this patch adds another semaphore to limit the number of checksum
*calculations* (including the read and checksum calculation) on each shard
to just 2. There may still be 100 ongoing checksum *comparisons*, in
other stages of the comparisons (sending the checksum requests to other
and waiting for them to return), but only 2 will ever be in the stage of
reading from disk and checksumming them.

The limit of 2 checksum calculations (per shard) applies on the repair
slave, not just to the master: The slave may receive many checksum
requests in parallel, but will only actually work on 2 at a time.

Because the parallelism=100 now rate-limits operations which use very little
memory, in the future we can safely increase it even more, to support
situations where the disk is very fast but the link between nodes has
very high latency.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170703151329.25716-1-nyh@scylladb.com>
2017-07-03 18:14:57 +03:00
Piotr Jastrzebski
80f08921c4 Make table_helper independent from trace_keyspace_helper
table_helper is a generic helper than can easily be used in other places.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <11e46dbc1c90d0273a41c8144e6f6013e21efcdb.1499077818.git.piotr@scylladb.com>
2017-07-03 15:55:00 +03:00
Raphael S. Carvalho
972a0237ef database: restore indentation for cleanup_sstables
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170630035324.19881-2-raphaelsc@scylladb.com>
2017-07-03 12:48:54 +03:00
Raphael S. Carvalho
b9d0645199 database: fix potential use-after-free in sstable cleanup
when do_for_each is in its last iteration and with_semaphore defers
because there's an ongoing cleanup, sstable object will be used after
freed because it was taken by ref and the container it lives in was
destroyed prematurely.

Let's fix it with a do_with, also making code nicer.

Fixes #2537.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170630035324.19881-1-raphaelsc@scylladb.com>
2017-07-03 12:48:53 +03:00
Avi Kivity
5883e85da3 Merge "improve maintainability of compaction strategies" from Raphael
"compaction_strategy.cc keeps the full implementation of size tiered,
major, and null strategies, and partial implementation of leveled
and date tiered strategies. It's a mess. In the future, we will also
need space for time window strategy. The file is hard to read and
maintain.
My goal here is to improve maintainability of the strategies by
putting each of them into its own header.

NOTE: No semantic change is introduced here."

* 'improve_compaction_strategy_maintainability' of github.com:raphaelsc/scylla:
  compaction_strategy: move dtcs to its existing header
  compaction_strategy: move lcs implementation to its own header
  compaction_strategy: move stcs implementation to its own header
  compaction_strategy: move compaction_strategy_impl to its own header
2017-07-03 11:39:30 +03:00
Takuya ASADA
0c81974bc4 dist/common/systemd: move scylla-server.service to be after network-online.target instead of network.target
To make sure start Scylla after network is up, we need to move from
network.target to network-online.target.

Fixes #2337

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1493661832-9545-1-git-send-email-syuu@scylladb.com>
2017-07-03 10:01:21 +03:00
Asias He
b2a2fbcf73 repair: Do not store the failed ranges
The number of failed ranges can be large so it can consume a lot of memory.
We already logged the failed ranges in the log. No need to storge them
in memory.

Message-Id: <7a70c4732667c5c3a69211785e8efff0c222fc28.1498809367.git.asias@scylladb.com>
2017-07-03 10:00:25 +03:00
Takuya ASADA
1c35549932 dist/common/scripts/scylla_cpuscaling_setup: skip configuration when cpufreq driver doesn't loaded
Configuring cpufreq service on VMs/IaaS causes an error because it doesn't supported cpufreq.
To prevent causing error, skip whole configuration when the driver not loaded.

Fixes #2051

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1498809504-27029-1-git-send-email-syuu@scylladb.com>
2017-07-03 09:59:56 +03:00
Takuya ASADA
e645b0fb13 dist/common/scripts: move EC2 configuration verification to 'scylla_ec2_check'
Currently we only have EC2 configuration verification on AMI, so move it to
/usr/lib/scylla and run it from scylla_setup, to make it usable for
non-AMI users.

Fixes #1997

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1498811107-29135-1-git-send-email-syuu@scylladb.com>
2017-07-03 09:59:28 +03:00
Avi Kivity
6895f6e603 sstable_datafile_test: fix sstable_expired_data_ratio failure
A comment states that we want the file to be old enough, but sets
a timestamp of max(), which is in the future. This may have passed
because the conversion from numeric_limits<time_t>::max() to
db_clock::time_point is not well defined (their dynamic range is
different), so truncation may have converted the large number to a
low one.
Message-Id: <20170702082903.20879-1-avi@scylladb.com>
2017-07-02 20:22:51 +02:00
Avi Kivity
51b6066212 cql3: operation: correctly format error messages
Error messages incorrectly used the debug representation of the receiver,
rather than the text representation of the operation itself.

Fixes #113.
Message-Id: <20170701101325.3163-1-avi@scylladb.com>
2017-07-02 20:06:50 +02:00
Duarte Nunes
d157e4558a utils/log_histogram: Remove largest() function
It should never have existed in the first place, as there are no
legitimate callers and it can be misused.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170630095939.2429-1-duarte@scylladb.com>
2017-07-02 14:29:17 +03:00
Gleb Natapov
d23111312f main: wait for wait_for_gossip_to_settle() to complete during boot
Boot should not continue until a future returned by
wait_for_gossip_to_settle() is resolved.  Commit 991ec4a16 mistakenly
broke that, so restore it back. Also fix calls for supervisor::notify()
to be in the right places.

Message-Id: <20170702082355.GQ14563@scylladb.com>
2017-07-02 11:32:36 +03:00
Avi Kivity
5bc13e4454 Revert "Make table_helper independent from trace_keyspace_helper"
This reverts commit db5bf363d0. Causes
errors of the sort

    Exiting on unhandled exception: exceptions::invalid_request_exception
    (Keyspace 'system_traces' does not exist)
2017-07-02 11:30:51 +03:00
Avi Kivity
7c809917b6 compaction_manager: fix debug mode build (periodic_compaction_submission_interval)
Turn static constexpr variable into a function.
2017-07-01 19:34:46 +03:00
Avi Kivity
c2c69e003f compaction: fix build on debug mode (DEFAULT_TOMBSTONE_COMPACTION_INTERVAL)
Debug mode wants to allocate storage for a constexpr variable for some
reason. Turn it into a function.
2017-07-01 19:26:22 +03:00
Avi Kivity
59f649e2bc Revert "cql_server::do_accepts: modernize loop"
This reverts commit 37af493f6e. Connections
are not accepted and ^C does not work anymore.
2017-07-01 12:54:23 +03:00
Jesse Haber-Kucharsky
1100bb8a5b cql: Eagerly throw lexing and parsing exceptions
Previously, lexing and parsing errors were aggregated while CQL queries were
evaluated. Afterwards, the first collected error (if present) would be thrown as
an exception.

The problem was that when parsing and lexing errors were aggregated this way,
the parser would continue even in spite of errors like "no viable alternative".
Semantic actions attached to grammar rules would still execute, though with
variables that had not yet been initialized. This would crash Scylla.

This change modifies the error-handling strategy of CQL parsing. Rather than
aggregate errors, we throw an exception on the first error we encounter. This
ensures that grammar actions never execute unless there is a precise match.

One possible issue with this approach is that the generated C++ code from the
ANTLR grammar may not be exception-safe. I compiled Scylla in debug-mode with
ASan support and executed several erroneous CQL queries with `cqlsh`. No memory
leaks were reported.

Fixes #2466.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <db1f650a2bbb615b506d9015486eece45375a440.1498836703.git.jhaberku@scylladb.com>
2017-07-01 12:13:44 +03:00