Consumer may throw, in this case, break from the loop and retry.
move_sstable_from_staging_in_thread may theoretically throw too,
ignore the error in this case since the sstable was already processed,
individual move failures are already ignored and moving from staging
will be retried upon restart.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To be used for "batch" move of several sstables from staging
to the base directory, allowing the caller to sync the directories
once when all are moved rather than for each one of them.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
distributed_loader::probe_file needlessly creates a seastar
thread for it and the next patch will use it as part of
a parallel_for_each loop to move a list of sstables
(and sync the directories once at the end).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
"
LWT is much more efficient if a request is processed on a shard that owns
a token for the request. This is because otherwise the processing will
bounce to an owning shard multiple times. The patch proposes a way to
move request to correct shard before running lwt. It works by returning
an error from lwt code if a shard is incorrect one specifying the shard
the request should be moved to. The error is processed by the transport
code that jumps to a correct shard and re-process incoming message there.
"
* 'gleb/bounce_lwt_request' of github.com:scylladb/seastar-dev:
lwt: take raw lock for entire cas duration
lwt: drop invoke_on in paxos_state prepare and accept
lwt: Process lwt request on a owning shard
storage_service: move start_native_transport into a thread
transport: change make_result to takes a reference to cql result instead of shared_ptr
The implementation of Expected's BEGINS_WITH operator on blobs was
incorrect, naively comparing the base64-encoded strings, which doesn't
work. This patches fixes the code to compare the decoded strings.
The reason why the BEGINS_WITH test missed this bug was that we forgot
to check the blob case and only tested the string case; So this patch
also adds the missing test - which reproduces this bug, and verifies
its fix.
Fixes#5457
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191211115526.29862-1-nyh@scylladb.com>
The LIKE operator requires filtering, so needs_filtering() must check
is_LIKE(). This already happens for partition columns, but it was
overlooked for clustering columns in the initial implementation of
LIKE.
Fixes#5400.
Tests: unit(dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
To be used by dtest as an indicator that endpoint's hints
were drained and hints directory is removed.
Refs #5354
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
LWT is much more efficient if a request is processed on a shard that owns
a token for the request. This is because otherwise the processing will
bounce to an owning shard multiple times. The patch proposes a way to
move request to correct shard before running lwt. It works by returning
an error from lwt code if a shard is incorrect one specifying the shard
the request should be moved to. The error is processed by transport code
that jumps to a correct shard and re-process incoming message there.
This patch adds comprehensive tests for the ReturnValue parameter of
the write operations (PutItem, UpdateItem, DeleteItem), which can return
pre-write or post-write values of the modified item. The tests are in
a new test file, alternator-test/test_returnvalues.py.
This feature is not yet implemented in Alternator, so all the new
tests xfail on Alternator (and all pass on AWS).
Refs #5053
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191127163735.19499-1-nyh@scylladb.com>
This patch adds tests for Query's "ScanIndexForward" parameter, which
can be used to return items in reversed sort order.
We test that a Limit works and returns the given number of *last* items
in the sort order, and also that such reverse queries can be resumed,
i.e., paging works in the reverse order.
These tests pass against AWS DynamoDB, but fail against Alternator (which
doesn't support ScanIndexForward yet), so it is marked xfail.
Refs #5153.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191127114657.14953-1-nyh@scylladb.com>
The JMX interface is implemented by the scylla-jmx project, not scylla.
Therefore, let's remove this historical reference to MBeans from
storage_proxy.
Message-Id: <20191211121652.22461-1-penberg@scylladb.com>
"
Add --experimental-features -- a vector of features to unlock. Make corresponding changes in the YAML parser.
Fixes#5338
"
* 'vecexper' of https://github.com/dekimir/scylla:
config: Add `experimental_features` option
utils: Add enum_option
* seastar e440e831c8...00da4c8760 (7):
> Merge "reactor: fix iocb pool underflow due to unaccounted aio fsync" from Avi
Fixes#5443.
> install-dependencies.sh: fix arch dependencies
> Merge " rpc: fix use-after-free during rpc teardown vs. rpc server message handling" from Benny
> Merge "testing: improve the observability of abandoned failed futures" from Botond
> rework the fair_queue tester
> directory_test: Update to use run instead of run_deprecated
> log: support fmt 6.0 branch with chrono.h for log
In the calculate_delay() code for view-backlog flow control, we calculate
a delay and cap it at a "budget" - the remaining timeout. This timeout is
measured in milliseconds, but the capping calculation converted it into
microseconds, which overflowed if the timeout is very large. This causes
some tests which enable the UB sanitizer to fail.
We fix this problem by comparing the delay to the budget in millisecond
resolution, not in microsecond resolution. Then, if the calculated delay
is short enough, we return it using its full microsecond resolution.
Fixes#5412
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191205131130.16793-1-nyh@scylladb.com>
Merged pull request https://github.com/scylladb/scylla/pull/5453
from Piotr Sarna:
Checking the EQ relation for alternator attributes is usually performed
simply by comparing underlying JSON objects, but sets (SS, BS, NS types)
need a special routine, as we need to make sure that sets stored in
a different order underneath are still equal, e.g:
[1, 3, 2] == [1, 2, 3]
Fixes#5021
Checking the EQ relation for alternator attributes is usually performed
simply by comparing underlying JSON objects, but sets (SS, BS, NS types)
need a special routine, as we need to make sure that sets stored in
a different order underneath are still equal, e.g:
[1, 3, 2] == [1, 2, 3]
Fixes#5021
If a set of mutations contains both an entry that deletes a table
and an entry that adds a table with the same name, it's expected
to be a replacement operation (delete old + create new),
rather than a useless "try to create a table even though it exists
already and then immediately delete the original one" operation.
As such, notifications about the deletions should be performed
before notifications about the creations. The place that originally
suffered from this wrong order is view building - which in this case
created an incorrect duplicated entry in the view building bookkeeping,
and then immediately deleted it, resulting in having old, deprecated
entries with stale UUIDS lying in the build queue and never proceeding,
because the underlying table is long gone.
The issue is fixed by ensuring the order of notifications:
- drops are announced first, view drops are announced before table drops;
- creations follow, table creations are announced before views;
- finally, changes to tables and views are announced;
Fixes#4382
Tests: unit(dev), mv_populating_from_existing_data_during_node_stop_test
Iterate over an array holding all rpm names to see if any
of them is missing from `dist/ami/files`. If they are missing,
look them up in build/redhat/RPMS/x86_64 so that if reloc/build_rpm.sh
was run manually before dist/ami/build_ami.sh we can just collect
the built rpms from its output dir.
If we're still missing any rpms, then run reloc/build_rpm.sh
and copy the required rpms from build/redhat/RPMS/x86_64.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Reviewed-by: Glauber Costa <glauber@scylladb.com>
In swagger 1.2 int is defined as int32.
We originally used int following the jmx definition, in practice
internally we use uint and int64 in many places.
While the API format the type correctly, an external system that uses
swagger-based code generator can face a type issue problem.
This patch replace all use of int in a return type with long that is defined as int64.
Changing the return type, have no impact on the system, but it does help
external systems that use code generator from swagger.
Fixes#5347
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Provide some explanation on prio strings + direction to gnutls manual.
Document client auth option.
Remove confusing/misleading statement on "custom options"
Message-Id: <20191210123714.12278-1-calle@scylladb.com>
Enable existing NOT_CONTAINS test, add NOT_CONTAINS to the list of
recognized operators, implement check_NOT_CONTAINS, and hook it up to
verify_expected_one().
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
When the user wants to turn on only some experimental features, they
can use this new option. The existing `experimental` option is
preserved for backwards compatibility.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Almost all commands provided by `scylla-gdb.py` are safe to use. The
worst that could happen if they fail is that you won't get the desired
information. There is one notable exception: `scylla thread`. If
anything goes wrong while this command is executed - gdb crashes, a bug
in the command, etc. - there is a good change the process under
examination will crash. Sometimes this is fine, but other times e.g.
when live debugging a production node, this is unacceptable.
To avoid any accidents add documentation to all commands working with
`seastar::thread`. And since most people don't read documentation,
especially when debugging under pressure, add a safety net to the
`scylla thread` command. When run, this command will now warn of the
dangers and will ask for explicit acknowledgment of the risk of crash,
by means of passing an `--iamsure` flag. When this flag is missing, it
will refuse to run. I am sure this will be very annoying but I am also
sure that the avoided crashes are worth it.
As part of making `scylla thread` safe, its argument parsing code is
migrated to `argparse`. This changes the usage but this should be fine
because it is well documented.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20191129092838.390878-1-bdenes@scylladb.com>
The ami description attribute is only allowed to be 255
characters long. When build_ami.sh generates an ami, it
generates an ami description which is a concatenation
of all of the componnents version strings. It can
happen that the description string is too long which
eventually causes the ami build to fail. This patch
trims the description string to 255 characters.
It is ok since the individual versions of the components
are also saved in tags attached to the image.
Tests:
1. Reproduced with a long description and
validated that it doesn't fail after the fix.
Fixes#5435
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <20191209141143.28893-1-eliransin@scylladb.com>
A Linux machine typically has multiple clocksources with distinct
performances. Setting a high-performant clocksource might result in
better performance for ScyllaDB, so this should be considered whenever
starting it up.
This patch introduces the possibility of enforcing optimized Linux
clocksource to Scylla's setup/start-up processes. It does so by adding
an interactive question about enforcing clocksource setting to scylla_setup,
which modifies the parameter "CLOCKSOURCE" in scylla_server configuration
file. This parameter is read by perftune.py which, if set to "yes", proceeds
to (non persistently) setting the clocksource. On x86, TSC clocksource is
used.
Fixes#4474
A general build system knows about 3 machines:
* build: where the building is running
* host: where the built software will run
* target: the machine the software will produce code for
The target machine is only relevant for compilers, so we can ignore
it.
Until now we could ignore the build and host distinction too. This
patch adds the first difference: don't use host ld_flags when linking
build tools (gen_crc_combine_table).
The reason for this change is to make it possible to build with
-Wl,--dynamic-linker pointing to a path that will exist on the host
machine, but may not exist on the build machine.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191207030408.987508-1-espindola@scylladb.com>
The vm.swappiness sysctl controls the kernel's prefernce for swapping
anonymous memory vs page cache. Since Scylla uses very large amounts
of anonymous memory, and tiny amounts of page cache, the correct setting
is to prefer swapping page cache. If the kernel swaps anonymous memory
the reactor will stall until the page fault is satisfied. On the other
hand, page cache pages usually belong to other applications, usually
backup processes that read Scylla files.
This setting has been used in production in Scylla Cloud for a while
with good results.
Users can opt out by not installing the scylla-kernel-conf package
(same as with the other kernel tunables).
* seastar 166061da3...e440e831c (8):
> Fail tests on ubsan errors
> future: make a couple of asserts more strict
> future: Move make_ready out of line
> config: Do not allow zero rates
Fixes#5360
> future: add new state to avoid temporaries in get_available_state().
> future: avoid temporary future_state on get_available_state().
> future: inline future::abandoned
> noncopyable_function: Avoid uninitialized warning on empty types
Previously, scylla used min/max(blob)->blob overload for collections,
tuples and UDTs; effectively making the results being printed as blobs.
This PR adds "dynamically"-typed min()/max() functions for compound types.
These types can be complicated, like map<int,set<tuple<..., and created
in runtime, so functions for them are created on-demand,
similarly to tojson(). The comparison remains unchanged - underneath
this is still byte-by-byte weak lex ordering.
Fixes#5139
* jul-stas/5139-minmax-bad-printing-collections:
cql_query_tests: Added tests for min/max/count on collections
cql3: min()/max() for collections/tuples/UDTs do not cast to blobs
This tests new min/max function for collections and tuples. CFs
in test suite were named according to types being tested, e.g.
`cf_map<int,text>' what is not a valid CF name. Therefore, these
names required "escaping" of invalid characters, here: simply
replacing with '_'.
Before:
cqlsh> insert into ks.list_types (id, val) values (1, [3,4,5]);
cqlsh> select max(val) from ks.list_types;
system.max(val)
------------------------------------------------------------
0x00000003000000040000000300000004000000040000000400000005
After:
cqlsh> select max(val) from ks.list_types;
system.max(val)
--------------------
[3, 4, 5]
This is accomplished similarly to `tojson()`/`fromjson()`: functions
are generated on demand from within `cql3::functions::get()`.
Because collections can have a variety of types, including UDTs
and tuples, it would be impossible to statically define max(T t)->T
for every T. Until now, max(blob)->blob overload was used.
Because `impl_max/min_function_for` is templated with the
input/output type, which can be defined in runtime, we need type-erased
("dynamic") versions of these functors. They work identically, i.e.
they compare byte representations of lhs and rhs with
`bytes::operator<`.
Resolves#5139
If you merge a pull request that contains multiple patches via
the github interface, it will document itself as the committer.
Work around this brain damage by using the command line.