Commit Graph

1589 Commits

Author SHA1 Message Date
Jan Ciolek
4c4ed8e6df test/boost: move expr_test_utils.hh to .hh and .cc in test/lib
expr_test_utils.hh was a header file with helper methods for
expression tests. All functions were inline, because I didn't
know how to create and link a .cc file in test/boost.

Now the header is split into expr_test_utils.hh and expr_test_utils.cc
and moved to test/lib, which is designed to keep this kind of files.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 17:31:37 +02:00
Nadav Har'El
d2cd9b71b3 Merge 'Make tracing test run again, simplify backend registry and few related cleanups' from Pavel Emelyanov
It turned out that boost/tracing test is not run because its name doesn't match the *_test.cc pattern. While fixing it it turned out that the test cannot even start, because it uses future<>.get() calls outside of seastar::thread context. While patching this place the trace-backend registry was removed for simplicity. And, while at it, few more cleanups "while at it"

Closes #11779

* github.com:scylladb/scylladb:
  tracing: Wire tracing test back
  tracing: Indentation fix after previous patch
  tracing: Move test into thread
  tracing: Dismantle trace-backend registry
  tracing: Use class-registrator for backends
  tracing: Add constraint to trace_state::begin()
  tracing: Remove copy-n-paste comments from test
  tracing: Outline may_create_new_session
2022-10-16 12:32:17 +03:00
Pavel Emelyanov
707efb6dfb tracing: Wire tracing test back
The boost/tracing test is not run, because test.py boost suite collects
tests that match *_test.cc pattern. The tracing one apparently doesn't

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:59:13 +03:00
Pavel Emelyanov
5c8a61ace2 tracing: Dismantle trace-backend registry
It's not used any longer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:57:24 +03:00
Takuya ASADA
49d5e51d76 reloc: add support stripped binary installation for relocatable package
This add support stripped binary installation for relocatable package.
After this change, scylla and unified packages only contain stripped binary,
and introduce "scylla-debuginfo" package for debug symbol.
On scylla-debuginfo package, install.sh script will extract debug symbol
at /opt/scylladb/<dir>/.debug.

Note that we need to keep unstripped version of relocatable package for rpm/deb,
otherwise rpmbuild/debuild fails to create debug symbol package.
This version is renamed to scylla-unstripped-$version-$release.$arch.tar.gz.

See #8918

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes #9005
2022-10-13 15:11:32 +02:00
Avi Kivity
f673d0abbe build: support fmt 9 ostream formatter deprecation
fmt 9 deprecates automatic fallback to std::ostream formatting.
We should migrate, but in order to do so incrementally, first enable
the deprecated fallback so the code continues to compile.

Closes #11768
2022-10-12 09:27:36 +03:00
Avi Kivity
0952cecfc9 build: mark abseil as a system header
Abseil is not under our control, so if a header generates a
warning, we can do nothing about it. So far this wasn't a problem,
but under clang 15 it spews a harmless deprecation warning. Silence
the warning by treating the header as a system header (which it is,
for us).

Closes #11767
2022-10-12 09:27:36 +03:00
Michael Livshin
7bd13be3f2 build: improvements & upgrades to Nix dev environment
* Add some more useful stuff to the shell environment, so it actually
  works for debugging & post-mortem analysis.

* Wrap ccache & distcc transparently (distcc will be used unless
  NODISTCC is set to a non-empty value in the environment; ccache will
  be used if CCACHE_DIR is not empty).

* Package the Scylla Python driver (instead of the C* one).

* Catch up to misc build/test requirements (including optional) by
  requiring or custom-packaging: wasmtime 0.29.0, cxxbridge,
  pytest-asyncio, liburing.

* Build statically-linked zstd in a saner and more idiomatic fashion.

* In pure builds (where sources lack Git metadata), derive
  SCYLLA_RELEASE from source hash.

* Refactor things for more parameterization.

* Explicitly stub out installPhase (seeing that "nix build" succeeds
  up to installPhase means we didn't miss any dependencies).

* Add flake support.

* Add copious comments.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-10-02 11:47:16 +03:00
Avi Kivity
2cec417426 Merge 'tools: use the standard allocator' from Botond Dénes
Tools want to be as little disrupting to the environment they run in as possible, because they might be run in a production environment, next to a running scylladb production server. As such, the usual behavior of seastar applications w.r.t. memory is an anti-pattern for tools: they don't want to reserve most of the system memory, in fact they don't want to reserve any amount, instead consuming as much as needed on-demand.
To achieve this, tools want to use the standard allocator. To achieve this they need a seastar option to to instruct seastar to *not* configure and use the seastar allocator and they need LSA to cooperate with the standard allocator.
The former is provided by https://github.com/scylladb/seastar/pull/1211.
The latter is solved by introducing the concept of a `segment_store_backend`, which abstracts away how the memory arena for segments is acquired and managed. We then refactor the existing segment store so that the seastar allocator specific parts are moved to an implementation of this backend concept, then we introduce another backend implementation appropriate to the standard allocator.
Finally, tools configure seastar with the newly introduced option to use the standard allocator and similarly configure LSA to use the standard allocator appropriate backend.

Refs: https://github.com/scylladb/scylladb/issues/9882
This is the last major code piece in scylla for making tools production ready.

Closes #11510

* github.com:scylladb/scylladb:
  test/boost: add alternative variant of logalloc test
  tools: use standard allocator
  utils/logalloc: add use_standard_allocator_segment_pool_backend()
  utils/logalloc: introduce segment store backend for standard allocator
  utils/logalloc: rebase release segment-store on segment-store-backend
  utils/logalloc: introduce segment_store_backend
  utils/logalloc: push segment alloc/dealloc to segment_store
  test/boost/logalloc_test: make test_compaction_with_multiple_regions exception-safe
2022-09-20 12:59:34 +03:00
Botond Dénes
22128977e4 test/boost: add alternative variant of logalloc test
Which intializes LSA with use_standard_allocator_segment_pool_backend()
running the logalloc_test suite on the standard allocator segment pool
backend. To avoid duplicating the test code, the new test-file pulls in
the test code via #include. I'm not proud of it, but it works and we
test LSA with both the debug and standard memory segment stores without
duplicating code.
2022-09-16 14:57:23 +03:00
Kamil Braun
728161003a Merge 'raft server, abort on background errors' from Gusev Petr
Halted background fibers render raft server effectively unusable, so
report this explicitly to the clients.

Fix: #11352

Closes #11370

* github.com:scylladb/scylladb:
  raft server, status metric
  raft server, abort group0 server on background errors
  raft server, provide a callback to handle background errors
  raft server, check aborted state on public server public api's
2022-09-15 14:12:11 +02:00
Botond Dénes
5374f0edbf Merge 'Task manager' from Aleksandra Martyniuk
Task manager for observing and managing long-running, asynchronous tasks in Scylla
with the interface for the user. It will allow listing of tasks, getting detailed
task status and progression, waiting for their completion, and aborting them.
The task manager will be configured with a “task ttl” that determines how long
the task status is kept in memory after the task completes.

At first it will support repair and compaction tasks, and possibly more in the future.

Currently:
Sharded `task_manager` is started in `main.cc` where it is further passed
to `http_context` for the purpose of user interface.

Task manager's tasks are implemented in two two layers: the abstract
and the implementation one. The latter is a pure virtual class which needs
to be overriden by each module. Abstract layer provides the methods that
are shared by all modules and the access to module-specific methods.

Each module can access task manager, create and manage its tasks through
`task_manager::module` object. This way data specific to a module can be
separated from the other modules.

User can access task manager rest api interface to track asynchronous tasks.
The available options consist of:
- getting a list of modules
- getting a list of basic stats of all tasks in the requested module
- getting the detailed status of the requested task
- aborting the requested task
- waiting for the requested task to finish

To enable testing of the provided api, test specific task implementation and module
are provided. Their lifetime can be simulated with the standalone test api.
These components are compiled and the tests are run in all but release build modes.

Fixes: #9809

Closes #11216

* github.com:scylladb/scylladb:
  test: task manager api test
  task_manager: test api layer implementation
  task_manager: add test specific classes
  task_manager: test api layer
  task_manager: api layer implementation
  task_manager: api layer
  task_manager: keep task_manager reference in http_context
  start sharded task manager
  task_manager: create task manager object
2022-09-12 09:26:46 +03:00
Petr Gusev
1b5fa4088e raft server, abort group0 server on background errors 2022-09-12 10:16:43 +04:00
Petr Gusev
c57238d3d6 raft server, check aborted state on public server public api's
Fix: #11352
2022-09-12 10:16:40 +04:00
Aleksandra Martyniuk
ec86410094 task_manager: test api layer implementation
The implementation of a test api that helps testing task manager
api. It provides methods to simulate the operations that can happen
on modules and theirs task. Through the api user can: register
and unregister the test module and the tasks belonging to the module,
and finish the tasks with success or custom error.
2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk
42f36db55b task_manager: test api layer
The test api that helps testing task manager api. It can be used
to simulate the operations that can happen on modules and theirs
task. Through the api user can: register and unregister the test
module and the tasks belonging to the module, and finish the tasks
with success or custom error.
2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk
c9637705a6 task_manager: api layer implementation
The implementation of a task manager api layer. It provides
methods to list the modules registered in task_manager, list
tasks belonging to the given module, abort, wait for or retrieve
a status of the given task.
2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk
07043cee68 task_manager: api layer
The task manager api layer. It can be used to list the modules
registered in task_manager, list tasks belonging to the given
module, abort, wait for or retrieve a status of the given task.
2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk
2439e55974 task_manager: create task manager object
Implementation of a task manager that allows tracking
and managing asynchronous tasks.

The tasks are represented by task_manager::task class providing
members common to all types of tasks. The methods that differ
among tasks of different module can be overriden in a class
inheriting from task_manager::task::impl class. Each task stores
its status containing parameters like id, sequence number, begin
and end time, state etc. After the task finishes, it is kept
in memory for configurable time or until it is unregistered.
Tasks need to be created with make_task method.

Each module is represented by task_manager::module type and should
have an access to task manager through task_manager::module methods.
That allows to easily separate and collectively manage data
belonging to each module.
2022-09-09 14:29:28 +02:00
Mikołaj Grzebieluch
82df8a9905 raft: broadcast_tables: add compilation of cql to intermediate language
We decided to extend `cql_statement` hierarchy with `strongly_consistent_modification_statement`
and `strongly_consistent_select_statement`. Statements operating on
system.broadcast_kv_store will be compiled to these new subclasses if
BROADCAST_TABLES flag is enabled.

If the query is executed on a shard other than 0 it's bounced to that shard.
2022-09-08 15:25:36 +02:00
Mikołaj Grzebieluch
c541d5c363 raft: broadcast_tables: add definition of intermediate language
In broadcast tables, raft command contains a whole program to be executed.
Sending and parsing on each node entire CQL statement is inefficient,
thus we decided to compile it to an intermediate language which can be
easily serializable.

This patch adds a definition of such a language. For now, only the following
types of statements can be compiled:
* select value where key = CONST from system.broadcast_kv_store;
* update system.broadcast_kv_store set value = CONST where key = CONST;
* update system.broadcast_kv_store set value = CONST where key = CONST if value = CONST;
where CONST is string literal.
2022-09-08 14:03:51 +02:00
Benny Halevy
d588e2a7c5 release: properly evaluate SCYLLA_BUILD_MODE_* macros
Patch 765d2f5e46 did not
evaluate the #if SCYLLA_BUILD_MODE directives properly
and it always matched SCYLLA_BULD_MODE == release.

This change fixes that by defining numerical codes
for each build mode and using macro expansion to match
the define SCYLLA_BUILD_MODE against these codes.

Also, ./configure.py was changes to pass SCYLLA_BUILD_MODE
to all .cc source files, and makes sure it is defined
in build_mode.hh.

Support was added for coverage build mode,
and an #error was added if SCYLLA_BUILD_MODE
was not recognized by the #if ladder directives.

Additional checks verifying the expected SEASTAR_DEBUG
against SCYLLA_BUILD_MODE were added as well,

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11387
2022-08-29 10:20:19 +03:00
Benny Halevy
765d2f5e46 release: define SCYLLA_BUILD_MODE_STR by stringifying SCYLLA_BUILD_MODE
Currently SCYLLA_BULD_MODE is defined as a string by the cxxflags
generated by configure.py.  This is not very useful since one cannot use
it in a @if preprocessor directive.

Instead, use -DSCYLLA_BULD_MODE=release, for example, and define a
SCYLLA_BULD_MODE_STR as the dtirng representation of it.

In addition define the respective
SCYLLA_BUILD_MODE_{RELEASE,DEV,DEBUG,SANITIZE} macros that can be easily
used in @ifdef (or #ifndef :)) for conditional compilation.

The planned use case for it is to enable a task_manager test module only
in non-release modes.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11357
2022-08-25 16:50:42 +02:00
Benny Halevy
fa7033bc2b configure: add --perf-tests-debuginfo option
Provides separate control over debuginfo for perf tests
since enabling --tests-debuginfo affects both today
causing the Jenkins archives of perf tests binaries to
inflate considerably.

Refs https://github.com/scylladb/scylla-pkg/issues/3060

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11337
2022-08-21 19:08:21 +03:00
Michał Chojnowski
de0f2c21ec configure.py: make messaging_service.cc the first source file
Currently messaging_service.o takes the longest of all core objects to
compile.  For a full build of build/release/scylla, with current ninja
scheduling, on a 32-hyperthread machine, the last ~16% of the total
build time is spent just waiting on messaging_service.o to finish
compiling.

Moving the file to the top of the list makes ninja start its compilation
early and gets rid of that single-threaded tail, improving the total build
time.

Closes #11255
2022-08-10 11:18:09 +03:00
Yaron Kaikov
2fe2306efb configure.py: add date-stamp parameter
When starting `Build` job we have a situation when `x86` and `arm`
starting in different dates causing the all process to fail

As suggested by @avikivity , adding a date-stamp parameter and will pass
it through downstream jobs to get one release for each job

Ref: scylladb/scylla-pkg#3008

Closes #11234
2022-08-08 17:28:38 +03:00
Avi Kivity
268e4abe77 Merge 'wasm: reuse instances for wasm UDFs' from Wojciech Mitros
Calling WebAssembly UDFs requires wasmtime instance. Creating such an instance is expensive,
but these instances can be reused for subsequent calls of the same UDF on various inputs.

This patch introduces a way of reusing wasmtime instances: a wasm instance cache.
The cache stores a wasmtime instance for each UDF and scheduling group. The instances are
evicted using LRU strategy and their size is based on the size of their wasm memories.

The instances stored in the cache are also dropped when the UDF is dropped itself. For that reason,
the first patch modifies the current implementation of UDF dropping, so that the instance dropping may be added
later. The patch also removes the need of compiling the UDF again when dropping it.

The second patch contains the implementation and use of the new cache. The cache is implemented
in `lang/wasm_instance_cache.hh` and the main ways of using it are the `run_script` methods from `wasm.hh`

The third patch adds tests to `test_wasm.py` that check the correctness and performance of the new
cache. The tests confirm the instance reuse, size limits, instance eviction after timeout and after dropping the UDF.

Closes #10306

* github.com:scylladb/scylladb:
  wasm: test instances reuse
  wasm: reuse UDF instances
  schema_tables: simplify merge_functions and avoid extra compilation
2022-08-02 13:51:16 +03:00
Avi Kivity
2c0932cc41 Merge 'Reduce the amount of per-table metrics' from Amnon Heiman
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.

The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).

Closes #11058

* github.com:scylladb/scylla:
  Add summary_test
  database: Reduce the number of per-table metrics
  replica/table.cc: Do not register per-table metrics for system
  histogram_metrics_helper.hh: Add to_metrics_summary function
  Unified histogram, estimated_histogram, rates, and summaries
  Split the timed_rate_moving_average into data and timer
  utils/histogram.hh: should_sample should use a bitmask
  estimated_histogram: add missing getter method
2022-07-27 22:01:08 +03:00
Amnon Heiman
3658aa9ec2 Add summary_test
This patch adds unit tests for the summary implementation.
2022-07-27 16:58:52 +03:00
Amnon Heiman
9a3e70adfb histogram_metrics_helper.hh: Add to_metrics_summary function
The to_metrics_summary is a helper function that create a metrics type
summary from a timed_rate_moving_average_with_summary object.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Botond Dénes
81e20ceaab Merge 'logalloc, dirty_memory_manager: move region_groups to dirty_memory_manager' from Avi Kivity
logalloc manages regions of log-structured allocated memory, and region_groups
containing such regions and other region_groups. region_groups were introduced
for accounting purposes - first to limit the amount of memory in memtables, then to
match new dirty memory allocation rate with memtable flushing rate so we never
hit a situation where allocation rate exceeded flush rate, and we exceed our limit.

The problem is that the abstraction is very weak - if we want to change anything
in memtable flush control we'll need to change region_groups too - and also
expensive to maintain.

The solution is to break the abstraction and move region_groups to memtable
dirty memory management code. Instead introduce a new, simpler abstraction,
the region_listener, which communicates changes in region memory consumption
to an external piece of code, which can then choose to do with it what it likes.

The long term plan is to completely remove region_groups and fold them into dirty_memory_manager:
 - make each memtable a region_listener so it gets called back after size changes
 - make memtables inform their dirty_memory_manager about the size to dirty_memory_manager can decide to throttle writes and which memtable to pick to flush

Closes #10839

* github.com:scylladb/scylla:
  logalloc: drop region_impl public accessors
  logalloc, dirty_memory_manager: move size-tracking binomial heap out of logalloc
  logalloc: relax lifetime rules around region_listener
  logalloc, dirty_memory_manager: move region_group and associated code
  logalloc: expose tracker_reclaimer_lock
  logalloc: reimplement tracker_reclaim_lock to avoid using hidden classes
  logalloc: reduce friendship between region and region_group
  logalloc: decouple region_group from region
  memtable: stop using logalloc::region::group() to test for flushed memtables
2022-07-26 17:08:37 +03:00
Nadav Har'El
cb8a67dc98 Merge 'Allow materialized views to by synchronous' from Piotr Sarna
This pull request introduces a "synchronous mode" for global views. In this mode, all view updates are applied synchronously as if the view was local.

Marking view as a synchronous one can be done using `CREATE MATERIALIZED VIEW` and `ALTER MATERIALIZED VIEW`. E.g.:
```cql
ALTER MATERIALIZED VIEW ks.v WITH synchronous_updates = true;
```

Marking view as a synchronous one was done using tags (originally used by alternator). No big modifications in the view's code were needed.

Fixes: https://github.com/scylladb/scylla/issues/10545

Closes #11013

* github.com:scylladb/scylla:
  cql-pytest: extend synchronous mv test with new cases
  cql-pytest: allow extra parameters in new_materialized_view
  docs: add a paragraph on view synchronous updates
  test/boost/cql_query_test: add test setting synchronous updates property
  test: cql-pytest: add a test for synchronous mode materialized views
  db: view: react to synchronous updates tag
  cql3: statements: cf_prop_defs: apply synchronous updates tag
  alternator, db: move the tag code to db/tags
  cql3: statements: add a synchronous_updates property
2022-07-26 15:42:51 +03:00
Avi Kivity
fbe8ea7727 logalloc, dirty_memory_manager: move region_group and associated code
region_group is an abstraction that allows accounting for groups of
regions, but the cost/benefit ratio of maintaining the abstraction
is poor. Each time we need to change decision algorithm of memtable
flushing (admittedly rarely), we need to distill that into an abstraction
for region_groups and then use it. An example is virtual regions groups;
we wanted to account for the partially flushed memtables and had to
invent region groups to stand in their place.

Rather than continuing to invest in the abstraction, break it now
and move it to the memtable dirty memory manager which is responsible
for making those decisions. The relevant code is moved to
dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and
a new unit test file is added as well.

A downside of the change is that unit testing will be more difficult.
2022-07-26 11:12:10 +03:00
Yaron Kaikov
c42c5111eb SCYLLA-VERSION-GEN: use semver-compatible version
Setting Scylla to use semantic versioning. (Ref: https://semver.org/)

Closes: https://github.com/scylladb/scylla/issues/9543

Closes #10957
2022-07-25 18:06:28 +03:00
Michał Sala
041cb77ad0 alternator, db: move the tag code to db/tags
Tags are a useful mechanism that could be used outside of alternator
namespace. My motivation to move tags_extension and other utilities to
db/tags/ was that I wanted to use them to mark "synchronous mode" views.

I have extracted `get_tags_of_table`, `find_tag` and `update_tags`
method to db/tags/utils.cc and moved alternator/tags_extension.hh to
db/tags/.

The signature of `get_tags_of_table` was changed from `const
std::map<sstring, sstring>&` to `const std::map<sstring, sstring>*`
Original behavior of this function was to throw an
`alternator::api_error` exception. This was undesirable, as it
introduced a dependency on the alternator module. I chose to change it
to return a potentially null value, and added a wrapper function to the
alternator module - `get_tags_of_table_or_throw` to keep the previous
throwing behavior.
2022-07-25 09:53:33 +02:00
Wojciech Mitros
9281ba3919 wasm: reuse UDF instances
When executing a wasm UDF, most of the time is spent on
setting up the instance. To minimize its cost, we reuse
the instance using wasm::instance_cache.

This patch adds a wasm instance cache, that stores
a wasmtime instance for each UDF and scheduling group.
The instances are evicted using LRU strategy. The
cache may store some entries for the UDF after evicting
the instance, but they are evicted when the corresponding
UDF is dropped, which greatly limits their number.

The size of stored instances is estimated using the size
of their WASM memories. In order to be able to read the
size of memory, we require that the memory is exported
by the client.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-07-20 18:19:22 +02:00
Takuya ASADA
752be6536a rename relocatable packages
Currently, we use following naming convention for relocatable package
filename:
  ${package_name}-${arch}-package-${version}.${release}.tar.gz
But this is very different with Linux standard packaging system such as
.rpm and .deb.
Let's align the convention to .rpm style, so new convention should be:
  ${package_name}-${version}-${release}.${arch}.tar.gz

Closes #9799

Closes #10891

* tools/java de8289690e...d0143b447c (1):
  > build_reloc.sh: rename relocatable packages

* tools/jmx fe351e8...06f2735 (1):
  > build_reloc.sh: rename relocatable packages

* tools/python3 e48dcc2...bf6e892 (1):
  > reloc/build_reloc.sh: rename relocatable packages
2022-07-19 15:46:49 +03:00
Takuya ASADA
23973f9591 Support installing pip provided command symlinks to /usr/bin
This is part of support installing executables from PIP package,
now we support installing executable from PIP package but it will
install under /opt/scylladb/python3/bin.
To call these commands without speciying full path, we also need to install
symlink to /usr/bin.
To do this, we need new list which specifies command name for symlink.

Closes #10748
2022-07-12 17:26:05 +03:00
Nadav Har'El
f5ff687b64 Merge 'cql3: Reorganize expr::to_restriction' from Jan Ciołek
This PR introduces improvements to `expr::to_restriction` and prepares the validation part for restriction classes removal.

`expr::to_restriction` is currently used to take a restriction from the WHERE clause, prepare it, perform some validation checks and finally convert it to an instance of the restriction class.

Soon we will get rid of the restriction class.

In preparation for that `expr::to_restriction` is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction

Thanks to this split getting rid of restriction class will be painless, we will just stop using the second part.

`to_restriction.cc` is replaced by `restrictions.hh/cc`. In the future we can put all the restriction expressions code there to avoid clutter in `expression.hh/cc`.

This change made it much easier to fix #10631, so I did that as well.

Fixes: #10631

Closes #10979

* github.com:scylladb/scylla:
  cql-pytest: Test that IS NOT only accepts NULL
  cql-pytest: Enable testInvalidCollectionNonEQRelation
  cql3: Move single element IN restrictions handling
  cql3: Check for disallowed operators early
  cql3: Simplify adding restrictions
  cql3: Reorganize to_restriction code
  cql3: Fix IS NOT NULL check in to_restriction
  cql3: Swap order of arguments in error message
2022-07-12 00:26:34 +03:00
Jan Ciolek
debd7399fd cql3: Reorganize to_restriction code
expr::to_restriction is currently used to
take a restriction from the WHERE clause,
prepare it, perform some validation checks
and finally convert it to an instance of
the restriction class.

Soon we will get rid of the restriction class.

In preparation for that expr::to_restriction
is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction

Thanks to this split getting rid of restriction class
will be painless, we will just stop using the
second part.

This commit splits expr::to_restriction into two functions;
* validate_and_prepare_new_restriction
* convert_to_restriction
that handle each of those parts.

All helper validation methods in the anonymous namespace
are copied from the to_restriction.cc file.

to_restriction.cc isn't the best filename for the new functionality,
so it has been renamed to restrictions.hh/cc.
In the future all the code regarding restrictions could be
put there to reduce clutter in expression.hh/cc

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Piotr Dulikowski
18f43fa00e utils/exceptions: add try_catch
Introduces a utility function which allows obtaining a pointer to the
exception data held behind an std::exception_ptr if the data matches the
requested type. It can be used to implement manual but concise
try..catch chains.

The `try_catch` has the best performance when used with libstdc++ as it
uses the stdlib specific functions for simulating a try..catch without
having to actually throw. For other stdlibs, the implementation falls
back to a throw surrounded by an actual try..catch.
2022-07-05 16:41:09 +02:00
Pavel Emelyanov
3a753068be Merge "Make permissions cache live updateable and add an API for resetting authorization cache" from Igor Ribeiro Barbosa Duarte
Currently, for users who have permissions_cache configs set to very high
values (and thus can't wait for the configured times to pass) having to restart
the service every time they make a change related to permissions or
prepared_statements cache (e.g. Adding a user and changing their permissions)
can become pretty annoying.
This patch series make permissions_validity_in_ms, permissions_update_interval_in_ms
and permissions_cache_max_entries live updateable so that restarting the
service is not necessary anymore for these cases.
It also adds an API for flushing the cache to make it easier for users who
don't want to modify their permissions_cache config.

branch: https://github.com/igorribeiroduarte/scylla/tree/make_permissions_cache_live_updateable
CI: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1005/
dtests: https://github.com/igorribeiroduarte/scylla-dtest/tree/test_permissions_cache

* https://github.com/igorribeiroduarte/scylla/make_permissions_cache_live_updateable:
  loading_cache_test: Test loading_cache::reset and loading_cache::update_config
  api: Add API for resetting authorization cache
  authorization_cache: Make permissions cache and authorized prepared statements cache live updateable
  auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct
  utils/loading_cache.hh: Add update_config method
  utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh
  utils/loading_cache.hh: Add reset method
2022-06-29 11:14:13 +03:00
Igor Ribeiro Barbosa Duarte
a23c3d6338 api: Add API for resetting authorization cache
For cases where we have very high values set to permissions_cache validity and
update interval (E.g.: 1 day), whenever a change to permissions is made it's
necessary to update scylla config and decrease these values, since waiting for
all this time to pass wouldn't be viable.
This patch adds an API for resetting the authorization cache so that changing
the config won't be mandatory for these cases.

Usage:
    $ curl -X POST http://localhost:10000/authorization_cache/reset

Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
2022-06-28 19:58:06 -03:00
Avi Kivity
3131cbea62 Merge 'query: allow replica to provide arbitrary continue position' from Botond Dénes
Currently, we use the last row in the query result set as the position where the query is continued from on the next page. Since only live rows make it into query result set, this mandates the query to be stopped on a live row on the replica, lest any dead rows or tombstones processed after the live rows, would have to be re-processed on the next page (and the saved reader would have to be thrown away due to position mismatch). This requirement of having to stop on a live row is problematic with datasets which have lots of dead rows or tombstones, especially if these form a prefix. In the extreme case, a query can time out before it can process a single live row and the data-set becomes effectively unreadable until compaction gets rid of the tombstones.
This series prepares the way for the solution: it allows the replica to determine what position the query should continue from on the next page. This position can be that of a dead row, if the query stopped on a dead row. For now, the replica supplies the same position that would have been obtained with looking at the last row in the result set, this series merely introduces the infrastructure for transferring a position together with the query result, and it prepares the paging logic to make use of this position. If the coordinator is not prepared for the new field, it will simply fall-back to the old way of looking at the last row in the result set. As I said for now this is still the same as the content of the new field so there is no problem in mixed clusters.

Refs: https://github.com/scylladb/scylla/issues/3672
Refs: https://github.com/scylladb/scylla/issues/7689
Refs: https://github.com/scylladb/scylla/issues/7933

Tests: manual upgrade test.
I wrote a data set with:
```
./scylla-bench -mode=write -workload=sequential -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -clustering-row-size=8096 -partition-count=1000
```
This creates large, 80MB partitions, which should fill many pages if read in full. Then I started a read workload:
```
./scylla-bench -mode=read -workload=uniform -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -duration=10m -rows-per-request=9000 -page-size=100
```
I confirmed that paging is happening as expected, then upgraded the nodes one-by-one to this PR (while the read-load was ongoing). I observed no read errors or any other errors in the logs.

Closes #10829

* github.com:scylladb/scylla:
  query: have replica provide the last position
  idl/query: add last_position to query_result
  mutlishard_mutation_query: propagate compaction state to result builder
  multishard_mutation_query: defer creating result builder until needed
  querier: use full_position instead of ad-hoc struct
  querier: rely on compactor for position tracking
  mutation_compactor: add current_full_position() convenience accessor
  mutation_compactor: s/_last_clustering_pos/_last_pos/
  mutation_compactor: add state accessor to compact_mutation
  introduce full_position
  idl: move position_in_partition into own header
  service/paging: use position_in_partition instead of clustering_key for last row
  alternator/serialization: extract value object parsing logic
  service/pagers/query_pagers.cc: fix indentation
  position_in_partition: add to_string(partition_region) and parse_partition_region()
  mutation_fragment.hh: move operator<<(partition_region) to position_in_partition.hh
2022-06-27 12:23:21 +03:00
Botond Dénes
119be5d5db idl: move position_in_partition into own header
So it can be used without pulling in all of partition_checksum.idl.hh.
2022-06-23 13:36:24 +03:00
Piotr Dulikowski
bc50163016 tests: add per_partition_rate_limit_test
Adds the per_partition_rate_limit_test.cc file. Currently, it only
contains a test which verifies that the feature correctly switches off
rate limiting for internal queries (!allow_limit || internal sg).
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
dccb8a5729 schema: add per_partition_rate_limit schema extension
Adds the new `per_partition_rate_limit` schema extension. It has two
parameters: `max_writes_per_second` and `max_reads_per_second`.
In the future commits they will control how many operations of given
type are allowed for each partition in the given table.
2022-06-22 20:16:48 +02:00
Piotr Dulikowski
0fe8b55427 db: add rate_limiter
Introduces the rate_limiter, a replica-side data structure meant for
tracking the frequence with which each partition is being accessed
(separately for reads and writes) and deciding whether the request
should be accepted and processed further or rejected.

The limiter is implemented as a statically allocated hashmap which keeps
track of the frequency with which partitions are accessed. Its entries
are incremented when an operation is admitted and are decayed
exponentially over time.

If a partition is detected to be accessed more than its limit allows,
requests are rejected with a probability calculated in such a way that,
on average, the number of accepted requests is kept at the limit.

The structure currently weights a bit above 1MB and each shard is meant
to keep a separate instance. All operations are O(1), including the
periodic timer.
2022-06-22 20:16:48 +02:00
Piotr Dulikowski
621b7f35e2 replica: add rate_limit_exception and a simple serialization framework
Introduces `replica::rate_limit_exception` - an exceptions that is
supposed to be thrown/returned on the replica side when the request is
rejected due to the exceeding the per-partition rate limit.

Additionally, introduces the `exception_variant` type which allows to
transport the new exception over RPC while preserving the type
information. This will be useful in later commits, as the coordinator
will have to know whether a replica has failed due to rate limit being
exceeded or another kind of error.

The `exception_variant` currently can only either hold "other exception"
(std::monostate) or the aforementioned `rate_limit_exception`, but can
be extended in a backwards-compatible way in the future to be able to
hold more exceptions that need to be handled in a different way.
2022-06-22 20:07:58 +02:00
Michael Livshin
43f2c55c5d configure.py: speed up and simplify compdb generation
The most time-consuming part is invoking "ninja -t compdb", and there
is no need to repeat that for every mode.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>

Closes #10733
2022-06-15 16:40:52 +03:00