Add a table name to Alternator's tracing output, as some clients would
like to consistently receive this information.
- add missing `tracing::add_table_name` in `executor::scan`
- add emiting tables' names in `trace_state::build_parameters_map`
- update tests, so when tracing is looked for it is filtered by table's
name, which confirms table is being outputed.
- change `struct one_session_records` declaration to `class one_session_records`,
as `one_session_records` is later defined as class.
Refs #26618Fixes#24031Closesscylladb/scylladb#26634
In #24031 users complained, that trace message is truncated, namely it's
no longer json parsable and table name might not be part of the output.
This path enables users to configure maximum size of trace message.
In case user wanted `table` name, but didn't care about message size,
#26634 will help.
- add configuration varable `alternator_max_users_query_size_in_trace_output`
with default value of 4096 (4 times old default value).
- modify `truncated_content_view` function to use new configuration
variable for truncation limit
- update `truncated_content_view` to consistently truncate at given
size, previously trunctation would also happen when data arrived in
more than one chunk
- update `truncated_content_view` to better handle truncated value
(limit number of copies)
- fix `scylla_config_read` call - call to `query` for a configuration
name that is not existing will return `Items` array empty
(but present) - this would raise array access exception few lines
below.
- add test
Refs #26634
Refs #24031Closesscylladb/scylladb#26618
As requested in #22104, moved the files and fixed other includes and build system.
Moved files:
- combine.hh
- collection_mutation.hh
- collection_mutation.cc
- converting_mutation_partition_applier.hh
- converting_mutation_partition_applier.cc
- counters.hh
- counters.cc
- timestamp.hh
Fixes: #22104
This is a cleanup, no need to backport
Closesscylladb/scylladb#25085
Each query-type (QUERY, EXECUTE, BATCH) CQL opcode has a number of parameters
in their payload which we always want to record in the Tracing object.
Today it's a Consistency Level, Serial Consistency Level and a Default Timestamp.
Setting each of them individually can lead to a human error when one (or more) of
them would not be set. Let's eliminate such a possibility by defining
a single function that sets them all.
This also allows an easy addition of such parameters to this function in
the future.
Our "sstring_view" is an historic alias for the standard std::string_view.
The patch changes the last remaining random uses of this old alias across
our source directory to the standard type name.
After this patch, there are no more uses of the "sstring_view" alias.
It will be removed in the following patch.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The pointer points to immutable prepared_statement, so tune up the type
respectively. Tracing has its own alieas for it, fix one too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
this change is a follow-up of 4f5fcb02fd,
the goal is to avoid the programming oversights like
```c++
trace(trace_ptr, "foo {} with {} but {} is {}");
```
as `trace(const trace_state_ptr& p, const std::string& msg)` is
a better match than the templated one, i.e.,
`trace(const trace_state_ptr& p, fmt::format_string<T...> fmt, T&&...
args)`. so we cannot detect this with the compile-time format checking.
so let's just drop this overload, and update its callers to use
the other overload.
The change was suggested by Avi. the example also came from him.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14188
when creating an event_record, the typical use case is to use a
string created using fmt::format(), which returns a std::string.
before this change, we always convert the std::string to a sstring,
and move this shinny new sstring into a new event_record. but
when creating sstring, we always performs a deep copy, which is not
necessary, as we own the std::string already.
so, in this change, instead of performing a deep copy, we just keep
the std::string and pass it all the way to where event_record is
created. please note, the std::string will be implicitly converted
to data_value, and it will be dropped on the floor after being
serialized in abstract_type::decompose(). so this deep copy is
inevitable.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in this change we pass the fmt string using fmt::format_string<T...>
in order to {fmt}'s compile-time formatting. so we can identify
the bad format specifier or bad format placeholders at compile-time.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This field needs to call trace_state::ttl_by_type() which, in turn,
looks into _props. The latter should have been initialized already
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It takes props from constructor args and tunes them according to the
constructing "flavor" -- primary or secondary state. Adding two static
helpers code-document the intent and make list-initialization possible
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The instance ptr and props have to be set up early, because other
members' initialization depends on them. It's currently OK, because
other members are initialized in the constructor body, but moving them
into initializer list would require correct ordering
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It now does nothing but wraps make_lw_shared<one_session_records>()
call. Callers can do it on their own thus facilitating further
list-initialization patching
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
For that to happen the value evaluation is moved from the
init_session_records() into a private trace_state helper as it checks
the props values initialized earlier
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.
the building systems are updated accordingly.
the source files referencing `types.hh` were updated using following
command:
```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```
the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12926
The CQL binary protocol introduced "unset" values in version 4
of the protocol. Unset values can be bound to variables, which
cause certain CQL fragments to be skipped. For example, the
fragment `SET a = :var` will not change the value of `a` if `:var`
is bound to an unset value.
Unsets, however, are very limited in where they can appear. They
can only appear at the top-level of an expression, and any computation
done with them is invalid. For example, `SET list_column = [3, :var]`
is invalid if `:var` is bound to unset.
This causes the code to be littered with checks for unset, and there
are plenty of tests dedicated to catching unsets. However, a simpler
way is possible - prevent the infiltration of unsets at the point of
entry (when evaluating a bind variable expression), and introduce
guards to check for the few cases where unsets are allowed.
This is what this long patch does. It performs the following:
(general)
1. unset is removed from the possible values of cql3::raw_value and
cql3::raw_value_view.
(external->cql3)
2. query_options is fortified with a vector of booleans,
unset_bind_variable_vector, where each boolean corresponds to a bind
variable index and is true when it is unset.
3. To avoid churn, two compatiblity structs are introduced:
cql3::raw_value{,_view}_vector_with_unset, which can be constructed
from a std::vector<raw_value{,_view/}>, which is what most callers
have. They can also be constructed with explicit unset vectors, for
the few cases they are needed.
(cql3->variables)
4. query_options::get_value_at() now throws if the requested bind variable
is unset. This replaces all the throwing checks in expression evaluation
and statement execution, which are removed.
5. A new query_options::is_unset() is added for the users that can tolerate
unset; though it is not used directly.
6. A new cql3::unset_operation_guard class guards against unsets. It accepts
an expression, and can be queried whether an unset is present. Two
conditions are checked: the expression must be a singleton bind
variable, and at runtime it must be bound to an unset value.
7. The modification_statement operations are split into two, via two
new subclasses of cql3::operation. cql3::operation_no_unset_support
ignores unsets completely. cql3::operation_skip_if_unset checks if
an operand is unset (luckily all operations have at most one operand that
tolerates unset) and applies unset_operation_guard to it.
8. The various sites that accept expressions or operations are modified
to check for should_skip_operation(). This are the loops around
operations in update_statement and delete_statement, and the checks
for unset in attributes (LIMIT and PER PARTITION LIMIT)
(tests)
9. Many unset tests are removed. It's now impossible to enter an
unset value into the expression evaluation machinery (there's
just no unset value), so it's impossible to test for it.
10. Other unset tests now have to be invoked via bind variables,
since there's no way to create an unset cql3::expr::constant.
11. Many tests have their exception message match strings relaxed.
Since unsets are now checked very early, we don't know the context
where they happen. It would be possible to reintroduce it (by adding
a format string parameter to cql3::unset_operation_guard), but it
seems not to be worth the effort. Usage of unsets is rare, and it is
explicit (at least with the Python driver, an unset cannot be
introduced by ommission).
I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't
recognize unsets) with cql3::maybe_unset_value (that does), but that
caused huge amounts of churn, so I abandoned that in favor of the
current approach.
Closes#12517
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
Secondary tracing sessions used to compute the execution time
from the point of their `begin()`-ning, not the parent session's
`begin()`. As a result, replica reported a slow query if it
exceeded the entire threshold *on that replica* too.
This change augments `trace_info` with the TS of parent's session
starting point, to be used as a reference on replicas.
Fixes#9403Closes#10005
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The write paths in storage_proxy pass replica sets as
std::unordered_set<gms::inet_address>. This is a complex type, with
N+1 allocations for N members, so we change it to a small_vector (via
inet_address_vector_replica_set) which requires just one allocation, and
even zero when up to three replicas are used.
This change is more nuanced than the corresponding change to the read path
abe3d7d7 ("Merge 'storage_proxy: use small_vector for vectors of
inet_address' from Avi Kivity"), for two reasons:
- there is a quadratic algorithm in
abstract_write_response_handler::response(): it searches for a replica
and erases it. Since this happens for every replica, it happens N^2/2
times.
- replica sets for writes always include all datacenters, while reads
usually involve just one datacenter.
So, a write to a keyspace that has 5 datacenters will invoke 15*(15-1)/2
=105 compares.
We could remove this by sending the index of the replica in the replica
set to the replica and ask it to include the index in the response, but
I think that this is unnecessary. Those 105 compares need to be only
105/15 = 7 times cheaper than the corresponding unordered_set operation,
which they surely will. Handling a response after a cross-datacenter round
trip surely involves L3 cache misses, and a small_vector reduces these
to a minimum compared to an unordered_set with its bucket table, linked
list walking and managent, and table rehashing.
Tests using perf_simple_query --write --smp 1 --operations-per-shard 1000000
--task-quota-ms show two allocations removed (as expected) and a nice
reduction in instructions executed.
before: median 204842.54 tps ( 54.2 allocs/op, 13.2 tasks/op, 49890 insns/op)
after: median 206077.65 tps ( 52.2 allocs/op, 13.2 tasks/op, 49138 insns/op)
Closes#8847
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
If tracing::tracing::_ignore_trace_events is enabled then
the tracing system must ignore all sessions events
for non full_tracing sessions (probability tracing and
user requested) and creating subsessions with the
make_trace_info.
Patch introduces the slow query tracing fast mode that
omits all events during tracing.
Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>
The mechanism of session record params is currently only used
to store query strings and a couple more params like consistency level,
but since we now have more frontends than just CQL and Thrift,
it would be nice to also allow the users to put custom parameters in
there.
An immediate first user of this mechanism would be alternator,
which is going to put the operation type under the "alternator_op" key.
The operation type is not part of the query string due to how DynamoDB's
protocol works - the op type is stored separately in the HTTP header.
While it's possible to extract the operation type from the session_id,
it might not be the case once #2572 is implemented.
std::memory_order is an unscoped enum, and so does not need its
members to be prefixed with std::memory_order::, just std::.
This used to work, but in C++20 it no longer does. Use the
standard way to name these constants, which works in both C++17
and C++20.
Reviewed-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200512092408.115649-1-avi@scylladb.com>
Currently query_options objects is passed to a trace stopping function
which makes it mandatory to make them alive until the end of the
query. The reason for that is to add prepared statement parameters to
the trace. All other query options that we want to put in the trace are
copied into trace_state::params_values, so lets copy prepared statement
parameters there too. Trace enabled case will become a little bit more
expensive but on the other hand we can drop a continuation that holds
query_options object alive from a fast path. It is safe to drop the call
to stop_foreground_prepared() here since The tracing will be stopped
in process_request_one().
Message-Id: <20191205102026.GJ9084@scylladb.com>
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.
Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.
Scylla now requires GCC 8 to compile.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().
Mechanically converted with https://github.com/avikivity/unsprint.
This patch completes what was started in a4282c2c6e
Make trace_state_ptr to be a wrapper class around lw_shared_ptr<trace_state> that
hints that bool(trace_state_ptr) is likely to return FALSE.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Add a new "response_size" column to system_traces.sessions and store a size of an uncompressed response
for a traced query.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Add a new column "request_size" to system_traces.sessions and store
the uncompressed request frame data size.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Most queries run without tracing (and those that run with tracing
are not sensitive to a few cycles), so mark the tracing paths as
cold.
Message-Id: <20180723133000.30482-1-avi@scylladb.com>
Store the prepared statement positional parameters values in the
corresponding system_traces.sessions entry in the 'parameters' column
(which has a map<text,text> type).
Parameters are stored as a pair of "param[X]" : "value", where X is
the index of the parameter starting from 0 and the "value" is the first
64 characters of the parameter's value string representation.
If parameters were given with their names attached (see the description
on bit 0x40 of QUERY flags in the CQL binary protocol specification) then
parameters are going to be stored in the "param[X](<bound variable name>)" : "value"
form.
If the value's string representation is longer than 64 characters then the "value" will
contain only first 64 characters of it and will have the "..." at
the end.
For a BATCH of prepared statements the parameter "name" will have a form of
param[Y][X] where Y is the index of the corresponding prepared statement
in the BATCH and X is the index of the parameter. Both X and Y start from
0.
Note:
Had to switch to boost::range::find() in sstables::big_sstable_set in order to
address the "ambiguous overload" compilation error.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Similarly to the regular QUERY of EXECUTE we want to see the actual
queries statement that were part of the BATCH.
If a traced query has only a single statement to execute then its statement will be stored in a form 'query':'<statement>'.
If there are two or more queries (BATCH) then statements of each query in the BATCH will be stored in a form 'query[X]':'<statement>', where X is the index of the query in the
BATCH starting from 0.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Hide it inside the trace_state.cc in order to avoid future circular
dependencies with other .hh files.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
All we require are value semantics.
`client_state` still stores `authenticated_user` in a `shared_ptr`, but
the behavior of that class is complex enough to warrant its own
discussion/design/refactor.