Commit Graph

67 Commits

Author SHA1 Message Date
Kefu Chai
bd71e0b794 tracing: add formatter for tracing::span_id
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `tracing::span_id`, and drop
its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17058
2024-01-31 13:43:46 +02:00
Kefu Chai
db77587309 tracing: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16925
2024-01-23 08:57:11 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Pavel Emelyanov
4c74425780 tracing: Remove stop_tracing() wrapper
Now it's confusing, as it doesn't stop tracing, but rather shuts it down
on all shards. The only caller of it can be more descriptive without the
wrapper

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
61381feaad tracing: Remove start_tracing() wrapper
Callers can make one-like stop with the help of invoke_on_all() overload
that wraps std::invoke

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
89c43f6677 tracing: Remove create_tracing() wrapper
It doesn't make callers' life easier, but hides global tracing instance

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
8234235b94 tracing: Rename helper's stop() to shutdown()
Because it's called on shutdown, not on stop

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:02:12 +03:00
Kamil Braun
59d4bb3787 tracing: remove qp.get_migration_manager() calls
Pass `migration_manager&` from top-level instead.
2023-06-15 09:48:54 +02:00
Kefu Chai
428c13076f tracing: use std::string instead of sstring for event_record::message
when creating an event_record, the typical use case is to use a
string created using fmt::format(), which returns a std::string.

before this change, we always convert the std::string to a sstring,
and move this shinny new sstring into a new event_record. but
when creating sstring, we always performs a deep copy, which is not
necessary, as we own the std::string already.

so, in this change, instead of performing a deep copy, we just keep
the std::string and pass it all the way to where event_record is
created. please note, the std::string will be implicitly converted
to data_value, and it will be dropped on the floor after being
serialized in abstract_type::decompose(). so this deep copy is
inevitable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-06-07 18:59:37 +08:00
Pavel Emelyanov
16e1315eef tracing: Remove init_session_records()
It now does nothing but wraps make_lw_shared<one_session_records>()
call. Callers can do it on their own thus facilitating further
list-initialization patching

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:11:18 +03:00
Pavel Emelyanov
dd87adadf3 tracing: List-initialize one_session_records::ttl
For that to happen the value evaluation is moved from the
init_session_records() into a private trace_state helper as it checks
the props values initialized earlier

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:09:05 +03:00
Pavel Emelyanov
b63084237c tracing: List-initialize one_session_records
This touches session_id, parent_id and my_span_id fields

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:07:24 +03:00
Pavel Emelyanov
944b98f261 tracing: List-initialize session_record
This object is constructed via one_session_records thus the latter needs
to pass some arguments along

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:04:01 +03:00
Pavel Emelyanov
5c8a61ace2 tracing: Dismantle trace-backend registry
It's not used any longer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:57:24 +03:00
Pavel Emelyanov
fe7d38661c tracing: Use class-registrator for backends
Currently the code uses its own class registration engine, but there's a
generic one in utils/ that applies here too. In fact, the tracing
backend registry is just a transparent wrapper over the generic one :\

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:56:24 +03:00
Pavel Emelyanov
79820c2006 tracing: Outline may_create_new_session
It's a private method used purely in tracing.cc, no need in compiling it
every time the header is met somewhere else.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:55:14 +03:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Juliusz Stasiewicz
00a6fda7b9 tracing: Trace slow queries on replicas wrt. parent's clock
Secondary tracing sessions used to compute the execution time
from the point of their `begin()`-ning, not the parent session's
`begin()`. As a result, replica reported a slow query if it
exceeded the entire threshold *on that replica* too.

This change augments `trace_info` with the TS of parent's session
starting point, to be used as a reference on replicas.

Fixes #9403

Closes #10005
2022-02-10 12:03:53 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Benny Halevy
d96a67eb57 abstract_replication_strategy: use shared_ptr in registry
Enable creating shared_ptr<BaseClass> in nonstatic_class_registry
using BaseClass::ptr_type and use that for
abstract_replication_strategy.

While at it, also clean up compressor with that respect
to define compressor::ptr_type as shared_ptr<compressor>
thus simplifying compressor_registry.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
887a1b0d3d tracing: Stop tracing in main's deferred action
Tracing is created in two steps and is destroyed in two too.
The 2nd step doesn't have the corresponding stop part, so here
it is -- defer tracing stop after it was started.

But need to keep in mind, that tracing is also shut down on
drain, so the stopping should handle this.

Fixes #8382
tests: unit(dev), manual(start-stop, aborted-start)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210331092221.1602-1-xemul@scylladb.com>
2021-03-31 12:28:37 +03:00
Ivan Prisyazhnyy
85fbca2049 tracing: omit tracing session events and subsessions in fast mode
If tracing::tracing::_ignore_trace_events is enabled then
the tracing system must ignore all sessions events
for non full_tracing sessions (probability tracing and
user requested) and creating subsessions with the
make_trace_info.

Patch introduces the slow query tracing fast mode that
omits all events during tracing.

Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>
2021-03-18 15:04:47 +02:00
Pavel Emelyanov
87f1223965 tracing: Push query processor through init methods
The goal is to make tracing keyspace helper reference query processor, so this
patch adds the needed arguments through the initialization stack.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 15:45:12 +03:00
Benny Halevy
7aef39e400 tracing: one_session_records: keep local tracing ptr
Similar to trace_state keep shared_ptr<tracing> _local_tracing_ptr
in one_session_records when constructed so it can be used
during shutdown.

Fixes #5243

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-11-28 15:24:10 +01:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Avi Kivity
6641353854 tracing: remove static class_registry
Static class_registries hinder librarification by requiring linking with
all object files (instead of a library from which objects are linked on
demand) and reduce readability by hiding dependencies and by their
horrible syntax. Hide them behind a non-static, non-template tracing
backend registry.
Message-Id: <20181229121000.7885-1-avi@scylladb.com>
2018-12-31 13:24:54 +00:00
Vlad Zolotarov
896c1822b5 tracing: move all tracing related API functions to a cold path
This patch completes what was started in a4282c2c6e

Make trace_state_ptr to be a wrapper class around lw_shared_ptr<trace_state> that
hints that bool(trace_state_ptr) is likely to return FALSE.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-08-03 12:32:54 -04:00
Vlad Zolotarov
6db90a2e63 tracing: store a query response size
Add a new "response_size" column to system_traces.sessions and store a size of an uncompressed response
for a traced query.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-08-03 12:29:36 -04:00
Vlad Zolotarov
05020921bb tracing: store request size
Add a new column "request_size" to system_traces.sessions and store
the uncompressed request frame data size.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-08-03 12:29:36 -04:00
Vlad Zolotarov
fcff872089 tracing: make the session state modifying methods and tracing::trace(...) noexcept
Make state session creation, stop_forground() and tracing::trace(...) methods
noexcept.
Most of them have already been implemented the way that they won't throw
but this patch makes it official...

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-12-14 15:05:48 -05:00
Avi Kivity
c6fa727af0 tracing: add missing include
The IDE doesn't understand what lw_shared_ptr<> means without it,
though it does compile.
2017-11-21 13:24:07 +02:00
Avi Kivity
a9f19e37b5 tracing: add missing include "log.hh"
It's currently made available via another include, which is going away.
2017-08-27 15:18:41 +03:00
Vlad Zolotarov
b0f660331a tracing: introduce a span ID and parent span ID
This patch makes the tracing framework follow the general idea of Google's
Dapper paper: traces generated in a context of the same query are forming
a single-rooted acyclic tree where in a ScyllaDB case vertexes are spans running
on each involved replica Node and edges are RPCs sent from one Node to another.

   - Each vertex in the tree above has an ID - "span ID".
   - In order to be able to build the tree from the sessions traces we need
     to know the parent "span ID" - the ID of a span that sent an RPC that created
     the current span.
   - Each span of a tracing session is given a 64-bit random span ID.
   - The root span has a span_id::illegal_id value.

This patch adds:
   - The described above parent span ID and a span ID to the one_session_records
     object.
   - The current span ID is passed in the trace_info struct to the remote replica.
   - Add parent_id and span_id columns to system_traces.events table for the parent
     ID and span ID.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-25 21:52:23 -04:00
Vlad Zolotarov
aa1f8ccea4 tracing::tracing: allow slow query TTL only in the signed 32-bit integer range
Any TTL is eventually converted into the gc_clock::duration value, which
is based on int32_t type. Limit the node_slow_log TTL user configurable value
to the same values range for consistency.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-12 12:24:08 -04:00
Amnon Heiman
e19fa02a17 remove scollectd from headers
As the metrics migration progressed, some include to scollectd.hh left
behind.

Because of the nature of the scollecd implementation those include
brings alot of code with them to the header files and eventually to many
source file.

This patch remove those include and add a missing include to
storage_proxy.cc.

The reason the compiler didn't complain is an indication to the
problematic nature of those include in the first place.

Before this patch, change in metrics.hh would cause 169 files to
compile, after this change 17.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1484667536-2185-1-git-send-email-amnon@scylladb.com>
2017-01-17 17:39:47 +02:00
Vlad Zolotarov
6267bb63f4 tracing::tracing: move collectd metrics registration to metrics registration layer
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-01-10 16:24:54 -05:00
Pekka Enberg
a443dfa95e tracing: Add seastar/core/scollectd.hh include
Fix the following build breakage:

FAILED: build/release/gen/cql3/CqlParser.o
g++ -MMD -MT build/release/gen/cql3/CqlParser.o -MF build/release/gen/cql3/CqlParser.o.d -std=gnu++1y -g  -Wall -Werror -fvisibility=hidden -pthread -I/home/penberg/scylla/seastar -I/home/penberg/scylla/seastar/fmt -I/home/penberg/scylla/seastar/build/release/gen  -march=nehalem -Ifmt -DBOOST_TEST_DYN_LINK -Wno-overloaded-virtual -DFMT_HEADER_ONLY -DHAVE_HWLOC -DHAVE_NUMA -DHAVE_LZ4_COMPRESS_DEFAULT  -O2 -DBOOST_TEST_DYN_LINK  -Wno-maybe-uninitialized -DHAVE_LIBSYSTEMD=1 -I. -I build/release/gen -I seastar -I seastar/build/release/gen -c -o build/release/gen/cql3/CqlParser.o build/release/gen/cql3/CqlParser.cpp
In file included from ./query-request.hh:31:0,
                 from ./locator/token_metadata.hh:51,
                 from ./locator/abstract_replication_strategy.hh:29,
                 from ./database.hh:26,
                 from ./service/storage_proxy.hh:44,
                 from ./db/schema_tables.hh:43,
                 from ./db/system_keyspace.hh:46,
                 from ./cql3/functions/function_name.hh:45,
                 from ./cql3/selection/selectable.hh:48,
                 from ./cql3/selection/writetime_or_ttl.hh:45,
                 from build/release/gen/cql3/CqlParser.hpp:63,
                 from build/release/gen/cql3/CqlParser.cpp:44:
./tracing/tracing.hh:357:5: error: ‘scollectd’ does not name a type
     scollectd::registrations _registrations;
     ^~~~~~~~~

Message-Id: <1482939751-8756-1-git-send-email-penberg@scylladb.com>
2016-12-28 18:40:18 +02:00
Vlad Zolotarov
62cad0f5f5 tracing: don't start tracing until a Tracing service is fully initialized
RPC messaging service is initialized before the Tracing service, so
we should prevent creation of tracing spans before the service is
fully initialized.

We will use an already existing "_down" state and extend it in a way
that !_down equals "started", where "started" is TRUE when the local
service is fully initialized.

We will also split the Tracing service initialization into two parts:
   1) Initialize the sharded object.
   2) Start the tracing service:
      - Create the I/O backend service.
      - Enable tracing.

Fixes issue #1939

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1481836429-28478-1-git-send-email-vladz@scylladb.com>
2016-12-21 12:40:14 +02:00
Vlad Zolotarov
a491ac0f18 tracing: introduce a log_slow_query logic
The main idea is to log queries that take "too long" to complete.
The "too long" is above the given threshold.

To achieve the above this patch does the following:
   - Introduce two new properties to the tracing::trace_state:
      - "Full tracing": when the tracing of this query was explicitly requested.
        In this state we will record all possible traces related to this query:
        both on the coordinator and on any replica involved.
      - "Log slow query": when slow query logging is enabled.
        If slow query logging is enabled and a session's "duration" is above
        the specified threshold we will create a record in the "slow queries log"
        and write all trace records created on the coordinator and on a replica
        if a replica's session lasts longer than that threshold.
        (We will propagate the Coordinator's slow query logging threshold to replicas
        in the context of a specific tracing/logging session).

     The properties above are independent, namely they may be enabled and/or disabled
     independently and any combination of them is legal (naturally, creating a tracing
     session when both states above are disabled makes no sense).
   - Instrument the tracing::tracing service to allow the following:
    - Enable/disable slow query logging.
    - Set/get the slow query duration threshold (in microseconds).
    - Set/get the slow query log record TTL value (in seconds).
   - Instrument the trace_keyspace_helper to write a slow query log entry
     when requested.
   - The slow query logging is disabled by default and the threshold is set to half a second.
   - The TTL of a slow log record is set to 86400 seconds by default.
   - It makes sense to use the same "slow query logging threshold" and a "slow query record TTL"
     both on a coordinator and on a replica Nodes in a context of the same tracing session:
     - Pass both TTL and a threshold to the replica in a trace_info.

This patch also implements the new slow query logging specific logic:
   - Don't write the pending tracing records before the end of a tracing session
     until "duration" reaches the logging threshold.
   - Don't build the parameters<sstring, sstring> map unless we know we will write it
     to I/O.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-28 18:28:44 +03:00
Vlad Zolotarov
8609900621 tracing: introduce trace_state capabilities bit field
- Instead of keeping separate booleans introduce a trace_state_props_set enum_set and
     pass it around instead of separate booleans.
   - Change the trace_info to hold this value in addition to write_on_close. Initialize
     a corresponding bit in an enum_set based on a write_on_close value in a trace_info
     constructor for a backward compatibility.
   - Separate a trace_state constructor into two:
      - For a primary session object.
      - For a secondary session object.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 18:34:36 +03:00
Vlad Zolotarov
ed21398ce9 trace_keyspace_helper: create a system_traces.node_slow_log table
This table is going to be used to store information about queries
which are slower than a specified threshold.

Also added a column caching and mutation creation functions

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:42 +03:00
Vlad Zolotarov
6c3e1935b0 tracing::session_record: change a type of a "ttl" field to be std::chrono::seconds
TTL is always defined in seconds - make its type explicitly reflect that.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:42 +03:00
Vlad Zolotarov
372da7e71b tracing: add support for setting a username and a table name parameters
- "username" is a name used in the authentication process.
  - "table name" is a <keyspace>.<cf name> string representing a name
    of a table used for a query in question.

Note that there may be more than one table name in a batch query. Therefore
we store an unordered set of tables names.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:42 +03:00
Vlad Zolotarov
eaf5db66a8 tracing::session_record: store "parameters" data in an std::map instead of in an unordered_map
Avoid sorting (and creating a new one) container at a backend code when a sorted
container is needed.

The overhead for the backends where it's not needed is minimal since the size of the
map is very small.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:10 +03:00
Vlad Zolotarov
37da6f53f8 tracing: fix a session "duration" semantics
A session's "duration" should be a time it took to
handle a request, which is a time till response to a user.
In other words - till a consistency level is reached.

Before this patch is was a time that takes a complete
handling of a request, which is the time it takes to handle
all replicas and not only those required to reach a CL.

This patch fixes this situation by extending the trace_state's state
values to 3 states: inactive, foreground and background.

A primary session may be in 3 states:
  - "inactive": between the creation and a begin() call.
  - "foreground": after a begin() call and before a
    stop_foreground_and_write() call.
  - "background": after a stop_foreground_and_write() call and till the
    state object is destroyed.

- Traces are not allowed while state is in an "inactive" state.
- The time the primary session was in a "foreground" state is the time
  reported as a session's "duration".
- Traces that have arrived during the "background" state will be recorded
  as usual but their "elapsed" time will be greater or equal to the
  session's "duration".

Secondary sessions may only be in an "inactive" or in a "foreground"
states.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
f83e33fc13 tracing: make "elapsed" be std::chrono::duration
- Define an tracing::elapsed_clock type (std::chrono::steady_clock).
     Use it instead of trace_state::clock_type.
   - Store the "elapsed" information in a form of elapsed_clock::duration.
   - Make all keyspace_backend specific conversions inside the trace_keyspace_helper
     class, where they belong.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
ebf13da9c9 tracing::session_record: make start_at to be a time_point
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
5deec0e327 tracing::write_complete(): improve a message in case of a logic error
Improve a message if there is a logic error and add logging of such
errors.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 19:00:43 +03:00
Vlad Zolotarov
67d537ecb5 tracing: issue a write event if a single session creates a lot of events
Currently write events are issued every time a trace session is closed.
However if a single session creates a lot of events we will start dropping them
after the total amount of pending records bypasses the limit.

This patch will issue a write event before the session end in that case.

Since now new events may be added to the active tracing session while it's
scheduled for write we have to ensure the following:
   - Not to add the already pending for write session to the pending bulk.
   - Grab all pending data in a specific session in a synchronous way during
     the write event.
   - Serialize creation of events mutations - otherwise the "monotonic nanos"
     logic won't work.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 19:00:43 +03:00