Commit Graph

50 Commits

Author SHA1 Message Date
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
bd71e0b794 tracing: add formatter for tracing::span_id
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `tracing::span_id`, and drop
its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17058
2024-01-31 13:43:46 +02:00
Pavel Emelyanov
4c74425780 tracing: Remove stop_tracing() wrapper
Now it's confusing, as it doesn't stop tracing, but rather shuts it down
on all shards. The only caller of it can be more descriptive without the
wrapper

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
61381feaad tracing: Remove start_tracing() wrapper
Callers can make one-like stop with the help of invoke_on_all() overload
that wraps std::invoke

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
89c43f6677 tracing: Remove create_tracing() wrapper
It doesn't make callers' life easier, but hides global tracing instance

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
ce5062eb13 tracing: Make shutdown() re-entrable
Today's shutdown() and its stop() peer are very restrictive in a way
callers should use them. There's no much point in it, making shutdown()
re-entrable, as for other services, will let relaxing callers code here
and in next patches

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:47 +03:00
Pavel Emelyanov
232de8b180 tracing: Coroutinize start/shutdown/stop
They are all simple enough to be in one patch.
Further patching is simpler in coroutinized form.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:46:46 +03:00
Pavel Emelyanov
8234235b94 tracing: Rename helper's stop() to shutdown()
Because it's called on shutdown, not on stop

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-03 10:02:12 +03:00
Kamil Braun
59d4bb3787 tracing: remove qp.get_migration_manager() calls
Pass `migration_manager&` from top-level instead.
2023-06-15 09:48:54 +02:00
Pavel Emelyanov
dd87adadf3 tracing: List-initialize one_session_records::ttl
For that to happen the value evaluation is moved from the
init_session_records() into a private trace_state helper as it checks
the props values initialized earlier

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:09:05 +03:00
Pavel Emelyanov
b63084237c tracing: List-initialize one_session_records
This touches session_id, parent_id and my_span_id fields

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:07:24 +03:00
Pavel Emelyanov
944b98f261 tracing: List-initialize session_record
This object is constructed via one_session_records thus the latter needs
to pass some arguments along

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 16:04:01 +03:00
Pavel Emelyanov
5c8a61ace2 tracing: Dismantle trace-backend registry
It's not used any longer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:57:24 +03:00
Pavel Emelyanov
fe7d38661c tracing: Use class-registrator for backends
Currently the code uses its own class registration engine, but there's a
generic one in utils/ that applies here too. In fact, the tracing
backend registry is just a transparent wrapper over the generic one :\

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:56:24 +03:00
Pavel Emelyanov
79820c2006 tracing: Outline may_create_new_session
It's a private method used purely in tracing.cc, no need in compiling it
every time the header is met somewhere else.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:55:14 +03:00
Avi Kivity
528ab5a502 treewide: change metric calls from make_derive to make_counter
make_derive was recently deprecated in favor of make_counter, so
make the change throughput the codebase.

Closes #10564
2022-05-14 12:53:55 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
887a1b0d3d tracing: Stop tracing in main's deferred action
Tracing is created in two steps and is destroyed in two too.
The 2nd step doesn't have the corresponding stop part, so here
it is -- defer tracing stop after it was started.

But need to keep in mind, that tracing is also shut down on
drain, so the stopping should handle this.

Fixes #8382
tests: unit(dev), manual(start-stop, aborted-start)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210331092221.1602-1-xemul@scylladb.com>
2021-03-31 12:28:37 +03:00
Ivan Prisyazhnyy
85fbca2049 tracing: omit tracing session events and subsessions in fast mode
If tracing::tracing::_ignore_trace_events is enabled then
the tracing system must ignore all sessions events
for non full_tracing sessions (probability tracing and
user requested) and creating subsessions with the
make_trace_info.

Patch introduces the slow query tracing fast mode that
omits all events during tracing.

Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>
2021-03-18 15:04:47 +02:00
Pavel Emelyanov
87f1223965 tracing: Push query processor through init methods
The goal is to make tracing keyspace helper reference query processor, so this
patch adds the needed arguments through the initialization stack.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 15:45:12 +03:00
Rafael Ávila de Espíndola
c5795e8199 everywhere: Replace engine().cpu_id() with this_shard_id()
This is a bit simpler and might allow removing a few includes of
reactor.hh.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200326194656.74041-1-espindola@scylladb.com>
2020-03-27 11:40:03 +03:00
Benny Halevy
7aef39e400 tracing: one_session_records: keep local tracing ptr
Similar to trace_state keep shared_ptr<tracing> _local_tracing_ptr
in one_session_records when constructed so it can be used
during shutdown.

Fixes #5243

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-11-28 15:24:10 +01:00
Avi Kivity
6641353854 tracing: remove static class_registry
Static class_registries hinder librarification by requiring linking with
all object files (instead of a library from which objects are linked on
demand) and reduce readability by hiding dependencies and by their
horrible syntax. Hide them behind a non-static, non-template tracing
backend registry.
Message-Id: <20181229121000.7885-1-avi@scylladb.com>
2018-12-31 13:24:54 +00:00
Vlad Zolotarov
fcff872089 tracing: make the session state modifying methods and tracing::trace(...) noexcept
Make state session creation, stop_forground() and tracing::trace(...) methods
noexcept.
Most of them have already been implemented the way that they won't throw
but this patch makes it official...

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-12-14 15:05:48 -05:00
Vlad Zolotarov
81bcc36b16 tracing: cleanup: use nullptr instead of trace_state_ptr()
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-25 21:52:28 -04:00
Vlad Zolotarov
b0f660331a tracing: introduce a span ID and parent span ID
This patch makes the tracing framework follow the general idea of Google's
Dapper paper: traces generated in a context of the same query are forming
a single-rooted acyclic tree where in a ScyllaDB case vertexes are spans running
on each involved replica Node and edges are RPCs sent from one Node to another.

   - Each vertex in the tree above has an ID - "span ID".
   - In order to be able to build the tree from the sessions traces we need
     to know the parent "span ID" - the ID of a span that sent an RPC that created
     the current span.
   - Each span of a tracing session is given a 64-bit random span ID.
   - The root span has a span_id::illegal_id value.

This patch adds:
   - The described above parent span ID and a span ID to the one_session_records
     object.
   - The current span ID is passed in the trace_info struct to the remote replica.
   - Add parent_id and span_id columns to system_traces.events table for the parent
     ID and span ID.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-25 21:52:23 -04:00
Vlad Zolotarov
6267bb63f4 tracing::tracing: move collectd metrics registration to metrics registration layer
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-01-10 16:24:54 -05:00
Vlad Zolotarov
9606db2f08 api::set_tracing_probability: prevent a server from returning 500 for a bad probability value
- Change an exception type thrown by a tracing::tracing::set_trace_probability()
     to make it different from the one thrown by an std::stod() when it fails to
     parse a given string.
   - Catch the std::out_of_range exception thrown by a tracing::tracing::set_trace_probability() and
     wrap the exception string into the httpd::bad_param_exception() object.
   - Throw a httpd::bad_param_exception() with a
     "Bad format in a probability value: <a user given probability string value>"
     message if std::invalid_argument is caught.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1465300738-1557-1-git-send-email-vladz@cloudius-systems.com>
2016-12-27 12:07:09 +02:00
Vlad Zolotarov
62cad0f5f5 tracing: don't start tracing until a Tracing service is fully initialized
RPC messaging service is initialized before the Tracing service, so
we should prevent creation of tracing spans before the service is
fully initialized.

We will use an already existing "_down" state and extend it in a way
that !_down equals "started", where "started" is TRUE when the local
service is fully initialized.

We will also split the Tracing service initialization into two parts:
   1) Initialize the sharded object.
   2) Start the tracing service:
      - Create the I/O backend service.
      - Enable tracing.

Fixes issue #1939

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1481836429-28478-1-git-send-email-vladz@scylladb.com>
2016-12-21 12:40:14 +02:00
Vlad Zolotarov
a491ac0f18 tracing: introduce a log_slow_query logic
The main idea is to log queries that take "too long" to complete.
The "too long" is above the given threshold.

To achieve the above this patch does the following:
   - Introduce two new properties to the tracing::trace_state:
      - "Full tracing": when the tracing of this query was explicitly requested.
        In this state we will record all possible traces related to this query:
        both on the coordinator and on any replica involved.
      - "Log slow query": when slow query logging is enabled.
        If slow query logging is enabled and a session's "duration" is above
        the specified threshold we will create a record in the "slow queries log"
        and write all trace records created on the coordinator and on a replica
        if a replica's session lasts longer than that threshold.
        (We will propagate the Coordinator's slow query logging threshold to replicas
        in the context of a specific tracing/logging session).

     The properties above are independent, namely they may be enabled and/or disabled
     independently and any combination of them is legal (naturally, creating a tracing
     session when both states above are disabled makes no sense).
   - Instrument the tracing::tracing service to allow the following:
    - Enable/disable slow query logging.
    - Set/get the slow query duration threshold (in microseconds).
    - Set/get the slow query log record TTL value (in seconds).
   - Instrument the trace_keyspace_helper to write a slow query log entry
     when requested.
   - The slow query logging is disabled by default and the threshold is set to half a second.
   - The TTL of a slow log record is set to 86400 seconds by default.
   - It makes sense to use the same "slow query logging threshold" and a "slow query record TTL"
     both on a coordinator and on a replica Nodes in a context of the same tracing session:
     - Pass both TTL and a threshold to the replica in a trace_info.

This patch also implements the new slow query logging specific logic:
   - Don't write the pending tracing records before the end of a tracing session
     until "duration" reaches the logging threshold.
   - Don't build the parameters<sstring, sstring> map unless we know we will write it
     to I/O.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-28 18:28:44 +03:00
Vlad Zolotarov
8609900621 tracing: introduce trace_state capabilities bit field
- Instead of keeping separate booleans introduce a trace_state_props_set enum_set and
     pass it around instead of separate booleans.
   - Change the trace_info to hold this value in addition to write_on_close. Initialize
     a corresponding bit in an enum_set based on a write_on_close value in a trace_info
     constructor for a backward compatibility.
   - Separate a trace_state constructor into two:
      - For a primary session object.
      - For a secondary session object.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 18:34:36 +03:00
Vlad Zolotarov
c8cf2ef82c tracing::trace_state: introduce is_in_state() and set_state() accessors
Use these new methods to manipulate trace_state::_state value.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:42 +03:00
Vlad Zolotarov
5391bcc5a9 tracing: improve a back pressure policy
Use a per-shard tracing records budget instead
of maintaining a fixed-size per-session records budget and
a per-shard sessions budget.

The original policy could lead to some irrational situations,
when we have a single tracing session that creates a substantial
amount of records that we can handle but we would start dropping
new records after it surpasses the per-session limit.

The new policy handles a per-shard trace records budget that is
being consumed by each trace() call and by a primary session destructor
when a session record is created.

Each active record may only be in one of the following states:
   - cached: stored in its session's object. When record is in this state
             it's not going to be written to I/O during the next write event.
   - pending for write: when record is in this state it's going to be written
             to I/O during the next write event.
   - flushing: the record is being currently written to the I/O.

There are counters of the total amount of records in each state above.

Each record may only be in a specific state at every point of time and
thereby it must be accounted only in one and only one of the three
counters.

The sum of all three counters should not be greater than
(max_pending_trace_records + write_event_records_threshold) at any time
(actually it can get as high as a value above plus (max_pending_sessions)
if all sessions are primary but we won't take this into an account for
simplicity).

The same is about the number of outstanding sessions: it may not be greater
than (max_pending_sessions + write_event_sessions_threshold) at any time.

If total number of tracing records is greater or equal to the limit
above, the new trace point is going to be dropped.

If current number or records plus the expected number of trace records
per session (exp_trace_events_per_session) is greater than the limit
above new sessions will be dropped. A new session will also be dropped if
there are too many active sessions.

When the record or a session is dropped the appropriate statistics
counters are updated and there is a rate-limited warning message printed
to the log.

Every time a number of records pending for write is greater or equal to
(write_event_records_threshold) or a number of sessions pending for
write is greater or equal to (write_event_sessions_threshold) a write
event is issued.

Every 2 seconds a timer would write all pending for write records
available so far.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 19:00:43 +03:00
Vlad Zolotarov
d8fe5317d1 tracing::trace_keyspace_helper: make events' mutations applying loop interruptible
When building events' mutation don't apply them in a tight loop
but rather apply each of them in a separate continuation to allow
reactor to interrupt this loop if it takes too long for it to
complete (e.g. where there are a lot of mutations to apply).

Since building all events' mutations is asynchronous now we can
no longer keep the "nanos" state in a global trace_keyspace_helper
object but rather have to move it into the per-session
backend_session_state class.

backend_session_state class is a backend-specific implementation of a
tracing::backend_session_state_base class.

An instance of the above object is created by a
tracing::i_tracing_backend_helper::allocate_session_state() virtual
method and is stored in a tracing::one_session_records object.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 19:00:39 +03:00
Vlad Zolotarov
63a0502ed1 tracing: rework the interface between the tracing/trace_state and the backend
Before this patch the interaction between the layers above was as follows:
   - trace_state was passing the trace event data to a backend object every
     time trace() method was called.
   - trace_state was passing the session data to a backend object in a destructor.
   - A backend object was storing this data in a form of lambda where all data
     above was caught in a capture list. This was primarily done in order to
     delay the call for make_xxx_mutation(). Lambdas were stored in a map by a session
     ID and they were executed when a kick() method was called.
   - A tracing::tracing object was periodically calling a kick() method of a
     backend that was initiating a write of all pending data to the storage.

All backend methods used in the described above interactions were virtual.
Thereby, for instance, for each and every trace record we were calling a virtual method that was
receiving a significant amount of parameters, store a lambda in a map and return.
This is clearly a suboptimal way of using virtual functions since we prevent a compiler
from inlining an obviously inlinable operations.

This patch changes the interaction scheme to be as follows:
   - Trace events and session data are stored and passed around in a form of structs
     that hold all relevant information (no more lambdas).
   - As long as a trace session is active its data is aggregated inside the corresponding
     trace_state object.
   - The object containing all records is passed and stored as a lw_shared_ptr to save extra
     copies and to shorten capture lists.
   - All aggregated data is passed to a tracing::tracing object in a trace_state destructor.
     The data is stored in a std::deque in a tracing::tracing object (instead of a map by a session ID).
   - A single backend's virtual method call writes all data aggregated so far (kick()
     method is not needed any more), every time a write event occurs.
   - Backend has only one virtual method now:
      - Write a bulk of sessions' data aggregated so far.
   - Backend's virtual method receives a records bulk object by reference.

As a result:
   - A latency of a single trace event that has no formatting improved from 0.2us to 0.1us.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 15:25:52 +03:00
Vlad Zolotarov
960b423ce0 tracing/tracing.cc: rename a logger object
s/logger/tracing_logger/

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-09 15:21:47 +03:00
Vlad Zolotarov
e1480cd00d tracing: add a word "shard" to a "thread" value
Add a word "shard" to a "thread" column value.
From now its format is "shard X", where X is a shard index.

Fixes #1480

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1469099732-23334-1-git-send-email-vladz@cloudius-systems.com>
2016-07-21 15:08:09 +03:00
Duarte Nunes
8d87976fa1 tracing: Downgrade debug level log messages
Two periodic debug log level messages triggered by
tracing::write_timer_callback() quickly fill the logs, so this patch
downgrades them to trace level.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Reviewed-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1469003034-1147-1-git-send-email-duarte@scylladb.com>
2016-07-20 11:34:02 +03:00
Vlad Zolotarov
9c0a725c56 tracing: add a _local_tracing to a i_tracing_backend_helper
A backend helper has to constantly communicate with the corresponding
tracing::tracing instance. By saving a reference to the tracing::tracing instance
will save us a lot of tracing::get_local_tracing_instance() calls and thus
a lot of dereferencing.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
da4836becc tracing::trace_state: add support for a formatted message in trace()
Add an support for passing a format string plus positional parameters
for creation of a trace point message.

Format string should be given in a fmt library native format described
here: http://fmtlib.net/latest/syntax.html#syntax .

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:57 +03:00
Vlad Zolotarov
ee0e986e96 tracing: make a service shutdown stages more strict
kick() backend during shutdown and restrict accessing a backend
after that.

Flush pending records when service is being shut down.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:57 +03:00
Vlad Zolotarov
a5022a09a4 tracing: use 'write' instead of 'flush' and 'store' for consistency with seastar's API
In names of functions and variables:
s/flush_/write_/
s/store_/write_/

In a i_tracing_backend_helper:
s/flush()/kick()/

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:57 +03:00
Vlad Zolotarov
d3960f0bbb tracing: rearrange shut down
tracing::tracing local instance is dereferenced from a
cql_server::connection::process_request(), therefore tracing::tracing
service may be stop()ed only after a CQL server service is down.
On the other hand it may not be stopped before RPC service is down
because a remote side may request a tracing for a specific command too.

This patch splits the tracing::tracing stop() into two phases:
   1) Flush all pending tracing records and stop the backend.
   2) Stop the service.

The first phase is called after CQL server is down and before RPC is down.
The second phase is called after RPC is down.

Fixes #1339

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1465840496-19990-1-git-send-email-vladz@cloudius-systems.com>
2016-06-14 07:58:04 +03:00
Vlad Zolotarov
ce08bc611c tracing: fix debug compilation
Define flush_period as a const and not as constexpr.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1465240516-20128-1-git-send-email-vladz@cloudius-systems.com>
2016-06-06 22:15:27 -04:00
Vlad Zolotarov
905190ac06 tracing: add support for probabilistic tracing
Add a support for defining a probability (a value in a [0,1] range)
for tracing the next CQL request.

Traces for requests that are chosen to be traced due to this feature
are not going to flushed immediately.

Use std::subtract_with_carry_engine (implements the "lagged Fibonacci" algorithm)
random number engine for fastest generation of random integer values.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-06-06 15:41:01 +03:00
Vlad Zolotarov
779ff88c76 tracing: add flush timer
Flush pending sessions to the storage every 2 seconds.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-06-06 14:34:08 +03:00
Vlad Zolotarov
4b008ac5ea tracing: rework maximum sessions amount back pressure strategy
A tracing session life cycle includes 3 stages:
   1) Active: when new trace records are being added to this session.
   2) Pending for flushing to a storage: when session is over but not
      yet flushed to the storage ("backend").
   3) Flushing: when session's records are being flushed to the storage
      and this process is not yet completed.

Sessions may accumulate in each of the stages above and we should limit
the maximum amount of sessions being accumulated in each of them in order to avoid OOM
situation.

Current in-tree implementation only limits the number of tracing sessions
accumulated in the first ("Active") stage.

Since currently every closing session is being immediately flushed (as long
as "settraceprobability" is not implemented) the second stage never accumulates
tracing sessions.

The third stage is currently not controlled at all and if, for instance, we
succeed to push enough tracing session towards a slow storage backend, they may
accumulate there consuming an uncontrolled amount of memory and may eventually consume
all of it.

This patch fixes this unpleasant situation by implying the following strategy:

   - Limit the total amount of accumulated tracing sessions in all stages above together
     by a static value - 2 times "flush threshold". "2 times" is needed to allow new
     tracing sessions to accumulate in the stage 2 while sessions in the stage 3 are still
     being  processed.
   - Forcefully flush sessions in the stage 2 to the storage when their count reaches a "flush
     threshold".

This would ensure that there will not more than totally (2 * "flush threshold") sessions (in any stage)
on each shard.

An advantage of this strategy is its simplicity - we only need a single threshold to control all stages.
If we feel that we needed a finer graining for each stage we may add separate limits for each of them
in the future.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-06-06 13:50:41 +03:00
Vlad Zolotarov
c965528a03 tracing: add a trace_state and tracing classes
trace_state: Is a single tracing session.
tracing:     A sharded service that contains an i_trace_backend_helper instance
             and is a "factory" of trace_state objects.

trace_state main interface functions are:
   - begin(): Start time counting (should be used via tracing::begin() wrapper).
   - trace(): Create a tracing event - it's coupled with a time passed since begin()
              (should be used via tracing::trace() wrapper).
   - ~trace_state(): Destructor will close the tracing session.

"tracing" service main interface function is:
   - start(): Initialize a backend.
   - stop():  Shut down a backend.
   - create_session(): Creates a new tracing session.

(tracing::end_session(): Is called by a trace_state destructor).

When trace_state needs to store a tracing event it uses a backend helper from
a "tracing" service.

A "tracing" service limits a number of opened tracing session by a static number.
If this number is reached - next sessions will be dropped.

trace_state implements a similar strategy in regard to tracing events per singe
session.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-06-01 20:13:42 +03:00