Commit Graph

52 Commits

Author SHA1 Message Date
Kefu Chai
a6152cb87b sstables: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16666
2024-01-09 11:45:44 +02:00
Avi Kivity
f125a3e315 Merge 'tree: finish the reader_permit state renames' from Botond Dénes
In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`.
This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date.

Closes #13573

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update API w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes
2023-05-04 18:29:04 +03:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Botond Dénes
804403f618 reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
They is still using the old terminology for permit state names, bring
them up to date with the recent state name changes.
2023-04-19 05:20:42 -04:00
Pavel Emelyanov
b13ff5248c sstables: Mark continuous_data_consumer::reader_position() const
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13285
2023-03-23 13:27:33 +02:00
Botond Dénes
8b0afc28d4 reader_permit: add make_new_tracked_temporary_buffer()
A separate method for callers of make_tracked_temporary_buffer() who
are creating new empty tracked buffers of a certain size.
make_tracked_temporary_buffer() is about to be changed to be more
targeted at callers who call it with pre-consumed memory units.
2023-01-16 02:05:27 -05:00
Michał Chojnowski
ddc535a4a2 sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes()
read_bytes destroys and creates a vector for every value it reads.
This happens for every cell.
We can save a bit of work by reusing the vector.
2022-05-07 13:04:16 +02:00
Pavel Emelyanov
3f884fbdd7 sstables: Remove excessive type-match assertions
The primitive_consumer method templates overcomplicate the
declaration of the fact that one of the method arguments is
the sub-type of a template argument

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:49:20 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Wojciech Mitros
7107e32390 continuous_data_consumer: properly skip bytes at the end of a range
When skipping bytes at the end of a continuous_data_consumer range,
the position of the consumer is moved after the skipped bytes, but
the position of the underlying input_stream is not.

This patch adds skipping of the underlying input_stream, to make
its position consistent with the position of the consumer.

Fixes #9024

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2021-07-19 11:43:30 +02:00
Botond Dénes
434f2efde5 sstables: continuous_data_consumer: mark permit as blocked when doing IO 2021-07-14 16:48:43 +03:00
Tomasz Grabiec
23bc19643f sstables: read: Document that primitive_consumer::read_32() is alloc-free
Callers will rely on it to assume that it does not invalidate
references to LSA objects.
2021-07-02 19:02:14 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Benny Halevy
6a82e9f4be sstables: index_reader: mark close noexcept
We'd like that to simplify the soon-to-be-introduced
sstable_mutation_reader::close error handling path.

close_index_list can be marked noexcept since parallel_for_each is,
with that index_reader::close can be marked noexcept too.

Note that since reader close can not fail
both lower and upper bounds are closed (since
closing lower_bound cannot fail).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Wojciech Mitros
201b86b042 primitive_consumer: keep fragments of parsed buffer in a small_vector
When we want to parse a linearized buffer of bytes, we're copying them
into the first and only element of the _read_bytes vector. Thus
_read_bytes often contains only one element, which makes a small_vector
a better alternative.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2021-04-01 16:05:52 +02:00
Wojciech Mitros
b1b5bda848 sstables: add non-contiguous parsing of byte strings to the primitive_consumer
Currently, the primitive_consumer parses all values in contiguous buffers.
A string of bytes may be very long, so parsing it in a single buffer
can cause a big allocation. This patch allows parsing into
fragmented_temporary_buffers instead of temporary_buffers.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2021-03-31 12:09:52 +02:00
Tomasz Grabiec
95df7126a7 sstables: consumer: Extract primitive_consumer
This change extracts the parser for primitive types out of
continuous_data_consumer so that it can be used stand-alone
or embedded in other parsers.
2020-06-16 16:14:30 +02:00
Botond Dénes
936619a8d3 sstables/continuous_data_consumer: track buffers used for parsing
Based on heap profiling, buffers used for storing half-parsed fields are
a major contributor to the overall memory consumption of reads. This
memory was completely "under the radar" before. Track it by using
tracked `temporary_buffer` instances everywhere in
`continuous_data_consumer`. As `continuous_data_consumer` is the basis
for parsing all index and data files, adding the tracing here
automatically covers all data, index and promoted index parsing.

I'm almost convinced that there is a better place to store the `permit`
then the three places now, but so far I was unable to completely
decipher the our data/index file parsing class hierarchy.
2020-01-28 08:13:16 +02:00
Paweł Dziepak
349601ac32 sstable: pass full length of buffer to vint deserialiser
vint deserialiser can be more performant if it is allowed to do an
overread (i.e. read more memory than the value it is deserialising).
In case of sstable reads those vints are going to be usually in a middle
of a much larger buffer so lets pass the whole length of the buffer and
enable this optimisation.
2019-03-14 13:37:06 +00:00
Paweł Dziepak
57de2c26b3 vint: drop deserialize_type structure
Deserialisation function returns a structure containing both the value
and its length in the input buffer. In the vast majority of the cases
the caller will already know the length and having this structure will
make it harder for the compiler to emit good code, especially if the
function is not inlined.

In practice I've seen the structure causing register pressure problems
that lead to spilling variables to memory.
2019-03-14 13:37:06 +00:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Tomasz Grabiec
b4c3b78082 sstables: continuous_data_consumer: Introduce skip() 2018-12-18 11:11:47 +01:00
Tomasz Grabiec
36dd660507 sstables: continuous_data_consumer: Make position() meaningful inside state_processor::process_state()
Will allow state_processor to know its position in the
stream.

Currently position() is meaningless inside process_state() because in
some cases it points to the position after the buffer and in some
cases before it. This patch standardizes on the former. This is more
useful than the latter because process_state() trims from the front of
the buffer as it consumes, so the position inside the stream can be
obtained by subtracting the remaining buffer size from position(),
without introducing any new variables.
2018-12-18 11:11:47 +01:00
Rafael Ávila de Espíndola
6746907999 Use fully covered switches in continuous_data_consumer
do_process_buffer had two unreachable default cases and a long
if-else-if chain.

This converts the the if-else-if chain to a switch and a helper
function.

This moves the error checking from run time to compile time. If we
were to add a 128 bit integer for example, gcc would complain about it
missing from the switch.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181125221451.106067-1-espindola@scylladb.com>
2018-11-25 22:52:11 +00:00
Avi Kivity
775b7e41f4 Update seastar submodule
* seastar d59fcef...b924495 (2):
  > build: Fix protobuf generation rules
  > Merge "Restructure files" from Jesse

Includes fixup patch from Jesse:

"
Update Seastar `#include`s to reflect restructure

All Seastar header files are now prefixed with "seastar" and the
configure script reflects the new locations of files.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com>
"
2018-11-21 00:01:44 +02:00
Vladimir Krivopalov
997ebaaa14 sstables: Support reading signed vints in continuous_data_consumer.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:50:17 -07:00
Piotr Jastrzebski
21a0e95a06 Implement read_unsigned_vint_length_bytes
It's a common operation that's used in multiple
places so it's best to have it implemented once.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-06 15:44:06 +02:00
Piotr Jastrzebski
06ceea9c3e Add continuous_data_consumer::read_short_length_bytes
This is a common operation so it's better to have it
implemented in a single place.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-04-26 12:49:37 +02:00
Piotr Jastrzebski
e664360730 Reduce duplication with continuous_data_consumer::read_partial_int
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-04-26 12:49:37 +02:00
Piotr Jastrzebski
b68d1fa5bd sstables: add continuous_data_consumer::read_unsigned_vint
This allows reading unsigned variant integers from
SSTable format 3.x.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-04-16 20:30:10 +02:00
Piotr Jastrzebski
20705c4536 sstables: add all dependant headers to consumer.hh
Before it was depending on byteorder.hh that just happend
to be included in all compilation units that were using consumer.hh
This change makes the header compile when used in new compilation units.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-04-16 11:02:49 +02:00
Avi Kivity
87f10bc853 sstables: continuous_data_consumer: make _remain an unsigned type
All of the adjustments to _remain already ensure it is greater than 0,
and indeed a negative _remain doesn't make sense.

Switching to an unsigne types allows us to re-enable -Wsign-compare.

Tests: unit (release)
Message-Id: <20180212121636.10463-1-avi@scylladb.com>
2018-02-12 12:25:21 +00:00
Vladimir Krivopalov
0a7a56edd5 Simplify continuous_data_consumer::consume_input() interface.
Remove redundant input parameter as continuous_data_consumer derivatives
would only use themselves as a context. So take it internally and make
the function regular (non-template) and having no parameters.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-01-29 11:57:26 -08:00
Vladimir Krivopalov
5dca3100ed Support skipping over bytes from input stream in parsers based on continuous_data_consumer
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-01-29 11:56:55 -08:00
Glauber Costa
f0391bf9a0 sstables: enhance data consumer with a position tracker
Callers, like compactions, will be able to know at any time the current
progress of a read.

As we do that, the currently unimplemented position() method of
data_consume_context becomes redundant and is removed.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-02 18:43:07 -05:00
Tomasz Grabiec
6baad2c2e6 sstables: Introduce data_consume_context::eof() 2017-08-28 09:19:43 +02:00
Tomasz Grabiec
27d86dfe18 sstables: Enable skipping to cells at data_consume_context level 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
56f1ad7841 sstables: Swap order of values in "proceed" so that "no" is assigned 0 2017-03-10 14:42:22 +01:00
Gleb Natapov
ae0a2935b4 sstables: fix ad-hoc summary creation
If sstable Summary is not present Scylla does not refuses to boot but
instead creates summary information on the fly. There is a bug in this
code though. Summary files is a map between keys and offsets into Index
file, but the code creates map between keys and Data file offsets
instead. Fix it by keeping offset of an index entry in index_entry
structure and use it during Summary file creation.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20161116165421.GA22296@scylladb.com>
2016-11-17 11:05:23 +02:00
Paweł Dziepak
0bc873ace5 sstables: add fast_forward_to() to continuous_data_consumer
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-10-19 15:29:08 +01:00
Paweł Dziepak
5feed84e32 sstables: do not call consume_end_partition() after proceed::no
After state_processor().process_state() returns proceed::no the upper
layer should have a chance to act before more data is pushed to the
consumer. This means that in case of proceed::no verify_end_state()
should not be called immediately since it may invoke
consume_end_partition().

Fixes #1605.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1471943032-7290-1-git-send-email-pdziepak@scylladb.com>
2016-08-23 12:24:39 +03:00
Avi Kivity
106e3703d9 sstables: stop using unaligned_cast
unaligned_cast violates strict aliasing, and causes code misgeneration on
gcc 6.  Replace it with read_be/write_be, which are nicer anyway.
Message-Id: <1469122850-7511-1-git-send-email-avi@scylladb.com>
2016-07-22 07:03:08 +01:00
Paweł Dziepak
55a6911d7a sstables: close input_stream<> properly
If read ahead is going to be enabled it is important to close
input_stream<> properly (and wait for completion) before destroying it.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Raphael S. Carvalho
3926748594 sstable: fix consumer parser
The first problem is the while loop around the code that processes prestate.
That's wrong because there may be a need to read more data before continuing
to process a prestate.
The second problem is the code assumption that a prestate will be processed
at once, and then unconditionally process the current state.
Both problems are likely to happen when reading a large buffer because more
than one read may be required.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-10-13 11:45:10 -03:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Avi Kivity
bee1cf1352 consumer.hh: tidy up copyright 2015-09-20 10:35:41 +03:00
Raphael S. Carvalho
6f07379646 row consumer: don't fallthrough if mask cannot be consumed
When row consumer fallthrough from ATOM_NAME_BYTES to ATOM_MASK,
we assume that mask can be consumed, but it may happen that
data.size() equals to zero, thus mask cannot be consumed.
Solution is to add read_8 so that the code will only fallthrough
if mask can be consumed right away.

Fixes #197.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-04 12:26:03 +03:00
Glauber Costa
2623362d20 continuous_data_consumer: do not pass reference to child
Since the child is a base class, we don't need to pass a reference: we can
just cast our 'this' pointer.

By doing that, the move constructor can come back.

Welcome back, move constructor.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-29 20:32:56 +03:00
Glauber Costa
4b174c754d commonize the NSM
In order to reuse the NSM in other scenarios, we need to push as much code
as possible into a common class.

This patch does that, making the continuous_data_consumer class now the main
placeholder for the NSM class. The actual readers will have to inherit from it.

However, despite using inheritance, I am not using virtual functions at all
instead, we let the continuous_data_consumer receive an instance of the derived
class, and then it can safely call its methods without paying the cost of
virtual functions.

In other attempt, I had kept the main process() function in the derived class,
that had the responsibility of then coding the loop.

With the use of the new pattern, we can keep the loop logic in the base class,
which is a lot cleaner. There is a performance penalty associated with it, but
it is fairly small: 0.5 % in the sequential_read perf_sstable test. I think we
can live with it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-28 18:56:26 -05:00