Commit Graph

20377 Commits

Author SHA1 Message Date
Piotr Sarna
9504bbf5a4 alternator: move unwrap_set to serialization header
The utility function for unwrapping a set is going to be useful
across source files, so it's moved to serialization.hh/serialization.cc.
2019-12-10 15:08:47 +01:00
Piotr Sarna
4660e58088 alternator: move rjson value comparison to rjson.hh
The comparison struct is going to be useful across source files,
so it's moved into rjson header, where it conceptually belongs anyway.
2019-12-10 15:08:47 +01:00
Rafael Ávila de Espíndola
761b19cee5 build: Split the build and host linker flags
A general build system knows about 3 machines:

* build: where the building is running
* host: where the built software will run
* target: the machine the software will produce code for

The target machine is only relevant for compilers, so we can ignore
it.

Until now we could ignore the build and host distinction too. This
patch adds the first difference: don't use host ld_flags when linking
build tools (gen_crc_combine_table).

The reason for this change is to make it possible to build with
-Wl,--dynamic-linker pointing to a path that will exist on the host
machine, but may not exist on the build machine.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191207030408.987508-1-espindola@scylladb.com>
2019-12-09 15:54:57 +02:00
fastio
8f326b28f4 Redis: Combine all the source files redis/commands/* into redis/commands.{hh,cc}
Fixes: #5394

Signed-off-by: Peng Jian <pengjian.uestc@gmail.com>
2019-12-08 13:54:33 +02:00
Avi Kivity
9c63cd8da5 sysctl: reduce kernel tendency to swap anonymous pages relative to page cache (#5417)
The vm.swappiness sysctl controls the kernel's prefernce for swapping
anonymous memory vs page cache. Since Scylla uses very large amounts
of anonymous memory, and tiny amounts of page cache, the correct setting
is to prefer swapping page cache. If the kernel swaps anonymous memory
the reactor will stall until the page fault is satisfied. On the other
hand, page cache pages usually belong to other applications, usually
backup processes that read Scylla files.

This setting has been used in production in Scylla Cloud for a while
with good results.

Users can opt out by not installing the scylla-kernel-conf package
(same as with the other kernel tunables).
2019-12-08 13:04:25 +02:00
Avi Kivity
0e319e0359 Update seastar submodule
* seastar 166061da3...e440e831c (8):
  > Fail tests on ubsan errors
  > future: make a couple of asserts more strict
  > future: Move make_ready out of line
  > config: Do not allow zero rates
Fixes #5360
  > future: add new state to avoid temporaries in get_available_state().
  > future: avoid temporary future_state on get_available_state().
  > future: inline future::abandoned
  > noncopyable_function: Avoid uninitialized warning on empty types
2019-12-06 18:33:23 +02:00
Piotr Sarna
0718ff5133 Merge 'min/max on collections returns human-readable result' from Juliusz
Previously, scylla used min/max(blob)->blob overload for collections,
tuples and UDTs; effectively making the results being printed as blobs.
This PR adds "dynamically"-typed min()/max() functions for compound types.

These types can be complicated, like map<int,set<tuple<..., and created
in runtime, so functions for them are created on-demand,
similarly to tojson(). The comparison remains unchanged - underneath
this is still byte-by-byte weak lex ordering.

Fixes #5139

* jul-stas/5139-minmax-bad-printing-collections:
  cql_query_tests: Added tests for min/max/count on collections
  cql3: min()/max() for collections/tuples/UDTs do not cast to blobs
2019-12-06 16:40:17 +01:00
Juliusz Stasiewicz
75955beb0b cql_query_tests: Added tests for min/max/count on collections
This tests new min/max function for collections and tuples. CFs
in test suite were named according to types being tested, e.g.
`cf_map<int,text>' what is not a valid CF name. Therefore, these
names required "escaping" of invalid characters, here: simply
replacing with '_'.
2019-12-06 12:15:49 +01:00
Juliusz Stasiewicz
9efad36fb8 cql3: min()/max() for collections/tuples/UDTs do not cast to blobs
Before:
cqlsh> insert into ks.list_types (id, val) values (1, [3,4,5]);
cqlsh> select max(val) from ks.list_types;

 system.max(val)
------------------------------------------------------------
 0x00000003000000040000000300000004000000040000000400000005

After:
cqlsh> select max(val) from ks.list_types;

 system.max(val)
--------------------
 [3, 4, 5]

This is accomplished similarly to `tojson()`/`fromjson()`: functions
are generated on demand from within `cql3::functions::get()`.
Because collections can have a variety of types, including UDTs
and tuples, it would be impossible to statically define max(T t)->T
for every T. Until now, max(blob)->blob overload was used.

Because `impl_max/min_function_for` is templated with the
input/output type, which can be defined in runtime, we need type-erased
("dynamic") versions of these functors. They work identically, i.e.
they compare byte representations of lhs and rhs with
`bytes::operator<`.

Resolves #5139
2019-12-06 12:14:51 +01:00
Avi Kivity
a18a921308 docs: maintainer.md: use command line to merge multi-commit pull requests
If you merge a pull request that contains multiple patches via
the github interface, it will document itself as the committer.

Work around this brain damage by using the command line.
2019-12-06 10:59:46 +01:00
Botond Dénes
7b37a700e1 configure.py: make tests explicitely depend on libseastar_testing.a
So that changes to libseastar_testing.a make all test target out of
date.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20191205142436.560823-1-bdenes@scylladb.com>
2019-12-05 19:30:34 +02:00
Piotr Sarna
3a46b1bb2b Merge "handle hints on separate connection and scheduling group" from Piotr
Introduce a new verb dedicated for receiving and sending hints: HINT_MUTATION. It is handled on the streaming connection, which is separate from the one used for handling mutations sent by coordinator during a write.

The intent of using a separate connection is to increase fairness while handling hints and user requests - this way, a situation can be avoided in which one type of requests saturate the connection, negatively impacting the other one.

Information about new RPC support is propagated through new gossip feature HINTED_HANDOFF_SEPARATE_CONNECTION.

Fixes #4974.

Tests: unit(release)
2019-12-05 17:25:26 +01:00
Calle Wilund
c11874d851 gms::inet_address: Use special ostream formatting to match Java
To make gms::inet_address::to_string() similar in output to origin.
The sole purpose being quick and easy fix of API/JMX ipv6
formatting of endpoints etc, where strings are used as lexical
comparisons instead of textual representation.

A better, but more work, solution is to fix the scylla-jmx
bridge to do explicit parse + re-format of addresses, but there
are many such callpoints.

An even better solution would be to fix nodetool to not make this
mistake of doing lexical comparisons, but then we risk breaking
merge compatibility. But could be an option for a separate
nodeprobe impl.

Message-Id: <20191204135319.1142-1-calle@scylladb.com>
2019-12-05 17:01:26 +02:00
Gleb Natapov
4893bc9139 tracing: split adding prepared query parameters from stopping of a trace
Currently query_options objects is passed to a trace stopping function
which makes it mandatory to make them alive until the end of the
query. The reason for that is to add prepared statement parameters to
the trace.  All other query options that we want to put in the trace are
copied into trace_state::params_values, so lets copy prepared statement
parameters there too. Trace enabled case will become a little bit more
expensive but on the other hand we can drop a continuation that holds
query_options object alive from a fast path. It is safe to drop the call
to stop_foreground_prepared() here since The tracing will be stopped
in process_request_one().

Message-Id: <20191205102026.GJ9084@scylladb.com>
2019-12-05 17:00:47 +02:00
Tomasz Grabiec
aa173898d6 Merge "Named semaphores in concurrency reader, segment_manager and region_group" from Juliusz
Selected semaphores' names are now included in exception messages in
case of timeout or when admission queue overflows.

Resolves #5281
2019-12-05 14:19:56 +01:00
Nadav Har'El
5b2f35a21a Merge "Redis: fix the options related to Redis API, fix the DEL and GET command"
Merged pull request https://github.com/scylladb/scylla/pull/5381 by
Peng Jian, fixing multiple small issues with Redis:

* Rename the options related to Redis API, and describe them clearly.
* Rename redis_transport_port to redis_port
* Rename redis_transport_port_ssl to redis_ssl_port
* Rename redis_default_database_count to redis_database_count
* Remove unnecessary option enable_redis_protocol
* Modify the default value of opition redis_read_consistency_level and redis_write_consistency_level to LOCAL_QUORUM

* Fix the DEL command: support to delete mutilple keys in one command.

* Fix the GET command: return the empty string when the required key is not exists.

* Fix the redis-test/test_del_non_existent_key: mark xfail.
2019-12-05 11:58:34 +02:00
Avi Kivity
85822c7786 database: fix schema use-after-move in make_multishard_streaming_reader
On aarch64, asan detected a use-after-move. It doesn't happen on x86_64,
likely due to different argument evaluation order.

Fix by evaluating full_slice before moving the schema.

Note: I used "auto&&" and "std::move()" even though full_slice()
returns a reference. I think this is safer in case full_slice()
changes, and works just as well with a reference.

Fixes #5419.
2019-12-05 11:58:34 +02:00
Piotr Sarna
79c3a508f4 table: Reduce read amplification in view update generation
This commit makes sure that single-partition readers for
read-before-write do not have fast-forwarding enabled,
as it may lead to huge read amplification. The observed case was:
1. Creating an index.
  CREATE INDEX index1  ON myks2.standard1 ("C1");
2. Running cassandra-stress in order to generate view updates.
cassandra-stress write no-warmup n=1000000 cl=ONE -schema \
  'replication(factor=2) compaction(strategy=LeveledCompactionStrategy)' \
  keyspace=myks2 -pop seq=4000000..8000000 -rate threads=100 -errors
  skip-read-validation -node 127.0.0.1;

Without disabling fast-forwarding, single-partition readers
were turned into scanning readers in cache, which resulted
in reading 36GB (sic!) on a workload which generates less
than 1GB of view updates. After applying the fix, the number
dropped down to less than 1GB, as expected.

Refs #5409
Fixes #4615
Fixes #5418
2019-12-05 11:58:34 +02:00
Konstantin Osipov
6a5e7c0e22 tests: reduce the number of iterations of dynamic_bitset_test
This test execution time dominates by a serious margin
test execution time in dev/release mode: reducing its
execution time improves the test.py turnaround by over 70%.

Message-Id: <20191204135315.86374-2-kostja@scylladb.com>
2019-12-05 11:58:34 +02:00
Avi Kivity
07427c89a2 gdb: change 'scylla thread' command to access fs_base register directly
Currently, 'scylla thread' uses arch_prctl() to extract the value of
fsbase, used to reference thread local variables. gdb 8 added support
for directly accessing the value as $fs_base, so use that instead. This
works from core dumps as well as live processes, as you don't need to
execute inferior functions.

The patch is required for debugging threads in core dumps, but not
sufficient, as we still need to set $rip and $rsp, and gdb still[1]
doesn't allow this.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=9370
2019-12-05 11:58:34 +02:00
Piotr Dulikowski
adfa7d7b8d messaging_service: don't move unsigned values in handlers
Performing std::move on integral types is pointless. This commit gets
rid of moves of values of `unsigned` type in rpc handlers.
2019-12-05 00:58:31 +01:00
Piotr Dulikowski
77d2ceaeba storage_proxy: handle hints through separate rpc verb 2019-12-05 00:51:52 +01:00
Piotr Dulikowski
2609065090 storage_proxy: move register_mutation handler to local lambda
This refactor makes it possible to reuse the lambda in following
commits.
2019-12-05 00:51:52 +01:00
Piotr Dulikowski
6198ee2735 hh: introduce HINTED_HANDOFF_SEPARATE_CONNECTION feature
The feature introduced by this commit declares that hints can be sent
using the new dedicated RPC verb. Before using the new verb, nodes need
to know if other nodes in the cluster will be able to handle the new
RPC verb.
2019-12-05 00:51:52 +01:00
Piotr Dulikowski
2e802ca650 hh: add HINT_MUTATION verb
Introduce a new verb dedicated for receiving and sending hints:
HINT_MUTATION. It is handled on the streaming connection, which is
separate from the one used for handling mutations sent by coordinator
during a write.

The intent of using a separate connection is to increase fariness while
handling hints and user requests - this way, a situation can be avoided
in which one type of requests saturate the connection, negatively
impacting the other one.
2019-12-05 00:51:49 +01:00
Avi Kivity
fd951a36e3 Merge "Let compaction wait on background deletions" from Benny
"
In several cases in distributed testing (dtest) we trigger compaction using nodetool compact assuming that when it is done, it is indeed really done.
However, the way compaction is currently implemented in scylla, it may leave behind some background tasks to delete the old sstables that were compacted.

This commit changes major compaction (triggered via the ss::force_keyspace_compaction api) so it would wait on the background deletes and will return only when they finish.

Fixes #4909

Tests: unit(dev), nodetool_refresh_with_data_perms_test, test_nodetool_snapshot_during_major_compaction
"
2019-12-04 11:18:41 +02:00
Takuya ASADA
c9d8606786 dist/common/scripts/scylla_ntp_setup: relax RHEL version check
We may able to use chrony setup script on future version of RHEL/CentOS,
it better to run chrony setup when RHEL version >= 8, not only 8.

Note that on Fedora it still provides ntp/ntpdate package, so we run
ntp setup on it for now. (same on debian variants)

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191203192812.5861-1-syuu@scylladb.com>
2019-12-04 10:59:14 +02:00
Juliusz Stasiewicz
430b2ad19d commitlog+region_group: timeout exceptions with names
`segment_manager' now uses a decorated version of `timed_out_error'
with hardcoded name. On the other hand `region_group' uses named
`on_request_expiry' within its `expiring_fifo'.
2019-12-03 19:07:19 +01:00
Avi Kivity
91d3f2afce docs: maintainers.md: fix typo in git push --force-with-lease
Just one lease, not many.

Reported by Piotr Sarna.
2019-12-03 18:17:46 +01:00
Calle Wilund
56a5e0a251 commitlog_replayer: Ensure applied frozen_mutation is safe during apply
Fixes #5211

In 79935df959 replay apply-call was
changed from one with no continuation to one with. But the frozen
mutation arg was still just lambda local.

Change to use do_with for this case as well.

Message-Id: <20191203162606.1664-1-calle@scylladb.com>
2019-12-03 18:28:01 +02:00
Juliusz Stasiewicz
d043393f52 db+semaphores+tests: mandatory `name' param in reader_concurrency_semaphore
Exception messages contain semaphore's name (provided in ctor).
This affects the queue overflow exception as well as timeout
exception. Also, custom throwing function in ctor was changed
to `prethrow_action', i.e. metrics can still be updated there but
now callers have no control over the type of the exception being
thrown. This affected `restricted_reader_max_queue_length' test.
`reader_concurrency_semaphore'-s docs are updated accordingly.
2019-12-03 15:41:34 +01:00
Amos Kong
e26b396f16 scylla-docker: fix default data_directories in scyllasetup.py (#5399)
Use default data_file_directories if it's not assigned in scylla.yaml

Fixes #5398

Signed-off-by: Amos Kong <amos@scylladb.com>
2019-12-03 13:58:17 +02:00
Rafael Ávila de Espíndola
1cd17887fa build: strip debug when configured with --debuginfo 0
In a build configured with --debuginfo 0 the scylla binary still ends
up with some debug info from the libraries that are statically linked
in.

We should avoid compiling subprojects (including seastar) with debug
info when none is needed, but this at least avoids it showing up in
the binary.

The main motivation for this is that it is confusing to get a binary
with *some* debug info in it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191127215843.44992-1-espindola@scylladb.com>
2019-12-03 12:41:04 +02:00
Tomasz Grabiec
0a453e5d30 Merge "Use fragmented buffers for collection de/serialization" from Botond
This series refactors the collection de/serialization code to use
fragmented buffers, avoiding the large allocations and the associated
pains when working with large collections. Currently all operations that
involve collections require deserializing them, executing the operation,
then serializing them again to their internal storage format. The
de/serialization operations happen in linearized buffers, which means
that we have to allocate a buffer large enough to hold the *entire*
collection. This can cause immense pressure on the memory allocator,
which, in the face of memory fragmentation, might be unable to serve the
allocation at all. We've seen this causing all sorts of nasty problems,
including but not limited to: failing compactions, failing memtable
flush, OOM crash and etc.

Users are strongly discouraged from using large collections, yet they
are still a fact of life and have been haunting us since forever.

The proper solution for these problems would be to come up with an
in-memory format for collections, however that is a major effort, with a
lot of unknowns. This is something we plan on doing at some point but
until it happens we should make life less painful for those with large
collections.

The goal of this series is to avoid the need of allocating these large
buffers. Serialization now happens into a `bytes_ostream` which
automatically fragments the values internally. Deserialization happens
with `utils::linearizing_input_stream` (introduced by this series), which
linearizes only the individual collection cells, but not the entire
collection.
An important goal of this series was to introduce the least amount of
risk, and hence the least amount of code. This series does not try to
make a revolution and completely revamp and optimize the
de/serialization codepaths. These codepaths have their days numbered so
investing a lot of effort into them is in vain. We can apply incremental
optimizations where we deem it necessary.

Fixes: #5341
2019-12-03 10:31:34 +01:00
fastio
01599ffbae Redis API: Support the syntax of deleting multiple keys in one DEL command, fix the returning value for GET command.
Support to delete multiple keys in one DEL command.
The feature of returning number of the really deleted keys is still not supported.
Return empty string to client for GET command when the required key is not exists.

Fixes: #5334

Signed-off-by: Peng Jian <pengjian.uestc@gmail.com>
2019-12-03 17:27:40 +08:00
fastio
039b83ad3b Redis API: Rename options related to Redis API, describe them clearly, and remove unnecessary one.
Rename option redis_transport_port to redis_port, which the redis transport listens on for clients.
Rename option redis_transport_port_ssl to redis_ssl_port, which the redis TLS transport listens on for clients.
Rename option redis_database_count. Set the redis dabase count.
Rename option redis_keyspace_opitons to redis_keyspace_replication_strategy_options. Set the replication strategy for redis keyspace.
Remove option enable_redis_protocol, which is unnecessary.

Fixes: #5335

Signed-off-by: Peng Jian <pengjian.uestc@gmail.com>
2019-12-03 17:13:35 +08:00
Nadav Har'El
7b93360c8d Merge: redis: skip processing request of EOF
Merged pull request https://github.com/scylladb/scylla/pull/5393/ by
Amos Kong:
`
When I test the redis cmd by echo and nc, there is a redundant error in the end.
I checked by strace, currently if client read nothing from stdin, it will
shutdown the socket, redis server will read nothing (0 byte) from socket. But
it tries to process the empty command and returns an error.

$ echo -n -e '*1\r\n$4\r\nping\r\n' |strace nc localhost 6379
| ...
|    read(0, "*1\r\n$4\r\nping\r\n", 8192)   = 14
|    select(5, [4], [4], [], NULL)           = 1 (out [4])
|>>> sendto(4, "*1\r\n$4\r\nping\r\n", 14, 0, NULL, 0) = 14
|    select(5, [0 4], [], [], NULL)          = 1 (in [0])
|    recvfrom(0, 0x7ffe4d5b6c70, 8192, 0, 0x7ffe4d5b6bf0, 0x7ffe4d5b6bec) = -1 ENOTSOCK (Socket operation on non-socket)
|    read(0, "", 8192)                       = 0
|>>> shutdown(4, SHUT_WR)                    = 0
|    select(5, [4], [], [], NULL)            = 1 (in [4])
|    recvfrom(4, "+PONG\r\n-ERR unknown command ''\r\n", 8192, 0, 0x7ffe4d5b6bf0, [0]) = 32
|    write(1, "+PONG\r\n-ERR unknown command ''\r\n", 32+PONG
|    -ERR unknown command ''
|    ) = 32
|    select(5, [4], [], [], NULL)            = 1 (in [4])
|    recvfrom(4, "", 8192, 0, 0x7ffe4d5b6bf0, [0]) = 0
|    close(1)                                = 0
|    close(4)                                = 0

Current result:
  $ echo -n -e '' |nc localhost 6379
  -ERR unknown command ''
  $ echo -n -e '*1\r\n$4\r\nping\r\n' |nc localhost 6379
  +PONG
  -ERR unknown command ''

Expected:
  $ echo -n -e '' |nc localhost 6379
  $ echo -n -e '*1\r\n$4\r\nping\r\n' |nc localhost 6379
  +PONG
2019-12-03 10:40:20 +02:00
Avi Kivity
83feb9ea77 tools: toolchain: update frozen image
Commit 96009881d8 added diffutils to the dependencies via
Seastar's install-dependencies.sh, after it was inadvertantly
dropped in 1164ff5329 (update to Fedora 31; diffutils is no
longer brought in as a side effect of something else).

Regenerate the image to include diffutils.

Ref #5401.
2019-12-03 10:36:55 +02:00
Amos Kong
fb9af2a86b redis-test: add test_raw_cmd.py
This patch added subtests for EOF process, it reads and writes the socket
directly by using protocol cmds.

We can add more tests in future, tests with Redis module will hide some
protocol error.

Signed-off-by: Amos Kong <amos@scylladb.com>
2019-12-03 10:47:56 +08:00
Amos Kong
4fa862adf4 redis: skip processing request of EOF
When I test the redis cmd by echo and nc, there is a redundant error in the end.
I checked by strace, currently if client read nothing from stdin, it will
shutdown the socket, redis server will read nothing (0 byte) from socket. But
it tries to process the empty command and returns an error.

$ echo -n -e '*1\r\n$4\r\nping\r\n' |strace nc localhost 6379
| ...
|    read(0, "*1\r\n$4\r\nping\r\n", 8192)   = 14
|    select(5, [4], [4], [], NULL)           = 1 (out [4])
|>>> sendto(4, "*1\r\n$4\r\nping\r\n", 14, 0, NULL, 0) = 14
|    select(5, [0 4], [], [], NULL)          = 1 (in [0])
|    recvfrom(0, 0x7ffe4d5b6c70, 8192, 0, 0x7ffe4d5b6bf0, 0x7ffe4d5b6bec) = -1 ENOTSOCK (Socket operation on non-socket)
|    read(0, "", 8192)                       = 0
|>>> shutdown(4, SHUT_WR)                    = 0
|    select(5, [4], [], [], NULL)            = 1 (in [4])
|    recvfrom(4, "+PONG\r\n-ERR unknown command ''\r\n", 8192, 0, 0x7ffe4d5b6bf0, [0]) = 32
|    write(1, "+PONG\r\n-ERR unknown command ''\r\n", 32+PONG
|    -ERR unknown command ''
|    ) = 32
|    select(5, [4], [], [], NULL)            = 1 (in [4])
|    recvfrom(4, "", 8192, 0, 0x7ffe4d5b6bf0, [0]) = 0
|    close(1)                                = 0
|    close(4)                                = 0

Current result:
  $ echo -n -e '' |nc localhost 6379
  -ERR unknown command ''
  $ echo -n -e '*1\r\n$4\r\nping\r\n' |nc localhost 6379
  +PONG
  -ERR unknown command ''

Expected:
  $ echo -n -e '' |nc localhost 6379
  $ echo -n -e '*1\r\n$4\r\nping\r\n' |nc localhost 6379
  +PONG

Signed-off-by: Amos Kong <amos@scylladb.com>
2019-12-03 10:47:56 +08:00
Rafael Ávila de Espíndola
bb114de023 dbuild: Fix confusion about relabeling
podman needs to relabel directories in exactly the same cases docker
does. The difference is that podman cannot relabel /tmp.

The reason it was working before is that in practice anyone using
dbuild has already relabeled any directories that need relabeling,
with the exception of /tmp, since it is recreated on every boot.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191201235614.10511-2-espindola@scylladb.com>
2019-12-02 18:38:16 +02:00
Rafael Ávila de Espíndola
867cdbda28 dbuild: Use a temporary directory for /tmp
With this we don't have to use --security-opt label=disable.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191201235614.10511-1-espindola@scylladb.com>
2019-12-02 18:38:14 +02:00
Botond Dénes
1d1f8b0d82 tests: mutation_test: add large collection allocation test
Checking that there are no large allocations when a large collection is
de/serialized.
2019-12-02 17:13:53 +02:00
Avi Kivity
28355af134 docs: add maintainer's handbook (#5396)
This is a list of recipes used by maintainers to maintain
scylla.git.
2019-12-02 15:01:54 +02:00
Botond Dénes
4c59487502 collection_mutation: don't linearize the buffer on deserialization
Use `utils::linearizing_input_stream` for the deserizalization of the
collection. Allows for avoiding the linearization of the entire cell
value, instead only linearizing individual values as they are
deserialized from the buffer.
2019-12-02 10:10:31 +02:00
Botond Dénes
690e9d2b44 utils: introduce linearizing_input_stream
`linearizing_input_stream` allows transparently reading linearized
values from a fragmented buffer. This is done by linearizing on-the-fly
only those read values that happen to be split across multiple
fragments. This reduces the size of the largest allocation from the size
of the entire buffer (when the entire buffer is linearized) to the size
of the largest read value. This is a huge gain when the buffer contains
loads of small objects, and modest gains when the buffer contains few
large objects. But the even in the worst case the size of the largest
allocation will be less or equal compared to the case where the entire
buffer is linearized.

This stream is planned to be used as glue code between the fragmented
cell value and the collection deserialization code which expects to be
reading linearized values.
2019-12-02 10:10:31 +02:00
Botond Dénes
065d8d37eb tests: random-utils: get_string(): add overload that takes engine parameter 2019-12-02 10:10:31 +02:00
Botond Dénes
2f9307c973 collection_mutation: use a fragmented buffer for serialization
For the serialization `bytes_ostream` is used.
2019-12-02 10:10:31 +02:00
Botond Dénes
fc5b096f73 imr: value_writer::write_to_destination(): don't dereference chunk iterator eagerly
Currently the loop which writes the data from the fragmented origin to
the destination, moves to the next chunk eagerly after writing the value
of the current chunk, if the current chunk is exhausted.
This presents a problem when we are writing the last piece of data from
the last chunk, as the chunk will be exhausted and we eagerly attempt to
move to the next chunk, which doesn't exist and dereferencing it will
fail. The solution is to not be eager about moving to the next chunk and
only attempt it if we actually have more data to write and hence expect
more chunks.
2019-12-02 10:10:31 +02:00
Botond Dénes
875314fc4b bytes_ostream: make it a FragmentRange
The presence of `const_iterator` seems to be a requirement as well
although it is not part of the concept. But perhaps it is just an
assumption made by code using it.
2019-12-02 10:10:31 +02:00