scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	5bc201df10	cache: Release dirty memory with row granularity	2018-05-30 14:41:41 +02:00
Tomasz Grabiec	70c72773be	cache: Defer during partition merging	2018-05-30 14:41:41 +02:00
Tomasz Grabiec	051bb74583	mvcc: partition_snapshot_row_cursor: Introduce consume_row()	2018-05-30 14:41:41 +02:00
Tomasz Grabiec	518fd7083f	mvcc: partition_snapshot_row_cursor: Introduce maybe_refresh_static() A version of maybe_refresh() optimized for snapshots which are no longer populated. Will be used to implement cache update from memtable.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	c653137b2b	mvcc: Make apply_to_incomplete() work with attached versions Needed before making it preemptible. We cannot steal the entry since we may need to resume merging later.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	1792be3697	cache: Propagate phase to apply_to_incomplete() It will be needed to create snapshots with appropriate phase markers.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	494cb3f3da	cache: Prepare for incremental apply_to_incomplete() Incremental merging will be implemented by the means of resumable functions, which return stop_iteration::no when not yet finished. We're not using futures, so that the caller can do work around preemption points as well.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	a19c5cbc16	Introduce a coroutine wrapper Represents a deferring operation which defers cooperatively with the caller. The operation is started and resumed by calling run(), which returns with stop_iteration::no whenever the operation defers and is not completed yet. When the operation is finally complete, run() returns with stop_iteration::yes. This allows the caller to: 1) execute some post-defer and pre-resume actions atomically 2) have control over when the operation is resumed and in which context, in particular the caller can cancel the operation at deferring points. It will be used to implement deferring partition_version::apply_to_incomplete().	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	6bd1a04c10	tests: mvcc: Encapsulate memory management details Curently tests have a single LSA region lock around construction of managed objects, their manipulation, and access. This way we avoid the complexity of dealing with allocating sections. That will not be possible once apply_to_incomplete() is changed to enter an allocating section itself becasue this requires region to be unlocked at entry. The tests will have to take more fine-grained locks. That is somewhat tricky add would add a lot of noise to tests. This patch will make things easier by abstracting LSA management, among other things, inside mvcc_conatiner and mvcc_partition classes.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	f6e21accc7	tests: cache: Take into account that update() may defer The test incorrectly assumed that once update() is started the cache will return only versions from last_generation. This will not hold once we start to defer during partition merging.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	c10d9e1607	cache: real_dirty_memory_accounter: Allow construction without memtable	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	6ecda1ccd7	cache: Extract real_dirty_memory_accounter	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	3f19f76c67	mvcc: Destroy memtable partition versions gently Now all snapshots will have a mutation_cleaner which they will use to gently destroy freed partition_version objects. Destruction of memtable entries during cache update is also using the gentle cleaner now. We need to have a separate cleaner for memtable objects even though they're owned by cache's region, because memtable versions must be cleared without a cache_tracker. Each memtable will have its own cleaner, which will be merged with the cache's cleaner when memtable is merged into cache. Fixes some sources of reactor stalls on cache update when there are large partition entries in memtables.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	c2d702622e	memtable: Destroy partitions incrementally from clear_gently() Destroying large partitions may stall the reactor for a long time. Avoid this by clearing incrementally.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	81d231f35b	mvcc: Remove rows from tracker gently Some parititons may have a lot of rows. Better to iterate over them incrementally as part of clear_gently() to avoid stalls.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	f0c1edd672	cache: Destroy partition versions incrementally Instead of destroying whole partition_versions at once, we will do that gently using mutation_cleaner to avoid reactor stalls. Large deletions could happen when large partition gets invalidated, upgraded to a new schema, or when it's abandaned by a detached snapshot. Refs #3289.	2018-05-30 14:41:40 +02:00
Tomasz Grabiec	e0803ff71e	Introduce mutation_cleaner Used for collecting unsued partition_version objects and freeing them incrementally. Will be used for both cache and memtables.	2018-05-30 14:41:39 +02:00
Tomasz Grabiec	e5aa02efeb	mvcc: Introduce partition_version_list	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	ca1ee93577	mvcc: Fix move constructor of partition_version_ref() not preserving _unique_owner We didn't rely on that yet, it seems, but will. (cherry picked from commit 21a744337de01f699d5c5c340483ad23cabab2ee)	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	40cc766cf2	database: Add API for incremental clearing of partition entries Partitions can get very large. Destroying them all at once can stall the reactor for significant amount of time. We want to avoid that by doing destruction incrementally, deferring in between. A new API is added for that at various levels: stop_iteration clear_gently() noexcept; It returns stop_iteration::yes when the object is fully cleared and can be now destroyed quickly. So a deferring destruction can look like this: return repeat([this] { return clear_gently(); }); The reason why clear_gently() doesn't return a future<> itself is that some contexts cannot defer, like memory reclamation.	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	2f75212ca4	cache: Define trivial methods inline They have users in a different compilation unit, in partition_version.cc	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	25b3641d9e	tests: Improve perf_row_cache_update We now test more kinds of workloads: - small partitions with no clustering key - large partition with lots of small rows - large partition with lots of range tombstones We also collect statistics about scheduling latency induced by cache update. Example output: Small partitions, no overwrites: update: 356.809113 [ms], stall: {ticks: 396, min: 0.006867 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 257/257 [MB] LSA: 257/257 [MB] std free: 83 [MB] update: 337.542999 [ms], stall: {ticks: 373, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 514/514 [MB] LSA: 514/514 [MB] std free: 83 [MB] update: 383.485291 [ms], stall: {ticks: 425, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 771/788 [MB] LSA: 771/788 [MB] std free: 83 [MB] update: 574.968811 [ms], stall: {ticks: 634, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.629722 [ms], max: 1.955666 [ms]}, cache: 879/917 [MB] LSA: 879/917 [MB] std free: 83 [MB] update: 411.541138 [ms], stall: {ticks: 455, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 787/835 [MB] LSA: 787/835 [MB] std free: 83 [MB] update: 368.491211 [ms], stall: {ticks: 408, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 750/790 [MB] LSA: 750/790 [MB] std free: 83 [MB] update: 343.671967 [ms], stall: {ticks: 380, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 734/769 [MB] LSA: 734/769 [MB] std free: 83 [MB] update: 320.277283 [ms], stall: {ticks: 357, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 724/753 [MB] LSA: 724/753 [MB] std free: 83 [MB] update: 310.583282 [ms], stall: {ticks: 344, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 714/740 [MB] LSA: 714/740 [MB] std free: 83 [MB] update: 303.627106 [ms], stall: {ticks: 338, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.955666 [ms]}, cache: 707/731 [MB] LSA: 707/731 [MB] std free: 83 [MB] update: 296.742523 [ms], stall: {ticks: 330, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 701/724 [MB] LSA: 701/724 [MB] std free: 83 [MB] update: 286.598541 [ms], stall: {ticks: 319, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 697/719 [MB] LSA: 697/719 [MB] std free: 83 [MB] update: 288.649323 [ms], stall: {ticks: 321, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 694/715 [MB] LSA: 694/715 [MB] std free: 83 [MB] update: 282.069916 [ms], stall: {ticks: 314, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 692/712 [MB] LSA: 692/712 [MB] std free: 83 [MB] update: 292.462036 [ms], stall: {ticks: 325, min: 0.001917 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 689/708 [MB] LSA: 689/708 [MB] std free: 83 [MB] update: 274.390442 [ms], stall: {ticks: 305, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 687/705 [MB] LSA: 687/705 [MB] std free: 83 [MB] invalidation: 172.617508 [ms] Large partition, lots of small rows: update: 262.132721 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 268.650944 [ms]}, cache: 187/188 [MB] LSA: 187/188 [MB] std free: 82 [MB] update: 281.359467 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 322.381152 [ms]}, cache: 375/376 [MB] LSA: 375/376 [MB] std free: 82 [MB] update: 287.229065 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 322.381152 [ms]}, cache: 563/564 [MB] LSA: 563/564 [MB] std free: 82 [MB] update: 1294.816284 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 1386.179840 [ms]}, cache: 586/625 [MB] LSA: 586/625 [MB] std free: 82 [MB] update: 845.022461 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 962.624896 [ms]}, cache: 439/475 [MB] LSA: 439/475 [MB] std free: 82 [MB] update: 380.335938 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 386.857376 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 477.234680 [ms], stall: {ticks: 4, min: 0.002760 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 525.955017 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 548.003784 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.006866 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 528.697937 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 609.292603 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 575.762451 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 668.489536 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 530.801392 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 535.948364 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 527.143555 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.020501 [ms], 99%: 0.020501 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] update: 521.869202 [ms], stall: {ticks: 4, min: 0.002760 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB] invalidation: 173.069733 [ms] Large partition, lots of range tombstones: update: 224.003220 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 268.650944 [ms]}, cache: 52/52 [MB] LSA: 52/52 [MB] std free: 82 [MB] update: 570.882874 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 105/105 [MB] LSA: 105/105 [MB] std free: 82 [MB] update: 577.249878 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 158/158 [MB] LSA: 158/158 [MB] std free: 82 [MB] update: 580.239624 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 211/211 [MB] LSA: 211/211 [MB] std free: 82 [MB] update: 614.187134 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.004768 [ms], 90%: 0.011864 [ms], 99%: 0.011864 [ms], max: 668.489536 [ms]}, cache: 264/264 [MB] LSA: 264/264 [MB] std free: 82 [MB] update: 618.709229 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 317/317 [MB] LSA: 317/317 [MB] std free: 82 [MB] update: 626.943359 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 369/370 [MB] LSA: 369/370 [MB] std free: 82 [MB] update: 602.873474 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 422/423 [MB] LSA: 422/423 [MB] std free: 82 [MB] update: 617.522583 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 475/475 [MB] LSA: 475/475 [MB] std free: 82 [MB] update: 627.291138 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.011864 [ms], 99%: 0.011864 [ms], max: 668.489536 [ms]}, cache: 528/528 [MB] LSA: 528/528 [MB] std free: 82 [MB] update: 623.720886 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 581/581 [MB] LSA: 581/581 [MB] std free: 82 [MB] update: 630.735596 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 634/634 [MB] LSA: 634/634 [MB] std free: 82 [MB] update: 2776.525635 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 2874.382592 [ms]}, cache: 687/687 [MB] LSA: 687/687 [MB] std free: 82 [MB]	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	bb96518cc5	mutation_reader: Make empty mutation source advertize no partitions So that perf_row_cache_update will always populate cache.	2018-05-30 12:18:56 +02:00
Tomasz Grabiec	aefb5e0fbd	Merge "Get rid of cql_statement::execute_internal" from Avi execute_internal() duplicates several code paths, especially in the select path, for no good reason. It boils down to timeout and consistency level selection which can be done based on client_state::is_internal(). This patchset eliminated the duplication and execute_internal(), simplifying the code. * github.com:avikivity/scylla cql-no-execute_internal/v2: cql: schema_altering_statement: make execute() and execute_internal() equivalent cql: select_statement: make execute() and execute_internal() equivalent cql: query_processor: don't call cql_statement::execute_internal() any more cql: cql_statement: remove execute_internal()	2018-05-28 13:01:43 +02:00
Avi Kivity	8033785b36	Update scylla-ami submodule * dist/ami/files/scylla-ami 025644d...1f5329f (1): > scylla_install_ami: Update CentOS to latest version	2018-05-28 13:59:57 +03:00
Avi Kivity	ff3e86888a	tests: report tests as they are completed As each test completes, report it. This prevents a long-running test in the beginning of the list from stalling output. Message-Id: <20180526173517.23078-1-avi@scylladb.com>	2018-05-28 13:58:01 +03:00
Avi Kivity	3a4d11d374	Merge "Introduce frozen_mutation_fragment" from Paweł " This series introduces frozen_mutation_fragment which can be used to send mutation_fragments over the wire to a remote node. The main intended user is going to be the new streaming implementation. The first part of the series fixes some IDL issues related to empty structures and variant being the first member of a structure. Both these problems make the generated code fail to build and they do not, in any way, affect the existing on-wire protocol. Logic responsible for freezing and unfreezing of mutation_fragments is heavily based on the existing code for freezing mutations and shares the same drawbacks (for example, unnecessary copy during unfreezing). These preexisting performance problems can be fixed incrementally. Another performance problem (which affects frozen_mutations as well, but to a lesser extent) is that since the batching is done at a different layer each frozen mutation fragment is a separate bytes_ostream object owning at least one memory buffer. If the mutation fragments are small this will cause an excessive number of allocations. This could be solved either by freezing fragments in batches (though it goes against the RPC layer doing its own batching) or using bytes_ostream or an equivalent object with a buffer allocation policy more suitable for such use cases. This also is something that probably could be an incremental fix. Tests: unit (release) " * tag 'frozen_mutation_fragment/v1-rebased' of https://github.com/pdziepak/scylla: idl: add idl description of frozen_mutation_fragments tests: add test for frozen_mutation_fragments frozen_mutation: introduce frozen_mutation_fragment tests/idl: test variant being the first member of a structure idl: create variant state in root node tests/idl: test serialising and deserialising empty structures idl-compiler: avoid unused variable in empty struct deserialisers tests/mutation_reader: disambiguate freeze() overload	2018-05-28 13:54:01 +03:00
Takuya ASADA	55d6be9254	Revert "dist/ami: update CentOS base image to latest version" This reverts commit `69d226625a`. Since ami-4bf3d731 is Market Place AMI, not possible to publish public AMI based on it. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180523112414.27307-1-syuu@scylladb.com>	2018-05-28 13:52:34 +03:00
Avi Kivity	b70febe246	cql: cql_statement: remove execute_internal() With no callers, it can be safely removed.	2018-05-27 12:40:27 +03:00
Avi Kivity	c8a66efb6a	cql: query_processor: don't call cql_statement::execute_internal() any more All cql_statement::execute_internal() overrides now either throw or call execute(). Since we shouldn't be calling the throwing overrides internally, we can safely call execute() instead. This allows us to get rid of execute_internal().	2018-05-27 12:37:37 +03:00
Avi Kivity	eb19798f99	cql: select_statement: make execute() and execute_internal() equivalent execute_internal(), for some code paths, differs from execute by the following: 1. it uses CL_ONE unconditionally 2. it has no query timeout 3. it doesn't use execution stages for other code paths, it just calls execute. As preparation for getting rid of execute_internal(), unify the two code paths. Commit `4859b759b9` caused the consistency level and timeouts to be provided by the caller, so using the caller provided parameters instead of overriding them does not change behavior.	2018-05-27 12:36:02 +03:00
Avi Kivity	d998f06633	cql: schema_altering_statement: make execute() and execute_internal() equivalent To get rid of execute_internal(), make the normal execute() equivalent and call it instead of having two different paths.	2018-05-27 11:08:55 +03:00
Duarte Nunes	4859b759b9	Merge 'Make all timeouts explicit' from Avi " This patchset makes all users of query_processor specify their timeouts explicitly, in preparation for the removal of cql_statement::execute_internal() (whose main function was to override timeouts). " * tag 'cql-explicit-timeouts/v1' of https://github.com/avikivity/scylla: query_processor: require clients to specify timeout configuration query_processor: un-default consistency level in make_internal_options	2018-05-26 16:10:58 +02:00
Avi Kivity	6e97609049	Merge "Improve support for data types handling in SSTables 3.x" from Vladimir " Firstly, this patchset removes the is_fixed_length() function of abstract_type in favour of value_length_if_fixed(). Secondly, it fixed the byte_type to be compatible with Cassandra which erroneously treats it as a variable-length data type. Lastly, it adds a unit test covering all non-composite CQL data types for writing. Tests: unit {release} " * 'projects/sstables-30/different-data-types/v1' of https://github.com/argenet/scylla: tests: Add a unit test for writing different data types to SSTables 3.x format. types: Treat byte_type as a variable-length type for compatibility reasons. types: Remove is_value_fixed() and use value_length_if_fixed() instead.	2018-05-26 10:24:35 +03:00
Vladimir Krivopalov	0951153292	tests: Add a unit test for writing different data types to SSTables 3.x format. This tests covers all non-composite CQL data types. The resulting files are dumped using sstabledump as follows: [ { "partition" : { "key" : [ "key" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 174, "liveness_info" : { "tstamp" : "1525385507816568" }, "cells" : [ { "name" : "asciival", "value" : "hello" }, { "name" : "bigintval", "value" : 9223372036854775807 }, { "name" : "blobval", "value" : "0x6772656174" }, { "name" : "boolval", "value" : true }, { "name" : "dateval", "value" : "2017-05-05" }, { "name" : "decimalval", "value" : 5.45 }, { "name" : "doubleval", "value" : 36.6 }, { "name" : "durationval", "value" : 1h4m48s20ms }, { "name" : "floatval", "value" : 7.62 }, { "name" : "inetval", "value" : "192.168.0.110" }, { "name" : "intval", "value" : -2147483648 }, { "name" : "smallintval", "value" : 32767 }, { "name" : "timeuuidval", "value" : "50554d6e-29bb-11e5-b345-feff819cdc9f" }, { "name" : "timeval", "value" : "19:45:05.090000000" }, { "name" : "tinyintval", "value" : 127 }, { "name" : "tsval", "value" : "2015-05-01 09:30:54.234Z" }, { "name" : "uuidval", "value" : "01234567-0123-0123-0123-0123456789ab" }, { "name" : "varcharval", "value" : "привет" }, { "name" : "varintval", "value" : 123 } ] } ] } ] Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-25 21:41:23 -07:00
Vladimir Krivopalov	3981dd6dd6	types: Treat byte_type as a variable-length type for compatibility reasons. Although values of the byte_type that corresponds to CQL TINYINT type always occupy only a single byte, Cassandra treats this it as a variable-length type for SSTables 3.0 reading and writing. While it is clearly a mistake at Cassandra side, we have to stay compatible. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-25 21:41:23 -07:00
Vladimir Krivopalov	24cb062834	types: Remove is_value_fixed() and use value_length_if_fixed() instead. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-25 21:41:23 -07:00
Paweł Dziepak	ed12555192	idl: add idl description of frozen_mutation_fragments	2018-05-25 10:15:10 +01:00
Paweł Dziepak	0bac487426	tests: add test for frozen_mutation_fragments	2018-05-25 10:15:10 +01:00
Paweł Dziepak	aa4e589ace	frozen_mutation: introduce frozen_mutation_fragment This patch introduces IDL definition as well as serialisers and deserialisers for freezing mutation_fragment so that they can be transferred between nodes in a cluster.	2018-05-25 10:15:10 +01:00
Paweł Dziepak	b2e9491728	tests/idl: test variant being the first member of a structure	2018-05-25 10:15:10 +01:00
Paweł Dziepak	a5731ded98	idl: create variant state in root node Each non-final IDL object is preceeded by a frame containing its size. In case of boost::variant there is a frame for the variant itself, an integer determining the active alternative of the variant and a frame of that active alternative. However, if a variant was the first member of a writable stub object the IDL would generate code that would not write the frame for the variant. This is not a very severe issue since there are no such cases right now as C++ type system would no allow such generated code to compile.	2018-05-25 10:15:10 +01:00
Paweł Dziepak	d731cf427d	tests/idl: test serialising and deserialising empty structures	2018-05-25 10:15:10 +01:00
Paweł Dziepak	f719516be8	idl-compiler: avoid unused variable in empty struct deserialisers Deserialisers generated by IDL compiler first create a substream covering the deserialised structure and then skip and read appropriate members. If there are no members the substream will be unused and prompt the compiler to emit a warning.	2018-05-25 10:15:10 +01:00
Paweł Dziepak	fde9e1d55f	tests/mutation_reader: disambiguate freeze() overload freeze() is about to get overloaded so make sure we don't get any ambiguities.	2018-05-25 10:15:10 +01:00
Duarte Nunes	4db0b4af58	Merge 'secondary index: Fixes for tables with multiple clustering columns' from Nadav " This patch series fixes #3405: secondary-index search only provided correct results in certain cases, where entire partitions or contiguous partition slices matched the query. When this was not the case, and individual clustering rows match or do not match the query, the wrong results were returned. To fix this bug, we need to fix the two stages of secondary-index search: 1. In the first stage, we read from the index MV a list of row keys (i.e., primary keys) matching the query. We can no longer remember just the partition keys, and need to keep the list of full primary keys. 2. In the second stage, we have a list of rows (not partitions) and need to read their selected contents to return to the user. Since CQL queries do not have a syntax to select an arbitrary list of rows, we have to add new code to do such a selection. Because we provide an ad-hoc, inefficient, implementation for the row selection described in stage 2, these patches leave two paths in the code: The old path, efficiently selecting entire partitions, and the new path, selecting individual rows. The old path is still used when it is applicable, which is when a partition key column or the first clustering key column is searched. " * 'si-fix-v4' of http://github.com/nyh/scylla: secondary index: test multiple clustering column secondary index: fix wrong results returned in certain cases secondary index: method for fetching list of rows from base table secondary index: method for fetching list of rows from index select_statement.cc: refactor find_index_partition_ranges() select_statement.cc: fix variable lifetime errors	2018-05-24 21:36:18 +01:00
Nadav Har'El	a6d9ea2fb5	secondary index: test multiple clustering column This patch adds a test for secondary indexes on a table which has many columns - two partition key column, two clustering key columns, and two regular columns. We add a bunch of data in various rows and partitions, index all columns and search on this data and verify the results. This test exposed various bugs in secondary index search, including issue #3405. After we fixed those bugs, the test now passes. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:56:57 +03:00
Nadav Har'El	1b29dd44f7	secondary index: fix wrong results returned in certain cases The current secondary-index search code, in indexed_table_select_statement::do_execute(), begins by fetching a list of partitions, and then the content of these partitions from the base table. However, in some cases, when the table has clustering columns and not searching on the first one of them, doing this work in partition granularity is wrong, and yields wrong results as demonstrated in issue #3405. So in this patch, we recognize the cases where we need to work in clustering row granularity, and in those cases use the new functions introduced in the previous patches - find_index_clustering_rows() and the execute() variant taking a list of primary-keys of rows. Fixes #3405. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:56:03 +03:00
Nadav Har'El	adf6d742be	secondary index: method for fetching list of rows from base table We add a new variant of select_statement::execute() which allows selecting an arbitrary list of clustering rows. The existing execute() variant can't do that - it can only take a list of partitions, and read the same clustering rows from all of them. The new select variant is not needed for regular CQL queries (which do not have a syntax allowing reading a list of rows with arbitrary primary keys), but we will need it for secondary index search, for solving issue #3405. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:54:36 +03:00
Nadav Har'El	a096a82adc	secondary index: method for fetching list of rows from index We already have a method find_index_partition_ranges(), to fetch a list of partition keys from the secondary index. However, as we shall see in the following patches (and see also issue #3405), getting a list of entire partitions is not always enough - the secondary index actually holds a list of primary keys, which includes clustering keys, and in some queries we can't just ignore them. So this patch provides a new method find_index_clustering_rows(), to query the secondary index and get a list of matching clustering keys. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:53:29 +03:00

1 2 3 4 5 ...

15534 Commits