scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Duarte Nunes	69798df95e	query: Limit number of partitions returned This is required to implement a thrift verb. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-22 09:48:13 +02:00
Duarte Nunes	594e43a60a	compact_query: Rename partition_limit This patch renames compact_query::_partition_limit to _current_partition_limit for clarity, as the next patch adds a partition limit that limits the number of partitions. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-22 09:47:29 +02:00
Duarte Nunes	e9ebd87991	compact_query: Rename limit to row_limit This patch renames compact_query::_limit to _row_limit for clarity, as a subsequent patch introduces yet another limit. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-22 09:47:28 +02:00
Duarte Nunes	01b18063ea	query: Add per-partition row limit This patch as a per-partition row limit. It ensures both local queries and the reconciliation logic abide by this limit. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-22 09:46:51 +02:00
Paweł Dziepak	ed12c164f8	mutation_query: make mutation queries streaming-friendly Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:31:28 +01:00
Paweł Dziepak	0828c88b25	mutation_partition: implement streaming-friendly data_query() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:31:19 +01:00
Paweł Dziepak	67ae9457e3	mutation_partition: introduce mutation_querier mutation_querier is a streamed_mutation consumer that adds the mutation content to query::result. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:53 +01:00
Paweł Dziepak	f54e604a16	mutation_partition: introduce compact_for_query compact_for_query is an intermediate stage used to compact data in a flattened stream of mutations before they are consumed by query building consumers. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:53 +01:00
Paweł Dziepak	f95c5542dc	mutation_partition: allow slicing moved mutation_partition Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:51 +01:00
Paweł Dziepak	5a60f6d1ec	range_tombstone: extract is_single_clustering_row_tombstone() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:50 +01:00
Paweł Dziepak	847bf878ec	mutation_partition: add more row::apply() overloads Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:48 +01:00
Duarte Nunes	70083efee2	sstables: Read and write range tombstone bounds This patch uses the composite_marker to add inclusiveness information to the prefixes of a range tombstone. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:59 +02:00
Duarte Nunes	7628e403a3	sstables: Drop code for tombstone merging Since Scylla now supports proper range tombstones, the code for reading ranges from sstables and converting them to overlapping tombstones is no longer necessary, and is, in fact, wasteful as the internal representation converts overlapping tombstones back to ranges. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:59 +02:00
Duarte Nunes	95594b8171	mutations: Encapsulate row tombstones difference This patch moves the difference between two mutation_partition's row_tombstones inside the range_tombstone_list. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:59 +02:00
Duarte Nunes	91aac30f12	mutations: Row tombstones are now a set of ranges This patch changes the type of the mutation partition's row_tombstones to be a range_tombstone_list, so that they are now represented as a set of disjoint ranges. All of its usages are updated accordingly. Fixes #1155 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:59 +02:00
Gleb Natapov	5fef0717cc	query: find latest modification timestamp while calculating result digest	2016-05-24 13:27:34 +03:00
Piotr Jastrzebski	23c23abe53	Make memtable mutation_reader slice using clustering ranges. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-05-16 11:46:41 +02:00
Gleb Natapov	b75475de80	query: fix result row counting for results with multiple partitions Message-Id: <1462377579-2419-1-git-send-email-gleb@scylladb.com>	2016-05-04 18:18:15 +02:00
Gleb Natapov	db322d8f74	query: put live row count into query::result The patch calculates row count during result building and while merging. If one of results that are being merged does not have row count the merged result will not have one either.	2016-05-02 15:10:15 +03:00
Tomasz Grabiec	c69d0a8e87	mutation_partition: Fix collection emptiness check Broken by `f15c380a4f`. This resulted in empty collection being returned in the results instead of no collection. Fixes org.apache.cassandra.cql3.validation.entities.CollectionsTest from cassandra-unit-tests.	2016-04-15 18:14:05 +02:00
Tomasz Grabiec	c2b955d40b	mutation_partition: Fix static row being returned when paginating Reproduced by dtest paging_test.py:TestPagingData.static_columns_paging_test. Broken by `f15c380a4f`, where the calcualtion of has_ck_selector got broken, in such a way that present clustering restrictions were treated as if not present, which resulted in static row being returned when it shouldn't. While at it, unify the check between query_compacted() and do_compact() by extracting it to a function.	2016-04-08 20:53:33 +02:00
Tomasz Grabiec	a1539fed95	mutation_partition: Fix reversed trim_rows() The first erase_and_dispose(), which removes rows between last position and beginning of the next range, can invalidate end() iterator of the range. Fix by looking up end after erasing. mutation_partition::range() was split into lower_bound() and upper_bound() to allow for that. This affects for example queries with descending order where the selected clustering range is empty and falls before all rows. Exposed by `f15c380a4f`, which is now calling do_compact() during query. Reproduced by dtest paging_test.py:TestPagingData.static_columns_paging_test	2016-04-08 20:53:33 +02:00
Avi Kivity	db03295c8a	Merge "Fix query digest mismatch" from Tomasz "Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165."	2016-04-08 12:13:29 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	f15c380a4f	database: Compact mutations when executing data queries Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165.	2016-04-07 19:56:58 +02:00
Tomasz Grabiec	dc290f0af7	mutation_partition: Make apply() atomic even in case of exception We cannot leave partially applied mutation behind when the write fails. It may fail if memory allocation fails in the middle of apply(). This for example would violate write atomicity, readers should either see the whole write or none at all. This fix makes apply() revert partially applied data upon failure, by the means of ReversiblyMergeable concept. In a nut shell the idea is to store old state in the source mutation as we apply it and swap back in case of exception. At cell level this swapping is inexpensive, just rewiring pointers. For this to work, the source mutation needs to be brought into mutable form, so frozen mutations need to be unfrozen. In practice this doesn't increase amount of cell allocations in the memtable apply path because incoming data will usually be newer and we will have to copy it into LSA anyway. There are extra allocations though for the data structures which holds cells. I didn't see significant change in performance of: build/release/tests/perf/perf_simple_query -c1 -m1G --write --duration 13 The score fluctuates around ~77k ops/s. Fixes #283.	2016-03-21 21:49:52 +01:00
Tomasz Grabiec	e09d186c7c	mutation_partition: Make intrusive sets ReversiblyMergeable	2016-03-21 21:49:52 +01:00
Tomasz Grabiec	e4a576a90f	mutation_partition: Make rows_entry ReversiblyMergeable	2016-03-21 19:26:24 +01:00
Tomasz Grabiec	aadcd75d89	mutation_partition: Make row_marker ReversiblyMergeable	2016-03-21 19:26:24 +01:00
Tomasz Grabiec	ea7c2dd085	mutation_partition: Make row ReversiblyMergeable	2016-03-21 19:26:24 +01:00
Tomasz Grabiec	d5e66a5b0d	mutation_partition: row: Allow storing empty cells internally Currently only "set" storage could store empty cells, but not the "vector" one because there empty cell has the meaning of being missing. To implement rolback, we need to be able to distinguish empty cells from missing ones. Solve by making vector storage use a bitmap for presence checking instead of emptiness. This adds 4 bytes to vector storage.	2016-03-21 18:41:27 +01:00
Tomasz Grabiec	ed1e6515db	mutation_partition: Make row::merge() tolerate empty row The row may be empty and still have a set storage, in which case rbegin() dereference is undefined behavior.	2016-03-21 18:41:27 +01:00
Tomasz Grabiec	518e956736	mutation_partition: Make row::vector_to_set() exception-safe Currently allocation failure can leave the old row in a half-moved-from state and leak cell_entry objects.	2016-03-18 22:30:04 +01:00
Tomasz Grabiec	c91eefa183	mutation_partition: Unmark cell_entry's copy constructor as noexcept It was a mistake, it certainly may throw because it copies cells.	2016-03-18 22:30:04 +01:00
Paweł Dziepak	21e2ebcf8c	query: build only result, only digest or both Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Paweł Dziepak	46079f763b	query: add keys and tombstones to result digest Query result digest is used to verify that all replicas have the same data. Therefore, it needs to contain more information than the query result itself in order to ensure proper detection of disagreements. Generally, adding clustering keys to the digest regardless of whether the client asked for them will guarantee correctness. However, adding tombstones as well improves the chances of early detection of nodes containing stale data. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Paweł Dziepak	c1f7f11d54	mutation_partition: do not add ck to result when not asked to Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Amnon Heiman	1c7bc28d35	idl-compiler: change optional vector implementation This patch change the way optional vector are implemented. Now a vector of optional would be handle like any other non primitive types, with a single method add() that would return a writer to the optional. The writer to the optional would have a skip and write method like simple optional field. For basic types the write method would get the value as a parameter, for composite type, it would return a writer to the type. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1456796143-3366-2-git-send-email-amnon@scylladb.com>	2016-03-01 09:41:30 +02:00
Tomasz Grabiec	6cec131432	query: Switch to IDL-generated views and writers The query result footprint for cassandra-stress mutation as reported by tests/memory-footprint increased by 18% from 285 B to 337 B. perf_simple_query shows slight regression in throughput (-8%): build/release/tests/perf/perf_simple_query -c4 -m1G --partitions 100000 Before: ~433k tps After: ~400k tps	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	4284715ddf	Relax includes	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	a921479e71	Merge tag '807-v3' from https://github.com/avikivity/scylla From Avi: This patchset introduces a linearization context for managed_bytes objects. Within this context, any scattered managed_bytes (found only in lsa regions, so limited to memtable and cache) are auto-linearized for the lifetime of the context. This ensures that key and value lookups can use fast contiguous iterators instead of using slow discontiguous iterators (or crashing, as is the case now).	2016-02-16 14:29:48 +01:00
Avi Kivity	13144ea9eb	managed_bytes: get rid of explicit linearize/scatter Now that everything is in a linarization context, we don't need to explicitly gather data.	2016-02-16 14:37:46 +02:00
Tomasz Grabiec	63006e5dd2	query: Serialize collection cells using CQL format We want the format of query results to be eventually defined in the IDL and be independent of the format we use in memory to represent collections. This change is a step in this direction. The change decouples format of collection cells in query results from our in-memory representation. We currently use collection_mutation_view, after the change we will use CQL binary protocol format. We use that because it requires less transformations on the coordinator side. One complication is that some list operations need to retrieve keys used in list cells, not only values. To satisfy this need, new query option was added called "collections_as_maps" which will cause lists and sets to be reinterpreted as maps matching their underlying representation. This allows the coordinator to generate mutations referencing existing items in lists.	2016-02-15 17:05:55 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	f59ec59abc	mutation: Implement upgrade() Converts mutation to a new schema.	2016-01-08 21:10:26 +01:00
Tomasz Grabiec	ade5cf1b4b	mutation_partition: Make visitable with mutation_partition_visitor	2016-01-08 21:10:25 +01:00
Tomasz Grabiec	6f955e1290	mutation_partition: Make equal() work with different schemas	2016-01-08 21:10:25 +01:00
Tomasz Grabiec	ff3a2e1239	mutation_partition: Drop row tombstones in do_compact()	2016-01-08 21:10:25 +01:00
Raphael S. Carvalho	03eee06784	remove empty rows in mutation_partition::do_compact do_compact() wasn't removing an empty row that is covered by a tombstone. As a result, an empty partition could be written to a sstable. To solve this problem, let's make trim_rows remove a row that is considered to be empty. A row is empty if it has no tombstone, no marker and no cells. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-05 15:19:21 +01:00
Calle Wilund	8c17e9e26c	mutation_partition: Do not return static row if CK range does not match Fixes #589 If we got no rows, but have live static columns, we should only give them back IFF we did not have any CK restrictions. If ck:s exist, and we have a restriction on them, we either have maching rows, or return nothing, since cql does not allow "is null".	2015-12-21 10:38:48 +00:00

1 2 3

117 Commits