scylladb

Author	SHA1	Message	Date
Duarte Nunes	b27da688f9	mutation: Remove dead get_cell() function Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170316234843.23130-1-duarte@scylladb.com>	2017-03-17 11:18:23 +02:00
Tomasz Grabiec	cbf4601e31	streamed_mutation: Add non-owning variant of mutation_from_streamed_mutation()	2017-02-23 18:50:53 +01:00
Piotr Jastrzebski	4bbe05dd47	mutation_partition: take schema in find_row and clustered_row This will allow intrusive set implementation that does not store schema. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-01-05 11:26:03 +01:00
Avi Kivity	1d9ee358f1	Revert "Merge "Reduce the size of mutation_partition" from Piotr" This reverts commit `aa392810ff`, reversing changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses boost::intrusive::detail directly, which it must not, and doesn't compile on all boost versions as a consequence.	2016-12-25 16:07:48 +02:00
Piotr Jastrzebski	2af6ff68d9	mutation_partition: take schema in find_row and clustered_row This will allow intrusive set implementation that does not store schema. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-12-23 11:29:07 +01:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Paweł Dziepak	15de8de9e5	reconcilable_result: keep result_memory_tracker object Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:10:02 +00:00
Paweł Dziepak	948c062e64	mutation_rebuilder: use memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-11-18 11:25:36 +00:00
Paweł Dziepak	ef57b9a26f	rename memory_usage() to external_memory_usage() where applicable Renaming the function to external_memory_usage() makes it clear that sizeof(T) is not included, something that was a source of confusion in the past. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-11-18 11:25:36 +00:00
Piotr Jastrzebski	9d33948487	mutation_rebuilder: fix fragment size calculation It wasn't calculating the size of data correctly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <c03dfff7bf1ca3199991e5864189f98bfa2942ea.1479397736.git.piotr@scylladb.com>	2016-11-17 16:23:42 +00:00
Tomasz Grabiec	ecf85cbffb	mutation: Define + operation It's more convenient to write m1 + m2 in tests than to do more elaborate constructs with copy constructors and apply().	2016-10-18 11:16:08 +02:00
Piotr Jastrzebski	0d39bb1ad0	Implement mutation_from_streamed_mutation_with_limit If mutation is bigger than this limit it won't be read and mutation_from_streamed_mutation will return empty optional. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-07-21 09:35:23 +02:00
Paweł Dziepak	93cc4454a6	streamed_mutation: emit range_tombstones directly Originally, streamed_mutations guaranteed that emitted tombstones are disjoint. In order to achieve that two separate objects were produced for each range tombstone: range_tombstone_begin and range_tombstone_end. Unfortunately, this forced sstable writer to accumulate all clustering rows between range_tombstone_begin and range_tombstone_end. However, since there is no need to write disjoint tombstones to sstables (see #1153 "Write range tombstones to sstables like Cassandra does") it is also not necessary for streamed_mutations to produce disjoint range tombstones. This patch changes that by making streamed_mutation produce range_tombstone objects directly. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-13 09:51:18 +01:00
Duarte Nunes	01b18063ea	query: Add per-partition row limit This patch as a per-partition row limit. It ensures both local queries and the reconciliation logic abide by this limit. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-22 09:46:51 +02:00
Paweł Dziepak	48e08fa997	mutation: add mutation_from_streamed_mutation() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:49 +01:00
Avi Kivity	db03295c8a	Merge "Fix query digest mismatch" from Tomasz "Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165."	2016-04-08 12:13:29 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	f15c380a4f	database: Compact mutations when executing data queries Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165.	2016-04-07 19:56:58 +02:00
Tomasz Grabiec	87d7279267	mutation: Add copy assignment operator We already have a copy constructor, so can have copy assignment as well.	2016-03-21 18:41:27 +01:00
Paweł Dziepak	82d2a2dccb	specify whether query::result, result_digest or both are needed Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Tomasz Grabiec	4284715ddf	Relax includes	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	f59ec59abc	mutation: Implement upgrade() Converts mutation to a new schema.	2016-01-08 21:10:26 +01:00
Tomasz Grabiec	6f955e1290	mutation_partition: Make equal() work with different schemas	2016-01-08 21:10:25 +01:00
Calle Wilund	284b10cabe	Make partition_slice::row_ranges mulitplex on partition Allows for having more than one clustering row range set, depending on PK queried (although right now limited to one - which happens to be exactly the number of mutiplexing paging needs... What a coincidence...) Encapsulates the row_ranges member in a query function, and if needed holds ranges outside the default one in an extra object. Query result::builder::add_partition now fetches the correct row range for the partition, and this is the range used in subsequent iteration.	2015-11-10 13:12:33 +01:00
Avi Kivity	2c3591cbd9	data_value de-any-fication We use boost::any to convert to and from database values (stored in serlialized form) and native C++ values. boost::any captures information about the data type (how to copy/move/delete etc.) and stores it inside the boost::any instance. We later retrieve the real value using boost::any_cast. However, data_value (which has a boost::any member) already has type information as a data_type instance. By teaching data_type intances about the corresponding native type, we can elimiante the use of boost::any. While boost::any is evil and eliminating it improves efficiency somewhat, the real goal is growing native type support in data_type. We will use that later to store native types in the cache, enabling O(log n) access to collections, O(1) access to tuples, and more efficient large blob support.	2015-10-30 17:38:51 +01:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Paweł Dziepak	11efd5c639	mutation: store mutation data externally Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-09-03 10:29:53 +02:00
Paweł Dziepak	868f5d91df	mutation: avoid moving mutation_partition By passing mutation_partition oither by const ref or rref instead of by value one move can be avoided if copying is necessary. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-09-03 10:29:32 +02:00
Pekka Enberg	aca4c0d2bb	mutation: Avoid a copy in set_cell() and others Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-08-24 09:06:13 +03:00
Tomasz Grabiec	59b91fe350	mutation: Introduce slice() helper	2015-07-22 13:13:38 +02:00
Tomasz Grabiec	4248d87544	Introduce mutation_decorated_key_less_comparator	2015-07-22 13:13:38 +02:00
Tomasz Grabiec	33ca07af33	mutation: Introduce live_row_count()	2015-07-15 18:56:10 +02:00
Tomasz Grabiec	9724b84bb3	db: Fix query of partitions with no live clustered rows When partition has no live regular rows, but has some data live in the static row, then it should appear in the results, even though we didn't select any static column. To reproduce: create table cf (k blob, c blob, v blob, s1 blob static, primary key (k, c)); update cf set s1 = 0x01 where k = 0x01; update cf set s1 = 0x02 where k = 0x02; select k from cf; The "select" statement should return 2 rows, but was returning 0. The following query worked fine, because static columns were included: select * from cf; The data query should contain only live data, so we shouldn't write a partition entry if it's supposed to be absent from the results. We can'r tell that though until we've processed all the data. To solve this problem, query result writer is using an optimistic approach, where the partition header will be retracted from the buffer (cheaply), if it turns out there's no live data in it.	2015-07-09 19:55:00 +02:00
Tomasz Grabiec	09ed972068	mutation_partition: Remove redundant slice parameter from query() The slice used by partition_writer must match the one used by query() anyway.	2015-07-09 19:47:32 +02:00
Paweł Dziepak	183b6fc6d9	db: do not return already expired cells in queries Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-02 17:25:41 +02:00
Tomasz Grabiec	464434d3d4	mutation: Add query() method	2015-07-02 14:51:28 +02:00
Avi Kivity	a2fa63e09b	db: add another mutation constructor	2015-06-03 12:35:13 +03:00
Tomasz Grabiec	f656ae8ed4	db: Encapsulate deletable_row fields	2015-05-13 08:56:54 +02:00
Tomasz Grabiec	dbc40dfb09	db: Encapsulate the "row" class Reduces coupling. User's should not rely on the fact that it's an std::map<>. It also allows us to extend row's interface with domain-specific methods, which are a lot easier to discover than free functions.	2015-05-13 08:56:54 +02:00
Tomasz Grabiec	6047b6d9b2	mutation: Introduce set_static_cell()	2015-05-08 09:19:01 +02:00
Tomasz Grabiec	b1e45e4401	db: Store ttl in atomic_cell Origin does that, so should we. Both ttl and expiry time are stored in sstables. The value of ttl seems to be used to calculate the read digest (expiry is not used for that). The API for creating atomic_cells changed a bit. To create a non-expiring cell: atomic_cell::make_live(timestamp, value); To create an expiring cell: atomic_cell::make_live(timestamp, value, expiry, ttl); or: // Expiry is calculated based on current clock reading atomic_cell::make_live(timestamp, value, ttl_optional);	2015-05-06 19:42:38 +02:00
Tomasz Grabiec	5ba1486ae7	db: Rename "ttl" to "expiry" when it's used as time point To avoid confusion with "ttl" the duration.	2015-05-06 17:27:22 +02:00
Tomasz Grabiec	04846ed3d2	mutation: Make mutation equality comparable	2015-05-06 16:40:48 +02:00
Avi Kivity	3a0de14aa8	db: more const correctness for column_family and component types Ensure that read-side accessors are const. This is important in preparation for multiple memtables (and later, sstables) since a read-side mutation_partition may be a temporary object coming from multiple memtables (and sstables) while a write-side mutation_partition is guaranteed to belong to a single memtable (and thus, not be temporary). Since writers will want non-const mutation_partitions to write to, they won't be able to use the read-side accessors by accident.	2015-05-05 19:37:21 +03:00
Tomasz Grabiec	aec740f895	db: Make decorated_key have ordering compatible with Origin	2015-04-30 12:02:39 +02:00
Pekka Enberg	00510d610b	database: Add set_clustered_cell() variant Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-04-27 11:39:57 +03:00
Tomasz Grabiec	5a7e3d3278	db: Order partitions by decorated_key Partitions should be ordered using Origin's ordering, which is first by token, then by Origin's representation of the key. That is the natural ordering of decorated_key. This also changes mutation class to hold decorated_key, to avoid decoration overhead at different layers.	2015-04-24 18:01:01 +02:00
Tomasz Grabiec	1c3275c950	mutation: Encapsulate fields	2015-04-24 18:01:01 +02:00
Tomasz Grabiec	0d4821009c	db: Move mutation and mutation_partition to separate headers and compilation units	2015-04-22 18:42:33 +02:00

50 Commits