scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 02:20:37 +00:00

Author	SHA1	Message	Date
Glauber Costa	c2f49da609	partition: add method to calculate memory size of a partition Once that is added, also add a method to a memtable entry to calculate the entire size of a memtable entry. Right now we only have one method to calculate the size minus rows. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2017-11-08 16:21:44 -05:00
Tomasz Grabiec	967cabcaf2	mvcc: Make the null state of partition_snapshot::change_mark explicit	2017-11-02 11:05:19 +01:00
Tomasz Grabiec	4b7933543d	mvcc: Add partition_snapshot::region() getter	2017-11-02 11:05:19 +01:00
Tomasz Grabiec	9cf30f19ae	mvcc: Add partition_snapshot::schema() getter	2017-11-02 11:05:19 +01:00
Tomasz Grabiec	b6ae5783cd	mvcc: Introduce partition_entry::evict() The operation frees as much memory as possible, marking affected mutation elements as discontinuous.	2017-09-13 17:47:03 +02:00
Tomasz Grabiec	cda86abdbc	mvcc: Encapsulate reference stability check in partition_snapshot	2017-09-13 17:38:08 +02:00
Tomasz Grabiec	2df6f356b1	mvcc: Store LSA region reference in partition_snapshot Will be useful for improving encapsulation.	2017-09-13 17:38:08 +02:00
Piotr Jastrzebski	896bf2e5de	Remove unused methods from MVCC Some apply methods where replaced by apply_to_incomplete(). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	6ebfb730ee	partition_entry: Introduce partition_tombstone() getter	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	b680de930c	partition_entry: Introduce apply_to_incomplete() Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> [tgrabiec: - extracted from a larger commit - fix heap comparator in apply_incomplete_target to order versions properly - extracted partition_version detaching into partition_entry::with_detached_versions() - dropped unnecessary rows_iterator::_version field - dropped unnecessary allocation of rows_entry and key copies in rows_iterator - dropped row_pointer - replaced apply_reversibly() with weaker and faster apply() - added handling of dummy entries at any position - fixed exception safety issue in apply_to_incomplete() which may result in data loss. We cannot move data out of applied versions into a new synthetic row and then apply it, because if exception happens in the middle, the data which was moved from the source will be lost. To fix that, row_iterator::consume_row() is introduced which allows in-place consumption of data without construction of temporary deletable_row. ]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	b6ce963200	partition_version: Introduce partition_entry::with_detached_versions()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	2d8f024e4d	partition_version: Document version merging rules on partition_entry	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	64626b32b0	row_cache: Make printable	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	639af55a78	partition_version: Add versions() getter [tgrabiec: Use explicit return type]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	1d3fec43eb	partition_version: Make return type of versions() explicit	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	22a0e301f1	partition_version: Make is_referenced() const-qualified	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	b47c8f1df7	partition_snapshot: Add const-qualified overload of version() [tgrabiec: Extracted from a different patch]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	dd9d35c166	partition_snapshot: Add getter for range tombstones	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	60c3c0a471	partition_entry: Add squashed() overload with a single schema	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	98f7671553	partition_snapshot: Introduce squashed()	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	87b0f11be3	partition_snapshot: Add getters for static row and partition tombstone [tgrabiec: - Extracted from a different patch - Renamed concept names to more familiar Map and Reduce - Renamed aggregate() to squashed() to match the existing nomenclature - Uncommented the concepts ]	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	ea59b9475e	partition_version: Add const-quialified variant of operator-> [tgrabiec: Extracted from a different patch]	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	f6fe0acea4	partition_version: Make operator bool() const-qualified [tgrabiec: Extracted from a different patch]	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	2fdabcaa9b	Track population phase in partition_snapshot This will be used by partial cache in later patches. [tgrabiec: - changed title, - documented meaning of the variable, - renamed the variable, - introduced open_version(), - fixed continuity of the static row not being preserved in case a new version is created] Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	9642f806ab	partition_version: Introduce version() getter	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	05b56fcfb0	mutation_partition: Add support for specifying continuity This will allow expressing lack of information about certain ranges of rows (including the static row), which will be used in cache to determine if information in cache is complete or not. Continuity is represented internally using flags on row entries. The key range between two consecutive entries is continuous iff rows_entry::continuous() is true for the later entry. The range starting after the last entry is assumed to be continuous. The range corresponding to the key of the entry is continuous iff rows_entry::dummy() is false. [tgrabiec: - based on the following commits: 4a5bf75 - Piotr Jastrzebski : mutation_partition: introduce dummy rows_entry 773070e - Piotr Jastrzebski : mutation_partition: add continuity flag to rows_entry - documented that partition tombstone is always complete - require specifying the partition tombstone when creating an incomplete entry - replaced rows_entry(dummy_tag, ...) constructor with more general rows_entry(position_in_partition, ...) - documented continuity semantics on mutation_partition - fixed _static_row_cached being lost by mutation_partition copy constructors - fixed conversion to streamed_mutation to ignore dummy entries - fixed mutation_partition serializer to drop dummy entries - documented semantics of continuity on mutation_partition level - dropped assumptions that dummy entries can be only at the last position - changed equality to ignore continuity completely, rather than partially (it was not ignoring dummy entries, but ignoring continuity flag) - added printout of continuity information in mutation_partition - fixed handling of empty entries in apply_reversibly() with regards to continuity; we no longer can remove empty entries before merging, since that may affect continuity of the right-hand mutation. Added _erased flag. - fixed mutation_partition::clustered_row() with dummy==true to not ignore the key - fixed partition_builder to not ignore continuity - renamed dummy_tag_t to dummy_tag. _t suffix is reserved. - standardized all APIs on is_dummy and is_continuous bool_class:es - replaced add_dummy_entry() with ensure_last_dummy() with safer semantics - dropped unused remove_dummy_entry() - simplified and inlined cache_entry::add_dummy_entry() - fixed mutation_partition(incomplete_tag) constructor to mark all row ranges as discontinuous ]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	6cf2841654	mvcc: Extract partition_snapshot_reader to separate header Right know whole world includes it transitively, which results in painful recompiles when the code changes. Relax dependencies. Message-Id: <1495620201-8046-1-git-send-email-tgrabiec@scylladb.com>	2017-05-24 12:13:15 +01:00
Tomasz Grabiec	892d4a2165	db: Enable creating forwardable readers via mutation_source Right now all mutation source implementations will use make_forwardable() wrapper.	2017-02-23 18:50:44 +01:00
Tomasz Grabiec	acfad565f0	partition_version: Refactor make_partition_snapshot_reader() overloads So that streamed_mutation is created in only one of the overloads and others delegate to that one. Later there will be common logic added to the construction and doing this will help avoid a duplication.	2017-02-23 18:23:52 +01:00
Tomasz Grabiec	fcf3391785	partition_snapshot_reader: Emit only relevant tombstones Refs #1254.	2017-02-13 16:12:15 +01:00
Paweł Dziepak	354ce0b2c7	mutation_fragment: make write access more explicit mutation_fragments are going to be caching their size in memory. In order to be able to invalidate that correctly, they need to know when that size may change (but avoid invalidation when it is not necessary).	2017-02-09 10:49:46 +00:00
Avi Kivity	18df2d9e9e	partition_version: fix const correctness in rows_entry_compare Using a non-const-correct comparator results in build failures with boost 1.55. Fixes #1892. Message-Id: <20161128104335.28789-1-avi@scylladb.com>	2016-11-28 10:55:12 +00:00
Paweł Dziepak	f16d6f9c40	partition_version: make sure that snapshot is destroyed under LSA Snapshot destructor may free some objects managed by the LSA. That's why partition_snapshot_reader destructor explicitly destroys the snapshot it uses. However, it was possible that exception thrown by _read_section prevented that from happenning making snapshot destoryed implicitly without current allocator set to LSA. Refs #1831. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1478778570-2795-1-git-send-email-pdziepak@scylladb.com>	2016-11-10 13:13:10 +01:00
Piotr Jastrzebski	27726cecff	Clean up position_in_partition. Introduce position_in_partition_view and use it in position() method in mutation_fragment, range_tombstone, static_row and clustering_row. Clean up comparators in position_in_partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <c65293c71a6aa23cf930ed317fb63df1fdc34fd1.1477399763.git.piotr@scylladb.com>	2016-10-25 15:13:20 +01:00
Glauber Costa	1db245b52d	add accounting of memory read to partition_snapshot_reader By default, we don't do any accounting. By specializing this class and providing an accounter class, we can account how much memory are we reading as we read through the elements. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	452eb95943	move partition_snapshot_reader code to header file This is so we can template it without worrying about declaring the specializations in the .cc file. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Piotr Jastrzebski	b05b90b3a5	Introduce clustering_key_filter_ranges. This fixes the problem of multiple concurrent get_ranges calls. Previously each call was invalidating the result of the previous call. Now they don't step on each other foot. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-08-30 19:46:38 +02:00
Paweł Dziepak	5cae44114f	partition_version: handle errors during version merge Currently, partition snapshot destructor can throw which is a big no-no. The solution is to ignore the exception and leave versions unmerged and hope that subsequent reads will succeed at merging. However, another problem is that the merge doesn't use allocating sections which means that memory won't be reclaimed to satisfy its needs. If the cache is full this may result in partition versions not being merged for a very long time. This patch introduces partition_snapshot::merge_partition_versions() which contains all the version merging logic that was previously present in the snapshot destructor. This function may throw so that it can be used with allocating sections. The actual merging and handling of potential erros is done from partition_snapshot_reader destructor. It tries to merge versions under the allocating section. Only if that fails it gives up and leaves them unmerged. Fixes #1578 Fixes #1579. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1471265544-23579-1-git-send-email-pdziepak@scylladb.com>	2016-08-15 15:56:53 +03:00
Paweł Dziepak	db5ea591ad	add mvcc implementation for mutation_partitions To ensure isolation of operation when streaming a mutation from a mutable source (such as cache or memtable) MVCC is used. Each entry in memtable or cache is actually a list of used versions of that entry. Incoming writes are either applied directly to the last verion (if it wasn't being read by anyone) or preprended to the list (if the former head was being read by someone). When reader finishes it tries to squash versions together provided there is no other reader that could prevent this. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:51 +01:00

39 Commits