scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 15:33:15 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	b26ce36d4b	mvcc: Introduce partition_snapshot::static_row_continuous()	2017-12-08 17:50:47 +01:00
Tomasz Grabiec	c283744fcb	mvcc: Introduce partition_snapshot::range_tombstones() for full range	2017-12-08 17:50:47 +01:00
Tomasz Grabiec	df964c70f8	mvcc: Don't require external schema in parition_snapshot::range_tombstones()	2017-12-08 17:50:47 +01:00
Tomasz Grabiec	5541c9fd63	mutation_partition: Define equal_continuity() using get_continuity() This fixes the problem of equal_continuity() being prone to false positives due to redundant information (extra dummy rows) present in one of the partitions. get_continuity() is minified, so is not prone to this.	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	bde050835f	mutation_partition: Make check_continuity() const-qualified	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	f9257886cb	mutation_partition: Make check_continuity() public	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	865bd8a594	mutation_partition: Introduce mutation_partition::get_continuity() Intended to be used in tests.	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	7e5d243a95	Introduce clustering_interval_set Will make it easy to represent and manipulate continuity in tests. Could also replace clustering_row_ranges in the future, which is currently a naked vector<> with no semantic methods.	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	22138554e6	mutation_partition: Leave moved-from row in an empty state Needed by apply_monotonically(). Fixes SIGSEGV in mutation_test_g.	2017-12-08 12:01:27 +01:00
Tomasz Grabiec	a305a28574	mutation_partition: Fix upgrade() not preserving static row continuity We do not rely on this yet, but will.	2017-12-08 12:01:27 +01:00
Paweł Dziepak	051cbbc9af	Merge "Fix range tombstone emitting which led to skipping over data" from Tomasz "Fixes cache reader to not skip over data in some cases involving overlapping range tombstones in different partition versions and discontinuous cache. Introduced in 2.0 Fixes #3053." * tag 'tgrabiec/fix-range-tombstone-slicing-v2' of github.com:scylladb/seastar-dev: tests: row_cache: Add reproducer for issue #3053 tests: mvcc: Add test for partition_snapshot::range_tombstones() mvcc: Optimize partition_snapshot::range_tombstones() for single version case mvcc: Fix partition_snapshot::range_tombstones() tests: random_mutation_generator: Do not emit dummy entries at clustering row positions	2017-12-08 10:27:17 +00:00
Tomasz Grabiec	4cc4c661f3	tests: row_cache: Add reproducer for issue #3053 The issue is that partition_snapshot::range_tombstones() is deoverlapping tombstones coming from different versions, and it may happen that due to range tombstone splitting that function will return a tombstone which starts after the requested range. This breaks assumptions made by the cache reader. It keeps track of the maximum fragment position, and if cache reader will then need to read from sstables due to a miss, it would do so starting from the position marked by that out of range tombstone, possibly skipping over some rows.	2017-12-08 10:15:58 +01:00
Tomasz Grabiec	b6f4637aec	tests: mvcc: Add test for partition_snapshot::range_tombstones()	2017-12-08 10:15:58 +01:00
Tomasz Grabiec	183554cbc4	mvcc: Optimize partition_snapshot::range_tombstones() for single version case	2017-12-08 10:15:58 +01:00
Tomasz Grabiec	1303320377	mvcc: Fix partition_snapshot::range_tombstones() partition_snapshot::range_tombstones() is deoverlapping tombstones coming from different versions and it may happen that due to range tombstone splitting the method will return a tombstone which starts after the requested range. This would cause it to return a tombstone which doesn't overlap with the requested range. This breaks assumptions made by cache reader. It keeps track of the maximum fragment position, and if cache reader will then need to read from sstables due to a miss, it would do so starting from the position marked by that out of range tombstone, possibly skipping over some rows. Exposed by a change in row_cache_test.cc::test_mvcc() which fills the buffer of sm5 reader after it is created. Fixes #3053.	2017-12-08 10:15:58 +01:00
Tomasz Grabiec	89e3b734ed	tests: random_mutation_generator: Do not emit dummy entries at clustering row positions It is assumed that dummy entries are only at !is_clustering_row() positions. Causes cache_streamed_mutation to assert when trying to trim a range tombstone.	2017-12-07 20:20:37 +01:00
Avi Kivity	d934ca55a7	Merge "SSTable resharding fixes" from Raphael "Didn't affect any release. Regression introduced in `301358e`. Fixes #3041" * 'resharding_fix_v4' of github.com:raphaelsc/scylla: tests: add sstable resharding test to test.py tests: fix sstable resharding test sstables: Fix resharding by not filtering out mutation that belongs to other shard db: introduce make_range_sstable_reader rename make_range_sstable_reader to make_local_shard_sstable_reader db: extract sstable reader creation from incremental_reader_selector db: reuse make_range_sstable_reader in make_sstable_reader	2017-12-07 16:42:48 +02:00
Amos Kong	8fd5d27508	dist/debian: add scylla-tools-core to depends list Signed-off-by: Amos Kong <amos@scylladb.com> Message-Id: <db39cbda0e08e501633556ab238d816e357ad327.1512646123.git.amos@scylladb.com>	2017-12-07 13:40:10 +02:00
Amos Kong	eb3b138ee2	dist/redhat: add scylla-tools-core to requires list Fixes #3051 Signed-off-by: Amos Kong <amos@scylladb.com> Message-Id: <f7013a4fbc241bb4429d855671fee4b845b255cd.1512646123.git.amos@scylladb.com>	2017-12-07 13:40:08 +02:00
Gleb Natapov	8f104bab5d	storage_proxy: send negative write replies only when entire cluster supports the feature Message-Id: <20171207102934.GM1885@scylladb.com>	2017-12-07 12:31:35 +02:00
Botond Dénes	1ff65f41fd	mutation_reader_merger: don't query the kind of moved-from fragment Call mutation_fragment_kind() on the fragment before it's moved as there are not guarantees for the state of a moved-from object (apart from that it's in a valid one). Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <c47b1e22877bb9499f1fbb9d513093c29ef1901b.1512635422.git.bdenes@scylladb.com>	2017-12-07 10:40:31 +02:00
Avi Kivity	060e5d3354	Merge "Improve time-series performance by not actually compacting fully expired tables" from Raphael "In time-series, it's common for tables in a given time window to be eventually fully expired. The deletion of such tables is done by compaction, but there's no need to actually compact such fully expired sstables iff their full deletion will not cause older data to be ressurected. In other words, a fully expired table can be actually skipped (but deleted in the end) by compaction iff it doesn't contain newer data than its overlapping counterparts. So there may be false negatives, but never false positives. All that said, the goal behind this patchset is to save read bandwidth of disk in such scenarios. Given that fully expired sstables will not be read by compaction process anymore, read amplification will be greatly reduced too. Fixes #2620." * 'time_series_performance_improvement_v2_2' of github.com:raphaelsc/scylla: tests: check sstable auto correct bad max deletion time tests: add test for compaction with fully expired table sstables/compaction: do not actually compact fully expired sstables sstables: make sstable auto correct max_local_deletion_time sstables: switch to const ref wherever possible sstables: use gc_clock::time_point for gc_before gc_clock: introduce operator<<(ostream&, gc_clock::time_point) sstables: introduce sstable::get_max_local_deletion_time sstables: remove unnecessary copy in time series strategies sstables: change return value type of get_fully_expired_sstables dtcs: make code to extract non expired tables faster sstables: add has_correct_max_deletion_time to sstable	2017-12-07 10:29:31 +02:00
Avi Kivity	908daa67bd	Merge "Generalize `data_resource`" from Jesse "Soon we will have resources beyond just keyspaces and table names. There will be resources for roles, for user-defined functions (UDFs), and possible resources for REST end-points. This change generalizes the implementation of a `data_resource` to many different kinds of resources, though there is still only one kind (`data`). The most important patch is 2/5 ("auth/resource: Generalize to different kinds"), which re-writes `auth::data_resource`. The patch message should sufficiently explain the design decisions involved. The other patches rename files and identifiers based on the expanded role of this class, except for 5/5 ("auth/resource.hh: Rename `resource_ids`"): this patch gives a more appropriate name to a type alias. Fixes #3027." * 'jhk/generalize_resource/v3' of https://github.com/hakuch/scylla: auth/resource.hh: Rename `resource_ids` auth: Rename `data_resource` files cql3/authorization_statement: Fix typo auth/resource: Generalize to different kinds auth: Rename `data_resource` to `resource`	2017-12-07 10:25:58 +02:00
Botond Dénes	9fce51f8a0	Add streamed mutation fast-forwarding unit test for the flat combined-reader Test for the bug fixed by `9661769`. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <fc917bae8e9c99f026bf7b366e6e9d39faf466af.1512630741.git.bdenes@scylladb.com>	2017-12-07 09:45:12 +02:00
Raphael S. Carvalho	39f7404436	tests: add sstable resharding test to test.py Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 03:15:27 -02:00
Raphael S. Carvalho	fc193c29cf	tests: fix sstable resharding test wrong sstable was used when checking for content, and storage service for test was missing. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 03:15:27 -02:00
Raphael S. Carvalho	bad21ba444	sstables: Fix resharding by not filtering out mutation that belongs to other shard After `301358e`, sstable resharding stopped work because shared sstables would use a filtering reader, which excludes mutation that belong to other shards. That completely breaks which relies on compaction of mutations that belong to different shards. The fix is about using recently introduced non local shard reader. Fixes #3041. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 03:15:26 -02:00
Raphael S. Carvalho	f1b65a115a	db: introduce make_range_sstable_reader introduce reader variant that will allow its caller to read a range in a given table without any filter applied. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 03:15:26 -02:00
Raphael S. Carvalho	d1b146baa6	rename make_range_sstable_reader to make_local_shard_sstable_reader Tomek says: "I think that the least surprising behavior for a function named like this is to read the sstables unfiltered (it just reads them), and the filtering should be indicated specially in the name or by accepting a parameter." Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 03:15:25 -02:00
Raphael S. Carvalho	3d725d6823	db: extract sstable reader creation from incremental_reader_selector step closer to divorcing incremental_selector from sstables Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 01:53:16 -02:00
Raphael S. Carvalho	ab82bacddd	db: reuse make_range_sstable_reader in make_sstable_reader Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-07 01:53:14 -02:00
Raphael S. Carvalho	5eef7371b3	tests: check sstable auto correct bad max deletion time Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	a86ee38638	tests: add test for compaction with fully expired table Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	809b30c4a2	sstables/compaction: do not actually compact fully expired sstables There's no need to actually compact a sstable which is fully expired and which deletion of all its data will not ressurect older data. For that, a sstable will only be considered fully expired if it doesn't contain data newer than its overlapping counterparts. That way, there could be a false negative, but never a false positive. Currently, a fully expired sstable would unnecessarily waste read bandwidth of disk. This will help a lot time series workloads in which data for a given time window is all deleted at once using TTL. Fixes #2620. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	810e2ec3d9	sstables: make sstable auto correct max_local_deletion_time sstables created prior to `cc6c383` can contain bad max deletion time stat, which would make get_fully_expired_sstables return sstables that aren't actually fully expired. Let's make sstable invalidate the stat if it is potentially incorrect. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	d2ab154f12	sstables: switch to const ref wherever possible Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	d916c8cdad	sstables: use gc_clock::time_point for gc_before Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:33 -02:00
Raphael S. Carvalho	1d0e6496ec	gc_clock: introduce operator<<(ostream&, gc_clock::time_point) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 19:52:32 -02:00
Raphael S. Carvalho	fcdce38e7f	sstables: introduce sstable::get_max_local_deletion_time Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 18:47:05 -02:00
Raphael S. Carvalho	18bdf496fe	sstables: remove unnecessary copy in time series strategies Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 18:46:46 -02:00
Raphael S. Carvalho	45c11865fa	sstables: change return value type of get_fully_expired_sstables unordered_set will allow us to quickly extract fully expired tables from a set of compacting sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 18:45:55 -02:00
Raphael S. Carvalho	4fe6fea758	dtcs: make code to extract non expired tables faster since it's O(n) and not O(n log n). change also needed for change in interface of function to retrieve fully expired tables, or sort lambda would need to be parametrized. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 18:40:16 -02:00
Raphael S. Carvalho	11176324bd	sstables: add has_correct_max_deletion_time to sstable Commit `cc6c38324` fixes the stat. It was only updated for range tombstone prior to fix, so a sstable that had a regular cell with no expiration time could be considered fully expired which can lead to bad decisions in compaction for time series workloads. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-12-06 18:40:05 -02:00
Jesse Haber-Kucharsky	aea262cdc4	auth/resource.hh: Rename `resource_ids`	2017-12-06 14:39:40 -05:00
Jesse Haber-Kucharsky	3cad18631d	auth: Rename `data_resource` files Now that there can be many kinds of resources, the old name doesn't fit.	2017-12-06 14:39:40 -05:00
Jesse Haber-Kucharsky	3665261a90	cql3/authorization_statement: Fix typo	2017-12-06 14:39:40 -05:00
Jesse Haber-Kucharsky	1bb22bb190	auth/resource: Generalize to different kinds This change generalizes the implementation of a `resource` to many different kinds of resources, though there is still only one kind (`data`). In the future, we also expect resource kinds for roles, user-defined functions (UDFs), and possibly on particular REST end-points. I considered several approaches to generalizing to different kinds of resources. One approach is to have a base class that is inherited from by different resource kinds. The common functionality would be accessed through virtual member functions and kind-specific functions would exist in sub-classes. I rejected this approach because dealing with different kinds of resources uniformly requires storage and life-time management through something like `std::unique_ptr<auth::resource>`, which means that we lose value semantics (including comparison) and must deal with complications around ownership. Another option was to use `boost::variant` (or, in future, `std::variant`). This is closer to what we want, since there a static set of resource kinds that we support. I rejected this approach for two reasons. The first is that all resource kinds share the same data (a list of segments and a root identifier), which would be duplicated in each type that composed the variant. The second is that the complexity and source-code overhead of `boost::variant` didn't seem warranted. The solution I ended up with is home-grown variant. All resources are described in the same `final` class: `auth::resource`. This class has value semantics, supports equality comparison, and has a strict ordering. All resources have in common a tag ("kind") and a list of parts. Most operations on resources don't care about the kind of resource (like getting its name, parsing a name, querying for the parent, etc). These are just member functions of the class. When we care about a kind-specific interpretation of a resource, we can produce a "view" of the resource. For example, `data_resource_view` allows for accessing the (optional) keyspace and table names. I anticipate in the future to add functions for creating role resources (`auth::resource::role`) and also `role_resource_view`. The functional behaviour of the system should be unchanged with this patch. I've added new unit tests in `auth_resource_test.cc` and removed the old test from `auth_test.cc`. Fixes #3027.	2017-12-06 14:37:56 -05:00
Jesse Haber-Kucharsky	8fe53ecf78	auth: Rename `data_resource` to `resource` The implementation and interface of `auth::resource` will change soon to support different kinds of resources beyond just data (keyspaces and tables).	2017-12-06 10:18:05 -05:00
Gleb Natapov	ddf117535a	storage_proxy: add counters for speculative reads Fixes #3030 Message-Id: <20171206143611.8756-1-gleb@scylladb.com>	2017-12-06 16:38:16 +02:00
Avi Kivity	ccc315bcfe	Merge "storage_proxy: allow fail request earlier if CL cannot be reached due to errors" from Gleb "This is CASSANDRA-7886 and CASSANDRA-8592. The patch series detects that CL of a request can no longer be reached due to errors and fails the request earlier. New type of errors are reported: read/write failure which were introduced in cql v4 protocol. For compatibility if older protocol is used the error is translated to timeout error." * 'gleb/request-failure_v2' of github.com:scylladb/seastar-dev: storage_proxy: fail read/write requests early if it cannot be completed due to errors storage_service: add WRITE_FAILURE_REPLY_FEATURE feature gossiper: add node_has_feature() function cql: add read/write failure exceptions storage_proxy: fix data presence reporting in read timeout error during storage_proxy: remove inheritance from enable_shared_from_this for abstract_write_response_handler storage_proxy: remove unneeded field in abstract_write_response_handler storage_proxy: fix pending endpoint accounting for EACH_QUORUM consistency_level: constify quorum_for() and local_quorum_for()	2017-12-06 16:17:19 +02:00

1 2 3 4 5 ...

13877 Commits