scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 13:37:04 +00:00

Author	SHA1	Message	Date
Paweł Dziepak	7e89dc3bbf	tests/sstables: add storage_service_for_tests to counter write test Writing a counters to a sstable is going to require cluster feature information, which requires accessing some singletons.	2017-09-05 13:49:01 +01:00
Paweł Dziepak	2cdcaeba6e	tests/sstables: add test for reading wrong-order counter cells	2017-09-05 13:49:01 +01:00
Paweł Dziepak	b86da0c479	tests/counter: test 1.7.4 compatible shard ordering	2017-09-05 13:49:01 +01:00
Paweł Dziepak	89c037dfc8	tests/counter: add tests for 1.7.4 counter shard order	2017-09-05 13:49:00 +01:00
Paweł Dziepak	b5787ca640	tests/counter: verify order of counter shards	2017-09-05 13:49:00 +01:00
Paweł Dziepak	838dbd98ac	tests/counter: add test for sorting and deduplicating shards	2017-09-05 13:49:00 +01:00
Calle Wilund	34260ce471	utils::UUID: operator< should behave as comparison of hex strings/bytes I.e. need to be unsigned comparison. Message-Id: <1487683665-23426-1-git-send-email-calle@scylladb.com> (cherry picked from commit `0d87f3dd7d`)	2017-08-24 14:18:55 +01:00
Duarte Nunes	1fd4a3ed34	tests/sstable_mutation_test: Don't use moved-from object Fix a bug introduced in `dbbb9e93d` and exposed by gcc6 by not using a moved-from object. Twice. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170802161033.4213-1-duarte@scylladb.com> (cherry picked from commit `4c9206ba2f`)	2017-08-03 09:46:33 +03:00
Avi Kivity	0b48863a7e	Merge "Ensure correct EOC for PI block cell names" from Duarte "This series ensures the always write correct cell names to promoted index cell blocks, taking into account the eoc of range tombstones. Fixes #2333" * 'pi-cell-name/v1' of github.com:duarten/scylla: tests/sstable_mutation_test: Test promoted index blocks are monotonic sstables: Consider eoc when flushing pi block sstables: Extract out converting bound_kind to eoc (cherry picked from commit `db7329b1cb`)	2017-08-01 18:13:19 +03:00
Nadav Har'El	b594f21f91	Allow reading exactly desired byte ranges and fast_forward_to Allow reading exactly desired byte ranges and fast_forward_to In commit `c63e88d556`, support was added for fast_forward_to() in data_consume_rows(). Because an input stream's end cannot be changed after creation, that patch ignores the specified end byte, and uses the end of file as the end position of the stream. As result of this, even when we want to read a specific byte range (e.g., in the repair code to checksum the partitions in a given range), the code reads an entire 128K buffer around the end byte, or significantly more, with read-ahead enabled. This causes repair to do more than 10 times the amount of I/O it really has to do in the checksumming phase (which in the current implementation, reads small ranges of partitions at a time). This patch has two levels: 1. In the lower level, sstable::data_consume_rows(), which reads all partitions in a given disk byte range, now gets another byte position, "last_end". That can be the range's end, the end of the file, or anything in between the two. It opens the disk stream until last_end, which means 1. we will never read-ahead beyond last_end, and 2. fast_fordward_to() is not allowed beyond last_end. 2. In the upper level, we add to the various layers of sstable readers, mutation readers, etc., a boolean flag mutation_reader::forwarding, which says whether fast_forward_to() is allowed on the stream of mutations to move the stream to a different partition range. Note that this flag is separate from the existing boolean flag streamed_mutation::fowarding - that one talks about skipping inside a single partition, while the flag we are adding is about switching the partition range being read. Most of the functions that previously accepted streamed_mutation::forwarding now accept also the option mutation_reader::forwarding. The exception are functions which are known to read only a single partition, and not support fast_forward_to() a different partition range. We note that if mutation_reader::forwarding::no is requested, and fast_forward_to() is forbidden, there is no point in reading anything beyond the range's end, so data_consume_rows() is called with last_end as the range's end. But if forwarding::yes is requested, we use the end of the file as last_end, exactly like the code before this patch did. Importantly, we note that the repair's partition reading code, column_family::make_streaming_reader, uses mutation_reader::forwarding::no, while the other existing reading code will use the default forwarding::yes. In the future, we can further optimize the amount of bytes read from disk by replacing forwarding::yes by an actual last partition that may ever be read, and use its byte position as the last_end passed to data_consume_rows. But we don't do this yet, and it's not a regression from the existing code, which also opened the file input stream until the end of the file, and not until the end of the range query. Moreover, such an improvement will not improve of anything if the overall range is always very large, in which case not over-reading at its end will not improve perforance. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170718110643.8667-1-nyh@scylladb.com>	2017-07-18 16:54:11 +03:00
Asias He	0a9d26de4a	tests: Add test_selective_token_range_sharder (cherry picked from commit `2a794db61b`)	2017-07-11 08:40:49 +08:00
Tomasz Grabiec	f306b47a88	tests: commitlog: Check there are no segments left on disk after clean shutdown Reproduces #2550. Message-Id: <1499358825-17855-2-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `72e01b7fe8`)	2017-07-10 12:41:33 +03:00
Avi Kivity	a4bd56ce40	tests: fix partitioner_test build on gcc 5	2017-06-13 21:56:02 +03:00
Calle Wilund	6340fe61af	commitlog_test: Fix test_commitlog_delete_when_over_disk_limit Test should a.) Wait for the flush semaphore b.) Only compare segement sets between start and end, not start, end and inbetwen. I.e. the test sort of assumed we started with < 2 (or so) segments. Not always the case (timing) Message-Id: <1496828317-14375-1-git-send-email-calle@scylladb.com> (cherry picked from commit `0c598e5645`)	2017-06-13 19:53:13 +03:00
Avi Kivity	a85b70d846	Merge "repair memory usage fix" from Asias "This series switches repair to use more stream plans to stream the mismatched sub ranges and use a range generator to produce sub ranges. Test shows no huge memory is used for repair with large data set. In addition, we now have a progress reporter in the log how many ranges are processed. Jun 06 14:18:22 [shard 0] repair - Repair 512 out of 529 ranges, id=1, keyspace=myks, cf=mytable, range=(8526136029525195375, 8549482295083869942] Jun 06 14:19:55 [shard 0] repair - Repair 513 out of 529 ranges, id=1, keyspace=myks, cf=mytable, range=(8526136029525195375, 8549482295083869942] Fixes #2430." * tag 'asias/fix-repair-2430-branch-master-v1' of github.com:cloudius-systems/seastar-dev: repair: Remove unused sub_ranges_max repair: Reduce parallelism in repair_ranges repair: Tweak the log a bit repair: Use more stream_plan repair: iterator over subranges instead of list (cherry picked from commit `419ad9d6cb`)	2017-06-08 14:52:28 +03:00
Raphael S. Carvalho	17d8a0c727	compaction: do not write expired cell as dead cell if it can be purged right away When compacting a fully expired sstable, we're not allowing that sstable to be purged because expired cell is unconditionally converted into a dead cell. Why not check if the expired cell can be purged instead using gc before and max purgeable timestamp? Currently, we need two compactions to get rid of a fully expired sstable which cells could have always been purged. look at this sstable with expired cell: { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z", "ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true }, "cells" : [ { "name" : "country", "value" : "1" }, ] now this sstable data after first compaction: [shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79 (~65% of original) in 229ms = 0.000328997MB/s. { ... "rows" : [ { "type" : "row", "position" : 79, "cells" : [ { "name" : "country", "deletion_info" : { "local_delete_time" : "2017-04-09T17:07:12Z" }, "tstamp" : "2017-04-09T17:07:12.702597Z" }, ] now another compaction will actually get rid of data: compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original) in 1ms = 0MB/s. ~2 total partitions merged to 0 NOTE: It's a waste of time to wait for second compaction because the expired cell could have been purged at first compaction because it satisfied gc_before and max purgeable timestamp. Fixes #2249, #2253 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com> (cherry picked from commit `a6f8f4fe24`)	2017-05-23 20:57:54 +03:00
Avi Kivity	b26bd8bbeb	tests: fix partitioner_test for g++ 5 It can't make the leap from dht::ring_position to stdx::optional<range_bound<dht::ring_position>> for some reason. (cherry picked from commit `ba31619594`)	2017-05-18 13:10:48 +03:00
Tomasz Grabiec	e2c75d8532	Merge "Fix performance problems with high shard counts tag" from Avi From http://github.com/avikivity/scylla exponential-sharder/v3. The sharder, which takes a range of tokens and splits it among shards, is slow with large shard count and the default murmur3_partitioner_ignore_msb_bits. This patchset fixes excessive iteration in sstable sharding metadata writer and nonsignular range scans. Without this patchset, sealing a memtable takes > 60 ms on a 48-shard system. With the patchset, it drops below the latency tracker threshold I used (5 ms). Fixes #2392. (cherry picked from commit `84648f73ef`)	2017-05-17 16:19:24 +03:00
Duarte Nunes	59063f4891	tests: Add test case for nonwrapping_range::intersection() Signed-off-by: Duarte Nunes <duarte@scylladb.com> (cherry picked from commit `f365b7f1f7`)	2017-05-17 15:59:06 +03:00
Paweł Dziepak	bd67d23927	tests/counter: test transform_counter_updates_to_shards	2017-05-02 13:49:43 +01:00
Paweł Dziepak	bdeeebbd74	tests/counter: test static columns	2017-05-02 13:49:43 +01:00
Raphael S. Carvalho	82cc3d7aa5	dtcs: do not compact fully expired sstable which ancestor is not deleted yet Currently, fully expired sstable[1] is unconditionally chosen for compaction by DTCS, but that may lead to a compaction loop under certain conditions. Let's consider that an almost expired sstable is compacted, and it's not deleted yet, and that the new sstable becomes expired before its ancestor is deleted. Because this new sstable is expired, it will be chosen by DTCS, but it will not be purged because 'compacted undeleted' sstables are taken into account by calculation of max purgeable timestamp and prevents expired data from being purged. The problem is that this sequence of events can keep happening forever as reported by issue #2260. NOTE: This problem was easier to reproduce before improvement on compaction of expired cells, because fully expired sstable was being converted into a sstable full of tombstones, which is also considered fully expired. Fixes #2260. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170428233554.13744-1-raphaelsc@scylladb.com> (cherry picked from commit `687a4bb0c2`)	2017-04-30 19:36:00 +03:00
Avi Kivity	1a77312aec	Merge "Reduce memory reclamation latency" from Tomasz "Currently eviction is performed until occupancy of the whole region drops below the 85% threshold. This may take a while if region had high occupancy and is large. We could improve the situation by only evicting until occupancy of the sparsest segment drops below the threshold, as is done by this change. I tested this using a c-s read workload in which the condition triggers in the cache region, with 1G per shard: lsa-timing - Reclamation cycle took 12.934 us. lsa-timing - Reclamation cycle took 47.771 us. lsa-timing - Reclamation cycle took 125.946 us. lsa-timing - Reclamation cycle took 144356 us. lsa-timing - Reclamation cycle took 655.765 us. lsa-timing - Reclamation cycle took 693.418 us. lsa-timing - Reclamation cycle took 509.869 us. lsa-timing - Reclamation cycle took 1139.15 us. The 144ms pause is when large eviction is necessary. Statistics for reclamation pauses for a read workload over larger-than-memory data set: Before: avg = 865.796362 stdev = 10253.498038 min = 93.891000 max = 264078.000000 sum = 574022.988000 samples = 663 After: avg = 513.685650 stdev = 275.270157 min = 212.286000 max = 1089.670000 sum = 340573.586000 samples = 663 Refs #1634." * tag 'tgrabiec/lsa-reduce-reclaim-latency-v3' of github.com:cloudius-systems/seastar-dev: lsa: Reduce reclamation latency tests: Add test for log_histogram log_histogram: Allow non-power-of-two minimum values lsa: Use regular compaction threshold in on-idle compaction tests: row_cache_test: Induce update failure more reliably lsa: Add getter for region's eviction function (cherry picked from commit `fccbf2c51f`) [avi: adjustments for 1.7's heap vs. master's log_histogram]	2017-04-21 22:12:52 +03:00
Raphael S. Carvalho	193b5d1782	partitioned_sstable_set: fix quadratic space complexity streaming generates lots of small sstables with large token range, which triggers O(N^2) in space in interval map. level 0 sstables will now be stored in a structure that has O(N) in space complexity and which will be included for every read. Fixes #2287. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170417185509.6633-1-raphaelsc@scylladb.com> (cherry picked from commit `11b74050a1`)	2017-04-18 13:05:00 +03:00
Avi Kivity	699648d5a1	Merge "tests: Use allocating_section in lsa_async_eviction_test" from Tomasz "The test allocates objects in batches (allocation is always under a reclaim lock) of ~3MiB and assumes that it will always succeed because if we cross the low water mark for free memory (20MiB) in seastar, reclamation will be performed between the batches, asynchronously. Unfortunately that's prevented by can_allocate_more_memory(), which fails segment allocation when we're below the low water mark. LSA currently doesn't allow allocating below the low water mark. The solution which is employed across the code base is to use allocating_section, so use it here as well. Exposed by recent consistent failures on branch-1.7." * 'tgrabiec/fix-lsa-async-eviction-test' of github.com:cloudius-systems/seastar-dev: tests: lsa_async_eviction_test: Allocate objects under allocating section lsa: Allow adjusting reserves in allocating_section (cherry picked from commit `434a4fee28`)	2017-03-16 12:44:54 +02:00
Paweł Dziepak	42e7a59cca	tests/cql_test_env: wait for storage service initialization Message-Id: <20170221121130.14064-1-pdziepak@scylladb.com> (cherry picked from commit `274bcd415a`)	2017-02-21 17:06:10 +02:00
Avi Kivity	2cd019ee47	Merge "Fixes for counter cell locking" from Paweł "This series contains some fixes and a unit test for the logic responsible for locking counter cells." * 'pdziepak/cell-locking-fixes/v1' of github.com:cloudius-systems/seastar-dev: tests: add test for counter cell locker cell_locking: fix schema upgrades cell_locker: make locker non-movable cell_locking: allow to be included by anyone (cherry picked from commit `b8c4b35b57`)	2017-02-15 17:37:38 +02:00
Avi Kivity	a203c87f0d	Merge "Disallow mixed schemas" fro Paweł "This series makes sure that schemas containing both counter and non-counter regular or static columns are not allowed." * 'pdziepak/disallow-mixed-schemas/v1' of github.com:cloudius-systems/seastar-dev: schema: verify that there are no both counter and non-counter columns test/mutation_source: specify whether to generate counter mutations tests/canonical_mutation: don't try to upgrade incompatible schemas (cherry picked from commit `9e4ae0763d`)	2017-02-07 18:04:24 +02:00
Avi Kivity	4f416c7272	Merge "Avoid avalanche of tasks after memtable flush" from Tomasz "Before, the logic for releasing writes blocked on dirty worked like this: 1) When region group size changes and it is not under pressure and there are some requests blocked, then schedule request releasing task 2) request releasing task, if no pressure, runs one request and if there are still blocked requests, schedules next request releasing task If requests don't change the size of the region group, then either some request executes or there is a request releasing task scheduled. The amount of scheduled tasks is at most 1, there is a single releasing thread. However, if requests themselves would change the size of the group, then each such change would schedule yet another request releasing thread, growing the task queue size by one. The group size can also change when memory is reclaimed from the groups (e.g. when contains sparse segments). Compaction may start many request releasing threads due to group size updates. Such behavior is detrimental for performance and stability if there are a lot of blocked requests. This can happen on 1.5 even with modest concurrency because timed out requests stay in the queue. This is less likely on 1.6 where they are dropped from the queue. The releasing of tasks may start to dominate over other processes in the system. When the amount of scheduled tasks reaches 1000, polling stops and server becomes unresponsive until all of the released requests are done, which is either when they start to block on dirty memory again or run out of blocked requests. It may take a while to reach pressure condition after memtable flush if it brings virtual dirty much below the threshold, which is currently the case for workloads with overwrites producing sparse regions. I saw this happening in a write workload from issue #2021 where the number of request releasing threads grew into thousands. Fix by ensuring there is at most one request releasing thread at a time. There will be one releasing fiber per region group which is woken up when pressure is lifted. It executes blocked requests until pressure occurs." * tag 'tgrabiec/lsa-single-threaded-releasing-v2' of github.com:cloudius-systems/seastar-dev: tests: lsa: Add test for reclaimer starting and stopping tests: lsa: Add request releasing stress test lsa: Avoid avalanche releasing of requests lsa: Move definitions to .cc lsa: Simplify hard pressure notification management lsa: Do not start or stop reclaiming on hard pressure tests: lsa: Adjust to take into account that reclaimers are run synchronously lsa: Document and annotate reclaimer notification callbacks tests: lsa: Use with_timeout() in quiesce() (cherry picked from commit `7a00dd6985`)	2017-02-03 09:47:50 +01:00
Piotr Jastrzebski	36b2c4df19	row_cache_test: extend test_mvcc Make the test execute with and without an active reader to memtable that's flushed to cache. This improves the code covarage of MVCC with tests. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <007b6cd1ba7a84ea5675ea82e454bf1adf3b3330.1485954941.git.piotr@scylladb.com>	2017-02-02 13:51:32 +01:00
Paweł Dziepak	8671d8329d	perf_simple_query: add counter tables tests	2017-02-02 10:35:14 +00:00
Paweł Dziepak	99b21fbb86	tests: random_mutation_generator: generate counter cells	2017-02-02 10:35:14 +00:00
Paweł Dziepak	de2acd47c9	tests/sstables: test reading and writing counters	2017-02-02 10:35:14 +00:00
Paweł Dziepak	5905729c4a	sstables: read counter cells	2017-02-02 10:35:14 +00:00
Paweł Dziepak	de698105e4	tests/counter: test apply, difference and freeze	2017-02-02 10:35:14 +00:00
Paweł Dziepak	496b42fcc7	tests: add test for counters	2017-02-02 10:35:13 +00:00
Piotr Jastrzebski	c7e95af0b0	row_cache_test: fix test_mvcc Currently the test does not wait for cache update to finish before carrying on with the checks. This makes the test nondeterministic and purely wrong because checks expect update to be finished. This patch changes the test to wait for update to finish. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <2a99bba24b1628466d3495332b48ef3ccdb43c26.1485862389.git.piotr@scylladb.com>	2017-01-31 11:37:29 +00:00
Pekka Enberg	be0351b49c	cql3: Introduce raw_value and raw_value_view types Currently, the code is using bytes_opt and bytes_view_opt to represent CQL values, which can hold a value or null. In preparation for supporting a third state, unset value introduced in CQL v4, introduce new raw_value and raw_value_view types and use them instead. The new types are based on boost::variant<> and are capable of holding null, unset values, and blobs that represent a value.	2017-01-26 13:50:04 +02:00
Tomasz Grabiec	2c7902fb2b	Revert "lsa: Reduce reclamation latency" This reverts commit `d61002cc33`. Introduced a regression in row_cache_alloc_stress. The problem is that reclaim_from_evictable() evicts way too much after the refactor due to the stop condition not taking into account how much data was evicted so far and only looking at occupancy of the minimal segment. This may lead to eviction of the whole region.	2017-01-26 10:43:18 +01:00
Duarte Nunes	54a464ae27	random_mutation_generator: Always generate range tombstones Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-01-23 19:02:23 +01:00
Duarte Nunes	a01aa91c82	range_tombstone_list: Add unit tests for difference() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-01-23 18:14:33 +01:00
Benoît Canet	bcc826cc34	mutation_reader: Short circuit the read path on empty range Add a boolean to short circuit the read path on empty range hoping for some speedup. tested in read write with cs using: cl=QUORUM duration=1m -mode native cql3 -rate threads=700 -node localhost Will do some additional benchmark. Fixes #1056 Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <20170118194451.16836-1-benoit@scylladb.com>	2017-01-20 10:05:40 +00:00
Tomasz Grabiec	d61002cc33	lsa: Reduce reclamation latency Currently eviction is performed until occupancy of the whole region drops below the 85% threshold. This may take a while if region had high occupancy and is large. We could improve the situation by only evicting until occupancy of the sparsest segment drops below the threshold, as is done by this change. I tested this using a c-s read workload in which the condition triggers in the cache region, with 1G per shard: lsa-timing - Reclamation cycle took 12.934 us. lsa-timing - Reclamation cycle took 47.771 us. lsa-timing - Reclamation cycle took 125.946 us. lsa-timing - Reclamation cycle took 144356 us. lsa-timing - Reclamation cycle took 655.765 us. lsa-timing - Reclamation cycle took 693.418 us. lsa-timing - Reclamation cycle took 509.869 us. lsa-timing - Reclamation cycle took 1139.15 us. The 144ms pause is when large eviction is necessary. The change improves worst case latency. Reclamation time statistics over 30 second period after cache fills up, in microseconds: Before: avg = 1524.283148 stdev = 11021.021118 min = 12.934000 max = 144356.000000 sum = 257603.852000 samples = 169 After: avg = 1317.362414 stdev = 1913.542802 min = 263.935000 max = 19244.600000 sum = 175209.201000 samples = 133 Refs #1634. Message-Id: <1484730859-11969-1-git-send-email-tgrabiec@scylladb.com>	2017-01-19 17:35:36 +02:00
Tomasz Grabiec	ddfee57c97	Replace iostream include with iosfwd in headers Message-Id: <1484656119-8386-4-git-send-email-tgrabiec@scylladb.com>	2017-01-17 14:52:44 +02:00
Paweł Dziepak	e03868c226	tests: run with all features enabled Since `ce083308a1` "random_mutation_generator: Generate RTs by default" random mutation generator produces range tombstones. However, so far the tests were run with all features disabled (because of incomplete initialization of all services) which meant that RANGE_TOMBSTONE feature was not enabled and the code couldn't handle range tombstones that weren't just prefixes. This patch solves the problem by forcing all features to be enabled when tests are run. Message-Id: <20170116103324.22956-1-pdziepak@scylladb.com>	2017-01-16 11:38:45 +01:00
Duarte Nunes	ce083308a1	random_mutation_generator: Generate RTs by default This patch changes the random_mutation_generator so it generates range tombstones by default. This fixes a bug where reversibly applying range tombstones wasn't being tested. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170110164822.28747-1-duarte@scylladb.com>	2017-01-11 09:24:37 +00:00
Avi Kivity	0591303b72	Merge "avoid excessive memory usage during resharding" from Rapahel "Intended to reduce memory usage when resharding by sharing sstable components among shards. File descriptors are also shared from now on, meaning that a much smaller number of file descriptors will be used during resharding. Fixes #1951." branch 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla * 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla: db: avoid excessive memory usage during resharding checked_file_impl: add support to dup sstables: group sstable components that can be shared among shards sstables: rename sstable member	2017-01-09 20:43:50 +02:00
Raphael S. Carvalho	68dfcf5256	db: avoid excessive memory usage during resharding After resharding, sstables may be owned by all shards, which means that file descriptors and memory usage for metadata will increase by a factor equal to number of shards. That can easily lead to OOM. SSTable components are immutable, so they can be stored in one shard and shared with others that need it. We use the following formula to decide which shard will open the sstable and share it with the others: (generation % smp::count), which is the inverse of how we calculate generation for new sstables. So if no resharding is performed, everything is shard-local. With this approach, resource usage due to loaded sstables will be evenly distributed among shards. For this approach to work, we now only populate keyspaces from shard 0. It's now the sole responsible for iterating through column family dirs. In addition, most of population functions are now free and take distributed database object as parameter. Fixes #1951. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-01-09 15:24:36 -02:00
Avi Kivity	77cb2b452f	Merge "CQL 3.3.1 support" from Pekka "This patch series adds support for CQL 3.3.1. The changes to CQL are listed here: https://github.com/apache/cassandra/blob/cassandra-2.2/doc/cql3/CQL.textile#changes The following CQL features are already supported by Scylla: - TRUNCATE TABLE alias - Double-dollar string literals - Aggregate functions: MIN, MAX, SUM, and AVG This series adds the following CQL features: - New data types: tinyint, smallint, date, and time - CQL binary protocol v4 (required by the new data types) - Advertise Cassandra 2.2.8 version from Scylla so that drivers correctly detect the presence of CQL 3.3.1 The following CQL features are not supported by Scylla: - Role-based access control (issue #1941) - JSON data type - User-defined functions (UDFs) - User-defined aggregates (UDAs) The following CQL binary protocol v4 changes are not implemented by this series: - Read_failure and Write_failure error codes are not implemented. They error codes not used by the smart drivers but as they are propagated to application code, we eventually need to wire them up to our storage proxy implementation. - Function_failure error code is only used by user-defined functions and the fromJson function, which are not implemented by Scylla. Fixes #1284." * 'penberg/cql-3.3.1/v5' of github.com:cloudius-systems/seastar-dev: version: Bump Cassandra version to 2.2.8 db/schema_tables: Add schema_functions and schema_aggregates tables tests/type_tests: TIME type test cases tests/cql_query_test: TIME type test cases cql3: TIME data type support tests/type_tests: DATE type test cases tests/cql_query_test: DATE type test cases cql3: DATE type support date.h: 64-bit year and days representation licenses: Add utils/date.h license utils/date.h: Import date and time library sources tests/type_tests: TINYINT and SMALLINT type test cases tests/cql_query_test: TINYINT and SMALLINT type test cases cql3: TINYINT and SMALLINT data type support types: Fix integer_type_impl::parse_int() for bytes	2017-01-09 11:54:45 +02:00
Pekka Enberg	10facd7db8	tests/type_tests: TIME type test cases	2017-01-09 10:42:21 +02:00

1 2 3 4 5 ...

1314 Commits