scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Duarte Nunes	115ff1095e	db/view: Use view schema for view pk operations Instead of base schema. Fixes #2504 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170718190703.12972-1-duarte@scylladb.com>	2017-07-19 09:59:34 +02:00
Avi Kivity	bfae5c7bac	Merge "Time window compaction strategy support" from Raphael "Time window strategy was introduced to address several limitations of date tiered strategy. In addition, its options are much easier to reason about, basically just window size and window unit. TWCS will work to keep only one sstable in each window. So the only real optimization needed is to align partition key to the window. Size tiered strategy is used to reduce write amplification when compacting the incoming window. For more details: https://issues.apache.org/jira/browse/CASSANDRA-9666 Fixes #1432." * 'twcs_v2' of github.com:raphaelsc/scylla: tests: add tests for time window compaction strategy compaction: wire up time window compaction strategy compaction/twcs: override default values with options in schema sstables: implement time window compaction strategy sstables: import TimeWindowCompactionStrategy.java	2017-07-19 10:22:53 +03:00
Duarte Nunes	3bfcf47cc6	types: Implement hash() for collections This patch provides a rather trivial implementation of hash() for collection types. It is needed for view building, where we hold mutations in a map indexed by partition keys (and frozen collection types can be part of the key). Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170718192107.13746-1-duarte@scylladb.com>	2017-07-19 09:52:56 +03:00
Raphael S. Carvalho	c55c63f213	tests: add tests for time window compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-19 02:58:37 -03:00
Raphael S. Carvalho	7ecedac222	compaction: wire up time window compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-19 02:58:37 -03:00
Raphael S. Carvalho	01886c23a8	compaction/twcs: override default values with options in schema Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-19 02:58:37 -03:00
Raphael S. Carvalho	206d30c52a	sstables: implement time window compaction strategy For more details, https://issues.apache.org/jira/browse/CASSANDRA-9666 Fixes #1432. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-19 02:58:35 -03:00
Raphael S. Carvalho	2686e84792	sstables: import TimeWindowCompactionStrategy.java it will be later converted to C++. Imported from latest scylla- tools-java repository. Checked that it doesn't lack anything. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-18 18:26:17 -03:00
Takuya ASADA	49b01e764a	dist/common/scripts/scylla_prepare: stop running hugeadm when it's posix mode A user reported scylla-server.service does not able to run on their cloud instance, because of hugeadm. (hugeadm says the kernel does not support huge pages.) We don't need it for posix mode, so move it in dpdk mode. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1500367219-8728-1-git-send-email-syuu@scylladb.com>	2017-07-18 16:39:16 +03:00
Tomasz Grabiec	63caa58b70	Merge "Drop mutations that raced with truncate" from Duarte Instead of retrying, just drop mutations that raced with a truncate. * git@github.com:duarten/scylla.git truncate-reorder/v1: database: Rename replay_position_reordered_exception database: Drop mutations that raced with truncate	2017-07-18 12:53:36 +02:00
Duarte Nunes	d9fa3bf322	thrift: Fail when mixed CFs are detected Fixes #2588 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170717222612.7429-1-duarte@scylladb.com>	2017-07-18 10:21:33 +03:00
Avi Kivity	64ef7aa5e4	Merge seastar upstream * seastar 867b7c7...a14d667 (1): > tls: remove unneeded lambda captures	2017-07-17 19:30:59 +03:00
Duarte Nunes	6b464da67d	schema: Get rid of regular_columns_by_name They are unused. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170717103635.6473-2-duarte@scylladb.com>	2017-07-17 12:52:41 +02:00
Asias He	adc5f0bd21	gossip: Implement the missing fd_max_interval_ms and fd_initial_value_ms option It is useful for larger cluster with larger gossip message latency. By default the fd_max_interval_ms is 2 seconds which means the failure_detector will ignore any gossip message update interval larger than 2 seconds. However, in larger cluster, the gossip message udpate interval can be larger than 2 seconds. Fixes #2603. Message-Id: <49b387955fbf439e49f22e109723d3a19d11a1b9.1500278434.git.asias@scylladb.com>	2017-07-17 13:29:16 +03:00
Duarte Nunes	13caccf1cf	Merge 'Fixes around migration to v3 schema tables' from Tomasz branch 'tgrabiec/schema-migration-fixes' of github.com:scylladb/seastar-dev: schema: Use proper name comparator legacy_schema_migrator: Properly migrate non-UTF8 named columns schema_tables: Store column_name in text form legacy_schema_migrator: Migrate columns like Cassandra schema_builder: Add factory method for default_names legacy_schema_migrator: Simplify logic thrift: Don't set regular_column_name_type schema: Use proper column name type for static columns schema: Fix column_name_type() for static compact tables schema: Introduce clustering_column_at() thrift: Reuse cell_comparator::to_sstring() for obtaining comparator type partition_slice_builder: Use proper column's type instead of regular_column_name_type()	2017-07-17 11:16:52 +02:00
Tomasz Grabiec	34dae0588c	schema: Use proper name comparator This replaces column_definition::name_comparator, which incorrectly assumes that names are always utf8, with name_compare moved from schema::rebuild() and unifies usages.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	7e54290d38	legacy_schema_migrator: Properly migrate non-UTF8 named columns Currently migrator assumed all columns are utf8-named, which doesn't have to be the case for static compact tables. Refs #2597. Due to #2573, we can assume that Scylla wasn't used with non-utf8 column names, and that old names are always in textual form.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	60a76efd37	schema_tables: Store column_name in text form That's how it is stored by Cassandra. Refs #2597.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	61229a7536	legacy_schema_migrator: Migrate columns like Cassandra This fixes generation of synthetic columns for static compact tables. Current code always generates synthetic clustering column with utf8 type and synthetic regular column with bytes type (in schema_builder). That's fine when creating a new CQL table, but not when migrating existing tables created via thrift API. Fixes #2584. This also migrates empty compact value columns like Cassandra does. Such columns are present in compact tables without regular columns, e.g.: create table test (k int, ck int, primary key (k, ck)) with compact storage; They should be migrated to a synthetic regular column with empty_type type and a non-empty name.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	49e21b3b8e	schema_builder: Add factory method for default_names	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	6dc299c27a	legacy_schema_migrator: Simplify logic The expression "is_dense.value_or(true)" is always true inside the if, so drop it. This allows us to drop temporary calulated_is_dense. We can also get rid of one of the if branches by extracting builder.set_is_dense() outside.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	3987e9be31	thrift: Don't set regular_column_name_type Regular columns are always utf8 after `f5dae826ce`.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	b919c50d21	schema: Use proper column name type for static columns After `f5dae826ce`, static columns not always have utf8 column names. For static compact tables it's determined by the cell name comparator type, which is equal to the type of the synthetic clustering column. Caused various errors with static thrift tables with non-utf8 comparator.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	f685f7f8a1	schema: Fix column_name_type() for static compact tables Introduced in `f5dae826ce`.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	84536a4a75	schema: Introduce clustering_column_at()	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	9ed958a1eb	thrift: Reuse cell_comparator::to_sstring() for obtaining comparator type	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	9768036d61	partition_slice_builder: Use proper column's type instead of regular_column_name_type()	2017-07-17 09:40:06 +02:00
Avi Kivity	c51001b598	Merge seastar upstream * seastar b812cee...867b7c7 (1): > rpc: start server's send loop only after protocol negotiation Fixes #2600.	2017-07-16 19:36:31 +03:00
Avi Kivity	a5bd854019	Merg seastar upstream * seastar 844bcfb...b812cee (1): > Update dpdk submodule Fix #2595 (again).	2017-07-16 17:00:48 +03:00
Avi Kivity	d9c64ef737	tests: move tmpdir to /tmp Reduces view_schema_test runtime to 5 seconds, from 53 seconds on an NVMe disk with write-back cache, and forever on a spinning disk. Message-Id: <20170716081653.10018-1-avi@scylladb.com>	2017-07-16 11:55:08 +02:00
Avi Kivity	9116dd91cb	tests: copy the sstable with an unknown component to the data directory We will be creating links to those sstable's files, and those don't work if the data directory and the test sstable are on different devices. Copying the files to the same directory fixes the problem. Message-Id: <20170716090405.14307-1-avi@scylladb.com>	2017-07-16 11:55:00 +02:00
Duarte Nunes	2c711922cc	database: Drop mutations that raced with truncate Mutations that race with a truncate can just be dropped. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-16 00:08:05 +02:00
Duarte Nunes	0825c9c805	database: Rename replay_position_reordered_exception Rename replay_position_reordered_exception to mutation_reordered_with_truncate_exception for more precision, since this is the only situation where this exception can be thrown. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-16 00:08:05 +02:00
Avi Kivity	e87ab54bfc	Merge seastar upstream * seastar ff34c42...844bcfb (1): > Update dpdk submodule Fixes #2595.	2017-07-15 19:17:05 +03:00
Tomasz Grabiec	caa62f7f05	Merge "Fixes for memtable flushing and replay positions" from Duarte We don't ensure mutations are applied in memory following the order of their replay positions. A memtable can thus be flushed with replay position rp, with the new one being at replay position rp', where rp' < rp. This breaks an intrinsic assumption in the code, which this series addresses. Fixes #2074 branch memtable-flush/v3 of git@github.com:duarten/scylla.git: commitlog: Always flush latest memtable column_family: More precise count of switched memtables column_family: Fix typo in pending_tasks metric name column_family: More precise count of pending flushes dirty_memory_manager: Remove unnecessary check from flush_one() column_family: Don't rely on flush_queue to guarantee flushes finished column_family: Don't bother closing the flush_queue on stop() column_family: Stop using flush_queue column_family: Remove outdated comment about the flush_queue memtable: Stop tracking the highest flushed rp	2017-07-14 11:39:37 +02:00
Avi Kivity	162d9aa85d	tests: fix view_schema_test with clang Clang is happy to create a vector<data_value> from a {}, a {1, 2}, but not a {1}. No doubt it is correct, but sheesh. Make the data_value explicit to humor it. Message-Id: <20170713074315.9857-1-avi@scylladb.com>	2017-07-14 12:24:27 +03:00
Duarte Nunes	b8235f2e88	storage_proxy: Preserve replica order across mutations In storage_proxy we arrange the mutations sent by the replicas in a vector of vectors, such that each row corresponds to a partition key and each column contains the mutation, possibly empty, as sent by a particular replica. There is reconciliation-related code that assumes that all the mutations sent by a particular replica can be found in a single column, but that isn't guaranteed by the way we initially arrange the mutations. This patch fixes this and enforces the expected order. Fixes #2531 Fixes #2593 Signed-off-by: Gleb Natapov <gleb@scylladb.com> Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170713162014.15343-1-duarte@scylladb.com>	2017-07-14 12:11:22 +03:00
Duarte Nunes	5f24e9a4a5	memtable: Stop tracking the highest flushed rp Since we no longer enforce that mutations are applied in memory ordered by their replay_positions, the way the highest_flush_rp is being tracked is no longer correct. The invariant it was used to maintain no longer exists, so we can get rid of it together with the assertion on the highest_flush_rp on flush(). Fixes #2074 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:56:06 +02:00
Duarte Nunes	22a53a52a1	column_family: Remove outdated comment about the flush_queue Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:56:05 +02:00
Duarte Nunes	003941cd95	column_family: Stop using flush_queue Since commitlog ordering requirements have been relaxed, we now keep the set of replay_positions seen by a memtable in a set, which we then use to clean up relevant segments in the commitlog. This means that the guarantees provided by the flush_queue are no longer necessary. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:56:00 +02:00
Duarte Nunes	7e6fe5895e	column_family: Don't bother closing the flush_queue on stop() When stopping a column family we issue a flush(), for which we wait. Since writes are supposed to have stopped coming in, and also new flush requests, there's no need to call and wait for the flush_queue to be closed. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:58 +02:00
Duarte Nunes	a1f4536ffb	column_family: Don't rely on flush_queue to guarantee flushes finished We now don't ensure mutations are applied in memory following the order of their replay positions, so we can't rely on the replay position to order memtable flushes. So, use a phased_barrier() to ensure that calling flush() returns a future that completes when all flushes up to that point have finished. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:58 +02:00
Duarte Nunes	1b320496e2	dirty_memory_manager: Remove unnecessary check from flush_one() We don't need to check whether a memtable is empty in flush_one(), as that must be checked later, during the actual sealing. The condition itself is rare and is checked already after the potentially contented semaphore has been acquired. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:57 +02:00
Duarte Nunes	59bdaed02b	column_family: More precise count of pending flushes This patch ensures we update the count of pending flushes in the same place as we update the stats across column families, which is more correct since it only accounts for actual flushes and not those of empty memtables or that have been coalesced together. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:25 +02:00
Duarte Nunes	3e27c335a9	column_family: Fix typo in pending_tasks metric name Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:25 +02:00
Duarte Nunes	a11724c6e1	column_family: More precise count of switched memtables The memtable_switch_count metric is supposed to count the number of times a flush has resulted in the memtable being switched out, but we were incrementing the count regardless of whether we tried to flush an empty memtable or two or more flushes were coalesced into one. This patch fixes this by moving the metric to where the memtable is actually switched. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:25 +02:00
Duarte Nunes	bca1b19ce9	commitlog: Always flush latest memtable We now don't ensure mutations are applied in memory following the order of their replay positions, so we can't rely on the replay position to order memtable flushes. When flushing commit log segments, ensure we flush the latest memtable. Refs #2074 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-13 22:51:25 +02:00
Paweł Dziepak	ec689b2fe1	Merge "utils: minor fixes in the loading_cache class" from Vlad "This series aims to fix the "serving invalid (old) values" issue in the loading_cache (issue #2590) by arming the timer with a period that equals min(expire, refresh). We are still trying to optimize the main case where 'expire' is significantly longer than 'refresh' period. We don't want to add any additional logic in the fast path and this series gives the immediate solution for the issue above while not adding any additional CPU cycle to the fast path." * 'loading_cache_short_expired-v2' of https://github.com/vladzcloudius/scylla: utils::loading_cache: arm the timer with a period equal to min(_expire, _update) utils::loading_cache: make a timer use a loading_cache_clock_type clock as a source	2017-07-13 16:58:53 +01:00
Vlad Zolotarov	45e23d8090	db::config: fix the permissions cache related parameters description Make the descriptions of permissions_validity_in_ms, permissions_update_interval_in_ms and permissions_cache_max_entries more readable and more related to what they really do. Mention the none-zero value requirement for the permissions_update_interval_in_ms and the permissions_cache_max_entries when the permissions cache is enabled. Adjust the parameters description in the scylla.yaml too. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1499957053-31792-1-git-send-email-vladz@scylladb.com>	2017-07-13 16:00:40 +01:00
Vlad Zolotarov	76ea74f3fd	utils::loading_cache: arm the timer with a period equal to min(_expire, _update) Arm the timer with a period that is not greater than either the permissions_validity_in_ms or the permissions_update_interval_in_ms in order to ensure that we are not stuck with the values older than permissions_validity_in_ms. Fixes #2590 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-07-13 10:48:59 -04:00

1 2 3 4 5 ...

12626 Commits