scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 11:55:15 +00:00

Author	SHA1	Message	Date
Duarte Nunes	3da54ffff0	schema_builder: Replace type when re-dropping column Fixes #2634 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170725183933.5311-1-duarte@scylladb.com> (cherry picked from commit `e988121dbb`)	2017-07-26 16:26:59 +02:00
Duarte Nunes	804793e291	tests/schema_change_test: Add test case for add+drop notification Reproduces #2616 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170725170622.4380-2-duarte@scylladb.com> (cherry picked from commit `472f32fb06`)	2017-07-26 16:26:59 +02:00
Duarte Nunes	83ea9b6fc0	db/schema_tables: Consider differing dropped columns If a node is notified of a schema change where the schema's dropped columns have changes, that node will miss the changes to the dropped columns. A scenario where this can happen is where a column c is dropped, then added as a different typed, and then dropped again, with a node n having seen the first drop and being notified of the subsequent add and drop. Fixes #2616 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170725170622.4380-1-duarte@scylladb.com> (cherry picked from commit `33e18a1779`)	2017-07-26 16:26:59 +02:00
Asias He	b45855fc1c	gossip: Fix nr_live_nodes calculation We need to consider the _live_endpoints size. The nr_live_nodes should not be larger than _live_endpoints size, otherwise the loop to collect the live node can run forever. It is a regression introduced in commit `437899909d` (gossip: Talk to more live nodes in each gossip round). Fixes #2637 Message-Id: <863ec3890647038ae1dfcffc73dde0163e29db20.1501026478.git.asias@scylladb.com> (cherry picked from commit `515a744303`)	2017-07-26 16:48:51 +03:00
Duarte Nunes	3900babff2	schema: Remove unnecessary print Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170725174000.71061-1-duarte@scylladb.com> (cherry picked from commit `9c831b4e97`)	2017-07-26 16:07:41 +03:00
Tomasz Grabiec	7c805187a9	Merge fixes related to row cache from Raphael * git@github.com:raphaelsc/scylla.git row_cache_fixes: db: atomically synchronize cache with changes to the snapshot db: refresh row cache's underlying data source after compaction (cherry picked from commit `18be42f71a`)	2017-07-25 15:37:40 +02:00
Paweł Dziepak	345a91d55d	tests/row_cache: test queries with no clustering ranges Reproducer for #2604. Message-Id: <20170725131220.17467-3-pdziepak@scylladb.com> (cherry picked from commit `79a1ad7a37`)	2017-07-25 15:37:32 +02:00
Paweł Dziepak	fda8b35cda	tests: do not overload the meaning of empty clustering range Empty clustering key range is perfectly valid and signifies that the reader is not interested in anything but the static row. Let's not make it mean anything else. Message-Id: <20170725131220.17467-2-pdziepak@scylladb.com> (cherry picked from commit `1ea507d6ae`)	2017-07-25 15:37:29 +02:00
Paweł Dziepak	08ac0f1100	cache: fix aborts if no clustering range is specified cache_streamed_mutation assumed that at least one clustering range was specified. That was wrong since the readers are allowed to query just for a static row (e.g. counter update that modifies only static columns). Fixes #2604. Message-Id: <20170725131220.17467-1-pdziepak@scylladb.com> (cherry picked from commit `6572f38450`)	2017-07-25 15:37:28 +02:00
Calle Wilund	db455305a2	system_keyspace: Make sure "system" is written to keyspaces (visible) Fixes #2514 Bug in schema version 3 update: We failed to write "system" to the schema tables. Only visible on an empty instance of course. Message-Id: <1500469809-23546-2-git-send-email-calle@scylladb.com> (cherry picked from commit `7a583585a2`)	2017-07-24 11:33:25 +02:00
Avi Kivity	e1a3052e76	tests: fix sstable_datafile_test build with boost 1.55 Boost 1.55 accidentally removed support for "range for" on recursive_directory_iterator (previous and latter versions do support it). Use old-style iteration instead. Message-Id: <20170724080128.8824-1-avi@scylladb.com> (cherry picked from commit `c21bb5ae05`)	2017-07-24 11:20:53 +03:00
Tomasz Grabiec	50fa3f3b89	schema_registry: Keep unused entries around for 1 second This is in order to avoid frequent misses which have a relatively high cost. A miss means we need to fetch schema definition from another node and in case of writes do a schema merge. If the schema is kept alive only by the incoming request, then it will be forgotten immediately when the request is done, and the next request using the same schema version will miss again. Refs #2608. Message-Id: <1500632447-10104-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `29a82f5554`)	2017-07-24 10:12:09 +02:00
Tomasz Grabiec	8474b7a725	legacy_schema_migrator: Don't snapshot empty legacy tables Otherwise we will create a new (empty) snapshot each time we boot. Message-Id: <1500573920-31478-2-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `ecc85988dd`)	2017-07-24 09:56:22 +02:00
Tomasz Grabiec	0fc874e129	database: Allow disabling auto snapshots during drop/truncate Message-Id: <1500573920-31478-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `408cea66cd`)	2017-07-24 09:56:19 +02:00
Duarte Nunes	5cf1a19f3f	Merge 'Fix possible inconsistency of table schema version' from Tomasz "Fixes issues uncovered in longevity test (#2608). Main problem is that due to time drift scylla_tables.version column may not get deleted on all nodes doing the schema merge, which will make some nodes come up with different table schema version than others. The inconsistency will not heal because scylla_tables doesn't take part in the schema sync. This is fixed by the last patch. This will cause nodes to constantly try to sync the schema, which under some conditions triggers #2617." * tag 'tgrabiec/fix-table-schema-version-inconsistency-v1' of github.com:scylladb/seastar-dev: schema_tables: Add scylla_tables to ALL schema: Make schema_mutations equality consistent with digest schema_tables: Extract compact_for_schema_digest() schema_tables: Always drop scylla_tables::version (cherry picked from commit `937fe80a1a`)	2017-07-24 09:54:45 +02:00
Tomasz Grabiec	f48466824f	schema_registry: Ensure schema_ptr is always synced on the other core global_schema_ptr ensures that schema object is replicated to other cores on access. It was replicating the "synced" state as well, but only when the shard didn't know about the schema. It could happen that the other shard has the entry, but it's not yet synced, in which case we would fail to replicate the "synced" state. This will result in exception from mutate(), which rejects attempts to mutate using an unsynced schema. The fix is to always replicate the "synced" state. If the entry is syncing, we will preemptively mark it as synced earlier. The syncing code is already prepared for this. Refs #2617. Message-Id: <1500555224-15825-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `65c64614aa`)	2017-07-24 09:52:31 +02:00
Avi Kivity	914f6f019f	Update ami submodule * dist/ami/files/scylla-ami 5dfe42f...2bd1481 (1): > Enable support for experimental CPU controller in i3 instances	2017-07-24 10:27:35 +03:00
Shlomi Livne	f5bb363f96	release: prepare for 2.0.rc1 Signed-off-by: Shlomi Livne <shlomi@scylladb.com>	2017-07-23 09:47:11 +03:00
Duarte Nunes	61ba56f628	schema: Support compaction enabled attribute Fixes #2547 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170721132206.3037-1-duarte@scylladb.com> (cherry picked from commit `7eecda3a61`)	2017-07-21 15:39:48 +02:00
Tomasz Grabiec	f4d3e5cdcf	Merge "Drop mutations that raced with truncate" from Duarte Instead of retrying, just drop mutations that raced with a truncate. * git@github.com:duarten/scylla.git truncate-reorder/v1: database: Rename replay_position_reordered_exception database: Drop mutations that raced with truncate (cherry picked from commit `63caa58b70`)	2017-07-21 15:39:20 +02:00
Avi Kivity	0291a4491e	Merge "restrict background writers with scheduling groups" from Glauber "This patchset restricts background writers - such as compactions, streaming flushes and memtable flushes to a maximum amount of CPU usage through a seastar::thread_scheduling_group. The said maximum is recommended to be set 50 % - it is default disabled, but can be adjusted through a configuration option until we are able to auto-tune this. The second patch in this series provides a preview on how such auto-tune would look like. By implementing a simple controller we automatically adjust the quota for the memtable writer processes, so that the rate at which bytes come in is equal to the rates at which bytes are flushed. Tail latencies are greatly reduced by this series, and heavy spikes that previously appeared on CPU-bound workloads are no more." * 'memtable-controller-v5' of https://github.com/glommer/scylla: simple controller for memtable/streaming writer shares. restrict background writers to 50 % of CPU. (cherry picked from commit `c5ee62a6a4`)	2017-07-20 15:13:39 +03:00
Duarte Nunes	83cc640c6a	Merge 'Revert back to 1.7 schema layout in memory' from Tomasz "Fixes schema layout incompatibility in a mixed 1.7 and 2.0 cluster (#2555) by reverting back to using the old layout in memory and thus also in across-node requests. We still use the new v3 layout in schema tables (needed by drivers and external tools). Translations happen when converting to/from schema mutations." * tag 'tgrabiec/use-v2-schema-layout-in-memory-v2' of github.com:scylladb/seastar-dev: schema: Revert back to the 1.7 layout of static compact tables in memory schema: Use v3 column layout when converting to/from schema mutations schema: Encapsulate column layout translations in the v3_columns class (cherry picked from commit `1daf1bc4bb`)	2017-07-19 19:49:43 +03:00
Duarte Nunes	2f06c54033	thrift/handler: Remove leftover debug artifacts Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170705161156.2307-1-duarte@scylladb.com> (cherry picked from commit `d583ef6860`)	2017-07-19 19:49:35 +03:00
Calle Wilund	9abe7651f7	system_schema: Fix remaining places not handing two system keyspaces Some places remained where code looked directly at system_keyspace::NAME to determine iff a ks is considered special/system/protected. Including schema digest calculation. Export "is_system_keyspace" and use accordingly. Message-Id: <1500469809-23546-1-git-send-email-calle@scylladb.com> (cherry picked from commit `247c36e048`)	2017-07-19 19:48:30 +03:00
Amos Kong	784aea12e7	scylla_raid_setup: fix syntax error /usr/lib/scylla/scylla_raid_setup: line 132: syntax error near unexpected token `fi' Fixes #2610 Signed-off-by: Amos Kong <amos@scylladb.com> Message-Id: <af3a5bc77c5ba2b49a8f48a5aaa19afffb787886.1500430021.git.amos@scylladb.com> (cherry picked from commit `2bdcad5bc3`)	2017-07-19 11:10:43 +03:00
Avi Kivity	3a98959eba	dist: tolerate sysctl failures sysctl may fail in a container environment if /proc is not virtualized properly. Fixes #1990 Message-Id: <20170625145930.31619-1-avi@scylladb.com> (cherry picked from commit `08488a75e0`)	2017-07-18 15:45:41 +03:00
Duarte Nunes	2c7d597307	wrapping_range: Fix lvalue transform() Instead of copying and moving the bound, pass it by reference so the transformer can decide whether it wants to copy or not. The only caller so far doesn't want a copy and takes the value by reference, which would be capturing a temporary value. Caught by the view_schema_test with gcc7. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170705210255.29669-1-duarte@scylladb.com> (cherry picked from commit `3dd0397700`)	2017-07-18 14:35:58 +03:00
Duarte Nunes	8d46c4e049	thrift: Fail when mixed CFs are detected Fixes #2588 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170717222612.7429-1-duarte@scylladb.com> (cherry picked from commit `d9fa3bf322`)	2017-07-18 10:21:45 +03:00
Asias He	b1c080984f	gossip: Implement the missing fd_max_interval_ms and fd_initial_value_ms option It is useful for larger cluster with larger gossip message latency. By default the fd_max_interval_ms is 2 seconds which means the failure_detector will ignore any gossip message update interval larger than 2 seconds. However, in larger cluster, the gossip message udpate interval can be larger than 2 seconds. Fixes #2603. Message-Id: <49b387955fbf439e49f22e109723d3a19d11a1b9.1500278434.git.asias@scylladb.com> (cherry picked from commit `adc5f0bd21`)	2017-07-17 13:29:30 +03:00
Duarte Nunes	e1706c36b7	Merge 'Fixes around migration to v3 schema tables' from Tomasz branch 'tgrabiec/schema-migration-fixes' of github.com:scylladb/seastar-dev: schema: Use proper name comparator legacy_schema_migrator: Properly migrate non-UTF8 named columns schema_tables: Store column_name in text form legacy_schema_migrator: Migrate columns like Cassandra schema_builder: Add factory method for default_names legacy_schema_migrator: Simplify logic thrift: Don't set regular_column_name_type schema: Use proper column name type for static columns schema: Fix column_name_type() for static compact tables schema: Introduce clustering_column_at() thrift: Reuse cell_comparator::to_sstring() for obtaining comparator type partition_slice_builder: Use proper column's type instead of regular_column_name_type() (cherry picked from commit `13caccf1cf`)	2017-07-17 12:42:19 +03:00
Avi Kivity	63c8306733	Update seastar submodule * seastar b812cee...867b7c7 (1): > rpc: start server's send loop only after protocol negotiation Fixes #2600. Still tracking upstream.	2017-07-17 10:41:59 +03:00
Avi Kivity	a7dfdc0155	tests: move tmpdir to /tmp Reduces view_schema_test runtime to 5 seconds, from 53 seconds on an NVMe disk with write-back cache, and forever on a spinning disk. Message-Id: <20170716081653.10018-1-avi@scylladb.com> (cherry picked from commit `d9c64ef737`)	2017-07-17 08:47:17 +03:00
Avi Kivity	70be29173a	tests: copy the sstable with an unknown component to the data directory We will be creating links to those sstable's files, and those don't work if the data directory and the test sstable are on different devices. Copying the files to the same directory fixes the problem. Message-Id: <20170716090405.14307-1-avi@scylladb.com> (cherry picked from commit `9116dd91cb`)	2017-07-17 08:47:08 +03:00
Avi Kivity	e09d4a9b75	Update seastar submodule * seastar 844bcfb...b812cee (1): > Update dpdk submodule Fixes #2595 (again). Still tracking master.	2017-07-16 17:01:48 +03:00
Avi Kivity	67f25e56a6	Update seastar submodule * seastar ff34c42...844bcfb (1): > Update dpdk submodule Still tracking master. Fixes #2595.	2017-07-15 19:18:10 +03:00
Tomasz Grabiec	74c4651b95	Merge "Fixes for memtable flushing and replay positions" from Duarte We don't ensure mutations are applied in memory following the order of their replay positions. A memtable can thus be flushed with replay position rp, with the new one being at replay position rp', where rp' < rp. This breaks an intrinsic assumption in the code, which this series addresses. Fixes #2074 branch memtable-flush/v3 of git@github.com:duarten/scylla.git: commitlog: Always flush latest memtable column_family: More precise count of switched memtables column_family: Fix typo in pending_tasks metric name column_family: More precise count of pending flushes dirty_memory_manager: Remove unnecessary check from flush_one() column_family: Don't rely on flush_queue to guarantee flushes finished column_family: Don't bother closing the flush_queue on stop() column_family: Stop using flush_queue column_family: Remove outdated comment about the flush_queue memtable: Stop tracking the highest flushed rp (cherry picked from commit `caa62f7f05`)	2017-07-14 19:07:33 +02:00
Duarte Nunes	58bfb86d73	storage_proxy: Preserve replica order across mutations In storage_proxy we arrange the mutations sent by the replicas in a vector of vectors, such that each row corresponds to a partition key and each column contains the mutation, possibly empty, as sent by a particular replica. There is reconciliation-related code that assumes that all the mutations sent by a particular replica can be found in a single column, but that isn't guaranteed by the way we initially arrange the mutations. This patch fixes this and enforces the expected order. Fixes #2531 Fixes #2593 Signed-off-by: Gleb Natapov <gleb@scylladb.com> Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170713162014.15343-1-duarte@scylladb.com> (cherry picked from commit `b8235f2e88`)	2017-07-14 12:11:50 +03:00
Tomasz Grabiec	cb94c66823	legacy_schema_migrator: Fix calculation of is_dense Current algorithm was marking tables with regular columns not named "value" as not dense, which doesn't have to be the case. It can be either way. It should be enough to look at clustering components. If there is a clustering key, then table is dense if and only if all comparator components belong to the clustering key. If there is no clustering key, then if there are any regular columns we're sure it's not dense. Fixes #2587. Message-Id: <1499877777-7083-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `30ec4af949`)	2017-07-13 17:28:25 +03:00
Tomasz Grabiec	5aa3e23fcd	gdb: Fix "scylla columnfamilies" command Broken in `0e4d5bc2f3`. Message-Id: <1499951956-26206-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `54953c8d27`)	2017-07-13 16:33:50 +03:00
Takuya ASADA	aac1d5d54d	dist/common/systemd: move scylla-server.service to be after network-online.target instead of network.target To make sure start Scylla after network is up, we need to move from network.target to network-online.target. Fixes #2337 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1493661832-9545-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `0c81974bc4`)	2017-07-12 13:36:52 +03:00
Glauber Costa	a371b8a5bf	change task quota's default The default of 2ms is somewhat arbitrary. Now that we have a lot more mileage deploying Scylla applications in production it does sound not only arbitrary, but high. In particular, it is really hard to achieve 1ms latencies in the face of CPU-heavy workloads with it. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <1499354495-27173-1-git-send-email-glauber@scylladb.com> (cherry picked from commit `780a6e4d2e`)	2017-07-12 10:21:35 +03:00
Avi Kivity	a69fb8a8ed	Update seastar submodule * seastar 89cc97c...ff34c42 (6): > tls: Wrap all IO in semaphore (Fixes #2575) > tests/lowres_clock_test.cc: Declare helper static > tests/lowres_clock_test.cc: fix compilation error for older GCC > configure.py: verifies boost version > pkg-config: Eliminate spaces in include path arguments > allow applications to override task-quota-ms Still tracking seastar master.	2017-07-12 10:20:49 +03:00
Avi Kivity	00b9640b2c	Merge "Preserve table schema digest on schema tables migration" from Tomasz "Currently new nodes calculate digests based on v3 schema mutations, which are very different from v2 mutations. As a result they will use schemas with different table_schema_version that the old nodes. The old nodes will not recognize the version and will try to request its definition. That will fail, because old nodes don't understand v3 schema mutations. To fix this problem, let's preserve the digests during migration, so that they're the same on new and old nodes. This will allow requests to proceed as usual. This does not solve the problem of schema being changed during the rolling upgrade. This is not allowed, as it would bring the same problem back. Fixes #2549." * tag 'tgrabiec/use-consistent-schema-table-digests-v2' of github.com:cloudius-systems/seastar-dev: tests: Add test for concurrent column addition legacy_schema_migrator: Set digest to one compatible with the old nodes schema_tables: Persist table_schema_version schema_tables: Introduce system_schema.scylla_tables schema_tables: Simplify read_table_mutations() schema_tables: Resurrect v2 read_table_mutations() system_keyspace: Forward-declare legacy schemas legacy_schema_migrator: Take storage_proxy as dependency (cherry picked from commit `a397889c81`)	2017-07-11 17:23:21 +03:00
Gleb Natapov	59d608f77f	consistency_level: report less live endpoints in Unavailable exception if there are pending nodes DowngradingConsistencyRetryPolicy uses live replicas count from Unavailable exception to adjust CL for retry, but when there are pending nodes CL is increased internally by a coordinator and that may prevent retried query from succeeding. Adjust live replica count in case of pending node presence so that retried query will be able to proceed. Fixes #2535 Message-Id: <20170710085238.GY2324@scylladb.com> (cherry picked from commit `739dd878e3`)	2017-07-11 17:16:46 +03:00
Botond Dénes	1717922219	Fix crash in the out-of order restrictions error msg composition Use name of the existing preceeding column with restriction (last_column) instead of assuming that the column right after the current column already has restrictions. This will yield an error message that is different from that of Cassandra, albeit still a correct one. Fixes #2421 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <40335768a2c8bd6c911b881c27e9ea55745c442e.1499781685.git.bdenes@scylladb.com> (cherry picked from commit `33bc62a9cf`)	2017-07-11 17:15:45 +03:00
Paweł Dziepak	7cd4bb0c4a	transport: send correct type id for counter columns CQL reply may contain metadata that describes columns present in the response including the information about their type. However, Scylla incorrectly reports counter types as bigint. The serialised format of counters and bigint is exactly the same, which could explain why the problem hasn't been noticed earlier but it is a bug nevertheless. Fixes #2569. Message-Id: <20170711130520.27603-1-pdziepak@scylladb.com> (cherry picked from commit `5aa523aaf9`)	2017-07-11 16:37:24 +03:00
Tomasz Grabiec	588ae935e7	legacy_schema_migrator: Use separate joinpoint instance for each table Otherwise we may deadlock, as explained in commit `5e8f0efc8`: Table drop starts with creating a snapshot on all shards. All shards must use the same snapshot timestamp which, among other things, is part of the snapshot name. The timestamp is generated using supplied timestamp generating function (joinpoint object). The joinpoint object will wait for all shards to arrive and then generate and return the timestamp. However, we drop tables in parallel, using the same joinpoint instance. So joinpoint may be contacted by snapshotting shards of tables A and B concurrently, generating timestamp t1 for some shards of table A and some shards of table B. Later the remaining shards of table A will get a different timestamp. As a result, different shards may use different snapshot names for the same table. The snapshot creation will never complete because the sealing fiber waits for all shards to signal it, on the same name. Message-Id: <1499762663-21967-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `310d2a54d2`)	2017-07-11 12:31:21 +03:00
Avi Kivity	c292e86b3c	sstables: fix use-after-free in read_simple() `r` is moved-from, and later captured in a different lambda. The compiler may choose to move and perform the other capture later, resulting in a use-after-free. Fix by copying `r` instead of moving it. Discovered by sstable_test in debug mode. Message-Id: <20170702082546.20570-1-avi@scylladb.com> (cherry picked from commit `07b8adce0e`)	2017-07-10 15:32:57 +03:00
Asias He	3dc0d734b0	repair: Do not store the failed ranges The number of failed ranges can be large so it can consume a lot of memory. We already logged the failed ranges in the log. No need to storge them in memory. Message-Id: <7a70c4732667c5c3a69211785e8efff0c222fc28.1498809367.git.asias@scylladb.com> (cherry picked from commit `b2a2fbcf73`)	2017-07-10 14:37:47 +03:00
Takuya ASADA	2d612022ba	dist/common/scripts/scylla_cpuscaling_setup: skip configuration when cpufreq driver doesn't loaded Configuring cpufreq service on VMs/IaaS causes an error because it doesn't supported cpufreq. To prevent causing error, skip whole configuration when the driver not loaded. Fixes #2051 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1498809504-27029-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `1c35549932`)	2017-07-10 14:08:54 +03:00

1 2 3 4 5 ...

12463 Commits