scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Avi Kivity	b6d74b1c19	timeout_config: introduce timeout configuration Different request types have different timeouts (for example, read requests have shorter timeouts than truncate requests), and also different request sources have different timeouts (for example, an internal local query wants infinite timeout while a user query has a user-defined timeout). To allow for this, define two types: timeout_config represents the timeout configuration for a source (e.g. user), while timeout_config_selector represents the request type, and is used to select a timeout within a timeout configuration. The latter is implemented as a pointer-to-member. Also introduce an infinite timeout configuration for internal queries.	2018-04-29 19:52:40 +03:00
Avi Kivity	0530653da9	Merge "adapt scylla_io_setup to recent I/O Scheduler changes" from Glauber " Recently many changes have landed in seastar for the I/O Scheduler. We can now describe the I/O storage of a machine by its visible properties like throughput and bandwidth instead of relying in an indirect calculation. For the instances we support, we can just measure that and start using them right away. A version of iotune that computes those properties is not yet ready, but in its making I have noticed that we aren't really setting the nomerges and scheduler properties of the disks under testing. We definitely should, since that can influence the results. So this patchset also starts doing that. The commandline for iotunev2 shouldn't change much. When it is ready we will just adjust this script once more. " * 'scylla_io_setup' of github.com:glommer/scylla: scylla_io_setup: preconfigure i3 and i2 instances with new I/O scheduler properties scylla_lib: drop support for m3 and c3 AWS instance types io_setup: call blocktune before tuning I/O blocktune: allow it to be called as a library. scripts: move scylla-blocktune to scripts location	2018-04-29 11:44:06 +03:00
Avi Kivity	7161244130	Merge seastar upstream * seastar 70aecca...ac02df7 (5): > Merge "Prefix preprocessor definitions" from Jesse > cmake: Do not enable warnings transitively > posix: prevent unused variable warning > build: Adjust DPDK options to fix compilation > io_scheduler: adjust property names DEBUG, DEFAULT_ALLOCATOR, and HAVE_LZ4_COMPRESS_DEFAULT macro references prefixed with SEASTAR_. Some may need to become Scylla macros.	2018-04-29 11:03:21 +03:00
Raphael S. Carvalho	043fadb15b	sstables/twcs: fix setting of timestamp resolution iterator incorrectly dereferenced when timestamp resolution not explicitly specified. following dtests are fixed: compaction_additional_test.CompactionAdditionalStrategyTests_with_TimeWindowCompactionStrategy.compaction_is_started_on_boot_test compaction_additional_test.CompactionAdditionalTest.compact_data_by_time_window_test compaction_additional_test.CompactionAdditionalTest.compaction_removes_ttld_data_by_time_windows_test compaction_test.TestCompaction_with_DateTieredCompactionStrategy.compaction_strategy_switching_test Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180427192545.17440-1-raphaelsc@scylladb.com>	2018-04-29 10:44:44 +03:00
Glauber Costa	0c29289c22	scylla_io_setup: preconfigure i3 and i2 instances with new I/O scheduler properties We can use iotunev2 (or any other I/O generator) to test for the limits of the disks for the i2 and i3 instance classes. The values I got here are the values I got from ~5 invocations of the (yet to be upstreamed) iotune v2, with the IOPS numbers rounded for convenience of reading. During the execution, I verified that the disks were saturated so we can trust these numbers even if iotunev2 is merged in a different form. The numbers are very consistent, unlike what we usually saw with the first version of iotune. Previously, we were just multiplying the concurrency number by the number of disks. Now that we have better infrastructure, we will manually test i3.large and i3.xlarge, since their disks are smaller and slower. For the other i3, and all instances in the i2 family storage scales up by adding more disks. So we can keep multiplying the characteristics of one known disk by the number of disks and assuming perfect scaling. Example for i3, obtained with i3.2xlarge: read_iops = 411k read_bandwidth = 1.9GB/s So for i3.16xlarge, we would have read_iops = 3.28M and 15GB/s - very close to the numbers advertised by AWS. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-04-28 09:50:07 -04:00
Glauber Costa	c85fbd16cb	scylla_lib: drop support for m3 and c3 AWS instance types m3 has 80GB SSDs in its largest form and I doubt anybody has ever used it with Scylla. I am also not aware of any c3 deployments. Since it is past generation, it doesn't even show up in the default instance selector anymore. I propose we drop AMI support for it. In practice, what that means is that we won't auto-tune its I/O properties and people that want to use it will have to run scylla_io_setup - like they do today with the EBS instances. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-04-28 09:50:07 -04:00
Glauber Costa	685a7c9ae6	io_setup: call blocktune before tuning I/O We are not configuring the disks the way we want them with respect to scheduler and nomerges. This is an oversigh that became clear now that I started rewriting iotune-- since I will explicitly test for that. But since this can affect the results, it should be here all along. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-04-28 09:50:07 -04:00
Glauber Costa	9eb8ea8b11	blocktune: allow it to be called as a library. This patch makes the functions in scylla-blocktune available as a library for other scripts - namely scylla_io_setup. The filename, scylla-blocktune, is not the most convenient thing to call from python so instead of just wrapping it in the usual test for __main__ I am just splitting the file into two. Another option would be to patch all callers to call scylla_blocktune.py, but because we are usually not using extensions in scripts that are meant to be called directly I decided for the split. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-04-28 09:50:07 -04:00
Glauber Costa	f837d5b1f1	scripts: move scylla-blocktune to scripts location scylla-blocktune currently lives in the top level but this is mostly historical. When time comes for us to install it, the packaging systems will copy it to /usr/lib/scylla with the others. So for consistency let's make sure that it also lives in the scripts directory. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-04-28 09:50:07 -04:00
Vladimir Krivopalov	b3572acd6e	A few improvements to encoding_stats structure. - Use the same default epoch as Origin - Use default value for the encoding_stats parameter in sstable::write_components() Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <846c6d2cbb97d2dd25968cb00b8557c86ff5e35c.1524854727.git.vladimir@scylladb.com>	2018-04-27 22:03:38 +03:00
Avi Kivity	2fb1bcfd13	Update scylla-ami submodule * dist/ami/files/scylla-ami 02b1853...8a6e4dd (1): > ds2_configure.py: always use Ec2Snitch for single region case Fixes #1800.	2018-04-27 21:02:27 +03:00
Vladimir Krivopalov	36fe06fd3e	Make abstract_type::is_fixed_length() non-virtual. This method is called agressively through SSTable 3.0 read/write, we want to reasonably optimise it to no incur extra indirect calls. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <2d00ddecd112af867a30d3d6930c10165dd5af34.1524851530.git.vladimir@scylladb.com>	2018-04-27 20:57:46 +03:00
Tomasz Grabiec	b1465291cf	db: schema_tables: Treat drop of scylla_tables.version as an alter After upgrade from 1.7 to 2.0, nodes will record a per-table schema version which matches that on 1.7 to support the rolling upgrade. Any later schema change (after the upgrade is done) will drop this record from affected tables so that the per-table schema version is recalculated. If nodes perform a schema pull (they detect schema mismatch), then the merge will affect all tables and will wipe the per-table schema version record from all tables, even if their schema did not change. If then only some nodes get restarted, the restarted nodes will load tables with the new (recalculated) per-table schema version, while not restarted nodes will still use the 1.7 per-table schema version. Until all nodes are restarted, writes or reads between nodes from different groups will involve a needless exchange of schema definition. This will manifest in logs with repeated messages indicating schema merge with no effect, triggered by writes: database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f The sync will be performed if the receiving shard forgets the foreign version, which happens if it doesn't process any request referencing it for more than 1 second. This may impact latency of writes and reads. The fix is to treat schema changes which drop the 1.7 per-table schema version marker as an alter, which will switch in-memory data structures to use the new per-table schema version immediately, without the need for a restart. Fixes #3394 Tests: - dtest: schema_test.py, schema_management_test.py - reproduced and validated the fix with run_upgrade_tests.sh from git@github.com:tgrabiec/scylla-dtest.git - unit (release) Message-Id: <1524764211-12868-1-git-send-email-tgrabiec@scylladb.com>	2018-04-27 17:12:33 +03:00
Avi Kivity	6154ea734d	Merge "upport for writing SSTables 3.0 - rows only" from Vladimir " This patch series introduces initial support for writing SSTables in 'mc' format (aka SSTables 3.0). Currently, the following components are written in 3.0 format: - Data.db - Index.db - Summary.db (there were no changes to summary files format compared to ka/la) Other SSTables components are written in the old format for now as they still need to exist to satisfy post-flush processing. For now, only rows are written to the data file and indexed. Range tombstones are not supported. Writing rows is supported in full with the only exception being counter cells. All the other features (TTLed data, row/cell level tombstones, collections, etc) are supported. Unit tests rely on producing files and binary-comparing them with 'golden' copies that are produced using Cassandra 3.11. This is done to not block until reading SSTables 3.0 format is implemented. ======================================= Implementation notes ======================================= Internally, sstable_writer has been refactored to support multiple implementations that are instantiated in its constructor based on the sstable version. Little to no code is shared among sstable_writer_v2 and sstable_writer_v3 as we only intend to support sstable_writer_v2 alongside sstable_writer_v3 for a single release (to be able to do rollback on rolling upgrade failure) and then plan to get rid of it entirely and switch to always writing SSTables in the new format. The design of sstable_writer_v3 mostly follows that of its precursors sstable_writer(_v2) and components_writer. Some refactoring and further code rearrangements are expected in the future but the main code is there. " * 'projects/sstables-30/write-rows/v2' of https://github.com/argenet/scylla: Add tests for writing data and index files in SSTables 3.0 ('mc') format. Support for writing SSTables 3.0 ('mc') Data.db and Index.db files - rows only. Add missing enum values to bound_kind. Add building blocks for writing data in SSTables 3.0 format. Refactor sstable_writer to support various internal implementations. Add is_fixed_length() to data types. Add mutation_partition::apply_insert() overload that accepts TTL and expiry for row marker.	2018-04-27 17:10:31 +03:00
Piotr Jastrzebski	d839a945b4	Use goto instead of break in data_consume_rows_context_m::process_state This way the code will be better predicted. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <271333caa723e8f3ed1db4fbe6b014ebde2b5d3a.1524818584.git.piotr@scylladb.com>	2018-04-27 11:56:13 +03:00
Vladimir Krivopalov	77fdfa3e7a	Add tests for writing data and index files in SSTables 3.0 ('mc') format. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	15ef4ca73c	Support for writing SSTables 3.0 ('mc') Data.db and Index.db files - rows only. This fix adds functionality for writing data in 'mc' format to Data.db file according to the SSTables 3.0 data format as described at https://github.com/scylladb/scylla/wiki/SSTables-3.0-Data-File-Format and Index.db file according to the specification at https://github.com/scylladb/scylla/wiki/SSTables-3.0-Index-File-Format The following cases are not supported yet: - writing counter cells - range tombstones In Index.db, end open markers are not written since range tombstones are not supported for data files yet. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	3ecc9e9ce4	Add missing enum values to bound_kind. bound_kind::clustering, bound_kind::excl_end_incl_start and bound_kind::incl_end_excl_start are used during SSTables 3.0 writing. bound_kind::static_clustering is not used yet but added for completeness and parity with the Origin. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	a95664be08	Add building blocks for writing data in SSTables 3.0 format. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	bb2bea928a	Refactor sstable_writer to support various internal implementations. This is preparatory work for supporting writing SSTables in multiple formats. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	54bd74fda0	Add is_fixed_length() to data types. For any given CQL data type, this member returns whether its values are of fixed or variable length. This is used by SSTables 3.0 format to only store the length value for variable-length cells. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 14:34:20 -07:00
Vladimir Krivopalov	ed62b9a667	Add mutation_partition::apply_insert() overload that accepts TTL and expiry for row marker. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-26 13:27:42 -07:00
Piotr Jastrzebski	a8154e2825	Fix use-after-free in summary parsing Buffer received from read_exactly is referenced by a pointer used in do_until loop but is not kept around and is destroyed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <5edd6d08ec4466fe6abd0e83b4bfb24f1f5c80fa.1524747108.git.piotr@scylladb.com>	2018-04-26 15:54:41 +03:00
Avi Kivity	5119c1e9c1	Merge "Implement reading simple table from sstable 3.x" from Piotr " This patchset prepares everything for support of both 2.x and 3.x formats and implements reading from sstable 3.x very simple table with just partition keys. Tests: units (release) " * 'haaawk/sstables3/read_only_partitions_v4' of ssh://github.com/scylladb/seastar-dev: (22 commits) Test for reading sstable in MC format with no columns Use new mp_row_consumer_m and data_consume_rows_context_m Introduce mp_row_consumer_m Rename mp_row_consumer to mp_row_consumer_k_l Introduce consumer_m and data_consume_rows_context_m Use read_short_length_bytes in RANGE_TOMBSTONE Use read_short_length_bytes in ATOM_START Use read_short_length_bytes in ROW_START Add continuous_data_consumer::read_short_length_bytes Reduce duplication with continuous_data_consumer::read_partial_int Add test for a simple table with just partition key Add test for reading index Extract mp_row_consumer to separate header Make sstable_mutation_reader independent from mp_row_consumer Make sstable_mutation_reader a template Make data_consume_context a template Move data_consume_rows_context from row.cc to row.hh Decouple sstable.hh and row.hh Reduce visibility of sstable::data_consume_* Move data_consume_context to separate header ...	2018-04-26 14:35:42 +03:00
Botond Dénes	b2d71ed872	install_dependencies.sh: centos: add systemd-devel This optional dependency is needed to properly integrate with systemd. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <bacd07958531e6541d5b1a4ea885f01491002a7b.1524740540.git.bdenes@scylladb.com>	2018-04-26 14:32:36 +03:00
Piotr Jastrzebski	5c223c13d6	Test for reading sstable in MC format with no columns Just a simple table with only partition key. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:38 +02:00
Piotr Jastrzebski	6dd7ce2582	Use new mp_row_consumer_m and data_consume_rows_context_m When SSTable is in MC format then use those new classes to be able to read the sstable. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:38 +02:00
Piotr Jastrzebski	9ba64f65e1	Introduce mp_row_consumer_m This is a version of mp_row_consumer that can handle SSTables in MC format. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:38 +02:00
Piotr Jastrzebski	4aec023927	Rename mp_row_consumer to mp_row_consumer_k_l Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:38 +02:00
Piotr Jastrzebski	2ee3d8b87b	Introduce consumer_m and data_consume_rows_context_m Those classes can handle SSTables in MC format. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:38 +02:00
Piotr Jastrzebski	b343212073	Use read_short_length_bytes in RANGE_TOMBSTONE Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	90bb7802cc	Use read_short_length_bytes in ATOM_START Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	6a81a755ee	Use read_short_length_bytes in ROW_START Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	06ceea9c3e	Add continuous_data_consumer::read_short_length_bytes This is a common operation so it's better to have it implemented in a single place. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	e664360730	Reduce duplication with continuous_data_consumer::read_partial_int Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	9a3f93a42b	Add test for a simple table with just partition key Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	c6d4f49abb	Add test for reading index Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	63f0b57365	Extract mp_row_consumer to separate header Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	e5145b87b0	Make sstable_mutation_reader independent from mp_row_consumer Take consumer as template parameter instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	9c93f9f5f4	Make sstable_mutation_reader a template Take DataConsumeRowsContext type as parameter. This will allow us to implement different context for reading 3.x files. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	9fad5831df	Make data_consume_context a template Parametrize it with the type of data consume rows context. There will be different implementations used for different sstable file formats. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	e2b393df13	Move data_consume_rows_context from row.cc to row.hh It will be used as a template parameter for sstable_mutation_reader once it's turned into a template. This means the definition has to be accessible. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	0e405719e8	Decouple sstable.hh and row.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	bcf5717753	Reduce visibility of sstable::data_consume_* They are used just in partition.cc, row.cc and sstables_test.cc so it is usefull to cut their scope by moving them to data_consume_context.hh. This will make it much easier to turn data_consume_context into a template. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	578aa6826f	Move data_consume_context to separate header It's used only in row.cc, partition.cc and sstables_test.cc so it's better to reduce the dependency just to those files. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	a55cec544e	mp_row_consumer: stop depending on sstable_mutation_reader Introduce mp_row_consumer_reader to cut a cyclic dependency between them. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:49:37 +02:00
Piotr Jastrzebski	0efcc6b33f	Fix use-after-free in estimated_histogram parsing A pointer to buf was used in do_until but buf wasn't kept around and was destroyed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-26 12:48:02 +02:00
Takuya ASADA	782ebcece4	dist/debian: add --jobs <njobs> option just like build_rpm.sh On some build environment we may want to limit number of parallel jobs since ninja-build runs ncpus jobs by default, it may too many since g++ eats very huge memory. So support --jobs <njobs> just like on rpm build script. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180425205439.30053-1-syuu@scylladb.com>	2018-04-26 12:44:06 +03:00
Duarte Nunes	6f9bc28edf	Merge 'Collect statistics on updates to memtables' from Vladimir " This patchset brings in a statistics collector that tracks minimal values for timestamps, TTLs and local deletion times for all the updates made to a given memtable. This statistics is later used when flushing memtables into SSTables using 3.x ('mc') format to delta-encode corresponding values using collected minimums as bases (that is why it is called encoding statistics). This patchset is sent out apart from other changes that introduce writing SSTables 3.x to facilitate read path implementation that also needs the encoding_stats structure. The tests for write path implicitly cover this functionality as any rows written to a SSTable 3.0 file make use of delta-encoding. " * 'projects/sstables-30/collect-encoding-statistics-v4' of https://github.com/argenet/scylla: Collect encoding statistics for memtable updates. Factor out min_tracker and max_tracker as common helpers. Always pass mutation_partitions to partition_entry::apply()	2018-04-26 00:39:15 +01:00
Vladimir Krivopalov	948c4d79d3	Collect encoding statistics for memtable updates. We keep track of all updates and store the minimal values of timestamps, TTLs and local deletion times across all the inserted data. These values are written as a part of serialization_header for Statistics.db and used for delta-encoding values when writing Data.db file in SSTables 3.0 (mc) format. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-25 15:39:14 -07:00

1 2 3 4 5 ...

15192 Commits