scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 16:03:20 +00:00

Author	SHA1	Message	Date
Botond Dénes	fae5a2a8c8	shard_reader: recreate_reader(): fix empty range case If the shard reader is created for a singular range (has a single partition), and then it is evicted after reaching EOS, when recreated we would have to create a reader that reads an empty range, since the only partition the range has was already read. Since it is not possible to create a reader with an empty range, we just didn't recreate the reader in this case. This is incorrect however, as the code might still attempt to read from this reader, if only due to a bug, and would trigger a crash. The correct fix is to create an empty reader that will immediately be at EOS.	2019-02-12 16:20:51 +02:00
Botond Dénes	cd807586f6	foreign_reader: rip out the now unused private API Drop all the glue code, needed in the past so the shard reader can be implemented on top of foreign reader. As the shard reader moved away from foreign reader, this glue code is not needed anymore.	2019-02-12 16:20:51 +02:00
Botond Dénes	d80bc3c0a5	shard_reader: move away from foreign_reader In the past, shard reader wrapped a foreign reader instance, adding functionality required by the multishard reader on top. This has worked well to a certain degree, but after the addition of pause-resume of shard reader, the cooperation with foreign reader became more-and-more a struggle. It has now gotten to a point, where it feels like shard reader is fighting foreign reader as much as it reuses it. This manifested itself in the ever growing amount of glue code, and hacks baked into foreign reader (which is supposed to be of general use), specific to the usage in the multishard reader. It is time we don't force this code-reuse anymore and instead implement all the required functionality in shard reader directly.	2019-02-12 16:20:51 +02:00
Botond Dénes	da0c01c68b	multishard_combining_reader: make shard_reader a shared pointer Some members of shard reader have to be accessed even after it is destroyed. This is required by background work that might still be pending when the reader is destroyed. This was solved by creating a special `state` struct, which contained all the members of the shard readers that had to be accessed even after it was destroyed. This state struct was managed through a shared pointer, that each continuation that was expected to outlive the reader, held a copy of. This however created a minefield, where each line of the code had to be carefully audited to access only fields that will be guaranteed to remain valid. Fix this mess by making the whole class a shared pointer, with `enable_shared_from_this`. Now each continuation just has to make sure to keep `this` alive and code can now access all members freely (well, almost).	2019-02-12 16:20:51 +02:00
Botond Dénes	f1c3421eb4	multishard_combining_reader: move the shard reader definition out Shard reader started its life as a very thin layer above foreign reader, with just some convenience methods added. As usual, by now it has grown into a hairy monster, its class definition out-growing even that of the multishard reader itself. It is time shard reader is moved into the top-level scope, improving the readability of both classes.	2019-02-12 16:20:51 +02:00
Botond Dénes	7114b59309	multishard_combining_reader: disentangle shard_reader Currently shard reader has a reference to the owning multishard reader and it freely accesses its members. This resulted in a mess, where it's not clear what exactly shard reader depends on. Disentangle this mess, by making the shard reader self-sufficient, passing all it depends on into its constructor.	2019-02-12 16:20:51 +02:00
Nadav Har'El	85e5791710	tests/view_schema_test: fix flakiness caused by missing eventually() All tests that involve writing to a base table and then reading from the view table must use the eventually() function to account for the fact that the view update is asynchronous, and may be visible only some time after writing the base table. Forgetting an eventually() can cause the test to become flaky and sometimes fail because the expected data is not yet in the view. Botond noticed these failures in practice in two subtests (test_partition_key_filtering_with_slice and test_clustering_key_in_restrictions). This patch fixes both tests, and I also reviewed the entire source file view_schem_test.cc and found additional places missing an eventually() (and also places that unnecessarily used eventually() to read from the base table), and fixed those as well. Fixes #4212 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190212121140.14679-1-nyh@scylladb.com>	2019-02-12 16:10:30 +02:00
Paweł Dziepak	eb03cf00f5	sstable: write_components: drop default for encoding stats There is no value if having a default value for encoding_stats parameter of write_components(). If anything it weakens the tests by encouraging not using the real encoding stats which is not what the actual sstable write path in Scylla does. This patch removes the default value and makes most of the tests provide real encoding statistics. The ones that do not are those that have no easy way of obtaining those (and those stats are not that important for the test itself) or there is a reason for not using those (sstable_3_x_test::test_sstable_write_large_row uses row size thresholds based on size with default-constructed encoding_stats). Message-Id: <20190212124356.14878-1-pdziepak@scylladb.com>	2019-02-12 16:08:24 +02:00
Calle Wilund	4a52ed7884	commitlog: Accept recycled (not yet re-used) segments in replay Refs #4085 Changes commitlog descriptor to both accept "Recycled-Commitlog..." file names, and preserve said name in the descriptor. This ensures we pick up the not-yet-used recycled segments left from a crash for replay. The replay in turn will simply ignore the recycled files, and post actual replay they will be deleted as needed. Message-Id: <20190129123311.16050-1-calle@scylladb.com>	2019-02-12 12:23:55 +02:00
Nadav Har'El	93baa334ea	create-relocatable-package.py: speed up slow compression create-relocatable-package.py currently (refs #4194) builds a compressed tar file, but does so using a painfully slow Python implementation of gzip, which is a problem considering the huge size (around 2 gigabytes) of Scylla's executable. On my machine, running it for a release build of Scylla takes a whopping 6 minutes. Just replacing the Python compression with a pipe to an external "gzip" process speeds up the run to just 2 minutes. But gzip is still not optimal, using only one thread even when on a many-core machine. If we switch to "pigz", a parallel implementation of "gzip", all cores are used and on my machine the compression speeds up to just 23 seconds - that's 15 times faster than before this patch. So this patch has create-relocatable-package.py use an external pigz process. "pigz" is now required on the build system (if you want to create packages), so is added to install-dependencies.sh. [avi: update toolchain] Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190212090333.3970-1-nyh@scylladb.com>	2019-02-12 11:19:04 +02:00
Nadav Har'El	1cf1af1502	scylla_setup: fix non-interactive behavior In commit `ec66dd6562`, in non-interactive runs of scylla_setup all options were unintentionally set to "false", regardless of the options passed on the scylla_setup command line. This can lead to all sorts of wrong behaviors, and in particular one test setup assumed it was enabling the Scylla service (which was previously the default) but after this commit, it no longer did. This patch restores the previous behavior: Non-interactive invocations of scylla_setup adhere to the defaults and the command-line options, rather than blindly choosing "false". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190211214105.32613-1-nyh@scylladb.com>	2019-02-12 10:50:00 +02:00
Avi Kivity	da9628c6dc	auth: password_authenticator: protect against NULL salted_hash In case salted_hash was NULL, we'd access uninitialized memory when dereferencing the optional in get_as<>(). Protect against that by using get_opt() and failing authentication if we see a NULL. Fixes #4168. Tests: unit (release) Branches: 3.0, 2.3 Message-Id: <20190211173820.8053-1-avi@scylladb.com>	2019-02-11 18:54:03 +01:00
Avi Kivity	cb51fcab9d	README: improbe dbuild instructions Add a quick start, document more options, and link from the main README. Message-Id: <20190210154606.21739-1-avi@scylladb.com>	2019-02-11 09:25:25 +01:00
Avi Kivity	2724a66a12	docker: don't send .git during "docker build" It's huge and useless during "docker build" operations. Message-Id: <20190208161848.21125-1-avi@scylladb.com>	2019-02-11 09:17:14 +01:00
Glauber Costa	e0bfd1c40a	allow Cassandra SSTables with counters to be imported if they are new enough Right now Cassandra SSTables with counters cannot be imported into Scylla. The reason for that is that Cassandra changed their counter representation in their 2.1 version and kept transparently supporting both representations. We do not support their old representation, nor there is a sane way to figure out by looking at the data which one is in use. For safety, we had made the decision long ago to not import any tables with counters: if a counter was generated in older Cassandra, we would misrepresent them. In this patch, I propose we offer a non-default way to import SSTables with counters: we can gate it with a flag, and trust that the user knows what they are doing when flipping it (at their own peril). Cassandra 2.1 is by now pretty old. many users can safely say they've never used anything older. While there are tools like sstableloader that can be used to import those counters, there are often situations in which directly importing SSTables is either better, faster, or worse: the only option left. I argue that having a flag that allow us to import them when we are sure it is safe is better than having no option at all. With this patch I was able to successfully import Cassandra tables with counters that were generated in Cassandra 2.1, reshard and compact their SSTables, and read the data back to get the same values in Scylla as in Cassandra. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190210154028.12472-1-glauber@scylladb.com>	2019-02-10 17:50:48 +02:00
Glauber Costa	61ea54eff6	tools: toolchain: dbuild: use host networking This is convenient to test scylla directly by invoking build/dev/scylla. This needs to be done under docker because the shared objects scylla looks for may not exist in the host system. During quick development we may not want to go through the trouble of packaging relocatable scylla every time to test changes. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190209021033.8400-1-glauber@scylladb.com>	2019-02-10 12:16:47 +02:00
Duarte Nunes	d2d885fb93	Merge 'Fix misdetection of remote counter shards' from Paweł " The code reading counter cells form sstables verifies that there are no unsupported local or remote shards. The latter are detected by checking if all shards are present in the counter cell header (only remote shards do not have entries there). However, the logic responsible for doing that was incorrectly computing the total number of counter shards in a cell if the header was larger than a single counter shard. This resulted in incorrect complaints that remote shards are present. Fixes #4206 Tests: unit(release) " * tag 'counter-header-fix/v1' of https://github.com/pdziepak/scylla: tests/sstables: test counter cell header with large number of shards sstables/counters: fix remote counter shard detection	2019-02-10 12:16:31 +02:00
Paweł Dziepak	4eeb8eeed5	tests/sstables: test counter cell header with large number of shards The logic responsible for reading counters from sstables was getting confused by large headers. The size of the header depends directly on the number of shards. This tests checks that we can handle cells with large number of counter shards properly.	2019-02-08 17:06:31 +00:00
Paweł Dziepak	df1ac03154	sstables/counters: fix remote counter shard detection Each counter cell has a header with an entry for each local and global shards. The detection of remote shards is done by checking if there are any counter shards that do not have an entry in the header. This is done by computing the number of counter shards in a cell and comparing it to the number of header entries. However, the computation was wrong and included the size taken by the header itself. As a result, if the header was as big or larger than a single counter shard Scylla incorrectly complained about remote shards.	2019-02-08 17:04:22 +00:00
Glauber Costa	8ba6b569b1	relocatable python: make sure all shared objects are relocated The interpreter as it is right now has a bug: I incorrectly assumed that all the shared libraries that python dynamically links would be in lib-dynload. That is not true, and at least some of them are in site-packages. With that, we were loading system libraries for some shared objects. The approach taken to fix this is to just check if we're seeing a shared library and relocate everything we see: we will end up relocating the ones in lib64 too, but that not only should be okay, it is probably even more fool-proof. While doing that I noticed that I had forgotten to incorporate one of previous feedback from Avi (that we're leaving temporary files behind). So I'm fixing that as well. [avi: update toolchain] Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190208115501.7234-1-glauber@scylladb.com>	2019-02-08 18:42:24 +02:00
Glauber Costa	fb742473e2	replace /usr/local as a source of packages in the python relocatable interpreter I was playing with the python3 interpreter trying to get pip to work, just to see how far we can go. We don't really need pip, but I figured it would be a good stress test to make sure that the process is working and robust. And it didn't really work, because although pip will correctly install things into $relocatable_root/local/lib, sys.path will still refer to a hardcoded /usr/local. While this should not affect Scylla, since we expect to have all our modules in out path anyway -- and that path is searched before /usr/local, it is still dangerous to make an absolute reference like this. Unfortunately, /usr/local/ it is included unconditionally by site.py, which is executed when the interpreter is started and there is no environment variable I found to change that (the help string refers to PYTHONNOUSERSITE, but I found no mention of that in site.py whatsoever) There is a way to tell site.py not to bother to add user sites, by passing the -s flag, which this patch does. Aside from doing that, we also enhance PYTHONPATH to include a reference to ./local/{lib,lib64}/python<version>/site-packages. After applying this patch, I was able to build an interpreter containing only python3-pip and python3-setuptools, and build the relocatable environment from there. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190206052104.25927-1-glauber@scylladb.com>	2019-02-08 18:41:52 +02:00
Paweł Dziepak	64b1a2caf9	tests: modernise tmpdir tmpdir is a helper class representing a temporary directory. Unfortunately, it suffers for some problems such as lack of proper encapsulation and weak typing. This has caused bugs in the past when the user code accidentally modified the member variable with the path to the directory. This patch modernises tmpdir and updates its users. The path is stored in a std::filesystem::path and available read-only to the class users. mkdtemp and boost are replaced by standard solution. The users are update to use path more (when it didn't involve too many changes to their code) and stop using lw_shared_ptr to store the tmpdir when it wasn't necessary. tmpdir intentionally doesn't provide any helpers for getting the path as a string in order to discourage weak types. Message-Id: <20190207145727.491-1-pdziepak@scylladb.com>	2019-02-07 20:18:14 +02:00
Avi Kivity	e2e25720c1	Update seastar submodule * seastar c3be06d...428f4ac (13): > build: make the "dist" test respect the build type > Merge 'Add support for docker --cpuset-cpus' from Juliana > Merge "Add support for Coroutines TS" from Paweł > Merge "Modernize dependency management" from Avi > future: propagate broken_promise exception to abandoned continuations > net/inet_address: avoid clang Wmissing-braces > build: Default to the "Release" type if unspecified > rpc: log an exception that may happen while processing an RPC message > Add a --split-dwarf option to configure.py > build: Fix the `StdFilesystem` module > Compress debug info by default > Add an option for building with split dwarf > Dockerfile: install stow	2019-02-07 20:08:15 +02:00
Paweł Dziepak	de2a447576	utils/extremum_tracking: drop default constructor Default constructed extremum_tracker has uninitialised _default_value which basically makes it never correct to do that. Since this class is a mechanism and not a value it doesn't really need to be a regular type, so let's drop the default constructor. Message-Id: <20190207162430.7460-1-pdziepak@scylladb.com>	2019-02-07 18:31:25 +02:00
Tomasz Grabiec	7184289015	Merge "Various fixes and improvements for sstables statistics" from Paweł This series contains several fixes and improvements as well as new tests for sstable code dealing with statistics. * https://github.com/pdziepak/scylla.git sstable-stats-fixes/v1-rebased: sstables: compaction: don't access moved-from vector of sstables memtable: move encoding_stats_collector implementation out of header sstables: seal_statistics(): pass encoding_stats by constant reference sstables/mc/writer: don't assume all schema columns are present tests/sstable3: improvements to file compare tests: extract mutation data model tests/data_model: add support for expiring atomic cells tests/data_model: allow specifying timestamp for row markers tests/memtable: test column tracking for encoding stats sstables: use correct source of statistics in get_encoding_stats_for_compaction() utils/extremum_tracking: preserve "not-set" status on merge sstables/metadata_collector: move the default values to the global tracker tests/sstables: test for reading serialisation header tests/sstables: pass encoding stats to write_components() tests/sstable: test merging encoding_stats Fixes #4202.	2019-02-07 12:35:29 +01:00
Paweł Dziepak	67252de195	tests/sstable: test merging encoding_stats	2019-02-07 10:17:06 +00:00
Paweł Dziepak	e25603fbf7	tests/sstables: pass encoding stats to write_components() By default write_components() uses a safe default for encoding_stats which indicates that all columns are present. This may hide so bugs, so let's pass the real thing in the tests that this may matter.	2019-02-07 10:17:06 +00:00
Paweł Dziepak	d44d5ebf86	tests/sstables: test for reading serialisation header	2019-02-07 10:17:06 +00:00
Paweł Dziepak	ebf667fb9c	sstables/metadata_collector: move the default values to the global tracker column_stats is a per-partition tracker, while metadata_collector is the global one. The statistics gathered by column_stats are merged into the metadata_collector. In order to ensure that we get proper default values in case no value of particular kind (e.g. no TTLs) was seen they need to be set on the global tracker, not the per-partition one.	2019-02-07 10:16:50 +00:00
Paweł Dziepak	2680022df0	utils/extremum_tracking: preserve "not-set" status on merge extremum_tracker allows choosing a default value that's going to be used only if no "real" values were provided. Since it is never compared with the actual input values it can be anything. For instance, if the minimum tracker default value is 0 and there was one update with the value 1 the detected minimum is going to be 1 (the default is ignored). However, this doesn't work when the trackers are merged since that process always leaves the destination tracker in the "set" state regardless whether any of the merged trakcers has ever seen any value. This is fixed by this patch, by properly preserving _is_set state on merge.	2019-02-07 10:16:50 +00:00
Paweł Dziepak	84d8ee35d4	sstables: use correct source of statistics in get_encoding_stats_for_compaction() sstable class is responsible for much more things that it should. In particular, it takes care of both writing and reading sstables. The problem that it causes is that it is very easy to confuse those two. This is what has happened in get_encoding_stats_for_compaction(). Originally, it was using _c_stats as a source of the statistics, which is used only during the write and per-partition. Needless to say, the returned encoding_stats were bogus. The correct source of those statistics is get_stats_metadata().	2019-02-07 10:16:50 +00:00
Paweł Dziepak	e315448d0a	tests/memtable: test column tracking for encoding stats	2019-02-07 10:16:50 +00:00
Paweł Dziepak	591d5195a9	tests/data_model: allow specifying timestamp for row markers	2019-02-07 10:16:50 +00:00
Paweł Dziepak	b07cba6a89	tests/data_model: add support for expiring atomic cells	2019-02-07 10:16:50 +00:00
Paweł Dziepak	aab0b7360f	tests: extract mutation data model	2019-02-07 10:16:50 +00:00
Paweł Dziepak	fa216be260	tests/sstable3: improvements to file compare This patch introduces some improvement to file comparison: - exception flags are set so that any error triggers an exceptions and guarantees that they are not silently ignored - std::ios_base::binary flag is passed to open() - istreambuf_iterator is used instead of istream_iterator. It is better suited for comparing binary data.	2019-02-07 10:16:50 +00:00
Paweł Dziepak	bc61471132	sstables/mc/writer: don't assume all schema columns are present The writer constructor prepares lists of present static and regular columns, those should be used for any further checks.	2019-02-07 10:16:50 +00:00
Paweł Dziepak	0132bcc035	sstables: seal_statistics(): pass encoding_stats by constant reference	2019-02-07 10:16:50 +00:00
Paweł Dziepak	341f186933	memtable: move encoding_stats_collector implementation out of header	2019-02-07 10:16:50 +00:00
Paweł Dziepak	6d5c1a9813	sstables: compaction: don't access moved-from vector of sstables	2019-02-07 10:16:50 +00:00
Paweł Dziepak	a8a45a243b	tests/cql_test_env: don't override tmpdir::path The interface tmpdir::path isn't properly encapsulated and its users can modify the path even though they really shouldn't. This can happen accidentally, in cql_test_env a reference to tmpdir::path was created and later assigned to in one of the code paths. This caused tmpdir destructor to remove wrong directory at program exit. This patch solves the problem by avoiding referencing tmpdir::path, a copy is perfectly acceptable considering that this is tests-only code. Message-Id: <20190206173046.26801-1-pdziepak@scylladb.com>	2019-02-06 20:55:40 +02:00
Takuya ASADA	96b1cb97ba	dist/ami: don't cleanup build dir rm -rf build/* was to start rpm building on clean state, but it also delete scylla built binaries so it was not good idea. Instead of rm -rf build/*, we can check file existance on cloned directory, if it seems good we can reuse it. Also we need to run git pull on each package repo since it may not included latest commit. Fixes #4189 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190206101755.2056-1-syuu@scylladb.com>	2019-02-06 15:33:09 +02:00
Nadav Har'El	3e7dc7230d	build_deb.sh: fix error message The error message was apparently copied from the RPM script. Fix it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190205162148.20698-1-nyh@scylladb.com>	2019-02-05 18:22:36 +02:00
Avi Kivity	54748ad15b	Merge "Allow non-key IN restrictions" from Piotr " Fixes #4193 Fixes #3795 This series enables handling IN restrictions for regular columns, which is needed by both filtering and indexing mechanisms. Tests: unit (release) " * 'allow_non_key_in_restrictions' of https://github.com/psarna/scylla: tests: add filtering with IN restriction test cql3: remove unused can_have_only_one_value function cql3: allow non-key IN restrictions	2019-02-05 17:30:35 +02:00
Piotr Sarna	45db5da51b	tests: add filtering with IN restriction test Test case for filtering regular columns with IN restriction is added.	2019-02-05 16:04:17 +01:00
Piotr Sarna	36609d1376	cql3: remove unused can_have_only_one_value function	2019-02-05 16:04:17 +01:00
Piotr Sarna	c178ed8b16	cql3: allow non-key IN restrictions Restricting a regular column with IN restriction is a perfectly valid case for filtering and indexing, so it should be allowed. Fixes #4193 Fixes #3795	2019-02-05 15:50:17 +01:00
Rafael Ávila de Espíndola	84542dadfa	sstables: delete_atomically: don't drop futures We still allow the delete of rows from system.large_partition to run in parallel with the sstable deletion, but now we return a future that waits for both. Tests: unit (release) Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190205001526.68774-1-espindola@scylladb.com>	2019-02-05 16:47:58 +02:00
Calle Wilund	ba6a8ef35b	tls: Use a default prio string disabling TLS1.0 forcing min 128bits Fixes #4010 Unless user sets this explicitly, we should try explicitly avoid deprecated protocol versions. While gnutls should do this for connections initiated thusly, clients such as drivers etc might use obsolete versions. Message-Id: <20190107131513.30197-1-calle@scylladb.com>	2019-02-05 15:34:18 +02:00
Avi Kivity	6c71eae63f	Merge "API: Stream compaction history records" from Amnon " get_compaction_history can return a lot of records which will add up to a big http reply. This series makes sure it will not create large allocations when returning the results. It adds an api to the query_processor to use paged queries with a consumer function that returns a future, this way we can use the http stream after each record. This implementation will prevent large allocations and stalls. Fixes #4152 " * 'amnon/compaction_history_stream_v7' of github.com:scylladb/seastar-dev: tests/query_processor_test: add query_with_consumer_test system_keyspace, api: stream get_compaction_history query_processor: query and for_each_cql_result with future	2019-02-05 14:16:36 +02:00

1 2 3 4 5 ...

17940 Commits