scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	ca28fdc37d	Revert "dist/docker/redhat: change user of scylla services to 'scylla'" This reverts commit `b1226fb15a`. When the data volume is mounted from the host (as is usual in container deployments), we can't expect that the files will be owned by the in-container scylla user. So that commit didn't really fix #4536. A follow-up patch will relax the check so it passes in a container environment.	2019-08-13 14:36:00 +03:00
Avi Kivity	0d0ee20f76	Merge "Implement `sstable_info` API command (info on sstables)" from Calle " Refs #4726 Implement the api portion of a "describe sstables" command. Adds rest types for collecting both fixed and dynamic attributes, some grouped. Allows extensions to add attributes as well. (Hint hint) " * 'sstabledesc' of https://github.com/elcallio/scylla: api/storage_service: Add "sstable_info" command sstables/compress: Make compressor pointer accessible from compression info sstables.hh: Add attribute description API to file extension sstables.hh: Add compression component accessor sstables.hh: Make "has_component" public	2019-08-12 21:16:08 +03:00
Dejan Mircevski	8be147d069	cql3: Handle empty LIKE pattern Match SQL's LIKE in allowing an empty pattern, which matches only an empty text field. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-08-12 19:48:31 +03:00
Rafael Ávila de Espíndola	99c7f8457d	logalloc: Add a migrators_base that is common to debug and release This simplifies the debug implementation and it now should work with scylla-gdb.py. It is not clear what, if anything, is lost by not using random ids. They were never being reused in the debug implementation anyway. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190618144755.31212-1-espindola@scylladb.com>	2019-08-12 19:44:55 +03:00
Calle Wilund	2b19bfbfbc	types: Remove obsolete "FIXME" inet_addr_type_impl has supported ipv6 for some time now. Message-Id: <20190812142731.6384-1-calle@scylladb.com>	2019-08-12 17:30:15 +03:00
Calle Wilund	1afc899e37	type_parser: Fix/improve exception messages Removes long-standing FIXME for message detail Also simplifies some code, removing duplication. Message-Id: <20190812134144.2417-1-calle@scylladb.com>	2019-08-12 17:03:43 +03:00
Calle Wilund	fdf2017487	cql3::term: Remove unneeded const_cast Removed no longer needed FIXME (to_string became const long ago) Message-Id: <20190812133943.2011-1-calle@scylladb.com>	2019-08-12 17:00:46 +03:00
Asias He	131acc09cc	repair: Adjust parallelism according to memory size (#4696 ) After commit `8a0c4d5` (Merge "Repair switch to rpc stream" from Asias), we increased the row buffer size for repair from 512KiB to 32MiB per repair instance. We allow repairing 16 ranges (16 repair instance) in parallel per repair request. So, a node can consume 16 * 32MiB = 512MiB per user requested repair. In addition, the repair master node can hold data from all the repair followers, so the memory usage on repair master can be larger than 512MiB. We need to provide a way to limit the memory usage. In this patch, we limit the total memory used by repair to 10% of the shard memory. The ranges that can be repaired in parallel is: max_repair_ranges_in_parallel = max_repair_memory / max_repair_memory_per_range. For example, if each shard has 4096MiB of memory, then we will have max_repair_ranges_in_parallel = 4096MiB / 32MiB = 12. Fixes #4675	2019-08-12 11:09:27 +03:00
Avi Kivity	e6cde72d2b	Merge "Fix cql server admission control to take all leftover work into account" from Gleb " Current admission control takes a permit when cql requests starts and releases it when reply is sent, but some requests may leave background work behind after that point (some because there is genuine background work to do like complete a write or do a read repair, and some because a read/write may stuck in a queue longer than the request's timeout), so after Scylla replies with a timeout some resources are still occupied. The series fixes this by passing the permit down to storage_proxy where it is held until all background work is completed. Fixes #4768 " * 'gleb/admission-v3' of github.com:scylladb/seastar-dev: transport: add a metric to follow memory available for service permit. storage_proxy: store a permit in a read executor storage_proxy: store a permit in a write response handler Pass service permit to storage_proxy transport: introduce service_permit class and use it instead of semaphore_units transport: hold admission a permit until a reply is sent transport: remove cql server load balancer	2019-08-12 11:02:37 +03:00
Gleb Natapov	3e27c2198a	transport: add a metric to follow memory available for service permit. Add a metric to follow memory available for service permit. When this memory is close to zero cql server stops admitting new requests.	2019-08-12 10:20:43 +03:00
Gleb Natapov	7d7b1685aa	storage_proxy: store a permit in a read executor A read executor exists until read operation completes in its entirety so storing a permit there guaranties that it will be freed only after no background work left for the request on this server.	2019-08-12 10:20:43 +03:00
Gleb Natapov	d5ced800f0	storage_proxy: store a permit in a write response handler A write response handler exists until write operation completes in its entirety so storing a permit there guaranties that it will be freed only after no background work left for the request on this server.	2019-08-12 10:20:43 +03:00
Gleb Natapov	6a4207f202	Pass service permit to storage_proxy Current cql transport code acquire a permit before processing a query and release it when the query gets a reply, but some quires leave work behind. If the work is allowed to accumulate without any limit a server may eventually run out of memory. To prevent that the permit system should account for the background work as well. The patch is a first step in this direction. It passes a permit down to storage proxy where it will be later hold by background work.	2019-08-12 10:20:43 +03:00
Raphael S. Carvalho	b436c41128	compaction_manager: Prevent sstable runs from being partially compacted Manager trims sstables off to allow compaction jobs to proceed in parallel according to their weights. The problem is that trimming procedure is not sstable run aware, so it could incorrectly remove only a subset of a sstable run, leading to partial sstable run compaction. Compaction of a sstable run could lead to inneficiency because the run structure would be messed up, affecting all amplification factors, and the same generation could even end up being compacted twice. This is fixed by making the trim procedure respect the sstable runs. Fixes #4773. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190730042023.11351-1-raphaelsc@scylladb.com>	2019-08-11 17:20:20 +03:00
Gleb Natapov	ddff7f48cf	transport: introduce service_permit class and use it instead of semaphore_units service_permit is a new class that allows sharing a permit among different parts of request processing many of which can complete at different times.	2019-08-11 16:08:55 +03:00
Gleb Natapov	2daa72b7dc	transport: hold admission a permit until a reply is sent Current code release admission permit to soon. If new requests are admitted faster than client read replies back reply queue can grow to be very big. The patch moves service permit release until after a reply is sent.	2019-08-11 16:08:55 +03:00
Gleb Natapov	7e3805ed3d	transport: remove cql server load balancer It is buggy, unused and unnecessary complicates the code.	2019-08-11 16:08:52 +03:00
Nadav Har'El	f9d6eaf5ff	reconcilable_result: switch to chunked_vector Merged patch series from Avi Kivity: In rare but valid cases (reconciling many tombstones, paging disabled), a reconciled_result can grow large. This triggers large allocation warnings. Switch to chunked_vector to avoid the large allocation. In passing, fix chunked_vector's begin()/end() const correctness, and add the reverse iterator function family which is needed by the conversion. Fixes #4780. Tests: unit (dev) Commit Summary utils: chunked_vector: make begin()/end() const correct utils::chunked_vector: add rbegin() and related iterators reconcilable_result: use chunked_vector to hold partitions	2019-08-11 16:03:13 +03:00
Avi Kivity	ce2b0b2682	Merge "Add listen/rpc "prefer_ipv6" options to DNS lookup #4775 " from Calle " Add listen/rpc "prefer_ipv6" options to DNS lookup of bind addresses for API/rpc/prometheus etc . Fixes #4751 Adds using a preferred address family to dns name lookups related to listen address and rpc address, adhering to the respective "prefer" options. API, prometheus and broadcast address are all considered to be covered by the "listen_interface_prefer_ipv6" option. Note: scylla does not yet support actual interface binding, but these options should apply equally to address name parameters. Setting a "prefer_ipv6" option automtially enables ipv6 dns family query. " * 'calle/ipv6' of https://github.com/elcallio/scylla: init: Use the "prefer_ipv6" options available for rpc/listen address/interface inet_address: Add optional "preferred type" to lookup config: Add rpc_interface_prefer_ipv6 parameter config: Add listen_interface_perfer_ipv6 parameter config.cc: Fix enable_ipv6_dns_lookup actual param name	2019-08-11 15:21:45 +03:00
Pekka Enberg	73113c0ea4	utils/fb_utilities.hh: Kill obsolete FIXME and commented out Java code The FIXME was added in the very first commit ("utils: Convert utils/FBUtilities.java") that introduced the fb_utilities class as a stub. However, we have long implemented the parts that we actually use, so drop the FIXME as obsolete. In addition, drop the remaining uncommented Java code as unused and also obsolete. Message-Id: <20190808182758.1155-1-penberg@scylladb.com>	2019-08-11 10:26:36 +03:00
Pekka Enberg	547c072f93	dbuild: Make Maven local repository accessible The Maven build tool ("mvn"), which is used by scylla-jmx and scylla-tools-java, stores dependencies in a local repository stored at $HOME/.m2. Make sure it's accessible to dbuild. Message-Id: <20190808140216.26141-1-penberg@scylladb.com>	2019-08-08 17:36:13 +03:00
Avi Kivity	8f19b16fe4	Update seastar submodule * seastar ed608e3c9e...fe2b5b0c6b (2): > Merge "handle discarded futures or suppress warning" from Benny > output_stream: Add close() blurb	2019-08-08 16:22:38 +03:00
Avi Kivity	4a5ec61438	Update seastar submodule * seastar a1cf07858b...ed608e3c9e (4): > core: Add ability to abort on EBADF and ENOTSOCK > Revert "Merge "handle discarded futures or suppress warning" from Benny" > Merge "handle discarded futures or suppress warning" from Benny > reactor: remove replace variadic future<pollable_fd, socket_address> with future<tuple>	2019-08-08 14:22:29 +03:00
Raphael S. Carvalho	76cde84540	sstables/compaction_manager: Fix logic for filtering out partial sstable runs ignore_partial_runs() brings confusion because i__p__r() equal to true doesn't mean filter out partial runs from compaction. It actually means not caring about compaction of a partial run. The logic was wrong because any compaction strategy that chooses not to ignore partial sstable run[1] would have any fragment composing it incorrectly becoming a candidate for compaction. This problem could make compaction include only a subset of fragments composing the partial run or even make the same fragment be compacted twice due to parallel compaction. [1]: partial sstable run is a sstable that is still being generated by compaction and as a result cannot be selected as candidate whatsoever. Fix is about making sure partial sstable run has none of its fragments selected for compaction. And also renaming i__p__r. Fixes #4729. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190807022814.12567-1-raphaelsc@scylladb.com>	2019-08-08 14:11:35 +03:00
Pekka Enberg	7d4bf10d87	docs/building-packages.md: Document how to build Scylla packages This documents the steps needed to build Scylla's Linux packages with the relocatable package infrastructure we use today. Message-Id: <20190807134017.4275-1-penberg@scylladb.com>	2019-08-08 14:11:35 +03:00
Pekka Enberg	79cece9f33	toolchain: Fix default command for dbuild Docker image Running "dbuild" without a build command fails as follows: $ ./tools/toolchain/dbuild Error: This command has to be run under the root user. Israel Fruchter discovered that the default command of our Docker image is this: "Cmd": [ "bash", "-c", "dnf -y install python3-cassandra-driver && dnf clean all" ] Let's make "/bin/bash" the default command instead, which will make "dbuild" with no build command to return to the host shell. Message-Id: <20190807133955.4202-1-penberg@scylladb.com>	2019-08-08 14:11:35 +03:00
Pekka Enberg	76cdec222f	build_reloc.sh: Remove "--with" passed to "configure.py" The build_reloc.sh script passes "--with=scylla" and "--with=iotune" to the configure.py script. This is redundant as the "scylla-package.tar.gz" target of ninja already limits itself to them. Removing the "--with" options allows building unit tests after a relocatable package has been built without having to rebuild anything. Message-Id: <20190807130505.30089-1-penberg@scylladb.com>	2019-08-07 16:28:00 +03:00
Avi Kivity	e548bdb2e8	thrift, transport: switch to new seastar accept() API (#4814 ) Seastar switched accept() to return a single struct instead of a variadic future, adjust the code to the new API to avoid deprecation warnings.	2019-08-07 15:23:26 +02:00
Pekka Enberg	f68fffd99a	reloc/build_reloc.sh: Make build mode configurable Add a '--mode <mode>' command line option to the 'build_reloc.sh' script so that we can create relocatable packages for debug builds. The '--mode' command line option defaults to 'release' so existing users are unaffected. Message-Id: <20190807120759.32634-1-penberg@scylladb.com>	2019-08-07 16:19:37 +03:00
Asias He	fee26b9f6e	repair: Fix use after free in do_estimate_partitions_on_local_shard (#4813 ) We need to keep the sstables object alive during the operation of do_for_each. Notes: No need to backport to 3.1. Fixes #4811	2019-08-07 15:19:21 +02:00
Asias He	49a73aa2fc	streaming: Move stream_mutation_fragments_cmd to a new file (#4812 ) Avoid including the lengthy stream_session.hh in messaging_service. More importantly, fix the build because currently messaging_service.cc and messaging_service.hh does not include stream_mutation_fragments_cmd. I am not sure why it builds on my machine. Spotted this when backporting the "streaming: Send error code from the sender to receiver" to 3.0 branch. Refs: #4789	2019-08-07 14:59:46 +02:00
Asias He	288371ce75	streaming: Do not call rpc stream flush in send_mutation_fragments The stream close() guarantees the data sent will be flushed. No need to call the stream flush() since the stream is not reused. Follow up fix for commit `bac987e32a` (streaming: Send error code from the sender to receiver). Refs #4789	2019-08-07 14:31:17 +02:00
Avi Kivity	689fc72bab	Update seastar submodule * seastar d199d27681...a1cf07858b (1): > Merge 'Do not return a variadic future form server_socket::accept()' from Avi Seastar configure.py now has --api-level=1, to keep us one the old variadic future server_socket::accept() API.	2019-08-06 18:37:27 +03:00
Avi Kivity	97f66c72af	Update seastar submodule * seastar d90834443c...d199d27681 (3): > sharded: support for non-cooperative service types > shared_future: silence warning about discarded future > Fix backtrace suppression message in cpu_stall_detector. Fixes #4560.	2019-08-06 18:00:48 +03:00
Asias He	bac987e32a	streaming: Send error code from the sender to receiver In case of error on the sender side, the sender does not propagate the error to the receiver. The sender will close the stream. As a result, the receiver will get nullopt from the source in get_next_mutation_fragment and pass mutation_fragment_opt with no value to the generating_reader. In turn, the generating_reader generates end of stream. However, the last element that the generating_reader has generated can be any type of mutation_fragment. This makes the sstable that consumes the generating_reader violates the mutation_fragment stream rule. To fix, we need to propagate the error. However RPC streaming does not support propagate the error in the framework. User has to send an error code explicitly. Fixes: #4789	2019-08-06 16:54:56 +02:00
Piotr Jastrzebski	24f6d90a45	sstables: add test of sstables_mutation_reader for missing partition_end Reproduces #4783 Issue was fixed by `9b8ac5ecbc` Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-08-06 15:11:19 +03:00
Calle Wilund	6c62e5741e	init: Use the "prefer_ipv6" options available for rpc/listen address/interface Fixes #4751 Adds using a preferred address family to dns name lookups related to listen address and rpc address, adhering to the respective "prefer" options. API, prometheus and broadcast address are all considered to be covered by the "listen_interface_prefer_ipv6" option. Note: scylla does not yet support actual interface binding, but these options should apply equally to address name parameters. Setting a "prefer_ipv6" option automtially enables ipv6 dns family query.	2019-08-06 08:32:10 +00:00
Calle Wilund	6c0c1309b3	inet_address: Add optional "preferred type" to lookup Allows using prio in address family dns lookup. I.e. prefer ipv4/ipv6 if avail.	2019-08-06 08:32:10 +00:00
Calle Wilund	d3410f0e48	config: Add rpc_interface_prefer_ipv6 parameter As already existing in scylla.yaml	2019-08-06 08:32:10 +00:00
Calle Wilund	0028cecb8e	config: Add listen_interface_perfer_ipv6 parameter As already existing in scylla.yaml. https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml#L622	2019-08-06 08:32:10 +00:00
Calle Wilund	39d18178eb	config.cc: Fix enable_ipv6_dns_lookup actual param name When adding option (and iterating through config refactoring) the member name and the config param name got out of sync	2019-08-06 08:32:09 +00:00
Calle Wilund	298da3fc4b	api/storage_service: Add "sstable_info" command Assembles information and attributes of sstables in one or more column families. v2: * Use (not really legal) nested "type" in json * Rename "table" param to "cf" for consistency * Some comments on data sizes * Stream result to avoid huge string allocations on final json	2019-08-06 08:14:15 +00:00
Calle Wilund	95a8ff12e7	sstables/compress: Make compressor pointer accessible from compression info	2019-08-06 07:07:44 +00:00
Calle Wilund	d15c63627c	sstables.hh: Add attribute description API to file extension	2019-08-06 07:07:44 +00:00
Calle Wilund	4c67d702c2	sstables.hh: Add compression component accessor	2019-08-06 07:07:44 +00:00
Calle Wilund	770f912221	sstables.hh: Make "has_component" public	2019-08-06 07:07:44 +00:00
Avi Kivity	b77c4e68c2	Merge "Add Zstandard compression #4802 " from Kamil " This adds the option to compress sstables using the Zstandard algorithm (https://facebook.github.io/zstd/). To use, pass 'sstable_compression': 'org.apache.cassandra.io.compress.ZstdCompressor' to the 'compression' argument when creating a table. You can also specify a 'compression_level' (default is 3). See Zstd documentation for the available compression levels. Resolves #2613. This PR also fixes a bug in sstables/compress.cc, where chunk length in bytes was passed to the compressor as chunk length in kilobytes. Fortunately, none of the compressors implemented until now used this parameter. Example usage (assuming there exists a keyspace 'a'): create table a.a (a text primary key, b int) with compression = {'sstable_compression': 'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level': 1, 'chunk_length_in_kb': '64'}; Notes: 1. The code uses an external dependency: https://github.com/facebook/zstd. Since I'm using "experimental" features of the library (using my own allocated memory to store the compression/decompression contexts), according to the library's documentation we need to link it statically (https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L63). I added a git submodule. 2. The compressor performs some dynamic allocations. Depending on the specified chunk length and/or compression level the allocations might be big and seastar throws warnings. But with reasonable chunk length sizes it should be OK. 3. It doesn't yet provide an option to train it with dictionaries, but that should be easy to add in another commit. " * 'zstd' of https://github.com/kbr-/scylla: Configure: rename seastar_pool to submodule_pool, add more submodules to the pool Add unit tests for Zstd compression Enable tests that use compressed sstable files Add ZStandard compression Fix the value of the chunk length parameter passed to compressors	2019-08-05 16:29:27 +03:00
Botond Dénes	23cc6d6fb2	make_flat_mutation_reader_from_fragments: reader: silence discarded future warning The fragment reader calls `fast_forward_to()` from its constructor to discard fragments that fall outside the query range. Mmove the the fast-forward code in to an internal void returning method, and call that from both the constructor and `fast_forward_to()`, to avoid a warning on a discarded future<>. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190801133942.10744-1-bdenes@scylladb.com>	2019-08-05 16:21:50 +03:00
Kamil Braun	3a0308f76f	Configure: rename seastar_pool to submodule_pool, add more submodules to the pool Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-08-05 14:55:56 +02:00
Kamil Braun	c3c7c06e10	Add unit tests for Zstd compression Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-08-05 14:55:56 +02:00

1 2 3 4 5 ...

19149 Commits