scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Botond Dénes	fea6214a0a	Update reader restriction related metrics Update description of existing reader count metrics, add memory consumption metrics. Use labels to distinguish between system, user and streaming reads related metrics.	2017-10-03 12:44:17 +03:00
Botond Dénes	47e07b787e	restricted_mutation_reader: restrict based-on memory consumption Restrict readers based on their memory consumption, instead of the count of the top-level readers. To do this an interposer is installed at the input_stream level which tracks buffers emmited by the stream. This way we can have an accurate picture of the readers' actual memory consumption. New readers will consume 16k units from the semaphore up-front. This is to account their own memory-consumption, apart from the buffers they will allocate. Creating the reader will be deferred to when there are enough resources to create it. As before only new readers will be blocked on an exhausted semaphore, existing readers can continue to work.	2017-10-03 12:44:12 +03:00
Avi Kivity	78eae8bf48	Revert "Merge "Make restricting_mutation_reader more accurate" from Botond" This reverts commit `c6e5dcc556`, reversing changes made to `19b21a0ab2`. Failes to build, plus author has more changes.	2017-10-03 11:58:59 +03:00
Botond Dénes	43dba8f173	Update reader restriction related metrics Update description of existing reader count metrics, add memory consumption metrics.	2017-09-20 11:16:21 +03:00
Botond Dénes	33e97e7457	restricted_mutation_reader: restrict based-on memory consumption Restrict readers based on their memory consumption, instead of the count of the top-level readers. To do this an interposer is installed at the input_stream level which tracks buffers emmited by the stream. This way we can have an accurate picture of the readers' actual memory consumption. New readers will consume 16k units from the semaphore up-front. This is to account their own memory-consumption, apart from the buffers they will allocate. Creating the reader will be deferred to when there are enough resources to create it. As before only new readers will be blocked on an exhausted semaphore, existing readers can continue to work.	2017-09-20 11:14:35 +03:00
Avi Kivity	e44517851e	untyped_result_set: reduce dependencies Forward-declare untyped_result_set and untyped_result_set_row, and remove the include from query_processor.hh. Message-Id: <20170916170859.27612-3-avi@scylladb.com>	2017-09-18 15:15:15 +02:00
Avi Kivity	0aaefe665b	system_keyspace: add missing include	2017-09-11 20:09:45 +03:00
Piotr Jastrzebski	dd5dc75605	Stop calling _local_cache.stop in at_exit. This removes a race condition that was causing #2721 Fixes #2721 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <ad060fab43d63c17db9f811c421d7ab26e5e57c8.1503933021.git.piotr@scylladb.com>	2017-09-03 15:55:48 +03:00
Avi Kivity	ebff739a84	Merge "use paging for compaction history" from Amnon "This series adds an option to use paging in internal query and use that for the get compaction history function. Internal paging will be done explicitly, to use paging, you first create a state object (that contains the query as well) and use that state to get the first page, the result will contain both the query result and a new state that can be used to get the next page. Fixes #2366" * 'amnon/paged_compaction_history_v5' of github.com:cloudius-systems/seastar-dev: system_keyspace: Use paging for get compaction history Add paging for internal queries query_options: Allows creating query_options from query_options	2017-08-02 18:15:58 +03:00
Amnon Heiman	e345d05ebe	system_keyspace: Use paging for get compaction history there could be a lot of compactions when querying for compaction history. This patch changes the query to use paging. It would collect all results when returning to the caller. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2017-07-20 18:17:49 +03:00
Calle Wilund	7a583585a2	system_keyspace: Make sure "system" is written to keyspaces (visible) Fixes #2514 Bug in schema version 3 update: We failed to write "system" to the schema tables. Only visible on an empty instance of course. Message-Id: <1500469809-23546-2-git-send-email-calle@scylladb.com>	2017-07-19 16:18:56 +03:00
Avi Kivity	f0b20be14d	Revert "system_keyspace: Make sure "system" is written to keyspaces (visible)" This reverts commit `89ef69c4b3`. Prevents nodes from joining the cluster.	2017-06-21 16:58:04 +03:00
Calle Wilund	89ef69c4b3	system_keyspace: Make sure "system" is written to keyspaces (visible) Fixes #2514 Bug in schema version 3 update: We failed to write "system" to the schema tables. Only visible on an empty instance of course. Message-Id: <1497966982-10044-1-git-send-email-calle@scylladb.com>	2017-06-20 20:59:47 +02:00
Gleb Natapov	69c5526301	messaging_service: return cache hit ratio as part of data read	2017-06-13 09:57:14 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Calle Wilund	6c8b5fc09d	schema_tables: Use v3 schema tables and formats Switches system/schema_* for system_schema/*, updates schema/schema builder and uses to hold/expect v3 style info (i.e. types & dropped).	2017-05-10 16:44:48 +00:00
Calle Wilund	8066efb710	system_keyspace: Add getter/setter for built index status Even though we have none.	2017-05-09 13:48:55 +00:00
Calle Wilund	061ef16562	system_tables/schema_tables: Remove special format case of "execute_cql" Having a varadic parameter being used in implicit sprint is not very readable + makes it less intuitive when suddenly system keyspace becomes more than one -> multiple sprints in the chain -> more confusion or more execution paths. Its not that horrible with some spread out sprint:s	2017-05-09 13:48:55 +00:00
Calle Wilund	27fdc5cfef	schema_tables/system_tables: Add v3 tables to "ALL" and handle in init I.e. deal with more than one keyspace in system_keyspace::make	2017-05-09 13:48:55 +00:00
Calle Wilund	2fb36e3bf8	system_keyspace: Add query overloads with named keyspace	2017-05-09 13:48:55 +00:00
Calle Wilund	32909d4c84	system_keyspace: Add v3+legacy schema definitions	2017-05-09 13:48:55 +00:00
Avi Kivity	d542cdddf6	thrift: change generated code namespace org::apache::cassandra (the generated namespace name) gets confused with apache::cassandra (the thrift runtime library namespace), either due to changes in gcc 7 or in thrift 0.10. Either way, the problem is fixed by changing the generated namespace to plain cassandra.	2017-05-05 05:26:20 +03:00
Tomasz Grabiec	586dbaa8d3	db: Replace virtual_reader_type with mutation_source_opt Virtual reader is a mutation_source.	2017-02-23 18:23:52 +01:00
Calle Wilund	ef26ab0e1b	db::system_keyspace: Find rpc_address by lookup	2017-02-06 09:45:37 +00:00
Duarte Nunes	40c684b5f5	database: Extract common create cf code This patch moves some duplicate code into the add_column_family_and_create_directory() function. It also saves some superfluous keyspace lookups and readies the code to be used by materialized views. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Glauber Costa	db7cc3cba8	system keyspace: write batchlog mutation in user memory Batchlog is a potentially memory-intensive table whose workload is driven by user needs, not system's. Move it to the user dirty memory manager. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-12-13 13:59:35 -05:00
Duarte Nunes	6a37d87c76	db: Delete size_estimates_recorder Now that access to the size_estimates system is virtualized, we no longer need the recorder. Fixes #1616 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-11-21 11:15:05 +00:00
Duarte Nunes	225648780d	size_estimates: Add virtual reader This patch add a virtual mutation_reader so that queries to the size_estimates system table are handled by the engine without needing to perform any IO. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-11-21 11:15:05 +00:00
Duarte Nunes	636287fdf2	system_keyspace: Build mutations for size estimates This patch adds a function to system_keyspace responsible for creating a mutation to a partition of the size_estimates system table from a set of range_estimates. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-11-21 11:15:04 +00:00
Duarte Nunes	18ddec245e	size_estimates: Store the token range as bytes This patch changes the range_estimates struct so that the tokens are represented as utf8 encoded bytes. This will make future patches require less conversions. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-11-21 11:14:21 +00:00
Duarte Nunes	e7a5162c1d	range_estimates: Add schema This will be used in future patches, when virtualizing the size_estimates system table. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-11-21 10:56:32 +00:00
Tomasz Grabiec	c1a7e2090e	Revert "database: change find_column_families signature so it returns a lw_shared_ptr" This reverts commit `f3528ede65`.	2016-11-04 10:48:21 +01:00
Glauber Costa	f3528ede65	database: change find_column_families signature so it returns a lw_shared_ptr There are places in which we need to use the column family object many times, with deferring points in between. Because the column family may have been destroyed in the deferring point, we need to go and find it again. If we use lw_shared_ptr, however, we'll be able to at least guarantee that the object will be alive. Some users will still need to check, if they want to guarantee that the column family wasn't removed. But others that only need to make sure we don't access an invalid object will be able to avoid the cost of re-finding it just fine. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>	2016-11-03 13:27:31 +01:00
Avi Kivity	c94fb1bf12	build: reduce inclusions of messaging_service.hh Remove inclusions from header files (primary offender is fb_utilities.hh) and introduce new messaging_service_fwd.hh to reduce rebuilds when the messaging service changes. Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>	2016-10-05 11:46:49 +03:00
Duarte Nunes	e0a43a82c6	system_keyspace: Correctly deal with wrapped ranges This patch ensures we correctly deal with ranges that wrap around when querying the size_estimates system table. Ref #693 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1470412433-7767-1-git-send-email-duarte@scylladb.com>	2016-08-05 19:17:00 +03:00
Duarte Nunes	ecfa04da77	system_keyspace: Add query_size_estimates() function The query_size_estimates() function queries the size_estimates system table for a given keyspace and table, filtering out the token ranges according to the specified tokens. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-24 22:43:58 +00:00
Duarte Nunes	e16f3f2969	system_keyspace: Avoid pointers in range_estimates This patch makes range_estimates a proper struct, where tokens are represented as dht::tokens rather than dht::ring_position*. We also pass other arguments to update_ and clear_size_estimates by copy, since one will already be required. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-24 22:43:35 +00:00
Piotr Jastrzebski	636a4acfd0	Add flag to configure max size of a cached partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-07-21 09:47:20 +02:00
Vlad Zolotarov	baa6496816	service::storage_proxy: READ instrumentation: store trace state object in abstract_read_executor Having a trace_state_ptr in the storage_proxy level is needed to trace code bits in this level. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-07-19 18:21:59 +03:00
Duarte Nunes	f8f61cf246	system_keyspace: Record and clear size estimates This patch implements functions that allow the size_estimates system table to be updated and cleared. The size_estimates table is updated per schema with a set of token ranges and the associated estimations of how many partitions there are and their mean size. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-18 23:58:31 +00:00
Glauber Costa	7169b727ea	move system tables to its own region In the spirit of what we are doing for the read semaphore, this patch moves system writes to its own dirty memory manager. Not only will it make sure that system tables will not be serialized by its own semaphore, but it will also put system tables in its own region group. Moving system tables to its own region group has the advantage that system requests won't be waiting during throttle behind a potentially big queue of user requests, since requests are tended to in FIFO order within the same region group. However, system tables being more controlled and predictable, we can actually go a step further and give them some extra reservation so they may not necessarily block even if under pressure (up to 10 MB more). Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 17:46:28 -04:00
Avi Kivity	76cc6408cd	Merge "feature check for seed node" from Asias ""This series implemnts feature check for seed node.	2016-07-05 19:01:01 +03:00
Asias He	6f69963ef9	system_keyspace: Simplify load_host_ids implementation - Use plain loop instead of do_for_each - Use row.get_as() instead of row.template get_as() Message-Id: <3e108d3a6258c0caaf569eb9c79532d9789ea411.1467703722.git.asias@scylladb.com>	2016-07-05 09:47:21 +02:00
Asias He	3f31be58b6	system_keyspace: Simplify load_tokens implemntation - Use plain loop instead of do_for_each - Use row.get_as() instead of row.template get_as() Message-Id: <f959ace4f30078695d383c849ed4520169228f97.1467703722.git.asias@scylladb.com>	2016-07-05 09:47:21 +02:00
Asias He	31df4e5316	system_keyspace: Introduce load_peer_features To get the peer features stored in the system.peers table.	2016-07-05 10:09:53 +08:00
Avi Kivity	9ac730dcc9	mutation_reader: make restricting_mutation_reader even more restricting While limiting the number of concurrently executing sstable readers reduces our memory load, the queued readers, although consuming a small amount of memory, can still grow without bounds. To limit the damage, add two limits on the queue: - a timeout, which is equal to the read timeout - a queue length limit, which is equal to 2% of the shard memory divided by an estimate of the queued request size (1kb) Together, these limits bound the amount of memory needed by queued disk requests in case the disk can't keep up. Message-Id: <1467206055-30769-1-git-send-email-avi@scylladb.com>	2016-06-29 15:17:35 +02:00
Avi Kivity	edeef03b34	db: restrict replica read concurrency Since reading mutations can consume a large amount of memory, which, moreover, is not predicatable at the time the read is initiated, restrict the number of reads to 100 per shard. This is more than enough to saturate the disk, and hopefully enough to prevent allocation failures. Restriction is applied in column_family::make_sstable_reader(), which is called either on a cache miss or if the cache is disabled. This allows cached reads to proceed without restriction, since their memory usage is supposedly low. Reads from the system keyspace use a separate semaphore, to prevent user reads from blocking system reads. Perhaps we should select the semaphore based on the source of the read rather than the keyspace, but for now using the keyspace is sufficient.	2016-06-27 17:17:56 +03:00
Pekka Enberg	47a904c0f6	Merge "gossip: Introduce SUPPORTED_FEATURES" from Asias "There is a need to have an ability to detect whether a feature is supported by entire cluster. The way to do it is to advertise feature availability over gossip and then each node will be able to check if all other nodes have a feature in question. The idea is to have new application state SUPPORTED_FEATURES that will contain set of strings, each string holding feature name. This series adds API to do so. The following patch on top of this series demostreates how to wait for features during boot up. FEATURE1 and FEATURE2 are introduced. We use wait_for_feature_on_all_node to wait for FEATURE1 and FEATURE2 successfully. Since FEATURE3 is not supported, the wait will not succeed, the wait will timeout. --- a/service/storage_service.cc +++ b/service/storage_service.cc @@ -95,7 +95,7 @@ sstring storage_service::get_config_supported_features() { // Add features supported by this local node. When a new feature is // introduced in scylla, update it here, e.g., // return sstring("FEATURE1,FEATURE2") - return sstring(""); + return sstring("FEATURE1,FEATURE2"); } std::set<inet_address> get_seeds() { @@ -212,6 +212,11 @@ void storage_service::prepare_to_join() { // gossip snitch infos (local DC and rack) gossip_snitch_info().get(); + gossiper.wait_for_feature_on_all_node(std::set<sstring>{sstring("FEATURE1"), sstring("FEATURE2")}, std::chrono::seconds(30)).get(); + logger.info("Wait for FEATURE1 and FEATURE2 done"); + gossiper.wait_for_feature_on_all_node(std::set<sstring>{sstring("FEATURE3")}).get(); + logger.info("Wait for FEATURE3 done"); + We can query the supported_features: cqlsh> SELECT supported_features from system.peers; supported_features -------------------- FEATURE1,FEATURE2 FEATURE1,FEATURE2 (2 rows) cqlsh> SELECT supported_features from system.local; supported_features -------------------- FEATURE1,FEATURE2 (1 rows)"	2016-04-08 09:22:50 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00

1 2 3 4

168 Commits