scylladb

Author	SHA1	Message	Date
Asias He	6496cdf0fb	db: Get rid of the streaming memtable delayed flush In `455d5a5` (streaming memtables: coalesce incoming writes), we introduced the delayed flush to coalesce incoming streaming mutations from different stream_plan. However, most of the time there will be one stream plan at a time, the next stream plan won't start until the previous one is finished. So, the current coalescing does not really work. The delayed flush adds 2s of dealy for each stream session. If we have lots of table to stream, we will waste a lot of time. We stream a keyspace in around 10 stream plans, i.e., 10% of ranges a time. If we have 5000 tables, even if the tables are almost empty, the delay will waste 5000 * 10 * 2 = 27 hours. To stream a keyspace with 4 tables, each table has 1000 rows. Before: [shard 0] stream_session - [Stream #944373d0-5d9c-11e8-9cdb-000000000000] Executing streaming plan for Bootstrap-ks-index-0 with peers={127.0.0.1}, master [shard 0] stream_session - [Stream #944373d0-5d9c-11e8-9cdb-000000000000] Streaming plan for Bootstrap-ks-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1030 KiB, 125.21 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks succeeded, took 8.233 seconds After: [shard 0] stream_session - [Stream #e00bf6a0-5d99-11e8-a7b8-000000000000] Executing streaming plan for Bootstrap-ks-index-0 with peers={127.0.0.1}, master [shard 0] stream_session - [Stream #e00bf6a0-5d99-11e8-a7b8-000000000000] Streaming plan for Bootstrap-ks-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1030 KiB, 4772.32 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks succeeded, took 0.216 seconds Fixes #3436 Message-Id: <cb2dde263782d2a2915ddfe678c74f9637ffd65b.1526979175.git.asias@scylladb.com>	2018-06-06 10:16:02 +03:00
Avi Kivity	9b21fbc055	Merge "LCS: enable compaction controller" from Glauber " In preparation, we change LCS so that it tries harder to push data to the last level, where the backlog is supposed to be zero. The backlog is defined as: backlog_of_stcs_in_l0 + Sum(L in level) sizeof(L) * (max_level - L) * fan_out where: * the fan_out is the amount of SSTables we usually compact with the next level (usually 10). * max_levels is the number of levels currently populated * sizeof(L) is the total amount of data in a particular level. Tests: unit (release) " * 'lcs-backlog-v2' of github.com:glommer/scylla: LCS: implement backlog tracker for compaction controller LCS: don't construct property in the body of constructor LCS: try harder to move SSTables to highest levels. leveled manifest: turn 10 into a constant backlog: add level to write progress monitor	2018-06-04 10:29:56 +03:00
Glauber Costa	7e3093709a	backlog: add level to write progress monitor For SSTables being written, we don't know their level yet. Add that information to the write monitor. New SSTables will always be at L0. Compacted SSTables will have their level determined by the compaction process. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-31 21:09:38 -04:00
Paweł Dziepak	aa25f0844f	atomic_cell: introduce fragmented buffer value interface As a prepratation for the switch to the new cell representation this patch changes the type returned by atomic_cell_view::value() to one that requires explicit linearisation of the cell value. Even though the value is still implicitly linearised (and only when managed by the LSA) the new interface is the same as the target one so that no more changes to its users will be needed.	2018-05-31 15:51:11 +01:00
Paweł Dziepak	27014a23d7	treewide: require type info for copying atomic_cell_or_collection	2018-05-31 15:51:11 +01:00
Avi Kivity	701e6f2cff	Merge "Implement backlog controller for TWCS" from Glauber " This series implements the backlog tracker for TWCS, allowing it to be controlled. The backlog for a TWCS colum family is just the sum of the SizeTiered backlogs for all the windows that we know about. A possible optimization for this is to stop tracking windows after they become old enough and revert to zero backlog. I reverted that last minute, though, since this will probably cause the backlog to completely misrepresent reality if we import SSTables into old buckets with things like repairs or nodetool refresh. " * 'twcs-backlog-v4.1' of github.com:glommer/scylla: backlog: implement backlog tracker for the TWCS STCS_backlog: allow users to query for the total bytes managed backlog: keep track of maximum timestamp in write monitor memtable: also keep track of max timestamp	2018-05-23 13:37:49 +03:00
Piotr Sarna	f8237dd664	database: do not truncate already removed views This commit clears table's views before truncating it in drop_column_family function. The only case when views are not empty during drop is when they're backing secondary indexes of a base table and they are all atomically dropped in the same go as the base table itself. This change will prevent trying to truncate views that were already dropped, which used to result in no_such_column_family error. References #3202	2018-05-22 21:10:51 +02:00
Duarte Nunes	a3bbd52e2e	Merge 'Add materialized view metrics' from Piotr " This series introduces materialized view statistics, as stated in issue #3385: - updates pushed - updates failed - row lock stats It also addresses issue #3416 by decoupling user write stats from view update stats. " * 'materialized_view_metrics_9' of https://github.com/psarna/scylla: view: adapt view_stats to act as write stats storage_proxy: decouple write_stats from stats db: add row locking metrics view: add view metrics	2018-05-22 18:41:51 +01:00
Glauber Costa	b573a2ff61	backlog: keep track of maximum timestamp in write monitor For sealed SSTables we can get the maximum timestamp from the statistics component. But for partially written SSTables, the metadata is not yet available. One way to solve this would be to make the SSTable statistics available earlier. But we would end up with a maximum timestamp that potentially changes all the time as we write more cells. A better approach is to take note of what's the maximum timestamp in a memtable before we start to flush, and when time comes for us to flush we will use the progress manager to inform the consumers about the maximum timestamp. For SSTables being compacted, we can't know for sure what is the maximum timestamp as some entries could be TTLd already. But the maximum of all SSTables present in the compaction is a good enough estimation for this purposes. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-22 12:55:58 -04:00
Piotr Sarna	3792bed3ed	view: adapt view_stats to act as write stats This commit adapts view_stats structure so it can be passed to storage_proxy as write stats. Thanks to that, mv replica updates will not interfere with user write metrics. As a side effect it also provides more stats to replica view updates. Closes #3385 Closes #3416	2018-05-22 16:52:58 +02:00
Piotr Sarna	9246bb36bc	db: add row locking metrics This commit adds statistics to row_locker class. Metrics are independendly counted for all lock types: row<->partition and exclusive<->shared. Metrics gathered: - total acquisitions - operations that wait on the lock - histogram of the time spent on waiting on this type of lock References #3385 References #3416	2018-05-22 16:52:58 +02:00
Piotr Sarna	49bebcfa25	view: add view metrics This commit introduces view statistics: - updates pushed to local/remote replicas - updates failed to be pushed to local/remote replicas Metrics are kept on per-table basis, i.e. updates_pushed_remote shows the number of total updates (mutations) pushed to all paired mv replicas that this particular table has. Every single update is taken into consideration, so if view update requires removing a row from one view and adding a row to another, it will be counted as 2 updates. References #3385 References #3416	2018-05-22 16:52:58 +02:00
Glauber Costa	d758a416f8	backlog_controller: move compaction controller to the compaction manager There was recently an attempt to add minimum shares to major compactions which ended up being harder than it should be due to all the plumbing necessary to call the compaction controller from inside the compaction manager-- since it is currently a database object. We had this problem again when trying to return fixed shares in case of an exception. Taking a step back, all of those problems stem from the fact that the compaction controller really shouldn't be a part of the database: as it deals with compactions and its consequences it is a lot more natural to have it inside the compaction manager to begin with. Once we do that, all the aforementioned problems go away. So let's move there where it belongs. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-22 09:24:19 -04:00
Glauber Costa	d3f985ef46	backlog_controller: allow users to compute inverse function of shares There are some situations in which we want to force a specific amount of shares and don't have a backlog. We can provide a function to get that from the controller. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-21 19:35:07 -04:00
Duarte Nunes	a23bda3393	Merge 'Implement separate timeout for range queries' from Avi " This patchset implements separate timeouts for range queries, and lays the foundations for separate timeouts for other query types. While the feature in itself is worthy, the real motivation is to have the timeouts decided by the caller, instead of storage_proxy. This in turn is required to disentangle each layer behaving differently depending on whether the query is internal or not; instead, the goal is to have each caller declare its needs in terms of consistency level and timeouts, and have the lower layers implement its requirements instead of making their own decisions. Fixes #3013. Tests: unit (release) " * tag '3013/v1.1' of https://github.com/avikivity/scylla: storage_proxy: remove default_query_timeout() storage_proxy: don't use default timeouts query_options: augment with timeout_config thrift: configure thrift transport and handler with a timeout_config transport: configure native transport with a timeout_config cql3: define and populate timeout_config_selector timeout_config: introduce timeout configuration	2018-05-13 20:05:50 +02:00
Glauber Costa	2e0c673432	database: release flush permits earlier There is an ongoing discussion in issue 2678 about the right time to release permits. Right now we are releasing the permit after we flush all data for the memtable plus the SSTables accompanying components - plus flushing them, closing them, etc. During all that time, we are increasing virtual dirty by adding more data to the buffers but we are not able to decrease it-- until we release the permit we can't start flushing the next memtable. This is much more of a concern than I/O overlapping as described in the issue. We have a hook in the SSTable write process that is (should be) called as soon as data is written. We should move the permit release there. We aren't, though, calling that as early as we could. The call to the data written hook is writing after the Index is closed, summary is sealed, etc. This patch fixes that. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20180508182746.28310-2-glauber@scylladb.com>	2018-05-13 19:22:54 +03:00
Glauber Costa	94f686f946	memtable controller: reduce adjustment period to 50ms 250ms is too high of a period for memtable controller. Since memtable flushes are relatively efficient, specially in comparison to compactions, if the shares are high we can flush a lot of data down with the high shares - so in the next adjustment period our shares will be minuscule and we won't flush much at all. This leads to oscillating behavior that is mitigated by adjusting faster. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20180508182746.28310-3-glauber@scylladb.com>	2018-05-09 17:40:46 +03:00
Calle Wilund	b2b1a1f7e1	database: Fix assert in truncate Fixes crash in cql_tests.StorageProxyCQLTester.table_test "avoid race condition when deleting sstable on behalf..." changed discard_sstables behaviour to only return rp:s for sstables owned and submitted for deletion (not all matching time stamp), which can in some cases cause zero rp returned. Message-Id: <20180508070003.1110-1-calle@scylladb.com>	2018-05-08 22:29:21 +01:00
Botond Dénes	6f7d919470	database: when dropping a table evict all relevant queriers Queriers shouldn't outlive the table they read from as that could lead to use-after-free problems when they are destroyed. Fixes: #3414 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <3d7172cef79bb52b7097596e1d4ebba3a6ff757e.1525716986.git.bdenes@scylladb.com>	2018-05-07 21:20:25 +03:00
Duarte Nunes	c053275a48	db/view/row_locking: Add timeout when waiting for the lock This ensures we respect the write timeout set by the client when applying base writes, in case a writes takes too long to acquire the row lock for the read-before-write phase of a materialized view update. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180507132755.8751-1-duarte@scylladb.com>	2018-05-07 18:22:39 +01:00
Duarte Nunes	4b3562c3f5	db/view: Limit number of pending view updates This patch adds a simple and naive mechanism to ensure a base replica doesn't overwhelm a potentially overloaded view replica by sending too many concurrent view updates. We add a semaphore to limit to 100 the number of outstanding view updates. We limit globally per shard, and not per destination view replica. We also limit statically. Refs #2538 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180426134457.21290-2-duarte@scylladb.com>	2018-05-07 11:25:27 +03:00
Raphael S. Carvalho	abcfc19fe9	db: make compaction slightly faster by not using filtering reader on unshared sstable After reboot, all existing sstables are considered shared. That's a safe default. Reader used by compaction decides to use filtering reader (filters out data that doesn't belong to this shard) if sstable is considered shared even though it may actually be unshared. By avoiding filtering reader we're avoiding an extra check for each key, and that may be meaningful for compaction of tons of small partitions and even range reads of such. We do so by fixing sstable::_shared, which is now set properly for existing sstables at start. quick check using microbenchmark which extends perf_sstable with compaction mode: before: 69407.61 +- 37.03 partitions / sec (30 runs, 1 concurrent ops) after: 70161.09 +- 40.35 partitions / sec (30 runs, 1 concurrent ops) Fixes #3042. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180504182158.21130-1-raphaelsc@scylladb.com>	2018-05-04 19:34:09 +01:00
Duarte Nunes	7916368df8	Merge "Introduce system.large_partitions table" from Piotr " This series introduces a system.large_partitions table, used to gather information on largest partitions in the cluster. Schema below allows easy extraction of most offending keys and removal by sstable name, which happens when a table is compacted away. Schema: ( keyspace_name text, table_name text, sstable_name text, partition_size bigint, key text, compaction_time timestamp, PRIMARY KEY((keyspace_name, table_name), sstable_name, partition_size, key) ) WITH CLUSTERING ORDER BY (partition_size DESC); " Closes #3292. * 'large_partition_table_3' of https://github.com/psarna/scylla: database, sstables, tests: add large_partition_handler db: add large_partition_handler interface with implementations docs: init system_keyspace entry with system.large_partitions db: add system.large_partitions table	2018-05-04 18:18:50 +01:00
Piotr Sarna	fe02c3d0e2	database, sstables, tests: add large_partition_handler This commit makes database, sstables and tests aware of which large_partition_handler they use. Proper large_partition_handler is retrievable from config information and is based on existing compaction_large_partition_warning_threshold_mb entry. Right now CQL TABLE variant of large_partition_handler is used in the database. Tests use a NOP version of large_partition_handler, which does not depend on CQL queries at all.	2018-05-04 14:38:13 +02:00
Raphael S. Carvalho	ce689a0807	database: avoid race condition when deleting sstable on behalf of cf truncate After removal of deletion manager, caller is now responsible for properly submitting the deletion of a shared sstable. That's because deletion manager was responsible for holding deletion until all owners agreed on it. Resharding for example was changed to delete the shared sstables at the end, but truncate wasn't changed and so race condition could happen when deleting same sstable at more than one shard in parallel. Change the operation to only submit a shared sstable for deletion in only one owner. Fixes dtest migration_test.TestMigration.migrate_sstable_with_schema_change_test Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180503193427.24049-1-raphaelsc@scylladb.com>	2018-05-04 11:42:56 +01:00
Tomasz Grabiec	5e985192b2	db: Log table id and schema version on boot Message-Id: <1524585689-12458-1-git-send-email-tgrabiec@scylladb.com>	2018-05-03 10:50:31 +03:00
Avi Kivity	b6d74b1c19	timeout_config: introduce timeout configuration Different request types have different timeouts (for example, read requests have shorter timeouts than truncate requests), and also different request sources have different timeouts (for example, an internal local query wants infinite timeout while a user query has a user-defined timeout). To allow for this, define two types: timeout_config represents the timeout configuration for a source (e.g. user), while timeout_config_selector represents the request type, and is used to select a timeout within a timeout configuration. The latter is implemented as a pointer-to-member. Also introduce an infinite timeout configuration for internal queries.	2018-04-29 19:52:40 +03:00
Vladimir Krivopalov	948c4d79d3	Collect encoding statistics for memtable updates. We keep track of all updates and store the minimal values of timestamps, TTLs and local deletion times across all the inserted data. These values are written as a part of serialization_header for Statistics.db and used for delta-encoding values when writing Data.db file in SSTables 3.0 (mc) format. For #1969. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-04-25 15:39:14 -07:00
Piotr Jastrzebski	d492e92b15	Extract sstable::component_type to separete header It will be used in other places which won't depend on sstable. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-24 11:29:57 +02:00
Duarte Nunes	31370fd7b1	view_info: Explicitly initialize base-dependent fields Instead of lazily-initializing the regular base column in the view's PK field, explicitly initialize it. This will be used by future patches that don't have access to the schema when wanting to obtain that column. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Avi Kivity	28be4ff5da	Revert "Merge "Implement loading sstables in 3.x format" from Piotr" This reverts commit `513479f624`, reversing changes made to `01c36556bf`. It breaks booting. Fixes #3376.	2018-04-23 06:47:00 +03:00
Piotr Jastrzebski	82d483a1d3	Extract sstable::component_type to separete header It will be used in other places which won't depend on sstable. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-22 13:45:29 +02:00
Duarte Nunes	b5e7d5fa2c	column_family: Make reader without going through mutation source When doing the read before write for a materialized view update, call make_reader directly. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180417091918.10043-1-duarte@scylladb.com>	2018-04-17 12:22:36 +03:00
Daniel Fiala	202bff0b18	database: Remember versions and formats of all temporary TOC files. The patch fixes a bug introduce by commit `089b54f2d2`. This bug exhibited when master was deployed in an attempt to populate materialised views. The nodes restarted in the middle and they were not able to come back. The fix is to remember formats and versions of sstables for every generation. Fixes: #3324. Signed-off-by: Daniel Fiala <daniel@scylladb.com> Message-Id: <20180410083114.17315-1-daniel@scylladb.com>	2018-04-11 16:47:33 +03:00
Raphael S. Carvalho	30b6c9b4cd	database: make sure sstable is also forwarded to shard responsible for its generation After `f59f423f3c`, sstable is loaded only at shards that own it so as to reduce the sstable load overhead. The problem is that a sstable may no longer be forwarded to a shard that needs to be aware of its existence which would result in that sstable generation being reallocated for a write request. That would result in a failure as follow: "SSTable write failed due to existence of TOC file for generation..." This can be fixed by forwarding any sstable at load to all its owner shards and the shard responsible for its generation, which is determined as follow: s = generation % smp::count Fixes #3273. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180405035245.30194-1-raphaelsc@scylladb.com>	2018-04-05 10:58:05 +03:00
Duarte Nunes	f298f57137	column_family: Add function to populate views The populate_views() function takes a set of views to update, a tokento select base table partitions, and the set of sstables to query. This lays the foundation for a view building mechanism to exist, which walks over a given base table, reads data token-by-token, calculates view updates (in a simplified way, compared to the existing functions that push view updates), and sends them to the paired view replicas. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00
Duarte Nunes	67dd3e6e5d	column_family: Allow synchronizing with in-progress writes This patch adds a mechanism to class column_family through which we can synchronize with in-progress writes. This is useful for code that, after some modification, needs to ensure that new writes will see it before it can proceed. In particular, this will be used by the view building code, which needs to wait until the in-progress writes, which may have missed that there is now a view, is observable to the view building code. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00
Duarte Nunes	9640205f11	database: Compare view id instead of name in find_views() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00
Duarte Nunes	9b9ba525f7	database: Add get_views() function Returns all the schemas that are views. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00
Duarte Nunes	dc44a08370	db/view: Return a future when sending view updates While we now send view mutations asynchronously in the normal view write path, other processes interested in sending view updates, such as streaming or view building, may wish to do it synchronously. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00
Duarte Nunes	a985ea0fcb	column_family: Don't retry flushing memtable if shutdown is requested Since we just keep retrying, this can cause Scylla to not shutdown for a while. The data will be safe in the commit log. Note that this patch doesn't fix the issue when shutdown goes through storage_service::drain_on_shutdown - more work is required to handle that case. Ref #3318. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180324140822.3743-3-duarte@scylladb.com>	2018-03-26 14:36:40 +03:00
Duarte Nunes	50ad37d39b	column_family: Increase scope of exception handling when flushing a memtable In column_family::try_flush_memtable_to_sstable, the handle_exception() block is on the inside of the continuations to write_memtable_to_sstable(), which, if it fails, will leave the sstable in the compaction_backlog_tracker::_ongoing_writes map, which will waste disk space, and that sstable will map to a dangling pointer to a destroyed database_sstable_write_monitor, which causes a seg fault when accessed (for example, through the backlog_controller, which accounts the _ongoing_writes when calculating the backlog). Fix this by increasing the scope of handle_exception(). Fixes #3315 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180324140822.3743-2-duarte@scylladb.com>	2018-03-26 14:36:16 +03:00
Duarte Nunes	f298e3e6f8	database: Log exception which caused flush to fail Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180322204419.12961-1-duarte@scylladb.com>	2018-03-23 10:57:35 +00:00
Botond Dénes	a65b063ab2	incremental_reader_selector: remote unused members Since `3d725d6823` the incremental_reader_selector creates readers via a factory function so these members, used previously for creating the readers, are not needed anymore. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <64b5cef93c1f9a2e544ccfd89e293627e99dd4cd.1521724155.git.bdenes@scylladb.com>	2018-03-22 13:14:03 +00:00
Glauber Costa	9188059427	database: group statements in their own scheduling group When we introduced the CPU scheduler, we have also introduced a group for commitlog - but never used it. There is also doubtful value in separating reads from writes, since they are often part of the same workload. To accomodate for that, let's rename the query group to "statement" (query is not incorrect, just confusing), and move the write path, currently ungrouped, inside it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-03-20 16:58:36 -04:00
Glauber Costa	c8e169f6d8	database: apply streaming mutations with streaming priority We are flushing the streaming memtables with streaming priority, but applying the mutations themselves is still done with normal priorities. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-03-20 16:58:35 -04:00
Avi Kivity	03c22ad524	Merge "Support for Cassandra 2.2 (LA) SSTable formats" from Daniel " These patches add support for C* 2.2 file(name) format. Namely: * It forces Scylla to write files in la format. * Adds storage-service feature for them. * cf and ks are determined from directory, not from file-name (for 2.2 format). * Adds some other fixes to make dtest happy. * Unit tests work with la format or with both formats. " * 'danfiala/filename-format-2.2-v4' of https://github.com/hagrid-the-developer/scylla: tests/sstables: Tests use la format or iterate over both formats. tests/sstables: Helper functions support 2.2 format directory structure. stables: Use 2.2 (la) format as a default format to store sstables if it is enabled by feature-bits. storage_service: Support la sstable storage format as a feature. sstables: make_descriptor accepts sstable-directory, because it is necessary to determine cf and ks in 2.2 format. sstables: Throw more detail exception for unknown item in reverse_map. sstables/compaction: Suppress NaN in a report of a throughput.	2018-03-19 17:49:44 +02:00
Daniel Fiala	089b54f2d2	stables: Use 2.2 (la) format as a default format to store sstables if it is enabled by feature-bits. Signed-off-by: Daniel Fiala <daniel@scylladb.com>	2018-03-19 14:12:01 +01:00
Daniel Fiala	10db711259	sstables: make_descriptor accepts sstable-directory, because it is necessary to determine cf and ks in 2.2 format. Signed-off-by: Daniel Fiala <daniel@scylladb.com>	2018-03-18 06:09:47 +01:00
Botond Dénes	b2f75a6c53	Add counters to monitor querier-cache efficiency Add the following counters: (1) querier_cache_lookups (2) querier_cache_misses (3) querier_cache_drops (4) querier_cache_time_based_evictions (5) querier_cache_resource_based_evictions (6) querier_cache_memory_based_evictions (6) querier_cache_population (1) counts the total number of querier cache lookups. Not all page-fetches will result in a querier lookup. For example the first page of a query will not do a lookup as there was no previous page to reuse the querier from. The second, and all subsequent pages however should attempt to reuse the querier from the previous page. (2) counts the subset of (1) where the read have missed the querier cache (failed to find a matching saved querier). (3) counts the subset of (1) where the querier was recalled and dropped immediately. This can happen for example if the querier was at the wrong position. (4) counts the cached queriers that were evicted due to their TTL expiring. (5) counts the cached queriers that were evicted due to reader-resource (those limited by reader-concurrency limits) shortage. (6) counts the cached queriers that were evicted due to reaching the cache's memory limits (currently set to 4% of the shards' memory). (7) is the current number of entries in the cache Note: * The count of cache hits can be derived from these counters as (1) - (2). * cache_drop (3) also implies a cache hit (see above). This means that the number of actually reused queriers is: (1) - (2) - (3)	2018-03-13 10:34:34 +02:00

1 2 3 4 5 ...

1053 Commits