scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 11:30:36 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	35b75e9b67	adapt compaction procedure to support leveled strategy Adapt our compaction code to start writing a new sstable if the one being written reached its maximum size. Leveled strategy works with that concept. If a strategy other than leveled is being used, everything will work as before. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-10-16 01:54:52 -03:00
Calle Wilund	012ab24469	column_family: Add flush queue object to act as ordering guarantee	2015-10-14 14:07:40 +02:00
Glauber Costa	b2fef14ada	do not calculate truncation time independently Currently, we are calculating truncated_at during truncate() independently for each shard. It will work if we're lucky, but it is fairly easy to trigger cases in which each shard will end up with a slightly different time. The main problem here, is that this time is used as the snapshot name when auto snapshots are enabled. Previous to my last fixes, this would just generate two separate directories in this case, which is wrong but not severe. But after the fix, this means that both shards will wait for one another to synchronize and this will hang the database. Fix this by making sure that the truncation time is calculated before invoke_on_all in all needed places. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-09 17:17:11 +03:00
Tomasz Grabiec	bc1d159c1b	Merge branch 'penberg/cql-drop-table/v3' from seastar-dev.git From Pekka: This patch series implements support for CQL DROP TABLE. It uses the newly added truncate infrastructure under the hood. After this series, the test_table CQL test in dtest passes: [penberg@nero urchin-dtest]$ nosetests -v cql_tests.py:TestCQL.table_test table_test (cql_tests.TestCQL) ... ok ---------------------------------------------------------------------- Ran 1 test in 23.841s OK	2015-10-06 13:39:25 +02:00
Pekka Enberg	afbb2f865d	database: Add keyspace_metadata::remove_column_family() helper Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-06 11:28:55 +03:00
Pekka Enberg	0651ab6901	database: Futurize drop_column_family() function Futurize drop_column_family() so that we can call truncate() from it. Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-06 11:28:55 +03:00
Pekka Enberg	85ffaa5330	database: Add truncate() variant that does not look up CF by name For drop_column_family(), we want to first remove the column_family from lookup tables and truncate after that to avoid races. Introduce a truncate() variant that takes keyspace and column_family references. Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-06 11:28:54 +03:00
Glauber Costa	639ba2b99d	incremental backups: move control to the CF level Currently, we control incremental backups behavior from the storage service. This creates some very concrete problems, since the storage service is not always available and initialized. The solution is to move it to the column family (and to the keyspace so we can properly propagate the conf file value). When we change this from the api, we will have to iterate over all of them, changing the value accordingly. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-05 13:16:11 +02:00
Glauber Costa	69d1358627	database: non const versions of get_keyspaces/column_families We will need to change some properties of the keyspace / cf. We need an acessor that is not marked as const. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-05 13:13:37 +02:00
Amnon Heiman	1f16765140	column family: setting the read and write latency histogram This patch contains the following changes, in the definition of the read and write latency histogram it removes the mask value, so the the default value will be used. To support the gothering of the read latency histogram the query method cannot be const as it modifies the histogram statistics. The read statistic is sample based and it should have no real impact on performance, if there will be an impact, we can always change it in the future to a lower sampling rate. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-10-04 11:52:19 +03:00
Pekka Enberg	5e27d476d4	database: Improve exception error messages When we convert exceptions into CQL server errors, type information is not preserved. Therefore, improve exception error messages to make debugging dtest failures, for example, slightly easier. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-10-01 11:23:46 +03:00
Calle Wilund	68b8d8f48c	database: Implement "truncate" for column family Including snapshotting.	2015-09-30 09:09:42 +02:00
Calle Wilund	56228fba24	column family: Add "snapshot" operation.	2015-09-30 09:09:42 +02:00
Calle Wilund	c141e15a4a	column family: Add "run_with_compaction_disabled" helper A'la origin. Could as well been RAII.	2015-09-30 09:09:41 +02:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Amnon Heiman	089bd6a5bd	column family: Expose the compaction strategy This expose the compaction strategy object. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-09-12 08:35:34 +03:00
Amnon Heiman	3af683e6f4	column family: add estimate read, write This adds an estimated read and estimated write histogram to the column family stats object.	2015-09-12 08:35:03 +03:00
Amnon Heiman	dd7638cfa9	Expose the dirty_memory_region_group in database and add occupancy to column_family This patch adds a getter for the dirty_memory_region_group in the database object and add an occupency method to column family that returns the total occupency in all the memtable in the column family. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-09-10 00:22:08 +03:00
Avi Kivity	b96018411b	Merge "Fix flush in the middle of scanning bug" from Tomasz Fixes #309. Conflicts: sstables/sstables.cc	2015-09-09 11:56:04 +03:00
Tomasz Grabiec	320ff132f8	sstables: Relax header dependencies	2015-09-09 10:07:43 +02:00
Gleb Natapov	df468504b6	schema_table: convert code to use distributed<storage_proxy> instead of storage_proxy& All database code was converted to is when storage_proxy was made distributed, but then new code was written to use storage_proxy& again. Passing distributed<> object is safer since it can be passed between shards safely. There was a patch to fix one such case yesterday, I found one more while converting.	2015-09-09 10:19:30 +03:00
Tomasz Grabiec	c623fbe1f7	database: Keep sstable as lw_shared_ptr<> from the beginning Allows us to save on indentation, and we need it as shared anyway later.	2015-09-08 10:19:19 +02:00
Calle Wilund	380649eb66	Database: Add commitlog flush handler to switch memtables to disk Initiates flushing of CF:s to sstable on CL disk overflow (flush req)	2015-09-07 13:21:46 +02:00
Avi Kivity	349015a269	Merge "Fix migration manager logging" from Pekka "Fix migration manager logging to output what origin does. Fixes #112."	2015-08-31 16:27:49 +03:00
Calle Wilund	987454d012	Database: Add "flush_all_memtables"	2015-08-31 14:29:50 +02:00
Pekka Enberg	03e0bcd8cb	database: Add operator<< for keyspace_metadata Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-08-31 13:35:19 +03:00
Pekka Enberg	04a65ec06f	database: Add keyspace_metadata::validate() helper Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-08-31 11:54:56 +03:00
Avi Kivity	012fd41fc0	db: hard dirty memory limit Unlike cache, dirty memory cannot be evicted at will, so we must limit it. This patch establishes a hard limit of 50% of all memory. Above that, new requests are not allowed to start. This allows the system some time to clean up memory. Note that we will need more fine-grained bandwidth control than this; the hard limit is the last line of defense against running our of reclaimable memory. Tested with a mixed read/write load; after reads start to dominate writes (due to the proliferation of small sstables, and the inability of compaction to keep up, dirty memory usage starts to climb until the hard stop prevents it from climbing further and ooming the server).	2015-08-28 14:47:17 +02:00
Avi Kivity	5f62f7a288	Revert "Merge "Commit log replay" from Calle" Due to test breakage. This reverts commit `43a4491043`, reversing changes made to `5dcf1ab71a`.	2015-08-27 12:39:08 +03:00
Avi Kivity	0fff367230	Merge "test for compaction metadata's ancestors" from Raphael	2015-08-27 11:07:53 +03:00
Avi Kivity	4e3c9c5493	Merge "compaction manager fixes" from Raphael	2015-08-27 11:05:26 +03:00
Avi Kivity	43a4491043	Merge "Commit log replay" from Calle "Initial implementation/transposition of commit log replay. * Changes replay position to be shard aware * Commit log segment ID:s now follow basically the same scheme as origin; max(previous ID, wall clock time in ms) + shard info (for us) * SStables now use the DB definition of replay_position. * Stores and propagates (compaction) flush replay positions in sstables * If CL segments are left over from a previous run, they, and existing sstables are inspected for high water mark, and then replayed from those marks to amend mutations potentially lost in a crash * Note that CPU count change is "handled" in so much that shard matching is per _previous_ runs shards, not current. Known limitations: * Mutations deserialized from old CL segments are _not_ fully validated against existing schemas. * System::truncated_at (not currently used) does not handle sharding afaik, so watermark ID:s coming from there are dubious. * Mutations that fail to apply (invalid, broken) are not placed in blob files like origin. Partly because I am lazy, but also partly because our serial format differs, and we currently have no tools to do anything useful with it * No replay filtering (Origin allows a system property to designate a filter file, detailing which keyspace/cf:s to replay). Partly because we have no system properties. There is no unit test for the commit log replayer (yet). Because I could not really come up with a good one given the test infrastructure that exists (tricky to kill stuff just "right"). The functionality is verified by manual testing, i.e. running scylla, building up data (cassandra-stress), kill -9 + restart. This of course does not really fully validate whether the resulting DB is 100% valid compared to the one at k-9, but at least it verified that replay took place, and mutations where applied. (Note that origin also lacks validity testing)"	2015-08-27 10:53:36 +03:00
Amnon Heiman	b5ceef451e	keyspace: Add the get_non_system_keyspaces and expose the replication This patch adds the get_non_system_keyspaces that found in origin and expose the replication strategy. With the get_replication_strategy method. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-08-25 19:39:13 +03:00
Calle Wilund	df8d7a8295	Database: Add "flush_all_memtables"	2015-08-25 09:41:56 +02:00
Avi Kivity	4390be3956	Rename 'negative_mutation_reader' to 'partition_presence_checker' Suggested by Tomek.	2015-08-24 18:03:22 +03:00
Raphael S. Carvalho	c65af6e188	api: add get_unleveled_sstables to column family api Adding to API function to return count of sstables in L0 if leveled compaction strategy is enabled, 0 otherwise. Currently, we don't support leveled compaction strategy, so function to return count of sstables in L0 always return zero. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-08-24 11:56:31 -03:00
Raphael S. Carvalho	4c9c144987	compaction_manager: avoid concurrent compaction on the same cf It was noticed that the same sstable files could be selected for compaction if concurrent compaction happens on the same cf. That's possible because compaction manager uses 2 tasks for handling compactions. Solution is to not duplicate cf in the compaction manager queue, and re-schedule compaction for a cf if needed. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-08-24 11:11:47 -03:00
Avi Kivity	8a4648761c	tests: make test cql environment use volatile system keyspace Prevents hangs due to the database not being able to persist a memtable. Tested-by: Asias He <asias@cloudius-systems.com>	2015-08-24 13:50:22 +03:00
Avi Kivity	83d5c7e7c8	Merge	2015-08-24 10:58:39 +03:00
Avi Kivity	855ef838a9	db: fix use-after-free with region_group _dirty_memory_region_group is used by the column_family's memtables, but is destroyed before them. Fix by changing the destruction order. Fixes #175.	2015-08-24 10:51:03 +03:00
Avi Kivity	0afbdf4aa7	Merge "Add row related methods to the cache_service API" from Amnon "This series expose statistics from the row_cache in the cache_service API. After this series the following methods will be available: get_row_hits get_row_requests get_row_hit_rate get_row_size get_row_entries"	2015-08-23 15:46:07 +03:00
Avi Kivity	c01bc16f58	db: don't give up flushing a memtable on error We must try again, or the memtable's memory will never be reclaimed.	2015-08-19 19:36:41 +03:00
Avi Kivity	6846909533	db: extract sstable flushing code to a function	2015-08-19 19:36:41 +03:00
Avi Kivity	5bf5476beb	db: add collectd counter for dirty memory	2015-08-19 19:36:41 +03:00
Avi Kivity	c175025bb6	db: place all memtables into a single region_group We can use this to track the amount of unevictable memory in the system.	2015-08-19 19:36:41 +03:00
Avi Kivity	7b67b04822	db: wire up max memtable size configuration	2015-08-19 13:17:27 +03:00
Avi Kivity	c317391f62	db: trigger memtable flush based on actual memory usage Rather than using _mutation_count as a poor proxy.	2015-08-19 12:59:52 +03:00
Raphael S. Carvalho	820ba6f4d2	adapt compaction manager for column family removal We need a way to remove a column family from the compaction manager because when dropping a column family we need to make sure that the compaction manager doesn't hold a reference to it anymore. So compaction manager queue is now of column_family, allowing us to cancel requests pertaining to a column family being dropped. There may be an ongoing compaction for the column family being dropped, so we also need to wait for its termination. Testcase for compaction manager was also adapted and improved. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-08-18 11:38:06 +03:00
Amnon Heiman	361b2377bb	Expose the row_cache in the column_family This expose the row_cache in the column family, it will be used by the API to get the row_cache statistic information. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-08-17 19:42:23 +03:00
Avi Kivity	608c0b8460	Merge "initial work on compaction manager API" from Rapahel	2015-08-17 17:24:13 +03:00

1 2 3 4 5 ...

276 Commits