scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Glauber Costa	f4a167670a	database: seal active memtables when we close the database Failing to do so can lead to data not being written to disk when we terminate. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-21 09:39:31 +03:00
Glauber Costa	1f13d3e38f	database: gate seal_active_memtable We need to do that in order to close the database cleanly, flushing all pending data before we do. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-21 09:39:29 +03:00
Avi Kivity	f221301d5e	Merge "preparation work - system table handling" from Glauber	2015-06-18 17:49:29 +03:00
Tomasz Grabiec	51cae834e3	db: Put all sstables behind single reader This change abstracts reading from on-disk data sources behind a single reader which is then composed with memtable readers. This change also abstracts all data sources behind a single reader obtained via column_family::make_reader(). That reader is then used by algorithms like column_family::for_all_partitions() or column_family::query(). Having those abstractions will make it easier to add row cache, because it will be encapsulated in a single place.	2015-06-18 16:33:33 +02:00
Tomasz Grabiec	7f1ff0401e	db: Move mutation_reader definition to separate header	2015-06-18 15:47:40 +02:00
Glauber Costa	057c38b61c	only populate system keyspace Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-18 09:22:20 -04:00
Pekka Enberg	8345874dda	database: Add database::has_schema() helper Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-17 15:45:45 +03:00
Gleb Natapov	2d409250f2	remove ad-hoc token_metadata creation	2015-06-15 12:51:09 +03:00
Avi Kivity	446731cf88	Merge "column family API" Column family API, from Amnon.	2015-06-15 10:50:23 +03:00
Vlad Zolotarov	e045d8465c	db: use snitch name from the configuration file Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-06-14 15:31:58 +03:00
Gleb Natapov	b7155ad862	pass partitions_ranges separately from from read_command partitions_ranges will be manipulated upon to be split for different destination, so provide it separately from read_command to not copy the later for each destination.	2015-06-11 15:18:07 +03:00
Avi Kivity	ce6cd4b67e	Merge "Store keyspace strategy options to database" From Pekka: "This series fixes up schema management code to store keyspace strategy options to database. The map is stored as JSON just like in Origin."	2015-06-11 14:21:53 +03:00
Pekka Enberg	d088cb8181	Fix keyspace strategy options to preserve key-value ordering Fix keyspace strategy options to preserve key-value ordering by switching to std::map. We need this to be able to store the map in database as JSON because unordered maps can cause the schema merging code to attempt a keyspace update, which we don't support, even though the values did not change. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-11 13:02:42 +03:00
Pekka Enberg	b7a23ddadd	database: Memtable flush batching Currently, we flush out memtables very aggressively which results into lots of small sstable writes. The proper fix here is to do accounting on the memtable size but before that happens, bump up the threshold to another magic number which gives better batching: $ ./build/release/seastar --smp 1 --data-file-directories data --commitlog-directory commitlog/ $ tools/bin/cassandra-stress write -mode cql3 native prepared -rate threads=32 Before: Results: op rate : 37280 partition rate : 37280 row rate : 37280 latency mean : 0.8 latency median : 0.6 latency 95th percentile : 1.1 latency 99th percentile : 7.6 latency 99.9th percentile : 11.9 latency max : 50.5 Total operation time : 00:00:30 END After: Results: op rate : 46721 partition rate : 46721 row rate : 46721 latency mean : 0.7 latency median : 0.5 latency 95th percentile : 0.9 latency 99th percentile : 1.3 latency 99.9th percentile : 5.8 latency max : 96.3 Total operation time : 00:00:39 END Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-11 10:24:35 +03:00
Amnon Heiman	b9e3a03483	Expose the column family info in the database The API needs the column family information in the database object. This adds function to the database to expose the column family information.	2015-06-11 09:50:52 +03:00
Calle Wilund	8b9a63a3c6	Database/commitlog: guard against replay position reordering Commit log guarantees that once an RP is assigned to a data frame/caller, it will not block before returning the result via future. However, this is not enough, since we could a.) Have blocked earlier, in which case the return value processing will be async anyway b.) Even if no blocking takes place, future chaining mechanism could decide it has to reorder execution. Assuming though that the case where this happens is rare, and cases where it actually affects the rule of replay position ordering is even rarer, we can guard against it by simply keeping track of the highest RP _discarded_ (sent to sstable flush), and if we attempt to apply a mutation with a higher RP, simply re-do the operation (i.e. write same entry to commit log again). Signed-off-by: Calle Wilund <calle@cloudius-systems.com>	2015-06-10 11:56:45 +03:00
Vlad Zolotarov	a2594015f9	locator: futurize snitch creation - Forbid explicit snitch creation with constructor. - Allow the creation of snitches only with locator::make_snitch() template function. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> New in v4: - Make sure the snitch is stopped before it's destroyed when _snitch_is_ready is returned in an exceptional state. New in v2: - Change snitch_ptr to be std::unique_ptr<i_endpoint_snitch> - abstract_replication_strategy::create_replication_strategy(): explicitly specify (template) types of create_object() parameters. - Re-arrange the loop in marge_keyspaces() so that lambdas that depend on "this" complete before there is a chance that "this" gets destroyed. - create_keyspace(): Don't add a new keyspace if a keyspace with this name already exists. - i_endpoint_snitch: added a stop() virtual method - Added a stop() pure virtual method. - Added an enum class snitch_state and a _state member initialized to snitch_state::initializing, added an assert() in a destructor requiring _state to become snitch_state::stopped, which should be set when stop() is complete. - rack_inferring_snitch: added a stop() method. - simple_snitch: added a stop() method. - Added stop() methods to abstract_replication_strategy and keyspace. - Updated database::stop() to wait for all keyspaces in _keyspaces to stop.	2015-06-09 15:33:38 +03:00
Vlad Zolotarov	c1f0d285bb	database: make the the create_keyspace() function declaration match the definitiion. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-06-09 15:18:46 +03:00
Pekka Enberg	87e525b6b5	database: Add update and drop column family stubs They're needed by table merging in db/legacy_schema_tables.cc. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-08 14:42:36 +03:00
Tomasz Grabiec	b2549a7b14	Merge branch 'calle/secondary_index' from seastar-dev.git	2015-06-03 13:22:01 +02:00
Calle Wilund	293dbf66e3	Forward and use replay_position when applying mutation * Forward commitlog replay_position to column_family.memtable, updating highest RP if needed * When flushing memtable, signal back to commitlog that RP has been dealt with to potentially remove finished segment(s) Note: since memtable flushing right now is _not_ explicitly ordered, this does not actually work, since we need to guarantee ordering with regards to RP. I.e. if we flush N blocks, we must guarantee that: a.) We report "flushed RP" in RP order b.) For a given RP1, all RP* lower than RP1 must also have been flushed. (The latter means that it is fine to say, flush X tables at the same time, as long as we report a single RP that is the highest, and no lower RP:s exist in non-flushed tables) I am however letting someone else deal with ensuring MT->sstable flush order. Signed-off-by: Calle Wilund <calle@cloudius-systems.com>	2015-06-03 12:38:13 +03:00
Calle Wilund	724a33c11d	Database: add "existing_index_names"	2015-06-03 10:13:53 +02:00
Paweł Dziepak	8e66bfc9d4	db: add getter for database::_keyspaces Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-06-02 14:11:34 +02:00
Paweł Dziepak	d50859907f	db: update keyspace_metadata when column family is added Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-06-02 14:11:34 +02:00
Pekka Enberg	4dc488afb2	database: Store metadata in 'struct keyspace' Store a lw_shared_ptr<keyspace_metadata> in struct keyspace so callers in migration manager, for example, can look it up. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-25 09:12:29 +02:00
Avi Kivity	ff42d58881	db: use CoW to modify the memtable list in column_family Allow memtables to be removed from a column_family while a running query continues to use them.	2015-05-20 16:00:00 +03:00
Avi Kivity	1342553fed	db: remove column_family::testonly_all_memtables() Unused and gets in the way.	2015-05-20 15:28:53 +03:00
Avi Kivity	f8f6e979ef	db: use CoW to modify the sstable table in column_family Allow sstables to be removed from a column_family while a running query continues to use them.	2015-05-20 15:17:35 +03:00
Tomasz Grabiec	137b3beb2f	Merge tag 'avi/readpath-prep/v1' from seastar-dev.git From Avi: "This patchset prepares for adding sstables to the read path. Because sstables involve I/O, their APIs return futures, which means that APIs that may call those sstable APIs also need to return futures. This patchset uses the two-space indent + do_with + reference aliases trick to make patches more readable. Cleanup patches will follow once it is merged."	2015-05-19 20:39:36 +02:00
Pekka Enberg	56d6fdacfe	database: Simplify replication strategy initialization Initialize replication strategy when keyspace is being created now that we have access to keyspace_metadata. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 15:27:47 +03:00
Pekka Enberg	cd35617855	database: Use keyspace_metadata for creation functions Use the keyspace_metadata type for keyspace creation functions. This is needed to be able to have a mapping from keyspace name to keyspace metadata for various call-sites. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 15:27:47 +03:00
Avi Kivity	db04bba208	db: futurize the single partition query path Prepare for disk reads.	2015-05-19 15:13:09 +03:00
Avi Kivity	738be63b28	db: define column_family move constructor in .cc Allows using it from files that do not include sstable.hh.	2015-05-19 15:13:09 +03:00
Pekka Enberg	8380df84b4	database: Rename ks_meta_data to keyspace_metadata Follow the naming convention set by user_types_metadata and rename ks_meta_data to keyspace_metadata. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 11:24:06 +03:00
Pekka Enberg	7a84b53d61	database: Use lw_shared_ptr for user types metadata Use lw_shared_ptr for user types metadata member in ks_meta_data. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 11:17:55 +03:00
Pekka Enberg	a225439fdb	database: Inline ks_meta_data implementation The implementation part of ks_meta_data is just few lines of code. Inline that to the database.hh header file. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 11:07:14 +03:00
Pekka Enberg	032af4d53b	database: Move ks_meta_data definition to database.hh Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-19 11:03:28 +03:00
Avi Kivity	07d7f410f3	Merge branch 'memtable' into db Conflicts: database.hh memtable changes moved to memtable.hh	2015-05-18 15:50:24 +03:00
Avi Kivity	875148dae6	db: create keyspace/column_family directory structure This is slightly awkwards, since the directory structure is not sharded. This requires some processing to occur outside the shard, while the rest is sharded.	2015-05-18 15:34:41 +03:00
Avi Kivity	20775b9d5c	db: store a column_family's memtables in a list instead of a vector A vector can cause memtables to be move()d around, which breaks any code that captures a memtable's this pointer. Fix by using a linked list.	2015-05-18 15:34:25 +03:00
Avi Kivity	394e0d3a8c	db: make database::add_keyspace() return void Returning a reference to the keyspace is dangerous in that the keyspace can be moved away, when we start futurizing the add_keyspace() process. Make it return void and look up the keyspace at the point of use.	2015-05-18 15:34:25 +03:00
Avi Kivity	d8fed7e211	db: add simple memtable sealing policy Need to be replaced with something better, but we lack the infrastructure so far (region memory allocator).	2015-05-18 15:34:25 +03:00
Avi Kivity	0eb842dc5b	db: write memtable after sealing it Still missing handling after write completes.	2015-05-18 15:00:33 +03:00
Avi Kivity	ca49d73f97	db: allow configuring a column family to be memory-only Useful for tests.	2015-05-18 15:00:33 +03:00
Avi Kivity	dda5cbfd0d	db: make column_family and keyspace configurable Currently used for the data directory.	2015-05-18 15:00:31 +03:00
Avi Kivity	7842113cb6	db: prune some unused column_familiy methods Made redundant by switching tests to using memtable directly.	2015-05-18 14:59:02 +03:00
Glauber Costa	2174285c31	db: move memtable definition to its own file Following what happened to others: we can now include memtable.hh without including database.hh Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-05-17 12:38:32 +03:00
Avi Kivity	40c2d91cd8	db: add memtable::find_or_create_row_slow() Useful for tests that do not need a column_family.	2015-05-17 10:31:22 +03:00
Tomasz Grabiec	f7abbda156	db: Apply frozen_mutation directly We don't convert it back to mutation before applying. mutation_partition has now apply() which works on mutation_partition_view.	2015-05-08 09:19:02 +02:00
Tomasz Grabiec	4ab66de0ae	db: Introduce frozen_mutation The immediate motivation for introducing frozen_mutation is inability to deserialize current "mutation" object, which needs schema reference at the time it's constructed. It needs schema to initialize its internal maps with proper key comparators, which depend on schema. frozen_mutation is an immutable, compact form of a mutation. It doesn't use complex in-memory strucutres, data is stored in a linear buffer. In case of frozen_mutation schema needs to be supplied only at the time mutation partition is visited. Therefore it can be trivially deserialized without schema.	2015-05-08 09:19:01 +02:00

1 2 3 4

184 Commits