scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-06 06:53:12 +00:00

Author	SHA1	Message	Date
Piotr Sarna	dbe8491655	view: cache is_index for view pointer It's detrimental to keep querying index manager whether a view is backing a secondary index every time, so this value is cached at construct time. At the same time, this value is not simply passed to view_info when being created in secondary index manager, in order to decouple materialized view logic from secondary indexes as much as possible (the sole existence of is_index() is bad enough).	2019-02-20 12:52:32 +01:00
Nadav Har'El	05db7d8957	Materialized views: name the "batch_memory_max" constant Give the constant 1024*1024 introduced in an earlier commit a name, "batch_memory_max", and move it from view.cc to view_builder.hh. It now resides next to the pre-existing constant that controlled how many rows were read in each build step, "batch_size". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190217100222.15673-1-nyh@scylladb.com>	2019-02-17 13:28:16 +00:00
Rafael Ávila de Espíndola	9cd14f2602	Don't write to system.large_partition during shutdown The included testcase used to crash because during database::stop() we would try to update system.large_partition. There doesn't seem to be an order we can stop the existing services in cql_test_env that makes this possible. This patch then adds another step when shutting down a database: first stop updating system.large_partition. This means that during shutdown any memtable flush, compaction or sstable deletion will not be reflected in system.large_partition. This is hopefully not too bad since the data in the table is TTLed. This seems to impact only tests, since main.cc calls _exit directly. Tests: unit (release,debug) Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190213194851.117692-1-espindola@scylladb.com>	2019-02-15 10:49:10 +01:00
Gleb Natapov	0b84b04f97	consistency_level: make it more const correct Message-Id: <20190214122631.GF19055@scylladb.com>	2019-02-14 14:52:51 +02:00
Nadav Har'El	fec562ec8f	Materialized views: limit size of row batching during bulk view building The bulk materialized-view building processes (when adding a materialized view to a table with existing data) currently reads the base table in batches of 128 (view_builder::batch_size) rows. This is clearly better than reading entire partitions (which may be huge), but still, 128 rows may grow pretty large when we have rows with large strings or blobs, and there is no real reason to buffer 128 rows when they are large. Instead, when the rows we read so far exceed some size threshold (in this patch, 1MB), we can operate on them immediately instead of waiting for 128. As a side-effect, this patch also solves another bug: At worst case, all the base rows of one batch may be written into one output view partition, in one mutation. But there is a hard limit on the size of one mutation (commitlog_segment_size_in_mb, by default 32MB), so we cannot allow the batch size to exceed this limit. By not batching further after 1MB, we avoid reaching this limit when individual rows do not reach it but 128 of them did. Fixes #4213. This patch also includes a unit test reproducing #4213, and demonstrating that it is now solved. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190214093424.7172-1-nyh@scylladb.com>	2019-02-14 12:04:40 +02:00
Calle Wilund	e70286a849	db/extensions: Allow schema extensions to turn themselves off Fixes #4222 Iff an extension creation callback returns null (not exception) we treat this as "I'm not needed" and simply ignore it. Message-Id: <20190213124311.23238-1-calle@scylladb.com>	2019-02-13 14:50:51 +02:00
Calle Wilund	4e657c0633	system_keyspace: Add waitable for trunc. migration For tests. Hooray for separation of concern.	2019-02-13 09:08:12 +00:00
Calle Wilund	64e8c6f31d	storage_service: Add features disabling for tests	2019-02-13 09:08:12 +00:00
Calle Wilund	12ebcf1ec7	commitlog_replay: Use dedicated table for truncation Fixes #4083 Instead of sharded collection in system.local, use a dedicated system table (system.truncated) to store truncation positions. Makes query/update easier and easier on the query memory. The code also migrates any existing truncation positions on startup and clears the old data.	2019-02-13 09:08:12 +00:00
Calle Wilund	4a52ed7884	commitlog: Accept recycled (not yet re-used) segments in replay Refs #4085 Changes commitlog descriptor to both accept "Recycled-Commitlog..." file names, and preserve said name in the descriptor. This ensures we pick up the not-yet-used recycled segments left from a crash for replay. The replay in turn will simply ignore the recycled files, and post actual replay they will be deleted as needed. Message-Id: <20190129123311.16050-1-calle@scylladb.com>	2019-02-12 12:23:55 +02:00
Glauber Costa	e0bfd1c40a	allow Cassandra SSTables with counters to be imported if they are new enough Right now Cassandra SSTables with counters cannot be imported into Scylla. The reason for that is that Cassandra changed their counter representation in their 2.1 version and kept transparently supporting both representations. We do not support their old representation, nor there is a sane way to figure out by looking at the data which one is in use. For safety, we had made the decision long ago to not import any tables with counters: if a counter was generated in older Cassandra, we would misrepresent them. In this patch, I propose we offer a non-default way to import SSTables with counters: we can gate it with a flag, and trust that the user knows what they are doing when flipping it (at their own peril). Cassandra 2.1 is by now pretty old. many users can safely say they've never used anything older. While there are tools like sstableloader that can be used to import those counters, there are often situations in which directly importing SSTables is either better, faster, or worse: the only option left. I argue that having a flag that allow us to import them when we are sure it is safe is better than having no option at all. With this patch I was able to successfully import Cassandra tables with counters that were generated in Cassandra 2.1, reshard and compact their SSTables, and read the data back to get the same values in Scylla as in Cassandra. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190210154028.12472-1-glauber@scylladb.com>	2019-02-10 17:50:48 +02:00
Calle Wilund	ba6a8ef35b	tls: Use a default prio string disabling TLS1.0 forcing min 128bits Fixes #4010 Unless user sets this explicitly, we should try explicitly avoid deprecated protocol versions. While gnutls should do this for connections initiated thusly, clients such as drivers etc might use obsolete versions. Message-Id: <20190107131513.30197-1-calle@scylladb.com>	2019-02-05 15:34:18 +02:00
Avi Kivity	6c71eae63f	Merge "API: Stream compaction history records" from Amnon " get_compaction_history can return a lot of records which will add up to a big http reply. This series makes sure it will not create large allocations when returning the results. It adds an api to the query_processor to use paged queries with a consumer function that returns a future, this way we can use the http stream after each record. This implementation will prevent large allocations and stalls. Fixes #4152 " * 'amnon/compaction_history_stream_v7' of github.com:scylladb/seastar-dev: tests/query_processor_test: add query_with_consumer_test system_keyspace, api: stream get_compaction_history query_processor: query and for_each_cql_result with future	2019-02-05 14:16:36 +02:00
Amnon Heiman	6c7742d616	system_keyspace, api: stream get_compaction_history get_compaciton_history can return big chunk of data. To prevent large memory allocation, the get_compaction_history now read each compaction_history record and use the http stream to send it. Fixes #4152 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2019-02-05 11:14:53 +02:00
Piotr Jastrzebski	834bec5cc9	Read shard awareness columns as dropped Without this new version of Scylla won't be able to start with system tables inherited after older version that had shard awareness columns. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <cb62f20fc0c98f532c6f4ad5e08b3794951e85bd.1549289050.git.piotr@scylladb.com>	2019-02-04 18:43:11 +02:00
Calle Wilund	9cadbaa96f	commitlog_replayer: Bugfix: finding truncation positions uses local var ref "uuid" was ref:ed in a continuation. Works 99.9% of the time because the continuation is not actually delayed (and assuming we begin the checks with non-truncated (system) cf:s it works). But if we do delay continuation, the resulting cf map will be borked. Fixes #4187. Message-Id: <20190204141831.3387-1-calle@scylladb.com>	2019-02-04 16:51:13 +02:00
Avi Kivity	468f8c7ee7	Merge "Print a warning if a row is too large" from Rafael " This is a first step in fixing #3988. " * 'espindola/large-row-warn-only-v4' of https://github.com/espindola/scylla: Rename large_partition_handler Print a warning if a row is too large Remove defaut parameter value Rename _threshold_bytes to _partition_threshold_bytes keys: add schema-aware printing for clustering_key_prefix	2019-02-03 13:57:42 +02:00
Piotr Jastrzebski	ad217bbdc7	Revert "system_keyspace: add sharding information to local table" This reverts commit `bdce561ada`. Those columns are not used and cause problems with tools. Refs #4112 Message-Id: <c772ebc0ebc001e5bdf229424c6d51dc58cd5d2e.1548945023.git.piotr@scylladb.com>	2019-01-31 19:06:55 +01:00
Rafael Ávila de Espíndola	625080b414	Rename large_partition_handler Now that it also handles large rows, rename it to large_data_handler. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 15:03:14 -08:00
Rafael Ávila de Espíndola	1185138a34	Print a warning if a row is too large Tests: unit (release) Refs #3988. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 15:03:10 -08:00
Rafael Ávila de Espíndola	776d5bb9e2	Remove defaut parameter value The value is already passed by cql_table_large_partition_handler, so the default was just for nop_large_partition_handler. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 13:02:01 -08:00
Rafael Ávila de Espíndola	30528fa853	Rename _threshold_bytes to _partition_threshold_bytes A followup patch will add a threshold for rows. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 13:02:01 -08:00
Duarte Nunes	ea34e242de	Merge 'Do not use hints for view building' from Piotr " This series prevents view building to fall back to storing hints. Instead, it will try to send hints to an endpoint as if it has consistency level ONE, and in case of failure retry the whole building step. Then, view building will never be marked as finished prematurely (because of pending hints), which will help avoid creating inconsistencies when decommissioning a node from the cluster. Tests: unit (release) dtest (materialized_views_test.py.) Fixes #3857 Fixes #4039 " 'do_not_mark_view_as_built_with_hints_7' of https://github.com/psarna/scylla: db,view: add updating view_building_paused statistics database: add view_building_paused metrics table: make populate_views not allow hints db,view: add allow_hints parameter to mutate_MV storage_proxy: add allow_hints parameter to send_to_endpoint	2019-01-28 10:31:14 +00:00
Piotr Sarna	9a6261ca27	db,view: add updating view_building_paused statistics Each time view building does is paused because of connection failure, view_building_paused metrics is bumped.	2019-01-28 09:38:42 +01:00
Piotr Sarna	e30cf22956	db,view: add allow_hints parameter to mutate_MV Mutating MV function can now accept a parameter whether hints should be allowed during sending mutations to endpoints.	2019-01-28 09:38:42 +01:00
Piotr Sarna	e0fe9ce2c0	storage_proxy: add allow_hints parameter to send_to_endpoint With hints allowed, send_to_endpoint will leverage consistency level ANY to send data. Otherwise, it will use the default - cl::ONE.	2019-01-28 09:38:41 +01:00
Rafael Ávila de Espíndola	5332ebd50c	Update the description of compaction_large_partition_warning_threshold_mb Despite the name, this option also controls if a warning is issued during memtable writes. Warning during memtable writes is useful but the option name also exists in cassandra, so probably the best we can do is update the description. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190125020821.72815-1-espindola@scylladb.com>	2019-01-28 09:09:35 +02:00
Piotr Jastrzebski	ad016a732b	Move set_type_impl out of types.hh to types/set.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:56:38 +01:00
Piotr Jastrzebski	b1e1b66732	Move list_type_impl out of types.hh to types/list.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:56:38 +01:00
Piotr Jastrzebski	147cc031db	Move map_type_impl out of types.hh to types/map.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:56:38 +01:00
Piotr Jastrzebski	7666e81b51	Decouple database.hh from types/user.hh This commit declares shared_ptr<user_types_metadata> in database.hh were user_types_metadata is an incomplete type so it requires "Allow to use shared_ptr with incomplete type other than sstable" to compile correctly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:55:04 +01:00
Piotr Jastrzebski	e92b4c3dbc	Move user_type_impl out of types.hh to types/user.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:04:04 +01:00
Rafael Ávila de Espíndola	f7d1dc16d4	database: Use nop_large_partition_handler to avoid self-reporting Currently nop_large_partition_handler is only used in tests, but it can also be used avoid self-reporting. Tests: unit(Release) I also tested starting scylla with --compaction-large-partition-warning-threshold-mb=0. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190123205059.39573-1-espindola@scylladb.com>	2019-01-23 21:11:21 +00:00
Duarte Nunes	88c7c1e851	Merge 'hinted handoff: cache cf mappings' from Vlad " Cache cf mappings when breaking in the middle of a segment sending so that the sender has them the next time it wants to send this segment for where it left off before. Also add the "discard" metric so that we can track hints that are being discarded in the send flow. " Fixes #4122 * 'hinted_handoff_cache_cf_mappings-v1' of https://github.com/vladzcloudius/scylla: hinted handoff: cache column family mappings for segments that were not sent out in full hinted handoff: add a "discarded" metric	2019-01-23 00:44:41 +00:00
Vlad Zolotarov	34829b8f81	hinted handoff: cache column family mappings for segments that were not sent out in full We will try to send a particular segment later (in 1s) from the place where we left off if it wasn't sent out in full before. However we may miss some of column family mappings when we get back to sending this file and start sending from some entry in the middle of it (where we left off) if we didn't save column family mappings we cached while reading this segment from its begining. This happens because commitlog doesn't save a column family information in every entry but rather once for each uniq column family (version) per "cycle" (see commitlog::segment description for more info). Therefore we have to assume that a particular column family mapping appears once in the whole segment (worst case). And therefore, when we decide to resume sending a segment we need to keep the column family mappings we accumulated so far and drop them only after we are done with this particular segment (sent it out in full). Fixes #4122 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-01-22 15:24:22 -05:00
Vlad Zolotarov	4516a8cfc4	hinted handoff: add a "discarded" metric Account the amount of hints that were discarded in the send path. This may happen for instance due to a schema change or because a hint being to old. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-01-22 14:11:09 -05:00
Benny Halevy	93270dd8e0	gc_clock: make 64 bit Fixes: #3353 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-22 15:34:32 +02:00
Benny Halevy	9878b36895	db: get default_time_to_live as int32_t rather than gc_clock::rep Otherwise, value_cast<> throws std::bad_cast exception when gc_clock::rep is defined as int64_t. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-22 15:34:32 +02:00
Botond Dénes	4e89dea9ea	database: don't allow access to global semaphores Recently we had a bug (#4096) due to a component (`multishard_mutation_query()`) assuming that all reads used the semaphore obtainable via `database::user_read_concurrency_sem()`. This problem revealed that it is plain wrong to allow access to the shard-global semaphores residing in the database object. Instead all code wishing to access the relevant semaphore for some read, should do so via the relevant `table` object, thus guaranteeing that it will get the correct semaphore, configured for that table. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <4f3a6780eb3240822db34aba7c1ba0a675a96592.1547734212.git.bdenes@scylladb.com>	2019-01-21 16:29:02 +02:00
Avi Kivity	e0914a080e	schema_tables: partially de-template make_map_mutation() make_map_mutation() is called several times, hopfully with the same Map type parameter. Replace the Func parameter with a noncopyable_function<>.	2019-01-20 15:55:20 +02:00
Avi Kivity	630f841e5b	hints: de-template scan_for_hints_dirs() This function is called twice, and is not doing anything performance critical, so replace the template parameter Func with std::function<>.x	2019-01-20 15:55:20 +02:00
Avi Kivity	6e6372e8d2	Revert "Merge "Type-eaese gratuitous templates with functions" from Avi" This reverts commit `31c6a794e9`, reversing changes made to `4537ec7426`. It causes bad_function_calls in some situations: INFO 2019-01-20 01:41:12,164 [shard 0] database - Keyspace system: Reading CF sstable_activity id=5a1ff267-ace0-3f12-8563-cfae6103c65e version=d69820df-9d03-3cd0-91b0-c078c030b708 INFO 2019-01-20 01:41:13,952 [shard 0] legacy_schema_migrator - Moving 0 keyspaces from legacy schema tables to the new schema keyspace (system_schema) INFO 2019-01-20 01:41:13,958 [shard 0] legacy_schema_migrator - Dropping legacy schema tables INFO 2019-01-20 01:41:14,702 [shard 0] legacy_schema_migrator - Completed migration of legacy schema tables ERROR 2019-01-20 01:41:14,999 [shard 0] seastar - Exiting on unhandled exception: std::bad_function_call (bad_function_call)	2019-01-20 11:32:14 +02:00
Avi Kivity	2407c35cc1	schema_tables: partially de-template make_map_mutation() make_map_mutation() is called several times, hopfully with the same Map type parameter. Replace the Func parameter with a noncopyable_function<>.	2019-01-17 18:54:43 +02:00
Avi Kivity	81d004b2c0	hints: de-template scan_for_hints_dirs() This function is called twice, and is not doing anything performance critical, so replace the template parameter Func with std::function<>.x	2019-01-17 18:51:46 +02:00
Piotr Sarna	02d88de082	db,view: add consuming units in staging table registration View update generator service can accept sstables even before it starts, but it should still acknowledge the number of waiters in the semaphore. Reported-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <fcaa0f2884ebb4d34d1716e9e1cfed0642b4b85d.1547661048.git.sarna@scylladb.com>	2019-01-16 18:05:17 +00:00
Duarte Nunes	04a14b27e4	Merge 'Add handling staging sstables to /upload dir' from Piotr " This series adds generating view updates from sstables added through /upload directory if their tables have accompanying materialized views. Said sstables are left in /upload directory until updates are generated from them and are treated just like staging sstables from /staging dir. If there are no views for a given tables, sstables are simply moved from /upload dir to datadir without any changes. Tests: unit (release) " * 'add_handling_staging_sstables_to_upload_dir_5' of https://github.com/psarna/scylla: all: rename view_update_from_staging_generator distributed_loader: fix indentation service: add generating view updates from uploaded sstables init: pass view update generator to storage service sstables: treat sstables in upload dir as needing view build sstables,table: rename is_staging to requires_view_building distributed_loader: use proper directory for opening SSTable db,view: make throttling optional for view_update_generator	2019-01-15 18:19:27 +00:00
Piotr Sarna	0eb703dc80	all: rename view_update_from_staging_generator The new name, view_update_generator, is both more concise and correct, since we now generate from directories other than "/staging".	2019-01-15 17:31:47 +01:00
Piotr Sarna	beb4836726	db,view: make throttling optional for view_update_generator Currently registering new view updates is throttled by a semaphore, which makes sense during stream sessions in order to avoid overloading the queue. Still, registration also occurs during initialization, where it makes little sense to wait on a semaphore, since view update generator might not have started at all yet.	2019-01-15 16:47:01 +01:00
Piotr Sarna	b9203ec4f8	view: wait for stream sessions to finish before view building During streaming, there's a race between streamed sstables and view creation, which might result in some tables not being used to generate view updates, even though they should. That happens when the decision about view update path for a table is done before view creation, but after already receiving some sstables via streaming. These will not be used in view building even though they should. Hence, a phaser is used to make the view builder wait for all ongoing stream sessions for a table to finish before proceeding with build steps. Refs #4032	2019-01-15 09:36:55 +01:00
Rafael Ávila de Espíndola	26ac2c23ef	Change _row_ names that refer to partitions This renames some variables and functions to make it clear that they refer to partitions and not rows. Old versions of sstablemetadata used to refer to a row histogram, but current versions now mention a partition histogram instead. This patch doesn't change the exposed API names. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181229223311.4184-2-espindola@scylladb.com>	2019-01-09 14:53:42 +02:00

... 73 74 75 76 77 ...

4972 Commits