scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	c95dd67d11	utils: Introduce cached_file It is a read-through cache of a file. Will be used to cache contents of the promoted index area from the index file. Currently, cached pages are evicted manually using the invalidate_*() method family, or when the object is destroyed. The cached_file represents a subset of the file. The reason for this is to satisfy two requirements. One is that we have a page-aligned caching, where pages are aligned relative to the start of the underlying file. This matches requirements of the seastar I/O engine on I/O requests. Another requirement is to have an effective way to populate the cache using an unaligned buffer which starts in the middle of the file when we know that we won't need to access bytes located before the buffer's position. See populate_front(). If we couldn't assume that, we wouldn't be able to insert an unaligned buffer into the cache.	2020-06-16 16:15:23 +02:00
Tomasz Grabiec	ab274b8203	sstables: clustered_index: Relax scope of validity of entry_info entry_info holds views, which may get invalidated when the containing index blocks are removed. Current implementations of next_entry() keeps the blocks in memory as long as the cursor is alive but that will change in new implementations of the cursor. Adjust the assumption of tests accordingly.	2020-06-16 16:15:23 +02:00
Tomasz Grabiec	ea2fbcc2cd	sstables: index_entry: Introduce owning promoted_index_block_position	2020-06-16 16:15:23 +02:00
Tomasz Grabiec	714da3c644	compound_compat: Allow constructing composite from a view	2020-06-16 16:15:23 +02:00
Tomasz Grabiec	f2e52c433f	sstables: index_entry: Rename promoted_index_block_position to promoted_index_block_position_view	2020-06-16 16:15:23 +02:00
Tomasz Grabiec	101fd613c5	sstables: mc: Extract parser for promoted index block It will be reused in binary search over the index.	2020-06-16 16:15:14 +02:00
Tomasz Grabiec	a557c374fd	sstables: mc: Extract parser for clustering out of the promoted index block parser This parser will be used stand-alone when doing a binary search over promoted index blocks. We will only parse the start key not the whole block.	2020-06-16 16:14:31 +02:00
Tomasz Grabiec	95df7126a7	sstables: consumer: Extract primitive_consumer This change extracts the parser for primitive types out of continuous_data_consumer so that it can be used stand-alone or embedded in other parsers.	2020-06-16 16:14:30 +02:00
Tomasz Grabiec	d5bf540079	sstables: Abstract the clustering index cursor behavior In preparation for supporting more than one algorithm for lookups in the promoted index, extract relevant logic out of the index_reader (which is a partition index cursor). The clustered index cursor implementation is now hidden behind abstract interface called clustered_index_cursor. The current implementation is put into the scanning_clustered_index_cursor. It's mostly code movement with minor adjustments. In order to encapsulate iteration over promoted index entries, clustered_index_cursor::next_entry() was introduced. No change in behavior intended in this patch.	2020-06-16 16:14:17 +02:00
Tomasz Grabiec	a858f87b11	sstables: index_reader: Rearrange to reduce branching and optionals No change in logic. Will make it easier to make further refactoring.	2020-06-16 16:13:39 +02:00
Calle Wilund	5105e9f5e1	cdc::log: Missing "preimage" check in row deletion pre-image Fixes #6561 Pre-image generation in row deletion case only checked if we had a pre-image result set row. But that can be from post-image. Also check actual existance of the pre-image CK. Message-Id: <20200608132804.23541-1-calle@scylladb.com>	2020-06-09 10:56:41 +03:00
Piotr Sarna	2746a3597f	Update seastar submodule * seastar 42e77050...81242ccc (7): > demos: coroutine_demo: fix for SEASTAR_API_LEVEL >= 3 > core: Avoid warning on disable_backtrace_temporarily::_old being unused > future: Add a couple of friend declarations > Merge "net: make socket stack nothrow move constructible" from Benny > reactor: Avoid declaring _Unwind_RaiseException > future-util: Delete SEASTAR__WAIT_ALL__AVOID_ALLOCATION_WHEN_ALL_READY > file: io_priority_class: specify constructor as noexcept	2020-06-08 19:38:28 +02:00
Takuya ASADA	1e2509ffec	dist/offline_installer/debian: fix umask error same as redhat, makeself script changes current umask, scylla_setup causes "scylla does not work with current umask setting (0077)" error. To fix that we need use latest version of makeself, and specfiy --keep-umask option. See #6243	2020-06-08 20:06:21 +03:00
Takuya ASADA	4eae7f66eb	dist/offline_installer/debian: support cross build Unlike redhat version, debian version already supported cross build since it uses debootstrap, but the shellscript rejecting to continue build on non-debian distribution, so drop these lines to build on Fedora. [avi: regenerate toolchain]	2020-06-08 19:54:09 +03:00
Takuya ASADA	058da69a3b	dist/debian/python3: cleanup build/debian, rename build directory This is scylla-python3 version of #6611, but we also need to rename .deb build directory for scylla-python3, since we may lose .deb when building both scylla and scylla-python3 .deb package, since we currently sharing build directory. So renamed it to build/python3/debian.	2020-06-08 15:49:22 +03:00
Takuya ASADA	260d264d3c	dist/debian: cleanup build/debian before building .deb On `287d6e5`, we stopped to rm -rf debian/ on build_deb.sh, since now we have prebuilt debian/ directory. However, it might cause .deb build error when we modified debian package source, since it never cleanup. To prevent build error, we need to cleanup build/debian on reloc/build_deb.sh, before extracting contents from relocatable package.	2020-06-08 15:18:42 +03:00
Kamil Braun	013330199d	cdc/storage_proxy: keep cdc_service alive in storage_proxy operations storage_proxy is never deinitialized, so it may have still used cdc_service after its destructor was called. This fixes the problem by cdc_service inheriting from async_sharded_service and storage_proxy calling shared_from_this on the service whenever it uses it. cdc_service inherits from async_sharded_service and not simply from enable_shared_from_this, because there might be other services that cdc_service depends on. Assuming that these services are deinitialized after cdc_service (as they should), i.e. after stop() is called on cdc_service, making cdc_service async_sharded_service will keep their deinitialization code from being called until all references to cdc_service disappear (async_sharded_service keeps stop() from returning until this happens). Some more improvements should be possible through some refactoring: 1. Make augment_mutation_call a free function, not a member of cdc_service: it doesn't need any state that cdc_service has. db_context can be passed down from storage_proxy when it calls the function. 2. Remove the storage_proxy -> cdc_service reference. storage_proxy only needs augment_mutation_call, which would not be a part of the service. This would also get rid of the proxy -> cdc -> proxy reference cycle that we have now, and would allow storage_proxy to be safely deinitialized after cdc_service. 3. Maybe we could even remove the cdc_service -> storage_proxy reference. Is it really needed?	2020-06-08 13:25:51 +03:00
Takuya ASADA	969c4258cf	aws: update enhanced networking supported instance list Sync enhanced networking supported instance list to latest one. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html Fixes #6540	2020-06-08 12:48:36 +03:00
Takuya ASADA	bebaaa038f	dist/debian: fix node-exporter.service file name Since `287d6e5`, we mistakenly packaging node-exporter.service in wrong name on .deb, need to rename in correct name. Fixes #6604	2020-06-08 12:39:18 +03:00
Asias He	dddde33512	gossip: Do not send shutdown message when a node is in unknown status When a replacing node is in early boot up and is not in HIBERNATE sate yet, if the node is killed by a user, the node will wrongly send a shutdown message to other nodes. This is because UNKNOWN is not in SILENT_SHUTDOWN_STATES, so in gossiper::do_stop_gossiping, the node will send shutdown message. Other nodes in the cluster will call storage_service::handle_state_normal for this node, since NORMAL and SHUTDOWN status share the same status handler. As a result, other nodes will incorrectly think the node is part of the cluster and the replace operation is finished. Such problem was seen in replace_node_no_hibernate_state_test dtest: n1, n2 are in the cluster n2 is dead n3 is started to replace n2, but n3 is killed in the middle n3 announces SHUTDOWN status wrongly n1 runs storage_service::handle_state_normal for n3 n1 get tokens for n3 which is empty, because n3 hasn't gossip tokens yet n1 skips update normal tokens for n3, but think n3 has replaced n2 n4 starts to replace n2 n4 checks the tokens for n2 in storage_service::join_token_ring (Cannot replace token {} which does not exist!) or storage_service::prepare_replacement_info (Cannot replace_address {} because it doesn't exist in gossip) To fix, we add UNKNOWN into SILENT_SHUTDOWN_STATES and avoid sending shutdown message. Tests: replace_address_test.py:TestReplaceAddress.replace_node_no_hibernate_state_test Fixes: #6436	2020-06-08 11:32:23 +02:00
Pavel Solodovnikov	6f6e6762ba	cql: remove unused functions It seems that the following functions are never used, delete them: * `function::has_reference_to` * `functions::get_overload_count` * `to_identifiers` in column_identifier.hh * `single_column_relation::get_map_key` Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200606115149.1770453-1-pa.solodovnikov@scylladb.com>	2020-06-08 11:28:57 +03:00
Piotr Sarna	3458bd2e32	db,view: fix outdated comments Some comments still referred to variable names which are no longer up-to-date. Follow-up for #6560. Message-Id: <2b857ccc900dd64f0d9379f5d6c87fd3aaa5d902.1591594042.git.sarna@scylladb.com>	2020-06-08 09:02:10 +03:00
Nadav Har'El	d6626c217a	merge: add error injection to mv Merged pull request https://github.com/scylladb/scylla/pull/6516 from Piotr Sarna: This series adds error injection points to materialized view paths: view update generation from staging sstables; view building; generating view updates from user writes. This series comes with a corresponding dtest pull request which adds some test cases based on error injection. Fixes #6488	2020-06-07 19:23:23 +03:00
Avi Kivity	53a19fc1f2	Merge 'Debian version number fix' from Takuya " Now we generate dist/changelog on relocatable package generation time, we cannot run '.rc' fixup on .deb package building time, need to do it in debian_files_gen.py. Also, we uses '_' in version number for some test version packages, which does not supported in .deb packaging system, need to replaced with '-'. " * syuu1228-debian_version_number_fix: dist/debian: support version number containing '_' dist/debian: move version number fixup to debian_files_gen.py	2020-06-07 19:14:24 +03:00
Piotr Sarna	b3a6a33487	db,view: ensure that local updates are applied locally In current mutate_MV() code it's possible for a local endpoint to become a target for a network operation. That's the source of occasional `broken promise` benign error messages appearing, since the mutation is actually applied locally, so there's no point in creating a write response handler - the node will not send a response to itself via network. While at it, the code is deduplicated a little bit - with the paths simplified, it's easier to ensure that a local endpoint is never listed as a target for remote network operations. Fixes #5459 Tests: unit(dev), dtest(materialized_views_test.TestMaterializedViews.add_dc_during_mv_insert_test)	2020-06-07 19:10:03 +03:00
Kamil Braun	a1e235b1a4	CDC: Don't split collection tombstone away from base update Overwriting a collection cell using timestamp T is a process with following steps: 1. inserting a row marker (if applicable) with timestamp T; 2. writing a collection tombstone with timestamp T-1; 3. writing the new collection value with timestamp T. Since CDC does clustering of the operations by timestamp, this would result in 3 separate calls to `transform` (in case of INSERT, or 2 - in the case of UPDATE), which seems excessive, especially when pre-/postimage is enabled. This patch makes collection tombstones being treated as if they had the same TS as the base write and thus they are processed in one call to `transform` (as long as TTLs are not used). Also, `cdc_test` had to be updated in places that relied on former splitting strategy. Fixes #6084	2020-06-07 17:09:05 +03:00
Tomasz Grabiec	c1df00859e	sstables: Make deletion_time printable Message-Id: <1591387901-7974-12-git-send-email-tgrabiec@scylladb.com>	2020-06-07 13:55:34 +03:00
Raphael S. Carvalho	8e47f61df7	compaction: Enable tombstone expiration based on the presence of the sstable set For tombstone expiration to proceed correctly without the risk of resurrecting data, the sstable set must be present. Regular compaction and derivatives provide the sstable set, so they're able to expire tombstones with no resurrection risk. Resharding, on the other hand, can run on any shard, not necessarily on the same shard that one of the input sstables belongs to, so it currently cannot provide a sstable set for tombstone expiration to proceed safely. That being said, let's only do expiration based on the presence of the set. This makes room for the sstable set to be feeded to compaction via descriptor, allowing even resharding to do expiration. Currently, compaction thinks that sstable set can only come from the table, and that also needs to be changed for further flexibility. It's theoretically possible that a given resharding job will resurrect data if a fully expired SSTable is resharded at a shard which it doesn't belong to. Resharding will have no way to tell that expiring all that data will lead to resurrection because the relevant SSTables are at different shards. This is fixed by checking for fully expired sstables only on presence of the sstable set. Fixes #6600. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200605200954.24696-1-raphaelsc@scylladb.com>	2020-06-07 11:46:48 +03:00
Pavel Solodovnikov	5b1b6b1395	cql: pass `cql3::operation::raw_deletion` by unique_ptr Another small step towards shared_ptr usage reduction in cql3 code. Also make `raw_deletion` dtor virtual to make address sanitizer happy in debug builds. Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200606104528.1732241-1-pa.solodovnikov@scylladb.com>	2020-06-06 21:04:06 +03:00
Takuya ASADA	9de65f26de	dist/debian: support version number containing '_' .deb packaging system does not support version number contains '_', it should be replacedwith '-'	2020-06-05 21:35:02 +09:00
Takuya ASADA	509ad875aa	dist/debian: move version number fixup to debian_files_gen.py Now we generate dist/changelog on relocatable package generation time, we cannot run '.rc' fixup on .deb package building time, need to do it in debian_files_gen.py.	2020-06-05 21:34:55 +09:00
Kamil Braun	1b7f1806ac	test: improve comments on test_schema_digest_does_not_change This test tends to cause a lot of discussion resulting from not understanding what is actually being tested. Closes https://github.com/scylladb/scylla/issues/6582.	2020-06-05 14:30:02 +02:00
Kamil Braun	d89b7a0548	cdc: rename CDC description tables Commit `968177da04` has changed the schema of cdc_topology_description and cdc_description tables in the system_distributed keyspace. Unfortunately this was a backwards-incompatible change: these tables would always be created, irrespective of whether or not "experimental" was enabled. They just wouldn't be populated with experimental=off. If the user now tries to upgrade Scylla from a version before this change to a version after this change, it will work as long as CDC is protected b the experimental flag and the flag is off. However, if we drop the flag, or if the user turns experimental on, weird things will happen, such as nodes refusing to start because they try to populate cdc_topology_description while assuming a different schema for this table. The simplest fix for this problem is to rename the tables. This fix must get merged in before CDC goes out of experimental. If the user upgrades his cluster from a pre-rename version, he will simply have two garbage tables that he is free to delete after upgrading. sstables and digests need to be regenerated for schema_digest_test since this commit effectively adds new tables to the system_distributed keyspace. This doesn't result in schema disagreement because the table is announced to all nodes through the migration manager.	2020-06-05 09:59:16 +02:00
Piotr Sarna	64b8b77ac2	table: add error injection points to the materialized view path ... in order to be able to test scenarios with failures.	2020-06-05 09:39:58 +02:00
Piotr Sarna	76e89efc1a	db,view: add error injection points to view building ... in order to be able to test scenarios with failures.	2020-06-05 09:39:58 +02:00
Piotr Sarna	9d524a7a7e	db,view: add error injection points to view update generator ... in order to be able to test scenarios with failures.	2020-06-05 09:39:58 +02:00
Piotr Sarna	9a4394327a	Merge 'CDC: Disallowed CDC for tables with counter column(s)' from Juliusz. CDC for counters is unimplemented as of now, therefore any attempt to enable CDC log on counter table needs to be clearly disallowed. This patch does exactly this. The check whether schema has counter columns is performed in `cdc_service::impl` in: - `on_before_create_column_family`, - `on_before_update_column_family` and, if so, results in `invalid_request_exception` thrown. Fixes #6553 * jul-stas-6553-disallow-cdc-for-counters: test/cql: Check that CDC for counters is disallowed CDC: Disallowed CDC for tables with counter column(s)	2020-06-05 07:46:53 +02:00
Nadav Har'El	ace1697aa9	alternator test: reproducer for unjustly refused condition expression This patch adds a test reproducing issue #6572, where the perfectly good condition expression: #name1 = :val1 OR #name2 = :val2 Gets refused because of the following combination in our implementation: 1. Short-circuit evaluation, i.e., after we discover #name1 = :val1 we don't evaluate the second half of the expression. 2. The list of "used" references is collected at evaluation time, instead of at parsing time. Because evaluation never reaches #name2 (or :val2) our implementation complains that they are not used, and refuses the request - which should have been allowed. This test xfails on Alternator. It passes on DynamoDB. Refs #6572 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200604171954.444291-1-nyh@scylladb.com>	2020-06-05 07:43:50 +02:00
Piotr Sarna	0ba23d2b40	test: add manual test for tagging return value While not very interesting by itself, the test case shows that in case of TagResource and UntagResource it's actually correct to return empty HTTP body instead of an empty JSON object, which was the case for PutItem. Message-Id: <6331963179c5174a695f0e9eeed17de6c9f9a3be.1591269516.git.sarna@scylladb.com>	2020-06-04 16:17:24 +03:00
Nadav Har'El	db45ff2733	alternator: clean up usage of describe_item() The DynamoDB GetItem request returns the requested item in a specific way, wrapped in a map with a "Item" member. For historic reasons, we used the same function that returns this (describe_item()) also in other code which reads items - e.g. for checking conditional operations. The result is wasteful - after adding this "Item" member we had other code to extract it, all for no good reason. It is also ugly and confusing. Importantly, this situation also makes it harder for me to add support for FilterExpression. The issue is that the expression evaluator got the item with the wrapper (from the existing ConditionExpression code) but the filtering code had it without this wrapper, as it didn't use describe_item(). So this patch uses describe_single_item(), which doesn't add the wrapper map, instead of describe_item(). The latter function is used just once - to implement GetItem. The unnecessary code to unwrap the item in multiple places was then dropped. All the tests still pass. I also tested test_expected.py in unsafe_rmw write isolation mode, because code only for this mode had to be modified as well. Refs #5038. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200604092050.422092-1-nyh@scylladb.com>	2020-06-04 12:33:48 +02:00
Nadav Har'El	3d26bde4c1	alternator doc: correct state of filtering support Correct the compatibility section in docs/alternator/alternator.md: Filtering of Scan/Query results using the older syntax (ScanFilter, QueryFilter) is, after commit `bea9629031`, now fully supported. The newer syntax (FilterExpression) is not yet. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200604073207.416860-1-nyh@scylladb.com>	2020-06-04 12:33:10 +02:00
Avi Kivity	5b92a6d9e4	build: drop __pycache__ directories from python3 relocatable package Recently ./reloc/build_deb.sh started failing with dpkg-source: info: using source format '1.0' dpkg-source: info: building scylla-python3 using existing scylla-python3_3.8.3-0.20200604.77dfa4f15.orig.tar.gz dpkg-source: info: building scylla-python3 in scylla-python3_3.8.3-0.20200604.77dfa4f15-1.diff.gz dpkg-source: error: cannot represent change to scylla-python3/lib64/python3.8/site-packages/urllib3/packages/backports/__pycache__/__init__.cpython-38.pyc: dpkg-source: error: new version is plain file dpkg-source: error: old version is symlink to /usr/lib/python3.8/site-packages/__pycache__/six.cpython-38.pyc dpkg-source: error: unrepresentable changes to source dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 1 debuild: fatal error at line 1182: Those files are not in fact symlinks, so it's clear that dpkg is confused about something. Rather than debug dpkg, however, it's easier to just drop __pycache__ directories. These hold the result of bytecode compilation and are therefore optional, as Python will compile the sources if the cache is not populated. Fixes #6584.	2020-06-04 13:04:34 +03:00
Israel Fruchter	a2bb48f44b	fix "scylla_coredump_setup: Remove the coredump create by the check" In 28c3d4 `out()` was used without `shell=True` and was the spliting of arguments failed cause of the complex commands in the cmd (pipe and such) Fixes #6159	2020-06-04 12:55:10 +03:00
Raphael S. Carvalho	77dfa4f151	sstables: kill unused resharding code output_sstables is no longer needed after we made resharding use a special interposer. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200603165324.176665-1-raphaelsc@scylladb.com>	2020-06-03 23:20:15 +03:00
Avi Kivity	0c34e114e2	Merge "Upgrade to seastar api version 3" (make_file_output_stream returns future) from Rafael " The new seastar api changes make_file_output_stream and make_file_data_sink to return futures. This series includes a few refactoring patches and the actual transition. " * 'espindola/api-v3-v3' of https://github.com/espindola/scylla: table: Fix indentation everywhere: Move to seastar api level 3 sstables: Pass an output_stream to make_compressed_file_.*_format_output_stream sstables: Pass a data_sink to checksummed_file_writer's constructor sstables: Convert a file_writer constructor to a static make sstables: Move file_writer constructor out of line	2020-06-03 23:09:49 +03:00
Rafael Ávila de Espíndola	686f9220c1	table: Fix indentation It was broken by the previous commit. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Rafael Ávila de Espíndola	e5876f6696	everywhere: Move to seastar api level 3 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Rafael Ávila de Espíndola	13282b3d4c	sstables: Pass an output_stream to make_compressed_file_.*_format_output_stream This is a bit simpler as we don't have to pass in the options and moves the calls to make_file_output_stream to places where we can handle futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Rafael Ávila de Espíndola	f6ec7364a7	sstables: Pass a data_sink to checksummed_file_writer's constructor checksummed_file_writer cannot be moved, so we can't have a checksummed_file_writer::make that returns a future. So instead we pass in a data_sink and let the callers call make_file_data_sink. This is in preparation for make_file_data_sink returning a future in the seastar api v3. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Rafael Ávila de Espíndola	c1f37db72b	sstables: Convert a file_writer constructor to a static make For now it always returns a ready future. This is in preparation for using seastar v3 api where make_file_output_stream returns a future. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:45 -07:00

1 2 3 4 5 ...

22336 Commits