scylladb

Author	SHA1	Message	Date
Kefu Chai	5db315930e	sstables: fix a typo in comment: s/Mimicks/Mimics/ this typo was identified by the codespell workflow Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18781	2024-05-21 12:14:10 +03:00
Raphael S. Carvalho	715ae689c0	Implement fast streaming for intra-node migration With intra-node migration, all the movement is local, so we can make streaming faster by just cloning the sstable set of leaving replica and loading it into the pending one. This cloning is underlying storage specific, but s3 doesn't support snapshot() yet (th sstables::storage procedure which clone is built upon). It's only supported by file system, with help of hard links. A new generation is picked for new cloned sstable, and it will live in the same directory as the original. A challenge I bumped into was to understand why table refused to load the sstable at pending replica, as it considered them foreign. Later I realized that sharder (for reads) at this stage of migration will point only to leaving replica. It didn't fail with mutation based streaming, because the sstable writer considers the shard -- that the sstable was written into -- as its owner, regardless of what sharder says. That was fixed by mimicking this behavior during loading at pending. test: ./test.py --mode=dev intranode --repeat=100 passes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-05-16 00:28:47 +02:00
Pavel Emelyanov	96651e0ddb	sstables: Do not keep directory, keyspace and table names on descriptor Now no code uses those strings. Even worse -- there are some places that need to provide some strings but don't have real values at hand, so just hard-code the empty strings there (because they are really not used). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-05 12:21:01 +03:00
Pavel Emelyanov	62d71d398f	sstables: Return tuple from parse_path() without ks.cf hints There are two path parsers. One of them accepts keyspace and table names and the other one doesn't. The latter is then supposed to parse the ks.cf pair from path and put it on the descriptor. This patch makes this method return ks.cf so that later it will be possible to remove these strings from the desctiptor itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-05 12:21:00 +03:00
Pavel Emelyanov	d56f9db121	sstables: Rename make_descriptor() to parse_path() The method really parses provided path, so the existing name is pretty confusing. It's extra confusing in the table::get_snapshot_details() where it's just called and the return value is simply ignored. Named "parse_..." makes it clear what the method is for. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-05 11:04:07 +03:00
Kefu Chai	a29838f9e1	sstables: change make_descriptor() to accept fs::path change another overload of `make_descriptor()` to accept `fs::path`, in the same spirit of a previous change in this area. so we have a more consistent API for creating sstable descriptor. and this new API is simpler to use. Refs #15187 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-01 07:44:06 +08:00
Kefu Chai	50c1d9aee7	sstables: switch entry_descriptor(sstring..) to std::string_view so its callers don't need to construct a temporary `sstring` if the parameter's type is not `sstring`. for instance, before this change, `entry_descriptor::make_descriptor(const std::filesystem::path...)` would have to construct two temporary instances of `sstring` for calling this function. after this change, it does not have to do so. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-01 07:44:06 +08:00
Kefu Chai	6656707164	sstables: change make_descriptor() to accept fs::path to lower the programmer's cognitive load. as programmer might want to pass the full path as the `fname` when calling `make_descriptor(sstring sstdir, sstring fname)`, but this overload only accepts the filename component as its second parameter. a single `path` parameter would be easier to work with. Refs #15187 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-01 07:44:06 +08:00
Raphael S. Carvalho	2dbae856f8	sstable: Piggyback on sstable parser and writer to provide bytes_on_disk bytes_on_disk is the sum of all sstable components. As read_simple() fetches the file size before parsing the component, bytes_on_disk can be added incrementally rather than an additional step after all components were already parsed. Likewise, write_simple() tracks the offset for each new component, and therefore bytes_on_disk can also be added incrementally. This simplifies s3 life as it no longer have to care about feeding a bytes_on_disk, which is currently limited to data and index sizes only. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	17261369ea	sstables: Allow SSTable loading to discard bloom filter If bloom filter is not loaded, it means that an always-present filter is used, which translates into the SSTable being opened on every single read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Raphael S. Carvalho	86516f4cef	sstables: Move sstable_open_info into open_info.hh So sstable_directory can access its definition without having to include sstables.hh. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:31:14 -03:00
Piotr Sarna	209c2f5d99	sstables: define generation_type for sstables No functional changes intended - this series is quite verbose, but after it's in, it should be considerably easier to change the type of SSTable generations to something else - e.g. a string or timeUUID. Closes #10533	2022-05-11 14:46:30 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	1b7b3a81e6	sstables: entry_descriptor::make_descriptor(): add overload with provided ks/cf Not necessitating these to be extracted from the sstable dir path. This practically allows for la/mx sstables at non-standard paths to be opened. This will be used by the `scylla-sstable` tool which wants to be flexible about where the sstables it opens are located.	2021-10-12 11:43:23 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Raphael S. Carvalho	593c1e00c8	sstables:: kill unused sstables::sstable_open_info Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:23:48 -03:00
Glauber Costa	fd89e9f740	sstables: move open-related structures to their own file. sstables/sstables.hh is one of our heaviest headers and it's better that we don't include it if possible. For some users, like distributed_loader, we are mostly interested in knowing the shape of structures used to open an SSTable. They are: - the entry_descriptor, representing an SSTable that we are scanning on-disk - the sstable_open_info, representing information about a local, opened SSTable - the foreign_sstable_open_info, representing information about an opened SSTable that can cross shard boundaries. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-08 16:06:00 -04:00

17 Commits