scylladb

Author	SHA1	Message	Date
Gleb Natapov	ad3cf2c174	utils: fix get_random_time_UUID_from_micros to generate correct time uuid According to the IETF spec uuid variant bits should be set to '10'. All others are either invalid or reserved. The patch change the code to follow the spec. Closes scylladb/scylladb#27073	2025-11-20 10:27:29 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Nadav Har'El	e639434a89	change remaining sstring_view to std::string_view Our "sstring_view" is an historic alias for the standard std::string_view. The patch changes the last remaining random uses of this old alias across our source directory to the standard type name. After this patch, there are no more uses of the "sstring_view" alias. It will be removed in the following patch. Refs #4062. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-11-18 16:48:57 +02:00
Tomasz Grabiec	1d0c6aa26f	utils: UUID: Make get_time_UUID() respect the clock offset schema_change_test currently fails due to failure to start a cql test env in unit tests after the point where this is called (in one of the test cases): forward_jump_clocks(std::chrono::seconds(606024*31)); The problem manifests with a failure to join the cluster due to missing_column exception ("missing_column: done") being thrown from system_keyspace::get_topology_request_state(). It's a symptom of join request being missing in system.topology_requests. It's missing because the row is expired. When request is created, we insert the mutations with intended TTL of 1 month. The actual TTL value is computed like this: ttl_opt topology_request_tracking_mutation_builder::ttl() const { return std::chrono::duration_cast<std::chrono::seconds>(std::chrono::microseconds(_ts)) + std::chrono::months(1) - std::chrono::duration_cast<std::chrono::seconds>(gc_clock::now().time_since_epoch()); } _ts comes from the request_id, which is supposed to be a timeuuid set from current time when request starts. It's set using utils::UUID_gen::get_time_UUID(). It reads the system clock without adding the clock offset, so after forward_jump_clocks(), _ts and gc_clock::now() may be far off. In some cases the accumulated offset is larger than 1month and the ttl becomes negative, causing the request row to expire immediately and failing the boot sequence. The fix is to use db_clock, which respects offsets and is consistent with gc_clock. The test doesn't fail in CI becuase there each test case runs in a separate process, so there is no bootstrap attempt (by new cql test env) after forward_jump_clocks(). Closes scylladb/scylladb#21558	2024-11-14 10:32:07 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Pavel Emelyanov	88a40b0dfa	uuid: UUID_gen::get_UUID src argument is const pointer Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17762	2024-03-13 10:21:25 +02:00
Nadav Har'El	458fd0c2f7	utils: replace assert() by on_internal_error() In issue #17035 we had a situation where a certain input timestamp could result in the create_time() utility function getting called on a timestamp that cannot be represented as timeuuid, and this resulted in an assertion failure, and a crash. I guess we used an assertion because we believed that callers try to avoid calling this function on excessively large timestamps, but evidentally, they didn't tried hard enough and we got a crash. The code in UUID_gen.hh changed a lot over the years and has become very convoluted and it is almost impossible to understand all the code paths that could lead to this assertion failures. So it's better to replace this assertion by a on_internal_error, which by default is just an exception - and also logs the backtrace of the failure. Issue #17035 would have been much less serious if we had an exception instead of an assert. Refs #17035 Refs #7871, Refs #13970 (removes an assert) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-31 16:45:28 +02:00
Nadav Har'El	827c20467c	utils: add a timeuuid minimum, like we had maximum Our time-handling code in UUID_gen.hh is very fragile for very large timestamps, because the different types - such as Cassandra "timestamp" and Timeuuid use very different resolution and ranges. In issue #17035 we discovered a situation where a certain CQL "timestamp"-type value could cause an assertion-failure and a crash in the create_time() function that creates a timeuuid - because that timestamp didn't fit the place we have in timeuuid. We already added in the past a limit, UUID_UNIXTIME_MAX, beyond which we refuse timestamps, to avoid these assertions failure. However, we missed the possibility of negative timestamps (which are allowed in CQL), and indeed a negative timestamp (or a timestamp which was "wrapped" to a negative value) is what caused issue #17035. So this patch adds a second limit, UUID_UNIXTIME_MIN - limiting the most negative timestamp that we support to well below the area which causes problems, and adds tests that reproduce #17035 and that we didn't break anything else (e.g., negative timestamps are still allowed - just not extremely negative timestamps). Fixes #17035. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-31 11:32:26 +02:00
Kefu Chai	a1dcddd300	utils: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16833	2024-01-18 12:50:06 +02:00
Kamil Braun	9d4b3c6036	test: use correct timestamp resolution in `test_group0_history_clearing_old_entries` In `10c1f1dc80` I fixed `make_group0_history_state_id_mutation` to use correct timestamp resolution (microseconds instead of milliseconds) which was supposed to fix the flakiness of `test_group0_history_clearing_old_entries`. Unfortunately, the test is still flaky, although now it's failing at a later step -- this is because I was sloppy and I didn't adjust this second part of the test to also use microsecond resolution. The test is counting the number of entries in the `system.group0_history` table that are older than a certain timestamp, but it's doing the counting using millisecond resolution, causing it to give results that are off by one sometimes. Fix it by using microseconds everywhere. Fixes #14653 Closes #14670	2023-07-13 10:33:52 +03:00
Kamil Braun	218a056825	utils: UUID_gen: accept decimicroseconds in min_time_UUID The function now accepts higher-resolution duration types, such as microsecond resolution timestamps. Will be used by the next commit.	2023-04-21 10:33:02 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	65913f4cfa	utils: UUID_gen: introduce negate()	2021-09-09 11:49:05 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Konstantin Osipov	c83cf1f965	uuid: switch the API to use std::chrono A follow up for the patch for #7611. This change was requested during review and moved out of #7611 to reduce its scope. The patch switches UUID_gen API from using plain integers to hold time units to units from std::chrono. For one, we plan to switch the entire code base to std::chrono units, to ensure type safety. Secondly, using std::chrono units allows to increase code reuse with template metaprogramming and remove a few of UUID_gen functions that beceme redundant as a result. * switch get_time_UUID(), unix_timestamp(), get_time_UUID_raw(), switch min_time_UUID(), max_time_UUID(), create_time_safe() to std::chrono * remove unused variant of from_unix_timestamp() * remove unused get_time_UUID_bytes(), create_time_unsafe(), redundant get_adjusted_timestamp() * inline get_raw_UUID_bytes() * collapse to similar implementations of get_time_UUID() * switch internal constants to std::chrono * remove unnecessary unique_ptr from UUID_gen::_instance Message-Id: <20210406130152.3237914-2-kostja@scylladb.com>	2021-04-06 17:12:54 +03:00
Konstantin Osipov	56d8d166cb	test: add tests for legacy uuid compare & msb monotonicity	2021-01-21 13:03:59 +03:00
Konstantin Osipov	2b8ce83eea	lists: use query timestamp for list cell values during append Scylla list cells are represented internally as a map of timeuuid => value. To append a new value to a list the coordinator generates a timeuuid reflecting the current time as key and adds a value to the map using this key. Before this patch, Scylla always generated a timeuuid for a new value, even if the query had a user supplied or LWT timestamp. This could break LWT linearizability. User supplied timestamps were ignored. This is reported as https://github.com/scylladb/scylla/issues/7611 A statement which appended multiple values to a list or a BATCH generated an own microsecond-resolution timeuuid for each value: BEGIN BATCH UPDATE ... SET a = a + [3] UPDATE ... SET a = a + [4] APPLY BATCH UPDATE ... SET a = a + [3, 4] To fix the bug, it's necessary to preserve monotonicity of timeuuids within a batch or multi-value append, but make sure they all use the microsecond time, as is set by LWT or user. To explain the fix, it's first necessary to recall the structure of time-based UUIDs: 60 bits: time since start of GMT epoch, year 1582, represented in 100-nanosecond units 4 bits: version 14 bits: clock sequence, a random number to avoid duplicates in case system clock is adjusted 2 bits: type 48 bits: MAC address (or other hardware address) The purpose of clockseq bits is as defined in https://tools.ietf.org/html/rfc4122#section-4.1.5 is to reduce the probability of UUID collision in case clock goes back in time or node id changes. The implementation should reset it whenever one of these events may occur. Since LWT microsecond time is guaranteed to be unique by Paxos, the RFC provisioning for clockseq and MAC slots becomes excessive. The fix thus changes timeuuid slot content in the following way: - time component now contains the same microsecond time for all values of a statement or a batch. The time is unique and monotonic in case of LWT. Otherwise it's most always monotonic, but may not be unique if two timestamps are created on different coordinators. - clockseq component is used to store a sequence number which is unique and monotonic for all values within the statement/batch. - to protect against time back-adjustments and duplicates if time is auto-generated, MAC component contains a random (spoof) MAC address, re-created on each restart. The address is different at each shard. The change is made for all sources of time: user, generated, LWT. Conditioning the list key generation algorithm on the source of time would unnecessarily complicate the code while not increase quality (uniqueness) of created list keys. Since 14 bits of clockseq provide us with only 16383 distinct slots per statement or batch, 3 extra bits in nanosecond part of the time are used to extend the range to 131071 values per statement/batch. If the rang is exceeded beyond the limit, an exception is produced. A twist on the use of clockseq to extend timeuuid uniqueness is that Scylla, like Cassandra, uses int8 compare to compare lower bits of timeuuid for ordering. The patch takes this into account and sign-complements the clockseq value to make it monotonic according to the legacy compare function. Fixes #7611 test: unit (dev)	2021-01-21 13:03:59 +03:00
Calle Wilund	83339f4bac	Alternator::streams: Make SequenceNumber monotinically growing Fixes #7424 AWS sdk (kinesis) assumes SequenceNumbers are monotonically growing bigints. Since we sort on and use timeuuids are these a "raw" bit representation of this will _not_ fulfill the requirement. However, we can "unwrap" the timestamp of uuid msb and give the value as timestamp<<64\|lsb, which will ensure sort order == bigint order.	2020-10-14 16:45:21 +03:00
Benny Halevy	72e2ea47c1	cql3: time_uuid_fcts: validate time UUID Throw an error in case we hit an invalid time UUID rather than hitting an assert. Ref #5552 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-27 11:09:01 +02:00
Benny Halevy	f8b079b599	utils: UUID: create_time assert nanos_since validity Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-27 11:09:01 +02:00
Benny Halevy	cd3460cc88	utils/UUID_gen: make_nanos_since Safely convert millis to "nanos_since" (number of 100 nanseconds since START_EPOCH) while type casting to uint64_t to avoid possible int overflow. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-27 11:08:16 +02:00
Benny Halevy	22bac26023	utils: UUID: assert UUID.is_timestamp Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-26 18:54:36 +02:00
Gleb Natapov	f9209e27d4	lwt: Add missing functions to utils/UUID_gen.hh Some lwt related code is missing in our UUID implementation. Add it.	2019-09-26 11:44:00 +03:00
Vlad Zolotarov	f64f27beb9	utils: add get_time_UUID(system_clock::time_point) Creates a type 1 UUID (time-based UUID) with the given system_clock::time_point Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-07-19 18:21:58 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Pekka Enberg	c003f89484	utils/UUID_gen: Add bytes_view variant of get_name_UUID() Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-16 14:53:30 +03:00
Tomasz Grabiec	957544f69b	utils: UUID_gen: Add support for name-based UUIDs (type 3)	2015-04-17 14:19:07 +02:00
Tomasz Grabiec	b79d2008c0	utils: UUID_gen: Fix comment about get_UUID() UUID can hold not only type 1 UUIDs, but any UUID.	2015-04-17 14:19:07 +02:00
Tomasz Grabiec	5300caadf6	utils: Fix UUID::get_time_UUID() creating conflicting UUIDs in SMP UUID_gen::create_time_safe() does not synchronize across cores. The comment says that it assumes it runs on a single core. This is no longer true, we can run urchin on many cores. This easily leads to UUID conflicts with more than one core. Fix by adding a per-core unique number to the node part of the UUID.	2015-04-15 20:33:47 +02:00
Tomasz Grabiec	2902395129	Relax includes	2015-03-30 09:01:59 +02:00
Avi Kivity	24506efc43	uuid: fix serialization of least significant bytes Shift amount was incorrect.	2015-03-23 22:42:34 +02:00
Avi Kivity	b5125cc03e	uuid: remove debug print	2015-03-11 14:42:42 +02:00
Avi Kivity	835c8b693c	uuid: fix uuidgen thread safety The instance must be thread local since it is mutable (last_nanos).	2015-03-11 14:42:42 +02:00
Avi Kivity	07947764b2	uuid: convert UUID_gen::get_UUID()	2015-01-11 15:46:03 +02:00
Nadav Har'El	31a982b41e	Convert time (version 1) UUID to C++ Convert Cassandra's UUIDGen class, which generates time-dependent UUID, and parts of the java.util.UUID which I thought we need, to C++. It is possible I missed some needed features of java.util.UUID that we'll need to add later. Also, part of the version-1 UUID is supposed to be node-unique (so that if two nodes happen to boot at the same time and get a UUID at exactly the same time, they still get different UUIDs). Cassandra uses for this a hash function of the IP address, we should use in the future the MAC address (from Seastar's network stack). But currently we just use 0. Left a FIXME to fix that. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com> [avi: add to ./configure.py]	2015-01-07 16:13:42 +02:00

36 Commits