This PR reverts the scylla sstable schema loading improvements as they fail in CI every other run. I am already working on fixes for these but I am not sure I understand all the failures so it is best to revert and re-post the series later.
Fixes: #13404Fixes: #13410Closes#13419
* github.com:scylladb/scylladb:
Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes"
Revert "tools/schema_loader: don't require results from optional schema tables"
Related: https://github.com/scylladb/scylla-enterprise/issues/2770
This commit adds the upgrade guide from ScyllaDB Open Source 5.2
to ScyllaDB Enterprise 2023.1.
This commit does not cover metric updates (the metrics file has no
content, which needs to be added in another PR).
As this is an upgrade guide, this commit must be merged to master and
backported to branch-5.2 and branch-2023.1 in scylla-enterprise.git.
Closes#13294
This reverts commit 32fff17e19, reversing
changes made to 164afe14ad.
This series proved to be problematic, the new test introduced by it
failing quite often. Revert it until the problems are tracked down and
fixed.
The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in `commitlog::descriptor::descriptor`, which is logged with the `WARN` level.
A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new `schema_commitlog_directory` parameter to move the schema commitlog to another disk drive.
This is expected to be released in 5.3.
As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here.
Fixes: #11867Closes#13263
* github.com:scylladb/scylladb:
commitlog: use separate directory for schema commitlog
schema commitlog: fix commitlog_total_space_in_mb initialization
The commitlog api originally implied that
the commitlog_directory would contain files
from a single commitlog instance. This is
checked in segment_manager::list_descriptors,
if it encounters a file with an unknown
prefix, an exception occurs in
commitlog::descriptor::descriptor, which is
logged with the WARN level.
A new schema commitlog was added recently,
which shares the filesystem directory with
the main commitlog. This causes warnings
to be emitted on each boot. This patch
solves the warnings problem by moving
the schema commitlog to a separate directory.
In addition, the user can employ the new
schema_commitlog_directory parameter to move
the schema commitlog to another disk drive.
By default, the schema commitlog directory is
nested in the commitlog_directory. This can help
avoid problems during an upgrade if the
commitlog_directory in the custom scylla.yaml
is located on a separate disk partition.
This is expected to be released in 5.3.
As #13134 (raft tables->schema commitlog)
is also scheduled for 5.3, and it already
requires a clean rolling restart (no cl
segments to replay), we don't need to
specifically handle upgrade here.
Fixes: #11867
`scylla-sstable` currently has two ways to obtain the schema:
* via a `schema.cql` file.
* load schema definition from memory (only works for system tables).
This meant that for most cases it was necessary to export the schema into a `CQL` format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable *is* inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file.
This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a `schema.cql` is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override.
If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong.
A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes.
This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change.
Example:
```
$ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db
{"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}}
```
As seen above, subdirectories like `qurantine`, `staging` etc are also supported.
Fixes: https://github.com/scylladb/scylladb/issues/10126Closes#13075
* github.com:scylladb/scylladb:
docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section
test/cql-pytest: test_tools.py: add test for schema loading
test/cql-pytest: nodetool.py: add flush_keyspace()
tools/scylla-sstable: reform schema loading mechanism
tools/schema_loader: add load_schema_from_schema_tables()
db/schema_tables: expose types schema
Fixes https://github.com/scylladb/scylladb/issues/13106
This commit removes the information that BYPASS CACHE
is an Enterprise-only feature and replaces that info
with the link to the BYPASS CACHE description.
Closes#13316
This commit removes the Enterprise upgrade guides from
the Open Source documentation. The Enterprise upgrade guides
should only be available in the Enterprise documentation,
with the source files stored in scylla-enterprise.git.
In addition, this commit:
- adds the links to the Enterprise user guides in the Enterprise
documentation at https://enterprise.docs.scylladb.com/
- adds the redirections for the removed pages to avoid
breaking any links.
This commit must be reverted in scylla-enterprise.git.
Closes#13298
The patch series introduces linearisable topology changes using
raft protocol. The state machine driven by raft is described in
"service: Introduce topology state machine". Some explanations about
the implementation can be found in "storage_service: raft topology:
implement topology management through raft".
The code is not ready for production. There is not much in terms of error
handling and integration with the rest of the system is not even started.
For full integration request fencing will need to be implemented and
token_metadata has to be extended to support not just "pending" nodes
but concepts of "read replica set" and "write replica set".
The code may be far from be usable, but it is hidden behind the
"experimental raft" flag and having it in tree will relieve me from
constant rebase burden.
* 'raft-topology-v6' of github.com:scylladb/scylla-dev:
storage_service: fix indentation from previous patch
storage_service: raft topology: implement topology management through raft
service: raft: make group0_guard move assignable
service: raft: wire up apply() and snapshot transfer for topology in group0 state machine
storage_service: raft topology: introduce a function that applies topology cmd to local state machine
storage_service: raft topology: introduce a raft monitor and topology coordinator fibers
storage_service: raft topology: introduce snapshot transfer code for the topology table
raft topology: add RAFT_TOPOLOGY_CMD verb that will be used by topology coordinator to communicated with nodes
bootstrapper: Add get_random_bootstrap_tokens function
service: raft: add support for topology_change command into raft_group0_client
service: raft: introduce topology_change group0 command
system_keyspace: add a table to persist topology change state machine's state
service: Introduce topology state machine data structures
storage_proxy: not consult topology on local table write
Hey y'all!
Me and @malusev998 are maintaining a updated version of the [PHP Driver ](https://github.com/he4rt/scylladb-php-driver) together with @he4rt community and it had a bunch of improvements on these last month.
Before it was working only at PHP 7.1 (DataStax branch), and at our branch we have it working at PHP 8.1 and 8.2.
We are also using the ScyllaDB C++ Driver on this project and I think that is a good idea to point new users for this project since it's the most updated PHP Driver maintained now.
What do y'all think about that?
Closes#13218
* github.com:scylladb/scylladb:
fix: links to php driver
fix: adding php versions into driver's description
docs: scylladb better php driver
Point to the difference between the official MurmurHash3 and Scylla / Cassandra implementation
Update docs/glossary.rst
Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>
Closes#11369
The topology state machine will track all the nodes in a cluster,
their state, properties (topology, tokens, etc) and requested actions.
Node state can be one of those:
none - the node is not yet in the cluster
bootstrapping - the node is currently bootstrapping
decommissioning - the node is being decommissioned
removing - the node is being removed
replacing - the node is replacing another node
normal - the node is working normally
rebuild - the node is being rebuilt
left - the node is left the cluster
Nodes in state left are never removed from the state.
Tokens also can be in one of the states:
write_both_read_old - writes are going to new and old replica, but reads are from
old replicas still
write_both_read_new - writes still going to old and new replicas but reads are
from new replica
owner - tokens are owned by the node and reads and write go to new
replica set only
Tokens that needs to be move start in 'write_both_read_old' state. After entire
cluster learns about it streaming start. After the streaming tokens move
to 'write_both_read_new' state and again the whole cluster needs to learn about it
and make sure no reads started before that point exist in the system.
After that tokens may move to 'owner' state.
topology_request is the field through which a topology operation request
can be issued to a node. A request is one of the topology operation
currently supported: join, leave, replace or remove.
Fixes https://github.com/scylladb/scylladb/issues/12758
This commit adds a new page with a matrix that shows
on which ScyllaDB Open Source versions we based given
ScyllaDB Enterprise versions.
The new file is added to the newly created Reference
section.
Closes#13230
Even after last fixups, the documentation still had some issues with
compilation instructions in particular. I also ran a spelling and
grammar check on the text, and fixed issues found by it.
Closes#13206
This commit adds branch-5.2 to the list of branches
for which we want to build the docs. As a result,
version 5.2 will be added to the version selector.
NOTE: Version 5.2 will be marked as unstable and
an appropriate message will be shown to the user.
After 5.2 is released, branch-5.2 needs to be
moved from UNSTABLE_VERSIONS to LATEST_VERSION
(where is should replace branch-5.1)
Closes#13200
Fixes https://github.com/scylladb/scylladb/issues/13138
Fixes https://github.com/scylladb/scylladb/issues/13153
This PR:
- Fixes outdated information about the recommended OS. Since version 5.2, the recommended OS should be Ubuntu 22.04 because that OS is used for building the ScyllaDB image.
- Adds the OS support information for version 5.2.
This PR (both commits) needs to be backported to branch-5.2.
Closes#13188
* github.com:scylladb/scylladb:
doc: Add OS support for version 5.2
doc: Updates the recommended OS to be Ubuntu 22.04
This patch fixes 2 small issues with the Wasm UDF documentation that
recently got uploaded:
1. a link was unnecessarily wrapped in angle brackets
2. a link did not redirect to the correct page due to a missing ":doc:" tag
Closes#13193
Fixes https://github.com/scylladb/scylladb/issues/13138
This PR fixes the outdated information about the recommended
OS. Since version 5.2, the recommended OS should be Ubuntu 22.04
because that OS is used for building the ScyllaDB image.
This commit needs to be backported to branch-5.2.
Until now, the instructions on generating wasm files and using them
for Scylla UDFs were stored in docs/dev, so they were not visible
on the docs website. Now that the Rust helper library for UDFs
is ready, and we're inviting users to try it out, we should also
make the rest of the Wasm UDF documentation readily available
for the users.
Closes#13139
Related: https://github.com/scylladb/scylladb/issues/13119
This commit removes the pages that describe Enterprise only features
from the Open Source documentation:
- Encryption at Rest
- Workload Prioritization
- LDAP Authorization
- LDAP Authentication
- Audit
In addition, it removes most of the information about Incremental
Compaction Strategy (ICS), which is replaced with links to the
Enterprise documentation.
The changes above required additional updates introduced with this
commit:
- The links to Enterprise-only features are replaced with the
corresponding links in the Enterprise documentation.
- The redirections are added for the removed pages to be redirected to
the corresponding pages in the Enterprise documentation.
This commit must be reverted in the scylla-enterprise repository to
avoid deleting the Enterprise-only content from the Enterprise docs.
Closes#13123
In issue #5283 we noted that the auto_snapshot option is not useful
in Alternator (as we don't offer any API to restore the snapshot...),
and suggested that we should automatically disable this option for
Alternator tables. However, this issue has been open for more than three
years, and we never changed this default.
So until we solve that issue - if we ever do - let's add a paragraph
in docs/alternator/alternator.md recommending to the user to disable
this option in the configuration themselves. The text explains why,
and also provides a link to the issue.
Refs #5283
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13103
When the WASM UDFs were first introduced, the LANGUAGE required in
the CQL statements to use them was "xwasm", because the ABI for the
UDFs was still not specified and changes to it could be backwards
incompatible.
Now, the ABI is stabilized, but if backwards incompatible changes
are made in the future, we will add a new ABI version for them, so
the name "xwasm" is no longer needed and we can finally
change it to "wasm".
Closes#13089
docs/alternator/compatibility.md mentions a known problem that
Alternator Streams are divided into too many "shards". This patch
add a link to a github issue to track our work on this issue - like
we did for most other differences mentioned in compatibility.md.
Refs #13080
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13081
This commit makes the following changes to the docs landing page:
- Adds the ScyllaDB enterprise docs as one of three tiles.
- Modifies the three tiles to reflect the three flavors of ScyllaDB.
- Moves the "New to ScyllaDB? Start here!" under the page title.
- Renames "Our Products" to "Other Products" to list the products other
than ScyllaDB itself. In addtition, the boxes are enlarged from to
large-4 to look better.
The major purpose of this commit is to expose the ScyllaDB
documentation.
docs: fix the link
Closes#13065
This PR adds a note to the Alternator TTL section to specify in which Open Source and Enterprise versions the feature was promoted from experimental to non-experimental.
The challenge here is that OSS and Enterprise are (still) **documented together**, but they're **not in sync** in promoting the TTL feature: it's still experimental in 5.1 (released) but no longer experimental in 2022.2 (to be released soon).
We can take one of the following approaches:
a) Merge this PR with master and ask the 2022.2 users to refer to master.
b) Merge this PR with master and then backport to branch-5.1. If we choose this approach, it is necessary to backport https://github.com/scylladb/scylladb/pull/11997 beforehand to avoid conflicts.
I'd opt for a) because it makes more sense from the OSS perspective and helps us avoid mess and backporting.
Closes#12295
* github.com:scylladb/scylladb:
doc: fix the version in the comment on removing the note
doc: specify the versions where Alternator TTL is no longer experimental
The WASM UDF implementation has changed since the last time the docs
were written. In particular, the Rust helper library has been
released, and using it should be the recommended method.
Some decisions that were only experimental at the start, were also
"set in stone", so we should refer to them as such.
The docs also contain some code examples. This patch adds tests for
these examples to make sure that they are not wrong and misleading.
Closes#12941