boost::intrusive::value_traits_pointers was introduced in boost 1.56, while
we also support boost 1.55. Replace with an equivalent expression.
(with additions by Asias)
Message-Id: <20170110084700.19994-1-avi@scylladb.com>
Commit f0c28e1 ("db/schema_tables: Add schema_functions and
schema_aggregates tables") forgot to add the newly added tables to the
db::schema_tables::ALL list, which is used for authorization checks, for
example.
Fixes the following auth_test.py dtest failures:
('Unable to connect to any servers', {'127.0.0.1': Unauthorized('Error from server: code=2100 [Unauthorized] message="User cathy has no SELECT permission on <table system.schema_functions> or any of its parents"',)})
Message-Id: <1484045277-4997-1-git-send-email-penberg@scylladb.com>
"Intended to reduce memory usage when resharding by sharing sstable
components among shards. File descriptors are also shared from now
on, meaning that a much smaller number of file descriptors will be
used during resharding.
Fixes #1951."
branch 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla
* 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla:
db: avoid excessive memory usage during resharding
checked_file_impl: add support to dup
sstables: group sstable components that can be shared among shards
sstables: rename sstable member
After resharding, sstables may be owned by all shards, which
means that file descriptors and memory usage for metadata will
increase by a factor equal to number of shards. That can easily
lead to OOM.
SSTable components are immutable, so they can be stored in one
shard and shared with others that need it. We use the following
formula to decide which shard will open the sstable and share
it with the others: (generation % smp::count), which is the
inverse of how we calculate generation for new sstables.
So if no resharding is performed, everything is shard-local.
With this approach, resource usage due to loaded sstables will
be evenly distributed among shards.
For this approach to work, we now only populate keyspaces from
shard 0. It's now the sole responsible for iterating through
column family dirs. In addition, most of population functions
are now free and take distributed database object as parameter.
Fixes#1951.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"This patch series adds support for CQL 3.3.1. The changes to CQL are listed
here:
https://github.com/apache/cassandra/blob/cassandra-2.2/doc/cql3/CQL.textile#changes
The following CQL features are already supported by Scylla:
- TRUNCATE TABLE alias
- Double-dollar string literals
- Aggregate functions: MIN, MAX, SUM, and AVG
This series adds the following CQL features:
- New data types: tinyint, smallint, date, and time
- CQL binary protocol v4 (required by the new data types)
- Advertise Cassandra 2.2.8 version from Scylla so that drivers correctly
detect the presence of CQL 3.3.1
The following CQL features are not supported by Scylla:
- Role-based access control (issue #1941)
- JSON data type
- User-defined functions (UDFs)
- User-defined aggregates (UDAs)
The following CQL binary protocol v4 changes are not implemented by this
series:
- Read_failure and Write_failure error codes are not implemented.
They error codes not used by the smart drivers but as they are
propagated to application code, we eventually need to wire them up
to our storage proxy implementation.
- Function_failure error code is only used by user-defined functions
and the fromJson function, which are not implemented by Scylla.
Fixes #1284."
* 'penberg/cql-3.3.1/v5' of github.com:cloudius-systems/seastar-dev:
version: Bump Cassandra version to 2.2.8
db/schema_tables: Add schema_functions and schema_aggregates tables
tests/type_tests: TIME type test cases
tests/cql_query_test: TIME type test cases
cql3: TIME data type support
tests/type_tests: DATE type test cases
tests/cql_query_test: DATE type test cases
cql3: DATE type support
date.h: 64-bit year and days representation
licenses: Add utils/date.h license
utils/date.h: Import date and time library sources
tests/type_tests: TINYINT and SMALLINT type test cases
tests/cql_query_test: TINYINT and SMALLINT type test cases
cql3: TINYINT and SMALLINT data type support
types: Fix integer_type_impl::parse_int() for bytes
mutation_result_merger::get() assumes that the merged result may be a
short read if at least one of the partial results is a short read (in
other words, if none of the partial results is a short read, then the
merged result is also not a short read). However this is not true;
because we update the memory accounter incrementally, we may stop
scanning early. All the partial results are full; but we did not scan
the entire range.
Fix by changing the short_read variable initialization from `no`
(which assumes we'll encounter a short read indication when processing
one of the batches) to `this->short_read()`, which also takes into
account the memory accounter.
Fixes#2001.
Message-Id: <20170108111315.17877-1-avi@scylladb.com>
We need 64-bit year and days representation to support the boundary
values of the CQL data type, which is implemented using Joda Time
library's DateTime type.
Noone overrides file_writer::write() so there is no reason to inhibit
optimisations and cause compiler to emit indirect calls.
Message-Id: <20170104163618.26251-1-pdziepak@scylladb.com>
Ubuntu Packaging Guide says if there's no upstream package (means it's not
ported from Debian), revision should be "0ubuntu1", not "ubuntu1" which is we
currently using.
On Debian, Debian Policy Manual says it's conventional to restart revision from 1 when upstream version increased, so we should specify it to "1".
To do it in single script, we will generate the revision on building time.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1483498658-27491-1-git-send-email-syuu@scylladb.com>
This patch gets housekeeping to create a uuid file if a path to a uuid
file is upplied but the file is missing.
Because it import the uuid lib, uuid parameters where renamed.
Fixes#1987
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1483866553-13855-2-git-send-email-amnon@scylladb.com>
We intend to share immutable sstable components among shards to
reduce excessive memory usage when resharding shared sstables.
This change is about grouping those components into a structure,
and using foreign ptr to make sure that the structure will be
deleted by whichever shard created it.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Rename _components to _recognized_components because _components
will be used to name a field with shareable components.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
* seastar 1c8e389...240b0bf (15):
> file/dup: don't decrease refcnt twice when file is explicitly closed
> reactor: Add missing CentOS 7.2 dependency systemtap-sdt-devel
> reactor: Cleaning the smp queue metrics when shuting down
> metrics: metrics keep the value map while unregistering
> change the reactor load metrics to utilization
> Merge "ASan fiber switches" from Paweł
> tls: Add missing credentials_builder::set_client_auth method
> collectd: create metrics with the right format
> io_queue: remove owner number from metric name
> reactor: change the load metric name to load
> Merge "reactor: stop using signals for task_quota timer"
> metrics: Allow initializing the metric_group in its constructor
> Update DPDK to 16.11
> Revert "rpc: Avoid using zero-copy interface of output_stream"
> core::metrics_groups: add a clear() method
Transform the supervisor_notify() and related functions into
the "supervisor" class and place this class implementation in
a separate .cc file.
This is going to fix the compilation breakage of tests introduced
by a
commit 8014adc2a1
init: serialize the creation of system_traces KS objects
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1483663955-20096-1-git-send-email-vladz@scylladb.com>
"Reduce the size of mutation_partition by implementing intrusive set using
bi::rbtree_algorithms directly and using tree nodes optimized for size.
This will reduce the size of mutation_partition by:
24 bytes + <number of cql rows> * 8 bytes
This should have a positive impact on performance because mutation_partitions
are stored both in memtable and cache.
Fixes #742."
* 'haaawk/742' of github.com:cloudius-systems/seastar-dev:
intrusive_set: rename size() to calculate_size()
Make intrusive_set_external_comparator::_value_traits static
Implement intrusive set using rbtree_algorithms
mutation_partition: make apply_reversibly_intrusive_set nongeneric
mutation_partition: take schema in find_row and clustered_row
mutation_partition: Extract intrusive set logic to a class.
mutation_partition: Replace value_comp with key_comp calls
Before this patch system table writes were not writing to commit log
because database::add_column_family() disables writes to commit log
for the table which is added if _commitlog is not set at that
time. Fix by initializing commit log before system tables are created.
Fixes#1986.
Fixes recent regression in
batch_test.py:TestBatch.replay_after_schema_change_test after
scylla-jmx was updated to not flush system tables on nodetool flush.
Could cause system keyspace writes to be delayed for more than before
under heavy write workload. Refs #1926.
Message-Id: <1483618117-4535-1-git-send-email-tgrabiec@scylladb.com>
During a range scan, we try to avoid sorting according to partition range
when we can do so. This is when we scan fewer than smp::count shards --
each shard's range is strictly ordered with respect to the others.
However, we use the wrong key for the sort -- we use the shard number. But
if we started at shard s > 0 and wrapped around to shard 0, then shard 0's
range will be after the range belonging to shard s, but will sort before it.
Fix by storing the iteration order as the sort key. We use that when we
know that shards do not overlap (shards < smp::count) and the index within
the source partition range vector when they do.
Fixes#1998.
Message-Id: <20170105114253.17492-1-avi@scylladb.com>
Serialize the creation of a system_traces KS objects when
they do not exist - the initial cluster boot.
Avoid creating them in parallel by different cluster Nodes
in order to avoid issue #420.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1483552503-12873-3-git-send-email-vladz@scylladb.com>
This hopefully will make it more apparent that
the time complexity of this method is O(N) not O(1).
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
_value_traits can be shared among all instances
and there's no need to store it in every single one.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
1.6 truncates paged queries early to avoid overrunning server memory
with too-large query results, but in the case of partition range queries,
this terminates too early due to an uninitialized variable holding the
maximum result size. This results in slow performance due to additional
round trips.
Fix by initializing the maximum result size from the result_memory_tracker
running on the coordinating shard.
Fixes#1995.
Message-Id: <20170105103915.10633-1-avi@scylladb.com>
This new implementation takes less memory because it
does not store comparator.
It also uses tree nodes optimized for size. This means
that instead of storing an enum field |color| they embed
this information inside pointer to parent.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
apply_reversibly_intrusive_set is used only in one place
and always with rows_type. There's no need for it to be generic.
This will allow changing intrusive set implementation.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
The integer_type_impl::parse_int() function uses boost::lexical_cast()
under the hood, which parses 8-bit numbers as characters. Fix the
function to lexical cast to 64-bit integer and convert the result to
integer_type_impl template type.
Previously, if the Prometheus port (by default, 0.0.0.0:9180) could not
be opened, the following message appeared in the log about 10 seconds into
the run, and Scylla crashed.
ERROR 2017-01-01 19:31:04,066 [shard 0] seastar - Exiting on unhandled exception: std::system_error (error system:98, Address already in use)
The puzzled user would have no idea *which* address was already in use, why,
or why Scylla stopped.
In this patch, before the above message we get the much more informative
message:
ERROR 2017-01-01 19:58:19,080 [shard 0] init - Could not start Prometheus API server on 0.0.0.0:9180: std::system_error (error system:98, Address already in use)
We continue to print the original message - and exit - in this case,
under the assumption that it's better not to run the database while
improperly configured.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170102121304.2060-1-nyh@scylladb.com>