Commit Graph

5566 Commits

Author SHA1 Message Date
Avi Kivity
a8ff8ea442 commitlog: switch to faster crc32 implementation 2015-08-09 00:05:36 +03:00
Avi Kivity
d6351ecca7 utils: add crc32 class
C++ interface to the crc32 x86 instruction.
2015-08-09 00:05:33 +03:00
Avi Kivity
70618762c3 build: require at least a Nehalem-class cpu
We want to use the crc32 instruction, which was made available
on Nehalem, so let's require it.  It's old enough to be present
everywhere.
2015-08-08 23:28:32 +03:00
Avi Kivity
4a5845ae60 Merge "Incremental eviction" from Tomasz
"This series enables incremental eviction of data from cache. The eviction is
controlled by the LSA tracker, which consideres evictable regions as part of
its reclaim() method."
2015-08-08 14:39:13 +03:00
Tomasz Grabiec
5e677f4331 tests: Add row_cache eviction test 2015-08-08 09:59:24 +02:00
Tomasz Grabiec
ef549ae5a5 lsa: Reclaim space from evictable regions incrementally
When LSA reclaimer cannot reclaim more space by compaction, it
will reclaim data by evicting from evictable regions.

Currently the only evictable region is the one owned by the row cache.
2015-08-08 09:59:24 +02:00
Tomasz Grabiec
7a8f1ef6c3 row_cache: Replace _lru_len counter with region occupancy
_lru_len may get stale when row_cache instance goes out of scope
purging all its partitions from cache. I'm assuming we're not really
interested in the number of partitions here, but rather a measure of
occupancy, so I applied a simple fix of using LSA region occupancy
instead.
2015-08-08 09:59:24 +02:00
Tomasz Grabiec
bceeb301b7 tests: lsa: Add test for region merging 2015-08-08 09:59:24 +02:00
Tomasz Grabiec
a095b39091 lsa: Don't leak empty _active segment in merge() 2015-08-08 09:59:24 +02:00
Tomasz Grabiec
5b5c0038e6 lsa: Don't allocate aligned segments
Requiring alignment means that there must be 64K of contiguous space
to allocate each 32K segment. When memory is fragmented, we may fail
to allocate such segment, even though there's plenty of free space.

This especially hurts forward progress of compaction, which frees
segments randomly and relies on the fact that freeing a segment will
make it available to the next segment request.
2015-08-07 22:13:17 +02:00
Tomasz Grabiec
64bd4bee94 lsa: Log segment closing and releasing on trace level 2015-08-07 22:06:15 +02:00
Tomasz Grabiec
02ff31b815 lsa: Reduce amount of calls to descriptor() in free() 2015-08-07 22:05:53 +02:00
Tomasz Grabiec
e3592a4a04 api: lsa: Invoke compaction on all shards 2015-08-07 22:05:53 +02:00
Avi Kivity
416d8f7799 sstables: don't pass temporary string to regex
Since the regex match returns views into that string, it must not be
a temporary. gcc 5.1's libstdc++ won't accept it, either.
2015-08-07 21:46:55 +03:00
Avi Kivity
a1543dc4f9 tests: mark fake variable as unused in logalloc_test
So that gcc 5.1 doesn't complain.
2015-08-07 21:32:09 +03:00
Glauber Costa
ae2ce78ee6 version: change all fields to uint16_t
Ok, shame on me: the version string was so obviously correct that I only
verified that the comparisons were working as expected.

Turns out it isn't: http://lists.boost.org/boost-users/2006/12/24194.php

boost::format will treat uint8_t arguments as char, and therefore we will end
up with the version string misprinted.

We can just cast it to uint16_t before we print, but since this is not exactly
a struct that we will be using all the time, let's favor readability over
saving a few bytes, and change all fields to uint16_t.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 20:25:20 +03:00
Tomasz Grabiec
5dc58a7cd4 allocation_strategy: Leak the standard strategy
Some code may attempt to use it during finalization after "instance"
was destroyed.

Reported by Pekka:

/usr/include/c++/4.9.2/bits/unique_ptr.h:291:14: runtime error:
reference binding to null pointer of type 'struct
standard_allocation_strategy'
./utils/allocation_strategy.hh:105:13: runtime error: reference
binding to null pointer of type 'struct standard_allocation_strategy'
./utils/allocation_strategy.hh:118:35: runtime error: reference
binding to null pointer of type 'struct allocation_strategy'
./utils/managed_bytes.hh:59:45: runtime error: member call on null
pointer of type 'struct allocation_strategy'
./utils/allocation_strategy.hh:82:9: runtime error: member access
within null pointer of type 'struct allocation_strategy'
2015-08-07 18:35:20 +03:00
Glauber Costa
5d3c7165d2 version: use a tuple internally.
As Avi suggested, we can use a tuple to make some comparisons more natural.
However, instead of doing a make_tuple on the comparison only, we can go
further and store the tuple internally.

I am still keeping the outer type, so it can host convenience functions like
to_sstring() and current().

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 18:24:54 +03:00
Avi Kivity
33a03d4da4 Merge "Read C* 2.1.8 sstables - and vice versa" from Glauber
"With the present patchset, we are now able to
- start cassandra-2.1.8
- generate a keyspace and a table
- flush it
- exit it
- start scylla on that same directory

And it works, from the sstable point of view. Note that if we don't flush,
Cassandra-2.1.8 won't do that automatically (it seems to do for 2.2), and
things will be left in the commitlog. We are not replaying their commitlog,
so we won't see any data.

The reverse also works - starting scylla, creating a table, exiting it.
After starting cassandra-2.1.8 in the same directory, everything works fine.

There are still some minor issues left, but they are not showstoppers. I will
open individual issues for each of them."
2015-08-07 17:54:29 +03:00
Glauber Costa
92031be642 index_interval: another field for schema_columnfamilies
There is another field I missed, index_interval. It is not actually used for
2.1.8 - so that's why it is easy to stop, but it at least exists.

2.1.8 already has "min_index_interval" and "max_index_interval". If we see a
table that contains index_interval, that will become "min_index_interval".

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
0237a73e05 system_keyspace: make collections multi-cell
They are multi-cell in Origin. This has nothing to do with 2.2 vs 2.1,
and it is just a plain bug.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
d1f897b63b cell_name: always include default comparator for version 2.1.8 and lower
This is the biggest change from 2.2: for the 2.1 series, the default type is
always stored in the comparator for compound types.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
d122b98c08 version: do not store current version as a string
The class I am presenting will make it easier for us to compare it with desired
versions so we can control proper behavior when needed.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
f33f432474 schema: set is_all_components for compact columns
They should be set. As a result, those columns will have the index "null"
at the schema_columns table.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
a7c1e16bc2 schema_tables: properly calculate index
We are currently assigning non-partition keys the index 0. That is not what
happens in Origin:

cqlsh> create table ks.twoclust \
        (ks int, cl1 int, cl2 int, r1 text, r2 text, primary key (ks, cl1, cl2));
cqlsh> select columnfamily_name, column_name, component_index \
        from system.schema_columns where keyspace_name='ks';

 columnfamily_name | column_name | component_index
-------------------+-------------+-----------------
          twoclust |         cl1 |               0
          twoclust |         cl2 |               1
          twoclust |          ks |            null
          twoclust |          r1 |               2
          twoclust |          r2 |               2

This is happening because we use column.position(), which has no knowledge of
the clustering keys at all.  We should instead pass that by the schema, which
will then do the right thing.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
21ebaeffae schema_builder: provide a build function that doesn't take compact storage.
We will invoke the schema builder from schema_tables.cc, and at that point, the
information about compact storage no longer exists anywhere. If we just call it
like this, it will be the same as calling it with compact_storage::no, which
will trigger a (wrong) recomputation for compact_storage::yes CFs

The best way to solve that, is make the compact_storage parameter mandatory
every time we create a new table - instead of defaulting to no. This will
ensure that the correct dense and compound calculation are always done when
calling the builder with a parameter, and not done at all when we call it
without a parameter.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
6d7a3d2f0a schema: always rebuild the schema
If we alter the compound property, we also have to rebuild the schema,
since some aspects of the columns depend on it. Let's just go ahead and
always rebuild the schema.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Glauber Costa
68bdb1d0c1 schema: set thrift properties in schema, not schema builder
We will use those properties during initialization - for instance, to calculate
thrift_bits.is_on_all_components. In order to do that, it has to be available at
schema creation, and not through the schema builder.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Glauber Costa
4bfd5b9f65 schema_tables: handle caching options
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Glauber Costa
498824971d schema: handle caching options
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Glauber Costa
97e965aac8 add caching_options
This is basically a glorified map<sstring, sstring>, that does some validation
on the options. Analoguous structures in the past were put directly at
schema.hh, but I will keep this one separate because it got slightly more
complex than the average.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Avi Kivity
ea485fdb0b Merge "SSTables format change" from Glauber
"This still doesn't achieve compatibility with Cassandra-2.1.8, my next series
will. But let's merge this one first so the rest can be reviewed separately."
2015-08-07 17:12:14 +03:00
Glauber Costa
c2a0232048 database: generate UUIDs compatible with Cassandra 2.1.8
Without this, Cassandra won't even try to read our sstables. The containing
directories will be ignored.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:56 -05:00
Glauber Costa
c19819290a system.size_estimates: define schema
This table exists in 2.1.8, and although it is dropped in 2.2, we
should at least list its schema.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:56 -05:00
Glauber Costa
28c0498bb6 system.local: add more fields
2.1.8 tables have 3 more fields in their system tables, that 2.2 don't.
Since we aim at 2.1 compatibility, we have to include them.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:56 -05:00
Glauber Costa
e9ec0ad7f7 schema_columnfamilies: add columns present in 2.1.8 version
They do not exist in 2.2, and don't serve a huge purpose. But we will
need them for compatibility with 2.1

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
8e44aa214b schema: extend column acessors to allow access to compact columns
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
9ee7b5d8f2 legacy schema tables: disable unused tables
Let's leave their schema in here, since it's ready and we may need them in the
future. But since they are not present in 2.1.8, we will remove them from the
schema list.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
7bbf8c2a6f sstable types: correctly state version of metadata field
Don't let the current name fool you: Having this listed as "la" here
was just lack of discipline on my part. I meant by it "the format from
which we are importing" - which was named la for Origin. I wasn't
really thinking at the time that it would be dangerous to stop between
versions.

This should read ka, not la.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
c8ca9b376d database: change default sstable version
Let's change the default generated tables to ka, which is the one that is present
in Origin

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
2d1b965f91 database: change filename parser to also accept ka
A ka file has a slightly different name on disk. Change the
parser so we can deal with both

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
a7f88004be sstables: build a descriptor from filename
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
859dc58511 sstables: construct filename for ka sstables
A helper struct - entry_descriptor - is introduced to aid in this goal.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
976de6f6f4 sstables: get cf and ks strings for filename
We will need them to properly build names in some situations.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
d5a5ee98f0 sstables: add new version
We'll keep the old one around. Eventually we'll need it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
cd8c9ad288 sstables: add ks and cf name to sstable constructor
When a schema is available, we use it. However, we have, by now, way too many
tests. Some of them use tables for which we don't even know the schema. It would
have been a massive amount of work to require a schema for all of them - so I am
keeping both constructors around.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
2cbfe261e3 sstables: reuse code for filename
We have currently two versions of filename: one static, where the caller has to
pass all parameters, and an internal one where those parameters are derived
from the sstable attributes. Implement the latter in terms of the former so
making changes gets easier.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:55 -05:00
Glauber Costa
77e06c3ab1 sstables: remove name parameter
It is currently only used to log a message, and for that we have an sstable
method that will do just fine. Using the name itself just makes it being passed
along throughout the captures.  Remove it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:53 -05:00
Glauber Costa
8a3c935c21 sstables: component_from_sstring
Analogous to version and format

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:18:09 -05:00
Tomasz Grabiec
3e1fe072d4 Merge branch 'penberg/fix-heap-overflow-in-bytes-ostream/v2' from seastar-dev.git
Fix fro heap overflow in bytes_ostream::append() from Pekka.
2015-08-07 13:13:47 +03:00