There are places in which we need to use the column family object many
times, with deferring points in between. Because the column family may
have been destroyed in the deferring point, we need to go and find it
again.
If we use lw_shared_ptr, however, we'll be able to at least guarantee
that the object will be alive. Some users will still need to check, if
they want to guarantee that the column family wasn't removed. But others
that only need to make sure we don't access an invalid object will be
able to avoid the cost of re-finding it just fine.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>
There is nothing really that fundamentally ties the estimated histogram to
sstables. This patch gets rid of the few incidental ties. They are:
- the namespace name, which is now moved to utils. Users inside sstables/
now need to add a namespace prefix, while the ones outside have to change
it to the right one
- sstables::merge, which has a very non-descriptive name to begin with, is
changed to a more descriptive name that can live inside utils/
- the disk_types.hh include has to be removed - but it had no reason to be
here in the first place.
Todo, is to actually move the file outside sstables/. That is done in a separate
step for clarity.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
get_sstables_including_compacted_undeleted() may return temporary shared
ptr which will be destroyed before the loop if not stored locally.
Fixes#1514
Message-Id: <20160728100504.GD2502@scylladb.com>
sstable_list is now a map<generation, sstable>; change it to a set
in preparation for replacing it with sstable_set. The change simplifies
a lot of code; the only casualty is the code that computes the highest
generation number.
The space calculation counters in column family had two problem:
1. The total bytes is an ever growing counter, which is meaningless for
the API.
2. Trying to simply sum the size on all shards, ignores the fact that the
same sstable file can be referenced by multiple shards, this is
especially noticeable during migration time.
To solve this, the implementation was modified so instead of
collecting the sizes, the API would collect a map of file name to size
and then would do the summing.
This removes the duplications and fixes the total bytes calculation
Calling cfstats before the change with load after a compaction happend:
$ nodetool cfstats keyspace1
Keyspace: keyspace1
Verify write latency 1068253.0 76435
Read Count: 75915
Read Latency: 0.5953986037015082 ms.
Write Count: 76435
Write Latency: 0.013975966507490025 ms.
Pending Flushes: 0
Table: standard1
SSTable count: 5
Space used (live): 44261215
Space used (total): 219724478
After the fix:
$ nodetool cfstats keyspace1
Keyspace: keyspace1
Verify write latency 1863206.0 124219
Read Count: 125401
Read Latency: 0.9381053978835895 ms.
Write Count: 124219
Write Latency: 0.01499936402643718 ms.
Pending Flushes: 0
Table: standard1
SSTable count: 6
Space used (live): 50402904
Space used (total): 50402904
Space used by snapshots (total): 0
Fixes: #1042
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1464518757-14666-2-git-send-email-amnon@scylladb.com>
object
The API would expose now the rate_moving_average and
rate_moving_average_and_histogram.
The old end points remains for the transition period, but marked as
depricated.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
After this change, user can query compression ratio on a per column
family basis with 'nodetool cfstats'.
look at 'nodetool cfstats' output:
./bin/nodetool cfstats ks.test5
Keyspace: ks
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Flushes: 0
Table: test5
SSTable count: 1
Space used (live): 4774
Space used (total): 4774
Space used by snapshots (total): 0
Off heap memory used (total): 131384
SSTable Compression Ratio: 0.833333
...
Fixes#636.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <a1bee5a23fe63787df3e387a88f2d216ba4a4134.1459802771.git.raphaelsc@scylladb.com>
The helper function for summing statistic over the column family are
template function that infer the return type acording to the type of the
Init param.
In the API the return value should be int64_t, passing an integer would
cause a number wrap around.
A partial output from the nodetool cfstats after the fix
nodetool cfstats keyspace1
Keyspace: keyspace1
Read Count: 0
Read Latency: NaN ms.
Write Count: 4050000
Write Latency: 0.009178098765432099 ms.
Pending Flushes: 0
Table: standard1
SSTable count: 12
Space used (live): 1118617445
Space used (total): 23336562465
Fixes#682
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This adds the implementation for the index_summary_off_heap_memory for a
single column family and for all of them.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Objects extending json_base are not movable, so we won't be able to
pass them via future<>, which will assert that types are nothrow move
constructible.
This problem only affects httpd::utils_json::histogram, which is used
in map-reduce. This patch changes the aggregation to work on domain
value (utils::ihistrogram) instead of json objects.
This patch adds the column family API that return the snapshot size.
The changes in the swagger definition file follo origin so the same API will be used for the metric and the
column_family.
The implementation is based on the get_snapshot_details in the
column_family.
This fix:
425
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This series adds a histogrm to the column family for live scanned and
tombstone scaned.
It expose those histogram via the API instead of the stub implmentation,
currently exist.
The implementation update of the histogram will be added in a different
series.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch fix an issue with the read latency estimated historam
implementation and add a call to the estimated number of sstable
histogram.
The later is not yet implemented on the datbase side.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the implementation that return the estimated total latency of
the read and of the write.
First the method that sum the count was renamed to get_cf_stats_count
and a method was added named get_cf_stats_sum to sum the estimated
latencies.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the API definition with stub implementation that would make
the nodetool cfstats to run.
After this patch the nodetool cfstats command would work, but with stub
imlementation.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
For us, everything is "off heap", so this will just be the total amount of
memory used by the filters.
Fixes#339
Signed-off-by: Glauber Costa <glommer@scylladb.com>
The following function where added to column family:
is_auto_compaction_disabled
get_built_indexes
get_compression_metadata_off_heap_memory_used
get_compression_parameters
get_compression_ratio
get_read_latency_estimated_histogram
get_write_latency_estimated_histogram
And the get and set compaction strategy methods and a stub
implementation for the compression parameter, crc chec and sstable
count.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The bloom filter memory calculation is missing, as a workaround until
it will be completed, the memory calculation will return 0.
It is needed by the nodetool info command.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch clear the ambiguity in the swagger definition file and adds
the implementation for the memtable memory related methods.
For each column family there is an active memtable and a list of non
active.
when refering the all the memtable in the column family, the nick name
will contain cf_all_memtables.
Each URL has two versions, one, with a column family name, that is
relevant to a specific column family and one without, which is the
result of running the method on all column families.
This patch adds the following implementation to column_family:
get_memtable_on_heap_size
get_all_memtable_on_heap_size
get_memtable_off_heap_size
get_all_memtable_off_heap_size
get_memtable_live_data_size
get_all_memtable_live_data_size
get_all_memtables_on_heap_size
get_all_all_memtables_on_heap_size
get_all_memtables_off_heap_size
get_all_all_memtables_off_heap_size
get_all_memtables_live_data_size
get_all_all_memtables_live_data_size
Memory consumption is map this way: All memory assume to be off heap, so
on heap will return 0, and off heap will return the memory consumption
After this patch the following URL will be available:
/column_family/metrics/memtable_on_heap_size/{name}
/column_family/metrics/memtable_on_heap_size
/column_family/metrics/memtable_off_heap_size/{name}
/column_family/metrics/memtable_off_heap_size
/column_family/metrics/memtable_live_data_size/{name}
/column_family/metrics/memtable_live_data_size
/column_family/metrics/all_memtables_on_heap_size/{name}
/column_family/metrics/all_memtables_on_heap_size
/column_family/metrics/all_memtables_off_heap_size/{name}
/column_family/metrics/all_memtables_off_heap_size
/column_family/metrics/all_memtables_live_data_size/{name}
/column_family/metrics/all_memtables_live_data_size
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
"This series modify the stub implementation of unimplemented API method to
return a 500 Http error.
It does so by adding a new API exception unimplemented_exception and a helper
function unimplemented that throw that exception.
A call to unimplemented was added to each of the stub API methods.
After this series a call to an unimplemented to API would return a 500."
Some APIs other then the column_family need to use the get_cf_stats,
this adds the helper method decleration to the column_family.hh and
change the implementation decleration to be non-static
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The API contains stub API methods, this adds a call to unimplemented
method in each of the stubed method that is not implemented.
The return remains the same to help the compiler deduce the return type
of the lambda function.
After this patch a call to an unimplemented API function will return
500.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the column family mean row size in the per column family and
the total version. I uses the ratio_helper class to calculate the mean
over all the shrades.
This patch uses the now existing infrastructure to expose statistics about the bloom
filters hit/miss rates.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Adding to API function to return count of sstables in L0 if leveled
compaction strategy is enabled, 0 otherwise. Currently, we don't
support leveled compaction strategy, so function to return count of
sstables in L0 always return zero.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
"This series cleans the streaming_histogram and the estimated histogram that
were importad from origin, it then uses it to get the estimated min and max row
estimation in the API."
This adds the implementation to in the API to the row size histogram.
It adds a map_cf method that perform a map operation over all column
family on the different shards.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the implementation for min and max row size in column family.
It uses the column family map redudce helper function with the addtional
function to get the min and max row size.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
With the change in column_family stats, the API needs to get the counter
from the read and write histogram.
It also adds the implementation for the read and write latency histogram.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the API implementation for the read, write, number of
panding flushes and memtable switch count.
The implementation uses a helper function to perform map and map_reduce
on column_family.
The get_uuid helper method now supports both colon notations (i.e.
either as a ":" or as %3A)
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the API implementation to the column family API.
After this patch the following API will be supported:
/column_family/name
/column_family
/column_family/name/keyspace