When compacting a fully expired sstable, we're not allowing that sstable
to be purged because expired cell is *unconditionally* converted into a
dead cell. Why not check if the expired cell can be purged instead using
gc before and max purgeable timestamp?
Currently, we need two compactions to get rid of a fully expired sstable
which cells could have always been purged.
look at this sstable with expired cell:
{
"partition" : {
"key" : [ "2" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 120,
"liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z",
"ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true },
"cells" : [
{ "name" : "country", "value" : "1" },
]
now this sstable data after first compaction:
[shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79
(~65% of original) in 229ms = 0.000328997MB/s.
{
...
"rows" : [
{
"type" : "row",
"position" : 79,
"cells" : [
{ "name" : "country", "deletion_info" :
{ "local_delete_time" : "2017-04-09T17:07:12Z" },
"tstamp" : "2017-04-09T17:07:12.702597Z"
},
]
now another compaction will actually get rid of data:
compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original)
in 1ms = 0MB/s. ~2 total partitions merged to 0
NOTE:
It's a waste of time to wait for second compaction because the expired
cell could have been purged at first compaction because it satisfied
gc_before and max purgeable timestamp.
Fixes#2249, #2253
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
(cherry picked from commit a6f8f4fe24)