alternator-test: reproduce bug in Expected with EQ of set value

Our implementation of the "EQ" operator in Expected (conditional operation) just compares the JSON represntation of the values. This is almost always correct, but unfortunately incorrect for sets - where we can have two equal sets despite having a different order. This patch just adds an (xfailing) test for this bug. The bug itself can be fixed in the future in one of several ways including changing the implementation of EQ, or changing the serialization of sets so they'll always be sorted in the same way. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190909125147.16484-1-nyh@scylladb.com>
view: handle multiple regular base columns in view pk
2019-09-10 17:06:41 +03:00 · 2019-09-05 19:31:11 +03:00 · 2019-09-05 17:14:23 +03:00 · 2019-09-05 16:24:35 +03:00 · 2019-09-05 14:40:43 +03:00 · 2019-09-05 14:37:51 +03:00
2770 changed files with 12125 additions and 38634 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -19,8 +19,3 @@ CMakeLists.txt.user
 __pycache__CMakeLists.txt.user
 .gdbinit
 resources
-.pytest_cache
-/expressions.tokens
-tags
-testlog/*
-test/*/*.reject
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,6 +1,6 @@
 [submodule "seastar"]
 	path = seastar
-	url = ../scylla-seastar
+	url = ../seastar
 	ignore = dirty
 [submodule "swagger-ui"]
 	path = swagger-ui
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -97,7 +97,7 @@ scan_scylla_source_directories(
          service
          sstables
          streaming
-          test
+          tests
          thrift
          tracing
          transport
--- a/HACKING.md
+++ b/HACKING.md
@@ -56,7 +56,7 @@ $ ./configure.py --help

 The most important option is:

- `--enable-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
+- `--{enable,disable}-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.

 Source files and build targets are tracked manually in `configure.py`, so the script needs to be updated when new files or targets are added or removed.

--- a/docs/IDL.md
+++ b/docs/IDL.md
--- a/31
+++ b/31
@@ -5,6 +5,8 @@ F: Filename, directory, or pattern for the subsystem
 ---

 AUTH
+M: Paweł Dziepak <pdziepak@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Calle Wilund <calle@scylladb.com>
 R: Vlad Zolotarov <vladz@scylladb.com>
 R: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
@@ -12,17 +14,22 @@ F: auth/*

 CACHE
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Paweł Dziepak <pdziepak@scylladb.com>
 R: Piotr Jastrzebski <piotr@scylladb.com>
 F: row_cache*
 F: *mutation*
 F: tests/mvcc*

 COMMITLOG / BATCHLOGa
+M: Paweł Dziepak <pdziepak@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Calle Wilund <calle@scylladb.com>
 F: db/commitlog/*
 F: db/batch*

 COORDINATOR
+M: Paweł Dziepak <pdziepak@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Gleb Natapov <gleb@scylladb.com>
 F: service/storage_proxy*

@@ -42,10 +49,12 @@ M: Pekka Enberg <penberg@scylladb.com>
 F: cql3/*

 COUNTERS
+M: Paweł Dziepak <pdziepak@scylladb.com>
 F: counters*
 F: tests/counter_test*

 GOSSIP
+M: Duarte Nunes <duarte@scylladb.com>
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
 R: Asias He <asias@scylladb.com>
 F: gms/*
@@ -56,11 +65,14 @@ F: dist/docker/*

 LSA
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Paweł Dziepak <pdziepak@scylladb.com>
 F: utils/logalloc*

 MATERIALIZED VIEWS
+M: Duarte Nunes <duarte@scylladb.com>
 M: Pekka Enberg <penberg@scylladb.com>
-M: Nadav Har'El <nyh@scylladb.com>
+R: Nadav Har'El <nyh@scylladb.com>
+R: Duarte Nunes <duarte@scylladb.com>
 F: db/view/*
 F: cql3/statements/*view*

@@ -70,12 +82,14 @@ F: dist/*

 REPAIR
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Asias He <asias@scylladb.com>
 R: Nadav Har'El <nyh@scylladb.com>
 F: repair/*

 SCHEMA MANAGEMENT
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 M: Pekka Enberg <penberg@scylladb.com>
 F: db/schema_tables*
 F: db/legacy_schema_migrator*
@@ -84,13 +98,15 @@ F: schema*

 SECONDARY INDEXES
 M: Pekka Enberg <penberg@scylladb.com>
-M: Nadav Har'El <nyh@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
+R: Nadav Har'El <nyh@scylladb.com>
 R: Pekka Enberg <penberg@scylladb.com>
 F: db/index/*
 F: cql3/statements/*index*

 SSTABLES
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Raphael S. Carvalho <raphaelsc@scylladb.com>
 R: Glauber Costa <glauber@scylladb.com>
 R: Nadav Har'El <nyh@scylladb.com>
@@ -98,17 +114,18 @@ F: sstables/*

 STREAMING
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 R: Asias He <asias@scylladb.com>
 F: streaming/*
 F: service/storage_service.*

-ALTERNATOR
-M: Nadav Har'El <nyh@scylladb.com>
-F: alternator/*
-F: alternator-test/*
+THRIFT TRANSPORT LAYER
+M: Duarte Nunes <duarte@scylladb.com>
+F: thrift/*

 THE REST
 M: Avi Kivity <avi@scylladb.com>
+M: Paweł Dziepak <pdziepak@scylladb.com>
+M: Duarte Nunes <duarte@scylladb.com>
 M: Tomasz Grabiec <tgrabiec@scylladb.com>
-M: Nadav Har'El <nyh@scylladb.com>
 F: *
--- a/README-DPDK.md
+++ b/README-DPDK.md
@@ -0,0 +1,29 @@
+Seastar and DPDK
+================
+
+Seastar uses the Data Plane Development Kit to drive NIC hardware directly.  This
+provides an enormous performance boost.
+
+To enable DPDK, specify `--enable-dpdk` to `./configure.py`, and `--dpdk-pmd` as a
+run-time parameter.  This will use the DPDK package provided as a git submodule with the
+seastar sources.
+
+To use your own self-compiled DPDK package, follow this procedure:
+
+1. Setup host to compile DPDK:
+   - Ubuntu 
+     `sudo apt-get install -y build-essential linux-image-extra-$(uname -r)` 
+2. Prepare a DPDK SDK:
+   - Download the latest DPDK release: `wget http://dpdk.org/browse/dpdk/snapshot/dpdk-1.8.0.tar.gz`
+   - Untar it.
+   - Edit config/common_linuxapp: set CONFIG_RTE_MBUF_REFCNT and CONFIG_RTE_LIBRTE_KNI to 'n'.
+   - For DPDK 1.7.x: edit config/common_linuxapp: 
+     - Set CONFIG_RTE_LIBRTE_PMD_BOND  to 'n'.
+     - Set CONFIG_RTE_MBUF_SCATTER_GATHER to 'n'.
+     - Set CONFIG_RTE_LIBRTE_IP_FRAG to 'n'.
+   - Start the tools/setup.sh script as root.
+   - Compile a linuxapp target (option 9).
+   - Install IGB_UIO module (option 11).
+   - Bind some physical port to IGB_UIO (option 17).
+   - Configure hugepage mappings (option 14/15).
+3. Run a configure.py: `./configure.py --dpdk-target <Path to untared dpdk-1.8.0 above>/x86_64-native-linuxapp-gcc`.
--- a/README.md
+++ b/README.md
@@ -27,10 +27,10 @@ Please see [HACKING.md](HACKING.md) for detailed information on building and dev

 ```

-* run Scylla with one CPU and ./tmp as work directory
+* run Scylla with one CPU and ./tmp as data directory

 ```
-./build/release/scylla --workdir tmp --smp 1
+./build/release/scylla --datadir tmp --commitlog-directory tmp --smp 1
 ```

 * For more run options:
@@ -38,24 +38,6 @@ Please see [HACKING.md](HACKING.md) for detailed information on building and dev
 ./build/release/scylla --help
 ```

-## Scylla APIs and compatibility
-By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and
-Thrift. There is also experimental support for the API of Amazon DynamoDB,
-but being experimental it needs to be explicitly enabled to be used. For more
-information on how to enable the experimental DynamoDB compatibility in Scylla,
-and the current limitations of this feature, see
-[Alternator](docs/alternator/alternator.md) and
-[Getting started with Alternator](docs/alternator/getting-started.md).
-
-## Documentation
-
-Documentation can be found in [./docs](./docs) and on the
-[wiki](https://github.com/scylladb/scylla/wiki). There is currently no clear
-definition of what goes where, so when looking for something be sure to check
-both.
-Seastar documentation can be found [here](http://docs.seastar.io/master/index.html).
-User documentation can be found [here](https://docs.scylladb.com/).
-
 ## Building Fedora RPM

 As a pre-requisite, you need to install [Mock](https://fedoraproject.org/wiki/Mock) on your machine:
--- a/2
+++ b/2
@@ -1,7 +1,7 @@
 #!/bin/sh

 PRODUCT=scylla
-VERSION=3.3.4
+VERSION=666.development

 if test -f version
 then
--- a/alternator-test/README.md
+++ b/alternator-test/README.md
@@ -31,48 +31,3 @@ and ~/.aws/config with the default region to use in the test:
 region = us-east-1
 ```

-## HTTPS support
-
-In order to run tests with HTTPS, run pytest with `--https` parameter. Note that the Scylla cluster needs to be provided
-with alternator\_https\_port configuration option in order to initialize a HTTPS server.
-Moreover, running an instance of a HTTPS server requires a certificate. Here's how to easily generate
-a key and a self-signed certificate, which is sufficient to run `--https` tests:
-
-```
-openssl genrsa 2048 > scylla.key
-openssl req -new -x509 -nodes -sha256 -days 365 -key scylla.key -out scylla.crt
-```
-
-If this pair is put into `conf/` directory, it will be enough
-to allow the alternator HTTPS server to think it's been authorized and properly certified.
-Still, boto3 library issues warnings that the certificate used for communication is self-signed,
-and thus should not be trusted. For the sake of running local tests this warning is explicitly ignored.
-
-
-## Authorization
-
-By default, boto3 prepares a properly signed Authorization header with every request.
-In order to confirm the authorization, the server recomputes the signature by using
-user credentials (user-provided username + a secret key known by the server),
-and then checks if it matches the signature from the header.
-Early alternator code did not verify signatures at all, which is also allowed by the protocol.
-A partial implementation of the authorization verification can be allowed by providing a Scylla
-configuration parameter:
-```yaml
-  alternator_enforce_authorization: true
-```
-The implementation is currently coupled with Scylla's system\_auth.roles table,
-which means that an additional step needs to be performed when setting up Scylla
-as the test environment. Tests will use the following credentials:
-Username: `alternator`
-Secret key: `secret_pass`
-
-With CQLSH, it can be achieved by executing this snipped:
-
-```bash
-cqlsh -x "INSERT INTO system_auth.roles (role, salted_hash) VALUES ('alternator', 'secret_pass')"
-```
-
-Most tests expect the authorization to succeed, so they will pass even with `alternator_enforce_authorization`
-turned off. However, test cases from `test_authorization.py` may require this option to be turned on,
-so it's advised.
--- a/alternator-test/conftest.py
+++ b/alternator-test/conftest.py
@@ -43,9 +43,6 @@ if (LooseVersion(botocore.__version__) < LooseVersion('1.12.54')):
 def pytest_addoption(parser):
    parser.addoption("--aws", action="store_true",
        help="run against AWS instead of a local Scylla installation")
-    parser.addoption("--https", action="store_true",
-        help="communicate via HTTPS protocol on port 8043 instead of HTTP when"
-            " running against a local Scylla installation")

 # "dynamodb" fixture: set up client object for communicating with the DynamoDB
 # API. Currently this chooses either Amazon's DynamoDB in the default region
@@ -62,15 +59,8 @@ def dynamodb(request):
        # requires us to specify dummy region and credential parameters,
        # otherwise the user is forced to properly configure ~/.aws even
        # for local runs.
-        local_url = 'https://localhost:8043' if request.config.getoption('https') else 'http://localhost:8000'
-        # Disable verifying in order to be able to use self-signed TLS certificates
-        verify = not request.config.getoption('https')
-        # Silencing the 'Unverified HTTPS request warning'
-        if request.config.getoption('https'):
-            import urllib3
-            urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
-        return boto3.resource('dynamodb', endpoint_url=local_url, verify=verify,
-            region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='secret_pass')
+        return boto3.resource('dynamodb', endpoint_url='http://localhost:8000',
+            region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='whatever')

 # "test_table" fixture: Create and return a temporary table to be used in tests
 # that need a table to work on. The table is automatically deleted at the end.
--- a/alternator-test/test_authorization.py
+++ b/alternator-test/test_authorization.py
@@ -1,74 +0,0 @@
-# Copyright 2019 ScyllaDB
-#
-# This file is part of Scylla.
-#
-# Scylla is free software: you can redistribute it and/or modify
-# it under the terms of the GNU Affero General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-#
-# Scylla is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU Affero General Public License
-# along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
-
-# Tests for authorization
-
-import pytest
-import botocore
-from botocore.exceptions import ClientError
-import boto3
-import requests
-
-# Test that trying to perform an operation signed with a wrong key
-# will not succeed
-def test_wrong_key_access(request, dynamodb):
-    print("Please make sure authorization is enforced in your Scylla installation: alternator_enforce_authorization: true")
-    url = dynamodb.meta.client._endpoint.host
-    with pytest.raises(ClientError, match='UnrecognizedClientException'):
-        if url.endswith('.amazonaws.com'):
-            boto3.client('dynamodb',endpoint_url=url, aws_access_key_id='wrong_id', aws_secret_access_key='').describe_endpoints()
-        else:
-            verify = not url.startswith('https')
-            boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='', verify=verify).describe_endpoints()
-
-# A similar test, but this time the user is expected to exist in the database (for local tests)
-def test_wrong_password(request, dynamodb):
-    print("Please make sure authorization is enforced in your Scylla installation: alternator_enforce_authorization: true")
-    url = dynamodb.meta.client._endpoint.host
-    with pytest.raises(ClientError, match='UnrecognizedClientException'):
-        if url.endswith('.amazonaws.com'):
-            boto3.client('dynamodb',endpoint_url=url, aws_access_key_id='alternator', aws_secret_access_key='wrong_key').describe_endpoints()
-        else:
-            verify = not url.startswith('https')
-            boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='wrong_key', verify=verify).describe_endpoints()
-
-# A test ensuring that expired signatures are not accepted
-def test_expired_signature(dynamodb, test_table):
-    url = dynamodb.meta.client._endpoint.host
-    print(url)
-    headers = {'Content-Type': 'application/x-amz-json-1.0',
-               'X-Amz-Date': '20170101T010101Z',
-               'X-Amz-Target': 'DynamoDB_20120810.DescribeEndpoints',
-               'Authorization': 'AWS4-HMAC-SHA256 Credential=alternator/2/3/4/aws4_request SignedHeaders=x-amz-date;host Signature=123'
-    }
-    response = requests.post(url, headers=headers, verify=False)
-    assert not response.ok
-    assert "InvalidSignatureException" in response.text and "Signature expired" in response.text
-
-# A test ensuring that signatures that exceed current time too much are not accepted.
-# Watch out - this test is valid only for around next 1000 years, it needs to be updated later.
-def test_signature_too_futuristic(dynamodb, test_table):
-    url = dynamodb.meta.client._endpoint.host
-    print(url)
-    headers = {'Content-Type': 'application/x-amz-json-1.0',
-               'X-Amz-Date': '30200101T010101Z',
-               'X-Amz-Target': 'DynamoDB_20120810.DescribeEndpoints',
-               'Authorization': 'AWS4-HMAC-SHA256 Credential=alternator/2/3/4/aws4_request SignedHeaders=x-amz-date;host Signature=123'
-    }
-    response = requests.post(url, headers=headers, verify=False)
-    assert not response.ok
-    assert "InvalidSignatureException" in response.text and "Signature not yet current" in response.text
--- a/alternator-test/test_condition_expression.py
+++ b/alternator-test/test_condition_expression.py
--- a/alternator-test/test_describe_endpoints.py
+++ b/alternator-test/test_describe_endpoints.py
@@ -22,7 +22,7 @@ import boto3
 # Test that the DescribeEndpoints operation works as expected: that it
 # returns one endpoint (it may return more, but it never does this in
 # Amazon), and this endpoint can be used to make more requests.
-def test_describe_endpoints(request, dynamodb):
+def test_describe_endpoints(dynamodb):
    endpoints = dynamodb.meta.client.describe_endpoints()['Endpoints']
    # It is not strictly necessary that only a single endpoint be returned,
    # but this is what Amazon DynamoDB does today (and so does Alternator).
@@ -34,16 +34,14 @@ def test_describe_endpoints(request, dynamodb):
        # send it another describe_endpoints() request ;-) Note that the
        # address does not include the "http://" or "https://" prefix, and
        # we need to choose one manually.
-        prefix = "https://" if request.config.getoption('https') else "http://"
-        verify = not request.config.getoption('https')
-        url = prefix + address
+        url = "http://" + address
        if address.endswith('.amazonaws.com'):
-            boto3.client('dynamodb',endpoint_url=url, verify=verify).describe_endpoints()
+            boto3.client('dynamodb',endpoint_url=url).describe_endpoints()
        else:
            # Even though we connect to the local installation, Boto3 still
            # requires us to specify dummy region and credential parameters,
            # otherwise the user is forced to properly configure ~/.aws even
            # for local runs.
-            boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='secret_pass', verify=verify).describe_endpoints()
+            boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='whatever').describe_endpoints()
        # Nothing to check here - if the above call failed with an exception,
        # the test would fail.
--- a/alternator-test/test_describe_table.py
+++ b/alternator-test/test_describe_table.py
@@ -41,6 +41,7 @@ def test_describe_table_basic(test_table):

 # Test that DescribeTable correctly returns the table's schema, in
 # AttributeDefinitions and KeySchema attributes
+@pytest.mark.xfail(reason="DescribeTable does not yet return schema")
 def test_describe_table_schema(test_table):
    got = test_table.meta.client.describe_table(TableName=test_table.name)['Table']
    expected = { # Copied from test_table()'s fixture
--- a/alternator-test/test_expected.py
+++ b/alternator-test/test_expected.py
@@ -56,7 +56,6 @@ def test_update_expression_and_expected(test_table_s):
 # and the "false" case, where the condition evaluates to false, so the update
 # doesn't happen and we get a ConditionalCheckFailedException instead.

-# Tests for Expected with ComparisonOperator = "EQ":
 def test_update_expected_1_eq_true(test_table_s):
    p = random_string()
    test_table_s.update_item(Key={'p': p},
@@ -86,6 +85,7 @@ def test_update_expected_1_eq_true(test_table_s):
 # Check that set equality is checked correctly. Unlike string equality (for
 # example), it cannot be done with just naive string comparison of the JSON
 # representation, and we need to allow for any order.
+@pytest.mark.xfail(reason="bug in EQ test of sets")
 def test_update_expected_1_eq_set(test_table_s):
    p = random_string()
    # Because boto3 sorts the set values we give it, in order to generate a
@@ -125,759 +125,49 @@ def test_update_expected_1_eq_false(test_table_s):
        )
    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1}

-# Tests for Expected with ComparisonOperator = "NE":
-def test_update_expected_1_ne_true(test_table_s):
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'}})
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'NE',
-                        'AttributeValueList': [2]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 3}
-    # For NE, AttributeValueList must have a single element
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NE',
-                            'AttributeValueList': [2, 3]}}
-        )
-    # If the types are different, this is considered "not equal":
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'b': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'NE',
-                        'AttributeValueList': ["1"]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 4}
-    # If the attribute does not exist at all, this is also considered "not equal":
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'b': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'q': {'ComparisonOperator': 'NE',
-                        'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 5}
-
-def test_update_expected_1_ne_false(test_table_s):
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'}})
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NE',
-                            'AttributeValueList': [1]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "LE":
-def test_update_expected_1_le(test_table_s):
-    p = random_string()
-    # LE should work for string, number, and binary type
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'LE',
-                        'AttributeValueList': [2]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'LE',
-                        'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'LE',
-                        'AttributeValueList': ['dog']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'LE',
-                        'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LE',
-                            'AttributeValueList': [0]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'LE',
-                            'AttributeValueList': ['aardvark']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'LE',
-                            'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
-        )
-    # If the types are different, this is also considered false
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LE',
-                            'AttributeValueList': ["1"]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # For LE, AttributeValueList must have a single element
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LE',
-                            'AttributeValueList': [2, 3]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "LT":
-def test_update_expected_1_lt(test_table_s):
-    p = random_string()
-    # LT should work for string, number, and binary type
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'LT',
-                        'AttributeValueList': [2]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'LT',
-                        'AttributeValueList': ['dog']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'LT',
-                        'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': [0]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': ['aardvark']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
-        )
-    # If the types are different, this is also considered false
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': ["1"]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # For LT, AttributeValueList must have a single element
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'LT',
-                            'AttributeValueList': [2, 3]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "GE":
-def test_update_expected_1_ge(test_table_s):
-    p = random_string()
-    # GE should work for string, number, and binary type
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'GE',
-                        'AttributeValueList': [0]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'GE',
-                        'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'GE',
-                        'AttributeValueList': ['aardvark']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'GE',
-                        'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GE',
-                            'AttributeValueList': [3]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'GE',
-                            'AttributeValueList': ['dog']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'GE',
-                            'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-        )
-    # If the types are different, this is also considered false
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GE',
-                            'AttributeValueList': ["1"]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # For GE, AttributeValueList must have a single element
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GE',
-                            'AttributeValueList': [2, 3]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "GT":
-def test_update_expected_1_gt(test_table_s):
-    p = random_string()
-    # GT should work for string, number, and binary type
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'GT',
-                        'AttributeValueList': [0]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'GT',
-                        'AttributeValueList': ['aardvark']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'GT',
-                        'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': [3]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': ['dog']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-        )
-    # If the types are different, this is also considered false
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': ["1"]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # For GE, AttributeValueList must have a single element
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'GT',
-                            'AttributeValueList': [2, 3]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "NOT_NULL":
-def test_update_expected_1_not_null(test_table_s):
-    # Note that despite its name, the "NOT_NULL" comparison operator doesn't check if
-    # the attribute has the type "NULL", or an empty value. Rather it is explicitly
-    # documented to check if the attribute exists at all.
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': None, 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'q': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    # For NOT_NULL, AttributeValueList must be empty
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': [2]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "NULL":
-def test_update_expected_1_null(test_table_s):
-    # Note that despite its name, the "NULL" comparison operator doesn't check if
-    # the attribute has the type "NULL", or an empty value. Rather it is explicitly
-    # documented to check if the attribute exists at all.
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': None, 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'q': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
-        )
-        assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    # For NULL, AttributeValueList must be empty
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NULL', 'AttributeValueList': [2]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "CONTAINS":
-def test_update_expected_1_contains(test_table_s):
-    # true cases. CONTAINS can be used for two unrelated things: check substrings
-    # (in string or binary) and membership (in set or list).
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
-                          'b': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
-                          'c': {'Value': [2, 4, 7], 'Action': 'PUT'},
-                          'd': {'Value': bytearray('hi there', 'utf-8'), 'Action': 'PUT'}})
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': ['ell']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [4]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    # The CONTAINS documentation uses confusing wording on whether it works
-    # only on sets, or also on lists. In fact, it does work on lists:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [4]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'d': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [bytearray('here', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': ['dog']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'q': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'d': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    # For CONTAINS, AttributeValueList must have just one item, and it must be
-    # a string, number or binary
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [2, 3]}}
-        )
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': []}}
-        )
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [[1]]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "NOT_CONTAINS":
-def test_update_expected_1_not_contains(test_table_s):
-    # true cases. NOT_CONTAINS can be used for two unrelated things: check substrings
-    # (in string or binary) and membership (in set or list).
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
-                          'b': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
-                          'c': {'Value': [2, 4, 7], 'Action': 'PUT'},
-                          'd': {'Value': bytearray('hi there', 'utf-8'), 'Action': 'PUT'}})
-
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': ['dog']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 7, 'Action': 'PUT'}},
-        Expected={'d': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [bytearray('dog', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 7
-
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': ['ell']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [4]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [4]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-            Expected={'d': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [bytearray('here', 'utf-8')]}}
-        )
-    # Surprisingly, if an attribute does not exist at all, NOT_CONTAINS
-    # fails, rather than succeeding. This is surprising because it means in
-    # this case both CONTAINS and NOT_CONTAINS are false, and because "NE" does not
-    # behave this way (if the attribute does not exist, NE succeeds).
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'q': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 7
-    # For NOT_CONTAINS, AttributeValueList must have just one item, and it must be
-    # a string, number or binary
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [2, 3]}}
-        )
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': []}}
-        )
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [[1]]}}
-        )
-
-# Tests for Expected with ComparisonOperator = "BEGINS_WITH":
 def test_update_expected_1_begins_with_true(test_table_s):
    p = random_string()
    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
-                          'd': {'Value': bytearray('hi there', 'utf-8'), 'Action': 'PUT'}})
+        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'}})
    # Case where expected and update are on different attribute:
    test_table_s.update_item(Key={'p': p},
        AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
        Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
                        'AttributeValueList': ['hell']}}
    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['b'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'b': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'d': {'ComparisonOperator': 'BEGINS_WITH',
-                        'AttributeValueList': [bytearray('hi', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['b'] == 4
-    # For BEGINS_WITH, AttributeValueList must have a single element
+    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello', 'b': 3}
+    # For BEGIN_WITH, AttributeValueList must have a single element
    with pytest.raises(ClientError, match='ValidationException'):
        test_table_s.update_item(Key={'p': p},
            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
+            Expected={'a': {'ComparisonOperator': 'EQ',
                            'AttributeValueList': ['hell', 'heaven']}}
        )

 def test_update_expected_1_begins_with_false(test_table_s):
    p = random_string()
    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
-                          'x': {'Value': 3, 'Action': 'PUT'}})
+        AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'}})
    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
        test_table_s.update_item(Key={'p': p},
            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
+            Expected={'a': {'ComparisonOperator': 'EQ',
                            'AttributeValueList': ['dog']}}
        )
-    # BEGINS_WITH requires String or Binary operand, giving it a number
-    # results with a ValidationException (not a normal failed condition):
-    with pytest.raises(ClientError, match='ValidationException'):
+    # Although BEGINS_WITH requires String or Binary type, giving it a
+    # number results not with a ValidationException but rather a
+    # failed condition (ConditionalCheckFailedException)
+    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
        test_table_s.update_item(Key={'p': p},
            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
+            Expected={'a': {'ComparisonOperator': 'EQ',
                            'AttributeValueList': [3]}}
        )
-    # However, if we try to compare the attribute to a String or Binary, and
-    # the attribute value itself is a number, this is just a failed condition:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'x': {'ComparisonOperator': 'BEGINS_WITH',
-                            'AttributeValueList': ['dog']}}
-        )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello', 'x': 3}
+    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello'}

-# Tests for Expected with ComparisonOperator = "IN":
-def test_update_expected_1_in(test_table_s):
-    # Some copies of "IN"'s documentation are outright wrong: "IN" checks
-    # whether the attribute value is in the give list of values. It does NOT
-    # do the opposite - testing whether certain items are in a set attribute.
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
-                          'c': {'Value': 3, 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [2, 3, 8]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [1, 2, 4]}}
-        )
-    # a bunch of wrong interpretations of what the heck that "IN" does :-(
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'IN', 'AttributeValueList': [2]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'IN', 'AttributeValueList': [1, 2, 4, 7, 8]}}
-        )
-    # Strangely, all the items in AttributeValueList must be of the same type,
-    # we can't check if an item is either the number 3 or the string 'dog',
-    # although allowing this case as well would have been easy:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [3, 'dog']}}
-        )
-    # Empty list is not allowed
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': []}}
-        )
-    # Non-scalar attribute values are not allowed
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-            Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [[1], [2]]}}
-        )
+# FIXME: need to test many more ComparisonOperator options... See full list in
+# description in https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LegacyConditionalParameters.Expected.html

-# Tests for Expected with ComparisonOperator = "BETWEEN":
-def test_update_expected_1_between(test_table_s):
-    p = random_string()
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'a': {'Value': 2, 'Action': 'PUT'},
-                          'b': {'Value': 'cat', 'Action': 'PUT'},
-                          'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'},
-                          'd': {'Value': set([2, 4, 7]), 'Action': 'PUT'}})
-    # true cases:
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [1, 3]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [1, 2]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
-        Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [2, 3]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
-        Expected={'b': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': ['aardvark', 'dog']}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
-    test_table_s.update_item(Key={'p': p},
-        AttributeUpdates={'z': {'Value': 6, 'Action': 'PUT'}},
-        Expected={'c': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [bytearray('aardvark', 'utf-8'), bytearray('dog', 'utf-8')]}}
-    )
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 6
-    # false cases:
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [0, 1]}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': ['cat', 'dog']}}
-        )
-    with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'q': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [0, 100]}})
-    assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 6
-    # The given AttributeValueList array must contain exactly two items of the
-    # same type, and in the right order. Any other input is considered a validation
-    # error:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': []}})
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [2, 3, 4]}})
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [4, 3]}})
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'b': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': ['dog', 'aardvark']}})
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [4, 'dog']}})
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p},
-            AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
-            Expected={'d': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [set([1]), set([2])]}})
-
-##############################################################################
 # Instead of ComparisonOperator and AttributeValueList, one can specify either
 # Value or Exists:
 def test_update_expected_1_value_true(test_table_s):
--- a/alternator-test/test_gsi.py
+++ b/alternator-test/test_gsi.py
@@ -377,6 +377,7 @@ def test_gsi_3(test_table_gsi_3):
        KeyConditions={'a': {'AttributeValueList': [items[3]['a']], 'ComparisonOperator': 'EQ'},
                       'b': {'AttributeValueList': [items[3]['b']], 'ComparisonOperator': 'EQ'}})

+@pytest.mark.xfail(reason="GSI in alternator currently have a bug on updating the second regular base column")
 def test_gsi_update_second_regular_base_column(test_table_gsi_3):
    items = [{'p': random_string(), 'a': random_string(), 'b': random_string(), 'd': random_string()} for i in range(10)]
    with test_table_gsi_3.batch_writer() as batch:
@@ -388,34 +389,6 @@ def test_gsi_update_second_regular_base_column(test_table_gsi_3):
        KeyConditions={'a': {'AttributeValueList': [items[3]['a']], 'ComparisonOperator': 'EQ'},
                       'b': {'AttributeValueList': [items[3]['b']], 'ComparisonOperator': 'EQ'}})

-# Test that when a table has a GSI, if the indexed attribute is missing, the
-# item is added to the base table but not the index.
-# This is the same feature we already tested in test_gsi_missing_attribute()
-# above, but on a different table: In that test we used test_table_gsi_2,
-# with one indexed attribute, and in this test we use test_table_gsi_3 which
-# has two base regular attributes in the view key, and more possibilities
-# of which value might be missing. Reproduces issue #6008.
-def test_gsi_missing_attribute_3(test_table_gsi_3):
-    p = random_string()
-    a = random_string()
-    b = random_string()
-    # First, add an item with a missing "a" value. It should appear in the
-    # base table, but not in the index:
-    test_table_gsi_3.put_item(Item={'p':  p, 'b': b})
-    assert test_table_gsi_3.get_item(Key={'p':  p})['Item'] == {'p': p, 'b': b}
-    # Note: with eventually consistent read, we can't really be sure that
-    # an item will "never" appear in the index. We hope that if a bug exists
-    # and such an item did appear, sometimes the delay here will be enough
-    # for the unexpected item to become visible.
-    assert not any([i['p'] == p for i in full_scan(test_table_gsi_3, IndexName='hello')])
-    # Same thing for an item with a missing "b" value:
-    test_table_gsi_3.put_item(Item={'p':  p, 'a': a})
-    assert test_table_gsi_3.get_item(Key={'p':  p})['Item'] == {'p': p, 'a': a}
-    assert not any([i['p'] == p for i in full_scan(test_table_gsi_3, IndexName='hello')])
-    # And for an item missing both:
-    test_table_gsi_3.put_item(Item={'p':  p})
-    assert test_table_gsi_3.get_item(Key={'p':  p})['Item'] == {'p': p}
-    assert not any([i['p'] == p for i in full_scan(test_table_gsi_3, IndexName='hello')])

 # A fourth scenario of GSI. Two GSIs on a single base table.
@pytest.fixture(scope="session")
@@ -504,52 +477,6 @@ def test_gsi_5(test_table_gsi_5):
        KeyConditions={'p': {'AttributeValueList': [p2], 'ComparisonOperator': 'EQ'},
                       'x': {'AttributeValueList': [x2], 'ComparisonOperator': 'EQ'}})

-# Verify that DescribeTable correctly returns the schema of both base-table
-# and secondary indexes. KeySchema is given for each of the base table and
-# indexes, and AttributeDefinitions is merged for all of them together.
-def test_gsi_5_describe_table_schema(test_table_gsi_5):
-    got = test_table_gsi_5.meta.client.describe_table(TableName=test_table_gsi_5.name)['Table']
-    # Copied from test_table_gsi_5 fixture
-    expected_base_keyschema = [
-                    { 'AttributeName': 'p', 'KeyType': 'HASH' },
-                    { 'AttributeName': 'c', 'KeyType': 'RANGE' } ]
-    expected_gsi_keyschema = [
-                    { 'AttributeName': 'p', 'KeyType': 'HASH' },
-                    { 'AttributeName': 'x', 'KeyType': 'RANGE' } ]
-    expected_all_attribute_definitions = [
-                    { 'AttributeName': 'p', 'AttributeType': 'S' },
-                    { 'AttributeName': 'c', 'AttributeType': 'S' },
-                    { 'AttributeName': 'x', 'AttributeType': 'S' } ]
-    assert got['KeySchema'] == expected_base_keyschema
-    gsis = got['GlobalSecondaryIndexes']
-    assert len(gsis) == 1
-    assert gsis[0]['KeySchema'] == expected_gsi_keyschema
-    # The list of attribute definitions may be arbitrarily reordered
-    assert multiset(got['AttributeDefinitions']) == multiset(expected_all_attribute_definitions)
-
-# Similar DescribeTable schema test for test_table_gsi_2. The peculiarity
-# in that table is that the base table has only a hash key p, and index
-# only hash hash key x; Now, while internally Scylla needs to add "p" as a
-# clustering key in the materialized view (in Scylla the view key always
-# contains the base key), when describing the table, "p" shouldn't be
-# returned as a range key, because the user didn't ask for it.
-# This test reproduces issue #5320.
-@pytest.mark.xfail(reason="GSI DescribeTable spurious range key (#5320)")
-def test_gsi_2_describe_table_schema(test_table_gsi_2):
-    got = test_table_gsi_2.meta.client.describe_table(TableName=test_table_gsi_2.name)['Table']
-    # Copied from test_table_gsi_2 fixture
-    expected_base_keyschema = [ { 'AttributeName': 'p', 'KeyType': 'HASH' } ]
-    expected_gsi_keyschema = [ { 'AttributeName': 'x', 'KeyType': 'HASH' } ]
-    expected_all_attribute_definitions = [
-                    { 'AttributeName': 'p', 'AttributeType': 'S' },
-                    { 'AttributeName': 'x', 'AttributeType': 'S' } ]
-    assert got['KeySchema'] == expected_base_keyschema
-    gsis = got['GlobalSecondaryIndexes']
-    assert len(gsis) == 1
-    assert gsis[0]['KeySchema'] == expected_gsi_keyschema
-    # The list of attribute definitions may be arbitrarily reordered
-    assert multiset(got['AttributeDefinitions']) == multiset(expected_all_attribute_definitions)
-
 # All tests above involved "ProjectionType: ALL". This test checks how
 # "ProjectionType:: KEYS_ONLY" works. We note that it projects both
 # the index's key, *and* the base table's key. So items which had different
--- a/alternator-test/test_health.py
+++ b/alternator-test/test_health.py
@@ -1,35 +0,0 @@
-# Copyright 2019 ScyllaDB
-#
-# This file is part of Scylla.
-#
-# Scylla is free software: you can redistribute it and/or modify
-# it under the terms of the GNU Affero General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-#
-# Scylla is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU Affero General Public License
-# along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
-
-# Tests for the health check
-
-import requests
-
-# Test that a health check can be performed with a GET packet
-def test_health_works(dynamodb):
-    url = dynamodb.meta.client._endpoint.host
-    response = requests.get(url)
-    assert response.ok
-    assert response.content.decode('utf-8').strip()  == 'healthy: {}'.format(url.replace('https://', '').replace('http://', ''))
-
-# Test that a health check only works for the root URL ('/')
-def test_health_only_works_for_root_path(dynamodb):
-    url = dynamodb.meta.client._endpoint.host
-    for suffix in ['/abc', '/-', '/index.htm', '/health']:
-        print(url + suffix)
-        response = requests.get(url + suffix, verify=False)
-        assert response.status_code in range(400, 405)
--- a/alternator-test/test_query.py
+++ b/alternator-test/test_query.py
@@ -20,7 +20,7 @@

 import random
 import pytest
-from botocore.exceptions import ClientError, ParamValidationError
+from botocore.exceptions import ClientError
 from decimal import Decimal
 from util import random_string, random_bytes, full_query, multiset
 from boto3.dynamodb.conditions import Key, Attr
@@ -356,161 +356,3 @@ def test_query_which_key(test_table):
            'c': {'AttributeValueList': [c], 'ComparisonOperator': 'EQ'},
            'z': {'AttributeValueList': [c], 'ComparisonOperator': 'EQ'}
        })
-
-# Test the "Select" parameter of Query. The default Select mode,
-# ALL_ATTRIBUTES, returns items with all their attributes. Other modes
-# allow returning just specific attributes or just counting the results
-# without returning items at all.
-@pytest.mark.xfail(reason="Select not supported yet")
-def test_query_select(test_table_sn):
-    numbers = [Decimal(i) for i in range(10)]
-    # Insert these numbers, in random order, into one partition:
-    p = random_string()
-    items = [{'p': p, 'c': num, 'x': num} for num in random.sample(numbers, len(numbers))]
-    with test_table_sn.batch_writer() as batch:
-        for item in items:
-            batch.put_item(item)
-    # Verify that we get back the numbers in their sorted order. By default,
-    # query returns all attributes:
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}})['Items']
-    got_sort_keys = [x['c'] for x in got_items]
-    assert got_sort_keys == numbers
-    got_x_attributes = [x['x'] for x in got_items]
-    assert got_x_attributes == numbers
-    # Select=ALL_ATTRIBUTES does exactly the same as the default - return
-    # all attributes:
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='ALL_ATTRIBUTES')['Items']
-    got_sort_keys = [x['c'] for x in got_items]
-    assert got_sort_keys == numbers
-    got_x_attributes = [x['x'] for x in got_items]
-    assert got_x_attributes == numbers
-    # Select=ALL_PROJECTED_ATTRIBUTES is not allowed on a base table (it
-    # is just for indexes, when IndexName is specified)
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='ALL_PROJECTED_ATTRIBUTES')
-    # Select=SPECIFIC_ATTRIBUTES requires that either a AttributesToGet
-    # or ProjectionExpression appears, but then really does nothing:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='SPECIFIC_ATTRIBUTES')
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='SPECIFIC_ATTRIBUTES', AttributesToGet=['x'])['Items']
-    expected_items = [{'x': i} for i in numbers]
-    assert got_items == expected_items
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='SPECIFIC_ATTRIBUTES', ProjectionExpression='x')['Items']
-    assert got_items == expected_items
-    # Select=COUNT just returns a count - not any items
-    got = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='COUNT')
-    assert got['Count'] == len(numbers)
-    assert not 'Items' in got
-    # Check again that we also get a count - not just with Select=COUNT,
-    # but without Select=COUNT we also get the items:
-    got = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}})
-    assert got['Count'] == len(numbers)
-    assert 'Items' in got
-    # Select with some unknown string generates a validation exception:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Select='UNKNOWN')
-
-# Test that the "Limit" parameter can be used to return only some of the
-# items in a single partition. The items returned are the first in the
-# sorted order.
-def test_query_limit(test_table_sn):
-    numbers = [Decimal(i) for i in range(10)]
-    # Insert these numbers, in random order, into one partition:
-    p = random_string()
-    items = [{'p': p, 'c': num} for num in random.sample(numbers, len(numbers))]
-    with test_table_sn.batch_writer() as batch:
-        for item in items:
-            batch.put_item(item)
-    # Verify that we get back the numbers in their sorted order.
-    # First, no Limit so we should get all numbers (we have few of them, so
-    # it all fits in the default 1MB limitation)
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}})['Items']
-    got_sort_keys = [x['c'] for x in got_items]
-    assert got_sort_keys == numbers
-    # Now try a few different Limit values, and verify that the query
-    # returns exactly the first Limit sorted numbers.
-    for limit in [1, 2, 3, 7, 10, 17, 100, 10000]:
-        got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Limit=limit)['Items']
-        assert len(got_items) == min(limit, len(numbers))
-        got_sort_keys = [x['c'] for x in got_items]
-        assert got_sort_keys == numbers[0:limit]
-    # Unfortunately, the boto3 library forbids a Limit of 0 on its own,
-    # before even sending a request, so we can't test how the server responds.
-    with pytest.raises(ParamValidationError):
-        test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Limit=0)
-
-# In test_query_limit we tested just that Limit allows to stop the result
-# after right right number of items. Here we test that such a stopped result
-# can be resumed, via the LastEvaluatedKey/ExclusiveStartKey paging mechanism.
-def test_query_limit_paging(test_table_sn):
-    numbers = [Decimal(i) for i in range(20)]
-    # Insert these numbers, in random order, into one partition:
-    p = random_string()
-    items = [{'p': p, 'c': num} for num in random.sample(numbers, len(numbers))]
-    with test_table_sn.batch_writer() as batch:
-        for item in items:
-            batch.put_item(item)
-    # Verify that full_query() returns all these numbers, in sorted order.
-    # full_query() will do a query with the given limit, and resume it again
-    # and again until the last page.
-    for limit in [1, 2, 3, 7, 10, 17, 100, 10000]:
-        got_items = full_query(test_table_sn, KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Limit=limit)
-        got_sort_keys = [x['c'] for x in got_items]
-        assert got_sort_keys == numbers
-
-# Test that the ScanIndexForward parameter works, and can be used to
-# return items sorted in reverse order. Combining this with Limit can
-# be used to return the last items instead of the first items of the
-# partition.
-@pytest.mark.xfail(reason="ScanIndexForward not supported yet")
-def test_query_reverse(test_table_sn):
-    numbers = [Decimal(i) for i in range(20)]
-    # Insert these numbers, in random order, into one partition:
-    p = random_string()
-    items = [{'p': p, 'c': num} for num in random.sample(numbers, len(numbers))]
-    with test_table_sn.batch_writer() as batch:
-        for item in items:
-            batch.put_item(item)
-    # Verify that we get back the numbers in their sorted order or reverse
-    # order, depending on the ScanIndexForward parameter being True or False.
-    # First, no Limit so we should get all numbers (we have few of them, so
-    # it all fits in the default 1MB limitation)
-    reversed_numbers = list(reversed(numbers))
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, ScanIndexForward=True)['Items']
-    got_sort_keys = [x['c'] for x in got_items]
-    assert got_sort_keys == numbers
-    got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, ScanIndexForward=False)['Items']
-    got_sort_keys = [x['c'] for x in got_items]
-    assert got_sort_keys == reversed_numbers
-    # Now try a few different Limit values, and verify that the query
-    # returns exactly the first Limit sorted numbers - in regular or
-    # reverse order, depending on ScanIndexForward.
-    for limit in [1, 2, 3, 7, 10, 17, 100, 10000]:
-        got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Limit=limit, ScanIndexForward=True)['Items']
-        assert len(got_items) == min(limit, len(numbers))
-        got_sort_keys = [x['c'] for x in got_items]
-        assert got_sort_keys == numbers[0:limit]
-        got_items = test_table_sn.query(KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, Limit=limit, ScanIndexForward=False)['Items']
-        assert len(got_items) == min(limit, len(numbers))
-        got_sort_keys = [x['c'] for x in got_items]
-        assert got_sort_keys == reversed_numbers[0:limit]
-
-# Test that paging also works properly with reverse order
-# (ScanIndexForward=false), i.e., reverse-order queries can be resumed
-@pytest.mark.xfail(reason="ScanIndexForward not supported yet")
-def test_query_reverse_paging(test_table_sn):
-    numbers = [Decimal(i) for i in range(20)]
-    # Insert these numbers, in random order, into one partition:
-    p = random_string()
-    items = [{'p': p, 'c': num} for num in random.sample(numbers, len(numbers))]
-    with test_table_sn.batch_writer() as batch:
-        for item in items:
-            batch.put_item(item)
-    reversed_numbers = list(reversed(numbers))
-    # Verify that with ScanIndexForward=False, full_query() returns all
-    # these numbers in reversed sorted order - getting pages of Limit items
-    # at a time and resuming the query.
-    for limit in [1, 2, 3, 7, 10, 17, 100, 10000]:
-        got_items = full_query(test_table_sn, KeyConditions={'p': {'AttributeValueList': [p], 'ComparisonOperator': 'EQ'}}, ScanIndexForward=False, Limit=limit)
-        got_sort_keys = [x['c'] for x in got_items]
-        assert got_sort_keys == reversed_numbers
--- a/alternator-test/test_returnvalues.py
+++ b/alternator-test/test_returnvalues.py
@@ -1,226 +0,0 @@
-# Copyright 2019 ScyllaDB
-#
-# This file is part of Scylla.
-#
-# Scylla is free software: you can redistribute it and/or modify
-# it under the terms of the GNU Affero General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-#
-# Scylla is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU Affero General Public License
-# along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
-
-# Tests for the ReturnValues parameter for the different update operations
-# (PutItem, UpdateItem, DeleteItem).
-
-import pytest
-from botocore.exceptions import ClientError
-from util import random_string
-
-# Test trivial support for the ReturnValues parameter in PutItem, UpdateItem
-# and DeleteItem - test that "NONE" works (and changes nothing), while a
-# completely unsupported value gives an error.
-# This test is useful to check that before the ReturnValues parameter is fully
-# implemented, it returns an error when a still-unsupported ReturnValues
-# option is attempted in the request - instead of simply being ignored.
-def test_trivial_returnvalues(test_table_s):
-    # PutItem:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='NONE')
-    assert not 'Attributes' in ret
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='DOG')
-    # UpdateItem:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='NONE',
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert not 'Attributes' in ret
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p}, ReturnValues='DOG',
-            UpdateExpression='SET a = a + :val',
-            ExpressionAttributeValues={':val': 1})
-    # DeleteItem:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.delete_item(Key={'p': p}, ReturnValues='NONE')
-    assert not 'Attributes' in ret
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='DOG')
-
-# Test the ReturnValues parameter on a PutItem operation. Only two settings
-# are supported for this parameter for this operation: NONE (the default)
-# and ALL_OLD.
-@pytest.mark.xfail(reason="ReturnValues not supported")
-def test_put_item_returnvalues(test_table_s):
-    # By default, the previous value of an item is not returned:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.put_item(Item={'p': p, 'a': 'hello'})
-    assert not 'Attributes' in ret
-    # Using ReturnValues=NONE is the same:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='NONE')
-    assert not 'Attributes' in ret
-    # With ReturnValues=ALL_OLD, the old value of the item is returned
-    # in an "Attributes" attribute:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='ALL_OLD')
-    assert ret['Attributes'] == {'p': p, 'a': 'hi'}
-    # Other ReturnValue options - UPDATED_OLD, ALL_NEW, UPDATED_NEW,
-    # are supported by other operations but not by PutItem:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='UPDATED_OLD')
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='ALL_NEW')
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='UPDATED_NEW')
-    # Also, obviously, a non-supported setting "DOG" also returns in error:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='DOG')
-    # The ReturnValues value is case sensitive, so while "NONE" is supported
-    # (and tested above), "none" isn't:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.put_item(Item={'p': p, 'a': 'hello'}, ReturnValues='none')
-
-# Test the ReturnValues parameter on a DeleteItem operation. Only two settings
-# are supported for this parameter for this operation: NONE (the default)
-# and ALL_OLD.
-@pytest.mark.xfail(reason="ReturnValues not supported")
-def test_delete_item_returnvalues(test_table_s):
-    # By default, the previous value of an item is not returned:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.delete_item(Key={'p': p})
-    assert not 'Attributes' in ret
-    # Using ReturnValues=NONE is the same:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.delete_item(Key={'p': p}, ReturnValues='NONE')
-    assert not 'Attributes' in ret
-    # With ReturnValues=ALL_OLD, the old value of the item is returned
-    # in an "Attributes" attribute:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi'})
-    ret=test_table_s.delete_item(Key={'p': p}, ReturnValues='ALL_OLD')
-    assert ret['Attributes'] == {'p': p, 'a': 'hi'}
-    # Other ReturnValue options - UPDATED_OLD, ALL_NEW, UPDATED_NEW,
-    # are supported by other operations but not by PutItem:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='UPDATE_OLD')
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='ALL_NEW')
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='UPDATE_NEW')
-    # Also, obviously, a non-supported setting "DOG" also returns in error:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='DOG')
-    # The ReturnValues value is case sensitive, so while "NONE" is supported
-    # (and tested above), "none" isn't:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.delete_item(Key={'p': p}, ReturnValues='none')
-
-# Test the ReturnValues parameter on a UpdateItem operation. All five
-# settings are supported for this parameter for this operation: NONE
-# (the default), ALL_OLD, UPDATED_OLD, ALL_NEW and UPDATED_NEW.
-@pytest.mark.xfail(reason="ReturnValues not supported")
-def test_update_item_returnvalues(test_table_s):
-    # By default, the previous value of an item is not returned:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p},
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert not 'Attributes' in ret
-
-    # Using ReturnValues=NONE is the same:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='NONE',
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert not 'Attributes' in ret
-
-    # With ReturnValues=ALL_OLD, the entire old value of the item (even
-    # attributes we did not modify) is returned in an "Attributes" attribute:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='ALL_OLD',
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert ret['Attributes'] == {'p': p, 'a': 'hi', 'b': 'dog'}
-
-    # With ReturnValues=UPDATED_OLD, only the overwritten attributes of the
-    # old item are returned in an "Attributes" attribute:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_OLD',
-        UpdateExpression='SET b = :val, c = :val2',
-        ExpressionAttributeValues={':val': 'cat', ':val2': 'hello'})
-    assert ret['Attributes'] == {'b': 'dog'}
-    # Even if an update overwrites an attribute by the same value again,
-    # this is considered an update, and the old value (identical to the
-    # new one) is returned:
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_OLD',
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert ret['Attributes'] == {'b': 'cat'}
-    # Deleting an attribute also counts as overwriting it, of course:
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_OLD',
-        UpdateExpression='REMOVE b')
-    assert ret['Attributes'] == {'b': 'cat'}
-
-    # With ReturnValues=ALL_NEW, the entire new value of the item (including
-    # old attributes we did not modify) is returned:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='ALL_NEW',
-        UpdateExpression='SET b = :val',
-        ExpressionAttributeValues={':val': 'cat'})
-    assert ret['Attributes'] == {'p': p, 'a': 'hi', 'b': 'cat'}
-
-    # With ReturnValues=UPDATED_NEW, only the new value of the updated
-    # attributes are returned. Note that "updated attributes" means
-    # the newly set attributes - it doesn't require that these attributes
-    # have any previous values
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 'hi', 'b': 'dog'})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_NEW',
-        UpdateExpression='SET b = :val, c = :val2',
-        ExpressionAttributeValues={':val': 'cat', ':val2': 'hello'})
-    assert ret['Attributes'] == {'b': 'cat', 'c': 'hello'}
-    # Deleting an attribute also counts as overwriting it, but the delete
-    # column is not returned in the response - so it's empty in this case.
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_NEW',
-        UpdateExpression='REMOVE b')
-    assert not 'Attributes' in ret
-    # In the above examples, UPDATED_NEW is not useful because it just
-    # returns the new values we already know from the request... UPDATED_NEW
-    # becomes more useful in read-modify-write operations:
-    p = random_string()
-    test_table_s.put_item(Item={'p': p, 'a': 1})
-    ret=test_table_s.update_item(Key={'p': p}, ReturnValues='UPDATED_NEW',
-        UpdateExpression='SET a = a + :val',
-        ExpressionAttributeValues={':val': 1})
-    assert ret['Attributes'] == {'a': 2}
-
-    # A non-supported setting "DOG" also returns in error:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p}, ReturnValues='DOG',
-            UpdateExpression='SET a = a + :val',
-            ExpressionAttributeValues={':val': 1})
-    # The ReturnValues value is case sensitive, so while "NONE" is supported
-    # (and tested above), "none" isn't:
-    with pytest.raises(ClientError, match='ValidationException'):
-        test_table_s.update_item(Key={'p': p}, ReturnValues='none',
-            UpdateExpression='SET a = a + :val',
-            ExpressionAttributeValues={':val': 1})
--- a/alternator-test/test_scan.py
+++ b/alternator-test/test_scan.py
@@ -19,7 +19,7 @@

 import pytest
 from botocore.exceptions import ClientError
-from util import random_string, full_scan, full_scan_and_count, multiset
+from util import random_string, full_scan, multiset
 from boto3.dynamodb.conditions import Attr

 # Test that scanning works fine with/without pagination
@@ -189,64 +189,3 @@ def test_scan_with_key_equality_filtering(dynamodb, filled_test_table):
    got_items = full_scan(table, ScanFilter=scan_filter_c_and_another)
    expected_items = [item for item in items if "c" in item.keys() and "another" in item.keys() and item["c"] == "9" and item["another"] == "y"*16]
    assert multiset(expected_items) == multiset(got_items)
-
-# Test the "Select" parameter of Scan. The default Select mode,
-# ALL_ATTRIBUTES, returns items with all their attributes. Other modes
-# allow returning just specific attributes or just counting the results
-# without returning items at all.
-@pytest.mark.xfail(reason="Select not supported yet")
-def test_scan_select(filled_test_table):
-    test_table, items = filled_test_table
-    got_items = full_scan(test_table)
-    # By default, a scan returns all the items, with all their attributes:
-    # query returns all attributes:
-    got_items = full_scan(test_table)
-    assert multiset(items) == multiset(got_items)
-    # Select=ALL_ATTRIBUTES does exactly the same as the default - return
-    # all attributes:
-    got_items = full_scan(test_table, Select='ALL_ATTRIBUTES')
-    assert multiset(items) == multiset(got_items)
-    # Select=ALL_PROJECTED_ATTRIBUTES is not allowed on a base table (it
-    # is just for indexes, when IndexName is specified)
-    with pytest.raises(ClientError, match='ValidationException'):
-        full_scan(test_table, Select='ALL_PROJECTED_ATTRIBUTES')
-    # Select=SPECIFIC_ATTRIBUTES requires that either a AttributesToGet
-    # or ProjectionExpression appears, but then really does nothing beyond
-    # what AttributesToGet and ProjectionExpression already do:
-    with pytest.raises(ClientError, match='ValidationException'):
-        full_scan(test_table, Select='SPECIFIC_ATTRIBUTES')
-    wanted = ['c', 'another']
-    got_items = full_scan(test_table, Select='SPECIFIC_ATTRIBUTES', AttributesToGet=wanted)
-    expected_items = [{k: x[k] for k in wanted if k in x} for x in items]
-    assert multiset(expected_items) == multiset(got_items)
-    got_items = full_scan(test_table, Select='SPECIFIC_ATTRIBUTES', ProjectionExpression=','.join(wanted))
-    assert multiset(expected_items) == multiset(got_items)
-    # Select=COUNT just returns a count - not any items
-    (got_count, got_items) = full_scan_and_count(test_table, Select='COUNT')
-    assert got_count == len(items)
-    assert got_items == []
-    # Check that we also get a count in regular scans - not just with
-    # Select=COUNT, but without Select=COUNT we both items and count:
-    (got_count, got_items) = full_scan_and_count(test_table)
-    assert got_count == len(items)
-    assert multiset(items) == multiset(got_items)
-    # Select with some unknown string generates a validation exception:
-    with pytest.raises(ClientError, match='ValidationException'):
-        full_scan(test_table, Select='UNKNOWN')
-
-# Test parallel scan, i.e., the Segments and TotalSegments options.
-# In the following test we check that these parameters allow splitting
-# a scan into multiple parts, and that these parts are in fact disjoint,
-# and their union is the entire contents of the table. We do not actually
-# try to run these queries in *parallel* in this test.
-@pytest.mark.xfail(reason="parallel scan not supported yet")
-def test_scan_parallel(filled_test_table):
-    test_table, items = filled_test_table
-    for nsegments in [1, 2, 17]:
-        print('Testing TotalSegments={}'.format(nsegments))
-        got_items = []
-        for segment in range(nsegments):
-            got_items.extend(full_scan(test_table, TotalSegments=nsegments, Segment=segment))
-        # The following comparison verifies that each of the expected item
-        # in items was returned in one - and just one - of the segments.
-        assert multiset(items) == multiset(got_items)
--- a/alternator-test/test_update_expression.py
+++ b/alternator-test/test_update_expression.py
@@ -584,7 +584,7 @@ def test_update_expression_if_not_exists(test_table_s):
 # value may itself be a function call - ad infinitum. So expressions like
 # list_append(if_not_exists(a, :val1), :val2) are legal and so is deeper
 # nesting.
-@pytest.mark.xfail(reason="for unknown reason, DynamoDB does not allow nesting list_append")
+@pytest.mark.xfail(reason="SET functions not yet implemented")
 def test_update_expression_function_nesting(test_table_s):
    p = random_string()
    test_table_s.update_item(Key={'p': p},
--- a/alternator-test/util.py
+++ b/alternator-test/util.py
@@ -39,26 +39,6 @@ def full_scan(table, **kwargs):
        items.extend(response['Items'])
    return items

-# full_scan_and_count returns both items and count as returned by the server.
-# Note that count isn't simply len(items) - the server returns them
-# independently. e.g., with Select='COUNT' the items are not returned, but
-# count is.
-def full_scan_and_count(table, **kwargs):
-    response = table.scan(**kwargs)
-    items = []
-    count = 0
-    if 'Items' in response:
-        items.extend(response['Items'])
-    if 'Count' in response:
-        count = count + response['Count']
-    while 'LastEvaluatedKey' in response:
-        response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'], **kwargs)
-        if 'Items' in response:
-            items.extend(response['Items'])
-        if 'Count' in response:
-            count = count + response['Count']
-    return (count, items)
-
 # Utility function for fetching the entire results of a query into an array of items
 def full_query(table, **kwargs):
    response = table.query(**kwargs)
--- a/alternator/auth.cc
+++ b/alternator/auth.cc
@@ -1,147 +0,0 @@
-/*
- * Copyright 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU Affero General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include "alternator/error.hh"
-#include "log.hh"
-#include <string>
-#include <string_view>
-#include <gnutls/crypto.h>
-#include <seastar/util/defer.hh>
-#include "hashers.hh"
-#include "bytes.hh"
-#include "alternator/auth.hh"
-#include <fmt/format.h>
-#include "auth/common.hh"
-#include "auth/password_authenticator.hh"
-#include "auth/roles-metadata.hh"
-#include "cql3/query_processor.hh"
-#include "cql3/untyped_result_set.hh"
-
-namespace alternator {
-
-static logging::logger alogger("alternator-auth");
-
-static hmac_sha256_digest hmac_sha256(std::string_view key, std::string_view msg) {
-    hmac_sha256_digest digest;
-    int ret = gnutls_hmac_fast(GNUTLS_MAC_SHA256, key.data(), key.size(), msg.data(), msg.size(), digest.data());
-    if (ret) {
-        throw std::runtime_error(fmt::format("Computing HMAC failed ({}): {}", ret, gnutls_strerror(ret)));
-    }
-    return digest;
-}
-
-static hmac_sha256_digest get_signature_key(std::string_view key, std::string_view date_stamp, std::string_view region_name, std::string_view service_name) {
-    auto date = hmac_sha256("AWS4" + std::string(key), date_stamp);
-    auto region = hmac_sha256(std::string_view(date.data(), date.size()), region_name);
-    auto service = hmac_sha256(std::string_view(region.data(), region.size()), service_name);
-    auto signing = hmac_sha256(std::string_view(service.data(), service.size()), "aws4_request");
-    return signing;
-}
-
-static std::string apply_sha256(std::string_view msg) {
-    sha256_hasher hasher;
-    hasher.update(msg.data(), msg.size());
-    return to_hex(hasher.finalize());
-}
-
-static std::string format_time_point(db_clock::time_point tp) {
-    time_t time_point_repr = db_clock::to_time_t(tp);
-    std::string time_point_str;
-    time_point_str.resize(17);
-    ::tm time_buf;
-    // strftime prints the terminating null character as well
-    std::strftime(time_point_str.data(), time_point_str.size(), "%Y%m%dT%H%M%SZ", ::gmtime_r(&time_point_repr, &time_buf));
-    time_point_str.resize(16);
-    return time_point_str;
-}
-
-void check_expiry(std::string_view signature_date) {
-    //FIXME: The default 15min can be changed with X-Amz-Expires header - we should honor it
-    std::string expiration_str = format_time_point(db_clock::now() - 15min);
-    std::string validity_str = format_time_point(db_clock::now() + 15min);
-    if (signature_date < expiration_str) {
-        throw api_error("InvalidSignatureException",
-                fmt::format("Signature expired: {} is now earlier than {} (current time - 15 min.)",
-                signature_date, expiration_str));
-    }
-    if (signature_date > validity_str) {
-        throw api_error("InvalidSignatureException",
-                fmt::format("Signature not yet current: {} is still later than {} (current time + 15 min.)",
-                signature_date, validity_str));
-    }
-}
-
-std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
-        std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
-        std::string_view body_content, std::string_view region, std::string_view service, std::string_view query_string) {
-    auto amz_date_it = signed_headers_map.find("x-amz-date");
-    if (amz_date_it == signed_headers_map.end()) {
-        throw api_error("InvalidSignatureException", "X-Amz-Date header is mandatory for signature verification");
-    }
-    std::string_view amz_date = amz_date_it->second;
-    check_expiry(amz_date);
-    std::string_view datestamp = amz_date.substr(0, 8);
-    if (datestamp != orig_datestamp) {
-        throw api_error("InvalidSignatureException",
-                format("X-Amz-Date date does not match the provided datestamp. Expected {}, got {}",
-                        orig_datestamp, datestamp));
-    }
-    std::string_view canonical_uri = "/";
-
-    std::stringstream canonical_headers;
-    for (const auto& header : signed_headers_map) {
-        canonical_headers << fmt::format("{}:{}", header.first, header.second) << '\n';
-    }
-
-    std::string payload_hash = apply_sha256(body_content);
-    std::string canonical_request = fmt::format("{}\n{}\n{}\n{}\n{}\n{}", method, canonical_uri, query_string, canonical_headers.str(), signed_headers_str, payload_hash);
-
-    std::string_view algorithm = "AWS4-HMAC-SHA256";
-    std::string credential_scope = fmt::format("{}/{}/{}/aws4_request", datestamp, region, service);
-    std::string string_to_sign = fmt::format("{}\n{}\n{}\n{}", algorithm, amz_date, credential_scope,  apply_sha256(canonical_request));
-
-    hmac_sha256_digest signing_key = get_signature_key(secret_access_key, datestamp, region, service);
-    hmac_sha256_digest signature = hmac_sha256(std::string_view(signing_key.data(), signing_key.size()), string_to_sign);
-
-    return to_hex(bytes_view(reinterpret_cast<const int8_t*>(signature.data()), signature.size()));
-}
-
-future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username) {
-    static const sstring query = format("SELECT salted_hash FROM {} WHERE {} = ?",
-            auth::meta::roles_table::qualified_name(), auth::meta::roles_table::role_col_name);
-
-    auto cl = auth::password_authenticator::consistency_for_user(username);
-    auto timeout = auth::internal_distributed_timeout_config();
-    return qp.process(query, cl, timeout, {sstring(username)}, true).then_wrapped([username = std::move(username)] (future<::shared_ptr<cql3::untyped_result_set>> f) {
-        auto res = f.get0();
-        auto salted_hash = std::optional<sstring>();
-        if (res->empty()) {
-            throw api_error("UnrecognizedClientException", fmt::format("User not found: {}", username));
-        }
-        salted_hash = res->one().get_opt<sstring>("salted_hash");
-        if (!salted_hash) {
-            throw api_error("UnrecognizedClientException", fmt::format("No password found for user: {}", username));
-        }
-        return make_ready_future<std::string>(*salted_hash);
-    });
-}
-
-}
--- a/alternator/auth.hh
+++ b/alternator/auth.hh
@@ -1,46 +0,0 @@
-/*
- * Copyright 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU Affero General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#pragma once
-
-#include <string>
-#include <string_view>
-#include <array>
-#include "gc_clock.hh"
-#include "utils/loading_cache.hh"
-
-namespace cql3 {
-class query_processor;
-}
-
-namespace alternator {
-
-using hmac_sha256_digest = std::array<char, 32>;
-
-using key_cache = utils::loading_cache<std::string, std::string>;
-
-std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
-        std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
-        std::string_view body_content, std::string_view region, std::string_view service, std::string_view query_string);
-
-future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username);
-
-}
--- a/alternator/base64.hh
+++ b/alternator/base64.hh
@@ -21,14 +21,8 @@

 #pragma once

-#include <string_view>
+#include <string>
 #include "bytes.hh"
-#include "rjson.hh"

 std::string base64_encode(bytes_view);
-
 bytes base64_decode(std::string_view);
-
-inline bytes base64_decode(const rjson::value& v) {
-  return base64_decode(std::string_view(v.GetString(), v.GetStringLength()));
-}
--- a/alternator/conditions.cc
+++ b/alternator/conditions.cc
@@ -27,9 +27,6 @@
 #include "cql3/constants.hh"
 #include <unordered_map>
 #include "rjson.hh"
-#include "serialization.hh"
-#include "base64.hh"
-#include <stdexcept>

 namespace alternator {

@@ -38,19 +35,13 @@ static logging::logger clogger("alternator-conditions");
 comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator) {
    static std::unordered_map<std::string, comparison_operator_type> ops = {
            {"EQ", comparison_operator_type::EQ},
-            {"NE", comparison_operator_type::NE},
            {"LE", comparison_operator_type::LE},
            {"LT", comparison_operator_type::LT},
            {"GE", comparison_operator_type::GE},
            {"GT", comparison_operator_type::GT},
-            {"IN", comparison_operator_type::IN},
-            {"NULL", comparison_operator_type::IS_NULL},
-            {"NOT_NULL", comparison_operator_type::NOT_NULL},
            {"BETWEEN", comparison_operator_type::BETWEEN},
            {"BEGINS_WITH", comparison_operator_type::BEGINS_WITH},
-            {"CONTAINS", comparison_operator_type::CONTAINS},
-            {"NOT_CONTAINS", comparison_operator_type::NOT_CONTAINS},
-    };
+    }; //TODO(sarna): NE, IN, CONTAINS, NULL, NOT_NULL
    if (!comparison_operator.IsString()) {
        throw api_error("ValidationException", format("Invalid comparison operator definition {}", rjson::print(comparison_operator)));
    }
@@ -105,324 +96,30 @@ static ::shared_ptr<cql3::restrictions::single_column_restriction::EQ> make_key_
    return filtering_restrictions;
 }

-namespace {
-
-struct size_check {
-    // True iff size passes this check.
-    virtual bool operator()(rapidjson::SizeType size) const = 0;
-    // Check description, such that format("expected array {}", check.what()) is human-readable.
-    virtual sstring what() const = 0;
-};
-
-class exact_size : public size_check {
-    rapidjson::SizeType _expected;
-  public:
-    explicit exact_size(rapidjson::SizeType expected) : _expected(expected) {}
-    bool operator()(rapidjson::SizeType size) const override { return size == _expected; }
-    sstring what() const override { return format("of size {}", _expected); }
-};
-
-struct empty : public size_check {
-    bool operator()(rapidjson::SizeType size) const override { return size < 1; }
-    sstring what() const override { return "to be empty"; }
-};
-
-struct nonempty : public size_check {
-    bool operator()(rapidjson::SizeType size) const override { return size > 0; }
-    sstring what() const override { return "to be non-empty"; }
-};
-
-} // anonymous namespace
-
-// Check that array has the expected number of elements
-static void verify_operand_count(const rjson::value* array, const size_check& expected, const rjson::value& op) {
-    if (!array || !array->IsArray()) {
-        throw api_error("ValidationException", "With ComparisonOperator, AttributeValueList must be given and an array");
-    }
-    if (!expected(array->Size())) {
-        throw api_error("ValidationException",
-                        format("{} operator requires AttributeValueList {}, instead found list size {}",
-                               op, expected.what(), array->Size()));
-    }
-}
-
-struct rjson_engaged_ptr_comp {
-    bool operator()(const rjson::value* p1, const rjson::value* p2) const {
-        return rjson::single_value_comp()(*p1, *p2);
-    }
-};
-
-// It's not enough to compare underlying JSON objects when comparing sets,
-// as internally they're stored in an array, and the order of elements is
-// not important in set equality. See issue #5021
-static bool check_EQ_for_sets(const rjson::value& set1, const rjson::value& set2) {
-    if (set1.Size() != set2.Size()) {
-        return false;
-    }
-    std::set<const rjson::value*, rjson_engaged_ptr_comp> set1_raw;
-    for (auto it = set1.Begin(); it != set1.End(); ++it) {
-        set1_raw.insert(&*it);
-    }
-    for (const auto& a : set2.GetArray()) {
-        if (set1_raw.count(&a) == 0) {
-            return false;
-        }
-    }
-    return true;
-}
-
 // Check if two JSON-encoded values match with the EQ relation
-static bool check_EQ(const rjson::value* v1, const rjson::value& v2) {
-    if (!v1) {
-        return false;
-    }
-    if (v1->IsObject() && v1->MemberCount() == 1 && v2.IsObject() && v2.MemberCount() == 1) {
-        auto it1 = v1->MemberBegin();
-        auto it2 = v2.MemberBegin();
-        if ((it1->name == "SS" && it2->name == "SS") || (it1->name == "NS" && it2->name == "NS") || (it1->name == "BS" && it2->name == "BS")) {
-            return check_EQ_for_sets(it1->value, it2->value);
-        }
-    }
-    return *v1 == v2;
-}
-
-// Check if two JSON-encoded values match with the NE relation
-static bool check_NE(const rjson::value* v1, const rjson::value& v2) {
-    return !v1 || *v1 != v2; // null is unequal to anything.
+static bool check_EQ(const rjson::value& v1, const rjson::value& v2) {
+    return v1 == v2;
 }

 // Check if two JSON-encoded values match with the BEGINS_WITH relation
-static bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2) {
-    // BEGINS_WITH requires that its single operand (v2) be a string or
-    // binary - otherwise it's a validation error. However, problems with
-    // the stored attribute (v1) will just return false (no match).
-    if (!v2.IsObject() || v2.MemberCount() != 1) {
-        throw api_error("ValidationException", format("BEGINS_WITH operator encountered malformed AttributeValue: {}", v2));
-    }
-    auto it2 = v2.MemberBegin();
-    if (it2->name != "S" && it2->name != "B") {
-        throw api_error("ValidationException", format("BEGINS_WITH operator requires String or Binary in AttributeValue, got {}", it2->name));
-    }
-
-
-    if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
+static bool check_BEGINS_WITH(const rjson::value& v1, const rjson::value& v2) {
+    // BEGINS_WITH only supports comparing two strings or two binaries -
+    // any other combinations of types, or other malformed values, return
+    // false (no match).
+    if (!v1.IsObject() || v1.MemberCount() != 1 || !v2.IsObject() || v2.MemberCount() != 1) {
        return false;
    }
-    auto it1 = v1->MemberBegin();
+    auto it1 = v1.MemberBegin();
+    auto it2 = v2.MemberBegin();
    if (it1->name != it2->name) {
        return false;
    }
-    if (it2->name == "S") {
-        std::string_view val1(it1->value.GetString(), it1->value.GetStringLength());
-        std::string_view val2(it2->value.GetString(), it2->value.GetStringLength());
-        return val1.substr(0, val2.size()) == val2;
-    } else /* it2->name == "B" */ {
-        // TODO (optimization): Check the begins_with condition directly on
-        // the base64-encoded string, without making a decoded copy.
-        bytes val1 = base64_decode(it1->value);
-        bytes val2 = base64_decode(it2->value);
-        return val1.substr(0, val2.size()) == val2;
-    }
-}
-
-static std::string_view to_string_view(const rjson::value& v) {
-    return std::string_view(v.GetString(), v.GetStringLength());
-}
-
-static bool is_set_of(const rjson::value& type1, const rjson::value& type2) {
-    return (type2 == "S" && type1 == "SS") || (type2 == "N" && type1 == "NS") || (type2 == "B" && type1 == "BS");
-}
-
-// Check if two JSON-encoded values match with the CONTAINS relation
-static bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
-    if (!v1) {
+    if (it1->name != "S" && it1->name != "B") {
        return false;
    }
-    const auto& kv1 = *v1->MemberBegin();
-    const auto& kv2 = *v2.MemberBegin();
-    if (kv2.name != "S" && kv2.name != "N" &&  kv2.name != "B") {
-        throw api_error("ValidationException",
-                        format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "
-                               "got {} instead", kv2.name));
-    }
-    if (kv1.name == "S" && kv2.name == "S") {
-        return to_string_view(kv1.value).find(to_string_view(kv2.value)) != std::string_view::npos;
-    } else if (kv1.name == "B" && kv2.name == "B") {
-        return base64_decode(kv1.value).find(base64_decode(kv2.value)) != bytes::npos;
-    } else if (is_set_of(kv1.name, kv2.name)) {
-        for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
-            if (*i == kv2.value) {
-                return true;
-            }
-        }
-    } else if (kv1.name == "L") {
-        for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
-            if (!i->IsObject() || i->MemberCount() != 1) {
-                clogger.error("check_CONTAINS received a list whose element is malformed");
-                return false;
-            }
-            const auto& el = *i->MemberBegin();
-            if (el.name == kv2.name && el.value == kv2.value) {
-                return true;
-            }
-        }
-    }
-    return false;
-}
-
-// Check if two JSON-encoded values match with the NOT_CONTAINS relation
-static bool check_NOT_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
-    if (!v1) {
-        return false;
-    }
-    return !check_CONTAINS(v1, v2);
-}
-
-// Check if a JSON-encoded value equals any element of an array, which must have at least one element.
-static bool check_IN(const rjson::value* val, const rjson::value& array) {
-    if (!array[0].IsObject() || array[0].MemberCount() != 1) {
-        throw api_error("ValidationException",
-                        format("IN operator encountered malformed AttributeValue: {}", array[0]));
-    }
-    const auto& type = array[0].MemberBegin()->name;
-    if (type != "S" && type != "N" && type != "B") {
-        throw api_error("ValidationException",
-                        "IN operator requires AttributeValueList elements to be of type String, Number, or Binary ");
-    }
-    if (!val) {
-        return false;
-    }
-    bool have_match = false;
-    for (const auto& elem : array.GetArray()) {
-        if (!elem.IsObject() || elem.MemberCount() != 1 || elem.MemberBegin()->name != type) {
-            throw api_error("ValidationException",
-                            "IN operator requires all AttributeValueList elements to have the same type ");
-        }
-        if (!have_match && *val == elem) {
-            // Can't return yet, must check types of all array elements. <sigh>
-            have_match = true;
-        }
-    }
-    return have_match;
-}
-
-static bool check_NULL(const rjson::value* val) {
-    return val == nullptr;
-}
-
-static bool check_NOT_NULL(const rjson::value* val) {
-    return val != nullptr;
-}
-
-// Check if two JSON-encoded values match with cmp.
-template <typename Comparator>
-bool check_compare(const rjson::value* v1, const rjson::value& v2, const Comparator& cmp) {
-    if (!v2.IsObject() || v2.MemberCount() != 1) {
-        throw api_error("ValidationException",
-                        format("{} requires a single AttributeValue of type String, Number, or Binary",
-                               cmp.diagnostic));
-    }
-    const auto& kv2 = *v2.MemberBegin();
-    if (kv2.name != "S" && kv2.name != "N" && kv2.name != "B") {
-        throw api_error("ValidationException",
-                        format("{} requires a single AttributeValue of type String, Number, or Binary",
-                               cmp.diagnostic));
-    }
-    if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
-        return false;
-    }
-    const auto& kv1 = *v1->MemberBegin();
-    if (kv1.name != kv2.name) {
-        return false;
-    }
-    if (kv1.name == "N") {
-        return cmp(unwrap_number(*v1, cmp.diagnostic), unwrap_number(v2, cmp.diagnostic));
-    }
-    if (kv1.name == "S") {
-        return cmp(std::string_view(kv1.value.GetString(), kv1.value.GetStringLength()),
-                   std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));
-    }
-    if (kv1.name == "B") {
-        return cmp(base64_decode(kv1.value), base64_decode(kv2.value));
-    }
-    clogger.error("check_compare panic: LHS type equals RHS type, but one is in {N,S,B} while the other isn't");
-    return false;
-}
-
-struct cmp_lt {
-    template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs < rhs; }
-    static constexpr const char* diagnostic = "LT operator";
-};
-
-struct cmp_le {
-    // bytes only has <, so we cannot use <=.
-    template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs < rhs || lhs == rhs; }
-    static constexpr const char* diagnostic = "LE operator";
-};
-
-struct cmp_ge {
-    // bytes only has <, so we cannot use >=.
-    template <typename T> bool operator()(const T& lhs, const T& rhs) const { return rhs < lhs || lhs == rhs; }
-    static constexpr const char* diagnostic = "GE operator";
-};
-
-struct cmp_gt {
-    // bytes only has <, so we cannot use >.
-    template <typename T> bool operator()(const T& lhs, const T& rhs) const { return rhs < lhs; }
-    static constexpr const char* diagnostic = "GT operator";
-};
-
-// True if v is between lb and ub, inclusive.  Throws if lb > ub.
-template <typename T>
-bool check_BETWEEN(const T& v, const T& lb, const T& ub) {
-    if (ub < lb) {
-        throw api_error("ValidationException",
-                        format("BETWEEN operator requires lower_bound <= upper_bound, but {} > {}", lb, ub));
-    }
-    return cmp_ge()(v, lb) && cmp_le()(v, ub);
-}
-
-static bool check_BETWEEN(const rjson::value* v, const rjson::value& lb, const rjson::value& ub) {
-    if (!v) {
-        return false;
-    }
-    if (!v->IsObject() || v->MemberCount() != 1) {
-        throw api_error("ValidationException", format("BETWEEN operator encountered malformed AttributeValue: {}", *v));
-    }
-    if (!lb.IsObject() || lb.MemberCount() != 1) {
-        throw api_error("ValidationException", format("BETWEEN operator encountered malformed AttributeValue: {}", lb));
-    }
-    if (!ub.IsObject() || ub.MemberCount() != 1) {
-        throw api_error("ValidationException", format("BETWEEN operator encountered malformed AttributeValue: {}", ub));
-    }
-
-    const auto& kv_v = *v->MemberBegin();
-    const auto& kv_lb = *lb.MemberBegin();
-    const auto& kv_ub = *ub.MemberBegin();
-    if (kv_lb.name != kv_ub.name) {
-        throw api_error(
-                "ValidationException",
-                format("BETWEEN operator requires the same type for lower and upper bound; instead got {} and {}",
-                       kv_lb.name, kv_ub.name));
-    }
-    if (kv_v.name != kv_lb.name) { // Cannot compare different types, so v is NOT between lb and ub.
-        return false;
-    }
-    if (kv_v.name == "N") {
-        const char* diag = "BETWEEN operator";
-        return check_BETWEEN(unwrap_number(*v, diag), unwrap_number(lb, diag), unwrap_number(ub, diag));
-    }
-    if (kv_v.name == "S") {
-        return check_BETWEEN(std::string_view(kv_v.value.GetString(), kv_v.value.GetStringLength()),
-                             std::string_view(kv_lb.value.GetString(), kv_lb.value.GetStringLength()),
-                             std::string_view(kv_ub.value.GetString(), kv_ub.value.GetStringLength()));
-    }
-    if (kv_v.name == "B") {
-        return check_BETWEEN(base64_decode(kv_v.value), base64_decode(kv_lb.value), base64_decode(kv_ub.value));
-    }
-    throw api_error("ValidationException",
-        format("BETWEEN operator requires AttributeValueList elements to be of type String, Number, or Binary; instead got {}",
-               kv_lb.name));
+    std::string_view val1(it1->value.GetString(), it1->value.GetStringLength());
+    std::string_view val2(it2->value.GetString(), it2->value.GetStringLength());
+    return val1.substr(0, val2.size()) == val2;
 }

 // Verify one Expect condition on one attribute (whose content is "got")
@@ -445,7 +142,7 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu
        if (comparison_operator) {
            throw api_error("ValidationException", "Cannot combine Value with ComparisonOperator");
        }
-        return check_EQ(got, *value);
+        return got && check_EQ(*got, *value);
    } else if (exists) {
        if (comparison_operator) {
            throw api_error("ValidationException", "Cannot combine Exists with ComparisonOperator");
@@ -459,49 +156,33 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu
        if (!comparison_operator) {
            throw api_error("ValidationException", "Missing ComparisonOperator, Value or Exists");
        }
+        if (!attribute_value_list || !attribute_value_list->IsArray()) {
+            throw api_error("ValidationException", "With ComparisonOperator, AttributeValueList must be given and an array");
+        }
        comparison_operator_type op = get_comparison_operator(*comparison_operator);
        switch (op) {
        case comparison_operator_type::EQ:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_EQ(got, (*attribute_value_list)[0]);
-        case comparison_operator_type::NE:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_NE(got, (*attribute_value_list)[0]);
-        case comparison_operator_type::LT:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_compare(got, (*attribute_value_list)[0], cmp_lt{});
-        case comparison_operator_type::LE:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_compare(got, (*attribute_value_list)[0], cmp_le{});
-        case comparison_operator_type::GT:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_compare(got, (*attribute_value_list)[0], cmp_gt{});
-        case comparison_operator_type::GE:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_compare(got, (*attribute_value_list)[0], cmp_ge{});
+            if (attribute_value_list->Size() != 1) {
+                throw api_error("ValidationException", "EQ operator requires one element in AttributeValueList");
+            }
+            if (got) {
+                const rjson::value& expected = (*attribute_value_list)[0];
+                return check_EQ(*got, expected);
+            }
+            return false;
        case comparison_operator_type::BEGINS_WITH:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_BEGINS_WITH(got, (*attribute_value_list)[0]);
-        case comparison_operator_type::IN:
-            verify_operand_count(attribute_value_list, nonempty(), *comparison_operator);
-            return check_IN(got, *attribute_value_list);
-        case comparison_operator_type::IS_NULL:
-            verify_operand_count(attribute_value_list, empty(), *comparison_operator);
-            return check_NULL(got);
-        case comparison_operator_type::NOT_NULL:
-            verify_operand_count(attribute_value_list, empty(), *comparison_operator);
-            return check_NOT_NULL(got);
-        case comparison_operator_type::BETWEEN:
-            verify_operand_count(attribute_value_list, exact_size(2), *comparison_operator);
-            return check_BETWEEN(got, (*attribute_value_list)[0], (*attribute_value_list)[1]);
-        case comparison_operator_type::CONTAINS:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_CONTAINS(got, (*attribute_value_list)[0]);
-        case comparison_operator_type::NOT_CONTAINS:
-            verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
-            return check_NOT_CONTAINS(got, (*attribute_value_list)[0]);
+            if (attribute_value_list->Size() != 1) {
+                throw api_error("ValidationException", "BEGINS_WITH operator requires one element in AttributeValueList");
+            }
+            if (got) {
+                const rjson::value& expected = (*attribute_value_list)[0];
+                return check_BEGINS_WITH(*got, expected);
+            }
+            return false;
+        default:
+            // FIXME: implement all the missing types, so there will be no default here.
+            throw api_error("ValidationException", format("ComparisonOperator {} is not yet supported", *comparison_operator));
        }
-        throw std::logic_error(format("Internal error: corrupted operator enum: {}", int(op)));
    }
 }

--- a/alternator/conditions.hh
+++ b/alternator/conditions.hh
@@ -37,7 +37,7 @@
 namespace alternator {

 enum class comparison_operator_type {
-    EQ, NE, LE, LT, GE, GT, IN, BETWEEN, CONTAINS, NOT_CONTAINS, IS_NULL, NOT_NULL, BEGINS_WITH
+    EQ, NE, LE, LT, GE, GT, IN, BETWEEN, CONTAINS, IS_NULL, NOT_NULL, BEGINS_WITH
 };

 comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator);
--- a/alternator/executor.cc
+++ b/alternator/executor.cc
@@ -35,7 +35,6 @@
 #include "query-result-reader.hh"
 #include "cql3/selection/selection.hh"
 #include "cql3/result_set.hh"
-#include "cql3/type_json.hh"
 #include "bytes.hh"
 #include "cql3/update_parameters.hh"
 #include "server.hh"
@@ -50,7 +49,6 @@
 #include "utils/big_decimal.hh"
 #include "seastar/json/json_elements.hh"
 #include <boost/algorithm/cxx11/any_of.hpp>
-#include "collection_mutation.hh"

 #include <boost/range/adaptors.hpp>

@@ -238,75 +236,17 @@ static std::string get_string_attribute(const rjson::value& value, rjson::string
                attribute_name, value));
    }
    return attribute_value->GetString();
-}
-
-// Convenience function for getting the value of a boolean attribute, or a
-// default value if it is missing. If the attribute exists, but is not a
-// bool, a descriptive api_error is thrown.
-static bool get_bool_attribute(const rjson::value& value, rjson::string_ref_type attribute_name, bool default_return) {
-    const rjson::value* attribute_value = rjson::find(value, attribute_name);
-    if (!attribute_value) {
-        return default_return;
-    }
-    if (!attribute_value->IsBool()) {
-        throw api_error("ValidationException", format("Expected boolean value for attribute {}, got: {}",
-                attribute_name, value));
-    }
-    return attribute_value->GetBool();
-}
-
-// Convenience function for getting the value of an integer attribute, or
-// an empty optional if it is missing. If the attribute exists, but is not
-// an integer, a descriptive api_error is thrown.
-static std::optional<int> get_int_attribute(const rjson::value& value, rjson::string_ref_type attribute_name) {
-    const rjson::value* attribute_value = rjson::find(value, attribute_name);
-    if (!attribute_value)
-        return {};
-    if (!attribute_value->IsInt()) {
-        throw api_error("ValidationException", format("Expected integer value for attribute {}, got: {}",
-                attribute_name, value));
-    }
-    return attribute_value->GetInt();
-}
-
-// Sets a KeySchema object inside the given JSON parent describing the key
-// attributes of the the given schema as being either HASH or RANGE keys.
-// Additionally, adds to a given map mappings between the key attribute
-// names and their type (as a DynamoDB type string).
-static void describe_key_schema(rjson::value& parent, const schema& schema, std::unordered_map<std::string,std::string>& attribute_types) {
-    rjson::value key_schema = rjson::empty_array();
-    for (const column_definition& cdef : schema.partition_key_columns()) {
-        rjson::value key = rjson::empty_object();
-        rjson::set(key, "AttributeName", rjson::from_string(cdef.name_as_text()));
-        rjson::set(key, "KeyType", "HASH");
-        rjson::push_back(key_schema, std::move(key));
-        attribute_types[cdef.name_as_text()] = type_to_string(cdef.type);
-
-    }
-    for (const column_definition& cdef : schema.clustering_key_columns()) {
-        rjson::value key = rjson::empty_object();
-        rjson::set(key, "AttributeName", rjson::from_string(cdef.name_as_text()));
-        rjson::set(key, "KeyType", "RANGE");
-        rjson::push_back(key_schema, std::move(key));
-        attribute_types[cdef.name_as_text()] = type_to_string(cdef.type);
-        // FIXME: this "break" can avoid listing some clustering key columns
-        // we added for GSIs just because they existed in the base table -
-        // but not in all cases. We still have issue #5320. See also
-        // reproducer in test_gsi_2_describe_table_schema.
-        break;
-    }
-    rjson::set(parent, "KeySchema", std::move(key_schema));

 }

-future<json::json_return_type> executor::describe_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::describe_table(client_state& client_state, std::string content) {
    _stats.api_operations.describe_table++;
    rjson::value request = rjson::parse(content);
    elogger.trace("Describing table {}", request);

    schema_ptr schema = get_table(_proxy, request);

-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    rjson::value table_description = rjson::empty_object();
    rjson::set(table_description, "TableName", rjson::from_string(schema->cf_name()));
@@ -327,11 +267,6 @@ future<json::json_return_type> executor::describe_table(client_state& client_sta
    rjson::set(table_description, "BillingModeSummary", rjson::empty_object());
    rjson::set(table_description["BillingModeSummary"], "BillingMode", "PAY_PER_REQUEST");
    rjson::set(table_description["BillingModeSummary"], "LastUpdateToPayPerRequestDateTime", rjson::value(creation_date_seconds));
-
-    std::unordered_map<std::string,std::string> key_attribute_types;
-    // Add base table's KeySchema and collect types for AttributeDefinitions:
-    describe_key_schema(table_description, *schema, key_attribute_types);
-
    table& t = _proxy.get_db().local().find_column_family(schema);
    if (!t.views().empty()) {
        rjson::value gsi_array = rjson::empty_array();
@@ -346,8 +281,6 @@ future<json::json_return_type> executor::describe_table(client_state& client_sta
            }
            sstring index_name = cf_name.substr(delim_it + 1);
            rjson::set(view_entry, "IndexName", rjson::from_string(index_name));
-            // Add indexes's KeySchema and collect types for AttributeDefinitions:
-            describe_key_schema(view_entry, *vptr, key_attribute_types);
            // Local secondary indexes are marked by an extra '!' sign occurring before the ':' delimiter
            rjson::value& index_array = (delim_it > 1 && cf_name[delim_it-1] == '!') ? lsi_array : gsi_array;
            rjson::push_back(index_array, std::move(view_entry));
@@ -359,32 +292,23 @@ future<json::json_return_type> executor::describe_table(client_state& client_sta
            rjson::set(table_description, "GlobalSecondaryIndexes", std::move(gsi_array));
        }
    }
-    // Use map built by describe_key_schema() for base and indexes to produce
-    // AttributeDefinitions for all key columns:
-    rjson::value attribute_definitions = rjson::empty_array();
-    for (auto& type : key_attribute_types) {
-        rjson::value key = rjson::empty_object();
-        rjson::set(key, "AttributeName", rjson::from_string(type.first));
-        rjson::set(key, "AttributeType", rjson::from_string(type.second));
-        rjson::push_back(attribute_definitions, std::move(key));
-    }
-    rjson::set(table_description, "AttributeDefinitions", std::move(attribute_definitions));
-
-    // FIXME: still missing some response fields (issue #5026)

+    // FIXME: more attributes! Check https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TableDescription.html#DDB-Type-TableDescription-TableStatus but also run a test to see what DyanmoDB really fills
+    // maybe for TableId or TableArn use  schema.id().to_sstring().c_str();
+    // Of course, the whole schema is missing!
    rjson::value response = rjson::empty_object();
    rjson::set(response, "Table", std::move(table_description));
    elogger.trace("returning {}", response);
    return make_ready_future<json::json_return_type>(make_jsonable(std::move(response)));
 }

-future<json::json_return_type> executor::delete_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::delete_table(client_state& client_state, std::string content) {
    _stats.api_operations.delete_table++;
    rjson::value request = rjson::parse(content);
    elogger.trace("Deleting table {}", request);

    std::string table_name = get_table_name(request);
-    tracing::add_table_name(trace_state, KEYSPACE_NAME, table_name);
+    tracing::add_table_name(client_state.get_trace_state(), KEYSPACE_NAME, table_name);

    if (!_proxy.get_db().local().has_schema(KEYSPACE_NAME, table_name)) {
        throw api_error("ResourceNotFoundException",
@@ -481,14 +405,14 @@ static std::pair<std::string, std::string> parse_key_schema(const rjson::value&
 }


-future<json::json_return_type> executor::create_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::create_table(client_state& client_state, std::string content) {
    _stats.api_operations.create_table++;
    rjson::value table_info = rjson::parse(content);
    elogger.trace("Creating table {}", table_info);
    std::string table_name = get_table_name(table_info);
    const rjson::value& attribute_definitions = table_info["AttributeDefinitions"];

-    tracing::add_table_name(trace_state, KEYSPACE_NAME, table_name);
+    tracing::add_table_name(client_state.get_trace_state(), KEYSPACE_NAME, table_name);

    schema_builder builder(KEYSPACE_NAME, table_name);
    auto [hash_key, range_key] = parse_key_schema(table_info);
@@ -682,8 +606,8 @@ public:
    void del(bytes&& name, api::timestamp_type ts) {
        add(std::move(name), atomic_cell::make_dead(ts, gc_clock::now()));
    }
-    collection_mutation_description to_mut() {
-        collection_mutation_description ret;
+    collection_type_impl::mutation to_mut() {
+        collection_type_impl::mutation ret;
        for (auto&& e : collected) {
            ret.cells.emplace_back(e.first, std::move(e.second));
        }
@@ -719,7 +643,7 @@ static mutation make_item_mutation(const rjson::value& item, schema_ptr schema)
    }

    if (!attrs_collector.empty()) {
-        auto serialized_map = attrs_collector.to_mut().serialize(*attrs_type());
+        auto serialized_map = attrs_type()->serialize_mutation_form(attrs_collector.to_mut());
        row.cells().apply(attrs_column(*schema), std::move(serialized_map));
    }
    // To allow creation of an item with no attributes, we need a row marker.
@@ -731,12 +655,7 @@ static mutation make_item_mutation(const rjson::value& item, schema_ptr schema)
    // Scylla proper, to implement the operation to replace an entire
    // collection ("UPDATE .. SET x = ..") - see
    // cql3::update_parameters::make_tombstone_just_before().
-    const bool use_partition_tombstone = schema->clustering_key_size() == 0;
-    if (use_partition_tombstone) {
-        m.partition().apply(tombstone(ts-1, gc_clock::now()));
-    } else {
-        row.apply(tombstone(ts-1, gc_clock::now()));
-    }
+    row.apply(tombstone(ts-1, gc_clock::now()));
    return m;
 }

@@ -748,43 +667,36 @@ static db::timeout_clock::time_point default_timeout() {

 static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
        service::storage_proxy& proxy,
-        service::client_state& client_state,
        schema_ptr schema,
        const rjson::value& item,
        bool need_read_before_write,
        alternator::stats& stats);

-future<json::json_return_type> executor::put_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::put_item(client_state& client_state, std::string content) {
    _stats.api_operations.put_item++;
    auto start_time = std::chrono::steady_clock::now();
    rjson::value update_info = rjson::parse(content);
    elogger.trace("Updating value {}", update_info);

    schema_ptr schema = get_table(_proxy, update_info);
-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    if (rjson::find(update_info, "ConditionExpression")) {
        throw api_error("ValidationException", "ConditionExpression is not yet implemented in alternator");
    }
-    auto return_values = get_string_attribute(update_info, "ReturnValues", "NONE");
-    if (return_values != "NONE") {
-        // FIXME: Need to support also the ALL_OLD option. See issue #5053.
-        throw api_error("ValidationException", format("Unsupported ReturnValues={} for PutItem operation", return_values));
-    }
-
    const bool has_expected = update_info.HasMember("Expected");

    const rjson::value& item = update_info["Item"];

    mutation m = make_item_mutation(item, schema);

-    return maybe_get_previous_item(_proxy, client_state, schema, item, has_expected, _stats).then(
+    return maybe_get_previous_item(_proxy, schema, item, has_expected, _stats).then(
            [this, schema, has_expected,  update_info = rjson::copy(update_info), m = std::move(m),
-             &client_state, start_time, trace_state] (std::unique_ptr<rjson::value> previous_item) mutable {
+             &client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
        if (has_expected) {
            verify_expected(update_info, previous_item);
        }
-        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), trace_state, empty_service_permit()).then([this, start_time] () {
+        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), client_state.get_trace_state(), empty_service_permit()).then([this, start_time] () {
            _stats.api_operations.put_item_latency.add(std::chrono::steady_clock::now() - start_time, _stats.api_operations.put_item_latency._count + 1);
            // Without special options on what to return, PutItem returns nothing.
            return make_ready_future<json::json_return_type>(json_string(""));
@@ -807,32 +719,22 @@ static mutation make_delete_item_mutation(const rjson::value& key, schema_ptr sc
    clustering_key ck = ck_from_json(key, schema);
    check_key(key, schema);
    mutation m(schema, pk);
-    const bool use_partition_tombstone = schema->clustering_key_size() == 0;
-    if (use_partition_tombstone) {
-        m.partition().apply(tombstone(api::new_timestamp(), gc_clock::now()));
-    } else {
-        auto& row = m.partition().clustered_row(*schema, ck);
-        row.apply(tombstone(api::new_timestamp(), gc_clock::now()));
-    }
+    auto& row = m.partition().clustered_row(*schema, ck);
+    row.apply(tombstone(api::new_timestamp(), gc_clock::now()));
    return m;
 }

-future<json::json_return_type> executor::delete_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::delete_item(client_state& client_state, std::string content) {
    _stats.api_operations.delete_item++;
    auto start_time = std::chrono::steady_clock::now();
    rjson::value update_info = rjson::parse(content);

    schema_ptr schema = get_table(_proxy, update_info);
-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    if (rjson::find(update_info, "ConditionExpression")) {
        throw api_error("ValidationException", "ConditionExpression is not yet implemented in alternator");
    }
-    auto return_values = get_string_attribute(update_info, "ReturnValues", "NONE");
-    if (return_values != "NONE") {
-        // FIXME: Need to support also the ALL_OLD option. See issue #5053.
-        throw api_error("ValidationException", format("Unsupported ReturnValues={} for DeleteItem operation", return_values));
-    }
    const bool has_expected = update_info.HasMember("Expected");

    const rjson::value& key = update_info["Key"];
@@ -840,13 +742,13 @@ future<json::json_return_type> executor::delete_item(client_state& client_state,
    mutation m = make_delete_item_mutation(key, schema);
    check_key(key, schema);

-    return maybe_get_previous_item(_proxy, client_state, schema, key, has_expected, _stats).then(
+    return maybe_get_previous_item(_proxy, schema, key, has_expected, _stats).then(
            [this, schema, has_expected,  update_info = rjson::copy(update_info), m = std::move(m),
-             &client_state, start_time, trace_state] (std::unique_ptr<rjson::value> previous_item) mutable {
+             &client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
        if (has_expected) {
            verify_expected(update_info, previous_item);
        }
-        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), trace_state, empty_service_permit()).then([this, start_time] () {
+        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), client_state.get_trace_state(), empty_service_permit()).then([this, start_time] () {
            _stats.api_operations.delete_item_latency.add(std::chrono::steady_clock::now() - start_time, _stats.api_operations.delete_item_latency._count + 1);
            // Without special options on what to return, DeleteItem returns nothing.
            return make_ready_future<json::json_return_type>(json_string(""));
@@ -879,7 +781,7 @@ struct primary_key_equal {
    }
 };

-future<json::json_return_type> executor::batch_write_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::batch_write_item(client_state& client_state, std::string content) {
    _stats.api_operations.batch_write_item++;
    rjson::value batch_info = rjson::parse(content);
    rjson::value& request_items = batch_info["RequestItems"];
@@ -889,7 +791,7 @@ future<json::json_return_type> executor::batch_write_item(client_state& client_s

    for (auto it = request_items.MemberBegin(); it != request_items.MemberEnd(); ++it) {
        schema_ptr schema = get_table_from_batch_request(_proxy, it);
-        tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+        tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());
        std::unordered_set<primary_key, primary_key_hash, primary_key_equal> used_keys(1, primary_key_hash{schema}, primary_key_equal{schema});
        for (auto& request : it->value.GetArray()) {
            if (!request.IsObject() || request.MemberCount() != 1) {
@@ -922,7 +824,7 @@ future<json::json_return_type> executor::batch_write_item(client_state& client_s
        }
    }

-    return _proxy.mutate(std::move(mutations), db::consistency_level::LOCAL_QUORUM, default_timeout(), trace_state, empty_service_permit()).then([] () {
+    return _proxy.mutate(std::move(mutations), db::consistency_level::LOCAL_QUORUM, default_timeout(), client_state.get_trace_state(), empty_service_permit()).then([] () {
        // Without special options on what to return, BatchWriteItem returns nothing,
        // unless there are UnprocessedItems - it's possible to just stop processing a batch
        // due to throttling. TODO(sarna): Consider UnprocessedItems when returning.
@@ -1007,6 +909,21 @@ static std::string get_item_type_string(const rjson::value& v) {
    return it->name.GetString();
 }

+// Check if a given JSON object encodes a set (i.e., it is a {"SS": [...]}, or "NS", "BS"
+// and returns set's type and a pointer to that set. If the object does not encode a set,
+// returned value is {"", nullptr}
+static const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v) {
+    if (!v.IsObject() || v.MemberCount() != 1) {
+        return {"", nullptr};
+    }
+    auto it = v.MemberBegin();
+    const std::string it_key = it->name.GetString();
+    if (it_key != "SS" && it_key != "BS" && it_key != "NS") {
+        return {"", nullptr};
+    }
+    return std::make_pair(it_key, &(it->value));
+}
+
 // Take two JSON-encoded list values (remember that a list value is
 // {"L": [...the actual list]}) and return the concatenation, again as
 // a list value.
@@ -1025,6 +942,50 @@ static rjson::value list_concatenate(const rjson::value& v1, const rjson::value&
    return ret;
 }

+struct single_value_rjson_comp {
+    bool operator()(const rjson::value& r1, const rjson::value& r2) const {
+        auto r1_type = r1.GetType();
+        auto r2_type = r2.GetType();
+        switch (r1_type) {
+        case rjson::type::kNullType:
+            return r1_type < r2_type;
+        case rjson::type::kFalseType:
+            return r1_type < r2_type;
+        case rjson::type::kTrueType:
+            return r1_type < r2_type;
+        case rjson::type::kObjectType:
+            throw rjson::error("Object type comparison is not supported");
+        case rjson::type::kArrayType:
+            throw rjson::error("Array type comparison is not supported");
+        case rjson::type::kStringType: {
+            const size_t r1_len = r1.GetStringLength();
+            const size_t r2_len = r2.GetStringLength();
+            size_t len = std::min(r1_len, r2_len);
+            int result = std::strncmp(r1.GetString(), r2.GetString(), len);
+            return result < 0 || (result == 0 && r1_len < r2_len);
+        }
+        case rjson::type::kNumberType: {
+            if (r1_type != r2_type) {
+                throw rjson::error("All numbers in a set should have the same type");
+            }
+            if (r1.IsDouble()) {
+                return r1.GetDouble() < r2.GetDouble();
+            } else if (r1.IsInt()) {
+                return r1.GetInt() < r2.GetInt();
+            } else if (r1.IsUint()) {
+                return r1.GetUint() < r2.GetUint();
+            } else if (r1.IsInt64()) {
+                return r1.GetInt64() < r2.GetInt64();
+            } else {
+                return r1.GetUint64() < r2.GetUint64();
+            }
+        }
+        default:
+            return false;
+        }
+    }
+};
+
 // Take two JSON-encoded set values (e.g. {"SS": [...the actual set]}) and return the sum of both sets,
 // again as a set value.
 static rjson::value set_sum(const rjson::value& v1, const rjson::value& v2) {
@@ -1037,7 +998,7 @@ static rjson::value set_sum(const rjson::value& v1, const rjson::value& v2) {
        throw api_error("ValidationException", "UpdateExpression: ADD operation for sets must be given sets as arguments");
    }
    rjson::value sum = rjson::copy(*set1);
-    std::set<rjson::value, rjson::single_value_comp> set1_raw;
+    std::set<rjson::value, single_value_rjson_comp> set1_raw;
    for (auto it = sum.Begin(); it != sum.End(); ++it) {
        set1_raw.insert(rjson::copy(*it));
    }
@@ -1062,7 +1023,7 @@ static rjson::value set_diff(const rjson::value& v1, const rjson::value& v2) {
    if (!set1 || !set2) {
        throw api_error("ValidationException", "UpdateExpression: DELETE operation can only be performed on a set");
    }
-    std::set<rjson::value, rjson::single_value_comp> set1_raw;
+    std::set<rjson::value, single_value_rjson_comp> set1_raw;
    for (auto it = set1->Begin(); it != set1->End(); ++it) {
        set1_raw.insert(rjson::copy(*it));
    }
@@ -1078,11 +1039,31 @@ static rjson::value set_diff(const rjson::value& v1, const rjson::value& v2) {
    return ret;
 }

+// Check if a given JSON object encodes a number (i.e., it is a {"N": [...]}
+// and returns an object representing it.
+static big_decimal unwrap_number(const rjson::value& v) {
+    if (!v.IsObject() || v.MemberCount() != 1) {
+        throw api_error("ValidationException", "UpdateExpression: invalid number object");
+    }
+    auto it = v.MemberBegin();
+    if (it->name != "N") {
+        throw api_error("ValidationException",
+                format("UpdateExpression: expected number, found type '{}'", it->name));
+    }
+    if (it->value.IsNumber()) {
+        return big_decimal(rjson::print(it->value)); // FIXME(sarna): should use big_decimal constructor with numeric values directly
+    }
+    if (!it->value.IsString()) {
+        throw api_error("ValidationException", "UpdateExpression: improperly formatted number constant");
+    }
+    return big_decimal(it->value.GetString());
+}
+
 // Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
 // sum, again as a JSON-encoded number.
 static rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
-    auto n1 = unwrap_number(v1, "UpdateExpression");
-    auto n2 = unwrap_number(v2, "UpdateExpression");
+    auto n1 = unwrap_number(v1);
+    auto n2 = unwrap_number(v2);
    rjson::value ret = rjson::empty_object();
    std::string str_ret = std::string((n1 + n2).to_string());
    rjson::set(ret, "N", rjson::from_string(str_ret));
@@ -1090,8 +1071,8 @@ static rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
 }

 static rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2) {
-    auto n1 = unwrap_number(v1, "UpdateExpression");
-    auto n2 = unwrap_number(v2, "UpdateExpression");
+    auto n1 = unwrap_number(v1);
+    auto n2 = unwrap_number(v2);
    rjson::value ret = rjson::empty_object();
    std::string str_ret = std::string((n1 - n2).to_string());
    rjson::set(ret, "N", rjson::from_string(str_ret));
@@ -1353,7 +1334,6 @@ static bool check_needs_read_before_write(const std::vector<parsed::update_expre
 // It should be overridden once we can leverage a consensus protocol.
 static future<std::unique_ptr<rjson::value>> do_get_previous_item(
        service::storage_proxy& proxy,
-        service::client_state& client_state,
        schema_ptr schema,
        const partition_key& pk,
        const clustering_key& ck,
@@ -1378,7 +1358,7 @@ static future<std::unique_ptr<rjson::value>> do_get_previous_item(

    auto cl = db::consistency_level::LOCAL_QUORUM;

-    return proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
+    return proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
            [schema, partition_slice = std::move(partition_slice), selection = std::move(selection)] (service::storage_proxy::coordinator_query_result qr) {
        auto previous_item = describe_item(schema, partition_slice, *selection, std::move(qr.query_result), {});
        return make_ready_future<std::unique_ptr<rjson::value>>(std::make_unique<rjson::value>(std::move(previous_item)));
@@ -1387,7 +1367,6 @@ static future<std::unique_ptr<rjson::value>> do_get_previous_item(

 static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
        service::storage_proxy& proxy,
-        service::client_state& client_state,
        schema_ptr schema,
        const partition_key& pk,
        const clustering_key& ck,
@@ -1401,12 +1380,11 @@ static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
    if (!needs_read_before_write) {
        return make_ready_future<std::unique_ptr<rjson::value>>();
    }
-    return do_get_previous_item(proxy, client_state, std::move(schema), pk, ck, stats);
+    return do_get_previous_item(proxy, std::move(schema), pk, ck, stats);
 }

 static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
        service::storage_proxy& proxy,
-        service::client_state& client_state,
        schema_ptr schema,
        const rjson::value& item,
        bool needs_read_before_write,
@@ -1417,26 +1395,21 @@ static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
    }
    partition_key pk = pk_from_json(item, schema);
    clustering_key ck = ck_from_json(item, schema);
-    return do_get_previous_item(proxy, client_state, std::move(schema), pk, ck, stats);
+    return do_get_previous_item(proxy, std::move(schema), pk, ck, stats);
 }


-future<json::json_return_type> executor::update_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::update_item(client_state& client_state, std::string content) {
    _stats.api_operations.update_item++;
    auto start_time = std::chrono::steady_clock::now();
    rjson::value update_info = rjson::parse(content);
    elogger.trace("update_item {}", update_info);
    schema_ptr schema = get_table(_proxy, update_info);
-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    if (rjson::find(update_info, "ConditionExpression")) {
        throw api_error("ValidationException", "ConditionExpression is not yet implemented in alternator");
    }
-    auto return_values = get_string_attribute(update_info, "ReturnValues", "NONE");
-    if (return_values != "NONE") {
-        // FIXME: Need to support also ALL_OLD, UPDATED_OLD, ALL_NEW and UPDATED_NEW options. See issue #5053.
-        throw api_error("ValidationException", format("Unsupported ReturnValues={} for UpdateItem operation", return_values));
-    }

    if (!update_info.HasMember("Key")) {
        throw api_error("ValidationException", "UpdateItem requires a Key parameter");
@@ -1480,10 +1453,10 @@ future<json::json_return_type> executor::update_item(client_state& client_state,
        attribute_updates = update_info["AttributeUpdates"];
    }

-    return maybe_get_previous_item(_proxy, client_state, schema, pk, ck, has_update_expression, expression, has_expected, _stats).then(
+    return maybe_get_previous_item(_proxy, schema, pk, ck, has_update_expression, expression, has_expected, _stats).then(
            [this, schema, expression = std::move(expression), has_update_expression, ck = std::move(ck), has_expected,
             update_info = rjson::copy(update_info), m = std::move(m), attrs_collector = std::move(attrs_collector),
-             attribute_updates = rjson::copy(attribute_updates), ts, &client_state, start_time, trace_state] (std::unique_ptr<rjson::value> previous_item) mutable {
+             attribute_updates = rjson::copy(attribute_updates), ts, &client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
        if (has_expected) {
            verify_expected(update_info, previous_item);
        }
@@ -1605,7 +1578,7 @@ future<json::json_return_type> executor::update_item(client_state& client_state,
            }
        }
        if (!attrs_collector.empty()) {
-            auto serialized_map = attrs_collector.to_mut().serialize(*attrs_type());
+            auto serialized_map = attrs_type()->serialize_mutation_form(attrs_collector.to_mut());
            row.cells().apply(attrs_column(*schema), std::move(serialized_map));
        }
        // To allow creation of an item with no attributes, we need a row marker.
@@ -1614,7 +1587,7 @@ future<json::json_return_type> executor::update_item(client_state& client_state,
        row.apply(row_marker(ts));

        elogger.trace("Applying mutation {}", m);
-        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), trace_state, empty_service_permit()).then([this, start_time] () {
+        return _proxy.mutate(std::vector<mutation>{std::move(m)}, db::consistency_level::LOCAL_QUORUM, default_timeout(), client_state.get_trace_state(), empty_service_permit()).then([this, start_time] () {
            // Without special options on what to return, UpdateItem returns nothing.
            _stats.api_operations.update_item_latency.add(std::chrono::steady_clock::now() - start_time, _stats.api_operations.update_item_latency._count + 1);
            return make_ready_future<json::json_return_type>(json_string(""));
@@ -1641,7 +1614,7 @@ static db::consistency_level get_read_consistency(const rjson::value& request) {
    return consistent_read ? db::consistency_level::LOCAL_QUORUM : db::consistency_level::LOCAL_ONE;
 }

-future<json::json_return_type> executor::get_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::get_item(client_state& client_state, std::string content) {
    _stats.api_operations.get_item++;
    auto start_time = std::chrono::steady_clock::now();
    rjson::value table_info = rjson::parse(content);
@@ -1649,7 +1622,7 @@ future<json::json_return_type> executor::get_item(client_state& client_state, tr

    schema_ptr schema = get_table(_proxy, table_info);

-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    rjson::value& query_key = table_info["Key"];
    db::consistency_level cl = get_read_consistency(table_info);
@@ -1677,14 +1650,14 @@ future<json::json_return_type> executor::get_item(client_state& client_state, tr

    auto attrs_to_get = calculate_attrs_to_get(table_info);

-    return _proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
+    return _proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
            [this, schema, partition_slice = std::move(partition_slice), selection = std::move(selection), attrs_to_get = std::move(attrs_to_get), start_time = std::move(start_time)] (service::storage_proxy::coordinator_query_result qr) mutable {
        _stats.api_operations.get_item_latency.add(std::chrono::steady_clock::now() - start_time, _stats.api_operations.get_item_latency._count + 1);
        return make_ready_future<json::json_return_type>(make_jsonable(describe_item(schema, partition_slice, *selection, std::move(qr.query_result), std::move(attrs_to_get))));
    });
 }

-future<json::json_return_type> executor::batch_get_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::batch_get_item(client_state& client_state, std::string content) {
    // FIXME: In this implementation, an unbounded batch size can cause
    // unbounded response JSON object to be buffered in memory, unbounded
    // parallelism of the requests, and unbounded amount of non-preemptable
@@ -1712,7 +1685,7 @@ future<json::json_return_type> executor::batch_get_item(client_state& client_sta
    for (auto it = request_items.MemberBegin(); it != request_items.MemberEnd(); ++it) {
        table_requests rs;
        rs.schema = get_table_from_batch_request(_proxy, it);
-        tracing::add_table_name(trace_state, KEYSPACE_NAME, rs.schema->cf_name());
+        tracing::add_table_name(client_state.get_trace_state(), KEYSPACE_NAME, rs.schema->cf_name());
        rs.cl = get_read_consistency(it->value);
        rs.attrs_to_get = calculate_attrs_to_get(it->value);
        auto& keys = (it->value)["Keys"];
@@ -1740,7 +1713,7 @@ future<json::json_return_type> executor::batch_get_item(client_state& client_sta
            auto selection = cql3::selection::selection::wildcard(rs.schema);
            auto partition_slice = query::partition_slice(std::move(bounds), {}, std::move(regular_columns), selection->get_query_options());
            auto command = ::make_lw_shared<query::read_command>(rs.schema->id(), rs.schema->version(), partition_slice, query::max_partitions);
-            future<std::tuple<std::string, std::optional<rjson::value>>> f = _proxy.query(rs.schema, std::move(command), std::move(partition_ranges), rs.cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
+            future<std::tuple<std::string, std::optional<rjson::value>>> f = _proxy.query(rs.schema, std::move(command), std::move(partition_ranges), rs.cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
                    [schema = rs.schema, partition_slice = std::move(partition_slice), selection = std::move(selection), attrs_to_get = rs.attrs_to_get] (service::storage_proxy::coordinator_query_result qr) mutable {
                std::optional<rjson::value> json = describe_single_item(schema, partition_slice, *selection, std::move(qr.query_result), std::move(attrs_to_get));
                return make_ready_future<std::tuple<std::string, std::optional<rjson::value>>>(
@@ -1852,7 +1825,7 @@ static rjson::value encode_paging_state(const schema& schema, const service::pag
    for (const column_definition& cdef : schema.partition_key_columns()) {
        rjson::set_with_string_name(last_evaluated_key, cdef.name_as_text(), rjson::empty_object());
        rjson::value& key_entry = last_evaluated_key[cdef.name_as_text()];
-        rjson::set_with_string_name(key_entry, type_to_string(cdef.type), rjson::parse(to_json_string(*cdef.type, *exploded_pk_it)));
+        rjson::set_with_string_name(key_entry, type_to_string(cdef.type), rjson::parse(cdef.type->to_json_string(*exploded_pk_it)));
        ++exploded_pk_it;
    }
    auto ck = paging_state.get_clustering_key();
@@ -1862,7 +1835,7 @@ static rjson::value encode_paging_state(const schema& schema, const service::pag
        for (const column_definition& cdef : schema.clustering_key_columns()) {
            rjson::set_with_string_name(last_evaluated_key, cdef.name_as_text(), rjson::empty_object());
            rjson::value& key_entry = last_evaluated_key[cdef.name_as_text()];
-            rjson::set_with_string_name(key_entry, type_to_string(cdef.type), rjson::parse(to_json_string(*cdef.type, *exploded_ck_it)));
+            rjson::set_with_string_name(key_entry, type_to_string(cdef.type), rjson::parse(cdef.type->to_json_string(*exploded_ck_it)));
            ++exploded_ck_it;
        }
    }
@@ -1878,11 +1851,10 @@ static future<json::json_return_type> do_query(schema_ptr schema,
        db::consistency_level cl,
        ::shared_ptr<cql3::restrictions::statement_restrictions> filtering_restrictions,
        service::client_state& client_state,
-        cql3::cql_stats& cql_stats,
-        tracing::trace_state_ptr trace_state) {
+        cql3::cql_stats& cql_stats) {
    ::shared_ptr<service::pager::paging_state> paging_state = nullptr;

-    tracing::trace(trace_state, "Performing a database query");
+    tracing::trace(client_state.get_trace_state(), "Performing a database query");

    if (exclusive_start_key) {
        partition_key pk = pk_from_json(*exclusive_start_key, schema);
@@ -1899,7 +1871,7 @@ static future<json::json_return_type> do_query(schema_ptr schema,
    auto partition_slice = query::partition_slice(std::move(ck_bounds), {}, std::move(regular_columns), selection->get_query_options());
    auto command = ::make_lw_shared<query::read_command>(schema->id(), schema->version(), partition_slice, query::max_partitions);

-    auto query_state_ptr = std::make_unique<service::query_state>(client_state, trace_state, empty_service_permit());
+    auto query_state_ptr = std::make_unique<service::query_state>(client_state, empty_service_permit());

    command->slice.options.set<query::partition_slice::option::allow_short_read>();
    auto query_options = std::make_unique<cql3::query_options>(cl, infinite_timeout_config, std::vector<cql3::raw_value>{});
@@ -1931,7 +1903,7 @@ static future<json::json_return_type> do_query(schema_ptr schema,
 // 2. Filtering - by passing appropriately created restrictions to pager as a last parameter
 // 3. Proper timeouts instead of gc_clock::now() and db::no_timeout
 // 4. Implement parallel scanning via Segments
-future<json::json_return_type> executor::scan(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::scan(client_state& client_state, std::string content) {
    _stats.api_operations.scan++;
    rjson::value request_info = rjson::parse(content);
    elogger.trace("Scanning {}", request_info);
@@ -1941,10 +1913,6 @@ future<json::json_return_type> executor::scan(client_state& client_state, tracin
    if (rjson::find(request_info, "FilterExpression")) {
        throw api_error("ValidationException", "FilterExpression is not yet implemented in alternator");
    }
-    if (get_int_attribute(request_info, "Segment") || get_int_attribute(request_info, "TotalSegments")) {
-        // FIXME: need to support parallel scan. See issue #5059.
-        throw api_error("ValidationException", "Scan Segment/TotalSegments is not yet implemented in alternator");
-    }

    rjson::value* exclusive_start_key = rjson::find(request_info, "ExclusiveStartKey");
    //FIXME(sarna): ScanFilter is deprecated in favor of FilterExpression
@@ -1968,7 +1936,7 @@ future<json::json_return_type> executor::scan(client_state& client_state, tracin
        partition_ranges = filtering_restrictions->get_partition_key_ranges(query_options);
        ck_bounds = filtering_restrictions->get_clustering_bounds(query_options);
    }
-    return do_query(schema, exclusive_start_key, std::move(partition_ranges), std::move(ck_bounds), std::move(attrs_to_get), limit, cl, std::move(filtering_restrictions), client_state, _stats.cql_stats, trace_state);
+    return do_query(schema, exclusive_start_key, std::move(partition_ranges), std::move(ck_bounds), std::move(attrs_to_get), limit, cl, std::move(filtering_restrictions), client_state, _stats.cql_stats);
 }

 static dht::partition_range calculate_pk_bound(schema_ptr schema, const column_definition& pk_cdef, comparison_operator_type op, const rjson::value& attrs) {
@@ -2091,14 +2059,14 @@ calculate_bounds(schema_ptr schema, const rjson::value& conditions) {
    return {std::move(partition_ranges), std::move(ck_bounds)};
 }

-future<json::json_return_type> executor::query(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content) {
+future<json::json_return_type> executor::query(client_state& client_state, std::string content) {
    _stats.api_operations.query++;
    rjson::value request_info = rjson::parse(content);
    elogger.trace("Querying {}", request_info);

    schema_ptr schema = get_table_or_view(_proxy, request_info);

-    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());
+    tracing::add_table_name(client_state.get_trace_state(), schema->ks_name(), schema->cf_name());

    rjson::value* exclusive_start_key = rjson::find(request_info, "ExclusiveStartKey");
    db::consistency_level cl = get_read_consistency(request_info);
@@ -2114,11 +2082,6 @@ future<json::json_return_type> executor::query(client_state& client_state, traci
    if (rjson::find(request_info, "FilterExpression")) {
        throw api_error("ValidationException", "FilterExpression is not yet implemented in alternator");
    }
-    bool forward = get_bool_attribute(request_info, "ScanIndexForward", true);
-    if (!forward) {
-        // FIXME: need to support the !forward (i.e., reverse sort order) case. See issue #5153.
-        throw api_error("ValidationException", "ScanIndexForward=false is not yet implemented in alternator");
-    }

    //FIXME(sarna): KeyConditions are deprecated in favor of KeyConditionExpression
    rjson::value& conditions = rjson::get(request_info, "KeyConditions");
@@ -2141,7 +2104,7 @@ future<json::json_return_type> executor::query(client_state& client_state, traci
            throw api_error("ValidationException", format("QueryFilter can only contain non-primary key attributes: Primary key attribute: {}", ck_defs.front()->name_as_text()));
        }
    }
-    return do_query(schema, exclusive_start_key, std::move(partition_ranges), std::move(ck_bounds), std::move(attrs_to_get), limit, cl, std::move(filtering_restrictions), client_state, _stats.cql_stats, std::move(trace_state));
+    return do_query(schema, exclusive_start_key, std::move(partition_ranges), std::move(ck_bounds), std::move(attrs_to_get), limit, cl, std::move(filtering_restrictions), client_state, _stats.cql_stats);
 }

 static void validate_limit(int limit) {
@@ -2250,20 +2213,18 @@ future<> executor::maybe_create_keyspace() {
    });
 }

-static tracing::trace_state_ptr create_tracing_session() {
+static void create_tracing_session(executor::client_state& client_state) {
    tracing::trace_state_props_set props;
    props.set<tracing::trace_state_props::full_tracing>();
-    return tracing::tracing::get_local_tracing_instance().create_session(tracing::trace_type::QUERY, props);
+    client_state.create_tracing_session(tracing::trace_type::QUERY, props);
 }

-tracing::trace_state_ptr executor::maybe_trace_query(client_state& client_state, sstring_view op, sstring_view query) {
-    tracing::trace_state_ptr trace_state;
+void executor::maybe_trace_query(client_state& client_state, sstring_view op, sstring_view query) {
    if (tracing::tracing::get_local_tracing_instance().trace_next_query()) {
-        trace_state = create_tracing_session();
-        tracing::add_query(trace_state, query);
-        tracing::begin(trace_state, format("Alternator {}", op), client_state.get_client_address());
+        create_tracing_session(client_state);
+        tracing::add_query(client_state.get_trace_state(), query);
+        tracing::begin(client_state.get_trace_state(), format("Alternator {}", op), client_state.get_client_address());
    }
-    return trace_state;
 }

 future<> executor::start() {
--- a/alternator/executor.hh
+++ b/alternator/executor.hh
@@ -46,26 +46,26 @@ public:

    executor(service::storage_proxy& proxy, service::migration_manager& mm) : _proxy(proxy), _mm(mm) {}

-    future<json::json_return_type> create_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> describe_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> delete_table(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> put_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> get_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> delete_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> update_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
+    future<json::json_return_type> create_table(client_state& client_state, std::string content);
+    future<json::json_return_type> describe_table(client_state& client_state, std::string content);
+    future<json::json_return_type> delete_table(client_state& client_state, std::string content);
+    future<json::json_return_type> put_item(client_state& client_state, std::string content);
+    future<json::json_return_type> get_item(client_state& client_state, std::string content);
+    future<json::json_return_type> delete_item(client_state& client_state, std::string content);
+    future<json::json_return_type> update_item(client_state& client_state, std::string content);
    future<json::json_return_type> list_tables(client_state& client_state, std::string content);
-    future<json::json_return_type> scan(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
+    future<json::json_return_type> scan(client_state& client_state, std::string content);
    future<json::json_return_type> describe_endpoints(client_state& client_state, std::string content, std::string host_header);
-    future<json::json_return_type> batch_write_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> batch_get_item(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
-    future<json::json_return_type> query(client_state& client_state, tracing::trace_state_ptr trace_state, std::string content);
+    future<json::json_return_type> batch_write_item(client_state& client_state, std::string content);
+    future<json::json_return_type> batch_get_item(client_state& client_state, std::string content);
+    future<json::json_return_type> query(client_state& client_state, std::string content);

    future<> start();
    future<> stop() { return make_ready_future<>(); }

    future<> maybe_create_keyspace();

-    static tracing::trace_state_ptr maybe_trace_query(client_state& client_state, sstring_view op, sstring_view query);
+    static void maybe_trace_query(client_state& client_state, sstring_view op, sstring_view query);
 };

 }
--- a/alternator/rjson.cc
+++ b/alternator/rjson.cc
@@ -113,58 +113,6 @@ void push_back(rjson::value& base_array, rjson::value&& item) {

 }

-bool single_value_comp::operator()(const rjson::value& r1, const rjson::value& r2) const {
-   auto r1_type = r1.GetType();
-   auto r2_type = r2.GetType();
-
-   // null is the smallest type and compares with every other type, nothing is lesser than null
-   if (r1_type == rjson::type::kNullType || r2_type == rjson::type::kNullType) {
-       return r1_type < r2_type;
-   }
-   // only null, true, and false are comparable with each other, other types are not compatible
-   if (r1_type != r2_type) {
-       if (r1_type > rjson::type::kTrueType || r2_type > rjson::type::kTrueType) {
-           throw rjson::error(format("Types are not comparable: {} {}", r1, r2));
-       }
-   }
-
-   switch (r1_type) {
-   case rjson::type::kNullType:
-       // fall-through
-   case rjson::type::kFalseType:
-       // fall-through
-   case rjson::type::kTrueType:
-       return r1_type < r2_type;
-   case rjson::type::kObjectType:
-       throw rjson::error("Object type comparison is not supported");
-   case rjson::type::kArrayType:
-       throw rjson::error("Array type comparison is not supported");
-   case rjson::type::kStringType: {
-       const size_t r1_len = r1.GetStringLength();
-       const size_t r2_len = r2.GetStringLength();
-       size_t len = std::min(r1_len, r2_len);
-       int result = std::strncmp(r1.GetString(), r2.GetString(), len);
-       return result < 0 || (result == 0 && r1_len < r2_len);
-   }
-   case rjson::type::kNumberType: {
-       if (r1.IsInt() && r2.IsInt()) {
-           return r1.GetInt() < r2.GetInt();
-       } else if (r1.IsUint() && r2.IsUint()) {
-           return r1.GetUint() < r2.GetUint();
-       } else if (r1.IsInt64() && r2.IsInt64()) {
-           return r1.GetInt64() < r2.GetInt64();
-       } else if (r1.IsUint64() && r2.IsUint64()) {
-           return r1.GetUint64() < r2.GetUint64();
-       } else {
-           // it's safe to call GetDouble() on any number type
-           return r1.GetDouble() < r2.GetDouble();
-       }
-   }
-   default:
-       return false;
-   }
-}
-
 } // end namespace rjson

 std::ostream& std::operator<<(std::ostream& os, const rjson::value& v) {
--- a/alternator/rjson.hh
+++ b/alternator/rjson.hh
@@ -152,10 +152,6 @@ void set(rjson::value& base, rjson::string_ref_type name, rjson::string_ref_type
 // Throws if base_array is not a JSON array.
 void push_back(rjson::value& base_array, rjson::value&& item);

-struct single_value_comp {
-    bool operator()(const rjson::value& r1, const rjson::value& r2) const;
-};
-
 } // end namespace rjson

 namespace std {
--- a/alternator/serialization.cc
+++ b/alternator/serialization.cc
@@ -25,7 +25,6 @@
 #include "error.hh"
 #include "rapidjson/writer.h"
 #include "concrete_types.hh"
-#include "cql3/type_json.hh"

 static logging::logger slogger("alternator-serialization");

@@ -68,7 +67,7 @@ struct from_json_visitor {
        bo.write(t.from_string(sstring_view(v.GetString(), v.GetStringLength())));
    }
    void operator()(const bytes_type_impl& t) const {
-        bo.write(base64_decode(v));
+        bo.write(base64_decode(std::string_view(v.GetString(), v.GetStringLength())));
    }
    void operator()(const boolean_type_impl& t) const {
        bo.write(boolean_type->decompose(v.GetBool()));
@@ -78,7 +77,7 @@ struct from_json_visitor {
    }
    // default
    void operator()(const abstract_type& t) const {
-        bo.write(from_json_object(t, Json::Value(rjson::print(v)), cql_serialization_format::internal()));
+        bo.write(t.from_json_object(Json::Value(rjson::print(v)), cql_serialization_format::internal()));
    }
 };

@@ -108,7 +107,7 @@ struct to_json_visitor {

    void operator()(const reversed_type_impl& t) const { visit(*t.underlying_type(), to_json_visitor{deserialized, type_ident, bv}); };
    void operator()(const decimal_type_impl& t) const {
-        auto s = to_json_string(*decimal_type, bytes(bv));
+        auto s = decimal_type->to_json_string(bytes(bv));
        //FIXME(sarna): unnecessary copy
        rjson::set_with_string_name(deserialized, type_ident, rjson::from_string(s));
    }
@@ -178,7 +177,7 @@ bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column
                        expected_type, column.name_as_text(), it->name.GetString()));
    }
    if (column.type == bytes_type) {
-        return base64_decode(it->value);
+        return base64_decode(it->value.GetString());
    } else {
        return column.type->from_string(it->value.GetString());
    }
@@ -195,7 +194,7 @@ rjson::value json_key_column_value(bytes_view cell, const column_definition& col
        // FIXME: use specialized Alternator number type, not the more
        // general "decimal_type". A dedicated type can be more efficient
        // in storage space and in parsing speed.
-        auto s = to_json_string(*decimal_type, bytes(cell));
+        auto s = decimal_type->to_json_string(bytes(cell));
        return rjson::from_string(s);
    } else {
        // We shouldn't get here, we shouldn't see such key columns.
@@ -228,34 +227,4 @@ clustering_key ck_from_json(const rjson::value& item, schema_ptr schema) {
    return clustering_key::from_exploded(raw_ck);
 }

-big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic) {
-    if (!v.IsObject() || v.MemberCount() != 1) {
-        throw api_error("ValidationException", format("{}: invalid number object", diagnostic));
-    }
-    auto it = v.MemberBegin();
-    if (it->name != "N") {
-        throw api_error("ValidationException", format("{}: expected number, found type '{}'", diagnostic, it->name));
-    }
-    if (it->value.IsNumber()) {
-         // FIXME(sarna): should use big_decimal constructor with numeric values directly:
-        return big_decimal(rjson::print(it->value));
-    }
-    if (!it->value.IsString()) {
-        throw api_error("ValidationException", format("{}: improperly formatted number constant", diagnostic));
-    }
-    return big_decimal(it->value.GetString());
-}
-
-const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v) {
-    if (!v.IsObject() || v.MemberCount() != 1) {
-        return {"", nullptr};
-    }
-    auto it = v.MemberBegin();
-    const std::string it_key = it->name.GetString();
-    if (it_key != "SS" && it_key != "BS" && it_key != "NS") {
-        return {"", nullptr};
-    }
-    return std::make_pair(it_key, &(it->value));
-}
-
 }
--- a/alternator/serialization.hh
+++ b/alternator/serialization.hh
@@ -22,12 +22,10 @@
 #pragma once

 #include <string>
-#include <string_view>
 #include "types.hh"
 #include "schema.hh"
 #include "keys.hh"
 #include "rjson.hh"
-#include "utils/big_decimal.hh"

 namespace alternator {

@@ -60,13 +58,4 @@ rjson::value json_key_column_value(bytes_view cell, const column_definition& col
 partition_key pk_from_json(const rjson::value& item, schema_ptr schema);
 clustering_key ck_from_json(const rjson::value& item, schema_ptr schema);

-// If v encodes a number (i.e., it is a {"N": [...]}, returns an object representing it.  Otherwise,
-// raises ValidationException with diagnostic.
-big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic);
-
-// Check if a given JSON object encodes a set (i.e., it is a {"SS": [...]}, or "NS", "BS"
-// and returns set's type and a pointer to that set. If the object does not encode a set,
-// returned value is {"", nullptr}
-const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v);
-
 }
--- a/alternator/server.cc
+++ b/alternator/server.cc
@@ -24,11 +24,10 @@
 #include <seastar/http/function_handlers.hh>
 #include <seastar/json/json_elements.hh>
 #include <seastarx.hh>
+#include <boost/algorithm/string/split.hpp>
+#include <boost/algorithm/string/classification.hpp>
 #include "error.hh"
 #include "rjson.hh"
-#include "auth.hh"
-#include <cctype>
-#include "cql3/query_processor.hh"

 static logging::logger slogger("alternator-server");

@@ -38,23 +37,12 @@ namespace alternator {

 static constexpr auto TARGET = "X-Amz-Target";

-inline std::vector<std::string_view> split(std::string_view text, char separator) {
-    std::vector<std::string_view> tokens;
+inline std::vector<sstring> split(const sstring& text, const char* separator) {
    if (text == "") {
-        return tokens;
+        return std::vector<sstring>();
    }
-
-    while (true) {
-        auto pos = text.find_first_of(separator);
-        if (pos != std::string_view::npos) {
-            tokens.emplace_back(text.data(), pos);
-            text.remove_prefix(pos + 1);
-        } else {
-            tokens.emplace_back(text);
-            break;
-        }
-    }
-    return tokens;
+    std::vector<sstring> tokens;
+    return boost::split(tokens, text, boost::is_any_of(separator));
 }

 // DynamoDB HTTP error responses are structured as follows
@@ -119,194 +107,64 @@ protected:
    sstring _type;
 };

-class health_handler : public handler_base {
-    virtual future<std::unique_ptr<reply>> handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
-        rep->set_status(reply::status_type::ok);
-        rep->write_body("txt", format("healthy: {}", req->get_header("Host")));
-        return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
-    }
-};
-
-future<> server::verify_signature(const request& req) {
-    if (!_enforce_authorization) {
-        slogger.debug("Skipping authorization");
-        return make_ready_future<>();
-    }
-    auto host_it = req._headers.find("Host");
-    if (host_it == req._headers.end()) {
-        throw api_error("InvalidSignatureException", "Host header is mandatory for signature verification");
-    }
-    auto authorization_it = req._headers.find("Authorization");
-    if (host_it == req._headers.end()) {
-        throw api_error("InvalidSignatureException", "Authorization header is mandatory for signature verification");
-    }
-    std::string host = host_it->second;
-    std::vector<std::string_view> credentials_raw = split(authorization_it->second, ' ');
-    std::string credential;
-    std::string user_signature;
-    std::string signed_headers_str;
-    std::vector<std::string_view> signed_headers;
-    for (std::string_view entry : credentials_raw) {
-        std::vector<std::string_view> entry_split = split(entry, '=');
-        if (entry_split.size() != 2) {
-            if (entry != "AWS4-HMAC-SHA256") {
-                throw api_error("InvalidSignatureException", format("Only AWS4-HMAC-SHA256 algorithm is supported. Found: {}", entry));
-            }
-            continue;
-        }
-        std::string_view auth_value = entry_split[1];
-        // Commas appear as an additional (quite redundant) delimiter
-        if (auth_value.back() == ',') {
-            auth_value.remove_suffix(1);
-        }
-        if (entry_split[0] == "Credential") {
-            credential = std::string(auth_value);
-        } else if (entry_split[0] == "Signature") {
-            user_signature = std::string(auth_value);
-        } else if (entry_split[0] == "SignedHeaders") {
-            signed_headers_str = std::string(auth_value);
-            signed_headers = split(auth_value, ';');
-            std::sort(signed_headers.begin(), signed_headers.end());
-        }
-    }
-    std::vector<std::string_view> credential_split = split(credential, '/');
-    if (credential_split.size() != 5) {
-        throw api_error("ValidationException", format("Incorrect credential information format: {}", credential));
-    }
-    std::string user(credential_split[0]);
-    std::string datestamp(credential_split[1]);
-    std::string region(credential_split[2]);
-    std::string service(credential_split[3]);
-
-    std::map<std::string_view, std::string_view> signed_headers_map;
-    for (const auto& header : signed_headers) {
-        signed_headers_map.emplace(header, std::string_view());
-    }
-    for (auto& header : req._headers) {
-        std::string header_str;
-        header_str.resize(header.first.size());
-        std::transform(header.first.begin(), header.first.end(), header_str.begin(), ::tolower);
-        auto it = signed_headers_map.find(header_str);
-        if (it != signed_headers_map.end()) {
-            it->second = std::string_view(header.second);
-        }
-    }
-
-    auto cache_getter = [] (std::string username) {
-        return get_key_from_roles(cql3::get_query_processor().local(), std::move(username));
+void server::set_routes(routes& r) {
+    using alternator_callback = std::function<future<json::json_return_type>(executor&, executor::client_state&, std::unique_ptr<request>)>;
+    std::unordered_map<std::string, alternator_callback> routes{
+        {"CreateTable", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) {
+            return e.maybe_create_keyspace().then([&e, &client_state, req = std::move(req)] { return e.create_table(client_state, req->content); }); }
+        },
+        {"DescribeTable", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.describe_table(client_state, req->content); }},
+        {"DeleteTable", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.delete_table(client_state, req->content); }},
+        {"PutItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.put_item(client_state, req->content); }},
+        {"UpdateItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.update_item(client_state, req->content); }},
+        {"GetItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.get_item(client_state, req->content); }},
+        {"DeleteItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.delete_item(client_state, req->content); }},
+        {"ListTables", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.list_tables(client_state, req->content); }},
+        {"Scan", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.scan(client_state, req->content); }},
+        {"DescribeEndpoints", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.describe_endpoints(client_state, req->content, req->get_header("Host")); }},
+        {"BatchWriteItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.batch_write_item(client_state, req->content); }},
+        {"BatchGetItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.batch_get_item(client_state, req->content); }},
+        {"Query", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.query(client_state, req->content); }},
    };
-    return _key_cache.get_ptr(user, cache_getter).then([this, &req,
-                                                    user = std::move(user),
-                                                    host = std::move(host),
-                                                    datestamp = std::move(datestamp),
-                                                    signed_headers_str = std::move(signed_headers_str),
-                                                    signed_headers_map = std::move(signed_headers_map),
-                                                    region = std::move(region),
-                                                    service = std::move(service),
-                                                    user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {
-        std::string signature = get_signature(user, *key_ptr, std::string_view(host), req._method,
-                datestamp, signed_headers_str, signed_headers_map, req.content, region, service, "");

-        if (signature != std::string_view(user_signature)) {
-            _key_cache.remove(user);
-            throw api_error("UnrecognizedClientException", "The security token included in the request is invalid.");
-        }
-    });
-}
-
-future<json::json_return_type> server::handle_api_request(std::unique_ptr<request>&& req) {
-    _executor.local()._stats.total_operations++;
-    sstring target = req->get_header(TARGET);
-    std::vector<std::string_view> split_target = split(target, '.');
-    //NOTICE(sarna): Target consists of Dynamo API version followed by a dot '.' and operation type (e.g. CreateTable)
-    std::string op = split_target.empty() ? std::string() : std::string(split_target.back());
-    slogger.trace("Request: {} {}", op, req->content);
-    return verify_signature(*req).then([this, op, req = std::move(req)] () mutable {
-        auto callback_it = _callbacks.find(op);
-        if (callback_it == _callbacks.end()) {
+    api_handler* handler = new api_handler([this, routes = std::move(routes)](std::unique_ptr<request> req) -> future<json::json_return_type> {
+        _executor.local()._stats.total_operations++;
+        sstring target = req->get_header(TARGET);
+        std::vector<sstring> split_target = split(target, ".");
+        //NOTICE(sarna): Target consists of Dynamo API version folllowed by a dot '.' and operation type (e.g. CreateTable)
+        sstring op = split_target.empty() ? sstring() : split_target.back();
+        slogger.trace("Request: {} {}", op, req->content);
+        auto callback_it = routes.find(op);
+        if (callback_it == routes.end()) {
            _executor.local()._stats.unsupported_operations++;
            throw api_error("UnknownOperationException",
                    format("Unsupported operation {}", op));
        }
        //FIXME: Client state can provide more context, e.g. client's endpoint address
-        // We use unique_ptr because client_state cannot be moved or copied
-        return do_with(std::make_unique<executor::client_state>(executor::client_state::internal_tag()), [this, callback_it = std::move(callback_it), op = std::move(op), req = std::move(req)] (std::unique_ptr<executor::client_state>& client_state) mutable {
-            client_state->set_raw_keyspace(executor::KEYSPACE_NAME);
-            tracing::trace_state_ptr trace_state = executor::maybe_trace_query(*client_state, op, req->content);
-            tracing::trace(trace_state, op);
-            return callback_it->second(_executor.local(), *client_state, trace_state, std::move(req)).finally([trace_state] {});
+        return do_with(executor::client_state::for_internal_calls(), [this, callback_it = std::move(callback_it), op = std::move(op), req = std::move(req)] (executor::client_state& client_state) mutable {
+            client_state.set_raw_keyspace(executor::KEYSPACE_NAME);
+            executor::maybe_trace_query(client_state, op, req->content);
+            tracing::trace(client_state.get_trace_state(), op);
+            return callback_it->second(_executor.local(), client_state, std::move(req));
        });
    });
+
+    r.add(operation_type::POST, url("/"), handler);
 }

-void server::set_routes(routes& r) {
-    api_handler* req_handler = new api_handler([this] (std::unique_ptr<request> req) mutable {
-        return handle_api_request(std::move(req));
-    });
-
-    r.add(operation_type::POST, url("/"), req_handler);
-    r.add(operation_type::GET, url("/"), new health_handler);
-}
-
-//FIXME: A way to immediately invalidate the cache should be considered,
-// e.g. when the system table which stores the keys is changed.
-// For now, this propagation may take up to 1 minute.
-server::server(seastar::sharded<executor>& e)
-        : _executor(e), _key_cache(1024, 1min, slogger), _enforce_authorization(false)
-      , _callbacks{
-        {"CreateTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) {
-            return e.maybe_create_keyspace().then([&e, &client_state, req = std::move(req), trace_state = std::move(trace_state)] () mutable { return e.create_table(client_state, std::move(trace_state), req->content); }); }
-        },
-        {"DescribeTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.describe_table(client_state, std::move(trace_state), req->content); }},
-        {"DeleteTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.delete_table(client_state, std::move(trace_state), req->content); }},
-        {"PutItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.put_item(client_state, std::move(trace_state), req->content); }},
-        {"UpdateItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.update_item(client_state, std::move(trace_state), req->content); }},
-        {"GetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.get_item(client_state, std::move(trace_state), req->content); }},
-        {"DeleteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.delete_item(client_state, std::move(trace_state), req->content); }},
-        {"ListTables", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.list_tables(client_state, req->content); }},
-        {"Scan", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.scan(client_state, std::move(trace_state), req->content); }},
-        {"DescribeEndpoints", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.describe_endpoints(client_state, req->content, req->get_header("Host")); }},
-        {"BatchWriteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.batch_write_item(client_state, std::move(trace_state), req->content); }},
-        {"BatchGetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.batch_get_item(client_state, std::move(trace_state), req->content); }},
-        {"Query", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, std::unique_ptr<request> req) { return e.query(client_state, std::move(trace_state), req->content); }},
-    } {
-}
-
-future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds, bool enforce_authorization) {
-    _enforce_authorization = enforce_authorization;
-    if (!port && !https_port) {
-        return make_exception_future<>(std::runtime_error("Either regular port or TLS port"
-                " must be specified in order to init an alternator HTTP server instance"));
-    }
-    return seastar::async([this, addr, port, https_port, creds] {
-        try {
-            _executor.invoke_on_all([] (executor& e) {
-                return e.start();
-            }).get();
-
-            if (port) {
-                _control.start().get();
-                _control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1)).get();
-                _control.listen(socket_address{addr, *port}).get();
-                slogger.info("Alternator HTTP server listening on {} port {}", addr, *port);
-            }
-            if (https_port) {
-                _https_control.start().get();
-                _https_control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1)).get();
-                _https_control.server().invoke_on_all([creds] (http_server& serv) {
-                    return serv.set_tls_credentials(creds->build_server_credentials());
-                }).get();
-
-                _https_control.listen(socket_address{addr, *https_port}).get();
-                slogger.info("Alternator HTTPS server listening on {} port {}", addr, *https_port);
-            }
-        } catch (...) {
-            slogger.error("Failed to set up Alternator HTTP server on {} port {}, TLS port {}: {}",
-                    addr, port ? std::to_string(*port) : "OFF", https_port ? std::to_string(*https_port) : "OFF", std::current_exception());
-            std::throw_with_nested(std::runtime_error(
-                    format("Failed to set up Alternator HTTP server on {} port {}, TLS port {}",
-                            addr, port ? std::to_string(*port) : "OFF", https_port ? std::to_string(*https_port) : "OFF")));
-        }
+future<> server::init(net::inet_address addr, uint16_t port) {
+    return _executor.invoke_on_all([] (executor& e) {
+        return e.start();
+    }).then([this] {
+        return _control.start();
+    }).then([this] {
+        return _control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1));
+    }).then([this, addr, port] {
+        return _control.listen(socket_address{addr, port});
+    }).then([addr, port] {
+        slogger.info("Alternator HTTP server listening on {} port {}", addr, port);
+    }).handle_exception([addr, port] (std::exception_ptr e) {
+        slogger.warn("Failed to set up Alternator HTTP server on {} port {}: {}", addr, port, e);
    });
 }

--- a/alternator/server.hh
+++ b/alternator/server.hh
@@ -24,30 +24,18 @@
 #include "alternator/executor.hh"
 #include <seastar/core/future.hh>
 #include <seastar/http/httpd.hh>
-#include <seastar/net/tls.hh>
-#include <optional>
-#include <alternator/auth.hh>

 namespace alternator {

 class server {
-    using alternator_callback = std::function<future<json::json_return_type>(executor&, executor::client_state&, tracing::trace_state_ptr, std::unique_ptr<request>)>;
-    using alternator_callbacks_map = std::unordered_map<std::string_view, alternator_callback>;
-
    seastar::httpd::http_server_control _control;
-    seastar::httpd::http_server_control _https_control;
    seastar::sharded<executor>& _executor;
-    key_cache _key_cache;
-    bool _enforce_authorization;
-    alternator_callbacks_map _callbacks;
 public:
-    server(seastar::sharded<executor>& executor);
+    server(seastar::sharded<executor>& executor) : _executor(executor) {}

-    seastar::future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds, bool enforce_authorization);
+    seastar::future<> init(net::inet_address addr, uint16_t port);
 private:
    void set_routes(seastar::httpd::routes& r);
-    future<> verify_signature(const seastar::httpd::request& r);
-    future<json::json_return_type> handle_api_request(std::unique_ptr<request>&& req);
 };

 }
--- a/api/api-doc/cache_service.json
+++ b/api/api-doc/cache_service.json
@@ -13,7 +13,7 @@
            {
               "method":"GET",
               "summary":"get row cache save period in seconds",
-               "type": "long",
+               "type":"int",
               "nickname":"get_row_cache_save_period_in_seconds",
               "produces":[
                  "application/json"
@@ -35,7 +35,7 @@
                     "description":"row cache save period in seconds",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -48,7 +48,7 @@
            {
               "method":"GET",
               "summary":"get key cache save period in seconds",
-               "type": "long",
+               "type":"int",
               "nickname":"get_key_cache_save_period_in_seconds",
               "produces":[
                  "application/json"
@@ -70,7 +70,7 @@
                     "description":"key cache save period in seconds",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -83,7 +83,7 @@
            {
               "method":"GET",
               "summary":"get counter cache save period in seconds",
-               "type": "long",
+               "type":"int",
               "nickname":"get_counter_cache_save_period_in_seconds",
               "produces":[
                  "application/json"
@@ -105,7 +105,7 @@
                     "description":"counter cache save period in seconds",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -118,7 +118,7 @@
            {
               "method":"GET",
               "summary":"get row cache keys to save",
-               "type": "long",
+               "type":"int",
               "nickname":"get_row_cache_keys_to_save",
               "produces":[
                  "application/json"
@@ -140,7 +140,7 @@
                     "description":"row cache keys to save",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -153,7 +153,7 @@
            {
               "method":"GET",
               "summary":"get key cache keys to save",
-               "type": "long",
+               "type":"int",
               "nickname":"get_key_cache_keys_to_save",
               "produces":[
                  "application/json"
@@ -175,7 +175,7 @@
                     "description":"key cache keys to save",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -188,7 +188,7 @@
            {
               "method":"GET",
               "summary":"get counter cache keys to save",
-               "type": "long",
+               "type":"int",
               "nickname":"get_counter_cache_keys_to_save",
               "produces":[
                  "application/json"
@@ -210,7 +210,7 @@
                     "description":"counter cache keys to save",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -448,7 +448,7 @@
        {
          "method": "GET",
          "summary": "Get key entries",
-          "type": "long",
+          "type": "int",
          "nickname": "get_key_entries",
          "produces": [
            "application/json"
@@ -568,7 +568,7 @@
        {
          "method": "GET",
          "summary": "Get row entries",
-          "type": "long",
+          "type": "int",
          "nickname": "get_row_entries",
          "produces": [
            "application/json"
@@ -688,7 +688,7 @@
        {
          "method": "GET",
          "summary": "Get counter entries",
-          "type": "long",
+          "type": "int",
          "nickname": "get_counter_entries",
          "produces": [
            "application/json"
--- a/api/api-doc/column_family.json
+++ b/api/api-doc/column_family.json
@@ -121,7 +121,7 @@
                     "description":"The minimum number of sstables in queue before compaction kicks off",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -172,7 +172,7 @@
                     "description":"The maximum number of sstables in queue before compaction kicks off",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -223,7 +223,7 @@
                     "description":"The maximum number of sstables in queue before compaction kicks off",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  },
                  {
@@ -231,7 +231,7 @@
                     "description":"The minimum number of sstables in queue before compaction kicks off",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -544,7 +544,7 @@
               "summary":"sstable count for each level. empty unless leveled compaction is used",
               "type":"array",
               "items":{
-                  "type": "long"
+                  "type":"int"
               },
               "nickname":"get_sstable_count_per_level",
               "produces":[
@@ -636,7 +636,7 @@
                     "description":"Duration (in milliseconds) of monitoring operation",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  },
                  {
@@ -644,7 +644,7 @@
                    "description":"number of the top partitions to list",
                    "required":false,
                    "allowMultiple":false,
-                    "type": "long",
+                    "type":"int",
                    "paramType":"query"
                 },
                 {
@@ -652,7 +652,7 @@
                    "description":"capacity of stream summary: determines amount of resources used in query processing",
                    "required":false,
                    "allowMultiple":false,
-                    "type": "long",
+                    "type":"int",
                    "paramType":"query"
                 }
              ]
@@ -921,7 +921,7 @@
            {
               "method":"GET",
               "summary":"Get memtable switch count",
-               "type": "long",
+               "type":"int",
               "nickname":"get_memtable_switch_count",
               "produces":[
                  "application/json"
@@ -945,7 +945,7 @@
            {
               "method":"GET",
               "summary":"Get all memtable switch count",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_memtable_switch_count",
               "produces":[
                  "application/json"
@@ -1082,7 +1082,7 @@
            {
               "method":"GET",
               "summary":"Get read latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_read_latency",
               "produces":[
                  "application/json"
@@ -1235,7 +1235,7 @@
            {
               "method":"GET",
               "summary":"Get all read latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_read_latency",
               "produces":[
                  "application/json"
@@ -1251,7 +1251,7 @@
            {
               "method":"GET",
               "summary":"Get range latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_range_latency",
               "produces":[
                  "application/json"
@@ -1275,7 +1275,7 @@
            {
               "method":"GET",
               "summary":"Get all range latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_range_latency",
               "produces":[
                  "application/json"
@@ -1291,7 +1291,7 @@
            {
               "method":"GET",
               "summary":"Get write latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_write_latency",
               "produces":[
                  "application/json"
@@ -1444,7 +1444,7 @@
            {
               "method":"GET",
               "summary":"Get all write latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_write_latency",
               "produces":[
                  "application/json"
@@ -1460,7 +1460,7 @@
            {
               "method":"GET",
               "summary":"Get pending flushes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_pending_flushes",
               "produces":[
                  "application/json"
@@ -1484,7 +1484,7 @@
            {
               "method":"GET",
               "summary":"Get all pending flushes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_pending_flushes",
               "produces":[
                  "application/json"
@@ -1500,7 +1500,7 @@
            {
               "method":"GET",
               "summary":"Get pending compactions",
-               "type": "long",
+               "type":"int",
               "nickname":"get_pending_compactions",
               "produces":[
                  "application/json"
@@ -1524,7 +1524,7 @@
            {
               "method":"GET",
               "summary":"Get all pending compactions",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_pending_compactions",
               "produces":[
                  "application/json"
@@ -1540,7 +1540,7 @@
            {
               "method":"GET",
               "summary":"Get live ss table count",
-               "type": "long",
+               "type":"int",
               "nickname":"get_live_ss_table_count",
               "produces":[
                  "application/json"
@@ -1564,7 +1564,7 @@
            {
               "method":"GET",
               "summary":"Get all live ss table count",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_live_ss_table_count",
               "produces":[
                  "application/json"
@@ -1580,7 +1580,7 @@
            {
               "method":"GET",
               "summary":"Get live disk space used",
-               "type": "long",
+               "type":"int",
               "nickname":"get_live_disk_space_used",
               "produces":[
                  "application/json"
@@ -1604,7 +1604,7 @@
            {
               "method":"GET",
               "summary":"Get all live disk space used",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_live_disk_space_used",
               "produces":[
                  "application/json"
@@ -1620,7 +1620,7 @@
            {
               "method":"GET",
               "summary":"Get total disk space used",
-               "type": "long",
+               "type":"int",
               "nickname":"get_total_disk_space_used",
               "produces":[
                  "application/json"
@@ -1644,7 +1644,7 @@
            {
               "method":"GET",
               "summary":"Get all total disk space used",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_total_disk_space_used",
               "produces":[
                  "application/json"
@@ -2100,7 +2100,7 @@
            {
               "method":"GET",
               "summary":"Get speculative retries",
-               "type": "long",
+               "type":"int",
               "nickname":"get_speculative_retries",
               "produces":[
                  "application/json"
@@ -2124,7 +2124,7 @@
            {
               "method":"GET",
               "summary":"Get all speculative retries",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_speculative_retries",
               "produces":[
                  "application/json"
@@ -2204,7 +2204,7 @@
            {
               "method":"GET",
               "summary":"Get row cache hit out of range",
-               "type": "long",
+               "type":"int",
               "nickname":"get_row_cache_hit_out_of_range",
               "produces":[
                  "application/json"
@@ -2228,7 +2228,7 @@
            {
               "method":"GET",
               "summary":"Get all row cache hit out of range",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_row_cache_hit_out_of_range",
               "produces":[
                  "application/json"
@@ -2244,7 +2244,7 @@
            {
               "method":"GET",
               "summary":"Get row cache hit",
-               "type": "long",
+               "type":"int",
               "nickname":"get_row_cache_hit",
               "produces":[
                  "application/json"
@@ -2268,7 +2268,7 @@
            {
               "method":"GET",
               "summary":"Get all row cache hit",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_row_cache_hit",
               "produces":[
                  "application/json"
@@ -2284,7 +2284,7 @@
            {
               "method":"GET",
               "summary":"Get row cache miss",
-               "type": "long",
+               "type":"int",
               "nickname":"get_row_cache_miss",
               "produces":[
                  "application/json"
@@ -2308,7 +2308,7 @@
            {
               "method":"GET",
               "summary":"Get all row cache miss",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_row_cache_miss",
               "produces":[
                  "application/json"
@@ -2324,7 +2324,7 @@
            {
               "method":"GET",
               "summary":"Get cas prepare",
-               "type": "long",
+               "type":"int",
               "nickname":"get_cas_prepare",
               "produces":[
                  "application/json"
@@ -2348,7 +2348,7 @@
            {
               "method":"GET",
               "summary":"Get cas propose",
-               "type": "long",
+               "type":"int",
               "nickname":"get_cas_propose",
               "produces":[
                  "application/json"
@@ -2372,7 +2372,7 @@
            {
               "method":"GET",
               "summary":"Get cas commit",
-               "type": "long",
+               "type":"int",
               "nickname":"get_cas_commit",
               "produces":[
                  "application/json"
--- a/api/api-doc/compaction_manager.json
+++ b/api/api-doc/compaction_manager.json
@@ -118,7 +118,7 @@
        {
          "method": "GET",
          "summary": "Get pending tasks",
-          "type": "long",
+          "type": "int",
          "nickname": "get_pending_tasks",
          "produces": [
            "application/json"
@@ -181,7 +181,7 @@
        {
          "method": "GET",
          "summary": "Get bytes compacted",
-          "type": "long",
+          "type": "int",
          "nickname": "get_bytes_compacted",
          "produces": [
            "application/json"
@@ -197,7 +197,7 @@
         "description":"A row merged information",
         "properties":{
            "key":{
-               "type": "long",
+               "type":"int",
               "description":"The number of sstable"
            },
            "value":{
--- a/api/api-doc/failure_detector.json
+++ b/api/api-doc/failure_detector.json
@@ -110,7 +110,7 @@
            {
               "method":"GET",
               "summary":"Get count down endpoint",
-               "type": "long",
+               "type":"int",
               "nickname":"get_down_endpoint_count",
               "produces":[
                  "application/json"
@@ -126,7 +126,7 @@
            {
               "method":"GET",
               "summary":"Get count up endpoint",
-               "type": "long",
+               "type":"int",
               "nickname":"get_up_endpoint_count",
               "produces":[
                  "application/json"
@@ -180,11 +180,11 @@
                    "description": "The endpoint address"
                },
                "generation": {
-                    "type": "long",
+                    "type": "int",
                    "description": "The heart beat generation"
                },
                "version": {
-                    "type": "long",
+                    "type": "int",
                    "description": "The heart beat version"
                },
                "update_time": {
@@ -209,7 +209,7 @@
           "description": "Holds a version value for an application state",
               "properties": {
                "application_state": {
-                    "type": "long",
+                    "type": "int",
                    "description": "The application state enum index"
                },
                "value": {
@@ -217,7 +217,7 @@
                    "description": "The version value"
                },
                "version": {
-                    "type": "long",
+                    "type": "int",
                    "description": "The application state version"
                }
            }
--- a/api/api-doc/gossiper.json
+++ b/api/api-doc/gossiper.json
@@ -75,7 +75,7 @@
            {
               "method":"GET",
               "summary":"Returns files which are pending for archival attempt. Does NOT include failed archive attempts",
-               "type": "long",
+               "type":"int",
               "nickname":"get_current_generation_number",
               "produces":[
                  "application/json"
@@ -99,7 +99,7 @@
            {
               "method":"GET",
               "summary":"Get heart beat version for a node",
-               "type": "long",
+               "type":"int",
               "nickname":"get_current_heart_beat_version",
               "produces":[
                  "application/json"
--- a/api/api-doc/hinted_handoff.json
+++ b/api/api-doc/hinted_handoff.json
@@ -99,7 +99,7 @@
        {
          "method": "GET",
          "summary": "Get create hint count",
-          "type": "long",
+          "type": "int",
          "nickname": "get_create_hint_count",
          "produces": [
            "application/json"
@@ -123,7 +123,7 @@
        {
          "method": "GET",
          "summary": "Get not stored hints count",
-          "type": "long",
+          "type": "int",
          "nickname": "get_not_stored_hints_count",
          "produces": [
            "application/json"
--- a/api/api-doc/messaging_service.json
+++ b/api/api-doc/messaging_service.json
@@ -191,7 +191,7 @@
            {
               "method":"GET",
               "summary":"Get the version number",
-               "type": "long",
+               "type":"int",
               "nickname":"get_version",
               "produces":[
                  "application/json"
--- a/api/api-doc/storage_proxy.json
+++ b/api/api-doc/storage_proxy.json
@@ -105,7 +105,7 @@
            {
               "method":"GET",
               "summary":"Get the max hint window",
-               "type": "long",
+               "type":"int",
               "nickname":"get_max_hint_window",
               "produces":[
                  "application/json"
@@ -128,7 +128,7 @@
                     "description":"max hint window in ms",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -141,7 +141,7 @@
            {
               "method":"GET",
               "summary":"Get max hints in progress",
-               "type": "long",
+               "type":"int",
               "nickname":"get_max_hints_in_progress",
               "produces":[
                  "application/json"
@@ -164,7 +164,7 @@
                     "description":"max hints in progress",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -177,7 +177,7 @@
            {
               "method":"GET",
               "summary":"get hints in progress",
-               "type": "long",
+               "type":"int",
               "nickname":"get_hints_in_progress",
               "produces":[
                  "application/json"
@@ -602,7 +602,7 @@
        {
          "method": "GET",
          "summary": "Get cas write metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_cas_write_metrics_unfinished_commit",
          "produces": [
            "application/json"
@@ -632,7 +632,7 @@
        {
          "method": "GET",
          "summary": "Get cas write metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_cas_write_metrics_condition_not_met",
          "produces": [
            "application/json"
@@ -647,7 +647,7 @@
        {
          "method": "GET",
          "summary": "Get cas read metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_cas_read_metrics_unfinished_commit",
          "produces": [
            "application/json"
@@ -671,13 +671,28 @@
        }
      ]
    },
+    {
+      "path": "/storage_proxy/metrics/cas_read/condition_not_met",
+      "operations": [
+        {
+          "method": "GET",
+          "summary": "Get cas read metrics",
+          "type": "int",
+          "nickname": "get_cas_read_metrics_condition_not_met",
+          "produces": [
+            "application/json"
+          ],
+          "parameters": []
+        }
+      ]
+    },
    {
      "path": "/storage_proxy/metrics/read/timeouts",
      "operations": [
        {
          "method": "GET",
          "summary": "Get read metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_read_metrics_timeouts",
          "produces": [
            "application/json"
@@ -692,7 +707,7 @@
        {
          "method": "GET",
          "summary": "Get read metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_read_metrics_unavailables",
          "produces": [
            "application/json"
@@ -827,7 +842,7 @@
        {
          "method": "GET",
          "summary": "Get range metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_range_metrics_timeouts",
          "produces": [
            "application/json"
@@ -842,7 +857,7 @@
        {
          "method": "GET",
          "summary": "Get range metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_range_metrics_unavailables",
          "produces": [
            "application/json"
@@ -887,7 +902,7 @@
        {
          "method": "GET",
          "summary": "Get write metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_write_metrics_timeouts",
          "produces": [
            "application/json"
@@ -902,7 +917,7 @@
        {
          "method": "GET",
          "summary": "Get write metrics",
-          "type": "long",
+          "type": "int",
          "nickname": "get_write_metrics_unavailables",
          "produces": [
            "application/json"
@@ -1008,7 +1023,7 @@
            {
               "method":"GET",
               "summary":"Get read latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_read_latency",
               "produces":[
                  "application/json"
@@ -1040,7 +1055,7 @@
            {
               "method":"GET",
               "summary":"Get write latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_write_latency",
               "produces":[
                  "application/json"
@@ -1072,7 +1087,7 @@
            {
               "method":"GET",
               "summary":"Get range latency",
-               "type": "long",
+               "type":"int",
               "nickname":"get_range_latency",
               "produces":[
                  "application/json"
--- a/api/api-doc/storage_service.json
+++ b/api/api-doc/storage_service.json
@@ -458,7 +458,7 @@
            {
               "method":"GET",
               "summary":"Return the generation value for this node.",
-               "type": "long",
+               "type":"int",
               "nickname":"get_current_generation_number",
               "produces":[
                  "application/json"
@@ -646,7 +646,7 @@
            {
               "method":"POST",
               "summary":"Trigger a cleanup of keys on a single keyspace",
-               "type": "long",
+               "type":"int",
               "nickname":"force_keyspace_cleanup",
               "produces":[
                  "application/json"
@@ -678,7 +678,7 @@
            {
               "method":"GET",
               "summary":"Scrub (deserialize + reserialize at the latest version, skipping bad rows if any) the given keyspace. If columnFamilies array is empty, all CFs are scrubbed. Scrubbed CFs will be snapshotted first, if disableSnapshot is false",
-               "type": "long",
+               "type":"int",
               "nickname":"scrub",
               "produces":[
                  "application/json"
@@ -726,7 +726,7 @@
            {
               "method":"GET",
               "summary":"Rewrite all sstables to the latest version. Unlike scrub, it doesn't skip bad rows and do not snapshot sstables first.",
-               "type": "long",
+               "type":"int",
               "nickname":"upgrade_sstables",
               "produces":[
                  "application/json"
@@ -800,7 +800,7 @@
               "summary":"Return an array with the ids of the currently active repairs",
               "type":"array",
               "items":{
-                  "type": "long"
+                  "type":"int"
               },
               "nickname":"get_active_repair_async",
               "produces":[
@@ -816,7 +816,7 @@
            {
               "method":"POST",
               "summary":"Invoke repair asynchronously. You can track repair progress by using the get supplying id",
-               "type": "long",
+               "type":"int",
               "nickname":"repair_async",
               "produces":[
                  "application/json"
@@ -947,7 +947,7 @@
                     "description":"The repair ID to check for status",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -1277,18 +1277,18 @@
                  },
                  {
                     "name":"dynamic_update_interval",
-                     "description":"interval in ms (default 100)",
+                     "description":"integer, in ms (default 100)",
                     "required":false,
                     "allowMultiple":false,
-                     "type":"long",
+                     "type":"integer",
                     "paramType":"query"
                  },
                  {
                     "name":"dynamic_reset_interval",
-                     "description":"interval in ms (default 600,000)",
+                     "description":"integer, in ms (default 600,000)",
                     "required":false,
                     "allowMultiple":false,
-                     "type":"long",
+                     "type":"integer",
                     "paramType":"query"
                  },
                  {
@@ -1493,7 +1493,7 @@
                     "description":"Stream throughput",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -1501,7 +1501,7 @@
            {
               "method":"GET",
               "summary":"Get stream throughput mb per sec",
-               "type": "long",
+               "type":"int",
               "nickname":"get_stream_throughput_mb_per_sec",
               "produces":[
                  "application/json"
@@ -1517,7 +1517,7 @@
            {
               "method":"GET",
               "summary":"get compaction throughput mb per sec",
-               "type": "long",
+               "type":"int",
               "nickname":"get_compaction_throughput_mb_per_sec",
               "produces":[
                  "application/json"
@@ -1539,7 +1539,7 @@
                     "description":"compaction throughput",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -1943,7 +1943,7 @@
            {
               "method":"GET",
               "summary":"Returns the threshold for warning of queries with many tombstones",
-               "type": "long",
+               "type":"int",
               "nickname":"get_tombstone_warn_threshold",
               "produces":[
                  "application/json"
@@ -1965,7 +1965,7 @@
                     "description":"tombstone debug threshold",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -1978,7 +1978,7 @@
            {
               "method":"GET",
               "summary":"",
-               "type": "long",
+               "type":"int",
               "nickname":"get_tombstone_failure_threshold",
               "produces":[
                  "application/json"
@@ -2000,7 +2000,7 @@
                     "description":"tombstone debug threshold",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -2013,7 +2013,7 @@
            {
               "method":"GET",
               "summary":"Returns the threshold for rejecting queries due to a large batch size",
-               "type": "long",
+               "type":"int",
               "nickname":"get_batch_size_failure_threshold",
               "produces":[
                  "application/json"
@@ -2035,7 +2035,7 @@
                     "description":"batch size debug threshold",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -2059,7 +2059,7 @@
                     "description":"throttle in kb",
                     "required":true,
                     "allowMultiple":false,
-                     "type": "long",
+                     "type":"int",
                     "paramType":"query"
                  }
               ]
@@ -2072,7 +2072,7 @@
            {
               "method":"GET",
               "summary":"Get load",
-               "type": "long",
+               "type":"int",
               "nickname":"get_metrics_load",
               "produces":[
                  "application/json"
@@ -2088,7 +2088,7 @@
            {
               "method":"GET",
               "summary":"Get exceptions",
-               "type": "long",
+               "type":"int",
               "nickname":"get_exceptions",
               "produces":[
                  "application/json"
@@ -2104,7 +2104,7 @@
            {
               "method":"GET",
               "summary":"Get total hints in progress",
-               "type": "long",
+               "type":"int",
               "nickname":"get_total_hints_in_progress",
               "produces":[
                  "application/json"
@@ -2120,7 +2120,7 @@
            {
               "method":"GET",
               "summary":"Get total hints",
-               "type": "long",
+               "type":"int",
               "nickname":"get_total_hints",
               "produces":[
                  "application/json"
--- a/api/api-doc/stream_manager.json
+++ b/api/api-doc/stream_manager.json
@@ -32,7 +32,7 @@
            {
               "method":"GET",
               "summary":"Get number of active outbound streams",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_active_streams_outbound",
               "produces":[
                  "application/json"
@@ -48,7 +48,7 @@
            {
               "method":"GET",
               "summary":"Get total incoming bytes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_total_incoming_bytes",
               "produces":[
                  "application/json"
@@ -72,7 +72,7 @@
            {
               "method":"GET",
               "summary":"Get all total incoming bytes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_total_incoming_bytes",
               "produces":[
                  "application/json"
@@ -88,7 +88,7 @@
            {
               "method":"GET",
               "summary":"Get total outgoing bytes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_total_outgoing_bytes",
               "produces":[
                  "application/json"
@@ -112,7 +112,7 @@
            {
               "method":"GET",
               "summary":"Get all total outgoing bytes",
-               "type": "long",
+               "type":"int",
               "nickname":"get_all_total_outgoing_bytes",
               "produces":[
                  "application/json"
@@ -154,7 +154,7 @@
               "description":"The peer"
            },
            "session_index":{
-               "type": "long",
+               "type":"int",
               "description":"The session index"
            },
            "connecting":{
@@ -211,7 +211,7 @@
               "description":"The ID"
            },
            "files":{
-               "type": "long",
+               "type":"int",
               "description":"Number of files to transfer. Can be 0 if nothing to transfer for some streaming request."
            },
            "total_size":{
@@ -242,7 +242,7 @@
               "description":"The peer address"
            },
            "session_index":{
-               "type": "long",
+               "type":"int",
               "description":"The session index"
            },
            "file_name":{
--- a/api/api-doc/system.json
+++ b/api/api-doc/system.json
@@ -52,21 +52,6 @@
            }
         ]
      },
-      {
-         "path":"/system/uptime_ms",
-         "operations":[
-            {
-               "method":"GET",
-               "summary":"Get system uptime, in milliseconds",
-               "type":"long",
-               "nickname":"get_system_uptime",
-               "produces":[
-                  "application/json"
-               ],
-               "parameters":[]
-            }
-         ]
-      },
      {
         "path":"/system/logger/{name}",
         "operations":[
--- a/api/api_init.hh
+++ b/api/api_init.hh
@@ -23,8 +23,6 @@
 #include "service/storage_proxy.hh"
 #include <seastar/http/httpd.hh>

-namespace service { class load_meter; }
-
 namespace api {

 struct http_context {
@@ -33,11 +31,9 @@ struct http_context {
    httpd::http_server_control http_server;
    distributed<database>& db;
    distributed<service::storage_proxy>& sp;
-    service::load_meter& lmeter;
    http_context(distributed<database>& _db,
-            distributed<service::storage_proxy>& _sp,
-            service::load_meter& _lm)
-            : db(_db), sp(_sp), lmeter(_lm) {
+            distributed<service::storage_proxy>& _sp)
+            : db(_db), sp(_sp) {
    }
 };

--- a/api/column_family.cc
+++ b/api/column_family.cc
@@ -26,7 +26,7 @@
 #include "sstables/sstables.hh"
 #include "utils/estimated_histogram.hh"
 #include <algorithm>
-#include "db/system_keyspace_view_types.hh"
+
 #include "db/data_listeners.hh"

 extern logging::logger apilog;
@@ -53,7 +53,8 @@ std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name) {
    return std::make_tuple(name.substr(0, pos), name.substr(end));
 }

-const utils::UUID& get_uuid(const sstring& ks, const sstring& cf, const database& db) {
+const utils::UUID& get_uuid(const sstring& name, const database& db) {
+    auto [ks, cf] = parse_fully_qualified_cf_name(name);
    try {
        return db.find_uuid(ks, cf);
    } catch (std::out_of_range& e) {
@@ -61,11 +62,6 @@ const utils::UUID& get_uuid(const sstring& ks, const sstring& cf, const database
    }
 }

-const utils::UUID& get_uuid(const sstring& name, const database& db) {
-    auto [ks, cf] = parse_fully_qualified_cf_name(name);
-    return get_uuid(ks, cf, db);
-}
-
 future<> foreach_column_family(http_context& ctx, const sstring& name, function<void(column_family&)> f) {
    auto uuid = get_uuid(name, ctx.db.local());

@@ -75,28 +71,28 @@ future<> foreach_column_family(http_context& ctx, const sstring& name, function<
 }

 future<json::json_return_type>  get_cf_stats(http_context& ctx, const sstring& name,
-        int64_t column_family_stats::*f) {
+        int64_t column_family::stats::*f) {
    return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
        return cf.get_stats().*f;
    }, std::plus<int64_t>());
 }

 future<json::json_return_type>  get_cf_stats(http_context& ctx,
-        int64_t column_family_stats::*f) {
+        int64_t column_family::stats::*f) {
    return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
        return cf.get_stats().*f;
    }, std::plus<int64_t>());
 }

 static future<json::json_return_type>  get_cf_stats_count(http_context& ctx, const sstring& name,
-        utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+        utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
        return (cf.get_stats().*f).hist.count;
    }, std::plus<int64_t>());
 }

 static future<json::json_return_type>  get_cf_stats_sum(http_context& ctx, const sstring& name,
-        utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+        utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    auto uuid = get_uuid(name, ctx.db.local());
    return ctx.db.map_reduce0([uuid, f](database& db) {
        // Histograms information is sample of the actual load
@@ -112,14 +108,14 @@ static future<json::json_return_type>  get_cf_stats_sum(http_context& ctx, const


 static future<json::json_return_type>  get_cf_stats_count(http_context& ctx,
-        utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+        utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
        return (cf.get_stats().*f).hist.count;
    }, std::plus<int64_t>());
 }

 static future<json::json_return_type>  get_cf_histogram(http_context& ctx, const sstring& name,
-        utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+        utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    utils::UUID uuid = get_uuid(name, ctx.db.local());
    return ctx.db.map_reduce0([f, uuid](const database& p) {
        return (p.find_column_family(uuid).get_stats().*f).hist;},
@@ -130,7 +126,7 @@ static future<json::json_return_type>  get_cf_histogram(http_context& ctx, const
    });
 }

-static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    std::function<utils::ihistogram(const database&)> fun = [f] (const database& db)  {
        utils::ihistogram res;
        for (auto i : db.get_column_families()) {
@@ -146,7 +142,7 @@ static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils:
 }

 static future<json::json_return_type>  get_cf_rate_and_histogram(http_context& ctx, const sstring& name,
-        utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+        utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    utils::UUID uuid = get_uuid(name, ctx.db.local());
    return ctx.db.map_reduce0([f, uuid](const database& p) {
        return (p.find_column_family(uuid).get_stats().*f).rate();},
@@ -157,7 +153,7 @@ static future<json::json_return_type>  get_cf_rate_and_histogram(http_context& c
    });
 }

-static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
+static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
    std::function<utils::rate_moving_average_and_histogram(const database&)> fun = [f] (const database& db)  {
        utils::rate_moving_average_and_histogram res;
        for (auto i : db.get_column_families()) {
@@ -254,11 +250,12 @@ class sum_ratio {
    uint64_t _n = 0;
    T _total = 0;
 public:
-    void operator()(T value) {
+    future<> operator()(T value) {
        if (value > 0) {
            _total += value;
            _n++;
        }
+        return make_ready_future<>();
    }
    // Returns average value of all registered ratios.
    T get() && {
@@ -407,11 +404,11 @@ void set_column_family(http_context& ctx, routes& r) {
    });

    cf::get_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::memtable_switch_count);
+        return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::memtable_switch_count);
    });

    cf::get_all_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, &column_family_stats::memtable_switch_count);
+        return get_cf_stats(ctx, &column_family::stats::memtable_switch_count);
    });

    // FIXME: this refers to partitions, not rows.
@@ -456,67 +453,67 @@ void set_column_family(http_context& ctx, routes& r) {
    });

    cf::get_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::pending_flushes);
+        return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::pending_flushes);
    });

    cf::get_all_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, &column_family_stats::pending_flushes);
+        return get_cf_stats(ctx, &column_family::stats::pending_flushes);
    });

    cf::get_read.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_count(ctx,req->param["name"] ,&column_family_stats::reads);
+        return get_cf_stats_count(ctx,req->param["name"] ,&column_family::stats::reads);
    });

    cf::get_all_read.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_count(ctx, &column_family_stats::reads);
+        return get_cf_stats_count(ctx, &column_family::stats::reads);
    });

    cf::get_write.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_count(ctx, req->param["name"] ,&column_family_stats::writes);
+        return get_cf_stats_count(ctx, req->param["name"] ,&column_family::stats::writes);
    });

    cf::get_all_write.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_count(ctx, &column_family_stats::writes);
+        return get_cf_stats_count(ctx, &column_family::stats::writes);
    });

    cf::get_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, req->param["name"], &column_family_stats::reads);
+        return get_cf_histogram(ctx, req->param["name"], &column_family::stats::reads);
    });

    cf::get_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::reads);
+        return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::reads);
    });

    cf::get_read_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_sum(ctx,req->param["name"] ,&column_family_stats::reads);
+        return get_cf_stats_sum(ctx,req->param["name"] ,&column_family::stats::reads);
    });

    cf::get_write_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats_sum(ctx, req->param["name"] ,&column_family_stats::writes);
+        return get_cf_stats_sum(ctx, req->param["name"] ,&column_family::stats::writes);
    });

    cf::get_all_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, &column_family_stats::writes);
+        return get_cf_histogram(ctx, &column_family::stats::writes);
    });

    cf::get_all_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
+        return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
    });

    cf::get_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, req->param["name"], &column_family_stats::writes);
+        return get_cf_histogram(ctx, req->param["name"], &column_family::stats::writes);
    });

    cf::get_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::writes);
+        return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::writes);
    });

    cf::get_all_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, &column_family_stats::writes);
+        return get_cf_histogram(ctx, &column_family::stats::writes);
    });

    cf::get_all_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
+        return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
    });

    cf::get_pending_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -532,11 +529,11 @@ void set_column_family(http_context& ctx, routes& r) {
    });

    cf::get_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, req->param["name"], &column_family_stats::live_sstable_count);
+        return get_cf_stats(ctx, req->param["name"], &column_family::stats::live_sstable_count);
    });

    cf::get_all_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, &column_family_stats::live_sstable_count);
+        return get_cf_stats(ctx, &column_family::stats::live_sstable_count);
    });

    cf::get_unleveled_sstables.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -795,25 +792,25 @@ void set_column_family(http_context& ctx, routes& r) {

    });

-    cf::get_cas_prepare.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
-            return cf.get_stats().estimated_cas_prepare;
-        },
-        utils::estimated_histogram_merge, utils_json::estimated_histogram());
+    cf::get_cas_prepare.set(r, [] (std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        //auto id = get_uuid(req->param["name"], ctx.db.local());
+        return make_ready_future<json::json_return_type>(0);
    });

-    cf::get_cas_propose.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
-            return cf.get_stats().estimated_cas_propose;
-        },
-        utils::estimated_histogram_merge, utils_json::estimated_histogram());
+    cf::get_cas_propose.set(r, [] (std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        //auto id = get_uuid(req->param["name"], ctx.db.local());
+        return make_ready_future<json::json_return_type>(0);
    });

-    cf::get_cas_commit.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
-            return cf.get_stats().estimated_cas_commit;
-        },
-        utils::estimated_histogram_merge, utils_json::estimated_histogram());
+    cf::get_cas_commit.set(r, [] (std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        //auto id = get_uuid(req->param["name"], ctx.db.local());
+        return make_ready_future<json::json_return_type>(0);
    });

    cf::get_sstables_per_read_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -824,11 +821,11 @@ void set_column_family(http_context& ctx, routes& r) {
    });

    cf::get_tombstone_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, req->param["name"], &column_family_stats::tombstone_scanned);
+        return get_cf_histogram(ctx, req->param["name"], &column_family::stats::tombstone_scanned);
    });

    cf::get_live_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return get_cf_histogram(ctx, req->param["name"], &column_family_stats::live_scanned);
+        return get_cf_histogram(ctx, req->param["name"], &column_family::stats::live_scanned);
    });

    cf::get_col_update_time_delta_histogram.set(r, [] (std::unique_ptr<request> req) {
@@ -846,28 +843,13 @@ void set_column_family(http_context& ctx, routes& r) {
        return true;
    });

-    cf::get_built_indexes.set(r, [&ctx](std::unique_ptr<request> req) {
-        auto [ks, cf_name] = parse_fully_qualified_cf_name(req->param["name"]);
-        return db::system_keyspace::load_view_build_progress().then([ks, cf_name, &ctx](const std::vector<db::system_keyspace::view_build_progress>& vb) mutable {
-            std::set<sstring> vp;
-            for (auto b : vb) {
-                if (b.view.first == ks) {
-                    vp.insert(b.view.second);
-                }
-            }
-            std::vector<sstring> res;
-            auto uuid = get_uuid(ks, cf_name, ctx.db.local());
-            column_family& cf = ctx.db.local().find_column_family(uuid);
-            res.reserve(cf.get_index_manager().list_indexes().size());
-            for (auto&& i : cf.get_index_manager().list_indexes()) {
-                if (vp.find(secondary_index::index_table_name(i.metadata().name())) == vp.end()) {
-                    res.emplace_back(i.metadata().name());
-                }
-            }
-            return make_ready_future<json::json_return_type>(res);
-        });
+    cf::get_built_indexes.set(r, [](const_req) {
+        // FIXME
+        // Currently there are no index support
+        return std::vector<sstring>();
    });

+
    cf::get_compression_metadata_off_heap_memory_used.set(r, [](const_req) {
        // FIXME
        // Currently there are no information on the compression
--- a/api/column_family.hh
+++ b/api/column_family.hh
@@ -109,9 +109,9 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, I init,
 }

 future<json::json_return_type>  get_cf_stats(http_context& ctx, const sstring& name,
-        int64_t column_family_stats::*f);
+        int64_t column_family::stats::*f);

 future<json::json_return_type>  get_cf_stats(http_context& ctx,
-        int64_t column_family_stats::*f);
+        int64_t column_family::stats::*f);

 }
--- a/api/compaction_manager.cc
+++ b/api/compaction_manager.cc
@@ -74,14 +74,13 @@ void set_compaction_manager(http_context& ctx, routes& r) {

    cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<request> req) {
        return ctx.db.map_reduce0([&ctx](database& db) {
-            return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&ctx, &db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {
-                return do_for_each(db.get_column_families(), [&tasks](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
-                    table& cf = *i.second.get();
-                    tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf);
-                    return make_ready_future<>();
-                }).then([&tasks] {
-                    return std::move(tasks);
-                });
+            std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash> tasks;
+            return do_for_each(db.get_column_families(), [&tasks](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
+                table& cf = *i.second.get();
+                tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf);
+                return make_ready_future<>();
+            }).then([&tasks] {
+                return tasks;
            });
        }, std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), sum_pending_tasks).then(
                [](const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& task_map) {
--- a/api/storage_proxy.cc
+++ b/api/storage_proxy.cc
@@ -81,9 +81,12 @@ void set_storage_proxy(http_context& ctx, routes& r) {
        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<request> req)  {
-        auto enabled = ctx.db.local().get_config().hinted_handoff_enabled();
-        return make_ready_future<json::json_return_type>(enabled);
+    sp::get_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req)  {
+        //TBD
+        // FIXME
+        // hinted handoff is not supported currently,
+        // so we should return false
+        return make_ready_future<json::json_return_type>(false);
    });

    sp::set_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req)  {
@@ -247,40 +250,68 @@ void set_storage_proxy(http_context& ctx, routes& r) {
        });
    });

-    sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_timeouts);
+    sp::get_cas_read_timeouts.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return 0
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_unavailables);
+    sp::get_cas_read_unavailables.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return 0
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_timeouts);
+    sp::get_cas_write_timeouts.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return 0
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_unavailables);
+    sp::get_cas_write_unavailables.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return 0
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_stats(ctx.sp, &proxy::stats::cas_write_unfinished_commit);
+    sp::get_cas_write_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_estimated_histogram(ctx, &proxy::stats::cas_write_contention);
+    sp::get_cas_write_metrics_contention.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_stats(ctx.sp, &proxy::stats::cas_write_condition_not_met);
+    sp::get_cas_write_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_stats(ctx.sp, &proxy::stats::cas_read_unfinished_commit);
+    sp::get_cas_read_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
    });

-    sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_estimated_histogram(ctx, &proxy::stats::cas_read_contention);
+    sp::get_cas_read_metrics_contention.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
+    });
+
+    sp::get_cas_read_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
+        //TBD
+        unimplemented();
+        return make_ready_future<json::json_return_type>(0);
    });

    sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
@@ -351,11 +382,19 @@ void set_storage_proxy(http_context& ctx, routes& r) {
        return sum_timer_stats(ctx.sp, &proxy::stats::write);
    });
    sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timer_stats(ctx.sp, &proxy::stats::cas_write);
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return empty moving average
+
+        return make_ready_future<json::json_return_type>(get_empty_moving_average());
    });

    sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
-        return sum_timer_stats(ctx.sp, &proxy::stats::cas_read);
+        //TBD
+        // FIXME
+        // cas is not supported yet, so just return empty moving average
+
+        return make_ready_future<json::json_return_type>(get_empty_moving_average());
    });

    sp::get_view_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
--- a/api/storage_service.cc
+++ b/api/storage_service.cc
@@ -27,7 +27,6 @@
 #include <boost/range/adaptor/map.hpp>
 #include <boost/range/adaptor/filtered.hpp>
 #include "service/storage_service.hh"
-#include "service/load_meter.hh"
 #include "db/commitlog/commitlog.hh"
 #include "gms/gossiper.hh"
 #include "db/system_keyspace.hh"
@@ -56,22 +55,26 @@ static sstring validate_keyspace(http_context& ctx, const parameters& param) {
    throw bad_param_exception("Keyspace " + param["keyspace"] + " Does not exist");
 }

-static ss::token_range token_range_endpoints_to_json(const dht::token_range_endpoints& d) {
-    ss::token_range r;
-    r.start_token = d._start_token;
-    r.end_token = d._end_token;
-    r.endpoints = d._endpoints;
-    r.rpc_endpoints = d._rpc_endpoints;
-    for (auto det : d._endpoint_details) {
-        ss::endpoint_detail ed;
-        ed.host = det._host;
-        ed.datacenter = det._datacenter;
-        if (det._rack != "") {
-            ed.rack = det._rack;
+static std::vector<ss::token_range> describe_ring(const sstring& keyspace) {
+    std::vector<ss::token_range> res;
+    for (auto d : service::get_local_storage_service().describe_ring(keyspace)) {
+        ss::token_range r;
+        r.start_token = d._start_token;
+        r.end_token = d._end_token;
+        r.endpoints = d._endpoints;
+        r.rpc_endpoints = d._rpc_endpoints;
+        for (auto det : d._endpoint_details) {
+            ss::endpoint_detail ed;
+            ed.host = det._host;
+            ed.datacenter = det._datacenter;
+            if (det._rack != "") {
+                ed.rack = det._rack;
+            }
+            r.endpoint_details.push(ed);
        }
-        r.endpoint_details.push(ed);
+        res.push_back(r);
    }
-    return r;
+    return res;
 }

 void set_storage_service(http_context& ctx, routes& r) {
@@ -173,13 +176,13 @@ void set_storage_service(http_context& ctx, routes& r) {
        return make_ready_future<json::json_return_type>(res);
    });

-    ss::describe_any_ring.set(r, [&ctx](std::unique_ptr<request> req) {
-        return make_ready_future<json::json_return_type>(stream_range_as_array(service::get_local_storage_service().describe_ring(""), token_range_endpoints_to_json));
+    ss::describe_any_ring.set(r, [&ctx](const_req req) {
+        return describe_ring("");
    });

-    ss::describe_ring.set(r, [&ctx](std::unique_ptr<request> req) {
-        auto keyspace = validate_keyspace(ctx, req->param);
-        return make_ready_future<json::json_return_type>(stream_range_as_array(service::get_local_storage_service().describe_ring(keyspace), token_range_endpoints_to_json));
+    ss::describe_ring.set(r, [&ctx](const_req req) {
+        auto keyspace = validate_keyspace(ctx, req.param);
+        return describe_ring(keyspace);
    });

    ss::get_host_id_map.set(r, [](const_req req) {
@@ -189,11 +192,11 @@ void set_storage_service(http_context& ctx, routes& r) {
    });

    ss::get_load.set(r, [&ctx](std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, &column_family_stats::live_disk_space_used);
+        return get_cf_stats(ctx, &column_family::stats::live_disk_space_used);
    });

-    ss::get_load_map.set(r, [&ctx] (std::unique_ptr<request> req) {
-        return ctx.lmeter.get_load_map().then([] (auto&& load_map) {
+    ss::get_load_map.set(r, [] (std::unique_ptr<request> req) {
+        return service::get_local_storage_service().get_load_map().then([] (auto&& load_map) {
            std::vector<ss::map_string_double> res;
            for (auto i : load_map) {
                ss::map_string_double val;
@@ -251,9 +254,6 @@ void set_storage_service(http_context& ctx, routes& r) {
        if (column_family.empty()) {
            resp = service::get_local_storage_service().take_snapshot(tag, keynames);
        } else {
-            if (keynames.empty()) {
-                throw httpd::bad_param_exception("The keyspace of column families must be specified");
-            }
            if (keynames.size() > 1) {
                throw httpd::bad_param_exception("Only one keyspace allowed when specifying a column family");
            }
@@ -304,24 +304,17 @@ void set_storage_service(http_context& ctx, routes& r) {
        if (column_families.empty()) {
            column_families = map_keys(ctx.db.local().find_keyspace(keyspace).metadata().get()->cf_meta_data());
        }
-        return service::get_local_storage_service().is_cleanup_allowed(keyspace).then([&ctx, keyspace,
-                column_families = std::move(column_families)] (bool is_cleanup_allowed) mutable {
-            if (!is_cleanup_allowed) {
-                return make_exception_future<json::json_return_type>(
-                        std::runtime_error("Can not perform cleanup operation when topology changes"));
+        return ctx.db.invoke_on_all([keyspace, column_families] (database& db) {
+            std::vector<column_family*> column_families_vec;
+            auto& cm = db.get_compaction_manager();
+            for (auto cf : column_families) {
+                column_families_vec.push_back(&db.find_column_family(keyspace, cf));
            }
-            return ctx.db.invoke_on_all([keyspace, column_families] (database& db) {
-                std::vector<column_family*> column_families_vec;
-                auto& cm = db.get_compaction_manager();
-                for (auto cf : column_families) {
-                    column_families_vec.push_back(&db.find_column_family(keyspace, cf));
-                }
-                return parallel_for_each(column_families_vec, [&cm] (column_family* cf) {
-                    return cm.perform_cleanup(cf);
-                });
-            }).then([]{
-                return make_ready_future<json::json_return_type>(0);
+            return parallel_for_each(column_families_vec, [&cm] (column_family* cf) {
+                return cm.perform_cleanup(cf);
            });
+        }).then([]{
+            return make_ready_future<json::json_return_type>(0);
        });
    });

@@ -605,7 +598,9 @@ void set_storage_service(http_context& ctx, routes& r) {
    });

    ss::join_ring.set(r, [](std::unique_ptr<request> req) {
-        return make_ready_future<json::json_return_type>(json_void());
+        return service::get_local_storage_service().join_ring().then([] {
+            return make_ready_future<json::json_return_type>(json_void());
+        });
    });

    ss::is_joined.set(r, [] (std::unique_ptr<request> req) {
@@ -865,7 +860,7 @@ void set_storage_service(http_context& ctx, routes& r) {
    });

    ss::get_metrics_load.set(r, [&ctx](std::unique_ptr<request> req) {
-        return get_cf_stats(ctx, &column_family_stats::live_disk_space_used);
+        return get_cf_stats(ctx, &column_family::stats::live_disk_space_used);
    });

    ss::get_exceptions.set(r, [](const_req req) {
--- a/api/system.cc
+++ b/api/system.cc
@@ -30,10 +30,6 @@ namespace api {
 namespace hs = httpd::system_json;

 void set_system(http_context& ctx, routes& r) {
-    hs::get_system_uptime.set(r, [](const_req req) {
-        return std::chrono::duration_cast<std::chrono::milliseconds>(engine().uptime()).count();
-    });
-
    hs::get_all_logger_names.set(r, [](const_req req) {
        return logging::logger_registry().get_all_logger_names();
    });
--- a/atomic_cell.cc
+++ b/atomic_cell.cc
@@ -21,8 +21,8 @@

 #include "atomic_cell.hh"
 #include "atomic_cell_or_collection.hh"
-#include "counters.hh"
 #include "types.hh"
+#include "types/collection.hh"

 /// LSA mirator for cells with irrelevant type
 ///
@@ -148,6 +148,35 @@ atomic_cell_or_collection::atomic_cell_or_collection(const abstract_type& type,
 {
 }

+static collection_mutation_view get_collection_mutation_view(const uint8_t* ptr)
+{
+    auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
+    auto ti = data::type_info::make_collection();
+    data::cell::context ctx(f, ti);
+    auto view = data::cell::structure::get_member<data::cell::tags::cell>(ptr).as<data::cell::tags::collection>(ctx);
+    auto dv = data::cell::variable_value::make_view(view, f.get<data::cell::tags::external_data>());
+    return collection_mutation_view { dv };
+}
+
+collection_mutation_view atomic_cell_or_collection::as_collection_mutation() const {
+    return get_collection_mutation_view(_data.get());
+}
+
+collection_mutation::collection_mutation(const collection_type_impl& type, collection_mutation_view v)
+    : _data(imr_object_type::make(data::cell::make_collection(v.data), &type.imr_state().lsa_migrator()))
+{
+}
+
+collection_mutation::collection_mutation(const collection_type_impl& type, bytes_view v)
+    : _data(imr_object_type::make(data::cell::make_collection(v), &type.imr_state().lsa_migrator()))
+{
+}
+
+collection_mutation::operator collection_mutation_view() const
+{
+    return get_collection_mutation_view(_data.get());
+}
+
 bool atomic_cell_or_collection::equals(const abstract_type& type, const atomic_cell_or_collection& other) const
 {
    auto ptr_a = _data.get();
@@ -202,7 +231,7 @@ size_t atomic_cell_or_collection::external_memory_usage(const abstract_type& t)
    size_t external_value_size = 0;
    if (flags.get<data::cell::tags::external_data>()) {
        if (flags.get<data::cell::tags::collection>()) {
-            external_value_size = as_collection_mutation().data.size_bytes();
+            external_value_size = get_collection_mutation_view(_data.get()).data.size_bytes();
        } else {
            auto cell_view = data::cell::atomic_cell_view(t.imr_state().type_info(), view);
            external_value_size = cell_view.value_size();
@@ -215,61 +244,6 @@ size_t atomic_cell_or_collection::external_memory_usage(const abstract_type& t)
        + imr_object_type::size_overhead + external_value_size;
 }

-std::ostream&
-operator<<(std::ostream& os, const atomic_cell_view& acv) {
-    if (acv.is_live()) {
-        return fmt_print(os, "atomic_cell{{{},ts={:d},expiry={:d},ttl={:d}}}",
-            acv.is_counter_update()
-                    ? "counter_update_value=" + to_sstring(acv.counter_update_value())
-                    : to_hex(acv.value().linearize()),
-            acv.timestamp(),
-            acv.is_live_and_has_ttl() ? acv.expiry().time_since_epoch().count() : -1,
-            acv.is_live_and_has_ttl() ? acv.ttl().count() : 0);
-    } else {
-        return fmt_print(os, "atomic_cell{{DEAD,ts={:d},deletion_time={:d}}}",
-            acv.timestamp(), acv.deletion_time().time_since_epoch().count());
-    }
-}
-
-std::ostream&
-operator<<(std::ostream& os, const atomic_cell& ac) {
-    return os << atomic_cell_view(ac);
-}
-
-std::ostream&
-operator<<(std::ostream& os, const atomic_cell_view::printer& acvp) {
-    auto& type = acvp._type;
-    auto& acv = acvp._cell;
-    if (acv.is_live()) {
-        std::ostringstream cell_value_string_builder;
-        if (type.is_counter()) {
-            if (acv.is_counter_update()) {
-                cell_value_string_builder << "counter_update_value=" << acv.counter_update_value();
-            } else {
-                cell_value_string_builder << "shards: ";
-                counter_cell_view::with_linearized(acv, [&cell_value_string_builder] (counter_cell_view& ccv) {
-                    cell_value_string_builder << ::join(", ", ccv.shards());
-                });
-            }
-        } else {
-            cell_value_string_builder << type.to_string(acv.value().linearize());
-        }
-        return fmt_print(os, "atomic_cell{{{},ts={:d},expiry={:d},ttl={:d}}}",
-            cell_value_string_builder.str(),
-            acv.timestamp(),
-            acv.is_live_and_has_ttl() ? acv.expiry().time_since_epoch().count() : -1,
-            acv.is_live_and_has_ttl() ? acv.ttl().count() : 0);
-    } else {
-        return fmt_print(os, "atomic_cell{{DEAD,ts={:d},deletion_time={:d}}}",
-            acv.timestamp(), acv.deletion_time().time_since_epoch().count());
-    }
-}
-
-std::ostream&
-operator<<(std::ostream& os, const atomic_cell::printer& acp) {
-    return operator<<(os, static_cast<const atomic_cell_view::printer&>(acp));
-}
-
 std::ostream& operator<<(std::ostream& os, const atomic_cell_or_collection::printer& p) {
    if (!p._cell._data.get()) {
        return os << "{ null atomic_cell_or_collection }";
@@ -279,9 +253,9 @@ std::ostream& operator<<(std::ostream& os, const atomic_cell_or_collection::prin
    if (dc::structure::get_member<dc::tags::flags>(p._cell._data.get()).get<dc::tags::collection>()) {
        os << "collection ";
        auto cmv = p._cell.as_collection_mutation();
-        os << collection_mutation_view::printer(*p._cdef.type, cmv);
+        os << to_hex(cmv.data.linearize());
    } else {
-        os << atomic_cell_view::printer(*p._cdef.type, p._cell.as_atomic_cell(p._cdef));
+        os << p._cell.as_atomic_cell(p._cdef);
    }
    return os << " }";
 }
--- a/atomic_cell.hh
+++ b/atomic_cell.hh
@@ -153,14 +153,6 @@ public:
    }

    friend std::ostream& operator<<(std::ostream& os, const atomic_cell_view& acv);
-
-    class printer {
-        const abstract_type& _type;
-        const atomic_cell_view& _cell;
-    public:
-        printer(const abstract_type& type, const atomic_cell_view& cell) : _type(type), _cell(cell) {}
-        friend std::ostream& operator<<(std::ostream& os, const printer& acvp);
-    };
 };

 class atomic_cell_mutable_view final : public basic_atomic_cell_view<mutable_view::yes> {
@@ -227,12 +219,30 @@ public:
    static atomic_cell make_live_uninitialized(const abstract_type& type, api::timestamp_type timestamp, size_t size);
    friend class atomic_cell_or_collection;
    friend std::ostream& operator<<(std::ostream& os, const atomic_cell& ac);
+};

-    class printer : atomic_cell_view::printer {
-    public:
-        printer(const abstract_type& type, const atomic_cell_view& cell) : atomic_cell_view::printer(type, cell) {}
-        friend std::ostream& operator<<(std::ostream& os, const printer& acvp);
-    };
+class collection_mutation_view;
+
+// Represents a mutation of a collection.  Actual format is determined by collection type,
+// and is:
+//   set:  list of atomic_cell
+//   map:  list of pair<atomic_cell, bytes> (for key/value)
+//   list: tbd, probably ugly
+class collection_mutation {
+public:
+    using imr_object_type =  imr::utils::object<data::cell::structure>;
+    imr_object_type _data;
+
+    collection_mutation() {}
+    collection_mutation(const collection_type_impl&, collection_mutation_view v);
+    collection_mutation(const collection_type_impl&, bytes_view bv);
+    operator collection_mutation_view() const;
+};
+
+
+class collection_mutation_view {
+public:
+    atomic_cell_value_view data;
 };

 class column_definition;
--- a/atomic_cell_hash.hh
+++ b/atomic_cell_hash.hh
@@ -34,12 +34,14 @@ template<>
 struct appending_hash<collection_mutation_view> {
    template<typename Hasher>
    void operator()(Hasher& h, collection_mutation_view cell, const column_definition& cdef) const {
-        cell.with_deserialized(*cdef.type, [&] (collection_mutation_view_description m_view) {
-            ::feed_hash(h, m_view.tomb);
-            for (auto&& key_and_value : m_view.cells) {
-                ::feed_hash(h, key_and_value.first);
-                ::feed_hash(h, key_and_value.second, cdef);
-            }
+      cell.data.with_linearized([&] (bytes_view cell_bv) {
+        auto ctype = static_pointer_cast<const collection_type_impl>(cdef.type);
+        auto m_view = ctype->deserialize_mutation_form(cell_bv);
+        ::feed_hash(h, m_view.tomb);
+        for (auto&& key_and_value : m_view.cells) {
+            ::feed_hash(h, key_and_value.first);
+            ::feed_hash(h, key_and_value.second, cdef);
+        }
      });
    }
 };
--- a/atomic_cell_or_collection.hh
+++ b/atomic_cell_or_collection.hh
@@ -22,7 +22,6 @@
 #pragma once

 #include "atomic_cell.hh"
-#include "collection_mutation.hh"
 #include "schema.hh"
 #include "hashing.hh"

--- a/auth/role_manager.hh
+++ b/auth/role_manager.hh
@@ -33,7 +33,6 @@

 #include "auth/resource.hh"
 #include "seastarx.hh"
-#include "exceptions/exceptions.hh"

 namespace auth {

@@ -53,9 +52,9 @@ struct role_config_update final {
 ///
 /// A logical argument error for a role-management operation.
 ///
-class roles_argument_exception : public exceptions::invalid_request_exception {
+class roles_argument_exception : public std::invalid_argument {
 public:
-    using exceptions::invalid_request_exception::invalid_request_exception;
+    using std::invalid_argument::invalid_argument;
 };

 class role_already_exists : public roles_argument_exception {
--- a/auth/service.cc
+++ b/auth/service.cc
@@ -39,7 +39,7 @@
 #include "db/consistency_level_type.hh"
 #include "exceptions/exceptions.hh"
 #include "log.hh"
-#include "service/migration_manager.hh"
+#include "service/migration_listener.hh"
 #include "utils/class_registrator.hh"
 #include "database.hh"

@@ -77,23 +77,17 @@ private:
    void on_update_view(const sstring& ks_name, const sstring& view_name, bool columns_changed) override {}

    void on_drop_keyspace(const sstring& ks_name) override {
-        // Do it in the background.
-        (void)_authorizer.revoke_all(
+        _authorizer.revoke_all(
                auth::make_data_resource(ks_name)).handle_exception_type([](const unsupported_authorization_operation&) {
            // Nothing.
-        }).handle_exception([] (std::exception_ptr e) {
-            log.error("Unexpected exception while revoking all permissions on dropped keyspace: {}", e);
        });
    }

    void on_drop_column_family(const sstring& ks_name, const sstring& cf_name) override {
-        // Do it in the background.
-        (void)_authorizer.revoke_all(
+        _authorizer.revoke_all(
                auth::make_data_resource(
                        ks_name, cf_name)).handle_exception_type([](const unsupported_authorization_operation&) {
            // Nothing.
-        }).handle_exception([] (std::exception_ptr e) {
-            log.error("Unexpected exception while revoking all permissions on dropped table: {}", e);
        });
    }

@@ -114,14 +108,14 @@ static future<> validate_role_exists(const service& ser, std::string_view role_n
 service::service(
        permissions_cache_config c,
        cql3::query_processor& qp,
-        ::service::migration_notifier& mn,
+        ::service::migration_manager& mm,
        std::unique_ptr<authorizer> z,
        std::unique_ptr<authenticator> a,
        std::unique_ptr<role_manager> r)
            : _permissions_cache_config(std::move(c))
            , _permissions_cache(nullptr)
            , _qp(qp)
-            , _mnotifier(mn)
+            , _migration_manager(mm)
            , _authorizer(std::move(z))
            , _authenticator(std::move(a))
            , _role_manager(std::move(r))
@@ -141,19 +135,18 @@ service::service(
 service::service(
        permissions_cache_config c,
        cql3::query_processor& qp,
-        ::service::migration_notifier& mn,
        ::service::migration_manager& mm,
        const service_config& sc)
            : service(
                      std::move(c),
                      qp,
-                      mn,
+                      mm,
                      create_object<authorizer>(sc.authorizer_java_name, qp, mm),
                      create_object<authenticator>(sc.authenticator_java_name, qp, mm),
                      create_object<role_manager>(sc.role_manager_java_name, qp, mm)) {
 }

-future<> service::create_keyspace_if_missing(::service::migration_manager& mm) const {
+future<> service::create_keyspace_if_missing() const {
    auto& db = _qp.db();

    if (!db.has_keyspace(meta::AUTH_KS)) {
@@ -167,15 +160,15 @@ future<> service::create_keyspace_if_missing(::service::migration_manager& mm) c

        // We use min_timestamp so that default keyspace metadata will loose with any manual adjustments.
        // See issue #2129.
-        return mm.announce_new_keyspace(ksm, api::min_timestamp, false);
+        return _migration_manager.announce_new_keyspace(ksm, api::min_timestamp, false);
    }

    return make_ready_future<>();
 }

-future<> service::start(::service::migration_manager& mm) {
-    return once_among_shards([this, &mm] {
-        return create_keyspace_if_missing(mm);
+future<> service::start() {
+    return once_among_shards([this] {
+        return create_keyspace_if_missing();
    }).then([this] {
        return _role_manager->start().then([this] {
            return when_all_succeed(_authorizer->start(), _authenticator->start());
@@ -184,7 +177,7 @@ future<> service::start(::service::migration_manager& mm) {
        _permissions_cache = std::make_unique<permissions_cache>(_permissions_cache_config, *this, log);
    }).then([this] {
        return once_among_shards([this] {
-            _mnotifier.register_listener(_migration_listener.get());
+            _migration_manager.register_listener(_migration_listener.get());
            return make_ready_future<>();
        });
    });
@@ -193,9 +186,9 @@ future<> service::start(::service::migration_manager& mm) {
 future<> service::stop() {
    // Only one of the shards has the listener registered, but let's try to
    // unregister on each one just to make sure.
-    return _mnotifier.unregister_listener(_migration_listener.get()).then([this] {
-        return _permissions_cache->stop();
-    }).then([this] {
+    _migration_manager.unregister_listener(_migration_listener.get());
+
+    return _permissions_cache->stop().then([this] {
        return when_all_succeed(_role_manager->stop(), _authorizer->stop(), _authenticator->stop());
    });
 }
--- a/auth/service.hh
+++ b/auth/service.hh
@@ -28,7 +28,6 @@
 #include <seastar/core/future.hh>
 #include <seastar/core/sstring.hh>
 #include <seastar/util/bool_class.hh>
-#include <seastar/core/sharded.hh>

 #include "auth/authenticator.hh"
 #include "auth/authorizer.hh"
@@ -43,7 +42,6 @@ class query_processor;

 namespace service {
 class migration_manager;
-class migration_notifier;
 class migration_listener;
 }

@@ -78,15 +76,13 @@ public:
 ///
 /// All state associated with access-control is stored externally to any particular instance of this class.
 ///
-/// peering_sharded_service inheritance is needed to be able to access shard local authentication service
-/// given an object from another shard. Used for bouncing lwt requests to correct shard.
-class service final : public seastar::peering_sharded_service<service> {
+class service final {
    permissions_cache_config _permissions_cache_config;
    std::unique_ptr<permissions_cache> _permissions_cache;

    cql3::query_processor& _qp;

-    ::service::migration_notifier& _mnotifier;
+    ::service::migration_manager& _migration_manager;

    std::unique_ptr<authorizer> _authorizer;

@@ -101,7 +97,7 @@ public:
    service(
            permissions_cache_config,
            cql3::query_processor&,
-            ::service::migration_notifier&,
+            ::service::migration_manager&,
            std::unique_ptr<authorizer>,
            std::unique_ptr<authenticator>,
            std::unique_ptr<role_manager>);
@@ -114,11 +110,10 @@ public:
    service(
            permissions_cache_config,
            cql3::query_processor&,
-            ::service::migration_notifier&,
            ::service::migration_manager&,
            const service_config&);

-    future<> start(::service::migration_manager&);
+    future<> start();

    future<> stop();

@@ -164,7 +159,7 @@ public:
 private:
    future<bool> has_existing_legacy_users() const;

-    future<> create_keyspace_if_missing(::service::migration_manager& mm) const;
+    future<> create_keyspace_if_missing() const;
 };

 future<bool> has_superuser(const service&, const authenticated_user&);
--- a/auth/standard_role_manager.cc
+++ b/auth/standard_role_manager.cc
@@ -101,8 +101,8 @@ static future<std::optional<record>> find_record(cql3::query_processor& qp, std:
        return std::make_optional(
                record{
                        row.get_as<sstring>(sstring(meta::roles_table::role_col_name)),
-                        row.get_or<bool>("is_superuser", false),
-                        row.get_or<bool>("can_login", false),
+                        row.get_as<bool>("is_superuser"),
+                        row.get_as<bool>("can_login"),
                        (row.has("member_of")
                                 ? row.get_set<sstring>("member_of")
                                 : role_set())});
@@ -203,7 +203,7 @@ future<> standard_role_manager::migrate_legacy_metadata() const {
            internal_distributed_timeout_config()).then([this](::shared_ptr<cql3::untyped_result_set> results) {
        return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
            role_config config;
-            config.is_superuser = row.get_or<bool>("super", false);
+            config.is_superuser = row.get_as<bool>("super");
            config.can_login = true;

            return do_with(
--- a/build_id.cc
+++ b/build_id.cc
@@ -1,71 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-#include "build_id.hh"
-#include <fmt/printf.h>
-#include <link.h>
-#include <seastar/core/align.hh>
-#include <sstream>
-
-using namespace seastar;
-
-static const Elf64_Nhdr* get_nt_build_id(dl_phdr_info* info) {
-    auto base = info->dlpi_addr;
-    const auto* h = info->dlpi_phdr;
-    auto num_headers = info->dlpi_phnum;
-    for (int i = 0; i != num_headers; ++i, ++h) {
-        if (h->p_type != PT_NOTE) {
-            continue;
-        }
-
-        auto* p = reinterpret_cast<const char*>(base) + h->p_vaddr;
-        auto* e = p + h->p_memsz;
-        while (p != e) {
-            const auto* n = reinterpret_cast<const Elf64_Nhdr*>(p);
-            if (n->n_type == NT_GNU_BUILD_ID) {
-                return n;
-            }
-
-            p += sizeof(Elf64_Nhdr);
-
-            p += n->n_namesz;
-            p = align_up(p, 4);
-
-            p += n->n_descsz;
-            p = align_up(p, 4);
-        }
-    }
-
-    assert(0 && "no NT_GNU_BUILD_ID note");
-}
-
-static int callback(dl_phdr_info* info, size_t size, void* data) {
-    std::string& ret = *(std::string*)data;
-    std::ostringstream os;
-
-    // The first DSO is always the main program, which has an empty name.
-    assert(strlen(info->dlpi_name) == 0);
-
-    auto* n = get_nt_build_id(info);
-    auto* p = reinterpret_cast<const char*>(n);
-
-    p += sizeof(Elf64_Nhdr);
-
-    p += n->n_namesz;
-    p = align_up(p, 4);
-
-    const char* desc = p;
-    for (unsigned i = 0; i < n->n_descsz; ++i) {
-        fmt::fprintf(os, "%02x", (unsigned char)*(desc + i));
-    }
-    ret = os.str();
-    return 1;
-}
-
-std::string get_build_id() {
-    std::string ret;
-    int r = dl_iterate_phdr(callback, &ret);
-    assert(r == 1);
-    return ret;
-}
--- a/build_id.hh
+++ b/build_id.hh
@@ -1,9 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-#pragma once
-
-#include <string>
-
-std::string get_build_id();
--- a/bytes_ostream.hh
+++ b/bytes_ostream.hh
@@ -38,7 +38,6 @@ class bytes_ostream {
 public:
    using size_type = bytes::size_type;
    using value_type = bytes::value_type;
-    using fragment_type = bytes_view;
    static constexpr size_type max_chunk_size() { return 128 * 1024; }
 private:
    static_assert(sizeof(value_type) == 1, "value_type is assumed to be one byte long");
@@ -94,29 +93,6 @@ public:
            return _current != other._current;
        }
    };
-    using const_iterator = fragment_iterator;
-
-    class output_iterator {
-    public:
-        using iterator_category = std::output_iterator_tag;
-        using difference_type = std::ptrdiff_t;
-        using value_type = bytes_ostream::value_type;
-        using pointer = bytes_ostream::value_type*;
-        using reference = bytes_ostream::value_type&;
-
-        friend class bytes_ostream;
-
-    private:
-        bytes_ostream* _ostream = nullptr;
-
-    private:
-        explicit output_iterator(bytes_ostream& os) : _ostream(&os) { }
-
-    public:
-        reference operator*() const { return *_ostream->write_place_holder(1); }
-        output_iterator& operator++() { return *this; }
-        output_iterator operator++(int) { return *this; }
-    };
 private:
    inline size_type current_space_left() const {
        if (!_current) {
@@ -313,11 +289,6 @@ public:
        return _size;
    }

-    // For the FragmentRange concept
-    size_type size_bytes() const {
-        return _size;
-    }
-
    bool empty() const {
        return _size == 0;
    }
@@ -355,8 +326,6 @@ public:
    fragment_iterator begin() const { return { _begin.get() }; }
    fragment_iterator end() const { return { nullptr }; }

-    output_iterator write_begin() { return output_iterator(*this); }
-
    boost::iterator_range<fragment_iterator> fragments() const {
        return { begin(), end() };
    }
--- a/cache_flat_mutation_reader.hh
+++ b/cache_flat_mutation_reader.hh
@@ -61,7 +61,6 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
        // - _last_row points at a direct predecessor of the next row which is going to be read.
        //   Used for populating continuity.
        // - _population_range_starts_before_all_rows is set accordingly
-        // - _underlying is engaged and fast-forwarded
        reading_from_underlying,

        end_of_stream
@@ -100,13 +99,7 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
    // forward progress is not guaranteed in case iterators are getting constantly invalidated.
    bool _lower_bound_changed = false;

-    // Points to the underlying reader conforming to _schema,
-    // either to *_underlying_holder or _read_context->underlying().underlying().
-    flat_mutation_reader* _underlying = nullptr;
-    std::optional<flat_mutation_reader> _underlying_holder;
-
    future<> do_fill_buffer(db::timeout_clock::time_point);
-    future<> ensure_underlying(db::timeout_clock::time_point);
    void copy_from_cache_to_buffer();
    future<> process_static_row(db::timeout_clock::time_point);
    void move_to_end();
@@ -193,22 +186,23 @@ future<> cache_flat_mutation_reader::process_static_row(db::timeout_clock::time_
        return make_ready_future<>();
    } else {
        _read_context->cache().on_row_miss();
-        return ensure_underlying(timeout).then([this, timeout] {
-            return (*_underlying)(timeout).then([this] (mutation_fragment_opt&& sr) {
-                if (sr) {
-                    assert(sr->is_static_row());
-                    maybe_add_to_cache(sr->as_static_row());
-                    push_mutation_fragment(std::move(*sr));
-                }
-                maybe_set_static_row_continuous();
-            });
+        return _read_context->get_next_fragment(timeout).then([this] (mutation_fragment_opt&& sr) {
+            if (sr) {
+                assert(sr->is_static_row());
+                maybe_add_to_cache(sr->as_static_row());
+                push_mutation_fragment(std::move(*sr));
+            }
+            maybe_set_static_row_continuous();
        });
    }
 }

 inline
 void cache_flat_mutation_reader::touch_partition() {
-    _snp->touch();
+    if (_snp->at_latest_version()) {
+        rows_entry& last_dummy = *_snp->version()->partition().clustered_rows().rbegin();
+        _snp->tracker()->touch(last_dummy);
+    }
 }

 inline
@@ -238,36 +232,14 @@ future<> cache_flat_mutation_reader::fill_buffer(db::timeout_clock::time_point t
    });
 }

-inline
-future<> cache_flat_mutation_reader::ensure_underlying(db::timeout_clock::time_point timeout) {
-    if (_underlying) {
-        return make_ready_future<>();
-    }
-    return _read_context->ensure_underlying(timeout).then([this, timeout] {
-        flat_mutation_reader& ctx_underlying = _read_context->underlying().underlying();
-        if (ctx_underlying.schema() != _schema) {
-            _underlying_holder = make_delegating_reader(ctx_underlying);
-            _underlying_holder->upgrade_schema(_schema);
-            _underlying = &*_underlying_holder;
-        } else {
-            _underlying = &ctx_underlying;
-        }
-    });
-}
-
 inline
 future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_point timeout) {
    if (_state == state::move_to_underlying) {
-        if (!_underlying) {
-            return ensure_underlying(timeout).then([this, timeout] {
-                return do_fill_buffer(timeout);
-            });
-        }
        _state = state::reading_from_underlying;
        _population_range_starts_before_all_rows = _lower_bound.is_before_all_clustered_rows(*_schema);
        auto end = _next_row_in_range ? position_in_partition(_next_row.position())
                                      : position_in_partition(_upper_bound);
-        return _underlying->fast_forward_to(position_range{_lower_bound, std::move(end)}, timeout).then([this, timeout] {
+        return _read_context->fast_forward_to(position_range{_lower_bound, std::move(end)}, timeout).then([this, timeout] {
            return read_from_underlying(timeout);
        });
    }
@@ -308,7 +280,7 @@ future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_poin

 inline
 future<> cache_flat_mutation_reader::read_from_underlying(db::timeout_clock::time_point timeout) {
-    return consume_mutation_fragments_until(*_underlying,
+    return consume_mutation_fragments_until(_read_context->underlying().underlying(),
        [this] { return _state != state::reading_from_underlying || is_buffer_full(); },
        [this] (mutation_fragment mf) {
            _read_context->cache().on_row_miss();
--- a/canonical_mutation.cc
+++ b/canonical_mutation.cc
@@ -35,7 +35,6 @@
 #include "idl/uuid.dist.impl.hh"
 #include "idl/keys.dist.impl.hh"
 #include "idl/mutation.dist.impl.hh"
-#include <iostream>

 canonical_mutation::canonical_mutation(bytes data)
        : _data(std::move(data))
@@ -80,8 +79,7 @@ mutation canonical_mutation::to_mutation(schema_ptr s) const {

    if (version == m.schema()->version()) {
        auto partition_view = mutation_partition_view::from_view(mv.partition());
-        mutation_application_stats app_stats;
-        m.partition().apply(*m.schema(), partition_view, *m.schema(), app_stats);
+        m.partition().apply(*m.schema(), partition_view, *m.schema());
    } else {
        column_mapping cm = mv.mapping();
        converting_mutation_partition_applier v(cm, *m.schema(), m.partition());
@@ -90,81 +88,3 @@ mutation canonical_mutation::to_mutation(schema_ptr s) const {
    }
    return m;
 }
-
-static sstring bytes_to_text(bytes_view bv) {
-    sstring ret(sstring::initialized_later(), bv.size());
-    std::copy_n(reinterpret_cast<const char*>(bv.data()), bv.size(), ret.data());
-    return ret;
-}
-
-std::ostream& operator<<(std::ostream& os, const canonical_mutation& cm) {
-    auto in = ser::as_input_stream(cm._data);
-    auto mv = ser::deserialize(in, boost::type<ser::canonical_mutation_view>());
-    column_mapping mapping = mv.mapping();
-    auto partition_view = mutation_partition_view::from_view(mv.partition());
-    fmt::print(os, "{{canonical_mutation: ");
-    fmt::print(os, "table_id {} schema_version {} ", mv.table_id(), mv.schema_version());
-    fmt::print(os, "partition_key {} ", mv.key());
-    class printing_visitor : public mutation_partition_view_virtual_visitor {
-        std::ostream& _os;
-        const column_mapping& _cm;
-        bool _first = true;
-        bool _in_row = false;
-    private:
-        void print_separator() {
-            if (!_first) {
-                fmt::print(_os, ", ");
-            }
-            _first = false;
-        }
-    public:
-        printing_visitor(std::ostream& os, const column_mapping& cm) : _os(os), _cm(cm) {}
-        virtual void accept_partition_tombstone(tombstone t) override {
-            print_separator();
-            fmt::print(_os, "partition_tombstone {}", t);
-        }
-        virtual void accept_static_cell(column_id id, atomic_cell ac) override {
-            print_separator();
-            auto&& entry = _cm.static_column_at(id);
-            fmt::print(_os, "static column {} {}", bytes_to_text(entry.name()), atomic_cell::printer(*entry.type(), ac));
-        }
-        virtual void accept_static_cell(column_id id, collection_mutation_view cmv) override {
-            print_separator();
-            auto&& entry = _cm.static_column_at(id);
-            fmt::print(_os, "static column {} {}", bytes_to_text(entry.name()), collection_mutation_view::printer(*entry.type(), cmv));
-        }
-        virtual void accept_row_tombstone(range_tombstone rt) override {
-            print_separator();
-            fmt::print(_os, "row tombstone {}", rt);
-        }
-        virtual void accept_row(position_in_partition_view pipv, row_tombstone rt, row_marker rm, is_dummy, is_continuous) override {
-            if (_in_row) {
-                fmt::print(_os, "}}, ");
-            }
-            fmt::print(_os, "{{row {} tombstone {} marker {}", pipv, rt, rm);
-            _in_row = true;
-            _first = false;
-        }
-        virtual void accept_row_cell(column_id id, atomic_cell ac) override {
-            print_separator();
-            auto&& entry = _cm.regular_column_at(id);
-            fmt::print(_os, "column {} {}", bytes_to_text(entry.name()), atomic_cell::printer(*entry.type(), ac));
-        }
-        virtual void accept_row_cell(column_id id, collection_mutation_view cmv) override {
-            print_separator();
-            auto&& entry = _cm.regular_column_at(id);
-            fmt::print(_os, "column {} {}", bytes_to_text(entry.name()), collection_mutation_view::printer(*entry.type(), cmv));
-        }
-        void finalize() {
-            if (_in_row) {
-                fmt::print(_os, "}}");
-            }
-        }
-    };
-    printing_visitor pv(os, mapping);
-    partition_view.accept(mapping, pv);
-    pv.finalize();
-    fmt::print(os, "}}");
-    return os;
-}
-
--- a/canonical_mutation.hh
+++ b/canonical_mutation.hh
@@ -26,7 +26,6 @@
 #include "database_fwd.hh"
 #include "mutation_partition_visitor.hh"
 #include "mutation_partition_serializer.hh"
-#include <iosfwd>

 // Immutable mutation form which can be read using any schema version of the same table.
 // Safe to access from other shards via const&.
@@ -53,5 +52,4 @@ public:

    const bytes& representation() const { return _data; }

-    friend std::ostream& operator<<(std::ostream& os, const canonical_mutation& cm);
 };
--- a/cdc/cdc.cc
+++ b/cdc/cdc.cc
@@ -1,835 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <utility>
-#include <algorithm>
-
-#include <boost/range/irange.hpp>
-#include <seastar/util/defer.hh>
-#include <seastar/core/thread.hh>
-
-#include "cdc/cdc.hh"
-#include "bytes.hh"
-#include "database.hh"
-#include "db/config.hh"
-#include "dht/murmur3_partitioner.hh"
-#include "partition_slice_builder.hh"
-#include "schema.hh"
-#include "schema_builder.hh"
-#include "service/migration_listener.hh"
-#include "service/storage_service.hh"
-#include "types/tuple.hh"
-#include "cql3/statements/select_statement.hh"
-#include "cql3/multi_column_relation.hh"
-#include "cql3/tuples.hh"
-#include "log.hh"
-#include "json.hh"
-
-using locator::snitch_ptr;
-using locator::token_metadata;
-using locator::topology;
-using seastar::sstring;
-using service::migration_notifier;
-using service::storage_proxy;
-
-namespace std {
-
-template<> struct hash<std::pair<net::inet_address, unsigned int>> {
-    std::size_t operator()(const std::pair<net::inet_address, unsigned int> &p) const {
-        return std::hash<net::inet_address>{}(p.first) ^ std::hash<int>{}(p.second);
-    }
-};
-
-}
-
-using namespace std::chrono_literals;
-
-static logging::logger cdc_log("cdc");
-
-namespace cdc {
-static schema_ptr create_log_schema(const schema&, std::optional<utils::UUID> = {});
-static schema_ptr create_stream_description_table_schema(const schema&, std::optional<utils::UUID> = {});
-static future<> populate_desc(db_context ctx, const schema& s);
-}
-
-class cdc::cdc_service::impl : service::migration_listener::empty_listener {
-    friend cdc_service;
-    db_context _ctxt;
-    bool _stopped = false;
-public:
-    impl(db_context ctxt)
-        : _ctxt(std::move(ctxt))
-    {
-        _ctxt._migration_notifier.register_listener(this);
-    }
-    ~impl() {
-        assert(_stopped);
-    }
-
-    future<> stop() {
-        return _ctxt._migration_notifier.unregister_listener(this).then([this] {
-            _stopped = true;
-        });
-    }
-
-    void on_before_create_column_family(const schema& schema, std::vector<mutation>& mutations, api::timestamp_type timestamp) override {
-        if (schema.cdc_options().enabled()) {
-            auto& db = _ctxt._proxy.get_db().local();
-            auto logname = log_name(schema.cf_name());
-            if (!db.has_schema(schema.ks_name(), logname)) {
-                // in seastar thread
-                auto log_schema = create_log_schema(schema);
-                auto stream_desc_schema = create_stream_description_table_schema(schema);
-                auto& keyspace = db.find_keyspace(schema.ks_name());
-
-                auto log_mut = db::schema_tables::make_create_table_mutations(keyspace.metadata(), log_schema, timestamp);
-                auto stream_mut = db::schema_tables::make_create_table_mutations(keyspace.metadata(), stream_desc_schema, timestamp);
-
-                mutations.insert(mutations.end(), std::make_move_iterator(log_mut.begin()), std::make_move_iterator(log_mut.end()));
-                mutations.insert(mutations.end(), std::make_move_iterator(stream_mut.begin()), std::make_move_iterator(stream_mut.end()));
-            }
-        }
-    }
-
-    void on_before_update_column_family(const schema& new_schema, const schema& old_schema, std::vector<mutation>& mutations, api::timestamp_type timestamp) override {
-        bool is_cdc = new_schema.cdc_options().enabled();
-        bool was_cdc = old_schema.cdc_options().enabled();
-
-        // we need to create or modify the log & stream schemas iff either we changed cdc status (was != is)
-        // or if cdc is on now unconditionally, since then any actual base schema changes will affect the column 
-        // etc.
-        if (was_cdc || is_cdc) {
-            auto logname = log_name(old_schema.cf_name());
-            auto descname = desc_name(old_schema.cf_name());
-            auto& db = _ctxt._proxy.get_db().local();
-            auto& keyspace = db.find_keyspace(old_schema.ks_name());
-            auto log_schema = was_cdc ? db.find_column_family(old_schema.ks_name(), logname).schema() : nullptr;
-            auto stream_desc_schema = was_cdc ? db.find_column_family(old_schema.ks_name(), descname).schema() : nullptr;
-
-            if (!is_cdc) {
-                auto log_mut = db::schema_tables::make_drop_table_mutations(keyspace.metadata(), log_schema, timestamp);
-                auto stream_mut = db::schema_tables::make_drop_table_mutations(keyspace.metadata(), stream_desc_schema, timestamp);
-
-                mutations.insert(mutations.end(), std::make_move_iterator(log_mut.begin()), std::make_move_iterator(log_mut.end()));
-                mutations.insert(mutations.end(), std::make_move_iterator(stream_mut.begin()), std::make_move_iterator(stream_mut.end()));
-                return;
-            }
-
-            auto new_log_schema = create_log_schema(new_schema, log_schema ? std::make_optional(log_schema->id()) : std::nullopt);
-            auto new_stream_desc_schema = create_stream_description_table_schema(new_schema, stream_desc_schema ? std::make_optional(stream_desc_schema->id()) : std::nullopt);
-
-            auto log_mut = log_schema 
-                ? db::schema_tables::make_update_table_mutations(keyspace.metadata(), log_schema, new_log_schema, timestamp, false)
-                : db::schema_tables::make_create_table_mutations(keyspace.metadata(), new_log_schema, timestamp)
-                ;
-            auto stream_mut = stream_desc_schema 
-                ? db::schema_tables::make_update_table_mutations(keyspace.metadata(), stream_desc_schema, new_stream_desc_schema, timestamp, false)
-                : db::schema_tables::make_create_table_mutations(keyspace.metadata(), new_stream_desc_schema, timestamp)
-                ;
-
-            mutations.insert(mutations.end(), std::make_move_iterator(log_mut.begin()), std::make_move_iterator(log_mut.end()));
-            mutations.insert(mutations.end(), std::make_move_iterator(stream_mut.begin()), std::make_move_iterator(stream_mut.end()));
-        }
-    }
-
-    void on_before_drop_column_family(const schema& schema, std::vector<mutation>& mutations, api::timestamp_type timestamp) override {
-        if (schema.cdc_options().enabled()) {
-            auto logname = log_name(schema.cf_name());
-            auto descname = desc_name(schema.cf_name());
-            auto& db = _ctxt._proxy.get_db().local();
-            auto& keyspace = db.find_keyspace(schema.ks_name());
-            auto log_schema = db.find_column_family(schema.ks_name(), logname).schema();
-            auto stream_desc_schema = db.find_column_family(schema.ks_name(), descname).schema();
-
-            auto log_mut = db::schema_tables::make_drop_table_mutations(keyspace.metadata(), log_schema, timestamp);
-            auto stream_mut = db::schema_tables::make_drop_table_mutations(keyspace.metadata(), stream_desc_schema, timestamp);
-
-            mutations.insert(mutations.end(), std::make_move_iterator(log_mut.begin()), std::make_move_iterator(log_mut.end()));
-            mutations.insert(mutations.end(), std::make_move_iterator(stream_mut.begin()), std::make_move_iterator(stream_mut.end()));
-        }
-    }
-
-    void on_create_column_family(const sstring& ks_name, const sstring& cf_name) override {
-        // This callback is done on all shards. Only do the work once. 
-        if (engine().cpu_id() != 0) {
-            return; 
-        }
-        auto& db = _ctxt._proxy.get_db().local();
-        auto& cf = db.find_column_family(ks_name, cf_name);
-        auto schema = cf.schema();
-        if (schema->cdc_options().enabled()) {
-            populate_desc(_ctxt, *schema).get();
-        }
-    }
-
-    void on_update_column_family(const sstring& ks_name, const sstring& cf_name, bool columns_changed) override {
-        on_create_column_family(ks_name, cf_name);
-    }
-
-    void on_drop_column_family(const sstring& ks_name, const sstring& cf_name) override {}
-
-    future<std::tuple<std::vector<mutation>, result_callback>> augment_mutation_call(
-        lowres_clock::time_point timeout,
-        std::vector<mutation>&& mutations
-    );
-
-    template<typename Iter>
-    future<> append_mutations(Iter i, Iter e, schema_ptr s, lowres_clock::time_point, std::vector<mutation>&);
-};
-
-cdc::cdc_service::cdc_service(service::storage_proxy& proxy)
-    : cdc_service(db_context::builder(proxy).build())
-{}
-
-cdc::cdc_service::cdc_service(db_context ctxt)
-    : _impl(std::make_unique<impl>(std::move(ctxt)))
-{
-    _impl->_ctxt._proxy.set_cdc_service(this);
-}
-
-future<> cdc::cdc_service::stop() {
-    return _impl->stop();
-}
-
-cdc::cdc_service::~cdc_service() = default;
-
-cdc::options::options(const std::map<sstring, sstring>& map) {
-    if (map.find("enabled") == std::end(map)) {
-        return;
-    }
-
-    for (auto& p : map) {
-        if (p.first == "enabled") {
-            _enabled = p.second == "true";
-        } else if (p.first == "preimage") {
-            _preimage = p.second == "true";
-        } else if (p.first == "postimage") {
-            _postimage = p.second == "true";
-        } else if (p.first == "ttl") {
-            _ttl = std::stoi(p.second);
-        } else {
-            throw exceptions::configuration_exception("Invalid CDC option: " + p.first);
-        }
-    }
-}
-
-std::map<sstring, sstring> cdc::options::to_map() const {
-    if (!_enabled) {
-        return {};
-    }
-    return {
-        { "enabled", _enabled ? "true" : "false" },
-        { "preimage", _preimage ? "true" : "false" },
-        { "postimage", _postimage ? "true" : "false" },
-        { "ttl", std::to_string(_ttl) },
-    };
-}
-
-sstring cdc::options::to_sstring() const {
-    return json::to_json(to_map());
-}
-
-bool cdc::options::operator==(const options& o) const {
-    return _enabled == o._enabled && _preimage == o._preimage && _postimage == o._postimage && _ttl == o._ttl;
-}
-bool cdc::options::operator!=(const options& o) const {
-    return !(*this == o);
-}
-
-namespace cdc {
-
-using operation_native_type = std::underlying_type_t<operation>;
-using column_op_native_type = std::underlying_type_t<column_op>;
-
-sstring log_name(const sstring& table_name) {
-    static constexpr auto cdc_log_suffix = "_scylla_cdc_log";
-    return table_name + cdc_log_suffix;
-}
-
-sstring desc_name(const sstring& table_name) {
-    static constexpr auto cdc_desc_suffix = "_scylla_cdc_desc";
-    return table_name + cdc_desc_suffix;
-}
-
-static schema_ptr create_log_schema(const schema& s, std::optional<utils::UUID> uuid) {
-    schema_builder b(s.ks_name(), log_name(s.cf_name()));
-    b.set_comment(sprint("CDC log for %s.%s", s.ks_name(), s.cf_name()));
-    b.with_column("stream_id", uuid_type, column_kind::partition_key);
-    b.with_column("time", timeuuid_type, column_kind::clustering_key);
-    b.with_column("batch_seq_no", int32_type, column_kind::clustering_key);
-    b.with_column("operation", data_type_for<operation_native_type>());
-    b.with_column("ttl", long_type);
-    auto add_columns = [&] (const schema::const_iterator_range_type& columns, bool is_data_col = false) {
-        for (const auto& column : columns) {
-            auto type = column.type;
-            if (is_data_col) {
-                type = tuple_type_impl::get_instance({ /* op */ data_type_for<column_op_native_type>(), /* value */ type, /* ttl */long_type});
-            }
-            b.with_column("_" + column.name(), type);
-        }
-    };
-    add_columns(s.partition_key_columns());
-    add_columns(s.clustering_key_columns());
-    add_columns(s.static_columns(), true);
-    add_columns(s.regular_columns(), true);
-
-    if (uuid) {
-        b.set_uuid(*uuid);
-    }
-    
-    return b.build();
-}
-
-static schema_ptr create_stream_description_table_schema(const schema& s, std::optional<utils::UUID> uuid) {
-    schema_builder b(s.ks_name(), desc_name(s.cf_name()));
-    b.set_comment(sprint("CDC description for %s.%s", s.ks_name(), s.cf_name()));
-    b.with_column("node_ip", inet_addr_type, column_kind::partition_key);
-    b.with_column("shard_id", int32_type, column_kind::partition_key);
-    b.with_column("created_at", timestamp_type, column_kind::clustering_key);
-    b.with_column("stream_id", uuid_type);
-
-    if (uuid) {
-        b.set_uuid(*uuid);
-    }
-
-    return b.build();
-}
-
-// This function assumes setup_stream_description_table was called on |s| before the call to this
-// function.
-static future<> populate_desc(db_context ctx, const schema& s) {
-    auto& db = ctx._proxy.get_db().local();
-    auto desc_schema =
-        db.find_schema(s.ks_name(), desc_name(s.cf_name()));
-    auto log_schema =
-        db.find_schema(s.ks_name(), log_name(s.cf_name()));
-    auto belongs_to = [&](const gms::inet_address& endpoint,
-                          const unsigned int shard_id,
-                          const int shard_count,
-                          const unsigned int ignore_msb_bits,
-                          const utils::UUID& stream_id) {
-        const auto log_pk = partition_key::from_singular(*log_schema,
-                                                         data_value(stream_id));
-        const auto token = ctx._partitioner.decorate_key(*log_schema, log_pk).token();
-        if (ctx._token_metadata.get_endpoint(ctx._token_metadata.first_token(token)) != endpoint) {
-            return false;
-        }
-        const auto owning_shard_id = dht::murmur3_partitioner(shard_count, ignore_msb_bits).shard_of(token);
-        return owning_shard_id == shard_id;
-    };
-
-    std::vector<mutation> mutations;
-    const auto ts = api::new_timestamp();
-    const auto ck = clustering_key::from_single_value(
-            *desc_schema, timestamp_type->decompose(ts));
-    auto cdef = desc_schema->get_column_definition(to_bytes("stream_id"));
-
-    for (const auto& dc : ctx._token_metadata.get_topology().get_datacenter_endpoints()) {
-        for (const auto& endpoint : dc.second) {
-            const auto decomposed_ip = inet_addr_type->decompose(endpoint.addr());
-            const unsigned int shard_count = ctx._snitch->get_shard_count(endpoint);
-            const unsigned int ignore_msb_bits = ctx._snitch->get_ignore_msb_bits(endpoint);
-            for (unsigned int shard_id = 0; shard_id < shard_count; ++shard_id) {
-                const auto pk = partition_key::from_exploded(
-                        *desc_schema, { decomposed_ip, int32_type->decompose(static_cast<int>(shard_id)) });
-                mutations.emplace_back(desc_schema, pk);
-
-                auto stream_id = utils::make_random_uuid();
-                while (!belongs_to(endpoint, shard_id, shard_count, ignore_msb_bits, stream_id)) {
-                    stream_id = utils::make_random_uuid();
-                }
-                auto value = atomic_cell::make_live(*uuid_type,
-                                                    ts,
-                                                    uuid_type->decompose(stream_id));
-                mutations.back().set_cell(ck, *cdef, std::move(value));
-            }
-        }
-    }
-    return ctx._proxy.mutate(std::move(mutations),
-                             db::consistency_level::QUORUM,
-                             db::no_timeout,
-                             nullptr,
-                             empty_service_permit());
-}
-
-db_context::builder::builder(service::storage_proxy& proxy) 
-    : _proxy(proxy) 
-{}
-
-db_context::builder& db_context::builder::with_migration_notifier(service::migration_notifier& migration_notifier) {
-    _migration_notifier = migration_notifier;
-    return *this;
-}
-
-db_context::builder& db_context::builder::with_token_metadata(locator::token_metadata& token_metadata) {
-    _token_metadata = token_metadata;
-    return *this;
-}
-
-db_context::builder& db_context::builder::with_snitch(locator::snitch_ptr& snitch) {
-    _snitch = snitch;
-    return *this;
-}
-
-db_context::builder& db_context::builder::with_partitioner(dht::i_partitioner& partitioner) {
-    _partitioner = partitioner;
-    return *this;
-}
-
-db_context db_context::builder::build() {
-    return db_context{
-        _proxy,
-        _migration_notifier ? _migration_notifier->get() : service::get_local_storage_service().get_migration_notifier(),
-        _token_metadata ? _token_metadata->get() : service::get_local_storage_service().get_token_metadata(),
-        _snitch ? _snitch->get() : locator::i_endpoint_snitch::get_local_snitch_ptr(),
-        _partitioner ? _partitioner->get() : dht::global_partitioner()
-    };
-}
-
-class transformer final {
-public:
-    using streams_type = std::unordered_map<std::pair<net::inet_address, unsigned int>, utils::UUID>;
-private:
-    db_context _ctx;
-    schema_ptr _schema;
-    schema_ptr _log_schema;
-    utils::UUID _time;
-    bytes _decomposed_time;
-    ::shared_ptr<const transformer::streams_type> _streams;
-    const column_definition& _op_col;
-    ttl_opt _cdc_ttl_opt;
-
-    clustering_key set_pk_columns(const partition_key& pk, int batch_no, mutation& m) const {
-        const auto log_ck = clustering_key::from_exploded(
-                *m.schema(), { _decomposed_time, int32_type->decompose(batch_no) });
-        auto pk_value = pk.explode(*_schema);
-        size_t pos = 0;
-        for (const auto& column : _schema->partition_key_columns()) {
-            assert (pos < pk_value.size());
-            auto cdef = m.schema()->get_column_definition(to_bytes("_" + column.name()));
-            auto value = atomic_cell::make_live(*column.type,
-                                                _time.timestamp(),
-                                                bytes_view(pk_value[pos]),
-                                                _cdc_ttl_opt);
-            m.set_cell(log_ck, *cdef, std::move(value));
-            ++pos;
-        }
-        return log_ck;
-    }
-
-    void set_operation(const clustering_key& ck, operation op, mutation& m) const {
-        m.set_cell(ck, _op_col, atomic_cell::make_live(*_op_col.type, _time.timestamp(), _op_col.type->decompose(operation_native_type(op)), _cdc_ttl_opt));
-    }
-
-    partition_key stream_id(const net::inet_address& ip, unsigned int shard_id) const {
-        auto it = _streams->find(std::make_pair(ip, shard_id));
-        if (it == std::end(*_streams)) {
-                throw std::runtime_error(format("No stream found for node {} and shard {}", ip, shard_id));
-        }
-        return partition_key::from_exploded(*_log_schema, { uuid_type->decompose(it->second) });
-    }
-public:
-    transformer(db_context ctx, schema_ptr s, ::shared_ptr<const transformer::streams_type> streams)
-        : _ctx(ctx)
-        , _schema(std::move(s))
-        , _log_schema(ctx._proxy.get_db().local().find_schema(_schema->ks_name(), log_name(_schema->cf_name())))
-        , _time(utils::UUID_gen::get_time_UUID())
-        , _decomposed_time(timeuuid_type->decompose(_time))
-        , _streams(std::move(streams))
-        , _op_col(*_log_schema->get_column_definition(to_bytes("operation")))
-    {
-        if (_schema->cdc_options().ttl()) {
-            _cdc_ttl_opt = std::chrono::seconds(_schema->cdc_options().ttl());
-        }
-    }
-
-    // TODO: is pre-image data based on query enough. We only have actual column data. Do we need
-    // more details like tombstones/ttl? Probably not but keep in mind.
-    mutation transform(const mutation& m, const cql3::untyped_result_set* rs = nullptr) const {
-        auto& t = m.token();
-        auto&& ep = _ctx._token_metadata.get_endpoint(
-                _ctx._token_metadata.first_token(t));
-        if (!ep) {
-            throw std::runtime_error(format("No owner found for key {}", m.decorated_key()));
-        }
-        auto shard_id = dht::murmur3_partitioner(_ctx._snitch->get_shard_count(*ep), _ctx._snitch->get_ignore_msb_bits(*ep)).shard_of(t);
-        mutation res(_log_schema, stream_id(ep->addr(), shard_id));
-        auto& p = m.partition();
-        if (p.partition_tombstone()) {
-            // Partition deletion
-            auto log_ck = set_pk_columns(m.key(), 0, res);
-            set_operation(log_ck, operation::partition_delete, res);
-        } else if (!p.row_tombstones().empty()) {
-            // range deletion
-            int batch_no = 0;
-            for (auto& rt : p.row_tombstones()) {
-                auto set_bound = [&] (const clustering_key& log_ck, const clustering_key_prefix& ckp) {
-                    auto exploded = ckp.explode(*_schema);
-                    size_t pos = 0;
-                    for (const auto& column : _schema->clustering_key_columns()) {
-                        if (pos >= exploded.size()) {
-                            break;
-                        }
-                        auto cdef = _log_schema->get_column_definition(to_bytes("_" + column.name()));
-                        auto value = atomic_cell::make_live(*column.type,
-                                                            _time.timestamp(),
-                                                            bytes_view(exploded[pos]),
-                                                            _cdc_ttl_opt);
-                        res.set_cell(log_ck, *cdef, std::move(value));
-                        ++pos;
-                    }
-                };
-                {
-                    auto log_ck = set_pk_columns(m.key(), batch_no, res);
-                    set_bound(log_ck, rt.start);
-                    // TODO: separate inclusive/exclusive range
-                    set_operation(log_ck, operation::range_delete_start, res);
-                    ++batch_no;
-                }
-                {
-                    auto log_ck = set_pk_columns(m.key(), batch_no, res);
-                    set_bound(log_ck, rt.end);
-                    // TODO: separate inclusive/exclusive range
-                    set_operation(log_ck, operation::range_delete_end, res);
-                    ++batch_no;
-                }
-            }
-        } else {
-            // should be update or deletion
-            int batch_no = 0;
-            for (const rows_entry& r : p.clustered_rows()) {
-                auto ck_value = r.key().explode(*_schema);
-
-                std::optional<clustering_key> pikey;
-                const cql3::untyped_result_set_row * pirow = nullptr;
-
-                if (rs) {
-                    for (auto& utr : *rs) {
-                        bool match = true;
-                        for (auto& c : _schema->clustering_key_columns()) {
-                            auto rv = utr.get_view(c.name_as_text());
-                            auto cv = r.key().get_component(*_schema, c.component_index());
-                            if (rv != cv) {
-                                match = false;
-                                break;
-                            }
-                        }
-                        if (match) {
-                            pikey = set_pk_columns(m.key(), batch_no, res);
-                            set_operation(*pikey, operation::pre_image, res);
-                            pirow = &utr;
-                            ++batch_no;
-                            break;
-                        }
-                    }
-                }
-
-                auto log_ck = set_pk_columns(m.key(), batch_no, res);
-
-                size_t pos = 0;
-                for (const auto& column : _schema->clustering_key_columns()) {
-                    assert (pos < ck_value.size());
-                    auto cdef = _log_schema->get_column_definition(to_bytes("_" + column.name()));
-                    res.set_cell(log_ck, *cdef, atomic_cell::make_live(*column.type, _time.timestamp(), bytes_view(ck_value[pos]), _cdc_ttl_opt));
-
-                    if (pirow) {
-                        assert(pirow->has(column.name_as_text()));
-                        res.set_cell(*pikey, *cdef, atomic_cell::make_live(*column.type, _time.timestamp(), bytes_view(ck_value[pos]), _cdc_ttl_opt));
-                    }
-
-                    ++pos;
-                }
-
-                std::vector<bytes_opt> values(3);
-
-                auto process_cells = [&](const row& r, column_kind ckind) {
-                    r.for_each_cell([&](column_id id, const atomic_cell_or_collection& cell) {
-                        auto& cdef = _schema->column_at(ckind, id);
-                        auto* dst = _log_schema->get_column_definition(to_bytes("_" + cdef.name()));
-                        // todo: collections.
-                        if (cdef.is_atomic()) {
-                            column_op op;
-
-                            values[1] = values[2] = std::nullopt;
-                            auto view = cell.as_atomic_cell(cdef);
-                            if (view.is_live()) {
-                                op = column_op::set;
-                                values[1] = view.value().linearize();
-                                if (view.is_live_and_has_ttl()) {
-                                    values[2] = long_type->decompose(data_value(view.ttl().count()));
-                                }
-                            } else {
-                                op = column_op::del;
-                            }
-
-                            values[0] = data_type_for<column_op_native_type>()->decompose(data_value(static_cast<column_op_native_type>(op)));
-                            res.set_cell(log_ck, *dst, atomic_cell::make_live(*dst->type, _time.timestamp(), tuple_type_impl::build_value(values), _cdc_ttl_opt));
-
-                            if (pirow && pirow->has(cdef.name_as_text())) {
-                                values[0] = data_type_for<column_op_native_type>()->decompose(data_value(static_cast<column_op_native_type>(column_op::set)));
-                                values[1] = pirow->get_blob(cdef.name_as_text());
-                                values[2] = std::nullopt;
-
-                                assert(std::addressof(res.partition().clustered_row(*_log_schema, *pikey)) != std::addressof(res.partition().clustered_row(*_log_schema, log_ck)));
-                                assert(pikey->explode() != log_ck.explode());
-                                res.set_cell(*pikey, *dst, atomic_cell::make_live(*dst->type, _time.timestamp(), tuple_type_impl::build_value(values), _cdc_ttl_opt));
-                            }
-                        } else {
-                            cdc_log.warn("Non-atomic cell ignored {}.{}:{}", _schema->ks_name(), _schema->cf_name(), cdef.name_as_text());
-                        }
-                    });
-                };
-
-                process_cells(r.row().cells(), column_kind::regular_column);
-                process_cells(p.static_row().get(), column_kind::static_column);
-
-                set_operation(log_ck, operation::update, res);
-                ++batch_no;
-            }
-        }
-
-        return res;
-    }
-
-    static db::timeout_clock::time_point default_timeout() {
-        return db::timeout_clock::now() + 10s;
-    }
-
-    future<lw_shared_ptr<cql3::untyped_result_set>> pre_image_select(
-            service::client_state& client_state,
-            db::consistency_level cl,
-            const mutation& m)
-    {
-        auto& p = m.partition();
-        if (p.partition_tombstone() || !p.row_tombstones().empty() || p.clustered_rows().empty()) {
-            return make_ready_future<lw_shared_ptr<cql3::untyped_result_set>>();
-        }
-
-        dht::partition_range_vector partition_ranges{dht::partition_range(m.decorated_key())};
-
-        auto&& pc = _schema->partition_key_columns();
-        auto&& cc = _schema->clustering_key_columns();
-
-        std::vector<query::clustering_range> bounds;
-        if (cc.empty()) {
-            bounds.push_back(query::clustering_range::make_open_ended_both_sides());
-        } else {
-            for (const rows_entry& r : p.clustered_rows()) {
-                auto& ck = r.key();
-                bounds.push_back(query::clustering_range::make_singular(ck));
-            }
-        }
-
-        std::vector<const column_definition*> columns;
-        columns.reserve(_schema->all_columns().size());
-
-        std::transform(pc.begin(), pc.end(), std::back_inserter(columns), [](auto& c) { return &c; });
-        std::transform(cc.begin(), cc.end(), std::back_inserter(columns), [](auto& c) { return &c; });
-
-        query::column_id_vector static_columns, regular_columns;
-
-        auto sk = column_kind::static_column;
-        auto rk = column_kind::regular_column;
-        // TODO: this assumes all mutations touch the same set of columns. This might not be true, and we may need to do more horrible set operation here.
-        for (auto& [r, cids, kind] : { std::tie(p.static_row().get(), static_columns, sk), std::tie(p.clustered_rows().begin()->row().cells(), regular_columns, rk) }) {
-            r.for_each_cell([&](column_id id, const atomic_cell_or_collection&) {
-                auto& cdef =_schema->column_at(kind, id);
-                cids.emplace_back(id);
-                columns.emplace_back(&cdef);
-            });
-        }
-
-        auto selection = cql3::selection::selection::for_columns(_schema, std::move(columns));
-        auto partition_slice = query::partition_slice(std::move(bounds), std::move(static_columns), std::move(regular_columns), selection->get_query_options());
-        auto command = ::make_lw_shared<query::read_command>(_schema->id(), _schema->version(), partition_slice, query::max_partitions);
-
-        return _ctx._proxy.query(_schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
-                [s = _schema, partition_slice = std::move(partition_slice), selection = std::move(selection)] (service::storage_proxy::coordinator_query_result qr) -> lw_shared_ptr<cql3::untyped_result_set> {
-                    cql3::selection::result_set_builder builder(*selection, gc_clock::now(), cql_serialization_format::latest());
-                    query::result_view::consume(*qr.query_result, partition_slice, cql3::selection::result_set_builder::visitor(builder, *s, *selection));
-                    auto result_set = builder.build();
-                    if (!result_set || result_set->empty()) {
-                        return {};
-                    }
-                    return make_lw_shared<cql3::untyped_result_set>(*result_set);
-        });
-    }
-};
-
-// This class is used to build a mapping from <node ip, shard id> to stream_id
-// It is used as a consumer for rows returned by the query to CDC Description Table
-class streams_builder {
-    const schema& _schema;
-    transformer::streams_type _streams;
-    net::inet_address _node_ip = net::inet_address();
-    unsigned int _shard_id = 0;
-    api::timestamp_type _latest_row_timestamp = api::min_timestamp;
-    utils::UUID _latest_row_stream_id = utils::UUID();
-public:
-    streams_builder(const schema& s) : _schema(s) {}
-
-    void accept_new_partition(const partition_key& key, uint32_t row_count) {
-        auto exploded = key.explode(_schema);
-        _node_ip = value_cast<net::inet_address>(inet_addr_type->deserialize(exploded[0]));
-        _shard_id = static_cast<unsigned int>(value_cast<int>(int32_type->deserialize(exploded[1])));
-        _latest_row_timestamp = api::min_timestamp;
-        _latest_row_stream_id = utils::UUID();
-    }
-
-    void accept_new_partition(uint32_t row_count) {
-        assert(false);
-    }
-
-    void accept_new_row(
-            const clustering_key& key,
-            const query::result_row_view& static_row,
-            const query::result_row_view& row) {
-        auto row_iterator = row.iterator();
-        api::timestamp_type timestamp = value_cast<db_clock::time_point>(
-                timestamp_type->deserialize(key.explode(_schema)[0])).time_since_epoch().count();
-        if (timestamp <= _latest_row_timestamp) {
-            return;
-        }
-        _latest_row_timestamp = timestamp;
-        for (auto&& cdef : _schema.regular_columns()) {
-            if (cdef.name_as_text() != "stream_id") {
-                row_iterator.skip(cdef);
-                continue;
-            }
-            auto val_opt = row_iterator.next_atomic_cell();
-            assert(val_opt);
-            val_opt->value().with_linearized([&] (bytes_view bv) {
-                _latest_row_stream_id = value_cast<utils::UUID>(uuid_type->deserialize(bv));
-            });
-        }
-    }
-
-    void accept_new_row(const query::result_row_view& static_row, const query::result_row_view& row) {
-        assert(false);
-    }
-
-    void accept_partition_end(const query::result_row_view& static_row) {
-        _streams.emplace(std::make_pair(_node_ip, _shard_id), _latest_row_stream_id);
-    }
-
-    transformer::streams_type build() {
-        return std::move(_streams);
-    }
-};
-
-static future<::shared_ptr<transformer::streams_type>> get_streams(
-        db_context ctx,
-        const sstring& ks_name,
-        const sstring& cf_name,
-        lowres_clock::time_point timeout,
-        service::query_state& qs) {
-    auto s =
-        ctx._proxy.get_db().local().find_schema(ks_name, desc_name(cf_name));
-    query::read_command cmd(
-            s->id(),
-            s->version(),
-            partition_slice_builder(*s).with_no_static_columns().build());
-    return ctx._proxy.query(
-            s,
-            make_lw_shared(std::move(cmd)),
-            {dht::partition_range::make_open_ended_both_sides()},
-            db::consistency_level::QUORUM,
-            {timeout, qs.get_permit(), qs.get_client_state()}).then([s = std::move(s)] (auto qr) mutable {
-        return query::result_view::do_with(*qr.query_result,
-                [s = std::move(s)] (query::result_view v) {
-            auto slice = partition_slice_builder(*s)
-                    .with_no_static_columns()
-                    .build();
-            streams_builder builder{ *s };
-            v.consume(slice, builder);
-            return ::make_shared<transformer::streams_type>(builder.build());
-        });
-    });
-}
-
-template <typename Func>
-future<std::vector<mutation>>
-transform_mutations(std::vector<mutation>& muts, decltype(muts.size()) batch_size, Func&& f) {
-    return parallel_for_each(
-            boost::irange(static_cast<decltype(muts.size())>(0), muts.size(), batch_size),
-            std::move(f))
-        .then([&muts] () mutable { return std::move(muts); });
-}
-
-} // namespace cdc
-
-future<std::tuple<std::vector<mutation>, cdc::result_callback>>
-cdc::cdc_service::impl::augment_mutation_call(lowres_clock::time_point timeout, std::vector<mutation>&& mutations) {
-    // we do all this because in the case of batches, we can have mixed schemas.
-    auto e = mutations.end();
-    auto i = std::find_if(mutations.begin(), e, [](const mutation& m) {
-        return m.schema()->cdc_options().enabled();
-    });
-
-    if (i == e) {
-        return make_ready_future<std::tuple<std::vector<mutation>, cdc::result_callback>>(std::make_tuple(std::move(mutations), result_callback{}));
-    }
-
-    mutations.reserve(2 * mutations.size());
-
-    return do_with(std::move(mutations), service::query_state(service::client_state::for_internal_calls(), empty_service_permit()), [this, timeout, i](std::vector<mutation>& mutations, service::query_state& qs) {
-        return transform_mutations(mutations, 1, [this, &mutations, timeout, &qs] (int idx) {
-            auto& m = mutations[idx];
-            auto s = m.schema();
-
-            if (!s->cdc_options().enabled()) {
-                return make_ready_future<>();
-            }
-            // for batches/multiple mutations this is super inefficient. either partition the mutation set by schema
-            // and re-use streams, or probably better: add a cache so this lookup is a noop on second mutation
-            return get_streams(_ctxt, s->ks_name(), s->cf_name(), timeout, qs).then([this, s = std::move(s), &qs, &mutations, idx](::shared_ptr<transformer::streams_type> streams) mutable {
-                auto& m = mutations[idx]; // should not really need because of reserve, but lets be conservative
-                transformer trans(_ctxt, s, streams);
-
-                if (!s->cdc_options().preimage()) {
-                    mutations.emplace_back(trans.transform(m));
-                    return make_ready_future<>();
-                }
-
-                // Note: further improvement here would be to coalesce the pre-image selects into one
-                // iff a batch contains several modifications to the same table. Otoh, batch is rare(?)
-                // so this is premature.
-                auto f = trans.pre_image_select(qs.get_client_state(), db::consistency_level::LOCAL_QUORUM, m);
-                return f.then([trans = std::move(trans), &mutations, idx] (lw_shared_ptr<cql3::untyped_result_set> rs) mutable {
-                    mutations.push_back(trans.transform(mutations[idx], rs.get()));
-                });
-            });
-        }).then([](std::vector<mutation> mutations) {
-            return make_ready_future<std::tuple<std::vector<mutation>, cdc::result_callback>>(std::make_tuple(std::move(mutations), result_callback{}));
-        });
-    });
-}
-
-bool cdc::cdc_service::needs_cdc_augmentation(const std::vector<mutation>& mutations) const {
-    return std::any_of(mutations.begin(), mutations.end(), [](const mutation& m) {
-        return m.schema()->cdc_options().enabled();
-    });
-}
-
-future<std::tuple<std::vector<mutation>, cdc::result_callback>>
-cdc::cdc_service::augment_mutation_call(lowres_clock::time_point timeout, std::vector<mutation>&& mutations) {
-    return _impl->augment_mutation_call(timeout, std::move(mutations));
-}
--- a/cdc/cdc.hh
+++ b/cdc/cdc.hh
@@ -1,142 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#pragma once
-
-#include <functional>
-#include <optional>
-#include <map>
-#include <string>
-#include <vector>
-
-#include <seastar/core/future.hh>
-#include <seastar/core/lowres_clock.hh>
-#include <seastar/core/shared_ptr.hh>
-#include <seastar/core/sstring.hh>
-
-#include "exceptions/exceptions.hh"
-#include "timestamp.hh"
-#include "cdc_options.hh"
-
-class schema;
-using schema_ptr = seastar::lw_shared_ptr<const schema>;
-
-namespace locator {
-
-class snitch_ptr;
-class token_metadata;
-
-} // namespace locator
-
-namespace service {
-
-class migration_notifier;
-class storage_proxy;
-class query_state;
-
-} // namespace service
-
-namespace dht {
-
-class i_partitioner;
-
-} // namespace dht
-
-class mutation;
-class partition_key;
-
-namespace cdc {
-
-class db_context;
-
-// Callback to be invoked on mutation finish to fix
-// the whole bit about post-image.
-// TODO: decide on what the parameters are to be for this.
-using result_callback = std::function<future<>()>;
-
-/// \brief CDC service, responsible for schema listeners
-///
-/// CDC service will listen for schema changes and iff CDC is enabled/changed
-/// create/modify/delete corresponding log tables etc as part of the schema change. 
-///
-class cdc_service {
-    class impl;
-    std::unique_ptr<impl> _impl;
-public:
-    future<> stop();
-    cdc_service(service::storage_proxy&);
-    cdc_service(db_context);
-    ~cdc_service();
-
-    // If any of the mutations are cdc enabled, optionally selects preimage, and adds the
-    // appropriate augments to set the log entries.
-    // Iff post-image is enabled for any of these, a non-empty callback is also
-    // returned to be invoked post the mutation query.
-    future<std::tuple<std::vector<mutation>, result_callback>> augment_mutation_call(
-        lowres_clock::time_point timeout,
-        std::vector<mutation>&& mutations
-        );
-    bool needs_cdc_augmentation(const std::vector<mutation>&) const;
-};
-
-struct db_context final {
-    service::storage_proxy& _proxy;
-    service::migration_notifier& _migration_notifier;
-    locator::token_metadata& _token_metadata;
-    locator::snitch_ptr& _snitch;
-    dht::i_partitioner& _partitioner;
-
-    class builder final {
-        service::storage_proxy& _proxy;
-        std::optional<std::reference_wrapper<service::migration_notifier>> _migration_notifier;
-        std::optional<std::reference_wrapper<locator::token_metadata>> _token_metadata;
-        std::optional<std::reference_wrapper<locator::snitch_ptr>> _snitch;
-        std::optional<std::reference_wrapper<dht::i_partitioner>> _partitioner;
-    public:
-        builder(service::storage_proxy& proxy);
-
-        builder& with_migration_notifier(service::migration_notifier& migration_notifier);
-        builder& with_token_metadata(locator::token_metadata& token_metadata);
-        builder& with_snitch(locator::snitch_ptr& snitch);
-        builder& with_partitioner(dht::i_partitioner& partitioner);
-
-        db_context build();
-    };
-};
-
-// cdc log table operation
-enum class operation : int8_t {
-    // note: these values will eventually be read by a third party, probably not privvy to this
-    // enum decl, so don't change the constant values (or the datatype).
-    pre_image = 0, update = 1, row_delete = 2, range_delete_start = 3, range_delete_end = 4, partition_delete = 5
-};
-
-// cdc log data column operation
-enum class column_op : int8_t {
-    // same as "operation". Do not edit values or type/type unless you _really_ want to.
-    set = 0, del = 1, add = 2,
-};
-
-seastar::sstring log_name(const seastar::sstring& table_name);
-
-seastar::sstring desc_name(const seastar::sstring& table_name);
-
-} // namespace cdc
--- a/cdc/cdc_options.hh
+++ b/cdc/cdc_options.hh
@@ -1,51 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#pragma once
-
-#include <map>
-#include <seastar/core/sstring.hh>
-#include "seastarx.hh"
-
-namespace cdc {
-
-class options final {
-    bool _enabled = false;
-    bool _preimage = false;
-    bool _postimage = false;
-    int _ttl = 86400; // 24h in seconds
-public:
-    options() = default;
-    options(const std::map<sstring, sstring>& map);
-
-    std::map<sstring, sstring> to_map() const;
-    sstring to_sstring() const;
-
-    bool enabled() const { return _enabled; }
-    bool preimage() const { return _preimage; }
-    bool postimage() const { return _postimage; }
-    int ttl() const { return _ttl; }
-
-    bool operator==(const options& o) const;
-    bool operator!=(const options& o) const;
-};
-
-} // namespace cdc
--- a/cell_locking.hh
+++ b/cell_locking.hh
@@ -68,7 +68,7 @@ public:
    public:
        explicit iterator(const mutation_partition& mp)
            : _mp(mp)
-            , _current(position_in_partition_view(position_in_partition_view::static_row_tag_t()), mp.static_row().get())
+            , _current(position_in_partition_view(position_in_partition_view::static_row_tag_t()), mp.static_row())
        { }

        iterator(const mutation_partition& mp, mutation_partition::rows_type::const_iterator it)
--- a/docs/coding-style.md
+++ b/docs/coding-style.md
--- a/collection_mutation.cc
+++ b/collection_mutation.cc
@@ -1,476 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include "types/collection.hh"
-#include "types/user.hh"
-#include "concrete_types.hh"
-#include "atomic_cell_or_collection.hh"
-#include "mutation_partition.hh"
-#include "compaction_garbage_collector.hh"
-#include "combine.hh"
-
-#include "collection_mutation.hh"
-
-collection_mutation::collection_mutation(const abstract_type& type, collection_mutation_view v)
-    : _data(imr_object_type::make(data::cell::make_collection(v.data), &type.imr_state().lsa_migrator())) {}
-
-collection_mutation::collection_mutation(const abstract_type& type, const bytes_ostream& data)
-	: _data(imr_object_type::make(data::cell::make_collection(fragment_range_view(data)), &type.imr_state().lsa_migrator())) {}
-
-static collection_mutation_view get_collection_mutation_view(const uint8_t* ptr)
-{
-    auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
-    auto ti = data::type_info::make_collection();
-    data::cell::context ctx(f, ti);
-    auto view = data::cell::structure::get_member<data::cell::tags::cell>(ptr).as<data::cell::tags::collection>(ctx);
-    auto dv = data::cell::variable_value::make_view(view, f.get<data::cell::tags::external_data>());
-    return collection_mutation_view { dv };
-}
-
-collection_mutation::operator collection_mutation_view() const
-{
-    return get_collection_mutation_view(_data.get());
-}
-
-collection_mutation_view atomic_cell_or_collection::as_collection_mutation() const {
-    return get_collection_mutation_view(_data.get());
-}
-
-bool collection_mutation_view::is_empty() const {
-    auto in = collection_mutation_input_stream(data);
-    auto has_tomb = in.read_trivial<bool>();
-    return !has_tomb && in.read_trivial<uint32_t>() == 0;
-}
-
-template <typename F>
-GCC6_CONCEPT(requires std::is_invocable_r_v<const data::type_info&, F, collection_mutation_input_stream&>)
-static bool is_any_live(const atomic_cell_value_view& data, tombstone tomb, gc_clock::time_point now, F&& read_cell_type_info) {
-    auto in = collection_mutation_input_stream(data);
-    auto has_tomb = in.read_trivial<bool>();
-    if (has_tomb) {
-        auto ts = in.read_trivial<api::timestamp_type>();
-        auto ttl = in.read_trivial<gc_clock::duration::rep>();
-        tomb.apply(tombstone{ts, gc_clock::time_point(gc_clock::duration(ttl))});
-    }
-
-    auto nr = in.read_trivial<uint32_t>();
-    for (uint32_t i = 0; i != nr; ++i) {
-        auto& type_info = read_cell_type_info(in);
-        auto vsize = in.read_trivial<uint32_t>();
-        auto value = atomic_cell_view::from_bytes(type_info, in.read(vsize));
-        if (value.is_live(tomb, now, false)) {
-            return true;
-        }
-    }
-
-    return false;
-}
-
-bool collection_mutation_view::is_any_live(const abstract_type& type, tombstone tomb, gc_clock::time_point now) const {
-    return visit(type, make_visitor(
-    [&] (const collection_type_impl& ctype) {
-        auto& type_info = ctype.value_comparator()->imr_state().type_info();
-        return ::is_any_live(data, tomb, now, [&type_info] (collection_mutation_input_stream& in) -> const data::type_info& {
-            auto key_size = in.read_trivial<uint32_t>();
-            in.skip(key_size);
-            return type_info;
-        });
-    },
-    [&] (const user_type_impl& utype) {
-        return ::is_any_live(data, tomb, now, [&utype] (collection_mutation_input_stream& in) -> const data::type_info& {
-            auto key_size = in.read_trivial<uint32_t>();
-            auto key = in.read(key_size);
-            return utype.type(deserialize_field_index(key))->imr_state().type_info();
-        });
-    },
-    [&] (const abstract_type& o) -> bool {
-        throw std::runtime_error(format("collection_mutation_view::is_any_live: unknown type {}", o.name()));
-    }
-    ));
-}
-
-template <typename F>
-GCC6_CONCEPT(requires std::is_invocable_r_v<const data::type_info&, F, collection_mutation_input_stream&>)
-static api::timestamp_type last_update(const atomic_cell_value_view& data, F&& read_cell_type_info) {
-    auto in = collection_mutation_input_stream(data);
-    api::timestamp_type max = api::missing_timestamp;
-    auto has_tomb = in.read_trivial<bool>();
-    if (has_tomb) {
-        max = std::max(max, in.read_trivial<api::timestamp_type>());
-        (void)in.read_trivial<gc_clock::duration::rep>();
-    }
-
-    auto nr = in.read_trivial<uint32_t>();
-    for (uint32_t i = 0; i != nr; ++i) {
-        auto& type_info = read_cell_type_info(in);
-        auto vsize = in.read_trivial<uint32_t>();
-        auto value = atomic_cell_view::from_bytes(type_info, in.read(vsize));
-        max = std::max(value.timestamp(), max);
-    }
-
-    return max;
-}
-
-
-api::timestamp_type collection_mutation_view::last_update(const abstract_type& type) const {
-    return visit(type, make_visitor(
-    [&] (const collection_type_impl& ctype) {
-        auto& type_info = ctype.value_comparator()->imr_state().type_info();
-        return ::last_update(data, [&type_info] (collection_mutation_input_stream& in) -> const data::type_info& {
-            auto key_size = in.read_trivial<uint32_t>();
-            in.skip(key_size);
-            return type_info;
-        });
-    },
-    [&] (const user_type_impl& utype) {
-        return ::last_update(data, [&utype] (collection_mutation_input_stream& in) -> const data::type_info& {
-            auto key_size = in.read_trivial<uint32_t>();
-            auto key = in.read(key_size);
-            return utype.type(deserialize_field_index(key))->imr_state().type_info();
-        });
-    },
-    [&] (const abstract_type& o) -> api::timestamp_type {
-        throw std::runtime_error(format("collection_mutation_view::last_update: unknown type {}", o.name()));
-    }
-    ));
-}
-
-std::ostream& operator<<(std::ostream& os, const collection_mutation_view::printer& cmvp) {
-    fmt::print(os, "{{collection_mutation_view ");
-    cmvp._cmv.with_deserialized(cmvp._type, [&os, &type = cmvp._type] (const collection_mutation_view_description& cmvd) {
-        bool first = true;
-        fmt::print(os, "tombstone {}", cmvd.tomb);
-        visit(type, make_visitor(
-        [&] (const collection_type_impl& ctype) {
-            auto&& key_type = ctype.name_comparator();
-            auto&& value_type = ctype.value_comparator();
-            for (auto&& [key, value] : cmvd.cells) {
-                if (!first) {
-                    fmt::print(os, ", ");
-                }
-                fmt::print(os, "{}: {}", key_type->to_string(key), atomic_cell_view::printer(*value_type, value));
-                first = false;
-            }
-        },
-        [&] (const user_type_impl& utype) {
-            for (auto&& [raw_idx, value] : cmvd.cells) {
-                if (!first) {
-                    fmt::print(os, ", ");
-                }
-                auto idx = deserialize_field_index(raw_idx);
-                fmt::print(os, "{}: {}", utype.field_name_as_string(idx), atomic_cell_view::printer(*utype.type(idx), value));
-                first = false;
-            }
-        },
-        [&] (const abstract_type& o) {
-            // Not throwing exception in this likely-to-be debug context
-            fmt::print(os, "attempted to pretty-print collection_mutation_view_description with type {}", o.name());
-        }
-        ));
-    });
-    fmt::print(os, "}}");
-    return os;
-}
-
-
-collection_mutation_description
-collection_mutation_view_description::materialize(const abstract_type& type) const {
-    collection_mutation_description m;
-    m.tomb = tomb;
-    m.cells.reserve(cells.size());
-
-    visit(type, make_visitor(
-    [&] (const collection_type_impl& ctype) {
-        auto& value_type = *ctype.value_comparator();
-        for (auto&& e : cells) {
-            m.cells.emplace_back(to_bytes(e.first), atomic_cell(value_type, e.second));
-        }
-    },
-    [&] (const user_type_impl& utype) {
-        for (auto&& e : cells) {
-            m.cells.emplace_back(to_bytes(e.first), atomic_cell(*utype.type(deserialize_field_index(e.first)), e.second));
-        }
-    },
-    [&] (const abstract_type& o) {
-        throw std::runtime_error(format("attempted to materialize collection_mutation_view_description with type {}", o.name()));
-    }
-    ));
-
-    return m;
-}
-
-bool collection_mutation_description::compact_and_expire(column_id id, row_tombstone base_tomb, gc_clock::time_point query_time,
-    can_gc_fn& can_gc, gc_clock::time_point gc_before, compaction_garbage_collector* collector)
-{
-    bool any_live = false;
-    auto t = tomb;
-    tombstone purged_tomb;
-    if (tomb <= base_tomb.regular()) {
-        tomb = tombstone();
-    } else if (tomb.deletion_time < gc_before && can_gc(tomb)) {
-        purged_tomb = tomb;
-        tomb = tombstone();
-    }
-    t.apply(base_tomb.regular());
-    utils::chunked_vector<std::pair<bytes, atomic_cell>> survivors;
-    utils::chunked_vector<std::pair<bytes, atomic_cell>> losers;
-    for (auto&& name_and_cell : cells) {
-        atomic_cell& cell = name_and_cell.second;
-        auto cannot_erase_cell = [&] {
-            return cell.deletion_time() >= gc_before || !can_gc(tombstone(cell.timestamp(), cell.deletion_time()));
-        };
-
-        if (cell.is_covered_by(t, false) || cell.is_covered_by(base_tomb.shadowable().tomb(), false)) {
-            continue;
-        }
-        if (cell.has_expired(query_time)) {
-            if (cannot_erase_cell()) {
-                survivors.emplace_back(std::make_pair(
-                    std::move(name_and_cell.first), atomic_cell::make_dead(cell.timestamp(), cell.deletion_time())));
-            } else if (collector) {
-                losers.emplace_back(std::pair(
-                        std::move(name_and_cell.first), atomic_cell::make_dead(cell.timestamp(), cell.deletion_time())));
-            }
-        } else if (!cell.is_live()) {
-            if (cannot_erase_cell()) {
-                survivors.emplace_back(std::move(name_and_cell));
-            } else if (collector) {
-                losers.emplace_back(std::move(name_and_cell));
-            }
-        } else {
-            any_live |= true;
-            survivors.emplace_back(std::move(name_and_cell));
-        }
-    }
-    if (collector) {
-        collector->collect(id, collection_mutation_description{purged_tomb, std::move(losers)});
-    }
-    cells = std::move(survivors);
-    return any_live;
-}
-
-template <typename Iterator>
-static collection_mutation serialize_collection_mutation(
-        const abstract_type& type,
-        const tombstone& tomb,
-        boost::iterator_range<Iterator> cells) {
-    auto element_size = [] (size_t c, auto&& e) -> size_t {
-        return c + 8 + e.first.size() + e.second.serialize().size();
-    };
-    auto size = accumulate(cells, (size_t)4, element_size);
-    size += 1;
-    if (tomb) {
-        size += sizeof(tomb.timestamp) + sizeof(tomb.deletion_time);
-    }
-    bytes_ostream ret;
-    ret.reserve(size);
-    auto out = ret.write_begin();
-    *out++ = bool(tomb);
-    if (tomb) {
-        write(out, tomb.timestamp);
-        write(out, tomb.deletion_time.time_since_epoch().count());
-    }
-    auto writeb = [&out] (bytes_view v) {
-        serialize_int32(out, v.size());
-        out = std::copy_n(v.begin(), v.size(), out);
-    };
-    // FIXME: overflow?
-    serialize_int32(out, boost::distance(cells));
-    for (auto&& kv : cells) {
-        auto&& k = kv.first;
-        auto&& v = kv.second;
-        writeb(k);
-
-        writeb(v.serialize());
-    }
-    return collection_mutation(type, ret);
-}
-
-collection_mutation collection_mutation_description::serialize(const abstract_type& type) const {
-    return serialize_collection_mutation(type, tomb, boost::make_iterator_range(cells.begin(), cells.end()));
-}
-
-collection_mutation collection_mutation_view_description::serialize(const abstract_type& type) const {
-    return serialize_collection_mutation(type, tomb, boost::make_iterator_range(cells.begin(), cells.end()));
-}
-
-template <typename C>
-GCC6_CONCEPT(requires std::is_base_of_v<abstract_type, std::remove_reference_t<C>>)
-static collection_mutation_view_description
-merge(collection_mutation_view_description a, collection_mutation_view_description b, C&& key_type) {
-    using element_type = std::pair<bytes_view, atomic_cell_view>;
-
-    auto compare = [&] (const element_type& e1, const element_type& e2) {
-        return key_type.less(e1.first, e2.first);
-    };
-
-    auto merge = [] (const element_type& e1, const element_type& e2) {
-        // FIXME: use std::max()?
-        return std::make_pair(e1.first, compare_atomic_cell_for_merge(e1.second, e2.second) > 0 ? e1.second : e2.second);
-    };
-
-    // applied to a tombstone, returns a predicate checking whether a cell is killed by
-    // the tombstone
-    auto cell_killed = [] (const std::optional<tombstone>& t) {
-        return [&t] (const element_type& e) {
-            if (!t) {
-                return false;
-            }
-            // tombstone wins if timestamps equal here, unlike row tombstones
-            if (t->timestamp < e.second.timestamp()) {
-                return false;
-            }
-            return true;
-            // FIXME: should we consider TTLs too?
-        };
-    };
-
-    collection_mutation_view_description merged;
-    merged.cells.reserve(a.cells.size() + b.cells.size());
-
-    combine(a.cells.begin(), std::remove_if(a.cells.begin(), a.cells.end(), cell_killed(b.tomb)),
-            b.cells.begin(), std::remove_if(b.cells.begin(), b.cells.end(), cell_killed(a.tomb)),
-            std::back_inserter(merged.cells),
-            compare,
-            merge);
-    merged.tomb = std::max(a.tomb, b.tomb);
-
-    return merged;
-}
-
-collection_mutation merge(const abstract_type& type, collection_mutation_view a, collection_mutation_view b) {
-    return a.with_deserialized(type, [&] (collection_mutation_view_description a_view) {
-        return b.with_deserialized(type, [&] (collection_mutation_view_description b_view) {
-            return visit(type, make_visitor(
-            [&] (const collection_type_impl& ctype) {
-                return merge(std::move(a_view), std::move(b_view), *ctype.name_comparator());
-            },
-            [&] (const user_type_impl& utype) {
-                return merge(std::move(a_view), std::move(b_view), *short_type);
-            },
-            [] (const abstract_type& o) -> collection_mutation_view_description {
-                throw std::runtime_error(format("collection_mutation merge: unknown type: {}", o.name()));
-            }
-            )).serialize(type);
-        });
-    });
-}
-
-template <typename C>
-GCC6_CONCEPT(requires std::is_base_of_v<abstract_type, std::remove_reference_t<C>>)
-static collection_mutation_view_description
-difference(collection_mutation_view_description a, collection_mutation_view_description b, C&& key_type)
-{
-    collection_mutation_view_description diff;
-    diff.cells.reserve(std::max(a.cells.size(), b.cells.size()));
-
-    auto it = b.cells.begin();
-    for (auto&& c : a.cells) {
-        while (it != b.cells.end() && key_type.less(it->first, c.first)) {
-            ++it;
-        }
-        if (it == b.cells.end() || !key_type.equal(it->first, c.first)
-            || compare_atomic_cell_for_merge(c.second, it->second) > 0) {
-
-            auto cell = std::make_pair(c.first, c.second);
-            diff.cells.emplace_back(std::move(cell));
-        }
-    }
-    if (a.tomb > b.tomb) {
-        diff.tomb = a.tomb;
-    }
-
-    return diff;
-}
-
-collection_mutation difference(const abstract_type& type, collection_mutation_view a, collection_mutation_view b)
-{
-    return a.with_deserialized(type, [&] (collection_mutation_view_description a_view) {
-        return b.with_deserialized(type, [&] (collection_mutation_view_description b_view) {
-            return visit(type, make_visitor(
-            [&] (const collection_type_impl& ctype) {
-                return difference(std::move(a_view), std::move(b_view), *ctype.name_comparator());
-            },
-            [&] (const user_type_impl& utype) {
-                return difference(std::move(a_view), std::move(b_view), *short_type);
-            },
-            [] (const abstract_type& o) -> collection_mutation_view_description {
-                throw std::runtime_error(format("collection_mutation difference: unknown type: {}", o.name()));
-            }
-            )).serialize(type);
-        });
-    });
-}
-
-template <typename F>
-GCC6_CONCEPT(requires std::is_invocable_r_v<std::pair<bytes_view, atomic_cell_view>, F, collection_mutation_input_stream&>)
-static collection_mutation_view_description
-deserialize_collection_mutation(collection_mutation_input_stream& in, F&& read_kv) {
-    collection_mutation_view_description ret;
-
-    auto has_tomb = in.read_trivial<bool>();
-    if (has_tomb) {
-        auto ts = in.read_trivial<api::timestamp_type>();
-        auto ttl = in.read_trivial<gc_clock::duration::rep>();
-        ret.tomb = tombstone{ts, gc_clock::time_point(gc_clock::duration(ttl))};
-    }
-
-    auto nr = in.read_trivial<uint32_t>();
-    ret.cells.reserve(nr);
-    for (uint32_t i = 0; i != nr; ++i) {
-        ret.cells.push_back(read_kv(in));
-    }
-
-    assert(in.empty());
-    return ret;
-}
-
-collection_mutation_view_description
-deserialize_collection_mutation(const abstract_type& type, collection_mutation_input_stream& in) {
-    return visit(type, make_visitor(
-    [&] (const collection_type_impl& ctype) {
-        // value_comparator(), ugh
-        auto& type_info = ctype.value_comparator()->imr_state().type_info();
-        return deserialize_collection_mutation(in, [&type_info] (collection_mutation_input_stream& in) {
-            // FIXME: we could probably avoid the need for size
-            auto ksize = in.read_trivial<uint32_t>();
-            auto key = in.read(ksize);
-            auto vsize = in.read_trivial<uint32_t>();
-            auto value = atomic_cell_view::from_bytes(type_info, in.read(vsize));
-            return std::make_pair(key, value);
-        });
-    },
-    [&] (const user_type_impl& utype) {
-        return deserialize_collection_mutation(in, [&utype] (collection_mutation_input_stream& in) {
-            // FIXME: we could probably avoid the need for size
-            auto ksize = in.read_trivial<uint32_t>();
-            auto key = in.read(ksize);
-            auto vsize = in.read_trivial<uint32_t>();
-            auto value = atomic_cell_view::from_bytes(
-                    utype.type(deserialize_field_index(key))->imr_state().type_info(), in.read(vsize));
-            return std::make_pair(key, value);
-        });
-    },
-    [&] (const abstract_type& o) -> collection_mutation_view_description {
-        throw std::runtime_error(format("deserialize_collection_mutation: unknown type {}", o.name()));
-    }
-    ));
-}
--- a/collection_mutation.hh
+++ b/collection_mutation.hh
@@ -1,139 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#pragma once
-
-#include "utils/chunked_vector.hh"
-#include "schema_fwd.hh"
-#include "gc_clock.hh"
-#include "atomic_cell.hh"
-#include "cql_serialization_format.hh"
-#include "marshal_exception.hh"
-#include "utils/linearizing_input_stream.hh"
-#include <iosfwd>
-
-class abstract_type;
-class bytes_ostream;
-class compaction_garbage_collector;
-class row_tombstone;
-
-class collection_mutation;
-
-// An auxiliary struct used to (de)construct collection_mutations.
-// Unlike collection_mutation which is a serialized blob, this struct allows to inspect logical units of information
-// (tombstone and cells) inside the mutation easily.
-struct collection_mutation_description {
-    tombstone tomb;
-    // FIXME: use iterators?
-    // we never iterate over `cells` more than once, so there is no need to store them in memory.
-    // In some cases instead of constructing the `cells` vector, it would be more efficient to provide
-    // a one-time-use forward iterator which returns the cells.
-    utils::chunked_vector<std::pair<bytes, atomic_cell>> cells;
-
-    // Expires cells based on query_time. Expires tombstones based on max_purgeable and gc_before.
-    // Removes cells covered by tomb or this->tomb.
-    bool compact_and_expire(column_id id, row_tombstone tomb, gc_clock::time_point query_time,
-        can_gc_fn&, gc_clock::time_point gc_before, compaction_garbage_collector* collector = nullptr);
-
-    // Packs the data to a serialized blob.
-    collection_mutation serialize(const abstract_type&) const;
-};
-
-// Similar to collection_mutation_description, except that it doesn't store the cells' data, only observes it.
-struct collection_mutation_view_description {
-    tombstone tomb;
-    // FIXME: use iterators? See the fixme in collection_mutation_description; the same considerations apply here.
-    utils::chunked_vector<std::pair<bytes_view, atomic_cell_view>> cells;
-
-    // Copies the observed data, storing it in a collection_mutation_description.
-    collection_mutation_description materialize(const abstract_type&) const;
-
-    // Packs the data to a serialized blob.
-    collection_mutation serialize(const abstract_type&) const;
-};
-
-using collection_mutation_input_stream = utils::linearizing_input_stream<atomic_cell_value_view, marshal_exception>;
-
-// Given a linearized collection_mutation_view, returns an auxiliary struct allowing the inspection of each cell.
-// The struct is an observer of the data given by the collection_mutation_view and is only valid while the
-// passed in `collection_mutation_input_stream` is alive.
-// The function needs to be given the type of stored data to reconstruct the structural information.
-collection_mutation_view_description deserialize_collection_mutation(const abstract_type&, collection_mutation_input_stream&);
-
-class collection_mutation_view {
-public:
-    atomic_cell_value_view data;
-
-    // Is this a noop mutation?
-    bool is_empty() const;
-
-    // Is any of the stored cells live (not deleted nor expired) at the time point `tp`,
-    // given the later of the tombstones `t` and the one stored in the mutation (if any)?
-    // Requires a type to reconstruct the structural information.
-    bool is_any_live(const abstract_type&, tombstone t = tombstone(), gc_clock::time_point tp = gc_clock::time_point::min()) const;
-
-    // The maximum of timestamps of the mutation's cells and tombstone.
-    api::timestamp_type last_update(const abstract_type&) const;
-
-    // Given a function that operates on a collection_mutation_view_description,
-    // calls it on the corresponding description of `this`.
-    template <typename F>
-    inline decltype(auto) with_deserialized(const abstract_type& type, F f) const {
-        auto stream = collection_mutation_input_stream(data);
-        return f(deserialize_collection_mutation(type, stream));
-    }
-
-    class printer {
-        const abstract_type& _type;
-        const collection_mutation_view& _cmv;
-    public:
-        printer(const abstract_type& type, const collection_mutation_view& cmv)
-                : _type(type), _cmv(cmv) {}
-        friend std::ostream& operator<<(std::ostream& os, const printer& cmvp);
-    };
-};
-
-// A serialized mutation of a collection of cells.
-// Used to represent mutations of collections (lists, maps, sets) or non-frozen user defined types.
-// It contains a sequence of cells, each representing a mutation of a single entry (element or field) of the collection.
-// Each cell has an associated 'key' (or 'path'). The meaning of each (key, cell) pair is:
-//  for sets: the key is the serialized set element, the cell contains no data (except liveness information),
-//  for maps: the key is the serialized map element's key, the cell contains the serialized map element's value,
-//  for lists: the key is a timeuuid identifying the list entry, the cell contains the serialized value,
-//  for user types: the key is an index identifying the field, the cell contains the value of the field.
-//  The mutation may also contain a collection-wide tombstone.
-class collection_mutation {
-public:
-    using imr_object_type =  imr::utils::object<data::cell::structure>;
-    imr_object_type _data;
-
-    collection_mutation() {}
-    collection_mutation(const abstract_type&, collection_mutation_view);
-    collection_mutation(const abstract_type& type, const bytes_ostream& data);
-    operator collection_mutation_view() const;
-};
-
-collection_mutation merge(const abstract_type&, collection_mutation_view, collection_mutation_view);
-
-collection_mutation difference(const abstract_type&, collection_mutation_view, collection_mutation_view);
-
-// Serializes the given collection of cells to a sequence of bytes ready to be sent over the CQL protocol.
-bytes serialize_for_cql(const abstract_type&, collection_mutation_view, cql_serialization_format);
--- a/compaction_garbage_collector.hh
+++ b/compaction_garbage_collector.hh
@@ -22,7 +22,7 @@
 #pragma once

 #include "schema.hh"
-#include "collection_mutation.hh"
+#include "types/collection.hh"

 class atomic_cell;
 class row_marker;
@@ -31,6 +31,6 @@ class compaction_garbage_collector {
 public:
    virtual ~compaction_garbage_collector() = default;
    virtual void collect(column_id id, atomic_cell) = 0;
-    virtual void collect(column_id id, collection_mutation_description) = 0;
+    virtual void collect(column_id id, collection_type_impl::mutation) = 0;
    virtual void collect(row_marker) = 0;
 };
--- a/compound.hh
+++ b/compound.hh
@@ -74,8 +74,8 @@ private:
     *   <len(value1)><value1><len(value2)><value2>...<len(value_n)><value_n>
     *
     */
-    template<typename RangeOfSerializedComponents, typename CharOutputIterator>
-    static void serialize_value(RangeOfSerializedComponents&& values, CharOutputIterator& out) {
+    template<typename RangeOfSerializedComponents>
+    static void serialize_value(RangeOfSerializedComponents&& values, bytes::iterator& out) {
        for (auto&& val : values) {
            assert(val.size() <= std::numeric_limits<size_type>::max());
            write<size_type>(out, size_type(val.size()));
--- a/compound_compat.hh
+++ b/compound_compat.hh
@@ -248,16 +248,15 @@ private:
    static size_t size(const data_value& val) {
        return val.serialized_size();
    }
-    template<typename Value, typename CharOutputIterator, typename = std::enable_if_t<!std::is_same<data_value, std::decay_t<Value>>::value>>
-    static void write_value(Value&& val, CharOutputIterator& out) {
+    template<typename Value, typename = std::enable_if_t<!std::is_same<data_value, std::decay_t<Value>>::value>>
+    static void write_value(Value&& val, bytes::iterator& out) {
        out = std::copy(val.begin(), val.end(), out);
    }
-    template <typename CharOutputIterator>
-    static void write_value(const data_value& val, CharOutputIterator& out) {
+    static void write_value(const data_value& val, bytes::iterator& out) {
        val.serialize(out);
    }
-    template<typename RangeOfSerializedComponents, typename CharOutputIterator>
-    static void serialize_value(RangeOfSerializedComponents&& values, CharOutputIterator& out, bool is_compound) {
+    template<typename RangeOfSerializedComponents>
+    static void serialize_value(RangeOfSerializedComponents&& values, bytes::iterator& out, bool is_compound) {
        if (!is_compound) {
            auto it = values.begin();
            write_value(std::forward<decltype(*it)>(*it), out);
--- a/concrete_types.hh
+++ b/concrete_types.hh
@@ -92,17 +92,14 @@ struct duration_type_impl final : public concrete_type<cql_duration> {

 struct timestamp_type_impl final : public simple_type_impl<db_clock::time_point> {
    timestamp_type_impl();
-    static db_clock::time_point from_sstring(sstring_view s);
 };

 struct simple_date_type_impl final : public simple_type_impl<uint32_t> {
    simple_date_type_impl();
-    static uint32_t from_sstring(sstring_view s);
 };

 struct time_type_impl final : public simple_type_impl<int64_t> {
    time_type_impl();
-    static int64_t from_sstring(sstring_view s);
 };

 struct string_type_impl : public concrete_type<sstring> {
@@ -128,11 +125,8 @@ struct date_type_impl final : public concrete_type<db_clock::time_point> {
    date_type_impl();
 };

-using timestamp_date_base_class = concrete_type<db_clock::time_point>;
-
 struct timeuuid_type_impl final : public concrete_type<utils::UUID> {
    timeuuid_type_impl();
-    static utils::UUID from_sstring(sstring_view s);
 };

 struct varint_type_impl final : public concrete_type<boost::multiprecision::cpp_int> {
@@ -141,13 +135,10 @@ struct varint_type_impl final : public concrete_type<boost::multiprecision::cpp_

 struct inet_addr_type_impl final : public concrete_type<seastar::net::inet_address> {
    inet_addr_type_impl();
-    static sstring to_sstring(const seastar::net::inet_address& addr);
-    static seastar::net::inet_address from_sstring(sstring_view s);
 };

 struct uuid_type_impl final : public concrete_type<utils::UUID> {
    uuid_type_impl();
-    static utils::UUID from_sstring(sstring_view s);
 };

 template <typename Func> using visit_ret_type = std::invoke_result_t<Func, const ascii_type_impl&>;
@@ -248,28 +239,3 @@ static inline visit_ret_type<Func> visit(const abstract_type& t, Func&& f) {
    }
    __builtin_unreachable();
 }
-
-template <typename Func> struct data_value_visitor {
-    const void* v;
-    Func& f;
-    auto operator()(const empty_type_impl& t) { return f(t, v); }
-    auto operator()(const counter_type_impl& t) { return f(t, v); }
-    auto operator()(const reversed_type_impl& t) { return f(t, v); }
-    template <typename T> auto operator()(const T& t) {
-        return f(t, reinterpret_cast<const typename T::native_type*>(v));
-    }
-};
-
-// Given an abstract_type and a void pointer to an object of that
-// type, call f with the runtime type of t and v casted to the
-// corresponding native type.
-// This takes an abstract_type and a void pointer instead of a
-// data_value to support reversed_type_impl without requiring that
-// each visitor create a new data_value just to recurse.
-template <typename Func> inline auto visit(const abstract_type& t, const void* v, Func&& f) {
-    return ::visit(t, data_value_visitor<Func>{v, f});
-}
-
-template <typename Func> inline auto visit(const data_value& v, Func&& f) {
-    return ::visit(*v.type(), v._value, f);
-}
--- a/conf/scylla.yaml
+++ b/conf/scylla.yaml
@@ -25,19 +25,15 @@
 # multiple tokens per node, see http://cassandra.apache.org/doc/latest/operating
 num_tokens: 256

-# Directory where Scylla should store all its files, which are commitlog,
-# data, hints, view_hints and saved_caches subdirectories. All of these
-# subs can be overriden by the respective options below.
-# If unset, the value defaults to /var/lib/scylla
-# workdir: /var/lib/scylla
-
 # Directory where Scylla should store data on disk.
-# data_file_directories:
-#    - /var/lib/scylla/data
+# If not set, the default directory is /var/lib/scylla/data.
+data_file_directories:
+    - /var/lib/scylla/data

 # commit log.  when running on magnetic HDD, this should be a
 # separate spindle than the data directories.
-# commitlog_directory: /var/lib/scylla/commitlog
+# If not set, the default directory is /var/lib/scylla/commitlog.
+commitlog_directory: /var/lib/scylla/commitlog

 # commitlog_sync may be either "periodic" or "batch."
 #
@@ -116,9 +112,6 @@ read_request_timeout_in_ms: 5000

 # How long the coordinator should wait for writes to complete
 write_request_timeout_in_ms: 2000
-# how long a coordinator should continue to retry a CAS operation
-# that contends with other proposals for the same row
-cas_contention_timeout_in_ms: 1000

 # phi value that must be reached for a host to be marked down.
 # most users should never need to adjust this.
@@ -245,10 +238,7 @@ batch_size_fail_threshold_in_kb: 50
 # broadcast_rpc_address: 1.2.3.4

 # Uncomment to enable experimental features
-# experimental_features:
-#     - cdc
-#     - lwt
-#     - udf
+# experimental: true

 # The directory where hints files are stored if hinted handoff is enabled.
 # hints_directory: /var/lib/scylla/hints
@@ -267,6 +257,24 @@ batch_size_fail_threshold_in_kb: 50
 # created until it has been seen alive and gone down again.
 # max_hint_window_in_ms: 10800000 # 3 hours

+# Maximum throttle in KBs per second, per delivery thread.  This will be
+# reduced proportionally to the number of nodes in the cluster.  (If there
+# are two nodes in the cluster, each delivery thread will use the maximum
+# rate; if there are three, each will throttle to half of the maximum,
+# since we expect two nodes to be delivering hints simultaneously.)
+# hinted_handoff_throttle_in_kb: 1024
+# Number of threads with which to deliver hints;
+# Consider increasing this number when you have multi-dc deployments, since
+# cross-dc handoff tends to be slower
+# max_hints_delivery_threads: 2
+
+###################################################
+## Not currently supported, reserved for future use
+###################################################
+
+# Maximum throttle in KBs per second, total. This will be
+# reduced proportionally to the number of nodes in the cluster.
+# batchlog_replay_throttle_in_kb: 1024

 # Validity period for permissions cache (fetching permissions can be an
 # expensive operation depending on the authorizer, CassandraAuthorizer is
@@ -294,6 +302,120 @@ batch_size_fail_threshold_in_kb: 50
 #
 partitioner: org.apache.cassandra.dht.Murmur3Partitioner

+# Maximum size of the key cache in memory.
+#
+# Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the
+# minimum, sometimes more. The key cache is fairly tiny for the amount of
+# time it saves, so it's worthwhile to use it at large numbers.
+# The row cache saves even more time, but must contain the entire row,
+# so it is extremely space-intensive. It's best to only use the
+# row cache if you have hot rows or static rows.
+#
+# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup.
+#
+# Default value is empty to make it "auto" (min(5% of Heap (in MB), 100MB)). Set to 0 to disable key cache.
+# key_cache_size_in_mb:
+
+# Duration in seconds after which Scylla should
+# save the key cache. Caches are saved to saved_caches_directory as
+# specified in this configuration file.
+#
+# Saved caches greatly improve cold-start speeds, and is relatively cheap in
+# terms of I/O for the key cache. Row cache saving is much more expensive and
+# has limited use.
+#
+# Default is 14400 or 4 hours.
+# key_cache_save_period: 14400
+
+# Number of keys from the key cache to save
+# Disabled by default, meaning all keys are going to be saved
+# key_cache_keys_to_save: 100
+
+# Maximum size of the row cache in memory.
+# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup.
+#
+# Default value is 0, to disable row caching.
+# row_cache_size_in_mb: 0
+
+# Duration in seconds after which Scylla should
+# save the row cache. Caches are saved to saved_caches_directory as specified
+# in this configuration file.
+#
+# Saved caches greatly improve cold-start speeds, and is relatively cheap in
+# terms of I/O for the key cache. Row cache saving is much more expensive and
+# has limited use.
+#
+# Default is 0 to disable saving the row cache.
+# row_cache_save_period: 0
+
+# Number of keys from the row cache to save
+# Disabled by default, meaning all keys are going to be saved
+# row_cache_keys_to_save: 100
+
+# Maximum size of the counter cache in memory.
+#
+# Counter cache helps to reduce counter locks' contention for hot counter cells.
+# In case of RF = 1 a counter cache hit will cause Scylla to skip the read before
+# write entirely. With RF > 1 a counter cache hit will still help to reduce the duration
+# of the lock hold, helping with hot counter cell updates, but will not allow skipping
+# the read entirely. Only the local (clock, count) tuple of a counter cell is kept
+# in memory, not the whole counter, so it's relatively cheap.
+#
+# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup.
+#
+# Default value is empty to make it "auto" (min(2.5% of Heap (in MB), 50MB)). Set to 0 to disable counter cache.
+# NOTE: if you perform counter deletes and rely on low gcgs, you should disable the counter cache.
+# counter_cache_size_in_mb:
+
+# Duration in seconds after which Scylla should
+# save the counter cache (keys only). Caches are saved to saved_caches_directory as
+# specified in this configuration file.
+#
+# Default is 7200 or 2 hours.
+# counter_cache_save_period: 7200
+
+# Number of keys from the counter cache to save
+# Disabled by default, meaning all keys are going to be saved
+# counter_cache_keys_to_save: 100
+
+# The off-heap memory allocator.  Affects storage engine metadata as
+# well as caches.  Experiments show that JEMAlloc saves some memory
+# than the native GCC allocator (i.e., JEMalloc is more
+# fragmentation-resistant).
+# 
+# Supported values are: NativeAllocator, JEMallocAllocator
+#
+# If you intend to use JEMallocAllocator you have to install JEMalloc as library and
+# modify cassandra-env.sh as directed in the file.
+#
+# Defaults to NativeAllocator
+# memory_allocator: NativeAllocator
+
+# saved caches
+# If not set, the default directory is /var/lib/scylla/saved_caches.
+# saved_caches_directory: /var/lib/scylla/saved_caches
+
+
+
+# For workloads with more data than can fit in memory, Scylla's
+# bottleneck will be reads that need to fetch data from
+# disk. "concurrent_reads" should be set to (16 * number_of_drives) in
+# order to allow the operations to enqueue low enough in the stack
+# that the OS and drives can reorder them. Same applies to
+# "concurrent_counter_writes", since counter writes read the current
+# values before incrementing and writing them back.
+#
+# On the other hand, since writes are almost never IO bound, the ideal
+# number of "concurrent_writes" is dependent on the number of cores in
+# your system; (8 * number_of_cores) is a good rule of thumb.
+# concurrent_reads: 32
+# concurrent_writes: 32
+# concurrent_counter_writes: 32
+
+# Total memory to use for sstable-reading buffers.  Defaults to
+# the smaller of 1/4 of heap or 512MB.
+# file_cache_size_in_mb: 512
+
 # Total space to use for commitlogs.
 #
 # If space gets above this value (it will round up to the next nearest
@@ -305,6 +427,28 @@ partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 # available for Scylla.
 commitlog_total_space_in_mb: -1

+# A fixed memory pool size in MB for for SSTable index summaries. If left
+# empty, this will default to 5% of the heap size. If the memory usage of
+# all index summaries exceeds this limit, SSTables with low read rates will
+# shrink their index summaries in order to meet this limit.  However, this
+# is a best-effort process. In extreme conditions Scylla may need to use
+# more than this amount of memory.
+# index_summary_capacity_in_mb:
+
+# How frequently index summaries should be resampled.  This is done
+# periodically to redistribute memory from the fixed-size pool to sstables
+# proportional their recent read rates.  Setting to -1 will disable this
+# process, leaving existing index summaries at their current sampling level.
+# index_summary_resize_interval_in_minutes: 60
+
+# Whether to, when doing sequential writing, fsync() at intervals in
+# order to force the operating system to flush the dirty
+# buffers. Enable this to avoid sudden dirty buffer flushing from
+# impacting read latencies. Almost always a good idea on SSDs; not
+# necessarily on platters.
+# trickle_fsync: false
+# trickle_fsync_interval_in_kb: 10240
+
 # TCP port, for commands and data
 # For security reasons, you should not expose this port to the internet.  Firewall it if needed.
 # storage_port: 7000
@@ -317,21 +461,91 @@ commitlog_total_space_in_mb: -1
 # listen_interface: eth0
 # listen_interface_prefer_ipv6: false

+# Internode authentication backend, implementing IInternodeAuthenticator;
+# used to allow/disallow connections from peer nodes.
+# internode_authenticator: org.apache.cassandra.auth.AllowAllInternodeAuthenticator
+
 # Whether to start the native transport server.
 # Please note that the address on which the native transport is bound is the
 # same as the rpc_address. The port however is different and specified below.
 # start_native_transport: true

+# The maximum threads for handling requests when the native transport is used.
+# This is similar to rpc_max_threads though the default differs slightly (and
+# there is no native_transport_min_threads, idle threads will always be stopped
+# after 30 seconds).
+# native_transport_max_threads: 128
+#
 # The maximum size of allowed frame. Frame (requests) larger than this will
 # be rejected as invalid. The default is 256MB.
 # native_transport_max_frame_size_in_mb: 256

+# The maximum number of concurrent client connections.
+# The default is -1, which means unlimited.
+# native_transport_max_concurrent_connections: -1
+
+# The maximum number of concurrent client connections per source ip.
+# The default is -1, which means unlimited.
+# native_transport_max_concurrent_connections_per_ip: -1
+
 # Whether to start the thrift rpc server.
 # start_rpc: true

 # enable or disable keepalive on rpc/native connections
 # rpc_keepalive: true

+# Scylla provides two out-of-the-box options for the RPC Server:
+#
+# sync  -> One thread per thrift connection. For a very large number of clients, memory
+#          will be your limiting factor. On a 64 bit JVM, 180KB is the minimum stack size
+#          per thread, and that will correspond to your use of virtual memory (but physical memory
+#          may be limited depending on use of stack space).
+#
+# hsha  -> Stands for "half synchronous, half asynchronous." All thrift clients are handled
+#          asynchronously using a small number of threads that does not vary with the amount
+#          of thrift clients (and thus scales well to many clients). The rpc requests are still
+#          synchronous (one thread per active request). If hsha is selected then it is essential
+#          that rpc_max_threads is changed from the default value of unlimited.
+#
+# The default is sync because on Windows hsha is about 30% slower.  On Linux,
+# sync/hsha performance is about the same, with hsha of course using less memory.
+#
+# Alternatively,  can provide your own RPC server by providing the fully-qualified class name
+# of an o.a.c.t.TServerFactory that can create an instance of it.
+# rpc_server_type: sync
+
+# Uncomment rpc_min|max_thread to set request pool size limits.
+#
+# Regardless of your choice of RPC server (see above), the number of maximum requests in the
+# RPC thread pool dictates how many concurrent requests are possible (but if you are using the sync
+# RPC server, it also dictates the number of clients that can be connected at all).
+#
+# The default is unlimited and thus provides no protection against clients overwhelming the server. You are
+# encouraged to set a maximum that makes sense for you in production, but do keep in mind that
+# rpc_max_threads represents the maximum number of client requests this server may execute concurrently.
+#
+# rpc_min_threads: 16
+# rpc_max_threads: 2048
+
+# uncomment to set socket buffer sizes on rpc connections
+# rpc_send_buff_size_in_bytes:
+# rpc_recv_buff_size_in_bytes:
+
+# Uncomment to set socket buffer size for internode communication
+# Note that when setting this, the buffer size is limited by net.core.wmem_max
+# and when not setting it it is defined by net.ipv4.tcp_wmem
+# See:
+# /proc/sys/net/core/wmem_max
+# /proc/sys/net/core/rmem_max
+# /proc/sys/net/ipv4/tcp_wmem
+# /proc/sys/net/ipv4/tcp_rmem
+# and: man tcp
+# internode_send_buff_size_in_bytes:
+# internode_recv_buff_size_in_bytes:
+
+# Frame size for thrift (maximum message length).
+# thrift_framed_transport_size_in_mb: 15
+
 # Set to true to have Scylla create a hard link to each sstable
 # flushed or streamed locally in a backups/ subdirectory of the
 # keyspace data.  Removing these links is the operator's
@@ -374,6 +588,30 @@ commitlog_total_space_in_mb: -1
 # column_index_size_in_kb: 64


+# Number of simultaneous compactions to allow, NOT including
+# validation "compactions" for anti-entropy repair.  Simultaneous
+# compactions can help preserve read performance in a mixed read/write
+# workload, by mitigating the tendency of small sstables to accumulate
+# during a single long running compactions. The default is usually
+# fine and if you experience problems with compaction running too
+# slowly or too fast, you should look at
+# compaction_throughput_mb_per_sec first.
+#
+# concurrent_compactors defaults to the smaller of (number of disks,
+# number of cores), with a minimum of 2 and a maximum of 8.
+# 
+# If your data directories are backed by SSD, you should increase this
+# to the number of cores.
+#concurrent_compactors: 1
+
+# Throttles compaction to the given total throughput across the entire
+# system. The faster you insert data, the faster you need to compact in
+# order to keep the sstable count down, but in general, setting this to
+# 16 to 32 times the rate you are inserting data is more than sufficient.
+# Setting this to 0 disables throttling. Note that this account for all types
+# of compaction, including validation compaction.
+# compaction_throughput_mb_per_sec: 16
+
 # Log a warning when writing partitions larger than this value
 # compaction_large_partition_warning_threshold_mb: 1000

@@ -386,6 +624,18 @@ commitlog_total_space_in_mb: -1
 # Log a warning when row number is larger than this value
 # compaction_rows_count_warning_threshold: 100000

+# When compacting, the replacement sstable(s) can be opened before they
+# are completely written, and used in place of the prior sstables for
+# any range that has been written. This helps to smoothly transfer reads 
+# between the sstables, reducing page cache churn and keeping hot rows hot
+# sstable_preemptive_open_interval_in_mb: 50
+
+# Throttles all streaming file transfer between the datacenters,
+# this setting allows users to throttle inter dc stream throughput in addition
+# to throttling all network stream traffic as configured with
+# stream_throughput_outbound_megabits_per_sec
+# inter_dc_stream_throughput_outbound_megabits_per_sec:
+
 # How long the coordinator should wait for seq or index scans to complete
 # range_request_timeout_in_ms: 10000
 # How long the coordinator should wait for writes to complete
@@ -400,23 +650,88 @@ commitlog_total_space_in_mb: -1
 # The default timeout for other, miscellaneous operations
 # request_timeout_in_ms: 10000

+# Enable operation timeout information exchange between nodes to accurately
+# measure request timeouts.  If disabled, replicas will assume that requests
+# were forwarded to them instantly by the coordinator, which means that
+# under overload conditions we will waste that much extra time processing 
+# already-timed-out requests.
+#
+# Warning: before enabling this property make sure to ntp is installed
+# and the times are synchronized between the nodes.
+# cross_node_timeout: false
+
+# Enable socket timeout for streaming operation.
+# When a timeout occurs during streaming, streaming is retried from the start
+# of the current file. This _can_ involve re-streaming an important amount of
+# data, so you should avoid setting the value too low.
+# Default value is 0, which never timeout streams.
+# streaming_socket_timeout_in_ms: 0
+
+# controls how often to perform the more expensive part of host score
+# calculation
+# dynamic_snitch_update_interval_in_ms: 100 
+
+# controls how often to reset all host scores, allowing a bad host to
+# possibly recover
+# dynamic_snitch_reset_interval_in_ms: 600000
+
+# if set greater than zero and read_repair_chance is < 1.0, this will allow
+# 'pinning' of replicas to hosts in order to increase cache capacity.
+# The badness threshold will control how much worse the pinned host has to be
+# before the dynamic snitch will prefer other replicas over it.  This is
+# expressed as a double which represents a percentage.  Thus, a value of
+# 0.2 means Scylla would continue to prefer the static snitch values
+# until the pinned host was 20% worse than the fastest.
+# dynamic_snitch_badness_threshold: 0.1
+
+# request_scheduler -- Set this to a class that implements
+# RequestScheduler, which will schedule incoming client requests
+# according to the specific policy. This is useful for multi-tenancy
+# with a single Scylla cluster.
+# NOTE: This is specifically for requests from the client and does
+# not affect inter node communication.
+# org.apache.cassandra.scheduler.NoScheduler - No scheduling takes place
+# org.apache.cassandra.scheduler.RoundRobinScheduler - Round robin of
+# client requests to a node with a separate queue for each
+# request_scheduler_id. The scheduler is further customized by
+# request_scheduler_options as described below.
+# request_scheduler: org.apache.cassandra.scheduler.NoScheduler
+
+# Scheduler Options vary based on the type of scheduler
+# NoScheduler - Has no options
+# RoundRobin
+#  - throttle_limit -- The throttle_limit is the number of in-flight
+#                      requests per client.  Requests beyond 
+#                      that limit are queued up until
+#                      running requests can complete.
+#                      The value of 80 here is twice the number of
+#                      concurrent_reads + concurrent_writes.
+#  - default_weight -- default_weight is optional and allows for
+#                      overriding the default which is 1.
+#  - weights -- Weights are optional and will default to 1 or the
+#               overridden default_weight. The weight translates into how
+#               many requests are handled during each turn of the
+#               RoundRobin, based on the scheduler id.
+#
+# request_scheduler_options:
+#    throttle_limit: 80
+#    default_weight: 5
+#    weights:
+#      Keyspace1: 1
+#      Keyspace2: 5
+
+# request_scheduler_id -- An identifier based on which to perform
+# the request scheduling. Currently the only valid option is keyspace.
+# request_scheduler_id: keyspace
+
 # Enable or disable inter-node encryption. 
 # You must also generate keys and provide the appropriate key and trust store locations and passwords. 
+# No custom encryption options are currently enabled. The available options are:
 #
 # The available internode options are : all, none, dc, rack
 # If set to dc scylla  will encrypt the traffic between the DCs
 # If set to rack scylla  will encrypt the traffic between the racks
 #
-# SSL/TLS algorithm and ciphers used can be controlled by 
-# the priority_string parameter. Info on priority string
-# syntax and values is available at:
-#   https://gnutls.org/manual/html_node/Priority-Strings.html
-#
-# The require_client_auth parameter allows you to 
-# restrict access to service based on certificate 
-# validation. Client must provide a certificate 
-# accepted by the used trust store to connect.
-# 
 # server_encryption_options:
 #    internode_encryption: none
 #    certificate: conf/scylla.crt
--- a/configure.py
+++ b/configure.py
@@ -144,12 +144,8 @@ def flag_supported(flag, compiler):

 def gold_supported(compiler):
    src_main = 'int main(int argc, char **argv) { return 0; }'
-    link_flags = ['-fuse-ld=gold']
-    if try_compile_and_link(source=src_main, flags=link_flags, compiler=compiler):
-        threads_flag = '-Wl,--threads'
-        if try_compile_and_link(source=src_main, flags=link_flags + [threads_flag], compiler=compiler):
-            link_flags.append(threads_flag)
-        return ' '.join(link_flags)
+    if try_compile_and_link(source=src_main, flags=['-fuse-ld=gold'], compiler=compiler):
+        return '-fuse-ld=gold'
    else:
        print('Note: gold not found; using default system linker')
        return ''
@@ -261,142 +257,136 @@ modes = {
 }

 scylla_tests = [
-    'test/boost/UUID_test',
-    'test/boost/aggregate_fcts_test',
-    'test/boost/allocation_strategy_test',
-    'test/boost/anchorless_list_test',
-    'test/boost/auth_passwords_test',
-    'test/boost/auth_resource_test',
-    'test/boost/auth_test',
-    'test/boost/batchlog_manager_test',
-    'test/boost/big_decimal_test',
-    'test/boost/broken_sstable_test',
-    'test/boost/bytes_ostream_test',
-    'test/boost/cache_flat_mutation_reader_test',
-    'test/boost/caching_options_test',
-    'test/boost/canonical_mutation_test',
-    'test/boost/cartesian_product_test',
-    'test/boost/castas_fcts_test',
-    'test/boost/cdc_test',
-    'test/boost/cell_locker_test',
-    'test/boost/checksum_utils_test',
-    'test/boost/chunked_vector_test',
-    'test/boost/clustering_ranges_walker_test',
-    'test/boost/commitlog_test',
-    'test/boost/compound_test',
-    'test/boost/compress_test',
-    'test/boost/config_test',
-    'test/boost/continuous_data_consumer_test',
-    'test/boost/counter_test',
-    'test/boost/cql_auth_query_test',
-    'test/boost/cql_auth_syntax_test',
-    'test/boost/cql_query_test',
-    'test/boost/crc_test',
-    'test/boost/data_listeners_test',
-    'test/boost/database_test',
-    'test/boost/duration_test',
-    'test/boost/dynamic_bitset_test',
-    'test/boost/enum_option_test',
-    'test/boost/enum_set_test',
-    'test/boost/extensions_test',
-    'test/boost/filtering_test',
-    'test/boost/flat_mutation_reader_test',
-    'test/boost/flush_queue_test',
-    'test/boost/fragmented_temporary_buffer_test',
-    'test/boost/frozen_mutation_test',
-    'test/boost/gossip_test',
-    'test/boost/gossiping_property_file_snitch_test',
-    'test/boost/hash_test',
-    'test/boost/idl_test',
-    'test/boost/input_stream_test',
-    'test/boost/json_cql_query_test',
-    'test/boost/keys_test',
-    'test/boost/like_matcher_test',
-    'test/boost/limiting_data_source_test',
-    'test/boost/linearizing_input_stream_test',
-    'test/boost/loading_cache_test',
-    'test/boost/log_heap_test',
-    'test/boost/logalloc_test',
-    'test/boost/managed_vector_test',
-    'test/boost/map_difference_test',
-    'test/boost/memtable_test',
-    'test/boost/meta_test',
-    'test/boost/multishard_mutation_query_test',
-    'test/boost/murmur_hash_test',
-    'test/boost/mutation_fragment_test',
-    'test/boost/mutation_query_test',
-    'test/boost/mutation_reader_test',
-    'test/boost/mutation_test',
-    'test/boost/mutation_writer_test',
-    'test/boost/mvcc_test',
-    'test/boost/network_topology_strategy_test',
-    'test/boost/nonwrapping_range_test',
-    'test/boost/observable_test',
-    'test/boost/partitioner_test',
-    'test/boost/querier_cache_test',
-    'test/boost/query_processor_test',
-    'test/boost/range_test',
-    'test/boost/range_tombstone_list_test',
-    'test/boost/reusable_buffer_test',
-    'test/boost/role_manager_test',
-    'test/boost/row_cache_test',
-    'test/boost/schema_change_test',
-    'test/boost/schema_registry_test',
-    'test/boost/secondary_index_test',
-    'test/boost/serialization_test',
-    'test/boost/serialized_action_test',
-    'test/boost/small_vector_test',
-    'test/boost/snitch_reset_test',
-    'test/boost/sstable_3_x_test',
-    'test/boost/sstable_datafile_test',
-    'test/boost/sstable_mutation_test',
-    'test/boost/sstable_resharding_test',
-    'test/boost/sstable_test',
-    'test/boost/storage_proxy_test',
-    'test/boost/top_k_test',
-    'test/boost/transport_test',
-    'test/boost/truncation_migration_test',
-    'test/boost/types_test',
-    'test/boost/user_function_test',
-    'test/boost/user_types_test',
-    'test/boost/utf8_test',
-    'test/boost/view_build_test',
-    'test/boost/view_complex_test',
-    'test/boost/view_schema_test',
-    'test/boost/vint_serialization_test',
-    'test/boost/virtual_reader_test',
-    'test/manual/ec2_snitch_test',
-    'test/manual/gce_snitch_test',
-    'test/manual/gossip',
-    'test/manual/hint_test',
-    'test/manual/imr_test',
-    'test/manual/json_test',
-    'test/manual/message',
-    'test/manual/partition_data_test',
-    'test/manual/row_locker_test',
-    'test/manual/streaming_histogram_test',
-    'test/perf/perf_cache_eviction',
-    'test/perf/perf_cql_parser',
-    'test/perf/perf_fast_forward',
-    'test/perf/perf_hash',
-    'test/perf/perf_mutation',
-    'test/perf/perf_row_cache_update',
-    'test/perf/perf_simple_query',
-    'test/perf/perf_sstable',
-    'test/tools/cql_repl',
-    'test/unit/lsa_async_eviction_test',
-    'test/unit/lsa_sync_eviction_test',
-    'test/unit/memory_footprint_test',
-    'test/unit/row_cache_alloc_stress_test',
-    'test/unit/row_cache_stress_test',
+    'tests/mutation_test',
+    'tests/mvcc_test',
+    'tests/mutation_fragment_test',
+    'tests/flat_mutation_reader_test',
+    'tests/schema_registry_test',
+    'tests/canonical_mutation_test',
+    'tests/range_test',
+    'tests/types_test',
+    'tests/keys_test',
+    'tests/partitioner_test',
+    'tests/frozen_mutation_test',
+    'tests/serialized_action_test',
+    'tests/hint_test',
+    'tests/clustering_ranges_walker_test',
+    'tests/perf/perf_mutation',
+    'tests/lsa_async_eviction_test',
+    'tests/lsa_sync_eviction_test',
+    'tests/row_cache_alloc_stress',
+    'tests/perf_row_cache_update',
+    'tests/perf/perf_hash',
+    'tests/perf/perf_cql_parser',
+    'tests/perf/perf_simple_query',
+    'tests/perf/perf_fast_forward',
+    'tests/perf/perf_cache_eviction',
+    'tests/cache_flat_mutation_reader_test',
+    'tests/row_cache_stress_test',
+    'tests/memory_footprint',
+    'tests/perf/perf_sstable',
+    'tests/cql_query_test',
+    'tests/secondary_index_test',
+    'tests/json_cql_query_test',
+    'tests/filtering_test',
+    'tests/storage_proxy_test',
+    'tests/schema_change_test',
+    'tests/mutation_reader_test',
+    'tests/mutation_query_test',
+    'tests/row_cache_test',
+    'tests/test-serialization',
+    'tests/broken_sstable_test',
+    'tests/sstable_test',
+    'tests/sstable_datafile_test',
+    'tests/sstable_3_x_test',
+    'tests/sstable_mutation_test',
+    'tests/sstable_resharding_test',
+    'tests/memtable_test',
+    'tests/commitlog_test',
+    'tests/cartesian_product_test',
+    'tests/hash_test',
+    'tests/map_difference_test',
+    'tests/message',
+    'tests/gossip',
+    'tests/gossip_test',
+    'tests/compound_test',
+    'tests/config_test',
+    'tests/gossiping_property_file_snitch_test',
+    'tests/ec2_snitch_test',
+    'tests/gce_snitch_test',
+    'tests/snitch_reset_test',
+    'tests/network_topology_strategy_test',
+    'tests/query_processor_test',
+    'tests/batchlog_manager_test',
+    'tests/bytes_ostream_test',
+    'tests/UUID_test',
+    'tests/murmur_hash_test',
+    'tests/allocation_strategy_test',
+    'tests/logalloc_test',
+    'tests/log_heap_test',
+    'tests/managed_vector_test',
+    'tests/crc_test',
+    'tests/checksum_utils_test',
+    'tests/flush_queue_test',
+    'tests/dynamic_bitset_test',
+    'tests/auth_test',
+    'tests/idl_test',
+    'tests/range_tombstone_list_test',
+    'tests/anchorless_list_test',
+    'tests/database_test',
+    'tests/nonwrapping_range_test',
+    'tests/input_stream_test',
+    'tests/virtual_reader_test',
+    'tests/view_schema_test',
+    'tests/view_build_test',
+    'tests/view_complex_test',
+    'tests/counter_test',
+    'tests/cell_locker_test',
+    'tests/row_locker_test',
+    'tests/streaming_histogram_test',
+    'tests/duration_test',
+    'tests/vint_serialization_test',
+    'tests/continuous_data_consumer_test',
+    'tests/compress_test',
+    'tests/chunked_vector_test',
+    'tests/loading_cache_test',
+    'tests/castas_fcts_test',
+    'tests/big_decimal_test',
+    'tests/aggregate_fcts_test',
+    'tests/role_manager_test',
+    'tests/caching_options_test',
+    'tests/auth_resource_test',
+    'tests/cql_auth_query_test',
+    'tests/enum_set_test',
+    'tests/extensions_test',
+    'tests/cql_auth_syntax_test',
+    'tests/querier_cache',
+    'tests/limiting_data_source_test',
+    'tests/meta_test',
+    'tests/imr_test',
+    'tests/partition_data_test',
+    'tests/reusable_buffer_test',
+    'tests/mutation_writer_test',
+    'tests/observable_test',
+    'tests/transport_test',
+    'tests/fragmented_temporary_buffer_test',
+    'tests/json_test',
+    'tests/auth_passwords_test',
+    'tests/multishard_mutation_query_test',
+    'tests/top_k_test',
+    'tests/utf8_test',
+    'tests/small_vector_test',
+    'tests/data_listeners_test',
+    'tests/truncation_migration_test',
+    'tests/like_matcher_test',
 ]

 perf_tests = [
-    'test/perf/perf_mutation_readers',
-    'test/perf/perf_checksum',
-    'test/perf/perf_mutation_fragment',
-    'test/perf/perf_idl',
-    'test/perf/perf_vint',
+    'tests/perf/perf_mutation_readers',
+    'tests/perf/perf_checksum',
+    'tests/perf/perf_mutation_fragment',
+    'tests/perf/perf_idl',
+    'tests/perf/perf_vint',
 ]

 apps = [
@@ -439,6 +429,8 @@ arg_parser.add_argument('--dpdk-target', action='store', dest='dpdk_target', def
                        help='Path to DPDK SDK target location (e.g. <DPDK SDK dir>/x86_64-native-linuxapp-gcc)')
 arg_parser.add_argument('--debuginfo', action='store', dest='debuginfo', type=int, default=1,
                        help='Enable(1)/disable(0)compiler debug information generation')
+arg_parser.add_argument('--compress-exec-debuginfo', action='store', dest='compress_exec_debuginfo', type=int, default=1,
+                        help='Enable(1)/disable(0) debug information compression in executables')
 arg_parser.add_argument('--static-stdc++', dest='staticcxx', action='store_true',
                        help='Link libgcc and libstdc++ statically')
 arg_parser.add_argument('--static-thrift', dest='staticthrift', action='store_true',
@@ -461,8 +453,6 @@ arg_parser.add_argument('--enable-alloc-failure-injector', dest='alloc_failure_i
                        help='enable allocation failure injection')
 arg_parser.add_argument('--with-antlr3', dest='antlr3_exec', action='store', default=None,
                        help='path to antlr3 executable')
-arg_parser.add_argument('--with-ragel', dest='ragel_exec', action='store', default='ragel',
-        help='path to ragel executable')
 args = arg_parser.parse_args()

 defines = ['XXH_PRIVATE_API',
@@ -476,8 +466,6 @@ cassandra_interface = Thrift(source='interface/cassandra.thrift', service='Cassa
 scylla_core = (['database.cc',
                'table.cc',
                'atomic_cell.cc',
-                'collection_mutation.cc',
-                'connection_notifier.cc',
                'hashers.cc',
                'schema.cc',
                'frozen_schema.cc',
@@ -497,7 +485,6 @@ scylla_core = (['database.cc',
                'utils/buffer_input_stream.cc',
                'utils/limiting_data_source.cc',
                'utils/updateable_value.cc',
-                'utils/directories.cc',
                'mutation_partition.cc',
                'mutation_partition_view.cc',
                'mutation_partition_serializer.cc',
@@ -518,8 +505,6 @@ scylla_core = (['database.cc',
                'sstables/partition.cc',
                'sstables/compaction.cc',
                'sstables/compaction_strategy.cc',
-                'sstables/size_tiered_compaction_strategy.cc',
-                'sstables/leveled_compaction_strategy.cc',
                'sstables/compaction_manager.cc',
                'sstables/integrity_checked_file_impl.cc',
                'sstables/prepended_input_stream.cc',
@@ -528,8 +513,6 @@ scylla_core = (['database.cc',
                'transport/event_notifier.cc',
                'transport/server.cc',
                'transport/messages/result_message.cc',
-                'cdc/cdc.cc',
-                'cql3/type_json.cc',
                'cql3/abstract_marker.cc',
                'cql3/attributes.cc',
                'cql3/cf_name.cc',
@@ -541,9 +524,7 @@ scylla_core = (['database.cc',
                'cql3/sets.cc',
                'cql3/tuples.cc',
                'cql3/maps.cc',
-                'cql3/functions/user_function.cc',
                'cql3/functions/functions.cc',
-                'cql3/functions/aggregate_fcts.cc',
                'cql3/functions/castas_fcts.cc',
                'cql3/statements/cf_prop_defs.cc',
                'cql3/statements/cf_statement.cc',
@@ -552,18 +533,14 @@ scylla_core = (['database.cc',
                'cql3/statements/create_table_statement.cc',
                'cql3/statements/create_view_statement.cc',
                'cql3/statements/create_type_statement.cc',
-                'cql3/statements/create_function_statement.cc',
                'cql3/statements/drop_index_statement.cc',
                'cql3/statements/drop_keyspace_statement.cc',
                'cql3/statements/drop_table_statement.cc',
                'cql3/statements/drop_view_statement.cc',
                'cql3/statements/drop_type_statement.cc',
-                'cql3/statements/drop_function_statement.cc',
                'cql3/statements/schema_altering_statement.cc',
                'cql3/statements/ks_prop_defs.cc',
-                'cql3/statements/function_statement.cc',
                'cql3/statements/modification_statement.cc',
-                'cql3/statements/cas_request.cc',
                'cql3/statements/parsed_statement.cc',
                'cql3/statements/property_definitions.cc',
                'cql3/statements/update_statement.cc',
@@ -601,10 +578,6 @@ scylla_core = (['database.cc',
                'service/priority_manager.cc',
                'service/migration_manager.cc',
                'service/storage_proxy.cc',
-                'service/paxos/proposal.cc',
-                'service/paxos/prepare_response.cc',
-                'service/paxos/paxos_state.cc',
-                'service/paxos/prepare_summary.cc',
                'cql3/operator.cc',
                'cql3/relation.cc',
                'cql3/column_identifier.cc',
@@ -743,7 +716,6 @@ scylla_core = (['database.cc',
                'tracing/trace_keyspace_helper.cc',
                'tracing/trace_state.cc',
                'tracing/tracing_backend_registry.cc',
-                'tracing/traced_file.cc',
                'table_helper.cc',
                'range_tombstone.cc',
                'range_tombstone_list.cc',
@@ -761,7 +733,6 @@ scylla_core = (['database.cc',
                'utils/ascii.cc',
                'utils/like_matcher.cc',
                'mutation_writer/timestamp_based_splitting_writer.cc',
-                'lua.cc',
                ] + [Antlr3Grammar('cql3/Cql.g')] + [Thrift('interface/cassandra.thrift', 'Cassandra')]
               )

@@ -811,24 +782,8 @@ alternator = [
       Antlr3Grammar('alternator/expressions.g'),
       'alternator/conditions.cc',
       'alternator/rjson.cc',
-       'alternator/auth.cc',
 ]

-redis = [
-        'redis/service.cc',
-        'redis/server.cc',
-        'redis/query_processor.cc',
-        'redis/protocol_parser.rl',
-        'redis/keyspace_utils.cc',
-        'redis/options.cc',
-        'redis/stats.cc',
-        'redis/mutation_utils.cc',
-        'redis/query_utils.cc',
-        'redis/abstract_command.cc',
-        'redis/command_factory.cc',
-        'redis/commands.cc',
-        ]
-
 idls = ['idl/gossip_digest.idl.hh',
        'idl/uuid.idl.hh',
        'idl/range.idl.hh',
@@ -853,80 +808,77 @@ idls = ['idl/gossip_digest.idl.hh',
        'idl/consistency_level.idl.hh',
        'idl/cache_temperature.idl.hh',
        'idl/view.idl.hh',
-        'idl/messaging_service.idl.hh',
-        'idl/paxos.idl.hh',
        ]

 headers = find_headers('.', excluded_dirs=['idl', 'build', 'seastar', '.git'])

 scylla_tests_generic_dependencies = [
-    'test/lib/cql_test_env.cc',
-    'test/lib/test_services.cc',
+    'tests/cql_test_env.cc',
+    'tests/test_services.cc',
 ]

 scylla_tests_dependencies = scylla_core + idls + scylla_tests_generic_dependencies + [
-    'test/lib/cql_assertions.cc',
-    'test/lib/result_set_assertions.cc',
-    'test/lib/mutation_source_test.cc',
-    'test/lib/data_model.cc',
-    'test/lib/exception_utils.cc',
-    'test/lib/random_schema.cc',
+    'tests/cql_assertions.cc',
+    'tests/result_set_assertions.cc',
+    'tests/mutation_source_test.cc',
+    'tests/data_model.cc',
+    'tests/exception_utils.cc',
+    'tests/random_schema.cc',
 ]

 deps = {
-    'scylla': idls + ['main.cc', 'release.cc', 'build_id.cc'] + scylla_core + api + alternator + redis,
+    'scylla': idls + ['main.cc', 'release.cc'] + scylla_core + api + alternator,
 }

 pure_boost_tests = set([
-    'test/boost/anchorless_list_test',
-    'test/boost/auth_passwords_test',
-    'test/boost/auth_resource_test',
-    'test/boost/big_decimal_test',
-    'test/boost/caching_options_test',
-    'test/boost/cartesian_product_test',
-    'test/boost/checksum_utils_test',
-    'test/boost/chunked_vector_test',
-    'test/boost/compound_test',
-    'test/boost/compress_test',
-    'test/boost/cql_auth_syntax_test',
-    'test/boost/crc_test',
-    'test/boost/duration_test',
-    'test/boost/dynamic_bitset_test',
-    'test/boost/enum_option_test',
-    'test/boost/enum_set_test',
-    'test/boost/idl_test',
-    'test/boost/keys_test',
-    'test/boost/like_matcher_test',
-    'test/boost/linearizing_input_stream_test',
-    'test/boost/map_difference_test',
-    'test/boost/meta_test',
-    'test/boost/nonwrapping_range_test',
-    'test/boost/observable_test',
-    'test/boost/range_test',
-    'test/boost/range_tombstone_list_test',
-    'test/boost/serialization_test',
-    'test/boost/small_vector_test',
-    'test/boost/top_k_test',
-    'test/boost/vint_serialization_test',
-    'test/manual/json_test',
-    'test/manual/streaming_histogram_test',
+    'tests/map_difference_test',
+    'tests/keys_test',
+    'tests/compound_test',
+    'tests/range_tombstone_list_test',
+    'tests/anchorless_list_test',
+    'tests/nonwrapping_range_test',
+    'tests/test-serialization',
+    'tests/range_test',
+    'tests/crc_test',
+    'tests/checksum_utils_test',
+    'tests/managed_vector_test',
+    'tests/dynamic_bitset_test',
+    'tests/idl_test',
+    'tests/cartesian_product_test',
+    'tests/streaming_histogram_test',
+    'tests/duration_test',
+    'tests/vint_serialization_test',
+    'tests/compress_test',
+    'tests/chunked_vector_test',
+    'tests/big_decimal_test',
+    'tests/caching_options_test',
+    'tests/auth_resource_test',
+    'tests/enum_set_test',
+    'tests/cql_auth_syntax_test',
+    'tests/meta_test',
+    'tests/observable_test',
+    'tests/json_test',
+    'tests/auth_passwords_test',
+    'tests/top_k_test',
+    'tests/small_vector_test',
+    'tests/like_matcher_test',
 ])

 tests_not_using_seastar_test_framework = set([
-    'test/boost/small_vector_test',
-    'test/manual/gossip',
-    'test/manual/message',
-    'test/perf/perf_cache_eviction',
-    'test/perf/perf_cql_parser',
-    'test/perf/perf_hash',
-    'test/perf/perf_mutation',
-    'test/perf/perf_row_cache_update',
-    'test/perf/perf_sstable',
-    'test/unit/lsa_async_eviction_test',
-    'test/unit/lsa_sync_eviction_test',
-    'test/unit/memory_footprint_test',
-    'test/unit/row_cache_alloc_stress_test',
-    'test/unit/row_cache_stress_test',
+    'tests/perf/perf_mutation',
+    'tests/lsa_async_eviction_test',
+    'tests/lsa_sync_eviction_test',
+    'tests/row_cache_alloc_stress',
+    'tests/perf_row_cache_update',
+    'tests/perf/perf_hash',
+    'tests/perf/perf_cql_parser',
+    'tests/message',
+    'tests/perf/perf_cache_eviction',
+    'tests/row_cache_stress_test',
+    'tests/memory_footprint',
+    'tests/gossip',
+    'tests/perf/perf_sstable',
+    'tests/small_vector_test',
 ]) | pure_boost_tests

 for t in tests_not_using_seastar_test_framework:
@@ -947,29 +899,28 @@ perf_tests_seastar_deps = [
 for t in perf_tests:
    deps[t] = [t + '.cc'] + scylla_tests_dependencies + perf_tests_seastar_deps

-deps['test/boost/sstable_test'] += ['test/lib/sstable_utils.cc', 'test/lib/normalizing_reader.cc']
-deps['test/boost/sstable_datafile_test'] += ['test/lib/sstable_utils.cc', 'test/lib/normalizing_reader.cc']
-deps['test/boost/mutation_reader_test'] += ['test/lib/sstable_utils.cc']
+deps['tests/sstable_test'] += ['tests/sstable_utils.cc', 'tests/normalizing_reader.cc']
+deps['tests/sstable_datafile_test'] += ['tests/sstable_utils.cc', 'tests/normalizing_reader.cc']
+deps['tests/mutation_reader_test'] += ['tests/sstable_utils.cc']

-deps['test/boost/bytes_ostream_test'] = ['test/boost/bytes_ostream_test.cc', 'utils/managed_bytes.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
-deps['test/boost/input_stream_test'] = ['test/boost/input_stream_test.cc']
-deps['test/boost/UUID_test'] = ['utils/UUID_gen.cc', 'test/boost/UUID_test.cc', 'utils/uuid.cc', 'utils/managed_bytes.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc', 'hashers.cc']
-deps['test/boost/murmur_hash_test'] = ['bytes.cc', 'utils/murmur_hash.cc', 'test/boost/murmur_hash_test.cc']
-deps['test/boost/allocation_strategy_test'] = ['test/boost/allocation_strategy_test.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
-deps['test/boost/log_heap_test'] = ['test/boost/log_heap_test.cc']
-deps['test/boost/anchorless_list_test'] = ['test/boost/anchorless_list_test.cc']
-deps['test/perf/perf_fast_forward'] += ['release.cc']
-deps['test/perf/perf_simple_query'] += ['release.cc']
-deps['test/boost/meta_test'] = ['test/boost/meta_test.cc']
-deps['test/manual/imr_test'] = ['test/manual/imr_test.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
-deps['test/boost/reusable_buffer_test'] = ['test/boost/reusable_buffer_test.cc']
-deps['test/boost/utf8_test'] = ['utils/utf8.cc', 'test/boost/utf8_test.cc']
-deps['test/boost/small_vector_test'] = ['test/boost/small_vector_test.cc']
-deps['test/boost/multishard_mutation_query_test'] += ['test/boost/test_table.cc']
-deps['test/boost/vint_serialization_test'] = ['test/boost/vint_serialization_test.cc', 'vint-serialization.cc', 'bytes.cc']
-deps['test/boost/linearizing_input_stream_test'] = ['test/boost/linearizing_input_stream_test.cc']
+deps['tests/bytes_ostream_test'] = ['tests/bytes_ostream_test.cc', 'utils/managed_bytes.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
+deps['tests/input_stream_test'] = ['tests/input_stream_test.cc']
+deps['tests/UUID_test'] = ['utils/UUID_gen.cc', 'tests/UUID_test.cc', 'utils/uuid.cc', 'utils/managed_bytes.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc', 'hashers.cc']
+deps['tests/murmur_hash_test'] = ['bytes.cc', 'utils/murmur_hash.cc', 'tests/murmur_hash_test.cc']
+deps['tests/allocation_strategy_test'] = ['tests/allocation_strategy_test.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
+deps['tests/log_heap_test'] = ['tests/log_heap_test.cc']
+deps['tests/anchorless_list_test'] = ['tests/anchorless_list_test.cc']
+deps['tests/perf/perf_fast_forward'] += ['release.cc']
+deps['tests/perf/perf_simple_query'] += ['release.cc']
+deps['tests/meta_test'] = ['tests/meta_test.cc']
+deps['tests/imr_test'] = ['tests/imr_test.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
+deps['tests/reusable_buffer_test'] = ['tests/reusable_buffer_test.cc']
+deps['tests/utf8_test'] = ['utils/utf8.cc', 'tests/utf8_test.cc']
+deps['tests/small_vector_test'] = ['tests/small_vector_test.cc']
+deps['tests/multishard_mutation_query_test'] += ['tests/test_table.cc']
+deps['tests/vint_serialization_test'] = ['tests/vint_serialization_test.cc', 'vint-serialization.cc', 'bytes.cc']

-deps['test/boost/duration_test'] += ['test/lib/exception_utils.cc']
+deps['tests/duration_test'] += ['tests/exception_utils.cc']

 deps['utils/gz/gen_crc_combine_table'] = ['utils/gz/gen_crc_combine_table.cc']

@@ -1012,13 +963,9 @@ modes['release']['cxx_ld_flags'] += ' ' + ' '.join(optimization_flags)

 gold_linker_flag = gold_supported(compiler=args.cxx)

-dbgflag = '-g -gz' if args.debuginfo else ''
+dbgflag = '-g' if args.debuginfo else ''
 tests_link_rule = 'link' if args.tests_debuginfo else 'link_stripped'

-# Strip if debuginfo is disabled, otherwise we end up with partial
-# debug info from the libraries we static link with
-regular_link_rule = 'link' if args.debuginfo else 'link_stripped'
-
 if args.so:
    args.pie = '-shared'
    args.fpie = '-fpic'
@@ -1035,10 +982,6 @@ else:
 optional_packages = [['libsystemd', 'libsystemd-daemon']]
 pkgs = []

-# Lua can be provided by lua53 package on Debian-like
-# systems and by Lua on others.
-pkgs.append('lua53' if have_pkg('lua53') else 'lua')
-

 def setup_first_pkg_of_list(pkglist):
    # The HAVE_pkg symbol is taken from the first alternative
@@ -1129,6 +1072,23 @@ scylla_release = file.read().strip()

 extra_cxxflags["release.cc"] = "-DSCYLLA_VERSION=\"\\\"" + scylla_version + "\\\"\" -DSCYLLA_RELEASE=\"\\\"" + scylla_release + "\\\"\""

+seastar_flags = []
+if args.dpdk:
+    # fake dependencies on dpdk, so that it is built before anything else
+    seastar_flags += ['--enable-dpdk']
+if args.gcc6_concepts:
+    seastar_flags += ['--enable-gcc6-concepts']
+if args.alloc_failure_injector:
+    seastar_flags += ['--enable-alloc-failure-injector']
+if args.split_dwarf:
+    seastar_flags += ['--split-dwarf']
+
+# We never compress debug info in debug mode
+modes['debug']['cxxflags'] += ' -gz'
+# We compress it by default in release mode
+flag_dest = 'cxx_ld_flags' if args.compress_exec_debuginfo else 'cxxflags'
+modes['release'][flag_dest] += ' -gz'
+
 for m in ['debug', 'release', 'sanitize']:
    modes[m]['cxxflags'] += ' ' + dbgflag

@@ -1137,56 +1097,27 @@ seastar_cflags += ' -Wno-error'
 if args.target != '':
    seastar_cflags += ' -march=' + args.target
 seastar_ldflags = args.user_ldflags
+seastar_flags += ['--compiler', args.cxx, '--c-compiler', args.cc, '--cflags=%s' % (seastar_cflags), '--ldflags=%s' % (seastar_ldflags),
+                  '--c++-dialect=gnu++17', '--use-std-optional-variant-stringview=1', '--optflags=%s' % (modes['release']['cxx_ld_flags']), ]

 libdeflate_cflags = seastar_cflags
 zstd_cflags = seastar_cflags + ' -Wno-implicit-fallthrough'

-MODE_TO_CMAKE_BUILD_TYPE = {'release' : 'RelWithDebInfo', 'debug' : 'Debug', 'dev' : 'Dev', 'sanitize' : 'Sanitize' }
+status = subprocess.call([args.python, './configure.py'] + seastar_flags, cwd='seastar')

-def configure_seastar(build_dir, mode):
-    seastar_build_dir = os.path.join(build_dir, mode, 'seastar')
+if status != 0:
+    print('Seastar configuration failed')
+    sys.exit(1)

-    seastar_cmake_args = [
-        '-DCMAKE_BUILD_TYPE={}'.format(MODE_TO_CMAKE_BUILD_TYPE[mode]),
-        '-DCMAKE_C_COMPILER={}'.format(args.cc),
-        '-DCMAKE_CXX_COMPILER={}'.format(args.cxx),
-        '-DSeastar_CXX_FLAGS={}'.format((seastar_cflags + ' ' + modes[mode]['cxx_ld_flags']).replace(' ', ';')),
-        '-DSeastar_LD_FLAGS={}'.format(seastar_ldflags),
-        '-DSeastar_CXX_DIALECT=gnu++17',
-        '-DSeastar_STD_OPTIONAL_VARIANT_STRINGVIEW=ON',
-        '-DSeastar_UNUSED_RESULT_ERROR=ON',
-    ]
-    if args.dpdk:
-        seastar_cmake_args += ['-DSeastar_DPDK=ON', '-DSeastar_DPDK_MACHINE=wsm']
-    if args.gcc6_concepts:
-        seastar_cmake_args += ['-DSeastar_GCC6_CONCEPTS=ON']
-    if args.split_dwarf:
-        seastar_cmake_args += ['-DSeastar_SPLIT_DWARF=ON']
-    if args.alloc_failure_injector:
-        seastar_cmake_args += ['-DSeastar_ALLOC_FAILURE_INJECTION=ON']

-    seastar_cmd = ['cmake', '-G', 'Ninja', os.path.relpath('seastar', seastar_build_dir)] + seastar_cmake_args
-    cmake_dir = seastar_build_dir
-    if args.dpdk:
-        # need to cook first
-        cmake_dir = 'seastar' # required by cooking.sh
-        relative_seastar_build_dir = os.path.join('..', seastar_build_dir)  # relative to seastar/
-        seastar_cmd = ['./cooking.sh', '-i', 'dpdk', '-d', relative_seastar_build_dir, '--'] + seastar_cmd[4:]
-
-    print(seastar_cmd)
-    os.makedirs(seastar_build_dir, exist_ok=True)
-    subprocess.check_call(seastar_cmd, shell=False, cwd=cmake_dir)
-
-for mode in build_modes:
-    configure_seastar('build', mode)
-
-pc = {mode: 'build/{}/seastar/seastar.pc'.format(mode) for mode in build_modes}
+pc = {mode: 'build/{}/seastar.pc'.format(mode) for mode in build_modes}
 ninja = find_executable('ninja') or find_executable('ninja-build')
 if not ninja:
    print('Ninja executable (ninja or ninja-build) not found on PATH\n')
    sys.exit(1)

-def query_seastar_flags(pc_file, link_static_cxx=False):
+def query_seastar_flags(seastar_pc_file, link_static_cxx=False):
+    pc_file = os.path.join('seastar', seastar_pc_file)
    cflags = pkg_config(pc_file, '--cflags', '--static')
    libs = pkg_config(pc_file, '--libs', '--static')

@@ -1200,6 +1131,8 @@ for mode in build_modes:
    modes[mode]['seastar_cflags'] = seastar_cflags
    modes[mode]['seastar_libs'] = seastar_libs

+MODE_TO_CMAKE_BUILD_TYPE = {'release' : 'RelWithDebInfo', 'debug' : 'Debug', 'dev' : 'Dev', 'sanitize' : 'Sanitize' }
+
 # We need to use experimental features of the zstd library (to use our own allocators for the (de)compression context),
 # which are available only when the library is linked statically.
 def configure_zstd(build_dir, mode):
@@ -1269,11 +1202,6 @@ if args.antlr3_exec:
 else:
    antlr3_exec = "antlr3"

-if args.ragel_exec:
-    ragel_exec = args.ragel_exec
-else:
-    ragel_exec = "ragel"
-
 for mode in build_modes:
    configure_zstd(outdir, mode)

@@ -1290,7 +1218,6 @@ with open(buildfile_tmp, 'w') as f:
        cxx = {cxx}
        cxxflags = {user_cflags} {warnings} {defines}
        ldflags = {gold_linker_flag} {user_ldflags}
-        ldflags_build = {gold_linker_flag}
        libs = {libs}
        pool link_pool
            depth = {link_pool_depth}
@@ -1309,11 +1236,6 @@ with open(buildfile_tmp, 'w') as f:
            command = {ninja} -C $subdir $target
            restat = 1
            description = NINJA $out
-        rule ragel
-            # sed away a bug in ragel 7 that emits some extraneous _nfa* variables
-            # (the $$ is collapsed to a single one by ninja)
-            command = {ragel_exec} -G2 -o $out $in && sed -i -e '1h;2,$$H;$$!d;g' -re 's/static const char _nfa[^;]*;//g' $out
-            description = RAGEL $out
        rule run
            command = $in > $out
            description = GEN $out
@@ -1333,7 +1255,7 @@ with open(buildfile_tmp, 'w') as f:
            libs_{mode} = -l{fmt_lib}
            seastar_libs_{mode} = {seastar_libs}
            rule cxx.{mode}
-              command = $cxx -MD -MT $out -MF $out.d {seastar_cflags} $cxxflags_{mode} $cxxflags $obj_cxxflags -c -o $out $in
+              command = $cxx -MD -MT $out -MF $out.d {seastar_cflags} $cxxflags $cxxflags_{mode} $obj_cxxflags -c -o $out $in
              description = CXX $out
              depfile = $out.d
            rule link.{mode}
@@ -1344,10 +1266,6 @@ with open(buildfile_tmp, 'w') as f:
              command = $cxx  $ld_flags_{mode} -s $ldflags -o $out $in $libs $libs_{mode}
              description = LINK (stripped) $out
              pool = link_pool
-            rule link_build.{mode}
-              command = $cxx  $ld_flags_{mode} $ldflags_build -o $out $in $libs $libs_{mode}
-              description = LINK (build) $out
-              pool = link_pool
            rule ar.{mode}
              command = rm -f $out; ar cr $out $in; ranlib $out
              description = AR $out
@@ -1382,10 +1300,8 @@ with open(buildfile_tmp, 'w') as f:
        swaggers = {}
        serializers = {}
        thrifts = set()
-        ragels = {}
        antlr3_grammars = set()
-        seastar_dep = 'build/{}/seastar/libseastar.a'.format(mode)
-        seastar_testing_dep = 'build/{}/seastar/libseastar_testing.a'.format(mode)
+        seastar_dep = 'seastar/build/{}/libseastar.a'.format(mode)
        for binary in build_artifacts:
            if binary in other:
                continue
@@ -1409,12 +1325,12 @@ with open(buildfile_tmp, 'w') as f:
                    'zstd/lib/libzstd.a',
                ]])
                objs.append('$builddir/' + mode + '/gen/utils/gz/crc_combine_table.o')
-                if binary.startswith('test/'):
+                if binary.startswith('tests/'):
                    local_libs = '$seastar_libs_{} $libs'.format(mode)
                    if binary in pure_boost_tests:
                        local_libs += ' ' + maybe_static(args.staticboost, '-lboost_unit_test_framework')
                    if binary not in tests_not_using_seastar_test_framework:
-                        pc_path = pc[mode].replace('seastar.pc', 'seastar-testing.pc')
+                        pc_path = os.path.join('seastar', pc[mode].replace('seastar.pc', 'seastar-testing.pc'))
                        local_libs += ' ' + pkg_config(pc_path, '--libs', '--static')
                    if has_thrift:
                        local_libs += ' ' + thrift_libs + ' ' + maybe_static(args.staticboost, '-lboost_system')
@@ -1423,12 +1339,12 @@ with open(buildfile_tmp, 'w') as f:
                    # So we strip the tests by default; The user can very
                    # quickly re-link the test unstripped by adding a "_g"
                    # to the test name, e.g., "ninja build/release/testname_g"
-                    f.write('build $builddir/{}/{}: {}.{} {} | {} {}\n'.format(mode, binary, tests_link_rule, mode, str.join(' ', objs), seastar_dep, seastar_testing_dep))
+                    f.write('build $builddir/{}/{}: {}.{} {} | {}\n'.format(mode, binary, tests_link_rule, mode, str.join(' ', objs), seastar_dep))
                    f.write('   libs = {}\n'.format(local_libs))
-                    f.write('build $builddir/{}/{}_g: {}.{} {} | {} {}\n'.format(mode, binary, regular_link_rule, mode, str.join(' ', objs), seastar_dep, seastar_testing_dep))
+                    f.write('build $builddir/{}/{}_g: link.{} {} | {}\n'.format(mode, binary, mode, str.join(' ', objs), seastar_dep))
                    f.write('   libs = {}\n'.format(local_libs))
                else:
-                    f.write('build $builddir/{}/{}: {}.{} {} | {}\n'.format(mode, binary, regular_link_rule, mode, str.join(' ', objs), seastar_dep))
+                    f.write('build $builddir/{}/{}: link.{} {} | {}\n'.format(mode, binary, mode, str.join(' ', objs), seastar_dep))
                    if has_thrift:
                        f.write('   libs =  {} {} $seastar_libs_{} $libs\n'.format(thrift_libs, maybe_static(args.staticboost, '-lboost_system'), mode))
            for src in srcs:
@@ -1441,9 +1357,6 @@ with open(buildfile_tmp, 'w') as f:
                elif src.endswith('.json'):
                    hh = '$builddir/' + mode + '/gen/' + src + '.hh'
                    swaggers[hh] = src
-                elif src.endswith('.rl'):
-                    hh = '$builddir/' + mode + '/gen/' + src.replace('.rl', '.hh')
-                    ragels[hh] = src
                elif src.endswith('.thrift'):
                    thrifts.add(src)
                elif src.endswith('.g'):
@@ -1454,7 +1367,7 @@ with open(buildfile_tmp, 'w') as f:
        compiles['$builddir/' + mode + '/utils/gz/gen_crc_combine_table.o'] = 'utils/gz/gen_crc_combine_table.cc'
        f.write('build {}: run {}\n'.format('$builddir/' + mode + '/gen/utils/gz/crc_combine_table.cc',
                                            '$builddir/' + mode + '/utils/gz/gen_crc_combine_table'))
-        f.write('build {}: link_build.{} {}\n'.format('$builddir/' + mode + '/utils/gz/gen_crc_combine_table', mode,
+        f.write('build {}: link.{} {}\n'.format('$builddir/' + mode + '/utils/gz/gen_crc_combine_table', mode,
                                                '$builddir/' + mode + '/utils/gz/gen_crc_combine_table.o'))
        f.write('   libs = $seastar_libs_{}\n'.format(mode))
        f.write(
@@ -1472,7 +1385,6 @@ with open(buildfile_tmp, 'w') as f:
            gen_headers += g.headers('$builddir/{}/gen'.format(mode))
        gen_headers += list(swaggers.keys())
        gen_headers += list(serializers.keys())
-        gen_headers += list(ragels.keys())
        gen_headers_dep = ' '.join(gen_headers)

        for obj in compiles:
@@ -1486,9 +1398,6 @@ with open(buildfile_tmp, 'w') as f:
        for hh in serializers:
            src = serializers[hh]
            f.write('build {}: serializer {} | idl-compiler.py\n'.format(hh, src))
-        for hh in ragels:
-            src = ragels[hh]
-            f.write('build {}: ragel {}\n'.format(hh, src))
        for thrift in thrifts:
            outs = ' '.join(thrift.generated('$builddir/{}/gen'.format(mode)))
            f.write('build {}: thrift.{} {}\n'.format(outs, mode, thrift.source))
@@ -1502,33 +1411,25 @@ with open(buildfile_tmp, 'w') as f:
            for cc in grammar.sources('$builddir/{}/gen'.format(mode)):
                obj = cc.replace('.cpp', '.o')
                f.write('build {}: cxx.{} {} || {}\n'.format(obj, mode, cc, ' '.join(serializers)))
-                if cc.endswith('Parser.cpp'):
-                    # Unoptimized parsers end up using huge amounts of stack space and overflowing their stack
-                    flags = '-O1'
-                    if has_sanitize_address_use_after_scope:
-                        flags += ' -fno-sanitize-address-use-after-scope'
-                    f.write('  obj_cxxflags = %s\n' % flags)
+                if cc.endswith('Parser.cpp') and has_sanitize_address_use_after_scope:
+                    # Parsers end up using huge amounts of stack space and overflowing their stack
+                    f.write('  obj_cxxflags = -fno-sanitize-address-use-after-scope\n')
        for hh in headers:
            f.write('build $builddir/{mode}/{hh}.o: checkhh.{mode} {hh} || {gen_headers_dep}\n'.format(
                    mode=mode, hh=hh, gen_headers_dep=gen_headers_dep))

-        f.write('build build/{mode}/seastar/libseastar.a: ninja | always\n'
+        f.write('build seastar/build/{mode}/libseastar.a: ninja | always\n'
                .format(**locals()))
        f.write('  pool = submodule_pool\n')
-        f.write('  subdir = build/{mode}/seastar\n'.format(**locals()))
-        f.write('  target = seastar\n'.format(**locals()))
-        f.write('build build/{mode}/seastar/libseastar_testing.a: ninja\n'
+        f.write('  subdir = seastar/build/{mode}\n'.format(**locals()))
+        f.write('  target = seastar seastar_testing\n'.format(**locals()))
+        f.write('build seastar/build/{mode}/apps/iotune/iotune: ninja\n'
                .format(**locals()))
        f.write('  pool = submodule_pool\n')
-        f.write('  subdir = build/{mode}/seastar\n'.format(**locals()))
-        f.write('  target = seastar_testing\n'.format(**locals()))
-        f.write('build build/{mode}/seastar/apps/iotune/iotune: ninja\n'
-                .format(**locals()))
-        f.write('  pool = submodule_pool\n')
-        f.write('  subdir = build/{mode}/seastar\n'.format(**locals()))
+        f.write('  subdir = seastar/build/{mode}\n'.format(**locals()))
        f.write('  target = iotune\n'.format(**locals()))
        f.write(textwrap.dedent('''\
-            build build/{mode}/iotune: copy build/{mode}/seastar/apps/iotune/iotune
+            build build/{mode}/iotune: copy seastar/build/{mode}/apps/iotune/iotune
            ''').format(**locals()))
        f.write('build build/{mode}/scylla-package.tar.gz: package build/{mode}/scylla build/{mode}/iotune build/SCYLLA-RELEASE-FILE build/SCYLLA-VERSION-FILE | always\n'.format(**locals()))
        f.write('  pool = submodule_pool\n')
@@ -1549,7 +1450,7 @@ with open(buildfile_tmp, 'w') as f:
        rule configure
          command = {python} configure.py $configure_args
          generator = 1
-        build build.ninja: configure | configure.py SCYLLA-VERSION-GEN
+        build build.ninja: configure | configure.py seastar/configure.py
        rule cscope
            command = find -name '*.[chS]' -o -name "*.cc" -o -name "*.hh" | cscope -bq -i-
            description = CSCOPE
@@ -1558,10 +1459,6 @@ with open(buildfile_tmp, 'w') as f:
            command = rm -rf build
            description = CLEAN
        build clean: clean
-        rule mode_list
-            command = echo {modes_list}
-            description = List configured modes
-        build mode_list: mode_list
        default {modes_list}
        ''').format(modes_list=' '.join(default_modes), **globals()))
    f.write(textwrap.dedent('''\
--- a/connection_notifier.cc
+++ b/connection_notifier.cc
@@ -1,71 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include "connection_notifier.hh"
-#include "db/query_context.hh"
-#include "cql3/constants.hh"
-#include "database.hh"
-#include "service/storage_proxy.hh"
-
-#include <stdexcept>
-
-namespace db::system_keyspace {
-extern const char *const CLIENTS;
-}
-
-static sstring to_string(client_type ct) {
-    switch (ct) {
-        case client_type::cql: return "cql";
-        case client_type::thrift: return "thrift";
-        case client_type::alternator: return "alternator";
-        default: throw std::runtime_error("Invalid client_type");
-    }
-}
-
-future<> notify_new_client(client_data cd) {
-    // FIXME: consider prepared statement
-    const static sstring req
-            = format("INSERT INTO system.{} (address, port, client_type, shard_id, protocol_version, username) "
-                     "VALUES (?, ?, ?, ?, ?, ?);", db::system_keyspace::CLIENTS);
-    
-    return db::execute_cql(req,
-            std::move(cd.ip), cd.port, to_string(cd.ct), cd.shard_id,
-            cd.protocol_version.has_value() ? data_value(*cd.protocol_version) : data_value::make_null(int32_type),
-            cd.username.value_or("anonymous")).discard_result();
-}
-
-future<> notify_disconnected_client(gms::inet_address addr, client_type ct, int port) {
-    // FIXME: consider prepared statement
-    const static sstring req
-            = format("DELETE FROM system.{} where address=? AND port=? AND client_type=?;",
-                     db::system_keyspace::CLIENTS);
-    return db::execute_cql(req, addr.addr(), port, to_string(ct)).discard_result();
-}
-
-future<> clear_clientlist() {
-    auto& db_local = service::get_storage_proxy().local().get_db().local();
-    return db_local.truncate(
-            db_local.find_keyspace(db::system_keyspace_name()),
-            db_local.find_column_family(db::system_keyspace_name(),
-                    db::system_keyspace::CLIENTS),
-            [] { return make_ready_future<db_clock::time_point>(db_clock::now()); },
-            false /* with_snapshot */);
-}
--- a/connection_notifier.hh
+++ b/connection_notifier.hh
@@ -1,57 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-#pragma once
-
-#include "gms/inet_address.hh"
-#include <seastar/core/sstring.hh>
-#include <optional>
-
-enum class client_type {
-    cql = 0,
-    thrift,
-    alternator,
-};
-
-// Representation of a row in `system.clients'. std::optionals are for nullable cells.
-struct client_data {
-    gms::inet_address ip;
-    int32_t port;
-    client_type ct;
-    int32_t shard_id;  /// ID of server-side shard which is processing the connection.
-
-    // `optional' column means that it's nullable (possibly because it's
-    // unimplemented yet). If you want to fill ("implement") any of them,
-    // remember to update the query in `notify_new_client()'.
-    std::optional<sstring> connection_stage;
-    std::optional<sstring> driver_name;
-    std::optional<sstring> driver_version;
-    std::optional<sstring> hostname;
-    std::optional<int32_t> protocol_version;
-    std::optional<sstring> ssl_cipher_suite;
-    std::optional<bool> ssl_enabled;
-    std::optional<sstring> ssl_protocol;
-    std::optional<sstring> username;
-};
-
-future<> notify_new_client(client_data cd);
-future<> notify_disconnected_client(gms::inet_address addr, client_type ct, int port);
-
-future<> clear_clientlist();
--- a/converting_mutation_partition_applier.hh
+++ b/converting_mutation_partition_applier.hh
@@ -21,9 +21,6 @@

 #pragma once

-#include "types/user.hh"
-#include "concrete_types.hh"
-
 #include "mutation_partition_view.hh"
 #include "mutation_partition.hh"
 #include "schema.hh"
@@ -38,8 +35,8 @@ class converting_mutation_partition_applier : public mutation_partition_visitor
    const column_mapping& _visited_column_mapping;
    deletable_row* _current_row;
 private:
-    static bool is_compatible(const column_definition& new_def, const abstract_type& old_type, column_kind kind) {
-        return ::is_compatible(new_def.kind, kind) && new_def.type->is_value_compatible_with(old_type);
+    static bool is_compatible(const column_definition& new_def, const data_type& old_type, column_kind kind) {
+        return ::is_compatible(new_def.kind, kind) && new_def.type->is_value_compatible_with(*old_type);
    }
    static atomic_cell upgrade_cell(const abstract_type& new_type, const abstract_type& old_type, atomic_cell_view cell,
                                    atomic_cell::collection_member cm = atomic_cell::collection_member::no) {
@@ -52,59 +49,32 @@ private:
            return atomic_cell(new_type, cell);
        }
    }
-    static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const abstract_type& old_type, atomic_cell_view cell) {
+    static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, atomic_cell_view cell) {
        if (!is_compatible(new_def, old_type, kind) || cell.timestamp() <= new_def.dropped_at()) {
            return;
        }
-        dst.apply(new_def, upgrade_cell(*new_def.type, old_type, cell));
+        dst.apply(new_def, upgrade_cell(*new_def.type, *old_type, cell));
    }
-    static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const abstract_type& old_type, collection_mutation_view cell) {
+    static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, collection_mutation_view cell) {
        if (!is_compatible(new_def, old_type, kind)) {
            return;
        }
+      cell.data.with_linearized([&] (bytes_view cell_bv) {
+        auto new_ctype = static_pointer_cast<const collection_type_impl>(new_def.type);
+        auto old_ctype = static_pointer_cast<const collection_type_impl>(old_type);
+        auto old_view = old_ctype->deserialize_mutation_form(cell_bv);

-      cell.with_deserialized(old_type, [&] (collection_mutation_view_description old_view) {
-        collection_mutation_description new_view;
+        collection_type_impl::mutation new_view;
        if (old_view.tomb.timestamp > new_def.dropped_at()) {
            new_view.tomb = old_view.tomb;
        }
-
-        visit(old_type, make_visitor(
-            [&] (const collection_type_impl& old_ctype) {
-                assert(new_def.type->is_collection()); // because is_compatible
-                auto& new_ctype = static_cast<const collection_type_impl&>(*new_def.type);
-
-                auto& new_value_type = *new_ctype.value_comparator();
-                auto& old_value_type = *old_ctype.value_comparator();
-
-                for (auto& c : old_view.cells) {
-                    if (c.second.timestamp() > new_def.dropped_at()) {
-                        new_view.cells.emplace_back(c.first, upgrade_cell(
-                                new_value_type, old_value_type, c.second, atomic_cell::collection_member::yes));
-                    }
-                }
-            },
-            [&] (const user_type_impl& old_utype) {
-                assert(new_def.type->is_user_type()); // because is_compatible
-                auto& new_utype = static_cast<const user_type_impl&>(*new_def.type);
-
-                for (auto& c : old_view.cells) {
-                    if (c.second.timestamp() > new_def.dropped_at()) {
-                        auto idx = deserialize_field_index(c.first);
-                        assert(idx < new_utype.size() && idx < old_utype.size());
-
-                        new_view.cells.emplace_back(c.first, upgrade_cell(
-                                *new_utype.type(idx), *old_utype.type(idx), c.second, atomic_cell::collection_member::yes));
-                    }
-                }
-            },
-            [&] (const abstract_type& o) {
-                throw std::runtime_error(format("not a multi-cell type: {}", o.name()));
+        for (auto& c : old_view.cells) {
+            if (c.second.timestamp() > new_def.dropped_at()) {
+                new_view.cells.emplace_back(c.first, upgrade_cell(*new_ctype->value_comparator(), *old_ctype->value_comparator(), c.second, atomic_cell::collection_member::yes));
            }
-        ));
-
+        }
        if (new_view.tomb || !new_view.cells.empty()) {
-            dst.apply(new_def, new_view.serialize(*new_def.type));
+            dst.apply(new_def, new_ctype->serialize_mutation_form(std::move(new_view)));
        }
      });
    }
@@ -130,7 +100,7 @@ public:
        const column_mapping_entry& col = _visited_column_mapping.static_column_at(id);
        const column_definition* def = _p_schema.get_column_definition(col.name());
        if (def) {
-            accept_cell(_p._static_row.maybe_create(), column_kind::static_column, *def, *col.type(), cell);
+            accept_cell(_p._static_row, column_kind::static_column, *def, col.type(), cell);
        }
    }

@@ -138,7 +108,7 @@ public:
        const column_mapping_entry& col = _visited_column_mapping.static_column_at(id);
        const column_definition* def = _p_schema.get_column_definition(col.name());
        if (def) {
-            accept_cell(_p._static_row.maybe_create(), column_kind::static_column, *def, *col.type(), collection);
+            accept_cell(_p._static_row, column_kind::static_column, *def, col.type(), collection);
        }
    }

@@ -161,7 +131,7 @@ public:
        const column_mapping_entry& col = _visited_column_mapping.regular_column_at(id);
        const column_definition* def = _p_schema.get_column_definition(col.name());
        if (def) {
-            accept_cell(_current_row->cells(), column_kind::regular_column, *def, *col.type(), cell);
+            accept_cell(_current_row->cells(), column_kind::regular_column, *def, col.type(), cell);
        }
    }

@@ -169,7 +139,7 @@ public:
        const column_mapping_entry& col = _visited_column_mapping.regular_column_at(id);
        const column_definition* def = _p_schema.get_column_definition(col.name());
        if (def) {
-            accept_cell(_current_row->cells(), column_kind::regular_column, *def, *col.type(), collection);
+            accept_cell(_current_row->cells(), column_kind::regular_column, *def, col.type(), collection);
        }
    }

@@ -177,9 +147,9 @@ public:
    // Cells must have monotonic names.
    static void append_cell(row& dst, column_kind kind, const column_definition& new_def, const column_definition& old_def, const atomic_cell_or_collection& cell) {
        if (new_def.is_atomic()) {
-            accept_cell(dst, kind, new_def, *old_def.type, cell.as_atomic_cell(old_def));
+            accept_cell(dst, kind, new_def, old_def.type, cell.as_atomic_cell(old_def));
        } else {
-            accept_cell(dst, kind, new_def, *old_def.type, cell.as_collection_mutation());
+            accept_cell(dst, kind, new_def, old_def.type, cell.as_collection_mutation());
        }
    }
 };
--- a/cql3/Cql.g
+++ b/cql3/Cql.g
@@ -43,14 +43,12 @@ options {
 #include "cql3/statements/create_table_statement.hh"
 #include "cql3/statements/create_view_statement.hh"
 #include "cql3/statements/create_type_statement.hh"
-#include "cql3/statements/create_function_statement.hh"
 #include "cql3/statements/drop_type_statement.hh"
 #include "cql3/statements/alter_type_statement.hh"
 #include "cql3/statements/property_definitions.hh"
 #include "cql3/statements/drop_index_statement.hh"
 #include "cql3/statements/drop_table_statement.hh"
 #include "cql3/statements/drop_view_statement.hh"
-#include "cql3/statements/drop_function_statement.hh"
 #include "cql3/statements/truncate_statement.hh"
 #include "cql3/statements/raw/update_statement.hh"
 #include "cql3/statements/raw/insert_statement.hh"
@@ -245,14 +243,10 @@ struct uninitialized {
        return res;
    }

-    sstring to_lower(std::string_view s) {
-        sstring lower_s(s.size(), '\0');
-        std::transform(s.cbegin(), s.cend(), lower_s.begin(), &::tolower);
-        return lower_s;
-    }
-
    bool convert_boolean_literal(std::string_view s) {
-        return to_lower(s) == "true";
+        std::string lower_s(s.size(), '\0');
+        std::transform(s.cbegin(), s.cend(), lower_s.begin(), &::tolower);
+        return lower_s == "true";
    }

    void add_raw_update(std::vector<std::pair<::shared_ptr<cql3::column_identifier::raw>,::shared_ptr<cql3::operation::raw_update>>>& operations,
@@ -354,9 +348,9 @@ cqlStatement returns [shared_ptr<raw::parsed_statement> stmt]
    | st25=createTypeStatement         { $stmt = st25; }
    | st26=alterTypeStatement          { $stmt = st26; }
    | st27=dropTypeStatement           { $stmt = st27; }
+#if 0
    | st28=createFunctionStatement     { $stmt = st28; }
    | st29=dropFunctionStatement       { $stmt = st29; }
-#if 0
    | st30=createAggregateStatement    { $stmt = st30; }
    | st31=dropAggregateStatement      { $stmt = st31; }
 #endif
@@ -530,7 +524,6 @@ usingClauseObjective[::shared_ptr<cql3::attributes::raw> attrs]
 */
 updateStatement returns [::shared_ptr<raw::update_statement> expr]
    @init {
-        bool if_exists = false;
        auto attrs = ::make_shared<cql3::attributes::raw>();
        std::vector<std::pair<::shared_ptr<cql3::column_identifier::raw>, ::shared_ptr<cql3::operation::raw_update>>> operations;
    }
@@ -538,14 +531,13 @@ updateStatement returns [::shared_ptr<raw::update_statement> expr]
      ( usingClause[attrs] )?
      K_SET columnOperation[operations] (',' columnOperation[operations])*
      K_WHERE wclause=whereClause
-      ( K_IF (K_EXISTS{ if_exists = true; } | conditions=updateConditions) )?
+      ( K_IF conditions=updateConditions )?
      {
          return ::make_shared<raw::update_statement>(std::move(cf),
                                                  std::move(attrs),
                                                  std::move(operations),
                                                  std::move(wclause),
-                                                  std::move(conditions),
-                                                  if_exists);
+                                                  std::move(conditions));
     }
    ;

@@ -589,7 +581,6 @@ deleteSelection returns [std::vector<::shared_ptr<cql3::operation::raw_deletion>
 deleteOp returns [::shared_ptr<cql3::operation::raw_deletion> op]
    : c=cident                { $op = ::make_shared<cql3::operation::column_deletion>(std::move(c)); }
    | c=cident '[' t=term ']' { $op = ::make_shared<cql3::operation::element_deletion>(std::move(c), std::move(t)); }
-    | c=cident '.' field=ident { $op = ::make_shared<cql3::operation::field_deletion>(std::move(c), std::move(field)); }
    ;

 usingClauseDelete[::shared_ptr<cql3::attributes::raw> attrs]
@@ -692,56 +683,54 @@ dropAggregateStatement returns [DropAggregateStatement expr]
      )?
      { $expr = new DropAggregateStatement(fn, argsTypes, argsPresent, ifExists); }
    ;
-#endif

-createFunctionStatement returns [shared_ptr<cql3::statements::create_function_statement> expr]
+createFunctionStatement returns [CreateFunctionStatement expr]
    @init {
-        bool or_replace = false;
-        bool if_not_exists = false;
+        boolean orReplace = false;
+        boolean ifNotExists = false;

-        std::vector<shared_ptr<cql3::column_identifier>> arg_names;
-        std::vector<shared_ptr<cql3_type::raw>> arg_types;
-        bool called_on_null_input = false;
+        boolean deterministic = true;
+        List<ColumnIdentifier> argsNames = new ArrayList<>();
+        List<CQL3Type.Raw> argsTypes = new ArrayList<>();
    }
-    : K_CREATE
-        // "OR REPLACE" and "IF NOT EXISTS" cannot be used together
-        ((K_OR K_REPLACE { or_replace = true; } K_FUNCTION)
-         | (K_FUNCTION K_IF K_NOT K_EXISTS { if_not_exists = true; })
-         | K_FUNCTION)
+    : K_CREATE (K_OR K_REPLACE { orReplace = true; })?
+      ((K_NON { deterministic = false; })? K_DETERMINISTIC)?
+      K_FUNCTION
+      (K_IF K_NOT K_EXISTS { ifNotExists = true; })?
      fn=functionName
      '('
        (
-          k=ident v=comparatorType { arg_names.push_back(k); arg_types.push_back(v); }
-          ( ',' k=ident v=comparatorType { arg_names.push_back(k); arg_types.push_back(v); } )*
+          k=ident v=comparatorType { argsNames.add(k); argsTypes.add(v); }
+          ( ',' k=ident v=comparatorType { argsNames.add(k); argsTypes.add(v); } )*
        )?
      ')'
-      ( (K_RETURNS K_NULL) | (K_CALLED { called_on_null_input = true; })) K_ON K_NULL K_INPUT
      K_RETURNS rt = comparatorType
      K_LANGUAGE language = IDENT
      K_AS body = STRING_LITERAL
-      { $expr = ::make_shared<cql3::statements::create_function_statement>(std::move(fn), to_lower($language.text), $body.text, std::move(arg_names), std::move(arg_types), std::move(rt), called_on_null_input, or_replace, if_not_exists); }
+      { $expr = new CreateFunctionStatement(fn, $language.text.toLowerCase(), $body.text, deterministic, argsNames, argsTypes, rt, orReplace, ifNotExists); }
    ;

-dropFunctionStatement returns [shared_ptr<cql3::statements::drop_function_statement> expr]
+dropFunctionStatement returns [DropFunctionStatement expr]
    @init {
-        bool if_exists = false;
-        std::vector<shared_ptr<cql3_type::raw>> arg_types;
-        bool args_present = false;
+        boolean ifExists = false;
+        List<CQL3Type.Raw> argsTypes = new ArrayList<>();
+        boolean argsPresent = false;
    }
    : K_DROP K_FUNCTION
-      (K_IF K_EXISTS { if_exists = true; } )?
+      (K_IF K_EXISTS { ifExists = true; } )?
      fn=functionName
      (
        '('
          (
-            v=comparatorType { arg_types.push_back(v); }
-            ( ',' v=comparatorType { arg_types.push_back(v); } )*
+            v=comparatorType { argsTypes.add(v); }
+            ( ',' v=comparatorType { argsTypes.add(v); } )*
          )?
        ')'
-        { args_present = true; }
+        { argsPresent = true; }
      )?
-      { $expr = ::make_shared<cql3::statements::drop_function_statement>(std::move(fn), std::move(arg_types), args_present, if_exists); }
+      { $expr = new DropFunctionStatement(fn, argsTypes, argsPresent, ifExists); }
    ;
+#endif

 /**
 * CREATE KEYSPACE [IF NOT EXISTS] <KEYSPACE> WITH attr1 = value1 AND attr2 = value2;
@@ -1407,9 +1396,8 @@ columnOperation[operations_type& operations]

 columnOperationDifferentiator[operations_type& operations, ::shared_ptr<cql3::column_identifier::raw> key]
    : '=' normalColumnOperation[operations, key]
-    | '[' k=term ']' collectionColumnOperation[operations, key, k, false]
-    | '.' field=ident udtColumnOperation[operations, key, field]
-    | '[' K_SCYLLA_TIMEUUID_LIST_INDEX '(' k=term ')' ']' collectionColumnOperation[operations, key, k, true]
+    | '[' k=term ']' specializedColumnOperation[operations, key, k, false]
+    | '[' K_SCYLLA_TIMEUUID_LIST_INDEX '(' k=term ')' ']' specializedColumnOperation[operations, key, k, true]
    ;

 normalColumnOperation[operations_type& operations, ::shared_ptr<cql3::column_identifier::raw> key]
@@ -1452,38 +1440,31 @@ normalColumnOperation[operations_type& operations, ::shared_ptr<cql3::column_ide
      }
    ;

-collectionColumnOperation[operations_type& operations,
-                          shared_ptr<cql3::column_identifier::raw> key,
-                          shared_ptr<cql3::term::raw> k,
-                          bool by_uuid]
+specializedColumnOperation[std::vector<std::pair<shared_ptr<cql3::column_identifier::raw>,
+                                                 shared_ptr<cql3::operation::raw_update>>>& operations,
+                           shared_ptr<cql3::column_identifier::raw> key,
+                           shared_ptr<cql3::term::raw> k,
+                           bool by_uuid]
+
    : '=' t=term
      {
          add_raw_update(operations, key, make_shared<cql3::operation::set_element>(k, t, by_uuid));
      }
    ;

-udtColumnOperation[operations_type& operations,
-                   shared_ptr<cql3::column_identifier::raw> key,
-                   shared_ptr<cql3::column_identifier> field]
-    : '=' t=term
-      {
-          add_raw_update(operations, std::move(key), make_shared<cql3::operation::set_field>(std::move(field), std::move(t)));
-      }
-    ;
-
 columnCondition[conditions_type& conditions]
    // Note: we'll reject duplicates later
    : key=cident
-        ( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, {}, *op)); }
+        ( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, *op)); }
        | K_IN
-            ( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::in_condition({}, {}, values)); }
-            | marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::in_condition({}, marker, {})); }
+            ( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::simple_in_condition(values)); }
+            | marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::simple_in_condition(marker)); }
            )
        | '[' element=term ']'
-            ( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, element, *op)); }
+            ( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::collection_condition(t, element, *op)); }
            | K_IN
-                ( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::in_condition(element, {}, values)); }
-                | marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::in_condition(element, marker, {})); }
+                ( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::collection_in_condition(element, values)); }
+                | marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::collection_in_condition(element, marker)); }
                )
            )
        )
@@ -1751,8 +1732,8 @@ basic_unreserved_keyword returns [sstring str]
        | K_INITCOND
        | K_RETURNS
        | K_LANGUAGE
-        | K_CALLED
-        | K_INPUT
+        | K_NON
+        | K_DETERMINISTIC
        | K_JSON
        | K_CACHE
        | K_BYPASS
@@ -1891,11 +1872,11 @@ K_STYPE:       S T Y P E;
 K_FINALFUNC:   F I N A L F U N C;
 K_INITCOND:    I N I T C O N D;
 K_RETURNS:     R E T U R N S;
-K_CALLED:      C A L L E D;
-K_INPUT:       I N P U T;
 K_LANGUAGE:    L A N G U A G E;
+K_NON:         N O N;
 K_OR:          O R;
 K_REPLACE:     R E P L A C E;
+K_DETERMINISTIC: D E T E R M I N I S T I C;
 K_JSON:        J S O N;
 K_DEFAULT:     D E F A U L T;
 K_UNSET:       U N S E T;
--- a/cql3/abstract_marker.cc
+++ b/cql3/abstract_marker.cc
@@ -45,7 +45,6 @@
 #include "cql3/lists.hh"
 #include "cql3/maps.hh"
 #include "cql3/sets.hh"
-#include "cql3/user_types.hh"
 #include "types/list.hh"

 namespace cql3 {
@@ -55,7 +54,7 @@ abstract_marker::abstract_marker(int32_t bind_index, ::shared_ptr<column_specifi
    , _receiver{std::move(receiver)}
 { }

-void abstract_marker::collect_marker_specification(lw_shared_ptr<variable_specifications> bound_names) {
+void abstract_marker::collect_marker_specification(::shared_ptr<variable_specifications> bound_names) {
    bound_names->add(_bind_index, _receiver);
 }

@@ -69,22 +68,19 @@ abstract_marker::raw::raw(int32_t bind_index)

 ::shared_ptr<term> abstract_marker::raw::prepare(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver)
 {
-    if (receiver->type->is_collection()) {
-        if (receiver->type->get_kind() == abstract_type::kind::list) {
-            return ::make_shared<lists::marker>(_bind_index, receiver);
-        } else if (receiver->type->get_kind() == abstract_type::kind::set) {
-            return ::make_shared<sets::marker>(_bind_index, receiver);
-        } else if (receiver->type->get_kind() == abstract_type::kind::map) {
-            return ::make_shared<maps::marker>(_bind_index, receiver);
-        }
-        assert(0);
+    auto receiver_type = ::dynamic_pointer_cast<const collection_type_impl>(receiver->type);
+    if (receiver_type == nullptr) {
+        return ::make_shared<constants::marker>(_bind_index, receiver);
    }
-
-    if (receiver->type->is_user_type()) {
-        return ::make_shared<user_types::marker>(_bind_index, receiver);
+    if (receiver_type->get_kind() == abstract_type::kind::list) {
+        return ::make_shared<lists::marker>(_bind_index, receiver);
+    } else if (receiver_type->get_kind() == abstract_type::kind::set) {
+        return ::make_shared<sets::marker>(_bind_index, receiver);
+    } else if (receiver_type->get_kind() == abstract_type::kind::map) {
+        return ::make_shared<maps::marker>(_bind_index, receiver);
    }
-
-    return ::make_shared<constants::marker>(_bind_index, receiver);
+    assert(0);
+    return shared_ptr<term>();
 }

 assignment_testable::test_result abstract_marker::raw::test_assignment(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver) {
--- a/cql3/abstract_marker.hh
+++ b/cql3/abstract_marker.hh
@@ -57,7 +57,7 @@ protected:
 public:
    abstract_marker(int32_t bind_index, ::shared_ptr<column_specification>&& receiver);

-    virtual void collect_marker_specification(lw_shared_ptr<variable_specifications> bound_names) override;
+    virtual void collect_marker_specification(::shared_ptr<variable_specifications> bound_names) override;

    virtual bool contains_bind_marker() const override;

--- a/cql3/attributes.cc
+++ b/cql3/attributes.cc
@@ -120,7 +120,7 @@ int32_t attributes::get_time_to_live(const query_options& options) {
    return ttl;
 }

-void attributes::collect_marker_specification(lw_shared_ptr<variable_specifications> bound_names) {
+void attributes::collect_marker_specification(::shared_ptr<variable_specifications> bound_names) {
    if (_timestamp) {
        _timestamp->collect_marker_specification(bound_names);
    }
--- a/cql3/attributes.hh
+++ b/cql3/attributes.hh
@@ -69,7 +69,7 @@ public:

    int32_t get_time_to_live(const query_options& options);

-    void collect_marker_specification(lw_shared_ptr<variable_specifications> bound_names);
+    void collect_marker_specification(::shared_ptr<variable_specifications> bound_names);

    class raw {
    public:
--- a/cql3/column_condition.cc
+++ b/cql3/column_condition.cc
@@ -45,8 +45,6 @@
 #include "lists.hh"
 #include "maps.hh"
 #include <boost/range/algorithm_ext/push_back.hpp>
-#include "types/map.hh"
-#include "types/list.hh"

 namespace {

@@ -63,58 +61,12 @@ void validate_operation_on_durations(const abstract_type& type, const cql3::oper
    }
 }

-int is_satisfied_by(const cql3::operator_type &op, const abstract_type& cell_type,
-        const abstract_type& param_type, const data_value& cell_value, const bytes& param) {
-
-        int rc;
-        // For multi-cell sets and lists, cell value is represented as a map,
-        // thanks to collections_as_maps flag in partition_slice. param, however,
-        // is represented as a set or list type.
-        // We must implement an own compare of two different representations
-        // to compare the two.
-        if (cell_type.is_map() && cell_type.is_multi_cell() && param_type.is_listlike()) {
-            const listlike_collection_type_impl& list_type = static_cast<const listlike_collection_type_impl&>(param_type);
-            const map_type_impl& map_type = static_cast<const map_type_impl&>(cell_type);
-            assert(list_type.is_multi_cell());
-            // Inverse comparison result since the order of arguments is inverse.
-            rc = -list_type.compare_with_map(map_type, param, map_type.decompose(cell_value));
-        } else {
-            rc = cell_type.compare(cell_type.decompose(cell_value), param);
-        }
-        if (op == cql3::operator_type::EQ) {
-            return rc == 0;
-        } else if (op == cql3::operator_type::NEQ) {
-            return rc != 0;
-        } else if (op == cql3::operator_type::GTE) {
-            return rc >= 0;
-        } else if (op == cql3::operator_type::LTE) {
-            return rc <= 0;
-        } else if (op == cql3::operator_type::GT) {
-            return rc > 0;
-        } else if (op == cql3::operator_type::LT) {
-            return rc < 0;
-        }
-        assert(false);
-        return false;
 }

-// Read the list index from key and check that list index is not
-// negative. The negative range check repeats Cassandra behaviour.
-uint32_t read_and_check_list_index(const cql3::raw_value_view& key) {
-    // The list element type is always int32_type, see lists::index_spec_of
-    int32_t idx = read_simple_exactly<int32_t>(to_bytes(key));
-    if (idx < 0) {
-        throw exceptions::invalid_request_exception(format("Invalid negative list index {}", idx));
-    }
-    return static_cast<uint32_t>(idx);
-}
-
-} // end of anonymous namespace
-
 namespace cql3 {

 bool
-column_condition::uses_function(const sstring& ks_name, const sstring& function_name) const {
+column_condition::uses_function(const sstring& ks_name, const sstring& function_name) {
    if (bool(_collection_element) && _collection_element->uses_function(ks_name, function_name)) {
        return true;
    }
@@ -131,7 +83,7 @@ column_condition::uses_function(const sstring& ks_name, const sstring& function_
    return false;
 }

-void column_condition::collect_marker_specificaton(lw_shared_ptr<variable_specifications> bound_names) {
+void column_condition::collect_marker_specificaton(::shared_ptr<variable_specifications> bound_names) {
    if (_collection_element) {
        _collection_element->collect_marker_specification(bound_names);
    }
@@ -140,134 +92,7 @@ void column_condition::collect_marker_specificaton(lw_shared_ptr<variable_specif
            value->collect_marker_specification(bound_names);
        }
    }
-    if (_value) {
-        _value->collect_marker_specification(bound_names);
-    }
-}
-
-bool column_condition::applies_to(const data_value* cell_value, const query_options& options) const {
-
-    // Cassandra condition support has a few quirks:
-    // - only a simple conjunct of predicates is supported "predicate AND predicate AND ..."
-    // - a predicate can operate on a column or a collection element, which must always be
-    // on the right side: "a = 3" or "collection['key'] IN (1,2,3)"
-    // - parameter markers are allowed on the right hand side only
-    // - only <, >, >=, <=, != and IN predicates are supported.
-    // - NULLs and missing values are treated differently from the WHERE clause:
-    // a term or cell in IF clause is allowed to be NULL or compared with NULL,
-    // and NULL value is treated just like any other value in the domain (there is no
-    // three-value logic or UNKNOWN like in SQL).
-    // - empty sets/lists/maps are treated differently when comparing with NULLs depending on
-    // whether the object is frozen or not. An empty *frozen* set/map/list is not equal to NULL.
-    //  An empty *multi-cell* set/map/list is identical to NULL.
-    // The code below implements these rules in a way compatible with Cassandra.
-
-    // Use a map/list value instead of entire collection if a key is present in the predicate.
-    if (_collection_element != nullptr && cell_value != nullptr) {
-        // Checked in column_condition::raw::prepare()
-        assert(cell_value->type()->is_collection());
-        const collection_type_impl& cell_type = static_cast<const collection_type_impl&>(*cell_value->type());
-
-        cql3::raw_value_view key = _collection_element->bind_and_get(options);
-        if (key.is_unset_value()) {
-            throw exceptions::invalid_request_exception(
-                    format("Invalid 'unset' value in {} element access", cell_type.cql3_type_name()));
-        }
-        if (key.is_null()) {
-            throw exceptions::invalid_request_exception(
-                    format("Invalid null value for {} element access", cell_type.cql3_type_name()));
-        }
-        if (cell_type.is_map()) {
-            // If a collection is multi-cell and not frozen, it is returned as a map even if the
-            // underlying data type is "set" or "list". This is controlled by
-            // partition_slice::collections_as_maps enum, which is set when preparing a read command
-            // object. Representing a list as a map<timeuuid, listval> is necessary to identify the list field
-            // being updated, e.g. in case of UPDATE t SET list[3] = null WHERE a = 1 IF list[3]
-            // = 'key'
-            const map_type_impl& map_type = static_cast<const map_type_impl&>(cell_type);
-            // A map is serialized as a vector of data value pairs.
-            const std::vector<std::pair<data_value, data_value>>& map = map_type.from_value(*cell_value);
-            if (column.type->is_map()) {
-                // We're working with a map *type*, not only map *representation*.
-                with_linearized(*key, [&map, &map_type, &cell_value] (bytes_view key) {
-                    auto end = map.end();
-                    const auto& map_key_type = *map_type.get_keys_type();
-                    auto less = [&map_key_type](const std::pair<data_value, data_value>& value, bytes_view key) {
-                        return map_key_type.less(map_key_type.decompose(value.first), key);
-                    };
-                    // Map elements are sorted by key.
-                    auto it = std::lower_bound(map.begin(), end, key, less);
-                    if (it != end && map_key_type.equal(map_key_type.decompose(it->first), key)) {
-                        cell_value = &it->second;
-                    } else {
-                        cell_value = nullptr;
-                    }
-                });
-            } else if (column.type->is_list()) {
-                // We're working with a list type, represented as map.
-                uint32_t idx = read_and_check_list_index(key);
-                cell_value = idx >= map.size() ? nullptr : &map[idx].second;
-            } else {
-                // Syntax like "set_column['key'] = constant" is invalid.
-                assert(false);
-            }
-        } else if (cell_type.is_list()) {
-            // This is a *frozen* list.
-            const list_type_impl& list_type = static_cast<const list_type_impl&>(cell_type);
-            const std::vector<data_value>& list = list_type.from_value(*cell_value);
-            uint32_t idx = read_and_check_list_index(key);
-            cell_value = idx >= list.size() ? nullptr : &list[idx];
-        } else {
-            assert(false);
-        }
-    }
-
-    if (_op.is_compare()) {
-        // <, >, >=, <=, !=
-        cql3::raw_value_view param = _value->bind_and_get(options);
-
-        if (param.is_unset_value()) {
-            throw exceptions::invalid_request_exception("Invalid 'unset' value in condition");
-        }
-        if (param.is_null()) {
-            if (_op == operator_type::EQ) {
-                return cell_value == nullptr;
-            } else if (_op == operator_type::NEQ) {
-                return cell_value != nullptr;
-            } else {
-                throw exceptions::invalid_request_exception(format("Invalid comparison with null for operator \"{}\"", _op));
-            }
-        } else if (cell_value == nullptr) {
-            // The condition parameter is not null, so only NEQ can return true
-            return _op == operator_type::NEQ;
-        }
-        // type::validate() is called by bind_and_get(), so it's safe to pass to_bytes() result
-        // directly to compare.
-        return is_satisfied_by(_op, *cell_value->type(), *column.type, *cell_value, to_bytes(param));
-    }
-    assert(_op == operator_type::IN);
-
-    std::vector<bytes_opt> in_values;
-
-    if (_value) {
-        auto&& lval = dynamic_pointer_cast<multi_item_terminal>(_value->bind(options));
-        if (!lval) {
-            throw exceptions::invalid_request_exception("Invalid null value for IN condition");
-        }
-        in_values = std::move(lval->get_elements());
-    } else {
-        for (auto&& v : _in_values) {
-            in_values.emplace_back(to_bytes_opt(v->bind_and_get(options)));
-        }
-    }
-    // If cell value is NULL, IN list must contain NULL or an empty set/list. Otherwise it must contain cell value.
-    if (cell_value) {
-        return std::any_of(in_values.begin(), in_values.end(), [this, cell_value] (const bytes_opt& value) {
-            return value.has_value() && is_satisfied_by(operator_type::EQ, *cell_value->type(), *column.type, *cell_value, *value);
-        });
-    } else {
-        return std::any_of(in_values.begin(), in_values.end(), [] (const bytes_opt& value) { return !value.has_value() || value->empty(); });
-    }
+    _value->collect_marker_specification(bound_names);
 }

 ::shared_ptr<column_condition>
@@ -275,54 +100,61 @@ column_condition::raw::prepare(database& db, const sstring& keyspace, const colu
    if (receiver.type->is_counter()) {
        throw exceptions::invalid_request_exception("Conditions on counters are not supported");
    }
-    shared_ptr<term> collection_element_term;
-    shared_ptr<column_specification> value_spec = receiver.column_specification;

-    if (_collection_element) {
-        if (!receiver.type->is_collection()) {
-            throw exceptions::invalid_request_exception(format("Invalid element access syntax for non-collection column {}",
-                        receiver.name_as_text()));
-        }
-        // Pass  a correct type specification to the collection_element->prepare(), so that it can
-        // later be used to validate the parameter type is compatible with receiver type.
-        shared_ptr<column_specification> element_spec;
-        auto ctype = static_cast<const collection_type_impl*>(receiver.type.get());
-        if (ctype->get_kind() == abstract_type::kind::list) {
-            element_spec = lists::index_spec_of(receiver.column_specification);
-            value_spec = lists::value_spec_of(receiver.column_specification);
-        } else if (ctype->get_kind() == abstract_type::kind::map) {
-            element_spec = maps::key_spec_of(*receiver.column_specification);
-            value_spec = maps::value_spec_of(*receiver.column_specification);
-        } else if (ctype->get_kind() == abstract_type::kind::set) {
-            throw exceptions::invalid_request_exception(format("Invalid element access syntax for set column {}",
-                        receiver.name_as_text()));
+    if (!_collection_element) {
+        if (_op == operator_type::IN) {
+            if (_in_values.empty()) { // ?
+                return column_condition::in_condition(receiver, _in_marker->prepare(db, keyspace, receiver.column_specification));
+            }
+
+            std::vector<::shared_ptr<term>> terms;
+            for (auto&& value : _in_values) {
+                terms.push_back(value->prepare(db, keyspace, receiver.column_specification));
+            }
+            return column_condition::in_condition(receiver, std::move(terms));
        } else {
-            throw exceptions::invalid_request_exception(
-                    format("Unsupported collection type {} in a condition with element access", ctype->cql3_type_name()));
+            validate_operation_on_durations(*receiver.type, _op);
+            return column_condition::condition(receiver, _value->prepare(db, keyspace, receiver.column_specification), _op);
        }
-        collection_element_term = _collection_element->prepare(db, keyspace, element_spec);
    }

-    if (_op.is_compare()) {
+    if (!receiver.type->is_collection()) {
+        throw exceptions::invalid_request_exception(format("Invalid element access syntax for non-collection column {}", receiver.name_as_text()));
+    }
+
+    shared_ptr<column_specification> element_spec, value_spec;
+    auto ctype = static_cast<const collection_type_impl*>(receiver.type.get());
+    if (ctype->get_kind() == abstract_type::kind::list) {
+        element_spec = lists::index_spec_of(receiver.column_specification);
+        value_spec = lists::value_spec_of(receiver.column_specification);
+    } else if (ctype->get_kind() == abstract_type::kind::map) {
+        element_spec = maps::key_spec_of(*receiver.column_specification);
+        value_spec = maps::value_spec_of(*receiver.column_specification);
+    } else if (ctype->get_kind() == abstract_type::kind::set) {
+        throw exceptions::invalid_request_exception(format("Invalid element access syntax for set column {}", receiver.name()));
+    } else {
+        abort();
+    }
+
+    if (_op == operator_type::IN) {
+        if (_in_values.empty()) {
+            return column_condition::in_condition(receiver,
+                    _collection_element->prepare(db, keyspace, element_spec),
+                    _in_marker->prepare(db, keyspace, value_spec));
+        }
+        std::vector<shared_ptr<term>> terms;
+        terms.reserve(_in_values.size());
+        boost::push_back(terms, _in_values
+                                | boost::adaptors::transformed(std::bind(&term::raw::prepare, std::placeholders::_1, std::ref(db), std::ref(keyspace), value_spec)));
+        return column_condition::in_condition(receiver, _collection_element->prepare(db, keyspace, element_spec), terms);
+    } else {
        validate_operation_on_durations(*receiver.type, _op);
-        return column_condition::condition(receiver, collection_element_term, _value->prepare(db, keyspace, value_spec), _op);
-    }
-    if (_op != operator_type::IN) {
-        throw exceptions::invalid_request_exception(format("Unsupported operator type {} in a condition ", _op));
-    }

-    if (_in_marker) {
-        assert(_in_values.empty());
-        shared_ptr<term> multi_item_term = _in_marker->prepare(db, keyspace, value_spec);
-        return column_condition::in_condition(receiver, collection_element_term, multi_item_term, {});
+        return column_condition::condition(receiver,
+                _collection_element->prepare(db, keyspace, element_spec),
+                _value->prepare(db, keyspace, value_spec),
+                _op);
    }
-    // Both _in_values and in _in_marker can be missing in case of empty IN list: "a IN ()"
-    std::vector<::shared_ptr<term>> terms;
-    terms.reserve(_in_values.size());
-    for (auto&& value : _in_values) {
-        terms.push_back(value->prepare(db, keyspace, value_spec));
-    }
-    return column_condition::in_condition(receiver, collection_element_term, {}, std::move(terms));
 }

-} // end of namespace cql3
+}
--- a/cql3/column_condition.hh
+++ b/cql3/column_condition.hh
@@ -52,18 +52,11 @@ namespace cql3 {
 */
 class column_condition final {
 public:
-    // If _collection_element is not zero, this defines the receiver cell, not the entire receiver
-    // column.
-    // E.g. if column type is list<string> and expression is "a = ['test']", then the type of the
-    // column definition below is list<string>. If expression is "a[0] = 'test'", then the column
-    // object stands for the string cell. See column_condition::raw::prepare() for details.
    const column_definition& column;
 private:
    // For collection, when testing the equality of a specific element, nullptr otherwise.
    ::shared_ptr<term> _collection_element;
-    // A literal value for comparison predicates or a multi item terminal for "a IN ?"
    ::shared_ptr<term> _value;
-    // List of terminals for "a IN (value, value, ...)"
    std::vector<::shared_ptr<term>> _in_values;
    const operator_type& _op;
 public:
@@ -79,42 +72,635 @@ public:
            assert(_in_values.empty());
        }
    }
-    /**
-     * Collects the column specification for the bind variables of this operation.
-     *
-     * @param boundNames the list of column specification where to collect the
-     * bind variables of this term in.
-     */
-    void collect_marker_specificaton(lw_shared_ptr<variable_specifications> bound_names);

-    bool uses_function(const sstring& ks_name, const sstring& function_name) const;
+    static ::shared_ptr<column_condition> condition(const column_definition& def, ::shared_ptr<term> value, const operator_type& op) {
+        return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, std::move(value), std::vector<::shared_ptr<term>>{}, op);
+    }

-    // Retrieve parameter marker values, if any, find the appropriate collection
-    // element if the cell is a collection and an element access is used in the expression,
-    // and evaluate the condition.
-    bool applies_to(const data_value* cell_value, const query_options& options) const;
-
-    // Helper constructor wrapper for  "IF col['key'] = 'foo'" or "IF col = 'foo'" */
    static ::shared_ptr<column_condition> condition(const column_definition& def, ::shared_ptr<term> collection_element,
            ::shared_ptr<term> value, const operator_type& op) {
        return ::make_shared<column_condition>(def, std::move(collection_element), std::move(value),
            std::vector<::shared_ptr<term>>{}, op);
    }

-    // Helper constructor wrapper for  "IF col IN ... and IF col['key'] IN ... */
-    static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
-            ::shared_ptr<term> in_marker, std::vector<::shared_ptr<term>> in_values) {
-        return ::make_shared<column_condition>(def, std::move(collection_element), std::move(in_marker),
+    static ::shared_ptr<column_condition> in_condition(const column_definition& def, std::vector<::shared_ptr<term>> in_values) {
+        return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, ::shared_ptr<term>{},
            std::move(in_values), operator_type::IN);
    }

+    static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
+            std::vector<::shared_ptr<term>> in_values) {
+        return ::make_shared<column_condition>(def, std::move(collection_element), ::shared_ptr<term>{},
+            std::move(in_values), operator_type::IN);
+    }
+
+    static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> in_marker) {
+        return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, std::move(in_marker),
+            std::vector<::shared_ptr<term>>{}, operator_type::IN);
+    }
+
+    static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
+        ::shared_ptr<term> in_marker) {
+        return ::make_shared<column_condition>(def, std::move(collection_element), std::move(in_marker),
+            std::vector<::shared_ptr<term>>{}, operator_type::IN);
+    }
+
+    bool uses_function(const sstring& ks_name, const sstring& function_name);
+public:
+    /**
+     * Collects the column specification for the bind variables of this operation.
+     *
+     * @param boundNames the list of column specification where to collect the
+     * bind variables of this term in.
+     */
+    void collect_marker_specificaton(::shared_ptr<variable_specifications> bound_names);
+
+#if 0
+    public ColumnCondition.Bound bind(QueryOptions options) throws InvalidRequestException
+    {
+        boolean isInCondition = operator == Operator.IN;
+        if (column.type instanceof CollectionType)
+        {
+            if (collectionElement == null)
+                return isInCondition ? new CollectionInBound(this, options) : new CollectionBound(this, options);
+            else
+                return isInCondition ? new ElementAccessInBound(this, options) : new ElementAccessBound(this, options);
+        }
+        return isInCondition ? new SimpleInBound(this, options) : new SimpleBound(this, options);
+    }
+
+    public static abstract class Bound
+    {
+        public final ColumnDefinition column;
+        public final Operator operator;
+
+        protected Bound(ColumnDefinition column, Operator operator)
+        {
+            this.column = column;
+            this.operator = operator;
+        }
+
+        /**
+         * Validates whether this condition applies to {@code current}.
+         */
+        public abstract boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException;
+
+        public ByteBuffer getCollectionElementValue()
+        {
+            return null;
+        }
+
+        protected boolean isSatisfiedByValue(ByteBuffer value, Cell c, AbstractType<?> type, Operator operator, long now) throws InvalidRequestException
+        {
+            ByteBuffer columnValue = (c == null || !c.isLive(now)) ? null : c.value();
+            return compareWithOperator(operator, type, value, columnValue);
+        }
+
+        /** Returns true if the operator is satisfied (i.e. "value operator otherValue == true"), false otherwise. */
+        protected boolean compareWithOperator(Operator operator, AbstractType<?> type, ByteBuffer value, ByteBuffer otherValue) throws InvalidRequestException
+        {
+            if (value == null)
+            {
+                switch (operator)
+                {
+                    case EQ:
+                        return otherValue == null;
+                    case NEQ:
+                        return otherValue != null;
+                    default:
+                        throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
+                }
+            }
+            else if (otherValue == null)
+            {
+                // the condition value is not null, so only NEQ can return true
+                return operator == Operator.NEQ;
+            }
+            int comparison = type.compare(otherValue, value);
+            switch (operator)
+            {
+                case EQ:
+                    return comparison == 0;
+                case LT:
+                    return comparison < 0;
+                case LTE:
+                    return comparison <= 0;
+                case GT:
+                    return comparison > 0;
+                case GTE:
+                    return comparison >= 0;
+                case NEQ:
+                    return comparison != 0;
+                default:
+                    // we shouldn't get IN, CONTAINS, or CONTAINS KEY here
+                    throw new AssertionError();
+            }
+        }
+
+        protected Iterator<Cell> collectionColumns(CellName collection, ColumnFamily cf, final long now)
+        {
+            // We are testing for collection equality, so we need to have the expected values *and* only those.
+            ColumnSlice[] collectionSlice = new ColumnSlice[]{ collection.slice() };
+            // Filter live columns, this makes things simpler afterwards
+            return Iterators.filter(cf.iterator(collectionSlice), new Predicate<Cell>()
+            {
+                public boolean apply(Cell c)
+                {
+                    // we only care about live columns
+                    return c.isLive(now);
+                }
+            });
+        }
+    }
+
+    /**
+     * A condition on a single non-collection column. This does not support IN operators (see SimpleInBound).
+     */
+    static class SimpleBound extends Bound
+    {
+        public final ByteBuffer value;
+
+        private SimpleBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert !(column.type instanceof CollectionType) && condition.collectionElement == null;
+            assert condition.operator != Operator.IN;
+            this.value = condition.value.bindAndGet(options);
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException
+        {
+            CellName name = current.metadata().comparator.create(rowPrefix, column);
+            return isSatisfiedByValue(value, current.getColumn(name), column.type, operator, now);
+        }
+    }
+
+    /**
+     * An IN condition on a single non-collection column.
+     */
+    static class SimpleInBound extends Bound
+    {
+        public final List<ByteBuffer> inValues;
+
+        private SimpleInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert !(column.type instanceof CollectionType) && condition.collectionElement == null;
+            assert condition.operator == Operator.IN;
+            if (condition.inValues == null)
+                this.inValues = ((Lists.Marker) condition.value).bind(options).getElements();
+            else
+            {
+                this.inValues = new ArrayList<>(condition.inValues.size());
+                for (Term value : condition.inValues)
+                    this.inValues.add(value.bindAndGet(options));
+            }
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException
+        {
+            CellName name = current.metadata().comparator.create(rowPrefix, column);
+            for (ByteBuffer value : inValues)
+            {
+                if (isSatisfiedByValue(value, current.getColumn(name), column.type, Operator.EQ, now))
+                    return true;
+            }
+            return false;
+        }
+    }
+
+    /** A condition on an element of a collection column. IN operators are not supported here, see ElementAccessInBound. */
+    static class ElementAccessBound extends Bound
+    {
+        public final ByteBuffer collectionElement;
+        public final ByteBuffer value;
+
+        private ElementAccessBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert column.type instanceof CollectionType && condition.collectionElement != null;
+            assert condition.operator != Operator.IN;
+            this.collectionElement = condition.collectionElement.bindAndGet(options);
+            this.value = condition.value.bindAndGet(options);
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
+        {
+            if (collectionElement == null)
+                throw new InvalidRequestException("Invalid null value for " + (column.type instanceof MapType ? "map" : "list") + " element access");
+
+            if (column.type instanceof MapType)
+            {
+                MapType mapType = (MapType) column.type;
+                if (column.type.isMultiCell())
+                {
+                    Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column, collectionElement));
+                    return isSatisfiedByValue(value, cell, mapType.getValuesType(), operator, now);
+                }
+                else
+                {
+                    Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
+                    ByteBuffer mapElementValue = cell.isLive(now) ? mapType.getSerializer().getSerializedValue(cell.value(), collectionElement, mapType.getKeysType())
+                                                                  : null;
+                    return compareWithOperator(operator, mapType.getValuesType(), value, mapElementValue);
+                }
+            }
+
+            // sets don't have element access, so it's a list
+            ListType listType = (ListType) column.type;
+            if (column.type.isMultiCell())
+            {
+                ByteBuffer columnValue = getListItem(
+                        collectionColumns(current.metadata().comparator.create(rowPrefix, column), current, now),
+                        getListIndex(collectionElement));
+                return compareWithOperator(operator, listType.getElementsType(), value, columnValue);
+            }
+            else
+            {
+                Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
+                ByteBuffer listElementValue = cell.isLive(now) ? listType.getSerializer().getElement(cell.value(), getListIndex(collectionElement))
+                                                               : null;
+                return compareWithOperator(operator, listType.getElementsType(), value, listElementValue);
+            }
+        }
+
+        static int getListIndex(ByteBuffer collectionElement) throws InvalidRequestException
+        {
+            int idx = ByteBufferUtil.toInt(collectionElement);
+            if (idx < 0)
+                throw new InvalidRequestException(String.format("Invalid negative list index %d", idx));
+            return idx;
+        }
+
+        static ByteBuffer getListItem(Iterator<Cell> iter, int index)
+        {
+            int adv = Iterators.advance(iter, index);
+            if (adv == index && iter.hasNext())
+                return iter.next().value();
+            else
+                return null;
+        }
+
+        public ByteBuffer getCollectionElementValue()
+        {
+            return collectionElement;
+        }
+    }
+
+    static class ElementAccessInBound extends Bound
+    {
+        public final ByteBuffer collectionElement;
+        public final List<ByteBuffer> inValues;
+
+        private ElementAccessInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert column.type instanceof CollectionType && condition.collectionElement != null;
+            this.collectionElement = condition.collectionElement.bindAndGet(options);
+
+            if (condition.inValues == null)
+                this.inValues = ((Lists.Marker) condition.value).bind(options).getElements();
+            else
+            {
+                this.inValues = new ArrayList<>(condition.inValues.size());
+                for (Term value : condition.inValues)
+                    this.inValues.add(value.bindAndGet(options));
+            }
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
+        {
+            if (collectionElement == null)
+                throw new InvalidRequestException("Invalid null value for " + (column.type instanceof MapType ? "map" : "list") + " element access");
+
+            CellNameType nameType = current.metadata().comparator;
+            if (column.type instanceof MapType)
+            {
+                MapType mapType = (MapType) column.type;
+                AbstractType<?> valueType = mapType.getValuesType();
+                if (column.type.isMultiCell())
+                {
+                    CellName name = nameType.create(rowPrefix, column, collectionElement);
+                    Cell item = current.getColumn(name);
+                    for (ByteBuffer value : inValues)
+                    {
+                        if (isSatisfiedByValue(value, item, valueType, Operator.EQ, now))
+                            return true;
+                    }
+                    return false;
+                }
+                else
+                {
+                    Cell cell = current.getColumn(nameType.create(rowPrefix, column));
+                    ByteBuffer mapElementValue  = null;
+                    if (cell != null && cell.isLive(now))
+                        mapElementValue =  mapType.getSerializer().getSerializedValue(cell.value(), collectionElement, mapType.getKeysType());
+                    for (ByteBuffer value : inValues)
+                    {
+                        if (value == null)
+                        {
+                            if (mapElementValue == null)
+                                return true;
+                            continue;
+                        }
+                        if (valueType.compare(value, mapElementValue) == 0)
+                            return true;
+                    }
+                    return false;
+                }
+            }
+
+            ListType listType = (ListType) column.type;
+            AbstractType<?> elementsType = listType.getElementsType();
+            if (column.type.isMultiCell())
+            {
+                ByteBuffer columnValue = ElementAccessBound.getListItem(
+                        collectionColumns(nameType.create(rowPrefix, column), current, now),
+                        ElementAccessBound.getListIndex(collectionElement));
+
+                for (ByteBuffer value : inValues)
+                {
+                    if (compareWithOperator(Operator.EQ, elementsType, value, columnValue))
+                        return true;
+                }
+            }
+            else
+            {
+                Cell cell = current.getColumn(nameType.create(rowPrefix, column));
+                ByteBuffer listElementValue = null;
+                if (cell != null && cell.isLive(now))
+                    listElementValue = listType.getSerializer().getElement(cell.value(), ElementAccessBound.getListIndex(collectionElement));
+
+                for (ByteBuffer value : inValues)
+                {
+                    if (value == null)
+                    {
+                        if (listElementValue == null)
+                            return true;
+                        continue;
+                    }
+                    if (elementsType.compare(value, listElementValue) == 0)
+                        return true;
+                }
+            }
+            return false;
+        }
+    }
+
+    /** A condition on an entire collection column. IN operators are not supported here, see CollectionInBound. */
+    static class CollectionBound extends Bound
+    {
+        private final Term.Terminal value;
+
+        private CollectionBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert column.type.isCollection() && condition.collectionElement == null;
+            assert condition.operator != Operator.IN;
+            this.value = condition.value.bind(options);
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
+        {
+            CollectionType type = (CollectionType)column.type;
+
+            if (type.isMultiCell())
+            {
+                Iterator<Cell> iter = collectionColumns(current.metadata().comparator.create(rowPrefix, column), current, now);
+                if (value == null)
+                {
+                    if (operator == Operator.EQ)
+                        return !iter.hasNext();
+                    else if (operator == Operator.NEQ)
+                        return iter.hasNext();
+                    else
+                        throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
+                }
+
+                return valueAppliesTo(type, iter, value, operator);
+            }
+
+            // frozen collections
+            Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
+            if (value == null)
+            {
+                if (operator == Operator.EQ)
+                    return cell == null || !cell.isLive(now);
+                else if (operator == Operator.NEQ)
+                    return cell != null && cell.isLive(now);
+                else
+                    throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
+            }
+
+            // make sure we use v3 serialization format for comparison
+            ByteBuffer conditionValue;
+            if (type.kind == CollectionType.Kind.LIST)
+                conditionValue = ((Lists.Value) value).getWithProtocolVersion(Server.VERSION_3);
+            else if (type.kind == CollectionType.Kind.SET)
+                conditionValue = ((Sets.Value) value).getWithProtocolVersion(Server.VERSION_3);
+            else
+                conditionValue = ((Maps.Value) value).getWithProtocolVersion(Server.VERSION_3);
+
+            return compareWithOperator(operator, type, conditionValue, cell.value());
+        }
+
+        static boolean valueAppliesTo(CollectionType type, Iterator<Cell> iter, Term.Terminal value, Operator operator)
+        {
+            if (value == null)
+                return !iter.hasNext();
+
+            switch (type.kind)
+            {
+                case LIST: return listAppliesTo((ListType)type, iter, ((Lists.Value)value).elements, operator);
+                case SET: return setAppliesTo((SetType)type, iter, ((Sets.Value)value).elements, operator);
+                case MAP: return mapAppliesTo((MapType)type, iter, ((Maps.Value)value).map, operator);
+            }
+            throw new AssertionError();
+        }
+
+        private static boolean setOrListAppliesTo(AbstractType<?> type, Iterator<Cell> iter, Iterator<ByteBuffer> conditionIter, Operator operator, boolean isSet)
+        {
+            while(iter.hasNext())
+            {
+                if (!conditionIter.hasNext())
+                    return (operator == Operator.GT) || (operator == Operator.GTE) || (operator == Operator.NEQ);
+
+                // for lists we use the cell value; for sets we use the cell name
+                ByteBuffer cellValue = isSet? iter.next().name().collectionElement() : iter.next().value();
+                int comparison = type.compare(cellValue, conditionIter.next());
+                if (comparison != 0)
+                    return evaluateComparisonWithOperator(comparison, operator);
+            }
+
+            if (conditionIter.hasNext())
+                return (operator == Operator.LT) || (operator == Operator.LTE) || (operator == Operator.NEQ);
+
+            // they're equal
+            return operator == Operator.EQ || operator == Operator.LTE || operator == Operator.GTE;
+        }
+
+        private static boolean evaluateComparisonWithOperator(int comparison, Operator operator)
+        {
+            // called when comparison != 0
+            switch (operator)
+            {
+                case EQ:
+                    return false;
+                case LT:
+                case LTE:
+                    return comparison < 0;
+                case GT:
+                case GTE:
+                    return comparison > 0;
+                case NEQ:
+                    return true;
+                default:
+                    throw new AssertionError();
+            }
+        }
+
+        static boolean listAppliesTo(ListType type, Iterator<Cell> iter, List<ByteBuffer> elements, Operator operator)
+        {
+            return setOrListAppliesTo(type.getElementsType(), iter, elements.iterator(), operator, false);
+        }
+
+        static boolean setAppliesTo(SetType type, Iterator<Cell> iter, Set<ByteBuffer> elements, Operator operator)
+        {
+            ArrayList<ByteBuffer> sortedElements = new ArrayList<>(elements.size());
+            sortedElements.addAll(elements);
+            Collections.sort(sortedElements, type.getElementsType());
+            return setOrListAppliesTo(type.getElementsType(), iter, sortedElements.iterator(), operator, true);
+        }
+
+        static boolean mapAppliesTo(MapType type, Iterator<Cell> iter, Map<ByteBuffer, ByteBuffer> elements, Operator operator)
+        {
+            Iterator<Map.Entry<ByteBuffer, ByteBuffer>> conditionIter = elements.entrySet().iterator();
+            while(iter.hasNext())
+            {
+                if (!conditionIter.hasNext())
+                    return (operator == Operator.GT) || (operator == Operator.GTE) || (operator == Operator.NEQ);
+
+                Map.Entry<ByteBuffer, ByteBuffer> conditionEntry = conditionIter.next();
+                Cell c = iter.next();
+
+                // compare the keys
+                int comparison = type.getKeysType().compare(c.name().collectionElement(), conditionEntry.getKey());
+                if (comparison != 0)
+                    return evaluateComparisonWithOperator(comparison, operator);
+
+                // compare the values
+                comparison = type.getValuesType().compare(c.value(), conditionEntry.getValue());
+                if (comparison != 0)
+                    return evaluateComparisonWithOperator(comparison, operator);
+            }
+
+            if (conditionIter.hasNext())
+                return (operator == Operator.LT) || (operator == Operator.LTE) || (operator == Operator.NEQ);
+
+            // they're equal
+            return operator == Operator.EQ || operator == Operator.LTE || operator == Operator.GTE;
+        }
+    }
+
+    public static class CollectionInBound extends Bound
+    {
+        private final List<Term.Terminal> inValues;
+
+        private CollectionInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
+        {
+            super(condition.column, condition.operator);
+            assert column.type instanceof CollectionType && condition.collectionElement == null;
+            assert condition.operator == Operator.IN;
+            inValues = new ArrayList<>();
+            if (condition.inValues == null)
+            {
+                // We have a list of serialized collections that need to be deserialized for later comparisons
+                CollectionType collectionType = (CollectionType) column.type;
+                Lists.Marker inValuesMarker = (Lists.Marker) condition.value;
+                if (column.type instanceof ListType)
+                {
+                    ListType deserializer = ListType.getInstance(collectionType.valueComparator(), false);
+                    for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
+                    {
+                        if (buffer == null)
+                            this.inValues.add(null);
+                        else
+                            this.inValues.add(Lists.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
+                    }
+                }
+                else if (column.type instanceof MapType)
+                {
+                    MapType deserializer = MapType.getInstance(collectionType.nameComparator(), collectionType.valueComparator(), false);
+                    for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
+                    {
+                        if (buffer == null)
+                            this.inValues.add(null);
+                        else
+                            this.inValues.add(Maps.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
+                    }
+                }
+                else if (column.type instanceof SetType)
+                {
+                    SetType deserializer = SetType.getInstance(collectionType.valueComparator(), false);
+                    for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
+                    {
+                        if (buffer == null)
+                            this.inValues.add(null);
+                        else
+                            this.inValues.add(Sets.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
+                    }
+                }
+            }
+            else
+            {
+                for (Term value : condition.inValues)
+                    this.inValues.add(value.bind(options));
+            }
+        }
+
+        public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
+        {
+            CollectionType type = (CollectionType)column.type;
+            CellName name = current.metadata().comparator.create(rowPrefix, column);
+            if (type.isMultiCell())
+            {
+                // copy iterator contents so that we can properly reuse them for each comparison with an IN value
+                List<Cell> cells = newArrayList(collectionColumns(name, current, now));
+                for (Term.Terminal value : inValues)
+                {
+                    if (CollectionBound.valueAppliesTo(type, cells.iterator(), value, Operator.EQ))
+                        return true;
+                }
+                return false;
+            }
+            else
+            {
+                Cell cell = current.getColumn(name);
+                for (Term.Terminal value : inValues)
+                {
+                    if (value == null)
+                    {
+                        if (cell == null || !cell.isLive(now))
+                            return true;
+                    }
+                    else if (type.compare(((Term.CollectionTerminal)value).getWithProtocolVersion(Server.VERSION_3), cell.value()) == 0)
+                    {
+                        return true;
+                    }
+                }
+                return false;
+            }
+        }
+    }
+#endif
+
    class raw final {
    private:
        ::shared_ptr<term::raw> _value;
        std::vector<::shared_ptr<term::raw>> _in_values;
        ::shared_ptr<abstract_marker::in_raw> _in_marker;

-        // Can be nullptr, used with the syntax "IF m[e] = ..." (in which case it's 'e')
+        // Can be nullptr, only used with the syntax "IF m[e] = ..." (in which case it's 'e')
        ::shared_ptr<term::raw> _collection_element;
        const operator_type& _op;
    public:
@@ -130,29 +716,46 @@ public:
                , _op(op)
        { }

-        /** A condition on a column or collection element. For example: "IF col['key'] = 'foo'" or "IF col = 'foo'" */
-        static ::shared_ptr<raw> simple_condition(::shared_ptr<term::raw> value, ::shared_ptr<term::raw> collection_element,
-                const operator_type& op) {
+        /** A condition on a column. For example: "IF col = 'foo'" */
+        static ::shared_ptr<raw> simple_condition(::shared_ptr<term::raw> value, const operator_type& op) {
            return ::make_shared<raw>(std::move(value), std::vector<::shared_ptr<term::raw>>{},
-                    ::shared_ptr<abstract_marker::in_raw>{}, std::move(collection_element), op);
+                ::shared_ptr<abstract_marker::in_raw>{}, ::shared_ptr<term::raw>{}, op);
        }

-        /**
-         * An IN condition on a column or a collection element. IN may contain a list of values or a single marker.
-         * For example:
-         * "IF col IN ('foo', 'bar', ...)"
-         * "IF col IN ?"
-         * "IF col['key'] IN * ('foo', 'bar', ...)"
-         * "IF col['key'] IN ?"
-         */
-        static ::shared_ptr<raw> in_condition(::shared_ptr<term::raw> collection_element,
-                ::shared_ptr<abstract_marker::in_raw> in_marker, std::vector<::shared_ptr<term::raw>> in_values) {
-            return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values), std::move(in_marker),
-                    std::move(collection_element), operator_type::IN);
+        /** An IN condition on a column. For example: "IF col IN ('foo', 'bar', ...)" */
+        static ::shared_ptr<raw> simple_in_condition(std::vector<::shared_ptr<term::raw>> in_values) {
+            return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values),
+                ::shared_ptr<abstract_marker::in_raw>{}, ::shared_ptr<term::raw>{}, operator_type::IN);
+        }
+
+        /** An IN condition on a column with a single marker. For example: "IF col IN ?" */
+        static ::shared_ptr<raw> simple_in_condition(::shared_ptr<abstract_marker::in_raw> in_marker) {
+            return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::vector<::shared_ptr<term::raw>>{},
+                std::move(in_marker), ::shared_ptr<term::raw>{}, operator_type::IN);
+        }
+
+        /** A condition on a collection element. For example: "IF col['key'] = 'foo'" */
+        static ::shared_ptr<raw> collection_condition(::shared_ptr<term::raw> value, ::shared_ptr<term::raw> collection_element,
+                const operator_type& op) {
+            return ::make_shared<raw>(std::move(value), std::vector<::shared_ptr<term::raw>>{}, ::shared_ptr<abstract_marker::in_raw>{}, std::move(collection_element), op);
+        }
+
+        /** An IN condition on a collection element. For example: "IF col['key'] IN ('foo', 'bar', ...)" */
+        static ::shared_ptr<raw> collection_in_condition(::shared_ptr<term::raw> collection_element,
+                std::vector<::shared_ptr<term::raw>> in_values) {
+            return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values), ::shared_ptr<abstract_marker::in_raw>{},
+                std::move(collection_element), operator_type::IN);
+        }
+
+        /** An IN condition on a collection element with a single marker. For example: "IF col['key'] IN ?" */
+        static ::shared_ptr<raw> collection_in_condition(::shared_ptr<term::raw> collection_element,
+                ::shared_ptr<abstract_marker::in_raw> in_marker) {
+            return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::vector<::shared_ptr<term::raw>>{}, std::move(in_marker),
+                std::move(collection_element), operator_type::IN);
        }

        ::shared_ptr<column_condition> prepare(database& db, const sstring& keyspace, const column_definition& receiver);
    };
 };

-} // end of namespace cql3
+}
--- a/cql3/constants.cc
+++ b/cql3/constants.cc
@@ -85,7 +85,7 @@ assignment_testable::test_result
 constants::literal::test_assignment(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver)
 {
    auto receiver_type = receiver->type->as_cql3_type();
-    if (receiver_type.is_collection() || receiver_type.is_user_type()) {
+    if (receiver_type.is_collection()) {
        return test_result::NOT_ASSIGNABLE;
    }
    if (!receiver_type.is_native()) {
@@ -166,10 +166,10 @@ constants::literal::prepare(database& db, const sstring& keyspace, ::shared_ptr<

 void constants::deleter::execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) {
    if (column.type->is_multi_cell()) {
-        collection_mutation_description coll_m;
+        collection_type_impl::mutation coll_m;
        coll_m.tomb = params.make_tombstone();
-
-        m.set_cell(prefix, column, coll_m.serialize(*column.type));
+        auto ctype = static_pointer_cast<const collection_type_impl>(column.type);
+        m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(ctype->serialize_mutation_form(coll_m)));
    } else {
        m.set_cell(prefix, column, make_dead_cell(params));
    }
--- a/cql3/constants.hh
+++ b/cql3/constants.hh
@@ -173,7 +173,7 @@ public:
        marker(int32_t bind_index, ::shared_ptr<column_specification> receiver)
            : abstract_marker{bind_index, std::move(receiver)}
        {
-            assert(!_receiver->type->is_collection() && !_receiver->type->is_user_type());
+            assert(!_receiver->type->is_collection());
        }

        virtual cql3::raw_value_view bind_and_get(const query_options& options) override {
--- a/cql3/cql3_type.cc
+++ b/cql3/cql3_type.cc
@@ -31,44 +31,9 @@
 #include "types/map.hh"
 #include "types/set.hh"
 #include "types/list.hh"
-#include "concrete_types.hh"

 namespace cql3 {

-static cql3_type::kind get_cql3_kind(const abstract_type& t) {
-    struct visitor {
-        cql3_type::kind operator()(const ascii_type_impl&) { return cql3_type::kind::ASCII; }
-        cql3_type::kind operator()(const byte_type_impl&) { return cql3_type::kind::TINYINT; }
-        cql3_type::kind operator()(const bytes_type_impl&) { return cql3_type::kind::BLOB; }
-        cql3_type::kind operator()(const boolean_type_impl&) { return cql3_type::kind::BOOLEAN; }
-        cql3_type::kind operator()(const counter_type_impl&) { return cql3_type::kind::COUNTER; }
-        cql3_type::kind operator()(const decimal_type_impl&) { return cql3_type::kind::DECIMAL; }
-        cql3_type::kind operator()(const double_type_impl&) { return cql3_type::kind::DOUBLE; }
-        cql3_type::kind operator()(const duration_type_impl&) { return cql3_type::kind::DURATION; }
-        cql3_type::kind operator()(const empty_type_impl&) { return cql3_type::kind::EMPTY; }
-        cql3_type::kind operator()(const float_type_impl&) { return cql3_type::kind::FLOAT; }
-        cql3_type::kind operator()(const inet_addr_type_impl&) { return cql3_type::kind::INET; }
-        cql3_type::kind operator()(const int32_type_impl&) { return cql3_type::kind::INT; }
-        cql3_type::kind operator()(const long_type_impl&) { return cql3_type::kind::BIGINT; }
-        cql3_type::kind operator()(const short_type_impl&) { return cql3_type::kind::SMALLINT; }
-        cql3_type::kind operator()(const simple_date_type_impl&) { return cql3_type::kind::DATE; }
-        cql3_type::kind operator()(const utf8_type_impl&) { return cql3_type::kind::TEXT; }
-        cql3_type::kind operator()(const time_type_impl&) { return cql3_type::kind::TIME; }
-        cql3_type::kind operator()(const timestamp_date_base_class&) { return cql3_type::kind::TIMESTAMP; }
-        cql3_type::kind operator()(const timeuuid_type_impl&) { return cql3_type::kind::TIMEUUID; }
-        cql3_type::kind operator()(const uuid_type_impl&) { return cql3_type::kind::UUID; }
-        cql3_type::kind operator()(const varint_type_impl&) { return cql3_type::kind::VARINT; }
-        cql3_type::kind operator()(const reversed_type_impl& r) { return get_cql3_kind(*r.underlying_type()); }
-        cql3_type::kind operator()(const tuple_type_impl&) { assert(0 && "no kind for this type"); }
-        cql3_type::kind operator()(const collection_type_impl&) { assert(0 && "no kind for this type"); }
-    };
-    return visit(t, visitor{});
-}
-
-cql3_type::kind_enum_set::prepared cql3_type::get_kind() const {
-    return kind_enum_set::prepare(get_cql3_kind(*_type));
-}
-
 cql3_type cql3_type::raw::prepare(database& db, const sstring& keyspace) {
    try {
        auto&& ks = db.find_keyspace(keyspace);
@@ -89,10 +54,6 @@ bool cql3_type::raw::references_user_type(const sstring& name) const {
 class cql3_type::raw_type : public raw {
 private:
    cql3_type _type;
-
-    virtual sstring to_string() const override {
-        return _type.to_string();
-    }
 public:
    raw_type(cql3_type type)
        : _type{type}
@@ -101,7 +62,7 @@ public:
    virtual cql3_type prepare(database& db, const sstring& keyspace) {
        return _type;
    }
-    cql3_type prepare_internal(const sstring&, const user_types_metadata&) override {
+    cql3_type prepare_internal(const sstring&, lw_shared_ptr<user_types_metadata>) override {
        return _type;
    }

@@ -113,6 +74,10 @@ public:
        return _type.is_counter();
    }

+    virtual sstring to_string() const {
+        return _type.to_string();
+    }
+
    virtual bool is_duration() const override {
        return _type.get_type() == duration_type;
    }
@@ -122,19 +87,6 @@ class cql3_type::raw_collection : public raw {
    const abstract_type::kind _kind;
    shared_ptr<raw> _keys;
    shared_ptr<raw> _values;
-
-    virtual sstring to_string() const override {
-        sstring start = is_frozen() ? "frozen<" : "";
-        sstring end = is_frozen() ? ">" : "";
-        if (_kind == abstract_type::kind::list) {
-            return format("{}list<{}>{}", start, _values, end);
-        } else if (_kind == abstract_type::kind::set) {
-            return format("{}set<{}>{}", start, _values, end);
-        } else if (_kind == abstract_type::kind::map) {
-            return format("{}map<{}, {}>{}", start, _keys, _values, end);
-        }
-        abort();
-    }
 public:
    raw_collection(const abstract_type::kind kind, shared_ptr<raw> keys, shared_ptr<raw> values)
            : _kind(kind), _keys(std::move(keys)), _values(std::move(values)) {
@@ -158,37 +110,35 @@ public:
        return true;
    }

-    virtual cql3_type prepare_internal(const sstring& keyspace, const user_types_metadata& user_types) override {
+    virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
        assert(_values); // "Got null values type for a collection";

-        if (!is_frozen() && _values->supports_freezing() && !_values->is_frozen()) {
-            throw exceptions::invalid_request_exception(
-                    format("Non-frozen user types or collections are not allowed inside collections: {}", *this));
+        if (!_frozen && _values->supports_freezing() && !_values->_frozen) {
+            throw exceptions::invalid_request_exception(format("Non-frozen collections are not allowed inside collections: {}", *this));
        }
        if (_values->is_counter()) {
            throw exceptions::invalid_request_exception(format("Counters are not allowed inside collections: {}", *this));
        }

        if (_keys) {
-            if (!is_frozen() && _keys->supports_freezing() && !_keys->is_frozen()) {
-                throw exceptions::invalid_request_exception(
-                        format("Non-frozen user types or collections are not allowed inside collections: {}", *this));
+            if (!_frozen && _keys->supports_freezing() && !_keys->_frozen) {
+                throw exceptions::invalid_request_exception(format("Non-frozen collections are not allowed inside collections: {}", *this));
            }
        }

        if (_kind == abstract_type::kind::list) {
-            return cql3_type(list_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
+            return cql3_type(list_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
        } else if (_kind == abstract_type::kind::set) {
            if (_values->is_duration()) {
                throw exceptions::invalid_request_exception(format("Durations are not allowed inside sets: {}", *this));
            }
-            return cql3_type(set_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
+            return cql3_type(set_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
        } else if (_kind == abstract_type::kind::map) {
            assert(_keys); // "Got null keys type for a collection";
            if (_keys->is_duration()) {
                throw exceptions::invalid_request_exception(format("Durations are not allowed as map keys: {}", *this));
            }
-            return cql3_type(map_type_impl::get_instance(_keys->prepare_internal(keyspace, user_types).get_type(), _values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
+            return cql3_type(map_type_impl::get_instance(_keys->prepare_internal(keyspace, user_types).get_type(), _values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
        }
        abort();
    }
@@ -200,18 +150,23 @@ public:
    bool is_duration() const override {
        return false;
    }
+
+    virtual sstring to_string() const override {
+        sstring start = _frozen ? "frozen<" : "";
+        sstring end = _frozen ? ">" : "";
+        if (_kind == abstract_type::kind::list) {
+            return format("{}list<{}>{}", start, _values, end);
+        } else if (_kind == abstract_type::kind::set) {
+            return format("{}set<{}>{}", start, _values, end);
+        } else if (_kind == abstract_type::kind::map) {
+            return format("{}map<{}, {}>{}", start, _keys, _values, end);
+        }
+        abort();
+    }
 };

 class cql3_type::raw_ut : public raw {
    ut_name _name;
-
-    virtual sstring to_string() const override {
-        if (is_frozen()) {
-            return format("frozen<{}>", _name.to_string());
-        }
-
-        return _name.to_string();
-    }
 public:
    raw_ut(ut_name name)
            : _name(std::move(name)) {
@@ -225,7 +180,7 @@ public:
        _frozen = true;
    }

-    virtual cql3_type prepare_internal(const sstring& keyspace, const user_types_metadata& user_types) override {
+    virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
        if (_name.has_keyspace()) {
            // The provided keyspace is the one of the current statement this is part of. If it's different from the keyspace of
            // the UTName, we reject since we want to limit user types to their own keyspace (see #6643)
@@ -237,10 +192,14 @@ public:
        } else {
            _name.set_keyspace(keyspace);
        }
+        if (!user_types) {
+            // bootstrap mode.
+            throw exceptions::invalid_request_exception(format("Unknown type {}", _name));
+        }
        try {
-            data_type type = user_types.get_type(_name.get_user_type_name());
-            if (is_frozen()) {
-                type = type->freeze();
+            auto&& type = user_types->get_type(_name.get_user_type_name());
+            if (!_frozen) {
+                throw exceptions::invalid_request_exception("Non-frozen User-Defined types are not supported, please use frozen<>");
            }
            return cql3_type(std::move(type));
        } catch (std::out_of_range& e) {
@@ -254,18 +213,14 @@ public:
        return true;
    }

-    virtual bool is_user_type() const override {
-        return true;
+    virtual sstring to_string() const override {
+        return _name.to_string();
    }
 };


 class cql3_type::raw_tuple : public raw {
    std::vector<shared_ptr<raw>> _types;
-
-    virtual sstring to_string() const override {
-        return format("tuple<{}>", join(", ", _types));
-    }
 public:
    raw_tuple(std::vector<shared_ptr<raw>> types)
            : _types(std::move(types)) {
@@ -284,8 +239,8 @@ public:
        }
        _frozen = true;
    }
-    virtual cql3_type prepare_internal(const sstring& keyspace, const user_types_metadata& user_types) override {
-        if (!is_frozen()) {
+    virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
+        if (!_frozen) {
            freeze();
        }
        std::vector<data_type> ts;
@@ -303,6 +258,10 @@ public:
            return t->references_user_type(name);
        });
    }
+
+    virtual sstring to_string() const override {
+        return format("tuple<{}>", join(", ", _types));
+    }
 };

 bool
@@ -315,16 +274,6 @@ cql3_type::raw::is_counter() const {
    return false;
 }

-bool
-cql3_type::raw::is_user_type() const {
-    return false;
-}
-
-bool
-cql3_type::raw::is_frozen() const {
-    return _frozen;
-}
-
 std::optional<sstring>
 cql3_type::raw::keyspace() const {
    return std::nullopt;
@@ -430,42 +379,14 @@ operator<<(std::ostream& os, const cql3_type::raw& r) {
 namespace util {

 sstring maybe_quote(const sstring& identifier) {
-    const auto* p = identifier.begin();
-    const auto* ep = identifier.end();
-
-    // quote empty string
-    if (__builtin_expect(p == ep, false)) {
-        return "\"\"";
-    }
-
-    // string needs no quoting if it matches [a-z][a-z0-9_]*
-    // quotes ('"') in the string are doubled
-    bool need_quotes;
-    bool has_quotes;
-    auto c = *p;
-    if ('a' <= c && c <= 'z') {
-        need_quotes = false;
-        has_quotes = false;
-    } else {
-        need_quotes = true;
-        has_quotes = (c == '"');
-    }
-    while ((++p != ep) && !has_quotes) {
-        c = *p;
-        if (!(('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'))) {
-            need_quotes = true;
-            has_quotes = (c == '"');
-        }
-    }
-
-    if (!need_quotes) {
+    static const std::regex unquoted_identifier_re("[a-z][a-z0-9_]*");
+    if (std::regex_match(identifier.begin(), identifier.end(), unquoted_identifier_re)) {
        return identifier;
    }
-    if (!has_quotes) {
-        return make_sstring("\"", identifier, "\"");
-    }
    static const std::regex double_quote_re("\"");
-    return '"' + std::regex_replace(identifier.c_str(), double_quote_re, "\"\"") + '"';
+    std::string result = identifier;
+    std::regex_replace(result, double_quote_re, "\"\"");
+    return '"' + result + '"';
 }

 }
--- a/cql3/cql3_type.hh
+++ b/cql3/cql3_type.hh
@@ -60,28 +60,23 @@ public:
    bool is_collection() const { return _type->is_collection(); }
    bool is_counter() const { return _type->is_counter(); }
    bool is_native() const { return _type->is_native(); }
-    bool is_user_type() const { return _type->is_user_type(); }
    data_type get_type() const { return _type; }
    const sstring& to_string() const { return _type->cql3_type_name(); }

    // For UserTypes, we need to know the current keyspace to resolve the
    // actual type used, so Raw is a "not yet prepared" CQL3Type.
    class raw {
-        virtual sstring to_string() const = 0;
-    protected:
-        bool _frozen = false;
    public:
        virtual ~raw() {}
+        bool _frozen = false;
        virtual bool supports_freezing() const = 0;
        virtual bool is_collection() const;
        virtual bool is_counter() const;
        virtual bool is_duration() const;
-        virtual bool is_user_type() const;
-        bool is_frozen() const;
        virtual bool references_user_type(const sstring&) const;
        virtual std::optional<sstring> keyspace() const;
        virtual void freeze();
-        virtual cql3_type prepare_internal(const sstring& keyspace, const user_types_metadata&) = 0;
+        virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata>) = 0;
        virtual cql3_type prepare(database& db, const sstring& keyspace);
        static shared_ptr<raw> from(cql3_type type);
        static shared_ptr<raw> user_type(ut_name name);
@@ -90,6 +85,7 @@ public:
        static shared_ptr<raw> set(shared_ptr<raw> t);
        static shared_ptr<raw> tuple(std::vector<shared_ptr<raw>> ts);
        static shared_ptr<raw> frozen(shared_ptr<raw> t);
+        virtual sstring to_string() const = 0;
        friend std::ostream& operator<<(std::ostream& os, const raw& r);
    };

@@ -103,33 +99,6 @@ private:
    }

 public:
-    enum class kind : int8_t {
-        ASCII, BIGINT, BLOB, BOOLEAN, COUNTER, DECIMAL, DOUBLE, EMPTY, FLOAT, INT, SMALLINT, TINYINT, INET, TEXT, TIMESTAMP, UUID, VARINT, TIMEUUID, DATE, TIME, DURATION
-    };
-    using kind_enum = super_enum<kind,
-        kind::ASCII,
-        kind::BIGINT,
-        kind::BLOB,
-        kind::BOOLEAN,
-        kind::COUNTER,
-        kind::DECIMAL,
-        kind::DOUBLE,
-        kind::EMPTY,
-        kind::FLOAT,
-        kind::INET,
-        kind::INT,
-        kind::SMALLINT,
-        kind::TINYINT,
-        kind::TEXT,
-        kind::TIMESTAMP,
-        kind::UUID,
-        kind::VARINT,
-        kind::TIMEUUID,
-        kind::DATE,
-        kind::TIME,
-        kind::DURATION>;
-    using kind_enum_set = enum_set<kind_enum>;
-
    static thread_local cql3_type ascii;
    static thread_local cql3_type bigint;
    static thread_local cql3_type blob;
@@ -154,7 +123,9 @@ public:

    static const std::vector<cql3_type>& values();
 public:
-    kind_enum_set::prepared get_kind() const;
+    using kind = abstract_type::cql3_kind;
+    using kind_enum_set = abstract_type::cql3_kind_enum_set;
+    kind_enum_set::prepared get_kind() const { return _type->get_cql3_kind(); }
 };

 inline bool operator==(const cql3_type& a, const cql3_type& b) {
--- a/cql3/cql_config.hh
+++ b/cql3/cql_config.hh
@@ -1,36 +0,0 @@
-/*
- * Copyright (C) 2019 ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-
-
-#pragma once
-
-#include "restrictions/restrictions_config.hh"
-
-namespace cql3 {
-
-struct cql_config {
-    restrictions::restrictions_config restrictions;
-};
-
-extern const cql_config default_cql_config;
-
-}
--- a/cql3/cql_statement.hh
+++ b/cql3/cql_statement.hh
@@ -72,14 +72,14 @@ public:

    timeout_config_selector get_timeout_config_selector() const { return _timeout_config_selector; }

-    virtual uint32_t get_bound_terms() const = 0;
+    virtual uint32_t get_bound_terms() = 0;

    /**
     * Perform any access verification necessary for the statement.
     *
     * @param state the current client state
     */
-    virtual future<> check_access(const service::client_state& state) const = 0;
+    virtual future<> check_access(const service::client_state& state) = 0;

    /**
     * Perform additional validation required by the statment.
@@ -87,7 +87,7 @@ public:
     *
     * @param state the current client state
     */
-    virtual void validate(service::storage_proxy& proxy, const service::client_state& state) const = 0;
+    virtual void validate(service::storage_proxy& proxy, const service::client_state& state) = 0;

    /**
     * Execute the statement and return the resulting result or null if there is no result.
@@ -96,7 +96,7 @@ public:
     * @param options options for this query (consistency, variables, pageSize, ...)
     */
    virtual future<::shared_ptr<cql_transport::messages::result_message>>
-        execute(service::storage_proxy& proxy, service::query_state& state, const query_options& options) const = 0;
+        execute(service::storage_proxy& proxy, service::query_state& state, const query_options& options) = 0;

    virtual bool uses_function(const sstring& ks_name, const sstring& function_name) const = 0;

@@ -115,21 +115,4 @@ public:
    }
 };

-// Conditional modification statements and batches
-// return a result set and have metadata, while same
-// statements without conditions do not.
-class cql_statement_opt_metadata : public cql_statement {
-protected:
-    // Result set metadata, may be empty for simple updates and batches
-    shared_ptr<metadata> _metadata;
-public:
-    using cql_statement::cql_statement;
-    virtual shared_ptr<const metadata> get_result_metadata() const override {
-        if (_metadata) {
-            return _metadata;
-        }
-        return make_empty_metadata();
-    }
-};
-
 }
--- a/cql3/error_collector.hh
+++ b/cql3/error_collector.hh
@@ -55,12 +55,12 @@ class error_collector : public error_listener<RecognizerType, ExceptionBaseType>
    /**
     * The offset of the first token of the snippet.
     */
-    static constexpr int32_t FIRST_TOKEN_OFFSET = 10;
+    static const int32_t FIRST_TOKEN_OFFSET = 10;

    /**
     * The offset of the last token of the snippet.
     */
-    static constexpr int32_t LAST_TOKEN_OFFSET = 2;
+    static const int32_t LAST_TOKEN_OFFSET = 2;

    /**
     * The CQL query.
--- a/cql3/functions/abstract_function.hh
+++ b/cql3/functions/abstract_function.hh
@@ -48,10 +48,6 @@
 #include <iosfwd>
 #include <boost/functional/hash.hpp>

-namespace std {
-    std::ostream& operator<<(std::ostream& os, const std::vector<data_type>& arg_types);
-}
-
 namespace cql3 {

 namespace functions {
@@ -70,9 +66,6 @@ protected:
    }

 public:
-
-    virtual bool requires_thread() const;
-
    virtual const function_name& name() const override {
        return _name;
    }
@@ -91,15 +84,15 @@ public:
            && _return_type == x._return_type;
    }

-    virtual bool uses_function(const sstring& ks_name, const sstring& function_name) const override {
+    virtual bool uses_function(const sstring& ks_name, const sstring& function_name) override {
        return _name.keyspace == ks_name && _name.name == function_name;
    }

-    virtual bool has_reference_to(function& f) const override {
+    virtual bool has_reference_to(function& f) override {
        return false;
    }

-    virtual sstring column_name(const std::vector<sstring>& column_names) const override {
+    virtual sstring column_name(const std::vector<sstring>& column_names) override {
        return format("{}({})", _name, join(", ", column_names));
    }

@@ -110,7 +103,12 @@ inline
 void
 abstract_function::print(std::ostream& os) const {
    os << _name << " : (";
-    os << _arg_types;
+    for (size_t i = 0; i < _arg_types.size(); ++i) {
+        if (i > 0) {
+            os << ", ";
+        }
+        os << _arg_types[i]->as_cql3_type().to_string();
+    }
    os << ") -> " << _return_type->as_cql3_type().to_string();
 }

--- a/cql3/functions/aggregate_fcts.cc
+++ b/cql3/functions/aggregate_fcts.cc
@@ -1,612 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-/*
- * Copyright (C) 2019 ScyllaDB
- *
- * Modified by ScyllaDB
- */
-
-/*
- * This file is part of Scylla.
- *
- * Scylla is free software: you can redistribute it and/or modify
- * it under the terms of the GNU Affero General Public License as published by
- * the Free Software Foundation, either version 3 of the License, or
- * (at your option) any later version.
- *
- * Scylla is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with Scylla.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-
-#include "utils/big_decimal.hh"
-#include "aggregate_fcts.hh"
-#include "functions.hh"
-#include "native_aggregate_function.hh"
-#include "exceptions/exceptions.hh"
-
-using namespace cql3;
-using namespace functions;
-using namespace aggregate_fcts;
-
-namespace {
-class impl_count_function : public aggregate_function::aggregate {
-    int64_t _count;
-public:
-    virtual void reset() override {
-        _count = 0;
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        return long_type->decompose(_count);
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        ++_count;
-    }
-};
-
-class count_rows_function final : public native_aggregate_function {
-public:
-    count_rows_function() : native_aggregate_function(COUNT_ROWS_FUNCTION_NAME, long_type, {}) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_count_function>();
-    }
-    virtual sstring column_name(const std::vector<sstring>& column_names) const override {
-        return "count";
-    }
-};
-
-// We need a wider accumulator for sum and average,
-// since summing the inputs can overflow the input type
-template <typename T>
-struct accumulator_for;
-
-template <typename NarrowType, typename AccType>
-static NarrowType checking_narrow(AccType acc) {
-    NarrowType ret = static_cast<NarrowType>(acc);
-    if (static_cast<AccType>(ret) != acc) {
-        throw exceptions::overflow_error_exception("Sum overflow. Values should be casted to a wider type.");
-    }
-    return ret;
-}
-
-template <>
-struct accumulator_for<int8_t> {
-    using type = __int128;
-
-    static int8_t narrow(type acc) {
-        return checking_narrow<int8_t>(acc);
-    }
-};
-
-template <>
-struct accumulator_for<int16_t> {
-    using type = __int128;
-
-    static int16_t narrow(type acc) {
-        return checking_narrow<int16_t>(acc);
-    }
-};
-
-template <>
-struct accumulator_for<int32_t> {
-    using type = __int128;
-
-    static int32_t narrow(type acc) {
-        return checking_narrow<int32_t>(acc);
-    }
-};
-
-template <>
-struct accumulator_for<int64_t> {
-    using type = __int128;
-
-    static int64_t narrow(type acc) {
-        return checking_narrow<int64_t>(acc);
-    }
-};
-
-template <>
-struct accumulator_for<float> {
-    using type = float;
-
-    static auto narrow(type acc) {
-        return acc;
-    }
-};
-
-template <>
-struct accumulator_for<double> {
-    using type = double;
-
-    static auto narrow(type acc) {
-        return acc;
-    }
-};
-
-template <>
-struct accumulator_for<boost::multiprecision::cpp_int> {
-    using type = boost::multiprecision::cpp_int;
-
-    static auto narrow(type acc) {
-        return acc;
-    }
-};
-
-template <>
-struct accumulator_for<big_decimal> {
-    using type = big_decimal;
-
-    static auto narrow(type acc) {
-        return acc;
-    }
-};
-
-template <typename Type>
-class impl_sum_function_for final : public aggregate_function::aggregate {
-    using accumulator_type = typename accumulator_for<Type>::type;
-    accumulator_type _sum{};
-public:
-    virtual void reset() override {
-        _sum = {};
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        return data_type_for<Type>()->decompose(accumulator_for<Type>::narrow(_sum));
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        _sum += value_cast<Type>(data_type_for<Type>()->deserialize(*values[0]));
-    }
-};
-
-template <typename Type>
-class sum_function_for final : public native_aggregate_function {
-public:
-    sum_function_for() : native_aggregate_function("sum", data_type_for<Type>(), { data_type_for<Type>() }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_sum_function_for<Type>>();
-    }
-};
-
-
-template <typename Type>
-static
-shared_ptr<aggregate_function>
-make_sum_function() {
-    return make_shared<sum_function_for<Type>>();
-}
-
-template <typename Type>
-class impl_div_for_avg {
-public:
-    static Type div(const typename accumulator_for<Type>::type& x, const int64_t y) {
-        return x/y;
-    }
-};
-
-template <>
-class impl_div_for_avg<big_decimal> {
-public:
-    static big_decimal div(const big_decimal& x, const int64_t y) {
-        return x.div(y, big_decimal::rounding_mode::HALF_EVEN);
-    }
-};
-
-template <typename Type>
-class impl_avg_function_for final : public aggregate_function::aggregate {
-   typename accumulator_for<Type>::type _sum{};
-   int64_t _count = 0;
-public:
-    virtual void reset() override {
-        _sum = {};
-        _count = 0;
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        Type ret{};
-        if (_count) {
-            ret = impl_div_for_avg<Type>::div(_sum, _count);
-        }
-        return data_type_for<Type>()->decompose(ret);
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        ++_count;
-        _sum += value_cast<Type>(data_type_for<Type>()->deserialize(*values[0]));
-    }
-};
-
-template <typename Type>
-class avg_function_for final : public native_aggregate_function {
-public:
-    avg_function_for() : native_aggregate_function("avg", data_type_for<Type>(), { data_type_for<Type>() }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_avg_function_for<Type>>();
-    }
-};
-
-template <typename Type>
-static
-shared_ptr<aggregate_function>
-make_avg_function() {
-    return make_shared<avg_function_for<Type>>();
-}
-
-template <typename T>
-struct aggregate_type_for {
-    using type = T;
-};
-
-template<>
-struct aggregate_type_for<ascii_native_type> {
-    using type = ascii_native_type::primary_type;
-};
-
-template<>
-struct aggregate_type_for<simple_date_native_type> {
-    using type = simple_date_native_type::primary_type;
-};
-
-template<>
-struct aggregate_type_for<timeuuid_native_type> {
-    using type = timeuuid_native_type::primary_type;
-};
-
-template<>
-struct aggregate_type_for<time_native_type> {
-    using type = time_native_type::primary_type;
-};
-
-template <typename Type>
-const Type& max_wrapper(const Type& t1, const Type& t2) {
-    using std::max;
-    return max(t1, t2);
-}
-
-inline const net::inet_address& max_wrapper(const net::inet_address& t1, const net::inet_address& t2) {
-    using family = seastar::net::inet_address::family;
-    const size_t len =
-            (t1.in_family() == family::INET || t2.in_family() == family::INET)
-            ? sizeof(::in_addr) : sizeof(::in6_addr);
-    return std::memcmp(t1.data(), t2.data(), len) >= 0 ? t1 : t2;
-}
-
-template <typename Type>
-class impl_max_function_for final : public aggregate_function::aggregate {
-   std::optional<typename aggregate_type_for<Type>::type> _max{};
-public:
-    virtual void reset() override {
-        _max = {};
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        if (!_max) {
-            return {};
-        }
-        return data_type_for<Type>()->decompose(data_value(Type{*_max}));
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        auto val = value_cast<typename aggregate_type_for<Type>::type>(data_type_for<Type>()->deserialize(*values[0]));
-        if (!_max) {
-            _max = val;
-        } else {
-            _max = max_wrapper(*_max, val);
-        }
-    }
-};
-
-/// The same as `impl_max_function_for' but without knowledge of `Type'.
-class impl_max_dynamic_function final : public aggregate_function::aggregate {
-    opt_bytes _max;
-public:
-    virtual void reset() override {
-        _max = {};
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        return _max.value_or(bytes{});
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        const auto val = *values[0];
-        if (!_max || *_max < val) {
-            _max = val;
-        }
-    }
-};
-
-template <typename Type>
-class max_function_for final : public native_aggregate_function {
-public:
-    max_function_for() : native_aggregate_function("max", data_type_for<Type>(), { data_type_for<Type>() }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_max_function_for<Type>>();
-    }
-};
-
-class max_dynamic_function final : public native_aggregate_function {
-public:
-    max_dynamic_function(data_type io_type) : native_aggregate_function("max", io_type, { io_type }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_max_dynamic_function>();
-    }
-};
-
-/**
- * Creates a MAX function for the specified type.
- *
- * @param inputType the function input and output type
- * @return a MAX function for the specified type.
- */
-template <typename Type>
-static
-shared_ptr<aggregate_function>
-make_max_function() {
-    return make_shared<max_function_for<Type>>();
-}
-
-template <typename Type>
-const Type& min_wrapper(const Type& t1, const Type& t2) {
-    using std::min;
-    return min(t1, t2);
-}
-
-inline const net::inet_address& min_wrapper(const net::inet_address& t1, const net::inet_address& t2) {
-    using family = seastar::net::inet_address::family;
-    const size_t len =
-            (t1.in_family() == family::INET || t2.in_family() == family::INET)
-            ? sizeof(::in_addr) : sizeof(::in6_addr);
-    return std::memcmp(t1.data(), t2.data(), len) <= 0 ? t1 : t2;
-}
-
-template <typename Type>
-class impl_min_function_for final : public aggregate_function::aggregate {
-   std::optional<typename aggregate_type_for<Type>::type> _min{};
-public:
-    virtual void reset() override {
-        _min = {};
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        if (!_min) {
-            return {};
-        }
-        return data_type_for<Type>()->decompose(data_value(Type{*_min}));
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        auto val = value_cast<typename aggregate_type_for<Type>::type>(data_type_for<Type>()->deserialize(*values[0]));
-        if (!_min) {
-            _min = val;
-        } else {
-            _min = min_wrapper(*_min, val);
-        }
-    }
-};
-
-/// The same as `impl_min_function_for' but without knowledge of `Type'.
-class impl_min_dynamic_function final : public aggregate_function::aggregate {
-    opt_bytes _min;
-public:
-    virtual void reset() override {
-        _min = {};
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        return _min.value_or(bytes{});
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        const auto val = *values[0];
-        if (!_min || val < *_min) {
-            _min = val;
-        }
-    }
-};
-
-template <typename Type>
-class min_function_for final : public native_aggregate_function {
-public:
-    min_function_for() : native_aggregate_function("min", data_type_for<Type>(), { data_type_for<Type>() }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_min_function_for<Type>>();
-    }
-};
-
-class min_dynamic_function final : public native_aggregate_function {
-public:
-    min_dynamic_function(data_type io_type) : native_aggregate_function("min", io_type, { io_type }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_min_dynamic_function>();
-    }
-};
-
-/**
- * Creates a MIN function for the specified type.
- *
- * @param inputType the function input and output type
- * @return a MIN function for the specified type.
- */
-template <typename Type>
-static
-shared_ptr<aggregate_function>
-make_min_function() {
-    return make_shared<min_function_for<Type>>();
-}
-
-template <typename Type>
-class impl_count_function_for final : public aggregate_function::aggregate {
-   int64_t _count = 0;
-public:
-    virtual void reset() override {
-        _count = 0;
-    }
-    virtual opt_bytes compute(cql_serialization_format sf) override {
-        return long_type->decompose(_count);
-    }
-    virtual void add_input(cql_serialization_format sf, const std::vector<opt_bytes>& values) override {
-        if (!values[0]) {
-            return;
-        }
-        ++_count;
-    }
-};
-
-template <typename Type>
-class count_function_for final : public native_aggregate_function {
-public:
-    count_function_for() : native_aggregate_function("count", long_type, { data_type_for<Type>() }) {}
-    virtual std::unique_ptr<aggregate> new_aggregate() override {
-        return std::make_unique<impl_count_function_for<Type>>();
-    }
-};
-
-/**
- * Creates a COUNT function for the specified type.
- *
- * @param inputType the function input type
- * @return a COUNT function for the specified type.
- */
-template <typename Type>
-static shared_ptr<aggregate_function> make_count_function() {
-    return make_shared<count_function_for<Type>>();
-}
-}
-
-shared_ptr<aggregate_function>
-aggregate_fcts::make_count_rows_function() {
-    return make_shared<count_rows_function>();
-}
-
-shared_ptr<aggregate_function>
-aggregate_fcts::make_max_dynamic_function(data_type io_type) {
-    return make_shared<max_dynamic_function>(io_type);
-}
-
-shared_ptr<aggregate_function>
-aggregate_fcts::make_min_dynamic_function(data_type io_type) {
-    return make_shared<min_dynamic_function>(io_type);
-}
-
-void cql3::functions::add_agg_functions(declared_t& funcs) {
-    auto declare = [&funcs] (shared_ptr<function> f) { funcs.emplace(f->name(), f); };
-
-    declare(make_count_function<int8_t>());
-    declare(make_max_function<int8_t>());
-    declare(make_min_function<int8_t>());
-
-    declare(make_count_function<int16_t>());
-    declare(make_max_function<int16_t>());
-    declare(make_min_function<int16_t>());
-
-    declare(make_count_function<int32_t>());
-    declare(make_max_function<int32_t>());
-    declare(make_min_function<int32_t>());
-
-    declare(make_count_function<int64_t>());
-    declare(make_max_function<int64_t>());
-    declare(make_min_function<int64_t>());
-
-    declare(make_count_function<boost::multiprecision::cpp_int>());
-    declare(make_max_function<boost::multiprecision::cpp_int>());
-    declare(make_min_function<boost::multiprecision::cpp_int>());
-
-    declare(make_count_function<big_decimal>());
-    declare(make_max_function<big_decimal>());
-    declare(make_min_function<big_decimal>());
-
-    declare(make_count_function<float>());
-    declare(make_max_function<float>());
-    declare(make_min_function<float>());
-
-    declare(make_count_function<double>());
-    declare(make_max_function<double>());
-    declare(make_min_function<double>());
-
-    declare(make_count_function<sstring>());
-    declare(make_max_function<sstring>());
-    declare(make_min_function<sstring>());
-
-    declare(make_count_function<ascii_native_type>());
-    declare(make_max_function<ascii_native_type>());
-    declare(make_min_function<ascii_native_type>());
-
-    declare(make_count_function<simple_date_native_type>());
-    declare(make_max_function<simple_date_native_type>());
-    declare(make_min_function<simple_date_native_type>());
-
-    declare(make_count_function<db_clock::time_point>());
-    declare(make_max_function<db_clock::time_point>());
-    declare(make_min_function<db_clock::time_point>());
-
-    declare(make_count_function<timeuuid_native_type>());
-    declare(make_max_function<timeuuid_native_type>());
-    declare(make_min_function<timeuuid_native_type>());
-
-    declare(make_count_function<time_native_type>());
-    declare(make_max_function<time_native_type>());
-    declare(make_min_function<time_native_type>());
-
-    declare(make_count_function<utils::UUID>());
-    declare(make_max_function<utils::UUID>());
-    declare(make_min_function<utils::UUID>());
-
-    declare(make_count_function<bytes>());
-    declare(make_max_function<bytes>());
-    declare(make_min_function<bytes>());
-
-    declare(make_count_function<bool>());
-    declare(make_max_function<bool>());
-    declare(make_min_function<bool>());
-
-    declare(make_count_function<net::inet_address>());
-    declare(make_max_function<net::inet_address>());
-    declare(make_min_function<net::inet_address>());
-
-    // FIXME: more count/min/max
-
-    declare(make_sum_function<int8_t>());
-    declare(make_sum_function<int16_t>());
-    declare(make_sum_function<int32_t>());
-    declare(make_sum_function<int64_t>());
-    declare(make_sum_function<float>());
-    declare(make_sum_function<double>());
-    declare(make_sum_function<boost::multiprecision::cpp_int>());
-    declare(make_sum_function<big_decimal>());
-    declare(make_avg_function<int8_t>());
-    declare(make_avg_function<int16_t>());
-    declare(make_avg_function<int32_t>());
-    declare(make_avg_function<int64_t>());
-    declare(make_avg_function<float>());
-    declare(make_avg_function<double>());
-    declare(make_avg_function<boost::multiprecision::cpp_int>());
-    declare(make_avg_function<big_decimal>());
-}
--- a/Show More
+++ b/Show More