scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-30 05:07:05 +00:00

Author	SHA1	Message	Date
Paweł Dziepak	bc2ff41003	cql3: fix units in large batch warning When displaying a warning about batch being too large C* reports batch size and limit in bytes while S* uses kB. This patch switches Scylla to use bytes. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1473867171-18932-1-git-send-email-pdziepak@scylladb.com>	2016-09-14 18:38:46 +03:00
Takuya ASADA	647673195c	dist/redhat/build_rpm.sh: add dependency for rpmbuild Install rpmbuild when it's not installed yet. Fixes #1651 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1473193430-14792-1-git-send-email-syuu@scylladb.com>	2016-09-14 14:57:55 +03:00
Calle Wilund	f126cf769a	column_family: Ensure flush() waits for all previous flushes + self Fixes #1577 Message-Id: <1472569952-4066-1-git-send-email-calle@scylladb.com>	2016-09-14 11:00:41 +01:00
Duarte Nunes	f864bca773	row_cache: Deal with side-effects in allocating_section In row_cache::make_reader, we update statistics inside an allocating_section, which retries the supplied function until it can satisfy all allocations by way of reserving LSA memory up front. Since those updates are interleave with allocations, retries can lead to miscounts. This patch fixes this by updating statistics after all allocations. Fixes #1659 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1473845977-20205-1-git-send-email-duarte@scylladb.com>	2016-09-14 10:46:25 +01:00
Tomasz Grabiec	a498da1987	database: Ignore spaces in initial_token list Currently we get boost::lexical_cast on startup if inital_token has a list which contains spaces after commas, e.g.: initial_token: -1100081313741479381, -1104041856484663086, ... Fixes #1664. Message-Id: <1473840915-5682-1-git-send-email-tgrabiec@scylladb.com>	2016-09-14 11:58:13 +03:00
Paweł Dziepak	c220c676c8	types: honour end of sstring_view There are several places in types.cc where we assume that sstring_view range is null terminated. That may be not true and we should always use either begin()/end() or data()/size() pairs. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-09-07 14:30:56 -07:00
Paweł Dziepak	6373289532	Merge "Adding slow query API" from Amnon "This series adds an API for the slow query recording. After this series it will be possible to set the/get the slow query recording parameters."	2016-09-07 11:06:09 -07:00
Pekka Enberg	1095705a6b	Update scylla-ami submodule * dist/ami/files/scylla-ami 14c1666...e1e3919 (1): > scylla_ami_setup: remove scylla_cpuset_setup	2016-09-07 21:04:03 +03:00
Avi Kivity	7ac729b4d5	Merge "Optimize reads for clustered data" from Raphael "This will be very important for read performance of time series use case, where timestamp is usually stored as a clustering key, and the user asks for specific data using a clustering range filter. Example: CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); ... SELECT * FROM temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2013-04-03 07:01:00' AND event_time < '2013-04-03 07:04:00'; This is based on: https://issues.apache.org/jira/browse/CASSANDRA-5514 To check correctness, I wrote a dtest that runs scylla with row cache disabled, creates several sstables with non overlapping clustering key ranges, queries data using several clustering range filters, and checks that the database returns the expected results. Tested performance with a tool I wrote myself [1] and performance is indeed improved by this patchset. This tool works as follow: Scylla is started with row cache disabled. That's wanted here because we're measuring a specific code that only gets executed if row cache misses the data we asked for. Then Scylla is populated node with N sstables ('nodetool flush' is used to ensure it), where each will have M clustering keys, totaling N*M clustering keys. Finally, we will start asking for data using a clustering range filter. The tool measures throughput and min/max/avg latency. [1]: https://gist.github.com/raphaelsc/4c415f592aaed14a18be31279d225972 Follow the results: BEFORE ----- ('Clustering keys / second: ', 747.9672111659951) ('Max latency (ms): ', 33) ('Min latency (ms): ', 12) ('Avg latency (ms): ', 13.0) The operation took 13.3695700169 seconds AFTER ----- ('Clustering keys / second: ', 3159.115303945648) ('Max latency (ms): ', 22) ('Min latency (ms): ', 2) ('Avg latency (ms): ', 3.0) The operation took 3.16544318199 seconds NOTE: Throughput and average latency are improved by a factor of ~4. -----"	2016-09-04 15:06:32 +03:00
Amnon Heiman	11c687dd93	API: Add slow query logging implementation This adds the implementation for the slow query logging API. After this patch the following will be available: curl -X GET "http://localhost:10000/storage_service/slow_query" curl -X POST "http://localhost:10000/storage_service/slow_query?enable=true&ttl=10&threshold=6000" Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-09-03 01:15:22 +03:00
Amnon Heiman	ed1d02b1a3	API: Add slow query API definition This adds the GET and POST api for slow query logging. The GET return an object with the enable, ttl and threshold and the POST lets you configure each of them. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-09-03 01:15:15 +03:00
Raphael S. Carvalho	b9f67351da	db: expose clustering filter info via collectd That's needed to observe behavior of clustering filter, and to check if it's worthwhile for a specific workload. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 11:32:23 -03:00
Raphael S. Carvalho	a2dc88889d	db: enable clustering optimization only on dtcs Leveled strategy will not benefit from this strategy because there's only a few sstables that will contain a given partition key, which means that a clustering key that belongs to a specific partition key can only be in a few sstables as well. Date tiered strategy is the one that will actually benefit the most from this optimization. Size tiered may benefit from it too if clustering key isn't overwritten, but it will not use the clustering optimization. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 11:31:07 -03:00
Raphael S. Carvalho	8d03ccd604	sstables: optimize reads with clustering filter If user specifies a clustering filter, it's possible to filter out sstable based on its metadata that tracks min/max clustering value. For example, if sstable stores clustering key from 'a' through 'c', it's possible to filter out that sstable if user asks for data with clustering key greater than 'c'. That's done by comparing each component separately because clustering key may be composite. Further information can be found here: https://issues.apache.org/jira/browse/CASSANDRA-5514 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:51:50 -03:00
Raphael S. Carvalho	768aced741	partition_slice: introduce key-independent function to get ranges That will be important for sstable code that will rule out a sstable if it doesn't cover a given clustering key range. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:50:56 -03:00
Raphael S. Carvalho	dce61ddb02	types: introduce abstract_type::as_tri_comparator() That's akin to abstract_type::as_less_comparator's nature. So we don't have to repeat something like the following everywhere: auto cmp = [&type] (const bytes_view& b1, const bytes_view& b2) { return type->compare(b1, b2); } Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:50:53 -03:00
Raphael S. Carvalho	004617839d	database: check bloom filter of all sstables earlier All sstables will now have bloom filter checked in a single pass before reader iterate through all candidates. It's possible that we will need to futurize the procedure if it holds cpu for too long. This change is also a step towards the optimization that will rule out sstables based on clustering filter. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:50:08 -03:00
Raphael S. Carvalho	2a426ab248	tests: add test to check tombstone metadata Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:49:35 -03:00
Raphael S. Carvalho	94c8ef39c3	sstables: store components ranges in sstable object Store range for each clustering component in sstable itself to optimize sstable filtering based on clustering key. If schema defines no clustering key, this new field will be empty. Each range stores min and max value of that specific component. With this information, it's possible to know if a sstable possibly stores a given clustering component. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:49:32 -03:00
Raphael S. Carvalho	026853fabb	tests: add test to check composite validity Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:49:30 -03:00
Raphael S. Carvalho	0a5af61176	sstables: introduce function to validate min max clustering values Scylla was generating a sstable with incorrect min max clustering values. This information is used to filter out a sstable when user asks for a range of clustering rows. So it's important to detect wrong metadata and make sure that it will not be used. The validation is fast and will only happen when loading a sstable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:49:28 -03:00
Raphael S. Carvalho	1f31223f32	sstables: store schema in sstable object That will be needed for optimization that will store decorated keys in the sstable object, and also for a subsequent work that will detect wrong metadata (min/max column names) by looking at columns in the schema. As schema is stored in sstable, there's no longer a need to store ks and cf names in it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-02 10:49:17 -03:00
Avi Kivity	7a140a306e	Revert "sstables: optimize selection of sstables for leveled strategy" This reverts commit c75b07fc34f0e7267a8e49276b96bbd4686cb78d; does not deduplicate the sstable list.	2016-09-01 18:34:08 +03:00
Raphael S. Carvalho	c75b07fc34	sstables: optimize selection of sstables for leveled strategy It's possible to copy sstables directly into vector, and that will improve performance. my benchmark tool[1] shows that new version reduces running time of copy procedure by factor of two after 1024^2 calls. Switching to back_inserter improves throughput even further. [1]: gist.github.com/raphaelsc/a4b27290f362cdecdef399770dda759c Refs #1632. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <7153514a9b5f5eb24dff518ee9fa3680e0881dae.1472741401.git.raphaelsc@scylladb.com>	2016-09-01 18:08:53 +03:00
Glauber Costa	dc5d8e33af	Revert "row_cache: update sstable histograms on cache hits" This reverts commit `1726b1d0cc`. Reverting this patch turns our SSTable access counter into a miss counter only. The estimated histogram always starts its first bucket at 1, so by marking cache accesses we will be wrongly feeding "1" into the buckets. Notice that this is not yet ideal: nodetool is supposed to show a histogram of all reads, and by doing this we are changing its meaning slightly. Workloads that serve mostly from cache will be distorted towards their misses. The real solution is to use a different histogram, but we will need to enforce a newer version of nodetool for that: the current issue is that nodetool expects an EstimatedHistogram in a specific format in the other side. Conflicts: row_cache.hh Message-Id: <a599fa9e949766e7c9697450ae34fc28e881e90a.1472742276.git.glauber@scy lladb.com> Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-09-01 18:07:31 +03:00
Avi Kivity	e33671c285	Merge "tracing: Trace read sstables" from Duarte "This patchset traces sstables we read from. To do that, we need to flow the trace_state_ptr to the mutation_readers."	2016-09-01 13:24:16 +03:00
Duarte Nunes	ba374da043	database: Trace sstable accesses This patch traces when we read from an sstable, be it a key range or a single one. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:04:32 +02:00
Duarte Nunes	f4cf2f2aef	tracing: Make trace_state_ptr argument required This patch makes the optional trace_state_ptr arguments introduced in previous patches mandatory where possible. Functions which are called internally don't have a trace context, so for those we keep the argument's default value for convenience. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:04:32 +02:00
Duarte Nunes	46b86ff801	storage_proxy: Pass along trace_state for queries This patch changes the storage_proxy so it passed along a trace_state_ptr to the layers below, when querying locally or receiving a remote query request. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:04:32 +02:00
Duarte Nunes	030db65c62	database: Accept a trace_state_ptr This patch changes the database and column_family types so a trace_state_ptr can be passed in when querying. This enables tracing of the inner components. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:04:28 +02:00
Duarte Nunes	9269256246	row_cache: Accept a trace_state_ptr This patch changes the row_cache so it accepts a trace_state_ptr, which it is responsible of flowing to the underlying mutation_reader if needed. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:00:55 +02:00
Duarte Nunes	5fd66f00c2	mutation_reader: Accept trace_state_ptr This patch changes the mutation_reader so it optionally accepts a trace_state_ptr. This will allow us to trace, for example, which sstables are accessed during a request. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-09-01 12:00:31 +02:00
Avi Kivity	cc127295e9	Merge "Fill in information for sstables per read histogram" from Glauber "Nodetool cfhistograms is supposed to tell us how many SSTables were touched per read. Currently, we are a bit in the dark as we don't export that information. This patch exports that, so that we can start using it."	2016-09-01 12:54:24 +03:00
Glauber Costa	1726b1d0cc	row_cache: update sstable histograms on cache hits If we have a cache hit, we still need to update our sstable histogram - notting that we have touched 0 SSTables. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:14:22 -04:00
Glauber Costa	ce24fd05fe	database: keep statistics on SSTables touched per read That is done for single partition queries only - mimicking what Cassandra does on that matter. For this to be correct, we also need to update this histogram on cache hits - in which case we update the read as having touched 0 SSTables. That will be done on a separate patch. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:14:21 -04:00
Glauber Costa	0f413695ac	database: make column family stats mutable The make_reader method is currently a const method, but we would like to start keeping hit statistics from it. Instead of relaxing the const condition too much, we can just mark the _stats field as mutable, indicating that make_reader will not be able to change anything in the CF, except for keeping statistics. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:13:24 -04:00
Glauber Costa	5c4d73577a	initialize sstables_per_read histogram with 35 instead of 90 buckets This is to match what Cassandra does. Nodetool may be expecting this on the other side. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:13:24 -04:00
Glauber Costa	4310635bae	move estimated histogram to utils Nothing sstable-specific in it, really. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:13:23 -04:00
Glauber Costa	ffc2131c51	decouple estimated_histogram from sstables There is nothing really that fundamentally ties the estimated histogram to sstables. This patch gets rid of the few incidental ties. They are: - the namespace name, which is now moved to utils. Users inside sstables/ now need to add a namespace prefix, while the ones outside have to change it to the right one - sstables::merge, which has a very non-descriptive name to begin with, is changed to a more descriptive name that can live inside utils/ - the disk_types.hh include has to be removed - but it had no reason to be here in the first place. Todo, is to actually move the file outside sstables/. That is done in a separate step for clarity. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:13:23 -04:00
Yoav Kleinberger	624165da79	scyllatop: dump all output to stdout instead of running a fancy console interface Sometimes the user would like to dump all the metrics into a file or pipe it to another program, as requested in issue #1506. This patch makes scyllatop check if stdout is connected to a TTY, and if not - it does not fire up the fancy urwid UI but instead, just writes all it's collected metrics to stdout. Optionally, the user tell the program to quit after a specific number of iterations via the -n or --iterations flag Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <1471777516-9903-1-git-send-email-yoav@scylladb.com>	2016-08-31 08:31:36 +03:00
Paweł Dziepak	e981101fa9	Merge "Remove clustering_key_filtering_context" from Piotr "clustering_key_filtering_context is no longer needed. partition_slice can be used instead so this series removes clustering_key_filtering_context and passes partition_slice down where it's needed. Then a static get_ranges method is used to obtain clustering key ranges for a given partition. Fixes #1614."	2016-08-30 22:30:15 +01:00
Piotr Jastrzebski	3607d99269	Remove clustering_key_filtering_context. Remove clustering_key_filter_factory and clustering_key_filtering_context. Use partition_slice directly with a static get_ranges method. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-08-30 20:31:55 +02:00
Piotr Jastrzebski	b05b90b3a5	Introduce clustering_key_filter_ranges. This fixes the problem of multiple concurrent get_ranges calls. Previously each call was invalidating the result of the previous call. Now they don't step on each other foot. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-08-30 19:46:38 +02:00
Duarte Nunes	39e0fb1260	storage_proxy: Support multiple partition ranges This patch adds the ability to query multiple partition ranges. This is needed since `55f2cf1626`, where we started unwrapping partition ranges in Thrift. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1472474594-15368-1-git-send-email-duarte@scylladb.com>	2016-08-30 17:43:40 +03:00
Takuya ASADA	533dc0485d	dist/common/scripts/scylla_sysconfig_setup: sync cpuset parameters with rps_cpus settings when posix_net_conf.sh is enabled and NIC is single queue On posix_net_conf.sh's single queue NIC mode (which means RPS enabled mode), we are excluded cpu0 and it's sibling from network stack processing cpus, and assigned NIC IRQ to cpu0. So always network stack is not working on cpu0 and it's sibling, to get better performance we need to exclude these cpus from scylla too. To do this, we need to get RPS cpu mask from posix_net_conf.sh, pass it to scylla_cpuset_setup to construct /etc/scylla.d/cpuset.conf when scylla_setup executed. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1472544875-2033-2-git-send-email-syuu@scylladb.com>	2016-08-30 16:51:16 +03:00
Takuya ASADA	0c3bb2ee63	dist/common/scripts/scylla_prepare: drop unnecesarry multiqueue NIC detection code on scylla_prepare Right now scylla_prepare specifies -mq option to posix_net_conf.sh when number of RX queues > 1, but on posix_net_conf.sh it sets NIC mode to sq when queues < ncpus / 2. So the logic is different, and actually posix_net_conf.sh does not need to specify -sq/-mq now, it autodetects queue mode. So we need to drop detection logic from scylla_prepare, let posix_net_conf.sh to detect it. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1472544875-2033-1-git-send-email-syuu@scylladb.com>	2016-08-30 16:51:15 +03:00
Pekka Enberg	eff14bae0e	transport/server: Explict CQL type IDs The CQL type IDs are specified as hex in the CQL binary protocol specification. Define CQL type IDs in the code explicitly to make reviewing the code and adding new types easier. Message-Id: <1472537971-26053-1-git-send-email-penberg@scylladb.com>	2016-08-30 09:45:26 +03:00
Avi Kivity	809d739ae8	Merge seastar upstream * seastar 2b07b1f...0303e0c (3): > scripts/posix_net_conf.sh: add support --cpu-mask mode > file: improve tmpfs support > file::close: remove trailing newline in log message	2016-08-29 13:26:04 +03:00
Pekka Enberg	2d3aee73a6	systemd: Don't start Scylla service until network is up Alexandr Porunov reports that Scylla fails to start up after reboot as follows: Aug 25 19:44:51 scylla1 scylla[637]: Exiting on unhandled exception of type 'std::system_error': Error system:99 (Cannot assign requested address) The problem is that because there's no dependency to network service, Scylla simply attempts to start up too soon in the boot sequence and fails. Fixes #1618. Message-Id: <1472212447-21445-1-git-send-email-penberg@scylladb.com>	2016-08-29 13:15:39 +03:00
Takuya ASADA	74d994f6a1	dist/common/scripts/scylla_setup: support enabling services on Ubuntu 15.10/16.04 Right now it ignores Ubuntu, but we shareing .service between Fedora/CentOS and Ubuntu >= 15.10, so support it. Fixes #1556. Message-Id: <1471932814-17347-1-git-send-email-syuu@scylladb.com>	2016-08-29 13:13:14 +03:00

1 2 3 4 5 ...

10374 Commits