scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 04:56:58 +00:00

Author	SHA1	Message	Date
Glauber Costa	2cd756ae5e	repair: replace a magic number with another magic number In due time we will have to fix this, but as an interim step, let's use a "better" magic number. The problem with 100, is that as soon as the partitions start to go bigger, we're using too much memory. Since this is multiplied by the number of token ranges, and happens in every shard, the final number can become really big, and the amount of resources we use go up proportionally. This means that even we are mistaken about the new number (we probably are), in this case it is better to err on the side of a more conservative resource usage. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <97158f3db5734916cee4ccf12eaa66e7402570bb.1457448855.git.glauber@scylladb.com>	2016-03-08 17:29:00 +02:00
Nadav Har'El	b7e29691c2	sstables: avoid index and data file over-reads When we do a streaming read that knows the expected end position of the read, we can use a large read-ahead buffer, and at the same time, stop reading at exactly the intended end (or small rounding of it to the DMA block size) and not waste resources blindly reading a large amount of data after the end just to fill the read-ahead buffer. The sstable reading code, both for reading the data file and the index file, created a file input stream without specifiying its end, thereby losing this optimization - so when a large buffer was used, we would get a large over-read. This patch fixes this, so sstable data file and index file are read using a file input stream which is a ware of its end. Fixes #964. Note that this patch does not change the behavior when reading a compressed data file. For compressed read, we did not have the problem of over-read in the first place, because chunks are read one by one. But we do have other sources of inefficiencies there (stemming, again, from the fact that the compressed chunks are read one by one), and I opened a separate issue #992 for that. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457219304-12680-1-git-send-email-nyh@scylladb.com>	2016-03-08 17:26:10 +02:00
Calle Wilund	8575f1391f	lists.cc: fix update insert of frozen list Fixes #967 Frozen lists are just atomic cells. However, old code inserted the frozen data directly as an atomic_cell_or_collection, which in turn meant it lacked the header data of a cell. When in turn it was handled by internal serialization (freeze), since the schema said is was not a (non-frozen) collection, we tried to look at frozen list data as cell header -> most likely considered dead. Message-Id: <1457432538-28836-1-git-send-email-calle@scylladb.com>	2016-03-08 13:48:45 +01:00
Pekka Enberg	81af486b69	Update scylla-ami submodule * dist/ami/files/scylla-ami d4a0e18...84bcd0d (1): > Add --ami parameter	2016-03-08 13:49:31 +02:00
Takuya ASADA	254b0fa676	dist: show message to use XFS for scylla data directory and also notify about developer mode, when iotune fails Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1457426286-15925-1-git-send-email-syuu@scylladb.com>	2016-03-08 12:20:33 +02:00
Pekka Enberg	83d82ea901	Merge "Fix Ubuntu package issues on AMI" from Takuya "This fixes bugs on Ubuntu package and AMI scripts, closes #991."	2016-03-08 11:51:30 +02:00
Takuya ASADA	18a27de3c8	dist: export all entries on /etc/default/scylla-server on Ubuntu Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-08 18:18:30 +09:00
Gleb Natapov	ce6d1a242a	storage_proxy: fix background_reads counter background_reads collectd counter was not always properly decremented. Fix it and streamline background read repair error handling. Message-Id: <20160307182255.GI4849@scylladb.com>	2016-03-07 19:41:09 +01:00
Yoav Kleinberger	1cd01cd2ab	tools/scyllatop: defend against curses "out of screen bounds" error Fixes issue #945 (hopefully) This issue was probably the result of trying to write outside the confines of the window. The views.Base class now defends against this. Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <9735806b211567f3239e187d87437c484f532291.1457265435.git.yoav@scylladb.com>	2016-03-07 18:02:26 +01:00
Raphael S. Carvalho	0f4239d63a	service: improve logging of storage_service::load_new_sstables Closes #952. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <2402f387c32d2d1221e740edb67e56c1593c1936.1457366098.git.raphaelsc@scylladb.com>	2016-03-07 18:01:52 +01:00
Raphael S. Carvalho	e850c1406e	sstables: update comment Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <8abc1c6c66ed8d3bb35ecfb6d8251de3f61a97ae.1457093016.git.raphaelsc@scylladb.com>	2016-03-07 17:36:34 +01:00
Raphael S. Carvalho	822759eee0	compaction_manager: update stat pending_tasks properly Size of both _cfs_to_cleanup and _cfs_to_compact must be added when calculating a new value to _stats.pending_tasks. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <b601e24d0631922798575f39d00fb54fe00d4971.1457093016.git.raphaelsc@scylladb.com>	2016-03-07 17:36:03 +01:00
Gleb Natapov	2d092bbd32	storage_proxy: send read requests with timeout No need to wait for replies long after request is timed out. Message-Id: <1457351304-28721-2-git-send-email-gleb@scylladb.com>	2016-03-07 14:00:11 +01:00
Gleb Natapov	4122422d19	storage_proxy: always wait for digest read resolver done future Currently it is waited upon only if background read repair check is needed and this cause unhandled exception warning to be printed if it enters failed state. Fix this by always waiting on it, but doing anything beyond ignoring an exception only if check is needed. Message-Id: <1457351304-28721-1-git-send-email-gleb@scylladb.com>	2016-03-07 14:00:09 +01:00
Gleb Natapov	626c9d046b	fix EACH_QUORUM handling during bootstrapping Currently write acknowledgements handling does not take bootstrapping node into account for CL=EACH_QUORUM. The patch fixes it. Fixes #994 Message-Id: <20160307121620.GR2253@scylladb.com>	2016-03-07 13:56:34 +01:00
Raphael S. Carvalho	d65642cee8	fix storage_service::load_new_sstables() to not disable write permanently Avi says: "If an exception happens, then enable_sstable_writes won't be called." The problem is fixed by catching a possible exception and enabling sstable write for the relevant column family if it wasn't enabled already. Closes #953. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <32c1bcb2c60c7b9e5514eb0a95062f40ca92093a.1457119308.git.raphaelsc@scylladb.com>	2016-03-07 13:56:02 +01:00
Gleb Natapov	f59415b3c6	Take pending endpoints into account while checking for sufficient live nodes During bootstrapping additional copies of data has to be made to ensure that CL level is met (see CASSANDRA-833 for details). Our code does that, but it does not take into account that bootstraping node can be dead which may cause request to proceed even though there is no enough live nodes for it to be completed. In such a case request neither completes nor timeouts, so it appear to be stuck from CQL layer POV. The patch fixes this by taking into account pending nodes while checking that there are enough sufficient live nodes for operation to proceed. Fixes #965 Message-Id: <20160303165250.GG2253@scylladb.com>	2016-03-07 13:30:13 +01:00
Gleb Natapov	8dad399256	log: add space between log level and date in the outpu It was dropped by `6dc51027a3` Message-Id: <20160306125313.GI2253@scylladb.com>	2016-03-07 13:06:06 +01:00
Tomasz Grabiec	9deb036e4e	Merge branch 'dev/issue-845-set-incremental-backup-config-v1' from seastar-dev.git From Vlad: This series modifies the 'database' class to use the internal _enable_incremental_backups value (initialized with 'incremental_backups' configuration value) instead of using the 'incremental_backups' configuration value directly. Then we update this internal value in runtime from 'nodetool enable/disablebackup' API callback so that newly created keyspaces and column families use the newly configured incremental backup configuration.	2016-03-07 10:47:20 +01:00
Tomasz Grabiec	b3e56549ca	Merge branch 'dev/issue-909-synchronization-part-v2' from seatar-dev.git From Vlad: This series fixes the first part of issue #909 (the second part has a separate github issue #965) which is a discrepancy between a storage_service::token_metadata and a gossiper::endpoint_state_map contents on non-zero shards.	2016-03-07 10:20:15 +01:00
Paweł Dziepak	99b61d3944	lsa: set _active to nullptr in region destructor In region destructor, after active segments is freed pointer to it is left unchanged. This confuses the remaining parts of the destructor logic (namely, removal from region group) which may rely on the information in region_impl::_active. In this particular case the problem was that code removing from the region group called region_impl::occupancy() which was dereferencing _active if not null. Fixes #993. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1457341670-18266-1-git-send-email-pdziepak@scylladb.com>	2016-03-07 10:15:28 +01:00
Takuya ASADA	9ee14abf24	dist: export sysconfig for scylla-io-setup.service Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-07 18:13:30 +09:00
Takuya ASADA	3d9dc52f5f	Revert "Revert "dist: align ami option with others (-a --> --ami)"" This reverts commit `66c5feb9e9`. Conflicts: dist/common/scripts/scylla_sysconfig_setup Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-07 18:13:30 +09:00
Takuya ASADA	c9882bc2c4	Revert "Revert "Revert "dist: remove AMI entry from sysconfig, since there is no script refering it""" This reverts commit `643beefc8c`. Conflicts: dist/common/scripts/scylla_sysconfig_setup dist/common/sysconfig/scylla-server Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-07 17:15:42 +09:00
Takuya ASADA	c888eaac74	dist: add /etc/scylla.d/io.conf on Ubuntu Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-07 17:15:42 +09:00
Vlad Zolotarov	2cd836a02e	api::set_storage_service(): fix the 'nodetool enablebackup' API 'nodetool enable/disablebackup' callback was modifying only the existing keyspaces and column families configurations. However new keyspaces/column families were using the original 'incremental_backups' configuration value which could be different from the value configured by 'nodetool enable/disablebackup' user command. This patch updates the database::_enable_incremental_backups per-shard value in addition to updating the existing keyspaces and column families configurations. Fixes #845 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 17:26:31 +02:00
Vlad Zolotarov	a45ecaf336	database: store "incremental backup" configuration value in per-shard instance Store the "incremental_backups" configuration value in the database class (and use it when creating a keyspace::config) in order to be able to modify it in runtime. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 17:22:48 +02:00
Vlad Zolotarov	87e6efcdab	storage_service: distribute gossiper::endpoint_state_map together with token_metadata If storage_service::token_metadata is not distributed together with gossiper::endpoint_state_map there may be a situation when a non-zero shard sees a new value in token_metadata (e.g. newly added node's token ranges) while still seeing an old gossiper::endpoint_state_map contents (e.g. a mentioned above newly added node may not be present, thus causing gossiper::is_alive() to return FALSE for that node, while the node is actually alive and kicking). To avoid this discrepancy we will always update a token_metadata together with an endpoint_state_map when we distribute new token_metadata data among shards. Fixes #909 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 13:15:19 +02:00
Vlad Zolotarov	3a72ef87f2	gossiper: make _shadow_endpoint_state_map public and rename We will need to access it from a storage_service class when replicate token_metadata. Rename _shadow_endpoint_state_map -> shadow_endpoint_state_map according to our coding convention. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 11:16:44 +02:00
Vlad Zolotarov	4a21d48cc5	gossiper: use a semaphore instead of a future<> for serializing a timer callback Use a semaphore to allow serializing with a gossiper's timer callback. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 11:16:44 +02:00
Takuya ASADA	6dc51027a3	log: make log.cc able to compile with g++-4.9 std::put_time() is not implemented on g++-4.9, so replace it with strftime(). Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1457024183-893-1-git-send-email-syuu@scylladb.com>	2016-03-04 12:48:43 +01:00
Avi Kivity	6c2e57b003	Merge seastar upstream * seastar ba615c7...906b562 (1): > rpc: prepare some more for feature negotiation	2016-03-03 18:22:57 +02:00
Gleb Natapov	b89b6f442b	storage_proxy: fix race between read cl completion and timeout in digest resolver If timeout happens after cl promise is fulfilled, but before continuation runs it removes all the data that cl continuation needs to calculate result. Fix this by calculating result immediately and returning it in cl promise instead of delaying this work until continuation runs. This has a nice side effect of simplifying digest mismatch handling and making it exception free. Fixes #977. Message-Id: <1457015870-2106-3-git-send-email-gleb@scylladb.com>	2016-03-03 16:48:28 +02:00
Gleb Natapov	e4ac5157bc	storage_proxy: store only one data reply in digest resolver. Read executor may ask for more than one data reply during digest resolving stage, but only one result is actually needed to satisfy a query, so no need to store all of them. Message-Id: <1457015870-2106-2-git-send-email-gleb@scylladb.com>	2016-03-03 16:47:53 +02:00
Gleb Natapov	69b61b81ce	storage_proxy: fix cl achieved condition in digest resolver timeout handler In digest resolver for cl to be achieved it is not enough to get correct number of replies, but also to have data reply among them. The condition in digest timeout does not check that, fortunately we have a variable that we set to true when cl is achieved, so use it instead. Message-Id: <1457015870-2106-1-git-send-email-gleb@scylladb.com>	2016-03-03 16:47:11 +02:00
Tomasz Grabiec	2abd62b5cb	bytes_ostream: Drop methods which serialize integers This will make bytes_ostream completely agnostic to serialization format, which should be determined by layer above it. Message-Id: <1457004221-8345-2-git-send-email-tgrabiec@scylladb.com>	2016-03-03 13:27:27 +02:00
Tomasz Grabiec	aaac2a3cec	serializer: Add missing include Message-Id: <1457004221-8345-1-git-send-email-tgrabiec@scylladb.com>	2016-03-03 13:27:22 +02:00
Pekka Enberg	9c930d88a0	db/system_keyspace: Remove ifdef'd code We have our implementations of all the three ifdef'd functions. Message-Id: <1456926917-12594-1-git-send-email-penberg@scylladb.com>	2016-03-03 12:26:50 +02:00
Takuya ASADA	da56325f69	configure.py: add support --static-stdc++ for seastar binaries (iotune) Ubuntu 14.04LTS package is broken now because iotune does not statically linked against libstdc++, so this patch fixed it. Requires seastar patch to add --static-stdc++ on configure.py. Fixes #982 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1456995050-22007-1-git-send-email-syuu@scylladb.com>	2016-03-03 12:18:47 +02:00
Avi Kivity	d4c92c7e27	Merge seastar upstream * seastar b3fc7c5...ba615c7 (1): > configure.py: add --static-stdc++ to link libstdc++ statically	2016-03-03 12:18:23 +02:00
Asias He	01cb6b0d42	gossip: Send syn message in parallel and do not wait for it 1) As explained in commit `697b16414a` (gossip: Make gossip message handling async), in each gossip round we can make talking to the 1-3 peer nodes in parallel to reduce latency of gossip round. 2) Gossip syn message uses one way rpc message, but now the returned future of the one way message is ready only when message is dequeued for some reason (sent or dropped). If we wait for the one way syn messge to return it might block the gossip round for a unbounded time. To fix, do not wait for it in the gossip round. The downside is there will be no back pressure to bound the syn messages, however since the messages are once per second, I think it is fine. Message-Id: <ea4655f121213702b3f58185378bb8899e422dd1.1456991561.git.asias@scylladb.com>	2016-03-03 11:17:50 +02:00
Takuya ASADA	e545013e47	Revert "dist: downgrade g++ to 4.9 on Ubuntu" This reverts commit `01bd4959ac`. Fixes #983 Conflicts: dist/ubuntu/build_deb.sh dist/ubuntu/control.in dist/ubuntu/rules.in Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1456996244-19889-1-git-send-email-syuu@scylladb.com>	2016-03-03 11:12:18 +02:00
Tomasz Grabiec	04f2482d74	schema_tables: Log results of schema merge Currently schema changes are only logged at coordinator node which initiates the change. It would be helpful in post morten analysis to also see when and how schema changes are resolved when applied on other nodes. Message-Id: <1456953095-1982-1-git-send-email-tgrabiec@scylladb.com>	2016-03-03 11:12:15 +02:00
Nadav Har'El	2cf09147b5	Repair: don't use freeze() to calculate mutation checksums Use the existing "feed_hash" mechanism to find a checksum of the content of a mutation, instead of serializing the mutation (with freeze()) and then finding the checksum of that string. The serialized form is more prone to future changes, and not really guaranteed to provide equal hashes for mutations which are considered "equal". Fixes #971 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1456958676-27121-1-git-send-email-nyh@scylladb.com>	2016-03-03 09:58:24 +01:00
Avi Kivity	bec30ccf25	build: add order-only dependency between building antlr .o and IDL headers This ensures that if an antlr generated .cpp file depends on an IDL-generated .hh file, then that .hh is generated before the .o is built.	2016-03-03 09:52:25 +02:00
Tomasz Grabiec	b42d3a90b3	cql3: create_table_statement: Sort _defined_names by text Currently they are sorted by address in memory, which breaks the check for column name duplicates, which assumes sorting by text. Fixes #975. Message-Id: <1456937400-20475-1-git-send-email-tgrabiec@scylladb.com>	2016-03-02 18:53:43 +02:00
Avi Kivity	dda77d14b9	Merge seastar upstream * seastar 9964cbf...b3fc7c5 (2): > Introduce util/indirect.hh > reactor: new counters for the io queue	2016-03-02 18:52:36 +02:00
Calle Wilund	0c3322befd	commitlog: Ensure segment survives whole flush call Must keep shared pointer alíve. Likewise though, the shared pointer copy in cycle main continuation is not needed. Message-Id: <1456931988-5876-3-git-send-email-calle@scylladb.com>	2016-03-02 18:22:13 +02:00
Calle Wilund	f1c4e3eb3d	commitlog: Clear reserve segments in orphan_all Otherwise they will keep the segment_manager alive (leak). Fixes jenkins ASan errors. Message-Id: <1456931988-5876-2-git-send-email-calle@scylladb.com>	2016-03-02 18:22:09 +02:00
Calle Wilund	a556f665c0	commitlog: Take segment_manager locks first in write/flush While is is formally better to take a local lock first and then first contend for a global, in this case it is arguably better to ensure we get a gate exception synchronously (early) instead of potentially in a continuation. Old version might cause us to do a gate::leave even while never entered. And since we should really only have one active (contending) segment per shard anyway, it should not matter. Message-Id: <1456931988-5876-1-git-send-email-calle@scylladb.com>	2016-03-02 18:22:05 +02:00

1 2 3 4 5 ...

8785 Commits