aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2023-05-19graph-lock: Disable locking for nowKevin Wolf
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They come from callers that hold an AioContext lock, which is not allowed during polling. In theory, we could temporarily release the lock, but callers are inconsistent about whether they hold a lock, and if they do, some are also confused about which one they hold. While all of this is fixable, it's not trivial, and the best course of action for 8.0.1 is probably just disabling the graph locking code temporarily. We don't currently rely on graph locking yet. It is supposed to replace the AioContext lock eventually to enable multiqueue support, but as long as we still have the AioContext lock, it is sufficient without the graph lock. Once the AioContext lock goes away, the deadlock doesn't exist any more either and this commit can be reverted. (Of course, it can also be reverted while the AioContext lock still exists if the callers have been fixed.) Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230517152834.277483-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19graph-lock: Honour read locks even in the main threadKevin Wolf
There are some conditions under which we don't actually need to do anything for taking a reader lock: Writing the graph is only possible from the main context while holding the BQL. So if a reader is running in the main context under the BQL and knows that it won't be interrupted until the next writer runs, we don't actually need to do anything. This is the case if the reader code neither has a nested event loop (this is forbidden anyway while you hold the lock) nor is a coroutine (because a writer could run when the coroutine has yielded). These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts. They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a coroutine. This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking the reader lock in the main thread. Reported-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-9-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19blockjob: Adhere to rate limit even when reentered earlyKevin Wolf
When jobs are sleeping, for example to enforce a given rate limit, they can be reentered early, in particular in order to get paused, to update the rate limit or to get cancelled. Before this patch, they behave in this case as if they had fully completed their rate limiting delay. This means that requests are sped up beyond their limit, violating the constraints that the user gave us. Change the block jobs to sleep in a loop until the necessary delay is completed, while still allowing cancelling them immediately as well pausing (handled by the pause point in job_sleep_ns()) and updating the rate limit. This change is also motivated by iotests cases being prone to fail because drain operations pause and unpause them so often that block jobs complete earlier than they are supposed to. In particular, the next commit would fail iotests 030 without this change. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-8-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19qcow2: Unlock the graph in qcow2_do_open() where necessaryKevin Wolf
qcow2_do_open() calls a few no_co_wrappers that wrap functions taking the graph lock internally as a writer. Therefore, it can't hold the reader lock across these calls, it causes deadlocks. Drop the lock temporarily around the calls. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-4-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19block/export: Fix null pointer dereference in error pathKevin Wolf
There are some error paths in blk_exp_add() that jump to 'fail:' before 'exp' is even created. So we can't just unconditionally access exp->blk. Add a NULL check, and switch from exp->blk to blk, which is available earlier, just to be extra sure that we really cover all cases where BlockDevOps could have been set for it (in practice, this only happens in drv->create() today, so this part of the change isn't strictly necessary). Fixes: Coverity CID 1509238 Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19block: Call .bdrv_co_create(_opts) unlockedKevin Wolf
These are functions that modify the graph, so they must be able to take a writer lock. This is impossible if they already hold the reader lock. If they need a reader lock for some of their operations, they should take it internally. Many of them go through blk_*(), which will always take the lock itself. Direct calls of bdrv_*() need to take the reader lock. Note that while locking for bdrv_co_*() calls is checked by TSA, this is not the case for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required when they are called from coroutine context like here! This effectively reverts 4ec8df0183, but adds some internal locking instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-15block: add accounting for zone append operationSam Li
Taking account of the new zone append write operation for zoned devices, BLOCK_ACCT_ZONE_APPEND enum is introduced as other I/O request type (read, write, flush). Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-id: 20230508051916.178322-3-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block: add some trace events for zone appendSam Li
Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508051510.177850-5-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block: introduce zone append write for zoned devicesSam Li
A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block device using zone append, the byte offset of the call may point at any position within the zone to which the data is being appended. Upon completion the device will respond with the position where the data has been written in the zone. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508051510.177850-3-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15file-posix: add tracking of the zone write pointersSam Li
Since Linux doesn't have a user API to issue zone append operations to zoned devices from user space, the file-posix driver is modified to add zone append emulation using regular writes. To do this, the file-posix driver tracks the wp location of all zones of the device. It uses an array of uint64_t. The most significant bit of each wp location indicates if the zone type is conventional zones. The zones wp can be changed due to the following operations issued: - zone reset: change the wp to the start offset of that zone - zone finish: change to the end location of that zone - write to a zone - zone append Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-id: 20230508051510.177850-2-faithilikerun@gmail.com [Fix errno propagation from handle_aiocb_zone_mgmt() --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block: add some trace events for new block layer APIsSam Li
Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-8-faithilikerun@gmail.com Message-id: 20230324090605.28361-8-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block: add zoned BlockDriver check to block layerSam Li
Putting zoned/non-zoned BlockDrivers on top of each other is not allowed. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-6-faithilikerun@gmail.com Message-id: 20230324090605.28361-6-faithilikerun@gmail.com [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé <philmd@linaro.org> and clarify that the check is about zoned BlockDrivers. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block/raw-format: add zone operations to pass through requestsSam Li
raw-format driver usually sits on top of file-posix driver. It needs to pass through requests of zone commands. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-5-faithilikerun@gmail.com Message-id: 20230324090605.28361-5-faithilikerun@gmail.com [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé <philmd@linaro.org>. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block/block-backend: add block layer APIs resembling Linux ZonedBlockDevice ↵Sam Li
ioctls Add zoned device option to host_device BlockDriver. It will be presented only for zoned host block devices. By adding zone management operations to the host_block_device BlockDriver, users can use the new block layer APIs including Report Zone and four zone management operations (open, close, finish, reset, reset_all). Qemu-io uses the new APIs to perform zoned storage commands of the device: zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs), zone_finish(zf). For example, to test zone_report, use following command: $ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0 -c "zrp offset nr_zones" Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-4-faithilikerun@gmail.com Message-id: 20230324090605.28361-4-faithilikerun@gmail.com [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé <philmd@linaro.org> and remove spurious ret = -errno in raw_co_zone_mgmt(). --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block/file-posix: introduce helper functions for sysfs attributesSam Li
Use get_sysfs_str_val() to get the string value of device zoned model. Then get_sysfs_zoned_model() can convert it to BlockZoneModel type of QEMU. Use get_sysfs_long_val() to get the long value of zoned device information. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-3-faithilikerun@gmail.com Message-id: 20230324090605.28361-3-faithilikerun@gmail.com [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé <philmd@linaro.org>. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-10block/meson.build: prefer positive condition for replicationVladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20230428194928.1426370-2-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-10block: compile out assert_bdrv_graph_readable() by defaultStefan Hajnoczi
reader_count() is a performance bottleneck because the global aio_context_list_lock mutex causes thread contention. Put this debugging assertion behind a new ./configure --enable-debug-graph-lock option and disable it by default. The --enable-debug-graph-lock option is also enabled by the more general --enable-debug option. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230501173443.153062-1-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_refresh_limits() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_refresh_limits() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-21-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_recurse_can_replace() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_recurse_can_replace() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-20-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_query_bds_stats() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_query_bds_stats() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-18-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark BlockDriver callbacks for amend job GRAPH_RDLOCKEmanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of amend callbacks in BlockDriver need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-17-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_co_get_info() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_get_info() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-15-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_co_get_allocated_file_size() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_get_allocated_file_size() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20230504115750.54437-14-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10mirror: Require GRAPH_RDLOCK for accessing a node's parent listKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that functions accessing the parent list of a node need to hold a reader lock for the graph. As it happens, they already do. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230504115750.54437-13-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10vhdx: Require GRAPH_RDLOCK for accessing a node's parent listKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that functions accessing the parent list of a node need to hold a reader lock for the graph. As it happens, they already do. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-12-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10nbd: Mark nbd_co_do_establish_connection() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of nbd_co_do_establish_connection() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-11-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10nbd: Remove nbd_co_flush() wrapper functionKevin Wolf
The only thing nbd_co_flush() does is call nbd_client_co_flush(). Just use that function directly in the BlockDriver definitions and remove the wrapper. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-10-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: .bdrv_open is non-coroutine and unlockedKevin Wolf
Drivers were a bit confused about whether .bdrv_open can run in a coroutine and whether or not it holds a graph lock. It cannot keep a graph lock from the caller across the whole function because it both changes the graph (requires a writer lock) and does I/O (requires a reader lock). Therefore, it should take these locks internally as needed. The functions used to be called in coroutine context during image creation. This was buggy for other reasons, and as of commit 32192301, all block drivers go through no_co_wrappers. So it is not called in coroutine context any more. Fix qcow2 and qed to work with the correct assumptions: The graph lock needs to be taken internally instead of just assuming it's already there, and the coroutine path is dead code that can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-9-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: bdrv/blk_co_unref() for calls in coroutine contextKevin Wolf
These functions must not be called in coroutine context, because they need write access to the graph. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Consistently call bdrv_activate() outside coroutineKevin Wolf
Migration code can call bdrv_activate() in coroutine context, whereas other callers call it outside of coroutines. As it calls other code that is not supposed to run in coroutines, standardise on running outside of coroutines. This adds a no_co_wrapper to switch to the main loop before calling bdrv_activate(). Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-3-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10qcow2: Don't call bdrv_getlength() in coroutine_fnsKevin Wolf
There is a bdrv_co_getlength() now, which should be used in coroutine context. This requires adding GRAPH_RDLOCK to some functions so that this still compiles with TSA because bdrv_co_getlength() is GRAPH_RDLOCK. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: add missing coroutine_fn annotationsPaolo Bonzini
After the recent introduction of many new coroutine callbacks, a couple calls from non-coroutine_fn to coroutine_fn have sneaked in; fix them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230406101752.242125-1-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: add configure options for excluding vmdk, vhdx and vpcVladimir Sementsov-Ogievskiy
Let's add --enable / --disable configure options for these formats, so that those who don't need them may not build them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230421092758.814122-1-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-03block/export: call blk_set_dev_ops(blk, NULL, NULL)Stefan Hajnoczi
Most export types install BlockDeviceOps pointers. It is easy to forget to remove them because that happens automatically via the "drive" qdev property in hw/ but not block/export/. Put blk_set_dev_ops(blk, NULL, NULL) calls in the core export.c code so the export types don't need to remember. This fixes the nbd and vhost-user-blk export types. Fixes: fd6afc501a01 ("nbd/server: Use drained block ops to quiesce the server") Fixes: ca858a5fe94c ("vhost-user-blk-server: notify client about disk resize") Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20230502211119.720647-1-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2023-04-26Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingRichard Henderson
Block layer patches - Protect BlockBackend.queued_requests with its own lock - Switch to AIO_WAIT_WHILE_UNLOCKED() where possible - AioContext removal: LinuxAioState/LuringState/ThreadPool - Add more coroutine_fn annotations, use bdrv/blk_co_* - Fix crash when execute hmp_commit # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRH0b0RHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9Y0yw/6A/vzA4TGgFUP3WIvH/sQri4/V3gyR+PT # u3hOQUCYZ99nioTpKV91TSuUPuU/Mdspy/0NKM+K92yIXqxa9172A2zLOsGOu21l # qKpse+nBf1zqEgB8YzUHyCBdetPz916C/f9RS26SNUCW85GCHYGHA3u7nKvWLMyV # oKIoTlA8QOglOuEKlRoYh7hCFm7ET51NOSEftm8GsYbsW/I2Vzl8a1SHN1lHufjd # We3+898zUrmFqNMp6Rjdhn+yZmmoGzoZqV4YQi83z7xjiv+Ms4VHVVW7X8d20xRX # 5BLFiLHAuZ/1d26HyVhgBUr7KHyf94odocz8BylWKXGl5SXMCZun1Td1vgVKlGK+ # GRxzB2cWGWqzC2UmqSTc0Z0aIWbXukKwvcX76uBKsQZ+kB2A7jFobxHiaoQEDJ8B # WRNEMH2+CqCAu9rsrNRinnJKhT2nXcr9F9YfwRIlagdAePGWin+EUW8huf14dDBm # Z2Y34aKW4RQibF8xirMHeRBbOLmcq2VpKLKwNfBHUDgZB8iuD7bLn4n9nwWXMG1w # zgNsTybkv46vLPamTpEaUoNTHfuRDTAuE7Z7lkcc7jF41Z0V1DC/DCCWcL/0LvhP # GIxFdkYug3hetdF2U/OZhUoEfxvkqcuBnrr55LFzqheKEllQpPwPpt7UF0aH8bg3 # i/YpjHsf3xU= # =mpYX # -----END PGP SIGNATURE----- # gpg: Signature made Tue 25 Apr 2023 02:12:29 PM BST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (25 commits) block/monitor: Fix crash when executing HMP commit vmdk: make vmdk_is_cid_valid a coroutine_fn qcow2: mark various functions as coroutine_fn and GRAPH_RDLOCK tests: mark more coroutine_fns qemu-pr-helper: mark more coroutine_fns 9pfs: mark more coroutine_fns nbd: mark more coroutine_fns, do not use co_wrappers mirror: make mirror_flush a coroutine_fn, do not use co_wrappers blkdebug: add missing coroutine_fn annotation vvfat: mark various functions as coroutine_fn thread-pool: avoid passing the pool parameter every time thread-pool: use ThreadPool from the running thread io_uring: use LuringState from the running thread linux-aio: use LinuxAioState from the running thread block: add missing coroutine_fn to bdrv_sum_allocated_file_size() include/block: fixup typos monitor: convert monitor_cleanup() to AIO_WAIT_WHILE_UNLOCKED() hmp: convert handle_hmp_command() to AIO_WAIT_WHILE_UNLOCKED() block: convert bdrv_drain_all_begin() to AIO_WAIT_WHILE_UNLOCKED() block: convert bdrv_graph_wrlock() to AIO_WAIT_WHILE_UNLOCKED() ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-04-25block/monitor: Fix crash when executing HMP commitWang Liang
hmp_commit() calls blk_is_available() from a non-coroutine context (and in the main loop). blk_is_available() is a co_wrapper_mixed_bdrv_rdlock function, and in the non-coroutine context it calls AIO_WAIT_WHILE(), which crashes if the aio_context lock is not taken before. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1615 Signed-off-by: Wang Liang <wangliangzz@inspur.com> Message-Id: <20230424103902.45265-1-wangliangzz@126.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25vmdk: make vmdk_is_cid_valid a coroutine_fnPaolo Bonzini
Functions that can do I/O are prime candidates for being coroutine_fns. Make the change for the one that is itself called only from coroutine_fns. Unfortunately vmdk does not use a coroutine_fn for the bulk of the open (like qcow2 does) so vmdk_read_cid cannot have the same treatment. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-10-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25qcow2: mark various functions as coroutine_fn and GRAPH_RDLOCKPaolo Bonzini
Functions that can do I/O (including calling bdrv_is_allocated and bdrv_block_status functions) are prime candidates for being coroutine_fns. Make the change for those that are themselves called only from coroutine_fns. Also annotate that they are called with the graph rdlock taken, thus allowing them to call bdrv_co_*() functions for I/O. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25mirror: make mirror_flush a coroutine_fn, do not use co_wrappersPaolo Bonzini
mirror_flush calls a mixed function blk_flush but it is only called from mirror_run; so call the coroutine version and make mirror_flush a coroutine_fn too. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-4-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25blkdebug: add missing coroutine_fn annotationPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-3-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25vvfat: mark various functions as coroutine_fnPaolo Bonzini
Functions that can do I/O are prime candidates for being coroutine_fns. Make the change for those that are themselves called only from coroutine_fns. In addition, coroutine_fns should do I/O using bdrv_co_*() functions, for which it is required to hold the BlockDriverState graph lock. So also nnotate functions on the I/O path with TSA attributes, making it possible to switch them to use bdrv_co_*() functions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-2-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25thread-pool: avoid passing the pool parameter every timeEmanuele Giuseppe Esposito
thread_pool_submit_aio() is always called on a pool taken from qemu_get_current_aio_context(), and that is the only intended use: each pool runs only in the same thread that is submitting work to it, it can't run anywhere else. Therefore simplify the thread_pool_submit* API and remove the ThreadPool function parameter. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-5-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25thread-pool: use ThreadPool from the running threadEmanuele Giuseppe Esposito
Use qemu_get_current_aio_context() where possible, since we always submit work to the current thread anyways. We want to also be sure that the thread submitting the work is the same as the one processing the pool, to avoid adding synchronization to the pool list. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-4-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25io_uring: use LuringState from the running threadEmanuele Giuseppe Esposito
Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from the caller side, avoid passing LuringState in luring_io_{plug/unplug} and luring_co_submit, and document the functions to make clear that they work in the current thread's AioContext. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-3-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25linux-aio: use LinuxAioState from the running threadEmanuele Giuseppe Esposito
Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LinuxAioState. In order to prevent mistakes from the caller side, avoid passing LinuxAioState in laio_io_{plug/unplug} and laio_co_submit, and document the functions to make clear that they work in the current thread's AioContext. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-2-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: convert bdrv_drain_all_begin() to AIO_WAIT_WHILE_UNLOCKED()Stefan Hajnoczi
Since the AioContext argument was already NULL, AIO_WAIT_WHILE() was never going to unlock the AioContext. Therefore it is possible to replace AIO_WAIT_WHILE() with AIO_WAIT_WHILE_UNLOCKED(). Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230309190855.414275-5-stefanha@redhat.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: convert bdrv_graph_wrlock() to AIO_WAIT_WHILE_UNLOCKED()Stefan Hajnoczi
The following conversion is safe and does not change behavior: GLOBAL_STATE_CODE(); ... - AIO_WAIT_WHILE(qemu_get_aio_context(), ...); + AIO_WAIT_WHILE_UNLOCKED(NULL, ...); Since we're in GLOBAL_STATE_CODE(), qemu_get_aio_context() is our home thread's AioContext. Thus AIO_WAIT_WHILE() does not unlock the AioContext: if (ctx_ && in_aio_context_home_thread(ctx_)) { \ while ((cond)) { \ aio_poll(ctx_, true); \ waited_ = true; \ } \ And that means AIO_WAIT_WHILE_UNLOCKED(NULL, ...) can be substituted. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230309190855.414275-4-stefanha@redhat.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: convert blk_exp_close_all_type() to AIO_WAIT_WHILE_UNLOCKED()Stefan Hajnoczi
There is no change in behavior. Switch to AIO_WAIT_WHILE_UNLOCKED() instead of AIO_WAIT_WHILE() to document that this code has already been audited and converted. The AioContext argument is already NULL so aio_context_release() is never called anyway. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230309190855.414275-3-stefanha@redhat.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: don't acquire AioContext lock in bdrv_drain_all()Stefan Hajnoczi
There is no need for the AioContext lock in bdrv_drain_all() because nothing in AIO_WAIT_WHILE() needs the lock and the condition is atomic. AIO_WAIT_WHILE_UNLOCKED() has no use for the AioContext parameter other than performing a check that is nowadays already done by the GLOBAL_STATE_CODE()/IO_CODE() macros. Set the ctx argument to NULL here to help us keep track of all converted callers. Eventually all callers will have been converted and then the argument can be dropped entirely. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230309190855.414275-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: protect BlockBackend->queued_requests with a lockStefan Hajnoczi
The CoQueue API offers thread-safety via the lock argument that qemu_co_queue_wait() and qemu_co_enter_next() take. BlockBackend currently does not make use of the lock argument. This means that multiple threads submitting I/O requests can corrupt the CoQueue's QSIMPLEQ. Add a QemuMutex and pass it to CoQueue APIs so that the queue is protected. While we're at it, also assert that the queue is empty when the BlockBackend is deleted. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230307210427.269214-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>