aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2 daysMerge tag 'housekeeping-20240424' of https://github.com/philmd/qemu into stagingRichard Henderson
Removal of deprecated code - Remove the Nios II target and hardware - Remove pvrdma device and rdmacm-mux helper - Remove GlusterFS RDMA protocol handling - Update Sriram Yagnaraman mail address # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmYpE0YACgkQ4+MsLN6t # wN5PIA//egomANjRHAUAf9tdjljgT/JR49ejM7iInyxspR/xaiq0TlP2kP6aDNps # y1HAWBwfj5lGxeMgQ1mSKJGka3v2AIPWb7RbNT+9AaiWHv+sx5OrEytozUsFHLo8 # gSgRQocq0NY2a9dPbtkDqfbmq/rkCC7wgZzwroHsyOdiqYsWDKPJFleBDMjGmEaf # colhiDmhUPgvE3NNpwfEVNh/2SzxUxY8k5FHal6qij5z56ZqBglgnziDZEvGVCZ1 # uF4Hca/kh7TV2MVsdStPbGWZYDhJ/Np/2FnRoThD1Hc4qq8d/SH997m2F94tSOud # YeH54Vp5lmCeYgba5y8VP0ZPx/b9XnTtLvKggNdoqB+T2LBWPRt8kehqoaxvammF # ALzbY/t2vUxL6nIVbosOaTyqVOXvynk3/Js5S0jbnlu+vP2WvvFEzfYKIs2DIA8w # z56o/rG4KfyxF0aDB+CvLNwtJS8THqeivPqmYoKTdN9FPpN2RyBNLITrKo389ygF # 3oWy3+xsKGIPdNFY0a4l25xntqWNhND89ejzyL9M6G1cQ9RdEmTIUGTrinPQQmfP # oHIJMBeTdj7EqPL4LB3BR/htw9U5PobeMNYKFsRkS39PjGDqba5wbIdk3w5/Rcxa # s/PKdspDKWPwZ5jhcLD0qxAGJFnqM2UFjPo+U8qyI3RXKXFAn0E= # =c8Aj # -----END PGP SIGNATURE----- # gpg: Signature made Wed 24 Apr 2024 07:12:22 AM PDT # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full] * tag 'housekeeping-20240424' of https://github.com/philmd/qemu: block/gluster: Remove deprecated RDMA protocol handling hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper hw/timer: Remove the ALTERA_TIMER model target/nios2: Remove the deprecated Nios II target MAINTAINERS: Update Sriram Yagnaraman mail address Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2 daysblock/gluster: Remove deprecated RDMA protocol handlingPhilippe Mathieu-Daudé
GlusterFS+RDMA has been deprecated 8 years ago in commit 0552ff2465 ("block/gluster: deprecate rdma support"): gluster volfile server fetch happens through unix and/or tcp, it doesn't support volfile fetch over rdma. The rdma code may actually mislead, so to make sure things do not break, for now we fallback to tcp when requested for rdma, with a warning. If you are wondering how this worked all these days, its the gluster libgfapi code which handles anything other than unix transport as socket/tcp, sad but true. Besides, the whole RDMA subsystem was deprecated in commit e9a54265f5 ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") released in v8.2. Cc: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240328130255.52257-4-philmd@linaro.org>
3 daysqapi: Inline and remove QERR_DEVICE_HAS_NO_MEDIUM definitionPhilippe Mathieu-Daudé
Address the comment added in commit 4629ed1e98 ("qerror: Finally unused, clean up"), from 2015: /* * These macros will go away, please don't use * in new code, and do not add new ones! */ Mechanical transformation using sed, and manual cleanup. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312141343.3168265-4-armbru@redhat.com>
2024-04-02block: Remove unnecessary NULL check in bdrv_pad_request()Kevin Wolf
Coverity complains that the check introduced in commit 3f934817 suggests that qiov could be NULL and we dereference it before reaching the check. In fact, all of the callers pass a non-NULL pointer, so just remove the misleading check. Resolves: Coverity CID 1542668 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240327192750.204197-1-kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-26block-backend: fix edge case in bdrv_next_cleanup() where BDS associated to ↵Fiona Ebner
BB changes Same rationale as for commit "block-backend: fix edge case in bdrv_next() where BDS associated to BB changes". The block graph might change between the bdrv_next() call and the bdrv_next_cleanup() call, so it could be that the associated BDS is not the same that was referenced previously anymore. Instead, rely on bdrv_next() to set it->bs to the BDS it referenced and unreference that one in any case. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-4-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-03-26block-backend: fix edge case in bdrv_next() where BDS associated to BB changesFiona Ebner
The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-3-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-03-26block/io: accept NULL qiov in bdrv_pad_requestStefan Reiter
Some operations, e.g. block-stream, perform reads while discarding the results (only copy-on-read matters). In this case, they will pass NULL as the target QEMUIOVector, which will however trip bdrv_pad_request, since it wants to extend its passed vector. In particular, this is the case for the blk_co_preadv() call in stream_populate(). If there is no qiov, no operation can be done with it, but the bytes and offset still need to be updated, so the subsequent aligned read will actually be aligned and not run into an assertion failure. In particular, this can happen when the request alignment of the top node is larger than the allocated part of the bottom node, in which case padding becomes necessary. For example: > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M -o cluster_size=32768 > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "compress", "file": "node0", "node-name": "node1" } } > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node1" } } > EOF Originally-by: Stefan Reiter <s.reiter@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> [FE: do update bytes and offset in any case add reproducer to commit message] Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-2-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-03-18qemu-img: Fix Column Width and Improve Formatting in snapshot listAbhiram Tilak
When running the command `qemu-img snapshot -l SNAPSHOT` the output of VM_CLOCK (measures the offset between host and VM clock) cannot to accommodate values in the order of thousands (4-digit). This line [1] hints on the problem. Additionally, the column width for the VM_CLOCK field was reduced from 15 to 13 spaces in commit b39847a5 in line [2], resulting in a shortage of space. [1]: https://gitlab.com/qemu-project/qemu/-/blob/master/block/qapi.c?ref_type=heads#L753 [2]: https://gitlab.com/qemu-project/qemu/-/blob/master/block/qapi.c?ref_type=heads#L763 This patch restores the column width to 15 spaces and makes adjustments to the affected iotests accordingly. Furthermore, addresses a potential source of confusion by removing whitespace in column headers. Example, VM CLOCK is modified to VM_CLOCK. Additionally a '--' symbol is introduced when ICOUNT returns no output for clarity. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2062 Fixes: b39847a50553 ("migration: introduce icount field for snapshots") Signed-off-by: Abhiram Tilak <atp.exp@gmail.com> Message-ID: <20240123050354.22152-2-atp.exp@gmail.com> [kwolf: Fixed up qemu-iotests 261 and 286] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-03-18mirror: Don't call job_pause_point() under graph lockKevin Wolf
Calling job_pause_point() while holding the graph reader lock potentially results in a deadlock: bdrv_graph_wrlock() first drains everything, including the mirror job, which pauses it. The job is only unpaused at the end of the drain section, which is when the graph writer lock has been successfully taken. However, if the job happens to be paused at a pause point where it still holds the reader lock, the writer lock can't be taken as long as the job is still paused. Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly. Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-28125 Fixes: 004915a96a7a ("block: Protect bs->backing with graph_lock") Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20240313153000.33121-1-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-03-12error: Move ERRP_GUARD() to the beginning of the functionZhao Liu
Since the commit 05e385d2a9 ("error: Move ERRP_GUARD() to the beginning of the function"), there are new codes that don't put ERRP_GUARD() at the beginning of the functions. As stated in the commit 05e385d2a9: "include/qapi/error.h advises to put ERRP_GUARD() right at the beginning of the function, because only then can it guard the whole function.", so clean up the few spots disregarding the advice. Inspired-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312060337.3240965-1-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/vmdk: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The vmdk_parse_extents() passes @errp to error_prepend(), and its @errp is from vmdk_open(). Though, vmdk_open(), as a BlockDriver.bdrv_open(), gets the @errp parameter which is pointer of its caller's local_err, to follow the requirement of @errp, add missing ERRP_GUARD() at the beginning of this function. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Fam Zheng <fam@euphon.net> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240311033822.3142585-13-zhao1.liu@linux.intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/vdi: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The vdi_co_do_create() passes @errp to error_prepend() without ERRP_GUARD(), and its @errp parameter is so widely sourced that it is necessary to protect it with ERRP_GUARD(). To avoid the potential issues as [1] said, add missing ERRP_GUARD() at the beginning of this function. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Stefan Weil <sw@weilnetz.de> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240311033822.3142585-12-zhao1.liu@linux.intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/snapshot: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. In block/snapshot.c, there are 2 functions passing @errp to error_prepend() without ERRP_GUARD(): - bdrv_all_delete_snapshot() - bdrv_all_goto_snapshot() As the APIs exposed in include/block/snapshot.h, they could be called by other modules. To avoid potential issues as [1] said, add missing ERRP_GUARD() at the beginning of these 2 functions. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240311033822.3142585-11-zhao1.liu@linux.intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/qed: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The bdrv_qed_co_invalidate_cache() passes @errp to error_prepend() without ERRP_GUARD(). Though it is a BlockDriver.bdrv_co_invalidate_cache() method, and currently its @errp parameter only points to callers' local_err, to follow the requirement of @errp, add missing ERRP_GUARD() at the beginning of this function. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240311033822.3142585-10-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/qcow2: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. In block/qcow2.c, there are 2 functions passing @errp to error_prepend() without ERRP_GUARD(): - qcow2_co_create() - qcow2_co_truncate() There are too many possible callers to check the impact of the defect; it may or may not be harmless. Thus it is necessary to protect @errp with ERRP_GUARD(). Therefore, to avoid the issue like [1] said, add missing ERRP_GUARD() at their beginning. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20240311033822.3142585-9-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/qcow2-bitmap: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The qcow2_co_can_store_new_dirty_bitmap() passes @errp to error_prepend(). As a BlockDriver.bdrv_co_can_store_new_dirty_bitmap method, it's called by bdrv_co_can_store_new_dirty_bitmap(). Its caller is not being called anywhere, but as the API in include/block/block-io.h, we can't ensure what kind of @errp future users will pass in. To avoid potential issues as [1] said, add missing ERRP_GUARD() at the beginning of qcow2_co_can_store_new_dirty_bitmap(). [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: John Snow <jsnow@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20240311033822.3142585-8-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/nvme: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. In nvme.c, there are 3 functions passing @errp to error_prepend() without ERRP_GUARD(): - nvme_init_queue() - nvme_create_queue_pair() - nvme_identify() All these 3 functions take their @errp parameters from the nvme_file_open(), which is a BlockDriver.bdrv_nvme() method and its @errp points to its caller's local_err. Though these 3 cases haven't trigger the issue like [1] said, to follow the requirement of @errp, add missing ERRP_GUARD() at their beginning. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Fam Zheng <fam@euphon.net> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240311033822.3142585-7-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/nbd: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The nbd_co_do_receive_one_chunk() passes @errp to error_prepend() without ERRP_GUARD(), and though its @errp parameter points to its caller's local_err, to follow the requirement of @errp, add missing ERRP_GUARD() at the beginning of this function. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: Eric Blake <eblake@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20240311033822.3142585-6-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12block/copy-before-write: Fix missing ERRP_GUARD() for error_prepend()Zhao Liu
As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. The cbw_open() passes @errp to error_prepend() without ERRP_GUARD(). Though it is the BlockDriver.bdrv_open() method, and currently its @errp parameter only points to callers' local_err, to follow the requirement of @errp, add missing ERRP_GUARD() at the beginning of this function. [1]: Issue description in the commit message of commit ae7c80a7bd73 ("error: New macro ERRP_GUARD()"). Cc: John Snow <jsnow@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20240311033822.3142585-5-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-02-12Merge tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu into ↵Peter Maydell
staging - LUKS support for detached headers - Update x86 CPU model docs and script - Add missing close of chardev QIOChannel - More trace events o nTKS handshake - Drop unsafe VNC constants - Increase NOFILE limit during startup # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmXGMNUACgkQvobrtBUQ # T998JQ//SqQ3L/AZmhE5cIwZ1XipSMMZ/yEoVIyniA3tL41S7Oimj3O9XvY68TEG # nnj9Oh+zOlVLxauTHAczveJ7z+XfonQZS3HrbGRUTHU+ezGVjyM618e/h9pSQtYI # +CCkrjtey1NoT42/um4D/bKg/B2XQeulS+pD12Z9l5zbqEZiw0R9+UwVIJ52G811 # 5UQgIjJ7GNFzalxqiMCkGc0nTyU8keEXQJcdZ4droo42DnU4pZeQWGDimzP61JnW # 1Crm6aZSuUriUbVmxJde+2eEdPSR4rr/yQ4Pw06hoi1QJALSgGYtOTo8+qsyumHd # us/2ouMrxOMdsIk4ViAkSTiaje9agPj84VE1Z229Y/uqZcEAuX572n730/kkzqUv # ZDKxMz0v3rzpkjFmsgj5D4yqJaQp4zn1zYm98ld7HWJVIOf3GSvpaNg9J6jwN7Gi # HKKkvYns9pxg3OSx++gqnM32HV6nnMDFiddipl/hTiUsnNlnWyTDSvJoNxIUU5+l # /uEbbdt8xnxx1JP0LiOhgmz6N6FU7oOpaPuJ5CD8xO2RO8D1uBRvmpFcdOTDAfv0 # uYdjhKBI+quKjE64p7gNWYCoqZtipRIJ6AY2VaPU8XHx8GvGFwBLX64oLYiYtrBG # gkv3NTHRkMhQw9cGQcZIgZ+OLU+1eNF+m9EV7LUjuKl0HWC3Vjs= # =61zI # -----END PGP SIGNATURE----- # gpg: Signature made Fri 09 Feb 2024 14:04:05 GMT # gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full] # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full] # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu: tests: Add case for LUKS volume with detached header crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS block: Support detached LUKS header creation using qemu-img block: Support detached LUKS header creation using blockdev-create crypto: Modify the qcrypto_block_create to support creation flags qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS crypto: Support LUKS volume with detached header io: add trace event when cancelling TLS handshake chardev: close QIOChannel before unref'ing docs: re-generate x86_64 ABI compatibility CSV docs: fix highlighting of CPU ABI header rows scripts: drop comment about autogenerated CPU API file softmmu: remove obsolete comment about libvirt timeouts ui: drop VNC feature _MASK constants qemu_init: increase NOFILE soft limit on POSIX crypto: Introduce SM4 symmetric cipher algorithm meson: sort C warning flags alphabetically Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-02-09block: Support detached LUKS header creation using qemu-imgHyman Huang
Even though a LUKS header might be created with cryptsetup, qemu-img should be enhanced to accommodate it as well. Add the 'detached-header' option to specify the creation of a detached LUKS header. This is how it is used: $ qemu-img create --object secret,id=sec0,data=abc123 -f luks > -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0 > -o detached-header=true header.luks Using qemu-img or cryptsetup tools to query information of an LUKS header image as follows: Assume a detached LUKS header image has been created by: $ dd if=/dev/zero of=test-header.img bs=1M count=32 $ dd if=/dev/zero of=test-payload.img bs=1M count=1000 $ cryptsetup luksFormat --header test-header.img test-payload.img > --force-password --type luks1 Header image information could be queried using cryptsetup: $ cryptsetup luksDump test-header.img or qemu-img: $ qemu-img info 'json:{"driver":"luks","file":{"filename": > "test-payload.img"},"header":{"filename":"test-header.img"}}' When using qemu-img, keep in mind that the entire disk information specified by the JSON-format string above must be supplied on the commandline; if not, an overlay check will reveal a problem with the LUKS volume check logic. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [changed to pass 'cflags' to block_crypto_co_create_generic] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-09block: Support detached LUKS header creation using blockdev-createHyman Huang
Firstly, enable the ability to choose the block device containing a detachable LUKS header by adding the 'header' parameter to BlockdevCreateOptionsLUKS. Secondly, when formatting the LUKS volume with a detachable header, truncate the payload volume to length without a header size. Using the qmp blockdev command, create the LUKS volume with a detachable header as follows: 1. add the secret to lock/unlock the cipher stored in the detached LUKS header $ virsh qemu-monitor-command vm '{"execute":"object-add", > "arguments":{"qom-type": "secret", "id": "sec0", "data": "foo"}}' 2. create a header img with 0 size $ virsh qemu-monitor-command vm '{"execute":"blockdev-create", > "arguments":{"job-id":"job0", "options":{"driver":"file", > "filename":"/path/to/detached_luks_header.img", "size":0 }}}' 3. add protocol blockdev node for header $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments": {"driver":"file", "filename": > "/path/to/detached_luks_header.img", "node-name": > "detached-luks-header-storage"}}' 4. create a payload img with 0 size $ virsh qemu-monitor-command vm '{"execute":"blockdev-create", > "arguments":{"job-id":"job1", "options":{"driver":"file", > "filename":"/path/to/detached_luks_payload_raw.img", "size":0}}}' 5. add protocol blockdev node for payload $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments": {"driver":"file", "filename": > "/path/to/detached_luks_payload_raw.img", "node-name": > "luks-payload-raw-storage"}}' 6. do the formatting with 128M size $ virsh qemu-monitor-command c81_node1 '{"execute":"blockdev-create", > "arguments":{"job-id":"job2", "options":{"driver":"luks", "header": > "detached-luks-header-storage", "file":"luks-payload-raw-storage", > "size":134217728, "preallocation":"full", "key-secret":"sec0" }}}' Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-09crypto: Modify the qcrypto_block_create to support creation flagsHyman Huang
Expand the signature of qcrypto_block_create to enable the formation of LUKS volumes with detachable headers. To accomplish that, introduce QCryptoBlockCreateFlags to instruct the creation process to set the payload_offset_sector to 0. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-09qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKSHyman Huang
To support detached LUKS header creation, make the existing 'file' field in BlockdevCreateOptionsLUKS optional. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-09crypto: Support LUKS volume with detached headerHyman Huang
By enhancing the LUKS driver, it is possible to implement the LUKS volume with a detached header. Normally a LUKS volume has a layout: disk: | header | key material | disk payload data | With a detached LUKS header, you need 2 disks so getting: disk1: | header | key material | disk2: | disk payload data | There are a variety of benefits to doing this: * Secrecy - the disk2 cannot be identified as containing LUKS volume since there's no header * Control - if access to the disk1 is restricted, then even if someone has access to disk2 they can't unlock it. Might be useful if you have disks on NFS but want to restrict which host can launch a VM instance from it, by dynamically providing access to the header to a designated host * Flexibility - your application data volume may be a given size and it is inconvenient to resize it to add encryption.You can store the LUKS header separately and use the existing storage volume for payload * Recovery - corruption of a bit in the header may make the entire payload inaccessible. It might be convenient to take backups of the header. If your primary disk header becomes corrupt, you can unlock the data still by pointing to the backup detached header Take the raw-format image as an example to introduce the usage of the LUKS volume with a detached header: 1. prepare detached LUKS header images $ dd if=/dev/zero of=test-header.img bs=1M count=32 $ dd if=/dev/zero of=test-payload.img bs=1M count=1000 $ cryptsetup luksFormat --header test-header.img test-payload.img > --force-password --type luks1 2. block-add a protocol blockdev node of payload image $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments":{"node-name":"libvirt-1-storage", "driver":"file", > "filename":"test-payload.img"}}' 3. block-add a protocol blockdev node of LUKS header as above. $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments":{"node-name":"libvirt-2-storage", "driver":"file", > "filename": "test-header.img" }}' 4. object-add the secret for decrypting the cipher stored in LUKS header above $ virsh qemu-monitor-command vm '{"execute":"object-add", > "arguments":{"qom-type":"secret", "id": > "libvirt-2-storage-secret0", "data":"abc123"}}' 5. block-add the raw-drived blockdev format node $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments":{"node-name":"libvirt-1-format", "driver":"raw", > "file":"libvirt-1-storage"}}' 6. block-add the luks-drived blockdev to link the raw disk with the LUKS header by specifying the field "header" $ virsh qemu-monitor-command vm '{"execute":"blockdev-add", > "arguments":{"node-name":"libvirt-2-format", "driver":"luks", > "file":"libvirt-1-format", "header":"libvirt-2-storage", > "key-secret":"libvirt-2-format-secret0"}}' 7. hot-plug the virtio-blk device finally $ virsh qemu-monitor-command vm '{"execute":"device_add", > "arguments": {"num-queues":"1", "driver":"virtio-blk-pci", > "drive": "libvirt-2-format", "id":"virtio-disk2"}}' Starting a VM with a LUKS volume with detached header is somewhat similar to hot-plug in that both maintaining the same json command while the starting VM changes the "blockdev-add/device_add" parameters to "blockdev/device". Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-07blkio: Respect memory-alignment for bounce buffer allocationsKevin Wolf
blkio_alloc_mem_region() requires that the requested buffer size is a multiple of the memory-alignment property. If it isn't, the allocation fails with a return value of -EINVAL. Fix the call in blkio_resize_bounce_pool() to make sure the requested size is properly aligned. I observed this problem with vhost-vdpa, which requires page aligned memory. As the virtio-blk device behind it still had 512 byte blocks, we got bs->bl.request_alignment = 512, but actually any request that needed a bounce buffer and was not aligned to 4k would fail without this fix. Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20240131173140.42398-1-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-02-07block-backend: Allow concurrent context changesHanna Czenczek
Since AioContext locks have been removed, a BlockBackend's AioContext may really change at any time (only exception is that it is often confined to a drained section, as noted in this patch). Therefore, blk_get_aio_context() cannot rely on its root node's context always matching that of the BlockBackend. In practice, whether they match does not matter anymore anyway: Requests can be sent to BDSs from any context, so anyone who requests the BB's context should have no reason to require the root node to have the same context. Therefore, we can and should remove the assertion to that effect. In addition, because the context can be set and queried from different threads concurrently, it has to be accessed with atomic operations. Buglink: https://issues.redhat.com/browse/RHEL-19381 Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20240202144755.671354-2-hreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-30block/blkio: Make s->mem_region_alignment be 64 bitsRichard W.M. Jones
With GCC 14 the code failed to compile on i686 (and was wrong for any version of GCC): ../block/blkio.c: In function ‘blkio_file_open’: ../block/blkio.c:857:28: error: passing argument 3 of ‘blkio_get_uint64’ from incompatible pointer type [-Wincompatible-pointer-types] 857 | &s->mem_region_alignment); | ^~~~~~~~~~~~~~~~~~~~~~~~ | | | size_t * {aka unsigned int *} In file included from ../block/blkio.c:12: /usr/include/blkio.h:49:67: note: expected ‘uint64_t *’ {aka ‘long long unsigned int *’} but argument is of type ‘size_t *’ {aka ‘unsigned int *’} 49 | int blkio_get_uint64(struct blkio *b, const char *name, uint64_t *value); | ~~~~~~~~~~^~~~~ Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 20240130122006.2977938-1-rjones@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-30block/io_uring: improve error message when init failsFiona Ebner
The man page for io_uring_queue_init states: > io_uring_queue_init(3) returns 0 on success and -errno on failure. and the man page for io_uring_setup (which is one of the functions where the return value of io_uring_queue_init() can come from) states: > On error, a negative error code is returned. The caller should not > rely on errno variable. Tested using 'sysctl kernel.io_uring_disabled=2'. Output before this change: > failed to init linux io_uring ring Output after this change: > failed to init linux io_uring ring: Operation not permitted Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240123135044.204985-1-f.ebner@proxmox.com>
2024-01-26block/blklogwrites: Protect mutable driver state with a mutex.Ari Sundholm
During the review of a fix for a concurrency issue in blklogwrites, it was found that the driver needs an additional fix when enabling multiqueue, which is a new feature introduced in QEMU 9.0, as the driver state may be read and written by multiple threads at the same time, which was not the case when the driver was originally written. Fix the multi-threaded scenario by introducing a mutex to protect the mutable fields in the driver state, and always having the mutex locked by the current thread when accessing them. Also use the mutex and a CoQueue to ensure that the super block is not being written to by multiple threads concurrently and updates are properly serialized. Additionally, add the const qualifier to a few BDRVBlkLogWritesState pointer targets in contexts where the driver state is not written to. Signed-off-by: Ari Sundholm <ari@tuxera.com> Message-ID: <20240119162913.2620245-1-ari@tuxera.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-26stream: Allow users to request only format driver names in backing file formatPeter Krempa
Introduce a new flag 'backing-mask-protocol' for the block-stream QMP command which instructs the internals to use 'raw' instead of the protocol driver in case when a image is used without a dummy 'raw' wrapper. The flag is designed such that it can be always asserted by management tools even when there isn't any update to backing files. The flag will be used by libvirt so that the backing images still reference the proper format even when libvirt will stop using the dummy raw driver (raw driver with no other config). Libvirt needs this so that the images stay compatible with older libvirt versions which didn't expect that a protocol driver name can appear in the backing file format field. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <bbee9a0a59748a8893289bf8249f568f0d587e62.1701796348.git.pkrempa@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-26commit: Allow users to request only format driver names in backing file formatPeter Krempa
Introduce a new flag 'backing-mask-protocol' for the block-commit QMP command which instructs the internals to use 'raw' instead of the protocol driver in case when a image is used without a dummy 'raw' wrapper. The flag is designed such that it can be always asserted by management tools even when there isn't any update to backing files. The flag will be used by libvirt so that the backing images still reference the proper format even when libvirt will stop using the dummy raw driver (raw driver with no other config). Libvirt needs this so that the images stay compatible with older libvirt versions which didn't expect that a protocol driver name can appear in the backing file format field. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <2cb46e37093ce793ea1604abc8bbb90f4c8e434b.1701796348.git.pkrempa@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-26block/blklogwrites: Fix a bug when logging "write zeroes" operations.Ari Sundholm
There is a bug in the blklogwrites driver pertaining to logging "write zeroes" operations, causing log corruption. This can be easily observed by setting detect-zeroes to something other than "off" for the driver. The issue is caused by a concurrency bug pertaining to the fact that "write zeroes" operations have to be logged in two parts: first the log entry metadata, then the zeroed-out region. While the log entry metadata is being written by bdrv_co_pwritev(), another operation may begin in the meanwhile and modify the state of the blklogwrites driver. This is as intended by the coroutine-driven I/O model in QEMU, of course. Unfortunately, this specific scenario is mishandled. A short example: 1. Initially, in the current operation (#1), the current log sector number in the driver state is only incremented by the number of sectors taken by the log entry metadata, after which the log entry metadata is written. The current operation yields. 2. Another operation (#2) may start while the log entry metadata is being written. It uses the current log position as the start offset for its log entry. This is in the sector right after the operation #1 log entry metadata, which is bad! 3. After bdrv_co_pwritev() returns (#1), the current log sector number is reread from the driver state in order to find out the start offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the offset will be the sector right after the (misplaced) operation #2 log entry, which means that the zeroed-out region begins at the wrong offset. 4. As a result of the above, the log is corrupt. Fix this by only reading the driver metadata once, computing the offsets and sizes in one go (including the optional zeroed-out region) and setting the log sector number to the appropriate value for the next operation in line. Signed-off-by: Ari Sundholm <ari@tuxera.com> Cc: qemu-stable@nongnu.org Message-ID: <20240109184646.1128475-1-megari@gmx.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-22block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_statusFiona Ebner
Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20240116154839.401030-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-18remove unnecessary casts from uintptr_tPaolo Bonzini
uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems, is an unsigned type and there is no need to further cast to __u64 which is another unsigned integer type; widening casts from unsigned integers zero-extend the value. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18io_uring: move LuringState typedef to block/aio.hPaolo Bonzini
The LuringState typedef is defined twice, in include/block/raw-aio.h and block/io_uring.c. Move it in include/block/aio.h, which is included everywhere the typedef is needed, since include/block/aio.h already has to define the forward reference to the struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-04Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingPeter Maydell
* configure: use a native non-cross compiler for linux-user * meson: cleanups * target/i386: miscellaneous cleanups and optimizations * target/i386: implement CMPccXADD * target/i386: the sgx_epc_get_section stub is reachable * esp: check for NULL result from scsi_device_find() # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWRImYUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNd7AgAgcyJGiMfUkXqhefplpm06RDXQIa8 # FuoJqPb21lO75DQKfaFRAc4xGLagjJROMJGHMm9HvMu2VlwvOydkQlfFRspENxQ/ # 5XzGdb/X0A7HA/mwUfnMB1AZx0Vs32VI5IBSc6acc9fmgeZ84XQEoM3KBQHUik7X # mSkE4eltR9gJ+4IaGo4voZtK+YoVD8nEcuqmnKihSPWizev0FsZ49aNMtaYa9qC/ # Xs3kiQd/zPibHDHJu0ulFsNZgxtUcvlLHTCf8gO4dHWxCFLXGubMush83McpRtNB # Qoh6cTLH+PBXfrxMR3zmTZMNvo8Euls3s07Y8TkNP4vdIIE/kMeMDW1wJw== # =mq30 # -----END PGP SIGNATURE----- # gpg: Signature made Sun 31 Dec 2023 08:12:22 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits) meson.build: report graphics backends separately configure, meson: rename targetos to host_os meson: rename config_all meson: remove CONFIG_ALL meson: remove config_targetos meson: remove CONFIG_POSIX and CONFIG_WIN32 from config_targetos meson: remove OS definitions from config_targetos meson: always probe u2f and canokey if the option is enabled meson: move subdirs to "Collect sources" section meson: move config-host.h definitions together meson: move CFI detection code with other compiler flags meson: keep subprojects together meson: move accelerator dependency checks together meson: move option validation together meson: move program checks together meson: add more sections to main meson.build configure: unify again the case arms in probe_target_compiler configure: remove unnecessary subshell Makefile: clean qemu-iotests output meson: use version_compare() to compare version ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-12-31configure, meson: rename targetos to host_osPaolo Bonzini
This variable is about the host OS, not the target. It is used a lot more since the Meson conversion, but the original sin dates back to 2003. Time to fix it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31meson: remove CONFIG_POSIX and CONFIG_WIN32 from config_targetosPaolo Bonzini
For consistency with other OSes, use if...endif for rules that are target-independent. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31meson: remove OS definitions from config_targetosPaolo Bonzini
CONFIG_DARWIN, CONFIG_LINUX and CONFIG_BSD are used in some rules, but only CONFIG_LINUX has substantial use. Convert them all to if...endif. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-21block: remove outdated AioContext locking commentsStefan Hajnoczi
The AioContext lock no longer exists. There is one noteworthy change: - * More specifically, these functions use BDRV_POLL_WHILE(bs), which - * requires the caller to be either in the main thread and hold - * the BlockdriverState (bs) AioContext lock, or directly in the - * home thread that runs the bs AioContext. Calling them from - * another thread in another AioContext would cause deadlocks. + * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires + * the caller to be either in the main thread or directly in the home thread + * that runs the bs AioContext. Calling them from another thread in another + * AioContext would cause deadlocks. I am not sure whether deadlocks are still possible. Maybe they have just moved to the fine-grained locks that have replaced the AioContext. Since I am not sure if the deadlocks are gone, I have kept the substance unchanged and just removed mention of the AioContext. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20231205182011.1976568-15-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21block: remove AioContext lockingStefan Hajnoczi
This is the big patch that removes aio_context_acquire()/aio_context_release() from the block layer and affected block layer users. There isn't a clean way to split this patch and the reviewers are likely the same group of people, so I decided to do it in one patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-ID: <20231205182011.1976568-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21graph-lock: remove AioContext lockingStefan Hajnoczi
Stop acquiring/releasing the AioContext lock in bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any effect. The distinction between bdrv_graph_wrunlock() and bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed into one function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231205182011.1976568-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21block: Fix crash when loading snapshot on inactive nodeKevin Wolf
bdrv_is_read_only() only checks if the node is configured to be read-only eventually, but even if it returns false, writing to the node may not be permitted at the moment (because it's inactive). bdrv_is_writable() checks that the node can be written to right now, and this is what the snapshot operations really need. Change bdrv_can_snapshot() to use bdrv_is_writable() to fix crashes like the following: $ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer qemu-system-x86_64: ../block/io.c:1990: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. The resulting error message after this patch isn't perfect yet, but at least it doesn't crash any more: $ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231201142520.32255-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21block/file-posix: set up Linux AIO and io_uring in the current threadStefan Hajnoczi
The file-posix block driver currently only sets up Linux AIO and io_uring in the BDS's AioContext. In the multi-queue block layer we must be able to submit I/O requests in AioContexts that do not have Linux AIO and io_uring set up yet since any thread can call into the block driver. Set up Linux AIO and io_uring for the current AioContext during request submission. We lose the ability to return an error from .bdrv_file_open() when Linux AIO and io_uring setup fails (e.g. due to resource limits). Instead the user only gets warnings and we fall back to aio=threads. This is still better than a fatal error after startup. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230914140101.1065008-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-28export/vhost-user-blk: Fix consecutive drainsKevin Wolf
The vhost-user-blk export implement AioContext switches in its drain implementation. This means that on drain_begin, it detaches the server from its AioContext and on drain_end, attaches it again and schedules the server->co_trip coroutine in the updated AioContext. However, nothing guarantees that server->co_trip is even safe to be scheduled. Not only is it unclear that the coroutine is actually in a state where it can be reentered externally without causing problems, but with two consecutive drains, it is possible that the scheduled coroutine didn't have a chance yet to run and trying to schedule an already scheduled coroutine a second time crashes with an assertion failure. Following the model of NBD, this commit makes the vhost-user-blk export shut down server->co_trip during drain so that resuming the export means creating and scheduling a new coroutine, which is always safe. There is one exception: If the drain call didn't poll (for example, this happens in the context of bdrv_graph_wrlock()), then the coroutine didn't have a chance to shut down. However, in this case the AioContext can't have changed; changing the AioContext always involves a polling drain. So in this case we can simply assert that the AioContext is unchanged and just leave the coroutine running or wake it up if it has yielded to wait for the AioContext to be attached again. Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf Fixes: https://issues.redhat.com/browse/RHEL-1708 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231127115755.22846-1-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-28vmdk: Don't corrupt desc file in vmdk_write_cidFam Zheng
If the text description file is larger than DESC_SIZE, we force the last byte in the buffer to be 0 and write it out. This results in a corruption. Try to allocate a big buffer in this case. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923 Signed-off-by: Fam Zheng <fam@euphon.net> Message-ID: <20231124115654.3239137-1-fam@euphon.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21stream: Fix AioContext locking during bdrv_graph_wrlock()Kevin Wolf
In stream_prepare(), we need to temporarily drop the AioContext lock that job_prepare_locked() took for us while calling the graph write lock functions which can poll. All block nodes related to this block job are in the same AioContext, so we can pass any of them to bdrv_graph_wrlock()/ bdrv_graph_wrunlock(). Unfortunately, the one that we picked is base, which can be NULL - and in this case the AioContext lock is not released and deadlocks can occur. Fix this by passing s->target_bs, which is never NULL. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-4-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21block: Fix deadlocks in bdrv_graph_wrunlock()Kevin Wolf
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that have a nested event loop. Nested event loops can depend on other iothreads making progress, so in order to allow them to make progress it must not hold the AioContext lock of another thread while calling aio_poll(). This introduces a @bs parameter to bdrv_graph_wrunlock() whose AioContext is temporarily dropped (which matches bdrv_graph_wrlock()), and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState doesn't necessarily exist any more when unlocking. This also requires a change to bdrv_schedule_unref(), which was relying on the incorrectly taken lock. It needs to take the lock itself now. While this is a separate bug, it can't be fixed a separate patch because otherwise the intermediate state would either deadlock or try to release a lock that we don't even hold. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [kwolf: Fixed up bdrv_schedule_unref()] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21block: Fix bdrv_graph_wrlock() call in blk_remove_bs()Kevin Wolf
While not all callers of blk_remove_bs() are correct in this respect, the assumption in the function is that callers hold the AioContext lock of the BlockBackend (this is required by the drain calls in it). In order to avoid deadlock in the nested event loop, bdrv_graph_wrlock() has then to be called with the root BlockDriverState as its parameter instead of NULL, so that this AioContext lock is temporarily dropped. Fixes: https://issues.redhat.com/browse/RHEL-1761 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-2-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>