aboutsummaryrefslogtreecommitdiff
path: root/block.c
AgeCommit message (Collapse)Author
2015-10-16block: Introduce parents listKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Split bdrv_move_feature_fields()Kevin Wolf
After bdrv_swap(), some fields must be moved back to their original BDS to compensate for the effects that a swap of the contents of the objects has while keeping the old addresses. Other fields must be moved back because they should logically be moved and must stay on top When replacing bdrv_swap() with operations changing the pointers in the parents, we only need the latter and must avoid swapping the former. Split the function accordingly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Manage backing file references in bdrv_set_backing_hd()Kevin Wolf
This simplifies the code somewhat, especially when dropping whole backing file subchains. The exception is the mirroring code that does adventurous things with bdrv_swap() and in order to keep it working, I had to duplicate most of bdrv_set_backing_hd() locally. We'll get rid again of this ugliness shortly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Convert bs->backing_hd to BdrvChildKevin Wolf
This is the final step in converting all of the BlockDriverState pointers that block drivers use to BdrvChild. After this patch, bs->children contains the full list of child nodes that are referenced by a given BDS, and these children are only referenced through BdrvChild, so that updating the pointer in there is enough for changing edges in the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Remove bdrv_open_image()Kevin Wolf
It is unused now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Convert bs->file to BdrvChildKevin Wolf
This patch removes the temporary duplication between bs->file and bs->file_child by converting everything to BdrvChild. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16block: Introduce BDS.file_childKevin Wolf
Store the BdrvChild for bs->file. At this point, bs->file_child->bs just duplicates the bs->file pointer. Later, it will completely replace it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-02block: disable I/O limits at the beginning of bdrv_close()Alberto Garcia
Disabling I/O limits from a BDS also drains all pending throttled requests, so it should be done at the beginning of bdrv_close() with the rest of the bdrv_drain() calls before the BlockDriver is closed. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14block: Allow specifying driver-specific options to reopenKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14block: Drop bdrv_find_whitelisted_format()Max Reitz
It is unused by now, so we can drop it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14block: Drop drv parameter from bdrv_fill_options()Max Reitz
Now that this parameter is effectively unused, we can drop it and change the function accordingly. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14block: Drop drv parameter from bdrv_open_inherit()Max Reitz
Now that this parameter is effectively unused, we can drop it and just pass NULL to bdrv_fill_options(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14block: Drop drv parameter from bdrv_open()Max Reitz
Now that this parameter is effectively unused, we can drop it and just pass NULL on to bdrv_open_inherit(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14block: Always pass NULL as drv for bdrv_open()Max Reitz
Change all callers of bdrv_open() to pass the driver name in the options QDict instead of passing its BlockDriver pointer. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-11opts: produce valid command line in qemu_opts_printKővágó, Zoltán
This will let us print options in a format that the user would actually write it on the command line (foo=bar,baz=asd,etc=def), without prepending a spurious comma at the beginning of the list, or quoting values unnecessarily. This patch provides the following changes: * write and id=, if the option has an id * do not print separator before the first element * do not quote string arguments * properly escape commas (,) for QEMU Signed-off-by: Kővágó, Zoltán <DirtY.iCE.hu@gmail.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-02block: more check for replaced nodeWen Congyang
We use mirror+replace to fix quorum's broken child. bs/s->common.bs is quorum, and to_replace is the broken child. The new child is target_bs. Without this patch, the replace node can be any node, and it can be top BDS with BB, or another quorum's child. We just check if the broken child is part of the quorum BDS in this patch. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Message-id: 55A86486.1000404@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-14block: Fix backing file child when modifying graphKevin Wolf
This patch moves bdrv_attach_child() from the individual places that add a backing file to a BDS to bdrv_set_backing_hd(), which is called by all of them. It also adds bdrv_detach_child() there. For normal operation (starting with one backing file chain and not changing it until the topmost image is closed) and live snapshots, this constitutes no change in behaviour. For all other cases, this is a fix for the bug that the old backing file was still referenced as a child, and the new one wasn't referenced. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-07-14block: Reorder cleanups in bdrv_close()Kevin Wolf
Block drivers may still want to access their child nodes in their .bdrv_close handler. If they unref and/or detach a child by themselves, this should not result in a double free. There is additional code for backing files, which are just a special case of child nodes. The same applies for them. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-07-14block: Introduce bdrv_unref_child()Kevin Wolf
This is the counterpart for bdrv_open_child(). It decreases the reference count of the child BDS and removes it from the list of children of the given parent BDS. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-07-14block: Introduce bdrv_open_child()Kevin Wolf
It is the same as bdrv_open_image(), except that it doesn't only return success or failure, but the newly created BdrvChild object for the new child node. As the BdrvChild object already contains a BlockDriverState pointer (and this is supposed to become the only pointer so that bdrv_append() and friends can just change a single pointer in BdrvChild), the pbs parameter is removed for bdrv_open_child(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-07-14block: Move bdrv_attach_child() calls up the call chainKevin Wolf
Let the callers of bdrv_open_inherit() call bdrv_attach_child(). It needs to be called in all cases where bdrv_open_inherit() succeeds (i.e. returns 0) and a child_role is given. bdrv_attach_child() is moved upwards to avoid a forward declaration. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-07-07block: Use bdrv_drain to replace uncessary bdrv_drain_allFam Zheng
There callers work on a single BlockDriverState subtree, where using bdrv_drain() is more accurate. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-07block: Initialize local_err in bdrv_append_temp_snapshotFam Zheng
Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1436156684-16526-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02block: Remove bdrv_reset_dirtyFam Zheng
Using this function would always be wrong because a dirty bitmap must have a specific owner that consumes the dirty bits and calls bdrv_reset_dirty_bitmap(). Remove the unused function to avoid future misuse. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23block: Use bdrv_is_sg() everywhereDimitris Aragiorgis
Instead of checking bs->sg use bdrv_is_sg() consistently throughout the code. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23util/hbitmap: Add an API to reset all set bits in hbitmapWen Congyang
The function bdrv_clear_dirty_bitmap() is updated to use faster hbitmap_reset_all() call. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 555E868A.60506@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-22Include qapi/qmp/qerror.h exactly where neededMarkus Armbruster
In particular, don't include it into headers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22qerror: Move #include out of qerror.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22qerror: Clean up QERR_ macros to expand into a single stringMarkus Armbruster
These macros expand into error class enumeration constant, comma, string. Unclean. Has been that way since commit 13f59ae. The error class is always ERROR_CLASS_GENERIC_ERROR since the previous commit. Clean up as follows: * Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and delete it from the QERR_ macro. No change after preprocessing. * Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into error_setg(...). Again, no change after preprocessing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-15Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer core and image format patches # gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (25 commits) block: Fix reopen flag inheritance block: Add BlockDriverState.inherits_from block: Add list of children to BlockDriverState queue.h: Add QLIST_FIX_HEAD_PTR() block: Drain requests before swapping nodes in bdrv_swap() block: Move flag inheritance to bdrv_open_inherit() block: Use QemuOpts in bdrv_open_common() block: Use macro for cache option names vmdk: Use bdrv_open_image() quorum: Use bdrv_open_image() check-qdict: Test cases for new functions qdict: Add qdict_{set,copy}_default() qdict: Add qdict_array_entries() iotests: Add tests for overriding BDRV_O_PROTOCOL block: driver should override flags in bdrv_open() block: Change bitmap truncate conditional to assertion block: record new size in bdrv_dirty_bitmap_truncate raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size vmdk: Use vmdk_find_index_in_cluster everywhere vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12block: Fix reopen flag inheritanceKevin Wolf
When reopening an image, the block layer already takes care to reopen bs->file as well with recalculated inherited flags. The same must happen for any other child (most notably missing before this patch: backing files). If bs->file (or any other child) didn't originally inherit from bs, e.g. because it was created separately and then only referenced, it must not inherit flags on reopen either, so check the inherited_from field before propagation the reopen down. VMDK already reopened its extents manually; this code can now be dropped. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12block: Add BlockDriverState.inherits_fromKevin Wolf
Currently, the block layer assumes that any block node can have only one parent, and if it has a parent, that it inherits some options/flags from this parent. This is not true any more: With references used in block device creation, a single node can be used by multiple parents, or it can be created separately and not inherit flags from any parent. To handle reopens correctly, a node must know from which parent it inherited options. This patch adds the information to BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Add list of children to BlockDriverStateKevin Wolf
This allows iterating over all children of a given BDS, not only including bs->file and bs->backing_hd, but also driver-specific ones like VMDK extents or Quorum children. For bdrv_swap(), the list of children of the swapped BDS stays at that BDS (because that's where the pointers stay as well). The list head moves and pointers to it must be fixed up therefore. The list of children in the parent of the swapped BDS is not affected by the swap. The contents of the BDS objects is swapped, so the existing pointer in the parent automatically points to the newly swapped in BDS. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Drain requests before swapping nodes in bdrv_swap()Kevin Wolf
bdrv_swap() requires that there are no requests in flight on either of the two devices. The request coroutine would work on the wrong BlockDriverState object (with bs->opaque even being interpreted as a different type potentially) and all sorts of bad things would result from this. The currently existing callers mostly ensure that there is no I/O pending on nodes that are swapped. In detail, this is: 1. Live snapshots. This goes through qmp_transaction(), which calls bdrv_drain_all() before doing anything. The command is executed synchronously, so no new I/O can be issued concurrently. 2. snapshot=on in bdrv_open(). We're in the middle of opening the image (both the original image and its temporary overlay), so there can't be any I/O in flight yet. 3. Mirroring. bdrv_drain() is already used on the source device so that the mirror doesn't miss anything. However, the main loop runs between that and the bdrv_swap() (which is actually a bug, being addressed in another series), so there is a small window in which new I/O might be issued that would be in flight during bdrv_swap(). It is safer to just drain the request queue of both devices in bdrv_swap() instead of relying on callers to do the right thing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Move flag inheritance to bdrv_open_inherit()Kevin Wolf
Instead of letting every caller of bdrv_open() determine the right flags for its child node manually and pass them to the function, pass the parent node and the role of the newly opened child (like backing file, protocol layer, etc.). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Use QemuOpts in bdrv_open_common()Kevin Wolf
Instead of manually parsing options and then deleting them from the options QDict, just use QemuOpts like most other places that deal with block device options. More options will be added there and then QemuOpts is a lot more manageable than open-coding everything. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: driver should override flags in bdrv_open()Max Reitz
The BDRV_O_PROTOCOL flag should have an impact only if no driver is specified explicitly. Therefore, if bdrv_open() is called with an explicit block driver argument (either through the options QDict or through the drv parameter) and that block driver is a protocol block driver, BDRV_O_PROTOCOL should be set; if it is a format block driver, BDRV_O_PROTOCOL should be unset. While there was code to unset the flag in case a format block driver has been selected, it only followed the bdrv_fill_options() function call whereas the flag in fact needs to be adjusted before it is used there. With that change, BDRV_O_PROTOCOL will always be set if the BDS should be a protocol driver; if the driver has been specified explicitly, the new code will set it; and bdrv_fill_options() will only "probe" a protocol driver if BDRV_O_PROTOCOL is set. The probing after bdrv_fill_options() cannot select a protocol driver. Thus, bdrv_open_image() to open BDS.file is never called if a protocol BDS is about to be created. With that change in turn it is impossible to call bdrv_open_common() with a protocol drv and file != NULL, which allows us to remove the bdrv_swap() call. This change breaks a test case in qemu-iotest 051: "-drive file=t.qcow2,file.driver=qcow2" now works because the explicitly specified "qcow2" overrides the BDRV_O_PROTOCOL which is automatically set for the "file" BDS (and the filename is just passed down). Therefore, this patch removes that test case. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12block: Change bitmap truncate conditional to assertionJohn Snow
This is an artifact of an older version that had both all-bitmap and single-bitmap truncate functions, and some info got lost in the shuffle. Bitmaps can only be frozen during a backup operation, and a backup operation should prevent a resize operation, so just assert that this cannot happen. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12block: record new size in bdrv_dirty_bitmap_truncateJohn Snow
ce1ffea8 neglected to update the BdrvDirtyBitmap structure itself for internal consistency. It's currently not an issue, but for migration and persistence series this will cause headaches. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12throttle: acquire the ThrottleGroup lock in bdrv_swap()Alberto Garcia
bdrv_swap() touches the fields of a BlockDriverState that are protected by the ThrottleGroup lock. Although those fields end up in their original place, they are temporarily swapped in the process, so there's a chance that an operation on a member of the same group happening on a different thread can try to use them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: d92dc40d7c4f1fc5cda5cbbf4ffb7a4670b79d17.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group supportAlberto Garcia
The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: f0082a86f3ac01c46170f7eafe2101a92e8fde39.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Extract timers from ThrottleState into a separate structureBenoît Canet
Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-29qapi: add dirty bitmap statusJohn Snow
Bitmaps can be in a handful of different states with potentially more to come as we tool around with migration and persistence patches. Management applications may need to know why certain bitmaps are unavailable for various commands, e.g. busy in another operation, busy being migrated, etc. Right now, all we offer is BlockDirtyInfo's boolean member 'frozen'. Instead of adding more booleans, replace it by an enumeration member 'status' with values 'active' and 'frozen'. Then add new value 'disabled'. Incompatible change. Fine because the changed part hasn't been released so far. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-05-22block: Detect multiplication overflow in bdrv_getlengthFam Zheng
Bogus image may have a large total_sectors that will overflow the multiplication. For cleanness, fix the return code so the error message will be meaningful. Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-05-22block: align bounce buffers to pageDenis V. Lunev
The following sequence int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); for (i = 0; i < 100000; i++) write(fd, buf, 4096); performs 5% better if buf is aligned to 4096 bytes. The difference is quite reliable. On the other hand we do not want at the moment to enforce bounce buffering if guest request is aligned to 512 bytes. The patch changes default bounce buffer optimal alignment to MAX(page size, 4k). 4k is chosen as maximal known sector size on real HDD. The justification of the performance improve is quite interesting. From the kernel point of view each request to the disk was split by two. This could be seen by blktrace like this: 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] After the patch the pattern becomes normal: 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] and the amount of requests sent to disk (could be calculated counting number of lines in the output of blktrace) is reduced about 2 times. Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest does his job well and real requests comes properly aligned (to page). Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-3-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22block: minimal bounce buffer alignmentDenis V. Lunev
The patch introduces new concept: minimal memory alignment for bounce buffers. Original so called "optimal" value is actually minimal required value for aligment. It should be used for validation that the IOVec is properly aligned and bounce buffer is not required. Though, from the performance point of view, it would be better if bounce buffer or IOVec allocated by QEMU will be aligned stricter. The patch does not change any alignment value yet. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-2-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-04-28block: move I/O request processing to block/io.cStefan Hajnoczi
The block.c file has grown to over 6000 lines. It is time to split this file so there are fewer conflicts and the code is easier to maintain. Extract I/O request processing code: * Read * Write * Zero writes and making the image empty * Flush * Discard * ioctl * Tracked requests and queuing * Throttling and copy-on-read * Block status and allocated functions * Refreshing block limits * Reading/writing vmstate * qemu_blockalign() and friends The patch simply moves code from block.c into block/io.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: extract bdrv_setup_io_funcs()Stefan Hajnoczi
Move the code to install coroutine and aio emulation function pointers in a BlockDriver to its own function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.hStefan Hajnoczi
The dirty bitmap functions are called from the block I/O processing code. Make them visible to block_int.h users so they can be used outside block.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: replace bdrv_states iteration with bdrv_next()Stefan Hajnoczi
The bdrv_states list is a static variable in block.c. bdrv_drain_all() and bdrv_flush_all() use this variable to iterate over all drives. The next patch will move bdrv_drain_all() and bdrv_flush_all() out of block.c so it's necessary to switch to the public bdrv_next() interface. Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>