aboutsummaryrefslogtreecommitdiff
path: root/block.h
AgeCommit message (Collapse)Author
2012-03-12block: handle -EBUSY in bdrv_commit_all()Stefan Hajnoczi
Monitor operations that manipulate image files must not execute while a background job (like image streaming) is in progress. This prevents corruptions from happening when two pieces of code are manipulating the image file without knowledge of each other. The monitor "commit" command raises QERR_DEVICE_IN_USE when bdrv_commit() returns -EBUSY but "commit all" has no error handling. This is easy to fix, although note that we do not deliver a detailed error about which device was busy in the "commit all" case. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29qapi: Introduce blockdev-group-snapshot-sync commandJeff Cody
This is a QAPI/QMP only command to take a snapshot of a group of devices. This is similar to the blockdev-snapshot-sync command, except blockdev-group-snapshot-sync accepts a list devices, filenames, and formats. It is attempted to keep the snapshot of the group atomic; if the creation or open of any of the new snapshots fails, then all of the new snapshots are abandoned, and the name of the snapshot image that failed is returned. The failure case should not interrupt any operations. Rather than use bdrv_close() along with a subsequent bdrv_open() to perform the pivot, the original image is never closed and the new image is placed 'in front' of the original image via manipulation of the BlockDriverState fields. Thus, once the new snapshot image has been successfully created, there are no more failure points before pivoting to the new snapshot. This allows the group of disks to remain consistent with each other, even across snapshot failures. Signed-off-by: Jeff Cody <jcody@redhat.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29block: add a transfer rate for floppy typesHervé Poussineau
Floppies must be read at a specific transfer rate, depending of its own format. Update floppy description table to include required transfer rate. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-22block: bdrv_eject(): Make eject_flag a real boolLuiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
2012-02-22block: Rename bdrv_mon_event() & BlockMonEventActionLuiz Capitulino
They are QMP events, not monitor events. Rename them accordingly. Also, move bdrv_emit_qmp_error_event() up in the file. A new event will be added soon and it's good to have them next each other. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09block: add .bdrv_co_write_zeroes() interfaceStefan Hajnoczi
The ability to zero regions of an image file is a useful primitive for higher-level features such as image streaming or zero write detection. Image formats may support an optimized metadata representation instead of writing zeroes into the image file. This allows zero writes to be potentially faster than regular write operations and also preserve sparseness of the image file. The .bdrv_co_write_zeroes() interface should be implemented by block drivers that wish to provide efficient zeroing. Note that this operation is different from the discard operation, which may leave the contents of the region indeterminate. That means discarded blocks are not guaranteed to contain zeroes and may contain junk data instead. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-01-26block: add bdrv_find_backing_imageMarcelo Tosatti
Add bdrv_find_backing_image: given a BlockDriverState pointer, and an id, traverse the backing image chain to locate the id. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-01-26block: make copy-on-read a per-request flagStefan Hajnoczi
Previously copy-on-read could only be enabled for all requests to a block device. This means requests coming from the guest as well as QEMU's internal requests would perform copy-on-read when enabled. For image streaming we want to support finer-grained behavior than just populating the image file from its backing image. Image streaming supports partial streaming where a common backing image is preserved. In this case guest requests should not perform copy-on-read because they would indiscriminately copy data which should be left in a backing image from the backing chain. Introduce a per-request flag for copy-on-read so that a block device can process both regular and copy-on-read requests. Overlapping reads and writes still need to be serialized for correctness when copy-on-read is happening, so add an in-flight reference count to track this. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15qcow2: Allow >4 GB VM stateKevin Wolf
This is a compatible extension to the snapshot header format that allows saving a 64 bit VM state size. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-14Fix spelling in comments, documentation and messagesStefan Weil
accidently->accidentally annother->another choosen->chosen consideres->considers decriptor->descriptor developement->development paramter->parameter preceed->precede preceeding->preceding priviledge->privilege propogation->propagation substraction->subtraction throught->through upto->up to usefull->useful Fix also grammar in posix-aio-compat.c Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05block: convert qemu_aio_flush() calls to bdrv_drain_all()Stefan Hajnoczi
Many places in QEMU call qemu_aio_flush() to complete all pending asynchronous I/O. Most of these places actually want to drain all block requests but there is no block layer API to do so. This patch introduces the bdrv_drain_all() API to wait for requests across all BlockDriverStates to complete. As a bonus we perform checks after qemu_aio_wait() to ensure that requests really have finished. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add interface to toggle copy-on-readStefan Hajnoczi
The bdrv_enable_copy_on_read()/bdrv_disable_copy_on_read() functions can be used to programmatically enable or disable copy-on-read for a block device. Later patches add the actual copy-on-read logic. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add bdrv_co_is_allocated() interfaceStefan Hajnoczi
This patch introduces the public bdrv_co_is_allocated() interface which can be used to query image allocation status while the VM is running. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add I/O throttling algorithmZhi Yong Wu
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add the blockio limits command line supportZhi Yong Wu
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-21block: allow migration to work with image files (v3)Anthony Liguori
Image files have two types of data: immutable data that describes things like image size, backing files, etc. and mutable data that includes offset and reference count tables. Today, image formats aggressively cache mutable data to improve performance. In some cases, this happens before a guest even starts. When dealing with live migration, since a file is open on two machines, the caching of meta data can lead to data corruption. This patch addresses this by introducing a mechanism to invalidate any cached mutable data a block driver may have which is then used by the live migration code. NB, this still requires coherent shared storage. Addressing migration without coherent shared storage (i.e. NFS) requires additional work. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-11block: add eject request callbackPaolo Bonzini
Recent versions of udev always keep the tray locked so that the kernel can observe "eject request" events (aka tray button presses) even on discs that aren't mounted. Add support for these events in the ATAPI and SCSI cd drive device models. To let management cope with the behavior of udev, an event should also be added for "tray opened/closed". This way, after issuing an "eject" command, management can poll until the guests actually reacts to the command. They can then issue the "change" command after the tray has been opened, or try with "eject -f" after a (configurable?) timeout. However, with this patch and the corresponding support in the device models, at least it is possible to do a manual two-step eject+change sequence. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-27qapi: Convert query-blockLuiz Capitulino
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-27block: Rename the BlockIOStatus enum valuesLuiz Capitulino
The biggest change is to rename its prefix from BDRV_IOS to BLOCK_DEVICE_IO_STATUS. Next commit will convert the query-block command to the QAPI and that's how the enumeration is going to be generated. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-27block: iostatus: Drop BDRV_IOS_INVALLuiz Capitulino
A future commit will convert bdrv_info() to the QAPI and it won't provide IOS_INVAL. Luckily all we have to do is to add a new 'iostatus_enabled' member to BlockDriverState and use it instead. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-21block: add bdrv_co_discard and bdrv_aio_discard supportPaolo Bonzini
This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21block: unify flush implementationsPaolo Bonzini
Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11block: Keep track of devices' I/O statusLuiz Capitulino
This commit adds support to the BlockDriverState type to keep track of devices' I/O status. There are three possible status: BDRV_IOS_OK (no error), BDRV_IOS_ENOSPC (no space error) and BDRV_IOS_FAILED (any other error). The distinction between no space and other errors is important because a management application may want to watch for no space in order to extend the space assigned to the VM and put it to run again. Qemu devices supporting the I/O status feature have to enable it explicitly by calling bdrv_iostatus_enable() _and_ have to be configured to stop the VM on errors (ie. werror=stop|enospc or rerror=stop). In case of multiple errors being triggered in sequence only the first one is stored. The I/O status is always reset to BDRV_IOS_OK when the 'cont' command is issued. Next commits will add support to some devices and extend the query-block/info block commands to return the I/O status information. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: New change_media_cb() parameter loadMarkus Armbruster
To let device models distinguish between eject and load. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: New bdrv_set_buffer_alignment()Markus Armbruster
Device models should be able to set it without an unclean include of block_int.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Move BlockConf & friends from block_int.h to block.hMarkus Armbruster
It's convenience stuff for block device models, so block.h isn't the ideal home either, but better than block_int.h. Permits moving some #include "block_int.h" from device model .h into .c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Show whether the virtual tray is open in info blockMarkus Armbruster
Need to ask the device, so this requires new BlockDevOps member is_tray_open(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Drop BlockDriverState member removableMarkus Armbruster
It's a confused mess (see previous commit). No users remain. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Clean up remaining users of "removable"Markus Armbruster
BlockDriverState member removable is a confused mess. It is true when an ide-cd, scsi-cd or floppy qdev is attached, or when the BlockDriverState was created with -drive if={floppy,sd} or -drive if={ide,scsi,xen,none},media=cdrom ("created removable"), except when an ide-hd, scsi-hd, scsi-generic or virtio-blk qdev is attached. Three users remain: 1. eject_device(), via bdrv_is_removable() uses it to determine whether a block device can eject media. 2. bdrv_info() is monitor command "info block". QMP documentation says "true if the device is removable, false otherwise". From the monitor user's point of view, the only sensible interpretation of "is removable" is "can eject media with monitor commands eject and change". A block device can eject media unless a device is attached that doesn't support it. Switch the two users over to new bdrv_dev_has_removable_media() that returns exactly that. 3. bdrv_getlength() uses to suppress its length cache when media can change (see commit 46a4e4e6). Media change is either monitor command change (updates the length cache), monitor command eject (doesn't update the length cache, easily fixable), or physical media change (invalidates length cache, not so easily fixable). I'm refraining from improving anything here, because this series is long enough already. Instead, I simply switch it over to bdrv_dev_has_removable_media() as well. This changes the behavior of the length cache and of monitor commands eject and change in two cases: a. drive not created removable, no device attached The commit makes the drive removable, and defeats the length cache. Example: -drive if=none b. drive created removable, but the attached drive is non-removable, and doesn't call bdrv_set_removable(..., 0) (most devices don't) The commit makes the drive non-removable, and enables the length cache. Example: -drive if=xen,media=cdrom -M xenpv The other non-removable devices that don't call bdrv_set_removable() can't currently use a drive created removable, either because they aren't qdevified, or because they lack a drive property. Won't stay that way. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Rename bdrv_set_locked() to bdrv_lock_medium()Markus Armbruster
While there, make the locked parameter bool. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Drop medium lock tracking, ask device models insteadMarkus Armbruster
Requires new BlockDevOps member is_medium_locked(). Implement for IDE and SCSI CD-ROMs. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: Leave enforcing tray lock to device modelsMarkus Armbruster
The device model knows best when to accept the guest's eject command. No need to detour through the block layer. bdrv_eject() can't fail anymore. Make it void. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06block: Declare qemu_blockalign() in block.h, not block_int.hMarkus Armbruster
Device models should be able to use it without an unclean include of block_int.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06block: Split change_cb() into change_media_cb(), resize_cb()Markus Armbruster
Multiplexing callbacks complicates matters needlessly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06block: Generalize change_cb() to BlockDevOpsMarkus Armbruster
So we can more easily add device model callbacks. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06block: Attach non-qdev devices as wellMarkus Armbruster
For now, this just protects against programming errors like having the same drive back multiple non-qdev devices, or untimely bdrv_delete(). Later commits will add other interesting uses. While there, rename BlockDriverState member peer to dev, bdrv_attach() to bdrv_attach_dev(), bdrv_detach() to bdrv_detach_dev(), and bdrv_get_attached() to bdrv_get_attached_dev(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-26block: latency accountingChristoph Hellwig
Account the total latency for read/write/flush requests. This allows management tools to average it based on a snapshot of the nr ops counters and allow checking for SLAs or provide statistics. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25block: explicit I/O accountingChristoph Hellwig
Decouple the I/O accounting from bdrv_aio_readv/writev/flush and make the hardware models call directly into the accounting helpers. This means: - we do not count internal requests from image formats in addition to guest originating I/O - we do not double count I/O ops if the device model handles it chunk wise - we only account I/O once it actuall is done - can extent I/O accounting to synchronous or coroutine I/O easily - implement I/O latency tracking easily (see the next patch) I've conveted the existing device model callers to the new model, device models that are using synchronous I/O and weren't accounted before haven't been updated yet. Also scsi hasn't been converted to the end-to-end accounting as I want to defer that after the pending scsi layer overhaul. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23block: parse cache mode flags in a single placeStefan Hajnoczi
This patch introduces bdrv_parse_cache_flags() which sets open flags given a cache mode. Previously this was duplicated in blockdev.c and qemu-img.c. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02block: Add bdrv_co_readv/writevKevin Wolf
Add new block driver callbacks bdrv_co_readv/writev, which work on a QEMUIOVector like bdrv_aio_*, but don't need a callback. The function may only be called inside a coroutine, so a block driver implementing this interface can yield instead of blocking during I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01block: Removed unused function bdrv_write_syncFrediano Ziglio
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19block: add bdrv_get_allocated_file_size() operationFam Zheng
qemu-img.c wants to count allocated file size of image. Previously it counts a single bs->file by 'stat' or Window API. As VMDK introduces multiple file support, the operation becomes format specific with platform specific meanwhile. The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls bdrv_get_allocated_file_size to count the bs. And also added VMDK code to count his own extents. Signed-off-by: Fam Zheng <famcool@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-15Replaced tabs with spaces in block.h and block_int.hDevin Nakamura
Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-05-19block: Remove type hint, it's guest matter, doesn't belong hereMarkus Armbruster
No users of bdrv_get_type_hint() left. bdrv_set_type_hint() can make the media removable by side effect. Make that explicit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-07Do not delete BlockDriverState when deleting the driveRyan Harper
When removing a drive from the host-side via drive_del we currently have the following path: drive_del qemu_aio_flush() bdrv_close() // zaps bs->drv, which makes any subsequent I/O get // dropped. Works as designed drive_uninit() bdrv_delete() // frees the bs. Since the device is still connected to // bs, any subsequent I/O is a use-after-free. The value of bs->drv becomes unpredictable on free. As long as it remains null, I/O still gets dropped, however it could become non-null at any point after the free resulting SEGVs or other QEMU state corruption. To resolve this issue as simply as possible, we can chose to not actually delete the BlockDriverState pointer. Since bdrv_close() handles setting the drv pointer to NULL, we just need to remove the BlockDriverState from the QLIST that is used to enumerate the block devices. This is currently handled within bdrv_delete, so move this into its own function, bdrv_make_anon(). The result is that we can now invoke drive_del, this closes the file descriptors and sets BlockDriverState->drv to NULL which prevents futher IO to the device, and since we do not free BlockDriverState, we don't have to worry about the copy retained in the block devices. We also don't attempt to remove the qdev property since we are no longer deleting the BlockDriverState on drives with associated drives. This also allows for removing Drives with no devices associated either. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-20fdc: move floppy geometry guessing to block.cBlue Swirl
Other geometry guessing functions already reside in block.c. Remove some unused or debugging only fields. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-02-07Add flag to indicate external users to block deviceMarcelo Tosatti
Certain operations such as drive_del or resize cannot be performed while external users (eg. block migration) reference the block device. Add a flag to indicate that. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31block: tell drivers about an image resizeChristoph Hellwig
Extend the change_cb callback with a reason argument, and use it to tell drivers about size changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17block: add discard supportChristoph Hellwig
Add a new bdrv_discard method to free blocks in a mapping image, and a new drive property to set the granularity for these discard. If no discard granularity support is set discard support is disabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17qemu-img.c: Re-factor img_create()Jes Sorensen
This patch re-factors img_create() moving the code doing the actual work into block.c where it can be shared with QEMU. This is needed to be able to create images from QEMU to be used for live snapshots. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>