aboutsummaryrefslogtreecommitdiff
path: root/drivers/xen
AgeCommit message (Collapse)Author
2015-06-06xen/events: don't bind non-percpu VIRQs with percpu chipDavid Vrabel
commit 77bb3dfdc0d554befad58fdefbc41be5bc3ed38a upstream. A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different VCPU than it is bound to. This can result in a race between handle_percpu_irq() and removing the action in __free_irq() because handle_percpu_irq() does not take desc->lock. The interrupt handler sees a NULL action and oopses. Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER). # cat /proc/interrupts | grep virq 40: 87246 0 xen-percpu-virq timer0 44: 0 0 xen-percpu-virq debug0 47: 0 20995 xen-percpu-virq timer1 51: 0 0 xen-percpu-virq debug1 69: 0 0 xen-dyn-virq xen-pcpu 74: 0 0 xen-dyn-virq mce 75: 29 0 xen-dyn-virq hvc_console Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-17xen/events: Set irq_info->evtchn before binding the channel to CPU in ↵Boris Ostrovsky
__startup_pirq() commit 16e6bd5970c88a2ac018b84a5f1dd5c2ff1fdf2c upstream. .. because bind_evtchn_to_cpu(evtchn, cpu) will map evtchn to 'info' and pass 'info' down to xen_evtchn_port_bind_to_cpu(). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Annie Li <annie.li@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-17xen/events: Clear cpu_evtchn_mask before resumingBoris Ostrovsky
commit 5cec98834989a014a9560b1841649eaca95cf00e upstream. When a guest is resumed, the hypervisor may change event channel assignments. If this happens and the guest uses 2-level events it is possible for the interrupt to be claimed by wrong VCPU since cpu_evtchn_mask bits may be stale. This can happen even though evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that is passed in is not necessarily the original one (from pre-migration times) but instead is freshly allocated during resume and so any information about which CPU the channel was bound to is lost. Thus we should clear the mask during resume. We also need to make sure that bits for xenstore and console channels are set when these two subsystems are resumed. While rebind_evtchn_irq() (which is invoked for both of them on a resume) calls irq_set_affinity(), the latter will in fact postpone setting affinity until handling the interrupt. But because cpu_evtchn_mask will have bits for these two cleared we won't be able to take the interrupt. With that in mind, we need to bind those two channels explicitly in rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a pass through generic irq affinity code later, in case something needs to be updated there as well. (Also replace cpumask_of(0) with cpumask_of(info->cpu) in rebind_evtchn_irq(): it should be set to zero in preceding xen_irq_info_evtchn_setup().) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Annie Li <annie.li@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26xen-pciback: limit guest control of command registerJan Beulich
commit af6fc858a35b90e89ea7a7ee58e66628c55c776b upstream. Otherwise the guest can abuse that control to cause e.g. PCIe Unsupported Request responses by disabling memory and/or I/O decoding and subsequently causing (CPU side) accesses to the respective address ranges, which (depending on system configuration) may be fatal to the host. Note that to alter any of the bits collected together as PCI_COMMAND_GUEST permissive mode is now required to be enabled globally or on the specific device. This is CVE-2015-2150 / XSA-120. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26xen/events: avoid NULL pointer dereference in dom0 on large machinesJuergen Gross
commit 85e40b0539b24518c8bdf63e2605c8522377d00f upstream. Using the pvops kernel a NULL pointer dereference was detected on a large machine (144 processors) when booting as dom0 in evtchn_fifo_unmask() during assignment of a pirq. The event channel in question was the first to need a new entry in event_array[] in events_fifo.c. Unfortunately xen_irq_info_pirq_setup() is called with evtchn being 0 for a new pirq and the real event channel number is assigned to the pirq only during __startup_pirq(). It is mandatory to call xen_evtchn_port_setup() after assigning the event channel number to the pirq to make sure all memory needed for the event channel is allocated. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06xen/manage: Fix USB interaction issues when resumingRoss Lagerwall
commit 72978b2fe2f2cdf9f319c6c6dcdbe92b38de2be2 upstream. Commit 61a734d305e1 ("xen/manage: Always freeze/thaw processes when suspend/resuming") ensured that userspace processes were always frozen before suspending to reduce interaction issues when resuming devices. However, freeze_processes() does not freeze kernel threads. Freeze kernel threads as well to prevent deadlocks with the khubd thread when resuming devices. This is what native suspend and resume does. Example deadlock: [ 7279.648010] [<ffffffff81446bde>] ? xen_poll_irq_timeout+0x3e/0x50 [ 7279.648010] [<ffffffff81448d60>] xen_poll_irq+0x10/0x20 [ 7279.648010] [<ffffffff81011723>] xen_lock_spinning+0xb3/0x120 [ 7279.648010] [<ffffffff810115d1>] __raw_callee_save_xen_lock_spinning+0x11/0x20 [ 7279.648010] [<ffffffff815620b6>] ? usb_control_msg+0xe6/0x120 [ 7279.648010] [<ffffffff81747e50>] ? _raw_spin_lock_irq+0x50/0x60 [ 7279.648010] [<ffffffff8174522c>] wait_for_completion+0xac/0x160 [ 7279.648010] [<ffffffff8109c520>] ? try_to_wake_up+0x2c0/0x2c0 [ 7279.648010] [<ffffffff814b60f2>] dpm_wait+0x32/0x40 [ 7279.648010] [<ffffffff814b6eb0>] device_resume+0x90/0x210 [ 7279.648010] [<ffffffff814b7d71>] dpm_resume+0x121/0x250 [ 7279.648010] [<ffffffff8144c570>] ? xenbus_dev_request_and_reply+0xc0/0xc0 [ 7279.648010] [<ffffffff814b80d5>] dpm_resume_end+0x15/0x30 [ 7279.648010] [<ffffffff81449fba>] do_suspend+0x10a/0x200 [ 7279.648010] [<ffffffff8144a2f0>] ? xen_pre_suspend+0x20/0x20 [ 7279.648010] [<ffffffff8144a1d0>] shutdown_handler+0x120/0x150 [ 7279.648010] [<ffffffff8144c60f>] xenwatch_thread+0x9f/0x160 [ 7279.648010] [<ffffffff810ac510>] ? finish_wait+0x80/0x80 [ 7279.648010] [<ffffffff8108d189>] kthread+0xc9/0xe0 [ 7279.648010] [<ffffffff8108d0c0>] ? flush_kthread_worker+0x80/0x80 [ 7279.648010] [<ffffffff8175087c>] ret_from_fork+0x7c/0xb0 [ 7279.648010] [<ffffffff8108d0c0>] ? flush_kthread_worker+0x80/0x80 [ 7441.216287] INFO: task khubd:89 blocked for more than 120 seconds. [ 7441.219457] Tainted: G X 3.13.11-ckt12.kz #1 [ 7441.222176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 7441.225827] khubd D ffff88003f433440 0 89 2 0x00000000 [ 7441.229258] ffff88003ceb9b98 0000000000000046 ffff88003ce83000 0000000000013440 [ 7441.232959] ffff88003ceb9fd8 0000000000013440 ffff88003cd13000 ffff88003ce83000 [ 7441.236658] 0000000000000286 ffff88003d3e0000 ffff88003ceb9bd0 00000001001aa01e [ 7441.240415] Call Trace: [ 7441.241614] [<ffffffff817442f9>] schedule+0x29/0x70 [ 7441.243930] [<ffffffff81743406>] schedule_timeout+0x166/0x2c0 [ 7441.246681] [<ffffffff81075b80>] ? call_timer_fn+0x110/0x110 [ 7441.249339] [<ffffffff8174357e>] schedule_timeout_uninterruptible+0x1e/0x20 [ 7441.252644] [<ffffffff81077710>] msleep+0x20/0x30 [ 7441.254812] [<ffffffff81555f00>] hub_port_reset+0xf0/0x580 [ 7441.257400] [<ffffffff81558465>] hub_port_init+0x75/0xb40 [ 7441.259981] [<ffffffff814bb3c9>] ? update_autosuspend+0x39/0x60 [ 7441.262817] [<ffffffff814bb4f0>] ? pm_runtime_set_autosuspend_delay+0x50/0xa0 [ 7441.266212] [<ffffffff8155a64a>] hub_thread+0x71a/0x1750 [ 7441.268728] [<ffffffff810ac510>] ? finish_wait+0x80/0x80 [ 7441.271272] [<ffffffff81559f30>] ? usb_port_resume+0x670/0x670 [ 7441.274067] [<ffffffff8108d189>] kthread+0xc9/0xe0 [ 7441.276305] [<ffffffff8108d0c0>] ? flush_kthread_worker+0x80/0x80 [ 7441.279131] [<ffffffff8175087c>] ret_from_fork+0x7c/0xb0 [ 7441.281659] [<ffffffff8108d0c0>] ? flush_kthread_worker+0x80/0x80 Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-29Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"David Vrabel
commit dbdd74763f1faf799fbb9ed30423182e92919378 upstream. This reverts commit 2c3fc8d26dd09b9d7069687eead849ee81c78e46. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_singleStefano Stabellini
commit 2c3fc8d26dd09b9d7069687eead849ee81c78e46 upstream. Need to pass the pointer within the swiotlb internal buffer to the swiotlb library, that in the case of xen_unmap_single is dev_addr, not paddr. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16swiotlb-xen: call xen_dma_sync_single_for_device when appropriateStefano Stabellini
commit 9490c6c67e2f41760de8ece4e4f56f75f84ceb9e upstream. In xen_swiotlb_sync_single we always call xen_dma_sync_single_for_cpu, even when we should call xen_dma_sync_single_for_device. Fix that. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16swiotlb-xen: remove BUG_ON in xen_bus_to_physStefano Stabellini
commit c884227eaae9936f8ecbde6e1387bccdab5f4e90 upstream. On x86 truncation cannot occur because config XEN depends on X86_64 || (X86_32 && X86_PAE). On ARM truncation can occur without CONFIG_ARM_LPAE, when the dma operation involves foreign grants. However in that case the physical address returned by xen_bus_to_phys is actually invalid (there is no mfn to pfn tracking for foreign grants on ARM) and it is not used. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpuStefano Stabellini
commit d6883e6f32e07ef2cc974753ba00927de099e6d7 upstream. xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t handle as argument, not a physical address. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05xen/manage: Always freeze/thaw processes when suspend/resumingRoss Lagerwall
commit 61a734d305e16944b42730ef582a7171dc733321 upstream. Always freeze processes when suspending and thaw processes when resuming to prevent a race noticeable with HVM guests. This prevents a deadlock where the khubd kthread (which is designed to be freezable) acquires a usb device lock and then tries to allocate memory which requires the disk which hasn't been resumed yet. Meanwhile, the xenwatch thread deadlocks waiting for the usb device lock. Freezing processes fixes this because the khubd thread is only thawed after the xenwatch thread finishes resuming all the devices. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17xen/events/fifo: reset control block and local HEADs on resumeDavid Vrabel
commit c12784c3d14a2110468ec4d1383f60cfd2665576 upstream. When using the FIFO-based event channel ABI, if the control block or the local HEADs are not reset after resuming the guest may see stale HEAD values and will fail to traverse the FIFO correctly. This may prevent one or more VCPUs from receiving any events following a resume. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-05xen/events/fifo: ensure all bitops are properly aligned even on x86David Vrabel
commit dcecb8fd93a65787130a74e61fdf29932c8d85eb upstream. When using the FIFO-based ABI on x86_64, if the last port is at the end of an event array page then sync_test_bit() on this port's event word will read beyond the end of the page and in certain circumstances this may fault. The fault requires the following page in the kernel's direct mapping to be not present, which would mean: a) the array page is the last page of RAM; or b) the following page is ballooned out /and/ it has been used for a foreign mapping by a kernel driver (such as netback or blkback) /and/ the grant has been unmapped. Use the infrastructure added for arm64 to ensure that all bitops operating on event words are unsigned long aligned. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-28xen/balloon: set ballooned out pages as invalid in p2mDavid Vrabel
commit fb9a0c443691ceaab3daba966bbbd9f5ff3aa26f upstream. Since cd9151e26d31048b2b5e00fd02e110e07d2200c9 (xen/balloon: set a mapping for ballooned out pages), a ballooned out page had its entry in the p2m set to the MFN of one of the scratch pages. This means that the p2m will contain many entries pointing to the same MFN. During a domain save, these many-to-one entries are not identified as such and the scratch page is saved multiple times. On restore the ballooned pages are populated with new frames and the domain may use up its allocation before all pages can be restored. Since the original fix only needed to keep a mapping for the ballooned page it is safe to set ballooned out pages as INVALID_P2M_ENTRY in the p2m (as they were before). Thus preventing them from being saved and re-populated on restore. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Marek Marczykowski <marmarek@invisiblethingslab.com> Tested-by: Marek Marczykowski <marmarek@invisiblethingslab.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-31xen/events/fifo: correctly align bitopsVladimir Murzin
commit 05a812ac474d0d6aef6d54b66bb08b81abde79c6 upstream. FIFO event channels require bitops on 32-bit aligned values (the event words). Linux's bitops require unsigned long alignment which may be 64-bits. On arm64 an incorrectly unaligned access will fault. Fix this by aligning the bitops along with an adjustment for bit position and using an unsigned long for the local copy of the ready word. Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> Tested-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-25xen/balloon: flush persistent kmaps in correct positionWei Liu
Xen balloon driver will update ballooned out pages' P2M entries to point to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc page while non-preemptible", kmap_flush_unused was moved after updating P2M table. In that case for 32 bit PV guest we might end up with P2M X -----> S (S is mfn of balloon scratch page) M2P Y -----> X (Y is mfn in persistent kmap entry) kmap_flush_unused() iterates through all the PTEs in the kmap address space, using pte_to_page() to obtain the page. If the p2m and the m2p are inconsistent the incorrect page is returned. This will clear page->address on the wrong page which may cause subsequent oopses if that page is currently kmap'ed. Move the flush back between get_page and __set_phys_to_machine to fix this. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: stable@vger.kernel.org # 3.12+
2014-02-12Merge tag 'stable/for-linus-3.14-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: "This has an healthy amount of code being removed - which we do not use anymore (the only user of it was ia64 Xen which had been removed already). The other bug-fixes are to make Xen ARM be able to use the new event channel mechanism and proper export of header files to user-space. Summary: - Fix ARM and Xen FIFO not working. - Remove more Xen ia64 vestigates. - Fix UAPI missing Xen files" * tag 'stable/for-linus-3.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: ia64/xen: Remove Xen support for ia64 even more xen: install xen/gntdev.h and xen/gntalloc.h xen/events: bind all new interdomain events to VCPU0
2014-02-11ia64/xen: Remove Xen support for ia64 even morePaul Bolle
Commit d52eefb47d4e ("ia64/xen: Remove Xen support for ia64") removed the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending on that symbol. Remove that code now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-11xen/events: bind all new interdomain events to VCPU0David Vrabel
Commit fc087e10734a4d3e40693fc099461ec1270b3fff (xen/events: remove unnecessary init_evtchn_cpu_bindings()) causes a regression. The kernel-side VCPU binding was not being correctly set for newly allocated or bound interdomain events. In ARM guests where 2-level events were used, this would result in no interdomain events being handled because the kernel-side VCPU masks would all be clear. x86 guests would work because the irq affinity was set during irq setup and this would set the correct kernel-side VCPU binding. Fix this by properly initializing the kernel-side VCPU binding in bind_evtchn_to_irq(). Reported-and-tested-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-05Merge tag 'stable/for-linus-3.14-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fixes from Konrad Rzeszutek Wilk: "Bug-fixes: - Revert "xen/grant-table: Avoid m2p_override during mapping" as it broke Xen ARM build. - Fix CR4 not being set on AP processors in Xen PVH mode" * tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: set CR4 flags for APs Revert "xen/grant-table: Avoid m2p_override during mapping"
2014-02-03Revert "xen/grant-table: Avoid m2p_override during mapping"Konrad Rzeszutek Wilk
This reverts commit 08ece5bb2312b4510b161a6ef6682f37f4eac8a1. As it breaks ARM builds and needs more attention on the ARM side. Acked-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-01-31Merge tag 'stable/for-linus-3.14-rc0-late-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: "Bug-fixes for the new features that were added during this cycle. There are also two fixes for long-standing issues for which we have a solution: grant-table operations extra work that was not needed causing performance issues and the self balloon code was too aggressive causing OOMs. Details: - Xen ARM couldn't use the new FIFO events - Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices. - Grant table were doing needless M2P operations. - Ratchet down the self-balloon code so it won't OOM. - Fix misplaced kfree in Xen PVH error code paths" * tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages drivers: xen: deaggressive selfballoon driver xen/grant-table: Avoid m2p_override during mapping xen/gnttab: Use phys_addr_t to describe the grant frame base address xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t) arm/xen: Initialize event channels earlier
2014-01-31drivers: xen: deaggressive selfballoon driverBob Liu
Current xen-selfballoon driver is too aggressive which may cause OOM be triggered more often. Eg. this bug reported by James: https://lkml.org/lkml/2013/11/21/158 There are two mainly reasons: 1) The original goal_page didn't consider some pages used by kernel space, like slab pages and pages used by device drivers. 2) The balloon driver may not give back memory to guest OS fast enough when the workload suddenly aquries a lot of physical memory. In both cases, the guest OS will suffer from memory pressure and OOM may be triggered. The fix is make xen-selfballoon driver not that aggressive by adding extra 10% of total ram pages to goal_page. It's more valuable to keep the guest system reliable and response faster than balloon out these 10% pages to XEN. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-01-31xen/grant-table: Avoid m2p_override during mappingZoltan Kiss
The grant mapping API does m2p_override unnecessarily: only gntdev needs it, for blkback and future netback patches it just cause a lock contention, as those pages never go to userspace. Therefore this series does the following: - the original functions were renamed to __gnttab_[un]map_refs, with a new parameter m2p_override - based on m2p_override either they follow the original behaviour, or just set the private flag and call set_phys_to_machine - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with m2p_override false - a new function gnttab_[un]map_refs_userspace provides the old behaviour It also removes a stray space from page.h and change ret to 0 if XENFEAT_auto_translated_physmap, as that is the only possible return value there. v2: - move the storing of the old mfn in page->index to gnttab_map_refs - move the function header update to a separate patch v3: - a new approach to retain old behaviour where it needed - squash the patches into one v4: - move out the common bits from m2p* functions, and pass pfn/mfn as parameter - clear page->private before doing anything with the page, so m2p_find_override won't race with this v5: - change return value handling in __gnttab_[un]map_refs - remove a stray space in page.h - add detail why ret = 0 now at some places v6: - don't pass pfn to m2p* functions, just get it locally Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Suggested-by: David Vrabel <david.vrabel@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-01-30xen/gnttab: Use phys_addr_t to describe the grant frame base addressJulien Grall
On ARM, address size can be 32 bits or 64 bits (if CONFIG_ARCH_PHYS_ADDR_T_64BIT is enabled). We can't assume that the grant frame base address will always fits in an unsigned long. Use phys_addr_t instead of unsigned long as argument for gnttab_setup_auto_xlat_frames. Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2014-01-30xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t)Ian Campbell
The use of phys_to_machine and machine_to_phys in the phys<=>bus conversions causes us to lose the top bits of the DMA address if the size of a DMA address is not the same as the size of the phyiscal address. This can happen in practice on ARM where foreign pages can be above 4GB even though the local kernel does not have LPAE page tables enabled (which is totally reasonable if the guest does not itself have >4GB of RAM). In this case the kernel still maps the foreign pages at a phys addr below 4G (as it must) but the resulting DMA address (returned by the grant map operation) is much higher. This is analogous to a hardware device which has its view of RAM mapped up high for some reason. This patch makes I/O to foreign pages (specifically blkif) work on 32-bit ARM systems with more than 4GB of RAM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-24Merge tag 'pm+acpi-3.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...
2014-01-22Merge tag 'stable/for-linus-3.14-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from Konrad Rzeszutek Wilk: "Two major features that Xen community is excited about: The first is event channel scalability by David Vrabel - we switch over from an two-level per-cpu bitmap of events (IRQs) - to an FIFO queue with priorities. This lets us be able to handle more events, have lower latency, and better scalability. Good stuff. The other is PVH by Mukesh Rathor. In short, PV is a mode where the kernel lets the hypervisor program page-tables, segments, etc. With EPT/NPT capabilities in current processors, the overhead of doing this in an HVM (Hardware Virtual Machine) container is much lower than the hypervisor doing it for us. In short we let a PV guest run without doing page-table, segment, syscall, etc updates through the hypervisor - instead it is all done within the guest container. It is a "hybrid" PV - hence the 'PVH' name - a PV guest within an HVM container. The major benefits are less code to deal with - for example we only use one function from the the pv_mmu_ops (which has 39 function calls); faster performance for syscall (no context switches into the hypervisor); less traps on various operations; etc. It is still being baked - the ABI is not yet set in stone. But it is pretty awesome and we are excited about it. Lastly, there are some changes to ARM code - you should get a simple conflict which has been resolved in #linux-next. In short, this pull has awesome features. Features: - FIFO event channels. Key advantages: support for over 100,000 events (2^17), 16 different event priorities, improved fairness in event latency through the use of FIFOs. - Xen PVH support. "It’s a fully PV kernel mode, running with paravirtualized disk and network, paravirtualized interrupts and timers, no emulated devices of any kind (and thus no qemu), no BIOS or legacy boot — but instead of requiring PV MMU, it uses the HVM hardware extensions to virtualize the pagetables, as well as system calls and other privileged operations." (from "The Paravirtualization Spectrum, Part 2: From poles to a spectrum") Bug-fixes: - Fixes in balloon driver (refactor and make it work under ARM) - Allow xenfb to be used in HVM guests. - Allow xen_platform_pci=0 to work properly. - Refactors in event channels" * tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (52 commits) xen/pvh: Set X86_CR0_WP and others in CR0 (v2) MAINTAINERS: add git repository for Xen xen/pvh: Use 'depend' instead of 'select'. xen: delete new instances of __cpuinit usage xen/fb: allow xenfb initialization for hvm guests xen/evtchn_fifo: fix error return code in evtchn_fifo_setup() xen-platform: fix error return code in platform_pci_init() xen/pvh: remove duplicated include from enlighten.c xen/pvh: Fix compile issues with xen_pvh_domain() xen: Use dev_is_pci() to check whether it is pci device xen/grant-table: Force to use v1 of grants. xen/pvh: Support ParaVirtualized Hardware extensions (v3). xen/pvh: Piggyback on PVHVM XenBus. xen/pvh: Piggyback on PVHVM for grant driver (v4) xen/grant: Implement an grant frame array struct (v3). xen/grant-table: Refactor gnttab_init xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init. xen/pvh: Piggyback on PVHVM for event channels (v2) xen/pvh: Update E820 to work with PVH (v2) xen/pvh: Secondary VCPU bringup (non-bootup CPUs) ...
2014-01-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual rocket science stuff from trivial.git" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) neighbour.h: fix comment sched: Fix warning on make htmldocs caused by wait.h slab: struct kmem_cache is protected by slab_mutex doc: Fix typo in USB Gadget Documentation of/Kconfig: Spelling s/one/once/ mkregtable: Fix sscanf handling lp5523, lp8501: comment improvements thermal: rcar: comment spelling treewide: fix comments and printk msgs IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart() Documentation: update /proc/uptime field description Documentation: Fix size parameter for snprintf arm: fix comment header and macro name asm-generic: uaccess: Spelling s/a ny/any/ mtd: onenand: fix comment header doc: driver-model/platform.txt: fix a typo drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text doc: Fix typo (acces_process_vm -> access_process_vm) treewide: Fix typos in printk drivers/gpu/drm/qxl/Kconfig: reformat the help text ...
2014-01-12Merge branch 'acpi-hotplug'Rafael J. Wysocki
* acpi-hotplug: ACPI / scan: ACPI device object sysfs attribute for _STA evaluation ACPI / hotplug / driver core: Handle containers in a special way ACPI / hotplug: Add demand_offline hotplug profile flag ACPI / bind: Move acpi_get_child() to drivers/ide/ide-acpi.c ACPI / bind: Pass struct acpi_device pointer to acpi_bind_one() ACPI / bind: Rework struct acpi_bus_type ACPI / bind: Redefine acpi_preset_companion() ACPI / bind: Redefine acpi_get_child() PCI / ACPI: Use acpi_find_child_device() for child devices lookup ACPI / bind: Simplify child device lookups ACPI / scan: Use direct recurrence for device hierarchy walks ACPI: Introduce acpi_set_device_status() ACPI / hotplug: Drop unfinished global notification handling routines ACPI / hotplug: Rework generic code to handle suprise removals ACPI / hotplug: Move container-specific code out of the core ACPI / hotplug: Make ACPI PCI root hotplug use common hotplug code ACPI / hotplug: Introduce common hotplug function acpi_device_hotplug() ACPI / hotplug: Do not fail bus and device checks for disabled hotplug ACPI / scan: Add acpi_device objects for all device nodes in the namespace ACPI / scan: Define non-empty device removal handler
2014-01-12Merge branch 'acpi-cleanup'Rafael J. Wysocki
* acpi-cleanup: (22 commits) ACPI / tables: Return proper error codes from acpi_table_parse() and fix comment. ACPI / tables: Check if id is NULL in acpi_table_parse() ACPI / proc: Include appropriate header file in proc.c ACPI / EC: Remove unused functions and add prototype declaration in internal.h ACPI / dock: Include appropriate header file in dock.c ACPI / PCI: Include appropriate header file in pci_link.c ACPI / PCI: Include appropriate header file in pci_slot.c ACPI / EC: Mark the function acpi_ec_add_debugfs() as static in ec_sys.c ACPI / NVS: Include appropriate header file in nvs.c ACPI / OSL: Mark the function acpi_table_checksum() as static ACPI / processor: initialize a variable to silence compiler warning ACPI / processor: use ACPI_COMPANION() to get ACPI device ACPI: correct minor typos ACPI / sleep: Drop redundant acpi_disabled check ACPI / dock: Drop redundant acpi_disabled check ACPI / table: Replace '1' with specific error return values ACPI: remove trailing whitespace ACPI / IBFT: Fix incorrect <acpi/acpi.h> inclusion in iSCSI boot firmware module ACPI / i915: Fix incorrect <acpi/acpi.h> inclusions via <linux/acpi_io.h> SFI / ACPI: Fix warnings reported during builds with W=1 ... Conflicts: drivers/acpi/nvs.c drivers/hwmon/asus_atk0110.c
2014-01-10xen: delete new instances of __cpuinit usagePaul Gortmaker
Commit 1fe565517b57676884349dccfd6ce853ec338636 ("xen/events: use the FIFO-based ABI if available") added new instances of __cpuinit macro usage. We removed this a couple versions ago; we now want to remove the compat no-op stubs. Introducing new users is not what we want to see at this point in time, as it will break once the stubs are gone. Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-01-07xen/evtchn_fifo: fix error return code in evtchn_fifo_setup()Wei Yongjun
Fix to return -ENOMEM from the error handling case instead of 0 (overwrited to 0 by the HYPERVISOR_event_channel_op call), otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2014-01-07xen-platform: fix error return code in platform_pci_init()Wei Yongjun
Fix to return a negative error code from the error handling case instead of 0, otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-01-07xen: Use dev_is_pci() to check whether it is pci deviceYijing Wang
Use PCI standard marco dev_is_pci() instead of directly compare pci_bus_type to check whether it is pci device. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com>
2014-01-06xen/grant-table: Force to use v1 of grants.Konrad Rzeszutek Wilk
We have the framework to use v2, but there are no backends that actually use it. The end result is that on PV we use v2 grants and on PVHVM v1. The v1 has a capacity of 512 grants per page while the v2 has 256 grants per page. This means we lose about 50% capacity - and if we want more than 16 VIFs (each VIF takes 512 grants), then we are hitting the max per guest of 32. Oracle-bug: 16039922 CC: annie.li@oracle.com CC: msw@amazon.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2014-01-06xen/pvh: Piggyback on PVHVM XenBus.Mukesh Rathor
PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. For the XenBus mechanism we want to use the PVHVM mechanism. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/pvh: Piggyback on PVHVM for grant driver (v4)Konrad Rzeszutek Wilk
In PVH the shared grant frame is the PFN and not MFN, hence its mapped via the same code path as HVM. The allocation of the grant frame is done differently - we do not use the early platform-pci driver and have an ioremap area - instead we use balloon memory and stitch all of the non-contingous pages in a virtualized area. That means when we call the hypervisor to replace the GMFN with a XENMAPSPACE_grant_table type, we need to lookup the old PFN for every iteration instead of assuming a flat contingous PFN allocation. Lastly, we only use v1 for grants. This is because PVHVM is not able to use v2 due to no XENMEM_add_to_physmap calls on the error status page (see commit 69e8f430e243d657c2053f097efebc2e2cd559f0 xen/granttable: Disable grant v2 for HVM domains.) Until that is implemented this workaround has to be in place. Also per suggestions by Stefano utilize the PVHVM paths as they share common functionality. v2 of this patch moves most of the PVH code out in the arch/x86/xen/grant-table driver and touches only minimally the generic driver. v3, v4: fixes us some of the code due to earlier patches. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/grant: Implement an grant frame array struct (v3).Konrad Rzeszutek Wilk
The 'xen_hvm_resume_frames' used to be an 'unsigned long' and contain the virtual address of the grants. That was OK for most architectures (PVHVM, ARM) were the grants are contiguous in memory. That however is not the case for PVH - in which case we will have to do a lookup for each virtual address for the PFN. Instead of doing that, lets make it a structure which will contain the array of PFNs, the virtual address and the count of said PFNs. Also provide a generic functions: gnttab_setup_auto_xlat_frames and gnttab_free_auto_xlat_frames to populate said structure with appropriate values for PVHVM and ARM. To round it off, change the name from 'xen_hvm_resume_frames' to a more descriptive one - 'xen_auto_xlat_grant_frames'. For PVH, in patch "xen/pvh: Piggyback on PVHVM for grant driver" we will populate the 'xen_auto_xlat_grant_frames' by ourselves. v2 moves the xen_remap in the gnttab_setup_auto_xlat_frames and also introduces xen_unmap for gnttab_free_auto_xlat_frames. Suggested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v3: Based on top of 'asm/xen/page.h: remove redundant semicolon'] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/grant-table: Refactor gnttab_initKonrad Rzeszutek Wilk
We have this odd scenario of where for PV paths we take a shortcut but for the HVM paths we first ioremap xen_hvm_resume_frames, then assign it to gnttab_shared.addr. This is needed because gnttab_map uses gnttab_shared.addr. Instead of having: if (pv) return gnttab_map if (hvm) ... gnttab_map Lets move the HVM part before the gnttab_map and remove the first call to gnttab_map. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init.Konrad Rzeszutek Wilk
The function gnttab_max_grant_frames() returns the maximum amount of frames (pages) of grants we can have. Unfortunatly it was dependent on gnttab_init() having been run before to initialize the boot max value (boot_max_nr_grant_frames). This meant that users of gnttab_max_grant_frames would always get a zero value if they called before gnttab_init() - such as 'platform_pci_init' (drivers/xen/platform-pci.c). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/pvh: Piggyback on PVHVM for event channels (v2)Mukesh Rathor
PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. There is a similar mode - PVHVM where we run in HVM mode with PV code enabled - and this patch explores that. The most notable PV interfaces are the XenBus and event channels. We will piggyback on how the event channel mechanism is used in PVHVM - that is we want the normal native IRQ mechanism and we will install a vector (hvm callback) for which we will call the event channel mechanism. This means that from a pvops perspective, we can use native_irq_ops instead of the Xen PV specific. Albeit in the future we could support pirq_eoi_map. But that is a feature request that can be shared with PVHVM. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-01-06xen/events: use the FIFO-based ABI if availableDavid Vrabel
Implement all the event channel port ops for the FIFO-based ABI. If the hypervisor supports the FIFO-based ABI, enable it by initializing the control block for the boot VCPU and subsequent VCPUs as they are brought up and on resume. The event array is expanded as required when event ports are setup. The 'xen.fifo_events=0' command line option may be used to disable use of the FIFO-based ABI. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/events: allow event channel priority to be setDavid Vrabel
Add xen_irq_set_priority() to set an event channels priority. This function will only work with event channel ABIs that support priority (i.e., the FIFO-based ABI). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/events: Add the hypervisor interface for the FIFO-based event channelsDavid Vrabel
Add the hypercall sub-ops and the structures for the shared data used in the FIFO-based event channel ABI. The design document for this new ABI is available here: http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf In summary, events are reported using a per-domain shared event array of event words. Each event word has PENDING, LINKED and MASKED bits and a LINK field for pointing to the next event in the event queue. There are 16 event queues (with different priorities) per-VCPU. Key advantages of this new ABI include: - Support for over 100,000 events (2^17). - 16 different event priorities. - Improved fairness in event latency through the use of FIFOs. The ABI is available in Xen 4.4 and later. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/evtchn: support more than 4096 portsDavid Vrabel
Remove the check during unbind for NR_EVENT_CHANNELS as this limits support to less than 4096 ports. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/events: add xen_evtchn_mask_all()David Vrabel
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/events: Refactor evtchn_to_irq array to be dynamically allocatedDavid Vrabel
Refactor static array evtchn_to_irq array to be dynamically allocated by implementing get and set functions for accesses to the array. Two new port ops are added: max_channels (maximum supported number of event channels) and nr_channels (number of currently usable event channels). For the 2-level ABI, these numbers are both the same as the shared data structure is a fixed size. For the FIFO ABI, these will be different as the event array is expanded dynamically. This allows more than 65000 event channels so an unsigned short is no longer sufficient for an event channel port number and unsigned int is used instead. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2014-01-06xen/events: add a evtchn_op for port setupDavid Vrabel
Add a hook for port-specific setup and call it from xen_irq_info_common_setup(). The FIFO-based ABIs may need to perform additional setup (expanding the event array) before a bound event channel can start to receive events. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>