aboutsummaryrefslogtreecommitdiff
path: root/softmmu
diff options
context:
space:
mode:
authorDavid Hildenbrand <david@redhat.com>2021-05-10 13:43:22 +0200
committerPaolo Bonzini <pbonzini@redhat.com>2021-06-15 20:27:38 +0200
commitd94e0bc9ef7848f69550a80e7be6d4de68856e46 (patch)
tree2dec04c2b8029862b0d9d2375add13d2ee76f62d /softmmu
parent8dbe22c6868b8a5efd1df3d0c5150524fabe61ff (diff)
util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no effect on most shared mappings - except for hugetlbfs and anonymous memory. Linux man page: "MAP_NORESERVE: Do not reserve swap space for this mapping. When swap space is reserved, one has the guarantee that it is possible to modify the mapping. When swap space is not reserved one might get SIGSEGV upon a write if no physical memory is available. See also the discussion of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before 2.6, this flag had effect only for private writable mappings." Note that the "guarantee" part is wrong with memory overcommit in Linux. Also, in Linux hugetlbfs is treated differently - we configure reservation of huge pages from the pool, not reservation of swap space (huge pages cannot be swapped). The rough behavior is [1]: a) !Hugetlbfs: 1) Without MAP_NORESERVE *or* with memory overcommit under Linux disabled ("/proc/sys/vm/overcommit_memory == 2"), the following accounting/reservation happens: For a file backed map SHARED or READ-only - 0 cost (the file is the map not swap) PRIVATE WRITABLE - size of mapping per instance For an anonymous or /dev/zero map SHARED - size of mapping PRIVATE READ-only - 0 cost (but of little use) PRIVATE WRITABLE - size of mapping per instance 2) With MAP_NORESERVE, no accounting/reservation happens. b) Hugetlbfs: 1) Without MAP_NORESERVE, huge pages are reserved. 2) With MAP_NORESERVE, no huge pages are reserved. Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able to configure it for !hugetlbfs globally; this toggle now allows configuring it more fine-grained, not for the whole system. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. [1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-10-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'softmmu')
-rw-r--r--softmmu/physmem.c1
1 files changed, 1 insertions, 0 deletions
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 11ea8e19a6..9b171c9dbe 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2251,6 +2251,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
flags = MAP_FIXED;
flags |= block->flags & RAM_SHARED ?
MAP_SHARED : MAP_PRIVATE;
+ flags |= block->flags & RAM_NORESERVE ? MAP_NORESERVE : 0;
if (block->fd >= 0) {
area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
flags, block->fd, offset);