From 00e4d116a7ef94eb910be037912b0b2fc09f608b Mon Sep 17 00:00:00 2001 From: Gerrit Renker Date: Fri, 22 Sep 2006 09:33:58 +0100 Subject: [DCCP]: Allow default/fallback service code. This has been discussed on dccp@vger and removes the necessity for applications to supply service codes in each and every case. If an application does not want to provide a service code, that's fine, it will be given 0. Otherwise, service codes can be set via socket options as before. This patch has been tested using various client/server configurations (including listening on multiple service codes). Signed-off-by: Gerrit Renker Signed-off-by: Arnaldo Carvalho de Melo --- Documentation/networking/dccp.txt | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt index c45daabd3bf..74563b38ffd 100644 --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -1,7 +1,6 @@ DCCP protocol ============ -Last updated: 10 November 2005 Contents ======== @@ -42,8 +41,11 @@ Socket options DCCP_SOCKOPT_PACKET_SIZE is used for CCID3 to set default packet size for calculations. -DCCP_SOCKOPT_SERVICE sets the service. This is compulsory as per the -specification. If you don't set it you will get EPROTO. +DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of +service codes (RFC 4340, sec. 8.1.2); if this socket option is not set, +the socket will fall back to 0 (which means that no meaningful service code +is present). Connecting sockets set at most one service option; for +listening sockets, multiple service codes can be specified. Notes ===== -- cgit v1.2.3 From 0ff38490c836dc379ff7ec45b10a15a662f4e5f6 Mon Sep 17 00:00:00 2001 From: Christoph Lameter Date: Mon, 25 Sep 2006 23:31:52 -0700 Subject: [PATCH] zone_reclaim: dynamic slab reclaim Currently one can enable slab reclaim by setting an explicit option in /proc/sys/vm/zone_reclaim_mode. Slab reclaim is then used as a final option if the freeing of unmapped file backed pages is not enough to free enough pages to allow a local allocation. However, that means that the slab can grow excessively and that most memory of a node may be used by slabs. We have had a case where a machine with 46GB of memory was using 40-42GB for slab. Zone reclaim was effective in dealing with pagecache pages. However, slab reclaim was only done during global reclaim (which is a bit rare on NUMA systems). This patch implements slab reclaim during zone reclaim. Zone reclaim occurs if there is a danger of an off node allocation. At that point we 1. Shrink the per node page cache if the number of pagecache pages is more than min_unmapped_ratio percent of pages in a zone. 2. Shrink the slab cache if the number of the nodes reclaimable slab pages (patch depends on earlier one that implements that counter) are more than min_slab_ratio (a new /proc/sys/vm tunable). The shrinking of the slab cache is a bit problematic since it is not node specific. So we simply calculate what point in the slab we want to reach (current per node slab use minus the number of pages that neeed to be allocated) and then repeately run the global reclaim until that is unsuccessful or we have reached the limit. I hope we will have zone based slab reclaim at some point which will make that easier. The default for the min_slab_ratio is 5% Also remove the slab option from /proc/sys/vm/zone_reclaim_mode. [akpm@osdl.org: cleanups] Signed-off-by: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/sysctl/vm.txt | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 7cee90223d3..20d0d797f53 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -29,6 +29,7 @@ Currently, these files are in /proc/sys/vm: - drop-caches - zone_reclaim_mode - min_unmapped_ratio +- min_slab_ratio - panic_on_oom ============================================================== @@ -138,7 +139,6 @@ This is value ORed together of 1 = Zone reclaim on 2 = Zone reclaim writes dirty pages out 4 = Zone reclaim swaps pages -8 = Also do a global slab reclaim pass zone_reclaim_mode is set during bootup to 1 if it is determined that pages from remote zones will cause a measurable performance reduction. The @@ -162,18 +162,13 @@ Allowing regular swap effectively restricts allocations to the local node unless explicitly overridden by memory policies or cpuset configurations. -It may be advisable to allow slab reclaim if the system makes heavy -use of files and builds up large slab caches. However, the slab -shrink operation is global, may take a long time and free slabs -in all nodes of the system. - ============================================================= min_unmapped_ratio: This is available only on NUMA kernels. -A percentage of the file backed pages in each zone. Zone reclaim will only +A percentage of the total pages in each zone. Zone reclaim will only occur if more than this percentage of pages are file backed and unmapped. This is to insure that a minimal amount of local pages is still available for file I/O even if the node is overallocated. @@ -182,6 +177,24 @@ The default is 1 percent. ============================================================= +min_slab_ratio: + +This is available only on NUMA kernels. + +A percentage of the total pages in each zone. On Zone reclaim +(fallback from the local zone occurs) slabs will be reclaimed if more +than this percentage of pages in a zone are reclaimable slab pages. +This insures that the slab growth stays under control even in NUMA +systems that rarely perform global reclaim. + +The default is 5 percent. + +Note that slab reclaim is triggered in a per zone / node fashion. +The process of reclaiming slab memory is currently not node specific +and may not be fast. + +============================================================= + panic_on_oom This enables or disables panic on out-of-memory feature. If this is set to 1, -- cgit v1.2.3 From 461a9afff5e731d6337c0f5b08a1e727ccd57e0a Mon Sep 17 00:00:00 2001 From: Zachary Amsden Date: Mon, 25 Sep 2006 23:32:25 -0700 Subject: [PATCH] x86: add a bootparameter to reserve high linear address space Add a boot parameter to reserve high linear address space for hypervisors. This is necessary to allow dynamically loaded hypervisor modules, which might not happen until userspace is already running, and also provides a useful tool to benchmark the performance impact of reduced lowmem address space. Signed-off-by: Zachary Amsden Signed-off-by: Chris Wright Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 71d05f48172..766abdab94e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1363,6 +1363,11 @@ running once the system is up. reserve= [KNL,BUGS] Force the kernel to ignore some iomem area + reservetop= [IA-32] + Format: nn[KMG] + Reserves a hole at the top of the kernel virtual + address space. + resume= [SWSUSP] Specify the partition device for software suspend -- cgit v1.2.3 From c5c6ba4e08ab9c9e390a0f3a7d9a5c332f5cc6ef Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 25 Sep 2006 23:32:58 -0700 Subject: [PATCH] PM: Add pm_trace switch Add the pm_trace attribute in /sys/power which has to be explicitly set to one to really enable the "PM tracing" code compiled in when CONFIG_PM_TRACE is set (which modifies the machine's CMOS clock in unpredictable ways). Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/power/interface.txt | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt index 4117802af0f..a66bec222b1 100644 --- a/Documentation/power/interface.txt +++ b/Documentation/power/interface.txt @@ -52,3 +52,18 @@ suspend image will be as small as possible. Reading from this file will display the current image size limit, which is set to 500 MB by default. + +/sys/power/pm_trace controls the code which saves the last PM event point in +the RTC across reboots, so that you can debug a machine that just hangs +during suspend (or more commonly, during resume). Namely, the RTC is only +used to save the last PM event point if this file contains '1'. Initially it +contains '0' which may be changed to '1' by writing a string representing a +nonzero integer into it. + +To use this debugging feature you should attempt to suspend the machine, then +reboot it and run + + dmesg -s 1000000 | grep 'hash matches' + +CAUTION: Using it will cause your machine's real-time (CMOS) clock to be +set to a random invalid time after a resume. -- cgit v1.2.3