SLUB: Place kmem_cache_cpu structures in a NUMA aware way

The kmem_cache_cpu structures introduced are currently an array placed in the kmem_cache struct. Meaning the kmem_cache_cpu structures are overwhelmingly on the wrong node for systems with a higher amount of nodes. These are performance critical structures since the per node information has to be touched for every alloc and free in a slab. In order to place the kmem_cache_cpu structure optimally we put an array of pointers to kmem_cache_cpu structs in kmem_cache (similar to SLAB). However, the kmem_cache_cpu structures can now be allocated in a more intelligent way. We would like to put per cpu structures for the same cpu but different slab caches in cachelines together to save space and decrease the cache footprint. However, the slab allocators itself control only allocations per node. We set up a simple per cpu array for every processor with 100 per cpu structures which is usually enough to get them all set up right. If we run out then we fall back to kmalloc_node. This also solves the bootstrap problem since we do not have to use slab allocator functions early in boot to get memory for the small per cpu structures. Pro: - NUMA aware placement improves memory performance - All global structures in struct kmem_cache become readonly - Dense packing of per cpu structures reduces cacheline footprint in SMP and NUMA. - Potential avoidance of exclusive cacheline fetches on the free and alloc hotpath since multiple kmem_cache_cpu structures are in one cacheline. This is particularly important for the kmalloc array. Cons: - Additional reference to one read only cacheline (per cpu array of pointers to kmem_cache_cpu) in both slab_alloc() and slab_free(). [akinobu.mita@gmail.com: fix cpu hotplug offline/online path] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: "Pekka Enberg" <penberg@cs.helsinki.fi> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Christoph Lameter <clameter@sgi.com> 2007-10-16 01:26:08 -0700
committer: Linus Torvalds <torvalds@woody.linux-foundation.org> 2007-10-16 09:43:01 -0700
commit: 4c93c355d5d563f300df7e61ef753d7a064411e9 (patch)
tree: 24bcdbed58a51c69640da9c8e220dd5ce0c054a7 /include/linux/slub_def.h
parent: ee3c72a14bfecdf783738032ff3c73ef6412f5b3 (diff)
1 files changed, 6 insertions, 3 deletions
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 92e10cf6d0e..f74716b59ce 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -16,8 +16,7 @@ struct kmem_cache_cpu {
 	struct page *page;
 	int node;
 	unsigned int offset;
-	/* Lots of wasted space */
-} ____cacheline_aligned_in_smp;
+};
 
 struct kmem_cache_node {
 	spinlock_t list_lock;	/* Protect partial list and nr_partial */
@@ -62,7 +61,11 @@ struct kmem_cache {
 	int defrag_ratio;
 	struct kmem_cache_node *node[MAX_NUMNODES];
 #endif
-	struct kmem_cache_cpu cpu_slab[NR_CPUS];
+#ifdef CONFIG_SMP
+	struct kmem_cache_cpu *cpu_slab[NR_CPUS];
+#else
+	struct kmem_cache_cpu cpu_slab;
+#endif
 };
 
 /*
author	Christoph Lameter <clameter@sgi.com>	2007-10-16 01:26:08 -0700
committer	Linus Torvalds <torvalds@woody.linux-foundation.org>	2007-10-16 09:43:01 -0700
commit	4c93c355d5d563f300df7e61ef753d7a064411e9 (patch)
tree	24bcdbed58a51c69640da9c8e220dd5ce0c054a7 /include/linux/slub_def.h
parent	ee3c72a14bfecdf783738032ff3c73ef6412f5b3 (diff)