2010-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git:// * 'for-linus' of git:// fuse: fix large stack use fuse: cleanup in fuse_notify_inval_...()
2010-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git:// * 'for-linus' of git:// percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.
2010-03-03Merge git:// Torvalds
* git:// GFS2: print glock numbers in hex GFS2: ordered writes are backwards GFS2: Remove old, unused linked list code from quota GFS2: Remove loopy umount code GFS2: Metadata address space clean up
2010-03-03Merge git:// Torvalds
* git:// [CIFS] pSesInfo->sesSem is used as mutex. Rename it to session_mutex and [CIFS] Use unsigned ea length for clarity cifs: set server_eof in cifs_fattr_to_inode [CIFS] Minor cleanup to EA patch cifs: merge CIFSSMBQueryEA with CIFSSMBQAllEAs cifs: verify lengths of QueryAllEAs reply cifs: increase maximum buffer size in CIFSSMBQAllEAs cifs: rename name_len to list_len in CIFSSMBQAllEAs cifs: clean up indentation in CIFSSMBQAllEAs cifs: add parens around smb_var in BCC macros
2010-03-02Merge branch 'for-linus' of ↵Linus Torvalds
git:// * 'for-linus' of git:// (38 commits) SELinux: Make selinux_kernel_create_files_as() shouldn't just always return 0 TOMOYO: Protect find_task_by_vpid() with RCU. Security: add static to security_ops and default_security_ops variable selinux: libsepol: remove dead code in check_avtab_hierarchy_callback() TOMOYO: Remove __func__ from tomoyo_is_correct_path/domain security: fix a couple of sparse warnings TOMOYO: Remove unneeded parameter. TOMOYO: Use shorter names. TOMOYO: Use enum for index numbers. TOMOYO: Add garbage collector. TOMOYO: Add refcounter on domain structure. TOMOYO: Merge headers. TOMOYO: Add refcounter on string data. TOMOYO: Reduce lines by using common path for addition and deletion. selinux: fix memory leak in sel_make_bools TOMOYO: Extract bitfield syslog: clean up needless comment syslog: use defined constants instead of raw numbers syslog: distinguish between /proc/kmsg and syscalls selinux: allow MLS->non-MLS and vice versa upon policy reload ...
2010-03-02Merge branch 'for-linus' of git:// Torvalds
* 'for-linus' of git:// Revert "blkdev: fix merge_bvec_fn return value checks"
2010-03-02Revert "blkdev: fix merge_bvec_fn return value checks"Jens Axboe
This reverts commit 9f7cdbc33f36d28e57eaba0093f68f0d14c38c5b. It's causing oopses om dm setups, so revert it until we investigate. Reported-by: Dmitry Torokhov <> Tested-by: Steven Rostedt <> Signed-off-by: Jens Axboe <>
2010-03-02Merge git:// Torvalds
* git:// (1341 commits) virtio_net: remove forgotten assignment be2net: fix tx completion polling sis190: fix cable detect via link status poll net: fix protocol sk_buff field bridge: Fix build error when IGMP_SNOOPING is not enabled bnx2x: Tx barriers and locks scm: Only support SCM_RIGHTS on unix domain sockets. vhost-net: restart tx poll on sk_sndbuf full vhost: fix get_user_pages_fast error handling vhost: initialize log eventfd context pointer vhost: logging thinko fix wireless: convert to use netdev_for_each_mc_addr ethtool: do not set some flags, if others failed ipoib: returned back addrlen check for mc addresses netlink: Adding inode field to /proc/net/netlink axnet_cs: add new id bridge: Make IGMP snooping depend upon BRIDGE. bridge: Add multicast count/interval sysfs entries bridge: Add hash elasticity/max sysfs entries bridge: Add multicast_snooping sysfs toggle ... Trivial conflicts in Documentation/feature-removal-schedule.txt
2010-03-01Merge branch 'for-2.6.34' of git:// Torvalds
* 'for-2.6.34' of git:// (38 commits) block: don't access jiffies when initialising io_context cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds block: fix for "Consolidate phys_segment and hw_segment limits" cfq-iosched: quantum check tweak blktrace: perform cleanup after setup error blkdev: fix merge_bvec_fn return value checks cfq-iosched: requests "in flight" vs "in driver" clarification cciss: Fix problem with scatter gather elements in the scsi half of the driver cciss: eliminate unnecessary pointer use in cciss scsi code cciss: do not use void pointer for scsi hba data cciss: factor out scatter gather chain block mapping code cciss: fix scatter gather chain block dma direction kludge cciss: simplify scatter gather code cciss: factor out scatter gather chain block allocation and freeing cciss: detect bad alignment of scsi commands at build time cciss: clarify command list padding calculation cfq-iosched: rethink seeky detection for SSDs cfq-iosched: rework seeky detection block: remove padding from io_context on 64bit builds block: Consolidate phys_segment and hw_segment limits ...
2010-03-01GFS2: print glock numbers in hexBob Peterson
This patch changes glock numbers from printing in decimal to hex. Since DLM prints corresponding resource IDs in hex, it makes debugging easier. Signed-off-by: Bob Peterson <> Signed-off-by: Steven Whitehouse <>
2010-03-01GFS2: ordered writes are backwardsDave Chinner
When we queue data buffers for ordered write, the buffers are added to the head of the ordered write list. When the log needs to push these buffers to disk, it also walks the list from the head. The result is that the the ordered buffers are submitted to disk in reverse order. For large writes, this means that whenever the log flushes large streams of reverse sequential order buffers are pushed down into the block layers. The elevators don't handle this particularly well, so IO rates tend to be significantly lower than if the IO was issued in ascending block order. Queue new ordered buffers to the tail of the ordered buffer list to ensure that IO is dispatched in the order it was submitted. This should significantly improve large sequential write speeds. On a disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for noop and from 38MB/s to 50MB/s for cfq. Signed-off-by: Dave Chinner <> Signed-off-by: Steven Whitehouse <>
2010-03-01GFS2: Remove loopy umount codeSteven Whitehouse
As a consequence of the previous patch, we can now remove the loop which used to be required due to the circular dependency between the inodes and glocks. Instead we can just invalidate the inodes, and then clear up any glocks which are left. Also we no longer need the rwsem since there is no longer any danger of the inode invalidation calling back into the glock code (and from there back into the inode code). Signed-off-by: Steven Whitehouse <>
2010-03-01GFS2: Metadata address space clean upSteven Whitehouse
Since the start of GFS2, an "extra" inode has been used to store the metadata belonging to each inode. The only reason for using this inode was to have an extra address space, the other fields were unused. This means that the memory usage was rather inefficient. The reason for keeping each inode's metadata in a separate address space is that when glocks are requested on remote nodes, we need to be able to efficiently locate the data and metadata which relating to that glock (inode) in order to sync or sync and invalidate it (depending on the remotely requested lock mode). This patch adds a new type of glock, which has in addition to its normal fields, has an address space. This applies to all inode and rgrp glocks (but to no other glock types which remain as before). As a result, we no longer need to have the second inode. This results in three major improvements: 1. A saving of approx 25% of memory used in caching inodes 2. A removal of the circular dependency between inodes and glocks 3. No confusion between "normal" and "metadata" inodes in super.c Although the first of these is the more immediately apparent, the second is just as important as it now enables a number of clean ups at umount time. Those will be the subject of future patches. Signed-off-by: Steven Whitehouse <>
2010-02-28blkdev: fix merge_bvec_fn return value checksDmitry Monakhov
merge_bvec_fn() returns bvec->bv_len on success. So we have to check against this value. But in case of fs_optimization merge we compare with wrong value. This patch must be included in b428cd6da7e6559aca69aa2e3a526037d3f20403 But accidentally i've forgot to add this in the initial patch. To make things straight let's replace all such checks. In fact this makes code easy to understand. Signed-off-by: Dmitry Monakhov <> Signed-off-by: Jens Axboe <>
git:// * 'core-rcu-for-linus' of git:// (44 commits) rcu: Fix accelerated GPs for last non-dynticked CPU rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot rcu: Fix accelerated grace periods for last non-dynticked CPU rcu: Export rcu_scheduler_active rcu: Make rcu_read_lock_sched_held() take boot time into account rcu: Make lockdep_rcu_dereference() message less alarmist sched, cgroups: Fix module export rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information rcu: Fix rcutorture mod_timer argument to delay one jiffy rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection rcu: Convert to raw_spinlocks rcu: Stop overflowing signed integers rcu: Use canonical URL for Mathieu's dissertation rcu: Accelerate grace period if last non-dynticked CPU rcu: Fix citation of Mathieu's dissertation rcu: Documentation update for CONFIG_PROVE_RCU security: Apply lockdep-based checking to rcu_dereference() uses idr: Apply lockdep-based diagnostics to rcu_dereference() uses radix-tree: Disable RCU lockdep checking in radix tree vfs: Abstract rcu_dereference_check for files-fdtable use ...
2010-02-26Remove EXPERIMENTAL from NFS_FSCACHEChristian Kujau
There's currently an open Ubuntu bug[0], with the intent to compile NFS_FSCACHE (and possibly AFS_FSCACHE, 9P_FSCACHE) into the standard Ubuntu kernel. However, since *_FSCACHE still depends on EXPERIMENTAL, this won't happen. As Arjan van de Ven pointed out[1], the EXPERIMENTAL flag doesn't mean that much any more, I propose the following patch to fs/nfs/Kconfig. I'd do the same for fs/9p/Kconfig and fs/afs/Kconfig, but as I did not test 9p or AFS, I feel it would not be appropriate for me to remove the flag. [0] [1] Signed-off-by: Christian Kujau <> Signed-off-by: David Howells <> Signed-off-by: Linus Torvalds <>
git:// * 'for-linus' of git:// dlm: use bastmode in debugfs output dlm: Send lockspace name with uevents dlm: send reply before bast dlm: fix ordering of bast and cast
* 'for-linus' of git:// (52 commits) fs/xfs: Correct NULL test xfs: optimize log flushing in xfs_fsync xfs: only clear the suid bit once in xfs_write xfs: kill xfs_bawrite xfs: log changed inodes instead of writing them synchronously xfs: remove invalid barrier optimization from xfs_fsync xfs: kill the unused XFS_QMOPT_* flush flags V2 xfs: Use delay write promotion for dquot flushing xfs: Sort delayed write buffers before dispatch xfs: Don't issue buffer IO direct from AIL push V2 xfs: Use delayed write for inodes rather than async V2 xfs: Make inode reclaim states explicit xfs: more reserved blocks fixups xfs: turn off sign warnings xfs: don't hold onto reserved blocks on remount,ro xfs: quota limit statvfs available blocks xfs: replace KM_LARGE with explicit vmalloc use xfs: cleanup up xfs_log_force calling conventions xfs: kill XLOG_VEC_SET_TYPE xfs: remove duplicate buffer flags ...
* git:// xfs: fix xfs to work with Virtually Indexed architectures sh: add mm API for DMA to vmalloc/vmap areas arm: add mm API for DMA to vmalloc/vmap areas parisc: add mm API for DMA to vmalloc/vmap areas mm: add coherence API for DMA to vmalloc/vmap areas
2010-02-26dlm: use bastmode in debugfs outputDavid Teigland
The bast mode that appears in the debugfs output should be useful on both master and process nodes. lkb_highbast is currently printed, and is only useful on the master node. lkb_bastmode is only useful on the process node. This patch sets lkb_bastmode on the master node as well, and uses that value in the debugfs print. Signed-off-by: David Teigland <>
2010-02-26dlm: Send lockspace name with ueventsSteven Whitehouse
Although it is possible to get this information from the path, its much easier to provide the lockspace as a seperate env variable. Signed-off-by: Steven Whitehouse <> Signed-off-by: David Teigland <>
2010-02-26dlm: send reply before bastDavid Teigland
When the lock master processes a successful operation (request, convert, cancel, or unlock), it will process the effects of the change before sending the reply for the operation. The "effects" of the operation are: - blocking callbacks (basts) for any newly granted locks - waiting or converting locks that can now be granted The cast is queued on the local node when the reply from the lock master is received. This means that a lock holder can receive a bast for a lock mode that is doesn't yet know has been granted. Signed-off-by: David Teigland <>
2010-02-26block: Consolidate phys_segment and hw_segment limitsMartin K. Petersen
Except for SCSI no device drivers distinguish between physical and hardware segment limits. Consolidate the two into a single segment limit. Signed-off-by: Martin K. Petersen <> Signed-off-by: Jens Axboe <>
* 'next-devicetree' of git:// (41 commits) of: remove undefined request_OF_resource & release_OF_resource of/sparc: Remove sparc-local declaration of allnodes and devtree_lock of: move definition of of_chosen into common code. of: remove unused extern reference to devtree_lock of: put default string compare and #a/s-cell values into common header of/flattree: Don't assume HAVE_LMB of: protect linux/of.h with CONFIG_OF proc_devtree: fix THIS_MODULE without module.h of: Remove old and misplaced function declarations of/flattree: Make the kernel accept ePAPR style phandle information of/flattree: endian-convert members of boot_param_header of: assume big-endian properties, adding conversions where necessary of: use __be32 for cell value accessors of/flattree: use OF_ROOT_NODE_{SIZE,ADDR}_CELLS DEFAULT for fdt parsing of/flattree: use callback to setup initrd from /chosen proc_devtree: include linux/of.h of: make set_node_proc_entry private to proc_devtree.c of: include linux/proc_fs.h of/flattree: merge early_init_dt_scan_memory() common code of: add 'of_' prefix to machine_is_compatible() ...
2010-02-25vfs: Apply lockdep-based checking to rcu_dereference() usesPaul E. McKenney
Add lockdep-ified RCU primitives to alloc_fd(), files_fdtable() and fcheck_files(). Cc: Alexander Viro <> Signed-off-by: Paul E. McKenney <> Cc: Cc: Cc: Cc: Cc: Cc: Cc: Cc: Cc: Cc: Cc: Alexander Viro <> LKML-Reference: <> Signed-off-by: Ingo Molnar <>
2010-02-25[CIFS] pSesInfo->sesSem is used as mutex. Rename it to session_mutex andSteve French
convert it to a real mutex. Signed-off-by: Thomas Gleixner <> Acked-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-24[CIFS] Use unsigned ea length for claritySteve French
Jeff correctly noted that using unsigned ea length is more intuitive. CC: Jeff Lyaton <> Signed-off-by: Steve French <>
2010-02-24dlm: fix ordering of bast and castDavid Teigland
When both blocking and completion callbacks are queued for lock, the dlm would always deliver the completion callback (cast) first. In some cases the blocking callback (bast) is queued before the cast, though, and should be delivered first. This patch keeps track of the order in which they were queued and delivers them in that order. This patch also keeps track of the granted mode in the last cast and eliminates the following bast if the bast mode is compatible with the preceding cast mode. This happens when a remotely mastered lock is demoted, e.g. EX->NL, in which case the local node queues a cast immediately after sending the demote message. In this way a cast can be queued for a mode, e.g. NL, that makes an in-transit bast extraneous. Signed-off-by: David Teigland <>
2010-02-23cifs: set server_eof in cifs_fattr_to_inodeJeff Layton
Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23[CIFS] Minor cleanup to EA patchSteve French
CC: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: merge CIFSSMBQueryEA with CIFSSMBQAllEAsJeff Layton
Add an "ea_name" parameter to CIFSSMBQAllEAs. When it's set make it behave like CIFSSMBQueryEA does now. The current callers of CIFSSMBQueryEA are converted to use CIFSSMBQAllEAs, and the old CIFSSMBQueryEA function is removed. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: verify lengths of QueryAllEAs replyJeff Layton
Make sure the lengths in a QUERY_ALL_EAS reply don't make the parser walk off the end of the SMB. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: increase maximum buffer size in CIFSSMBQAllEAsJeff Layton
It's 4000 now, but there's no reason to limit it to that. We should be able to handle a response up to CIFSMaxBufSize. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: rename name_len to list_len in CIFSSMBQAllEAsJeff Layton
...for clarity and so we can reuse the name for the real name_len. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: clean up indentation in CIFSSMBQAllEAsJeff Layton
Add a label that we can goto on error, and reduce some of the if/then/else indentation in this function. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-23cifs: add parens around smb_var in BCC macrosJeff Layton remove ambiguity about how these values are interpreted when passing in more complex values as arguments. Signed-off-by: Jeff Layton <> Signed-off-by: Steve French <>
2010-02-22fs/exec.c: fix initial stack reservationMichael Neuling
803bf5ec259941936262d10ecc84511b76a20921 ("fs/exec.c: restrict initial stack space expansion to rlimit") attempts to limit the initial stack to 20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not reduced in size, we ended up not changing the stack at all. This size reduction check is not necessary as the expand_stack call does this already. This caused a regression in UML resulting in most guest processes being killed. Signed-off-by: Michael Neuling <> Reviewed-by: KOSAKI Motohiro <> Acked-by: WANG Cong <> Cc: Anton Blanchard <> Cc: Oleg Nesterov <> Cc: James Morris <> Cc: Serge Hallyn <> Cc: Benjamin Herrenschmidt <> Cc: Jouni Malinen <> Cc: <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2010-02-22seq_file: add RCU versions of new hlist/list iterators (v3)stephen hemminger
Many usages of seq_file use RCU protected lists, so non RCU iterators will not work safely. Signed-off-by: Stephen Hemminger <> Signed-off-by: David S. Miller <>
2010-02-20CacheFiles: Fix a race in cachefiles_delete_object() vs renameDavid Howells
cachefiles_delete_object() can race with rename. It gets the parent directory of the object it's asked to delete, then locks it - but rename may have changed the object's parent between the get and the completion of the lock. However, if such a circumstance is detected, we abandon our attempt to delete the object - since it's no longer in the index key path, it won't be seen again by lookups of that key. The assumption is that cachefilesd may have culled it by renaming it to the graveyard for later destruction. Signed-off-by: David Howells <> Signed-off-by: Al Viro <>
2010-02-20vfs: don't call ima_file_check() unconditionally in nfsd_open()Chuck Ebbert
commit 1e41568d7378d1ba8c64ba137b9ddd00b59f893a ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") moved this code back to its original location but missed the "else". Signed-off-by: Chuck Ebbert <> Signed-off-by: Al Viro <>
2010-02-19Switch proc/self to nd_set_link()Al Viro
Signed-off-by: Al Viro <>
2010-02-19fix LOOKUP_FOLLOW on automount "symlinks"Al Viro
Make sure that automount "symlinks" are followed regardless of LOOKUP_FOLLOW; it should have no effect on them. Cc: Signed-off-by: Al Viro <>
2010-02-17percpu: add __percpu sparse annotations to fsTejun Heo
Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <> Cc: "Theodore Ts'o" <> Cc: Trond Myklebust <> Cc: Alex Elder <> Cc: Christoph Hellwig <> Cc: Alexander Viro <>
2010-02-16sysfs: sysfs_sd_setattr set iattrs unconditionallyEric W. Biederman
There is currently a bug in sysfs_sd_setattr inherited from sysfs_setattr in 2.6.32 where the first time we set the attributes on a sysfs file we allocate backing store but do not set the backing store attributes. Resulting in overly restrictive permissions on sysfs files. The fix is to simply modify the code so that it always executes when we update the sysfs attributes, as we did in 2.6.31 and earlier. Signed-off-by: Eric W. Biederman <> Tested-by: Jean Delvare <> Cc: stable <> Signed-off-by: Greg Kroah-Hartman <>