aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-06-16cifs: don't call cifs_new_fileinfo unless cifs_open succeedsJeff Layton
It's currently possible for cifs_open to fail after it has already called cifs_new_fileinfo. In that situation, the new fileinfo will be leaked as the caller doesn't call fput. That in turn leads to a busy inodes after umount problem since the fileinfo holds an extra inode reference now. Shuffle cifs_open around a bit so that it only calls cifs_new_fileinfo if it's going to succeed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: don't ignore cifs_posix_open_inode_helper return valueSuresh Jayaraman
...and ensure that we propagate the error back to avoid any surprises. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com>
2010-06-16cifs: clean up arguments to cifs_open_inode_helperJeff Layton
...which takes a ton of unneeded arguments and does a lot more pointer dereferencing than is really needed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: pass instantiated filp back after open callJeff Layton
The current scheme of sticking open files on a list and assuming that cifs_open will scoop them off of it is broken and leads to "Busy inodes after umount..." errors at unmount time. The problem is that there is no guarantee that cifs_open will always be called after a ->lookup or ->create operation. If there are permissions or other problems, then it's quite likely that it *won't* be called. Fix this by fully instantiating the filp whenever the file is created and pass that filp back to the VFS. If there is a problem, the VFS can clean up the references. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: move cifs_new_fileinfo call out of cifs_posix_openJeff Layton
Having cifs_posix_open call cifs_new_fileinfo is problematic and inconsistent with how "regular" opens work. It's also buggy as cifs_reopen_file calls this function on a reconnect, which creates a new struct cifsFileInfo that just gets leaked. Push it out into the callers. This also allows us to get rid of the "mnt" arg to cifs_posix_open. Finally, in the event that a cifsFileInfo isn't or can't be created, we always want to close the filehandle out on the server as the client won't have a record of the filehandle and can't actually use it. Make sure that CIFSSMBClose is called in those cases. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6Steve French
2010-06-15ocfs2: Limit default local alloc size within bitmap range.Tao Ma
In commit 6b82021b9e91cd689fdffadbcdb9a42597bbe764, we increase our local alloc size and calculate how much megabytes we can get according to group size and volume size. But we also need to check the maximum bits a local alloc block bitmap can have. With a bs=512, cs=32K, local volume with 160G, it calculate 96MB while the maximum local alloc size is only 76M. So the bitmap will overflow and corrupt the system truncate log file. See bug http://oss.oracle.com/bugzilla/show_bug.cgi?id=1262 Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-15ocfs2: Move orphan scan work to ocfs2_wq.Tao Ma
We used to let orphan scan work in the default work queue, but there is a corner case which will make the system deadlock. The scenario is like this: 1. set heartbeat threadshold to 200. this will allow us to have a great chance to have a orphan scan work before our quorum decision. 2. mount node 1. 3. after 1~2 minutes, mount node 2(in order to make the bug easier to reproduce, better add maxcpus=1 to kernel command line). 4. node 1 do orphan scan work. 5. node 2 do orphan scan work. 6. node 1 do orphan scan work. After this, node 1 hold the orphan scan lock while node 2 know node 1 is the master. 7. ifdown eth2 in node 2(eth2 is what we do ocfs2 interconnection). Now when node 2 begins orphan scan, the system queue is blocked. The root cause is that both orphan scan work and quorum decision work will use the system event work queue. orphan scan has a chance of blocking the event work queue(in dlm_wait_for_node_death) so that there is no chance for quorum decision work to proceed. This patch resolve it by moving orphan scan work to ocfs2_wq. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-15fs/ocfs2/dlm: Add missing spin_unlockJulia Lawall
Add a spin_unlock missing on the error path. Unlock as in the other code that leads to the leave label. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1; @@ * spin_lock(E1,...); <+... when != E1 if (...) { ... when != E1 * return ...; } ...+> * spin_unlock(E1,...); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-14Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linusJens Axboe
2010-06-13of: Drop properties with "/" in their nameMichael Ellerman
Some bogus firmwares include properties with "/" in their name. This causes problems when creating the /proc/device-tree file system, because the slash is taken to indicate a directory. We don't care about those properties, and we don't want to encourage them, so just throw them away when creating /proc/device-tree. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-06-13ceph: fix message memory leak, uninitialized variableSage Weil
We need to properly initialize skip, as not all alloc_msg op instances set it. Also, BUG if someone says skip but also allocates a message. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-13ceph: fix map handler error pathSage Weil
Don't leak message if we receive an unexpected message type. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-13ceph: some endianity fixesYehuda Sadeh
Fix some problems that came up with sparse. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-12cifs: implement drop_inode superblock opJeff Layton
The standard behavior for drop_inode is to delete the inode when the last reference to it is put and the nlink count goes to 0. This helps keep inodes that are still considered "not deleted" in cache as long as possible even when there aren't dentries attached to them. When server inode numbers are disabled, it's not possible for cifs_iget to ever match an existing inode (since inode numbers are generated via iunique). In this situation, cifs can keep a lot of inodes in cache that will never be used again. Implement a drop_inode routine that deletes the inode if server inode numbers are disabled on the mount. This helps keep the cifs inode caches down to a more manageable size when server inode numbers are disabled. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-06-12cifs: don't attempt busy-file rename unless it's in same directoryJeff Layton
Busy-file renames don't actually work across directories, so we need to limit this code to renames within the same dir. This fixes the bug detailed here: https://bugzilla.redhat.com/show_bug.cgi?id=591938 Signed-off-by: Jeff Layton <jlayton@redhat.com> CC: Stable <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: The file argument for fsync() is never null Btrfs: handle ERR_PTR from posix_acl_from_xattr() Btrfs: avoid BUG when dropping root and reference in same transaction Btrfs: prohibit a operation of changing acl's mask when noacl mount option used Btrfs: should add a permission check for setfacl Btrfs: btrfs_lookup_dir_item() can return ERR_PTR Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs Btrfs: unwind after btrfs_start_transaction() errors Btrfs: btrfs_iget() returns ERR_PTR Btrfs: handle kzalloc() failure in open_ctree() Btrfs: handle error returns from btrfs_lookup_dir_item() Btrfs: Fix BUG_ON for fs converted from extN Btrfs: Fix null dereference in relocation.c Btrfs: fix remap_file_pages error Btrfs: uninitialized data is check_path_shared() Btrfs: fix fallocate regression Btrfs: fix loop device on top of btrfs
2010-06-11Btrfs: The file argument for fsync() is never nullDan Carpenter
The "file" argument for fsync is never null so we can remove this check. What drew my attention here is that 7ea8085910e: "drop unused dentry argument to ->fsync" introduced an unconditional dereference at the start of the function and that generated a smatch warning. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle ERR_PTR from posix_acl_from_xattr()Dan Carpenter
posix_acl_from_xattr() returns both ERR_PTRs and null, but it's OK to pass null values to set_cached_acl() Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: avoid BUG when dropping root and reference in same transactionSage Weil
If btrfs_ioctl_snap_destroy() deletes a snapshot but finishes with end_transaction(), the cleaner kthread may come in and drop the root in the same transaction. If that's the case, the root's refs still == 1 in the tree when btrfs_del_root() deletes the item, because commit_fs_roots() hasn't updated it yet (that happens during the commit). This wasn't a problem before only because btrfs_ioctl_snap_destroy() would commit the transaction before dropping the dentry reference, so the dead root wouldn't get queued up until after the fs root item was updated in the btree. Since it is not an error to drop the root reference and the root in the same transaction, just drop the BUG_ON() in btrfs_del_root(). Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: prohibit a operation of changing acl's mask when noacl mount option usedShi Weihua
when used Posix File System Test Suite(pjd-fstest) to test btrfs, some cases about setfacl failed when noacl mount option used. I simplified used commands in pjd-fstest, and the following steps can reproduce it. ------------------------ # cd btrfs-part/ # mkdir aaa # setfacl -m m::rw aaa <- successed, but not expected by pjd-fstest. ------------------------ I checked ext3, a warning message occured, like as: setfacl: aaa/: Operation not supported Certainly, it's expected by pjd-fstest. So, i compared acl.c of btrfs and ext3. Based on that, a patch created. Fortunately, it works. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: should add a permission check for setfaclShi Weihua
On btrfs, do the following ------------------ # su user1 # cd btrfs-part/ # touch aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rw- group::rw- other::r-- # su user2 # cd btrfs-part/ # setfacl -m u::rwx aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rwx <- successed to setfacl group::rw- other::r-- ------------------ but we should prohibit it that user2 changing user1's acl. In fact, on ext3 and other fs, a message occurs: setfacl: aaa: Operation not permitted This patch fixed it. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_lookup_dir_item() can return ERR_PTRDan Carpenter
btrfs_lookup_dir_item() can return either ERR_PTRs or null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRsDan Carpenter
btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a check for that. It's not clear to me if it can also return NULL pointers or not so I left the original NULL pointer check as is. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: unwind after btrfs_start_transaction() errorsDan Carpenter
This was added by a22285a6a3: "Btrfs: Integrate metadata reservation with start_transaction". If we goto out here then we skip all the unwinding and there are locks still held etc. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_iget() returns ERR_PTRDan Carpenter
btrfs_iget() returns an ERR_PTR() on failure and not null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle kzalloc() failure in open_ctree()Dan Carpenter
Unwind and return -ENOMEM if the allocation fails here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle error returns from btrfs_lookup_dir_item()Dan Carpenter
If btrfs_lookup_dir_item() fails, we should can just let the mount fail with an error. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: Fix BUG_ON for fs converted from extNYan, Zheng
Tree blocks can live in data block groups in FS converted from extN. So it's easy to trigger the BUG_ON. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: Fix null dereference in relocation.cYan, Zheng
Fix a potential null dereference in relocation.c Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Acked-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: try to send partial cap release on cap message on missing inode ceph: release cap on import if we don't have the inode ceph: fix misleading/incorrect debug message ceph: fix atomic64_t initialization on ia64 ceph: fix lease revocation when seq doesn't match ceph: fix f_namelen reported by statfs ceph: fix memory leak in statfs ceph: fix d_subdirs ordering problem
2010-06-11Btrfs: fix remap_file_pages errorMiao Xie
when we use remap_file_pages() to remap a file, remap_file_pages always return error. It is because btrfs didn't set VM_CAN_NONLINEAR for vma. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: uninitialized data is check_path_shared()Dan Carpenter
refs can be used with uninitialized data if btrfs_lookup_extent_info() fails on the first pass through the loop. In the original code if that happens then check_path_shared() probably returns 1, this patch changes it to return 1 for safety. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: fix fallocate regressionJosef Bacik
Seems that when btrfs_fallocate was converted to use the new ENOSPC stuff we dropped passing the mode to the function that actually does the preallocation. This breaks anybody who wants to use FALLOC_FL_KEEP_SIZE. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: fix loop device on top of btrfsMiao Xie
We cannot use the loop device which has been connected to a file in the btrf The reproduce steps is following: # dd if=/dev/zero of=vdev0 bs=1M count=1024 # losetup /dev/loop0 vdev0 # mkfs.btrfs /dev/loop0 ... failed to zero device start -5 The reason is that the btrfs don't implement either ->write_begin or ->write the VFS API, so we fix it by setting ->write to do_sync_write(). Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11writeback: fix pin_sb_for_writebackChristoph Hellwig
We need to check for s_instances to make sure we don't bother working against a filesystem that is beeing unmounted, and we need to call put_super to make sure a superblock is freed when we race against umount. Also no need to keep sb_lock after we got a reference on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: add missing requeue_io in writeback_inodes_wbChristoph Hellwig
In "writeback: fix writeback_inodes_wb from writeback_inodes_sb" I accidentally removed the requeue_io if we need to skip a superblock because we can't pin it. Add it back, otherwise we're getting spurious lockups after multiple xfstests runs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: simplify and split bdi_start_writebackChristoph Hellwig
bdi_start_writeback now never gets a superblock passed, so we can just remove that case. And to further untangle the code and flatten the call stack split it into two trivial helpers for it's two callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: simplify wakeup_flusher_threadsChristoph Hellwig
bdi_writeback_all only has one caller, so fold it to simplify the code and flatten the call stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: fix writeback_inodes_wb from writeback_inodes_sbChristoph Hellwig
When we call writeback_inodes_wb from writeback_inodes_sb we always have s_umount held, which currently makes the whole operation a no-op. But if we are called to write out inodes for a specific superblock we always have s_umount held, so replace the incorrect logic checking for WB_SYNC_ALL which only worked by coincidence with the proper check for an explicit superblock argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: enforce s_umount locking in writeback_inodes_sbChristoph Hellwig
Make sure that not only sync_filesystem but all callers of writeback_inodes_sb have the superblock protected against remount. As-is this disables all functionality for these callers, but the next patch relies on this locking to fix writeback_inodes_sb for sync_filesystem. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: queue work on stack in writeback_inodes_sbChristoph Hellwig
If we want to rely on s_umount in the caller we need to wait for completion of the I/O submission before returning to the caller. Refactor bdi_sync_writeback into a bdi_queue_work_onstack helper and use it for this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: fix writeback completion notificationsChristoph Hellwig
The code dealing with bdi_work->state and completion of a bdi_work is a major mess currently. This patch makes sure we directly use one set of flags to deal with it, and use it consistently, which means: - always notify about completion from the rcu callback. We only ever wait for it from on-stack callers, so this simplification does not even cause a theoretical slowdown currently. It also makes sure we don't miss out on the notification if we ever add other callers to wait for it. - make earlier completion notification depending on the on-stack allocation, not the sync mode. If we introduce new callers that want to do WB_SYNC_NONE writeback from on-stack callers this will be nessecary. Also rename bdi_wait_on_work_clear to bdi_wait_on_work_done and inline a few small functions into their only caller to make the code understandable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-10ceph: try to send partial cap release on cap message on missing inodeSage Weil
If we have enough memory to allocate a new cap release message, do so, so that we can send a partial release message immediately. This keeps us from making the MDS wait when the cap release it needs is in a partially full release message. If we fail because of ENOMEM, oh well, they'll just have to wait a bit longer. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-10ceph: release cap on import if we don't have the inodeSage Weil
If we get an IMPORT that give us a cap, but we don't have the inode, queue a release (and try to send it immediately) so that the MDS doesn't get stuck waiting for us. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-10ceph: fix misleading/incorrect debug messageSage Weil
Nothing is released here: the caps message is simply ignored in this case. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-10ceph: fix atomic64_t initialization on ia64Jeff Mahoney
bdi_seq is an atomic_long_t but we're using ATOMIC_INIT, which causes build failures on ia64. This patch fixes it to use ATOMIC_LONG_INIT. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-10pipe: fix check in "set size" fcntlMiklos Szeredi
As it stands this check compares the number of pages to the page size. This makes no sense and makes the fcntl fail in almost any sane case. Fix it by checking if nr_pages is not zero (it can become zero only if arg is too big and round_pipe_size() overflows). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-10pipe: fix pipe buffer resizingMiklos Szeredi
pipe_set_size() needs to copy pipe bufs from the old circular buffer to the new. The current code gets this wrong in multiple ways, resulting in oops. Test program is available here: http://www.kernel.org/pub/linux/kernel/people/mszeredi/piperesize/ Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-10block: remove duplicate BUG_ON() in bd_finish_claiming()Jens Axboe
We do the same BUG_ON() just a line later when calling into __bd_abort_claiming(). Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>