aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-07-01nfs: Remove invalid tk_pid from debug messageKinglong Mee
Before rpc_run_task(), tk_pid is uninitiated as 0 always. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Remove invalid NFS_ATTR_FATTR_V4_REFERRAL checking in nfs4_get_rootfhKinglong Mee
NFS_ATTR_FATTR_V4_REFERRAL is only set in nfs4_proc_lookup_common. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Drop bad comment in nfs41_walk_client_list()Kinglong Mee
Commit 7b1f1fd184 "NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list" have change the logical of the list_for_each_entry(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Remove unneeded micro checking of CONFIG_PROC_FSKinglong Mee
Have checking CONFIG_PROC_FS in include/linux/sunrpc/stats.h. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Don't setting FILE_CREATED flags alwaysKinglong Mee
Commit 5bc2afc2b5 "NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method" have support the opened arguments now. Also, Commit 03da633aa7 "atomic_open: take care of EEXIST in no-open case with O_CREAT|O_EXCL in fs/namei.c" have change vfs's logical. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Use remove_proc_subtree() instead remove_proc_entry()Kinglong Mee
Thanks for Al Viro's comments of killing proc_fs_nfs completely. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Remove unused argument in nfs_server_set_fsinfo()Kinglong Mee
Commit e38eb6506f "NFS: set_pnfs_layoutdriver() from nfs4_proc_fsinfo()" have remove the using of mntfh from nfs_server_set_fsinfo(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Fix a memory leak when meeting an unsupported state protectKinglong Mee
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: take extra reference to fl->fl_file when running a LOCKU operationJeff Layton
Jean reported another crash, similar to the one fixed by feaff8e5b2cf: BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 PGD 0 Oops: 0000 [#1] SMP Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 Workqueue: rpciod rpc_async_schedule [sunrpc] 30ec000 RIP: 0010:[<ffffffff8124ef7f>] [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 RSP: 0018:ffff8802330efc08 EFLAGS: 00010296 RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000 R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020 FS: 0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0 Stack: ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281 ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246 ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58 Call Trace: [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0 [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc] [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0 [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4] [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4] [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4] [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4] [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc] [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc] [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc] [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc] [<ffffffff810add67>] process_one_work+0x147/0x400 [<ffffffff810ae42b>] worker_thread+0x11b/0x460 [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0 [<ffffffff810b35d9>] kthread+0xc9/0xe0 [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160 [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 [<ffffffff8173c222>] ret_from_fork+0x42/0x70 [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83 RIP [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 RSP <ffff8802330efc08> CR2: 0000000000000148 ---[ end trace 64484f16250de7ef ]--- The problem is almost exactly the same as the one fixed by feaff8e5b2cf. We must take a reference to the struct file when running the LOCKU compound to prevent the final fput from running until the operation is complete. Reported-by: Jean Spector <jean@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01fuse: separate pqueue for clonesMiklos Szeredi
Make each fuse device clone refer to a separate processing queue. The only constraint on userspace code is that the request answer must be written to the same device clone as it was read off. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: introduce per-instance fuse_dev structureMiklos Szeredi
Allow fuse device clones to refer to be distinguished. This patch just adds the infrastructure by associating a separate "struct fuse_dev" with each clone. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: device fd cloneMiklos Szeredi
Allow an open fuse device to be "cloned". Userspace can create a clone by: newfd = open("/dev/fuse", O_RDWR) ioctl(newfd, FUSE_DEV_IOC_CLONE, &oldfd); At this point newfd will refer to the same fuse connection as oldfd. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: abort: no fc->lock needed for request endingMiklos Szeredi
In fuse_abort_conn() when all requests are on private lists we no longer need fc->lock protection. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: no fc->lock for pqueue partsMiklos Szeredi
Remove fc->lock protection from processing queue members, now protected by fpq->lock. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: no fc->lock in request_end()Miklos Szeredi
No longer need to call request_end() with the connection lock held. We still protect the background counters and queue with fc->lock, so acquire it if necessary. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: cleanup request_end()Miklos Szeredi
Now that we atomically test having already done everything we no longer need other protection. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: request_end(): do onceMiklos Szeredi
When the connection is aborted it is possible that request_end() will be called twice. Use atomic test and set to do the actual ending only once. test_and_set_bit() also provides the necessary barrier semantics so no explicit smp_wmb() is necessary. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: add req flag for private listMiklos Szeredi
When an unlocked request is aborted, it is moved from fpq->io to a private list. Then, after unlocking fpq->lock, the private list is processed and the requests are finished off. To protect the private list, we need to mark the request with a flag, so if in the meantime the request is unlocked the list is not corrupted. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: pqueue lockingMiklos Szeredi
Add a fpq->lock for protecting members of struct fuse_pqueue and FR_LOCKED request flag. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: abort: group pqueue accessesMiklos Szeredi
Rearrange fuse_abort_conn() so that processing queue accesses are grouped together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: cleanup fuse_dev_do_read()Miklos Szeredi
- locked list_add() + list_del_init() cancel out - common handling of case when request is ended here in the read phase Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: move list_del_init() from request_end() into callersMiklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: duplicate ->connected in pqueueMiklos Szeredi
This will allow checking ->connected just with the processing queue lock. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: separate out processing queueMiklos Szeredi
This is just two fields: fc->io and fc->processing. This patch just rearranges the fields, no functional change. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: simplify request_wait()Miklos Szeredi
wait_event_interruptible_exclusive_locked() will do everything request_wait() does, so replace it. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: no fc->lock for iqueue partsMiklos Szeredi
Remove fc->lock protection from input queue members, now protected by fiq->waitq.lock. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: allow interrupt queuing without fc->lockMiklos Szeredi
Interrupt is only queued after the request has been sent to userspace. This is either done in request_wait_answer() or fuse_dev_do_read() depending on which state the request is in at the time of the interrupt. If it's not yet sent, then queuing the interrupt is postponed until the request is read. Otherwise (the request has already been read and is waiting for an answer) the interrupt is queued immedidately. We want to call queue_interrupt() without fc->lock protection, in which case there can be a race between the two functions: - neither of them queue the interrupt (thinking the other one has already done it). - both of them queue the interrupt The first one is prevented by adding memory barriers, the second is prevented by checking (under fiq->waitq.lock) if the interrupt has already been queued. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: iqueue lockingMiklos Szeredi
Use fiq->waitq.lock for protecting members of struct fuse_iqueue and FR_PENDING request flag, previously protected by fc->lock. Following patches will remove fc->lock protection from these members. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: dev read: split list_moveMiklos Szeredi
Different lists will need different locks. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: abort: group iqueue accessesMiklos Szeredi
Rearrange fuse_abort_conn() so that input queue accesses are grouped together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: duplicate ->connected in iqueueMiklos Szeredi
This will allow checking ->connected just with the input queue lock. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: separate out input queueMiklos Szeredi
The input queue contains normal requests (fc->pending), forgets (fc->forget_*) and interrupts (fc->interrupts). There's also fc->waitq and fc->fasync for waking up the readers of the fuse device when a request is available. The fc->reqctr is also moved to the input queue (assigned to the request when the request is added to the input queue. This patch just rearranges the fields, no functional change. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: req state use flagsMiklos Szeredi
Use flags for representing the state in fuse_req. This is needed since req->list will be protected by different locks in different states, hence we'll want the state itself to be split into distinct bits, each protected with the relevant lock in that state. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: simplify req statesMiklos Szeredi
FUSE_REQ_INIT is actually the same state as FUSE_REQ_PENDING and FUSE_REQ_READING and FUSE_REQ_WRITING can be merged into a common FUSE_REQ_IO state. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: don't hold lock over request_wait_answer()Miklos Szeredi
Only hold fc->lock over sections of request_wait_answer() that actually need it. If wait_event_interruptible() returns zero, it means that the request finished. Need to add memory barriers, though, to make sure that all relevant data in the request is synchronized. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: simplify unique ctrMiklos Szeredi
Since it's a 64bit counter, it's never gonna wrap around. Remove code dealing with that possibility. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: rework abortMiklos Szeredi
Splice fc->pending and fc->processing lists into a common kill list while holding fc->lock. By the time we release fc->lock, pending and processing lists are empty and the io list contains only locked requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: fold helpers into abortMiklos Szeredi
Fold end_io_requests() and end_queued_requests() into fuse_abort_conn(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: use per req lock for lock/unlock_request()Miklos Szeredi
Reuse req->waitq.lock for protecting FR_ABORTED and FR_LOCKED flags. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: req use bitopsMiklos Szeredi
Finer grained locking will mean there's no single lock to protect modification of bitfileds in fuse_req. So move to using bitops. Can use the non-atomic variants for those which happen while the request definitely has only one reference. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: simplify request abortMiklos Szeredi
- don't end the request while req->locked is true - make unlock_request() return an error if the connection was aborted Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: call fuse_abort_conn() in dev releaseMiklos Szeredi
fuse_abort_conn() does all the work done by fuse_dev_release() and more. "More" consists of: end_io_requests(fc); wake_up_all(&fc->waitq); kill_fasync(&fc->fasync, SIGIO, POLL_IN); All of which should be no-op (WARN_ON's added). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: fold fuse_request_send_nowait() into single callerMiklos Szeredi
And the same with fuse_request_send_nowait_locked(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: check conn_error earlierMiklos Szeredi
fc->conn_error is set once in FUSE_INIT reply and never cleared. Check it in request allocation, there's no sense in doing all the preparation if sending will surely fail. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: account as waiting before queuing for backgroundMiklos Szeredi
Move accounting of fc->num_waiting to the point where the request actually starts waiting. This is earlier than the current queue_request() for background requests, since they might be waiting on the fc->bg_queue before being queued on fc->pending. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: reset waitingMiklos Szeredi
Reset req->waiting in fuse_put_request(). This is needed for correct accounting in fc->num_waiting for reserved requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: fix background request if not connectedMiklos Szeredi
request_end() expects fc->num_background and fc->active_background to have been incremented, which is not the case in fuse_request_send_nowait() failure path. So instead just call the ->end() callback (which is actually set by all callers). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: initialize fc->release before calling itMiklos Szeredi
fc->release is called from fuse_conn_put() which was used in the error cleanup before fc->release was initialized. [Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling fuse_conn_init(fc) instead of before.] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()") Cc: <stable@vger.kernel.org> #v2.6.31+
2015-07-01fs/file.c: __fget() and dup2() atomicity rulesEric Dumazet
__fget() does lockless fetch of pointer from the descriptor table, attempts to grab a reference and treats "it was already zero" as "it's already gone from the table, we just hadn't seen the store, let's fail". Unfortunately, that breaks the atomicity of dup2() - __fget() might see the old pointer, notice that it's been already dropped and treat that as "it's closed". What we should be getting is either the old file or new one, depending whether we come before or after dup2(). Dmitry had following test failing sometimes : int fd; void *Thread(void *x) { char buf; int n = read(fd, &buf, 1); if (n != 1) exit(printf("read failed: n=%d errno=%d\n", n, errno)); return 0; } int main() { fd = open("/dev/urandom", O_RDONLY); int fd2 = open("/dev/urandom", O_RDONLY); if (fd == -1 || fd2 == -1) exit(printf("open failed\n")); pthread_t th; pthread_create(&th, 0, Thread, 0); if (dup2(fd2, fd) == -1) exit(printf("dup2 failed\n")); pthread_join(th, 0); if (close(fd) == -1) exit(printf("close failed\n")); if (close(fd2) == -1) exit(printf("close failed\n")); printf("DONE\n"); return 0; } Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-01fs/file.c: don't acquire files->file_lock in fd_install()Eric Dumazet
Mateusz Guzik reported : Currently obtaining a new file descriptor results in locking fdtable twice - once in order to reserve a slot and second time to fill it. Holding the spinlock in __fd_install() is needed in case a resize is done, or to prevent a resize. Mateusz provided an RFC patch and a micro benchmark : http://people.redhat.com/~mguzik/pipebench.c A resize is an unlikely operation in a process lifetime, as table size is at least doubled at every resize. We can use RCU instead of the spinlock. __fd_install() must wait if a resize is in progress. The resize must block new __fd_install() callers from starting, and wait that ongoing install are finished (synchronize_sched()) resize should be attempted by a single thread to not waste resources. rcu_sched variant is used, as __fd_install() and expand_fdtable() run from process context. It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mateusz Guzik <mguzik@redhat.com> Acked-by: Mateusz Guzik <mguzik@redhat.com> Tested-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>