aboutsummaryrefslogtreecommitdiff
path: root/security
AgeCommit message (Collapse)Author
2011-09-23UBUNTU: SAUCE: seccomp_filter: new mode with configurable syscall filtersWill Drewry
This change adds a new seccomp mode which specifies the allowed system calls dynamically. When in the new mode (13), all system calls are checked against process-defined filters - first by system call number, then by a filter string. If an entry exists for a given system call and all filter predicates evaluate to true, then the task may proceed. Otherwise, the task is killed. Filter string parsing and evaluation is handled by the ftrace filter engine. Related patches tweak to the perf filter trace and free allowing the calls to be shared. Filters inherit their understanding of types and arguments for each system call from the CONFIG_FTRACE_SYSCALLS subsystem which already populates this information in syscall_metadata associated enter_event (and exit_event) structures. If CONFIG_FTRACE_SYSCALLS is not compiled in, only filter strings of "1" will be allowed. The net result is a process may have its system calls filtered using the ftrace filter engine's inherent understanding of systems calls. The set of filters is specified through the PR_SET_SECCOMP_FILTER argument in prctl(). For example, a filterset for a process, like pdftotext, that should only process read-only input could (roughly) look like: sprintf(rdonly, "flags == %u", O_RDONLY|O_LARGEFILE); type = PR_SECCOMP_FILTER_SYSCALL; prctl(PR_SET_SECCOMP_FILTER, type, __NR_open, rdonly); prctl(PR_SET_SECCOMP_FILTER, type, __NR__llseek, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_brk, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_close, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_exit_group, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_fstat64, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_mmap2, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_munmap, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_read, "1"); prctl(PR_SET_SECCOMP_FILTER, type, __NR_write, "fd == 1 || fd == 2"); prctl(PR_SET_SECCOMP, 13); Subsequent calls to PR_SET_SECCOMP_FILTER for the same system call will be &&'d together to ensure that attack surface may only be reduced: prctl(PR_SET_SECCOMP_FILTER, __NR_write, "fd != 2"); With the earlier example, the active filter becomes: "(fd == 1 || fd == 2) && (fd != 2)" The patch also adds PR_CLEAR_SECCOMP_FILTER and PR_GET_SECCOMP_FILTER. The latter returns the current filter for a system call to userspace: prctl(PR_GET_SECCOMP_FILTER, type, __NR_write, buf, bufsize); while the former clears any filters for a given system call changing it back to a defaulty deny: prctl(PR_CLEAR_SECCOMP_FILTER, type, __NR_write); Note, type may be either PR_SECCOMP_FILTER_EVENT or PR_SECCOMP_FILTER_SYSCALL. This allows for ftrace event ids to be used in lieu of system call numbers. At present, only syscalls:sys_enter_* event id are supported, but this allows for potential future extension of the backend. v11: - Use mode "13" to avoid future overlap; with comment update - Use kref; extra memset; other clean up from msb@chromium.org - Cleaned up Makefile object merging since locally shared symbols are gone v10: - Note that PERF_EVENTS are also needed for ftrace filter engine support. - Removed dependency on ftrace code changes for event_filters (wrapping with perf_events and violating opaqueness for the filter str) - pulled in all the hacks to get access to syscall_metadata and build call objects for filter evaluation. v9: - rebase on to de505e709ffb09a7382ca8e0d8c7dbb171ba5 - disallow PR_SECCOMP_FILTER_EVENT when a compat task is calling as ftrace has no compat_syscalls awareness yet. - return -ENOSYS when filter engine strings are used on a compat call as there are no compat_syscalls events to reference yet. v8: - expand parenthical use during SET_SECCOMP_FILTER to avoid operator precedence undermining attack surface reduction (caught by segoon@openwall.com). Opted to waste bytes on () than reparse to avoid OP_OR precedence overriding extend_filter's intentions. - remove more lingering references to @state - fix incorrect compat mismatch check (anyone up for a Tested-By?) v7: - disallow seccomp_filter inheritance across fork except when seccomp is active. This avoids filters leaking across processes when they are not actively in use but ensure an allowed fork/clone doesn't drop filters. - remove the Mode: print from show as it reflected current and not the filters holder. v6: - clean up minor unnecessary changes (empty lines, ordering, etc) - fix one overly long line - add refcount overflow BUG_ON v5: - drop mutex usage when the task_struct is safe to access directly v4: - move off of RCU to a read/write guarding mutex after paulmck@linux.vnet.ibm.com's feedback (mem leak, rcu fail) - stopped inc/dec refcounts in mutex guard sections - added required changes to init the mutex in INIT_TASK and safely lock around fork inheritance. - added id_type support to the prctl interface to support using ftrace event ids as an alternative to syscall numbers. Behavior is identical otherwise (as per discussion with mingo@elte.hu) v3: - always block execve calls (as per torvalds@linux-foundation.org) - add __NR_seccomp_execve(_32) to seccomp-supporting arches - ensure compat tasks can't reach ftrace:syscalls - dropped new defines for seccomp modes. - two level array instead of hlists (sugg. by olofj@chromium.org) - added generic Kconfig entry that is not connected. - dropped internal seccomp.h - move prctl helpers to seccomp_filter - killed seccomp_t typedef (as per checkpatch) v2: - changed to use the existing syscall number ABI. - prctl changes to minimize parsing in the kernel: prctl(PR_SET_SECCOMP, {0 | 1 | 2 }, { 0 | ON_EXEC }); prctl(PR_SET_SECCOMP_FILTER, __NR_read, "fd == 5"); prctl(PR_CLEAR_SECCOMP_FILTER, __NR_read); prctl(PR_GET_SECCOMP_FILTER, __NR_read, buf, bufsize); - defined PR_SECCOMP_MODE_STRICT and ..._FILTER - added flags - provide a default fail syscall_nr_to_meta in ftrace - provides fallback for unhooked system calls - use -ENOSYS and ERR_PTR(-ENOSYS) for stubbed functionality - added kernel/seccomp.h to share seccomp.c/seccomp_filter.c - moved to a hlist and 4 bit hash of linked lists - added support to operate without CONFIG_FTRACE_SYSCALLS - moved Kconfig support next to SECCOMP - made Kconfig entries dependent on EXPERIMENTAL - added macros to avoid ifdefs from kernel/fork.c - added compat task/filter matching - drop seccomp.h inclusion in sched.h and drop seccomp_t - added Filtering to "show" output - added on_exec state dup'ing when enabling after a fast-path accept. Signed-off-by: Will Drewry <wad@chromium.org> BUG=chromium-os:14496 TEST=built in x86-alex. Out of tree commandline helper test confirms functionality works. Will check in a test into the minijail repo which can be used from autotest. Change-Id: I901595e3399914783739d113a058d83550ddf8e2 Reviewed-on: http://gerrit.chromium.org/gerrit/4814 Reviewed-by: Sonny Rao <sonnyrao@chromium.org> Tested-by: Will Drewry <wad@chromium.org> Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: ubuntu: Yama: if an underlying filesystem provides a permissions op ↵Andy Whitcroft
use it When we are checking permissions on hardlinks we use generic_permissions() to work out if the user actually has read/write permissions and only then allow the link. However where the underlying filesystem supplies a permissions() op there is no guarentee that the inode ownership is actually valid and we must use that op instead. Add a new function mirroring the core fragment from inode_permission using the filesystem specific permissions() op falling back to generic_permissions() when it is not present. With this in place links in overlayfs behave as expected. Signed-off-by: Andy Whitcroft <apw@canonical.com>
2011-09-23UBUNTU: SAUCE: fix yama_ptracer_del lockdep warningMing Lei
yama_ptracer_del can be called in softirq context, so ptracer_relations_lock may be held in softirq context. This patch replaces spin_[un]lock with spin_[un]lock_bh for &ptracer_relations_lock to fix reported lockdep warning and avoid possible dealock. BugLink: http://bugs.launchpad.net/bugs/791019 Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2011-09-23UBUNTU: fix ERROR: __devcgroup_inode_permission undefinedLeann Ogasawara
Fix build error: ERROR: "__devcgroup_inode_permission" [ubuntu/aufs/aufs.ko] undefined! Result of upstream commit: commit 482e0cd3dbaa70f2a2bead4b5f2c0d203ef654ba Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sun Jun 19 13:01:04 2011 -0400 devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
2011-09-23UBUNTU: ubuntu: Yama - unconditionally chain to Yama LSMKees Cook
This patch forces the LSM to always chain through the Yama LSM regardless of which LSM is selected as the primary LSM. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: ubuntu: Yama - add ptrace relationship tracking interfaceKees Cook
Some application suites have external crash handlers that depend on being able to use ptrace to generate crash reports (KDE, Wine, Chromium, Firefox, etc). Since the inferior process has a defined application-specific relationship with the debugger, allow the inferior to express that relationship by declaring who can call PTRACE_ATTACH against it. The inferior can use prctl() with PR_SET_PTRACER to allow a specific PID and its descendants to perform the ptrace instead of only a direct ancestor. Signed-off-by: Kees Cook <kees.cook@canonical.com> --- v2: - kmalloc, spinlock init, and doc typo corrections from Tetsuo Handa. - make sure to replace if possible on add, thanks to Eric Paris. v3: - make sure to use thread group leader when searching for exceptions. v4: - make sure to use thread group leader when creating exceptions. v5: - make sure to use thread group leader when deleting exceptions. Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: ubuntu: Yama - create task_free security callbackKees Cook
The current LSM interface to cred_free is not sufficient for allowing an LSM to track the life and death of a task. This patch adds the task_free hook so that an LSM can clean up resources on task death. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: ubuntu: Yama - LSM hooksKees Cook
This adds the Yama Linux Security Module to collect several security features (symlink, hardlink, and ptrace restrictions) that have existed in various forms over the years and have been carried outside the mainline kernel by other Linux distributions like Openwall and grsecurity. Signed-off-by: Kees Cook <kees.cook@canonical.com> --- v2: - add rcu locking, thanks to Tetsuo Handa. - add Documentation/Yama.txt for summary of features. v3: - drop needless cap_ callbacks. - fix usage of get_task_comm. - drop CONFIG_ of sysctl defaults, as recommended by Andi Kleen. - require SYSCTL. v4: - drop accidentally included fs/exec.c chunk. v5: - resend, with ptrace relationship interface v6: - merge with 2.6.39, thanks to Andy Whitcroft Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: ubuntu: AUFS -- aufs2-standalone.patch aufs2.1-39Andy Whitcroft
Signed-off-by: Andy Whitcroft <apw@canonical.com>
2011-09-23UBUNTU: SAUCE: AppArmor: Fix unpack of network tables.John Johansen
The unpacking of network rules, unpacks 1 more rule than it should. It should drop all rules with network types AF_MAX or greater. Fix suggested by Tetsuo Handa in https://lists.ubuntu.com/archives/kernel-team/2010-November/013327.html Reported-by: Tetsuo Handa <from-ubuntu@I-love.SAKURA.ne.jp> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23AppArmor: compatibility patch for v5 interfaceJohn Johansen
Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23AppArmor: compatibility patch for v5 network controllJohn Johansen
Add compatibility for v5 network rules. Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
2011-09-23UBUNTU: SAUCE: AppArmor: Allow dfa backward compatibility with broken userspaceJohn Johansen
Allow broken Lucid userspace tools to load policy, on Maverick kernel. The fix for http://launchpad.net/bugs/581525 blocks Lucid tools from loading policy, this provides compatibility with Lucid tools without reintroducing the bug. The apparmor_parser when compiling policy could generate invalid dfas that did not have sufficient padding to avoid invalid references, when used by the kernel. The kernels check to verify the next/check table size was broken meaning invalid dfas were being created by userspace and not caught. To remain compatible with old tools that are not fixed, pad the loaded dfas next/check table. The dfa's themselves are valid except for the high padding for potentially invalid transitions (high bounds error), which have a maximimum is 256 entries. So just allocate an extra null filled 256 entries for the next/check tables. This will guarentee all bounds are good and invalid transitions go to the null (0) state. Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
2011-08-04AppArmor: Fix masking of capabilities in complain modeJohn Johansen
commit 25e75dff519bcce2cb35023105e7df51d7b9e691 upstream. AppArmor is masking the capabilities returned by capget against the capabilities mask in the profile. This is wrong, in complain mode the profile has effectively all capabilities, as the profile restrictions are not being enforced, merely tested against to determine if an access is known by the profile. This can result in the wrong behavior of security conscience applications like sshd which examine their capability set, and change their behavior accordingly. In this case because of the masked capability set being returned sshd fails due to DAC checks, even when the profile is in complain mode. Kernels affected: 2.6.36 - 3.0. Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04AppArmor: Fix reference to rcu protected pointer outside of rcu_read_lockJohn Johansen
commit 04fdc099f9c80c7775dbac388fc97e156d4d47e7 upstream. The pointer returned from tracehook_tracer_task() is only valid inside the rcu_read_lock. However the tracer pointer obtained is being passed to aa_may_ptrace outside of the rcu_read_lock critical section. Mover the aa_may_ptrace test into the rcu_read_lock critical section, to fix this. Kernels affected: 2.6.36 - 3.0 Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-21KEYS: Fix error handling in construct_key_and_link()David Howells
Fix error handling in construct_key_and_link(). If construct_alloc_key() returns an error, it shouldn't pass out through the normal path as the key_serial() called by the kleave() statement will oops when it gets an error code in the pointer: BUG: unable to handle kernel paging request at ffffffffffffff84 IP: [<ffffffff8120b401>] request_key_and_link+0x4d7/0x52f .. Call Trace: [<ffffffff8120b52c>] request_key+0x41/0x75 [<ffffffffa00ed6e8>] cifs_get_spnego_key+0x206/0x226 [cifs] [<ffffffffa00eb0c9>] CIFS_SessSetup+0x511/0x1234 [cifs] [<ffffffffa00d9799>] cifs_setup_session+0x90/0x1ae [cifs] [<ffffffffa00d9c02>] cifs_get_smb_ses+0x34b/0x40f [cifs] [<ffffffffa00d9e05>] cifs_mount+0x13f/0x504 [cifs] [<ffffffffa00caabb>] cifs_do_mount+0xc4/0x672 [cifs] [<ffffffff8113ae8c>] mount_fs+0x69/0x155 [<ffffffff8114ff0e>] vfs_kern_mount+0x63/0xa0 [<ffffffff81150be2>] do_kern_mount+0x4d/0xdf [<ffffffff81152278>] do_mount+0x63c/0x69f [<ffffffff8115255c>] sys_mount+0x88/0xc2 [<ffffffff814fbdc2>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper fix comment in generic_permission() kill obsolete comment for follow_down() proc_sys_permission() is OK in RCU mode reiserfs_permission() doesn't need to bail out in RCU mode proc_fd_permission() is doesn't need to bail out in RCU mode nilfs2_permission() doesn't need to bail out in RCU mode logfs doesn't need ->permission() at all coda_ioctl_permission() is safe in RCU mode cifs_permission() doesn't need to bail out in RCU mode bad_inode_permission() is safe from RCU mode ubifs: dereferencing an ERR_PTR in ubifs_mount()
2011-06-20devcgroup_inode_permission: take "is it a device node" checks to inlined wrapperAl Viro
inode_permission() calls devcgroup_inode_permission() and almost all such calls are _not_ for device nodes; let's at least keep the common path straight... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-17KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyringDavid Howells
____call_usermodehelper() now erases any credentials set by the subprocess_inf::init() function. The problem is that commit 17f60a7da150 ("capabilites: allow the application of capability limits to usermode helpers") creates and commits new credentials with prepare_kernel_cred() after the call to the init() function. This wipes all keyrings after umh_keys_init() is called. The best way to deal with this is to put the init() call just prior to the commit_creds() call, and pass the cred pointer to init(). That means that umh_keys_init() and suchlike can modify the credentials _before_ they are published and potentially in use by the rest of the system. This prevents request_key() from working as it is prevented from passing the session keyring it set up with the authorisation token to /sbin/request-key, and so the latter can't assume the authority to instantiate the key. This causes the in-kernel DNS resolver to fail with ENOKEY unconditionally. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵James Morris
into for-linus
2011-06-14SELinux: skip file_name_trans_write() when policy downgraded.Roy.Li
When policy version is less than POLICYDB_VERSION_FILENAME_TRANS, skip file_name_trans_write(). Signed-off-by: Roy.Li <rongqing.li@windriver.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-06-14TOMOYO: Fix oops in tomoyo_mount_acl().Tetsuo Handa
In tomoyo_mount_acl() since 2.6.36, kern_path() was called without checking dev_name != NULL. As a result, an unprivileged user can trigger oops by issuing mount(NULL, "/", "ext3", 0, NULL) request. Fix this by checking dev_name != NULL before calling kern_path(dev_name). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2011-06-09AppArmor: Fix sleep in invalid context from task_setrlimitJohn Johansen
Affected kernels 2.6.36 - 3.0 AppArmor may do a GFP_KERNEL memory allocation with task_lock(tsk->group_leader); held when called from security_task_setrlimit. This will only occur when the task's current policy has been replaced, and the task's creds have not been updated before entering the LSM security_task_setrlimit() hook. BUG: sleeping function called from invalid context at mm/slub.c:847 in_atomic(): 1, irqs_disabled(): 0, pid: 1583, name: cupsd 2 locks held by cupsd/1583: #0: (tasklist_lock){.+.+.+}, at: [<ffffffff8104dafa>] do_prlimit+0x61/0x189 #1: (&(&p->alloc_lock)->rlock){+.+.+.}, at: [<ffffffff8104db2d>] do_prlimit+0x94/0x189 Pid: 1583, comm: cupsd Not tainted 3.0.0-rc2-git1 #7 Call Trace: [<ffffffff8102ebf2>] __might_sleep+0x10d/0x112 [<ffffffff810e6f46>] slab_pre_alloc_hook.isra.49+0x2d/0x33 [<ffffffff810e7bc4>] kmem_cache_alloc+0x22/0x132 [<ffffffff8105b6e6>] prepare_creds+0x35/0xe4 [<ffffffff811c0675>] aa_replace_current_profile+0x35/0xb2 [<ffffffff811c4d2d>] aa_current_profile+0x45/0x4c [<ffffffff811c4d4d>] apparmor_task_setrlimit+0x19/0x3a [<ffffffff811beaa5>] security_task_setrlimit+0x11/0x13 [<ffffffff8104db6b>] do_prlimit+0xd2/0x189 [<ffffffff8104dea9>] sys_setrlimit+0x3b/0x48 [<ffffffff814062bb>] system_call_fastpath+0x16/0x1b Signed-off-by: John Johansen <john.johansen@canonical.com> Reported-by: Miles Lane <miles.lane@gmail.com> Cc: stable@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2011-06-08selinux: simplify and clean up inode_has_perm()Linus Torvalds
This is a rather hot function that is called with a potentially NULL "struct common_audit_data" pointer argument. And in that case it has to provide and initialize its own dummy common_audit_data structure. However, all the _common_ cases already pass it a real audit-data structure, so that uncommon NULL case not only creates a silly run-time test, more importantly it causes that function to have a big stack frame for the dummy variable that isn't even used in the common case! So get rid of that stupid run-time behavior, and make the (few) functions that currently call with a NULL pointer just call a new helper function instead (naturally called inode_has_perm_noapd(), since it has no adp argument). This makes the run-time test be a static code generation issue instead, and allows for a much denser stack since none of the common callers need the dummy structure. And a denser stack not only means less stack space usage, it means better cache behavior. So we have a win-win-win from this simplification: less code executed, smaller stack footprint, and better cache behavior. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-01AppArmor: fix oops in apparmor_setprocattrKees Cook
When invalid parameters are passed to apparmor_setprocattr a NULL deref oops occurs when it tries to record an audit message. This is because it is passing NULL for the profile parameter for aa_audit. But aa_audit now requires that the profile passed is not NULL. Fix this by passing the current profile on the task that is trying to setprocattr. Signed-off-by: Kees Cook <kees@ubuntu.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Cc: stable@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2011-05-27Merge branch 'docs-move' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs * 'docs-move' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs: Create Documentation/security/, move LSM-, credentials-, and keys-related files from Documentation/ to Documentation/security/, add Documentation/security/00-INDEX, and update all occurrences of Documentation/<moved_file> to Documentation/security/<moved_file>.
2011-05-26selinux: don't pass in NULL avd to avc_has_perm_noauditLinus Torvalds
Right now security_get_user_sids() will pass in a NULL avd pointer to avc_has_perm_noaudit(), which then forces that function to have a dummy entry for that case and just generally test it. Don't do it. The normal callers all pass a real avd pointer, and this helper function is incredibly hot. So don't make avc_has_perm_noaudit() do conditional stuff that isn't needed for the common case. This also avoids some duplicated stack space. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26cgroups: add per-thread subsystem callbacksBen Blum
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts Add can_attach_task(), pre_attach(), and attach_task() as new callbacks for cgroups's subsystem interface. Unlike can_attach and attach, these are for per-thread operations, to be called potentially many times when attaching an entire threadgroup. Also, the old "bool threadgroup" interface is removed, as replaced by this. All subsystems are modified for the new interface - of note is cpuset, which requires from/to nodemasks for attach to be globally scoped (though per-cpuset would work too) to persist from its pre_attach to attach_task and attach. This is a pre-patch for cgroup-procs-writable.patch. Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26selinux: fix case of names with whitespace/multibytes on /selinux/createKohei Kaigai
I submit the patch again, according to patch submission convension. This patch enables to accept percent-encoded object names as forth argument of /selinux/create interface to avoid possible bugs when we give an object name including whitespace or multibutes. E.g) if and when a userspace object manager tries to create a new object named as "resolve.conf but fake", it shall give this name as the forth argument of the /selinux/create. But sscanf() logic in kernel space fetches only the part earlier than the first whitespace. In this case, selinux may unexpectedly answer a default security context configured to "resolve.conf", but it is bug. Although I could not test this patch on named TYPE_TRANSITION rules actually, But debug printk() message seems to me the logic works correctly. I assume the libselinux provides an interface to apply this logic transparently, so nothing shall not be changed from the viewpoint of application. Signed-off-by: KaiGai Kohei <kohei.kaigai@emea.nec.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-26Merge commit 'v2.6.39' into 20110526Eric Paris
Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c
2011-05-26Set cred->user_ns in key_replace_session_keyringSerge E. Hallyn
Since this cred was not created with copy_creds(), it needs to get initialized. Otherwise use of syscall(__NR_keyctl, KEYCTL_SESSION_TO_PARENT); can lead to a NULL deref. Thanks to Robert for finding this. But introduced by commit 47a150edc2a ("Cache user_ns in struct cred"). Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Reported-by: Robert Święcki <robert@swiecki.net> Cc: David Howells <dhowells@redhat.com> Cc: stable@kernel.org (2.6.39) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-24Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into ↵James Morris
for-linus Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c Manually resolve conflicts. Signed-off-by: James Morris <jmorris@namei.org>
2011-05-24Merge branch 'next' into for-linusJames Morris
2011-05-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) b43: fix comment typo reqest -> request Haavard Skinnemoen has left Atmel cris: typo in mach-fs Makefile Kconfig: fix copy/paste-ism for dell-wmi-aio driver doc: timers-howto: fix a typo ("unsgined") perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). treewide: fix a few typos in comments regulator: change debug statement be consistent with the style of the rest Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations" audit: acquire creds selectively to reduce atomic op overhead rtlwifi: don't touch with treewide double semicolon removal treewide: cleanup continuations and remove logging message whitespace ath9k_hw: don't touch with treewide double semicolon removal include/linux/leds-regulator.h: fix syntax in example code tty: fix typo in descripton of tty_termios_encode_baud_rate xtensa: remove obsolete BKL kernel option from defconfig m68k: fix comment typo 'occcured' arch:Kconfig.locks Remove unused config option. treewide: remove extra semicolons ...
2011-05-19selinux: avoid unnecessary avc cache stat hit countLinus Torvalds
There is no point in counting hits - we can calculate it from the number of lookups and misses. This makes the avc statistics a bit smaller, and makes the code generation better too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19selinux: de-crapify avc cache stat code generationLinus Torvalds
You can turn off the avc cache stats, but distributions seem to not do that (perhaps because several performance tuning how-to's talk about the avc cache statistics). Which is sad, because the code it generates is truly horrendous, with the statistics update being sandwitched between get_cpu/put_cpu which in turn causes preemption disables etc. We're talking ten+ instructions just to increment a per-cpu variable in some pretty hot code. Fix the craziness by just using 'this_cpu_inc()' instead. Suddenly we only need a single 'inc' instruction to increment the statistics. This is quite noticeable in the incredibly hot avc_has_perm_noaudit() function (which triggers all the statistics by virtue of doing an avc_lookup() call). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...
2011-05-19Create Documentation/security/,Randy Dunlap
move LSM-, credentials-, and keys-related files from Documentation/ to Documentation/security/, add Documentation/security/00-INDEX, and update all occurrences of Documentation/<moved_file> to Documentation/security/<moved_file>.
2011-05-19Merge branch 'master' into nextJames Morris
Conflicts: include/linux/capability.h Manually resolve merge conflict w/ thanks to Stephen Rothwell. Signed-off-by: James Morris <jmorris@namei.org>
2011-05-13Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵James Morris
into for-linus
2011-05-12SELinux: delete debugging printks from filename_trans rule processingEric Paris
The filename_trans rule processing has some printk(KERN_ERR ) messages which were intended as debug aids in creating the code but weren't removed before it was submitted. Remove them. Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-12TOMOYO: Fix wrong domainname validation.Tetsuo Handa
In tomoyo_correct_domain() since 2.6.36, TOMOYO was by error validating "<kernel>" + "/foo/\" + "/bar" when "<kernel> /foo/\* /bar" was given. As a result, legal domainnames like "<kernel> /foo/\* /bar" are rejected. Reported-by: Hayama Yossihiro <yossi@yedo.src.co.jp> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: James Morris <jmorris@namei.org>
2011-05-11SELINUX: add /sys/fs/selinux mount point to put selinuxfsGreg Kroah-Hartman
In the interest of keeping userspace from having to create new root filesystems all the time, let's follow the lead of the other in-kernel filesystems and provide a proper mount point for it in sysfs. For selinuxfs, this mount point should be in /sys/fs/selinux/ Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Lennart Poettering <mzerqung@0pointer.de> Cc: Daniel J Walsh <dwalsh@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> [include kobject.h - Eric Paris] [use selinuxfs_obj throughout - Eric Paris] Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-07security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()Lai Jiangshan
The rcu callback sel_netif_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(sel_netif_free). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-07security,rcu: convert call_rcu(user_update_rcu_disposal) to kfree_rcu()Lai Jiangshan
The rcu callback user_update_rcu_disposal() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(user_update_rcu_disposal). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-04Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵James Morris
into for-linus
2011-04-28flex_array: flex_array_prealloc takes a number of elements, not an endEric Paris
Change flex_array_prealloc to take the number of elements for which space should be allocated instead of the last (inclusive) element. Users and documentation are updated accordingly. flex_arrays got introduced before they had users. When folks started using it, they ended up needing a different API than was coded up originally. This swaps over to the API that folks apparently need. Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Paris <eparis@redhat.com> Tested-by: Chris Richards <gizmo@giz-works.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: stable@kernel.org [2.6.38+]
2011-04-28SELinux: pass last path component in may_createEric Paris
New inodes are created in a two stage process. We first will compute the label on a new inode in security_inode_create() and check if the operation is allowed. We will then actually re-compute that same label and apply it in security_inode_init_security(). The change to do new label calculations based in part on the last component of the path name only passed the path component information all the way down the security_inode_init_security hook. Down the security_inode_create hook the path information did not make it past may_create. Thus the two calculations came up differently and the permissions check might not actually be against the label that is created. Pass and use the same information in both places to harmonize the calculations and checks. Reported-by: Dominick Grift <domg472@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28SELinux: introduce path_has_permEric Paris
We currently have inode_has_perm and dentry_has_perm. dentry_has_perm just calls inode_has_perm with additional audit data. But dentry_has_perm can take either a dentry or a path. Split those to make the code obvious and to fix the previous problem where I thought dentry_has_perm always had a valid dentry and mnt. Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28flex_array: flex_array_prealloc takes a number of elements, not an endEric Paris
Change flex_array_prealloc to take the number of elements for which space should be allocated instead of the last (inclusive) element. Users and documentation are updated accordingly. flex_arrays got introduced before they had users. When folks started using it, they ended up needing a different API than was coded up originally. This swaps over to the API that folks apparently need. Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Paris <eparis@redhat.com> Tested-by: Chris Richards <gizmo@giz-works.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: stable@kernel.org [2.6.38+]