aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-04-18Merge remote-tracking branch 'origin/v3.14/topic/overlayfs' into ↵Alex Shi
linux-linaro-lsk-v3.14
2015-04-17vfs: make first argument of dir_context.actor typedMiklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit ac7576f4b1da8c9c6bc1ae026c2b9e86ae617ba5) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17overlayfs: fix lockdep misannotationMiklos Szeredi
In an overlay directory that shadows an empty lower directory, say /mnt/a/empty102, do: touch /mnt/a/empty102/x unlink /mnt/a/empty102/x rmdir /mnt/a/empty102 It's actually harmless, but needs another level of nesting between I_MUTEX_CHILD and I_MUTEX_NORMAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit d1b72cc6d8cb766c802fdc70a5edc2f0ba8a2b57) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17fs: limit filesystem stacking depthMiklos Szeredi
Add a simple read-only counter to super_block that indicates how deep this is in the stack of filesystems. Previously ecryptfs was the only stackable filesystem and it explicitly disallowed multiple layers of itself. Overlayfs, however, can be stacked recursively and also may be stacked on top of ecryptfs or vice versa. To limit the kernel stack usage we must limit the depth of the filesystem stack. Initially the limit is set to 2. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit 69c433ed2ecd2d3264efd7afec4439524b319121) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17rcu: Provide counterpart to rcu_dereference() for non-RCU situationsPaul E. McKenney
Although rcu_dereference() and friends can be used in situations where object lifetimes are being managed by something other than RCU, the resulting sparse and lockdep-RCU noise can be annoying. This commit therefore supplies a lockless_dereference(), which provides the protection for dereferences without the RCU-related debugging noise. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit 54ef6df3f3f1353d99c80c437259d317b2cd1cbd) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: add RENAME_WHITEOUTMiklos Szeredi
This adds a new RENAME_WHITEOUT flag. This flag makes rename() create a whiteout of source. The whiteout creation is atomic relative to the rename. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit 0d7a855526dd672e114aff2ac22b60fc6f155b08) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: add whiteout supportMiklos Szeredi
Whiteout isn't actually a new file type, but is represented as a char device (Linus's idea) with 0/0 device number. This has several advantages compared to introducing a new whiteout file type: - no userspace API changes (e.g. trivial to make backups of upper layer filesystem, without losing whiteouts) - no fs image format changes (you can boot an old kernel/fsck without whiteout support and things won't break) - implementation is trivial Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit 787fb6bc9682ec7c05fb5d9561b57100fbc1cc41) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: export check_sticky()Miklos Szeredi
It's already duplicated in btrfs and about to be used in overlayfs too. Move the sticky bit check to an inline helper and call the out-of-line helper only in the unlikly case of the sticky bit being set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit cbdf35bcb833bfd00f0925d7a9a33a21f41ea582) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: introduce clone_private_mount()Miklos Szeredi
Overlayfs needs a private clone of the mount, so create a function for this and export to modules. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit c771d683a62e5d36bc46036f5c07f4f5bb7dda61) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: export __inode_permission() to modulesMiklos Szeredi
We need to be able to check inode permissions (but not filesystem implied permissions) for stackable filesystems. Expose this interface for overlayfs. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit bd5d08569cc379f8366663a61558a9ce17c2e460) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: export do_splice_direct() to modulesMiklos Szeredi
Export do_splice_direct() to modules. Needed by overlay filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit 1c118596a7682912106c80007102ce0184c77780) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: add i_op->dentry_open()Miklos Szeredi
Add a new inode operation i_op->dentry_open(). This is for stacked filesystems that want to return a struct file from a different filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> (cherry picked from commit 4aa7c6346be395bdf776f82bbb2e3e2bc60bdd2b) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17new helper: readlink_copy()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit 5d826c847b34de6415b4f1becd88a57ff619af50) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: fs/namei.c
2015-04-17vfs: add cross-renameMiklos Szeredi
If flags contain RENAME_EXCHANGE then exchange source and destination files. There's no restriction on the type of the files; e.g. a directory can be exchanged with a symlink. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com> (cherry picked from commit da1ce0670c14d8380e423a3239e562a1dc15fa9e) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17security: add flags to rename hooksMiklos Szeredi
Add flags to security_path_rename() and security_inode_rename() hooks. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 0b3974eb04c4874e85fa1d4fc70450d12f28611d) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: add RENAME_NOREPLACE flagMiklos Szeredi
If this flag is specified and the target of the rename exists then the rename syscall fails with EEXIST. The VFS does the existence checking, so it is trivial to enable for most local filesystems. This patch only enables it in ext4. For network filesystems the VFS check is not enough as there may be a race between a remote create and the rename, so these filesystems need to handle this flag in their ->rename() implementations to ensure atomicity. Andy writes about why this is useful: "The trivial answer: to eliminate the race condition from 'mv -i'. Another answer: there's a common pattern to atomically create a file with contents: open a temporary file, write to it, optionally fsync it, close it, then link(2) it to the final name, then unlink the temporary file. The reason to use link(2) is because it won't silently clobber the destination. This is annoying: - It requires an extra system call that shouldn't be necessary. - It doesn't work on (IMO sensible) filesystems that don't support hard links (e.g. vfat). - It's not atomic -- there's an intermediate state where both files exist. - It's ugly. The new rename flag will make this totally sensible. To be fair, on new enough kernels, you can also use O_TMPFILE and linkat to achieve the same thing even more cleanly." Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 0a7c3937a1f23f8cb5fc77ae01661e9968a51d0c) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-04-17vfs: add renameat2 syscallMiklos Szeredi
Add new renameat2 syscall, which is the same as renameat with an added flags argument. Pass flags to vfs_rename() and to i_op->rename() as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 520c8b16505236fc82daa352e6c5e73cd9870cff) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-03-30 Merge tag 'v3.14.37' into linux-linaro-lsk-v3.14Alex Shi
This is the 3.14.37 stable release
2015-03-26workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for ↵Tejun Heo
PREEMPT_NONE commit 8603e1b30027f943cc9c1eef2b291d42c3347af1 upstream. cancel[_delayed]_work_sync() are implemented using __cancel_work_timer() which grabs the PENDING bit using try_to_grab_pending() and then flushes the work item with PENDING set to prevent the on-going execution of the work item from requeueing itself. try_to_grab_pending() can always grab PENDING bit without blocking except when someone else is doing the above flushing during cancelation. In that case, try_to_grab_pending() returns -ENOENT. In this case, __cancel_work_timer() currently invokes flush_work(). The assumption is that the completion of the work item is what the other canceling task would be waiting for too and thus waiting for the same condition and retrying should allow forward progress without excessive busy looping Unfortunately, this doesn't work if preemption is disabled or the latter task has real time priority. Let's say task A just got woken up from flush_work() by the completion of the target work item. If, before task A starts executing, task B gets scheduled and invokes __cancel_work_timer() on the same work item, its try_to_grab_pending() will return -ENOENT as the work item is still being canceled by task A and flush_work() will also immediately return false as the work item is no longer executing. This puts task B in a busy loop possibly preventing task A from executing and clearing the canceling state on the work item leading to a hang. task A task B worker executing work __cancel_work_timer() try_to_grab_pending() set work CANCELING flush_work() block for work completion completion, wakes up A __cancel_work_timer() while (forever) { try_to_grab_pending() -ENOENT as work is being canceled flush_work() false as work is no longer executing } This patch removes the possible hang by updating __cancel_work_timer() to explicitly wait for clearing of CANCELING rather than invoking flush_work() after try_to_grab_pending() fails with -ENOENT. Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com v3: bit_waitqueue() can't be used for work items defined in vmalloc area. Switched to custom wake function which matches the target work item and exclusive wait and wakeup. v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if the target bit waitqueue has wait_bit_queue's on it. Use DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu Vizoso. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Rabin Vincent <rabin.vincent@axis.com> Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com> Tested-by: Jesper Nilsson <jesper.nilsson@axis.com> Tested-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-19 Merge tag 'v3.14.36' into linux-linaro-lsk-v3.14Alex Shi
This is the 3.14.36 stable release
2015-03-18target: Fix PR_APTPL_BUF_LEN buffer size limitationNicholas Bellinger
commit f161d4b44d7cc1dc66b53365215227db356378b1 upstream. This patch addresses the original PR_APTPL_BUF_LEN = 8k limitiation for write-out of PR APTPL metadata that Martin has recently been running into. It changes core_scsi3_update_and_write_aptpl() to use vzalloc'ed memory instead of kzalloc, and increases the default hardcoded length to 256k. It also adds logic in core_scsi3_update_and_write_aptpl() to double the original length upon core_scsi3_update_aptpl_buf() failure, and retries until the vzalloc'ed buffer is large enough to accommodate the outgoing APTPL metadata. Reported-by: Martin Svec <martin.svec@zoner.cz> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18mm: when stealing freepages, also take pages created by splitting buddy pageVlastimil Babka
commit 99592d598eca62bdbbf62b59941c189176dfc614 upstream. When studying page stealing, I noticed some weird looking decisions in try_to_steal_freepages(). The first I assume is a bug (Patch 1), the following two patches were driven by evaluation. Testing was done with stress-highalloc of mmtests, using the mm_page_alloc_extfrag tracepoint and postprocessing to get counts of how often page stealing occurs for individual migratetypes, and what migratetypes are used for fallbacks. Arguably, the worst case of page stealing is when UNMOVABLE allocation steals from MOVABLE pageblock. RECLAIMABLE allocation stealing from MOVABLE allocation is also not ideal, so the goal is to minimize these two cases. The evaluation of v2 wasn't always clear win and Joonsoo questioned the results. Here I used different baseline which includes RFC compaction improvements from [1]. I found that the compaction improvements reduce variability of stress-highalloc, so there's less noise in the data. First, let's look at stress-highalloc configured to do sync compaction, and how these patches reduce page stealing events during the test. First column is after fresh reboot, other two are reiterations of test without reboot. That was all accumulater over 5 re-iterations (so the benchmark was run 5x3 times with 5 fresh restarts). Baseline: 3.19-rc4 3.19-rc4 3.19-rc4 5-nothp-1 5-nothp-2 5-nothp-3 Page alloc extfrag event 10264225 8702233 10244125 Extfrag fragmenting 10263271 8701552 10243473 Extfrag fragmenting for unmovable 13595 17616 15960 Extfrag fragmenting unmovable placed with movable 7989 12193 8447 Extfrag fragmenting for reclaimable 658 1840 1817 Extfrag fragmenting reclaimable placed with movable 558 1677 1679 Extfrag fragmenting for movable 10249018 8682096 10225696 With Patch 1: 3.19-rc4 3.19-rc4 3.19-rc4 6-nothp-1 6-nothp-2 6-nothp-3 Page alloc extfrag event 11834954 9877523 9774860 Extfrag fragmenting 11833993 9876880 9774245 Extfrag fragmenting for unmovable 7342 16129 11712 Extfrag fragmenting unmovable placed with movable 4191 10547 6270 Extfrag fragmenting for reclaimable 373 1130 923 Extfrag fragmenting reclaimable placed with movable 302 906 738 Extfrag fragmenting for movable 11826278 9859621 9761610 With Patch 2: 3.19-rc4 3.19-rc4 3.19-rc4 7-nothp-1 7-nothp-2 7-nothp-3 Page alloc extfrag event 4725990 3668793 3807436 Extfrag fragmenting 4725104 3668252 3806898 Extfrag fragmenting for unmovable 6678 7974 7281 Extfrag fragmenting unmovable placed with movable 2051 3829 4017 Extfrag fragmenting for reclaimable 429 1208 1278 Extfrag fragmenting reclaimable placed with movable 369 976 1034 Extfrag fragmenting for movable 4717997 3659070 3798339 With Patch 3: 3.19-rc4 3.19-rc4 3.19-rc4 8-nothp-1 8-nothp-2 8-nothp-3 Page alloc extfrag event 5016183 4700142 3850633 Extfrag fragmenting 5015325 4699613 3850072 Extfrag fragmenting for unmovable 1312 3154 3088 Extfrag fragmenting unmovable placed with movable 1115 2777 2714 Extfrag fragmenting for reclaimable 437 1193 1097 Extfrag fragmenting reclaimable placed with movable 330 969 879 Extfrag fragmenting for movable 5013576 4695266 3845887 In v2 we've seen apparent regression with Patch 1 for unmovable events, this is now gone, suggesting it was indeed noise. Here, each patch improves the situation for unmovable events. Reclaimable is improved by patch 1 and then either the same modulo noise, or perhaps sligtly worse - a small price for unmovable improvements, IMHO. The number of movable allocations falling back to other migratetypes is most noisy, but it's reduced to half at Patch 2 nevertheless. These are least critical as compaction can move them around. If we look at success rates, the patches don't affect them, that didn't change. Baseline: 3.19-rc4 3.19-rc4 3.19-rc4 5-nothp-1 5-nothp-2 5-nothp-3 Success 1 Min 49.00 ( 0.00%) 42.00 ( 14.29%) 41.00 ( 16.33%) Success 1 Mean 51.00 ( 0.00%) 45.00 ( 11.76%) 42.60 ( 16.47%) Success 1 Max 55.00 ( 0.00%) 51.00 ( 7.27%) 46.00 ( 16.36%) Success 2 Min 53.00 ( 0.00%) 47.00 ( 11.32%) 44.00 ( 16.98%) Success 2 Mean 59.60 ( 0.00%) 50.80 ( 14.77%) 48.20 ( 19.13%) Success 2 Max 64.00 ( 0.00%) 56.00 ( 12.50%) 52.00 ( 18.75%) Success 3 Min 84.00 ( 0.00%) 82.00 ( 2.38%) 78.00 ( 7.14%) Success 3 Mean 85.60 ( 0.00%) 82.80 ( 3.27%) 79.40 ( 7.24%) Success 3 Max 86.00 ( 0.00%) 83.00 ( 3.49%) 80.00 ( 6.98%) Patch 1: 3.19-rc4 3.19-rc4 3.19-rc4 6-nothp-1 6-nothp-2 6-nothp-3 Success 1 Min 49.00 ( 0.00%) 44.00 ( 10.20%) 44.00 ( 10.20%) Success 1 Mean 51.80 ( 0.00%) 46.00 ( 11.20%) 45.80 ( 11.58%) Success 1 Max 54.00 ( 0.00%) 49.00 ( 9.26%) 49.00 ( 9.26%) Success 2 Min 58.00 ( 0.00%) 49.00 ( 15.52%) 48.00 ( 17.24%) Success 2 Mean 60.40 ( 0.00%) 51.80 ( 14.24%) 50.80 ( 15.89%) Success 2 Max 63.00 ( 0.00%) 54.00 ( 14.29%) 55.00 ( 12.70%) Success 3 Min 84.00 ( 0.00%) 81.00 ( 3.57%) 79.00 ( 5.95%) Success 3 Mean 85.00 ( 0.00%) 81.60 ( 4.00%) 79.80 ( 6.12%) Success 3 Max 86.00 ( 0.00%) 82.00 ( 4.65%) 82.00 ( 4.65%) Patch 2: 3.19-rc4 3.19-rc4 3.19-rc4 7-nothp-1 7-nothp-2 7-nothp-3 Success 1 Min 50.00 ( 0.00%) 44.00 ( 12.00%) 39.00 ( 22.00%) Success 1 Mean 52.80 ( 0.00%) 45.60 ( 13.64%) 42.40 ( 19.70%) Success 1 Max 55.00 ( 0.00%) 46.00 ( 16.36%) 47.00 ( 14.55%) Success 2 Min 52.00 ( 0.00%) 48.00 ( 7.69%) 45.00 ( 13.46%) Success 2 Mean 53.40 ( 0.00%) 49.80 ( 6.74%) 48.80 ( 8.61%) Success 2 Max 57.00 ( 0.00%) 52.00 ( 8.77%) 52.00 ( 8.77%) Success 3 Min 84.00 ( 0.00%) 81.00 ( 3.57%) 79.00 ( 5.95%) Success 3 Mean 85.00 ( 0.00%) 82.40 ( 3.06%) 79.60 ( 6.35%) Success 3 Max 86.00 ( 0.00%) 83.00 ( 3.49%) 80.00 ( 6.98%) Patch 3: 3.19-rc4 3.19-rc4 3.19-rc4 8-nothp-1 8-nothp-2 8-nothp-3 Success 1 Min 46.00 ( 0.00%) 44.00 ( 4.35%) 42.00 ( 8.70%) Success 1 Mean 50.20 ( 0.00%) 45.60 ( 9.16%) 44.00 ( 12.35%) Success 1 Max 52.00 ( 0.00%) 47.00 ( 9.62%) 47.00 ( 9.62%) Success 2 Min 53.00 ( 0.00%) 49.00 ( 7.55%) 48.00 ( 9.43%) Success 2 Mean 55.80 ( 0.00%) 50.60 ( 9.32%) 49.00 ( 12.19%) Success 2 Max 59.00 ( 0.00%) 52.00 ( 11.86%) 51.00 ( 13.56%) Success 3 Min 84.00 ( 0.00%) 80.00 ( 4.76%) 79.00 ( 5.95%) Success 3 Mean 85.40 ( 0.00%) 81.60 ( 4.45%) 80.40 ( 5.85%) Success 3 Max 87.00 ( 0.00%) 83.00 ( 4.60%) 82.00 ( 5.75%) While there's no improvement here, I consider reduced fragmentation events to be worth on its own. Patch 2 also seems to reduce scanning for free pages, and migrations in compaction, suggesting it has somewhat less work to do: Patch 1: Compaction stalls 4153 3959 3978 Compaction success 1523 1441 1446 Compaction failures 2630 2517 2531 Page migrate success 4600827 4943120 5104348 Page migrate failure 19763 16656 17806 Compaction pages isolated 9597640 10305617 10653541 Compaction migrate scanned 77828948 86533283 87137064 Compaction free scanned 517758295 521312840 521462251 Compaction cost 5503 5932 6110 Patch 2: Compaction stalls 3800 3450 3518 Compaction success 1421 1316 1317 Compaction failures 2379 2134 2201 Page migrate success 4160421 4502708 4752148 Page migrate failure 19705 14340 14911 Compaction pages isolated 8731983 9382374 9910043 Compaction migrate scanned 98362797 96349194 98609686 Compaction free scanned 496512560 469502017 480442545 Compaction cost 5173 5526 5811 As with v2, /proc/pagetypeinfo appears unaffected with respect to numbers of unmovable and reclaimable pageblocks. Configuring the benchmark to allocate like THP page fault (i.e. no sync compaction) gives much noisier results for iterations 2 and 3 after reboot. This is not so surprising given how [1] offers lower improvements in this scenario due to less restarts after deferred compaction which would change compaction pivot. Baseline: 3.19-rc4 3.19-rc4 3.19-rc4 5-thp-1 5-thp-2 5-thp-3 Page alloc extfrag event 8148965 6227815 6646741 Extfrag fragmenting 8147872 6227130 6646117 Extfrag fragmenting for unmovable 10324 12942 15975 Extfrag fragmenting unmovable placed with movable 5972 8495 10907 Extfrag fragmenting for reclaimable 601 1707 2210 Extfrag fragmenting reclaimable placed with movable 520 1570 2000 Extfrag fragmenting for movable 8136947 6212481 6627932 Patch 1: 3.19-rc4 3.19-rc4 3.19-rc4 6-thp-1 6-thp-2 6-thp-3 Page alloc extfrag event 8345457 7574471 7020419 Extfrag fragmenting 8343546 7573777 7019718 Extfrag fragmenting for unmovable 10256 18535 30716 Extfrag fragmenting unmovable placed with movable 6893 11726 22181 Extfrag fragmenting for reclaimable 465 1208 1023 Extfrag fragmenting reclaimable placed with movable 353 996 843 Extfrag fragmenting for movable 8332825 7554034 6987979 Patch 2: 3.19-rc4 3.19-rc4 3.19-rc4 7-thp-1 7-thp-2 7-thp-3 Page alloc extfrag event 3512847 3020756 2891625 Extfrag fragmenting 3511940 3020185 2891059 Extfrag fragmenting for unmovable 9017 6892 6191 Extfrag fragmenting unmovable placed with movable 1524 3053 2435 Extfrag fragmenting for reclaimable 445 1081 1160 Extfrag fragmenting reclaimable placed with movable 375 918 986 Extfrag fragmenting for movable 3502478 3012212 2883708 Patch 3: 3.19-rc4 3.19-rc4 3.19-rc4 8-thp-1 8-thp-2 8-thp-3 Page alloc extfrag event 3181699 3082881 2674164 Extfrag fragmenting 3180812 3082303 2673611 Extfrag fragmenting for unmovable 1201 4031 4040 Extfrag fragmenting unmovable placed with movable 974 3611 3645 Extfrag fragmenting for reclaimable 478 1165 1294 Extfrag fragmenting reclaimable placed with movable 387 985 1030 Extfrag fragmenting for movable 3179133 3077107 2668277 The improvements for first iteration are clear, the rest is much noisier and can appear like regression for Patch 1. Anyway, patch 2 rectifies it. Allocation success rates are again unaffected so there's no point in making this e-mail any longer. [1] http://marc.info/?l=linux-mm&m=142166196321125&w=2 This patch (of 3): When __rmqueue_fallback() is called to allocate a page of order X, it will find a page of order Y >= X of a fallback migratetype, which is different from the desired migratetype. With the help of try_to_steal_freepages(), it may change the migratetype (to the desired one) also of: 1) all currently free pages in the pageblock containing the fallback page 2) the fallback pageblock itself 3) buddy pages created by splitting the fallback page (when Y > X) These decisions take the order Y into account, as well as the desired migratetype, with the goal of preventing multiple fallback allocations that could e.g. distribute UNMOVABLE allocations among multiple pageblocks. Originally, decision for 1) has implied the decision for 3). Commit 47118af076f6 ("mm: mmzone: MIGRATE_CMA migration type added") changed that (probably unintentionally) so that the buddy pages in case 3) are always changed to the desired migratetype, except for CMA pageblocks. Commit fef903efcf0c ("mm/page_allo.c: restructure free-page stealing code and fix a bug") did some refactoring and added a comment that the case of 3) is intended. Commit 0cbef29a7821 ("mm: __rmqueue_fallback() should respect pageblock type") removed the comment and tried to restore the original behavior where 1) implies 3), but due to the previous refactoring, the result is instead that only 2) implies 3) - and the conditions for 2) are less frequently met than conditions for 1). This may increase fragmentation in situations where the code decides to steal all free pages from the pageblock (case 1)), but then gives back the buddy pages produced by splitting. This patch restores the original intended logic where 1) implies 3). During testing with stress-highalloc from mmtests, this has shown to decrease the number of events where UNMOVABLE and RECLAIMABLE allocations steal from MOVABLE pageblocks, which can lead to permanent fragmentation. In some cases it has increased the number of events when MOVABLE allocations steal from UNMOVABLE or RECLAIMABLE pageblocks, but these are fixable by sync compaction and thus less harmful. Note that evaluation has shown that the behavior introduced by 47118af076f6 for buddy pages in case 3) is actually even better than the original logic, so the following patch will introduce it properly once again. For stable backports of this patch it makes thus sense to only fix versions containing 0cbef29a7821. [iamjoonsoo.kim@lge.com: tracepoint fix] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-10 Merge tag 'v3.14.35' into linux-linaro-lsk-v3.14Alex Shi
This is the 3.14.35 stable release
2015-03-06usb: core: buffer: smallest buffer should start at ARCH_DMA_MINALIGNSebastian Andrzej Siewior
commit 5efd2ea8c9f4f12916ffc8ba636792ce052f6911 upstream. the following error pops up during "testusb -a -t 10" | musb-hdrc musb-hdrc.1.auto: dma_pool_free buffer-128, f134e000/be842000 (bad dma) hcd_buffer_create() creates a few buffers, the smallest has 32 bytes of size. ARCH_KMALLOC_MINALIGN is set to 64 bytes. This combo results in hcd_buffer_alloc() returning memory which is 32 bytes aligned and it might by identified by buffer_offset() as another buffer. This means the buffer which is on a 32 byte boundary will not get freed, instead it tries to free another buffer with the error message. This patch fixes the issue by creating the smallest DMA buffer with the size of ARCH_KMALLOC_MINALIGN (or 32 in case ARCH_KMALLOC_MINALIGN is smaller). This might be 32, 64 or even 128 bytes. The next three pools will have the size 128, 512 and 2048. In case the smallest pool is 128 bytes then we have only three pools instead of four (and zero the first entry in the array). The last pool size is always 2048 bytes which is the assumed PAGE_SIZE / 2 of 4096. I doubt it makes sense to continue using PAGE_SIZE / 2 where we would end up with 8KiB buffer in case we have 16KiB pages. Instead I think it makes sense to have a common size(s) and extend them if there is need to. There is a BUILD_BUG_ON() now in case someone has a minalign of more than 128 bytes. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06fsnotify: fix handling of renames in auditJan Kara
commit 6ee8e25fc3e916193bce4ebb43d5439e1e2144ab upstream. Commit e9fd702a58c4 ("audit: convert audit watches to use fsnotify instead of inotify") broke handling of renames in audit. Audit code wants to update inode number of an inode corresponding to watched name in a directory. When something gets renamed into a directory to a watched name, inotify previously passed moved inode to audit code however new fsnotify code passes directory inode where the change happened. That confuses audit and it starts watching parent directory instead of a file in a directory. This can be observed for example by doing: cd /tmp touch foo bar auditctl -w /tmp/foo touch foo mv bar foo touch foo In audit log we see events like: type=CONFIG_CHANGE msg=audit(1423563584.155:90): auid=1000 ses=2 op="updated rules" path="/tmp/foo" key=(null) list=4 res=1 ... type=PATH msg=audit(1423563584.155:91): item=2 name="bar" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE type=PATH msg=audit(1423563584.155:91): item=3 name="foo" inode=1046842 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE type=PATH msg=audit(1423563584.155:91): item=4 name="foo" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE ... and that's it - we see event for the first touch after creating the audit rule, we see events for rename but we don't see any event for the last touch. However we start seeing events for unrelated stuff happening in /tmp. Fix the problem by passing moved inode as data in the FS_MOVED_FROM and FS_MOVED_TO events instead of the directory where the change happens. This doesn't introduce any new problems because noone besides audit_watch.c cares about the passed value: fs/notify/fanotify/fanotify.c cares only about FSNOTIFY_EVENT_PATH events. fs/notify/dnotify/dnotify.c doesn't care about passed 'data' value at all. fs/notify/inotify/inotify_fsnotify.c uses 'data' only for FSNOTIFY_EVENT_PATH. kernel/audit_tree.c doesn't care about passed 'data' at all. kernel/audit_watch.c expects moved inode as 'data'. Fixes: e9fd702a58c49db ("audit: convert audit watches to use fsnotify instead of inotify") Signed-off-by: Jan Kara <jack@suse.cz> Cc: Paul Moore <paul@paul-moore.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-03Merge tag 'v3.14.34' into linux-linaro-lsk-v3.14Mark Brown
This is the 3.14.34 stable release Conflicts: arch/arm64/kernel/setup.c
2015-03-03Merge remote-tracking branch 'lsk/v3.14/topic/coresight' into ↵Mark Brown
linux-linaro-lsk-v3.14
2015-03-02coresight: remove the unnecessary function coresight_is_bit_set()Kaixu Xia
This function coresight_is_bit_set() isn't called, so we should remove it. Signed-off-by: Kaixu Xia <xiakaixu@huawei.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit c4546f246636ccf4cda092bcfcafcb5f5f752ec7) Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2015-03-02coresight: Fixing wrong #ifdef/#endif placementMathieu Poirier
Fixing problem reported by: https://lkml.org/lkml/2015/1/6/86 The #ifdef/#endif is wrong and prevents the stub of function of_get_coresight_platform_data() from being visible when CONFIG_OF is not defined. Moving CONFIG_OF condition out of CONFIG_CORESIGHT, making them both independent. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit c61c4b5dd2c6b5dbf0f7e299db1e8411ef590f5c) Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2015-03-02coresight: remove the unused macro CORESIGHT_DEBUGFS_ENTRYXia Kaixu
Debugfs isn't used for coresight configuration, so the macro CORESIGHT_DEBUGFS_ENTRY is unnecessary, just remove it. Signed-off-by: Xia Kaixu <xiakaixu@huawei.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f379984f849d729bd2eb076b633200b1c040611e) Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2015-02-26ipv4: tcp: get rid of ugly unicast_sockEric Dumazet
[ Upstream commit bdbbb8527b6f6a358dbcb70dac247034d665b8e4 ] In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock") I tried to address contention on a socket lock, but the solution I chose was horrible : commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside of TCP stack") addressed a selinux regression. commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1") took care of another regression. commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression. commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate") was another shot in the dark. Really, just use a proper socket per cpu, and remove the skb_orphan() call, to re-enable flow control. This solves a serious problem with FQ packet scheduler when used in hostile environments, as we do not want to allocate a flow structure for every RST packet sent in response to a spoofed packet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26ipv4: try to cache dst_entries which would cause a redirectHannes Frederic Sowa
[ Upstream commit df4d92549f23e1c037e83323aff58a21b3de7fe0 ] Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit f88649721268999 ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-21Merge branch 'lsk/v3.14/topic/of' into linux-linaro-lsk-v3.14lsk-v3.14-15.02lsk-v3.14-14.02Mark Brown
Conflicts: drivers/of/Kconfig drivers/of/Makefile drivers/of/address.c drivers/of/selftest.c include/linux/of.h
2015-02-20of/overlay: Introduce DT overlay supportPantelis Antoniou
Overlays are a method to dynamically modify part of the kernel's device tree with dynamically loaded data. Add the core functionality to parse, apply and remove an overlay changeset. The core functionality takes care of managing the overlay data format and performing the add and remove. Drivers are expected to use the overlay functionality to support custom expansion busses commonly found on consumer development boards like the BeagleBone or Raspberry Pi. The overlay code uses CONFIG_OF_DYNAMIC changesets to perform the low level work of modifying the devicetree. Documentation about internal and APIs is provided in Documentation/devicetree/overlay-notes.txt v2: - Switch from __of_node_alloc() to __of_node_dup() - Documentation fixups - Remove 2-pass processing of properties - Remove separate ov_lock; just use the DT mutex. v1: - Drop delete capability using '-' prefix. The '-' prefixed names are valid properties and nodes and there is no need for it just yet. - Do not update special properties - name & phandle ones. - Change order of node attachment, so that the special property update works. Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 7518b5890d8ac366faa2326ce2356ef6392ce63d) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: drivers/of/Makefile
2015-02-20of/reconfig: Add OF_DYNAMIC notifier for platform_bus_typePantelis Antoniou
Add OF notifier handler needed for creating/destroying platform devices according to dynamic runtime changes in the DT live tree. Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 801d728c10db4b28e01590b46bf1f0df930760cc) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-20of/reconfig: Always use the same structure for notifiersGrant Likely
The OF_RECONFIG notifier callback uses a different structure depending on whether it is a node change or a property change. This is silly, and not very safe. Rework the code to use the same data structure regardless of the type of notifier. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: <linuxppc-dev@lists.ozlabs.org> (cherry picked from commit f5242e5a883bf1c1aba6bfd87b85e7dda0e62191) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: arch/powerpc/platforms/pseries/hotplug-memory.c include/linux/of.h
2015-02-20of/reconfig: Add of_reconfig_get_state_change() of notifier helper.Pantelis Antoniou
Introduce of_reconfig_get_state_change() which allows an of notifier to query about device state changes. Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit b53a2340d0d30468b7315992ba77fe188c3bc5c8) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-20of/platform: Fix of_platform_device_destroy iteration of devicesGrant Likely
of_platform_destroy does not work properly, since the tree population test was iterating on all devices having as its parent the given platform device. The check was intended to check whether any other platform or amba devices created by of_platform_populate were still populated, but instead checked for every kind of device. This is wrong, since platform devices typically create a subsystem regular device and set themselves as parents. Instead, go ahead and call the unregister functions for any devices created with of_platform_populate. The driver core will take care of unbinding drivers, and drivers are responsible for getting rid of any child devices that weren't created by of_platform_populate. Signed-off-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> (cherry picked from commit 75f353b61342b5847c7f6d8499fd6301dce09845) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-20of: Keep track of populated platform devicesPawel Moll
In "Device Tree powered" systems, platform devices are usually massively populated with of_platform_populate() call, executed at some level of initcalls, either by generic architecture or by platform-specific code. There are situations though where certain devices must be created (and bound with drivers) before all the others. This presents a challenge, as devices created explicitly would be created again by of_platform_populate(). This patch tries to solve that issue in a generic way, adding a "populated" flag for a DT node description. Subsequent of_platform_populate() will skip such nodes (and its children) in a similar way to the non-available ones. This patch also adds of_platform_depopulate() as an operation complementary to the _populate() one. It removes a platform or an amba device populated from the Device Tree, together with its all children (leaving, however, devices without associated of_node untouched) clearing the "populated" flag on the way. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit c6e126de43e7d4abfd6cf796b40589db3a046167) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-20of: kill off of_can_translate_addressRob Herring
of_can_translate_address only checks some conditions for address translation, but does not check other conditions like having range properties. The checks it does do are redundant with __of_address_translate. The only difference is printing a message or not. Since we only have a single caller that does the full translation anyway, just remove of_can_translate_address and quiet the error message. Cc: Grant Likely <grant.likely@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Frank Rowand <frank.rowand@sonymobile.com> Reviewed-by: Frank Rowand <frank.rowand@sonymobile.com> (cherry picked from commit d9c6866be8a145e32da616d8dcbae806032d75b5) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-20of: Introduce Device Tree resolve support.Pantelis Antoniou
Introduce support for dynamic device tree resolution. Using it, it is possible to prepare a device tree that's been loaded on runtime to be modified and inserted at the kernel live tree. Export of of_resolve and bug fix of double free by Guenter Roeck <groeck@juniper.net> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> [grant.likely: Don't need to select CONFIG_OF_DYNAMIC and CONFIG_OF_DEVICE] [grant.likely: Don't need to depend on OF or !SPARC] [grant.likely: Factor out duplicate code blocks into single function] Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 7941b27b16e3282f6ec8817e36492f1deec753a7) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: drivers/of/Kconfig drivers/of/Makefile
2015-02-20of: Transactional DT support.Pantelis Antoniou
Introducing DT transactional support. A DT transaction is a method which allows one to apply changes in the live tree, in such a way that either the full set of changes take effect, or the state of the tree can be rolled-back to the state it was before it was attempted. An applied transaction can be rolled-back at any time. Documentation is in Documentation/devicetree/changesets.txt Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> [glikely: Removed device notifiers and reworked to be more consistent] Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 201c910bd6898d81d4ac6685d0f421b7e10f3c5d) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: include/linux/of.h
2015-02-19of: Reorder device tree changes and notifiersGrant Likely
Currently, devicetree reconfig notifiers get emitted before the change is applied to the tree, but that behaviour is problematic if the receiver wants the determine the new state of the tree. The current users don't care, but the changeset code to follow will be making multiple changes at once. Reorder notifiers to get emitted after the change has been applied to the tree so that callbacks see the new tree state. At the same time, fixup the existing callbacks to expect the new order. There are a few callbacks that compare the old and new values of a changed property. Put both property pointers into the of_prop_reconfig structure. The current notifiers also allow the notifier callback to fail and cancel the change to the tree, but that feature isn't actually used. It really isn't valid to ignore a tree modification provided by firmware anyway, so remove the ability to cancel a change to the tree. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Nathan Fontenot <nfont@austin.ibm.com> (cherry picked from commit 259092a35c7e11f1d4616b0f5b3ba7b851fe4fa6) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-19of: Make devicetree sysfs update functions consistent.Grant Likely
All of the DT modification functions are split into two parts, the first part manipulates the DT data structure, and the second part updates sysfs, but the code isn't very consistent about how the second half is called. They don't all enforce the same rules about when it is valid to update sysfs, and there isn't any clarity on locking. The transactional DT modification feature that is coming also needs access to these functions so that it can perform all the structure changes together, and then all the sysfs updates as a second stage instead of doing each one at a time. Fix up the second have by creating a separate __of_*_sysfs() function for each of the helpers. The new functions have consistent naming (ie. of_node_add() becomes __of_attach_node_sysfs()) and all of them now defer if of_init hasn't been called yet. Callers of the new functions must hold the of_mutex to ensure there are no race conditions with of_init(). The mutex ensures that there will only ever be one writer to the tree at any given time. There can still be any number of readers and the raw_spin_lock is still used to make sure access to the data structure is still consistent. Finally, put the function prototypes into of_private.h so they are accessible to the transaction code. Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> [grant.likely: Changed suffix from _post to _sysfs to match existing code] [grant.likely: Reorganized to eliminate trivial wrappers] Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 8a2b22a2595bf89d4396530edf8388159fad9d83) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: drivers/of/base.c
2015-02-19of: Introduce device tree node flag helpers.Pantelis Antoniou
Helper functions for working with device node flags. Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 588453c69dace39351129a038dd2796608f74bb3) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-19of: device_node kobject lifecycle fixesPantelis Antoniou
After the move to having device nodes be proper kobjects the lifecycle of the node needs to be controlled better. At first convert of_add_node() in the unflattened functions to of_init_node() which initializes the kobject so that of_node_get/put work correctly even before of_init is called. Afterwards introduce of_node_is_initialized & of_node_is_attached that query the underlying kobject about the state (attached means kobj is visible in sysfs) Using that make sure the lifecycle of the tree is correct at all times. Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com> [grant.likely: moved of_node_init() calls, fixed up locking, and dropped __of_populate() hunks] Signed-off-by: Grant Likely <grant.likely@linaro.org> (cherry picked from commit 0829f6d1f69e4f2fae4062987ae6531a9af1a2e3) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-19of: remove /proc/device-treeGrant Likely
The same data is now available in sysfs, so we can remove the code that exports it in /proc and replace it with a symlink to the sysfs version. Tested on versatile qemu model and mpc5200 eval board. More testing would be appreciated. v5: Fixed up conflicts with mainline changes Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Pantelis Antoniou <panto@antoniou-consulting.com> (cherry picked from commit 8357041a69b368991d1b04d9f1d297f8d71e1314) Signed-off-by: Mark Brown <broonie@kernel.org> Conflicts: drivers/of/base.c
2015-02-19of: add functions to count number of elements in a propertyHeiko Stuebner
The need to know the number of array elements in a property is a common pattern. To prevent duplication of open-coded implementations add a helper static function that also centralises strict sanity checking and DTB format details, as well as a set of wrapper functions for u8, u16, u32 and u64. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@bqreaders.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Grant Likely <grant.likely@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit ad54a0cfbeb4bd4033d09017557ccbc423f9d5ff) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-19of: Make device nodes kobjects so they show up in sysfsGrant Likely
Device tree nodes are already treated as objects, and we already want to expose them to userspace which is done using the /proc filesystem today. Right now the kernel has to do a lot of work to keep the /proc view in sync with the in-kernel representation. If device_nodes are switched to be kobjects then the device tree code can be a whole lot simpler. It also turns out that switching to using /sysfs from /proc results in smaller code and data size, and the userspace ABI won't change if /proc/device-tree symlinks to /sys/firmware/devicetree/base. v7: Add missing sysfs_bin_attr_init() v6: Add __of_add_property() early init fixes from Pantelis v5: Rename firmware/ofw to firmware/devicetree Fix updating property values in sysfs v4: Fixed build error on Powerpc Fixed handling of dynamic nodes on powerpc v3: Fixed handling of duplicate attribute and child node names v2: switch to using sysfs bin_attributes which solve the problem of reporting incorrect property size. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Pantelis Antoniou <panto@antoniou-consulting.com> (cherry picked from commit 75b57ecf9d1d1e17d099ab13b8f48e6e038676be) Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-11ALSA: ak411x: Fix stall in work callbackTakashi Iwai
commit 4161b4505f1690358ac0a9ee59845a7887336b21 upstream. When ak4114 work calls its callback and the callback invokes ak4114_reinit(), it stalls due to flush_delayed_work(). For avoiding this, control the reentrance by introducing a refcount. Also flush_delayed_work() is replaced with cancel_delayed_work_sync(). The exactly same bug is present in ak4113.c and fixed as well. Reported-by: Pavel Hofman <pavel.hofman@ivitera.com> Acked-by: Jaroslav Kysela <perex@perex.cz> Tested-by: Pavel Hofman <pavel.hofman@ivitera.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>