aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)Author
2015-05-12Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidAlex Shi
2015-05-11Merge branch 'linaro-android-3.10-lsk' of ↵Kevin Hilman
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-v3.10-android * 'linaro-android-3.10-lsk' of git://android.git.linaro.org/kernel/linaro-android: android: fiq_debugger: fix cut-off help message ipv4: Missing sk_nulls_node_init() in ping_unhash(). android: base-cfg: add ALSA usb: gadget: add audio dependencies to USB_G_ANDROID SELinux: ss: Fix policy write for ioctl operations nf: IDLETIMER: Adds the uid field in the msg android: configs: Enable SELinux and its dependencies. SELinux: use deletion-safe iterator to free list subsystem: CPU FREQUENCY DRIVERS- Set cpu_load calculation on current frequency
2015-05-10Merge branch 'android-3.10' of https://android.googlesource.com/kernel/commonAmit Pundir
* android-3.10: android: fiq_debugger: fix cut-off help message ipv4: Missing sk_nulls_node_init() in ping_unhash(). android: base-cfg: add ALSA usb: gadget: add audio dependencies to USB_G_ANDROID SELinux: ss: Fix policy write for ioctl operations nf: IDLETIMER: Adds the uid field in the msg android: configs: Enable SELinux and its dependencies. SELinux: use deletion-safe iterator to free list subsystem: CPU FREQUENCY DRIVERS- Set cpu_load calculation on current frequency
2015-05-06tcp: avoid looping in tcp_send_fin()Eric Dumazet
[ Upstream commit 845704a535e9b3c76448f52af1b70e4422ea03fd ] Presence of an unbound loop in tcp_send_fin() had always been hard to explain when analyzing crash dumps involving gigantic dying processes with millions of sockets. Lets try a different strategy : In case of memory pressure, try to add the FIN flag to last packet in write queue, even if packet was already sent. TCP stack will be able to deliver this FIN after a timeout event. Note that this FIN being delivered by a retransmit, it also carries a Push flag given our current implementation. By checking sk_under_memory_pressure(), we anticipate that cooking many FIN packets might deplete tcp memory. In the case we could not allocate a packet, even with __GFP_WAIT allocation, then not sending a FIN seems quite reasonable if it allows to get rid of this socket, free memory, and not block the process from eventually doing other useful work. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06tcp: fix possible deadlock in tcp_send_fin()Eric Dumazet
[ Upstream commit d83769a580f1132ac26439f50068a29b02be535e ] Using sk_stream_alloc_skb() in tcp_send_fin() is dangerous in case a huge process is killed by OOM, and tcp_mem[2] is hit. To be able to free memory we need to make progress, so this patch allows FIN packets to not care about tcp_mem[2], if skb allocation succeeded. In a follow-up patch, we might abort tcp_send_fin() infinite loop in case TIF_MEMDIE is set on this thread, as memory allocator did its best getting extra memory already. This patch reverts d22e15371811 ("tcp: fix tcp fin memory accounting") Fixes: d22e15371811 ("tcp: fix tcp fin memory accounting") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06ip_forward: Drop frames with attached skb->skSebastian Pöhn
[ Upstream commit 2ab957492d13bb819400ac29ae55911d50a82a13 ] Initial discussion was: [FYI] xfrm: Don't lookup sk_policy for timewait sockets Forwarded frames should not have a socket attached. Especially tw sockets will lead to panics later-on in the stack. This was observed with TPROXY assigning a tw socket and broken policy routing (misconfigured). As a result frame enters forwarding path instead of input. We cannot solve this in TPROXY as it cannot know that policy routing is broken. v2: Remove useless comment Signed-off-by: Sebastian Poehn <sebastian.poehn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-04ipv4: Missing sk_nulls_node_init() in ping_unhash().David S. Miller
If we don't do that, then the poison value is left in the ->pprev backlink. This can cause crashes if we do a disconnect, followed by a connect(). Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Wen Xu <hotdog3645@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 20770158 Change-Id: I944eb20fddea190892c2da681d934801d268096b
2015-04-29tcp: tcp_make_synack() should clear skb->tstampEric Dumazet
[ Upstream commit b50edd7812852d989f2ef09dcfc729690f54a42d ] I noticed tcpdump was giving funky timestamps for locally generated SYNACK messages on loopback interface. 11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S 945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7> 20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S 3160535375:3160535375(0) ack 945476043 win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7> This is because we need to clear skb->tstamp before entering lower stack, otherwise net_timestamp_check() does not set skb->tstamp. Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29tcp: fix FRTO undo on cumulative ACK of SACKed rangeNeal Cardwell
[ Upstream commit 666b805150efd62f05810ff0db08f44a2370c937 ] On processing cumulative ACKs, the FRTO code was not checking the SACKed bit, meaning that there could be a spurious FRTO undo on a cumulative ACK of a previously SACKed skb. The FRTO code should only consider a cumulative ACK to indicate that an original/unretransmitted skb is newly ACKed if the skb was not yet SACKed. The effect of the spurious FRTO undo would typically be to make the connection think that all previously-sent packets were in flight when they really weren't, leading to a stall and an RTO. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: e33099f96d99c ("tcp: implement RFC5682 F-RTO") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29tcp: prevent fetching dst twice in early demux codeMichal Kubeček
[ Upstream commit d0c294c53a771ae7e84506dfbd8c18c30f078735 ] On s390x, gcc 4.8 compiles this part of tcp_v6_early_demux() struct dst_entry *dst = sk->sk_rx_dst; if (dst) dst = dst_check(dst, inet6_sk(sk)->rx_dst_cookie); to code reading sk->sk_rx_dst twice, once for the test and once for the argument of ip6_dst_check() (dst_check() is inline). This allows ip6_dst_check() to be called with null first argument, causing a crash. Protect sk->sk_rx_dst access by ACCESS_ONCE() both in IPv4 and IPv6 TCP early demux code. Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Fixes: c7109986db3c ("ipv6: Early TCP socket demux") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29remove extra definitions of U32_MAXAlex Elder
commit 04f9b74e4d96d349de12fdd4e6626af4a9f75e09 upstream. Now that the definition is centralized in <linux/kernel.h>, the definitions of U32_MAX (and related) elsewhere in the kernel can be removed. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Sage Weil <sage@inktank.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29conditionally define U32_MAXAlex Elder
commit 77719536dc00f8fd8f5abe6dadbde5331c37f996 upstream. The symbol U32_MAX is defined in several spots. Change these definitions to be conditional. This is in preparation for the next patch, which centralizes the definition in <linux/kernel.h>. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Sage Weil <sage@inktank.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-19tcp: Fix crash in TCP Fast OpenBen Hutchings
Commit 355a901e6cf1 ("tcp: make connect() mem charging friendly") changed tcp_send_syn_data() to perform an open-coded copy of the 'syn' skb rather than using skb_copy_expand(). The open-coded copy does not cover the skb_shared_info::gso_segs field, so in the new skb it is left set to 0. When this commit was backported into stable branches between 3.10.y and 3.16.7-ckty inclusive, it triggered the BUG() in tcp_transmit_skb(). Since Linux 3.18 the GSO segment count is kept in the tcp_skb_cb::tcp_gso_segs field and tcp_send_syn_data() does copy the tcp_skb_cb structure to the new skb, so mainline and newer stable branches are not affected. Set skb_shared_info::gso_segs to the correct value of 1. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-17Merge branch 'linaro-android-3.10-lsk' of ↵Alex Shi
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-android Remove conflictsed drivers/staging/android/logger.c
2015-04-16tcp: Silence warning: ‘in6’ may be used uninitializedJon Medhurst
Initialise in6 to avoid... In file included from include/net/inetpeer.h:15:0, from include/net/route.h:28, from include/net/inet_hashtables.h:32, from include/net/tcp.h:37, from net/ipv4/tcp.c:275: net/ipv4/tcp.c: In function ‘tcp_nuke_addr’: include/net/ipv6.h:417:34: warning: ‘in6’ may be used uninitialized in this function [-Wmaybe-uninitialized] return ((ul1[0] ^ ul2[0]) | (ul1[1] ^ ul2[1])) == 0UL; ^ net/ipv4/tcp.c:3513:19: note: ‘in6’ was declared here struct in6_addr *in6; Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2015-03-30Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidAlex Shi
2015-03-26tcp: make connect() mem charging friendlyEric Dumazet
[ Upstream commit 355a901e6cf1b2b763ec85caa2a9f04fbcc4ab4a ] While working on sk_forward_alloc problems reported by Denys Fedoryshchenko, we found that tcp connect() (and fastopen) do not call sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so sk_forward_alloc is negative while connect is in progress. We can fix this by calling regular sk_stream_alloc_skb() both for the SYN packet (in tcp_connect()) and the syn_data packet in tcp_send_syn_data() Then, tcp_send_syn_data() can avoid copying syn_data as we simply can manipulate syn_data->cb[] to remove SYN flag (and increment seq) Instead of open coding memcpy_fromiovecend(), simply use this helper. This leaves in socket write queue clean fast clone skbs. This was tested against our fastopen packetdrill tests. Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26tcp: fix tcp fin memory accountingJosh Hunt
[ Upstream commit d22e1537181188e5dc8cbc51451832625035bdc2 ] tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn | grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26inet_diag: fix possible overflow in inet_diag_dump_one_icsk()Eric Dumazet
[ Upstream commit c8e2c80d7ec00d020320f905822bf49c5ad85250 ] inet_diag_dump_one_icsk() allocates too small skb. Add inet_sk_attr_size() helper right before inet_sk_diag_fill() so that it can be updated if/when new attributes are added. iproute2/ss currently does not use this dump_one() interface, this might explain nobody noticed this problem yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-19Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidAlex Shi
2015-03-18udp: only allow UFO for packets from SOCK_DGRAM socketsMichal Kubeček
[ Upstream commit acf8dd0a9d0b9e4cdb597c2f74802f79c699e802 ] If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer checksum is to be computed on segmentation. However, in this case, skb->csum_start and skb->csum_offset are never set as raw socket transmit path bypasses udp_send_skb() where they are usually set. As a result, driver may access invalid memory when trying to calculate the checksum and store the result (as observed in virtio_net driver). Moreover, the very idea of modifying the userspace provided UDP header is IMHO against raw socket semantics (I wasn't able to find a document clearly stating this or the opposite, though). And while allowing CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit too intrusive change just to handle a corner case like this. Therefore disallowing UFO for packets from SOCK_DGRAM seems to be the best option. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18ipv4: ip_check_defrag should not assume that skb_network_offset is zeroAlexander Drozdov
[ Upstream commit 3e32e733d1bbb3f227259dc782ef01d5706bdae0 ] ip_check_defrag() may be used by af_packet to defragment outgoing packets. skb_network_offset() of af_packet's outgoing packets is not zero. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18ipv4: ip_check_defrag should correctly check return value of skb_copy_bitsAlexander Drozdov
[ Upstream commit fba04a9e0c869498889b6445fd06cbe7da9bb834 ] skb_copy_bits() returns zero on success and negative value on error, so it is needed to invert the condition in ip_check_defrag(). Fixes: 1bf3751ec90c ("ipv4: ip_check_defrag must not modify skb before unsharing") Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-16Merge branch 'linaro-android-3.10-lsk' ofAlex Shi
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-android
2015-03-15Merge branch 'android-3.10' of ↵Amit Pundir
https://android.googlesource.com/kernel/common into linaro-android-3.10-lsk * android-3.10: (40 commits) fs: ecryptfs: readdir: constify actor mm: reorder can_do_mlock to fix audit denial dm-verity: Add modes and emit uevent on corrupted blocks proc: make oom adjustment files user read-only Revert "Grants system server access to /proc/<pid>/oom_adj for Android applications." fs/proc/task_mmu.c: add user-space support for resetting mm->hiwater_rss (peak RSS) net: ping: Return EAFNOSUPPORT when appropriate. lz4: fix compression/decompression signedness mismatch lib: add lz4 compressor module decompressor: add LZ4 decompressor module Squashfs: Add LZ4 compression configuration option Squashfs: add LZ4 compression support fs/squashfs/super.c: logging cleanup fs/squashfs/file_direct.c: replace count*size kmalloc by kmalloc_array fs/squashfs/squashfs.h: replace pr_warning by pr_warn fs: push sync_filesystem() down to the file system's remount_fs() Squashfs: fix failure to unlock pages on decompress error Squashfs: Check stream is not NULL in decompressor_multi.c Squashfs: Directly decompress into the page cache for file data Squashfs: Restructure squashfs_readpage() ...
2015-03-10Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidAlex Shi
Most of cpu feature which hardcode in commit 3868e7f8d4799 are included in compat_hwcap_str[]. We don't need repeat them. Conflicts: arch/arm64/kernel/setup.c
2015-03-05net: ping: Return EAFNOSUPPORT when appropriate.Lorenzo Colitti
1. For an IPv4 ping socket, ping_check_bind_addr does not check the family of the socket address that's passed in. Instead, make it behave like inet_bind, which enforces either that the address family is AF_INET, or that the family is AF_UNSPEC and the address is 0.0.0.0. 2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL if the socket family is not AF_INET6. Return EAFNOSUPPORT instead, for consistency with inet6_bind. 3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT instead of EINVAL if an incorrect socket address structure is passed in. 4. Make IPv6 ping sockets be IPv6-only. The code does not support IPv4, and it cannot easily be made to support IPv4 because the protocol numbers for ICMP and ICMPv6 are different. This makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead of making the socket unusable. Among other things, this fixes an oops that can be triggered by: int s = socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP); struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6, .sin6_addr = in6addr_any, }; bind(s, (struct sockaddr *) &sin6, sizeof(sin6)); [backport of net 9145736d4862145684009d6a72a6e61324a9439e] Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-03ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iifCong Wang
As suggested by Julian: Simply, flowi4_iif must not contain 0, it does not look logical to ignore all ip rules with specified iif. because in fib_rule_match() we do: if (rule->iifindex && (rule->iifindex != fl->flowi_iif)) goto out; flowi4_iif should be LOOPBACK_IFINDEX by default. We need to move LOOPBACK_IFINDEX to include/net/flow.h: 1) It is mostly used by flowi_iif 2) Fix the following compile error if we use it in flow.h by the patches latter: In file included from include/linux/netfilter.h:277:0, from include/net/netns/netfilter.h:5, from include/net/net_namespace.h:21, from include/linux/netdevice.h:43, from include/linux/icmpv6.h:12, from include/linux/ipv6.h:61, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:27, from include/linux/nfs_fs.h:30, from init/do_mounts.c:32: include/net/flow.h: In function ‘flowi4_init_output’: include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function) [Backport of net-next 6a662719c9868b3d6c7d26b3a085f0cd3cc15e64] Change-Id: Ib7a0a08d78c03800488afa1b2c170cb70e34cfd9 Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2015-02-26ipv4: tcp: get rid of ugly unicast_sockEric Dumazet
[ Upstream commit bdbbb8527b6f6a358dbcb70dac247034d665b8e4 ] In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock") I tried to address contention on a socket lock, but the solution I chose was horrible : commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside of TCP stack") addressed a selinux regression. commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1") took care of another regression. commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression. commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate") was another shot in the dark. Really, just use a proper socket per cpu, and remove the skb_orphan() call, to re-enable flow control. This solves a serious problem with FQ packet scheduler when used in hostile environments, as we do not want to allocate a flow structure for every RST packet sent in response to a spoofed packet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26tcp: ipv4: initialize unicast_sock sk_pacing_rateEric Dumazet
[ Upstream commit 811230cd853d62f09ed0addd0ce9a1b9b0e13fb5 ] When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26ping: Fix race in free in receive pathsubashab@codeaurora.org
[ Upstream commit fc752f1f43c1c038a2c6ae58cc739ebb5953ccb0 ] An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->|exception -007|sk_mem_uncharge -007|sock_rfree -008|skb_release_head_state -009|skb_release_all -009|__kfree_skb -010|kfree_skb -011|icmp_rcv -012|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26udp_diag: Fix socket skipping within chainHerbert Xu
[ Upstream commit 86f3cddbc3037882414c7308973530167906b7e9 ] While working on rhashtable walking I noticed that the UDP diag dumping code is buggy. In particular, the socket skipping within a chain never happens, even though we record the number of sockets that should be skipped. As this code was supposedly copied from TCP, this patch does what TCP does and resets num before we walk a chain. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26ipv4: try to cache dst_entries which would cause a redirectHannes Frederic Sowa
[ Upstream commit df4d92549f23e1c037e83323aff58a21b3de7fe0 ] Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit f88649721268999 ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26ip: zero sockaddr returned on error queueWillem de Bruijn
[ Upstream commit f812116b174e59a350acc8e4856213a166a91222 ] The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That structure is defined and allocated on the stack as struct { struct sock_extended_err ee; struct sockaddr_in(6) offender; } errhdr; The second part is only initialized for certain SO_EE_ORIGIN values. Always initialize it completely. An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that would return uninitialized bytes. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Also verified that there is no padding between errhdr.ee and errhdr.offender that could leak additional kernel data. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-28Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidMark Brown
2015-01-27tcp: Do not apply TSO segment limit to non-TSO packetsHerbert Xu
[ Upstream commit 843925f33fcc293d80acf2c5c8a78adf3344d49b ] Thomas Jarosch reported IPsec TCP stalls when a PMTU event occurs. In fact the problem was completely unrelated to IPsec. The bug is also reproducible if you just disable TSO/GSO. The problem is that when the MSS goes down, existing queued packet on the TX queue that have not been transmitted yet all look like TSO packets and get treated as such. This then triggers a bug where tcp_mss_split_point tells us to generate a zero-sized packet on the TX queue. Once that happens we're screwed because the zero-sized packet can never be removed by ACKs. Fixes: 1485348d242 ("tcp: Apply device TSO segment limit earlier") Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-08Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidMark Brown
2014-12-06ipv4: Fix incorrect error code when adding an unreachable routePanu Matilainen
[ Upstream commit 49dd18ba4615eaa72f15c9087dea1c2ab4744cf5 ] Trying to add an unreachable route incorrectly returns -ESRCH if if custom FIB rules are present: [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4 RTNETLINK answers: Network is unreachable [root@localhost ~]# ip rule add to 55.66.77.88 table 200 [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4 RTNETLINK answers: No such process [root@localhost ~]# Commit 83886b6b636173b206f475929e58fac75c6f2446 ("[NET]: Change "not found" return value for rule lookup") changed fib_rules_lookup() to use -ESRCH as a "not found" code internally, but for user space it should be translated into -ENETUNREACH. Handle the translation centrally in ipv4-specific fib_lookup(), leaving the DECnet case alone. On a related note, commit b7a71b51ee37d919e4098cd961d59a883fd272d8 ("ipv4: removed redundant conditional") removed a similar translation from ip_route_input_slow() prematurely AIUI. Fixes: b7a71b51ee37 ("ipv4: removed redundant conditional") Signed-off-by: Panu Matilainen <pmatilai@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-05Merge branch 'linaro-android-3.10-lsk' of ↵Mark Brown
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-android Conflicts: arch/arm64/Makefile
2014-12-05Merge branch 'linaro-fixes/android-3.10' into 'linaro-android-3.10-lsk'Amit Pundir
2014-12-05Merge branch 'upstream/android-3.10' into linaro-fixes/android-3.10Amit Pundir
2014-12-02net/ping: handle protocol mismatching scenarioJane Zhou
ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jane Zhou <a17711@motorola.com> Signed-off-by: Yiwei Zhao <gbjc64@motorola.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-11-21Merge branch 'linux-linaro-lsk-android' of ↵Mark Brown
git://git.linaro.org/kernel/linux-linaro-stable into linux-linaro-lsk-android
2014-11-14Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidMark Brown
2014-11-14ipv4: dst_entry leak in ip_send_unicast_reply()Vasily Averin
[ Upstream commit 4062090e3e5caaf55bed4523a69f26c3265cc1d2 ] ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork and in case errors in __ip_append_data() nobody frees stolen dst entry Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()") Signed-off-by: Vasily Averin <vvs@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ipv4: fix nexthop attlen check in fib_nh_matchJiri Pirko
[ Upstream commit f76936d07c4eeb36d8dbb64ebd30ab46ff85d9f7 ] fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-05Merge branch 'linaro-android-3.10-lsk' of ↵Mark Brown
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-android Conflicts: arch/arm64/Kconfig arch/arm64/include/asm/barrier.h arch/arm64/include/asm/elf.h arch/arm64/include/asm/ptrace.h arch/arm64/kernel/Makefile arch/arm64/kernel/debug-monitors.c arch/arm64/kernel/entry.S arch/arm64/kernel/hw_breakpoint.c arch/arm64/kernel/kuser32.S arch/arm64/kernel/ptrace.c arch/arm64/kernel/setup.c arch/arm64/kernel/traps.c kernel/fork.c
2014-10-15Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidMark Brown
2014-10-15tcp: fixing TLP's FIN recoveryPer Hurtig
[ Upstream commit bef1909ee3ed1ca39231b260a8d3b4544ecd0c8f ] Fix to a problem observed when losing a FIN segment that does not contain data. In such situations, TLP is unable to recover from *any* tail loss and instead adds at least PTO ms to the retransmission process, i.e., RTO = RTO + PTO. Signed-off-by: Per Hurtig <per.hurtig@kau.se> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()Neal Cardwell
[ Upstream commit 4fab9071950c2021d846e18351e0f46a1cffd67b ] Make sure we use the correct address-family-specific function for handling MTU reductions from within tcp_release_cb(). Previously AF_INET6 sockets were incorrectly always using the IPv6 code path when sometimes they were handling IPv4 traffic and thus had an IPv4 dst. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Willem de Bruijn <willemb@google.com> Fixes: 563d34d057862 ("tcp: dont drop MTU reduction indications") Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>