aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2013-02-08net: sctp: sctp_endpoint_free: zero out secret key dataDaniel Borkmann
On sctp_endpoint_destroy, previously used sensitive keying material should be zeroed out before the memory is returned, as we already do with e.g. auth keys when released. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfreeDaniel Borkmann
In sctp_setsockopt_auth_key, we create a temporary copy of the user passed shared auth key for the endpoint or association and after internal setup, we free it right away. Since it's sensitive data, we should zero out the key before returning the memory back to the allocator. Thus, use kzfree instead of kfree, just as we do in sctp_auth_key_put(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08l2tp: dont play with skb->truesizeEric Dumazet
Andrew Savchenko reported a DNS failure and we diagnosed that some UDP sockets were unable to send more packets because their sk_wmem_alloc was corrupted after a while (tx_queue column in following trace) $ cat /proc/net/udp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops ... 459: 00000000:0270 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4507 2 ffff88003d612380 0 466: 00000000:0277 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4802 2 ffff88003d613180 0 470: 076A070A:007B 00000000:0000 07 FFFF4600:00000000 00:00000000 00000000 123 0 5552 2 ffff880039974380 0 470: 010213AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4986 2 ffff88003dbd3180 0 470: 010013AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4985 2 ffff88003dbd2e00 0 470: 00FCA8C0:007B 00000000:0000 07 FFFFFB00:00000000 00:00000000 00000000 0 0 4984 2 ffff88003dbd2a80 0 ... Playing with skb->truesize is tricky, especially when skb is attached to a socket, as we can fool memory charging. Just remove this code, its not worth trying to be ultra precise in xmit path. Reported-by: Andrew Savchenko <bircoph@gmail.com> Tested-by: Andrew Savchenko <bircoph@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07net: sctp: sctp_auth_key_put: use kzfree instead of kfreeDaniel Borkmann
For sensitive data like keying material, it is common practice to zero out keys before returning the memory back to the allocator. Thus, use kzfree instead of kfree. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07Merge branch 'fixes' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch into openvswitch Jesse Gross says: ==================== One bug fix for net/3.8 for a long standing problem that was reported a few times recently. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()Tommi Rantala
ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch, and the -1 return value from this function is bogus. Use the error handling we already have in place in ip6gre_tunnel_xmit() for this error case to fix this. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06tcp: fix for zero packets_in_flight was too broadIlpo Järvinen
There are transients during normal FRTO procedure during which the packets_in_flight can go to zero between write_queue state updates and firing the resulting segments out. As FRTO processing occurs during that window the check must be more precise to not match "spuriously" :-). More specificly, e.g., when packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic branch that set cwnd into zero would not be taken and new segments might be sent out later. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-02-04Merge branch 'for-upstream' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
2013-02-04tcp: ipv6: Update MIB counters for dropsVijay Subramanian
This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can drop SYNs for various reasons which are not currently tracked. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04tcp: Update MIB counters for dropsVijay Subramanian
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various reasons which are not currently tracked. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03packet: fix leakage of tx_ring memoryPhil Sutter
When releasing a packet socket, the routine packet_set_ring() is reused to free rings instead of allocating them. But when calling it for the first time, it fills req->tp_block_nr with the value of rb->pg_vec_len which in the second invocation makes it bail out since req->tp_block_nr is greater zero but req->tp_block_size is zero. This patch solves the problem by passing a zeroed auto-variable to packet_set_ring() upon each invocation from packet_release(). As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING and packet mmap), i.e. the original inclusion of TX ring support into af_packet, but applies only to sockets with both RX and TX ring allocated, which is probably why this was unnoticed all the time. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> Cc: Johann Baudy <johann.baudy@gnu-log.net> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03net: Fix inner_network_header assignment in skb-copy.Pravin B Shelar
Use correct inner offset to set inner_network_offset. Found by inspection. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03tcp: frto should not set snd_cwnd to 0Eric Dumazet
Commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()) uncovered a bug in FRTO code : tcp_process_frto() is setting snd_cwnd to 0 if the number of in flight packets is 0. As Neal pointed out, if no packet is in flight we lost our chance to disambiguate whether a loss timeout was spurious. We should assume it was a proper loss. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03tcp: fix an infinite loop in tcp_slow_start()Eric Dumazet
Since commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()), a nul snd_cwnd triggers an infinite loop in tcp_slow_start() Avoid this infinite loop and log a one time error for further analysis. FRTO code is suspected to cause this bug. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-01Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-01-31tcp: detect SYN/data drop when F-RTO is disabledYuchung Cheng
On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN retransmission to detect SYN/data drops. But if F-RTO is disabled, icsk_retransmit is reset at step D of tcp_fastretrans_alert() ( under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use total_retrans instead which accounts for SYN retransmission regardless the use of F-RTO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31l2tp: correctly handle ancillary data in the ip6 recv pathTom Parkin
l2tp_ip6 is incorrectly using the IPv4-specific ip_cmsg_recv to handle ancillary data. This means that socket options such as IPV6_RECVPKTINFO are not honoured in userspace. Convert l2tp_ip6 to use the IPv6-specific handler. Ref: net/ipv6/udp.c Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31ipv6: export ip6_datagram_recv_ctlTom Parkin
ip6_datagram_recv_ctl and ip6_datagram_send_ctl are used for handling IPv6 ancillary data. Since ip6_datagram_send_ctl is already publicly exported for use in modules, ip6_datagram_recv_ctl should also be available to support ancillary data in the receive path. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31ipv6: rename datagram_send_ctl and datagram_recv_ctlTom Parkin
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific. Since datagram_send_ctl is publicly exported it should be appropriately named to reflect the fact that it's for IPv6 only. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31Bluetooth: Fix hci_conn timeout routineAndre Guedes
If occurs a LE or SCO hci_conn timeout and the connection is already established (BT_CONNECTED state), the connection is not terminated as expected. This bug can be reproduced using l2test or scotest tool. Once the connection is established, kill l2test/scotest and the connection won't be terminated. This patch fixes hci_conn_disconnect helper so it is able to terminate LE and SCO connections, as well as ACL. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-01-31Bluetooth: Fix handling of unexpected SMP PDUsJohan Hedberg
The conn->smp_chan pointer can be NULL if SMP PDUs arrive at unexpected moments. To avoid NULL pointer dereferences the code should be checking for this and disconnect if an unexpected SMP PDU arrives. This patch fixes the issue by adding a check for conn->smp_chan for all other PDUs except pairing request and security request (which are are the first PDUs to come to initialize the SMP context). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> CC: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-01-30ipv6: do not create neighbor entries for local deliveryMarcelo Ricardo Leitner
They will be created at output, if ever needed. This avoids creating empty neighbor entries when TPROXYing/Forwarding packets for addresses that are not even directly reachable. Note that IPv4 already handles it this way. No neighbor entries are created for local input. Tested by myself and customer. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29tcp: Increment LISTENOVERFLOW and LISTENDROPS in tcp_v4_conn_request()Nivedita Singhvi
We drop a connection request if the accept backlog is full and there are sufficient packets in the syn queue to warrant starting drops. Increment the appropriate counters so this isn't silent, for accurate stats and help in debugging. This patch assumes LINUX_MIB_LISTENDROPS is a superset of/includes the counter LINUX_MIB_LISTENOVERFLOWS. Signed-off-by: Nivedita Singhvi <niv@us.ibm.com> Acked-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29ipv6 addrconf: Fix interface identifiers of 802.15.4 devices.YOSHIFUJI Hideaki / 吉藤英明
The "Universal/Local" (U/L) bit must be complmented according to RFC4944 and RFC2464. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29pktgen: correctly handle failures when adding a deviceCong Wang
The return value of pktgen_add_device() is not checked, so even if we fail to add some device, for example, non-exist one, we still see "OK:...". This patch fixes it. After this patch, I got: # echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0 -bash: echo: write error: No such device # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: Result: ERROR: can not add device non-exist # echo "add_device eth0" > /proc/net/pktgen/kpktgend_0 # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: eth0 Result: OK: add_device=eth0 (Candidate for -stable) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29netem: fix delay calculation in rate extensionJohannes Naab
The delay calculation with the rate extension introduces in v3.3 does not properly work, if other packets are still queued for transmission. For the delay calculation to work, both delay types (latency and delay introduces by rate limitation) have to be handled differently. The latency delay for a packet can overlap with the delay of other packets. The delay introduced by the rate however is separate, and can only start, once all other rate-introduced delays finished. Latency delay is from same distribution for each packet, rate delay depends on the packet size. .: latency delay -: rate delay x: additional delay we have to wait since another packet is currently transmitted .....---- Packet 1 .....xx------ Packet 2 .....------ Packet 3 ^^^^^ latency stacks ^^ rate delay doesn't stack ^^ latency stacks -----> time When a packet is enqueued, we first consider the latency delay. If other packets are already queued, we can reduce the latency delay until the last packet in the queue is send, however the latency delay cannot be <0, since this would mean that the rate is overcommitted. The new reference point is the time at which the last packet will be send. To find the time, when the packet should be send, the rate introduces delay has to be added on top of that. Signed-off-by: Johannes Naab <jn@stusta.de> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29l2tp: prevent l2tp_tunnel_delete racing with userspace closeTom Parkin
If a tunnel socket is created by userspace, l2tp hooks the socket destructor in order to clean up resources if userspace closes the socket or crashes. It also caches a pointer to the struct sock for use in the data path and in the netlink interface. While it is safe to use the cached sock pointer in the data path, where the skb references keep the socket alive, it is not safe to use it elsewhere as such access introduces a race with userspace closing the socket. In particular, l2tp_tunnel_delete is prone to oopsing if a multithreaded userspace application closes a socket at the same time as sending a netlink delete command for the tunnel. This patch fixes this oops by forcing l2tp_tunnel_delete to explicitly look up a tunnel socket held by userspace using sockfd_lookup(). Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
2013-01-28SCTP: Free the per-net sysctl table on net exit. v2Vlad Yasevich
Per-net sysctl table needs to be explicitly freed at net exit. Otherwise we see the following with kmemleak: unreferenced object 0xffff880402d08000 (size 2048): comm "chrome_sandbox", pid 18437, jiffies 4310887172 (age 9097.630s) hex dump (first 32 bytes): b2 68 89 81 ff ff ff ff 20 04 04 f8 01 88 ff ff .h...... ....... 04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815b4aad>] kmemleak_alloc+0x21/0x3e [<ffffffff81110352>] slab_post_alloc_hook+0x28/0x2a [<ffffffff81113fad>] __kmalloc_track_caller+0xf1/0x104 [<ffffffff810f10c2>] kmemdup+0x1b/0x30 [<ffffffff81571e9f>] sctp_sysctl_net_register+0x1f/0x72 [<ffffffff8155d305>] sctp_net_init+0x100/0x39f [<ffffffff814ad53c>] ops_init+0xc6/0xf5 [<ffffffff814ad5b7>] setup_net+0x4c/0xd0 [<ffffffff814ada5e>] copy_net_ns+0x6d/0xd6 [<ffffffff810938b1>] create_new_namespaces+0xd7/0x147 [<ffffffff810939f4>] copy_namespaces+0x63/0x99 [<ffffffff81076733>] copy_process+0xa65/0x1233 [<ffffffff81077030>] do_fork+0x10b/0x271 [<ffffffff8100a0e9>] sys_clone+0x23/0x25 [<ffffffff815dda73>] stub_clone+0x13/0x20 [<ffffffffffffffff>] 0xffffffffffffffff I fixed the spelling of sysctl_header so the code actually compiles. -- EWB. Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28IP_GRE: Fix kernel panic in IP_GRE with GRE csum.Pravin B Shelar
Due to IP_GRE GSO support, GRE can recieve non linear skb which results in panic in case of GRE_CSUM. Following patch fixes it by using correct csum API. Bug introduced in commit 6b78f16e4bdde3936b (gre: add GSO support) Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-27sctp: set association state to established in dupcook_a handlerXufeng Zhang
While sctp handling a duplicate COOKIE-ECHO and the action is 'Association restart', sctp_sf_do_dupcook_a() will processing the unexpected COOKIE-ECHO for peer restart, but it does not set the association state to SCTP_STATE_ESTABLISHED, so the association could stuck in SCTP_STATE_SHUTDOWN_PENDING state forever. This violates the sctp specification: RFC 4960 5.2.4. Handle a COOKIE ECHO when a TCB Exists Action A) In this case, the peer may have restarted. ..... After this, the endpoint shall enter the ESTABLISHED state. To resolve this problem, adding a SCTP_CMD_NEW_STATE cmd to the command list before SCTP_CMD_REPLY cmd, this will set the restart association to SCTP_STATE_ESTABLISHED state properly and also avoid I-bit being set in the DATA chunk header when COOKIE_ACK is bundled with DATA chunks. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-27ip6mr: limit IPv6 MRT_TABLE identifiersDan Carpenter
We did this for IPv4 in b49d3c1e1c "net: ipmr: limit MRT_TABLE identifiers" but we need to do it for IPv6 as well. On IPv6 the name is "pim6reg" instead of "pimreg" so there is one less digit allowed. The strcpy() is in ip6mr_reg_vif(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-27batman-adv: filter ARP packets with invalid MAC addresses in DATMatthias Schiffer
We never want multicast MAC addresses in the Distributed ARP Table, so it's best to completely ignore ARP packets containing them where we expect unicast addresses. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-01-27batman-adv: check for more types of invalid IP addresses in DATMatthias Schiffer
There are more types of IP addresses that may appear in ARP packets that we don't want to process. While some of these should never appear in sane ARP packets, a 0.0.0.0 source is used for duplicate address detection and thus seen quite often. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-01-27batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply()Matthias Schiffer
The callers of batadv_dat_snoop_incoming_arp_reply() assume the skb has been freed when it returns true; fix this by calling kfree_skb before returning as it is done in batadv_dat_snoop_incoming_arp_request(). Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-01-24cfg80211: off by one in ieee80211_bss()Dan Carpenter
We do a: sprintf(buf, " Last beacon: %ums ago", elapsed_jiffies_msecs(bss->ts)); elapsed_jiffies_msecs() can return a 10 digit number so "buf" needs to be 31 characters long. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-23Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-01-22ipv4: Fix route refcount on pmtu discoverySteffen Klassert
git commit 9cb3a50c (ipv4: Invalidate the socket cached route on pmtu events if possible) introduced a refcount problem. We don't get a refcount on the route if we get it from__sk_dst_get(), but we need one if we want to reuse this route because __sk_dst_set() releases the refcount of the old route. This patch adds proper refcount handling for that case. We introduce a 'new' flag to indicate that we are going to use a new route and we release the old route only if we replace it by a new one. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-22Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
2013-01-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) The transport header did not point to the right place after esp/ah processing on tunnel mode in the receive path. As a result, the ECN field of the inner header was not set correctly, fixes from Li RongQing. 2) We did a null check too late in one of the xfrm_replay advance functions. This can lead to a division by zero, fix from Nickolai Zeldovich. 3) The size calculation of the hash table missed the muiltplication with the actual struct size when the hash table is freed. We might call the wrong free function, fix from Michal Kubecek. 4) On IPsec pmtu events we can't access the transport headers of the original packet, so force a relookup for all routes to notify about the pmtu event. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-22net: net_cls: fd passed in SCM_RIGHTS datagram not set correctlyDaniel Wagner
Commit 6a328d8c6f03501657ad580f6f98bf9a42583ff7 changed the update logic for the socket but it does not update the SCM_RIGHTS update as well. This patch is based on the net_prio fix commit 48a87cc26c13b68f6cce4e9d769fcb17a6b3e4b8 net: netprio: fd passed in SCM_RIGHTS datagram not set correctly A socket fd passed in a SCM_RIGHTS datagram was not getting updated with the new tasks cgrp prioidx. This leaves IO on the socket tagged with the old tasks priority. To fix this add a check in the scm recvmsg path to update the sock cgrp prioidx with the new tasks value. Let's apply the same fix for net_cls. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Reported-by: Li Zefan <lizefan@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21openvswitch: Move LRO check from transmit to receive.Jesse Gross
The check for LRO packets was incorrectly put in the transmit path instead of on receive. Since this check is supposed to protect OVS (and other parts of the system) from packets that it cannot handle it is obviously not useful on egress. Therefore, this commit moves it back to the receive side. The primary problem that this caused is upcalls to userspace tried to segment the packet even though no segmentation information is available. This would later cause NULL pointer dereferences when skb_gso_segment() did nothing. Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-01-21ipv4: Add a socket release callback for datagram socketsSteffen Klassert
This implements a socket release callback function to check if the socket cached route got invalid during the time we owned the socket. The function is used from udp, raw and ping sockets. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21ipv4: Invalidate the socket cached route on pmtu events if possibleSteffen Klassert
The route lookup in ipv4_sk_update_pmtu() might return a route different from the route we cached at the socket. This is because standart routes are per cpu, so each cpu has it's own struct rtable. This means that we do not invalidate the socket cached route if the NET_RX_SOFTIRQ is not served by the same cpu that the sending socket uses. As a result, the cached route reused until we disconnect. With this patch we invalidate the socket cached route if possible. If the socket is owened by the user, we can't update the cached route directly. A followup patch will implement socket release callback functions for datagram sockets to handle this case. Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21xfrm4: Invalidate all ipv4 routes on IPsec pmtu eventsSteffen Klassert
On IPsec pmtu events we can't access the transport headers of the original packet, so we can't find the socket that sent the packet. The only chance to notify the socket about the pmtu change is to force a relookup for all routes. This patch implenents this for the IPsec protocols. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-21xfrm: fix freed block size calculation in xfrm_policy_fini()Michal Kubecek
Missing multiplication of block size by sizeof(struct hlist_head) can cause xfrm_hash_free() to be called with wrong second argument so that kfree() is called on a block allocated with vzalloc() or __get_free_pages() or free_pages() is called with wrong order when a namespace with enough policies is removed. Bug introduced by commit a35f6c5d, i.e. versions >= 2.6.29 are affected. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-20net: splice: fix __splice_segment()Eric Dumazet
commit 9ca1b22d6d2 (net: splice: avoid high order page splitting) forgot that skb->head could need a copy into several page frags. This could be the case for loopback traffic mostly. Also remove now useless skb argument from linear_to_page() and __splice_segment() prototypes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-20net: splice: avoid high order page splittingEric Dumazet
splice() can handle pages of any order, but network code tries hard to split them in PAGE_SIZE units. Not quite successfully anyway, as __splice_segment() assumed poff < PAGE_SIZE. This is true for the skb->data part, not necessarily for the fragments. This patch removes this logic to give the pages as they are in the skb. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-20tcp: fix incorrect LOCKDROPPEDICMPS counterEric Dumazet
commit 563d34d057 (tcp: dont drop MTU reduction indications) added an error leading to incorrect accounting of LINUX_MIB_LOCKDROPPEDICMPS If socket is owned by the user, we want to increment this SNMP counter, unless the message is a (ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one. Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>