aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/e1000
AgeCommit message (Collapse)Author
2016-09-15e1000: fix data race between tx_ring->next_to_cleanDmitriy Vyukov
[ Upstream commit 9eab46b7cb8d0b0dcf014bf7b25e0e72b9e4d929 ] e1000_clean_tx_irq cleans buffers and sets tx_ring->next_to_clean, then e1000_xmit_frame reuses the cleaned buffers. But there are no memory barriers when buffers gets recycled, so the recycled buffers can be corrupted. Use smp_store_release to update tx_ring->next_to_clean and smp_load_acquire to read tx_ring->next_to_clean to properly hand off buffers from e1000_clean_tx_irq to e1000_xmit_frame. The data race was found with KernelThreadSanitizer (KTSAN). Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-16drivers/net/intel: use napi_complete_done()Jesse Brandeburg
As per Eric Dumazet's previous patches: (see commit (24d2e4a50737) - tg3: use napi_complete_done()) Quoting verbatim: Using napi_complete_done() instead of napi_complete() allows us to use /sys/class/net/ethX/gro_flush_timeout GRO layer can aggregate more packets if the flush is delayed a bit, without having to set too big coalescing parameters that impact latencies. </end quote> Tested configuration: low latency via ethtool -C ethx adaptive-rx off rx-usecs 10 adaptive-tx off tx-usecs 15 workload: streaming rx using netperf TCP_MAERTS igb: MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1176930056 1475.36 797726 16384.00 71905 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1175182320 50476.00 23282 16384.00 71816 i40e: Hard to test because the traffic is incoming so fast (24Gb/s) that GRO always receives 87kB, even at the highest interrupt rate. Other drivers were only compile tested. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16drivers/net: get rid of unnecessary initializations in .get_drvinfo()Ivan Vecera
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len, eedump_len & regdump_len fields in their .get_drvinfo() ethtool op. It's not necessary as these fields is filled in ethtool_get_drvinfo(). v2: removed unused variable v3: removed another unused variable Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-22e1000: remove dead e1000_init_eeprom_params callsFrancois Romieu
The device probe method e1000_probe calls e1000_init_eeprom_params itself so there's no reason to call it again from e1000_do_write_eeprom or e1000_do_read_eeprom. The sentence above assumes that e1000_init_eeprom_params is effective. e1000_init_eeprom_params depends mostly on hw->mac_type and e1000_probe bails out early if it can't set mac_type (see e1000_init_hw_struct, then e1000_set_mac_type), qed. Btw, if effective, the removed paths would had been deadlock prone when e1000_eeprom_spi was set: -> e1000_write_eeprom (takes e1000_eeprom_lock) -> e1000_do_write_eeprom -> e1000_init_eeprom_params -> e1000_read_eeprom (takes e1000_eeprom_lock) (same narrative with e1000_read_eeprom -> e1000_do_read_eeprom etc.) As a final note, the candidate deadlock above can't happen in e1000_probe due to the way eeprom->word_size is set / tested. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-05-12e1000: Replace e1000_free_frag with skb_free_fragAlexander Duyck
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08e1000, e1000e: Use dma_rmb instead of rmb for descriptor read orderingAlexander Duyck
This change replaces calls to rmb with dma_rmb in the case where we want to order all follow-on descriptor reads after the check for the descriptor status bit. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-08ethernet: codespell comment spelling fixesJoe Perches
To test a checkpatch spelling patch, I ran codespell against drivers/net/ethernet/. $ git ls-files drivers/net/ethernet/ | \ while read file ; do \ codespell -w $file; \ done I removed a false positive in e1000_hw.h Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-06e1000: add dummy allocator to fix race condition between mtu change and netpollSabrina Dubroca
There is a race condition between e1000_change_mtu's cleanups and netpoll, when we change the MTU across jumbo size: Changing MTU frees all the rx buffers: e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings -> e1000_clean_rx_ring Then, close to the end of e1000_change_mtu: pr_info -> ... -> netpoll_poll_dev -> e1000_clean -> e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag And when we come back to do the rest of the MTU change: e1000_up -> e1000_configure -> e1000_configure_rx -> e1000_alloc_jumbo_rx_buffers alloc_jumbo finds the buffers already != NULL, since data (shared with page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage, or at least not what is expected when in jumbo state. This results in an unusable adapter (packets don't get through), and a NULL pointer dereference on the next call to e1000_clean_rx_ring (other mtu change, link down, shutdown): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330 [...] Call Trace: [<ffffffff81195445>] put_page+0x55/0x60 [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200 [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60 [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0 [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840 [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170 [<ffffffff81647050>] dev_set_mtu+0xa0/0x140 [<ffffffff81664218>] do_setlink+0x218/0xac0 [<ffffffff814459e9>] ? nla_parse+0xb9/0x120 [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890 [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40 [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260 By setting the allocator to a dummy version, netpoll can't mess up our rx buffers. The allocator is set back to a sane value in e1000_configure_rx. Fixes: edbbb3ca1077 ("e1000: implement jumbo receive with partial descriptors") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-06e1000: call netif_carrier_off early on downEliezer Tamir
When bringing down an interface netif_carrier_off() should be one the first things we do, since this will prevent the stack from queuing more packets to this interface. This operation is very fast, and should make the device behave much nicer when trying to bring down an interface under load. Also, this would Do The Right Thing (TM) if this device has some sort of fail-over teaming and redirect traffic to the other IF. Move netif_carrier_off as early as possible. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22net: e1000: support txtd update delay via xmit_moreFlorian Westphal
Don't update Tx tail descriptor if we queue hasn't been stopped and we know at least one more skb will be sent right away. Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22e1000: fix time comparisonAsaf Vertz
To be future-proof and for better readability the time comparisons are modified to use time_after_eq() instead of plain, error-prone math. Signed-off-by: Asaf Vertz <asaf.vertz@tandemg.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-13net: rename vlan_tx_* helpers since "tx" is misleading thereJiri Pirko
The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10ethernet/intel: Use napi_alloc_skbAlexander Duyck
This change replaces calls to netdev_alloc_skb_ip_align with napi_alloc_skb. The advantage of napi_alloc_skb is currently the fact that the page allocation doesn't make use of any irq disable calls. There are few spots where I couldn't replace the calls as the buffer allocation routine is called as a part of init which is outside of the softirq context. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08ethernet/intel: Use eth_skb_pad and skb_put_padto helpersAlexander Duyck
Update the Intel Ethernet drivers to use eth_skb_pad() and skb_put_padto instead of doing their own implementations of the function. Also this cleans up two other spots where skb_pad was called but the length and tail pointers were being manipulated directly instead of just having the padding length added via __skb_put. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30e1000: unset IFF_UNICAST_FLT on WMware 82545EMFrancesco Ruggeri
VMWare's e1000 implementation does not seem to support unicast filtering. This can be observed by configuring a macvlan interface on eth0 in a VM in VMWare Fusion 5.0.5, and trying to use that interface instead of eth0. Tested on 3.16. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: switch to napi_gro_frags apiFlorian Westphal
napi_gro_frags allows skb re-use in case GRO can merge payload pages into an skb on the GRO lists. netperf TCP_STREAM, kvm-e1000 emulation, mtu 9k: Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec old: 87380 16384 16384 30.00 8985.78 new: 87380 16384 16384 30.00 9907.05 Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: convert to build_skbFlorian Westphal
Instead of preallocating Rx skbs, allocate them right before sending inbound packet up the stack. e1000-kvm, mtu1500, netperf TCP_STREAM: Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec old: 87380 16384 16384 60.00 4532.40 new: 87380 16384 16384 60.00 4599.05 Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: rename struct e1000_buffer to e1000_tx_bufferFlorian Westphal
and remove *page, its only used for Rx. Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: add and use e1000_rx_buffer info for RxFlorian Westphal
e1000 uses the same metadata struct for Rx and Tx. But Tx and Rx have different requirements. For Rx, we only need to store a buffer and a DMA address. Follow-up patch will remove skb for Rx, bringing rx_buffer_info down to 16 bytes on x86_64. [ buffer_info is 48 bytes ] Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: perform copybreak ahead of DMA unmapFlorian Westphal
Currently we unmap the DMA range, then copy to new skb. Change this so we can keep the mapping in case the data is copied. Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: move tbi workaround code into helper functionFlorian Westphal
Its the same in both handlers. Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-12e1000: move e1000_tbi_adjust_stats to where its usedFlorian Westphal
... and make it static. Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2014-09-06e1000: e1000_ethertool.c coding style fixesKrzysztof Majzerowicz-Jaszcz
Fixed many errors/warnings and checks in e1000_ethtool.c reported by checkpatch.pl. Suggestions from Joe Perches and Alexander Duyck applied as well Signed-off-by: Krzysztof Majzerowicz-Jaszcz <cristos@vipserv.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-08-25e1000: Fix TSO for non-accelerated vlan trafficVlad Yasevich
This device claims TSO and checksum support for vlans. It also allows a user to control vlan acceleration offloading. As such, it is possible to turn off vlan acceleration and configure a vlan which will continue to support TSO. In such situation the packet passed down the the device will contain a vlan header and skb->protocol will be set to ETH_P_8021Q. The device assumes that skb->protocol contains network protocol value and uses that value to set up TSO and checksum information. This will results in corrupted frames sent on the wire. This patch extract the protocol value correctly and corrects TSO for non-accelerated traffic. CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Carolyn Wyborny <carolyn.wyborny@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> CC: Greg Rose <gregory.v.rose@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: John Ronciak <john.ronciak@intel.com> CC: Mitch Williams <mitch.a.williams@intel.com> CC: Linux NICS <linux.nics@intel.com> CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-12PCI: Remove DEFINE_PCI_DEVICE_TABLE macro useBenoit Taine
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to meet kernel coding style guidelines. This issue was reported by checkpatch. A simplified version of the semantic patch that makes this change is as follows (http://coccinelle.lip6.fr/): // <smpl> @@ identifier i; declarer name DEFINE_PCI_DEVICE_TABLE; initializer z; @@ - DEFINE_PCI_DEVICE_TABLE(i) + const struct pci_device_id i[] = z; // </smpl> [bhelgaas: add semantic patch] Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-20e1000: remove unnecessary break after returnFabian Frederick
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-06net: use SPEED_UNKNOWN and DUPLEX_UNKNOWN when appropriateJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03e1000: Use time_after() for time comparisonManuel Schölling
To be future-proof and for better readability the time comparisons are modified to use time_after() instead of plain, error-prone math. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-03e1000: remove the check: skb->len<=0Yongjian Xu
There is no case skb->len would be 0 or 'negative'. Remove the check. Signed-off-by: Yongjian Xu <xuyongjiande@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-05-27e1000: Use is_broadcast_ether_addr/is_multicast_ether_addr helpersTobias Klauser
Use the is_broadcast_ether_addr/is_multicast_ether_addr helper functions from linux/etherdevice.h instead of open coding them. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-05-13net: get rid of SET_ETHTOOL_OPSWilfried Klaebe
net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops *ops; struct net_device *dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11e1000: remove open-coded skb_cow_headFrancois Romieu
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-11e1000: remove debug messages with function namesJakub Kicinski
e1000_hw.c contains a lot of debug messages which print name of invoked function and contain no new line character at the end. Remove them as equivalent information can be nowadays obtained using function tracer. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Jakub Kicinski <kubakici@wp.pl> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-16drivers/net: delete non-required instances of include <linux/init.h>Paul Gortmaker
None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. This covers everything under drivers/net except for wireless, which has been submitted separately. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30e1000: fix possible reset_task running after adapter downVladimir Davydov
On e1000_down(), we should ensure every asynchronous work is canceled before proceeding. Since the watchdog_task can schedule other works apart from itself, it should be stopped first, but currently it is stopped after the reset_task. This can result in the following race leading to the reset_task running after the module unload: e1000_down_and_stop(): e1000_watchdog(): ---------------------- ----------------- cancel_work_sync(reset_task) schedule_work(reset_task) cancel_delayed_work_sync(watchdog_task) The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning of e1000_down_and_stop() thus ensuring the race is impossible. Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-29e1000: fix lockdep warning in e1000_reset_taskVladimir Davydov
The patch fixes the following lockdep warning, which is 100% reproducible on network restart: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0+ #47 Tainted: GF ------------------------------------------------------- kworker/1:1/27 is trying to acquire lock: ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70 but task is already holding lock: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&adapter->mutex){+.+...}: [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390 [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}: [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); *** DEADLOCK *** 3 locks held by kworker/1:1/27: #0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] stack backtrace: CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ #47 Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007 Workqueue: events e1000_reset_task [e1000] ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00 Call Trace: [<ffffffff816b54a2>] dump_stack+0x49/0x5f [<ffffffff810ba936>] print_circular_bug+0x216/0x310 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108b906>] ? process_one_work+0x166/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 == The issue background == The problem occurs, because e1000_down(), which is called under adapter->mutex by e1000_reset_task(), tries to synchronously cancel e1000 auxiliary works (reset_task, watchdog_task, phy_info_task, fifo_stall_task), which take adapter->mutex in their handlers. So the question is what does adapter->mutex protect there? The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to private mutex from rtnl") as a replacement for rtnl_lock() taken in the asynchronous handlers. It targeted on fixing a similar lockdep warning issued when e1000_down() was called under rtnl_lock(), and it fixed it, but unfortunately it introduced the lockdep warning described above. Anyway, that said the source of this bug is that the asynchronous works were made to take rtnl_lock() some time ago, so let's look deeper and find why it was added there. The rtnl_lock() was added to asynchronous handlers by commit 338c15 ("e1000: fix occasional panic on unload") in order to prevent asynchronous handlers from execution after the module is unloaded (e1000_down() is called) as it follows from the comment to the commit: > Net drivers in general have an issue where timers fired > by mod_timer or work threads with schedule_work are running > outside of the rtnl_lock. > > With no other lock protection these routines are vulnerable > to races with driver unload or reset paths. > > The longer term solution to this might be a redesign with > safer locks being taken in the driver to guarantee no > reentrance, but for now a safe and effective fix is > to take the rtnl_lock in these routines. I'm not sure if this locking scheme fixed the problem or just made it unlikely, although I incline to the latter. Anyway, this was long time ago when e1000 auxiliary works were implemented as timers scheduling real work handlers in their routines. The e1000_down() function only canceled the timers, but left the real handlers running if they were running, which could result in work execution after module unload. Today, the e1000 driver uses sane delayed works instead of the pair timer+work to implement its delayed asynchronous handlers, and the e1000_down() synchronously cancels all the works so that the problem that commit 338c15 tried to cope with disappeared, and we don't need any locks in the handlers any more. Moreover, any locking there can potentially result in a deadlock. So, this patch reverts commits 0ef4ee and 338c15. Fixes: 0ef4eedc2e98 ("e1000: convert to private mutex from rtnl") Fixes: 338c15e470d8 ("e1000: fix occasional panic on unload") Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-29e1000: prevent oops when adapter is being closed and reset simultaneouslyyzhu1
This change is based on a similar change made to e1000e support in commit bb9e44d0d0f4 ("e1000e: prevent oops when adapter is being closed and reset simultaneously"). The same issue has also been observed on the older e1000 cards. Here, we have increased the RESET_COUNT value to 50 because there are too many accesses to e1000 nic on stress tests to e1000 nic, it is not enough to set RESET_COUT 25. Experimentation has shown that it is enough to set RESET_COUNT 50. Signed-off-by: yzhu1 <yanjun.zhu@windriver.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-14Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull DMA mask updates from Russell King: "This series cleans up the handling of DMA masks in a lot of drivers, fixing some bugs as we go. Some of the more serious errors include: - drivers which only set their coherent DMA mask if the attempt to set the streaming mask fails. - drivers which test for a NULL dma mask pointer, and then set the dma mask pointer to a location in their module .data section - which will cause problems if the module is reloaded. To counter these, I have introduced two helper functions: - dma_set_mask_and_coherent() takes care of setting both the streaming and coherent masks at the same time, with the correct error handling as specified by the API. - dma_coerce_mask_and_coherent() which resolves the problem of drivers forcefully setting DMA masks. This is more a marker for future work to further clean these locations up - the code which creates the devices really should be initialising these, but to fix that in one go along with this change could potentially be very disruptive. The last thing this series does is prise away some of Linux's addition to "DMA addresses are physical addresses and RAM always starts at zero". We have ARM LPAE systems where all system memory is above 4GB physical, hence having DMA masks interpreted by (eg) the block layers as describing physical addresses in the range 0..DMAMASK fails on these platforms. Santosh Shilimkar addresses this in this series; the patches were copied to the appropriate people multiple times but were ignored. Fixing this also gets rid of some ARM weirdness in the setup of the max*pfn variables, and brings ARM into line with every other Linux architecture as far as those go" * 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm: (52 commits) ARM: 7805/1: mm: change max*pfn to include the physical offset of memory ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations ARM: 7795/1: mm: dma-mapping: Add dma_max_pfn(dev) helper function ARM: 7794/1: block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit() ARM: DMA-API: better handing of DMA masks for coherent allocations ARM: 7857/1: dma: imx-sdma: setup dma mask DMA-API: firmware/google/gsmi.c: avoid direct access to DMA masks DMA-API: dcdbas: update DMA mask handing DMA-API: dma: edma.c: no need to explicitly initialize DMA masks DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks DMA-API: crypto: remove last references to 'static struct device *dev' DMA-API: crypto: fix ixp4xx crypto platform device support DMA-API: others: use dma_set_coherent_mask() DMA-API: staging: use dma_set_coherent_mask() DMA-API: usb: use new dma_coerce_mask_and_coherent() DMA-API: usb: use dma_set_coherent_mask() DMA-API: parport: parport_pc.c: use dma_coerce_mask_and_coherent() DMA-API: net: octeon: use dma_coerce_mask_and_coherent() DMA-API: net: nxp/lpc_eth: use dma_coerce_mask_and_coherent() ...
2013-11-01e1000: fix wrong queue idx calculationHong Zhiguo
tx_ring and adapter->tx_ring are already of type "struct e1000_tx_ring *" Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-09-24intel: Remove extern from function prototypesJoe Perches
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by: Joe Perches <joe@perches.com>
2013-09-21DMA-API: net: intel/e1000: replace dma_set_mask()+dma_set_coherent_mask() ↵Russell King
with new helper Replace the following sequence: dma_set_mask(dev, mask); dma_set_coherent_mask(dev, mask); with a call to the new helper dma_set_mask_and_coherent(). Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-29drivers:net: Convert dma_alloc_coherent(...__GFP_ZERO) to dma_zalloc_coherentJoe Perches
__GFP_ZERO is an uncommon flag and perhaps is better not used. static inline dma_zalloc_coherent exists so convert the uses of dma_alloc_coherent with __GFP_ZERO to the more common kernel style with zalloc. Remove memset from the static inline dma_zalloc_coherent and add just one use of __GFP_ZERO instead. Trivially reduces the size of the existing uses of dma_zalloc_coherent. Realign arguments as appropriate. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: add protocol argument to packet tagging functionsPatrick McHardy
Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: prepare for 802.1ad VLAN filtering offloadPatrick McHardy
Change the rx_{add,kill}_vid callbacks to take a protocol argument in preparation of 802.1ad support. The protocol argument used so far is always htons(ETH_P_8021Q). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*Patrick McHardy
Rename the hardware VLAN acceleration features to include "CTAG" to indicate that they only support CTAGs. Follow up patches will introduce 802.1ad server provider tagging (STAGs) and require the distinction for hardware not supporting acclerating both. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/mac80211/sta_info.c net/wireless/core.h Two minor conflicts in wireless. Overlapping additions of extern declarations in net/wireless/core.h and a bug fix overlapping with the addition of a boolean parameter to __ieee80211_key_free(). Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-27e1000: ethtool: Add missing dma_mapping_error-call in e1000_setup_desc_ringsChristoph Paasch
After dma_map_single, dma_mapping_error must be called. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-17drivers:net: dma_alloc_coherent: use __GFP_ZERO instead of memset(, 0)Joe Perches
Reduce the number of calls required to alloc a zeroed block of memory. Trivially reduces overall object size. Other changes around these removals o Neaten call argument alignment o Remove an unnecessary OOM message after dma_alloc_coherent failure o Remove unnecessary gfp_t stack variable Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15drivers:net: Remove dma_alloc_coherent OOM messagesJoe Perches
I believe these error messages are already logged on allocation failure by warn_alloc_failed and so get a dump_stack on OOM. Remove the unnecessary additional error logging. Around these deletions: o Alignment neatening. o Remove unnecessary casts of dma_alloc_coherent. o Hoist assigns from ifs. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>