Since we moved the "lock" to be the first element of
struct tlb_core_data in commit 82d86de25b ("powerpc/e6500: Make TLB
lock recursive"), this macro is not used by any code. Just delete it.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
In u-boot, when set the video as console, the name 'vga' is used
as a general name for the video device, during the fdt_fixup_stdout
process, the 'vga' name is used to search in the dtb to setup the
'linux,stdout-path' node. Though the P1022 DIU is not VGA-compatible
device, to meet the 'vga' name used in u-boot, the vga alias node is
added for P1022 in this patch. At the same time, a display alias is
also added so that no other components grow dependencies on the vga
alias node.
Signed-off-by: Jason Jin <Jason.Jin@freescale.com>
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
LTC3886 is a is a dual PolyPhase DC/DC synchronous step-down switching
regulator controller. It is mostly command compatible to LTC3883,
but supports two phases instead of one.
Suggested-by: Michael Jones <mike@proclivis.com>
Tested-by: Michael Jones <mike@proclivis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
LTC2980 and LTM2987 are command compatible to LTC2977. They consist of
two LTC2977 on a single die, and are instantiated as two separate chips,
each supporting eight channels.
Suggested-by: Michael Jones <mike@proclivis.com>
Tested-by: Michael Jones <mike@proclivis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Add additional chip ID for an older revision of LTC2978, as well
as two chip IDs for LTC3882. Turns out the LTC3882 does support the
LTC2978_MFR_SPECIAL_ID register, and reading it returns its chip ID,
but the register is undocumented.
Suggested-by: Michael Jones <mike@proclivis.com>
Tested-by: Michael Jones <mike@proclivis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Per information from Linear Technologies, the ID mask is 12 bit
for all chips of this series. Use this mask to detect chips to ensure
that all chip revisions are detected.
Suggested-by: Michael Jones <mike@proclivis.com>
Tested-by: Michael Jones <mike@proclivis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Pull rdma bugfix from Doug Ledford:
"Bugfix in iw_cxgb4"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
iw_cxgb4: gracefully handle unknown CQE status errors
This work adds the possibility of deriving the zone id from the skb->mark
field in a scalable manner. This allows for having only a single template
serving hundreds/thousands of different zones, for example, instead of the
need to have one match for each zone as an extra CT jump target.
Note that we'd need to have this information attached to the template as at
the time when we're trying to lookup a possible ct object, we already need
to know zone information for a possible match when going into
__nf_conntrack_find_get(). This work provides a minimal implementation for
a possible mapping.
In order to not add/expose an extra ct->status bit, the zone structure has
been extended to carry a flag for deriving the mark.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This work adds a direction parameter to netfilter zones, so identity
separation can be performed only in original/reply or both directions
(default). This basically opens up the possibility of doing NAT with
conflicting IP address/port tuples from multiple, isolated tenants
on a host (e.g. from a netns) without requiring each tenant to NAT
twice resp. to use its own dedicated IP address to SNAT to, meaning
overlapping tuples can be made unique with the zone identifier in
original direction, where the NAT engine will then allocate a unique
tuple in the commonly shared default zone for the reply direction.
In some restricted, local DNAT cases, also port redirection could be
used for making the reply traffic unique w/o requiring SNAT.
The consensus we've reached and discussed at NFWS and since the initial
implementation [1] was to directly integrate the direction meta data
into the existing zones infrastructure, as opposed to the ct->mark
approach we proposed initially.
As we pass the nf_conntrack_zone object directly around, we don't have
to touch all call-sites, but only those, that contain equality checks
of zones. Thus, based on the current direction (original or reply),
we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID.
CT expectations are direction-agnostic entities when expectations are
being compared among themselves, so we can only use the identifier
in this case.
Note that zone identifiers can not be included into the hash mix
anymore as they don't contain a "stable" value that would be equal
for both directions at all times, f.e. if only zone->id would
unconditionally be xor'ed into the table slot hash, then replies won't
find the corresponding conntracking entry anymore.
If no particular direction is specified when configuring zones, the
behaviour is exactly as we expect currently (both directions).
Support has been added for the CT netlink interface as well as the
x_tables raw CT target, which both already offer existing interfaces
to user space for the configuration of zones.
Below a minimal, simplified collision example (script in [2]) with
netperf sessions:
+--- tenant-1 ---+ mark := 1
| netperf |--+
+----------------+ | CT zone := mark [ORIGINAL]
[ip,sport] := X +--------------+ +--- gateway ---+
| mark routing |--| SNAT |-- ... +
+--------------+ +---------------+ |
+--- tenant-2 ---+ | ~~~|~~~
| netperf |--+ +-----------+ |
+----------------+ mark := 2 | netserver |------ ... +
[ip,sport] := X +-----------+
[ip,port] := Y
On the gateway netns, example:
iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL
iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully
iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark
iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark
conntrack dump from gateway netns:
netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns
tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1
tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1
tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1
tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2
Taking this further, test script in [2] creates 200 tenants and runs
original-tuple colliding netperf sessions each. A conntrack -L dump in
the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED
state as expected.
I also did run various other tests with some permutations of the script,
to mention some: SNAT in random/random-fully/persistent mode, no zones (no
overlaps), static zones (original, reply, both directions), etc.
[1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/
[2] https://paste.fedoraproject.org/242835/65657871/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull libata fixes from Tejun Heo:
"Three minor device-specific fixes and revert of NCQ autosense added
during this -rc1.
It turned out that NCQ autosense as currently implemented interferes
with the usual error handling behavior. It will be revisited in the
near future"
* 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: ahci_brcmstb: Fix misuse of IS_ENABLED
sata_sx4: Check return code from pdc20621_i2c_read()
Revert "libata: Implement NCQ autosense"
Revert "libata: Implement support for sense data reporting"
Revert "libata-eh: Set 'information' field for autosense"
ata: ahci_brcmstb: Fix warnings with CONFIG_PM_SLEEP=n
Pull cgroup fix from Tejun Heo:
"A fix for a subtle bug introduced back during 3.17 cycle which
interferes with setting configurations under specific conditions"
* 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: use trialcs->mems_allowed as a temp variable
We should not sleep while holding a spinlock.
bcm_gpio_set_power is called while holding the bcm_device_list lock.
Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
kbuild test robot reported:
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head: d52736e24f
commit: 4e3c89920c [751/762] net: Introduce VRF related flags and helpers
reproduce: make htmldocs
>> Warning(include/linux/netdevice.h:1293): Enum value 'IFF_VRF_MASTER' not described in enum 'netdev_priv_flags'
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Eric noted netif_index_is_vrf is not called with rcu_read_lock held,
so wrap the dev_get_by_index_rcu in rcu_read_lock and unlock.
If VRF is not enabled or oif is 0 skip the device lookup. In both cases
index cannot be the VRF master.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat says:
====================
Driver updates 16-Aug-2015
This patchset contains bug fixes, new RSS and pause parameters ethtool
options, and support for RX CHECKSUM_COMPLETE.
Patchset was applied and tested over commit adc6310 ("Merge branch
'mv88e6xxx-switchdev-fdb'").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Only for packets with first ethertype set to IPv4/6 for now.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only rx/tx pause settings.
Autoneg setting is currently not supported.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Port speed settings are applied by the device only upon
port admin status transition from DOWN to UP.
So we enforce this transition regardless of the port's
current operation state (which may be occasionally DOWN if
for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
register set/query access functions.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Change the maximum LRO session size from 16KB to 64KB
- Reduce the LRO session timeout from 512us to 32us in
order to reduce the TCP latency of non-LRO'ed flows.
- Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type.
- Fix a bug accessing un-initialized mdev pointer.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We un-intentionally limited the minimum rings size too much.
TX minimum ring size reduced from 128 to 64.
RX minimum ring size reduced from 128 to 2.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirection table size was defined by a variable that
was actually assigned a constant value.
Since we do not have any forseen intension to make it configurable
we simply made it a constant.
We also limit the number of channels such that the RSS indirection
table could always populate all RX rings.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to generate a unique key per TIR.
Generating a single key per netdev and copying it to all
its TIRs.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth-next 2015-08-16
Here's what's likely the last bluetooth-next pull request for 4.3:
- 6lowpan/802.15.4 refactoring, cleanups & fixes
- Document 6lowpan netdev usage in Documentation/networking/6lowpan.txt
- Support for UART based QCA Bluetooth controllers
- Power management support for Broeadcom Bluetooth controllers
- Change LE connection initiation to always use passive scanning first
- Support for new Silicon Wave USB ID
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When competing sync(2) calls walk the same filesystem, they need to
walk the list of inodes on the superblock to find all the inodes
that we need to wait for IO completion on. However, when multiple
wait_sb_inodes() calls do this at the same time, they contend on the
the inode_sb_list_lock and the contention causes system wide
slowdowns. In effect, concurrent sync(2) calls can take longer and
burn more CPU than if they were serialised.
Stop the worst of the contention by adding a per-sb mutex to wrap
around wait_sb_inodes() so that we only execute one sync(2) IO
completion walk per superblock superblock at a time and hence avoid
contention being triggered by concurrent sync(2) calls.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
The process of reducing contention on per-superblock inode lists
starts with moving the locking to match the per-superblock inode
list. This takes the global lock out of the picture and reduces the
contention problems to within a single filesystem. This doesn't get
rid of contention as the locks still have global CPU scope, but it
does isolate operations on different superblocks form each other.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
Some filesystems don't use the VFS inode hash and fake the fact they
are hashed so that all the writeback code works correctly. However,
this means the evict() path still tries to remove the inode from the
hash, meaning that the inode_hash_lock() needs to be taken
unnecessarily. Hence under certain workloads the inode_hash_lock can
be contended even if the inode is never actually hashed.
To avoid this add hlist_fake to test if the inode isn't actually
hashed to avoid taking the hash lock on inodes that have never been
hashed. Based on Dave Chinner's
inode: add IOP_NOTHASHED to avoid inode hash lock in evict
basd on Al's suggestions. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
Doing writeback on lots of little files causes terrible IOPS storms
because of the per-mapping writeback plugging we do. This
essentially causes imeediate dispatch of IO for each mapping,
regardless of the context in which writeback is occurring.
IOWs, running a concurrent write-lots-of-small 4k files using fsmark
on XFS results in a huge number of IOPS being issued for data
writes. Metadata writes are sorted and plugged at a high level by
XFS, so aggregate nicely into large IOs. However, data writeback IOs
are dispatched in individual 4k IOs, even when the blocks of two
consecutively written files are adjacent.
Test VM: 8p, 8GB RAM, 4xSSD in RAID0, 100TB sparse XFS filesystem,
metadata CRCs enabled.
Kernel: 3.10-rc5 + xfsdev + my 3.11 xfs queue (~70 patches)
Test:
$ ./fs_mark -D 10000 -S0 -n 10000 -s 4096 -L 120 -d
/mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d
/mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d
/mnt/scratch/6 -d /mnt/scratch/7
Result:
wall sys create rate Physical write IO
time CPU (avg files/s) IOPS Bandwidth
----- ----- ------------ ------ ---------
unpatched 6m56s 15m47s 24,000+/-500 26,000 130MB/s
patched 5m06s 13m28s 32,800+/-600 1,500 180MB/s
improvement -26.44% -14.68% +36.67% -94.23% +38.46%
If I use zero length files, this workload at about 500 IOPS, so
plugging drops the data IOs from roughly 25,500/s to 1000/s.
3 lines of code, 35% better throughput for 15% less CPU.
The benefits of plugging at this layer are likely to be higher for
spinning media as the IO patterns for this workload are going make a
much bigger difference on high IO latency devices.....
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Govindarajulu Varadarajan says:
====================
enic: add devcmd2
This series adds new devcmd2 support. The first two patches are code
refactoring.
devcmd is an interface for driver to communicate with fw/adaptor. It
involves writing data to hardware registers and waiting for the result.
This mechanism does not scale well. The queuing of "no wait" devcmds is
done in firmware memory rather than on the host. Firmware memory is a
rather more scarce and valuable resource than host memory. A devcmd storm
from one vf can disrupt the service on other pf/vf. The lack of flow
control allows for possible denial of server from one VM to another.
Devcmd2 uses work queue to post the devcmds, just like tx work queue. This
allows better flow control.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
devcmd is an interface for driver to communicate with fw/adaptor. It
involves writing data to hardware registers and waiting for the result.
This mechanism does not scale well. The queuing of "no wait" devcmds is
done in firmware memory rather than on the host. Firmware memory is a
rather more scarce and valuable resource than host memory. A devcmd storm
from one vf can disrupt the service on other pf/vf. The lack of flow
control allows for possible denial of server from one VM to another.
Devcmd2 uses work queue to post the devcmds, just like tx work queue. This
allows better flow control.
Initialize devcmd2, if fails we fall back to devcmd1.
Also change the driver version.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devcmd resources to vnic_res_type. Add data types used by devcmd.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pr_info does not give any details about the interface involved. This patch
uses netdev_info for printing the message. Use dev_info where netdev is not
ready.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure definitions are in .c file to make them private to
that file. This patch moves the struct definition to .h file, So that their
definitions are accessible from other files.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
generic_file_write_iter() will already do an fsync on our behalf
if the file descriptor is O_SYNC or the file is marked as IS_SYNC.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This sets the queue's max segment size to match the device's
capabilities. The default of 128 is usable until a device's transfer
capability exceeds 512k, assuming a device page size of 4k. Many nvme
devices exceed that transfer limit, so this lets the block layer know what
kind of commands it to allow to form rather than unnecessarily split them.
One additional segment is added to account for a transfer that may start
in the middle of a page.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Change brace placement to be in line with coding standards
Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
After having tested insertion, lookup, table walk and removal, spawn a
number of threads running operations on the same rhashtable. Each of
them will:
1) insert it's own set of objects,
2) lookup every successfully inserted object and finally
3) remove objects in several rounds until all of them have been removed,
making sure the remaining ones are still found after each round.
This should put a good amount of load onto the system and due to
synchronising thread startup via two semaphores also extensive
concurrent table access.
The default number of ten threads returned within half a second on my
local VM with two cores. Running 200 threads took about four seconds. If
slow systems suffer too much from this though, the default could be
lowered or even set to zero so this extended test does not run at all by
default.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antonio Quartulli says:
====================
Included changes:
- avoid integer overflow in GW selection routine
- prevent race condition by making capability bit changes atomic (use
clear/set/test_bit)
- fix synchronization issue in mcast tvlv handler
- fix crash on double list removal of TT Request objects
- fix leak by puring packets enqueued for sending upon iface removal
- ensure network header pointer is set in skb
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
VxLAN offloading is not functional if the NIC is running in multichannel
mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
connectivity through the NIC and the device needs to be down and up to
restore it. The firmware should take care about it and does not allow
the conversion of interface to tunnel type (be_cmd_manage_iface) or should
support VxLAN offloading if multichannel config is enabled.
I have tested this on the latest available firmware (10.6.144.21).
Result:
[root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 172.30.10.50/24 dev enp5s0f0
[root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
[root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 172.30.10.60 dstport 4789
[root@sm-04 ~]# ip link set vxlan10 up
[ 7900.442811] be2net 0000:05:00.0: Enabled VxLAN offloads for UDP port 4789
[ 7900.455722] be2net 0000:05:00.1: Enabled VxLAN offloads for UDP port 4789
[ 7900.468635] be2net 0000:05:00.2: Enabled VxLAN offloads for UDP port 4789
[ 7900.481553] be2net 0000:05:00.3: Enabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
[root@sm-04 ~]# ip link set vxlan10 down
[ 7959.434093] be2net 0000:05:00.0: Disabled VxLAN offloads for UDP port 4789
[ 7959.444792] be2net 0000:05:00.1: Disabled VxLAN offloads for UDP port 4789
[ 7959.455592] be2net 0000:05:00.2: Disabled VxLAN offloads for UDP port 4789
[ 7959.466416] be2net 0000:05:00.3: Disabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ip link del vxlan10
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
[root@sm-04 ~]# ip link set enp5s0f0 down
[root@sm-04 ~]# ip link set enp5s0f0 up
[ 8071.019003] be2net 0000:05:00.0 enp5s0f0: Link is Up
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms
Cc: Sathya Perla <sathya.perla@avagotech.com>
Cc: Ajit Khaparde <ajit.khaparde@avagotech.com>
Cc: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Ajit Khaparde <ajit.khaparde@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin KaFai Lau says:
====================
ipv6: Fix a potential deadlock when creating pcpu rt
v1 -> v2:
A minor change in the commit message of patch 2.
This patch series fixes a potential deadlock when creating a pcpu rt.
It happens when dst_alloc() decided to run gc. Something like this:
read_lock(&table->tb6_lock);
ip6_rt_pcpu_alloc()
=> dst_alloc()
=> ip6_dst_gc()
=> write_lock(&table->tb6_lock); /* oops */
Patch 1 and 2 are some prep works.
Patch 3 is the fix.
Original report: https://bugzilla.kernel.org/show_bug.cgi?id=102291
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It is a prep work for fixing a potential deadlock when creating
a pcpu rt.
The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
exist. This patch moves the pcpu rt creation logic into another function,
rt6_make_pcpu_route().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After 4b32b5ad31 ("ipv6: Stop rt6_info from using inet_peer's metrics"),
ip6_dst_alloc() does not need the 'table' argument. This patch
cleans it up.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* Due to HW bug, LAN8700 sometimes does not detect presence of energy in the
Ethernet cable in Energy Detect Power-Down mode (e.g while EDPWRDOWN bit is
set, the ENERGYON bit does not asserted sometimes). This is a common bug of
LAN87xx family of PHY chips.
* The lan87xx_read_status() was improved to acquire ENERGYON bit. Its previous
algorythm still not reliable on 100 % and sometimes skip cable plugging.
Signed-off-by: Igor Plyatov <plyatov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Another pull request for the next cycle, this time with quite
a bit of content:
* mesh fixes/improvements from Alexis, Bob, Chun-Yeow and Jesse
* TDLS higher bandwidth support (Arik)
* OCB fixes from Bertold Van den Bergh
* suspend/resume fixes from Eliad
* dynamic SMPS support for minstrel-HT (Krishna Chaitanya)
* VHT bitrate mask support (Lorenzo Bianconi)
* better regulatory support for 5/10 MHz channels (Matthias May)
* basic support for MU-MIMO to avoid the multi-vif issue (Sara Sharon)
along with a number of other cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>