We currently always send fragments without DF bit set.
Thus, given following setup:
mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
A R1 R2 B
Where R1 and R2 run linux with netfilter defragmentation/conntrack
enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
will respond with icmp too big error if one of these fragments exceeded
1400 bytes.
However, if R1 receives fragment sizes 1200 and 100, it would
forward the reassembled packet without refragmenting, i.e.
R2 will send an icmp error in response to a packet that was never sent,
citing mtu that the original sender never exceeded.
The other minor issue is that a refragmentation on R1 will conceal the
MTU of R2-B since refragmentation does not set DF bit on the fragments.
This modifies ip_fragment so that we track largest fragment size seen
both for DF and non-DF packets, and set frag_max_size to the largest
value.
If the DF fragment size is larger or equal to the non-df one, we will
consider the packet a path mtu probe:
We set DF bit on the reassembled skb and also tag it with a new IPCB flag
to force refragmentation even if skb fits outdev mtu.
We will also set DF bit on each fragment in this case.
Joint work with Hannes Frederic Sowa.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain circumstances it may not be possible to schedule particular
events due to constraints other than a lack of hardware counters (e.g.
on big.LITTLE systems where CPUs support different events). The core
perf event code does not distinguish these cases and pessimistically
assumes that any failure to schedule an event means that it is not worth
attempting to schedule later events, even if some hardware counters are
still unused.
When an event a pmu cannot schedule exists in a flexible group list it
can unnecessarily prevent event groups following it in the list from
being scheduled (until it is rotated to the end of the list). This means
some events are scheduled for only a portion of the time they could be,
and for short running programs no events may be scheduled if the list is
initially sorted in an unfortunate order.
This patch adds a new (optional) filter_match function pointer to struct
pmu which a pmu driver can use to tell perf core when an event matches
pmu-specific scheduling requirements. This plugs into the existing
event_filter_match logic, and makes it possible to avoid the scheduling
problem described above. When no filter is provided by the PMU, the
existing behaviour is retained.
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds transmission power setting support for IEEE-802.15.4
devices via nl802154.
Signed-off-by: Varka Bhadram <varkab@cdac.in>
Acked-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
During initialization, the DRBG now tries to allocate a handle of the
Jitter RNG. If such a Jitter RNG is available during seeding, the DRBG
pulls the required entropy/nonce string from get_random_bytes and
concatenates it with a string of equal size from the Jitter RNG. That
combined string is now the seed for the DRBG.
Written differently, the initial seed of the DRBG is now:
get_random_bytes(entropy/nonce) || jitterentropy (entropy/nonce)
If the Jitter RNG is not available, the DRBG only seeds from
get_random_bytes.
CC: Andreas Steffen <andreas.steffen@strongswan.org>
CC: Theodore Ts'o <tytso@mit.edu>
CC: Sandy Harris <sandyinchina@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The async seeding operation is triggered during initalization right
after the first non-blocking seeding is completed. As required by the
asynchronous operation of random.c, a callback function is provided that
is triggered by random.c once entropy is available. That callback
function performs the actual seeding of the DRBG.
CC: Andreas Steffen <andreas.steffen@strongswan.org>
CC: Theodore Ts'o <tytso@mit.edu>
CC: Sandy Harris <sandyinchina@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In order to prepare for the addition of the asynchronous seeding call,
the invocation of seeding the DRBG is moved out into a helper function.
In addition, a block of memory is allocated during initialization time
that will be used as a scratchpad for obtaining entropy. That scratchpad
is used for the initial seeding operation as well as by the
asynchronous seeding call. The memory must be zeroized every time the
DRBG seeding call succeeds to avoid entropy data lingering in memory.
CC: Andreas Steffen <andreas.steffen@strongswan.org>
CC: Theodore Ts'o <tytso@mit.edu>
CC: Sandy Harris <sandyinchina@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 43b4578071 ("perf/x86: Reduce stack usage of
x86_schedule_events()") violated the rule that 'fake' scheduling; as
used for event/group validation; should not change the event state.
This went mostly un-noticed because repeated calls of
x86_pmu::get_event_constraints() would give the same result. And
x86_pmu::put_event_constraints() would mostly not do anything.
Commit e979121b1b ("perf/x86/intel: Implement cross-HT corruption
bug workaround") made the situation much worse by actually setting the
event->hw.constraint value to NULL, so when validation and actual
scheduling interact we get NULL ptr derefs.
Fix it by removing the constraint pointer from the event and move it
back to an array, this time in cpuc instead of on the stack.
validate_group()
x86_schedule_events()
event->hw.constraint = c; # store
<context switch>
perf_task_event_sched_in()
...
x86_schedule_events();
event->hw.constraint = c2; # store
...
put_event_constraints(event); # assume failure to schedule
intel_put_event_constraints()
event->hw.constraint = NULL;
<context switch end>
c = event->hw.constraint; # read -> NULL
if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
This in particular is possible when the event in question is a
cpu-wide event and group-leader, where the validate_group() tries to
add an event to the group.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Hunter <ahh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 43b4578071 ("perf/x86: Reduce stack usage of x86_schedule_events()")
Fixes: e979121b1b ("perf/x86/intel: Implement cross-HT corruption bug workaround")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
RGMII interfaces come in 4 different flavors that the PHY library needs
to care about: regular RGMII (no delays), RGMII with either RX or TX
delay, and both. In order to avoid errors of checking only for one type
of RGMII interface and miss the 3 others, introduce a convenience
function which tests for all values.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5590f3196b ("drivers/core/of: Add symlink to device-tree from
devices with an OF node") adds the symlink `of_node` for each device
pointing to it's device tree node while creating/initialising it.
However the devicetree sysfs is created and setup in of_init which is
executed at core_initcall level. For all the devices created before
of_init, the following error is thrown:
"Error -2(-ENOENT) creating of_node link"
Like many other components in driver model, initialize the sysfs support
for OF/devicetree from driver_init so that it's ready before any devices
are created.
Fixes: 5590f3196b ("drivers/core/of: Add symlink to device-tree from
devices with an OF node")
Suggested-by: Rob Herring <robh+dt@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Robert Schwebel <r.schwebel@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cgroup side of threadgroup locking uses signal_struct->group_rwsem
to synchronize against threadgroup changes. This per-process rwsem
adds small overhead to thread creation, exit and exec paths, forces
cgroup code paths to do lock-verify-unlock-retry dance in a couple
places and makes it impossible to atomically perform operations across
multiple processes.
This patch replaces signal_struct->group_rwsem with a global
percpu_rwsem cgroup_threadgroup_rwsem which is cheaper on the reader
side and contained in cgroups proper. This patch converts one-to-one.
This does make writer side heavier and lower the granularity; however,
cgroup process migration is a fairly cold path, we do want to optimize
thread operations over it and cgroup migration operations don't take
enough time for the lower granularity to matter.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
threadgroup_change_begin/end() are used to mark the beginning and end
of threadgroup modifying operations to allow code paths which require
a threadgroup to stay stable across blocking operations to synchronize
against those sections using threadgroup_lock/unlock().
It's currently implemented as a general mechanism in sched.h using
per-signal_struct rwsem; however, this never grew non-cgroup use cases
and becomes noop if !CONFIG_CGROUPS. It turns out that cgroups is
gonna be better served with a different sycnrhonization scheme and is
a bit silly to keep cgroups specific details as a general mechanism.
What's general here is identifying the places where threadgroups are
modified. This patch restructures threadgroup locking so that
threadgroup_change_begin/end() become a place where subsystems which
need to sycnhronize against threadgroup changes can hook into.
cgroup_threadgroup_change_begin/end() which operate on the
per-signal_struct rwsem are created and threadgroup_lock/unlock() are
moved to cgroup.c and made static.
This is pure reorganization which doesn't cause any functional
changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
If tcp ehash table is constrained to a very small number of buckets
(eg boot parameter thash_entries=128), then we can crash if spinlock
array has more entries.
While we are at it, un-inline inet_ehash_locks_alloc() and make
following changes :
- Budget 2 cache lines per cpu worth of 'spinlocks'
- Try to kmalloc() the array to avoid extra TLB pressure.
(Most servers at Google allocate 8192 bytes for this hash table)
- Get rid of various #ifdef
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The STMPE MFD is only used with device tree configured systems (and STMPE
MFD core depends on OF), so force the configuration to come from device
tree only.
Tested-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
This allows us to create netdev tables that contain ingress chains. Use
skb_header_pointer() as we may see shared sk_buffs at this stage.
This change provides access to the existing nf_tables features from the ingress
hook.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the internal NFT_AF_NEEDS_DEV flag to indicate that you must
attach this table to a net_device.
This change is required by the follow up patch that introduces the new netdev
table.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Split the "get phy from device_node" functionality out of
"get phy by phandle" so it can be used directly.
This is useful when a battery-charger is intimately associated with a
particular phy but handled by a separate driver. The charger
can find the device_node based on sibling relationships
without the need for a redundant declaration in the devicetree
description.
As a peripheral that gets a phy will often want to register a
notifier block, and de-register it later, that functionality
is included so the de-registration is automatic.
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
USB3380 enhanced mode allows GPEP to be used in both IN and OUT
directions. However, IN and OUT endpoints must use same USB endpoint
address (bEndpointAddress). Fix this by setting the ep_cfg.ep_number
during initialization and keep it in net2280_enable()
Tested-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
USB3380 in enhanced mode has 4 IN and 4 OUT endpoints. Check
interrupts for all of them.
Tested-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Since the HSUSB controllers of R-Car Gen2 are the same specification
(they have 16 pipes and usb-dmac), this patch changes USBHS_TYPE_R8A7790
and USBHS_TYPE_R8A7791 to USBHS_TYPE_RCAR_GEN2.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
for_each_*_in_state validate array index after
access to array elements, thus perform out of bounds read.
Fix this by validating index in the first place and read
array element iff validation was successful.
Fixes: df63b9994e ("drm/atomic: Add for_each_{connector,crtc,plane}_in_state helper macros")
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It was reported that this comment was confusing, and indeed it is.
v2: (one year later!) Add the range for the DRM_I915_* iotcl defines
(Daniel)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Userspace will be able to tell whether a GPU reset occured by comparing
an old referece value of the counter with a new value.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Atomic modesetting: now with modesetting support.
v2: Moved drm_atomic_set_mode_prop_for_crtc from previous patch; removed
state->active fiddling, documented return code. Changed property
type to DRM_MODE_PROP_BLOB.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Tested-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a blob property tracking the current mode to the CRTC state, and
ensure it is properly updated and referenced.
v2: Continue using crtc_state->mode inside getcrtc, instead of reading
out the mode blob. Use IS_ERR and PTR_ERR from create_blob. Move
set_mode_prop_for_crtc to later patch where it actually gets used.
Enforce !!state->enable == !!state->mode_blob inside
drm_atomic_crtc_check.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Tested-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a new helper, to be used later for blob property management, that
sets the mode for a CRTC state, as well as updating the CRTC enable/active
state at the same time.
v2: Do not touch active/mode_changed in CRTC state. Document return
value. Remove stray drm_atomic_set_mode_prop_for_crtc declaration.
v3: Remove i915 changes, and leave it directly bashing crtc_state->mode
for the meantime.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Tested-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch fix spelling typos found in DocBook/rapidio.xml
Ths file was generated from comments in the source files,
I had to fix them, instead of the xml file.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When we disconnect from the AP, drivers call cfg80211_disconnect().
This doesn't know whether the disconnection was initiated locally
or by the AP though, which can cause problems with the supplicant,
for example with WPS. This issue obviously doesn't show up with any
mac80211 based driver since mac80211 doesn't call this function.
Fix this by requiring drivers to indicate whether the disconnect is
locally generated or not. I've tried to update the drivers, but may
not have gotten the values correct, and some drivers may currently
not be able to report correct values. In case of doubt I left it at
false, which is the current behaviour.
For libertas, make adjustments as indicated by Dan Williams.
Reported-by: Matthieu Mauger <matthieux.mauger@intel.com>
Tested-by: Matthieu Mauger <matthieux.mauger@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This makes it possible to save some lines of code in drivers with an
simple bcma driver registration.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Architecture-specific helpers are not supposed to muck with
struct kvm_userspace_memory_region contents. Add const to
enforce this.
In order to eliminate the only write in __kvm_set_memory_region,
the cleaning of deleted slots is pulled up from update_memslots
to __kvm_set_memory_region.
Reviewed-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Reviewed-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a new driver for pxa SoCs, which is also compatible with the former
mmp_pdma.
The rationale behind a new driver (as opposed to incremental patching) was :
- the new driver relies on virt-dma, which obsoletes all the internal
structures of mmp_pdma (sw_desc, hw_desc, ...), and by consequence all the
functions
- mmp_pdma allocates dma coherent descriptors containing not only hardware
descriptors but linked list information
The new driver only puts the dma hardware descriptors (ie. 4 u32) into the
dma pool allocated memory. This changes completely the way descriptors are
handled
- the architecture behind the interrupt/tasklet management was rewritten to be
more conforming to virt-dma
- the buffers alignment is handled differently
The former driver assumed that the DMA channel stopped between each
descriptor. The new one chains descriptors to let the channel running. This
is a necessary guarantee for real-time high bandwidth usecases such as video
capture on "old" architectures such as pxa.
- hot chaining / cold chaining / no chaining
Whenever possible, submitting a descriptor "hot chains" it to a running
channel. There is still no guarantee that the descriptor will be issued, as
the channel might be stopped just before the descriptor is submitted. Yet
this allows to submit several video buffers, and resubmit a buffer while
another is under handling.
As before, dma_async_issue_pending() is the only guarantee to have all the
buffers issued.
When an alignment issue is detected (ie. one address in a descriptor is not
a multiple of 8), if the already running channel is in "aligned mode", the
channel will stop, and restarted in "misaligned mode" to finished the issued
list.
- descriptors reusing
A submitted, issued and completed descriptor can be reused, ie resubmitted if
it was prepared with the proper flag (DMA_PREP_ACK). Only a channel
resources release will in this case release that buffer.
This allows a rolling ring of buffers to be reused, where there are several
thousands of hardware descriptors used (video buffer for example).
Additionally, a set of more casual features is introduced :
- debugging traces
- lockless way to know if a descriptor is terminated or not
The driver was tested on zylonite board (pxa3xx) and mioa701 (pxa27x),
with dmatest, pxa_camera and pxamci.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
ipv6_select_ident() returns a 32bit value in network order.
Fixes: 286c2349f6 ("ipv6: Clean up ipv6_select_ident() and ip6_fragment()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If GPIOLIB=n:
drivers/leds/leds-gpio.c: In function ‘gpio_leds_create’:
drivers/leds/leds-gpio.c:187: error: implicit declaration of function ‘devm_get_gpiod_from_child’
drivers/leds/leds-gpio.c:187: warning: assignment makes pointer from integer without a cast
Add dummies for fwnode_get_named_gpiod() and devm_get_gpiod_from_child()
for the !GPIOLIB case to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 40b7318319 ("gpio: Support for unified device properties interface")
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
After the patch
'ipv6: Only create RTF_CACHE routes after encountering pmtu exception',
we need to compensate the performance hit (bouncing dst->__refcnt).
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch always creates RTF_CACHE clone with DST_NOCACHE
when FLOWI_FLAG_KNOWN_NH is set so that the rt6i_dst is set to
the fl6->daddr.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Julian Anastasov <ja@ssi.bg>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of doing the rt6->rt6i_node check whenever we need
to get the route's cookie. Refactor it into rt6_get_cookie().
It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
percpu rt6_info later.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch creates a RTF_CACHE routes only after encountering a pmtu
exception.
After ip6_rt_update_pmtu() has inserted the RTF_CACHE route to the fib6
tree, the rt->rt6i_node->fn_sernum is bumped which will fail the
ip6_dst_check() and trigger a relookup.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
When creating a RTF_CACHE route, RTF_ANYCAST is set based on rt6i_dst.
Also, rt6i_gateway is always set to the nexthop while the nexthop
could be a gateway or the rt6i_dst.addr.
After removing the rt6i_dst and rt6i_src dependency in the last patch,
we also need to stop the caller from depending on rt6i_gateway and
RTF_ANYCAST.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>