Commit Graph

112844 Commits

Author SHA1 Message Date
Ido Schimmel
c82481f7ea netlink: Add field to skip in-kernel notifications
The struct includes a 'skip_notify' flag that indicates if netlink
notifications to user space should be suppressed. As explained in commit
3b1137fe74 ("net: ipv6: Change notifications for multipath add to
RTA_MULTIPATH"), this is useful to suppress per-nexthop RTM_NEWROUTE
notifications when an IPv6 multipath route is added / deleted. Instead,
one notification is sent for the entire multipath route.

This concept is also useful for in-kernel notifications. Sending one
in-kernel notification for the addition / deletion of an IPv6 multipath
route - instead of one per-nexthop - provides a significant increase in
the insertion / deletion rate to underlying devices.

Add a 'skip_notify_kernel' flag to suppress in-kernel notifications.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18 09:45:36 -07:00
Ido Schimmel
3de205cde4 netlink: Document all fields of 'struct nl_info'
Some fields were not documented. Add documentation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18 09:45:36 -07:00
Boris Brezillon
7786fd1087 drm/panfrost: Expose performance counters through unstable ioctls
Expose performance counters through 2 driver specific ioctls: one to
enable/disable the perfcnt block, and one to dump the counter values.

There are discussions to expose global performance monitors (those
counters that can't be retrieved on a per-job basis) in a consistent
way, but this is likely to take time to settle on something that works
for various HW/users.
The ioctls are marked unstable so we can get rid of them when the time
comes. We initally went for a debugfs-based interface, but this was
making the transition to per-FD address space more complicated (we need
to specify the namespace the GPU has to use when dumping the perf
counters), hence the decision to switch back to driver specific ioctls
which are passed the FD they operate on and thus will have a dedicated
address space attached to them.

Other than that, the implementation is pretty simple: it basically dumps
all counters and copy the values to a userspace buffer. The parsing is
left to userspace which has to know the specific layout that's used
by the GPU (layout differs on a per-revision basis).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-5-boris.brezillon@collabora.com
2019-06-18 09:23:48 -06:00
Jean-Philippe Brucker
58b55c859a iommu: Add padding to struct iommu_fault
Ease future extensions of struct iommu_fault_page_request and struct
iommu_fault_unrecoverable by adding a few bytes of padding. That way, a
new field can be added to either of these structures by simply introducing
a new flag. To extend it after the size limit is reached, a new fault
reporting structure will have to be negotiated with userspace.

With 56 bytes of padding, the total size of iommu_fault is 64 bytes and
fits in a cache line on a lot of contemporary machines, while providing 16
and 24 bytes of extension to structures iommu_fault_page_request and
iommu_fault_unrecoverable respectively.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-18 17:14:35 +02:00
Jason Gunthorpe
47f245985a mm/hmm: Hold on to the mmget for the lifetime of the range
Range functions like hmm_range_snapshot() and hmm_range_fault() call
find_vma, which requires hodling the mmget() and the mmap_sem for the mm.

Make this simpler for the callers by holding the mmget() inside the range
for the lifetime of the range. Other functions that accept a range should
only be called if the range is registered.

This has the side effect of directly preventing hmm_release() from
happening while a range is registered. That means range->dead cannot be
false during the lifetime of the range, so remove dead and
hmm_mirror_mm_is_alive() entirely.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18 12:11:16 -03:00
Jason Gunthorpe
378a604064 mm/hmm: Remove duplicate condition test before wait_event_timeout
The wait_event_timeout macro already tests the condition as its first
action, so there is no reason to open code another version of this, all
that does is skip the might_sleep() debugging in common cases, which is
not helpful.

Further, based on prior patches, we can now simplify the required condition
test:
 - If range is valid memory then so is range->hmm
 - If hmm_release() has run then range->valid is set to false
   at the same time as dead, so no reason to check both.
 - A valid hmm has a valid hmm->mm.

Allowing the return value of wait_event_timeout() (along with its internal
barriers) to compute the result of the function.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18 11:55:20 -03:00
Thomas Hellstrom
4ba3976712 drm/vmwgfx: Add surface dirty-tracking callbacks
Add the callbacks necessary to implement emulated coherent memory for
surfaces. Add a flag to the gb_surface_create ioctl to indicate that
surface memory should be coherent.
Also bump the drm minor version to signal the availability of coherent
surfaces.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2019-06-18 15:19:35 +02:00
Thomas Hellstrom
7a39f35ce4 drm/ttm: TTM fault handler helpers
With the vmwgfx dirty tracking, the default TTM fault handler is not
completely sufficient (vmwgfx need to modify the vma->vm_flags member,
and also needs to restrict the number of prefaults).

We also want to replicate the new ttm_bo_vm_reserve() functionality

So start turning the TTM vm code into helpers: ttm_bo_vm_fault_reserved()
and ttm_bo_vm_reserve(), and provide a default TTM fault handler for other
drivers to use.

Cc: "Christian König" <christian.koenig@amd.com>

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com> #v1
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
32d1f6985c drm/ttm: Allow the driver to provide the ttm struct vm_operations_struct
Add a pointer to the struct vm_operations_struct in the bo_device, and
assign that pointer to the default value currently used.

The driver can then optionally modify that pointer and the new value
can be used for each new vma created.

Cc: "Christian König" <christian.koenig@amd.com>

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
4fe51e9e79 mm: Add write-protect and clean utilities for address space ranges
Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space.
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.

The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.

Notable changes since RFC:
- Added comments to help avoid the usage of these function for VMAs
  it's not intended for. We also do advisory checks on the vm_flags and
  warn on illegal usage.
- Perform the pte modifications the same way softdirty does.
- Add mmu_notifier range invalidation calls.
- Add a config option so that this code is not unconditionally included.
- Tell the mmu_gather code about pending tlb flushes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> #v1
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
29875a5291 mm: Add an apply_to_pfn_range interface
This is basically apply_to_page_range with added functionality:
Allocating missing parts of the page table becomes optional, which
means that the function can be guaranteed not to error if allocation
is disabled. Also passing of the closure struct and callback function
becomes different and more in line with how things are done elsewhere.

Finally we keep apply_to_page_range as a wrapper around apply_to_pfn_range

The reason for not using the page-walk code is that we want to perform
the page-walk on vmas pointing to an address space without requiring the
mmap_sem to be held rather than on vmas belonging to a process with the
mmap_sem held.

Notable changes since RFC:
Don't export apply_to_pfn range.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> #v1
2019-06-18 15:19:33 +02:00
Masahiro Yamada
4eb293487d pinctrl: make pinconf.h self-contained
This header uses 'bool', but it does not include any header by itself.

So, it could cause unknown type name error, depending on the header
include order, although probably <linux/types.h> has been included by
someone else.

Include <linux/types.h> to make it self-contained.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-18 13:44:41 +02:00
Chen-Yu Tsai
8b13a48b89 clk: fixed-factor: Add CLK_FIXED_FACTOR_FW_NAME for DT clock-names parent
With the new clk parenting code, clk_init_data was expanded to include
.parent_data, for clk drivers to specify parents using a combination of
device tree clock-names, pointers to struct clk_hw, device tree clocks,
and/or fallback global clock names.

Add a new macro, CLK_FIXED_FACTOR_FW_NAME, that takes a string to match
a clock-names entry in the device tree to specify the clock parent.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:39:00 +08:00
Chen-Yu Tsai
1bef004e26 clk: fixed-factor: Add CLK_FIXED_FACTOR_HWS which takes list of struct clk_hw *
With the new clk parenting code, clk_init_data was expanded to include
.parent_hws, for clk drivers to directly reference parents by clk_hw.

Add a new macro, CLK_FIXED_FACTOR_HWS, that can take an array of pointers
to struct clk_hw, instead of a string, as its parent. Taking an array
instead of a direct pointer allows the reuse of the array for multiple
clks, rather than having one compound literal with the same contents
allocated for each clk declaration.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:39:00 +08:00
Chen-Yu Tsai
d7b15114ab clk: fixed-factor: Add CLK_FIXED_FACTOR_HW which takes clk_hw pointer as parent
With the new clk parenting code, clk_init_data was expanded to include
.parent_hws, for clk drivers to directly reference parents by clk_hw.

Add a new macro, CLK_FIXED_FACTOR_HW, that can take a struct clk_hw
pointer, instead of a string, as its parent.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:39:00 +08:00
Chen-Yu Tsai
13933109df clk: Add CLK_HW_INIT_PARENT_DATA macro using .parent_data
With the new clk parenting code, struct clk_init_data was expanded to
include .parent_data, for clk drivers that have parents referenced using
a combination of device tree clock-names, clock indices, and/or struct
clk_hw pointers.

Add a new macro that can take a list of struct clk_parent_data for
drivers to use.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:38:59 +08:00
Chen-Yu Tsai
2d6b4f33e6 clk: Add CLK_HW_INIT_FW_NAME macro using .fw_name in .parent_data
With the new clk parenting code, clk_init_data was expanded to include
.parent_data, for clk drivers that have parents referenced using a
combination of device tree clock-names, clock indices, and/or clk_hw
pointers.

Add a CLK_HW_INIT macro for specifying a single parent from the device
tree using .fw_name in struct clk_parent_data.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:38:59 +08:00
Chen-Yu Tsai
99600fd47e clk: Add CLK_HW_INIT_* macros using .parent_hws
With the new clk parenting code, struct clk_init_data was expanded to
include .parent_hws, for clk drivers to directly list parents by
pointing to their respective struct clk_hw's.

Add macros that can take either one single struct clk_hw *, or an array
of them, for drivers to use.

A special CLK_HW_INIT_HWS macro is included, which takes an array of
struct clk_hw *, but sets .num_parents to 1. This variant is to allow
the reuse of the array, instead of having a compound literal allocated
for each clk sharing the same parent.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2019-06-18 19:38:59 +08:00
Douglas Anderson
b4c9f938d5 mmc: core: Add sdio_retune_hold_now() and sdio_retune_release()
We want SDIO drivers to be able to temporarily stop retuning when the
driver knows that the SDIO card is not in a state where retuning will
work (maybe because the card is asleep).  We'll move the relevant
functions to a place where drivers can call them.

Cc: stable@vger.kernel.org #v4.18+
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-06-18 13:30:32 +02:00
Douglas Anderson
0a55f4ab96 mmc: core: API to temporarily disable retuning for SDIO CRC errors
Normally when the MMC core sees an "-EILSEQ" error returned by a host
controller then it will trigger a retuning of the card.  This is
generally a good idea.

However, if a command is expected to sometimes cause transfer errors
then these transfer errors shouldn't cause a re-tuning.  This
re-tuning will be a needless waste of time.  One example case where a
transfer is expected to cause errors is when transitioning between
idle (sometimes referred to as "sleep" in Broadcom code) and active
state on certain Broadcom WiFi SDIO cards.  Specifically if the card
was already transitioning between states when the command was sent it
could cause an error on the SDIO bus.

Let's add an API that the SDIO function drivers can call that will
temporarily disable the auto-tuning functionality.  Then we can add a
call to this in the Broadcom WiFi driver and any other driver that
might have similar needs.

NOTE: this makes the assumption that the card is already tuned well
enough that it's OK to disable the auto-retuning during one of these
error-prone situations.  Presumably the driver code performing the
error-prone transfer knows how to recover / retry from errors.  ...and
after we can get back to a state where transfers are no longer
error-prone then we can enable the auto-retuning again.  If we truly
find ourselves in a case where the card needs to be retuned sometimes
to handle one of these error-prone transfers then we can always try a
few transfers first without auto-retuning and then re-try with
auto-retuning if the first few fail.

Without this change on rk3288-veyron-minnie I periodically see this in
the logs of a machine just sitting there idle:
  dwmmc_rockchip ff0d0000.dwmmc: Successfully tuned phase to XYZ

Cc: stable@vger.kernel.org #v4.18+
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-06-18 13:30:21 +02:00
Anurag Kumar Vulisha
4ed9890c4c usb: gadget: send usb_gadget as an argument in get_config_params
Passing struct usb_gadget * as an extra argument in get_config_params
makes gadget drivers to easily update the U1DevExitLat & U2DevExitLat
values based on the values passed from the device tree. This patch
does the same

Signed-off-by: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2019-06-18 11:58:30 +03:00
Dmitry Torokhov
6d101f24f1 USB: add usbfs ioctl to retrieve the connection parameters
Recently usfbs gained availability to retrieve device speed, but there
is sill no way to determine the bus number or list of ports the device
is connected to when using usbfs. While this information can be obtained
from sysfs, not all environments allow sysfs access. In a jailed
environment a program might be simply given an opened file descriptor to
usbfs device, and it is really important that all data can be gathered
from said file descriptor.

This patch introduces a new ioctl, USBDEVFS_CONNINFO_EX, which return
extended connection information for the device, including the bus
number, address, port list and speed. The API allows kernel to extend
amount of data returned by the ioctl and userspace has an option of
adjusting the amount of data it is willing to consume. A new capability,
USBDEVFS_CAP_CONNINFO_EX, is introduced to help userspace in determining
whether the kernel supports this new ioctl.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-18 08:44:09 +02:00
David S. Miller
13091aa305 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17 20:20:36 -07:00
Linus Torvalds
da0f382029 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Lots of bug fixes here:

   1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer.

   2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John
      Crispin.

   3) Use after free in psock backlog workqueue, from John Fastabend.

   4) Fix source port matching in fdb peer flow rule of mlx5, from Raed
      Salem.

   5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet.

   6) Network header needs to be set for packet redirect in nfp, from
      John Hurley.

   7) Fix udp zerocopy refcnt, from Willem de Bruijn.

   8) Don't assume linear buffers in vxlan and geneve error handlers,
      from Stefano Brivio.

   9) Fix TOS matching in mlxsw, from Jiri Pirko.

  10) More SCTP cookie memory leak fixes, from Neil Horman.

  11) Fix VLAN filtering in rtl8366, from Linus Walluij.

  12) Various TCP SACK payload size and fragmentation memory limit fixes
      from Eric Dumazet.

  13) Use after free in pneigh_get_next(), also from Eric Dumazet.

  14) LAPB control block leak fix from Jeremy Sowden"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
  lapb: fixed leak of control-blocks.
  tipc: purge deferredq list for each grp member in tipc_group_delete
  ax25: fix inconsistent lock state in ax25_destroy_timer
  neigh: fix use-after-free read in pneigh_get_next
  tcp: fix compile error if !CONFIG_SYSCTL
  hv_sock: Suppress bogus "may be used uninitialized" warnings
  be2net: Fix number of Rx queues used for flow hashing
  net: handle 802.1P vlan 0 packets properly
  tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
  tcp: add tcp_min_snd_mss sysctl
  tcp: tcp_fragment() should apply sane memory limits
  tcp: limit payload size of sacked skbs
  Revert "net: phylink: set the autoneg state in phylink_phy_change"
  bpf: fix nested bpf tracepoints with per-cpu data
  bpf: Fix out of bounds memory access in bpf_sk_storage
  vsock/virtio: set SOCK_DONE on peer shutdown
  net: dsa: rtl8366: Fix up VLAN filtering
  net: phylink: set the autoneg state in phylink_phy_change
  net: add high_order_alloc_disable sysctl/static key
  tcp: add tcp_tx_skb_cache sysctl
  ...
2019-06-17 15:55:34 -07:00
Ard Biesheuvel
c681edae33 net: ipv4: move tcp_fastopen server side code to SipHash library
Using a bare block cipher in non-crypto code is almost always a bad idea,
not only for security reasons (and we've seen some examples of this in
the kernel in the past), but also for performance reasons.

In the TCP fastopen case, we call into the bare AES block cipher one or
two times (depending on whether the connection is IPv4 or IPv6). On most
systems, this results in a call chain such as

  crypto_cipher_encrypt_one(ctx, dst, src)
    crypto_cipher_crt(tfm)->cit_encrypt_one(crypto_cipher_tfm(tfm), ...);
      aesni_encrypt
        kernel_fpu_begin();
        aesni_enc(ctx, dst, src); // asm routine
        kernel_fpu_end();

It is highly unlikely that the use of special AES instructions has a
benefit in this case, especially since we are doing the above twice
for IPv6 connections, instead of using a transform which can process
the entire input in one go.

We could switch to the cbcmac(aes) shash, which would at least get
rid of the duplicated overhead in *some* cases (i.e., today, only
arm64 has an accelerated implementation of cbcmac(aes), while x86 will
end up using the generic cbcmac template wrapping the AES-NI cipher,
which basically ends up doing exactly the above). However, in the given
context, it makes more sense to use a light-weight MAC algorithm that
is more suitable for the purpose at hand, such as SipHash.

Since the output size of SipHash already matches our chosen value for
TCP_FASTOPEN_COOKIE_SIZE, and given that it accepts arbitrary input
sizes, this greatly simplifies the code as well.

NOTE: Server farms backing a single server IP for load balancing purposes
      and sharing a single fastopen key will be adversely affected by
      this change unless all systems in the pool receive their kernel
      upgrades at the same time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17 13:56:26 -07:00
Jason Gunthorpe
5d60c11154 RDMA: Move rdma_node_type to uapi/
This enum is exposed over the sysfs file 'node_type' and over netlink via
RDMA_NLDEV_ATTR_DEV_NODE_TYPE, so declare it in the uapi headers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-17 14:58:59 -04:00
Daniel Vetter
eb69c8a4bf drm/gem: Unexport drm_gem_(un)pin/v(un)map
They're purely for internal use, not for drivers.

Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614203615.12639-3-daniel.vetter@ffwll.ch
2019-06-17 17:37:01 +02:00
Christoph Hellwig
ff896738be block: return from __bio_try_merge_page if merging occured in the same page
We currently have an input same_page parameter to __bio_try_merge_page
to prohibit merging in the same page.  The rationale for that is that
some callers need to account for every page added to a bio.  Instead of
letting these callers call twice into the merge code to account for the
new vs existing page cases, just turn the paramter into an output one that
returns if a merge in the same page occured and let them act accordingly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-17 09:33:02 -06:00
Fernando Fernandez Mancera
d7f9b2f18e netfilter: synproxy: extract SYNPROXY infrastructure from {ipt, ip6t}_SYNPROXY
Add common functions into nf_synproxy_core.c to prepare for nftables support.
The prototypes of the functions used by {ipt, ip6t}_SYNPROXY are in the new
file nf_synproxy.h

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 17:12:55 +02:00
Fernando Fernandez Mancera
3006a5224f netfilter: synproxy: remove module dependency on IPv6 SYNPROXY
This is a prerequisite for the infrastructure module NETFILTER_SYNPROXY.
The new module is needed to avoid duplicated code for the SYNPROXY
nftables support.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 17:12:09 +02:00
Fernando Fernandez Mancera
5fcc88ecf6 netfilter: synproxy: add common uapi for SYNPROXY infrastructure
This new UAPI file is going to be used by the xt and nft common SYNPROXY
infrastructure. It is needed to avoid duplicated code.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 17:10:38 +02:00
Pablo Neira Ayuso
667ec21ebf Merge branch 'master' of git://blackhole.kfki.hu/nf-next
Jozsef Kadlecsik says:

====================
ipset patches for nf-next

- Remove useless memset() calls, nla_parse_nested/nla_parse
  erase the tb array properly, from Florent Fourcot.
- Merge the uadd and udel functions, the code is nicer
  this way, also from Florent Fourcot.
- Add a missing check for the return value of a
  nla_parse[_deprecated] call, from Aditya Pakki.
- Add the last missing check for the return value
  of nla_parse[_deprecated] call.
- Fix error path and release the references properly
  in set_target_v3_checkentry().
- Fix memory accounting which is reported to userspace
  for hash types on resize, from Stefano Brivio.
- Update my email address to kadlec@netfilter.org.
  The patch covers all places in the source tree where
  my kadlec@blackhole.kfki.hu address could be found.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 16:37:24 +02:00
Christian Brauner
ff6d090d0d netfilter: bridge: port sysctls to use brnf_net
This ports the sysctls to use struct brnf_net.

With this patch we make it possible to namespace the br_netfilter module in
the following patch.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 16:36:30 +02:00
Pablo Neira Ayuso
9911c1139f netfilter: xt_owner: bail out with EINVAL in case of unsupported flags
Reject flags that are not supported with EINVAL.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 16:36:30 +02:00
Florian Westphal
87e389b4c2 netfilter: conntrack: small conntrack lookup optimization
____nf_conntrack_find() performs checks on the conntrack objects in
this order:

1. if (nf_ct_is_expired(ct))

This fetches ct->timeout, in third cache line.

The hnnode that is used to store the list pointers resides in the first
(origin) or second (reply tuple) cache lines.

This test rarely passes, but its necessary to reap obsolete entries.

2. if (nf_ct_is_dying(ct))

This fetches ct->status, also in third cache line.

The test is useless, and can be removed:
  Consider:
     cpu0                                           cpu1
    ct = ____nf_conntrack_find()
    atomic_inc_not_zero(ct) -> ok
    nf_ct_key_equal -> ok
    is_dying -> DYING bit not set, ok
                                                    set_bit(ct, DYING);
						    ... unhash ... etc.
    return ct
    -> returning a ct with dying bit set, despite
    having a test for it.

This (unlikely) case is fine - refcount prevents ct from getting free'd.

3. if (nf_ct_key_equal(h, tuple, zone, net))

nf_ct_key_equal checks in following order:

1. Tuple equal (first or second cacheline)
2. Zone equal (third cacheline)
3. confirmed bit set (->status, third cacheline)
4. net namespace match (third cacheline).

Swapping "timeout" and "cpu" places timeout in the first cacheline.
This has two advantages:

1. For a conntrack that won't even match the original tuple,
   we will now only fetch the first and maybe the second cacheline
   instead of always accessing the 3rd one as well.

2.  in case of TCP ct->timeout changes frequently because we
    reduce/increase it when there are packets outstanding in the network.

The first cacheline contains both the reference count and the ct spinlock,
i.e. moving timeout there avoids writes to 3rd cacheline.

The restart sequence in __nf_conntrack_find() is removed, if we found a
candidate, but then fail to increment the refcount or discover the tuple
has changed (object recycling), just pretend we did not find an entry.

A second lookup won't find anything until another CPU adds a new conntrack
with identical tuple into the hash table, which is very unlikely.

We have the confirmation-time checks (when we hold hash lock) that deal
with identical entries and even perform clash resolution in some cases.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 16:35:30 +02:00
Stéphane Veyret
857b46027d netfilter: nft_ct: add ct expectations support
This patch allows to add, list and delete expectations via nft objref
infrastructure and assigning these expectations via nft rule.

This allows manual port triggering when no helper is defined to manage a
specific protocol. For example, if I have an online game which protocol
is based on initial connection to TCP port 9753 of the server, and where
the server opens a connection to port 9876, I can set rules as follow:

table ip filter {
    ct expectation mygame {
        protocol udp;
        dport 9876;
        timeout 2m;
        size 1;
    }

    chain input {
        type filter hook input priority 0; policy drop;
        tcp dport 9753 ct expectation set "mygame";
    }

    chain output {
        type filter hook output priority 0; policy drop;
        udp dport 9876 ct status expected accept;
    }
}

Signed-off-by: Stéphane Veyret <sveyret@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-17 16:35:20 +02:00
Andy Shevchenko
36f34737ff spi: Add a prototype for exported spi_set_cs_timing()
Compiler is not happy about spi_set_cs_timing() prototype.

drivers/spi/spi.c:3016:6: warning: no previous prototype for ‘spi_set_cs_timing’ [-Wmissing-prototypes]
 void spi_set_cs_timing(struct spi_device *spi, u8 setup, u8 hold,
      ^~~~~~~~~~~~~~~~~

Let's add it to the header.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2019-06-17 13:49:56 +01:00
Mattias Jacobsson
440c4983de platform/x86: wmi: add context argument to the probe function
The struct wmi_device_id has a context pointer field, forward this
pointer as an argument to the probe function in struct wmi_driver.

Update existing users of the same probe function to accept this new
context argument.

Signed-off-by: Mattias Jacobsson <2pi@mok.nu>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-06-17 15:22:47 +03:00
Mattias Jacobsson
a48e23385f platform/x86: wmi: add context pointer field to struct wmi_device_id
When using wmi_install_notify_handler() to initialize a WMI handler a
data pointer can be supplied which will be passed on to the notification
handler. No similar feature exist when handling WMI events via struct
wmi_driver.

Add a context field pointer to struct wmi_device_id and add a function
find_guid_context() to retrieve that context pointer.

Signed-off-by: Mattias Jacobsson <2pi@mok.nu>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-06-17 15:22:47 +03:00
Yurii Pavlovskyi
b096f626a6 platform/x86: asus-wmi: Switch fan boost mode
The WMI exposes a write-only device ID where up to three fan modes can be
switched on some laptops (TUF Gaming FX505GM). There is a hotkey
combination Fn-F5 that does have a fan icon, which is designed to toggle
between fan modes. The DSTS of the device ID returns information about the
presence of this capability and the presence of each of the two additional
fan modes as a bitmask (0x01 - overboost present, 0x02 - silent present)
[1].

Add a SysFS entry that reads the last written value and updates value in
WMI on write and a hotkey handler that toggles the modes taking into
account their availability according to DSTS.

Modes:
* 0x00 - normal or balanced,
* 0x01 - overboost, increased fan RPM,
* 0x02 - silent, decreased fan RPM

[1] Link: https://lkml.org/lkml/2019/4/12/110

Signed-off-by: Yurii Pavlovskyi <yurii.pavlovskyi@gmail.com>
Suggested-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-06-17 15:22:47 +03:00
Yurii Pavlovskyi
e0668f2888 platform/x86: asus-wmi: Improve DSTS WMI method ID detection
The DSTS method detection mistakenly selects DCTS instead of DSTS if
nothing is returned when the method ID is not defined in WMNB. As a result,
the control of keyboard backlight is not functional for TUF Gaming series
laptops. Implement detection based on _UID of the WMI device instead.

There is evidence that DCTS is handled by ACPI WMI devices that have _UID
ASUSWMI, whereas none of the devices without ASUSWMI respond to DCTS and
DSTS is used instead [1].

DSDT examples:

FX505GM (_UID ATK):
Method (WMNB, 3, Serialized)
{ ...
    If ((Local0 == 0x53545344))
    {
        ...
        Return (Zero)
    }
    ...
    // No return
}

K54C (_UID ATK):
Method (WMNB, 3, Serialized)
{ ...
    If ((Local0 == 0x53545344))
    {
        ...
        Return (0x02)
    }
    ...
    Return (0xFFFFFFFE)
}

[1] Link: https://lkml.org/lkml/2019/4/11/322

Signed-off-by: Yurii Pavlovskyi <yurii.pavlovskyi@gmail.com>
Suggested-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-06-17 15:22:47 +03:00
Yurii Pavlovskyi
e7488e58c7 platform/x86: wmi: Add function to get _UID of WMI device
Add a new function to acpi.h / wmi.c that returns _UID of the ACPI WMI
device. For example, it returns "ATK" for the following declaration in
DSDT:
Device (ATKD)
{
    Name (_HID, "PNP0C14" /* Windows Management Instrumentation Device */)
      // _HID: Hardware ID
    Name (_UID, "ATK")  // _UID: Unique ID
    ..

Generally, it is possible that multiple PNP0C14 ACPI devices are present in
the system as mentioned in the commit message of commit bff431e49f
("ACPI: WMI: Add ACPI-WMI mapping driver").

Therefore the _UID is returned for a specific ACPI device that declares the
given GUID, to which it is also mapped by other methods of wmi module.

Signed-off-by: Yurii Pavlovskyi <yurii.pavlovskyi@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-06-17 15:22:47 +03:00
Olof Johansson
449c1cd297 Merge tag 'versatile-v5.3-armsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into arm/soc
Versatile platform updates for the v5.3 kernel cycle:
- Drop a slew of unused CLCD platform data
- Fix OF reference counts

* tag 'versatile-v5.3-armsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
  ARM: versatile: Drop CLCD platform data
  ARM: versatile: fix a leaked reference by addingmissing of_node_put

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-17 05:11:53 -07:00
Olof Johansson
df767c0a43 Merge tag 'scmi-updates-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/drivers
ARM SCMI updates/fixes for v5.3

1. Correction to ARM document ID referred in SCMI protocol binding
2. Fix to correct bitfield definitions for SENSOR_DESC attributes which
   otherwise will calculate sensor values on wrong scale
3. Adds the missing rate_discrete flag setting so that discrete clocks
   are handled correctly. Without this fix it assumes continuous range
   which is incorrect
4. Adds support to read and scale the sensor values based on the factor
   read from the firmware

* tag 'scmi-updates-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  hwmon: scmi: Scale values to target desired HWMON units
  firmware: arm_scmi: fetch and store sensor scale
  firmware: arm_scmi: update rate_discrete in clock_describe_rates_get
  firmware: arm_scmi: fix bitfield definitions for SENSOR_DESC attributes
  dt-bindings: arm: fix the document ID for SCMI protocol documentation

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-17 04:54:11 -07:00
Olof Johansson
c3bd15a078 Merge tag 'omap-for-v5.3/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/drivers
ti-sysc interconnect target module driver changes for v5.3

This series of changes improves probing devices with ti-sysc to the
point where we can now probe most devices without the custom dts
property "ti,hwmods" and no legacy platform data :)

We add support for platform data callbacks for idling and unidling the
clockdomain the module belongs to. The rest of the series mostly adds
handling for the various quirks needed by old legacy modules such as
i2c and watchdog. Some quirk handling is still missing for few modules,
but those will be added as they get tested.

The related platform data and dts changes will be sent separately.

* tag 'omap-for-v5.3/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  bus: ti-sysc: Add support for module specific reset quirks
  bus: ti-sysc: Detect uarts also on omap34xx
  bus: ti-sysc: Do rstctrl reset handling in two phases
  bus: ti-sysc: Add support for disabling module without legacy mode
  bus: ti-sysc: Set ENAWAKEUP if available
  bus: ti-sysc: Handle swsup idle mode quirks
  bus: ti-sysc: Handle clockactivity for enable and disable
  bus: ti-sysc: Enable interconnect target module autoidle bit on enable
  bus: ti-sysc: Allow QUIRK_LEGACY_IDLE even if legacy_mode is not set
  bus: ti-sysc: Make OCP reset work for sysstatus and sysconfig reset bits
  bus: ti-sysc: Support 16-bit writes too
  bus: ti-sysc: Add support for missing clockdomain handling
  ARM: dts: dra71x: Disable usb4_tm target module
  ARM: dts: dra71x: Disable rtc target module
  ARM: dts: dra76x: Disable usb4_tm target module
  ARM: dts: dra76x: Disable rtc target module
  ARM: dts: dra76x: Update MMC2_HS200_MANUAL1 iodelay values
  ARM: dts: am57xx-idk: Remove support for voltage switching for SD card
  bus: ti-sysc: Handle devices with no control registers
  ARM: dts: Configure osc clock for d_can on am335x

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-17 04:53:32 -07:00
Masahiro Yamada
5ec47cda74 memory: move jedec_ddr.h from include/memory to drivers/memory/
Now that jedec_ddr_data.c was moved from lib/ to drivers/memory/,
<memory/jedec_ddr.h> is included only from drivers/memory/.

Make it a local header of drivers/memory/.

The directory include/memory is now gone.

While I am here, I also changed #include <linux/module.h> to
<linux/export.h>. Because CONFIG_DDR is bool, jedec_ddr_data.c is
never compiled as a module.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-17 04:45:21 -07:00
Olof Johansson
1126037393 Merge tag 'soc-fsl-next-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/drivers
NXP/FSL SoC driver updates for v5.3

DPAA2 Console driver
- Add driver to export two char devices to dump logs for MC and
  AIOP

DPAA2 DPIO driver
- Add support for memory backed QBMan portals
- Increase the timeout period to prevent false error
- Add APIs to retrieve QBMan portal probing status

DPAA Qman driver
- Only make liodn fixup on powerpc SoCs with PAMU iommu

* tag 'soc-fsl-next-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
  soc: fsl: qbman_portals: add APIs to retrieve the probing status
  soc: fsl: qman: fixup liodns only on ppc targets
  soc: fsl: dpio: Add support for memory backed QBMan portals
  bus: mc-bus: Add support for mapping shareable portals
  soc: fsl: dpio: Increase timeout for QBMan Management Commands
  soc: fsl: add DPAA2 console support
  Documentation: DT: Add entry for DPAA2 console
  soc: fsl: guts: Add definition for LX2160A

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-17 04:38:06 -07:00
Peter Zijlstra
9ed7d75b2f x86/percpu: Relax smp_processor_id()
Nadav reported that since this_cpu_read() became asm-volatile, many
smp_processor_id() users generated worse code due to the extra
constraints.

However since smp_processor_id() is reading a stable value, we can use
__this_cpu_read().

While this does reduce text size somewhat, this mostly results in code
movement to .text.unlikely as a result of more/larger .cold.
subfunctions. Less text on the hotpath is good for I$.

  $ ./compare.sh defconfig-build1 defconfig-build2 vmlinux.o
  setup_APIC_ibs                                             90         98   -12,+20
  force_ibs_eilvt_setup                                     400        413   -57,+70
  pci_serr_error                                            109        104   -54,+49
  pci_serr_error                                            109        104   -54,+49
  unknown_nmi_error                                         125        120   -76,+71
  unknown_nmi_error                                         125        120   -76,+71
  io_check_error                                            125        132   -97,+104
  intel_thermal_interrupt                                   730        822   +92,+0
  intel_init_thermal                                        951        945   -6,+0
  generic_get_mtrr                                          301        294   -7,+0
  generic_get_mtrr                                          301        294   -7,+0
  generic_set_all                                           749        754   -44,+49
  get_fixed_ranges                                          352        360   -41,+49
  x86_acpi_suspend_lowlevel                                 369        363   -6,+0
  check_tsc_sync_source                                     412        412   -71,+71
  irq_migrate_all_off_this_cpu                              662        674   -14,+26
  clocksource_watchdog                                      748        748   -113,+113
  __perf_event_account_interrupt                            204        197   -7,+0
  attempt_merge                                            1748       1741   -7,+0
  intel_guc_send_ct                                        1424       1409   -15,+0
  __fini_doorbell                                           235        231   -4,+0
  bdw_set_cdclk                                             928        923   -5,+0
  gen11_dsi_disable                                        1571       1556   -15,+0
  gmbus_wait                                                493        488   -5,+0
  md_make_request                                           376        369   -7,+0
  __split_and_process_bio                                   543        536   -7,+0
  delay_tsc                                                  96         89   -7,+0
  hsw_disable_pc8                                           696        691   -5,+0
  tsc_verify_tsc_adjust                                     215        228   -22,+35
  cpuidle_driver_unref                                       56         49   -7,+0
  blk_account_io_completion                                 159        148   -11,+0
  mtrr_wrmsr                                                 95         99   -29,+33
  __intel_wait_for_register_fw                              401        419   +18,+0
  cpuidle_driver_ref                                         43         36   -7,+0
  cpuidle_get_driver                                         15          8   -7,+0
  blk_account_io_done                                       535        528   -7,+0
  irq_migrate_all_off_this_cpu                              662        674   -14,+26
  check_tsc_sync_source                                     412        412   -71,+71
  irq_wait_for_poll                                         170        163   -7,+0
  generic_end_io_acct                                       329        322   -7,+0
  x86_acpi_suspend_lowlevel                                 369        363   -6,+0
  nohz_balance_enter_idle                                   198        191   -7,+0
  generic_start_io_acct                                     254        247   -7,+0
  blk_account_io_start                                      341        334   -7,+0
  perf_event_task_tick                                      682        675   -7,+0
  intel_init_thermal                                        951        945   -6,+0
  amd_e400_c1e_apic_setup                                    47         51   -28,+32
  setup_APIC_eilvt                                          350        328   -22,+0
  hsw_enable_pc8                                           1611       1605   -6,+0
                                               total   12985947   12985892   -994,+939

Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:43:41 +02:00
Ingo Molnar
bddb363673 Merge branch 'x86/cpu' into perf/core, to pick up dependent changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:29:16 +02:00
Waiman Long
94a9717b3c locking/rwsem: Make rwsem->owner an atomic_long_t
The rwsem->owner contains not just the task structure pointer, it also
holds some flags for storing the current state of the rwsem. Some of
the flags may have to be atomically updated. To reflect the new reality,
the owner is now changed to an atomic_long_t type.

New helper functions are added to properly separate out the task
structure pointer and the embedded flags.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Link: https://lkml.kernel.org/r/20190520205918.22251-14-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:28:06 +02:00