If a read request fits entirely in a chunk, it will be passed directly to the
underlying device (providing it hasn't failed of course). If it doesn't fit,
the slightly less efficient path that uses the stripe_cache is used.
Requests that get to the stripe cache are always completely split up as
necessary.
So with RAID5, ripping out the merge_bvec_fn doesn't cause it to stop work,
but could cause it to take the less efficient path more often.
All that is needed to manage this is for 'chunk_aligned_read' do some bio
splitting, much like the RAID0 code does.
Cc: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The bcache driver has always accepted arbitrarily large bios and split
them internally. Now that every driver must accept arbitrarily large
bios this code isn't nessecary anymore.
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The way the block layer is currently written, it goes to great lengths
to avoid having to split bios; upper layer code (such as bio_add_page())
checks what the underlying device can handle and tries to always create
bios that don't need to be split.
But this approach becomes unwieldy and eventually breaks down with
stacked devices and devices with dynamic limits, and it adds a lot of
complexity. If the block layer could split bios as needed, we could
eliminate a lot of complexity elsewhere - particularly in stacked
drivers. Code that creates bios can then create whatever size bios are
convenient, and more importantly stacked drivers don't have to deal with
both their own bio size limitations and the limitations of the
(potentially multiple) devices underneath them. In the future this will
let us delete merge_bvec_fn and a bunch of other code.
We do this by adding calls to blk_queue_split() to the various
make_request functions that need it - a few can already handle arbitrary
size bios. Note that we add the call _after_ any call to
blk_queue_bounce(); this means that blk_queue_split() and
blk_recalc_rq_segments() don't need to be concerned with bouncing
affecting segment merging.
Some make_request_fn() callbacks were simple enough to audit and verify
they don't need blk_queue_split() calls. The skipped ones are:
* nfhd_make_request (arch/m68k/emu/nfblock.c)
* axon_ram_make_request (arch/powerpc/sysdev/axonram.c)
* simdisk_make_request (arch/xtensa/platforms/iss/simdisk.c)
* brd_make_request (ramdisk - drivers/block/brd.c)
* mtip_submit_request (drivers/block/mtip32xx/mtip32xx.c)
* loop_make_request
* null_queue_bio
* bcache's make_request fns
Some others are almost certainly safe to remove now, but will be left
for future patches.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: drbd-user@lists.linbit.com
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Jim Paris <jim@jtan.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Acked-by: NeilBrown <neilb@suse.de> (for the 'md/md.c' bits)
Acked-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[dpark: skip more mq-based drivers, resolve merge conflicts, etc.]
Signed-off-by: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull networking fixes from David Miller:
1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
devices, from Emmanuel Grumbach.
2) Falling back to vmalloc in conntrack should not emit a warning, from
Pablo Neira Ayuso.
3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
Felipe Dominguez Vega.
4) Rocker doesn't free netdev on device removal, from Ido Schimmel.
5) UDP multicast early sock demux has route handling races, from Eric
Dumazet.
6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.
7) Fix use-after-free in skb_set_peeked, from Herbert Xu.
8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
to fraglists longer than the driver can support. From Jason Wang.
9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.
10) Fix interrupt storm in bna driver, from Ivan Vecera.
11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.
12) Fix inet request sock leak, from Eric Dumazet.
13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
driver, from LEROY Christophe.
14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
to get fixed any time soon. From Jakub Kicinski
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
cosa: missing error code on failure in probe()
gianfar: remove faulty filer optimizer
gianfar: correct list membership accounting
gianfar: correct filer table writing
bonding: Gratuitous ARP gets dropped when first slave added
net: dsa: Do not override PHY interface if already configured
net: fs_enet: mask interrupts for TX partial frames.
net: fs_enet: explicitly remove I flag on TX partial frames
inet: fix possible request socket leak
inet: fix races with reqsk timers
mkiss: Fix error handling in mkiss_open()
bnx2x: Free NVRAM lock at end of each page
bnx2x: Prevent null pointer dereference on SKB release
cxgb4: missing curly braces in t4_setup_debugfs()
net-timestamp: Update skb_complete_tx_timestamp comment
ipv6: don't reject link-local nexthop on other interface
netlink: make sure -EBUSY won't escape from netlink_insert
bna: fix interrupts storm caused by erroneous packets
net: mvpp2: replace TX coalescing interrupts with hrtimer
net: mvpp2: enable proper per-CPU TX buffers unmapping
...
Commit 966f2a71a9 ("cpufreq: exynos: remove Exynos4x12 specific
cpufreq driver support") deleted option ARM_EXYNOS_CPUFREQ but missed
to delete a rule in drivers/cpufreq/Makefile which depends on that
option.
Remove unselectable rule for arm-exynos-cpufreq.o
from drivers/cpufreq/Makefile.
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Pull EDAC fix from Borislav Petkov:
"A ppc4xx_edac fix for accessing ->csrows properly. This driver was
missed during the conversion a couple of years ago"
* tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, ppc4xx: Access mci->csrows array elements properly
Exynos4x12 based platforms have switched over to use generic
cpufreq driver for cpufreq functionality. So the Exynos
specific cpufreq support for these platforms can be removed.
Also once Exynos4x12 based platforms support have been removed
the shared exynos-cpufreq driver is no longer needed and can
be deleted.
Based on the earlier work by Thomas Abraham.
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Thomas Abraham <thomas.ab@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
The number of TLB lines was increased from 16 on Tegra30 to 32 on
Tegra114 and later. Parameterize the value so that the initial default
can be set accordingly.
On Tegra30, initializing the value to 32 would effectively disable the
TLB and hence cause massive latencies for memory accesses translated
through the SMMU. This is especially noticeable for isochronuous clients
such as display, whose FIFOs would continuously underrun.
Fixes: 8918465163 ("memory: Add NVIDIA Tegra memory controller support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
The driver requests the pclk clock at probe time already and stores its
reference to it in struct tegra_pmc, so there is no need to look it up
everytime it is needed. Use the existing reference instead.
Signed-off-by: Thierry Reding <treding@nvidia.com>
If the gpio DT node has the gpio-ranges property, the range will be
added by the gpio core and doesn't need to be added by the pinctrl
driver.
By having the gpio-ranges property, we have an explicit dependency from
the gpio node to the pinctrl node and we can stop using the deprecated
pinctrl_add_gpio_range() function.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Kbuild descends into drivers/soc/tegra/ only when CONFIG_ARCH_TEGRA
is enabled. (see drivers/soc/Makefile)
$(CONFIG_ARCH_TEGRA) in drivers/soc/tegra/Makefile always evaluates
to 'y'.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
pclk_pd_pmu needs to keep running and with the upcoming gpio clock
handling this is not always the case anymore. So add it to the list
of critical clocks for now.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Lin Huang <hl@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Recent versions of the Tegra MC hardware extend the size of the client
ID bitfield in the MC_ERR_STATUS register by one bit. While one could
simply extend the bitfield for older hardware, that would allow data
from reserved bits into the driver code, which is generally a bad idea
on principle. So this patch instead passes in the client ID mask from
from the per-SoC MC data.
There's no MC support for T210 (yet), but when that support winds up
in the kernel, the appropriate soc->client_id_mask value for that chip
will be 0xff.
Based on an original patch by David Ung <davidu@nvidia.com>.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: David Ung <davidu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This code is used both when creating a new page directory entry and when
tearing it down, with only the PDE value changing between both cases.
Factor the code out so that it can be reused.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: make commit message more accurate]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Extract the use count reference accounting into a separate function and
separate it from allocating the PTE.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: extract and write commit message]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rather than explicitly zeroing pages allocated via alloc_page(), add
__GFP_ZERO to the gfp mask to ask the allocator for zeroed pages.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Remove the unnecessary manipulation of the PageReserved flags in the
Tegra SMMU driver. None of this is required as the page(s) remain
private to the SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use the DMA API instead of calling architecture internal functions in
the Tegra SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Pass smmu_flush_ptc() the device address rather than struct page
pointer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
smmu_flush_ptc() is used in two modes: one is to flush an individual
entry, the other is to flush all entries. We know at the call site
which we require. Split the function into smmu_flush_ptc_all() and
smmu_flush_ptc().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Drivers should not be using __cpuc_* functions nor outer_cache_flush()
directly. This change partly cleans up tegra-smmu.c.
The only difference between cache handling of the tegra variants is
Denver, which omits the call to outer_cache_flush(). This is due to
Denver being an ARM64 CPU, and the ARM64 architecture does not provide
this function. (This, in itself, is a good reason why these should not
be used.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: fix build failure on 64-bit ARM]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use kcalloc() to allocate the use-counter array for the page directory
entries/page tables. Using kcalloc() allows us to be provided with
zero-initialised memory from the allocators, rather than initialising
it ourselves.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Store the struct page pointer for the second level page tables, rather
than working back from the page directory entry. This is necessary as
we want to eliminate the use of physical addresses used with
arch-private functions, switching instead to use the streaming DMA API.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Fix the page table lookup in the unmap and iova_to_phys methods.
Neither of these methods should allocate a page table; a missing page
table should be treated the same as no mapping present.
More importantly, using as_get_pte() for an IOVA corresponding with a
non-present page table entry increments the use-count for the page
table, on the assumption that the caller of as_get_pte() is going to
setup a mapping. This is an incorrect assumption.
Fix both of these bugs by providing a separate helper which only looks
up the page table, but never allocates it. This is akin to pte_offset()
for CPU page tables.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add a pair of helpers to get the page directory and page table indexes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra SMMU unmap path has several problems:
1. as_pte_put() can perform a write-after-free
2. tegra_smmu_unmap() can perform cache maintanence on a page we have
just freed.
3. when a page table is unmapped, there is no CPU cache maintanence of
the write clearing the page directory entry, nor is there any
maintanence of the IOMMU to ensure that it sees the page table has
gone.
Fix this by getting rid of as_pte_put(), and instead coding the PTE
unmap separately from the PDE unmap, placing the PDE unmap after the
PTE unmap has been completed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
iova_to_phys() has several problems:
(a) iova_to_phys() is supposed to return 0 if there is no entry present
for the iova.
(b) if as_get_pte() fails, we oops the kernel by dereferencing a NULL
pointer. Really, we should not even be trying to allocate a page
table at all, but should only be returning the presence of the 2nd
level page table. This will be fixed in a subsequent patch.
Treat both of these conditions as "no mapping" conditions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Fix the section mismatch warning
WARNING: vmlinux.o(.text+0x2b2788): Section mismatch in reference from the function mxc_gpio_probe() to the function .init.text:mxc_gpio_init_gc()
The function mxc_gpio_probe() references
the function __init mxc_gpio_init_gc().
This is often because mxc_gpio_probe lacks a __init
annotation or the annotation of mxc_gpio_init_gc is wrong.
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
writel() already does a cpu_to_le32 conversion, so
remove cpu_to_le32().
Signed-off-by: Leilk Liu <leilk.liu@mediatek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently in the FSL platform all GPIO interrupts in a bank are muxed
into two GPIO lines to the GPC interrupt controller. In each GPIO bank
GPIOs 0-15 are OR'ed into one GPC interrupt controller interrupt and 16-31
are OR'ed into another. With the current code, if any of the 0-15 or
16-31 interrupts are marked as wakeup capable, all interrupts belonging
to that sub-bank (either 0-15 or 16-31) will wake up the device. This is
because interrupts are only being masked at the interrupt controller
and not at the GPIO controller.
This patch allows masking of GPIO interrupts at the GPIO controller during
suspend if they have not been labeled wakeup capable. This patch uses
preexisting IRQCHIP_MASK_ON_SUSPEND flag while initializing the GPIO
interrupts to get the desired behavior.
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Ulises Brindis <ubrindis56@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Fix below build warning:
CC drivers/gpio/gpio-omap.o
drivers/gpio/gpio-omap.c: In function 'omap_gpio_irq_type':
drivers/gpio/gpio-omap.c:504:3: warning: passing argument 1 of 'spin_unlock_irqrestore' from incompatible pointer type [enabled by default]
include/linux/spinlock.h:360:29: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *'
Fixes: commit 4dbada2be4 ("gpio: omap: use raw locks for locking")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
To set/get atlas7 pull & drive strength, we use lots of if/else
to check pad type. But except mask value or immediate value, all
actions in these conditional branches are the same.
So we use predefined pull info table and drive strength table
to reduce these redundancy code.
Signed-off-by: Wei Chen <Wei.Chen@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
alloc_orinocodev() would allocate the wiphy entry, but it would only get
registered much later in orinoco_init(). If something failed in the init
process inbetween the call to alloc_orinocodev() and the completion
of orinoco_init(), the drivers would end up calling wiphy_unregister()
with a NULL pointer causing beautiful OOPS fireworks.
Explicitly call wiphy_unregister() instead in the right places.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>