Commit Graph

4733 Commits

Author SHA1 Message Date
Linus Torvalds
052db7ec86 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:

 1) Move to 4-level page tables on sparc64 and support up to 53-bits of
    physical addressing.  Kernel static image BSS size reduced by
    several megabytes.

 2) M6/M7 cpu support, from Allan Pais.

 3) Move to sparse IRQs, handle hypervisor TLB call errors more
    gracefully, and add T5 perf_event support.  From Bob Picco.

 4) Recognize cdroms and compute geometry from capacity in virtual disk
    driver, also from Allan Pais.

 5) Fix memset() return value on sparc32, from Andreas Larsson.

 6) Respect gfp flags in dma_alloc_coherent on sparc32, from Daniel
    Hellstrom.

 7) Fix handling of compound pages in virtual disk driver, from Dwight
    Engen.

 8) Fix lockdep warnings in LDC layer by moving IRQ requesting to
    ldc_alloc() from ldc_bind().

 9) Increase boot string length to 1024 bytes, from Dave Kleikamp.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: (31 commits)
  sparc64: Fix lockdep warnings on reboot on Ultra-5
  sparc64: Increase size of boot string to 1024 bytes
  sparc64: Kill unnecessary tables and increase MAX_BANKS.
  sparc64: sparse irq
  sparc64: Adjust vmalloc region size based upon available virtual address bits.
  sparc64: Increase MAX_PHYS_ADDRESS_BITS to 53.
  sparc64: Use kernel page tables for vmemmap.
  sparc64: Fix physical memory management regressions with large max_phys_bits.
  sparc64: Adjust KTSB assembler to support larger physical addresses.
  sparc64: Define VA hole at run time, rather than at compile time.
  sparc64: Switch to 4-level page tables.
  sparc64: Fix reversed start/end in flush_tlb_kernel_range()
  sparc64: Add vio_set_intr() to enable/disable Rx interrupts
  vio: fix reuse of vio_dring slot
  sunvdc: limit each sg segment to a page
  sunvdc: compute vdisk geometry from capacity
  sunvdc: add cdrom and v1.1 protocol support
  sparc: VIO protocol version 1.6
  sparc64: Fix hibernation code refrence to PAGE_OFFSET.
  sparc64: Move request_irq() from ldc_bind() to ldc_alloc()
  ...
2014-10-11 20:36:34 -04:00
Linus Torvalds
81ae31d782 Merge tag 'stable/for-linus-3.18-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen updates from David Vrabel:
 "Features and fixes:

   - Add pvscsi frontend and backend drivers.
   - Remove _PAGE_IOMAP PTE flag, freeing it for alternate uses.
   - Try and keep memory contiguous during PV memory setup (reduces
     SWIOTLB usage).
   - Allow front/back drivers to use threaded irqs.
   - Support large initrds in PV guests.
   - Fix PVH guests in preparation for Xen 4.5"

* tag 'stable/for-linus-3.18-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (22 commits)
  xen: remove DEFINE_XENBUS_DRIVER() macro
  xen/xenbus: Remove BUG_ON() when error string trucated
  xen/xenbus: Correct the comments for xenbus_grant_ring()
  x86/xen: Set EFER.NX and EFER.SCE in PVH guests
  xen: eliminate scalability issues from initrd handling
  xen: sync some headers with xen tree
  xen: make pvscsi frontend dependant on xenbus frontend
  arm{,64}/xen: Remove "EXPERIMENTAL" in the description of the Xen options
  xen-scsifront: don't deadlock if the ring becomes full
  x86: remove the Xen-specific _PAGE_IOMAP PTE flag
  x86/xen: do not use _PAGE_IOMAP PTE flag for I/O mappings
  x86: skip check for spurious faults for non-present faults
  xen/efi: Directly include needed headers
  xen-scsiback: clean up a type issue in scsiback_make_tpg()
  xen-scsifront: use GFP_ATOMIC under spin_lock
  MAINTAINERS: Add xen pvscsi maintainer
  xen-scsiback: Add Xen PV SCSI backend driver
  xen-scsifront: Add Xen PV SCSI frontend driver
  xen: Add Xen pvSCSI protocol description
  xen/events: support threaded irqs for interdomain event channels
  ...
2014-10-11 20:29:01 -04:00
Sergey Senozhatsky
015254daf1 zram: use notify_free to account all free notifications
`notify_free' device attribute accounts the number of slot free
notifications and internally represents the number of zram_free_page()
calls.  Slot free notifications are sent only when device is used as a
swap device, hence `notify_free' is used only for swap devices.  Since
f4659d8e62 (zram: support REQ_DISCARD) ZRAM handles yet another one
free notification (also via zram_free_page() call) -- REQ_DISCARD
requests, which are sent by a filesystem, whenever some data blocks are
discarded.  However, there is no way to know the number of notifications
in the latter case.

Use `notify_free' to account the number of pages freed by
zram_bio_discard() and zram_slot_free_notify().  Depending on usage
scenario `notify_free' represents:

 a) the number of pages freed because of slot free notifications, which is
   equal to the number of swap_slot_free_notify() calls, so there is no
   behaviour change

 b) the number of pages freed because of REQ_DISCARD notifications

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:03 -04:00
Minchan Kim
461a8eee6a zram: report maximum used memory
Normally, zram user could get maximum memory usage zram consumed via
polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory usage
during update inverval of polling.  For avoiding that, user should poll it
with shorter interval(ie, 0.0000000001s) with mlocking to avoid page fault
delay when memory pressure is heavy.  It would be troublesome.

This patch adds new knob "mem_used_max" so user could see the maximum
memory usage easily via reading the knob and reset it via "echo 0 >
/sys/block/zram0/mem_used_max".

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Reviewed-by: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:02 -04:00
Minchan Kim
9ada9da957 zram: zram memory size limitation
Since zram has no control feature to limit memory usage, it makes hard to
manage system memrory.

This patch adds new knob "mem_limit" via sysfs to set up the a limit so
that zram could fail allocation once it reaches the limit.

In addition, user could change the limit in runtime so that he could
manage the memory more dynamically.

Initial state is no limit so it doesn't break old behavior.

[akpm@linux-foundation.org: fix typo, per Sergey]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:02 -04:00
Minchan Kim
722cdc1723 zsmalloc: change return value unit of zs_get_total_size_bytes
zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with
*byte unit* but zsmalloc operates *page unit* rather than byte unit so
let's change the API so benefit we could get is that reduce unnecessary
overhead (ie, change page unit with byte unit) in zsmalloc.

Since return type is pages, "zs_get_total_pages" is better than
"zs_get_total_size_bytes".

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:02 -04:00
Al Viro
8e3fb059ae rsxx debugfs inanity
check with the author of that horror...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09 02:39:07 -04:00
Linus Torvalds
28596c9722 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
2014-10-07 21:16:26 -04:00
David Vrabel
95afae4814 xen: remove DEFINE_XENBUS_DRIVER() macro
The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse
errors.

Replace the uses with standard structure definitions instead.  This is
similar to pci and usb device registration.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-06 10:27:57 +01:00
Mike Snitzer
b277da0a8a block: disable entropy contributions for nonrot devices
Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
QUEUE_FLAG_NONROT.

Historically, all block devices have automatically made entropy
contributions.  But as previously stated in commit e2e1a148 ("block: add
sysfs knob for turning off disk entropy contributions"):
    - On SSD disks, the completion times aren't as random as they
      are for rotational drives. So it's questionable whether they
      should contribute to the random pool in the first place.
    - Calling add_disk_randomness() has a lot of overhead.

There are more reliable sources for randomness than non-rotational block
devices.  From a security perspective it is better to err on the side of
caution than to allow entropy contributions from unreliable "random"
sources.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-04 10:55:32 -06:00
Jens Axboe
7b7b7f7e02 Merge branch 'stable/for-jens-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.18/drivers
Konrad writes:

This pull has two fixes and one cleanup. Nothing earthshattering.
2014-10-01 14:37:25 -06:00
Arianna Avanzini
0f1ca65ee5 xen, blkfront: factor out flush-related checks from do_blkif_request()
This commit factors out some checks related to the request insertion
path, which can be done in an function instead of by itself.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-10-01 16:32:39 -04:00
Roger Pau Monné
61cecca865 xen-blkback: fix leak on grant map error path
Fix leaking a page when a grant mapping has failed.

CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: Tao Chen <boby.chen@huawei.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-10-01 16:32:31 -04:00
Vitaly Kuznetsov
12ea729645 xen/blkback: unmap all persistent grants when frontend gets disconnected
blkback does not unmap persistent grants when frontend goes to Closed
state (e.g. when blkfront module is being removed). This leads to the
following in guest's dmesg:

[  343.243825] xen:grant_table: WARNING: g.e. 0x445 still in use!
[  343.243825] xen:grant_table: WARNING: g.e. 0x42a still in use!
...

When load module -> use device -> unload module sequence is performed multiple times
it is possible to hit BUG() condition in blkfront module:

[  343.243825] kernel BUG at drivers/block/xen-blkfront.c:954!
[  343.243825] invalid opcode: 0000 [#1] SMP
[  343.243825] Modules linked in: xen_blkfront(-) ata_generic pata_acpi [last unloaded: xen_blkfront]
...
[  343.243825] Call Trace:
[  343.243825]  [<ffffffff814111ef>] ? unregister_xenbus_watch+0x16f/0x1e0
[  343.243825]  [<ffffffffa0016fbf>] blkfront_remove+0x3f/0x140 [xen_blkfront]
...
[  343.243825] RIP  [<ffffffffa0016aae>] blkif_free+0x34e/0x360 [xen_blkfront]
[  343.243825]  RSP <ffff88001eb8fdc0>

We don't need to keep these grants if we're disconnecting as frontend might already
forgot about them. Solve the issue by moving xen_blkbk_free_caches() call from
xen_blkif_free() to xen_blkif_disconnect().

Now we can see the following:
[  928.590893] xen:grant_table: WARNING: g.e. 0x587 still in use!
[  928.591861] xen:grant_table: WARNING: g.e. 0x372 still in use!
...
[  929.592146] xen:grant_table: freeing g.e. 0x587
[  929.597174] xen:grant_table: freeing g.e. 0x372
...

Backend does not keep persistent grants any more, reconnect works fine.

CC: stable@vger.kernel.org
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-10-01 16:32:23 -04:00
Michael Opdenacker
baf378126b rsxx: Remove deprecated IRQF_DISABLED
This removes the use of the IRQF_DISABLED flag
from drivers/block/rsxx/core.c

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Acked-by Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-01 14:07:39 -06:00
Michael Opdenacker
fc2021fb9b block: hd: remove deprecated IRQF_DISABLED
This patch removes the use of the IRQF_DISABLED flag
from drivers/block/hd.c

It's a NOOP since 2.6.35 and it will be removed one day.

This also removes a related comment which is obsolete too.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-01 08:16:07 -06:00
Dwight Engen
d0aedcd4f1 vio: fix reuse of vio_dring slot
vio_dring_avail() will allow use of every dring entry, but when the last
entry is allocated then dr->prod == dr->cons which is indistinguishable from
the ring empty condition. This causes the next allocation to reuse an entry.
When this happens in sunvdc, the server side vds driver begins nack'ing the
messages and ends up resetting the ldc channel. This problem does not effect
sunvnet since it checks for < 2.

The fix here is to just never allocate the very last dring slot so that full
and empty are not the same condition. The request start path was changed to
check for the ring being full a bit earlier, and to stop the blk_queue if
there is no space left. The blk_queue will be restarted once the ring is
only half full again. The number of ring entries was increased to 512 which
matches the sunvnet and Solaris vdc drivers, and greatly reduces the
frequency of hitting the ring full condition and the associated blk_queue
stop/starting. The checks in sunvent were adjusted to account for
vio_dring_avail() returning 1 less.

Orabug: 19441666
OraBZ: 14983

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 14:37:35 -07:00
Dwight Engen
5eed69ffd2 sunvdc: limit each sg segment to a page
ldc_map_sg() could fail its check that the number of pages referred to
by the sg scatterlist was <= the number of cookies.

This fixes the issue by doing a similar thing to the xen-blkfront driver,
ensuring that the scatterlist will only ever contain a segment count <=
port->ring_cookies, and each segment will be page aligned, and <= page
size. This ensures that the scatterlist is always mappable.

Orabug: 19347817
OraBZ: 15945

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 14:37:35 -07:00
Allen Pais
de5b73f084 sunvdc: compute vdisk geometry from capacity
The LDom diskserver doesn't return reliable geometry data. In addition,
the types for all fields in the vio_disk_geom are u16, which were being
truncated in the cast into the u8's of the Linux struct hd_geometry.

Modify vdc_getgeo() to compute the geometry from the disk's capacity in a
manner consistent with xen-blkfront::blkif_getgeo().

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 14:37:35 -07:00
Allen Pais
9bce21828d sunvdc: add cdrom and v1.1 protocol support
Interpret the media type from v1.1 protocol to support CDROM/DVD.

For v1.0 protocol, a disk's size continues to be calculated from the
geometry returned by the vdisk server. The geometry returned by the server
can be less than the actual number of sectors available in the backing
image/device due to the rounding in the division used to compute the
geometry in the vdisk server.

In v1.1 protocol a disk's actual size in sectors is returned during the
handshake. Use this size when v1.1 protocol is negotiated. Since this size
will always be larger than the former geometry computed size, disks created
under v1.0 will be forwards compatible to v1.1, but not vice versa.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 14:37:34 -07:00
Christoph Hellwig
c8a446ad69 blk-mq: rename blk_mq_end_io to blk_mq_end_request
Now that we've changed the driver API on the submission side use the
opportunity to fix up the name on the completion side to fit into the
general scheme.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22 12:00:07 -06:00
Christoph Hellwig
e2490073cd blk-mq: call blk_mq_start_request from ->queue_rq
When we call blk_mq_start_request from the core blk-mq code before calling into
->queue_rq there is a racy window where the timeout handler can hit before we've
fully set up the driver specific part of the command.

Move the call to blk_mq_start_request into the driver so the driver can start
the request only once it is fully set up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22 12:00:07 -06:00
Christoph Hellwig
bf57229745 blk-mq: remove REQ_END
Pass an explicit parameter for the last request in a batch to ->queue_rq
instead of using a request flag.  Besides being a cleaner and non-stateful
interface this is also required for the next patch, which fixes the blk-mq
I/O submission code to not start a time too early.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22 12:00:07 -06:00
Jens Axboe
6d11fb454b Merge branch 'for-linus' into for-3.18/core
Moving patches from for-linus to 3.18 instead, pull in this changes
that will go to Linus today.
2014-09-22 11:57:32 -06:00
Lai Jiangshan
e9f05b4cfe drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks
The original code are the same as RB_DECLARE_CALLBACKS().

CC: Michel Lespinasse <walken@google.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-18 09:00:17 -06:00
Lai Jiangshan
82cfb90bc9 drbd: compute the end before rb_insert_augmented()
Commit 98683650 "Merge branch 'drbd-8.4_ed6' into
for-3.8-drivers-drbd-8.4_ed6" switches to the new augment API, but the
new API requires that the tree is augmented before rb_insert_augmented()
is called, which is missing.

So we add the augment-code to drbd_insert_interval() when it travels the
tree up to down before rb_insert_augmented().  See the example in
include/linux/interval_tree_generic.h or Documentation/rbtree.txt.

drbd_insert_interval() may cancel the insertion when traveling, in this
case, the just added augment-code does nothing before cancel since the
@this node is already in the subtrees in this case.

CC: Michel Lespinasse <walken@google.com>
CC: stable@kernel.org # v3.10+
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-18 09:00:16 -06:00
Linus Torvalds
645cc09381 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small collection of fixes for the current rc series.  This contains:

   - Two small blk-mq patches from Rob Elliott, cleaning up error case
     at init time.

   - A fix from Ming Lei, fixing SG merging for blk-mq where
     QUEUE_FLAG_SG_NO_MERGE is the default.

   - A dev_t minor lifetime fix from Keith, fixing an issue where a
     minor might be reused before all references to it were gone.

   - Fix from Alan Stern where an unbalanced queue bypass caused SCSI
     some headaches when it does a series of add/del on devices without
     fully registrering the queue.

   - A fix from me for improving the scaling of tag depth in blk-mq if
     we are short on memory"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: scale depth and rq map appropriate if low on memory
  Block: fix unbalanced bypass-disable in blk_register_queue
  block: Fix dev_t minor allocation lifetime
  blk-mq: cleanup after blk_mq_init_rq_map failures
  blk-mq: pass along blk_mq_alloc_tag_set return values
  blk-merge: fix blk_recount_segments
2014-09-13 09:39:55 -07:00
Jens Axboe
b207892b06 Merge branch 'for-linus' into for-3.18/core
A bit of churn on the for-linus side that would be nice to have
in the core bits for 3.18, so pull it in to catch us up and make
forward progress easier.

Signed-off-by: Jens Axboe <axboe@fb.com>

Conflicts:
	block/scsi_ioctl.c
2014-09-11 09:31:18 -06:00
Philipp Reisner
590001c229 drbd: Add missing newline in resync progress display in /proc/drbd
Was broken in 2010 with commit 4b0715f096

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Lars Ellenberg
729e8b87ba drbd: reduce lock contention in drbd_worker
The worker may now dequeue work items in batches.
This should reduce lock contention during busy periods.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Lars Ellenberg
abde9cc6a5 drbd: Improve asender performance
Shorten receive path in the asender thread. Reduces CPU utilisation
of asender when receiving packets, and with that increases IOPs.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Andreas Gruenbacher
b47a06d105 drbd: Get rid of the WORK_PENDING macro
This macro doesn't add any value; just use test_bit() instead.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Andreas Gruenbacher
d1b8085356 drbd: Get rid of the __no_warn and __cond_lock macros
These macros can easily be replaced with its definition.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Andreas Gruenbacher
8d4ba3f0fa drbd: Avoid inconsistent locking warning
request_timer_fn() takes resource->req_lock via the device and releases it via
the connection.  Avoid this as it is confusing static code checkers.

Reported-by: "Dan Carpenter" <dan.carpenter@oracle.com>
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Philipp Marek
f0c21e6228 drbd: Remove superfluous newline from "resync_extents" debugfs entry.
See "drbd/resources/*/volumes/*/resync_extents".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Andreas Gruenbacher
ed15b79509 drbd: Use consistent names for all the bi_end_io callbacks
Now they follow the _endio naming sheme.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Andreas Gruenbacher
11f8b2b69d drbd: Use better variable names
Rename local variable 'ds' to 'disk_state' or 'data_size'.
'dgs' to 'digest_size'

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-11 08:41:29 -06:00
Wei Yongjun
255939e783 rbd: fix error return code in rbd_dev_device_setup()
Fix to return -ENOMEM from the workqueue alloc error handling
case instead of 0, as done elsewhere in this function.

Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
2014-09-10 11:59:06 +04:00
Ilya Dryomov
58d1362b50 rbd: avoid format-security warning inside alloc_workqueue()
drivers/block/rbd.c: In function ‘rbd_dev_device_setup’:
drivers/block/rbd.c:5090:19: warning: format not a string literal and no format arguments [-Wformat-security]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-09-10 11:59:06 +04:00
Robert Elliott
dc501dc0d9 blk-mq: pass along blk_mq_alloc_tag_set return values
Two of the blk-mq based drivers do not pass back the return value
from blk_mq_alloc_tag_set, instead just returning -ENOMEM.

blk_mq_alloc_tag_set returns -EINVAL if the number of queues or
queue depth is bad.  -ENOMEM implies that retrying after freeing some
memory might be more successful, but that won't ever change
in the -EINVAL cases.

Change the null_blk and mtip32xx drivers to pass along
the return value.

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-02 15:39:03 -06:00
Linus Torvalds
10f3291a1d Merge branch 'akpm' (fixes from Andrew Morton)
Merge patches from Andrew Morton:
 "22 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  kexec: purgatory: add clean-up for purgatory directory
  Documentation/kdump/kdump.txt: add ARM description
  flush_icache_range: export symbol to fix build errors
  tools: selftests: fix build issue with make kselftests target
  ocfs2: quorum: add a log for node not fenced
  ocfs2: o2net: set tcp user timeout to max value
  ocfs2: o2net: don't shutdown connection when idle timeout
  ocfs2: do not write error flag to user structure we cannot copy from/to
  x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory
  drivers/rtc/rtc-s5m.c: re-add support for devices without irq specified
  xattr: fix check for simultaneous glibc header inclusion
  kexec: remove CONFIG_KEXEC dependency on crypto
  kexec: create a new config option CONFIG_KEXEC_FILE for new syscall
  x86,mm: fix pte_special versus pte_numa
  hugetlb_cgroup: use lockdep_assert_held rather than spin_is_locked
  mm/zpool: use prefixed module loading
  zram: fix incorrect stat with failed_reads
  lib: turn CONFIG_STACKTRACE into an actual option.
  mm: actually clear pmd_numa before invalidating
  memblock, memhotplug: fix wrong type in memblock_find_in_range_node().
  ...
2014-08-29 16:28:29 -07:00
Chao Yu
0cf1e9d6c3 zram: fix incorrect stat with failed_reads
Since we allocate a temporary buffer in zram_bvec_read to handle partial
page operations in commit 924bd88d70 ("Staging: zram: allow partial
page operations"), our ->failed_reads value may be incorrect as we do
not increase its value when failing to allocate the temporary buffer.

Let's fix this issue and correct the annotation of failed_reads.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-29 16:28:16 -07:00
Joe Lawrence
a492f07545 block,scsi: fixup blk_get_request dead queue scenarios
The blk_get_request function may fail in low-memory conditions or during
device removal (even if __GFP_WAIT is set). To distinguish between these
errors, modify the blk_get_request call stack to return the appropriate
ERR_PTR. Verify that all callers check the return status and consider
IS_ERR instead of a simple NULL pointer check.

For consistency, make a similar change to the blk_mq_alloc_request leg
of blk_get_request.  It may fail if the queue is dead, or the caller was
unwilling to wait.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-28 10:03:46 -06:00
Joe Lawrence
eb571eeade block,scsi: verify return pointer from blk_get_request
The blk-core dead queue checks introduce an error scenario to
blk_get_request that returns NULL if the request queue has been
shutdown. This affects the behavior for __GFP_WAIT callers, who should
verify the return value before dereferencing.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-26 15:20:23 -06:00
Geert Uytterhoeven
336ec13734 paride/pcd: Fix grammar
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-26 09:35:57 +02:00
Dmitry Monakhov
aeac318194 brd: add ram disk visibility option
Currenly ram disk is not visiable inside /proc/partitions. This was
done for compatibility reasons here: 53978d0a7a. But some utilities
expect disk presents in /proc/partitions.
Let's add module's option and let's administrator chose visibility behaviour.
By default, old behaviour preserved.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-21 20:42:01 -05:00
Michal Simek
ffb5db73eb block: systemace: Remove .owner field for driver
There is no need to init .owner field.

Based on the patch from Peter Griffin <peter.griffin@linaro.org>
"mmc: remove .owner field for drivers using module_platform_driver"

This patch removes the superflous .owner field for drivers which
use the module_platform_driver API, as this is overriden in
platform_driver_register anyway."

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-21 20:37:54 -05:00
Linus Torvalds
a11c5c9ef6 Merge tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas:
 "Part two of the PCI changes for v3.17:

    - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)

  It's a mechanical change that removes uses of the
  DEFINE_PCI_DEVICE_TABLE macro.  I waited until later in the merge
  window to reduce conflicts, but it's possible you'll still see a few"

* tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
2014-08-14 18:10:33 -06:00
Linus Torvalds
d429a3639c Merge branch 'for-3.17/drivers' of git://git.kernel.dk/linux-block
Pull block driver changes from Jens Axboe:
 "Nothing out of the ordinary here, this pull request contains:

   - A big round of fixes for bcache from Kent Overstreet, Slava Pestov,
     and Surbhi Palande.  No new features, just a lot of fixes.

   - The usual round of drbd updates from Andreas Gruenbacher, Lars
     Ellenberg, and Philipp Reisner.

   - virtio_blk was converted to blk-mq back in 3.13, but now Ming Lei
     has taken it one step further and added support for actually using
     more than one queue.

   - Addition of an explicit SG_FLAG_Q_AT_HEAD for block/bsg, to
     compliment the the default behavior of adding to the tail of the
     queue.  From Douglas Gilbert"

* 'for-3.17/drivers' of git://git.kernel.dk/linux-block: (86 commits)
  bcache: Drop unneeded blk_sync_queue() calls
  bcache: add mutex lock for bch_is_open
  bcache: Correct printing of btree_gc_max_duration_ms
  bcache: try to set b->parent properly
  bcache: fix memory corruption in init error path
  bcache: fix crash with incomplete cache set
  bcache: Fix more early shutdown bugs
  bcache: fix use-after-free in btree_gc_coalesce()
  bcache: Fix an infinite loop in journal replay
  bcache: fix crash in bcache_btree_node_alloc_fail tracepoint
  bcache: bcache_write tracepoint was crashing
  bcache: fix typo in bch_bkey_equal_header
  bcache: Allocate bounce buffers with GFP_NOWAIT
  bcache: Make sure to pass GFP_WAIT to mempool_alloc()
  bcache: fix uninterruptible sleep in writeback thread
  bcache: wait for buckets when allocating new btree root
  bcache: fix crash on shutdown in passthrough mode
  bcache: fix lockdep warnings on shutdown
  bcache allocator: send discards with correct size
  bcache: Fix to remove the rcu_sched stalls.
  ...
2014-08-14 09:10:21 -06:00
Linus Torvalds
8d2d441ac4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
 "There is a lot of refactoring and hardening of the libceph and rbd
  code here from Ilya that fix various smaller bugs, and a few more
  important fixes with clone overlap.  The main fix is a critical change
  to the request_fn handling to not sleep that was exposed by the recent
  mutex changes (which will also go to the 3.16 stable series).

  Yan Zheng has several fixes in here for CephFS fixing ACL handling,
  time stamps, and request resends when the MDS restarts.

  Finally, there are a few cleanups from Himangi Saraogi based on
  Coccinelle"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (39 commits)
  libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly
  rbd: remove extra newlines from rbd_warn() messages
  rbd: allocate img_request with GFP_NOIO instead GFP_ATOMIC
  rbd: rework rbd_request_fn()
  ceph: fix kick_requests()
  ceph: fix append mode write
  ceph: fix sizeof(struct tYpO *) typo
  ceph: remove redundant memset(0)
  rbd: take snap_id into account when reading in parent info
  rbd: do not read in parent info before snap context
  rbd: update mapping size only on refresh
  rbd: harden rbd_dev_refresh() and callers a bit
  rbd: split rbd_dev_spec_update() into two functions
  rbd: remove unnecessary asserts in rbd_dev_image_probe()
  rbd: introduce rbd_dev_header_info()
  rbd: show the entire chain of parent images
  ceph: replace comma with a semicolon
  rbd: use rbd_segment_name_free() instead of kfree()
  ceph: check zero length in ceph_sync_read()
  ceph: reset r_resend_mds after receiving -ESTALE
  ...
2014-08-13 17:43:29 -06:00