linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-21 00:04:01 -04:00

Author	SHA1	Message	Date
Sergey Senozhatsky	10447b60be	zram: remove `num_migrated' device attr This patch introduces rework to zram stats. We have per-stat sysfs nodes, and it makes things a bit hard to use in user space: it doesn't give an immediate stats 'snapshot', it requires user space to use more syscalls - open, read, close for every stat file, with appropriate error checks on every step, etc. First, zram now accounts block layer statistics, available in /sys/block/zram<id>/stat and /proc/diskstats files. So some new stats are available (see Documentation/block/stat.txt), besides, zram's activities now can be monitored by sysstat's iostat or similar tools. Example: cat /sys/block/zram0/stat 248 0 1984 0 251029 0 2008232 5120 0 5116 5116 Second, group currently exported on per-stat basis nodes into two categories (files): -- zram<id>/io_stat accumulates device's IO stats, that are not accounted by block layer, and contains: failed_reads failed_writes invalid_io notify_free Example: cat /sys/block/zram0/io_stat 0 0 0 652572 -- zram<id>/mm_stat accumulates zram mm stats and contains: orig_data_size compr_data_size mem_used_total mem_limit mem_used_max zero_pages num_migrated Example: cat /sys/block/zram0/mm_stat 434634752 270288572 279158784 0 579895296 15060 0 per-stat sysfs nodes are now considered to be deprecated and we plan to remove them (and clean up some of the existing stat code) in two years (as of now, there is no warning printed to syslog about deprecated stats being used). User space is advised to use the above mentioned 3 files. This patch (of 7): Remove sysfs `num_migrated' attribute. We are moving away from per-stat device attrs towards 3 stat files that will accumulate io and mm stats in a format similar to block layer statistics in /sys/block/<dev>/stat. That will be easier to use in user space, and reduce the number of syscalls needed to read zram device statistics. `num_migrated' will return back in zram<id>/mm_stat file. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-15 16:35:21 -07:00
Minchan Kim	4e3ba87845	zram: support compaction Now that zsmalloc supports compaction, zram can use it. For the first step, this patch exports compact knob via sysfs so user can do compaction via "echo 1 > /sys/block/zram0/compact". Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Juneho Choi <juno.choi@lge.com> Cc: Gunho Lee <gunho.lee@lge.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-15 16:35:21 -07:00
Linus Torvalds	fa927894bb	Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second vfs update from Al Viro: "Now that net-next went in... Here's the next big chunk - killing ->aio_read() and ->aio_write(). There'll be one more pile today (direct_IO changes and generic_write_checks() cleanups/fixes), but I'd prefer to keep that one separate" * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits) ->aio_read and ->aio_write removed pcm: another weird API abuse infinibad: weird APIs switched to ->write_iter() kill do_sync_read/do_sync_write fuse: use iov_iter_get_pages() for non-splice path fuse: switch to ->read_iter/->write_iter switch drivers/char/mem.c to ->read_iter/->write_iter make new_sync_{read,write}() static coredump: accept any write method switch /dev/loop to vfs_iter_write() serial2002: switch to __vfs_read/__vfs_write ashmem: use __vfs_read() export __vfs_read() autofs: switch to __vfs_write() new helper: __vfs_write() switch hugetlbfs to ->read_iter() coda: switch to ->read_iter/->write_iter ncpfs: switch to ->read_iter/->write_iter net/9p: remove (now-)unused helpers p9_client_attach(): set fid->uid correctly ...	2015-04-15 13:22:56 -07:00
David Howells	75c3cfa855	VFS: assorted weird filesystems: d_inode() annotations Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:58 -04:00
Christoph Hellwig	aa4d86163e	block: loop: switch to VFS ITER_BVEC Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:33 -04:00
Wei Liu	ccc9d90a9a	xenbus_client: Extend interface to support multi-page ring Originally Xen PV drivers only use single-page ring to pass along information. This might limit the throughput between frontend and backend. The patch extends Xenbus driver to support multi-page ring, which in general should improve throughput if ring is the bottleneck. Changes to various frontend / backend to adapt to the new interface are also included. Affected Xen drivers: * blkfront/back * netfront/back * pcifront/back * scsifront/back * vtpmfront The interface is documented, as before, in xenbus_client.c. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-04-15 10:56:47 +01:00
Al Viro	283e7e5d24	switch /dev/loop to vfs_iter_write() all writable files that might be used as backing store for /dev/loop already support ->write_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:39 -04:00
James Bottomley	b9f28d8635	sd, mmc, virtio_blk, string_helpers: fix block size units The current string_get_size() overflows when the device size goes over 2^64 bytes because the string helper routine computes the suffix from the size in bytes. However, the entirety of Linux thinks in terms of blocks, not bytes, so this will artificially induce an overflow on very large devices. Fix this by making the function string_get_size() take blocks and the block size instead of bytes. This should allow us to keep working until the current SCSI standard overflows. Also fix virtio_blk and mmc (both of which were also artificially multiplying by the block size to pass a byte side to string_get_size()). The mathematics of this is pretty simple: we're taking a product of size in blocks (S) and block size (B) and trying to re-express this in exponential form: SB = RN^E (where N, the exponent is either 1000 or 1024) and R < N. Mathematically, S = RSN^ES and B=RBN^EB, so if RSRB < N it's easy to see that SB = RSRBN^(ES+EB). However, if RSBS > N, we can see that this can be re-expressed as RSBS = RN (where R = RSBS/N < N) so the whole exponent becomes R*N^(ES+EB+1) [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>] Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-04-10 16:27:48 -07:00
Thomas Gleixner	462b69b1e4	Merge branch 'linus' into irq/core to get the GIC updates which conflict with pending GIC changes. Conflicts: drivers/usb/isp1760/isp1760-core.c	2015-04-08 23:26:21 +02:00
Jens Axboe	976f2ab498	Merge branch 'stable/for-jens-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-4.1/drivers Konrad writes: This pull has one fix and an cleanup. Note that David Vrabel in the xen/tip.git tree has other changes for the Xen block drivers that are related to his grant work - and they do not conflict with this git pull.	2015-04-07 20:00:45 -06:00
Keith Busch	a67a95134f	NVMe: Meta data handling through submit io ioctl This adds support for the extended metadata formats through the submit IO ioctl, and simplifies the rest when using a separate metadata format. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-07 19:11:06 -06:00
Keith Busch	7f749d9c10	NVMe: Add translation for block limits Adds SCSI-to-NVMe translation for VPD B0h, block limits inquiry data. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-07 19:11:04 -06:00
Keith Busch	447228023e	NVMe: Remove check for null Checking fails static analysis due to additional arithmetic prior to the NULL check. Mapping doesn't return NULL here anyway, so removing the check. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-07 19:10:51 -06:00
Alexey Khoroshilov	c727040bda	NVMe: Fix error handling of class_create("nvme") class_create() returns ERR_PTR on failure, so IS_ERR() should be used instead of check for NULL. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-07 19:08:58 -06:00
Tao Chen	77387b82d1	xen-blkback: define pr_fmt macro to avoid the duplication of DRV_PFX Define pr_fmt macro with {xen-blkback: } prefix, then remove all use of DRV_PFX in the pr sentences. Replace all DPRINTK with pr sentences, and get rid of DPRINTK macro. It will simplify the code. And if the pr sentences miss a \n, add it in the end. If the DPRINTK sentences have redundant \n, remove it. It will format the code. These all make the readability of the code become better. Signed-off-by: Tao Chen <boby.chen@huawei.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>	2015-04-07 10:32:43 -04:00
Tao Chen	1375590d3e	xen-blkback: enlarge the array size of blkback name The blkback name is like blkback.domid.xvd[a-z], if domid has four digits (means larger than 1000), then the backmost xvd wouldn't be fully shown. Define a BLKBACK_NAME_LEN macro to be 20, enlarge the array size of blkback name, so it will be fully shown in any case. Signed-off-by: Tao Chen <boby.chen@huawei.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>	2015-04-07 10:32:42 -04:00
Markus Pargmann	de9ad6d4ed	nbd: Return error pointer directly Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:28 -06:00
Markus Pargmann	dab5313aa4	nbd: Return error code directly By returning the error code directly, we can avoid the jump label error_out. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:26 -06:00
Markus Pargmann	e018e7570c	nbd: Remove fixme that was already fixed The mentioned problem is not present anymore. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:24 -06:00
Markus Pargmann	d18509f597	nbd: Restructure debugging prints dprintk has some name collisions with other frameworks and drivers. It is also not necessary to have these custom debug print filters. Dynamic debug offers the same amount of filtered debugging. This patch replaces all dprintks with dev_dbg(). It also removes the ioctl dprintk which prints the ingoing ioctls which should be replaceable by strace or similar stuff. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:22 -06:00
Markus Pargmann	b9c495bb6d	nbd: Fix device bytesize type The block subsystem uses loff_t to store the device size. Change the type for nbd_device bytesize to loff_t. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:21 -06:00
Markus Pargmann	d06df60b94	nbd: Replace kthread_create with kthread_run kthread_run includes the wake_up_process() call, so instead of kthread_create() followed by wake_up_process() we can use this macro. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:19 -06:00
Markus Pargmann	13e71d69cc	nbd: Remove kernel internal header The header is not included anywhere. Remove it and include the private nbd_device struct in nbd.c. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-04-02 12:39:18 -06:00
Ingo Molnar	4c1eaa2344	drivers/block/pmem: Fix 32-bit build warning in pmem_alloc() Fix: drivers/block/pmem.c: In function ‘pmem_alloc’: drivers/block/pmem.c:138:7: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Wformat=] By using the proper %pa format specifier we use for 'phys_addr_t' arguments. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jens Axboe <axboe@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-nvdimm@ml01.01.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 17:03:57 +02:00
Ross Zwisler	9e853f2313	drivers/block/pmem: Add a driver for persistent memory PMEM is a new driver that presents a reserved range of memory as a block device. This is useful for developing with NV-DIMMs, and can be used with volatile memory as a development platform. This patch contains the initial driver from Ross Zwisler, with various changes: converted it to use a platform_device for discovery, fixed partition support and merged various patches from Boaz Harrosh. Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jens Axboe <axboe@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-nvdimm@ml01.01.org Link: http://lkml.kernel.org/r/1427872339-6688-3-git-send-email-hch@lst.de [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 17:03:56 +02:00
Jens Axboe	d31af0a325	NVMe: increase depth of admin queue Usually the admin queue depth of 64 is plenty, but for some use cases we really need it larger. Examples are use cases like MAT, where you have to touch all of NAND for init/format like purposes. In those cases, we see a good 2x increase with an increased queue depth. Signed-off-by: Jens Axboe <axboe@fb.com> Acked-by: Keith Busch <keith.busch@intel.com>	2015-03-31 10:41:34 -06:00
Murali Iyer	f137e0f151	nvme: Fix PRP list calculation for non-4k system page size PRP list calculation is supposed to be based on device's page size. Systems with page size larger than device's page size cause corruption to the name space as well as system memory with out this fix. Systems like x86 might not experience this issue because it uses PAGE_SIZE of 4K where as powerpc uses PAGE_SIZE of 64k while NVMe device's page size varies depending upon the vendor. Signed-off-by: Murali Iyer <mniyer@us.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-31 10:39:56 -06:00
Keith Busch	1efccc9ddb	NVMe: Fix blk-mq hot cpu notification The driver may issue commands to a device that may never return, so its request_queue could always have active requests while the controller is running. Waiting for the queue to freeze could block forever, which is what blk-mq's hot cpu notification handler was doing when nvme drives were in use. This has the nvme driver make the asynchronous event command's tag reserved and does not keep the request active. We can't have more than one since the request is released back to the request_queue before the command is completed. Having only one avoids potential tag collisions, and reserving the tag for this purpose prevents other admin tasks from reusing the tag. I also couldn't think of a scenario where issuing AEN requests single depth is worse than issuing them in batches, so I don't think we lose anything with this change. As an added bonus, doing it this way removes "Cancelling I/O" warnings observed when unbinding the nvme driver from a device. Reported-by: Yigal Korman <yigal@plexistor.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-31 10:39:56 -06:00
Chong Yuan	fda631ffe5	NVMe: embedded iod mask cleanup Remove unused mask in nvme_alloc_iod Signed-off-by: Chong Yuan <chong.yuan@memblaze.com> Reviewed-by: Wenbo Wang <wenbo.wang@memblaze.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-31 10:36:41 -06:00
Keith Busch	6df3dbc83f	NVMe: Freeze admin queue on device failure This fixes a race accessing an invalid address when a controller's admin queue is in use during a reset for failure or hot removal occurs. The admin queue will be frozen to prevent new users from entering prior to the doorbell queue being unmapped. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-31 10:36:06 -06:00
David Rientjes	cbc4ffdbe7	block, drbd: use mempool_create_slab_pool() Mempools created for slab caches should use mempool_create_slab_pool(). Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-24 20:00:18 -06:00
David Rientjes	23fe8f8b11	block, drbd: fix drbd_req_new() initialization mempool_alloc() does not support __GFP_ZERO since elements may come from memory that has already been released by mempool_free(). Remove __GFP_ZERO from mempool_alloc() in drbd_req_new() and properly initialize it to 0. Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-24 20:00:16 -06:00
Keith Busch	e6e96d73a2	NVMe: Initialize device list head before starting Driver recovery requires the device's list node to have been initialized. Fixes: https://lkml.org/lkml/2015/3/22/262 Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-03-23 09:35:12 -06:00
David Gibson	f571872671	powerpc: Move Power Macintosh drivers to generic byteswappers ppc has special instruction forms to efficiently load and store values in non-native endianness. These can be accessed via the arch-specific {ld,st}_le{16,32}() inlines in arch/powerpc/include/asm/swab.h. However, gcc is perfectly capable of generating the byte-reversing load/store instructions when using the normal, generic cpu_to_le() and le_to_cpu() functions eaning the arch-specific functions don't have much point. Worse the "le" in the names of the arch specific functions is now misleading, because they always generate byte-reversing forms, but some ppc machines can now run a little-endian kernel. To start getting rid of the arch-specific forms, this patch removes them from all the old Power Macintosh drivers, replacing them with the generic byteswappers. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2015-03-23 14:29:40 +11:00
Valentin Rothberg	d8bf368d06	genirq: Remove the deprecated 'IRQF_DISABLED' request_irq() flag entirely The IRQF_DISABLED flag is a NOOP and has been scheduled for removal since Linux v2.6.36 by commit `6932bf37be` ("genirq: Remove IRQF_DISABLED from core code"). According to commit `e58aa3d2d0` ("genirq: Run irq handlers with interrupts disabled"), running IRQ handlers with interrupts enabled can cause stack overflows when the interrupt line of the issuing device is still active. This patch ends the grace period for IRQF_DISABLED (i.e., SA_INTERRUPT in older versions of Linux) and removes the definition and all remaining usages of this flag. There's still a few non-functional references left in the kernel source: - The bigger hunk in Documentation/scsi/ncr53c8xx.txt is removed entirely as IRQF_DISABLED is gone now; the usage in older kernel versions (including the old SA_INTERRUPT flag) should be discouraged. The trouble of using IRQF_SHARED is a general problem and not specific to any driver. - I left the reference in Documentation/PCI/MSI-HOWTO.txt untouched since it has already been removed in linux-next. - All remaining references are changelogs that I suggest to keep. Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Afzal Mohammed <afzal@ti.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Ewan Milne <emilne@redhat.com> Cc: Eyal Perry <eyalpe@mellanox.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Hongliang Tao <taohl@lemote.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Keerthy <j-keerthy@ti.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nishanth Menon <nm@ti.com> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Lambert <lambert.quentin@gmail.com> Cc: Rajendra Nayak <rnayak@ti.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Sricharan R <r.sricharan@ti.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Lindgren <tony@atomide.com> Cc: Zhou Wang <wangzhou1@hisilicon.com> Cc: iss_storagedev@hp.com Cc: linux-mips@linux-mips.org Cc: linux-mtd@lists.infradead.org Link: http://lkml.kernel.org/r/1425565425-12604-1-git-send-email-valentinrothberg@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-05 20:53:06 +01:00
Jens Axboe	b8be79b714	Merge tag 'nbd_fixes_20150305' of git://git.pengutronix.de/git/mpa/linux-nbd into for-linus NBD fixes based on v4.0-rc1	2015-03-05 07:46:13 -07:00
Sudip Mukherjee	ff6b8090e2	nbd: fix possible memory leak we have already allocated memory for nbd_dev, but we were not releasing that memory and just returning the error value. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Paul Clements <Paul.Clements@SteelEye.com> Cc: <stable@vger.kernel.org> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>	2015-03-05 08:51:03 +01:00
Linus Torvalds	a015d33c98	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "Two smaller fixes for this cycle: - A fixup from Keith so that NVMe compiles without BLK_INTEGRITY, basically just moving the code around appropriately. - A fixup for shm, fixing an oops in shmem_mapping() for mapping with no inode. From Sasha" [ The shmem fix doesn't look block-layer-related, but fixes a bug that happened due to the backing_dev_info removal.. - Linus ] * 'for-linus' of git://git.kernel.dk/linux-block: mm: shmem: check for mapping owner before dereferencing NVMe: Fix for BLK_DEV_INTEGRITY not set	2015-02-28 10:21:57 -08:00
Joonsoo Kim	2ea55a2cae	zram: use proper type to update max_used_pages max_used_pages is defined as atomic_long_t so we need to use unsigned long to keep temporary value for it rather than int which is smaller than unsigned long in a 64 bit system. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-28 09:57:51 -08:00
Keith Busch	52b68d7ef8	NVMe: Fix for BLK_DEV_INTEGRITY not set Need to define and use appropriate functions for when BLK_DEV_INTEGRITY is not set. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-02-23 09:17:54 -08:00
Jens Axboe	decf6d79de	Merge branch 'for-3.20' of git://git.infradead.org/users/kbusch/linux-nvme into for-linus Merge 3.20 NVMe changes from Keith.	2015-02-20 22:12:02 -08:00
Keith Busch	0c0f9b95c8	NVMe: Fix potential corruption on sync commands This makes all sync commands uninterruptible and schedules without timeout so the controller either has to post a completion or the timeout recovery fails the command. This fixes potential memory or data corruption from a command timing out too early or woken by a signal. Previously any DMA buffers mapped for that command would have been released even though we don't know what the controller is planning to do with those addresses. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:38 -07:00
Keith Busch	4832851840	NVMe: Remove unused variables We don't track queues in a llist, subscribe to hot-cpu notifications, or internally retry commands. Delete the unused artifacts. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:38 -07:00
Keith Busch	9ac16938ab	NVMe: Fix scsi mode select llbaa setting It should be a logical bitwise AND, not conditional. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:37 -07:00
Keith Busch	07836e659c	NVMe: Fix potential corruption during shutdown The driver has to end unreturned commands at some point even if the controller has not provided a completion. The driver tried to be safe by deleting IO queues prior to ending all unreturned commands. That should cause the controller to internally abort inflight commands, but IO queue deletion request does not have to be successful, so all bets are off. We still have to make progress, so to be extra safe, this patch doesn't clear a queue to release the dma mapping for a command until after the pci device has been disabled. This patch removes the special handling during device initialization so controller recovery can be done all the time. This is possible since initialization is not inlined with pci probe anymore. Reported-by: Nilish Choudhury <nilesh.choudhury@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:37 -07:00
Keith Busch	2e1d844819	NVMe: Asynchronous controller probe This performs the longest parts of nvme device probe in scheduled work. This speeds up probe significantly when multiple devices are in use. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:36 -07:00
Keith Busch	b3fffdefab	NVMe: Register management handle under nvme class This creates a new class type for nvme devices to register their management character devices with. This is so we do not rely on miscdev to provide enough minors for as many nvme devices some people plan to use. The previous limit was approximately 60 NVMe controllers, depending on the platform and kernel. Now the limit is 1M, which ought to be enough for anybody. Since we have a new device class, it makes sense to attach the block devices under this as well, so part of this patch moves the management handle initialization prior to the namespaces discovery. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:36 -07:00
Keith Busch	4f1982b4e2	NVMe: Update SCSI Inquiry VPD 83h translation The original translation created collisions on Inquiry VPD 83 for many existing devices. Newer specifications provide other ways to translate based on the device's version can be used to create unique identifiers. Version 1.1 provides an EUI64 field that uniquely identifies each namespace, and 1.2 added the longer NGUID field for the same reason. Both follow the IEEE EUI format and readily translate to the SCSI device identification EUI designator type 2h. For devices implementing either, the translation will use this type, defaulting to the EUI64 8-byte type if implemented then NGUID's 16 byte version if not. If neither are provided, the 1.0 translation is used, and is updated to use the SCSI String format to guarantee a unique identifier. Knowing when to use the new fields depends on the nvme controller's revision. The NVME_VS macro was not decoding this correctly, so that is fixed in this patch and moved to a more appropriate place. Since the Identify Namespace structure required an update for the NGUID field, this patch adds the remaining new 1.2 fields to the structure. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:35 -07:00
Keith Busch	e1e5e5641e	NVMe: Metadata format support Adds support for NVMe metadata formats and exposes block devices for all namespaces regardless of their format. Namespace formats that are unusable will have disk capacity set to 0, but a handle to the block device is created to simplify device management. A namespace is not usable when the format requires host interleave block and metadata in single buffer, has no provisioned storage, or has better data but failed to register with blk integrity. The namespace has to be scanned in two phases to support separate metadata formats. The first establishes the sector size and capacity prior to invoking add_disk. If metadata is required, the capacity will be temporarilly set to 0 until it can be revalidated and registered with the integrity extenstions after add_disk completes. The driver relies on the integrity extensions to provide the metadata buffer. NVMe requires this be a single physically contiguous region, so only one integrity segment is allowed per command. If the metadata is used for T10 PI, the driver provides mappings to save and restore the reftag physical block translation. The driver provides no-op functions for generate and verify if metadata is not used for protection information. This way the setup is always provided by the block layer. If a request does not supply a required metadata buffer, the command is failed with bad address. This could only happen if a user manually disables verify/generate on such a disk. The only exception to where this is okay is if the controller is capable of stripping/generating the metadata, which is possible on some types of formats. The metadata scatter gather list now occupies the spot in the nvme_iod that used to be used to link retryable IOD's, but we don't do that anymore, so the field was unused. Signed-off-by: Keith Busch <keith.busch@intel.com>	2015-02-19 16:15:35 -07:00
Linus Torvalds	4533f6e27a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph changes from Sage Weil: "On the RBD side, there is a conversion to blk-mq from Christoph, several long-standing bug fixes from Ilya, and some cleanup from Rickard Strandqvist. On the CephFS side there is a long list of fixes from Zheng, including improved session handling, a few IO path fixes, some dcache management correctness fixes, and several blocking while !TASK_RUNNING fixes. The core code gets a few cleanups and Chaitanya has added support for TCP_NODELAY (which has been used on the server side for ages but we somehow missed on the kernel client). There is also an update to MAINTAINERS to fix up some email addresses and reflect that Ilya and Zheng are doing most of the maintenance for RBD and CephFS these days. Do not be surprised to see a pull request come from one of them in the future if I am unavailable for some reason" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits) MAINTAINERS: update Ceph and RBD maintainers libceph: kfree() in put_osd() shouldn't depend on authorizer libceph: fix double __remove_osd() problem rbd: convert to blk-mq ceph: return error for traceless reply race ceph: fix dentry leaks ceph: re-send requests when MDS enters reconnecting stage ceph: show nocephx_require_signatures and notcp_nodelay options libceph: tcp_nodelay support rbd: do not treat standalone as flatten ceph: fix atomic_open snapdir ceph: properly mark empty directory as complete client: include kernel version in client metadata ceph: provide seperate {inode,file}_operations for snapdir ceph: fix request time stamp encoding ceph: fix reading inline data when i_size > PAGE_SIZE ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_close_sessions) ceph: avoid block operation when !TASK_RUNNING (ceph_get_caps) ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_sync) rbd: fix error paths in rbd_dev_refresh() ...	2015-02-19 14:14:42 -08:00

... 3 4 5 6 7 ...

4900 Commits