linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-03 14:02:43 -04:00

Author	SHA1	Message	Date
Dan Williams	50690762cf	iommu/vt-d: Fix leaked ioremap mapping iommu_load_old_irte() appears to leak the old_irte mapping after use. Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:22:38 +02:00
Kees Cook	2439d4aa92	iommu/vt-d: Avoid format string leaks into iommu_device_create This makes sure it won't be possible to accidentally leak format strings into iommu device names. Current name allocations are safe, but this makes the "%s" explicit. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:15:47 +02:00
Robin Murphy	7b0ce727bf	of: iommu: Silence misleading warning Printing "IOMMU is currently not supported for PCI" for every PCI device probed on a DT-based system proves to be both irritatingly noisy and confusing to users who have misinterpreted it to mean they can no longer use VFIO device assignment. Since configuring DMA masks for PCI devices via of_dma_configure() has not in fact changed anything with regard to IOMMUs there really is nothing to warn about here; shut it up. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:07:49 +02:00
Suman Anna	5835b6a64c	iommu/omap: Align code with open parenthesis Fix all the occurrences of the following check warning generated with the checkpatch --strict option: "CHECK: Alignment should match open parenthesis" Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:43 +02:00
Suman Anna	eb642a3f5a	iommu/omap: Use BIT(x) macros in omap-iommu.h Switch to using the BIT(x) macros in omap-iommu.h where possible. This eliminates the following checkpatch check warning: "CHECK: Prefer using the BIT macro" A couple of the warnings were ignored for better readability of the bit-shift for the different values. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:42 +02:00
Suman Anna	5ff98fa68c	iommu/omap: Use BIT(x) macros in omap-iopgtable.h Switch to using the BIT(x) macros in omap-iopgtable.h where possible. This eliminates the following checkpatch check warning: "CHECK: Prefer using the BIT macro" A couple of macros that used zero bit shifting are defined directly to avoid the above warning on one of the macros. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:42 +02:00
Suman Anna	99ee98d6ac	iommu/omap: Remove unnecessary error traces on alloc failures Fix couple of checkpatch warnings of the type, "WARNING: Possible unnecessary 'out of memory' message" Signed-off-by: Suman Anna <s-anna@ti.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:28 +02:00
Suman Anna	5b39a37abc	iommu/omap: Remove trailing semi-colon from a macro Remove the trailing semi-colon in the DEBUG_FOPS_RO macro definition. This fixes the checking warning, "WARNING: macros should not use a trailing semicolon" Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:26 +02:00
Suman Anna	dc308f9f92	iommu/omap: Remove unused union fields There are couple of unions defined in the structures iotlb_entry and cr_regs. There are no usage/references to some of these union fields in the code, so clean them up and simplify the structures. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:04:25 +02:00
Vineet Gupta	e13c42ecbe	ARCv2: Fix the peripheral address space detection With HS 2.1 release, the peripheral space register no longer contains the uncached space specifics, causing the kernel to panic early on. So read the newer NON VOLATILE AUX register to get that info. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-03 19:34:07 +05:30
Suman Anna	ad8e29a080	iommu/omap: Protect omap-iopgtable.h against double inclusion Protect the omap-pgtable.h header against double inclusion in source code by using the standard include guard mechanism. Signed-off-by: Suman Anna <s-anna@ti.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:03:52 +02:00
Suman Anna	69c2c19632	iommu/omap: Move debugfs functions to omap-iommu-debug.c The main OMAP IOMMU driver file has some helper functions used by the OMAP IOMMU debugfs functionality, and there is already a dedicated source file omap-iommu-debug.c dealing with these debugfs routines. Move all these functions to the omap-iommu-debug.c file, so that all the debugfs related routines are in one place. The move required exposing some new functions and moving some definitions to the internal omap-iommu.h header file. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:03:50 +02:00
Suman Anna	0cdbf72716	iommu/omap: Remove all module references The OMAP IOMMU driver has been adapted to the IOMMU framework for a while now, and it does not support being built as a module anymore. So, remove all the module references from the OMAP IOMMU driver. While at it, also relocate a comment around the subsys_initcall to avoid a checkpatch strict warning about using a blank line after function/struct/union/enum declarations. Signed-off-by: Suman Anna <s-anna@ti.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:03:01 +02:00
Suman Anna	a73622a70a	Documentation: dt: Add #iommu-cells info to OMAP iommu bindings The OMAP IOMMU bindings is updated to reflect the required #iommu-cells property. Signed-off-by: Suman Anna <s-anna@ti.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:02:24 +02:00
James Cowgill	a4504755e7	MIPS: Replace add and sub instructions in relocate_kernel.S with addiu Fixes the assembler errors generated when compiling a MIPS R6 kernel with CONFIG_KEXEC on, by replacing the offending add and sub instructions with addiu instructions. Build errors: arch/mips/kernel/relocate_kernel.S: Assembler messages: arch/mips/kernel/relocate_kernel.S:27: Error: invalid operands `dadd $16,$16,8' arch/mips/kernel/relocate_kernel.S:64: Error: invalid operands `dadd $20,$20,8' arch/mips/kernel/relocate_kernel.S:65: Error: invalid operands `dadd $18,$18,8' arch/mips/kernel/relocate_kernel.S:66: Error: invalid operands `dsub $22,$22,1' scripts/Makefile.build:294: recipe for target 'arch/mips/kernel/relocate_kernel.o' failed Signed-off-by: James Cowgill <James.Cowgill@imgtec.com> Cc: <stable@vger.kernel.org> # 4.0+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10558/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2015-08-03 15:26:30 +02:00
Axel Lin	4a9e5ca1a5	phy: Constify struct phy_ops variables The phy_ops variables are never modified after initialized in these drivers, so make them const. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2015-08-03 18:35:09 +05:30
Axel Lin	42ad8f6721	phy: ulpi_phy: Add const qualifier to ops The ops is never changed in ulpi_phy_create(), so make it const. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2015-08-03 18:35:09 +05:30
Hans de Goede	8083526e05	phy-sun4i-usb: Only check vbus-det on power-on on boards with vbus-det data->vbus_det is always 1 on boards without a (working) vbus-det, skip the vbus_det test on such boards. This fixes the sun4i usb phy code never turning on Vbus on such boards. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2015-08-03 18:35:08 +05:30
Hans de Goede	3d772c4adb	phy-sun4i-sub: Move vbus-detect helper functions up in the file Move vbus-detect helper functions up in the file, just moving some code around, no functional changes what so ever. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2015-08-03 18:35:08 +05:30
Vincent Abriou	29d1dc62e1	drm/sti: atomic crtc/plane update Better fit STI hardware structure. Planes are no more responsible of updating mixer information such as z-order and status. It is now up to the CRTC atomic flush to do it. Plane actions (enable or disable) are performed atomically. Disabling of a plane is synchronize with the vsync event. Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>	2015-08-03 14:26:05 +02:00
Vincent Abriou	9e1f05b280	drm/sti: rename files and functions replace all "sti_drm_" occurences by "sti_" Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>	2015-08-03 14:25:06 +02:00
Vincent Abriou	871bcdfea6	drm/sti: code clean up Purpose is to simplify the STI driver: - remove layer structure - consider video subdev as part of the compositor (like mixer subdev) - remove useless STI_VID0 and STI_VID1 enum Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>	2015-08-03 14:25:01 +02:00
Vincent Abriou	bf60b29f8e	drm/sti: fix dynamic z-ordering Apply the plane depth when the plane is updated. If the depth is different from the previous plane update, the register controlling the plane depth is cleaned and updated with the new depth. Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>	2015-08-03 14:24:54 +02:00
Benjamin Gaignard	53bdcf5f02	drm: sti: fix sub-components bind Fix misunderstanding in how use component framework. drm_platform_init() is now call only when all the sub-components are register themselves instead of the previous broken two stages mechanism. Update bindings documentation. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>	2015-08-03 14:24:44 +02:00
Robin Murphy	97942c2862	arm64: dma-mapping: Simplify pgprot handling Since __get_dma_pgprot() does The Right Thing(TM) in the non-coherent case, and the non-cacheable alias for DMA buffers is private to the kernel anyway, we can simplify things slightly and make the code more readable by just using PAGE_KERNEL as the base pgprot. Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-08-03 13:17:38 +01:00
Mark Rutland	514f161abc	MAINTAINERS: add PSCI entry Add myself and Lorenzo as maintainers of the PSCI client code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-08-03 12:35:42 +01:00
Mark Rutland	5211df00a4	drivers: psci: support native SMC{32,64} calls A 32-bit OS cannot make calls with SMC64 IDs, while a 64-bit OS must invoke some PSCI functions with SMC64 IDs. This patch introduces and makes use of a new macro to choose the appropriate IDs based on the register width of the OS, which will allow 32-bit callers to use the PSCI client code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-08-03 12:35:00 +01:00
Mark Rutland	bff60792f9	arm64: psci: factor invocation code to drivers To enable sharing with arm, move the core PSCI framework code to drivers/firmware. This results in a minor gain in lines of code, but this will quickly be amortised by the removal of code currently duplicated in arch/arm. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-08-03 12:33:39 +01:00
Dinh Nguyen	27e44646dc	reset: socfpga: Update reset-socfpga to read the altr,modrst-offset property In order for the Arria10 to be able to re-use the reset driver for SoCFPGA Cyclone5/Arria5, we need to read the 'altr,modrst-offset' property from the device tree entry. The 'altr,modrst-offset' property is the first register into the reset manager that is used for bringing peripherals out of reset. The driver assumes a modrst-offset of 0x10 in order to support legacy Cyclone5/Arria5 hardware. Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>	2015-08-03 13:13:59 +02:00
Joachim Eastwood	27dc2fb18f	doc: dt: add documentation for lpc1850-rgu reset driver Add device tree binding documentation for the Reset Generation Unit (RGU) found on NXP LPC18xx and LPC43xx devies. This documentation also includes a table which shows the RGU reset number and the connected peripheral. Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2015-08-03 13:13:53 +02:00
Joachim Eastwood	c392b65ba8	reset: add driver for lpc18xx rgu Add reset driver for the Reset Generation Unit (RGU) found on NXP LPC18xx and LPC43xx devies. This reset controller features up to 64 reset lines connected to different blocks and peripheral in the SoC. Most reset lines on the controller are self clearing except for those dealing with the Cortex-M0 cores on LPC43xx devices. This driver also registers a restart handler that can be used to reset the entire device. Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2015-08-03 13:13:51 +02:00
Fabian Frederick	a518db45c2	reset: sti: constify of_device_id array of_device_id is always used as const. (See driver.of_match_table and open firmware functions) Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Maxime Coquelin <maxime.coquelin@st.com> Acked-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2015-08-03 13:13:50 +02:00
Philipp Zabel	efdf5aa8f1	ARM: STi: DT: Move reset controller constants into common location By popular vote, the DT binding includes for reset controllers are located in include/dt-bindings/reset/. Move the STi reset constants in there, too, to avoid confusion. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Patrice Chotard <patrice.chotard@st.com>	2015-08-03 13:13:44 +02:00
Philipp Zabel	b2f6dd7b94	MAINTAINERS: add include/dt-bindings/reset path to reset controller entry This is the path for reset definitions to be used in both device tree and reset controller drivers. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2015-08-03 13:13:01 +02:00
Yuyang Du	7ea241afbf	sched/fair: Clean up load average references For cfs_rq, we have load.weight, runnable_load_avg, and load_avg. Clean up how they are used: - First, as group sched_entity already largely uses load_avg, we now expand to use load_avg in all cases. - Second, for CPU-wide load balancing, we choose to use runnable_load_avg in all cases, which is the same as before this series. Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-8-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:24:32 +02:00
Yuyang Du	139622343e	sched/fair: Provide runnable_load_avg back to cfs_rq The cfs_rq's load_avg is composed of runnable_load_avg and blocked_load_avg. Before this series, sometimes the runnable_load_avg is used, and sometimes the load_avg is used. Completely replacing all uses of runnable_load_avg with load_avg may be too big a leap, i.e., the blocked_load_avg is concerned to result in overrated load. Therefore, we get runnable_load_avg back. The new cfs_rq's runnable_load_avg is improved to be updated with all of the runnable sched_eneities at the same time, so the one sched_entity updated and the others stale problem is solved. Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-7-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:24:31 +02:00
Yuyang Du	1269557889	sched/fair: Remove task and group entity load when they are dead When task exits or group is destroyed, the entity's load should be removed from its parent cfs_rq's load. Otherwise, it will take time for the parent cfs_rq to decay the dead entity's load to 0, which is not desired. Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-6-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:24:30 +02:00
Yuyang Du	540247fb5d	sched/fair: Init cfs_rq's sched_entity load average The runnable load and utilization averages of cfs_rq's sched_entity were not initiated. Like done to a task, give new cfs_rq' sched_entity start values to heavy its load in infant time. Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-5-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:24:29 +02:00
Vincent Guittot	6c1d47c082	sched/fair: Implement update_blocked_averages() for CONFIG_FAIR_GROUP_SCHED=n The load and the utilization of idle CPUs must be updated periodically in order to decay the blocked part. If CONFIG_FAIR_GROUP_SCHED is not set, the load and util of idle cpus are not decayed and stay at the values set before becoming idle. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Link: http://lkml.kernel.org/r/1436918682-4971-4-git-send-email-yuyang.du@intel.com [ Fixed up the SOB chain. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:24:28 +02:00
Yuyang Du	9d89c257df	sched/fair: Rewrite runnable load and utilization average tracking The idea of runnable load average (let runnable time contribute to weight) was proposed by Paul Turner and Ben Segall, and it is still followed by this rewrite. This rewrite aims to solve the following issues: 1. cfs_rq's load average (namely runnable_load_avg and blocked_load_avg) is updated at the granularity of an entity at a time, which results in the cfs_rq's load average is stale or partially updated: at any time, only one entity is up to date, all other entities are effectively lagging behind. This is undesirable. To illustrate, if we have n runnable entities in the cfs_rq, as time elapses, they certainly become outdated: t0: cfs_rq { e1_old, e2_old, ..., en_old } and when we update: t1: update e1, then we have cfs_rq { e1_new, e2_old, ..., en_old } t2: update e2, then we have cfs_rq { e1_old, e2_new, ..., en_old } ... We solve this by combining all runnable entities' load averages together in cfs_rq's avg, and update the cfs_rq's avg as a whole. This is based on the fact that if we regard the update as a function, then: w * update(e) = update(w * e) and update(e1) + update(e2) = update(e1 + e2), then w1 * update(e1) + w2 * update(e2) = update(w1 * e1 + w2 * e2) therefore, by this rewrite, we have an entirely updated cfs_rq at the time we update it: t1: update cfs_rq { e1_new, e2_new, ..., en_new } t2: update cfs_rq { e1_new, e2_new, ..., en_new } ... 2. cfs_rq's load average is different between top rq->cfs_rq and other task_group's per CPU cfs_rqs in whether or not blocked_load_average contributes to the load. The basic idea behind runnable load average (the same for utilization) is that the blocked state is taken into account as opposed to only accounting for the currently runnable state. Therefore, the average should include both the runnable/running and blocked load averages. This rewrite does that. In addition, we also combine runnable/running and blocked averages of all entities into the cfs_rq's average, and update it together at once. This is based on the fact that: update(runnable) + update(blocked) = update(runnable + blocked) This significantly reduces the code as we don't need to separately maintain/update runnable/running load and blocked load. 3. How task_group entities' share is calculated is complex and imprecise. We reduce the complexity in this rewrite to allow a very simple rule: the task_group's load_avg is aggregated from its per CPU cfs_rqs's load_avgs. Then group entity's weight is simply proportional to its own cfs_rq's load_avg / task_group's load_avg. To illustrate, if a task_group has { cfs_rq1, cfs_rq2, ..., cfs_rqn }, then, task_group_avg = cfs_rq1_avg + cfs_rq2_avg + ... + cfs_rqn_avg, then cfs_rqx's entity's share = cfs_rqx_avg / task_group_avg * task_group's share To sum up, this rewrite in principle is equivalent to the current one, but fixes the issues described above. Turns out, it significantly reduces the code complexity and hence increases clarity and efficiency. In addition, the new averages are more smooth/continuous (no spurious spikes and valleys) and updated more consistently and quickly to reflect the load dynamics. As a result, we have less load tracking overhead, better performance, and especially better power efficiency due to more balanced load. Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-3-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:29 +02:00
Yuyang Du	cd126afe83	sched/fair: Remove rq's runnable avg The current rq->avg is not used at all since its merge into the kernel, and the code is in the scheduler's hot path, so remove it. Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bsegall@google.com Cc: fengguang.wu@intel.com Cc: len.brown@intel.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: rafael.j.wysocki@intel.com Cc: umgwanakikbuti@gmail.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1436918682-4971-2-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:28 +02:00
Oleg Nesterov	d308b9f1e4	stop_machine: Remove cpu_stop_work's from list in cpu_stop_park() cpu_stop_park() does cpu_stop_signal_done() but leaves the work on stopper->works. The owner of this work can free/reuse this memory right after that and corrupt the list, so if this CPU becomes online again cpu_stopper_thread() will crash. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: paulmck@linux.vnet.ibm.com Cc: riel@redhat.com Cc: viro@ZenIV.linux.org.uk Link: http://lkml.kernel.org/r/20150630012958.GA23944@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:28 +02:00
Oleg Nesterov	9a301f22fa	stop_machine: Use 'cpu_stop_fn_t' where possible Cosmetic, but 'cpu_stop_fn_t' actually makes the code more readable and it doesn't break cscope. And most of the declarations already use it. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: paulmck@linux.vnet.ibm.com Cc: riel@redhat.com Cc: viro@ZenIV.linux.org.uk Link: http://lkml.kernel.org/r/20150630012955.GA23937@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:27 +02:00
Oleg Nesterov	7eeb088e72	stop_machine: Unexport __stop_machine() The only caller outside of stop_machine.c is _cpu_down(), it can use stop_machine(). get_online_cpus() is fine under cpu_hotplug_begin(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: paulmck@linux.vnet.ibm.com Cc: riel@redhat.com Cc: viro@ZenIV.linux.org.uk Link: http://lkml.kernel.org/r/20150630012951.GA23934@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:26 +02:00
Oleg Nesterov	b377c2a089	stop_machine: Don't do for_each_cpu() twice in queue_stop_cpus_work() queue_stop_cpus_work() can do everything in one for_each_cpu() loop. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: paulmck@linux.vnet.ibm.com Cc: riel@redhat.com Cc: viro@ZenIV.linux.org.uk Link: http://lkml.kernel.org/r/20150630012948.GA23927@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:26 +02:00
Oleg Nesterov	02cb7aa923	stop_machine: Move 'cpu_stopper_task' and 'stop_cpus_work' into 'struct cpu_stopper' Multpiple DEFINE_PER_CPU's do not make sense, move all the per-cpu variables into 'struct cpu_stopper'. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: paulmck@linux.vnet.ibm.com Cc: riel@redhat.com Cc: viro@ZenIV.linux.org.uk Link: http://lkml.kernel.org/r/20150630012944.GA23924@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:25 +02:00
Konstantin Khlebnikov	fe32d3cd5e	sched/preempt: Fix cond_resched_lock() and cond_resched_softirq() These functions check should_resched() before unlocking spinlock/bh-enable: preempt_count always non-zero => should_resched() always returns false. cond_resched_lock() worked iff spin_needbreak is set. This patch adds argument "preempt_offset" to should_resched(). preempt_count offset constants for that: PREEMPT_DISABLE_OFFSET - offset after preempt_disable() PREEMPT_LOCK_OFFSET - offset after spin_lock() SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable() SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh() Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `bdb4380658` ("sched: Extract the basic add/sub preempt_count modifiers") Link: http://lkml.kernel.org/r/20150715095204.12246.98268.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:24 +02:00
Konstantin Khlebnikov	c56dadf397	sched/preempt, powerpc, kvm: Use need_resched() instead of should_resched() Function should_resched() is equal to (!preempt_count() && need_resched()). In preemptive kernel preempt_count here is non-zero because of vc->lock. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150715095203.12246.72922.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:24 +02:00
Konstantin Khlebnikov	0fa2f5cb2b	sched/preempt, xen: Use need_resched() instead of should_resched() This code is used only when CONFIG_PREEMPT=n and only in non-atomic context: xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall(). Thus preempt_count is zero and should_resched() is equal to need_resched(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150715095201.12246.49283.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:23 +02:00
Mike Galbraith	63b0e9edce	sched/fair: Beef up wake_wide() Josef Bacik reported that Facebook sees better performance with their 1:N load (1 dispatch/node, N workers/node) when carrying an old patch to try very hard to wake to an idle CPU. While looking at wake_wide(), I noticed that it doesn't pay attention to the wakeup of a many partner waker, returning 1 only when waking one of its many partners. Correct that, letting explicit domain flags override the heuristic. While at it, adjust task_struct bits, we don't need a 64-bit counter. Tested-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> [ Tidy things up. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team<Kernel-team@fb.com> Cc: morten.rasmussen@arm.com Cc: riel@redhat.com Link: http://lkml.kernel.org/r/1436888390.7983.49.camel@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:23 +02:00

... 175 176 177 178 179 ...

547843 Commits