Commit Graph

26349 Commits

Author SHA1 Message Date
Juergen Gross
47ae4b05d0 virt, sched: Add generic vCPU pinning support
Add generic virtualization support for pinning the current vCPU to a
specified physical CPU. As this operation isn't performance critical
(a very limited set of operations like BIOS calls and SMIs is expected
to need this) just add a hypervisor specific indirection.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Douglas_Warzecha@dell.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: chrisw@sous-sol.org
Cc: david.vrabel@citrix.com
Cc: hpa@zytor.com
Cc: jdelvare@suse.com
Cc: jeremy@goop.org
Cc: linux@roeck-us.net
Cc: pali.rohar@gmail.com
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1472453327-19050-3-git-send-email-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 13:52:38 +02:00
Jeffrey Hugo
d64934019f x86/efi: Use efi_exit_boot_services()
The eboot code directly calls ExitBootServices.  This is inadvisable as the
UEFI spec details a complex set of errors, race conditions, and API
interactions that the caller of ExitBootServices must get correct.  The
eboot code attempts allocations after calling ExitBootSerives which is
not permitted per the spec.  Call the efi_exit_boot_services() helper
intead, which handles the allocation scenario properly.

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-05 12:40:16 +01:00
Jeffrey Hugo
dadb57abc3 efi/libstub: Allocate headspace in efi_get_memory_map()
efi_get_memory_map() allocates a buffer to store the memory map that it
retrieves.  This buffer may need to be reused by the client after
ExitBootServices() is called, at which point allocations are not longer
permitted.  To support this usecase, provide the allocated buffer size back
to the client, and allocate some additional headroom to account for any
reasonable growth in the map that is likely to happen between the call to
efi_get_memory_map() and the client reusing the buffer.

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-05 12:18:17 +01:00
Stephane Eranian
24cf84672e perf/x86/intel/uncore: Handle non-standard counter offset
The offset of the counters for UPI and M2M boxes on Skylake server is
non-standard (8 bytes apart).

This patch introduces a custom flag UNCORE_BOX_FLAG_CTL_OFFS8 to
specially handle it.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 13:15:08 +02:00
Kan Liang
68ce4a0dea perf/x86/intel/uncore: Remove hard-coded implementation for Node ID mapping location
The method to build PCI bus to socket mapping is similar among
platforms. However, the PCI location which stores Node ID mapping could
vary between different platforms. For example, the Node ID mapping address
on Skylake server is different from the previous platform. Also, to
build the mapping for the PCI bus without UBOX, it has to start from
bus 0 on Skylake server.

This patch removes the current hardcoded implementation and adds
three parameters for snbep_pci2phy_map_init(). This way the Node ID mapping
address and bus searching direction can be configured according to
different platforms.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nilay Vaish <nilayvaish@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 13:15:07 +02:00
Fabian Frederick
395adae450 iommu/amd: Remove AMD_IOMMU_STATS
Commit e85e8f69ce ("iommu/amd: Remove statistics code")
removed that configuration.

Also remove function definition (suggested by Joerg Roedel)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-09-05 12:42:27 +02:00
Ingo Molnar
2cc538412a Merge branch 'perf/urgent' into perf/core, to pick up fixed and resolve conflicts
Conflicts:
	kernel/events/core.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 12:09:59 +02:00
Tony Luck
ffb173e657 x86/mce: Drop X86_FEATURE_MCE_RECOVERY and the related model string test
We now have a better way to determine if we are running on a cpu that
supports machine check recovery. Free up this feature bit.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Boris Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d5db39e08d46cf1012d94d3902275d08ba931926.1472754712.git.tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-05 11:47:31 +02:00
Tony Luck
9a6fb28a35 x86/mce: Improve memcpy_mcsafe()
Use the mcsafe_key defined in the previous patch to make decisions on which
copy function to use. We can't use the FEATURE bit any more because PCI
quirks run too late to affect the patching of code. So we use a static key.

Turn memcpy_mcsafe() into an inline function to make life easier for
callers. The assembly code that actually does the copy is now named
memcpy_mcsafe_unrolled()

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Boris Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/bfde2fc774e94f53d91b70a4321c85a0d33e7118.1472754712.git.tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-05 11:47:31 +02:00
Tony Luck
3637efb008 x86/mce: Add PCI quirks to identify Xeons with machine check recovery
Each Xeon includes a number of capability registers in PCI space that
describe some features not enumerated by CPUID.

Use these to determine that we are running on a model that can recover from
machine checks. Hooks for Ivybridge ... Skylake provided.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Boris Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/abf331dc4a3e2a2d17444129bc51127437bcf4ba.1472754711.git.tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-05 11:47:31 +02:00
Borislav Petkov
cc2187a6e0 x86/microcode/AMD: Fix load of builtin microcode with randomized memory
We do not need to add the randomization offset when the microcode is
built in.

Reported-and-tested-by: Emanuel Czirai <icanrealizeum@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160904093736.GA11939@pd.tnic
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-05 10:38:56 +02:00
Linus Torvalds
9ca581b50d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single fix for an AMD erratum so machines without a BIOS fix work"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/AMD: Apply erratum 665 on machines without a BIOS fix
2016-09-04 08:45:41 -07:00
Emanuel Czirai
d199299675 x86/AMD: Apply erratum 665 on machines without a BIOS fix
AMD F12h machines have an erratum which can cause DIV/IDIV to behave
unpredictably. The workaround is to set MSRC001_1029[31] but sometimes
there is no BIOS update containing that workaround so let's do it
ourselves unconditionally. It is simple enough.

[ Borislav: Wrote commit message. ]

Signed-off-by: Emanuel Czirai <icanrealizeum@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yaowu Xu <yaowu@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160902053550.18097-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-02 20:42:28 +02:00
Steven Rostedt
15301a5707 x86/paravirt: Do not trace _paravirt_ident_*() functions
Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
after enabling function tracer. I asked him to bisect the functions within
available_filter_functions, which he did and it came down to three:

  _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()

It was found that this is only an issue when noreplace-paravirt is added
to the kernel command line.

This means that those functions are most likely called within critical
sections of the funtion tracer, and must not be traced.

In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
longer an issue.  But both _paravirt_ident_{32,64}() causes the
following splat when they are traced:

 mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
 mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
 NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
 Modules linked in: e1000e
 CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
 RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
 RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
 RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
 RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
 R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
 R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
 FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
 Call Trace:
   _raw_spin_lock+0x27/0x30
   handle_pte_fault+0x13db/0x16b0
   handle_mm_fault+0x312/0x670
   __do_page_fault+0x1b1/0x4e0
   do_page_fault+0x22/0x30
   page_fault+0x28/0x30
   __vfs_read+0x28/0xe0
   vfs_read+0x86/0x130
   SyS_read+0x46/0xa0
   entry_SYSCALL_64_fastpath+0x1e/0xa8
 Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b

Reported-by: Łukasz Daniluk <lukasz.daniluk@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-02 09:40:47 -07:00
Arnd Bergmann
236dec0510 kconfig: tinyconfig: provide whole choice blocks to avoid warnings
Using "make tinyconfig" produces a couple of annoying warnings that show
up for build test machines all the time:

    .config:966:warning: override: NOHIGHMEM changes choice state
    .config:965:warning: override: SLOB changes choice state
    .config:963:warning: override: KERNEL_XZ changes choice state
    .config:962:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:933:warning: override: SLOB changes choice state
    .config:930:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:870:warning: override: SLOB changes choice state
    .config:868:warning: override: KERNEL_XZ changes choice state
    .config:867:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state

I've made a previous attempt at fixing them and we discussed a number of
alternatives.

I tried changing the Makefile to use "merge_config.sh -n
$(fragment-list)" but couldn't get that to work properly.

This is yet another approach, based on the observation that we do want
to see a warning for conflicting 'choice' options, and that we can
simply make them non-conflicting by listing all other options as
disabled.  This is a trivial patch that we can apply independent of
plans for other changes.

Link: http://lkml.kernel.org/r/20160829214952.1334674-2-arnd@arndb.de
Link: https://storage.kernelci.org/mainline/v4.7-rc6/x86-tinyconfig/build.log
https://patchwork.kernel.org/patch/9212749/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-01 17:52:01 -07:00
Thomas Gleixner
0cb7bf61b1 Merge branch 'linus' into smp/hotplug
Apply upstream changes to avoid conflicts with pending patches.
2016-09-01 18:33:46 +02:00
Bjorn Helgaas
6af7e4f772 PCI: Mark Haswell Power Control Unit as having non-compliant BARs
The Haswell Power Control Unit has a non-PCI register (CONFIG_TDP_NOMINAL)
where BAR 0 is supposed to be.  This is erratum HSE43 in the spec update
referenced below:

  The PCIe* Base Specification indicates that Configuration Space Headers
  have a base address register at offset 0x10.  Due to this erratum, the
  Power Control Unit's CONFIG_TDP_NOMINAL CSR (Bus 1; Device 30; Function
  3; Offset 0x10) is located where a base register is expected.

Mark the PCU as having non-compliant BARs so we don't try to probe any of
them.  There are no other BARs on this device.

Rename the quirk so it's not Broadwell-specific.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v3-spec-update.html
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v3-datasheet-vol-2.html (section 5.4, Device 30 Function 3)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=153881
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
2016-09-01 08:52:29 -05:00
Andy Shevchenko
3976b0380b x86/platform/intel-mid: Enable SD card detection on Merrifield
Intel Merrifield platform provides SD card interface. The interface allows user
to plug SD card to extend storage capacity.

Append the essential data to enable SD card detection on it.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160831135713.79066-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-01 08:22:42 +02:00
Andy Shevchenko
dd8d6ec672 x86/platform/intel-mid: Enable WiFi on Intel Edison
Intel Edison board provides built-in WiFi dongle based on Broadcom BCM43340.

Append the essential data to enable WiFi on Intel Edison.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160831135713.79066-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-01 08:22:42 +02:00
Josh Poimboeuf
0d025d271e mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815c9e ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by #2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.

Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-30 10:10:21 -07:00
Linus Torvalds
5d84ee7964 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single bugfix to prevent irq remapping when the ioapic is disabled"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Do not init irq remapping if ioapic is disabled
2016-08-28 10:00:21 -07:00
Linus Torvalds
af56ff27eb Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "ARM:
   - fixes for ITS init issues, error handling, IRQ leakage, race
     conditions
   - an erratum workaround for timers
   - some removal of misleading use of errors and comments
   - a fix for GICv3 on 32-bit guests

  MIPS:
   - fix for where the guest could wrongly map the first page of
     physical memory

  x86:
   - nested virtualization fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  MIPS: KVM: Check for pfn noslot case
  kvm: nVMX: fix nested tsc scaling
  KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
  KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
  arm64: KVM: report configured SRE value to 32-bit world
  arm64: KVM: remove misleading comment on pmu status
  KVM: arm/arm64: timer: Workaround misconfigured timer interrupt
  arm64: Document workaround for Cortex-A72 erratum #853709
  KVM: arm/arm64: Change misleading use of is_error_pfn
  KVM: arm64: ITS: avoid re-mapping LPIs
  KVM: arm64: check for ITS device on MSI injection
  KVM: arm64: ITS: move ITS registration into first VCPU run
  KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  KVM: arm64: vgic-its: Plug race in vgic_put_irq
  KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  KVM: arm64: ITS: return 1 on successful MSI injection
2016-08-27 15:51:50 -07:00
Linus Torvalds
5e608a0270 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "11 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: silently skip readahead for DAX inodes
  dax: fix device-dax region base
  fs/seq_file: fix out-of-bounds read
  mm: memcontrol: avoid unused function warning
  mm: clarify COMPACTION Kconfig text
  treewide: replace config_enabled() with IS_ENABLED() (2nd round)
  printk: fix parsing of "brl=" option
  soft_dirty: fix soft_dirty during THP split
  sysctl: handle error writing UINT_MAX to u32 fields
  get_maintainer: quiet noisy implicit -f vcs_file_exists checking
  byteswap: don't use __builtin_bswap*() with sparse
2016-08-26 23:12:12 -07:00
Linus Torvalds
219c04cea3 Merge tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
 "Resource management:
   - Update "pci=resource_alignment" documentation (Mathias Koehrer)

  MSI:
   - Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig)
   - Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig)

  Intel VMD host bridge driver:
   - Fix infinite loop executing irq's (Keith Busch)"

* tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: VMD: Fix infinite loop executing irq's
  PCI: Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors()
  PCI: Use positive flags in pci_alloc_irq_vectors()
  PCI: Update "pci=resource_alignment" documentation
2016-08-26 18:26:07 -07:00
Masahiro Yamada
a5ff1b34e1 treewide: replace config_enabled() with IS_ENABLED() (2nd round)
Commit 97f2645f35 ("tree-wide: replace config_enabled() with
IS_ENABLED()") mostly killed config_enabled(), but some new users have
appeared for v4.8-rc1.  They are all used for a boolean option, so can
be replaced with IS_ENABLED() safely.

Link: http://lkml.kernel.org/r/1471970749-24867-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26 17:39:35 -07:00
Markus Elfring
4f0fbdf22e xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus reuse the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-26 10:44:22 +01:00
Linus Torvalds
fe2dd21282 Merge tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen regression fix from David Vrabel:
 "Fix a regression in the xenbus device preventing userspace tools from
  working"

* tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: change the type of xen_vcpu_id to uint32_t
  xenbus: don't look up transaction IDs for ordinary writes
2016-08-24 14:04:30 -04:00
Juergen Gross
0252937a87 xen: Make VPMU init message look less scary
The default for the Xen hypervisor is to not enable VPMU in order to
avoid security issues. In this case the Linux kernel will issue the
message "Could not initialize VPMU for cpu 0, error -95" which looks
more like an error than a normal state.

Change the message to something less scary in case the hypervisor
returns EOPNOTSUPP or ENOSYS when trying to activate VPMU.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:45:38 +01:00
Boris Ostrovsky
e1c105a9d7 hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
Now that Xen no longer allocates irqs in _cpu_up() we can restore
commit a899418167 ("hotplug: Prevent alloc/free of irq descriptors
during cpu up/down")

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
CC: x86@kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:44:46 +01:00
Boris Ostrovsky
5fc509bc2b xen/x86: Move irq allocation from Xen smp_op.cpu_up()
Commit ce0d3c0a6f ("genirq: Revert sparse irq locking around
__cpu_up() and move it to x86 for now") reverted irq locking
introduced by commit a899418167 ("hotplug: Prevent alloc/free
of irq descriptors during cpu up/down") because of Xen allocating
irqs in both of its cpu_up ops.

We can move those allocations into CPU notifiers so that original
patch can be reinstated.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:44:40 +01:00
Vitaly Kuznetsov
55467dea29 xen: change the type of xen_vcpu_id to uint32_t
We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:17:27 +01:00
Brian Gerst
ffcb043ba5 sched/x86: Fix thread_saved_pc()
thread_saved_pc() was using a completely bogus method to get the return
address.  Since switch_to() was previously inlined, there was no sane way
to know where on the stack the return address was stored.  Now with the
frame of a sleeping thread well defined, this can be implemented correctly.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-7-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:31:51 +02:00
Brian Gerst
616d24835e sched/x86: Pass kernel thread parameters in 'struct fork_frame'
Instead of setting up a fake pt_regs context, put the kernel thread
function pointer and arg into the unused callee-restored registers
of 'struct fork_frame'.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-6-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:31:50 +02:00
Brian Gerst
0100301bfd sched/x86: Rewrite the switch_to() code
Move the low-level context switch code to an out-of-line asm stub instead of
using complex inline asm.  This allows constructing a new stack frame for the
child process to make it seamlessly flow to ret_from_fork without an extra
test and branch in __switch_to().  It also improves code generation for
__schedule() by using the C calling convention instead of clobbering all
registers.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-5-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:31:41 +02:00
Brian Gerst
7b32aeadbc sched/x86: Add 'struct inactive_task_frame' to better document the sleeping task stack frame
Add 'struct inactive_task_frame', which defines the layout of the stack for
a sleeping process.  For now, the only defined field is the BP register
(frame pointer).

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-4-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:27:41 +02:00
Brian Gerst
163630191e sched/x86/64, kgdb: Clear GDB_PS on 64-bit
switch_to() no longer saves EFLAGS, so it's bogus to look for it on the
stack.  Set it to zero like 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-3-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:27:40 +02:00
Brian Gerst
4e047aa7f2 sched/x86/32, kgdb: Don't use thread.ip in sleeping_thread_to_gdb_regs()
Match 64-bit and set gdb_regs[GDB_PC] to zero.  thread.ip is always the
same point in the scheduler (except for newly forked processes), and will
be removed in a future patch.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471106302-10159-2-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:27:40 +02:00
Josh Poimboeuf
13e25bab7e x86/dumpstack/ftrace: Don't print unreliable addresses in print_context_stack_bp()
When function graph tracing is enabled, print_context_stack_bp() can
report return_to_handler() as an unreliable address, which is confusing
and misleading: return_to_handler() is really only useful as a hint for
debugging, whereas print_context_stack_bp() users only care about the
actual 'reliable' call path.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c51aef578d8027791b38d2ad9bac0c7f499fde91.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:15 +02:00
Josh Poimboeuf
6f727b84e2 x86/dumpstack/ftrace: Mark function graph handler function as unreliable
When function graph tracing is enabled for a function, its return
address on the stack is replaced with the address of an ftrace handler
(return_to_handler).

Currently 'return_to_handler' can be reported as reliable.  That's not
ideal, and can actually be misleading.  When saving or dumping the
stack, you normally only care about what led up to that point (the call
path), rather than what will happen in the future (the return path).

That's especially true in the non-oops stack trace case, which isn't
used for debugging.  For example, in a perf profiling operation,
reporting return_to_handler() in the trace would just be confusing.

And in the oops case, where debugging is important, "unreliable" is also
more appropriate there because it serves as a hint that graph tracing
was involved, instead of trying to imply that return_to_handler() was
the real caller.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f8af15749c7d632d3e7f815995831d5b7f82950d.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:15 +02:00
Josh Poimboeuf
471bd10f5e ftrace/x86: Implement HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
Use the more reliable version of ftrace_graph_ret_addr() so we no longer
have to worry about the unwinder getting out of sync with the function
graph ret_stack index, which can happen if the unwinder skips any frames
before calling ftrace_graph_ret_addr().

This fixes this issue (and several others like it):

  $ cat /proc/self/stack
  [<ffffffff810489a2>] save_stack_trace_tsk+0x22/0x40
  [<ffffffff81311a89>] proc_pid_stack+0xb9/0x110
  [<ffffffff813127c4>] proc_single_show+0x54/0x80
  [<ffffffff812be088>] seq_read+0x108/0x3e0
  [<ffffffff812923d7>] __vfs_read+0x37/0x140
  [<ffffffff812929d9>] vfs_read+0x99/0x140
  [<ffffffff81293f28>] SyS_read+0x58/0xc0
  [<ffffffff818af97c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
  [<ffffffffffffffff>] 0xffffffffffffffff

  $ echo function_graph > /sys/kernel/debug/tracing/current_tracer

  $ cat /proc/self/stack
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff810394cc>] print_context_stack+0xfc/0x100
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff8103891b>] dump_trace+0x12b/0x350
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff810489a2>] save_stack_trace_tsk+0x22/0x40
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff81311a89>] proc_pid_stack+0xb9/0x110
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff813127c4>] proc_single_show+0x54/0x80
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff812be088>] seq_read+0x108/0x3e0
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff812923d7>] __vfs_read+0x37/0x140
  [<ffffffff818b2428>] return_to_handler+0x0/0x27
  [<ffffffff812929d9>] vfs_read+0x99/0x140
  [<ffffffffffffffff>] 0xffffffffffffffff

Enabling function graph tracing causes the stack trace to change in two
ways:

First, the real call addresses are confusingly interspersed with
'return_to_handler' addresses.  This issue will be fixed by the next
patch.

Second, the stack trace is offset by two frames, because the unwinder
skipped the first two frames and got out of sync with the ret_stack
index.  This patch fixes this issue.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a6d623e36f8d08f9a17bd74d804d201177a23afd.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:15 +02:00
Josh Poimboeuf
408fe5de2f x86/dumpstack/ftrace: Convert dump_trace() callbacks to use ftrace_graph_ret_addr()
Convert print_context_stack() and print_context_stack_bp() to use the
arch-independent ftrace_graph_ret_addr() helper.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/56ec97cafc1bf2e34d1119e6443d897db406da86.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:14 +02:00
Josh Poimboeuf
9a7c348ba6 ftrace: Add return address pointer to ftrace_ret_stack
Storing this value will help prevent unwinders from getting out of sync
with the function graph tracer ret_stack.  Now instead of needing a
stateful iterator, they can compare the return address pointer to find
the right ret_stack entry.

Note that an array of 50 ftrace_ret_stack structs is allocated for every
task.  So when an arch implements this, it will add either 200 or 400
bytes of memory usage per task (depending on whether it's a 32-bit or
64-bit platform).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:14 +02:00
Josh Poimboeuf
e4a744ef2f ftrace: Remove CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST from config
Make HAVE_FUNCTION_GRAPH_FP_TEST a normal define, independent from
kconfig.  This removes some config file pollution and simplifies the
checking for the fp test.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2c4e5f05054d6d367f702fd153af7a0109dd5c81.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:13 +02:00
Andy Lutomirski
e37e43a497 x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y)
This allows x86_64 kernels to enable vmapped stacks by setting
HAVE_ARCH_VMAP_STACK=y - which enables the CONFIG_VMAP_STACK=y
high level Kconfig option.

There are a couple of interesting bits:

First, x86 lazily faults in top-level paging entries for the vmalloc
area.  This won't work if we get a page fault while trying to access
the stack: the CPU will promote it to a double-fault and we'll die.
To avoid this problem, probe the new stack when switching stacks and
forcibly populate the pgd entry for the stack when switching mms.

Second, once we have guard pages around the stack, we'll want to
detect and handle stack overflow.

I didn't enable it on x86_32.  We'd need to rework the double-fault
code a bit and I'm concerned about running out of vmalloc virtual
addresses under some workloads.

This patch, by itself, will behave somewhat erratically when the
stack overflows while RSP is still more than a few tens of bytes
above the bottom of the stack.  Specifically, we'll get #PF and make
it to no_context and them oops without reliably triggering a
double-fault, and no_context doesn't know about stack overflows.
The next patch will improve that case.

Thank you to Nadav and Brian for helping me pay enough attention to
the SDM to hopefully get this right.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c88f3e2920b18e6cc621d772a04a62c06869037e.1470907718.git.luto@kernel.org
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:11:42 +02:00
Ingo Molnar
eb4e841099 Merge tag 'v4.8-rc3' into x86/asm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:11:29 +02:00
Wei Jiangang
5035da4199 x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq().

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Link: http://lkml.kernel.org/r/1471576957-12961-2-git-send-email-weijg.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 11:24:33 +02:00
Wei Jiangang
384d9fe374 x86/smpboot: Check APIC ID before setting up default routing
This is not a bugfix, but code optimization.

If the BSP's APIC ID in local APIC is unexpected,
a kernel panic will occur and the system will halt.
That means no need to enable APIC mode, and no reason
to set up the default routing for APIC.

The combination of default_setup_apic_routing() and
apic_bsp_setup() are used to enable APIC mode.
They two should be kept together, rather than being
separated by the codes of checking APIC ID.
Just like their usage in APIC_init_uniprocessor().

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Link: http://lkml.kernel.org/r/1471576957-12961-1-git-send-email-weijg.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 11:24:33 +02:00
Borislav Petkov
556b672368 x86/entry: Remove outdated comment about SYSCALL targets
The comment probably meant some old AMD64 incarnation which most likely
never saw the light of day. STAR and LSTAR are two different registers
and STAR sets CS/SS(DS) selectors for *all* modes, not only 32-bit.

So simply remove that comment.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160823172356.15879-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 11:20:31 +02:00
Wanpeng Li
2e63ad4bd5 x86/apic: Do not init irq remapping if ioapic is disabled
native_smp_prepare_cpus
  -> default_setup_apic_routing
    -> enable_IR_x2apic
      -> irq_remapping_prepare
        -> intel_prepare_irq_remapping
          -> intel_setup_irq_remapping		  

So IR table is setup even if "noapic" boot parameter is added. As a result we
crash later when the interrupt affinity is set due to a half initialized
remapping infrastructure.

Prevent remap initialization when IOAPIC is disabled.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-08-24 09:45:40 +02:00
Keith Busch
21c80c9fef x86/PCI: VMD: Fix infinite loop executing irq's
We can't initialize the list head on deletion as this causes the node to
point to itself, which causes an infinite loop if vmd_irq() happens to be
servicing that node.

The list initialization was trying to fix a bug from multiple calls to
disable the same IRQ.  Fix this instead by having the VMD driver track if
the interrupt is enabled.

[bhelgaas: changelog, add "Fixes"]
Fixes: 97e9230635 ("x86/PCI: VMD: Initialize list item in IRQ disable")
Reported-by: Grzegorz Koczot <grzegorz.koczot@intel.com>
Tested-by: Miroslaw Drost <miroslaw.drost@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
2016-08-23 16:36:42 -05:00