Commit Graph

22740 Commits

Author SHA1 Message Date
Steven Rostedt (Red Hat)
4fd3279b48 ftrace: Add more information to ftrace_bug() output
With the introduction of the dynamic trampolines, it is useful that if
things go wrong that ftrace_bug() produces more information about what
the current state is. This can help debug issues that may arise.

Ftrace has lots of checks to make sure that the state of the system it
touchs is exactly what it expects it to be. When it detects an abnormality
it calls ftrace_bug() and disables itself to prevent any further damage.
It is crucial that ftrace_bug() produces sufficient information that
can be used to debug the situation.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11 12:42:13 -05:00
Steven Rostedt (Red Hat)
12cce594fa ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
When the static ftrace_ops (like function tracer) enables tracing, and it
is the only callback that is referencing a function, a trampoline is
dynamically allocated to the function that calls the callback directly
instead of calling a loop function that iterates over all the registered
ftrace ops (if more than one ops is registered).

But when it comes to dynamically allocated ftrace_ops, where they may be
freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
to free the trampoline. If a task was preempted while executing on the
trampoline, there's currently no way to know when it will be off that
trampoline.

But this is not true when it comes to !CONFIG_PREEMPT. The current method
of calling schedule_on_each_cpu() will force tasks off the trampoline,
becaues they can not schedule while on it (kernel preemption is not
configured). That means it is safe to free a dynamically allocated
ftrace ops trampoline when CONFIG_PREEMPT is not configured.

Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11 12:41:52 -05:00
Borislav Petkov
6f9b63a0ae x86, CPU, AMD: Move K8 TLB flush filter workaround to K8 code
This belongs with the rest of the code in init_amd_k8() which gets
executed on family 0xf.

Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-11 17:58:20 +01:00
Borislav Petkov
585e4777be x86, espfix: Remove stale ptemask
Previous versions of the espfix had a single function which did setup
the pagetables. It was later split into BSP and AP version. Drop unused
leftovers after that split.

Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-11 17:57:46 +01:00
Ingo Molnar
0cafa3e714 Merge tag 'microcode_fixes_for_3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent
Pull two fixes for early microcode loader on 32-bit from Borislav Petkov:

 - access the dis_ucode_ldr chicken bit properly
 - fix patch stashing on AMD on 32-bit

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10 17:08:01 +01:00
Thierry Reding
4707a341b4 /dev/mem: Use more consistent data types
The xlate_dev_{kmem,mem}_ptr() functions take either a physical address
or a kernel virtual address, so data types should be phys_addr_t and
void *. They both return a kernel virtual address which is only ever
used in calls to copy_{from,to}_user(), so make variables that store it
void * rather than char * for consistency.

Also only define a weak unxlate_dev_mem_ptr() function if architectures
haven't overridden them in the asm/io.h header file.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-11-10 15:59:21 +01:00
Paolo Bonzini
ac146235d4 KVM: x86: fix warning on 32-bit compilation
PCIDs are only supported in 64-bit mode.  No need to clear bit 63
of CR3 unless the host is 64-bit.

Reported by Fengguang Wu's autobuilder.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-10 13:53:25 +01:00
Borislav Petkov
c0a717f23d x86, microcode, AMD: Fix ucode patch stashing on 32-bit
Save the patch while we're running on the BSP instead of later, before
the initrd has been jettisoned. More importantly, on 32-bit we need to
access the physical address instead of the virtual.

This way we actually do find it on the APs instead of having to go
through the initrd each time.

Tested-by: Richard Hendershot <rshendershot@mchsi.com>
Fixes: 5335ba5cf4 ("x86, microcode, AMD: Fix early ucode loading")
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-10 13:50:55 +01:00
Boris Ostrovsky
54279552bd x86/core, x86/xen/smp: Use 'die_complete' completion when taking CPU down
Commit 2ed53c0d6c ("x86/smpboot: Speed up suspend/resume by
avoiding 100ms sleep for CPU offline during S3") introduced
completions to CPU offlining process. These completions are not
initialized on Xen kernels causing a panic in
play_dead_common().

Move handling of die_complete into common routines to make them
available to Xen guests.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Cc: tianyu.lan@intel.com
Cc: konrad.wilk@oracle.com
Cc: xen-devel@lists.xenproject.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414770572-7950-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10 11:16:40 +01:00
Andy Lutomirski
26893107aa x86_64/vsyscall: Restore orig_ax after vsyscall seccomp
The vsyscall emulation code sets orig_ax for seccomp's benefit,
but it forgot to set it back.

I'm not sure that this is observable at all, but it could cause
confusion to various /proc or ptrace users, and it's possible
that it could cause minor artifacts if a signal were to be
delivered on return from vsyscall emulation.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/cdc6a564517a4df09235572ee5f530ccdcf933f7.1415144089.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10 10:46:35 +01:00
Andy Lutomirski
07114f0f1c x86_64: Add a comment explaining the TASK_SIZE_MAX guard page
That guard page is absolutely necessary; explain why for
posterity.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/23320cb5017c2da8475ec20fcde8089d82aa2699.1415144745.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10 10:43:13 +01:00
David Matlack
ce1a5e60a6 kvm: x86: add trace event for pvclock updates
The new trace event records:
  * the id of vcpu being updated
  * the pvclock_vcpu_time_info struct being written to guest memory

This is useful for debugging pvclock bugs, such as the bug fixed by
"[PATCH] kvm: x86: Fix kvm clock versioning.".

Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:55 +01:00
Owen Hofmann
09a0c3f110 kvm: x86: Fix kvm clock versioning.
kvm updates the version number for the guest paravirt clock structure by
incrementing the version of its private copy. It does not read the guest
version, so will write version = 2 in the first update for every new VM,
including after restoring a saved state. If guest state is saved during
reading the clock, it could read and accept struct fields and guest TSC
from two different updates. This changes the code to increment the guest
version and write it back.

Signed-off-by: Owen Hofmann <osh@google.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:54 +01:00
Nadav Amit
ed9aad215f KVM: x86: MOVNTI emulation min opsize is not respected
Commit 3b32004a66 ("KVM: x86: movnti minimum op size of 32-bit is not kept")
did not fully fix the minimum operand size of MONTI emulation. Still, MOVNTI
may be mistakenly performed using 16-bit opsize.

This patch add No16 flag to mark an instruction does not support 16-bits
operand size.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:54 +01:00
Marcelo Tosatti
7f187922dd KVM: x86: update masterclock values on TSC writes
When the guest writes to the TSC, the masterclock TSC copy must be
updated as well along with the TSC_OFFSET update, otherwise a negative
tsc_timestamp is calculated at kvm_guest_time_update.

Once "if (!vcpus_matched && ka->use_master_clock)" is simplified to
"if (ka->use_master_clock)", the corresponding "if (!ka->use_master_clock)"
becomes redundant, so remove the do_request boolean and collapse
everything into a single condition.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:53 +01:00
Nadav Amit
b2c9d43e6c KVM: x86: Return UNHANDLABLE on unsupported SYSENTER
Now that KVM injects #UD on "unhandlable" error, it makes better sense to
return such error on sysenter instead of directly injecting #UD to the guest.
This allows to track more easily the unhandlable cases the emulator does not
support.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:52 +01:00
Nadav Amit
db324fe6f2 KVM: x86: Warn on APIC base relocation
APIC base relocation is unsupported by KVM. If anyone uses it, the least should
be to report a warning in the hypervisor.

Note that KVM-unit-tests uses this feature for some reason, so running the
tests triggers the warning.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:51 +01:00
Nadav Amit
d14cb5df59 KVM: x86: Emulator mis-decodes VEX instructions on real-mode
Commit 7fe864dc94 (KVM: x86: Mark VEX-prefix instructions emulation as
unimplemented, 2014-06-02) marked VEX instructions as such in protected
mode.  VEX-prefix instructions are not supported relevant on real-mode
and VM86, but should cause #UD instead of being decoded as LES/LDS.

Fix this behaviour to be consistent with real hardware.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Check for mod == 3, rather than 2 or 3. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08 08:20:10 +01:00
Sudeep Holla
5aaba36318 cpumask: factor out show_cpumap into separate helper function
Many sysfs *_show function use cpu{list,mask}_scnprintf to copy cpumap
to the buffer aligned to PAGE_SIZE, append '\n' and '\0' to return null
terminated buffer with newline.

This patch creates a new helper function cpumap_print_to_pagebuf in
cpumask.h using newly added bitmap_print_to_pagebuf and consolidates
most of those sysfs functions using the new helper function.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: x86@kernel.org
Cc: linux-acpi@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-07 11:45:00 -08:00
Nadav Amit
2c2ca2d12f KVM: x86: Remove redundant and incorrect cpl check on task-switch
Task-switch emulation checks the privilege level prior to performing the
task-switch.  This check is incorrect in the case of task-gates, in which the
tss.dpl is ignored, and can cause superfluous exceptions.  Moreover this check
is unnecassary, since the CPU checks the privilege levels prior to exiting.
Intel SDM 25.4.2 says "If CALL or JMP accesses a TSS descriptor directly
outside IA-32e mode, privilege levels are checked on the TSS descriptor" prior
to exiting.  AMD 15.14.1 says "The intercept is checked before the task switch
takes place but after the incoming TSS and task gate (if one was involved) have
been checked for correctness."

This patch removes the CPL checks for CALL and JMP.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:10 +01:00
Nadav Amit
9a9abf6b61 KVM: x86: Inject #GP when loading system segments with non-canonical base
When emulating LTR/LDTR/LGDT/LIDT, #GP should be injected if the base is
non-canonical. Otherwise, VM-entry will fail.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:09 +01:00
Nadav Amit
5b7f6a1e6f KVM: x86: Combine the lgdt and lidt emulation logic
LGDT and LIDT emulation logic is almost identical. Merge the logic into a
single point to avoid redundancy. This will be used by the next patch that
will ensure the bases of the loaded GDTR and IDTR are canonical.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:08 +01:00
Nadav Amit
38827dbd3f KVM: x86: Do not update EFLAGS on faulting emulation
If the emulation ends in fault, eflags should not be updated.  However, several
instruction emulations (actually all the fastops) currently update eflags, if
the fault was detected afterwards (e.g., #PF during writeback).

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:08 +01:00
Nadav Amit
9d88fca71a KVM: x86: MOV to CR3 can set bit 63
Although Intel SDM mentions bit 63 is reserved, MOV to CR3 can have bit 63 set.
As Intel SDM states in section 4.10.4 "Invalidation of TLBs and
Paging-Structure Caches": " MOV to CR3. ... If CR4.PCIDE = 1 and bit 63 of the
instruction’s source operand is 0 ..."

In other words, bit 63 is not reserved. KVM emulator currently consider bit 63
as reserved. Fix it.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:07 +01:00
Nadav Amit
0fcc207c66 KVM: x86: Emulate push sreg as done in Core
According to Intel SDM push of segment selectors is done in the following
manner: "if the operand size is 32-bits, either a zero-extended value is pushed
on the stack or the segment selector is written on the stack using a 16-bit
move. For the last case, all recent Core and Atom processors perform a 16-bit
move, leaving the upper portion of the stack location unmodified."

This patch modifies the behavior to match the core behavior.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:06 +01:00
Nadav Amit
5aca372236 KVM: x86: Wrong flags on CMPS and SCAS emulation
CMPS and SCAS instructions are evaluated in the wrong order.  For reference (of
CMPS), see http://www.fermimn.gov.it/linux/quarta/x86/cmps.htm : "Note that the
direction of subtraction for CMPS is [SI] - [DI] or [ESI] - [EDI]. The left
operand (SI or ESI) is the source and the right operand (DI or EDI) is the
destination. This is the reverse of the usual Intel convention in which the
left operand is the destination and the right operand is the source."

Introducing em_cmp_r for this matter that performs comparison in reverse order
using fastop infrastructure to avoid a wrapper function.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:06 +01:00
Nadav Amit
807c142595 KVM: x86: SYSCALL cannot clear eflags[1]
SYSCALL emulation currently clears in 64-bit mode eflags according to
MSR_SYSCALL_MASK.  However, on bare-metal eflags[1] which is fixed to one
cannot be cleared, even if MSR_SYSCALL_MASK masks the bit.  This wrong behavior
may result in failed VM-entry, as VT disallows entry with eflags[1] cleared.

This patch sets the bit after masking eflags on syscall.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:05 +01:00
Nadav Amit
b5bbf10ee6 KVM: x86: Emulation of MOV-sreg to memory uses incorrect size
In x86, you can only MOV-sreg to memory with either 16-bits or 64-bits size.
In contrast, KVM may write to 32-bits memory on MOV-sreg. This patch fixes KVM
behavior, and sets the destination operand size to two, if the destination is
memory.

When destination is registers, and the operand size is 32-bits, the high
16-bits in modern CPUs is filled with zero.  This is handled correctly.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:04 +01:00
Nadav Amit
82b32774c2 KVM: x86: Breakpoints do not consider CS.base
x86 debug registers hold a linear address. Therefore, breakpoints detection
should consider CS.base, and check whether instruction linear address equals
(CS.base + RIP). This patch introduces a function to evaluate RIP linear
address and uses it for breakpoints detection.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:04 +01:00
Nadav Amit
7305eb5d8c KVM: x86: Clear DR6[0:3] on #DB during handle_dr
DR6[0:3] (previous breakpoint indications) are cleared when #DB is injected
during handle_exception, just as real hardware does.  Similarily, handle_dr
should clear DR6[0:3].

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:03 +01:00
Nadav Amit
6d2a0526b0 KVM: x86: Emulator should set DR6 upon GD like real CPU
It should clear B0-B3 and set BD.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:02 +01:00
Nadav Amit
3ffb24681c KVM: x86: No error-code on real-mode exceptions
Real-mode exceptions do not deliver error code. As can be seen in Intel SDM
volume 2, real-mode exceptions do not have parentheses, which indicate
error-code.  To avoid significant changes of the code, the error code is
"removed" during exception queueing.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:02 +01:00
Nadav Amit
5b38ab877e KVM: x86: decode_modrm does not regard modrm correctly
In one occassion, decode_modrm uses the rm field after it is extended with
REX.B to determine the addressing mode. Doing so causes it not to read the
offset for rip-relative addressing with REX.B=1.

This patch moves the fetch where we already mask REX.B away instead.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:01 +01:00
Wei Wang
4114c27d45 KVM: x86: reset RVI upon system reset
A bug was reported as follows: when running Windows 7 32-bit guests on qemu-kvm,
sometimes the guests run into blue screen during reboot. The problem was that a
guest's RVI was not cleared when it rebooted. This patch has fixed the problem.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Tested-by: Rongrong Liu <rongrongx.liu@intel.com>, Da Chun <ngugc@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:00 +01:00
Paolo Bonzini
a2ae9df7c9 kvm: x86: vmx: avoid returning bool to distinguish success from error
Return a negative error code instead, and WARN() when we should be covering
the entire 2-bit space of vmcs_field_type's return value.  For increased
robustness, add a BUILD_BUG_ON checking the range of vmcs_field_to_offset.

Suggested-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:44:00 +01:00
Tiejun Chen
34a1cd60d1 kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup()
Instead of vmx_init(), actually it would make reasonable sense to do
anything specific to vmx hardware setting in vmx_x86_ops->hardware_setup().

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:43:59 +01:00
Tiejun Chen
f2c7648d91 kvm: x86: vmx: move down hardware_setup() and hardware_unsetup()
Just move this pair of functions down to make sure later we can
add something dependent on others.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07 15:43:59 +01:00
Yijing Wang
38737d82f9 PCI/MSI: Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits
MSI-X vector Mask Bits are in MSI-X Tables in PCI memory space.  Xen PV
guests can't write to those tables.  MSI vector Mask Bits are in PCI
configuration space.  Xen PV guests can write to config space, but those
writes are ignored.

Commit 0e4ccb1505 ("PCI: Add x86_msi.msi_mask_irq() and
msix_mask_irq()") added a way to override default_mask_msi_irqs() and
default_mask_msix_irqs() so they can be no-ops in Xen guests, but this is
more complicated than necessary.

Add "pci_msi_ignore_mask" in the core PCI MSI code.  If set,
default_mask_msi_irqs() and default_mask_msix_irqs() return without doing
anything.  This is less flexible, but much simpler.

[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: xen-devel@lists.xenproject.org
2014-11-06 16:34:39 -07:00
Valentin Rothberg
304576a776 crypto: aesni - remove unnecessary #define
The CPP identifier 'HAS_PCBC' is defined when the Kconfig
option CRYPTO_PCBC is set as 'y' or 'm', and is further
used in two ifdef blocks to conditionally compile source
code. This indirection hides the actual Kconfig dependency
and complicates readability. Moreover, it's inconsistent
with the rest of the ifdef blocks in the file, which
directly reference Kconfig options.

This patch removes 'HAS_PCBC' and replaces its occurrences
with the actual dependency on 'CRYPTO_PCBC' being set as
'y' or 'm'.

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-06 23:14:59 +08:00
Borislav Petkov
85be07c324 x86, microcode: Fix accessing dis_ucode_ldr on 32-bit
We should be accessing it through a pointer, like on the BSP.

Tested-by: Richard Hendershot <rshendershot@mchsi.com>
Fixes: 65cef1311d ("x86, microcode: Add a disable chicken bit")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-05 17:28:06 +01:00
Nadav Amit
d29b9d7ed7 KVM: x86: Fix uninitialized op->type for some immediate values
The emulator could reuse an op->type from a previous instruction for some
immediate values.  If it mistakenly considers the operands as memory
operands, it will performs a memory read and overwrite op->val.

Consider for instance the ROR instruction - src2 (the number of times)
would be read from memory instead of being used as immediate.

Mark every immediate operand as such to avoid this problem.

Cc: stable@vger.kernel.org
Fixes: c44b4c6ab8
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-05 12:36:58 +01:00
Jan Beulich
97b67ae559 x86-64: Use RIP-relative addressing for most per-CPU accesses
Observing that per-CPU data (in the SMP case) is reachable by
exploiting 64-bit address wraparound (building on the default kernel
load address being at 16Mb), the one byte shorter RIP-relative
addressing form can be used for most per-CPU accesses. The one
exception are the "stable" reads, where the use of the "P" operand
modifier prevents the compiler from using RIP-relative addressing, but
is unavoidable due to the use of the "p" constraint (side note: with
gcc 4.9.x the intended effect of this isn't being achieved anymore,
see gcc bug 63637).

With the dependency on the minimum kernel load address, arbitrarily
low values for CONFIG_PHYSICAL_START are now no longer possible. A
link time assertion is being added, directing to the need to increase
that value when it triggers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5458A1780200007800044A9D@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 20:43:14 +01:00
Jan Beulich
6d24c5f72d x86-64: Handle PC-relative relocations on per-CPU data
This is in preparation of using RIP-relative addressing in many of the
per-CPU accesses.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5458A15A0200007800044A9A@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 20:43:14 +01:00
Jan Beulich
2c773dd31f x86: Convert a few more per-CPU items to read-mostly ones
Both this_cpu_off and cpu_info aren't getting modified post boot, yet
are being accessed on enough code paths that grouping them with other
frequently read items seems desirable. For cpu_info this at the same
time implies removing the cache line alignment (which afaict became
pointless when it got converted to per-CPU data years ago).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/54589BD20200007800044A84@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 20:13:28 +01:00
Daniel J Blueman
bdee237c03 x86: mm: Use 2GB memory block size on large-memory x86-64 systems
On large-memory x86-64 systems of 64GB or more with memory hot-plug
enabled, use a 2GB memory block size. Eg with 64GB memory, this reduces
the number of directories in /sys/devices/system/memory from 512 to 32,
making it more manageable, and reducing the creation time accordingly.

This caveat is that the memory can't be offlined (for hotplug or
otherwise) with the finer default 128MB granularity, but this is
unimportant due to the high memory densities generally used with such
large-memory systems, where eg a single DIMM is the order of 16GB.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1415089784-28779-4-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 18:19:27 +01:00
Daniel J Blueman
b980dcf25d x86: numachip: APIC driver cleanups
Drop printing that serves no purpose, as it's printing fixed or known
values, and mark constant structure appropriately.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1415089784-28779-3-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 18:17:27 +01:00
Daniel J Blueman
25e5a76bae x86: numachip: Elide self-IPI ICR polling
The default self-IPI path polls the ICR to delay sending the IPI until
there is no IPI in progress. This is redundant on x86-86 APICs, since
IPIs are queued. See the AMD64 Architecture Programmer's Manual, vol 2,
p525.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1415089784-28779-2-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 18:17:27 +01:00
Daniel J Blueman
00e7977dd1 x86: numachip: Fix 16-bit APIC ID truncation
Prevent 16-bit APIC IDs being truncated by using correct mask. This fixes
booting large systems, where the wrong core would receive the startup and
init IPIs, causing hanging.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1415089784-28779-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04 18:17:27 +01:00
Peter Zijlstra (Intel)
ce5686d4ed perf/x86: Fix embarrasing typo
Because we're all human and typing sucks..

Fixes: 7fb0f1de49 ("perf/x86: Fix compile warnings for intel_uncore")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-be0bftjh8yfm4uvmvtf3yi87@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-04 07:06:58 +01:00
Greg Kroah-Hartman
a8a93c6f99 Merge branch 'platform/remove_owner' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into driver-core-next
Remove all .owner fields from platform drivers
2014-11-03 19:53:56 -08:00