Commit Graph

518 Commits

Author SHA1 Message Date
Benjamin Herrenschmidt
bf7588a085 powerpc/powernv: Fix endian bug in LPC bus debugfs accessors
When reading from the LPC, the OPAL FW calls return the value via pointer
to a uint32_t which is always returned big endian. Our internal inb/outb
implementation byteswaps that fine but our debugfs code is still broken.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07 16:01:18 +11:00
Anton Blanchard
9a4f5cd0a5 powerpc: Add printk levels to powernv platform code
Add printk levels to powernv platform code, and convert to
pr_err() etc while here.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Alexander Gordeev
6b2fd7efeb PCI/MSI/PPC: Remove arch_msi_check_device()
Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
This makes the code more compact and allows removing
arch_msi_check_device() from generic MSI code.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-01 12:21:14 -06:00
Gavin Shan
fe7e85c6f5 powerpc/powernv: Override dma_get_required_mask()
The dma_get_required_mask() function is used by some drivers to
query the platform about what DMA mask is needed to cover all of
memory. This is a bit of a strange semantic when we have to choose
between IOMMU translation or bypass, but essentially what it means
is "what DMA mask will give best performances".

Currently, our IOMMU backend always returns a 32-bit mask here, we
don't do anything special to it when we have bypass available. This
causes some drivers to choose a 32-bit mask, thus losing the ability
to use the bypass window, thinking this is more efficient. The problem
was reported from the driver of following device:

0004:03:00.0 0107: 1000:0087 (rev 05)
0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \
             Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

This patch adds an override of that function in order to, instead,
return a 64-bit mask whenever a bypass window is available in order
for drivers to prefer this configuration.

Reported-by: Murali N. Iyer <mniyer@us.ibm.com>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
372fb80db9 powerpc/powernv: Fetch frozen PE on top level
It should have been part of commit 1ad7a72c5 ("powerpc/eeh: Report
frozen parent PE prior to child PE"). There are 2 ways to report
EEH errors: proactively polling because of 0xFF's returned from
PCI config or IO read, or interrupt driven event. We missed to
report and handle parent frozen PE prior to child frozen PE for
the later case on PowerNV platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
d1a85eee35 powerpc/powernv: Sync OpalPciResetScope with firmware
The names of PCI reset scopes aren't sychronized with firmware.
The patch fixes it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:17 +10:00
Gavin Shan
d9df1b5da1 powerpc/powernv: Clear PAPR error injection registers
The frozen state on one specific PE is probably caused by error
injection, which is done with help of PAPR error injection registers.
According to the hardware spec, those registers should be cleared
automatically after one-shot frozen PE. However, that's not always
true, at least on P7IOC of Firebird-L. So we have to clear them
before doing PE reset to avoid recursive EEH errors at recovery
stage.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:11 +10:00
Mike Qiu
7a06278229 powerpc/powernv: Add PCI error injection debugfs entry
The patch adds debugfs file (/sys/kernel/debug/powerpc/PCIxxxx/
err_injct), which accepts following formated string, to support
error injection. It will be used to support userland utility
"errinjct" in future.

  "pe_no:0:function:address:mask" - 32-bits PCI errors
  "pe_no:1:function:address:mask" - 64-bits PCI errors

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:10 +10:00
Gavin Shan
131c123abe powerpc/eeh: Introduce eeh_ops::err_inject
The patch introduces eeh_ops::err_inject(), which allows to inject
specified errors to indicated PE for testing purpose. The functionality
isn't support on pSeries platform. On PowerNV, the functionality
relies on OPAL API opal_pci_err_inject().

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:10 +10:00
Gavin Shan
5b64234081 powerpc/powernv: Sync header with firmware
The patch synchronizes firmware header file (opal.h) for PCI error
injection.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:09 +10:00
Gavin Shan
0d5ee5205e powerpc/eeh: Freeze PE before PE reset
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option()
method to proactively freeze PE, which will be issued before resetting
pass-throughed PE to drop MMIO access during reset because it's
always contributing to recursive EEH error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:07 +10:00
Joe Perches
6d31c2fa0e powerpc: pci-ioda: Use a single function to emit logging messages
No need for 3 functions when a single one will do.

Modify the function declaring macros to call the single function.

Reduces object code size a little:

$ size arch/powerpc/platforms/powernv/pci-ioda.o*
   text	   data	    bss	    dec	    hex	filename
  22303	   1073	   6680	  30056	   7568	arch/powerpc/platforms/powernv/pci-ioda.o.new
  22840	   1121	   6776	  30737	   7811	arch/powerpc/platforms/powernv/pci-ioda.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:57 +10:00
Joe Perches
45eb47242d powerpc: pci-ioda: Remove unnecessary return value from printk
The return value is unnecessary and unused, so make the functions
void instead of int.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:56 +10:00
Paul Mackerras
d6a4f70909 powerpc/powernv: Don't call generic code on offline cpus
On PowerNV platforms, when a CPU is offline, we put it into nap mode.
It's possible that the CPU wakes up from nap mode while it is still
offline due to a stray IPI.  A misdirected device interrupt could also
potentially cause it to wake up.  In that circumstance, we need to clear
the interrupt so that the CPU can go back to nap mode.

In the past the clearing of the interrupt was accomplished by briefly
enabling interrupts and allowing the normal interrupt handling code
(do_IRQ() etc.) to handle the interrupt.  This has the problem that
this code calls irq_enter() and irq_exit(), which call functions such
as account_system_vtime() which use RCU internally.  Use of RCU is not
permitted on offline CPUs and will trigger errors if RCU checking is
enabled.

To avoid calling into any generic code which might use RCU, we adopt
a different method of clearing interrupts on offline CPUs.  Since we
are on the PowerNV platform, we know that the system interrupt
controller is a XICS being driven directly (i.e. not via hcalls) by
the kernel.  Hence this adds a new icp_native_flush_interrupt()
function to the native-mode XICS driver and arranges to call that
when an offline CPU is woken from nap.  This new function reads the
interrupt from the XICS.  If it is an IPI, it clears the IPI; if it
is a device interrupt, it prints a warning and disables the source.
Then it does the end-of-interrupt processing for the interrupt.

The other thing that briefly enabling interrupts did was to check and
clear the irq_happened flag in this CPU's PACA.  Therefore, after
flushing the interrupt from the XICS, we also clear all bits except
the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
irq_happened flag.  The PACA_IRQ_HARD_DIS flag is set by power7_nap()
and is left set to indicate that interrupts are hard disabled.  This
means we then have to ignore that flag in power7_nap(), which is
reasonable since it doesn't indicate that any interrupt event needs
servicing.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:50 +10:00
Zhouyi Zhou
d4fe0965e2 powerpc/jump_label: use HAVE_JUMP_LABEL?
CONFIG_JUMP_LABEL doesn't ensure HAVE_JUMP_LABEL, if it
is not the case use maintainers's own mutex to guard
the modification of global values.

Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:45 +10:00
Anton Blanchard
1217d34b53 powerpc: Ensure global functions include their prototype
Fix a number of places where global functions were not including
their prototype. This ensures the prototype and the function match.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:42 +10:00
Anton Blanchard
e51df2c170 powerpc: Make a bunch of things static
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:41 +10:00
Michael Neuling
831cf65b02 powerpc/powernv: Check OPAL dump calls exist before using
Check that the OPAL_DUMP_READ token exists before initalising the elog
infrastructure.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 91"

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:36 +10:00
Michael Neuling
7dc992ec7b powerpc/powernv: Check OPAL elog calls exist before using
Check that the OPAL_ELOG_READ token exists before initalising the elog
infrastructure.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 74"

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:36 +10:00
Michael Neuling
035ed26fb0 powerpc/powernv: Check OPAL RTC calls exists before using
Check that the OPAL_RTC_READ token exists before we use the OPAL RTC.

Refactors the code a little to merge error paths.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 3".

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:35 +10:00
Michael Neuling
bffe6bda34 powerpc/powernv: Add OPAL check token call
Currently there is no way to generically check if an OPAL call exists or not
from the host kernel.

This adds an OPAL call opal_check_token() which tells you if the given token is
present in OPAL or not.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:35 +10:00
Vasant Hegde
cdd91b89ad powerpc/powernv: Improve error messages in dump code
Presently we only support initiating Service Processor dump from host.
Hence update sysfs message. Also update couple of other error/info
messages.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:34 +10:00
Li Zhong
c1c8a92f70 powerpc: use machine_subsys_initcall() for opal_hmi_handler_init()
As opal_message_init() uses machine_early_initcall(powernv, ), and
opal_hmi_handler_init() depends on that early initcall, so it also needs
use machine_* to check the machine_id.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
2014-09-09 19:02:46 +10:00
Masanari Iida
1a84db567a treewide: fix errors in printk
This patch fix spelling typo in printk.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-09-01 11:18:25 +02:00
Linus Torvalds
1d508f8ace Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull more powerpc updates from Ben Herrenschmidt:
 "Here are some more powerpc bits for 3.17, essentially fixes.

  The biggest series, also aimed at -stable, is from Aneesh and is the
  result of weeks and weeks of debugging to find out why the heck or THP
  implementation was occasionally triggering multi-hit errors in our
  level 1 TLB.  It ended up being a combination of issues including
  subtleties as to how we should invalidate those special 'MPSS' pages
  we use to allow the use of 16M pages inside 4K/64K "base page size"
  segments (you really have to love our MMU !)

  Another interesting one in the "OMG" category is the series from
  Michael adding memory barriers to spin_is_locked().  That's also the
  result of many days of debugging to figure out why the semaphore code
  would occasionally crash in ways that made no sense.  It ended up
  being some creative lock stacking that was defeated by the fact that
  our locks allow a load inside the locked section to be re-ordered with
  the load of the lock value itself (I'm still of two mind about whether
  to kill that once and for all by putting a heavier barrier back into
  our lock implementation...).  The fixes come with a long explanation
  in the cset comments, feel free to read it if you feel like having a
  headache today"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
  powerpc/thp: Add tracepoints to track hugepage invalidate
  powerpc/mm: Use read barrier when creating real_pte
  powerpc/thp: Use ACCESS_ONCE when loading pmdp
  powerpc/thp: Invalidate with vpn in loop
  powerpc/thp: Handle combo pages in invalidate
  powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
  powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
  powerpc/thp: Add write barrier after updating the valid bit
  powerpc: reorder per-cpu NUMA information's initialization
  powerpc/perf/hv-24x7: Use kmem_cache_free
  powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
  powerpc: Hard disable interrupts in xmon
  powerpc: remove duplicate definition of TEXASR_FS
  powerpc/pseries: Avoid deadlock on removing ddw
  powerpc/pseries: Failure on removing device node
  powerpc/boot: Use correct zlib types for comparison
  powerpc/powernv: Interface to register/unregister opal dump region
  printk: Add function to return log buffer address and size
  powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS
  powerpc/ppc476: Disable BTAC
  ...
2014-08-14 10:14:07 -06:00
Vasant Hegde
b09c2ec408 powerpc/powernv: Interface to register/unregister opal dump region
PowerNV platform is capable of capturing host memory region when system
crashes (because of host/firmware). We have new OPAL API to register/
unregister memory region to be captured when system crashes.

This patch adds support for new API. Also during boot time we register
kernel log buffer and unregister before doing kexec.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:45 +10:00
Gavin Shan
763fe0addb powerpc/powernv: Fix IOMMU group lost
When we take full hotplug to recover from EEH errors, PCI buses
could be involved. For the case, the child devices of involved
PCI buses can't be attached to IOMMU group properly, which is
caused by commit 3f28c5a ("powerpc/powernv: Reduce multi-hit of
iommu_add_device()").

When adding the PCI devices of the newly created PCI buses to
the system, the IOMMU group is expected to be added in (C).
(A) fails to bind the IOMMU group because bus->is_added is
false. (B) fails because the device doesn't have binding IOMMU
table yet. bus->is_added is set to true at end of (C) and
pdev->is_added is set to true at (D).

   pcibios_add_pci_devices()
      pci_scan_bridge()
         pci_scan_child_bus()
            pci_scan_slot()
               pci_scan_single_device()
                  pci_scan_device()
                  pci_device_add()
                     pcibios_add_device()           A: Ignore
                     device_add()                   B: Ignore
                  pcibios_fixup_bus()
                     pcibios_setup_bus_devices()
                        pcibios_setup_device()      C: Hit
      pcibios_finish_adding_to_bus()
         pci_bus_add_devices()
            pci_bus_add_device()                    D: Add device

If the parent PCI bus isn't involved in hotplug, the IOMMU
group is expected to be bound in (B). (A) should fail as the
sysfs entries aren't populated.

The patch fixes the issue by reverting commit 3f28c5a and remove
WARN_ON() in iommu_add_device() to allow calling the function
even the specified device already has associated IOMMU group.

Cc: <stable@vger.kernel.org>  # 3.16+
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:42 +10:00
Linus Torvalds
c8d6637d04 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
 "This finally applies the stricter sysfs perms checking we pulled out
  before last merge window.  A few stragglers are fixed (thanks
  linux-next!)"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files
  arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files
  drivers/video/fbdev/s3c2410fb.c: don't make debug world-writable.
  ARM: avoid ARM binutils leaking ELF local symbols
  scripts: modpost: Remove numeric suffix pattern matching
  scripts: modpost: fix compilation warning
  sysfs: disallow world-writable files.
  module: return bool from within_module*()
  module: add within_module() function
  modules: Fix build error in moduleloader.h
2014-08-10 21:31:58 -07:00
Rusty Russell
76215b04fd arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files
If you don't have a store function, you're not writable anyway!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-07 22:38:55 +09:30
Rusty Russell
6656c21ca1 arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files
If you don't have a store function, you're not writable anyway!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2014-08-07 21:04:52 +09:30
Mahesh Salgaonkar
0ef95b411e powerpc/powernv: Invoke opal call to handle hmi.
When we hit the HMI in Linux, invoke opal call to handle/recover from HMI
errors in real mode and then in virtual mode during check_irq_replay()
invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from
OPAL and act accordingly.

Now that we are ready to handle HMI interrupt directly in linux, remove
the HMI interrupt registration with firmware.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:52 +10:00
Mahesh Salgaonkar
0869b6fd20 powerpc/book3s: Add basic infrastructure to handle HMI in Linux.
Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements
basic infrastructure to handle HMI in Linux host. The design is to invoke
opal handle hmi in real mode for recovery and set irq_pending when we hit HMI.
During check_irq_replay pull opal hmi event and print hmi info on console.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:48 +10:00
Gavin Shan
98fd700287 powerpc/powernv: Handle compound PE in config accessors
The PCI config accessors check for PE frozen state and clear it if
EEH isn't functional. The patch handles compound PE in config accessors
if PHB supports it. For consistency, all PEs will be put into frozen
state if any one in compound group gets frozen by hardware.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:39 +10:00
Gavin Shan
5828790931 powerpc/powernv: Handle compound PE for EEH
The patch handles compound PE for EEH backend. If one specific
PE in compound group has been frozen, we enforces to freeze
all PEs in the group. If we're enable DMA or MMIO for one PE
in compound group, DMA or MMIO of all PEs in the group will be
enabled.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:34 +10:00
Gavin Shan
49dec9222f powerpc/powernv: Handle compound PE
The patch introduces 3 PHB callbacks: compound PE state retrieval,
force freezing and unfreezing compound PE. The PCI config accessors
and PowerNV EEH backend can use them in subsequent patches.

We don't export the capability of compound PE to EEH core, which
helps avoiding more complexity to EEH core.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:30 +10:00
Gavin Shan
c979c70ed1 powerpc/powernv: Split ioda_eeh_get_state()
Function ioda_eeh_get_state() is used to fetch EEH state for PHB
or PE. We're going to support compound PE and the function becomes
more complicated with that. The patch splits the function into two
functions for PHB and PE cases separately to improve readability.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 16:33:21 +10:00
Gavin Shan
5ca27efbd8 powerpc/powernv: Allow to freeze PE
The patch synchronizes header file with firmware to have new OPAL
API opal_pci_eeh_freeze_set(), which is used to freeze the specified
PE in order to support "compound" PE.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:52 +10:00
Guo Chao
262af557dd powerpc/powernv: Enable M64 aperatus for PHB3
This patch enables M64 aperatus for PHB3.

We already had platform hook (ppc_md.pcibios_window_alignment) to affect
the PCI resource assignment done in PCI core so that each PE's M32 resource
was built on basis of M32 segment size. Similarly, we're using that for
M64 assignment on basis of M64 segment size.

   * We're using last M64 BAR to cover M64 aperatus, and it's shared by all
     256 PEs.
   * We don't support P7IOC yet. However, some function callbacks are added
     to (struct pnv_phb) so that we can reuse them on P7IOC in future.
   * PE, corresponding to PCI bus with large M64 BAR device attached, might
     span multiple M64 segments. We introduce "compound" PE to cover the case.
     The compound PE is a list of PEs and the master PE is used as before.
     The slave PEs are just for MMIO isolation.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:47 +10:00
Gavin Shan
bb593c0049 powerpc/eeh: Aux PE data for error log
The patch allows PE (struct eeh_pe) instance to have auxillary data,
whose size is configurable on basis of platform. For PowerNV, the
auxillary data will be used to cache PHB diag-data for that PE
(frozen PE or fenced PHB). In turn, we can retrieve the diag-data
at any later points.

It's useful for the case of VFIO PCI devices where the error log
should be cached, and then be retrieved by the guest at later point.
Also, it can avoid PHB diag-data overwritting if another frozen PE
reported and the previous diag-data isn't fetched by guest.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:43 +10:00
Gavin Shan
f18440fb7e powerpc/eeh: Make diag-data not endian dependent
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness
problems in EEH"). The patch helps to get non-endian-dependent
diag-data.

Cc: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:38 +10:00
Gavin Shan
0dae27439a powerpc/eeh: Replace pr_warning() with pr_warn()
pr_warn() is equal to pr_warning(), but the former is a bit more
formal according to commit fc62f2f ("kernel.h: add pr_warn for
symmetry to dev_warn, netdev_warn").

The patch replaces pr_warning() with pr_warn().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:34 +10:00
Gavin Shan
dc561fb9e7 powerpc/eeh: Selectively enable IO for error log
According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.

The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:25 +10:00
Gavin Shan
05b1721d9f powerpc/eeh: Refactor EEH flag accessors
There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:

  eeh_add_flag():   Add EEH flag
  eeh_clear_flag(): Clear EEH flag
  eeh_has_flag():   Check if one specific flag has been set
  eeh_enabled():    Check if EEH functionality has been enabled

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:21 +10:00
Gavin Shan
dff4a39e88 powerpc/powernv: Fix IOMMU table for VFIO dev
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass
through one PCI device, whose hose driver ever enable the bypass
mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU
table. However, EEH needs access the IOMMU table when the device
is owned by guest.

The patch fixes pdev->dev.archdata.dma_data.iommu_table when
passing through the device to guest in pnv_pci_ioda2_set_bypass().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:12 +10:00
Brian W Hart
a32305bf90 powerpc/powernv: Update dev->dma_mask in pci_set_dma_mask() path
powerpc defines various machine-specific routines for handling
pci_set_dma_mask().  The routines for machine "PowerNV" may neglect
to set dev->dma_mask.  This could confuse anyone (e.g. drivers) that
consult dev->dma_mask to find the current mask.  Set the dma_mask in
the PowerNV leaf routine.

Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:40:54 +10:00
Mike Qiu
dadcd6d6e7 powerpc/eeh: sysfs entries lost
The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh:
Skip eeh sysfs when eeh is disabled"). That commit added condition
to create sysfs entries with EEH_ENABLED, which isn't populated
when trying to create sysfs entries on PowerNV platform during system
boot time. The patch fixes the issue by:

   * Reoder EEH initialization functions so that they're same on
     PowerNV/pSeries.
   * Cache PE's primary bus by PowerNV platform instead of EEH core
     to avoid kernel crash caused by the function reorder. Another
     benefit with this is to avoid one eeh_probe_mode_dev() in EEH
     core.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:28:49 +10:00
Gavin Shan
05ec424e38 powerpc/eeh: Avoid event on passed PE
We must not handle EEH error on devices which are passed to somebody
else. Instead, we expect that the frozen device owner detects an EEH
error and recovers from it.

This avoids EEH error handling on passed through devices so the device
owner gets a chance to handle them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:28:47 +10:00
Michael Ellerman
b14726c51c powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall()
A lot of the code in platforms/powernv is using non-machine initcalls.
That means if a kernel built with powernv support runs on another
platform, for example pseries, the initcalls will still run.

That is usually OK, because the initcalls will check for something in
the device tree or elsewhere before doing anything, so on other
platforms they will usually just return.

But it's fishy for powernv code to be running on other platforms, so
switch them all to be machine initcalls. If we want any of them to run
on other platforms in future they should move to sysdev.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28 14:11:26 +10:00
Benjamin Herrenschmidt
cdc2652ee5 Merge branch 'merge' into next
Bring in some important fixes from the 3.16 branch
2014-07-28 13:41:12 +10:00
Vasant Hegde
fa952c54ba powerpc/powernv: Change BUG_ON to WARN_ON in elog code
We can continue to read the error log (up to MAX size) even if
we get the elog size more than MAX size. Hence change BUG_ON to
WARN_ON.

Also updated error message.

Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28 11:30:54 +10:00