linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-02 13:32:40 -04:00

Author	SHA1	Message	Date
Linus Torvalds	636f64db07	Merge tag 'ras_core_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - More noinstr fixes - Add an erratum workaround for Intel CPUs which, in certain circumstances, end up consuming an unrelated uncorrectable memory error when using fast string copy insns - Remove the MCE tolerance level control as it is not really needed or used anymore * tag 'ras_core_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Remove the tolerance level control x86/mce: Work around an erratum on fast string copy instructions x86/mce: Use arch atomic and bit helpers	2022-03-25 12:34:53 -07:00
Linus Torvalds	3bf03b9a08	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...	2022-03-22 16:11:53 -07:00
luofei	d1fe111fb6	mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler When the hwpoison page meets the filter conditions, it should not be regarded as successful memory_failure() processing for mce handler, but should return a distinct value, otherwise mce handler regards the error page has been identified and isolated, which may lead to calling set_mce_nospec() to change page attribute, etc. Here memory_failure() return -EOPNOTSUPP to indicate that the error event is filtered, mce handler should not take any action for this situation and hwpoison injector should treat as correct. Link: https://lkml.kernel.org/r/20220223082135.2769649-1-luofei@unicloud.com Signed-off-by: luofei <luofei@unicloud.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:07 -07:00
Borislav Petkov	7f1b8e0d63	x86/mce: Remove the tolerance level control This is pretty much unused and not really useful. What is more, all relevant MCA hardware has recoverable machine checks support so there's no real need to tweak MCA tolerance levels in order to maybe extend machine lifetime. So rip it out. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YcDq8PxvKtTENl/e@zn.tnic	2022-02-23 11:09:25 +01:00
Jue Wang	8ca97812c3	x86/mce: Work around an erratum on fast string copy instructions A rare kernel panic scenario can happen when the following conditions are met due to an erratum on fast string copy instructions: 1) An uncorrected error. 2) That error must be in first cache line of a page. 3) Kernel must execute page_copy from the page immediately before that page. The fast string copy instructions ("REP; MOVS") could consume an uncorrectable memory error in the cache line _right after_ the desired region to copy and raise an MCE. Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string copy and will avoid such spurious machine checks. However, that is less preferable due to the permanent performance impact. Considering memory poison is rare, it's desirable to keep fast string copy enabled until an MCE is seen. Intel has confirmed the following: 1. The CPU erratum of fast string copy only applies to Skylake, Cascade Lake and Cooper Lake generations. Directly return from the MCE handler: 2. Will result in complete execution of the "REP; MOVS" with no data loss or corruption. 3. Will not result in another MCE firing on the next poisoned cache line due to "REP; MOVS". 4. Will resume execution from a correct point in code. 5. Will result in the same instruction that triggered the MCE firing a second MCE immediately for any other software recoverable data fetch errors. 6. Is not safe without disabling the fast string copy, as the next fast string copy of the same buffer on the same CPU would result in a PANIC MCE. This should mitigate the erratum completely with the only caveat that the fast string copy is disabled on the affected hyper thread thus performance degradation. This is still better than the OS crashing on MCEs raised on an irrelevant process due to "REP; MOVS' accesses in a kernel context, e.g., copy_page. Tested: Injected errors on 1st cache line of 8 anonymous pages of process 'proc1' and observed MCE consumption from 'proc2' with no panic (directly returned). Without the fix, the host panicked within a few minutes on a random 'proc2' process due to kernel access from copy_page. [ bp: Fix comment style + touch ups, zap an unlikely(), improve the quirk function's readability. ] Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20220218013209.2436006-1-juew@google.com	2022-02-19 14:26:42 +01:00
Borislav Petkov	f11445ba7a	x86/mce: Use arch atomic and bit helpers The arch helpers do not have explicit KASAN instrumentation. Use them in noinstr code. Inline a couple more functions with single call sites, while at it: mce_severity_amd_smca() has a single call-site which is noinstr so force the inlining and fix: vmlinux.o: warning: objtool: mce_severity_amd.constprop.0()+0xca: call to \ mce_severity_amd_smca() leaves .noinstr.text section Always inline mca_msr_reg(): text data bss dec hex filename 16065240 128031326 36405368 180501934 ac23dae vmlinux.before 16065240 128031294 36405368 180501902 ac23d8e vmlinux.after and mce_no_way_out() as the latter one is used only once, to fix: vmlinux.o: warning: objtool: mce_read_aux()+0x53: call to mca_msr_reg() leaves .noinstr.text section vmlinux.o: warning: objtool: do_machine_check()+0xc9: call to mce_no_way_out() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20220204083015.17317-4-bp@alien8.de	2022-02-13 22:08:27 +01:00
Tony Luck	822ccfade5	x86/cpu: Read/save PPIN MSR during initialization Currently, the PPIN (Protected Processor Inventory Number) MSR is read by every CPU that processes a machine check, CMCI, or just polls machine check banks from a periodic timer. This is not a "fast" MSR, so this adds to overhead of processing errors. Add a new "ppin" field to the cpuinfo_x86 structure. Read and save the PPIN during initialization. Use this copy in mce_setup() instead of reading the MSR. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220131230111.2004669-4-tony.luck@intel.com	2022-02-01 16:29:26 +01:00
Tony Luck	0dcab41d34	x86/cpu: Merge Intel and AMD ppin_init() functions The code to decide whether a system supports the PPIN (Protected Processor Inventory Number) MSR was cloned from the Intel implementation. Apart from the X86_FEATURE bit and the MSR numbers it is identical. Merge the two functions into common x86 code, but use x86_match_cpu() instead of the switch (c->x86_model) that was used by the old Intel code. No functional change. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220131230111.2004669-2-tony.luck@intel.com	2022-02-01 12:56:23 +01:00
Greg Kroah-Hartman	7f99cb5e60	x86/CPU/AMD: Use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the AMD mce sysfs code to use default_groups field which has been the preferred way since `aa30f47cf6` ("kobject: Add support for default attribute groups to kobj_type") so that the obsolete default_attrs field can be removed soon. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20220106103537.3663852-1-gregkh@linuxfoundation.org	2022-02-01 12:41:24 +01:00
Tony Luck	e464121f2d	x86/cpu: Add Xeon Icelake-D to list of CPUs that support PPIN Missed adding the Icelake-D CPU to the list. It uses the same MSRs to control and read the inventory number as all the other models. Fixes: `dc6b025de9` ("x86/mce: Add Xeon Icelake to list of CPUs that support PPIN") Reported-by: Ailin Xu <ailin.xu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220121174743.1875294-2-tony.luck@intel.com	2022-01-25 18:40:30 +01:00
Yazen Ghannam	1f52b0aba6	x86/MCE/AMD: Allow thresholding interface updates after init Changes to the AMD Thresholding sysfs code prevents sysfs writes from updating the underlying registers once CPU init is completed, i.e. "threshold_banks" is set. Allow the registers to be updated if the thresholding interface is already initialized or if in the init path. Use the "set_lvt_off" value to indicate if running in the init path, since this value is only set during init. Fixes: `a037f3ca0e` ("x86/mce/amd: Make threshold bank setting hotplug robust") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220117161328.19148-1-yazen.ghannam@amd.com	2022-01-23 20:50:18 +01:00
Zhang Zixun	de768416b2	x86/mce/inject: Avoid out-of-bounds write when setting flags A contrived zero-length write, for example, by using write(2): ... ret = write(fd, str, 0); ... to the "flags" file causes: BUG: KASAN: stack-out-of-bounds in flags_write Write of size 1 at addr ffff888019be7ddf by task writefile/3787 CPU: 4 PID: 3787 Comm: writefile Not tainted 5.16.0-rc7+ #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 due to accessing buf one char before its start. Prevent such out-of-bounds access. [ bp: Productize into a proper patch. Link below is the next best thing because the original mail didn't get archived on lore. ] Fixes: `0451d14d05` ("EDAC, mce_amd_inj: Modify flags attribute to use string arguments") Signed-off-by: Zhang Zixun <zhang133010@icloud.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/linux-edac/YcnePfF1OOqoQwrX@zn.tnic/	2021-12-28 11:45:36 +01:00
Yazen Ghannam	91f75eb481	x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is Reserved/Read-as-Zero. For example: Bank # \| CPUx \| CPUy 0 LS LS 1 RAZ UMC 2 CS CS 3 SMU RAZ Future AMD systems will lay out MCA bank types such that the type of bank number "i" may be different across CPUs. For example: Bank # \| CPUx \| CPUy 0 LS LS 1 RAZ UMC 2 CS NBIO 3 SMU RAZ Change the structures that cache MCA bank types to be per-CPU and update smca_get_bank_type() to handle this change. Move some SMCA-specific structures to amd.c from mce.h, since they no longer need to be global. Break out the "count" for bank types from struct smca_hwid, since this should provide a per-CPU count rather than a system-wide count. Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The values in this array should not change at runtime. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com	2021-12-22 17:22:09 +01:00
Yazen Ghannam	5176a93ab2	x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd. The "PHY" bank types all have the same error descriptions, and the NBIF and SHUB bank types have the same error descriptions. So reuse the same arrays where appropriate. [ bp: Remove useless comments over hwid types. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.com	2021-12-22 17:19:18 +01:00
Borislav Petkov	1acd85feba	x86/mce: Check regs before accessing it Commit in Fixes accesses pt_regs before checking whether it is NULL or not. Make sure the NULL pointer check happens first. Fixes: `0a5b288e85` ("x86/mce: Prevent severity computation from being instrumented") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20211217102029.GA29708@kili	2021-12-20 11:41:02 +01:00
Borislav Petkov	e3d72e8eee	x86/mce: Mark mce_start() noinstr Fixes vmlinux.o: warning: objtool: do_machine_check()+0x4ae: call to __const_udelay() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-13-bp@alien8.de	2021-12-13 14:14:05 +01:00
Borislav Petkov	edb3d07e24	x86/mce: Mark mce_timed_out() noinstr Fixes vmlinux.o: warning: objtool: do_machine_check()+0x482: call to mce_timed_out() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-12-bp@alien8.de	2021-12-13 14:13:54 +01:00
Borislav Petkov	75581a203e	x86/mce: Move the tainting outside of the noinstr region add_taint() is yet another external facility which the #MC handler calls. Move that tainting call into the instrumentation-allowed part of the handler. Fixes vmlinux.o: warning: objtool: do_machine_check()+0x617: call to add_taint() leaves .noinstr.text section While at it, allow instrumentation around the mce_log() call. Fixes vmlinux.o: warning: objtool: do_machine_check()+0x690: call to mce_log() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-11-bp@alien8.de	2021-12-13 14:13:35 +01:00
Borislav Petkov	db6c996d6c	x86/mce: Mark mce_read_aux() noinstr Fixes vmlinux.o: warning: objtool: do_machine_check()+0x681: call to mce_read_aux() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-10-bp@alien8.de	2021-12-13 14:13:23 +01:00
Borislav Petkov	b4813539d3	x86/mce: Mark mce_end() noinstr It is called by the #MC handler which is noinstr. Fixes vmlinux.o: warning: objtool: do_machine_check()+0xbd6: call to memset() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-9-bp@alien8.de	2021-12-13 14:13:12 +01:00
Borislav Petkov	3c7ce80a81	x86/mce: Mark mce_panic() noinstr And allow instrumentation inside it because it does calls to other facilities which will not be tagged noinstr. Fixes vmlinux.o: warning: objtool: do_machine_check()+0xc73: call to mce_panic() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-8-bp@alien8.de	2021-12-13 14:13:01 +01:00
Borislav Petkov	0a5b288e85	x86/mce: Prevent severity computation from being instrumented Mark all the MCE severity computation logic noinstr and allow instrumentation when it "calls out". Fixes vmlinux.o: warning: objtool: do_machine_check()+0xc5d: call to mce_severity() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-7-bp@alien8.de	2021-12-13 14:12:48 +01:00
Borislav Petkov	4fbce464db	x86/mce: Allow instrumentation during task work queueing Fixes vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-6-bp@alien8.de	2021-12-13 14:12:35 +01:00
Borislav Petkov	487d654db3	x86/mce: Remove noinstr annotation from mce_setup() Instead, sandwitch around the call which is done in noinstr context and mark the caller - mce_gather_info() - as noinstr. Also, document what the whole instrumentation strategy with #MC is going to be in the future and where it all is supposed to be going to. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-5-bp@alien8.de	2021-12-13 14:12:21 +01:00
Borislav Petkov	88f66a4235	x86/mce: Use mce_rdmsrl() in severity checking code MCA has its own special MSR accessors. Use them. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-4-bp@alien8.de	2021-12-13 14:12:08 +01:00
Borislav Petkov	ad669ec16a	x86/mce: Remove function-local cpus variables Use num_online_cpus() directly. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-3-bp@alien8.de	2021-12-13 14:11:53 +01:00
Borislav Petkov	cd5e0d1fc9	x86/mce: Do not use memset to clear the banks bitmaps The bitmap is a single unsigned long so no need for the function call. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-2-bp@alien8.de	2021-12-13 14:11:22 +01:00
Smita Koralahalli	1e56279a49	x86/mce/inject: Set the valid bit in MCA_STATUS before error injection MCA handlers check the valid bit in each status register (MCA_STATUS[Val]) and continue processing the error only if the valid bit is set. Set the valid bit unconditionally in the corresponding MCA_STATUS register and correct any Val=0 injections made by the user as such errors will get ignored and such injections will be largely pointless. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211104215846.254012-3-Smita.KoralahalliChannabasappa@amd.com	2021-12-08 12:01:01 +01:00
Smita Koralahalli	e48d008bd1	x86/mce/inject: Check if a bank is populated before injecting The MCA_IPID register uniquely identifies a bank's type on Scalable MCA (SMCA) systems. When an MCA bank is not populated, the MCA_IPID register will read as zero and writes to it will be ignored. On a hw-type error injection (injection which writes the actual MCA registers in an attempt to cause a real MCE) check the value of this register before trying to inject the error. Do not impose any limitations on a sw injection and allow the user to test out all the decoding paths without relying on the available hardware, as its purpose is to just test the code. [ bp: Heavily massage. ] Link: https://lkml.kernel.org/r/20211019233641.140275-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211104215846.254012-2-Smita.KoralahalliChannabasappa@amd.com	2021-12-08 12:00:56 +01:00
Zhaolong Zhang	2322b532ad	x86/mce: Get rid of cpu_missing Get rid of cpu_missing because `7bb39313cd` ("x86/mce: Make mce_timed_out() identify holdout CPUs") provides a more detailed message about which CPUs are missing. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211109112345.2673403-1-zhangzl2013@126.com	2021-11-17 15:32:31 +01:00
Yazen Ghannam	0b746e8c1e	x86/MCE/AMD, EDAC/amd64: Move address translation to AMD64 EDAC The address translation code used for current AMD systems is non-architectural. So move it to EDAC. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211028175728.121452-2-yazen.ghannam@amd.com	2021-11-15 12:36:32 +01:00
Linus Torvalds	1654e95ee3	Merge tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add the model number of a new, Raptor Lake CPU, to intel-family.h - Do not log spurious corrected MCEs on SKL too, due to an erratum - Clarify the path of paravirt ops patches upstream - Add an optimization to avoid writing out AMX components to sigframes when former are in init state * tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Raptor Lake to Intel family x86/mce: Add errata workaround for Skylake SKX37 MAINTAINERS: Add some information to PARAVIRT_OPS entry x86/fpu: Optimize out sigframe xfeatures when in init state	2021-11-14 09:29:03 -08:00
Dave Jones	e629fc1407	x86/mce: Add errata workaround for Skylake SKX37 Errata SKX37 is word-for-word identical to the other errata listed in this workaround. I happened to notice this after investigating a CMCI storm on a Skylake host. While I can't confirm this was the root cause, spurious corrected errors does sound like a likely suspect. Fixes: `2976908e41` ("x86/mce: Do not log spurious corrected mce errors") Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20211029205759.GA7385@codemonkey.org.uk	2021-11-12 11:43:35 -08:00
Linus Torvalds	56d3375448	Merge tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Summary below. i915 starts to add support for DG2 GPUs, enables DG1 and ADL-S support by default, lots of work to enable DisplayPort 2.0 across drivers. Lots of documentation updates and fixes across the board. core: - improve dma_fence, lease and resv documentation - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin - sched fixes/improvements - allow empty drm leases - add dma resv iterator - add more DP 2.0 headers - DP MST helper improvements for DP2.0 dma-buf: - avoid warnings, remove fence trace macros bridge: - new helper to get rid of panels - probe improvements for it66121 - enable DSI EOTP for anx7625 fbdev: - efifb: release runtime PM on destroy ttm: - kerneldoc switch - helper to clear all DMA mappings - pool shrinker optimizaton - remove ttm_tt_destroy_common - update ttm_move_memcpy for async use panel: - add new panel-edp driver amdgpu: - Initial DP 2.0 support - Initial USB4 DP tunnelling support - Aldebaran MCE support - Modifier support for DCC image stores for GFX 10.3 - Display rework for better FP code handling - Yellow Carp/Cyan Skillfish updates - Cyan Skillfish display support - convert vega/navi to IP discovery asic enumeration - validate IP discovery table - RAS improvements - Lots of fixes i915: - DG1 PCI IDs + LMEM discovery/placement - DG1 GuC submission by default - ADL-S PCI IDs updated + enabled by default - ADL-P (XE_LPD) fixed and updates - DG2 display fixes - PXP protected object support for Gen12 integrated - expose multi-LRC submission interface for GuC - export logical engine instance to user - Disable engine bonding on Gen12+ - PSR cleanup - PSR2 selective fetch by default - DP 2.0 prep work - VESA vendor block + MSO use of it - FBC refactor - try again to fix fast-narrow vs slow-wide eDP training - use THP when IOMMU enabled - LMEM backup/restore for suspend/resume - locking simplification - GuC major reworking - async flip VT-D workaround changes - DP link training improvements - misc display refactorings bochs: - new PCI ID rcar-du: - Non-contiguious buffer import support for rcar-du - r8a779a0 support prep omapdrm: - COMPILE_TEST fixes sti: - COMPILE_TEST fixes msm: - fence ordering improvements - eDP support in DP sub-driver - dpu irq handling cleanup - CRC support for making igt happy - NO_CONNECTOR bridge support - dsi: 14nm phy support for msm8953 - mdp5: msm8x53, sdm450, sdm632 support stm: - layer alpha + zpo support v3d: - fix Vulkan CTS failure - support multiple sync objects gud: - add R8/RGB332/RGB888 pixel formats vc4: - convert to new bridge helpers vgem: - use shmem helpers virtio: - support mapping exported vram zte: - remove obsolete driver rockchip: - use bridge attach no connector for LVDS/RGB" * tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm: (1259 commits) drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits drm/amd/display: MST support for DPIA drm/amdgpu: Fix even more out of bound writes from debugfs drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts drm/amdgpu/UAPI: rearrange header to better align related items drm/amd/display: Enable dpia in dmub only for DCN31 B0 drm/amd/display: Fix USB4 hot plug crash issue drm/amd/display: Fix deadlock when falling back to v2 from v3 drm/amd/display: Fallback to clocks which meet requested voltage on DCN31 drm/amd/display: move FPU associated DCN301 code to DML folder drm/amd/display: fix link training regression for 1 or 2 lane drm/amd/display: add two lane settings training options drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings drm/amd/display: implement decide lane settings drm/amd/display: adopt DP2.0 LT SCR revision 8 drm/amd/display: FEC configuration for dpia links in MST mode drm/amd/display: FEC configuration for dpia links drm/amd/display: Add workaround flag for EDID read on certain docks drm/amd/display: Set phy_mux_sel bit in dmub scratch register ...	2021-11-02 16:47:49 -07:00
Linus Torvalds	158405e888	Merge tag 'ras_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Get rid of a bunch of function pointers used in MCA land in favor of normal functions. This is in preparation of making the MCA code noinstr-aware - When the kernel copies data from user addresses and it encounters a machine check, a SIGBUS is sent to that process. Change this action to either an -EFAULT which is returned to the user or a short write, making the recovery action a lot more user-friendly * tag 'ras_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Sort mca_config members to get rid of unnecessary padding x86/mce: Get rid of the ->quirk_no_way_out() indirect call x86/mce: Get rid of msr_ops x86/mce: Get rid of machine_check_vector x86/mce: Get rid of the mce_severity function pointer x86/mce: Drop copyin special case for #MC x86/mce: Change to not send SIGBUS error during copy from user	2021-11-01 15:12:04 -07:00
Ingo Molnar	082f20b21d	Merge branch 'x86/urgent' into x86/fpu, to resolve a conflict Resolve the conflict between these commits: x86/fpu: `1193f408cd` ("x86/fpu/signal: Change return type of __fpu_restore_sig() to boolean") x86/urgent: `d298b03506` ("x86/fpu: Restore the masking out of reserved MXCSR bits") `b2381acd3f` ("x86/fpu: Mask out the invalid MXCSR bits properly") Conflicts: arch/x86/kernel/fpu/signal.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-10-16 15:17:46 +02:00
Mukul Joshi	f38ce910d8	x86/MCE/AMD: Export smca_get_bank_type symbol Export smca_get_bank_type for use in the AMD GPU driver to determine MCA bank while handling correctable and uncorrectable errors in GPU UMC. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-06 15:53:24 -04:00
Borislav Petkov	15802468a9	x86/mce: Sort mca_config members to get rid of unnecessary padding $ pahole -C mca_config arch/x86/kernel/cpu/mce/core.o before: /* size: 40, cachelines: 1, members: 16 / / sum members: 21, holes: 1, sum holes: 3 / / sum bitfield members: 64 bits, bit holes: 2, sum bit holes: 32 bits / / padding: 4 / / last cacheline: 40 bytes / after: / size: 32, cachelines: 1, members: 16 / / padding: 3 / / last cacheline: 32 bytes */ No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-6-bp@alien8.de	2021-09-23 11:38:04 +02:00
Borislav Petkov	cc466666ab	x86/mce: Get rid of the ->quirk_no_way_out() indirect call Use a flag setting to call the only quirk function for that. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-5-bp@alien8.de	2021-09-23 11:34:49 +02:00
Borislav Petkov	8121b8f947	x86/mce: Get rid of msr_ops Avoid having indirect calls and use a normal function which returns the proper MSR address based on ->smca setting. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-4-bp@alien8.de	2021-09-23 11:20:20 +02:00
Borislav Petkov	cbe1de162d	x86/mce: Get rid of machine_check_vector Get rid of the indirect function pointer and use flags settings instead to steer execution. Now that it is not an indirect call any longer, drop the instrumentation annotation for objtool too. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-3-bp@alien8.de	2021-09-23 11:15:49 +02:00
Borislav Petkov	631adc7b0b	x86/mce: Get rid of the mce_severity function pointer Turn it into a normal function which calls an AMD- or Intel-specific variant depending on the CPU it runs on. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-2-bp@alien8.de	2021-09-23 11:03:51 +02:00
Tony Luck	a6e3cf70b7	x86/mce: Change to not send SIGBUS error during copy from user Sending a SIGBUS for a copy from user is not the correct semantic. System calls should return -EFAULT (or a short count for write(2)). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210818002942.1607544-3-tony.luck@intel.com	2021-09-20 10:55:41 +02:00
Tony Luck	81065b35e2	x86/mce: Avoid infinite loop for copy from user recovery There are two cases for machine check recovery: 1) The machine check was triggered by ring3 (application) code. This is the simpler case. The machine check handler simply queues work to be executed on return to user. That code unmaps the page from all users and arranges to send a SIGBUS to the task that triggered the poison. 2) The machine check was triggered in kernel code that is covered by an exception table entry. In this case the machine check handler still queues a work entry to unmap the page, etc. but this will not be called right away because the #MC handler returns to the fix up code address in the exception table entry. Problems occur if the kernel triggers another machine check before the return to user processes the first queued work item. Specifically, the work is queued using the ->mce_kill_me callback structure in the task struct for the current thread. Attempting to queue a second work item using this same callback results in a loop in the linked list of work functions to call. So when the kernel does return to user, it enters an infinite loop processing the same entry for ever. There are some legitimate scenarios where the kernel may take a second machine check before returning to the user. 1) Some code (e.g. futex) first tries a get_user() with page faults disabled. If this fails, the code retries with page faults enabled expecting that this will resolve the page fault. 2) Copy from user code retries a copy in byte-at-time mode to check whether any additional bytes can be copied. On the other side of the fence are some bad drivers that do not check the return value from individual get_user() calls and may access multiple user addresses without noticing that some/all calls have failed. Fix by adding a counter (current->mce_count) to keep track of repeated machine checks before task_work() is called. First machine check saves the address information and calls task_work_add(). Subsequent machine checks before that task_work call back is executed check that the address is in the same page as the first machine check (since the callback will offline exactly one page). Expected worst case is four machine checks before moving on (e.g. one user access with page faults disabled, then a repeat to the same address with page faults enabled ... repeat in copy tail bytes). Just in case there is some code that loops forever enforce a limit of 10. [ bp: Massage commit message, drop noinstr, fix typo, extend panic messages. ] Fixes: `5567d11c21` ("x86/mce: Send #MC singal from task work") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com	2021-09-14 10:27:03 +02:00
Thomas Gleixner	0c2e62ba04	x86/extable: Remove EX_TYPE_FAULT from MCE safe fixups Now that the MC safe copy and FPU have been converted to use the MCE safe fixup types remove EX_TYPE_FAULT from the list of types which MCE considers to be safe to be recovered in kernel. This removes the SGX exception handling of ENCLS from the #MC safe handling, but according to the SGX wizards the current SGX implementations cannot survive #MC on ENCLS: https://lore.kernel.org/r/YS+upEmTfpZub3s9@google.com The code relies on the trap number being stored if ENCLS raised an exception. That's still working, but it does no longer trick the MCE code into assuming that #MC is handled correctly for ENCLS. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210908132525.445255957@linutronix.de	2021-09-13 18:23:00 +02:00
Thomas Gleixner	2cadf5248b	x86/extable: Provide EX_TYPE_DEFAULT_MCE_SAFE and EX_TYPE_FAULT_MCE_SAFE Provide exception fixup types which can be used to identify fixups which allow in kernel #MC recovery and make them invoke the existing handlers. These will be used at places where #MC recovery is handled correctly by the caller. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210908132525.269689153@linutronix.de	2021-09-13 17:56:56 +02:00
Thomas Gleixner	46d28947d9	x86/extable: Rework the exception table mechanics The exception table entries contain the instruction address, the fixup address and the handler address. All addresses are relative. Storing the handler address has a few downsides: 1) Most handlers need to be exported 2) Handlers can be defined everywhere and there is no overview about the handler types 3) MCE needs to check the handler type to decide whether an in kernel #MC can be recovered. The functionality of the handler itself is not in any way special, but for these checks there need to be separate functions which in the worst case have to be exported. Some of these 'recoverable' exception fixups are pretty obscure and just reuse some other handler to spare code. That obfuscates e.g. the #MC safe copy functions. Cleaning that up would require more handlers and exports Rework the exception fixup mechanics by storing a fixup type number instead of the handler address and invoke the proper handler for each fixup type. Also teach the extable sort to leave the type field alone. This makes most handlers static except for special cases like the MCE MSR fixup and the BPF fixup. This allows to add more types for cleaning up the obscure places without adding more handler code and exports. There is a marginal code size reduction for a production config and it removes _eight_ exported symbols. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lkml.kernel.org/r/20210908132525.211958725@linutronix.de	2021-09-13 17:51:47 +02:00
Thomas Gleixner	083b32d6f4	x86/mce: Get rid of stray semicolons and the random number of tabs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210908132525.154428878@linutronix.de	2021-09-13 17:42:51 +02:00
Thomas Gleixner	e42404afc4	x86/mce: Deduplicate exception handling Prepare code for further simplification. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210908132525.096452100@linutronix.de	2021-09-13 17:00:23 +02:00
Linus Torvalds	230bda0873	Merge tag 'x86_cleanups_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "The usual round of minor cleanups and fixes" * tag 'x86_cleanups_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kaslr: Have process_mem_region() return a boolean x86/power: Fix kernel-doc warnings in cpu.c x86/mce/inject: Replace deprecated CPU-hotplug functions. x86/microcode: Replace deprecated CPU-hotplug functions. x86/mtrr: Replace deprecated CPU-hotplug functions. x86/mmiotrace: Replace deprecated CPU-hotplug functions.	2021-08-30 13:35:36 -07:00

1 2 3 4 5

218 Commits