commit 0ccfb8e07e797d57830f3008028de56e22de6e0b Author: Greg Kroah-Hartman Date: Wed Apr 10 16:36:08 2024 +0200 Linux 6.6.26 Link: https://lore.kernel.org/r/20240408125306.643546457@linuxfoundation.org Tested-by: SeongJae Park Tested-by: Kelsey Steele Tested-by: Ron Economos Tested-by: Jon Hunter Tested-by: Takeshi Ogasawara Tested-by: Conor Dooley Tested-by: Mark Brown Tested-by: Harshit Mogalapalli Tested-by: Shuah Khan Link: https://lore.kernel.org/r/20240409172821.820897593@linuxfoundation.org Tested-by: Florian Fainelli Tested-by: kernelci.org bot Link: https://lore.kernel.org/r/20240409173540.185904475@linuxfoundation.org Tested-by: Takeshi Ogasawara Tested-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 6d9ef0c36980ef051cb55aeefb6438429e37268a Author: Greg Kroah-Hartman Date: Tue Apr 9 19:32:41 2024 +0200 x86: set SPECTRE_BHI_ON as default commit 2bb69f5fc72183e1c62547d900f560d0e9334925 upstream. Part of a merge commit from Linus that adjusted the default setting of SPECTRE_BHI_ON. Cc: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit cb238e95ee72a64e53a4f93181aae634cc0d3be6 Author: Daniel Sneddon Date: Wed Mar 13 09:49:17 2024 -0700 KVM: x86: Add BHI_NO commit ed2e8d49b54d677f3123668a21a57822d679651f upstream. Intel processors that aren't vulnerable to BHI will set MSR_IA32_ARCH_CAPABILITIES[BHI_NO] = 1;. Guests may use this BHI_NO bit to determine if they need to implement BHI mitigations or not. Allow this bit to be passed to the guests. Signed-off-by: Daniel Sneddon Signed-off-by: Pawan Gupta Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit 1c42ff893a8fb802dd90ca06af928826fdf0d16b Author: Pawan Gupta Date: Mon Mar 11 08:57:09 2024 -0700 x86/bhi: Mitigate KVM by default commit 95a6ccbdc7199a14b71ad8901cb788ba7fb5167b upstream. BHI mitigation mode spectre_bhi=auto does not deploy the software mitigation by default. In a cloud environment, it is a likely scenario where userspace is trusted but the guests are not trusted. Deploying system wide mitigation in such cases is not desirable. Update the auto mode to unconditionally mitigate against malicious guests. Deploy the software sequence at VMexit in auto mode also, when hardware mitigation is not available. Unlike the force =on mode, software sequence is not deployed at syscalls in auto mode. Suggested-by: Alexandre Chartre Signed-off-by: Pawan Gupta Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit d414b401f9539858574a19af4ffc0fc0d53bfb8f Author: Pawan Gupta Date: Mon Mar 11 08:57:05 2024 -0700 x86/bhi: Add BHI mitigation knob commit ec9404e40e8f36421a2b66ecb76dc2209fe7f3ef upstream. Branch history clearing software sequences and hardware control BHI_DIS_S were defined to mitigate Branch History Injection (BHI). Add cmdline spectre_bhi={on|off|auto} to control BHI mitigation: auto - Deploy the hardware mitigation BHI_DIS_S, if available. on - Deploy the hardware mitigation BHI_DIS_S, if available, otherwise deploy the software sequence at syscall entry and VMexit. off - Turn off BHI mitigation. The default is auto mode which does not deploy the software sequence mitigation. This is because of the hardening done in the syscall dispatch path, which is the likely target of BHI. Signed-off-by: Pawan Gupta Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit 118794d0a572c7a8514dc774e68b59d41857b81c Author: Pawan Gupta Date: Mon Mar 11 08:57:03 2024 -0700 x86/bhi: Enumerate Branch History Injection (BHI) bug commit be482ff9500999f56093738f9219bbabc729d163 upstream. Mitigation for BHI is selected based on the bug enumeration. Add bits needed to enumerate BHI bug. Signed-off-by: Pawan Gupta Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit c6e3d590d0514612d96c572cba66ae0cb4b505a2 Author: Daniel Sneddon Date: Wed Mar 13 09:47:57 2024 -0700 x86/bhi: Define SPEC_CTRL_BHI_DIS_S commit 0f4a837615ff925ba62648d280a861adf1582df7 upstream. Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel from userspace BHI attacks without having to manually overwrite the branch history. Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL. Mitigation is enabled later. Signed-off-by: Daniel Sneddon Signed-off-by: Pawan Gupta Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit eb36b0dce2138581bc6b5e39d0273cb4c96ded81 Author: Pawan Gupta Date: Mon Mar 11 08:56:58 2024 -0700 x86/bhi: Add support for clearing branch history at syscall entry commit 7390db8aea0d64e9deb28b8e1ce716f5020c7ee5 upstream. Branch History Injection (BHI) attacks may allow a malicious application to influence indirect branch prediction in kernel by poisoning the branch history. eIBRS isolates indirect branch targets in ring0. The BHB can still influence the choice of indirect branch predictor entry, and although branch predictor entries are isolated between modes when eIBRS is enabled, the BHB itself is not isolated between modes. Alder Lake and new processors supports a hardware control BHI_DIS_S to mitigate BHI. For older processors Intel has released a software sequence to clear the branch history on parts that don't support BHI_DIS_S. Add support to execute the software sequence at syscall entry and VMexit to overwrite the branch history. For now, branch history is not cleared at interrupt entry, as malicious applications are not believed to have sufficient control over the registers, since previous register state is cleared at interrupt entry. Researchers continue to poke at this area and it may become necessary to clear at interrupt entry as well in the future. This mitigation is only defined here. It is enabled later. Signed-off-by: Pawan Gupta Co-developed-by: Daniel Sneddon Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Alexandre Chartre Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit eb0f175b34287f886019b86ac2f410df331d2c34 Author: Linus Torvalds Date: Wed Apr 3 16:36:44 2024 -0700 x86/syscall: Don't force use of indirect calls for system calls commit 1e3ad78334a69b36e107232e337f9d693dcc9df2 upstream. Make build a switch statement instead, and the compiler can either decide to generate an indirect jump, or - more likely these days due to mitigations - just a series of conditional branches. Yes, the conditional branches also have branch prediction, but the branch prediction is much more controlled, in that it just causes speculatively running the wrong system call (harmless), rather than speculatively running possibly wrong random less controlled code gadgets. This doesn't mitigate other indirect calls, but the system call indirection is the first and most easily triggered case. Signed-off-by: Linus Torvalds Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Reviewed-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit 108feca9e47df1bed26ac7b04306587d9ebccda3 Author: Josh Poimboeuf Date: Fri Apr 5 11:14:13 2024 -0700 x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file commit 0cd01ac5dcb1e18eb18df0f0d05b5de76522a437 upstream. Change the format of the 'spectre_v2' vulnerabilities sysfs file slightly by converting the commas to semicolons, so that mitigations for future variants can be grouped together and separated by commas. Signed-off-by: Josh Poimboeuf Signed-off-by: Daniel Sneddon Signed-off-by: Thomas Gleixner Signed-off-by: Daniel Sneddon Signed-off-by: Greg Kroah-Hartman commit 046545314c792a5545e5f236293149346058c73e Author: Ard Biesheuvel Date: Tue Feb 27 16:19:14 2024 +0100 x86/boot: Move mem_encrypt= parsing to the decompressor commit cd0d9d92c8bb46e77de62efd7df13069ddd61e7d upstream. The early SME/SEV code parses the command line very early, in order to decide whether or not memory encryption should be enabled, which needs to occur even before the initial page tables are created. This is problematic for a number of reasons: - this early code runs from the 1:1 mapping provided by the decompressor or firmware, which uses a different translation than the one assumed by the linker, and so the code needs to be built in a special way; - parsing external input while the entire kernel image is still mapped writable is a bad idea in general, and really does not belong in security minded code; - the current code ignores the built-in command line entirely (although this appears to be the case for the entire decompressor) Given that the decompressor/EFI stub is an intrinsic part of the x86 bootable kernel image, move the command line parsing there and out of the core kernel. This removes the need to build lib/cmdline.o in a special way, or to use RIP-relative LEA instructions in inline asm blocks. This involves a new xloadflag in the setup header to indicate that mem_encrypt=on appeared on the kernel command line. Signed-off-by: Ard Biesheuvel Signed-off-by: Borislav Petkov (AMD) Tested-by: Tom Lendacky Link: https://lore.kernel.org/r/20240227151907.387873-17-ardb+git@google.com Signed-off-by: Greg Kroah-Hartman commit ccde70aa54c484f05030a353fec47de3a0de5a2d Author: Ard Biesheuvel Date: Thu Jan 25 14:32:07 2024 +0100 x86/efistub: Remap kernel text read-only before dropping NX attribute commit 9c55461040a9264b7e44444c53d26480b438eda6 upstream. Currently, the EFI stub invokes the EFI memory attributes protocol to strip any NX restrictions from the entire loaded kernel, resulting in all code and data being mapped read-write-execute. The point of the EFI memory attributes protocol is to remove the need for all memory allocations to be mapped with both write and execute permissions by default, and make it the OS loader's responsibility to transition data mappings to code mappings where appropriate. Even though the UEFI specification does not appear to leave room for denying memory attribute changes based on security policy, let's be cautious and avoid relying on the ability to create read-write-execute mappings. This is trivially achievable, given that the amount of kernel code executing via the firmware's 1:1 mapping is rather small and limited to the .head.text region. So let's drop the NX restrictions only on that subregion, but not before remapping it as read-only first. Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit 56408ed92903795d350801d4809a77e946774f13 Author: Ard Biesheuvel Date: Tue Feb 27 16:19:16 2024 +0100 x86/sev: Move early startup code into .head.text section commit 428080c9b19bfda37c478cd626dbd3851db1aff9 upstream. In preparation for implementing rigorous build time checks to enforce that only code that can support it will be called from the early 1:1 mapping of memory, move SEV init code that is called in this manner to the .head.text section. Signed-off-by: Ard Biesheuvel Signed-off-by: Borislav Petkov (AMD) Tested-by: Tom Lendacky Link: https://lore.kernel.org/r/20240227151907.387873-19-ardb+git@google.com Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit af90ced75242b5a1ca4fc861f4bd55890a4ed873 Author: Ard Biesheuvel Date: Tue Feb 27 16:19:15 2024 +0100 x86/sme: Move early SME kernel encryption handling into .head.text commit 48204aba801f1b512b3abed10b8e1a63e03f3dd1 upstream. The .head.text section is the initial primary entrypoint of the core kernel, and is entered with the CPU executing from a 1:1 mapping of memory. Such code must never access global variables using absolute references, as these are based on the kernel virtual mapping which is not active yet at this point. Given that the SME startup code is also called from this early execution context, move it into .head.text as well. This will allow more thorough build time checks in the future to ensure that early startup code only uses RIP-relative references to global variables. Also replace some occurrences of __pa_symbol() [which relies on the compiler generating an absolute reference, which is not guaranteed] and an open coded RIP-relative access with RIP_REL_REF(). Signed-off-by: Ard Biesheuvel Signed-off-by: Borislav Petkov (AMD) Tested-by: Tom Lendacky Link: https://lore.kernel.org/r/20240227151907.387873-18-ardb+git@google.com Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit dc4cbf9e2df4d2ad361659aa037f5a9b0d32691f Author: Ard Biesheuvel Date: Tue Feb 27 16:19:13 2024 +0100 efi/libstub: Add generic support for parsing mem_encrypt= commit 7205f06e847422b66c1506eee01b9998ffc75d76 upstream. Parse the mem_encrypt= command line parameter from the EFI stub if CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early boot code by the arch code in the stub. This avoids the need for the core kernel to do any string parsing very early in the boot. Signed-off-by: Ard Biesheuvel Signed-off-by: Borislav Petkov (AMD) Tested-by: Tom Lendacky Link: https://lore.kernel.org/r/20240227151907.387873-16-ardb+git@google.com Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit 5447cb97e9b29bd3ac7569c5998d9cd7901482bd Author: Hou Wenlong Date: Tue Oct 17 15:08:06 2023 +0800 x86/head/64: Move the __head definition to commit d2a285d65bfde3218fd0c3b88794d0135ced680b upstream. Move the __head section definition to a header to widen its use. An upcoming patch will mark the code as __head in mem_encrypt_identity.c too. Signed-off-by: Hou Wenlong Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/0583f57977be184689c373fe540cbd7d85ca2047.1697525407.git.houwenlong.hwl@antgroup.com Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit 876941f533e7b47fc69977fc4551c02f2d18af97 Author: Andrii Nakryiko Date: Wed Mar 27 22:24:26 2024 -0700 bpf: support deferring bpf_link dealloc to after RCU grace period commit 1a80dbcb2dbaf6e4c216e62e30fa7d3daa8001ce upstream. BPF link for some program types is passed as a "context" which can be used by those BPF programs to look up additional information. E.g., for multi-kprobes and multi-uprobes, link is used to fetch BPF cookie values. Because of this runtime dependency, when bpf_link refcnt drops to zero there could still be active BPF programs running accessing link data. This patch adds generic support to defer bpf_link dealloc callback to after RCU GP, if requested. This is done by exposing two different deallocation callbacks, one synchronous and one deferred. If deferred one is provided, bpf_link_free() will schedule dealloc_deferred() callback to happen after RCU GP. BPF is using two flavors of RCU: "classic" non-sleepable one and RCU tasks trace one. The latter is used when sleepable BPF programs are used. bpf_link_free() accommodates that by checking underlying BPF program's sleepable flag, and goes either through normal RCU GP only for non-sleepable, or through RCU tasks trace GP *and* then normal RCU GP (taking into account rcu_trace_implies_rcu_gp() optimization), if BPF program is sleepable. We use this for multi-kprobe and multi-uprobe links, which dereference link during program run. We also preventively switch raw_tp link to use deferred dealloc callback, as upcoming changes in bpf-next tree expose raw_tp link data (specifically, cookie value) to BPF program at runtime as well. Fixes: 0dcac2725406 ("bpf: Add multi kprobe link") Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link") Reported-by: syzbot+981935d9485a560bfbcb@syzkaller.appspotmail.com Reported-by: syzbot+2cb5a6c573e98db598cc@syzkaller.appspotmail.com Reported-by: syzbot+62d8b26793e8a2bd0516@syzkaller.appspotmail.com Signed-off-by: Andrii Nakryiko Acked-by: Jiri Olsa Link: https://lore.kernel.org/r/20240328052426.3042617-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 771690b7c31d6ef210f94587e6562f445d7b438d Author: Andrii Nakryiko Date: Wed Mar 27 22:24:25 2024 -0700 bpf: put uprobe link's path and task in release callback commit e9c856cabefb71d47b2eeb197f72c9c88e9b45b0 upstream. There is no need to delay putting either path or task to deallocation step. It can be done right after bpf_uprobe_unregister. Between release and dealloc, there could be still some running BPF programs, but they don't access either task or path, only data in link->uprobes, so it is safe to do. On the other hand, doing path_put() in dealloc callback makes this dealloc sleepable because path_put() itself might sleep. Which is problematic due to the need to call uprobe's dealloc through call_rcu(), which is what is done in the next bug fix patch. So solve the problem by releasing these resources early. Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/r/20240328052426.3042617-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 43eca11b7c73030deb2034f33e9dc6a91fe56c23 Author: Davide Caratti Date: Fri Mar 29 13:08:52 2024 +0100 mptcp: don't account accept() of non-MPC client as fallback to TCP commit 7a1b3490f47e88ec4cbde65f1a77a0f4bc972282 upstream. Current MPTCP servers increment MPTcpExtMPCapableFallbackACK when they accept non-MPC connections. As reported by Christoph, this is "surprising" because the counter might become greater than MPTcpExtMPCapableSYNRX. MPTcpExtMPCapableFallbackACK counter's name suggests it should only be incremented when a connection was seen using MPTCP options, then a fallback to TCP has been done. Let's do that by incrementing it when the subflow context of an inbound MPC connection attempt is dropped. Also, update mptcp_connect.sh kselftest, to ensure that the above MIB does not increment in case a pure TCP client connects to a MPTCP server. Fixes: fc518953bc9c ("mptcp: add and use MIB counter infrastructure") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/449 Signed-off-by: Davide Caratti Reviewed-by: Mat Martineau Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://lore.kernel.org/r/20240329-upstream-net-20240329-fallback-mib-v1-1-324a8981da48@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Matthieu Baerts (NGI0) Signed-off-by: Greg Kroah-Hartman commit 12f353fac65d6cd7ed4658ed0467c60b1cdb02d6 Author: Davide Caratti Date: Tue Dec 19 22:31:04 2023 +0100 mptcp: don't overwrite sock_ops in mptcp_is_tcpsk() commit 8e2b8a9fa512709e6fee744dcd4e2a20ee7f5c56 upstream. Eric Dumazet suggests: > The fact that mptcp_is_tcpsk() was able to write over sock->ops was a > bit strange to me. > mptcp_is_tcpsk() should answer a question, with a read-only argument. re-factor code to avoid overwriting sock_ops inside that function. Also, change the helper name to reflect the semantics and to disambiguate from its dual, sk_is_mptcp(). While at it, collapse mptcp_stream_accept() and mptcp_accept() into a single function, where fallback / non-fallback are separated into a single sk_is_mptcp() conditional. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/432 Suggested-by: Eric Dumazet Signed-off-by: Davide Caratti Acked-by: Paolo Abeni Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller Signed-off-by: Matthieu Baerts (NGI0) Signed-off-by: Greg Kroah-Hartman commit 5b5ff82491a1bff97c0daed82724f578e9a21fe4 Author: Matthieu Baerts (NGI0) Date: Wed Mar 6 10:42:57 2024 +0100 selftests: mptcp: connect: fix shellcheck warnings commit e3aae1098f109f0bd33c971deff1926f4e4441d0 upstream. shellcheck recently helped to prevent issues. It is then good to fix the other harmless issues in order to spot "real" ones later. Here, two categories of warnings are now ignored: - SC2317: Command appears to be unreachable. The cleanup() function is invoked indirectly via the EXIT trap. - SC2086: Double quote to prevent globbing and word splitting. This is recommended, but the current usage is correct and there is no need to do all these modifications to be compliant with this rule. For the modifications: - SC2034: ksft_skip appears unused. - SC2181: Check exit code directly with e.g. 'if mycmd;', not indirectly with $?. - SC2004: $/${} is unnecessary on arithmetic variables. - SC2155: Declare and assign separately to avoid masking return values. - SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined. - SC2059: Don't use variables in the printf format string. Use printf '..%s..' "$foo". Now this script is shellcheck (0.9.0) compliant. We can easily spot new issues. Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) Link: https://lore.kernel.org/r/20240306-upstream-net-next-20240304-selftests-mptcp-shared-code-shellcheck-v2-8-bc79e6e5e6a0@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Matthieu Baerts (NGI0) Signed-off-by: Greg Kroah-Hartman commit e4a449368a2ce6d57a775d0ead27fc07f5a86e5b Author: Sergey Shtylyov Date: Wed Mar 27 19:52:49 2024 +0300 of: module: prevent NULL pointer dereference in vsnprintf() commit a1aa5390cc912934fee76ce80af5f940452fa987 upstream. In of_modalias(), we can get passed the str and len parameters which would cause a kernel oops in vsnprintf() since it only allows passing a NULL ptr when the length is also 0. Also, we need to filter out the negative values of the len parameter as these will result in a really huge buffer since snprintf() takes size_t parameter while ours is ssize_t... Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Signed-off-by: Sergey Shtylyov Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1d211023-3923-685b-20f0-f3f90ea56e1f@omp.ru Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 37b81aed6468a031f8515142e226e6ddc392b9a2 Author: Greg Kroah-Hartman Date: Mon Apr 8 12:42:06 2024 +0200 Revert "x86/mpparse: Register APIC address only once" This reverts commit bebb5af001dc6cb4f505bb21c4d5e2efbdc112e2 which is commit f2208aa12c27bfada3c15c550c03ca81d42dcac2 upstream. It is reported to cause problems in the stable branches, so revert it. Link: https://lore.kernel.org/r/899b7c1419a064a2b721b78eade06659@stwm.de Reported-by: Wolfgang Walter Cc: Thomas Gleixner Cc: Borislav Petkov (AMD) Cc: Guenter Roeck Cc: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a7ff84a6fe5ae8889a5f1c97008358836bd7f947 Author: Andi Shyti Date: Thu Mar 28 08:34:05 2024 +0100 drm/i915/gt: Enable only one CCS for compute workload commit 6db31251bb265813994bfb104eb4b4d0f44d64fb upstream. Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance. This change can be tested with igt i915_query. Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Matt Roper Cc: # v6.2+ Reviewed-by: Matt Roper Acked-by: Michal Mrozek Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com (cherry picked from commit 2bebae0112b117de7e8a7289277a4bd2403b9e17) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 726ff623869ddc3de887d99296cac3c849061b21 Author: Andi Shyti Date: Thu Mar 28 08:34:04 2024 +0100 drm/i915/gt: Do not generate the command streamer for all the CCS commit ea315f98e5d6d3191b74beb0c3e5fc16081d517c upstream. We want a fixed load CCS balancing consisting in all slices sharing one single user engine. For this reason do not create the intel_engine_cs structure with its dedicated command streamer for CCS slices beyond the first. Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Matt Roper Cc: # v6.2+ Acked-by: Michal Mrozek Reviewed-by: Matt Roper Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com (cherry picked from commit c7a5aa4e57f88470313a8277eb299b221b86e3b1) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit c1f7ce2a11a945044d9d5556e638efdca70fb321 Author: Andi Shyti Date: Thu Mar 28 08:34:03 2024 +0100 drm/i915/gt: Disable HW load balancing for CCS commit bc9a1ec01289e6e7259dc5030b413a9c6654a99a upstream. The hardware should not dynamically balance the load between CCS engines. Wa_14019159160 recommends disabling it across all platforms. Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Matt Roper Cc: # v6.2+ Reviewed-by: Matt Roper Acked-by: Michal Mrozek Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com (cherry picked from commit f5d2904cf814f20b79e3e4c1b24a4ccc2411b7e0) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 2cfff21732132e363b4cc275d63ea98f1af726c1 Author: Paulo Alcantara Date: Tue Apr 2 16:34:04 2024 -0300 smb: client: fix potential UAF in cifs_signal_cifsd_for_reconnect() commit e0e50401cc3921c9eaf1b0e667db174519ea939f upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit aa582b33f94453fdeaff1e7d0aa252c505975e01 Author: Paulo Alcantara Date: Tue Apr 2 16:34:02 2024 -0300 smb: client: fix potential UAF in smb2_is_network_name_deleted() commit 63981561ffd2d4987807df4126f96a11e18b0c1d upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 0a15ba88a32fa7a516aff7ffd27befed5334dff2 Author: Paulo Alcantara Date: Tue Apr 2 16:34:00 2024 -0300 smb: client: fix potential UAF in is_valid_oplock_break() commit 69ccf040acddf33a3a85ec0f6b45ef84b0f7ec29 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit f92739fdd4522c4291277136399353d7c341fae4 Author: Paulo Alcantara Date: Tue Apr 2 16:33:58 2024 -0300 smb: client: fix potential UAF in smb2_is_valid_lease_break() commit 705c76fbf726c7a2f6ff9143d4013b18daaaebf1 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 21fed37d2bdcde33453faf61d3d4d96c355f04bd Author: Paulo Alcantara Date: Tue Apr 2 16:33:59 2024 -0300 smb: client: fix potential UAF in smb2_is_valid_oplock_break() commit 22863485a4626ec6ecf297f4cc0aef709bc862e4 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 10e17ca4000ec34737bde002a13435c38ace2682 Author: Paulo Alcantara Date: Tue Apr 2 16:33:54 2024 -0300 smb: client: fix potential UAF in cifs_dump_full_key() commit 58acd1f497162e7d282077f816faa519487be045 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit c3cf8b74c57924c0985e49a1fdf02d3395111f39 Author: Paulo Alcantara Date: Tue Apr 2 16:33:56 2024 -0300 smb: client: fix potential UAF in cifs_stats_proc_show() commit 0865ffefea197b437ba78b5dd8d8e256253efd65 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit cf03020c56d3ed28c4942280957a007b5e9544f7 Author: Paulo Alcantara Date: Tue Apr 2 16:33:55 2024 -0300 smb: client: fix potential UAF in cifs_stats_proc_write() commit d3da25c5ac84430f89875ca7485a3828150a7e0a upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit a65f2b56334ba4dc30bd5ee9ce5b2691b973344d Author: Paulo Alcantara Date: Tue Apr 2 16:33:53 2024 -0300 smb: client: fix potential UAF in cifs_debug_files_proc_show() commit ca545b7f0823f19db0f1148d59bc5e1a56634502 upstream. Skip sessions that are being teared down (status == SES_EXITING) to avoid UAF. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 6f17163b9339fac92023a1d9bef22128db3b9a4b Author: Ritvik Budhiraja Date: Tue Apr 2 14:01:28 2024 -0500 smb3: retrying on failed server close commit 173217bd73365867378b5e75a86f0049e1069ee8 upstream. In the current implementation, CIFS close sends a close to the server and does not check for the success of the server close. This patch adds functionality to check for server close return status and retries in case of an EBUSY or EAGAIN error. This can help avoid handle leaks Cc: stable@vger.kernel.org Signed-off-by: Ritvik Budhiraja Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit ba55f8a995f6badf7f3288a900e3ede006bf4cbd Author: Paulo Alcantara Date: Mon Apr 1 22:44:09 2024 -0300 smb: client: serialise cifs_construct_tcon() with cifs_mount_mutex commit 93cee45ccfebc62a3bb4cd622b89e00c8c7d8493 upstream. Serialise cifs_construct_tcon() with cifs_mount_mutex to handle parallel mounts that may end up reusing the session and tcon created by it. Cc: stable@vger.kernel.org # 6.4+ Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 9b2ee27e8de59257f4482b94b55cc7b18425f440 Author: Paulo Alcantara Date: Mon Apr 1 22:44:08 2024 -0300 smb: client: handle DFS tcons in cifs_construct_tcon() commit 4a5ba0e0bfe552ac7451f57e304f6343c3d87f89 upstream. The tcons created by cifs_construct_tcon() on multiuser mounts must also be able to failover and refresh DFS referrals, so set the appropriate fields in order to get a full DFS tcon. They could be shared among different superblocks later, too. Cc: stable@vger.kernel.org # 6.4+ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202404021518.3Xu2VU4s-lkp@intel.com/ Signed-off-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 00effef72c98294edb1efa87ffa0f6cfb61b36a4 Author: Stefan O'Rear Date: Wed Mar 27 02:12:58 2024 -0400 riscv: process: Fix kernel gp leakage commit d14fa1fcf69db9d070e75f1c4425211fa619dfc8 upstream. childregs represents the registers which are active for the new thread in user context. For a kernel thread, childregs->gp is never used since the kernel gp is not touched by switch_to. For a user mode helper, the gp value can be observed in user space after execve or possibly by other means. [From the email thread] The /* Kernel thread */ comment is somewhat inaccurate in that it is also used for user_mode_helper threads, which exec a user process, e.g. /sbin/init or when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have PF_KTHREAD set and are valid targets for ptrace etc. even before they exec. childregs is the *user* context during syscall execution and it is observable from userspace in at least five ways: 1. kernel_execve does not currently clear integer registers, so the starting register state for PID 1 and other user processes started by the kernel has sp = user stack, gp = kernel __global_pointer$, all other integer registers zeroed by the memset in the patch comment. This is a bug in its own right, but I'm unwilling to bet that it is the only way to exploit the issue addressed by this patch. 2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread before it execs, but ptrace requires SIGSTOP to be delivered which can only happen at user/kernel boundaries. 3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for user_mode_helpers before the exec completes, but gp is not one of the registers it returns. 4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses are also exposed via PERF_SAMPLE_REGS_USER which is permitted under LOCKDOWN_PERF. I have not attempted to write exploit code. 5. Much of the tracing infrastructure allows access to user registers. I have not attempted to determine which forms of tracing allow access to user registers without already allowing access to kernel registers. Fixes: 7db91e57a0ac ("RISC-V: Task implementation") Cc: stable@vger.kernel.org Signed-off-by: Stefan O'Rear Reviewed-by: Alexandre Ghiti Link: https://lore.kernel.org/r/20240327061258.2370291-1-sorear@fastmail.com Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 7a82963245eb5779bac7cc09100f799d4ebea30b Author: Samuel Holland Date: Mon Mar 11 19:19:13 2024 -0700 riscv: Fix spurious errors from __get/put_kernel_nofault commit d080a08b06b6266cc3e0e86c5acfd80db937cb6b upstream. These macros did not initialize __kr_err, so they could fail even if the access did not fault. Cc: stable@vger.kernel.org Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") Signed-off-by: Samuel Holland Reviewed-by: Alexandre Ghiti Reviewed-by: Charlie Jenkins Link: https://lore.kernel.org/r/20240312022030.320789-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 3dcb2223b973887f6360780f17ef99be38c3bbd3 Author: Sumanth Korikkar Date: Tue Mar 26 18:12:13 2024 +0100 s390/entry: align system call table on 8 bytes commit 378ca2d2ad410a1cd5690d06b46c5e2297f4c8c0 upstream. Align system call table on 8 bytes. With sys_call_table entry size of 8 bytes that eliminates the possibility of a system call pointer crossing cache line boundary. Cc: stable@kernel.org Suggested-by: Ulrich Weigand Reviewed-by: Alexander Gordeev Signed-off-by: Sumanth Korikkar Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman commit 782baf52e7cbddd01543b58c43b019c70ce7681d Author: Edward Liaw Date: Fri Mar 29 18:58:10 2024 +0000 selftests/mm: include strings.h for ffsl commit 176517c9310281d00dd3210ab4cc4d3cdc26b17e upstream. Got a compilation error on Android for ffsl after 91b80cc5b39f ("selftests: mm: fix map_hugetlb failure on 64K page size systems") included vm_util.h. Link: https://lkml.kernel.org/r/20240329185814.16304-1-edliaw@google.com Fixes: af605d26a8f2 ("selftests/mm: merge util.h into vm_util.h") Signed-off-by: Edward Liaw Reviewed-by: Muhammad Usama Anjum Cc: Axel Rasmussen Cc: David Hildenbrand Cc: "Mike Rapoport (IBM)" Cc: Peter Xu Cc: Shuah Khan Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 43fad1d0284de30159661d0badfc3cbaf7e6f8f8 Author: David Hildenbrand Date: Tue Mar 26 15:32:08 2024 +0100 mm/secretmem: fix GUP-fast succeeding on secretmem folios commit 65291dcfcf8936e1b23cfd7718fdfde7cfaf7706 upstream. folio_is_secretmem() currently relies on secretmem folios being LRU folios, to save some cycles. However, folios might reside in a folio batch without the LRU flag set, or temporarily have their LRU flag cleared. Consequently, the LRU flag is unreliable for this purpose. In particular, this is the case when secretmem_fault() allocates a fresh page and calls filemap_add_folio()->folio_add_lru(). The folio might be added to the per-cpu folio batch and won't get the LRU flag set until the batch was drained using e.g., lru_add_drain(). Consequently, folio_is_secretmem() might not detect secretmem folios and GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel when we would later try reading/writing to the folio, because the folio has been unmapped from the directmap. Fix it by removing that unreliable check. Link: https://lkml.kernel.org/r/20240326143210.291116-2-david@redhat.com Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: David Hildenbrand Reported-by: xingwei lee Reported-by: yue sun Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2rmz0HQ@mail.gmail.com/ Debugged-by: Miklos Szeredi Tested-by: Miklos Szeredi Reviewed-by: Mike Rapoport (IBM) Cc: Lorenzo Stoakes Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 8a44119ca4456d2a15037b5b76ed98fa783ce614 Author: Mark Brown Date: Mon Mar 25 16:35:21 2024 +0000 arm64/ptrace: Use saved floating point state type to determine SVE layout commit b017a0cea627fcbe158fc2c214fe893e18c4d0c4 upstream. The SVE register sets have two different formats, one of which is a wrapped version of the standard FPSIMD register set and another with actual SVE register data. At present we check TIF_SVE to see if full SVE register state should be provided when reading the SVE regset but if we were in a syscall we may have saved only floating point registers even though that is set. Fix this and simplify the logic by checking and using the format which we recorded when deciding if we should use FPSIMD or SVE format. Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Cc: # 6.2.x Signed-off-by: Mark Brown Link: https://lore.kernel.org/r/20240325-arm64-ptrace-fp-type-v1-1-8dc846caf11f@kernel.org Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 92f32f108693fefda53e75fca5fbfa7553ef9f41 Author: Kan Liang Date: Mon Apr 1 06:33:20 2024 -0700 perf/x86/intel/ds: Don't clear ->pebs_data_cfg for the last PEBS event commit 312be9fc2234c8acfb8148a9f4c358b70d358dee upstream. The MSR_PEBS_DATA_CFG MSR register is used to configure which data groups should be generated into a PEBS record, and it's shared among all counters. If there are different configurations among counters, perf combines all the configurations. The first perf command as below requires a complete PEBS record (including memory info, GPRs, XMMs, and LBRs). The second perf command only requires a basic group. However, after the second perf command is running, the MSR_PEBS_DATA_CFG register is cleared. Only a basic group is generated in a PEBS record, which is wrong. The required information for the first perf command is missed. $ perf record --intr-regs=AX,SP,XMM0 -a -C 8 -b -W -d -c 100000003 -o /dev/null -e cpu/event=0xd0,umask=0x81/upp & $ sleep 5 $ perf record --per-thread -c 1 -e cycles:pp --no-timestamp --no-tid taskset -c 8 ./noploop 1000 The first PEBS event is a system-wide PEBS event. The second PEBS event is a per-thread event. When the thread is scheduled out, the intel_pmu_pebs_del() function is invoked to update the PEBS state. Since the system-wide event is still available, the cpuc->n_pebs is 1. The cpuc->pebs_data_cfg is cleared. The data configuration for the system-wide PEBS event is lost. The (cpuc->n_pebs == 1) check was introduced in commit: b6a32f023fcc ("perf/x86: Fix PEBS threshold initialization") At that time, it indeed didn't hurt whether the state was updated during the removal, because only the threshold is updated. The calculation of the threshold takes the last PEBS event into account. However, since commit: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG") we delay the threshold update, and clear the PEBS data config, which triggers the bug. The PEBS data config update scope should not be shrunk during removal. [ mingo: Improved the changelog & comments. ] Fixes: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG") Reported-by: Stephane Eranian Signed-off-by: Kan Liang Signed-off-by: Ingo Molnar Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240401133320.703971-1-kan.liang@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 453b5f2dec276c1bb4ea078bf8c0da57ee4627e5 Author: Jason A. Donenfeld Date: Tue Mar 26 17:07:35 2024 +0100 x86/coco: Require seeding RNG with RDRAND on CoCo systems commit 99485c4c026f024e7cb82da84c7951dbe3deb584 upstream. There are few uses of CoCo that don't rely on working cryptography and hence a working RNG. Unfortunately, the CoCo threat model means that the VM host cannot be trusted and may actively work against guests to extract secrets or manipulate computation. Since a malicious host can modify or observe nearly all inputs to guests, the only remaining source of entropy for CoCo guests is RDRAND. If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole is meant to gracefully continue on gathering entropy from other sources, but since there aren't other sources on CoCo, this is catastrophic. This is mostly a concern at boot time when initially seeding the RNG, as after that the consequences of a broken RDRAND are much more theoretical. So, try at boot to seed the RNG using 256 bits of RDRAND output. If this fails, panic(). This will also trigger if the system is booted without RDRAND, as RDRAND is essential for a safe CoCo boot. Add this deliberately to be "just a CoCo x86 driver feature" and not part of the RNG itself. Many device drivers and platforms have some desire to contribute something to the RNG, and add_device_randomness() is specifically meant for this purpose. Any driver can call it with seed data of any quality, or even garbage quality, and it can only possibly make the quality of the RNG better or have no effect, but can never make it worse. Rather than trying to build something into the core of the RNG, consider the particular CoCo issue just a CoCo issue, and therefore separate it all out into driver (well, arch/platform) code. [ bp: Massage commit message. ] Signed-off-by: Jason A. Donenfeld Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Elena Reshetova Reviewed-by: Kirill A. Shutemov Reviewed-by: Theodore Ts'o Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240326160735.73531-1-Jason@zx2c4.com Signed-off-by: Greg Kroah-Hartman commit 5a02df3e92470efd589712925b5c722e730276a0 Author: Borislav Petkov (AMD) Date: Wed Mar 13 14:48:27 2024 +0100 x86/mce: Make sure to grab mce_sysfs_mutex in set_bank() commit 3ddf944b32f88741c303f0b21459dbb3872b8bc5 upstream. Modifying a MCA bank's MCA_CTL bits which control which error types to be reported is done over /sys/devices/system/machinecheck/ ├── machinecheck0 │   ├── bank0 │   ├── bank1 │   ├── bank10 │   ├── bank11 ... sysfs nodes by writing the new bit mask of events to enable. When the write is accepted, the kernel deletes all current timers and reinits all banks. Doing that in parallel can lead to initializing a timer which is already armed and in the timer wheel, i.e., in use already: ODEBUG: init active (active state 0) object: ffff888063a28000 object type: timer_list hint: mce_timer_fn+0x0/0x240 arch/x86/kernel/cpu/mce/core.c:2642 WARNING: CPU: 0 PID: 8120 at lib/debugobjects.c:514 debug_print_object+0x1a0/0x2a0 lib/debugobjects.c:514 Fix that by grabbing the sysfs mutex as the rest of the MCA sysfs code does. Reported by: Yue Sun Reported by: xingwei lee Signed-off-by: Borislav Petkov (AMD) Cc: Link: https://lore.kernel.org/r/CAEkJfYNiENwQY8yV1LYJ9LjJs%2Bx_-PqMv98gKig55=2vbzffRw@mail.gmail.com Signed-off-by: Greg Kroah-Hartman commit 51b7841f3fe84606ec0bd8da859d22e05e5419ec Author: David Hildenbrand Date: Wed Apr 3 23:21:30 2024 +0200 x86/mm/pat: fix VM_PAT handling in COW mappings commit 04c35ab3bdae7fefbd7c7a7355f29fa03a035221 upstream. PAT handling won't do the right thing in COW mappings: the first PTE (or, in fact, all PTEs) can be replaced during write faults to point at anon folios. Reliably recovering the correct PFN and cachemode using follow_phys() from PTEs will not work in COW mappings. Using follow_phys(), we might just get the address+protection of the anon folio (which is very wrong), or fail on swap/nonswap entries, failing follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and track_pfn_copy(), not properly calling free_pfn_range(). In free_pfn_range(), we either wouldn't call memtype_free() or would call it with the wrong range, possibly leaking memory. To fix that, let's update follow_phys() to refuse returning anon folios, and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings if we run into that. We will now properly handle untrack_pfn() with COW mappings, where we don't need the cachemode. We'll have to fail fork()->track_pfn_copy() if the first page was replaced by an anon folio, though: we'd have to store the cachemode in the VMA to make this work, likely growing the VMA size. For now, lets keep it simple and let track_pfn_copy() just fail in that case: it would have failed in the past with swap/nonswap entries already, and it would have done the wrong thing with anon folios. Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn(): <--- C reproducer ---> #include #include #include #include int main(void) { struct io_uring_params p = {}; int ring_fd; size_t size; char *map; ring_fd = io_uring_setup(1, &p); if (ring_fd < 0) { perror("io_uring_setup"); return 1; } size = p.sq_off.array + p.sq_entries * sizeof(unsigned); /* Map the submission queue ring MAP_PRIVATE */ map = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, ring_fd, IORING_OFF_SQ_RING); if (map == MAP_FAILED) { perror("mmap"); return 1; } /* We have at least one page. Let's COW it. */ *map = 0; pause(); return 0; } <--- C reproducer ---> On a system with 16 GiB RAM and swap configured: # ./iouring & # memhog 16G # killall iouring [ 301.552930] ------------[ cut here ]------------ [ 301.553285] WARNING: CPU: 7 PID: 1402 at arch/x86/mm/pat/memtype.c:1060 untrack_pfn+0xf4/0x100 [ 301.553989] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_g [ 301.558232] CPU: 7 PID: 1402 Comm: iouring Not tainted 6.7.5-100.fc38.x86_64 #1 [ 301.558772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebu4 [ 301.559569] RIP: 0010:untrack_pfn+0xf4/0x100 [ 301.559893] Code: 75 c4 eb cf 48 8b 43 10 8b a8 e8 00 00 00 3b 6b 28 74 b8 48 8b 7b 30 e8 ea 1a f7 000 [ 301.561189] RSP: 0018:ffffba2c0377fab8 EFLAGS: 00010282 [ 301.561590] RAX: 00000000ffffffea RBX: ffff9208c8ce9cc0 RCX: 000000010455e047 [ 301.562105] RDX: 07fffffff0eb1e0a RSI: 0000000000000000 RDI: ffff9208c391d200 [ 301.562628] RBP: 0000000000000000 R08: ffffba2c0377fab8 R09: 0000000000000000 [ 301.563145] R10: ffff9208d2292d50 R11: 0000000000000002 R12: 00007fea890e0000 [ 301.563669] R13: 0000000000000000 R14: ffffba2c0377fc08 R15: 0000000000000000 [ 301.564186] FS: 0000000000000000(0000) GS:ffff920c2fbc0000(0000) knlGS:0000000000000000 [ 301.564773] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 301.565197] CR2: 00007fea88ee8a20 CR3: 00000001033a8000 CR4: 0000000000750ef0 [ 301.565725] PKRU: 55555554 [ 301.565944] Call Trace: [ 301.566148] [ 301.566325] ? untrack_pfn+0xf4/0x100 [ 301.566618] ? __warn+0x81/0x130 [ 301.566876] ? untrack_pfn+0xf4/0x100 [ 301.567163] ? report_bug+0x171/0x1a0 [ 301.567466] ? handle_bug+0x3c/0x80 [ 301.567743] ? exc_invalid_op+0x17/0x70 [ 301.568038] ? asm_exc_invalid_op+0x1a/0x20 [ 301.568363] ? untrack_pfn+0xf4/0x100 [ 301.568660] ? untrack_pfn+0x65/0x100 [ 301.568947] unmap_single_vma+0xa6/0xe0 [ 301.569247] unmap_vmas+0xb5/0x190 [ 301.569532] exit_mmap+0xec/0x340 [ 301.569801] __mmput+0x3e/0x130 [ 301.570051] do_exit+0x305/0xaf0 ... Link: https://lkml.kernel.org/r/20240403212131.929421-3-david@redhat.com Signed-off-by: David Hildenbrand Reported-by: Wupeng Ma Closes: https://lkml.kernel.org/r/20240227122814.3781907-1-mawupeng1@huawei.com Fixes: b1a86e15dc03 ("x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines") Fixes: 5899329b1910 ("x86: PAT: implement track/untrack of pfnmap regions for x86 - v3") Acked-by: Ingo Molnar Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 801c8b8ec5bfb3519566dff16a5ecd48302fca82 Author: Herve Codina Date: Mon Mar 25 16:21:26 2024 +0100 of: dynamic: Synchronize of_changeset_destroy() with the devlink removals commit 8917e7385346bd6584890ed362985c219fe6ae84 upstream. In the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove() During the step 1, devices are destroyed and devlinks are removed. During the step 2, OF nodes are destroyed but __of_changeset_entry_destroy() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2 ... Indeed, during the devlink removals performed at step 1, the removal itself releasing the device (and the attached of_node) is done by a job queued in a workqueue and so, it is done asynchronously with respect to function calls. When the warning is present, of_node_put() will be called but wrongly too late from the workqueue job. In order to be sure that any ongoing devlink removals are done before the of_node destruction, synchronize the of_changeset_destroy() with the devlink removals. Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina Reviewed-by: Saravana Kannan Tested-by: Luca Ceresoli Reviewed-by: Nuno Sa Link: https://lore.kernel.org/r/20240325152140.198219-3-herve.codina@bootlin.com Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit dfa65572768868d0a0414ac77cfcb949da8e4a9e Author: Herve Codina Date: Mon Mar 25 16:21:25 2024 +0100 driver core: Introduce device_link_wait_removal() commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream. The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") introduces a workqueue to release the consumer and supplier devices used in the devlink. In the job queued, devices are release and in turn, when all the references to these devices are dropped, the release function of the device itself is called. Nothing is present to provide some synchronisation with this workqueue in order to ensure that all ongoing releasing operations are done and so, some other operations can be started safely. For instance, in the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove() During the step 1, devices are released and related devlinks are removed (jobs pushed in the workqueue). During the step 2, OF nodes are destroyed but, without any synchronisation with devlink removal jobs, of_overlay_remove() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2 Indeed, the missing of_node_put() call is going to be done, too late, from the workqueue job execution. Introduce device_link_wait_removal() to offer a way to synchronize operations waiting for the end of devlink removals (i.e. end of workqueue jobs). Also, as a flushing operation is done on the workqueue, the workqueue used is moved from a system-wide workqueue to a local one. Cc: stable@vger.kernel.org Signed-off-by: Herve Codina Tested-by: Luca Ceresoli Reviewed-by: Nuno Sa Reviewed-by: Saravana Kannan Acked-by: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 65938e81df2197203bda4b9a0c477e7987218d66 Author: Jens Axboe Date: Tue Apr 2 16:16:03 2024 -0600 io_uring/kbuf: hold io_buffer_list reference over mmap commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream. If we look up the kbuf, ensure that it doesn't get unregistered until after we're done with it. Since we're inside mmap, we cannot safely use the io_uring lock. Rely on the fact that we can lookup the buffer list under RCU now and grab a reference to it, preventing it from being unregistered until we're done with it. The lookup returns the io_buffer_list directly with it referenced. Cc: stable@vger.kernel.org # v6.4+ Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU") Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 6b9d49bcd97bfe2eed9ee69014fd977ed0d6b27d Author: Jens Axboe Date: Mon Apr 1 15:16:19 2024 -0600 io_uring: use private workqueue for exit work commit 73eaa2b583493b680c6f426531d6736c39643bfb upstream. Rather than use the system unbound event workqueue, use an io_uring specific one. This avoids dependencies with the tty, which also uses the system_unbound_wq, and issues flushes of said workqueue from inside its poll handling. Cc: stable@vger.kernel.org Reported-by: Rasmus Karlsson Tested-by: Rasmus Karlsson Tested-by: Iskren Chernev Link: https://github.com/axboe/liburing/issues/1113 Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b392402d29ab50bfe2e958c71d66e1bc1ef15be4 Author: Jens Axboe Date: Fri Mar 15 16:12:51 2024 -0600 io_uring/kbuf: protect io_buffer_list teardown with a reference commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream. No functional changes in this patch, just in preparation for being able to keep the buffer list alive outside of the ctx->uring_lock. Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 4c0a5da0e70e7055311a8a949b9759811fa7656d Author: Jens Axboe Date: Thu Mar 14 10:46:40 2024 -0600 io_uring/kbuf: get rid of bl->is_ready commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream. Now that xarray is being exclusively used for the buffer_list lookup, this check is no longer needed. Get rid of it and the is_ready member. Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit d6e03f6d8bcce48c181a4177ec19a9ddb8b20e67 Author: Jens Axboe Date: Thu Mar 14 10:45:07 2024 -0600 io_uring/kbuf: get rid of lower BGID lists commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream. Just rely on the xarray for any kind of bgid. This simplifies things, and it really doesn't bring us much, if anything. Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 781477d729be26cd0e42b87084ef886ca5988690 Author: I Gede Agastya Darma Laksana Date: Tue Apr 2 00:46:02 2024 +0700 ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone commit 1576f263ee2147dc395531476881058609ad3d38 upstream. This patch addresses an issue with the Panasonic CF-SZ6's existing quirk, specifically its headset microphone functionality. Previously, the quirk used ALC269_FIXUP_HEADSET_MODE, which does not support the CF-SZ6's design of a single 3.5mm jack for both mic and audio output effectively. The device uses pin 0x19 for the headset mic without jack detection. Following verification on the CF-SZ6 and discussions with the original patch author, i determined that the update to ALC269_FIXUP_ASPIRE_HEADSET_MIC is the appropriate solution. This change is custom-designed for the CF-SZ6's unique hardware setup, which includes a single 3.5mm jack for both mic and audio output, connecting the headset microphone to pin 0x19 without the use of jack detection. Fixes: 0fca97a29b83 ("ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk") Signed-off-by: I Gede Agastya Darma Laksana Cc: Message-ID: <20240401174602.14133-1-gedeagas22@gmail.com> Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 04d78aa05ae4601e35d935da4545a2b242ba44e0 Author: Christoffer Sandberg Date: Thu Mar 28 11:27:57 2024 +0100 ALSA: hda/realtek - Fix inactive headset mic jack commit daf6c4681a74034d5723e2fb761e0d7f3a1ca18f upstream. This patch adds the existing fixup to certain TF platforms implementing the ALC274 codec with a headset jack. It fixes/activates the inactive microphone of the headset. Signed-off-by: Christoffer Sandberg Signed-off-by: Werner Sembach Cc: Message-ID: <20240328102757.50310-1-wse@tuxedocomputers.com> Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 67c477f3201c170ac6cc38219be3253c21928cf9 Author: Namjae Jeon Date: Tue Apr 2 09:31:22 2024 +0900 ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1 commit 5ed11af19e56f0434ce0959376d136005745a936 upstream. SMB2_GLOBAL_CAP_ENCRYPTION flag should be used only for 3.0 and 3.0.2 dialects. This flags set cause compatibility problems with other SMB clients. Reported-by: James Christopher Adduono Tested-by: James Christopher Adduono Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit a637fabac554270a851033f5ab402ecb90bc479c Author: Namjae Jeon Date: Sun Mar 31 21:59:10 2024 +0900 ksmbd: validate payload size in ipc response commit a677ebd8ca2f2632ccdecbad7b87641274e15aac upstream. If installing malicious ksmbd-tools, ksmbd.mountd can return invalid ipc response to ksmbd kernel server. ksmbd should validate payload size of ipc response from ksmbd.mountd to avoid memory overrun or slab-out-of-bounds. This patch validate 3 ipc response that has payload. Cc: stable@vger.kernel.org Reported-by: Chao Ma Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit a06562fd4ce2a7b82bc12962553453396017b3a9 Author: Namjae Jeon Date: Sun Mar 31 21:58:26 2024 +0900 ksmbd: don't send oplock break if rename fails commit c1832f67035dc04fb89e6b591b64e4d515843cda upstream. Don't send oplock break if rename fails. This patch fix smb2.oplock.batch20 test. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 2f0262ac3a8cd27811e88790f0aadae866047aea Author: Kent Gibson Date: Thu Apr 4 11:33:28 2024 +0200 gpio: cdev: fix missed label sanitizing in debounce_setup() commit 83092341e15d0dfee1caa8dc502f66c815ccd78a upstream. When adding sanitization of the label, the path through edge_detector_setup() that leads to debounce_setup() was overlooked. A request taking this path does not allocate a new label and the request label is freed twice when the request is released, resulting in memory corruption. Add label sanitization to debounce_setup(). Cc: stable@vger.kernel.org Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt") Signed-off-by: Kent Gibson [Bartosz: rebased on top of the fix for empty GPIO labels] Co-developed-by: Bartosz Golaszewski Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman commit d9f0804ab0b888e809ead50f0ae5da0b4cfc07b1 Author: Bartosz Golaszewski Date: Thu Apr 4 11:33:27 2024 +0200 gpio: cdev: check for NULL labels when sanitizing them for irqs commit b3b95964590a3d756d69ea8604c856de805479ad upstream. We need to take into account that a line's consumer label may be NULL and not try to kstrdup() it in that case but rather pass the NULL pointer up the stack to the interrupt request function. To that end: let make_irq_label() return NULL as a valid return value and use ERR_PTR() instead to signal an allocation failure to callers. Cc: stable@vger.kernel.org Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt") Reported-by: Linux Kernel Functional Testing Closes: https://lore.kernel.org/lkml/20240402093534.212283-1-naresh.kamboju@linaro.org/ Signed-off-by: Bartosz Golaszewski Tested-by: Anders Roxell Signed-off-by: Greg Kroah-Hartman commit 63bd08629aeeeefa5ee728d08be327e756051c72 Author: Borislav Petkov (AMD) Date: Fri Apr 5 16:46:37 2024 +0200 x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk commit b377c66ae3509ccea596512d6afb4777711c4870 upstream. srso_alias_untrain_ret() is special code, even if it is a dummy which is called in the !SRSO case, so annotate it like its real counterpart, to address the following objtool splat: vmlinux.o: warning: objtool: .export_symbol+0x2b290: data relocation to !ENDBR: srso_alias_untrain_ret+0x0 Fixes: 4535e1a4174c ("x86/bugs: Fix the SRSO mitigation on Zen3/4") Signed-off-by: Borislav Petkov (AMD) Signed-off-by: Ingo Molnar Cc: Linus Torvalds Link: https://lore.kernel.org/r/20240405144637.17908-1-bp@kernel.org Signed-off-by: Greg Kroah-Hartman commit ac522af8db5c024eac693584778f031d1f2bdcc9 Author: Jesse Brandeburg Date: Mon Mar 4 16:37:07 2024 -0800 ice: fix typo in assignment commit 6c5b6ca7642f2992502a22dbd8b80927de174b67 upstream. Fix an obviously incorrect assignment, created with a typo or cut-n-paste error. Fixes: 5995ef88e3a8 ("ice: realloc VSI stats arrays") Signed-off-by: Jesse Brandeburg Reviewed-by: Simon Horman Reviewed-by: Paul Menzel Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit 9d60e8ec996ff84e7f5c08c51e58d9a465d7f1bd Author: Jeff Layton Date: Fri Apr 5 13:56:18 2024 -0400 nfsd: hold a lighter-weight client reference over CB_RECALL_ANY [ Upstream commit 10396f4df8b75ff6ab0aa2cd74296565466f2c8d ] Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the client. While a callback job is technically an RPC that counter is really more for client-driven RPCs, and this has the effect of preventing the client from being unhashed until the callback completes. If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we can end up in a situation where the callback can't complete on the (now dead) callback channel, but the new client can't connect because the old client can't be unhashed. This usually manifests as a NFS4ERR_DELAY return on the CREATE_SESSION operation. The job is only holding a reference to the client so it can clear a flag after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of reference when dealing with the nfsdfs info files, but it should work appropriately here to ensure that the nfs4_client doesn't disappear. Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition") Reported-by: Vladimir Benes Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 6e307a6d9eb4d38298b7c9f4160e975e91e832b0 Author: Alexandre Ghiti Date: Tue Mar 26 21:30:17 2024 +0100 riscv: Disable preemption when using patch_map() [ Upstream commit a370c2419e4680a27382d9231edcf739d5d74efc ] patch_map() uses fixmap mappings to circumvent the non-writability of the kernel text mapping. The __set_fixmap() function only flushes the current cpu tlb, it does not emit an IPI so we must make sure that while we use a fixmap mapping, the current task is not migrated on another cpu which could miss the newly introduced fixmap mapping. So in order to avoid any task migration, disable the preemption. Reported-by: Andrea Parri Closes: https://lore.kernel.org/all/ZcS+GAaM25LXsBOl@andrea/ Reported-by: Andy Chiu Closes: https://lore.kernel.org/linux-riscv/CABgGipUMz3Sffu-CkmeUB1dKVwVQ73+7=sgC45-m0AE9RCjOZg@mail.gmail.com/ Fixes: cad539baa48f ("riscv: implement a memset like function for text") Fixes: 0ff7c3b33127 ("riscv: Use text_mutex instead of patch_lock") Co-developed-by: Andy Chiu Signed-off-by: Andy Chiu Signed-off-by: Alexandre Ghiti Acked-by: Puranjay Mohan Link: https://lore.kernel.org/r/20240326203017.310422-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 1ba1291172f935e6b6fe703161a948f3347400b8 Author: Chuck Lever Date: Wed Apr 3 10:36:25 2024 -0400 SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP [ Upstream commit 05258a0a69b3c5d2c003f818702c0a52b6fea861 ] Jan Schunk reports that his small NFS servers suffer from memory exhaustion after just a few days. A bisect shows that commit e18e157bb5c8 ("SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call") is the first bad commit. That commit assumed that sock_sendmsg() releases all the pages in the underlying bio_vec array, but the reality is that it doesn't. svc_xprt_release() releases the rqst's response pages, but the record marker page fragment isn't one of those, so it is never released. This is a narrow fix that can be applied to stable kernels. A more extensive fix is in the works. Reported-by: Jan Schunk Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218671 Fixes: e18e157bb5c8 ("SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call") Cc: Alexander Duyck Cc: Jakub Kacinski Cc: David Howells Reviewed-by: David Howells Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e12149dd9ba29a3acc55ab13d58ef10beacfdd60 Author: Vijendar Mukunda Date: Thu Apr 4 09:47:13 2024 +0530 ASoC: SOF: amd: fix for false dsp interrupts [ Upstream commit b9846a386734e73a1414950ebfd50f04919f5e24 ] Before ACP firmware loading, DSP interrupts are not expected. Sometimes after reboot, it's observed that before ACP firmware is loaded false DSP interrupt is reported. Registering the interrupt handler before acp initialization causing false interrupts sometimes on reboot as ACP reset is not applied. Correct the sequence by invoking acp initialization sequence prior to registering interrupt handler. Fixes: 738a2b5e2cc9 ("ASoC: SOF: amd: Add IPC support for ACP IP block") Signed-off-by: Vijendar Mukunda Link: https://msgid.link/r/20240404041717.430545-1-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit cbd080c308348ffac455e2833da2ee4f7fc59d5a Author: Arnd Bergmann Date: Wed Apr 3 10:06:48 2024 +0200 ata: sata_mv: Fix PCI device ID table declaration compilation warning [ Upstream commit 3137b83a90646917c90951d66489db466b4ae106 ] Building with W=1 shows a warning for an unused variable when CONFIG_PCI is diabled: drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable] static const struct pci_device_id mv_pci_tbl[] = { Move the table into the same block that containsn the pci_driver definition. Fixes: 7bb3c5290ca0 ("sata_mv: Remove PCI dependency") Signed-off-by: Arnd Bergmann Signed-off-by: Damien Le Moal Signed-off-by: Sasha Levin commit 4b31a226097cf8cc3c9de5e855d97757fdb2bf06 Author: Huai-Yuan Liu Date: Wed Apr 3 09:42:21 2024 +0800 spi: mchp-pci1xxx: Fix a possible null pointer dereference in pci1xxx_spi_probe [ Upstream commit 1f886a7bfb3faf4c1021e73f045538008ce7634e ] In function pci1xxxx_spi_probe, there is a potential null pointer that may be caused by a failed memory allocation by the function devm_kzalloc. Hence, a null pointer check needs to be added to prevent null pointer dereferencing later in the code. To fix this issue, spi_bus->spi_int[iter] should be checked. The memory allocated by devm_kzalloc will be automatically released, so just directly return -ENOMEM without worrying about memory leaks. Fixes: 1cc0cbea7167 ("spi: microchip: pci1xxxx: Add driver for SPI controller of PCI1XXXX PCIe switch") Signed-off-by: Huai-Yuan Liu Link: https://msgid.link/r/20240403014221.969801-1-qq810974084@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 0fdada1ef5b10ea73bf618d0498723c7860b5be9 Author: David Howells Date: Tue Apr 2 10:11:35 2024 +0100 cifs: Fix caching to try to do open O_WRONLY as rdwr on server [ Upstream commit e9e62243a3e2322cf639f653a0b0a88a76446ce7 ] When we're engaged in local caching of a cifs filesystem, we cannot perform caching of a partially written cache granule unless we can read the rest of the granule. This can result in unexpected access errors being reported to the user. Fix this by the following: if a file is opened O_WRONLY locally, but the mount was given the "-o fsc" flag, try first opening the remote file with GENERIC_READ|GENERIC_WRITE and if that returns -EACCES, try dropping the GENERIC_READ and doing the open again. If that last succeeds, invalidate the cache for that file as for O_DIRECT. Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite") Signed-off-by: David Howells cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: Jeff Layton cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 0f28afed9f9d71a3f97df94b9691ac49f49ec887 Author: Oswald Buddenhagen Date: Mon Apr 1 16:58:05 2024 +0200 Revert "ALSA: emu10k1: fix synthesizer sample playback position and caching" [ Upstream commit 03f56ed4ead162551ac596c9e3076ff01f1c5836 ] As already anticipated in the original commit, playback was broken for very short samples. I just didn't expect it to be an actual problem, because we're talking about less than 1.5 milliseconds here. But clearly such wavetable samples do actually exist. The problem was that for such short samples we'd set the current position beyond the end of the loop, so we'd run off the end of the sample and play garbage. This is a bigger (more audible) problem than the original one, which was that we'd start playback with garbage (whatever was still in the cache), which would be mostly masked by the note's attack phase. So revert to the old behavior for now. We'll subsequently fix it properly with a bigger patch series. Note that this isn't a full revert - the dead code is not re-introduced, because that would be silly. Fixes: df335e9a8bcb ("ALSA: emu10k1: fix synthesizer sample playback position and caching") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218625 Signed-off-by: Oswald Buddenhagen Message-ID: <20240401145805.528794-1-oswald.buddenhagen@gmx.de> Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit f3e692c8c24a7161667413c2f75236d616317ae2 Author: Li Nan Date: Fri Dec 8 16:23:35 2023 +0800 scsi: sd: Unregister device if device_add_disk() failed in sd_probe() [ Upstream commit 0296bea01cfa6526be6bd2d16dc83b4e7f1af91f ] "if device_add() succeeds, you should call device_del() when you want to get rid of it." In sd_probe(), device_add_disk() fails when device_add() has already succeeded, so change put_device() to device_unregister() to ensure device resources are released. Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()") Signed-off-by: Li Nan Link: https://lore.kernel.org/r/20231208082335.1754205-1-linan666@huaweicloud.com Reviewed-by: Bart Van Assche Reviewed-by: Yu Kuai Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 56de23eac65fd54666316761c75d3cb8c8f8662d Author: Arnd Bergmann Date: Tue Mar 26 23:38:06 2024 +0100 scsi: mylex: Fix sysfs buffer lengths [ Upstream commit 1197c5b2099f716b3de327437fb50900a0b936c9 ] The myrb and myrs drivers use an odd way of implementing their sysfs files, calling snprintf() with a fixed length of 32 bytes to print into a page sized buffer. One of the strings is actually longer than 32 bytes, which clang can warn about: drivers/scsi/myrb.c:1906:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation] drivers/scsi/myrs.c:1089:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation] These could all be plain sprintf() without a length as the buffer is always long enough. On the other hand, sysfs files should not be overly long either, so just double the length to make sure the longest strings don't get truncated here. Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller (SCSI interface)") Fixes: 081ff398c56c ("scsi: myrb: Add Mylex RAID controller (block interface)") Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20240326223825.4084412-8-arnd@kernel.org Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 4cad40d93665abf87fb769e29bdaa783f85128d8 Author: Arnd Bergmann Date: Tue Mar 26 15:53:37 2024 +0100 ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit [ Upstream commit 52f80bb181a9a1530ade30bc18991900bbb9697f ] gcc warns about a memcpy() with overlapping pointers because of an incorrect size calculation: In file included from include/linux/string.h:369, from drivers/ata/sata_sx4.c:66: In function 'memcpy_fromio', inlined from 'pdc20621_get_from_dimm.constprop' at drivers/ata/sata_sx4.c:962:2: include/linux/fortify-string.h:97:33: error: '__builtin_memcpy' accessing 4294934464 bytes at offsets 0 and [16, 16400] overlaps 6442385281 bytes at offset -2147450817 [-Werror=restrict] 97 | #define __underlying_memcpy __builtin_memcpy | ^ include/linux/fortify-string.h:620:9: note: in expansion of macro '__underlying_memcpy' 620 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ include/linux/fortify-string.h:665:26: note: in expansion of macro '__fortify_memcpy_chk' 665 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ include/asm-generic/io.h:1184:9: note: in expansion of macro 'memcpy' 1184 | memcpy(buffer, __io_virt(addr), size); | ^~~~~~ The problem here is the overflow of an unsigned 32-bit number to a negative that gets converted into a signed 'long', keeping a large positive number. Replace the complex calculation with a more readable min() variant that avoids the warning. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Arnd Bergmann Signed-off-by: Damien Le Moal Signed-off-by: Sasha Levin commit fce7a547b9c88a45d7481c0875f38a779c39d778 Author: Richard Fitzgerald Date: Fri Mar 29 14:46:30 2024 +0000 regmap: maple: Fix uninitialized symbol 'ret' warnings [ Upstream commit eaa03486d932572dfd1c5f64f9dfebe572ad88c0 ] Fix warnings reported by smatch by initializing local 'ret' variable to 0. drivers/base/regmap/regcache-maple.c:186 regcache_maple_drop() error: uninitialized symbol 'ret'. drivers/base/regmap/regcache-maple.c:290 regcache_maple_sync() error: uninitialized symbol 'ret'. Signed-off-by: Richard Fitzgerald Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Link: https://lore.kernel.org/r/20240329144630.1965159-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 04b52388c46dc87c46594fb18a6cd97f201ec2fb Author: Vijendar Mukunda Date: Fri Mar 29 11:08:12 2024 +0530 ASoC: amd: acp: fix for acp_init function error handling [ Upstream commit 2c603a4947a1247102ccb008d5eb6f37a4043c98 ] If acp_init() fails, acp pci driver probe should return error. Add acp_init() function return value check logic. Fixes: e61b415515d3 ("ASoC: amd: acp: refactor the acp init and de-init sequence") Signed-off-by: Vijendar Mukunda Link: https://lore.kernel.org/r/20240329053815.2373979-1-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 3d3e148c7576a04d9f6e0899428ff78ceafd4344 Author: Jaewon Kim Date: Fri Mar 29 17:58:40 2024 +0900 spi: s3c64xx: Use DMA mode from fifo size [ Upstream commit a3d3eab627bbbb0cb175910cf8d0f7022628a642 ] If the SPI data size is smaller than FIFO, it operates in PIO mode, and if it is larger than FIFO size, it oerates in DMA mode. If the SPI data size is equal to fifo, it operates in PIO mode and it is separated to 2 transfers. To prevent it, it must operate in DMA mode from the case where the data size and the fifo size are the same. Fixes: 1ee806718d5e ("spi: s3c64xx: support interrupt based pio mode") Signed-off-by: Jaewon Kim Reviewed-by: Sam Protsenko Link: https://lore.kernel.org/r/20240329085840.65856-1-jaewon02.kim@samsung.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 5448a99c809654be994e4c8ee5665341449ab790 Author: Tudor Ambarus Date: Fri Feb 16 07:05:47 2024 +0000 spi: s3c64xx: determine the fifo depth only once [ Upstream commit c6e776ab6abdfce5a1edcde7a22c639e76499939 ] Determine the FIFO depth only once, at probe time. ``sdd->fifo_depth`` can be set later on with the FIFO depth specified in the device tree. Signed-off-by: Tudor Ambarus Link: https://msgid.link/r/20240216070555.2483977-5-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit f8a6edd44903a1eb7ca9a1a35feaee3219b932d1 Author: Tudor Ambarus Date: Fri Feb 16 07:05:46 2024 +0000 spi: s3c64xx: allow full FIFO masks [ Upstream commit d6911cf27e5c8491cbfedd4ae2d1ee74a3e685b4 ] The driver is wrong because is using partial register field masks for the SPI_STATUS.{RX, TX}_FIFO_LVL register fields. We see s3c64xx_spi_port_config.fifo_lvl_mask with different values for different instances of the same IP. Take s5pv210_spi_port_config for example, it defines: .fifo_lvl_mask = { 0x1ff, 0x7F }, fifo_lvl_mask is used to determine the FIFO depth of the instance of the IP. In this case, the integrator uses a 256 bytes FIFO for the first SPI instance of the IP, and a 64 bytes FIFO for the second instance. While the first mask reflects the SPI_STATUS.{RX, TX}_FIFO_LVL register fields, the second one is two bits short. Using partial field masks is misleading and can hide problems of the driver's logic. Allow platforms to specify the full FIFO mask, regardless of the FIFO depth. Introduce {rx, tx}_fifomask to represent the SPI_STATUS.{RX, TX}_FIFO_LVL register fields. It's a shifted mask defining the field's length and position. We'll be able to deprecate the use of @rx_lvl_offset, as the shift value can be determined from the mask. The existing compatibles shall start using {rx, tx}_fifomask so that they use the full field mask and to avoid shifting the mask to position, and then shifting it back to zero in the {TX, RX}_FIFO_LVL macros. @rx_lvl_offset will be deprecated in a further patch, after we have the infrastructure to deprecate @fifo_lvl_mask as well. No functional change intended. Signed-off-by: Tudor Ambarus Link: https://msgid.link/r/20240216070555.2483977-4-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit 6f9d907bee2a7bf4f40cec754846d421071e83f4 Author: Tudor Ambarus Date: Fri Feb 16 07:05:45 2024 +0000 spi: s3c64xx: define a magic value [ Upstream commit ff8faa8a5c0f4c2da797cd22a163ee3cc8823b13 ] Define a magic value, it will be used in the next patch as well. Signed-off-by: Tudor Ambarus Link: https://msgid.link/r/20240216070555.2483977-3-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit 3fa0085f1052a3fb4353334503afa659cbdb4a97 Author: Tudor Ambarus Date: Wed Feb 7 12:04:22 2024 +0000 spi: s3c64xx: remove else after return [ Upstream commit 9d47e411f4d636519a8d26587928d34cf52c0c1f ] Else case is not needed after a return, remove it. Reviewed-by: Andi Shyti Reviewed-by: Sam Protsenko Signed-off-by: Tudor Ambarus Link: https://lore.kernel.org/r/20240207120431.2766269-9-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit 56aeaed8c82287771eced00faa96a057be6355f9 Author: Tudor Ambarus Date: Wed Feb 7 12:04:17 2024 +0000 spi: s3c64xx: explicitly include [ Upstream commit 4568fa574fcef3811a8140702979f076ef0f5bc0 ] The driver uses GENMASK() but does not include . It is good practice to directly include all headers used, it avoids implicit dependencies and spurious breakage if someone rearranges headers and causes the implicit include to vanish. Include the missing header. Reviewed-by: Peter Griffin Signed-off-by: Tudor Ambarus Link: https://lore.kernel.org/r/20240207120431.2766269-4-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit 0df4616ef533c85d20436365d1bdd39d468e1adc Author: Tudor Ambarus Date: Wed Feb 7 12:04:15 2024 +0000 spi: s3c64xx: sort headers alphabetically [ Upstream commit a77ce80f63f06d7ae933c332ed77c79136fa69b0 ] Sorting headers alphabetically helps locating duplicates, and makes it easier to figure out where to insert new headers. Reviewed-by: Andi Shyti Reviewed-by: Peter Griffin Signed-off-by: Tudor Ambarus Link: https://lore.kernel.org/r/20240207120431.2766269-2-tudor.ambarus@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit bb3ee5fddac125527d3425d65d1c9f12edfc3d27 Author: Sam Protsenko Date: Sat Jan 20 11:00:01 2024 -0600 spi: s3c64xx: Extract FIFO depth calculation to a dedicated macro [ Upstream commit 460efee706c2b6a4daba62ec143fea29c2e7b358 ] Simplify the code by extracting all cases of FIFO depth calculation into a dedicated macro. No functional change. Signed-off-by: Sam Protsenko Reviewed-by: Andi Shyti Link: https://msgid.link/r/20240120170001.3356-1-semen.protsenko@linaro.org Signed-off-by: Mark Brown Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size") Signed-off-by: Sasha Levin commit 80ca762f1bdd423994e5ff9578d52eb8d2088201 Author: Stephen Lee Date: Mon Mar 25 18:01:31 2024 -0700 ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw [ Upstream commit fc563aa900659a850e2ada4af26b9d7a3de6c591 ] In snd_soc_info_volsw(), mask is generated by figuring out the index of the most significant bit set in max and converting the index to a bitmask through bit shift 1. Unintended wraparound occurs when max is an integer value with msb bit set. Since the bit shift value 1 is treated as an integer type, the left shift operation will wraparound and set mask to 0 instead of all 1's. In order to fix this, we type cast 1 as `1ULL` to prevent the wraparound. Fixes: 7077148fb50a ("ASoC: core: Split ops out of soc-core.c") Signed-off-by: Stephen Lee Link: https://msgid.link/r/20240326010131.6211-1-slee08177@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 229c761b198e58098a27aa074dfca28a8f679a72 Author: Pierre-Louis Bossart Date: Mon Mar 25 17:18:16 2024 -0500 ASoC: rt722-sdca-sdw: fix locking sequence [ Upstream commit adb354bbc231b23d3a05163ce35c1d598512ff64 ] The disable_irq_lock protects the 'disable_irq' value, we need to lock before testing it. Fixes: a0b7c59ac1a9 ("ASoC: rt722-sdca: fix for JD event handling in ClockStop Mode0") Signed-off-by: Pierre-Louis Bossart Reviewed-by: Bard Liao Reviewed-by: Chao Song Link: https://msgid.link/r/20240325221817.206465-6-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 1064108334bb29884b87fa30e40cf8477ee73abe Author: Pierre-Louis Bossart Date: Mon Mar 25 17:18:15 2024 -0500 ASoC: rt712-sdca-sdw: fix locking sequence [ Upstream commit c8b2e5c1b959d100990e4f0cbad38e7d047bb97c ] The disable_irq_lock protects the 'disable_irq' value, we need to lock before testing it. Fixes: 7a8735c1551e ("ASoC: rt712-sdca: fix for JD event handling in ClockStop Mode0") Signed-off-by: Pierre-Louis Bossart Reviewed-by: Bard Liao Reviewed-by: Chao Song Link: https://msgid.link/r/20240325221817.206465-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 3bfbc530a65859d8f91cdc5994f038a2dc72c253 Author: Pierre-Louis Bossart Date: Mon Mar 25 17:18:14 2024 -0500 ASoC: rt711-sdw: fix locking sequence [ Upstream commit aae86cfd8790bcc7693a5a0894df58de5cb5128c ] The disable_irq_lock protects the 'disable_irq' value, we need to lock before testing it. Fixes: b69de265bd0e ("ASoC: rt711: fix for JD event handling in ClockStop Mode0") Signed-off-by: Pierre-Louis Bossart Reviewed-by: Bard Liao Reviewed-by: Chao Song Link: https://msgid.link/r/20240325221817.206465-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 53c8045621c174192d6ab62e58b4dacaa66a90fd Author: Pierre-Louis Bossart Date: Mon Mar 25 17:18:13 2024 -0500 ASoC: rt711-sdca: fix locking sequence [ Upstream commit ee287771644394d071e6a331951ee8079b64f9a7 ] The disable_irq_lock protects the 'disable_irq' value, we need to lock before testing it. Fixes: 23adeb7056ac ("ASoC: rt711-sdca: fix for JD event handling in ClockStop Mode0") Signed-off-by: Pierre-Louis Bossart Reviewed-by: Bard Liao Reviewed-by: Chao Song Link: https://msgid.link/r/20240325221817.206465-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 8eea5ae23babfcab4699fa01c5b5a295ba877874 Author: Pierre-Louis Bossart Date: Mon Mar 25 17:18:12 2024 -0500 ASoC: rt5682-sdw: fix locking sequence [ Upstream commit 310a5caa4e861616a27a83c3e8bda17d65026fa8 ] The disable_irq_lock protects the 'disable_irq' value, we need to lock before testing it. Fixes: 02fb23d72720 ("ASoC: rt5682-sdw: fix for JD event handling in ClockStop Mode0") Signed-off-by: Pierre-Louis Bossart Reviewed-by: Bard Liao Reviewed-by: Chao Song Link: https://msgid.link/r/20240325221817.206465-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit cc4d9f0597ee1f1f94323611ae5d7473ddf2a99a Author: Rob Clark Date: Fri Mar 22 14:48:01 2024 -0700 drm/prime: Unbreak virtgpu dma-buf export [ Upstream commit a4ec240f6b7c21cf846d10017c3ce423a0eae92c ] virtgpu "vram" GEM objects do not implement obj->get_sg_table(). But they also don't use drm_gem_map_dma_buf(). In fact they may not even have guest visible pages. But it is perfectly fine to export and share with other virtual devices. Reported-by: Dominik Behr Fixes: 207395da5a97 ("drm/prime: reject DMA-BUF attach when get_sg_table is missing") Signed-off-by: Rob Clark Reviewed-by: Simon Ser Signed-off-by: Simon Ser Link: https://patchwork.freedesktop.org/patch/msgid/20240322214801.319975-1-robdclark@gmail.com Signed-off-by: Sasha Levin commit 692a51bebf4552bdf0a79ccd68d291182a26a569 Author: Dave Airlie Date: Thu Mar 28 12:43:16 2024 +1000 nouveau/uvmm: fix addr/range calcs for remap operations [ Upstream commit be141849ec00ef39935bf169c0f194ac70bf85ce ] dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8 was causing a remap operation like the below. op_remap: prev: 0000003fffed0000 00000000000f0000 00000000a5abd18a 0000000000000000 op_remap: next: op_remap: unmap: 0000003fffed0000 0000000000100000 0 op_map: map: 0000003ffffc0000 0000000000010000 000000005b1ba33c 00000000000e0000 This was resulting in an unmap operation from 0x3fffed0000+0xf0000, 0x100000 which was corrupting the pagetables and oopsing the kernel. Fixes the prev + unmap range calcs to use start/end and map back to addr/range. Signed-off-by: Dave Airlie Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Cc: Danilo Krummrich Signed-off-by: Danilo Krummrich Link: https://patchwork.freedesktop.org/patch/msgid/20240328024317.2041851-1-airlied@gmail.com Signed-off-by: Sasha Levin commit 9e3941c90e46d6227c8bec46a58148d4879b6c09 Author: Christian Hewitt Date: Fri Mar 22 16:45:25 2024 +0000 drm/panfrost: fix power transition timeout warnings [ Upstream commit 2bd02f5a0bac4bb13e0da18652dc75ba0e4958ec ] Increase the timeout value to prevent system logs on Amlogic boards flooding with power transition warnings: [ 13.047638] panfrost ffe40000.gpu: shader power transition timeout [ 13.048674] panfrost ffe40000.gpu: l2 power transition timeout [ 13.937324] panfrost ffe40000.gpu: shader power transition timeout [ 13.938351] panfrost ffe40000.gpu: l2 power transition timeout ... [39829.506904] panfrost ffe40000.gpu: shader power transition timeout [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout [39949.508369] panfrost ffe40000.gpu: shader power transition timeout [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout The 2000 value has been found through trial and error testing with devices using G52 and G31 GPUs. Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()") Signed-off-by: Christian Hewitt Reviewed-by: Steven Price Reviewed-by: AngeloGioacchino Del Regno Signed-off-by: Steven Price Link: https://patchwork.freedesktop.org/patch/msgid/20240322164525.2617508-1-christianshewitt@gmail.com Signed-off-by: Sasha Levin commit 4930d7a414c1566eed878ae647c363039c484f1a Author: Simon Trimmer Date: Thu Mar 28 12:13:55 2024 +0000 ALSA: hda: cs35l56: Add ACPI device match tables [ Upstream commit 2d0401ee38d43ab0e4cdd02dfc9d402befb2b5c8 ] Adding the ACPI HIDs to the match table triggers the cs35l56-hda modules to be loaded on boot so that Serial Multi Instantiate can add the devices to the bus and begin the driver init sequence. Signed-off-by: Simon Trimmer Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier") Message-ID: <20240328121355.18972-1-simont@opensource.cirrus.com> Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 3af6c5ac72dc5b721058132a0a1d7779e443175e Author: Richard Fitzgerald Date: Wed Mar 27 11:44:06 2024 +0000 regmap: maple: Fix cache corruption in regcache_maple_drop() [ Upstream commit 00bb549d7d63a21532e76e4a334d7807a54d9f31 ] When keeping the upper end of a cache block entry, the entry[] array must be indexed by the offset from the base register of the block, i.e. max - mas.index. The code was indexing entry[] by only the register address, leading to an out-of-bounds access that copied some part of the kernel memory over the cache contents. This bug was not detected by the regmap KUnit test because it only tests with a block of registers starting at 0, so mas.index == 0. Signed-off-by: Richard Fitzgerald Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Link: https://msgid.link/r/20240327114406.976986-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 4e73748d5954a56420e549a87e3f78df62efd6f7 Author: Victor Isaev Date: Fri Dec 15 23:27:20 2023 -0500 RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ [ Upstream commit 13dddf9319808badd2c1f5d7007b4e82838a648e ] "riscv: signal: Report signal frame size to userspace via auxv" (e92f469) has added new constant AT_MINSIGSTKSZ but failed to increment the size of auxv, keeping AT_VECTOR_SIZE_ARCH at 9. This fix correctly increments AT_VECTOR_SIZE_ARCH to 10, following the approach in the commit 94b07c1 ("arm64: signal: Report signal frame size to userspace via auxv"). Link: https://lore.kernel.org/r/73883406.20231215232720@torrio.net Link: https://lore.kernel.org/all/20240102133617.3649-1-victor@torrio.net/ Reported-by: Ivan Komarov Closes: https://lore.kernel.org/linux-riscv/CY3Z02NYV1C4.11BLB9PLVW9G1@fedora/ Fixes: e92f469b0771 ("riscv: signal: Report signal frame size to userspace via auxv") Signed-off-by: Victor Isaev Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit b2ddeb7fb3226fa7f0cadf33ba5b2f6db5e79226 Author: Pu Lehui Date: Tue Mar 12 01:20:53 2024 +0000 drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported [ Upstream commit ea6873118493019474abbf57d5a800da365734df ] RISC-V perf driver does not yet support branch sampling. Although the specification is in the works [0], it is best to disable such events until support is available, otherwise we will get unexpected results. Due to this reason, two riscv bpf testcases get_branch_snapshot and perf_branches/perf_branches_hw fail. Link: https://github.com/riscv/riscv-control-transfer-records [0] Fixes: f5bfa23f576f ("RISC-V: Add a perf core library for pmu drivers") Signed-off-by: Pu Lehui Reviewed-by: Atish Patra Reviewed-by: Conor Dooley Link: https://lore.kernel.org/r/20240312012053.1178140-1-pulehui@huaweicloud.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 3e1a29fb81c2fbaedf1e55e38244702deead85ca Author: Richard Fitzgerald Date: Thu Mar 7 11:02:27 2024 +0000 ASoC: wm_adsp: Fix missing mutex_lock in wm_adsp_write_ctl() [ Upstream commit f193957b0fbbba397c8bddedf158b3bf7e4850fc ] wm_adsp_write_ctl() must hold the pwr_lock mutex when calling cs_dsp_get_ctl(). This was previously partially fixed by commit 781118bc2fc1 ("ASoC: wm_adsp: Fix missing locking in wm_adsp_[read|write]_ctl()") but this only put locking around the call to cs_dsp_coeff_write_ctrl(), missing the call to cs_dsp_get_ctl(). Signed-off-by: Richard Fitzgerald Fixes: 781118bc2fc1 ("ASoC: wm_adsp: Fix missing locking in wm_adsp_[read|write]_ctl()") Link: https://msgid.link/r/20240307110227.41421-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 40613ea1d5ea36c97b11732ea5327dca87388929 Author: Dominique Martinet Date: Tue Jan 9 12:39:03 2024 +0900 9p: Fix read/write debug statements to report server reply [ Upstream commit be3193e58ec210b2a72fb1134c2a0695088a911d ] Previous conversion to iov missed these debug statements which would now always print the requested size instead of the actual server reply. Write also added a loop in a much older commit but we didn't report these, while reads do report each iteration -- it's more coherent to keep reporting all requests to server so move that at the same time. Fixes: 7f02464739da ("9p: convert to advancing variant of iov_iter_get_pages_alloc()") Signed-off-by: Dominique Martinet Message-ID: <20240109-9p-rw-trace-v1-1-327178114257@codewreck.org> Signed-off-by: Sasha Levin commit f4a192cd7b25e571bdcde0daafb360e6b8327c87 Author: Jann Horn Date: Fri Nov 24 16:08:22 2023 +0100 fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() [ Upstream commit 055ca83559912f2cfd91c9441427bac4caf3c74e ] When you try to splice between a normal pipe and a notification pipe, get_pipe_info(..., true) fails, so splice() falls back to treating the notification pipe like a normal pipe - so we end up in iter_file_splice_write(), which first locks the input pipe, then calls vfs_iter_write(), which locks the output pipe. Lockdep complains about that, because we're taking a pipe lock while already holding another pipe lock. I think this probably (?) can't actually lead to deadlocks, since you'd need another way to nest locking a normal pipe into locking a watch_queue pipe, but the lockdep annotations don't make that clear. Bail out earlier in pipe_write() for notification pipes, before taking the pipe lock. Reported-and-tested-by: Closes: https://syzkaller.appspot.com/bug?extid=011e4ea1da6692cf881c Fixes: c73be61cede5 ("pipe: Add general notification queue support") Signed-off-by: Jann Horn Link: https://lore.kernel.org/r/20231124150822.2121798-1-jannh@google.com Signed-off-by: Christian Brauner Signed-off-by: Sasha Levin commit ab7a6fe9c1b5964414062218d1b99eb6c69c9d1f Author: Ashish Kalra Date: Wed Jan 31 15:56:08 2024 -0800 KVM: SVM: Add support for allowing zero SEV ASIDs [ Upstream commit 0aa6b90ef9d75b4bd7b6d106d85f2a3437697f91 ] Some BIOSes allow the end user to set the minimum SEV ASID value (CPUID 0x8000001F_EDX) to be greater than the maximum number of encrypted guests, or maximum SEV ASID value (CPUID 0x8000001F_ECX) in order to dedicate all the SEV ASIDs to SEV-ES or SEV-SNP. The SEV support, as coded, does not handle the case where the minimum SEV ASID value can be greater than the maximum SEV ASID value. As a result, the following confusing message is issued: [ 30.715724] kvm_amd: SEV enabled (ASIDs 1007 - 1006) Fix the support to properly handle this case. Fixes: 916391a2d1dc ("KVM: SVM: Add support for SEV-ES capability in KVM") Suggested-by: Sean Christopherson Signed-off-by: Ashish Kalra Cc: stable@vger.kernel.org Acked-by: Tom Lendacky Link: https://lore.kernel.org/r/20240104190520.62510-1-Ashish.Kalra@amd.com Link: https://lore.kernel.org/r/20240131235609.4161407-4-seanjc@google.com Signed-off-by: Sean Christopherson Signed-off-by: Sasha Levin commit 79b79ea2b3bf9e85f2792ce9f3366b5644186767 Author: Sean Christopherson Date: Wed Jan 31 15:56:07 2024 -0800 KVM: SVM: Use unsigned integers when dealing with ASIDs [ Upstream commit 466eec4a22a76c462781bf6d45cb02cbedf21a61 ] Convert all local ASID variables and parameters throughout the SEV code from signed integers to unsigned integers. As ASIDs are fundamentally unsigned values, and the global min/max variables are appropriately unsigned integers, too. Functionally, this is a glorified nop as KVM guarantees min_sev_asid is non-zero, and no CPU supports -1u as the _only_ asid, i.e. the signed vs. unsigned goof won't cause problems in practice. Opportunistically use sev_get_asid() in sev_flush_encrypted_page() instead of open coding an equivalent. Reviewed-by: Tom Lendacky Link: https://lore.kernel.org/r/20240131235609.4161407-3-seanjc@google.com Signed-off-by: Sean Christopherson Stable-dep-of: 0aa6b90ef9d7 ("KVM: SVM: Add support for allowing zero SEV ASIDs") Signed-off-by: Sasha Levin commit 0a583b7ebb6faed8bf2e627af5b9f9b67015d8c8 Author: Paul Barker Date: Tue Apr 2 15:53:05 2024 +0100 net: ravb: Always update error counters [ Upstream commit 101b76418d7163240bc74a7e06867dca0e51183e ] The error statistics should be updated each time the poll function is called, even if the full RX work budget has been consumed. This prevents the counts from becoming stuck when RX bandwidth usage is high. This also ensures that error counters are not updated after we've re-enabled interrupts as that could result in a race condition. Also drop an unnecessary space. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Paul Barker Reviewed-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 1dd9204143d17a7098ba639e48e4a819bef5bb23 Author: Paul Barker Date: Tue Apr 2 15:53:04 2024 +0100 net: ravb: Always process TX descriptor ring [ Upstream commit 596a4254915f94c927217fe09c33a6828f33fb25 ] The TX queue should be serviced each time the poll function is called, even if the full RX work budget has been consumed. This prevents starvation of the TX queue when RX bandwidth usage is high. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Paul Barker Reviewed-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit a9fb2f7463cd263e3e0ad5899513305c9a14ca6b Author: Claudiu Beznea Date: Fri Feb 2 10:41:22 2024 +0200 net: ravb: Let IP-specific receive function to interrogate descriptors [ Upstream commit 2b993bfdb47b3aaafd8fe9cd5038b5e297b18ee1 ] ravb_poll() initial code used to interrogate the first descriptor of the RX queue in case gPTP is false to determine if ravb_rx() should be called. This is done for non-gPTP IPs. For gPTP IPs the driver PTP-specific information was used to determine if receive function should be called. As every IP has its own receive function that interrogates the RX descriptors list in the same way the ravb_poll() was doing there is no need to double check this in ravb_poll(). Removing the code from ravb_poll() leads to a cleaner code. Signed-off-by: Claudiu Beznea Reviewed-by: Sergey Shtylyov Signed-off-by: Paolo Abeni Stable-dep-of: 596a4254915f ("net: ravb: Always process TX descriptor ring") Signed-off-by: Sasha Levin commit 199a1314ef78ac1a95e28d1d754f0e18aa960f65 Author: Vitaly Lifshits Date: Sun Mar 3 12:51:32 2024 +0200 e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue [ Upstream commit 861e8086029e003305750b4126ecd6617465f5c7 ] Forcing SMBUS inside the ULP enabling flow leads to sporadic PHY loss on some systems. It is suspected to be caused by initiating PHY transactions before the interface settles. Separating this configuration from the ULP enabling flow and moving it to the shutdown function allows enough time for the interface to settle and avoids adding a delay. Fixes: 6607c99e7034 ("e1000e: i219 - fix to enable both ULP and EEE in Sx state") Co-developed-by: Dima Ruinskiy Signed-off-by: Dima Ruinskiy Signed-off-by: Vitaly Lifshits Tested-by: Naama Meir Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit eb96a5c025533ba42eb2a16559e6a4e90d4dfe9e Author: Vitaly Lifshits Date: Fri Mar 1 10:48:05 2024 -0800 e1000e: Minor flow correction in e1000_shutdown function [ Upstream commit 662200e324daebe6859c1f0f3ea1538b0561425a ] Add curly braces to avoid entering to an if statement where it is not always required in e1000_shutdown function. This improves code readability and might prevent non-deterministic behaviour in the future. Signed-off-by: Vitaly Lifshits Tested-by: Naama Meir Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20240301184806.2634508-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Stable-dep-of: 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") Signed-off-by: Sasha Levin commit 1d16cd91cd319d5bf6230c8493feb56a61e486a1 Author: Vitaly Lifshits Date: Thu Jan 4 16:16:52 2024 +0200 e1000e: Workaround for sporadic MDI error on Meteor Lake systems [ Upstream commit 6dbdd4de0362c37e54e8b049781402e5a409e7d0 ] On some Meteor Lake systems accessing the PHY via the MDIO interface may result in an MDI error. This issue happens sporadically and in most cases a second access to the PHY via the MDIO interface results in success. As a workaround, introduce a retry counter which is set to 3 on Meteor Lake systems. The driver will only return an error if 3 consecutive PHY access attempts fail. The retry mechanism is disabled in specific flows, where MDI errors are expected. Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake") Suggested-by: Nikolay Mushayev Co-developed-by: Nir Efrati Signed-off-by: Nir Efrati Signed-off-by: Vitaly Lifshits Tested-by: Naama Meir Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit d5752c7bb1b2acb6c9f45c105d26c2176f558f79 Author: Jesse Brandeburg Date: Tue Dec 5 17:01:08 2023 -0800 intel: legacy: field get conversion [ Upstream commit b9a4525450758dd75edbdaee97425ba7546c2b5c ] Refactor several older Intel drivers to use FIELD_GET(), which reduces lines of code and adds clarity of intent. This code was generated by the following coccinelle/spatch script and then manually repaired. @get@ constant shift,mask; type T; expression a; @@ ( -((T)((a) & mask) >> shift) +FIELD_GET(mask, a) and applied via: spatch --sp-file field_prep.cocci --in-place --dir \ drivers/net/ethernet/intel/ Cc: Julia Lawall CC: Alexander Lobakin Reviewed-by: Marcin Szycik Reviewed-by: Simon Horman Signed-off-by: Jesse Brandeburg Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit e383353b799269460a71e40bc100d91c4e892f31 Author: Jesse Brandeburg Date: Tue Dec 5 17:01:01 2023 -0800 intel: add bit macro includes where needed [ Upstream commit 3314f2097dee43defc20554f961a8b17f4787e2d ] This series is introducing the use of FIELD_GET and FIELD_PREP which requires bitfield.h to be included. Fix all the includes in this one change, and rearrange includes into alphabetical order to ease readability and future maintenance. virtchnl.h and it's usage was modified to have it's own includes as it should. This required including bits.h for virtchnl.h. Reviewed-by: Marcin Szycik Signed-off-by: Jesse Brandeburg Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit e77220eec3ee6e75d3887d7b8c54305a66ad57ee Author: Ivan Vecera Date: Wed Sep 27 10:31:34 2023 +0200 i40e: Remove circular header dependencies and fix headers [ Upstream commit 56df345917c09ffc00b7834f88990a7a7c338b5c ] Similarly as for ice driver [1] there are also circular header dependencies in i40e driver: i40e.h -> i40e_virtchnl_pf.h -> i40e.h Another issue is that i40e header files does not contain their own dependencies on other header files (both private and standard) so their inclusion in .c file require to add these deps in certain order to that .c file to make it compilable. Fix both issues by removal the mentioned circular dependency, by filling i40e headers with their dependencies so they can be placed anywhere in a source code. Additionally remove bunch of includes from i40e.h super header file that are not necessary and include i40e.h only in .c files that really require it. [1] 649c87c6ff52 ("ice: remove circular header dependencies on ice.h") Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 59a9de1a943000cff274644fab506630499b7930 Author: Ivan Vecera Date: Wed Sep 27 10:31:33 2023 +0200 i40e: Split i40e_osdep.h [ Upstream commit 5dfd37c37a44ba47c35ff8e6eaff14c226141111 ] Header i40e_osdep.h contains only IO primitives and couple of debug printing macros. Split this header file to i40e_io.h and i40e_debug.h and move i40e_debug_mask enum to i40e_debug.h Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 2ee5326d32c51c25590e16285ba32eddb875a311 Author: Ivan Vecera Date: Wed Sep 27 10:31:32 2023 +0200 i40e: Move memory allocation structures to i40e_alloc.h [ Upstream commit ef5d54078d451973f90e123fafa23fc95c2a08ae ] Structures i40e_dma_mem & i40e_virt_mem are defined i40e_osdep.h while memory allocation functions that use them are declared in i40e_alloc.h Move them there. Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 0ed115020ac486b28c76ad9349e17b17dfdf804c Author: Ivan Vecera Date: Wed Sep 27 10:31:31 2023 +0200 i40e: Simplify memory allocation functions [ Upstream commit d3276f928a1d2dfebc41a82e967cd0dffeb540f8 ] Enum i40e_memory_type enum is unused in i40e_allocate_dma_mem() thus can be safely removed. Useless macros in i40e_alloc.h can be removed as well. Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 0c52a50aec50a848f07c8ab494b4ba4faf2dac4b Author: Ivan Vecera Date: Wed Sep 27 10:31:30 2023 +0200 virtchnl: Add header dependencies [ Upstream commit 7151d87a175c6618fe81705755eb3dc4199cad4e ] The uses BIT, struct_size and ETH_ALEN macros but does not include appropriate header files that defines them. Add these dependencies so this header file can be included anywhere. Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 45116a7c213868fad4e3b06d475f4b6fee81e5ae Author: Ivan Vecera Date: Wed Sep 27 10:31:29 2023 +0200 i40e: Refactor I40E_MDIO_CLAUSE* macros [ Upstream commit 8196b5fd6c7312d31775f77c7fff0253eb0ecdaa ] The macros I40E_MDIO_CLAUSE22* and I40E_MDIO_CLAUSE45* are using I40E_MASK together with the same values I40E_GLGEN_MSCA_STCODE_SHIFT and I40E_GLGEN_MSCA_OPCODE_SHIFT to define masks. Introduce I40E_GLGEN_MSCA_OPCODE_MASK and I40E_GLGEN_MSCA_STCODE_MASK for both shifts in i40e_register.h and use them to refactor the macros mentioned above. Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit f629cf15dcde5b5e6f4ed5c5f0c3144494bb60c1 Author: Ivan Vecera Date: Wed Sep 27 10:31:27 2023 +0200 i40e: Remove back pointer from i40e_hw structure [ Upstream commit 39ec612acf6d075809c38a7262d7ad09314762f3 ] The .back field placed in i40e_hw is used to get pointer to i40e_pf instance but it is not necessary as the i40e_hw is a part of i40e_pf and containerof macro can be used to obtain the pointer to i40e_pf. Remove .back field from i40e_hw structure, introduce i40e_hw_to_pf() and i40e_hw_to_dev() helpers and use them. Signed-off-by: Ivan Vecera Reviewed-by: Przemek Kitszel Reviewed-by: Jesse Brandeburg Reviewed-by: Aleksandr Loktionov Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems") Signed-off-by: Sasha Levin commit 66ca011a5df5e82c728569605512f260df9828c2 Author: Ivan Vecera Date: Sat Mar 16 12:38:29 2024 +0100 i40e: Enforce software interrupt during busy-poll exit [ Upstream commit ea558de7238bb12c3435c47f0631e9d17bf4a09f ] As for ice bug fixed by commit b7306b42beaf ("ice: manage interrupts during poll exit") followed by commit 23be7075b318 ("ice: fix software generating extra interrupts") I'm seeing the similar issue also with i40e driver. In certain situation when busy-loop is enabled together with adaptive coalescing, the driver occasionally misses that there are outstanding descriptors to clean when exiting busy poll. Try to catch the remaining work by triggering a software interrupt when exiting busy poll. No extra interrupts will be generated when busy polling is not used. The issue was found when running sockperf ping-pong tcp test with adaptive coalescing and busy poll enabled (50 as value busy_pool and busy_read sysctl knobs) and results in huge latency spikes with more than 100000us. The fix is inspired from the ice driver and do the following: 1) During napi poll exit in case of busy-poll (napo_complete_done() returns false) this is recorded to q_vector that we were in busy loop. 2) Extends i40e_buildreg_itr() to be able to add an enforced software interrupt into built value 2) In i40e_update_enable_itr() enforces a software interrupt trigger if we are exiting busy poll to catch any pending clean-ups 3) Reuses unused 3rd ITR (interrupt throttle) index and set it to 20K interrupts per second to limit the number of these sw interrupts. Test results ============ Prior: [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120 sockperf: == version #3.10-no.git == sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s) [ 0] IP = 10.9.9.1 PORT = 11111 # TCP sockperf: Warmup stage (sending a few dummy messages)... sockperf: Starting test... sockperf: Test end (interrupted by timer) sockperf: Test ended sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2438563; ReceivedMessages=2438562 sockperf: ========= Printing statistics for Server No: 0 sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2429473; ReceivedMessages=2429473 sockperf: ====> avg-latency=24.571 (std-dev=93.297, mean-ad=4.904, median-ad=1.510, siqr=1.063, cv=3.797, std-error=0.060, 99.0% ci=[24.417, 24.725]) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 24.571 usec sockperf: Total 2429473 observations; each percentile contains 24294.73 observations sockperf: ---> observation = 103294.331 sockperf: ---> percentile 99.999 = 45.633 sockperf: ---> percentile 99.990 = 37.013 sockperf: ---> percentile 99.900 = 35.910 sockperf: ---> percentile 99.000 = 33.390 sockperf: ---> percentile 90.000 = 28.626 sockperf: ---> percentile 75.000 = 27.741 sockperf: ---> percentile 50.000 = 26.743 sockperf: ---> percentile 25.000 = 25.614 sockperf: ---> observation = 12.220 After: [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120 sockperf: == version #3.10-no.git == sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s) [ 0] IP = 10.9.9.1 PORT = 11111 # TCP sockperf: Warmup stage (sending a few dummy messages)... sockperf: Starting test... sockperf: Test end (interrupted by timer) sockperf: Test ended sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2400055; ReceivedMessages=2400054 sockperf: ========= Printing statistics for Server No: 0 sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2391186; ReceivedMessages=2391186 sockperf: ====> avg-latency=24.965 (std-dev=5.934, mean-ad=4.642, median-ad=1.485, siqr=1.067, cv=0.238, std-error=0.004, 99.0% ci=[24.955, 24.975]) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 24.965 usec sockperf: Total 2391186 observations; each percentile contains 23911.86 observations sockperf: ---> observation = 195.841 sockperf: ---> percentile 99.999 = 45.026 sockperf: ---> percentile 99.990 = 39.009 sockperf: ---> percentile 99.900 = 35.922 sockperf: ---> percentile 99.000 = 33.482 sockperf: ---> percentile 90.000 = 28.902 sockperf: ---> percentile 75.000 = 27.821 sockperf: ---> percentile 50.000 = 26.860 sockperf: ---> percentile 25.000 = 25.685 sockperf: ---> observation = 12.277 Fixes: 0bcd952feec7 ("ethernet/intel: consolidate NAPI and NAPI exit") Reported-by: Hugo Ferreira Reviewed-by: Michal Schmidt Signed-off-by: Ivan Vecera Reviewed-by: Jesse Brandeburg Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit e6d25dbd9243a5bdd26e221566f3e686a5775f8e Author: Ivan Vecera Date: Mon Nov 13 15:10:24 2023 -0800 i40e: Remove _t suffix from enum type names [ Upstream commit addca9175e5f74cf29e8ad918c38c09b8663b5b8 ] Enum type names should not be suffixed by '_t'. Either to use 'typedef enum name name_t' to so plain 'name_t var' instead of 'enum name_t var'. Signed-off-by: Ivan Vecera Reviewed-by: Jacob Keller Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20231113231047.548659-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Stable-dep-of: ea558de7238b ("i40e: Enforce software interrupt during busy-poll exit") Signed-off-by: Sasha Levin commit 3da10e91ecd24c49dd80e73f5ca86166f90dcfe1 Author: Mario Limonciello Date: Wed Mar 20 13:32:21 2024 -0500 drm/amd: Flush GFXOFF requests in prepare stage [ Upstream commit ca299b4512d4b4f516732a48ce9aa19d91f4473e ] If the system hasn't entered GFXOFF when suspend starts it can cause hangs accessing GC and RLC during the suspend stage. Cc: # 6.1.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: # 6.1.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: # 6.1.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: # 6.1.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: # 6.6.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: # 6.6.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: # 6.6.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: # 6.6.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: # 6.1+ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132 Fixes: ab4750332dbe ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") Reviewed-by: Alex Deucher Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit da67a1139f054fc59c9c18f135729bc16aef93d4 Author: Mario Limonciello Date: Fri Oct 6 13:50:21 2023 -0500 drm/amd: Add concept of running prepare_suspend() sequence for IP blocks [ Upstream commit cb11ca3233aa3303dc11dca25977d2e7f24be00f ] If any IP blocks allocate memory during their hw_fini() sequence this can cause the suspend to fail under memory pressure. Introduce a new phase that IP blocks can use to allocate memory before suspend starts so that it can potentially be evicted into swap instead. Reviewed-by: Christian König Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher Stable-dep-of: ca299b4512d4 ("drm/amd: Flush GFXOFF requests in prepare stage") Signed-off-by: Sasha Levin commit 8b5f720486ca87e102ee722a73ae0894c12f1e7a Author: Mario Limonciello Date: Fri Oct 6 13:50:20 2023 -0500 drm/amd: Evict resources during PM ops prepare() callback [ Upstream commit 5095d5418193eb2748c7d8553c7150b8f1c44696 ] Linux PM core has a prepare() callback run before suspend. If the system is under high memory pressure, the resources may need to be evicted into swap instead. If the storage backing for swap is offlined during the suspend() step then such a call may fail. So move this step into prepare() to move evict majority of resources and update all non-pmops callers to call the same callback. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362 Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher Stable-dep-of: ca299b4512d4 ("drm/amd: Flush GFXOFF requests in prepare stage") Signed-off-by: Sasha Levin commit 4356a2c3f296503c8b420ae8adece053960a9f06 Author: Chris Park Date: Tue Mar 5 17:41:15 2024 -0500 drm/amd/display: Prevent crash when disable stream [ Upstream commit 72d72e8fddbcd6c98e1b02d32cf6f2b04e10bd1c ] [Why] Disabling stream encoder invokes a function that no longer exists. [How] Check if the function declaration is NULL in disable stream encoder. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu Acked-by: Wayne Lin Signed-off-by: Chris Park Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 8dc9a27589a9bf5f0a7eb517543411adc185e957 Author: Dmytro Laktyushkin Date: Wed Jan 17 16:46:02 2024 -0500 drm/amd/display: Fix DPSTREAM CLK on and off sequence [ Upstream commit e8d131285c98927554cd007f47cedc4694bfedde ] [Why] Secondary DP2 display fails to light up in some instances [How] Clock needs to be on when DPSTREAMCLK*_EN =1. This change moves dtbclk_p enable/disable point to make sure this is the case Reviewed-by: Charlene Liu Reviewed-by: Dmytro Laktyushkin Acked-by: Tom Chung Signed-off-by: Daniel Miess Signed-off-by: Dmytro Laktyushkin Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: 72d72e8fddbc ("drm/amd/display: Prevent crash when disable stream") Signed-off-by: Sasha Levin commit 113b12e1648881bf3322cfc906392e2314dde781 Author: Krishna Kurapati Date: Fri Mar 1 09:39:14 2024 +0530 usb: typec: ucsi: Fix race between typec_switch and role_switch [ Upstream commit f5e9bda03aa50ffad36eccafe893d004ef213c43 ] When orientation switch is enabled in ucsi glink, there is a xhci probe failure seen when booting up in host mode in reverse orientation. During bootup the following things happen in multiple drivers: a) DWC3 controller driver initializes the core in device mode when the dr_mode is set to DRD. It relies on role_switch call to change role to host. b) QMP driver initializes the lanes to TYPEC_ORIENTATION_NORMAL as a normal routine. It relies on the typec_switch_set call to get notified of orientation changes. c) UCSI core reads the UCSI_GET_CONNECTOR_STATUS via the glink and provides initial role switch to dwc3 controller. When booting up in host mode with orientation TYPEC_ORIENTATION_REVERSE, then we see the following things happening in order: a) UCSI gives initial role as host to dwc3 controller ucsi_register_port. Upon receiving this notification, the dwc3 core needs to program GCTL from PRTCAP_DEVICE to PRTCAP_HOST and as part of this change, it asserts GCTL Core soft reset and waits for it to be completed before shifting it to host. Only after the reset is done will the dwc3_host_init be invoked and xhci is probed. DWC3 controller expects that the usb phy's are stable during this process i.e., the phy init is already done. b) During the 100ms wait for GCTL core soft reset, the actual notification from PPM is received by ucsi_glink via pmic glink for changing role to host. The pmic_glink_ucsi_notify routine first sends the orientation change to QMP and then sends role to dwc3 via ucsi framework. This is happening exactly at the time GCTL core soft reset is being processed. c) When QMP driver receives typec switch to TYPEC_ORIENTATION_REVERSE, it then re-programs the phy at the instant GCTL core soft reset has been asserted by dwc3 controller due to which the QMP PLL lock fails in qmp_combo_usb_power_on. d) After the 100ms of GCTL core soft reset is completed, the dwc3 core goes for initializing the host mode and invokes xhci probe. But at this point the QMP is non-responsive and as a result, the xhci plat probe fails during xhci_reset. Fix this by passing orientation switch to available ucsi instances if their gpio configuration is available before ucsi_register is invoked so that by the time, the pmic_glink_ucsi_notify provides typec_switch to QMP, the lane is already configured and the call would be a NOP thus not racing with role switch. Cc: stable@vger.kernel.org Fixes: c6165ed2f425 ("usb: ucsi: glink: use the connector orientation GPIO to provide switch events") Suggested-by: Wesley Cheng Signed-off-by: Krishna Kurapati Acked-by: Heikki Krogerus Link: https://lore.kernel.org/r/20240301040914.458492-1-quic_kriskura@quicinc.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 0dcf573f997732702917af1563aa2493dc772fc0 Author: Aleksandr Loktionov Date: Wed Mar 13 10:56:39 2024 +0100 i40e: fix vf may be used uninitialized in this function warning commit f37c4eac99c258111d414d31b740437e1925b8e8 upstream. To fix the regression introduced by commit 52424f974bc5, which causes servers hang in very hard to reproduce conditions with resets races. Using two sources for the information is the root cause. In this function before the fix bumping v didn't mean bumping vf pointer. But the code used this variables interchangeably, so stale vf could point to different/not intended vf. Remove redundant "v" variable and iterate via single VF pointer across whole function instead to guarantee VF pointer validity. Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF") Signed-off-by: Aleksandr Loktionov Reviewed-by: Arkadiusz Kubalewski Reviewed-by: Przemek Kitszel Reviewed-by: Paul Menzel Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit 89e29416cf6b6f19430bb8c739ffc789a2dd4e40 Author: Aleksandr Loktionov Date: Wed Mar 13 10:44:00 2024 +0100 i40e: fix i40e_count_filters() to count only active/new filters commit eb58c598ce45b7e787568fe27016260417c3d807 upstream. The bug usually affects untrusted VFs, because they are limited to 18 MACs, it affects them badly, not letting to create MAC all filters. Not stable to reproduce, it happens when VF user creates MAC filters when other MACVLAN operations are happened in parallel. But consequence is that VF can't receive desired traffic. Fix counter to be bumped only for new or active filters. Fixes: 621650cabee5 ("i40e: Refactoring VF MAC filters counting to make more reliable") Signed-off-by: Aleksandr Loktionov Reviewed-by: Arkadiusz Kubalewski Reviewed-by: Paul Menzel Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit 76c39cf84cd2245305236ef71ad7896aa132c721 Author: Aleksandr Mishin Date: Thu Mar 28 19:55:05 2024 +0300 octeontx2-af: Add array index check commit ef15ddeeb6bee87c044bf7754fac524545bf71e8 upstream. In rvu_map_cgx_lmac_pf() the 'iter', which is used as an array index, can reach value (up to 14) that exceed the size (MAX_LMAC_COUNT = 8) of the array. Fix this bug by adding 'iter' value check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 91c6945ea1f9 ("octeontx2-af: cn10k: Add RPM MAC support") Signed-off-by: Aleksandr Mishin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 43b69da260af7f133df982033d2f2c01705307ca Author: Su Hui Date: Thu Mar 28 10:06:21 2024 +0800 octeontx2-pf: check negative error code in otx2_open() commit e709acbd84fb6ef32736331b0147f027a3ef4c20 upstream. otx2_rxtx_enable() return negative error code such as -EIO, check -EIO rather than EIO to fix this problem. Fixes: c926252205c4 ("octeontx2-pf: Disable packet I/O for graceful exit") Signed-off-by: Su Hui Reviewed-by: Subbaraya Sundeep Reviewed-by: Simon Horman Reviewed-by: Kalesh AP Link: https://lore.kernel.org/r/20240328020620.4054692-1-suhui@nfschina.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b08b0c7a66c9799a024d544154eb0d2905b38a83 Author: Hariprasad Kelam Date: Tue Mar 26 17:51:49 2024 +0530 octeontx2-af: Fix issue with loading coalesced KPU profiles commit 0ba80d96585662299d4ea4624043759ce9015421 upstream. The current implementation for loading coalesced KPU profiles has a limitation. The "offset" field, which is used to locate profiles within the profile is restricted to a u16. This restricts the number of profiles that can be loaded. This patch addresses this limitation by increasing the size of the "offset" field. Fixes: 11c730bfbf5b ("octeontx2-af: support for coalescing KPU profiles") Signed-off-by: Hariprasad Kelam Reviewed-by: Kalesh AP Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 03b6f3692baef466bbdfaa91ec25987354961171 Author: Antoine Tenart Date: Tue Mar 26 12:34:01 2024 +0100 udp: prevent local UDP tunnel packets from being GROed commit 64235eabc4b5b18c507c08a1f16cdac6c5661220 upstream. GRO has a fundamental issue with UDP tunnel packets as it can't detect those in a foolproof way and GRO could happen before they reach the tunnel endpoint. Previous commits have fixed issues when UDP tunnel packets come from a remote host, but if those packets are issued locally they could run into checksum issues. If the inner packet has a partial checksum the information will be lost in the GRO logic, either in udp4/6_gro_complete or in udp_gro_complete_segment and packets will have an invalid checksum when leaving the host. Prevent local UDP tunnel packets from ever being GROed at the outer UDP level. Due to skb->encapsulation being wrongly used in some drivers this is actually only preventing UDP tunnel packets with a partial checksum to be GROed (see iptunnel_handle_offloads) but those were also the packets triggering issues so in practice this should be sufficient. Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Suggested-by: Paolo Abeni Signed-off-by: Antoine Tenart Reviewed-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2a1b61d0cb9bf13abcbcfd5ee704cc12736d6aba Author: Antoine Tenart Date: Tue Mar 26 12:34:00 2024 +0100 udp: do not transition UDP GRO fraglist partial checksums to unnecessary commit f0b8c30345565344df2e33a8417a27503589247d upstream. UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets are converted to CHECKSUM_UNNECESSARY to avoid later checks. However this is an issue for CHECKSUM_PARTIAL packets as they can be looped in an egress path and then their partial checksums are not fixed. Different issues can be observed, from invalid checksum on packets to traces like: gen01: hw csum failure skb len=3008 headroom=160 headlen=1376 tailroom=0 mac=(106,14) net=(120,40) trans=160 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0) hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12 ... Fix this by only converting CHECKSUM_NONE packets to CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All other checksum types are kept as-is, including CHECKSUM_COMPLETE as fraglist packets being segmented back would have their skb->csum valid. Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Antoine Tenart Reviewed-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3001e7aa43d6691db2a878b0745b854bf12ddd19 Author: Antoine Tenart Date: Tue Mar 26 12:33:58 2024 +0100 udp: do not accept non-tunnel GSO skbs landing in a tunnel commit 3d010c8031e39f5fa1e8b13ada77e0321091011f upstream. When rx-udp-gro-forwarding is enabled UDP packets might be GROed when being forwarded. If such packets might land in a tunnel this can cause various issues and udp_gro_receive makes sure this isn't the case by looking for a matching socket. This is performed in udp4/6_gro_lookup_skb but only in the current netns. This is an issue with tunneled packets when the endpoint is in another netns. In such cases the packets will be GROed at the UDP level, which leads to various issues later on. The same thing can happen with rx-gro-list. We saw this with geneve packets being GROed at the UDP level. In such case gso_size is set; later the packet goes through the geneve rx path, the geneve header is pulled, the offset are adjusted and frag_list skbs are not adjusted with regard to geneve. When those skbs hit skb_fragment, it will misbehave. Different outcomes are possible depending on what the GROed skbs look like; from corrupted packets to kernel crashes. One example is a BUG_ON[1] triggered in skb_segment while processing the frag_list. Because gso_size is wrong (geneve header was pulled) skb_segment thinks there is "geneve header size" of data in frag_list, although it's in fact the next packet. The BUG_ON itself has nothing to do with the issue. This is only one of the potential issues. Looking up for a matching socket in udp_gro_receive is fragile: the lookup could be extended to all netns (not speaking about performances) but nothing prevents those packets from being modified in between and we could still not find a matching socket. It's OK to keep the current logic there as it should cover most cases but we also need to make sure we handle tunnel packets being GROed too early. This is done by extending the checks in udp_unexpected_gso: GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must be segmented. [1] kernel BUG at net/core/skbuff.c:4408! RIP: 0010:skb_segment+0xd2a/0xf70 __udp_gso_segment+0xaa/0x560 Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Signed-off-by: Antoine Tenart Reviewed-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a5eae74f39c094af13f98daa49c09149a8f83daa Author: Atlas Yu Date: Thu Mar 28 13:51:52 2024 +0800 r8169: skip DASH fw status checks when DASH is disabled commit 5e864d90b20803edf6bd44a99fb9afa7171785f2 upstream. On devices that support DASH, the current code in the "rtl_loop_wait" function raises false alarms when DASH is disabled. This occurs because the function attempts to wait for the DASH firmware to be ready, even though it's not relevant in this case. r8169 0000:0c:00.0 eth0: RTL8168ep/8111ep, 38:7c:76:49:08:d9, XID 502, IRQ 86 r8169 0000:0c:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] r8169 0000:0c:00.0 eth0: DASH disabled ... r8169 0000:0c:00.0 eth0: rtl_ep_ocp_read_cond == 0 (loop: 30, delay: 10000). This patch modifies the driver start/stop functions to skip checking the DASH firmware status when DASH is explicitly disabled. This prevents unnecessary delays and false alarms. The patch has been tested on several ThinkStation P8/PX workstations. Fixes: 0ab0c45d8aae ("r8169: add handling DASH when DASH is disabled") Signed-off-by: Atlas Yu Reviewed-by: Heiner Kallweit Link: https://lore.kernel.org/r/20240328055152.18443-1-atlas.yu@canonical.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 36a1cb0371aa6f0698910ee70cb4ed3c349f4fa4 Author: David Thompson Date: Mon Mar 25 17:09:29 2024 -0400 mlxbf_gige: stop interface during shutdown commit 09ba28e1cd3cf715daab1fca6e1623e22fd754a6 upstream. The mlxbf_gige driver intermittantly encounters a NULL pointer exception while the system is shutting down via "reboot" command. The mlxbf_driver will experience an exception right after executing its shutdown() method. One example of this exception is: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000 [0000000000000070] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] SMP CPU: 0 PID: 13 Comm: ksoftirqd/0 Tainted: G S OE 5.15.0-bf.6.gef6992a #1 Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.0.2.12669 Apr 21 2023 pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige] lr : mlxbf_gige_poll+0x54/0x160 [mlxbf_gige] sp : ffff8000080d3c10 x29: ffff8000080d3c10 x28: ffffcce72cbb7000 x27: ffff8000080d3d58 x26: ffff0000814e7340 x25: ffff331cd1a05000 x24: ffffcce72c4ea008 x23: ffff0000814e4b40 x22: ffff0000814e4d10 x21: ffff0000814e4128 x20: 0000000000000000 x19: ffff0000814e4a80 x18: ffffffffffffffff x17: 000000000000001c x16: ffffcce72b4553f4 x15: ffff80008805b8a7 x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101 x11: 7f7f7f7f7f7f7f7f x10: c2ac898b17576267 x9 : ffffcce720fa5404 x8 : ffff000080812138 x7 : 0000000000002e9a x6 : 0000000000000080 x5 : ffff00008de3b000 x4 : 0000000000000000 x3 : 0000000000000001 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige] mlxbf_gige_poll+0x54/0x160 [mlxbf_gige] __napi_poll+0x40/0x1c8 net_rx_action+0x314/0x3a0 __do_softirq+0x128/0x334 run_ksoftirqd+0x54/0x6c smpboot_thread_fn+0x14c/0x190 kthread+0x10c/0x110 ret_from_fork+0x10/0x20 Code: 8b070000 f9000ea0 f95056c0 f86178a1 (b9407002) ---[ end trace 7cc3941aa0d8e6a4 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Kernel Offset: 0x4ce722520000 from 0xffff800008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x000005c1,a3330e5a Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- During system shutdown, the mlxbf_gige driver's shutdown() is always executed. However, the driver's stop() method will only execute if networking interface configuration logic within the Linux distribution has been setup to do so. If shutdown() executes but stop() does not execute, NAPI remains enabled and this can lead to an exception if NAPI is scheduled while the hardware interface has only been partially deinitialized. The networking interface managed by the mlxbf_gige driver must be properly stopped during system shutdown so that IFF_UP is cleared, the hardware interface is put into a clean state, and NAPI is fully deinitialized. Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver") Signed-off-by: David Thompson Link: https://lore.kernel.org/r/20240325210929.25362-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f2dd75e57285f49e34af1a5b6cd8945c08243776 Author: Kuniyuki Iwashima Date: Mon Apr 1 14:10:04 2024 -0700 ipv6: Fix infinite recursion in fib6_dump_done(). commit d21d40605bca7bd5fc23ef03d4c1ca1f48bc2cae upstream. syzkaller reported infinite recursive calls of fib6_dump_done() during netlink socket destruction. [1] From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then the response was generated. The following recvmmsg() resumed the dump for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due to the fault injection. [0] 12:01:34 executing program 3: r0 = socket$nl_route(0x10, 0x3, 0x0) sendmsg$nl_route(r0, ... snip ...) recvmmsg(r0, ... snip ...) (fail_nth: 8) Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3]. syzkaller stopped receiving the response halfway through, and finally netlink_sock_destruct() called nlk_sk(sk)->cb.done(). fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it is still not NULL. fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling itself recursively and hitting the stack guard page. To avoid the issue, let's set the destructor after kzalloc(). [0]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl (lib/dump_stack.c:117) should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153) should_failslab (mm/slub.c:3733) kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992) inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662) rtnl_dump_all (net/core/rtnetlink.c:4029) netlink_dump (net/netlink/af_netlink.c:2269) netlink_recvmsg (net/netlink/af_netlink.c:1988) ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801) ___sys_recvmsg (net/socket.c:2846) do_recvmmsg (net/socket.c:2943) __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034) [1]: BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb) stack guard page: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events netlink_sock_destruct_work RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570) Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff RSP: 0018:ffffc9000d980000 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3 RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358 RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000 R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68 FS: 0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <#DF> fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) ... fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) netlink_sock_destruct (net/netlink/af_netlink.c:401) __sk_destruct (net/core/sock.c:2177 (discriminator 2)) sk_destruct (net/core/sock.c:2224) __sk_free (net/core/sock.c:2235) sk_free (net/core/sock.c:2246) process_one_work (kernel/workqueue.c:3259) worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416) kthread (kernel/kthread.c:388) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_64.S:256) Modules linked in: Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzkaller Signed-off-by: Kuniyuki Iwashima Reviewed-by: Eric Dumazet Reviewed-by: David Ahern Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 74204bf9050f7627aead9875fe4e07ba125cb19b Author: Duoming Zhou Date: Fri Mar 29 09:50:23 2024 +0800 ax25: fix use-after-free bugs caused by ax25_ds_del_timer commit fd819ad3ecf6f3c232a06b27423ce9ed8c20da89 upstream. When the ax25 device is detaching, the ax25_dev_device_down() calls ax25_ds_del_timer() to cleanup the slave_timer. When the timer handler is running, the ax25_ds_del_timer() that calls del_timer() in it will return directly. As a result, the use-after-free bugs could happen, one of the scenarios is shown below: (Thread 1) | (Thread 2) | ax25_ds_timeout() ax25_dev_device_down() | ax25_ds_del_timer() | del_timer() | ax25_dev_put() //FREE | | ax25_dev-> //USE In order to mitigate bugs, when the device is detaching, use timer_shutdown_sync() to stop the timer. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Duoming Zhou Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240329015023.9223-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8b88752d2b1291cd630b28a0681ae2d3d169f101 Author: Kuniyuki Iwashima Date: Tue Mar 26 13:42:45 2024 -0700 tcp: Fix bind() regression for v6-only wildcard and v4(-mapped-v6) non-wildcard addresses. commit d91ef1e1b55f730bee8ce286b02b7bdccbc42973 upstream. Jianguo Wu reported another bind() regression introduced by bhash2. Calling bind() for the following 3 addresses on the same port, the 3rd one should fail but now succeeds. 1. 0.0.0.0 or ::ffff:0.0.0.0 2. [::] w/ IPV6_V6ONLY 3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address The first two bind() create tb2 like this: bhash2 -> tb2(:: w/ IPV6_V6ONLY) -> tb2(0.0.0.0) The 3rd bind() will match with the IPv6 only wildcard address bucket in inet_bind2_bucket_match_addr_any(), however, no conflicting socket exists in the bucket. So, inet_bhash2_conflict() will returns false, and thus, inet_bhash2_addr_any_conflict() returns false consequently. As a result, the 3rd bind() bypasses conflict check, which should be done against the IPv4 wildcard address bucket. So, in inet_bhash2_addr_any_conflict(), we must iterate over all buckets. Note that we cannot add ipv6_only flag for inet_bind2_bucket as it would confuse the following patetrn. 1. [::] w/ SO_REUSE{ADDR,PORT} and IPV6_V6ONLY 2. [::] w/ SO_REUSE{ADDR,PORT} 3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address The first bind() would create a bucket with ipv6_only flag true, the second bind() would add the [::] socket into the same bucket, and the third bind() could succeed based on the wrong assumption that ipv6_only bucket would not conflict with v4(-mapped-v6) address. Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address") Diagnosed-by: Jianguo Wu Signed-off-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20240326204251.51301-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 690e877ca2b6e5c69679fd38d861463df2e17bae Author: Jakub Kicinski Date: Fri Mar 29 09:05:59 2024 -0700 selftests: reuseaddr_conflict: add missing new line at the end of the output commit 31974122cfdeaf56abc18d8ab740d580d9833e90 upstream. The netdev CI runs in a VM and captures serial, so stdout and stderr get combined. Because there's a missing new line in stderr the test ends up corrupting KTAP: # Successok 1 selftests: net: reuseaddr_conflict which should have been: # Success ok 1 selftests: net: reuseaddr_conflict Fixes: 422d8dc6fd3a ("selftest: add a reuseaddr test") Reviewed-by: Muhammad Usama Anjum Link: https://lore.kernel.org/r/20240329160559.249476-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4e3fdeecec5707678b0d1f18c259dadb97262e9d Author: Eric Dumazet Date: Thu Mar 28 11:22:48 2024 +0000 erspan: make sure erspan_base_hdr is present in skb->head commit 17af420545a750f763025149fa7b833a4fc8b8f0 upstream. syzbot reported a problem in ip6erspan_rcv() [1] Issue is that ip6erspan_rcv() (and erspan_rcv()) no longer make sure erspan_base_hdr is present in skb linear part (skb->head) before getting @ver field from it. Add the missing pskb_may_pull() calls. v2: Reload iph pointer in erspan_rcv() after pskb_may_pull() because skb->head might have changed. [1] BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2742 [inline] BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2756 [inline] BUG: KMSAN: uninit-value in ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline] BUG: KMSAN: uninit-value in gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610 pskb_may_pull_reason include/linux/skbuff.h:2742 [inline] pskb_may_pull include/linux/skbuff.h:2756 [inline] ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline] gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610 ip6_protocol_deliver_rcu+0x1d4c/0x2ca0 net/ipv6/ip6_input.c:438 ip6_input_finish net/ipv6/ip6_input.c:483 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492 ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586 dst_input include/net/dst.h:460 [inline] ip6_rcv_finish+0x955/0x970 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:314 [inline] ipv6_rcv+0xde/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5538 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5652 netif_receive_skb_internal net/core/dev.c:5738 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5798 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1549 tun_get_user+0x5566/0x69e0 drivers/net/tun.c:2002 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2108 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb63/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x93/0xe0 fs/read_write.c:652 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: slab_post_alloc_hook mm/slub.c:3804 [inline] slab_alloc_node mm/slub.c:3845 [inline] kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577 __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1318 [inline] alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504 sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795 tun_alloc_skb drivers/net/tun.c:1525 [inline] tun_get_user+0x209a/0x69e0 drivers/net/tun.c:1846 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2108 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb63/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x93/0xe0 fs/read_write.c:652 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 CPU: 1 PID: 5045 Comm: syz-executor114 Not tainted 6.9.0-rc1-syzkaller-00021-g962490525cff #0 Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup") Reported-by: syzbot+1c1cf138518bf0c53d68@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/000000000000772f2c0614b66ef7@google.com/ Signed-off-by: Eric Dumazet Cc: Lorenzo Bianconi Link: https://lore.kernel.org/r/20240328112248.1101491-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit a03e138da771c8f0d0aee1c5df71e2e81955a328 Author: Ivan Vecera Date: Fri Mar 29 11:06:37 2024 -0700 i40e: Fix VF MAC filter removal commit ea2a1cfc3b2019bdea6324acd3c03606b60d71ad upstream. Commit 73d9629e1c8c ("i40e: Do not allow untrusted VF to remove administratively set MAC") fixed an issue where untrusted VF was allowed to remove its own MAC address although this was assigned administratively from PF. Unfortunately the introduced check is wrong because it causes that MAC filters for other MAC addresses including multi-cast ones are not removed. if (ether_addr_equal(addr, vf->default_lan_addr.addr) && i40e_can_vf_change_mac(vf)) was_unimac_deleted = true; else continue; if (i40e_del_mac_filter(vsi, al->list[i].addr)) { ... The else path with `continue` effectively skips any MAC filter removal except one for primary MAC addr when VF is allowed to do so. Fix the check condition so the `continue` is only done for primary MAC address. Fixes: 73d9629e1c8c ("i40e: Do not allow untrusted VF to remove administratively set MAC") Signed-off-by: Ivan Vecera Reviewed-by: Michal Schmidt Reviewed-by: Brett Creeley Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20240329180638.211412-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b9bd1498cdce1545a25c7c4210b8722a7294237e Author: Petr Oros Date: Mon Mar 25 21:19:01 2024 +0100 ice: fix enabling RX VLAN filtering commit 8edfc7a40e3300fc6c5fa7a3228a24d5bcd86ba5 upstream. ice_port_vlan_on/off() was introduced in commit 2946204b3fa8 ("ice: implement bridge port vlan"). But ice_port_vlan_on() incorrectly assigns ena_rx_filtering to inner_vlan_ops in DVM mode. This causes an error when rx_filtering cannot be enabled in legacy mode. Reproducer: echo 1 > /sys/class/net/$PF/device/sriov_numvfs ip link set $PF vf 0 spoofchk off trust on vlan 3 dmesg: ice 0000:41:00.0: failed to enable Rx VLAN filtering for VF 0 VSI 9 during VF rebuild, error -95 Fixes: 2946204b3fa8 ("ice: implement bridge port vlan") Signed-off-by: Petr Oros Reviewed-by: Michal Swiatkowski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit fc126c1d51e9552eacd2d717b9ffe9262a8a4cd6 Author: Antoine Tenart Date: Tue Mar 26 12:33:59 2024 +0100 gro: fix ownership transfer commit ed4cccef64c1d0d5b91e69f7a8a6697c3a865486 upstream. If packets are GROed with fraglist they might be segmented later on and continue their journey in the stack. In skb_segment_list those skbs can be reused as-is. This is an issue as their destructor was removed in skb_gro_receive_list but not the reference to their socket, and then they can't be orphaned. Fix this by also removing the reference to the socket. For example this could be observed, kernel BUG at include/linux/skbuff.h:3131! (skb_orphan) RIP: 0010:ip6_rcv_core+0x11bc/0x19a0 Call Trace: ipv6_list_rcv+0x250/0x3f0 __netif_receive_skb_list_core+0x49d/0x8f0 netif_receive_skb_list_internal+0x634/0xd40 napi_complete_done+0x1d2/0x7d0 gro_cell_poll+0x118/0x1f0 A similar construction is found in skb_gro_receive, apply the same change there. Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference") Signed-off-by: Antoine Tenart Reviewed-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 39864092cff37498a43d5be1720dc62b788f24a6 Author: Antoine Tenart Date: Tue Mar 26 12:34:02 2024 +0100 selftests: net: gro fwd: update vxlan GRO test expectations commit 0fb101be97ca27850c5ecdbd1269423ce4d1f607 upstream. UDP tunnel packets can't be GRO in-between their endpoints as this causes different issues. The UDP GRO fwd vxlan tests were relying on this and their expectations have to be fixed. We keep both vxlan tests and expected no GRO from happening. The vxlan UDP GRO bench test was removed as it's not providing any valuable information now. Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Antoine Tenart Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 23e1c6866e226dfb6869f38b8f0d6e693057832f Author: Michael Krummsdorf Date: Tue Mar 26 13:36:54 2024 +0100 net: dsa: mv88e6xxx: fix usable ports on 88e6020 commit 625aefac340f45a4fc60908da763f437599a0d6f upstream. The switch has 4 ports with 2 internal PHYs, but ports are numbered up to 6, with ports 0, 1, 5 and 6 being usable. Fixes: 71d94a432a15 ("net: dsa: mv88e6xxx: add support for MV88E6020 switch") Signed-off-by: Michael Krummsdorf Signed-off-by: Matthias Schiffer Reviewed-by: Andrew Lunn Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240326123655.40666-1-matthias.schiffer@ew.tq-group.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 95c1016a2d92c4c28a9d1b6d09859c00b19c0ea4 Author: Aleksandr Mishin Date: Fri Mar 29 09:16:31 2024 +0300 net: phy: micrel: Fix potential null pointer dereference commit 96c155943a703f0655c0c4cab540f67055960e91 upstream. In lan8814_get_sig_rx() and lan8814_get_sig_tx() ptp_parse_header() may return NULL as ptp_header due to abnormal packet type or corrupted packet. Fix this bug by adding ptp_header check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Aleksandr Mishin Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/20240329061631.33199-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f996e5ecf07fad2c49af1fc0563f937ef528df7c Author: Wei Fang Date: Thu Mar 28 15:59:29 2024 +0000 net: fec: Set mac_managed_pm during probe commit cbc17e7802f5de37c7c262204baadfad3f7f99e5 upstream. Setting mac_managed_pm during interface up is too late. In situations where the link is not brought up yet and the system suspends the regular PHY power management will run. Since the FEC ETHEREN control bit is cleared (automatically) on suspend the controller is off in resume. When the regular PHY power management resume path runs in this context it will write to the MII_DATA register but nothing will be transmitted on the MDIO bus. This can be observed by the following log: fec 5b040000.ethernet eth0: MDIO read timeout Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110 Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110 The data written will however remain in the MII_DATA register. When the link later is set to administrative up it will trigger a call to fec_restart() which will restore the MII_SPEED register. This triggers the quirk explained in f166f890c8f0 ("net: ethernet: fec: Replace interrupt driven MDIO with polled IO") causing an extra MII_EVENT. This extra event desynchronizes all the MDIO register reads, causing them to complete too early. Leading all reads to read as 0 because fec_enet_mdio_wait() returns too early. When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes the PHY to be initialized incorrectly and the PHY will not transmit any ethernet signal in this state. It cannot be brought out of this state without a power cycle of the PHY. Fixes: 557d5dc83f68 ("net: fec: use mac-managed PHY PM") Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/ Signed-off-by: Wei Fang [jernberg: commit message] Signed-off-by: John Ernberg Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.se Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 22a44eeef781b39fccd3277e0b95f7c36f3bda3c Author: Duanqiang Wen Date: Tue Apr 2 10:18:43 2024 +0800 net: txgbe: fix i2c dev name cannot match clkdev commit c644920ce9220d83e070f575a4df711741c07f07 upstream. txgbe clkdev shortened clk_name, so i2c_dev info_name also need to shorten. Otherwise, i2c_dev cannot initialize clock. Fixes: e30cef001da2 ("net: txgbe: fix clk_name exceed MAX_DEV_ID limits") Signed-off-by: Duanqiang Wen Link: https://lore.kernel.org/r/20240402021843.126192-1-duanqiangwen@net-swift.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 1e304328d9c3e94e7dc1899aabd1d5182c204ae9 Author: Horatiu Vultur Date: Tue Apr 2 09:16:34 2024 +0200 net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping commit de99e1ea3a35f23ff83a31d6b08f43d27b2c6345 upstream. There are 2 issues with the blamed commit. 1. When the phy is initialized, it would enable the disabled of UDPv4 checksums. The UDPv6 checksum is already enabled by default. So when 1-step is configured then it would clear these flags. 2. After the 1-step is configured, then if 2-step is configured then the 1-step would be still configured because it is not clearing the flag. So the sync frames will still have origin timestamps set. Fix this by reading first the value of the register and then just change bit 12 as this one determines if the timestamp needs to be inserted in the frame, without changing any other bits. Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Horatiu Vultur Reviewed-by: Divya Koppera Link: https://lore.kernel.org/r/20240402071634.2483524-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 784a65669720d27040b02f8ad266ea66353e26e6 Author: Piotr Wejman Date: Mon Apr 1 21:22:39 2024 +0200 net: stmmac: fix rx queue priority assignment commit b3da86d432b7cd65b025a11f68613e333d2483db upstream. The driver should ensure that same priority is not mapped to multiple rx queues. From DesignWare Cores Ethernet Quality-of-Service Databook, section 17.1.29 MAC_RxQ_Ctrl2: "[...]The software must ensure that the content of this field is mutually exclusive to the PSRQ fields for other queues, that is, the same priority is not mapped to multiple Rx queues[...]" Previously rx_queue_priority() function was: - clearing all priorities from a queue - adding new priorities to that queue After this patch it will: - first assign new priorities to a queue - then remove those priorities from all other queues - keep other priorities previously assigned to that queue Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration") Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2") Signed-off-by: Piotr Wejman Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c040b99461a5bfc14c2d0cbb1780fcc3a4706c7e Author: Eric Dumazet Date: Tue Apr 2 13:41:33 2024 +0000 net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() commit 7eb322360b0266481e560d1807ee79e0cef5742b upstream. qdisc_tree_reduce_backlog() is called with the qdisc lock held, not RTNL. We must use qdisc_lookup_rcu() instead of qdisc_lookup() syzbot reported: WARNING: suspicious RCU usage 6.1.74-syzkaller #0 Not tainted ----------------------------- net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by udevd/1142: #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282 #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline] #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297 #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792 stack backtrace: CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Call Trace: [] __dump_stack lib/dump_stack.c:88 [inline] [] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106 [] dump_stack+0x15/0x1e lib/dump_stack.c:113 [] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592 [] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305 [] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811 [] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51 [] qdisc_enqueue include/net/sch_generic.h:833 [inline] [] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723 [] dequeue_skb net/sched/sch_generic.c:292 [inline] [] qdisc_restart net/sched/sch_generic.c:397 [inline] [] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415 [] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125 [] net_tx_action+0x7c9/0x970 net/core/dev.c:5313 [] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616 [] invoke_softirq kernel/softirq.c:447 [inline] [] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700 [] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712 [] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107 [] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656 Fixes: d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") Reported-by: syzbot Signed-off-by: Eric Dumazet Reviewed-by: Jiri Pirko Acked-by: Jamal Hadi Salim Link: https://lore.kernel.org/r/20240402134133.2352776-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f4d1fa512b2ab6bee5de1affdaeb5a2fd93e29a6 Author: Christophe JAILLET Date: Tue Apr 2 20:33:56 2024 +0200 net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45() commit c120209bce34c49dcaba32f15679574327d09f63 upstream. The definition and declaration of sja1110_pcs_mdio_write_c45() don't have parameters in the same order. Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have: int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd, int reg, u16 val); it is likely that the definition is the one to change. Found with cppcheck, funcArgOrderDifferent. Fixes: ae271547bba6 ("net: dsa: sja1105: C45 only transactions for PCS") Signed-off-by: Christophe JAILLET Reviewed-by: Michael Walle Reviewed-by: Vladimir Oltean Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit 729ad2ac2a2cdc9f4a4bdfd40bfd276e6bc33924 Author: Eric Dumazet Date: Wed Apr 3 13:09:08 2024 +0000 net/sched: act_skbmod: prevent kernel-infoleak commit d313eb8b77557a6d5855f42d2234bd592c7b50dd upstream. syzbot found that tcf_skbmod_dump() was copying four bytes from kernel stack to user space [1]. The issue here is that 'struct tc_skbmod' has a four bytes hole. We need to clear the structure before filling fields. [1] BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185 copy_to_iter include/linux/uio.h:196 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420 skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline] netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x2c4/0x340 net/socket.c:1068 __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242 __do_sys_recvfrom net/socket.c:2260 [inline] __se_sys_recvfrom net/socket.c:2256 [inline] __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was stored to memory at: pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253 netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317 netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351 nlmsg_unicast include/net/netlink.h:1144 [inline] nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610 rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741 rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline] tcf_add_notify net/sched/act_api.c:2048 [inline] tcf_action_add net/sched/act_api.c:2071 [inline] tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119 rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595 netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559 rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 ____sys_sendmsg+0x877/0xb60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was stored to memory at: __nla_put lib/nlattr.c:1041 [inline] nla_put+0x1c6/0x230 lib/nlattr.c:1099 tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256 tcf_action_dump_old net/sched/act_api.c:1191 [inline] tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227 tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251 tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628 tcf_add_notify_msg net/sched/act_api.c:2023 [inline] tcf_add_notify net/sched/act_api.c:2042 [inline] tcf_action_add net/sched/act_api.c:2071 [inline] tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119 rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595 netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559 rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 ____sys_sendmsg+0x877/0xb60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Local variable opt created at: tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244 tcf_action_dump_old net/sched/act_api.c:1191 [inline] tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227 Bytes 188-191 of 248 are uninitialized Memory access of size 248 starts at ffff888117697680 Data copied to user address 00007ffe56d855f0 Fixes: 86da71b57383 ("net_sched: Introduce skbmod action") Signed-off-by: Eric Dumazet Acked-by: Jamal Hadi Salim Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 3dcaf25993a285b21183278859e0563c09def8af Author: Will Deacon Date: Wed Mar 27 12:48:53 2024 +0000 KVM: arm64: Ensure target address is granule-aligned for range TLBI commit 4c36a156738887c1edd78589fe192d757989bcde upstream. When zapping a table entry in stage2_try_break_pte(), we issue range TLB invalidation for the region that was mapped by the table. However, we neglect to align the base address down to the granule size and so if we ended up reaching the table entry via a misaligned address then we will accidentally skip invalidation for some prefix of the affected address range. Align 'ctx->addr' down to the granule size when performing TLB invalidation for an unmapped table in stage2_try_break_pte(). Cc: Raghavendra Rao Ananta Cc: Gavin Shan Cc: Shaoqin Huang Cc: Quentin Perret Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range") Signed-off-by: Will Deacon Reviewed-by: Shaoqin Huang Reviewed-by: Marc Zyngier Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org Signed-off-by: Oliver Upton Signed-off-by: Greg Kroah-Hartman commit 3ec21104c8815c1787b2b4e5eb88adfc13c685cb Author: Borislav Petkov (AMD) Date: Tue Apr 2 16:05:49 2024 +0200 x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO commit 0e110732473e14d6520e49d75d2c88ef7d46fe67 upstream. The srso_alias_untrain_ret() dummy thunk in the !CONFIG_MITIGATION_SRSO case is there only for the altenative in CALL_UNTRAIN_RET to have a symbol to resolve. However, testing with kernels which don't have CONFIG_MITIGATION_SRSO enabled, leads to the warning in patch_return() to fire: missing return thunk: srso_alias_untrain_ret+0x0/0x10-0x0: eb 0e 66 66 2e WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:826 apply_returns (arch/x86/kernel/alternative.c:826 Put in a plain "ret" there so that gcc doesn't put a return thunk in in its place which special and gets checked. In addition: ERROR: modpost: "srso_alias_untrain_ret" [arch/x86/kvm/kvm-amd.ko] undefined! make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Chyba 1 make[1]: *** [/usr/src/linux-6.8.3/Makefile:1873: modpost] Chyba 2 make: *** [Makefile:240: __sub-make] Chyba 2 since !SRSO builds would use the dummy return thunk as reported by petr.pisar@atlas.cz, https://bugzilla.kernel.org/show_bug.cgi?id=218679. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202404020901.da75a60f-oliver.sang@intel.com Signed-off-by: Borislav Petkov (AMD) Link: https://lore.kernel.org/all/202404020901.da75a60f-oliver.sang@intel.com/ Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 668b3074aa14829e2ac2759799537a93b60fef86 Author: Jakub Sitnicki Date: Tue Apr 2 12:46:21 2024 +0200 bpf, sockmap: Prevent lock inversion deadlock in map delete elem commit ff91059932401894e6c86341915615c5eb0eca48 upstream. syzkaller started using corpuses where a BPF tracing program deletes elements from a sockmap/sockhash map. Because BPF tracing programs can be invoked from any interrupt context, locks taken during a map_delete_elem operation must be hardirq-safe. Otherwise a deadlock due to lock inversion is possible, as reported by lockdep: CPU0 CPU1 ---- ---- lock(&htab->buckets[i].lock); local_irq_disable(); lock(&host->lock); lock(&htab->buckets[i].lock); lock(&host->lock); Locks in sockmap are hardirq-unsafe by design. We expects elements to be deleted from sockmap/sockhash only in task (normal) context with interrupts enabled, or in softirq context. Detect when map_delete_elem operation is invoked from a context which is _not_ hardirq-unsafe, that is interrupts are disabled, and bail out with an error. Note that map updates are not affected by this issue. BPF verifier does not allow updating sockmap/sockhash from a BPF tracing program today. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: xingwei lee Reported-by: yue sun Reported-by: syzbot+bc922f476bd65abbd466@syzkaller.appspotmail.com Reported-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki Signed-off-by: Daniel Borkmann Tested-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com Acked-by: John Fastabend Closes: https://syzkaller.appspot.com/bug?extid=d4066896495db380182e Closes: https://syzkaller.appspot.com/bug?extid=bc922f476bd65abbd466 Link: https://lore.kernel.org/bpf/20240402104621.1050319-1-jakub@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 55fabde8d9f495d764a34c775cd68035e280a681 Author: Christophe JAILLET Date: Wed Nov 1 11:49:48 2023 +0100 vboxsf: Avoid an spurious warning if load_nls_xxx() fails commit de3f64b738af57e2732b91a0774facc675b75b54 upstream. If an load_nls_xxx() function fails a few lines above, the 'sbi->bdi_id' is still 0. So, in the error handling path, we will call ida_simple_remove(..., 0) which is not allocated yet. In order to prevent a spurious "ida_free called for id=0 which is not allocated." message, tweak the error handling path and add a new label. Fixes: 0fd169576648 ("fs: Add VirtualBox guest shared folder (vboxsf) support") Signed-off-by: Christophe JAILLET Link: https://lore.kernel.org/r/d09eaaa4e2e08206c58a1a27ca9b3e81dc168773.1698835730.git.christophe.jaillet@wanadoo.fr Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Greg Kroah-Hartman commit 81d51b9b7c95e791ba3c1a2dd77920a9d3b3f525 Author: Eric Dumazet Date: Thu Apr 4 12:20:51 2024 +0000 netfilter: validate user input for expected length commit 0c83842df40f86e529db6842231154772c20edcc upstream. I got multiple syzbot reports showing old bugs exposed by BPF after commit 20f2505fb436 ("bpf: Try to avoid kzalloc in cgroup/{s,g}etsockopt") setsockopt() @optlen argument should be taken into account before copying data. BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238 CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105 copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] copy_from_sockptr include/linux/sockptr.h:55 [inline] do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101 do_sock_setsockopt+0x3af/0x720 net/socket.c:2311 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a RIP: 0033:0x7fd22067dde9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9 RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8 Allocated by task 7238: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:370 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slub.c:4069 [inline] __kmalloc_noprof+0x200/0x410 mm/slub.c:4082 kmalloc_noprof include/linux/slab.h:664 [inline] __cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869 do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a The buggy address belongs to the object at ffff88802cd73da0 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73 flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff) page_type: 0xffffefff(slab) raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122 raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490 prep_new_page mm/page_alloc.c:1498 [inline] get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712 __alloc_pages_node_noprof include/linux/gfp.h:244 [inline] alloc_pages_node_noprof include/linux/gfp.h:271 [inline] alloc_slab_page+0x5f/0x120 mm/slub.c:2249 allocate_slab+0x5a/0x2e0 mm/slub.c:2412 new_slab mm/slub.c:2465 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615 __slab_alloc+0x58/0xa0 mm/slub.c:3705 __slab_alloc_node mm/slub.c:3758 [inline] slab_alloc_node mm/slub.c:3936 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089 kstrdup+0x3a/0x80 mm/util.c:62 device_rename+0xb5/0x1b0 drivers/base/core.c:4558 dev_change_name+0x275/0x860 net/core/dev.c:1232 do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864 __rtnl_newlink net/core/rtnetlink.c:3680 [inline] rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727 rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361 page last free pid 5146 tgid 5146 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1110 [inline] free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617 discard_slab mm/slub.c:2511 [inline] __put_partials+0xeb/0x130 mm/slub.c:2980 put_cpu_partial+0x17c/0x250 mm/slub.c:3055 __slab_free+0x2ea/0x3d0 mm/slub.c:4254 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3888 [inline] slab_alloc_node mm/slub.c:3948 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] __kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076 kmalloc_node_noprof include/linux/slab.h:681 [inline] kvmalloc_node_noprof+0x72/0x190 mm/util.c:634 bucket_table_alloc lib/rhashtable.c:186 [inline] rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367 rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427 process_one_work kernel/workqueue.c:3218 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299 worker_thread+0x86d/0xd70 kernel/workqueue.c:3380 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 Memory state around the buggy address: ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc >ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc ^ ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot Signed-off-by: Eric Dumazet Reviewed-by: Pablo Neira Ayuso Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 9627fd0c6ea1c446741a33e67bc5709c59923827 Author: Pablo Neira Ayuso Date: Wed Apr 3 19:35:30 2024 +0200 netfilter: nf_tables: discard table flag update with pending basechain deletion commit 1bc83a019bbe268be3526406245ec28c2458a518 upstream. Hook unregistration is deferred to the commit phase, same occurs with hook updates triggered by the table dormant flag. When both commands are combined, this results in deleting a basechain while leaving its hook still registered in the core. Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 8b891153b2e4dc0ca9d9dab8f619d49c740813df Author: Ziyang Xuan Date: Wed Apr 3 15:22:04 2024 +0800 netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() commit 24225011d81b471acc0e1e315b7d9905459a6304 upstream. nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable(). And thhere is not any protection when iterate over nf_tables_flowtables list in __nft_flowtable_type_get(). Therefore, there is pertential data-race of nf_tables_flowtables list entry. Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller nft_flowtable_type_get() to protect the entire type query process. Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Ziyang Xuan Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 333b5085522cf1898d5a0d92616046b414f631a7 Author: Pablo Neira Ayuso Date: Tue Apr 2 18:04:36 2024 +0200 netfilter: nf_tables: flush pending destroy work before exit_net release commit 24cea9677025e0de419989ecb692acd4bb34cac2 upstream. Similar to 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier") to address a race between exit_net and the destroy workqueue. The trace below shows an element to be released via destroy workqueue while exit_net path (triggered via module removal) has already released the set that is used in such transaction. [ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465 [ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359 [ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 1360.547984] Call Trace: [ 1360.547991] [ 1360.547998] dump_stack_lvl+0x53/0x70 [ 1360.548014] print_report+0xc4/0x610 [ 1360.548026] ? __virt_addr_valid+0xba/0x160 [ 1360.548040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 1360.548054] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548176] kasan_report+0xae/0xe0 [ 1360.548189] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548312] nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548447] ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables] [ 1360.548577] ? _raw_spin_unlock_irq+0x18/0x30 [ 1360.548591] process_one_work+0x2f1/0x670 [ 1360.548610] worker_thread+0x4d3/0x760 [ 1360.548627] ? __pfx_worker_thread+0x10/0x10 [ 1360.548640] kthread+0x16b/0x1b0 [ 1360.548653] ? __pfx_kthread+0x10/0x10 [ 1360.548665] ret_from_fork+0x2f/0x50 [ 1360.548679] ? __pfx_kthread+0x10/0x10 [ 1360.548690] ret_from_fork_asm+0x1a/0x30 [ 1360.548707] [ 1360.548719] Allocated by task 192061: [ 1360.548726] kasan_save_stack+0x20/0x40 [ 1360.548739] kasan_save_track+0x14/0x30 [ 1360.548750] __kasan_kmalloc+0x8f/0xa0 [ 1360.548760] __kmalloc_node+0x1f1/0x450 [ 1360.548771] nf_tables_newset+0x10c7/0x1b50 [nf_tables] [ 1360.548883] nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink] [ 1360.548909] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink] [ 1360.548927] netlink_unicast+0x367/0x4f0 [ 1360.548935] netlink_sendmsg+0x34b/0x610 [ 1360.548944] ____sys_sendmsg+0x4d4/0x510 [ 1360.548953] ___sys_sendmsg+0xc9/0x120 [ 1360.548961] __sys_sendmsg+0xbe/0x140 [ 1360.548971] do_syscall_64+0x55/0x120 [ 1360.548982] entry_SYSCALL_64_after_hwframe+0x55/0x5d [ 1360.548994] Freed by task 192222: [ 1360.548999] kasan_save_stack+0x20/0x40 [ 1360.549009] kasan_save_track+0x14/0x30 [ 1360.549019] kasan_save_free_info+0x3b/0x60 [ 1360.549028] poison_slab_object+0x100/0x180 [ 1360.549036] __kasan_slab_free+0x14/0x30 [ 1360.549042] kfree+0xb6/0x260 [ 1360.549049] __nft_release_table+0x473/0x6a0 [nf_tables] [ 1360.549131] nf_tables_exit_net+0x170/0x240 [nf_tables] [ 1360.549221] ops_exit_list+0x50/0xa0 [ 1360.549229] free_exit_list+0x101/0x140 [ 1360.549236] unregister_pernet_operations+0x107/0x160 [ 1360.549245] unregister_pernet_subsys+0x1c/0x30 [ 1360.549254] nf_tables_module_exit+0x43/0x80 [nf_tables] [ 1360.549345] __do_sys_delete_module+0x253/0x370 [ 1360.549352] do_syscall_64+0x55/0x120 [ 1360.549360] entry_SYSCALL_64_after_hwframe+0x55/0x5d (gdb) list *__nft_release_table+0x473 0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354). 11349 list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) { 11350 list_del(&flowtable->list); 11351 nft_use_dec(&table->use); 11352 nf_tables_flowtable_destroy(flowtable); 11353 } 11354 list_for_each_entry_safe(set, ns, &table->sets, list) { 11355 list_del(&set->list); 11356 nft_use_dec(&table->use); 11357 if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT)) 11358 nft_map_deactivate(&ctx, set); (gdb) [ 1360.549372] Last potentially related work creation: [ 1360.549376] kasan_save_stack+0x20/0x40 [ 1360.549384] __kasan_record_aux_stack+0x9b/0xb0 [ 1360.549392] __queue_work+0x3fb/0x780 [ 1360.549399] queue_work_on+0x4f/0x60 [ 1360.549407] nft_rhash_remove+0x33b/0x340 [nf_tables] [ 1360.549516] nf_tables_commit+0x1c6a/0x2620 [nf_tables] [ 1360.549625] nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink] [ 1360.549647] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink] [ 1360.549671] netlink_unicast+0x367/0x4f0 [ 1360.549680] netlink_sendmsg+0x34b/0x610 [ 1360.549690] ____sys_sendmsg+0x4d4/0x510 [ 1360.549697] ___sys_sendmsg+0xc9/0x120 [ 1360.549706] __sys_sendmsg+0xbe/0x140 [ 1360.549715] do_syscall_64+0x55/0x120 [ 1360.549725] entry_SYSCALL_64_after_hwframe+0x55/0x5d Fixes: 0935d5588400 ("netfilter: nf_tables: asynchronous release") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 420132bee3d0136b7fba253a597b098fe15493a7 Author: Pablo Neira Ayuso Date: Mon Apr 1 00:33:02 2024 +0200 netfilter: nf_tables: reject new basechain after table flag update commit 994209ddf4f430946f6247616b2e33d179243769 upstream. When dormant flag is toggled, hooks are disabled in the commit phase by iterating over current chains in table (existing and new). The following configuration allows for an inconsistent state: add table x add chain x y { type filter hook input priority 0; } add table x { flags dormant; } add chain x w { type filter hook input priority 1; } which triggers the following warning when trying to unregister chain w which is already unregistered. [ 127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50 1 __nf_unregister_net_hook+0x21a/0x260 [...] [ 127.322519] Call Trace: [ 127.322521] [ 127.322524] ? __warn+0x9f/0x1a0 [ 127.322531] ? __nf_unregister_net_hook+0x21a/0x260 [ 127.322537] ? report_bug+0x1b1/0x1e0 [ 127.322545] ? handle_bug+0x3c/0x70 [ 127.322552] ? exc_invalid_op+0x17/0x40 [ 127.322556] ? asm_exc_invalid_op+0x1a/0x20 [ 127.322563] ? kasan_save_free_info+0x3b/0x60 [ 127.322570] ? __nf_unregister_net_hook+0x6a/0x260 [ 127.322577] ? __nf_unregister_net_hook+0x21a/0x260 [ 127.322583] ? __nf_unregister_net_hook+0x6a/0x260 [ 127.322590] ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables] [ 127.322655] nft_table_disable+0x75/0xf0 [nf_tables] [ 127.322717] nf_tables_commit+0x2571/0x2620 [nf_tables] Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit e40f32f17642a1ea36c79d102e0b680f79793cbc Author: Borislav Petkov (AMD) Date: Thu Mar 28 13:59:05 2024 +0100 x86/bugs: Fix the SRSO mitigation on Zen3/4 Commit 4535e1a4174c4111d92c5a9a21e542d232e0fcaa upstream. The original version of the mitigation would patch in the calls to the untraining routines directly. That is, the alternative() in UNTRAIN_RET will patch in the CALL to srso_alias_untrain_ret() directly. However, even if commit e7c25c441e9e ("x86/cpu: Cleanup the untrain mess") meant well in trying to clean up the situation, due to micro- architectural reasons, the untraining routine srso_alias_untrain_ret() must be the target of a CALL instruction and not of a JMP instruction as it is done now. Reshuffle the alternative macros to accomplish that. Fixes: e7c25c441e9e ("x86/cpu: Cleanup the untrain mess") Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Ingo Molnar Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 93eae88e34f65652a4b7c142fbec05ae0ae458f7 Author: Josh Poimboeuf Date: Mon Sep 4 22:05:03 2023 -0700 x86/nospec: Refactor UNTRAIN_RET[_*] Commit e8efc0800b8b5045ba8c0d1256bfbb47e92e192a upstream. Factor out the UNTRAIN_RET[_*] common bits into a helper macro. Signed-off-by: Josh Poimboeuf Signed-off-by: Ingo Molnar Signed-off-by: Borislav Petkov (AMD) Acked-by: Borislav Petkov (AMD) Link: https://lore.kernel.org/r/f06d45489778bd49623297af2a983eea09067a74.1693889988.git.jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman commit 820a3626f3d7ae2656098052f902ea01e2b6aa49 Author: Josh Poimboeuf Date: Mon Sep 4 22:05:00 2023 -0700 x86/srso: Disentangle rethunk-dependent options Commit 34a3cae7474c6e6f4a85aad4a7b8191b8b35cdcd upstream. CONFIG_RETHUNK, CONFIG_CPU_UNRET_ENTRY and CONFIG_CPU_SRSO are all tangled up. De-spaghettify the code a bit. Some of the rethunk-related code has been shuffled around within the '.text..__x86.return_thunk' section, but otherwise there are no functional changes. srso_alias_untrain_ret() and srso_alias_safe_ret() ((which are very address-sensitive) haven't moved. Signed-off-by: Josh Poimboeuf Signed-off-by: Ingo Molnar Signed-off-by: Borislav Petkov (AMD) Acked-by: Borislav Petkov (AMD) Link: https://lore.kernel.org/r/2845084ed303d8384905db3b87b77693945302b4.1693889988.git.jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman commit 6b10edf9164058ff54a8984206a38961697b98ab Author: Josh Poimboeuf Date: Mon Sep 4 22:04:55 2023 -0700 x86/srso: Improve i-cache locality for alias mitigation Commit aa730cff0c26244e88066b5b461a9f5fbac13823 upstream. Move srso_alias_return_thunk() to the same section as srso_alias_safe_ret() so they can share a cache line. Signed-off-by: Josh Poimboeuf Signed-off-by: Ingo Molnar Signed-off-by: Borislav Petkov (AMD) Acked-by: Borislav Petkov (AMD) Link: https://lore.kernel.org/r/eadaf5530b46a7ae8b936522da45ae555d2b3393.1693889988.git.jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman commit 065012bb777711f4fb35502e92478637f48b74b9 Author: Marco Pinna Date: Fri Mar 29 17:12:59 2024 +0100 vsock/virtio: fix packet delivery to tap device commit b32a09ea7c38849ff925489a6bf5bd8914bc45df upstream. Commit 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks") added virtio_transport_deliver_tap_pkt() for handing packets to the vsockmon device. However, in virtio_transport_send_pkt_work(), the function is called before actually sending the packet (i.e. before placing it in the virtqueue with virtqueue_add_sgs() and checking whether it returned successfully). Queuing the packet in the virtqueue can fail even multiple times. However, in virtio_transport_deliver_tap_pkt() we deliver the packet to the monitoring tap interface only the first time we call it. This certainly avoids seeing the same packet replicated multiple times in the monitoring interface, but it can show the packet sent with the wrong timestamp or even before we succeed to queue it in the virtqueue. Move virtio_transport_deliver_tap_pkt() after calling virtqueue_add_sgs() and making sure it returned successfully. Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks") Cc: stable@vge.kernel.org Signed-off-by: Marco Pinna Reviewed-by: Stefano Garzarella Link: https://lore.kernel.org/r/20240329161259.411751-1-marco.pinn95@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit ca58927b00385005f488b6a9905ced7a4f719aad Author: Haiyang Zhang Date: Tue Apr 2 12:48:36 2024 -0700 net: mana: Fix Rx DMA datasize and skb_over_panic commit c0de6ab920aafb56feab56058e46b688e694a246 upstream. mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be multiple of 64. So a packet slightly bigger than mtu+14, say 1536, can be received and cause skb_over_panic. Sample dmesg: [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev: [ 5325.243689] ------------[ cut here ]------------ [ 5325.245748] kernel BUG at net/core/skbuff.c:192! [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60 [ 5325.302941] Call Trace: [ 5325.304389] [ 5325.315794] ? skb_panic+0x4f/0x60 [ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30 [ 5325.319490] ? skb_panic+0x4f/0x60 [ 5325.321161] skb_put+0x4e/0x50 [ 5325.322670] mana_poll+0x6fa/0xb50 [mana] [ 5325.324578] __napi_poll+0x33/0x1e0 [ 5325.326328] net_rx_action+0x12e/0x280 As discussed internally, this alignment is not necessary. To fix this bug, remove it from the code. So oversized packets will be marked as CQE_RX_TRUNCATED by NIC, and dropped. Cc: stable@vger.kernel.org Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes") Signed-off-by: Haiyang Zhang Reviewed-by: Dexuan Cui Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 426366d577e9712b2cabc95fc899d4f6f85105ba Author: Jose Ignacio Tornos Martinez Date: Wed Apr 3 15:21:58 2024 +0200 net: usb: ax88179_178a: avoid the interface always configured as random address commit 2e91bb99b9d4f756e92e83c4453f894dda220f09 upstream. After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets"), reset is not executed from bind operation and mac address is not read from the device registers or the devicetree at that moment. Since the check to configure if the assigned mac address is random or not for the interface, happens after the bind operation from usbnet_probe, the interface keeps configured as random address, although the address is correctly read and set during open operation (the only reset now). In order to keep only one reset for the device and to avoid the interface always configured as random address, after reset, configure correctly the suitable field from the driver, if the mac address is read successfully from the device registers or the devicetree. Take into account if a locally administered address (random) was previously stored. cc: stable@vger.kernel.org # 6.6+ Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets") Reported-by: Dave Stevenson Signed-off-by: Jose Ignacio Tornos Martinez Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 92309bed3c5fbe2ccd4c45056efd42edbd06162d Author: Mahmoud Adam Date: Tue Mar 26 16:31:33 2024 +0100 net/rds: fix possible cp null dereference commit 62fc3357e079a07a22465b9b6ef71bb6ea75ee4b upstream. cp might be null, calling cp->cp_conn would produce null dereference [Simon Horman adds:] Analysis: * cp is a parameter of __rds_rdma_map and is not reassigned. * The following call-sites pass a NULL cp argument to __rds_rdma_map() - rds_get_mr() - rds_get_mr_for_dest * Prior to the code above, the following assumes that cp may be NULL (which is indicative, but could itself be unnecessary) trans_private = rs->rs_transport->get_mr( sg, nents, rs, &mr->r_key, cp ? cp->cp_conn : NULL, args->vec.addr, args->vec.bytes, need_odp ? ODP_ZEROBASED : ODP_NOT_NEEDED); * The code modified by this patch is guarded by IS_ERR(trans_private), where trans_private is assigned as per the previous point in this analysis. The only implementation of get_mr that I could locate is rds_ib_get_mr() which can return an ERR_PTR if the conn (4th) argument is NULL. * ret is set to PTR_ERR(trans_private). rds_ib_get_mr can return ERR_PTR(-ENODEV) if the conn (4th) argument is NULL. Thus ret may be -ENODEV in which case the code in question will execute. Conclusion: * cp may be NULL at the point where this patch adds a check; this patch does seem to address a possible bug Fixes: c055fc00c07b ("net/rds: fix WARNING in rds_conn_connect_if_down") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mahmoud Adam Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240326153132.55580-1-mngyadam@amazon.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 27aa3e4b3088426b7e34584274ad45b5afaf7629 Author: Jesper Dangaard Brouer Date: Wed Mar 27 13:14:56 2024 +0100 xen-netfront: Add missing skb_mark_for_recycle commit 037965402a010898d34f4e35327d22c0a95cd51f upstream. Notice that skb_mark_for_recycle() is introduced later than fixes tag in commit 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling"). It is believed that fixes tag were missing a call to page_pool_release_page() between v5.9 to v5.14, after which is should have used skb_mark_for_recycle(). Since v6.6 the call page_pool_release_page() were removed (in commit 535b9c61bdef ("net: page_pool: hide page_pool_release_page()") and remaining callers converted (in commit 6bfef2ec0172 ("Merge branch 'net-page_pool-remove-page_pool_release_page'")). This leak became visible in v6.8 via commit dba1b8a7ab68 ("mm/page_pool: catch page_pool memory leaks"). Cc: stable@vger.kernel.org Fixes: 6c5aa6fc4def ("xen networking: add basic XDP support for xen-netfront") Reported-by: Leonidas Spyropoulos Link: https://bugzilla.kernel.org/show_bug.cgi?id=218654 Reported-by: Arthur Borsboom Signed-off-by: Jesper Dangaard Brouer Link: https://lore.kernel.org/r/171154167446.2671062.9127105384591237363.stgit@firesoul Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 117eed2997bc5270fb2e6b17b2d543712c1b120a Author: Geliang Tang Date: Fri Mar 29 13:08:53 2024 +0100 selftests: mptcp: join: fix dev in check_endpoint commit 40061817d95bce6dd5634a61a65cd5922e6ccc92 upstream. There's a bug in pm_nl_check_endpoint(), 'dev' didn't be parsed correctly. If calling it in the 2nd test of endpoint_tests() too, it fails with an error like this: creation [FAIL] expected '10.0.2.2 id 2 subflow dev dev' \ found '10.0.2.2 id 2 subflow dev ns2eth2' The reason is '$2' should be set to 'dev', not '$1'. This patch fixes it. Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://lore.kernel.org/r/20240329-upstream-net-20240329-fallback-mib-v1-2-324a8981da48@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8038ee3c3e5b59bcd78467686db5270c68544e30 Author: Pablo Neira Ayuso Date: Thu Mar 28 14:23:55 2024 +0100 netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path commit 0d459e2ffb541841714839e8228b845458ed3b27 upstream. The commit mutex should not be released during the critical section between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC worker could collect expired objects and get the released commit lock within the same GC sequence. nf_tables_module_autoload() temporarily releases the mutex to load module dependencies, then it goes back to replay the transaction again. Move it at the end of the abort phase after nft_gc_seq_end() is called. Cc: stable@vger.kernel.org Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path") Reported-by: Kuan-Ting Chen Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit b0b36dcbe0f24383612e5e62bd48df5a8107f7fc Author: Pablo Neira Ayuso Date: Thu Mar 28 13:27:36 2024 +0100 netfilter: nf_tables: release batch on table validation from abort path commit a45e6889575c2067d3c0212b6bc1022891e65b91 upstream. Unlike early commit path stage which triggers a call to abort, an explicit release of the batch is required on abort, otherwise mutex is released and commit_list remains in place. Add WARN_ON_ONCE to ensure commit_list is empty from the abort path before releasing the mutex. After this patch, commit_list is always assumed to be empty before grabbing the mutex, therefore 03c1f1ef1584 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()") only needs to release the pending modules for registration. Cc: stable@vger.kernel.org Fixes: c0391b6ab810 ("netfilter: nf_tables: missing validation from the abort path") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit d75632d0db3cdc31873d25756066a7f56bc87737 Author: Bastien Nocera Date: Wed Mar 27 15:24:56 2024 +0100 Bluetooth: Fix TOCTOU in HCI debugfs implementation commit 7835fcfd132eb88b87e8eb901f88436f63ab60f7 upstream. struct hci_dev members conn_info_max_age, conn_info_min_age, le_conn_max_interval, le_conn_min_interval, le_adv_max_interval, and le_adv_min_interval can be modified from the HCI core code, as well through debugfs. The debugfs implementation, that's only available to privileged users, will check for boundaries, making sure that the minimum value being set is strictly above the maximum value that already exists, and vice-versa. However, as both minimum and maximum values can be changed concurrently to us modifying them, we need to make sure that the value we check is the value we end up using. For example, with ->conn_info_max_age set to 10, conn_info_min_age_set() gets called from vfs handlers to set conn_info_min_age to 8. In conn_info_min_age_set(), this goes through: if (val == 0 || val > hdev->conn_info_max_age) return -EINVAL; Concurrently, conn_info_max_age_set() gets called to set to set the conn_info_max_age to 7: if (val == 0 || val > hdev->conn_info_max_age) return -EINVAL; That check will also pass because we used the old value (10) for conn_info_max_age. After those checks that both passed, the struct hci_dev access is mutex-locked, disabling concurrent access, but that does not matter because the invalid value checks both passed, and we'll end up with conn_info_min_age = 8 and conn_info_max_age = 7 To fix this problem, we need to lock the structure access before so the check and assignment are not interrupted. This fix was originally devised by the BassCheck[1] team, and considered the problem to be an atomicity one. This isn't the case as there aren't any concerns about the variable changing while we check it, but rather after we check it parallel to another change. This patch fixes CVE-2024-24858 and CVE-2024-24857. [1] https://sites.google.com/view/basscheck/ Co-developed-by: Gui-Dong Han <2045gemini@gmail.com> Signed-off-by: Gui-Dong Han <2045gemini@gmail.com> Link: https://lore.kernel.org/linux-bluetooth/20231222161317.6255-1-2045gemini@gmail.com/ Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24858 Link: https://lore.kernel.org/linux-bluetooth/20231222162931.6553-1-2045gemini@gmail.com/ Link: https://lore.kernel.org/linux-bluetooth/20231222162310.6461-1-2045gemini@gmail.com/ Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24857 Fixes: 31ad169148df ("Bluetooth: Add conn info lifetime parameters to debugfs") Fixes: 729a1051da6f ("Bluetooth: Expose default LE advertising interval via debugfs") Fixes: 71c3b60ec6d2 ("Bluetooth: Move BR/EDR debugfs file creation into hci_debugfs.c") Signed-off-by: Bastien Nocera Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 4a32840119d0d656a0750a8239d8e1d3ed6397a9 Author: Hui Wang Date: Wed Mar 27 12:30:30 2024 +0800 Bluetooth: hci_event: set the conn encrypted before conn establishes commit c569242cd49287d53b73a94233db40097d838535 upstream. We have a BT headset (Lenovo Thinkplus XT99), the pairing and connecting has no problem, once this headset is paired, bluez will remember this device and will auto re-connect it whenever the device is powered on. The auto re-connecting works well with Windows and Android, but with Linux, it always fails. Through debugging, we found at the rfcomm connection stage, the bluetooth stack reports "Connection refused - security block (0x0003)". For this device, the re-connecting negotiation process is different from other BT headsets, it sends the Link_KEY_REQUEST command before the CONNECT_REQUEST completes, and it doesn't send ENCRYPT_CHANGE command during the negotiation. When the device sends the "connect complete" to hci, the ev->encr_mode is 1. So here in the conn_complete_evt(), if ev->encr_mode is 1, link type is ACL and HCI_CONN_ENCRYPT is not set, we set HCI_CONN_ENCRYPT to this conn, and update conn->enc_key_size accordingly. After this change, this BT headset could re-connect with Linux successfully. This is the btmon log after applying the patch, after receiving the "Connect Complete" with "Encryption: Enabled", will send the command to read encryption key size: > HCI Event: Connect Request (0x04) plen 10 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Class: 0x240404 Major class: Audio/Video (headset, speaker, stereo, video, vcr) Minor class: Wearable Headset Device Rendering (Printing, Speaker) Audio (Speaker, Microphone, Headset) Link type: ACL (0x01) ... > HCI Event: Link Key Request (0x17) plen 6 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) < HCI Command: Link Key Request Reply (0x01|0x000b) plen 22 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Link key: ${32-hex-digits-key} ... > HCI Event: Connect Complete (0x03) plen 11 Status: Success (0x00) Handle: 256 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Link type: ACL (0x01) Encryption: Enabled (0x01) < HCI Command: Read Encryption Key... (0x05|0x0008) plen 2 Handle: 256 < ACL Data TX: Handle 256 flags 0x00 dlen 10 L2CAP: Information Request (0x0a) ident 1 len 2 Type: Extended features supported (0x0002) > HCI Event: Command Complete (0x0e) plen 7 Read Encryption Key Size (0x05|0x0008) ncmd 1 Status: Success (0x00) Handle: 256 Key size: 16 Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/704 Reviewed-by: Paul Menzel Reviewed-by: Luiz Augusto von Dentz Signed-off-by: Hui Wang Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 57e089d33b9654cc46f038f4382f1c8499f40e17 Author: Johan Hovold Date: Wed Mar 20 08:55:53 2024 +0100 Bluetooth: add quirk for broken address properties commit 39646f29b100566451d37abc4cc8cdd583756dfe upstream. Some Bluetooth controllers lack persistent storage for the device address and instead one can be provided by the boot firmware using the 'local-bd-address' devicetree property. The Bluetooth devicetree bindings clearly states that the address should be specified in little-endian order, but due to a long-standing bug in the Qualcomm driver which reversed the address some boot firmware has been providing the address in big-endian order instead. Add a new quirk that can be set on platforms with broken firmware and use it to reverse the address when parsing the property so that the underlying driver bug can be fixed. Fixes: 5c0a1001c8be ("Bluetooth: hci_qca: Add helper to set device address") Cc: stable@vger.kernel.org # 5.1 Reviewed-by: Douglas Anderson Signed-off-by: Johan Hovold Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 1622e563b8190cdfb3a8d3eb609061befe9078af Author: Johan Hovold Date: Wed Mar 20 08:55:54 2024 +0100 Bluetooth: qca: fix device-address endianness commit 77f45cca8bc55d00520a192f5a7715133591c83e upstream. The WCN6855 firmware on the Lenovo ThinkPad X13s expects the Bluetooth device address in big-endian order when setting it using the EDL_WRITE_BD_ADDR_OPCODE command. Presumably, this is the case for all non-ROME devices which all use the EDL_WRITE_BD_ADDR_OPCODE command for this (unlike the ROME devices which use a different command and expect the address in little-endian order). Reverse the little-endian address before setting it to make sure that the address can be configured using tools like btmgmt or using the 'local-bd-address' devicetree property. Note that this can potentially break systems with boot firmware which has started relying on the broken behaviour and is incorrectly passing the address via devicetree in big-endian order. The only device affected by this should be the WCN3991 used in some Chromebooks. As ChromeOS updates the kernel and devicetree in lockstep, the new 'qcom,local-bd-address-broken' property can be used to determine if the firmware is buggy so that the underlying driver bug can be fixed without breaking backwards compatibility. Set the HCI_QUIRK_BDADDR_PROPERTY_BROKEN quirk for such platforms so that the address is reversed when parsing the address property. Fixes: 5c0a1001c8be ("Bluetooth: hci_qca: Add helper to set device address") Cc: stable@vger.kernel.org # 5.1 Cc: Balakrishna Godavarthi Cc: Matthias Kaehlcke Tested-by: Nikita Travkin # sc7180 Reviewed-by: Douglas Anderson Signed-off-by: Johan Hovold Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit b99d0617b69804a733fc2c2470d1a6c95e6a1975 Author: Johan Hovold Date: Wed Mar 20 08:55:52 2024 +0100 arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken commit e12e28009e584c8f8363439f6a928ec86278a106 upstream. Several Qualcomm Bluetooth controllers lack persistent storage for the device address and instead one can be provided by the boot firmware using the 'local-bd-address' devicetree property. The Bluetooth bindings clearly states that the address should be specified in little-endian order, but due to a long-standing bug in the Qualcomm driver which reversed the address some boot firmware has been providing the address in big-endian order instead. The boot firmware in SC7180 Trogdor Chromebooks is known to be affected so mark the 'local-bd-address' property as broken to maintain backwards compatibility with older firmware when fixing the underlying driver bug. Note that ChromeOS always updates the kernel and devicetree in lockstep so that there is no need to handle backwards compatibility with older devicetrees. Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt") Cc: stable@vger.kernel.org # 5.10 Cc: Rob Clark Reviewed-by: Douglas Anderson Signed-off-by: Johan Hovold Acked-by: Bjorn Andersson Reviewed-by: Bjorn Andersson Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 417c6cc9ef8c06f3432398dd89cffe55ec8a0a24 Author: Johan Hovold Date: Thu Mar 14 09:44:12 2024 +0100 Revert "Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT" commit 4790a73ace86f3d165bbedba898e0758e6e1b82d upstream. This reverts commit 7dcd3e014aa7faeeaf4047190b22d8a19a0db696. Qualcomm Bluetooth controllers like WCN6855 do not have persistent storage for the Bluetooth address and must therefore start as unconfigured to allow the user to set a valid address unless one has been provided by the boot firmware in the devicetree. A recent change snuck into v6.8-rc7 and incorrectly started marking the default (non-unique) address as valid. This specifically also breaks the Bluetooth setup for some user of the Lenovo ThinkPad X13s. Note that this is the second time Qualcomm breaks the driver this way and that this was fixed last year by commit 6945795bc81a ("Bluetooth: fix use-bdaddr-property quirk"), which also has some further details. Fixes: 7dcd3e014aa7 ("Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT") Cc: stable@vger.kernel.org # 6.8 Cc: Janaki Ramaiah Thota Signed-off-by: Johan Hovold Reported-by: Clayton Craft Tested-by: Clayton Craft Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 3f9d57c771656bfd651e22edcfdb5f60e62542d4 Author: Uros Bizjak Date: Mon Apr 1 20:55:29 2024 +0200 x86/bpf: Fix IP after emitting call depth accounting commit 9d98aa088386aee3db1b7b60b800c0fde0654a4a upstream. Adjust the IP passed to `emit_patch` so it calculates the correct offset for the CALL instruction if `x86_call_depth_emit_accounting` emits code. Otherwise we will skip some instructions and most likely crash. Fixes: b2e9dfe54be4 ("x86/bpf: Emit call depth accounting if required") Link: https://lore.kernel.org/lkml/20230105214922.250473-1-joanbrugueram@gmail.com/ Co-developed-by: Joan Bruguera Micó Signed-off-by: Joan Bruguera Micó Signed-off-by: Uros Bizjak Cc: Alexei Starovoitov Cc: Daniel Borkmann Link: https://lore.kernel.org/r/20240401185821.224068-2-ubizjak@gmail.com Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 4d47169ab691497bae199f0d00a8e31901e94d1a Author: Sean Christopherson Date: Thu Apr 4 17:16:14 2024 -0700 x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word commit 8cb4a9a82b21623dbb4b3051dd30d98356cf95bc upstream. Add CPUID_LNX_5 to track cpufeatures' word 21, and add the appropriate compile-time assert in KVM to prevent direct lookups on the features in CPUID_LNX_5. KVM uses X86_FEATURE_* flags to manage guest CPUID, and so must translate features that are scattered by Linux from the Linux-defined bit to the hardware-defined bit, i.e. should never try to directly access scattered features in guest CPUID. Opportunistically add NR_CPUID_WORDS to enum cpuid_leafs, along with a compile-time assert in KVM's CPUID infrastructure to ensure that future additions update cpuid_leafs along with NCAPINTS. No functional change intended. Fixes: 7f274e609f3d ("x86/cpufeatures: Add new word for scattered features") Cc: Sandipan Das Signed-off-by: Sean Christopherson Acked-by: Dave Hansen Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b9906101f894e8460587584dc75d7f89a5394cd9 Author: Heiner Kallweit Date: Sat Mar 30 12:49:02 2024 +0100 r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d commit 5d872c9f46bd2ea3524af3c2420a364a13667135 upstream. On some boards with this chip version the BIOS is buggy and misses to reset the PHY page selector. This results in the PHY ID read accessing registers on a different page, returning a more or less random value. Fix this by resetting the page selector first. Fixes: f1e911d5d0df ("r8169: add basic phylib support") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/64f2055e-98b8-45ec-8568-665e3d54d4e6@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 477ed6789eb9f3f4d3568bb977f90c863c12724e Author: Christian Göttsche Date: Thu Mar 28 20:16:58 2024 +0100 selinux: avoid dereference of garbage after mount failure commit 37801a36b4d68892ce807264f784d818f8d0d39b upstream. In case kern_mount() fails and returns an error pointer return in the error branch instead of continuing and dereferencing the error pointer. While on it drop the never read static variable selinuxfs_mount. Cc: stable@vger.kernel.org Fixes: 0619f0f5e36f ("selinux: wrap selinuxfs state") Signed-off-by: Christian Göttsche Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 9e2af26c29c6c6f1215d56c3246da148e54b2357 Author: Oliver Upton Date: Tue Mar 5 18:48:39 2024 +0000 KVM: arm64: Fix host-programmed guest events in nVHE commit e89c928bedd77d181edc2df01cb6672184775140 upstream. Programming PMU events in the host that count during guest execution is a feature supported by perf, e.g. perf stat -e cpu_cycles:G ./lkvm run While this works for VHE, the guest/host event bitmaps are not carried through to the hypervisor in the nVHE configuration. Make kvm_pmu_update_vcpu_events() conditional on whether or not _hardware_ supports PMUv3 rather than if the vCPU as vPMU enabled. Cc: stable@vger.kernel.org Fixes: 84d751a019a9 ("KVM: arm64: Pass pmu events to hyp via vcpu") Reviewed-by: Marc Zyngier Link: https://lore.kernel.org/r/20240305184840.636212-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton Signed-off-by: Greg Kroah-Hartman commit 651bf5b1d0708d277e5f2cf2887ec27d3b04f500 Author: Anup Patel Date: Thu Mar 21 14:20:41 2024 +0530 RISC-V: KVM: Fix APLIC in_clrip[x] read emulation commit 8e936e98718f005c986be0bfa1ee6b355acf96be upstream. The reads to APLIC in_clrip[x] registers returns rectified input values of the interrupt sources. A rectified input value of an interrupt source is defined by the section "4.5.2 Source configurations (sourcecfg[1]–sourcecfg[1023])" of the RISC-V AIA specification as: rectified input value = (incoming wire value) XOR (source is inverted) Update the riscv_aplic_input() implementation to match the above. Cc: stable@vger.kernel.org Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC") Signed-off-by: Anup Patel Signed-off-by: Anup Patel Link: https://lore.kernel.org/r/20240321085041.1955293-3-apatel@ventanamicro.com Signed-off-by: Greg Kroah-Hartman commit 200cc2c7184138383ceb3249925c06a3e7c653db Author: Anup Patel Date: Thu Mar 21 14:20:40 2024 +0530 RISC-V: KVM: Fix APLIC setipnum_le/be write emulation commit d8dd9f113e16bef3b29c9dcceb584a6f144f55e4 upstream. The writes to setipnum_le/be register for APLIC in MSI-mode have special consideration for level-triggered interrupts as-per the section "4.9.2 Special consideration for level-sensitive interrupt sources" of the RISC-V AIA specification. Particularly, the below text from the RISC-V AIA specification defines the behaviour of writes to setipnum_le/be register for level-triggered interrupts: "A second option is for the interrupt service routine to write the APLIC’s source identity number for the interrupt to the domain’s setipnum register just before exiting. This will cause the interrupt’s pending bit to be set to one again if the source is still asserting an interrupt, but not if the source is not asserting an interrupt." Fix setipnum_le/be write emulation for in-kernel APLIC by implementing the above behaviour in aplic_write_pending() function. Cc: stable@vger.kernel.org Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC") Signed-off-by: Anup Patel Signed-off-by: Anup Patel Link: https://lore.kernel.org/r/20240321085041.1955293-2-apatel@ventanamicro.com Signed-off-by: Greg Kroah-Hartman commit 21bc9b158983992b0709222c60f4bc749919a93d Author: Bartosz Golaszewski Date: Mon Mar 25 10:02:42 2024 +0100 gpio: cdev: sanitize the label before requesting the interrupt commit b34490879baa847d16fc529c8ea6e6d34f004b38 upstream. When an interrupt is requested, a procfs directory is created under "/proc/irq//