commit d795d7b5846cb98eaa66abeb2b0927499ede2af5 Author: Greg Kroah-Hartman Date: Wed May 12 08:40:05 2021 +0200 Linux 5.12.3 Tested-by: Jon Hunter Reviewed-by: Florian Fainelli Tested-by: Linux Kernel Functional Testing Tested-by: Fox Chen Tested-by: Guenter Roeck Tested-by: Jason Self Tested-by: Shuah Khan Tested-by: Justin M. Forbes Link: https://lore.kernel.org/r/20210510102014.849075526@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 142703b82c36834a286ce667ce20f714577b8ced Author: Lukasz Luba Date: Thu Apr 22 16:36:22 2021 +0100 thermal/core/fair share: Lock the thermal zone while looping over instances commit fef05776eb02238dcad8d5514e666a42572c3f32 upstream. The tz->lock must be hold during the looping over the instances in that thermal zone. This lock was missing in the governor code since the beginning, so it's hard to point into a particular commit. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Lukasz Luba Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com Signed-off-by: Greg Kroah-Hartman commit 6bf443acf6ca4f666d0e4225614ba9993a3aa1a9 Author: brian-sy yang Date: Tue Dec 29 13:08:31 2020 +0800 thermal/drivers/cpufreq_cooling: Fix slab OOB issue commit 34ab17cc6c2c1ac93d7e5d53bb972df9a968f085 upstream. Slab OOB issue is scanned by KASAN in cpu_power_to_freq(). If power is limited below the power of OPP0 in EM table, it will cause slab out-of-bound issue with negative array index. Return the lowest frequency if limited power cannot found a suitable OPP in EM table to fix this issue. Backtrace: [] die+0x104/0x5ac [] bug_handler+0x64/0xd0 [] brk_handler+0x160/0x258 [] do_debug_exception+0x248/0x3f0 [] el1_dbg+0x14/0xbc [] __kasan_report+0x1dc/0x1e0 [] kasan_report+0x10/0x20 [] __asan_report_load8_noabort+0x18/0x28 [] cpufreq_power2state+0x180/0x43c [] power_actor_set_power+0x114/0x1d4 [] allocate_power+0xaec/0xde0 [] power_allocator_throttle+0x3ec/0x5a4 [] handle_thermal_trip+0x160/0x294 [] thermal_zone_device_check+0xe4/0x154 [] process_one_work+0x5e4/0xe28 [] worker_thread+0xa4c/0xfac [] kthread+0x33c/0x358 [] ret_from_fork+0xc/0x18 Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power") Signed-off-by: brian-sy yang Signed-off-by: Michael Kao Reviewed-by: Lukasz Luba Cc: stable@vger.kernel.org #v5.7 Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com Signed-off-by: Greg Kroah-Hartman commit c23e941b235179d75ca8bd47e810836ee9e07789 Author: Rasmus Villemoes Date: Fri Apr 23 11:45:29 2021 +0200 lib/vsprintf.c: remove leftover 'f' and 'F' cases from bstr_printf() commit 84696cfaf4d90945eb2a8302edc6cf627db56b84 upstream. Commit 9af7706492f9 ("lib/vsprintf: Remove support for %pF and %pf in favour of %pS and %ps") removed support for %pF and %pf, and correctly removed the handling of those cases in vbin_printf(). However, the corresponding cases in bstr_printf() were left behind. In the same series, %pf was re-purposed for dealing with fwnodes (3bd32d6a2ee6, "lib/vsprintf: Add %pfw conversion specifier for printing fwnode names"). So should anyone use %pf with the binary printf routines, vbin_printf() would (correctly, as it involves dereferencing the pointer) do the string formatting to the u32 array, but bstr_printf() would not copy the string from the u32 array, but instead interpret the first sizeof(void*) bytes of the formatted string as a pointer - which generally won't end well (also, all subsequent get_args would be out of sync). Fixes: 9af7706492f9 ("lib/vsprintf: Remove support for %pF and %pf in favour of %pS and %ps") Cc: stable@vger.kernel.org Signed-off-by: Rasmus Villemoes Reviewed-by: Sakari Ailus Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20210423094529.1862521-1-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman commit 87bc16ed037b0daf8939d926ef3fa2e1e5ccbdf4 Author: 周琰杰 (Zhou Yanjie) Date: Sun Apr 18 22:44:23 2021 +0800 pinctrl: Ingenic: Add support for read the pin configuration of X1830. commit 1d0bd580ef83b78a10c0b37f3313eaa59d8c80db upstream. Add X1830 support in "ingenic_pinconf_get()", so that it can read the configuration of X1830 SoC correctly. Fixes: d7da2a1e4e08 ("pinctrl: Ingenic: Add pinctrl driver for X1830.") Cc: Signed-off-by: 周琰杰 (Zhou Yanjie) Reviewed-by: Andy Shevchenko Reviewed-by: Paul Cercueil Link: https://lore.kernel.org/r/1618757073-1724-3-git-send-email-zhouyanjie@wanyeetech.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit dc1a515ba108d26b0c028afbed4b02239cdae13e Author: 周琰杰 (Zhou Yanjie) Date: Sun Apr 18 22:44:22 2021 +0800 pinctrl: Ingenic: Add missing pins to the JZ4770 MAC MII group. commit 65afd97630a9d6dd9ea83ff182dfdb15bc58c5d1 upstream. The MII group of JZ4770's MAC should have 7 pins, add missing pins to the MII group. Fixes: 5de1a73e78ed ("Pinctrl: Ingenic: Add missing parts for JZ4770 and JZ4780.") Cc: Signed-off-by: 周琰杰 (Zhou Yanjie) Reviewed-by: Andy Shevchenko Reviewed-by: Paul Cercueil Link: https://lore.kernel.org/r/1618757073-1724-2-git-send-email-zhouyanjie@wanyeetech.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit d757bf4c69cda3c3ab7f775dfabbf5a80e2f6f9d Author: Benjamin Block Date: Thu Apr 29 23:37:00 2021 +0200 dm rq: fix double free of blk_mq_tag_set in dev remove after table load fails commit 8e947c8f4a5620df77e43c9c75310dc510250166 upstream. When loading a device-mapper table for a request-based mapped device, and the allocation/initialization of the blk_mq_tag_set for the device fails, a following device remove will cause a double free. E.g. (dmesg): device-mapper: core: Cannot initialize queue for request-based dm-mq mapped device device-mapper: ioctl: unable to set up device queue for new table. Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0305e098835de000 TEID: 0305e098835de803 Fault in home space mode while using kernel ASCE. AS:000000025efe0007 R3:0000000000000024 Oops: 0038 ilc:3 [#1] SMP Modules linked in: ... lots of modules ... Supported: Yes, External CPU: 0 PID: 7348 Comm: multipathd Kdump: loaded Tainted: G W X 5.3.18-53-default #1 SLE15-SP3 Hardware name: IBM 8561 T01 7I2 (LPAR) Krnl PSW : 0704e00180000000 000000025e368eca (kfree+0x42/0x330) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 000000000000004a 000000025efe5230 c1773200d779968d 0000000000000000 000000025e520270 000000025e8d1b40 0000000000000003 00000007aae10000 000000025e5202a2 0000000000000001 c1773200d779968d 0305e098835de640 00000007a8170000 000003ff80138650 000000025e5202a2 000003e00396faa8 Krnl Code: 000000025e368eb8: c4180041e100 lgrl %r1,25eba50b8 000000025e368ebe: ecba06b93a55 risbg %r11,%r10,6,185,58 #000000025e368ec4: e3b010000008 ag %r11,0(%r1) >000000025e368eca: e310b0080004 lg %r1,8(%r11) 000000025e368ed0: a7110001 tmll %r1,1 000000025e368ed4: a7740129 brc 7,25e369126 000000025e368ed8: e320b0080004 lg %r2,8(%r11) 000000025e368ede: b904001b lgr %r1,%r11 Call Trace: [<000000025e368eca>] kfree+0x42/0x330 [<000000025e5202a2>] blk_mq_free_tag_set+0x72/0xb8 [<000003ff801316a8>] dm_mq_cleanup_mapped_device+0x38/0x50 [dm_mod] [<000003ff80120082>] free_dev+0x52/0xd0 [dm_mod] [<000003ff801233f0>] __dm_destroy+0x150/0x1d0 [dm_mod] [<000003ff8012bb9a>] dev_remove+0x162/0x1c0 [dm_mod] [<000003ff8012a988>] ctl_ioctl+0x198/0x478 [dm_mod] [<000003ff8012ac8a>] dm_ctl_ioctl+0x22/0x38 [dm_mod] [<000000025e3b11ee>] ksys_ioctl+0xbe/0xe0 [<000000025e3b127a>] __s390x_sys_ioctl+0x2a/0x40 [<000000025e8c15ac>] system_call+0xd8/0x2c8 Last Breaking-Event-Address: [<000000025e52029c>] blk_mq_free_tag_set+0x6c/0xb8 Kernel panic - not syncing: Fatal exception: panic_on_oops When allocation/initialization of the blk_mq_tag_set fails in dm_mq_init_request_queue(), it is uninitialized/freed, but the pointer is not reset to NULL; so when dev_remove() later gets into dm_mq_cleanup_mapped_device() it sees the pointer and tries to uninitialize and free it again. Fix this by setting the pointer to NULL in dm_mq_init_request_queue() error-handling. Also set it to NULL in dm_mq_cleanup_mapped_device(). Cc: # 4.6+ Fixes: 1c357a1e86a4 ("dm: allocate blk_mq_tag_set rather than embed in mapped_device") Signed-off-by: Benjamin Block Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit f2b61e8ba153a0f79eac22ce6e5c6e9af300d44b Author: Tian Tao Date: Wed Apr 14 09:43:44 2021 +0800 dm integrity: fix missing goto in bitmap_flush_interval error handling commit 17e9e134a8efabbbf689a0904eee92bb5a868172 upstream. Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org Signed-off-by: Tian Tao Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 578f39e1f2b3739c9de900f1ab0301a6b8f57dbc Author: Joe Thornber Date: Tue Apr 13 09:11:53 2021 +0100 dm space map common: fix division bug in sm_ll_find_free_block() commit 5208692e80a1f3c8ce2063a22b675dd5589d1d80 upstream. This division bug meant the search for free metadata space could skip the final allocation bitmap's worth of entries. Fix affects DM thinp, cache and era targets. Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber Tested-by: Ming-Hung Tsai Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 2f97deb8b0da818c04abf89af5f3b2325b9575de Author: Joe Thornber Date: Mon Mar 29 16:34:57 2021 +0100 dm persistent data: packed struct should have an aligned() attribute too commit a88b2358f1da2c9f9fcc432f2e0a79617fea397c upstream. Otherwise most non-x86 architectures (e.g. riscv, arm) will resort to byte-by-byte access. Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 2a1bd74b8186d7938bf004f5603f25b84785f63e Author: Steven Rostedt (VMware) Date: Fri Apr 30 12:17:58 2021 -0400 tracing: Restructure trace_clock_global() to never block commit aafe104aa9096827a429bc1358f8260ee565b7cc upstream. It was reported that a fix to the ring buffer recursion detection would cause a hung machine when performing suspend / resume testing. The following backtrace was extracted from debugging that case: Call Trace: trace_clock_global+0x91/0xa0 __rb_reserve_next+0x237/0x460 ring_buffer_lock_reserve+0x12a/0x3f0 trace_buffer_lock_reserve+0x10/0x50 __trace_graph_return+0x1f/0x80 trace_graph_return+0xb7/0xf0 ? trace_clock_global+0x91/0xa0 ftrace_return_to_handler+0x8b/0xf0 ? pv_hash+0xa0/0xa0 return_to_handler+0x15/0x30 ? ftrace_graph_caller+0xa0/0xa0 ? trace_clock_global+0x91/0xa0 ? __rb_reserve_next+0x237/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 ? trace_event_buffer_lock_reserve+0x3c/0x120 ? trace_event_buffer_reserve+0x6b/0xc0 ? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0 ? dpm_run_callback+0x3b/0xc0 ? pm_ops_is_empty+0x50/0x50 ? platform_get_irq_byname_optional+0x90/0x90 ? trace_device_pm_callback_start+0x82/0xd0 ? dpm_run_callback+0x49/0xc0 With the following RIP: RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200 Since the fix to the recursion detection would allow a single recursion to happen while tracing, this lead to the trace_clock_global() taking a spin lock and then trying to take it again: ring_buffer_lock_reserve() { trace_clock_global() { arch_spin_lock() { queued_spin_lock_slowpath() { /* lock taken */ (something else gets traced by function graph tracer) ring_buffer_lock_reserve() { trace_clock_global() { arch_spin_lock() { queued_spin_lock_slowpath() { /* DEAD LOCK! */ Tracing should *never* block, as it can lead to strange lockups like the above. Restructure the trace_clock_global() code to instead of simply taking a lock to update the recorded "prev_time" simply use it, as two events happening on two different CPUs that calls this at the same time, really doesn't matter which one goes first. Use a trylock to grab the lock for updating the prev_time, and if it fails, simply try again the next time. If it failed to be taken, that means something else is already updating it. Link: https://lkml.kernel.org/r/20210430121758.650b6e8a@gandalf.local.home Cc: stable@vger.kernel.org Tested-by: Konstantin Kharlamov Tested-by: Todd Brandt Fixes: b02414c8f045 ("ring-buffer: Fix recursion protection transitions between interrupt context") # started showing the problem Fixes: 14131f2f98ac3 ("tracing: implement trace_clock_*() APIs") # where the bug happened Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761 Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 7bc6bc25a1a80554e9dab1578ef0864a5bcfe552 Author: Steven Rostedt (VMware) Date: Tue Apr 27 11:32:07 2021 -0400 tracing: Map all PIDs to command lines commit 785e3c0a3a870e72dc530856136ab4c8dd207128 upstream. The default max PID is set by PID_MAX_DEFAULT, and the tracing infrastructure uses this number to map PIDs to the comm names of the tasks, such output of the trace can show names from the recorded PIDs in the ring buffer. This mapping is also exported to user space via the "saved_cmdlines" file in the tracefs directory. But currently the mapping expects the PIDs to be less than PID_MAX_DEFAULT, which is the default maximum and not the real maximum. Recently, systemd will increases the maximum value of a PID on the system, and when tasks are traced that have a PID higher than PID_MAX_DEFAULT, its comm is not recorded. This leads to the entire trace to have "<...>" as the comm name, which is pretty useless. Instead, keep the array mapping the size of PID_MAX_DEFAULT, but instead of just mapping the index to the comm, map a mask of the PID (PID_MAX_DEFAULT - 1) to the comm, and find the full PID from the map_cmdline_to_pid array (that already exists). This bug goes back to the beginning of ftrace, but hasn't been an issue until user space started increasing the maximum value of PIDs. Link: https://lkml.kernel.org/r/20210427113207.3c601884@gandalf.local.home Cc: stable@vger.kernel.org Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 337b1546cde87fb8588ddaedf0201b769baa572a Author: Calvin Walton Date: Wed Apr 28 17:09:16 2021 +0800 tools/power turbostat: Fix offset overflow issue in index converting commit 13a779de4175df602366d129e41782ad7168cef0 upstream. The idx_to_offset() function returns type int (32-bit signed), but MSR_PKG_ENERGY_STAT is u32 and would be interpreted as a negative number. The end result is that it hits the if (offset < 0) check in update_msr_sum() which prevents the timer callback from updating the stat in the background when long durations are used. The similar issue exists in offset_to_idx() and update_msr_sum(). Fix this issue by converting the 'int' to 'off_t' accordingly. Fixes: 9972d5d84d76 ("tools/power turbostat: Enable accumulate RAPL display") Signed-off-by: Calvin Walton Signed-off-by: Len Brown Signed-off-by: Greg Kroah-Hartman commit 0452b0b041884b7268ca8b5e3dd69e72f4b5c123 Author: Marek Vasut Date: Sun Mar 28 00:59:32 2021 +0100 rsi: Use resume_noirq for SDIO commit c434e5e48dc4e626364491455f97e2db0aa137b1 upstream. The rsi_resume() does access the bus to enable interrupts on the RSI SDIO WiFi card, however when calling sdio_claim_host() in the resume path, it is possible the bus is already claimed and sdio_claim_host() spins indefinitelly. Enable the SDIO card interrupts in resume_noirq instead to prevent anything else from claiming the SDIO bus first. Fixes: 20db07332736 ("rsi: sdio suspend and resume support") Signed-off-by: Marek Vasut Cc: Amitkumar Karwar Cc: Angus Ainslie Cc: David S. Miller Cc: Jakub Kicinski Cc: Kalle Valo Cc: Karun Eagalapati Cc: Martin Kepplinger Cc: Sebastian Krzyszkowiak Cc: Siva Rebbagondla Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20210327235932.175896-1-marex@denx.de Signed-off-by: Greg Kroah-Hartman commit a46bbc14f4f0529431e15be9b392a012d562f48a Author: Pavel Skripkin Date: Sun Mar 28 00:44:43 2021 +0300 tty: fix memory leak in vc_deallocate commit 211b4d42b70f1c1660feaa968dac0efc2a96ac4d upstream. syzbot reported memory leak in tty/vt. The problem was in VT_DISALLOCATE ioctl cmd. After allocating unimap with PIO_UNIMAP it wasn't freed via VT_DISALLOCATE, but vc_cons[currcons].d was zeroed. Reported-by: syzbot+bcc922b19ccc64240b42@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin Cc: stable Link: https://lore.kernel.org/r/20210327214443.21548-1-paskripkin@gmail.com Signed-off-by: Greg Kroah-Hartman commit bb2511e92a5da1e16a9b48bf81fdab99f540dd28 Author: Hou Zhiqiang Date: Tue Apr 13 17:22:19 2021 +0300 PCI: dwc: Move iATU detection earlier commit 8bcca26585585ae4b44d25d30f351ad0afa4976b upstream. dw_pcie_ep_init() depends on the detected iATU region numbers to allocate the in/outbound window management bitmap. It fails after 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows"). Move the iATU region detection into a new function, move the detection to the very beginning of dw_pcie_host_init() and dw_pcie_ep_init(). Also remove it from the dw_pcie_setup(), since it's more like a software initialization step than hardware setup. Link: https://lore.kernel.org/r/20210125044803.4310-1-Zhiqiang.Hou@nxp.com Link: https://lore.kernel.org/linux-pci/20210407131255.702054-1-dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20210413142219.2301430-1-dmitry.baryshkov@linaro.org Fixes: 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows") Tested-by: Kunihiko Hayashi Tested-by: Marek Szyprowski Tested-by: Manivannan Sadhasivam Signed-off-by: Hou Zhiqiang [DB: moved dw_pcie_iatu_detect to happen after host_init callback] Signed-off-by: Dmitry Baryshkov Signed-off-by: Lorenzo Pieralisi Reviewed-by: Rob Herring Cc: stable@vger.kernel.org # v5.11+ Cc: Marek Szyprowski Signed-off-by: Greg Kroah-Hartman commit 07009d765c573e98420648bd52cbb9472f71ea0f Author: Artur Petrosyan Date: Thu Apr 8 13:45:49 2021 +0400 usb: dwc2: Fix session request interrupt handler commit 42b32b164acecd850edef010915a02418345a033 upstream. According to programming guide in host mode, port power must be turned on in session request interrupt handlers. Fixes: 21795c826a45 ("usb: dwc2: exit hibernation on session request") Cc: Acked-by: Minas Harutyunyan Signed-off-by: Artur Petrosyan Link: https://lore.kernel.org/r/20210408094550.75484A0094@mailhost.synopsys.com Signed-off-by: Greg Kroah-Hartman commit 1c10fd60c8595ea7ff7e29d3cf1fa88069941da3 Author: Yu Chen Date: Thu Apr 15 15:20:30 2021 -0700 usb: dwc3: core: Do core softreset when switch mode commit f88359e1588b85cf0e8209ab7d6620085f3441d9 upstream. From: John Stultz According to the programming guide, to switch mode for DRD controller, the driver needs to do the following. To switch from device to host: 1. Reset controller with GCTL.CoreSoftReset 2. Set GCTL.PrtCapDir(host mode) 3. Reset the host with USBCMD.HCRESET 4. Then follow up with the initializing host registers sequence To switch from host to device: 1. Reset controller with GCTL.CoreSoftReset 2. Set GCTL.PrtCapDir(device mode) 3. Reset the device with DCTL.CSftRst 4. Then follow up with the initializing registers sequence Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of switching from host to device. John Stult reported a lockup issue seen with HiKey960 platform without these steps[1]. Similar issue is observed with Ferry's testing platform[2]. So, apply the required steps along with some fixes to Yu Chen's and John Stultz's version. The main fixes to their versions are the missing wait for clocks synchronization before clearing GCTL.CoreSoftReset and only apply DCTL.CSftRst when switching from host to device. [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/ [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/ Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly") Cc: Andy Shevchenko Cc: Ferry Toth Cc: Wesley Cheng Cc: Tested-by: John Stultz Tested-by: Wesley Cheng Signed-off-by: Yu Chen Signed-off-by: John Stultz Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/374440f8dcd4f06c02c2caf4b1efde86774e02d9.1618521663.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman commit d91827a96d388f57f07efdc66cef64a647db9f8b Author: Thinh Nguyen Date: Mon Apr 19 19:11:12 2021 -0700 usb: dwc3: gadget: Fix START_TRANSFER link state check commit c560e76319a94a3b9285bc426c609903408e4826 upstream. The START_TRANSFER command needs to be executed while in ON/U0 link state (with an exception during register initialization). Don't use dwc->link_state to check this since the driver only tracks the link state when the link state change interrupt is enabled. Check the link state from DSTS register instead. Note that often the host already brings the device out of low power before it sends/requests the next transfer. So, the user won't see any issue when the device starts transfer then. This issue is more noticeable in cases when the device delays starting transfer, which can happen during delayed control status after the host put the device in low power. Fixes: 799e9dc82968 ("usb: dwc3: gadget: conditionally disable Link State change events") Cc: Acked-by: Felipe Balbi Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/bcefaa9ecbc3e1936858c0baa14de6612960e909.1618884221.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit dda85fb8b1134bcb8059d31b7cc0c2ed0ff3ecda Author: Thinh Nguyen Date: Thu Apr 15 00:41:58 2021 -0700 usb: dwc3: gadget: Remove FS bInterval_m1 limitation commit 3232a3ce55edfc0d7f8904543b4088a5339c2b2b upstream. The programming guide incorrectly stated that the DCFG.bInterval_m1 must be set to 0 when operating in fullspeed. There's no such limitation for all IPs. See DWC_usb3x programming guide section 3.2.2.1. Fixes: a1679af85b2a ("usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1") Cc: Acked-by: Felipe Balbi Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/5d4139ae89d810eb0a2d8577fb096fc88e87bfab.1618472454.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman commit dfdc7ec6321356342a463f69009b8b54ab854291 Author: Dean Anderson Date: Wed Mar 17 15:41:09 2021 -0700 usb: gadget/function/f_fs string table fix for multiple languages commit 55b74ce7d2ce0b0058f3e08cab185a0afacfe39e upstream. Fixes bug with the handling of more than one language in the string table in f_fs.c. str_count was not reset for subsequent language codes. str_count-- "rolls under" and processes u32 max strings on the processing of the second language entry. The existing bug can be reproduced by adding a second language table to the structure "strings" in tools/usb/ffs-test.c. Signed-off-by: Dean Anderson Link: https://lore.kernel.org/r/20210317224109.21534-1-dean@sensoray.com Cc: stable Signed-off-by: Greg Kroah-Hartman commit d7a096551b86223eb6682d7c83943ebe48ef23da Author: Hemant Kumar Date: Wed Apr 21 12:47:32 2021 -0700 usb: gadget: Fix double free of device descriptor pointers commit 43c4cab006f55b6ca549dd1214e22f5965a8675f upstream. Upon driver unbind usb_free_all_descriptors() function frees all speed descriptor pointers without setting them to NULL. In case gadget speed changes (i.e from super speed plus to super speed) after driver unbind only upto super speed descriptor pointers get populated. Super speed plus desc still holds the stale (already freed) pointer. Fix this issue by setting all descriptor pointers to NULL after freeing them in usb_free_all_descriptors(). Fixes: f5c61225cf29 ("usb: gadget: Update function for SuperSpeedPlus") cc: stable@vger.kernel.org Reviewed-by: Peter Chen Signed-off-by: Hemant Kumar Signed-off-by: Wesley Cheng Link: https://lore.kernel.org/r/1619034452-17334-1-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman commit 3c6f653256879209fdbc0ea082714bbd251bb811 Author: Anirudh Rayabharam Date: Mon Apr 19 09:07:08 2021 +0530 usb: gadget: dummy_hcd: fix gpf in gadget_setup commit 4a5d797a9f9c4f18585544237216d7812686a71f upstream. Fix a general protection fault reported by syzbot due to a race between gadget_setup() and gadget_unbind() in raw_gadget. The gadget core is supposed to guarantee that there won't be any more callbacks to the gadget driver once the driver's unbind routine is called. That guarantee is enforced in usb_gadget_remove_driver as follows: usb_gadget_disconnect(udc->gadget); if (udc->gadget->irq) synchronize_irq(udc->gadget->irq); udc->driver->unbind(udc->gadget); usb_gadget_udc_stop(udc); usb_gadget_disconnect turns off the pullup resistor, telling the host that the gadget is no longer connected and preventing the transmission of any more USB packets. Any packets that have already been received are sure to processed by the UDC driver's interrupt handler by the time synchronize_irq returns. But this doesn't work with dummy_hcd, because dummy_hcd doesn't use interrupts; it uses a timer instead. It does have code to emulate the effect of synchronize_irq, but that code doesn't get invoked at the right time -- it currently runs in usb_gadget_udc_stop, after the unbind callback instead of before. Indeed, there's no way for usb_gadget_remove_driver to invoke this code before the unbind callback. To fix this, move the synchronize_irq() emulation code to dummy_pullup so that it runs before unbind. Also, add a comment explaining why it is necessary to have it there. Reported-by: syzbot+eb4674092e6cc8d9e0bd@syzkaller.appspotmail.com Suggested-by: Alan Stern Acked-by: Alan Stern Signed-off-by: Anirudh Rayabharam Link: https://lore.kernel.org/r/20210419033713.3021-1-mail@anirudhrb.com Cc: stable Signed-off-by: Greg Kroah-Hartman commit c5bddca814e9677c3a609061703480f9bf323e88 Author: Palash Oswal Date: Tue Apr 27 18:21:49 2021 +0530 io_uring: Check current->io_uring in io_uring_cancel_sqpoll commit 6d042ffb598ed83e7d5623cc961d249def5b9829 upstream. syzkaller identified KASAN: null-ptr-deref Write in io_uring_cancel_sqpoll. io_uring_cancel_sqpoll is called by io_sq_thread before calling io_uring_alloc_task_context. This leads to current->io_uring being NULL. io_uring_cancel_sqpoll should not have to deal with threads where current->io_uring is NULL. In order to cast a wider safety net, perform input sanitisation directly in io_uring_cancel_sqpoll and return for NULL value of current->io_uring. This is safe since if current->io_uring isn't set, then there's no way for the task to have submitted any requests. Reported-by: syzbot+be51ca5a4d97f017cd50@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Palash Oswal Link: https://lore.kernel.org/r/20210427125148.21816-1-hello@oswalpalash.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit cabc762764cce781d6bb9cb202c997c2254db9cd Author: Pavel Begunkov Date: Sun Apr 25 23:34:45 2021 +0100 io_uring: fix work_exit sqpoll cancellations commit 28090c133869b461c5366195a856d73469ab87d9 upstream. After closing an SQPOLL ring, io_ring_exit_work() kicks in and starts doing cancellations via io_uring_try_cancel_requests(). It will go through io_uring_try_cancel_iowq(), which uses ctx->tctx_list, but as SQPOLL task don't have a ctx note, its io-wq won't be reachable and so is left not cancelled. It will eventually cancelled when one of the tasks dies, but if a thread group survives for long and changes rings, it will spawn lots of unreclaimed resources and live locked works. Cancel SQPOLL task's io-wq separately in io_ring_exit_work(). Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/a71a7fe345135d684025bb529d5cb1d8d6b46e10.1619389911.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit cb5e0b3d0f993a6268c1a2c7ede2f9aa0c17ef68 Author: Pavel Begunkov Date: Sun Apr 18 14:52:09 2021 +0100 io_uring: fix shared sqpoll cancellation hangs commit 734551df6f9bedfbefcd113ede665945e9de0b99 upstream. [ 736.982891] INFO: task iou-sqp-4294:4295 blocked for more than 122 seconds. [ 736.982897] Call Trace: [ 736.982901] schedule+0x68/0xe0 [ 736.982903] io_uring_cancel_sqpoll+0xdb/0x110 [ 736.982908] io_sqpoll_cancel_cb+0x24/0x30 [ 736.982911] io_run_task_work_head+0x28/0x50 [ 736.982913] io_sq_thread+0x4e3/0x720 We call io_uring_cancel_sqpoll() one by one for each ctx either in sq_thread() itself or via task works, and it's intended to cancel all requests of a specified context. However the function uses per-task counters to track the number of inflight requests, so it counts more requests than available via currect io_uring ctx and goes to sleep for them to appear (e.g. from IRQ), that will never happen. Cancel a bit more than before, i.e. all ctxs that share sqpoll and continue to use shared counters. Don't forget that we should not remove ctx from the list before running that task_work sqpoll-cancel, otherwise the function wouldn't be able to find the context and will hang. Reported-by: Joakim Hassila Reported-by: Jens Axboe Fixes: 37d1e2e3642e2 ("io_uring: move SQPOLL thread io-wq forked worker") Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/1bded7e6c6b32e0bae25fce36be2868e46b116a0.1618752958.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 949938b5d3713831c49ada46d568e78e227bfa3e Author: Pavel Begunkov Date: Sun Apr 18 14:52:08 2021 +0100 io_uring: remove extra sqpoll submission halting commit 3b763ba1c77da5806e4fdc5684285814fe970c98 upstream. SQPOLL task won't submit requests for a context that is currently dying, so no need to remove ctx from sqd_list prior the main loop of io_ring_exit_work(). Kill it, will be removed by io_sq_thread_finish() and only brings confusion and lockups. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/f220c2b786ba0f9499bebc9f3cd9714d29efb6a5.1618752958.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b60741f40668693e9c1c6cf7c6f93efa6c4d59fe Author: Stanimir Varbanov Date: Sun Mar 7 12:17:27 2021 +0100 media: venus: hfi_parser: Check for instance after hfi platform get commit 9b5d8fd580caa898c6e1b8605c774f2517f786ab upstream. The inst function argument is != NULL only for Venus v1 and we did not migrate v1 to a hfi_platform abstraction yet. So check for instance != NULL only after hfi_platform_get returns no error. Fixes: e29929266be1 ("media: venus: Get codecs and capabilities from hfi platform") Cc: stable@vger.kernel.org # v5.12 Signed-off-by: Stanimir Varbanov Tested-by: Bryan O'Donoghue Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit dbe610d5d415692487be847f3c7b9b76a635e808 Author: Stanimir Varbanov Date: Sun Mar 7 12:16:03 2021 +0100 media: venus: hfi_parser: Don't initialize parser on v1 commit 834124c596e2dddbbdba06620835710ccca32fd0 upstream. The Venus v1 behaves differently comparing with the other Venus version in respect to capability parsing and when they are send to the driver. So we don't need to initialize hfi parser for multiple invocations like what we do for > v1 Venus versions. Fixes: 10865c98986b ("media: venus: parser: Prepare parser for multiple invocations") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Stanimir Varbanov Tested-by: Bryan O'Donoghue Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit bdb64c3da67f8dbcb2c8f15bd3323f7a791694a6 Author: Stanimir Varbanov Date: Sat Mar 6 13:16:41 2021 +0100 media: venus: hfi_cmds: Support plane-actual-info property from v1 commit 15447d18b1b877d9c6f979bd00088e470a4e0e9f upstream. The property is supported from v1 and upwards. So move it to set_property_1x. Fixes: 01e869e78756 ("media: venus: venc: fix handlig of S_SELECTION and G_SELECTION") Cc: stable@vger.kernel.org # v5.12 Signed-off-by: Stanimir Varbanov Tested-by: Bryan O'Donoghue Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 63697f519be40d8a859655a6fd2505a79036cd92 Author: Stanimir Varbanov Date: Sat Mar 6 13:13:46 2021 +0100 media: venus: venc_ctrls: Change default header mode commit 39a6b9185d305d236bff625509ee63801b50301b upstream. It is observed that on Venus v1 the default header-mode is producing a bitstream which is not playble. Change the default header-mode to joined with 1st frame. Fixes: 002c22bd360e ("media: venus: venc: set inband mode property to FW.") Cc: stable@vger.kernel.org # v5.12 Signed-off-by: Stanimir Varbanov Tested-by: Bryan O'Donoghue Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit fc437f92ebb7f63b196f73f7afbde21c36cd4b1f Author: Stanimir Varbanov Date: Thu Feb 25 15:28:57 2021 +0100 media: venus: pm_helpers: Set opp clock name for v1 commit 3215887167af7db9af9fa23d61321ebfbd6ed6d3 upstream. The rate of the core clock is set through devm_pm_opp_set_rate and to avoid errors from it we have to set the name of the clock via dev_pm_opp_set_clkname. Fixes: 9a538b83612c ("media: venus: core: Add support for opp tables/perf voting") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Stanimir Varbanov Tested-by: Bryan O'Donoghue Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 259f7f0fabc3e1a362332f3468d75742da839450 Author: Marco Felsch Date: Fri Mar 5 09:23:54 2021 +0100 media: coda: fix macroblocks count control usage commit 0b276e470a4d43e1365d3eb53c608a3d208cabd4 upstream. Commit b2d3bef1aa78 ("media: coda: Add a V4L2 user for control error macroblocks count") add the control for the decoder devices. But during streamon() this ioctl gets called for all (encoder and decoder) devices and on encoder devices this causes a null pointer exception. Fix this by setting the control only if it is really accessible. Fixes: b2d3bef1aa78 ("media: coda: Add a V4L2 user for control error macroblocks count") Signed-off-by: Marco Felsch Cc: Reviewed-by: Philipp Zabel Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 78cc3a571850c26df20ca5786a8bea75f984140a Author: Hans Verkuil Date: Mon Apr 12 13:51:23 2021 +0200 media: v4l2-ctrls: fix reference to freed memory commit ac34b79da14d67a9b494f6125186becbd067e225 upstream. When controls are used together with the Request API, then for each request a v4l2_ctrl_handler struct is allocated. This contains the controls that can be set in a request. If a control is *not* set in the request, then the value used in the most recent previous request must be used, or the current value if it is not found in any outstanding requests. The framework tried to find such a previous request and it would set the 'req' pointer in struct v4l2_ctrl_ref to the v4l2_ctrl_ref of the control in such a previous request. So far, so good. However, when that previous request was applied to the hardware, returned to userspace, and then userspace would re-init or free that request, any 'ref' pointer in still-queued requests would suddenly point to freed memory. This was not noticed before since the drivers that use this expected that each request would always have the controls set, so there was never any need to find a control in older requests. This requirement was relaxed, and now this bug surfaced. It was also made worse by changeset 2fae4d6aabc8 ("media: v4l2-ctrls: v4l2_ctrl_request_complete() should always set ref->req") which increased the chance of this happening. The use of the 'req' pointer in v4l2_ctrl_ref was very fragile, so drop this entirely. Instead add a valid_p_req bool to indicate that p_req contains a valid value for this control. And if it is false, then just use the current value of the control. Note that VIDIOC_G_EXT_CTRLS will always return -EACCES when attempting to get a control from a request until the request is completed. And in that case, all controls in the request will have the control value set (i.e. valid_p_req is true). This means that the whole 'find the most recent previous request containing a control' idea is pointless, and the code can be simplified considerably. The v4l2_g_ext_ctrls_common() function was refactored a bit to make it more understandable. It also avoids updating volatile controls in a completed request since that was already done when the request was completed. Signed-off-by: Hans Verkuil Fixes: 2fae4d6aabc8 ("media: v4l2-ctrls: v4l2_ctrl_request_complete() should always set ref->req") Fixes: 6fa6f831f095 ("media: v4l2-ctrls: add core request support") Cc: # for v5.9 and up Tested-by: Alexandre Courbot Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit cdaf358483b350c7c02719e44787149c20a1dc07 Author: Ricardo Ribalda Date: Fri Apr 9 10:41:35 2021 +0200 media: staging/intel-ipu3: Fix race condition during set_fmt commit dccfe2548746ca9cca3a20401ece4cf255d1f171 upstream. Do not modify imgu_pipe->nodes[inode].vdev_fmt.fmt.pix_mp, until the format has been correctly validated. Otherwise, even if we use a backup variable, there is a period of time where imgu_pipe->nodes[inode].vdev_fmt.fmt.pix_mp might have an invalid value that can be used by other functions. Cc: stable@vger.kernel.org Fixes: ad91849996f9 ("media: staging/intel-ipu3: Fix set_fmt error handling") Reviewed-by: Tomasz Figa Signed-off-by: Ricardo Ribalda Signed-off-by: Sakari Ailus Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 6fb617e37a39db0a3eca4489431359d0bdf3b9bc Author: Ricardo Ribalda Date: Wed Mar 10 01:16:46 2021 +0100 media: staging/intel-ipu3: Fix set_fmt error handling commit ad91849996f9dd79741a961fd03585a683b08356 upstream. If there in an error during a set_fmt, do not overwrite the previous sizes with the invalid config. Without this patch, v4l2-compliance ends up allocating 4GiB of RAM and causing the following OOPs [ 38.662975] ipu3-imgu 0000:00:05.0: swiotlb buffer is full (sz: 4096 bytes) [ 38.662980] DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:00:05.0 [ 38.663010] general protection fault: 0000 [#1] PREEMPT SMP Cc: stable@vger.kernel.org Fixes: 6d5f26f2e045 ("media: staging/intel-ipu3-v4l: reduce kernel stack usage") Signed-off-by: Ricardo Ribalda Signed-off-by: Sakari Ailus Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 74ba0adb5e983503b18a96121d965cad34ac7ce3 Author: Ricardo Ribalda Date: Mon Mar 15 13:34:05 2021 +0100 media: staging/intel-ipu3: Fix memory leak in imu_fmt commit 3630901933afba1d16c462b04d569b7576339223 upstream. We are losing the reference to an allocated memory if try. Change the order of the check to avoid that. Cc: stable@vger.kernel.org Fixes: 6d5f26f2e045 ("media: staging/intel-ipu3-v4l: reduce kernel stack usage") Signed-off-by: Ricardo Ribalda Signed-off-by: Sakari Ailus Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit e046910804f6099dc12fbaca097418062f742cd0 Author: Takashi Iwai Date: Mon Feb 1 09:32:46 2021 +0100 media: dvb-usb: Fix memory leak at error in dvb_usb_device_init() commit 13a79f14ab285120bc4977e00a7c731e8143f548 upstream. dvb_usb_device_init() allocates a dvb_usb_device object, but it doesn't release the object by itself even at errors. The object is released in the callee side (dvb_usb_init()) in some error cases via dvb_usb_exit() call, but it also missed the object free in other error paths. And, the caller (it's only dvb_usb_device_init()) doesn't seem caring the resource management as well, hence those memories are leaked. This patch assures releasing the memory at the error path in dvb_usb_device_init(). Now dvb_usb_init() frees the resources it allocated but leaves the passed dvb_usb_device object intact. In turn, the dvb_usb_device object is released in dvb_usb_device_init() instead. We could use dvb_usb_exit() function for releasing everything in the callee (as it was used for some error cases in the original code), but releasing the passed object in the callee is non-intuitive and error-prone. So I took this approach (which is more standard in Linus kernel code) although it ended with a bit more open codes. Along with the change, the patch makes sure that USB intfdata is reset and don't return the bogus pointer to the caller of dvb_usb_device_init() at the error path, too. Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 2aabd8da8f0d4bed272a29f789d7fd2625bd91dd Author: Takashi Iwai Date: Mon Feb 1 09:32:47 2021 +0100 media: dvb-usb: Fix use-after-free access commit c49206786ee252f28b7d4d155d1fff96f145a05d upstream. dvb_usb_device_init() copies the properties to the own data, so that the callers can release the original properties later (as done in the commit 299c7007e936 ("media: dw2102: Fix memleak on sequence of probes")). However, it also stores dev->desc pointer that is a reference to the original properties data. Since dev->desc is referred later, it may result in use-after-free, in the worst case, leading to a kernel Oops as reported. This patch addresses the problem by allocating and copying the properties at first, then get the desc from the copied properties. Reported-and-tested-by: Stefan Seyfried BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=1181104 Reviewed-by: Robert Foss Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit cede24d13be6c2a62be6d7ceea63c2719b0cfa82 Author: Peilin Ye Date: Fri Dec 11 09:30:39 2020 +0100 media: dvbdev: Fix memory leak in dvb_media_device_free() commit bf9a40ae8d722f281a2721779595d6df1c33a0bf upstream. dvb_media_device_free() is leaking memory. Free `dvbdev->adapter->conn` before setting it to NULL, as documented in include/media/media-device.h: "The media_entity instance itself must be freed explicitly by the driver if required." Link: https://syzkaller.appspot.com/bug?id=9bbe4b842c98f0ed05c5eed77a226e9de33bf298 Link: https://lore.kernel.org/linux-media/20201211083039.521617-1-yepeilin.cs@gmail.com Cc: stable@vger.kernel.org Fixes: 0230d60e4661 ("[media] dvbdev: Add RF connector if needed") Reported-by: syzbot+7f09440acc069a0d38ac@syzkaller.appspotmail.com Signed-off-by: Peilin Ye Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit a1d50db9020fbdfa5fce29945e50739394069c43 Author: Jan Kara Date: Thu Apr 15 17:54:17 2021 +0200 ext4: Fix occasional generic/418 failure commit 5899593f51e63dde2f07c67358bd65a641585abb upstream. Eric has noticed that after pagecache read rework, generic/418 is occasionally failing for ext4 when blocksize < pagesize. In fact, the pagecache rework just made hard to hit race in ext4 more likely. The problem is that since ext4 conversion of direct IO writes to iomap framework (commit 378f32bab371), we update inode size after direct IO write only after invalidating page cache. Thus if buffered read sneaks at unfortunate moment like: CPU1 - write at offset 1k CPU2 - read from offset 0 iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT); ext4_readpage(); ext4_handle_inode_extension() the read will zero out tail of the page as it still sees smaller inode size and thus page cache becomes inconsistent with on-disk contents with all the consequences. Fix the problem by moving inode size update into end_io handler which gets called before the page cache is invalidated. Reported-and-tested-by: Eric Whitney Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") CC: stable@vger.kernel.org Signed-off-by: Jan Kara Acked-by: Dave Chinner Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 16e4d107b5d73a2f487a2e92221cc38f4c8369ea Author: Theodore Ts'o Date: Mon Apr 12 17:19:00 2021 -0400 ext4: allow the dax flag to be set and cleared on inline directories commit 4811d9929cdae4238baf5b2522247bd2f9fa7b50 upstream. This is needed to allow generic/607 to pass for file systems with the inline data_feature enabled, and it allows the use of file systems where the directories use inline_data, while the files are accessed via DAX. Cc: stable@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit c9d79b206f74a4783f5562174c5479e6631cea38 Author: Xu Yihang Date: Thu Apr 8 15:00:33 2021 +0800 ext4: fix error return code in ext4_fc_perform_commit() commit e1262cd2e68a0870fb9fc95eb202d22e8f0074b7 upstream. In case of if not ext4_fc_add_tlv branch, an error return code is missing. Cc: stable@kernel.org Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Reported-by: Hulk Robot Signed-off-by: Xu Yihang Reviewed-by: Harshad Shirwadkar Link: https://lore.kernel.org/r/20210408070033.123047-1-xuyihang@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 5a611ecc0df71e966af07d1c84b402587ada2420 Author: Ye Bin Date: Tue Apr 6 10:53:31 2021 +0800 ext4: fix ext4_error_err save negative errno into superblock commit 6810fad956df9e5467e8e8a5ac66fda0836c71fa upstream. Fix As write_mmp_block() so that it returns -EIO instead of 1, so that the correct error gets saved into the superblock. Cc: stable@kernel.org Fixes: 54d3adbc29f0 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Liu Zhi Qiang Signed-off-by: Ye Bin Reviewed-by: Andreas Dilger Link: https://lore.kernel.org/r/20210406025331.148343-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit f61f62d574c29b327da8f0b4e90a373f7593a98c Author: Fengnan Chang Date: Fri Apr 2 18:16:31 2021 +0800 ext4: fix error code in ext4_commit_super commit f88f1466e2a2e5ca17dfada436d3efa1b03a3972 upstream. We should set the error code when ext4_commit_super check argument failed. Found in code review. Fixes: c4be0c1dc4cdc ("filesystem freeze: add error handling of write_super_lockfs/unlockfs"). Cc: stable@kernel.org Signed-off-by: Fengnan Chang Reviewed-by: Andreas Dilger Link: https://lore.kernel.org/r/20210402101631.561-1-changfengnan@vivo.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 1e9ea8f4637026b8e965128953f2da061ccae9c4 Author: Ye Bin Date: Thu Apr 1 16:19:03 2021 +0800 ext4: always panic when errors=panic is specified commit ac2f7ca51b0929461ea49918f27c11b680f28995 upstream. Before commit 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()"), the following series of commands would trigger a panic: 1. mount /dev/sda -o ro,errors=panic test 2. mount /dev/sda -o remount,abort test After commit 014c9caa29d3, remounting a file system using the test mount option "abort" will no longer trigger a panic. This commit will restore the behaviour immediately before commit 014c9caa29d3. (However, note that the Linux kernel's behavior has not been consistent; some previous kernel versions, including 5.4 and 4.19 similarly did not panic after using the mount option "abort".) This also makes a change to long-standing behaviour; namely, the following series commands will now cause a panic, when previously it did not: 1. mount /dev/sda -o ro,errors=panic test 2. echo test > /sys/fs/ext4/sda/trigger_fs_error However, this makes ext4's behaviour much more consistent, so this is a good thing. Cc: stable@kernel.org Fixes: 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()") Signed-off-by: Ye Bin Link: https://lore.kernel.org/r/20210401081903.3421208-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit b64a3fb6ea0274483d66daac9d0cd4f3d2c8fbbc Author: Zhang Yi Date: Wed Mar 31 11:31:38 2021 +0800 ext4: do not set SB_ACTIVE in ext4_orphan_cleanup() commit 72ffb49a7b623c92a37657eda7cc46a06d3e8398 upstream. When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due to some error happens behind ext4_orphan_cleanup(), it will end up triggering a after free issue of super_block. The problem is that ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is enabled, after we cleanup the truncated inodes, the last iput() will put them into the lru list, and these inodes' pages may probably dirty and will be write back by the writeback thread, so it could be raced by freeing super_block in the error path of mount_bdev(). After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it was used to ensure updating the quota file properly, but evict inode and trash data immediately in the last iput does not affect the quotafile, so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by just remove the SB_ACTIVE setting. [1] https://lore.kernel.org/linux-ext4/99cce8ca-e4a0-7301-840f-2ace67c551f3@huawei.com/T/#m04990cfbc4f44592421736b504afcc346b2a7c00 Cc: stable@kernel.org Signed-off-by: Zhang Yi Tested-by: Jan Kara Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20210331033138.918975-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit e18d76a12b34791bc0318a0e0c0fa5863cd8dabf Author: Zhang Yi Date: Wed Mar 31 20:15:16 2021 +0800 ext4: fix check to prevent false positive report of incorrect used inodes commit a149d2a5cabbf6507a7832a1c4fd2593c55fd450 upstream. Commit <50122847007> ("ext4: fix check to prevent initializing reserved inodes") check the block group zero and prevent initializing reserved inodes. But in some special cases, the reserved inode may not all belong to the group zero, it may exist into the second group if we format filesystem below. mkfs.ext4 -b 4096 -g 8192 -N 1024 -I 4096 /dev/sda So, it will end up triggering a false positive report of a corrupted file system. This patch fix it by avoid check reserved inodes if no free inode blocks will be zeroed. Cc: stable@kernel.org Fixes: 50122847007 ("ext4: fix check to prevent initializing reserved inodes") Signed-off-by: Zhang Yi Suggested-by: Jan Kara Link: https://lore.kernel.org/r/20210331121516.2243099-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 222c345267b57a4023c418b61bb76628a3584994 Author: Jan Kara Date: Tue Apr 6 18:18:00 2021 +0200 ext4: annotate data race in jbd2_journal_dirty_metadata() commit 83fe6b18b8d04c6c849379005e1679bac9752466 upstream. Assertion checks in jbd2_journal_dirty_metadata() are known to be racy but we don't want to be grabbing locks just for them. We thus recheck them under b_state_lock only if it looks like they would fail. Annotate the checks with data_race(). Cc: stable@kernel.org Reported-by: Hao Sun Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20210406161804.20150-2-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit cfc9b6d1521c5e6623c5c0d43e77d4b3cde42da4 Author: Jan Kara Date: Tue Apr 6 18:17:59 2021 +0200 ext4: annotate data race in start_this_handle() commit 3b1833e92baba135923af4a07e73fe6e54be5a2f upstream. Access to journal->j_running_transaction is not protected by appropriate lock and thus is racy. We are well aware of that and the code handles the race properly. Just add a comment and data_race() annotation. Cc: stable@kernel.org Reported-by: syzbot+30774a6acf6a2cf6d535@syzkaller.appspotmail.com Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20210406161804.20150-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 8cb6d877b0b2fb11739f6e8f0122d2d1011e10f6 Author: Masahiro Yamada Date: Sun Apr 25 15:24:07 2021 +0900 kbuild: update config_data.gz only when the content of .config is changed commit 46b41d5dd8019b264717978c39c43313a524d033 upstream. If the timestamp of the .config file is updated, config_data.gz is regenerated, then vmlinux is re-linked. This occurs even if the content of the .config has not changed at all. This issue was mitigated by commit 67424f61f813 ("kconfig: do not write .config if the content is the same"); Kconfig does not update the .config when it ends up with the identical configuration. The issue is remaining when the .config is created by *_defconfig with some config fragment(s) applied on top. This is typical for powerpc and mips, where several *_defconfig targets are constructed by using merge_config.sh. One workaround is to have the copy of the .config. The filechk rule updates the copy, kernel/config_data, by checking the content instead of the timestamp. With this commit, the second run with the same configuration avoids the needless rebuilds. $ make ARCH=mips defconfig all [ snip ] $ make ARCH=mips defconfig all *** Default configuration is based on target '32r2el_defconfig' Using ./arch/mips/configs/generic_defconfig as base Merging arch/mips/configs/generic/32r2.config Merging arch/mips/configs/generic/el.config Merging ./arch/mips/configs/generic/board-boston.config Merging ./arch/mips/configs/generic/board-ni169445.config Merging ./arch/mips/configs/generic/board-ocelot.config Merging ./arch/mips/configs/generic/board-ranchu.config Merging ./arch/mips/configs/generic/board-sead-3.config Merging ./arch/mips/configs/generic/board-xilfpga.config # # configuration written to .config # SYNC include/config/auto.conf CALL scripts/checksyscalls.sh CALL scripts/atomic/check-atomics.sh CHK include/generated/compile.h CHK include/generated/autoksyms.h Reported-by: Elliot Berman Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman commit 0fc0b094a4de022549c6089cd891305133bf9eff Author: Sean Christopherson Date: Tue May 4 15:56:31 2021 -0700 x86/cpu: Initialize MSR_TSC_AUX if RDTSCP *or* RDPID is supported commit b6b4fbd90b155a0025223df2c137af8a701d53b3 upstream. Initialize MSR_TSC_AUX with CPU node information if RDTSCP or RDPID is supported. This fixes a bug where vdso_read_cpunode() will read garbage via RDPID if RDPID is supported but RDTSCP is not. While no known CPU supports RDPID but not RDTSCP, both Intel's SDM and AMD's APM allow for RDPID to exist without RDTSCP, e.g. it's technically a legal CPU model for a virtual machine. Note, technically MSR_TSC_AUX could be initialized if and only if RDPID is supported since RDTSCP is currently not used to retrieve the CPU node. But, the cost of the superfluous WRMSR is negigible, whereas leaving MSR_TSC_AUX uninitialized is just asking for future breakage if someone decides to utilize RDTSCP. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Sean Christopherson Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210504225632.1532621-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman commit 13b5638e78d81e78c1cb1da72cdd00aaba88a2a8 Author: Thomas Gleixner Date: Thu Apr 22 21:44:19 2021 +0200 futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI commit cdf78db4070967869e4d027c11f4dd825d8f815a upstream. FUTEX_LOCK_PI does not require to have the FUTEX_CLOCK_REALTIME bit set because it has been using CLOCK_REALTIME based absolute timeouts forever. Due to that, the time namespace adjustment which is applied when FUTEX_CLOCK_REALTIME is not set, will wrongly take place for FUTEX_LOCK_PI and wreckage the timeout. Exclude it from that procedure. Fixes: c2f7d08cccf4 ("futex: Adjust absolute futex timeouts with per time namespace offset") Signed-off-by: Thomas Gleixner Acked-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210422194704.984540159@linutronix.de Signed-off-by: Greg Kroah-Hartman commit 600de799010689ad8d16f7b1932ca718809f0719 Author: Thomas Gleixner Date: Thu Apr 22 21:44:18 2021 +0200 Revert 337f13046ff0 ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op") commit 4fbf5d6837bf81fd7a27d771358f4ee6c4f243f8 upstream. The FUTEX_WAIT operand has historically a relative timeout which means that the clock id is irrelevant as relative timeouts on CLOCK_REALTIME are not subject to wall clock changes and therefore are mapped by the kernel to CLOCK_MONOTONIC for simplicity. If a caller would set FUTEX_CLOCK_REALTIME for FUTEX_WAIT the timeout is still treated relative vs. CLOCK_MONOTONIC and then the wait arms that timeout based on CLOCK_REALTIME which is broken and obviously has never been used or even tested. Reject any attempt to use FUTEX_CLOCK_REALTIME with FUTEX_WAIT again. The desired functionality can be achieved with FUTEX_WAIT_BITSET and a FUTEX_BITSET_MATCH_ANY argument. Fixes: 337f13046ff0 ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op") Signed-off-by: Thomas Gleixner Acked-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210422194704.834797921@linutronix.de Signed-off-by: Greg Kroah-Hartman commit 1ca4f55a7e8ae1bade645ecd958f68c45741abad Author: Steve French Date: Fri May 7 20:00:41 2021 -0500 smb3: do not attempt multichannel to server which does not support it commit 9c2dc11df50d1c8537075ff6b98472198e24438e upstream. We were ignoring CAP_MULTI_CHANNEL in the server response - if the server doesn't support multichannel we should not be attempting it. See MS-SMB2 section 3.2.5.2 Reviewed-by: Shyam Prasad N Reviewed-By: Tom Talpey Cc: # v5.8+ Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 0470ef330390bdffb1cf75179af5d610db5a7daa Author: Steve French Date: Fri May 7 19:33:51 2021 -0500 smb3: if max_channels set to more than one channel request multichannel commit c1f8a398b6d661b594556a91224b096d92293061 upstream. Mounting with "multichannel" is obviously implied if user requested more than one channel on mount (ie mount parm max_channels>1). Currently both have to be specified. Fix that so that if max_channels is greater than 1 on mount, enable multichannel rather than silently falling back to non-multichannel. Signed-off-by: Steve French Reviewed-By: Tom Talpey Cc: # v5.11+ Reviewed-by: Shyam Prasad N Signed-off-by: Greg Kroah-Hartman commit 524fc5482ffb17a80d92a1c2f2a2be1f23a5d44b Author: Steve French Date: Fri May 7 18:24:11 2021 -0500 smb3: when mounting with multichannel include it in requested capabilities commit 679971e7213174efb56abc8fab1299d0a88db0e8 upstream. In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to advertise CAP_MULTICHANNEL capability when establishing multiple channels has been requested by the user doing the mount. See MS-SMB2 sections 2.2.3 and 3.2.5.2 Without setting it there is some risk that multichannel could fail if the server interpreted the field strictly. Reviewed-By: Tom Talpey Reviewed-by: Shyam Prasad N Cc: # v5.8+ Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 2831dd51bec25d5b8a499a8e8e15de6237771a80 Author: Linus Torvalds Date: Tue Apr 27 17:05:53 2021 -0700 Fix misc new gcc warnings commit e7c6e405e171fb33990a12ecfd14e6500d9e5cf2 upstream. It seems like Fedora 34 ends up enabling a few new gcc warnings, notably "-Wstringop-overread" and "-Warray-parameter". Both of them cause what seem to be valid warnings in the kernel, where we have array size mismatches in function arguments (that are no longer just silently converted to a pointer to element, but actually checked). This fixes most of the trivial ones, by making the function declaration match the function definition, and in the case of intel_pm.c, removing the over-specified array size from the argument declaration. At least one 'stringop-overread' warning remains in the i915 driver, but that one doesn't have the same obvious trivial fix, and may or may not actually be indicative of a bug. [ It was a mistake to upgrade one of my machines to Fedora 34 while being busy with the merge window, but if this is the extent of the compiler upgrade problems, things are better than usual - Linus ] Signed-off-by: Linus Torvalds Cc: Andrey Zhizhikin Signed-off-by: Greg Kroah-Hartman commit 3cc63b1c6e7a2f2cbf2af5f9f200f9739ac0b88c Author: Arnd Bergmann Date: Mon Mar 22 17:02:41 2021 +0100 security: commoncap: fix -Wstringop-overread warning commit 82e5d8cc768b0c7b03c551a9ab1f8f3f68d5f83f upstream. gcc-11 introdces a harmless warning for cap_inode_getsecurity: security/commoncap.c: In function ‘cap_inode_getsecurity’: security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread] 440 | memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here is that tmpbuf is initialized to NULL, so gcc assumes it is not accessible unless it gets set by vfs_getxattr_alloc(). This is a legitimate warning as far as I can tell, but the code is correct since it correctly handles the error when that function fails. Add a separate NULL check to tell gcc about it as well. Signed-off-by: Arnd Bergmann Acked-by: Christian Brauner Signed-off-by: James Morris Cc: Andrey Zhizhikin Signed-off-by: Greg Kroah-Hartman commit deaaae2f918c9fc97a1418457616c9a8d2021890 Author: Frederic Weisbecker Date: Tue Feb 23 01:09:59 2021 +0100 rcu/nocb: Fix missed nocb_timer requeue commit b2fcf2102049f6e56981e0ab3d9b633b8e2741da upstream. This sequence of events can lead to a failure to requeue a CPU's ->nocb_timer: 1. There are no callbacks queued for any CPU covered by CPU 0-2's ->nocb_gp_kthread. Note that ->nocb_gp_kthread is associated with CPU 0. 2. CPU 1 enqueues its first callback with interrupts disabled, and thus must defer awakening its ->nocb_gp_kthread. It therefore queues its rcu_data structure's ->nocb_timer. At this point, CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE. 3. CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a callback, but with interrupts enabled, allowing it to directly awaken the ->nocb_gp_kthread. 4. The newly awakened ->nocb_gp_kthread associates both CPU 1's and CPU 2's callbacks with a future grace period and arranges for that grace period to be started. 5. This ->nocb_gp_kthread goes to sleep waiting for the end of this future grace period. 6. This grace period elapses before the CPU 1's timer fires. This is normally improbably given that the timer is set for only one jiffy, but timers can be delayed. Besides, it is possible that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y. 7. The grace period ends, so rcu_gp_kthread awakens the ->nocb_gp_kthread, which in turn awakens both CPU 1's and CPU 2's ->nocb_cb_kthread. Then ->nocb_gb_kthread sleeps waiting for more newly queued callbacks. 8. CPU 1's ->nocb_cb_kthread invokes its callback, then sleeps waiting for more invocable callbacks. 9. Note that neither kthread updated any ->nocb_timer state, so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE. 10. CPU 1 enqueues its second callback, this time with interrupts enabled so it can wake directly ->nocb_gp_kthread. It does so with calling wake_nocb_gp() which also cancels the pending timer that got queued in step 2. But that doesn't reset CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE. So CPU 1's ->nocb_defer_wakeup and its ->nocb_timer are now desynchronized. 11. ->nocb_gp_kthread associates the callback queued in 10 with a new grace period, arranges for that grace period to start and sleeps waiting for it to complete. 12. The grace period ends, rcu_gp_kthread awakens ->nocb_gp_kthread, which in turn wakes up CPU 1's ->nocb_cb_kthread which then invokes the callback queued in 10. 13. CPU 1 enqueues its third callback, this time with interrupts disabled so it must queue a timer for a deferred wakeup. However the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which incorrectly indicates that a timer is already queued. Instead, CPU 1's ->nocb_timer was cancelled in 10. CPU 1 therefore fails to queue the ->nocb_timer. 14. CPU 1 has its pending callback and it may go unnoticed until some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever calls an explicit deferred wakeup, for example, during idle entry. This commit fixes this bug by resetting rdp->nocb_defer_wakeup everytime we delete the ->nocb_timer. It is quite possible that there is a similar scenario involving ->nocb_bypass_timer and ->nocb_defer_wakeup. However, despite some effort from several people, a failure scenario has not yet been located. However, that by no means guarantees that no such scenario exists. Finding a failure scenario is left as an exercise for the reader, and the "Fixes:" tag below relates to ->nocb_bypass_timer instead of ->nocb_timer. Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing) Cc: Cc: Josh Triplett Cc: Lai Jiangshan Cc: Joel Fernandes Cc: Boqun Feng Reviewed-by: Neeraj Upadhyay Signed-off-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney Signed-off-by: Greg Kroah-Hartman commit ebeac958b690123a0b40aa61f688f2f170035fad Author: Ignat Korchagin Date: Tue Apr 27 22:09:38 2021 +0100 sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues commit 99ba0ea616aabdc8e26259fd722503e012199a76 upstream. efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is later used to allocate and traverse efx->xdp_tx_queues lookup array. However, we may end up not initializing all the array slots with real queues during probing. This results, for example, in a NULL pointer dereference, when running "# ethtool -S ", similar to below [2570283.664955][T4126959] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [2570283.681283][T4126959] #PF: supervisor read access in kernel mode [2570283.695678][T4126959] #PF: error_code(0x0000) - not-present page [2570283.710013][T4126959] PGD 0 P4D 0 [2570283.721649][T4126959] Oops: 0000 [#1] SMP PTI [2570283.734108][T4126959] CPU: 23 PID: 4126959 Comm: ethtool Tainted: G O 5.10.20-cloudflare-2021.3.1 #1 [2570283.752641][T4126959] Hardware name: [2570283.781408][T4126959] RIP: 0010:efx_ethtool_get_stats+0x2ca/0x330 [sfc] [2570283.796073][T4126959] Code: 00 85 c0 74 39 48 8b 95 a8 0f 00 00 48 85 d2 74 2d 31 c0 eb 07 48 8b 95 a8 0f 00 00 48 63 c8 49 83 c4 08 83 c0 01 48 8b 14 ca <48> 8b 92 f8 00 00 00 49 89 54 24 f8 39 85 a0 0f 00 00 77 d7 48 8b [2570283.831259][T4126959] RSP: 0018:ffffb79a77657ce8 EFLAGS: 00010202 [2570283.845121][T4126959] RAX: 0000000000000019 RBX: ffffb799cd0c9280 RCX: 0000000000000018 [2570283.860872][T4126959] RDX: 0000000000000000 RSI: ffff96dd970ce000 RDI: 0000000000000005 [2570283.876525][T4126959] RBP: ffff96dd86f0a000 R08: ffff96dd970ce480 R09: 000000000000005f [2570283.892014][T4126959] R10: ffffb799cd0c9fff R11: ffffb799cd0c9000 R12: ffffb799cd0c94f8 [2570283.907406][T4126959] R13: ffffffffc11b1090 R14: ffff96dd970ce000 R15: ffffffffc11cd66c [2570283.922705][T4126959] FS: 00007fa7723f8740(0000) GS:ffff96f51fac0000(0000) knlGS:0000000000000000 [2570283.938848][T4126959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2570283.952524][T4126959] CR2: 00000000000000f8 CR3: 0000001a73e6e006 CR4: 00000000007706e0 [2570283.967529][T4126959] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [2570283.982400][T4126959] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [2570283.997308][T4126959] PKRU: 55555554 [2570284.007649][T4126959] Call Trace: [2570284.017598][T4126959] dev_ethtool+0x1832/0x2830 Fix this by adjusting efx->xdp_tx_queue_count after probing to reflect the true value of initialized slots in efx->xdp_tx_queues. Signed-off-by: Ignat Korchagin Fixes: e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues") Cc: # 5.12.x Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e531db1ea6f98c9612cb2de093a107c7eadfb96c Author: Edward Cree Date: Tue Apr 20 13:28:28 2021 +0100 sfc: farch: fix TX queue lookup in TX event handling commit 83b09a1807415608b387c7bc748d329fefc5617e upstream. We're starting from a TXQ label, not a TXQ type, so efx_channel_get_tx_queue() is inappropriate (and could return NULL, leading to panics). Fixes: 12804793b17c ("sfc: decouple TXQ type from label") Cc: stable@vger.kernel.org Signed-off-by: Edward Cree Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 98d91180748986bfb6dfb3e72765f3225719a647 Author: Edward Cree Date: Tue Apr 20 13:27:22 2021 +0100 sfc: farch: fix TX queue lookup in TX flush done handling commit 5b1faa92289b53cad654123ed2bc8e10f6ddd4ac upstream. We're starting from a TXQ instance number ('qid'), not a TXQ type, so efx_get_tx_queue() is inappropriate (and could return NULL, leading to panics). Fixes: 12804793b17c ("sfc: decouple TXQ type from label") Reported-by: Trevor Hemsley Cc: stable@vger.kernel.org Signed-off-by: Edward Cree Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 473ffbe0b25cdf3fc5157a5223d426cc722fb98c Author: Hyeongseok Kim Date: Thu Mar 4 09:15:34 2021 +0900 exfat: fix erroneous discard when clear cluster bit commit 77edfc6e51055b61cae2f54c8e6c3bb7c762e4fe upstream. If mounted with discard option, exFAT issues discard command when clear cluster bit to remove file. But the input parameter of cluster-to-sector calculation is abnormally added by reserved cluster size which is 2, leading to discard unrelated sectors included in target+2 cluster. With fixing this, remove the wrong comments in set/clear/find bitmap functions. Fixes: 1e49a94cf707 ("exfat: add bitmap operations") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Hyeongseok Kim Acked-by: Sungjong Seo Signed-off-by: Namjae Jeon Signed-off-by: Greg Kroah-Hartman commit d04d56f855fa9fda66998dde69e39c3f8432011d Author: Sergei Trofimovich Date: Thu Apr 29 23:02:11 2021 -0700 mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1 commit 9df65f522536719682bccd24245ff94db956256c upstream. On !ARCH_SUPPORTS_DEBUG_PAGEALLOC (like ia64) debug_pagealloc=1 implies page_poison=on: if (page_poisoning_enabled() || (!IS_ENABLED(CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC) && debug_pagealloc_enabled())) static_branch_enable(&_page_poisoning_enabled); page_poison=on needs to override init_on_free=1. Before the change it did not work as expected for the following case: - have PAGE_POISONING=y - have page_poison unset - have !ARCH_SUPPORTS_DEBUG_PAGEALLOC arch (like ia64) - have init_on_free=1 - have debug_pagealloc=1 That way we get both keys enabled: - static_branch_enable(&init_on_free); - static_branch_enable(&_page_poisoning_enabled); which leads to poisoned pages returned for __GFP_ZERO pages. After the change we execute only: - static_branch_enable(&_page_poisoning_enabled); and ignore init_on_free=1. Link: https://lkml.kernel.org/r/20210329222555.3077928-1-slyfox@gentoo.org Link: https://lkml.org/lkml/2021/3/26/443 Fixes: 8db26a3d4735 ("mm, page_poison: use static key more efficiently") Signed-off-by: Sergei Trofimovich Acked-by: Vlastimil Babka Reviewed-by: David Hildenbrand Cc: Andrey Konovalov Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b945c480ae85ad2dbeaf2f7f10b4a3baa4263cf0 Author: Vivek Goyal Date: Wed Oct 21 16:12:49 2020 -0400 fuse: fix write deadlock commit 4f06dd92b5d0a6f8eec6a34b8d6ef3e1f4ac1e10 upstream. There are two modes for write(2) and friends in fuse: a) write through (update page cache, send sync WRITE request to userspace) b) buffered write (update page cache, async writeout later) The write through method kept all the page cache pages locked that were used for the request. Keeping more than one page locked is deadlock prone and Qian Cai demonstrated this with trinity fuzzing. The reason for keeping the pages locked is that concurrent mapped reads shouldn't try to pull possibly stale data into the page cache. For full page writes, the easy way to fix this is to make the cached page be the authoritative source by marking the page PG_uptodate immediately. After this the page can be safely unlocked, since mapped/cached reads will take the written data from the cache. Concurrent mapped writes will now cause data in the original WRITE request to be updated; this however doesn't cause any data inconsistency and this scenario should be exceedingly rare anyway. If the WRITE request returns with an error in the above case, currently the page is not marked uptodate; this means that a concurrent read will always read consistent data. After this patch the page is uptodate between writing to the cache and receiving the error: there's window where a cached read will read the wrong data. While theoretically this could be a regression, it is unlikely to be one in practice, since this is normal for buffered writes. In case of a partial page write to an already uptodate page the locking is also unnecessary, with the above caveats. Partial write of a not uptodate page still needs to be handled. One way would be to read the complete page before doing the write. This is not possible, since it might break filesystems that don't expect any READ requests when the file was opened O_WRONLY. The other solution is to serialize the synchronous write with reads from the partial pages. The easiest way to do this is to keep the partial pages locked. The problem is that a write() may involve two such pages (one head and one tail). This patch fixes it by only locking the partial tail page. If there's a partial head page as well, then split that off as a separate WRITE request. Reported-by: Qian Cai Link: https://lore.kernel.org/linux-fsdevel/4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw/ Fixes: ea9b9907b82a ("fuse: implement perform_write") Cc: # v2.6.26 Signed-off-by: Vivek Goyal Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 00fd897153cadc2fd94c937bedf60cec25707762 Author: Heinz Mauelshagen Date: Wed Apr 21 23:32:36 2021 +0200 dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences commit f99a8e4373eeacb279bc9696937a55adbff7a28a upstream. If fast table reloads occur during an ongoing reshape of raid4/5/6 devices the target may race reading a superblock vs the the MD resync thread; causing an inconclusive reshape state to be read in its constructor. lvm2 test lvconvert-raid-reshape-stripes-load-reload.sh can cause BUG_ON() to trigger in md_run(), e.g.: "kernel BUG at drivers/md/raid5.c:7567!". Scenario triggering the bug: 1. the MD sync thread calls end_reshape() from raid5_sync_request() when done reshaping. However end_reshape() _only_ updates the reshape position to MaxSector keeping the changed layout configuration though (i.e. any delta disks, chunk sector or RAID algorithm changes). That inconclusive configuration is stored in the superblock. 2. dm-raid constructs a mapping, loading named inconsistent superblock as of step 1 before step 3 is able to finish resetting the reshape state completely, and calls md_run() which leads to mentioned bug in raid5.c. 3. the MD RAID personality's finish_reshape() is called; which resets the reshape information on chunk sectors, delta disks, etc. This explains why the bug is rarely seen on multi-core machines, as MD's finish_reshape() superblock update races with the dm-raid constructor's superblock load in step 2. Fix identifies inconclusive superblock content in the dm-raid constructor and resets it before calling md_run(), factoring out identifying checks into rs_is_layout_change() to share in existing rs_reshape_requested() and new rs_reset_inclonclusive_reshape(). Also enhance a comment and remove an empty line. Cc: stable@vger.kernel.org Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 538244fba59fde17186322776247cd9c05be86dd Author: Paul Clements Date: Thu Apr 15 17:17:57 2021 -0400 md/raid1: properly indicate failure when ending a failed write request commit 2417b9869b81882ab90fd5ed1081a1cb2d4db1dd upstream. This patch addresses a data corruption bug in raid1 arrays using bitmaps. Without this fix, the bitmap bits for the failed I/O end up being cleared. Since we are in the failure leg of raid1_end_write_request, the request either needs to be retried (R1BIO_WriteError) or failed (R1BIO_Degraded). Fixes: eeba6809d8d5 ("md/raid1: end bio when the device faulty") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Paul Clements Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman commit 3606420d57140dfbcda3fd2c0ae94bdb26635639 Author: Eric Biggers Date: Sun Mar 21 22:07:48 2021 -0700 crypto: rng - fix crypto_rng_reset() refcounting when !CRYPTO_STATS commit 30d0f6a956fc74bb2e948398daf3278c6b08c7e9 upstream. crypto_stats_get() is a no-op when the kernel is compiled without CONFIG_CRYPTO_STATS, so pairing it with crypto_alg_put() unconditionally (as crypto_rng_reset() does) is wrong. Fix this by moving the call to crypto_stats_get() to just before the actual algorithm operation which might need it. This makes it always paired with crypto_stats_rng_seed(). Fixes: eed74b3eba9e ("crypto: rng - Fix a refcounting bug in crypto_rng_reset()") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 694d9d5b59e296e4229b90eb939a2e4249ac5506 Author: Nathan Chancellor Date: Fri Apr 9 15:11:55 2021 -0700 crypto: arm/curve25519 - Move '.fpu' after '.arch' commit 44200f2d9b8b52389c70e6c7bbe51e0dc6eaf938 upstream. Debian's clang carries a patch that makes the default FPU mode 'vfp3-d16' instead of 'neon' for 'armv7-a' to avoid generating NEON instructions on hardware that does not support them: https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/raw/5a61ca6f21b4ad8c6ac4970e5ea5a7b5b4486d22/debian/patches/clang-arm-default-vfp3-on-armv7a.patch https://bugs.debian.org/841474 https://bugs.debian.org/842142 https://bugs.debian.org/914268 This results in the following build error when clang's integrated assembler is used because the '.arch' directive overrides the '.fpu' directive: arch/arm/crypto/curve25519-core.S:25:2: error: instruction requires: NEON vmov.i32 q0, #1 ^ arch/arm/crypto/curve25519-core.S:26:2: error: instruction requires: NEON vshr.u64 q1, q0, #7 ^ arch/arm/crypto/curve25519-core.S:27:2: error: instruction requires: NEON vshr.u64 q0, q0, #8 ^ arch/arm/crypto/curve25519-core.S:28:2: error: instruction requires: NEON vmov.i32 d4, #19 ^ Shuffle the order of the '.arch' and '.fpu' directives so that the code builds regardless of the default FPU mode. This has been tested against both clang with and without Debian's patch and GCC. Cc: stable@vger.kernel.org Fixes: d8f1308a025f ("crypto: arm/curve25519 - wire up NEON implementation") Link: https://github.com/ClangBuiltLinux/continuous-integration2/issues/118 Reported-by: Arnd Bergmann Suggested-by: Arnd Bergmann Suggested-by: Jessica Clarke Signed-off-by: Nathan Chancellor Acked-by: Jason A. Donenfeld Reviewed-by: Nick Desaulniers Tested-by: Nick Desaulniers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 9afbad5eb1439e9e03f1b6600c94b43e1074c2b6 Author: Stefan Berger Date: Wed Mar 10 17:19:16 2021 -0500 tpm: vtpm_proxy: Avoid reading host log when using a virtual device commit 9716ac65efc8f780549b03bddf41e60c445d4709 upstream. Avoid allocating memory and reading the host log when a virtual device is used since this log is of no use to that driver. A virtual device can be identified through the flag TPM_CHIP_FLAG_VIRTUAL, which is only set for the tpm_vtpm_proxy driver. Cc: stable@vger.kernel.org Fixes: 6f99612e2500 ("tpm: Proxy driver for supporting multiple emulated TPMs") Signed-off-by: Stefan Berger Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit ac07c557ca12ec9276c0375517bac7ae5be4e50c Author: Stefan Berger Date: Wed Mar 10 17:19:14 2021 -0500 tpm: efi: Use local variable for calculating final log size commit 48cff270b037022e37835d93361646205ca25101 upstream. When tpm_read_log_efi is called multiple times, which happens when one loads and unloads a TPM2 driver multiple times, then the global variable efi_tpm_final_log_size will at some point become a negative number due to the subtraction of final_events_preboot_size occurring each time. Use a local variable to avoid this integer underflow. The following issue is now resolved: Mar 8 15:35:12 hibinst kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Mar 8 15:35:12 hibinst kernel: Workqueue: tpm-vtpm vtpm_proxy_work [tpm_vtpm_proxy] Mar 8 15:35:12 hibinst kernel: RIP: 0010:__memcpy+0x12/0x20 Mar 8 15:35:12 hibinst kernel: Code: 00 b8 01 00 00 00 85 d2 74 0a c7 05 44 7b ef 00 0f 00 00 00 c3 cc cc cc 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4 Mar 8 15:35:12 hibinst kernel: RSP: 0018:ffff9ac4c0fcfde0 EFLAGS: 00010206 Mar 8 15:35:12 hibinst kernel: RAX: ffff88f878cefed5 RBX: ffff88f878ce9000 RCX: 1ffffffffffffe0f Mar 8 15:35:12 hibinst kernel: RDX: 0000000000000003 RSI: ffff9ac4c003bff9 RDI: ffff88f878cf0e4d Mar 8 15:35:12 hibinst kernel: RBP: ffff9ac4c003b000 R08: 0000000000001000 R09: 000000007e9d6073 Mar 8 15:35:12 hibinst kernel: R10: ffff9ac4c003b000 R11: ffff88f879ad3500 R12: 0000000000000ed5 Mar 8 15:35:12 hibinst kernel: R13: ffff88f878ce9760 R14: 0000000000000002 R15: ffff88f77de7f018 Mar 8 15:35:12 hibinst kernel: FS: 0000000000000000(0000) GS:ffff88f87bd00000(0000) knlGS:0000000000000000 Mar 8 15:35:12 hibinst kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 8 15:35:12 hibinst kernel: CR2: ffff9ac4c003c000 CR3: 00000001785a6004 CR4: 0000000000060ee0 Mar 8 15:35:12 hibinst kernel: Call Trace: Mar 8 15:35:12 hibinst kernel: tpm_read_log_efi+0x152/0x1a7 Mar 8 15:35:12 hibinst kernel: tpm_bios_log_setup+0xc8/0x1c0 Mar 8 15:35:12 hibinst kernel: tpm_chip_register+0x8f/0x260 Mar 8 15:35:12 hibinst kernel: vtpm_proxy_work+0x16/0x60 [tpm_vtpm_proxy] Mar 8 15:35:12 hibinst kernel: process_one_work+0x1b4/0x370 Mar 8 15:35:12 hibinst kernel: worker_thread+0x53/0x3e0 Mar 8 15:35:12 hibinst kernel: ? process_one_work+0x370/0x370 Cc: stable@vger.kernel.org Fixes: 166a2809d65b ("tpm: Don't duplicate events from the final event log in the TCG2 log") Signed-off-by: Stefan Berger Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit e09145e208326b747c135ad6c59bdd7c8591e664 Author: Alexander Shishkin Date: Wed Apr 14 20:12:51 2021 +0300 intel_th: pci: Add Alder Lake-M support commit 48cb17531b15967d9d3f34c770a25cc6c4ca6ad1 upstream. This adds support for the Trace Hub in Alder Lake-M PCH. Signed-off-by: Alexander Shishkin Reviewed-by: Andy Shevchenko Cc: stable@vger.kernel.org # v4.14+ Link: https://lore.kernel.org/r/20210414171251.14672-8-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 467c41f8b18d5ae633ad276942a23e3cbf3c550a Author: Michael Ellerman Date: Sun Apr 25 21:58:31 2021 +1000 powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n commit ee1bc694fbaec1a662770703fc34a74abf418938 upstream. lkp reported a randconfig failure: In file included from arch/powerpc/include/asm/book3s/64/pkeys.h:6, from arch/powerpc/kvm/book3s_64_mmu_host.c:15: arch/powerpc/include/asm/book3s/64/hash-pkey.h: In function 'hash__vmflag_to_pte_pkey_bits': >> arch/powerpc/include/asm/book3s/64/hash-pkey.h:10:23: error: 'VM_PKEY_BIT0' undeclared 10 | return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) | | ^~~~~~~~~~~~ We added the include of book3s/64/pkeys.h for pte_to_hpte_pkey_bits(), but that header on its own should only be included when PPC_MEM_KEYS=y. Instead include linux/pkeys.h, which brings in the right definitions when PPC_MEM_KEYS=y and also provides empty stubs when PPC_MEM_KEYS=n. Fixes: e4e8bc1df691 ("powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled") Cc: stable@vger.kernel.org # v5.11+ Reported-by: kernel test robot Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210425115831.2818434-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman commit 5fa84d105f27bd984c86a7c723925dff0f745ecf Author: Michael Ellerman Date: Mon Apr 19 22:01:39 2021 +1000 powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled commit e4e8bc1df691ba5ba749d1e2b67acf9827e51a35 upstream. The changes to add KUAP support with the hash MMU broke booting of KVM PR guests. The symptom is no visible progress of the guest, or possibly just "SLOF" being printed to the qemu console. Host code is still executing, but breaking into xmon might show a stack trace such as: __might_fault+0x84/0xe0 (unreliable) kvm_read_guest+0x1c8/0x2f0 [kvm] kvmppc_ld+0x1b8/0x2d0 [kvm] kvmppc_load_last_inst+0x50/0xa0 [kvm] kvmppc_exit_pr_progint+0x178/0x220 [kvm_pr] kvmppc_handle_exit_pr+0x31c/0xe30 [kvm_pr] after_sprg3_load+0x80/0x90 [kvm_pr] kvmppc_vcpu_run_pr+0x104/0x260 [kvm_pr] kvmppc_vcpu_run+0x34/0x48 [kvm] kvm_arch_vcpu_ioctl_run+0x340/0x450 [kvm] kvm_vcpu_ioctl+0x2ac/0x8c0 [kvm] sys_ioctl+0x320/0x1060 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c Bisect points to commit b2ff33a10c8b ("powerpc/book3s64/hash/kuap: Enable kuap on hash"), but that's just the commit that enabled KUAP with hash and made the bug visible. The root cause seems to be that KVM PR is creating kernel mappings that don't use the correct key, since we switched to using key 3. We have a helper for adding the right key value, however it's designed to take a pteflags variable, which the KVM code doesn't have. But we can make it work by passing 0 for the pteflags, and tell it explicitly that it should use the kernel key. With that changed guests boot successfully. Fixes: d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation") Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210419120139.1455937-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman commit 524a894b21e64ae86d53030e2aab1182c2ce4083 Author: Tony Ambardar Date: Thu Sep 17 06:54:37 2020 -0700 powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h commit 7de21e679e6a789f3729e8402bc440b623a28eae upstream. A few archs like powerpc have different errno.h values for macros EDEADLOCK and EDEADLK. In code including both libc and linux versions of errno.h, this can result in multiple definitions of EDEADLOCK in the include chain. Definitions to the same value (e.g. seen with mips) do not raise warnings, but on powerpc there are redefinitions changing the value, which raise warnings and errors (if using "-Werror"). Guard against these redefinitions to avoid build errors like the following, first seen cross-compiling libbpf v5.8.9 for powerpc using GCC 8.4.0 with musl 1.1.24: In file included from ../../arch/powerpc/include/uapi/asm/errno.h:5, from ../../include/linux/err.h:8, from libbpf.c:29: ../../include/uapi/asm-generic/errno.h:40: error: "EDEADLOCK" redefined [-Werror] #define EDEADLOCK EDEADLK In file included from toolchain-powerpc_8540_gcc-8.4.0_musl/include/errno.h:10, from libbpf.c:26: toolchain-powerpc_8540_gcc-8.4.0_musl/include/bits/errno.h:58: note: this is the location of the previous definition #define EDEADLOCK 58 cc1: all warnings being treated as errors Cc: Stable Reported-by: Rosen Penev Signed-off-by: Tony Ambardar Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200917135437.1238787-1-Tony.Ambardar@gmail.com Signed-off-by: Greg Kroah-Hartman commit fe261343667365a07d978bed8df59e94e2a4836d Author: Christophe Leroy Date: Thu Apr 29 16:52:09 2021 +0000 powerpc/32: Fix boot failure with CONFIG_STACKPROTECTOR commit f5668260b872e89b8d3942a8b7d4278aa9c2c981 upstream. Commit 7c95d8893fb5 ("powerpc: Change calling convention for create_branch() et. al.") complexified the frame of function do_feature_fixups(), leading to GCC setting up a stack guard when CONFIG_STACKPROTECTOR is selected. The problem is that do_feature_fixups() is called very early while 'current' in r2 is not set up yet and the code is still not at the final address used at link time. So, like other instrumentation, stack protection needs to be deactivated for feature-fixups.c and code-patching.c Fixes: 7c95d8893fb5 ("powerpc: Change calling convention for create_branch() et. al.") Cc: stable@vger.kernel.org # v5.8+ Reported-by: Jonathan Neuschaefer Signed-off-by: Christophe Leroy Tested-by: Jonathan Neuschaefer Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/b688fe82927b330349d9e44553363fa451ea4d95.1619715114.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman commit b54b85bd073acc2a51615ea9b74cc594d70a6b55 Author: Sourabh Jain Date: Thu Apr 29 11:32:56 2021 +0530 powerpc/kexec_file: Use current CPU info while setting up FDT commit 40c753993e3aad51a12c21233486e2037417a4d6 upstream. kexec_file_load() uses initial_boot_params in setting up the device tree for the kernel to be loaded. Though initial_boot_params holds info about CPUs at the time of boot, it doesn't account for hot added CPUs. So, kexec'ing with kexec_file_load() syscall leaves the kexec'ed kernel with inaccurate CPU info. If kdump kernel is loaded with kexec_file_load() syscall and the system crashes on a hot added CPU, the capture kernel hangs failing to identify the boot CPU, with no output. To avoid this from happening, extract current CPU info from of_root device node and use it for setting up the fdt in kexec_file_load case. Fixes: 6ecd0163d360 ("powerpc/kexec_file: Add appropriate regions for memory reserve map") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Sourabh Jain Reviewed-by: Hari Bathini Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210429060256.199714-1-sourabhjain@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit a43ae32fc22bb8bae4c03e0d83bf6511ec4dd849 Author: Mahesh Salgaonkar Date: Mon Apr 12 13:22:50 2021 +0530 powerpc/eeh: Fix EEH handling for hugepages in ioremap space. commit 5ae5bc12d0728db60a0aa9b62160ffc038875f1a upstream. During the EEH MMIO error checking, the current implementation fails to map the (virtual) MMIO address back to the pci device on radix with hugepage mappings for I/O. This results into failure to dispatch EEH event with no recovery even when EEH capability has been enabled on the device. eeh_check_failure(token) # token = virtual MMIO address addr = eeh_token_to_phys(token); edev = eeh_addr_cache_get_dev(addr); if (!edev) return 0; eeh_dev_check_failure(edev); <= Dispatch the EEH event In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys translation that results in wrong physical address, which is then passed to eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges to get to a PCI device. Hence, it fails to find a match and the EEH event never gets dispatched leaving the device in failed state. The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space") introduced following logic to translate virt to phys for hugepage mappings: eeh_token_to_phys(): + pa = pte_pfn(*ptep); + + /* On radix we can do hugepage mappings for io, so handle that */ + if (hugepage_shift) { + pa <<= hugepage_shift; <= This is wrong + pa |= token & ((1ul << hugepage_shift) - 1); + } This patch fixes the virt -> phys translation in eeh_token_to_phys() function. $ cat /sys/kernel/debug/powerpc/eeh_address_cache mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1 mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1 mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0 mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0 mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1 mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0 mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1 mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0 Above is the list of cached io address ranges of pci 0030:01:00.. Before this patch: Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection clearly shows that 'addr=' contains wrong physical address: kworker/u16:0-7 [001] .... 108.883775: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510 dmesg shows no EEH recovery messages: [ 108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff) [ 108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295) [ 108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff resource_bit 0x1 [ 108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout [ 108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout <..> After this patch: eeh_addr_cache_get_dev() trace shows correct physical address: -0 [001] ..s. 1043.123828: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8 dmesg logs shows EEH recovery getting triggerred: [ 964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff) [ 964.323991] EEH: Recovering PHB#30-PE#10000 [ 964.324002] EEH: PE location: N/A, PHB location: N/A [ 964.324006] EEH: Frozen PHB#30-PE#10000 detected <..> Fixes: 33439620680b ("powerpc/eeh: Handle hugepages in ioremap space") Cc: stable@vger.kernel.org # v5.3+ Reported-by: Dominic DeMarco Signed-off-by: Mahesh Salgaonkar Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/161821396263.48361.2796709239866588652.stgit@jupiter Signed-off-by: Greg Kroah-Hartman commit a9a8818919226e60c34693a129449594d15d38be Author: Nicholas Piggin Date: Fri Apr 2 12:41:24 2021 +1000 powerpc/powernv: Enable HAIL (HV AIL) for ISA v3.1 processors commit 49c1d07fd04f54eb588c4a1dfcedc8d22c5ffd50 upstream. Starting with ISA v3.1, LPCR[AIL] no longer controls the interrupt mode for HV=1 interrupts. Instead, a new LPCR[HAIL] bit is defined which behaves like AIL=3 for HV interrupts when set. Set HAIL on bare metal to give us mmu-on interrupts and improve performance. This also fixes an scv bug: we don't implement scv real mode (AIL=0) vectors because they are at an inconvenient location, so we just disable scv support when AIL can not be set. However powernv assumes that LPCR[AIL] will enable AIL mode so it enables scv support despite HV interrupts being AIL=0, which causes scv interrupts to go off into the weeds. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210402024124.545826-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman commit f486ef7e762e7b38482c27ee833918bfa528bf35 Author: Dmitry Safonov <0x7f454c46@gmail.com> Date: Wed Mar 31 16:48:46 2021 +0000 powerpc/vdso: Separate vvar vma from vdso commit 1c4bce6753857dc409a0197342d18764e7f4b741 upstream. Since commit 511157ab641e ("powerpc/vdso: Move vdso datapage up front") VVAR page is in front of the VDSO area. In result it breaks CRIU (Checkpoint Restore In Userspace) [1], where CRIU expects that "[vdso]" from /proc/../maps points at ELF/vdso image, rather than at VVAR data page. Laurent made a patch to keep CRIU working (by reading aux vector). But I think it still makes sence to separate two mappings into different VMAs. It will also make ppc64 less "special" for userspace and as a side-bonus will make VVAR page un-writable by debugger (which previously would COW page and can be unexpected). I opportunistically Cc stable on it: I understand that usually such stuff isn't a stable material, but that will allow us in CRIU have one workaround less that is needed just for one release (v5.11) on one platform (ppc64), which we otherwise have to maintain. I wouldn't go as far as to say that the commit 511157ab641e is ABI regression as no other userspace got broken, but I'd really appreciate if it gets backported to v5.11 after v5.12 is released, so as not to complicate already non-simple CRIU-vdso code. Thanks! [1]: https://github.com/checkpoint-restore/criu/issues/1417 Cc: stable@vger.kernel.org # v5.11 Signed-off-by: Dmitry Safonov Signed-off-by: Christophe Leroy Tested-by: Christophe Leroy Reviewed-by: Vincenzo Frascino # vDSO parts. Acked-by: Andrei Vagin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/f401eb1ebc0bfc4d8f0e10dc8e525fd409eb68e2.1617209142.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman commit 43aa1cfeebf447a263682379e3de57c1673ad918 Author: Longpeng(Mike) Date: Thu Apr 15 08:46:28 2021 +0800 iommu/vt-d: Force to flush iotlb before creating superpage commit 38c527aeb41926c71902dd42f788a8b093b21416 upstream. The translation caches may preserve obsolete data when the mapping size is changed, suppose the following sequence which can reveal the problem with high probability. 1.mmap(4GB,MAP_HUGETLB) 2. while (1) { (a) DMA MAP 0,0xa0000 (b) DMA UNMAP 0,0xa0000 (c) DMA MAP 0,0xc0000000 * DMA read IOVA 0 may failure here (Not present) * if the problem occurs. (d) DMA UNMAP 0,0xc0000000 } The page table(only focus on IOVA 0) after (a) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x21d200803 entry:0xffff89b3b0a72000 The page table after (b) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x0 entry:0xffff89b3b0a72000 The page table after (c) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x21d200883 entry:0xffff89b39cacb000 (*) Because the PDE entry after (b) is present, it won't be flushed even if the iommu driver flush cache when unmap, so the obsolete data may be preserved in cache, which would cause the wrong translation at end. However, we can see the PDE entry is finally switch to 2M-superpage mapping, but it does not transform to 0x21d200883 directly: 1. PDE: 0x1a30a72003 2. __domain_mapping dma_pte_free_pagetable Set the PDE entry to ZERO Set the PDE entry to 0x21d200883 So we must flush the cache after the entry switch to ZERO to avoid the obsolete info be preserved. Cc: David Woodhouse Cc: Lu Baolu Cc: Nadav Amit Cc: Alex Williamson Cc: Joerg Roedel Cc: Kevin Tian Cc: Gonglei (Arei) Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating superpage") Cc: # v3.0+ Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/ Suggested-by: Lu Baolu Signed-off-by: Longpeng(Mike) Acked-by: Lu Baolu Link: https://lore.kernel.org/r/20210415004628.1779-1-longpeng2@huawei.com Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit a72354b22114e35427bcbdd54427928302ced0f2 Author: Joel Stanley Date: Wed Mar 31 00:15:37 2021 +1030 jffs2: Hook up splice_write callback commit 42984af09afc414d540fcc8247f42894b0378a91 upstream. overlayfs using jffs2 as the upper filesystem would fail in some cases since moving to v5.10. The test case used was to run 'touch' on a file that exists in the lower fs, causing the modification time to be updated. It returns EINVAL when the bug is triggered. A bisection showed this was introduced in v5.9-rc1, with commit 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops"). Reverting that commit restores the expected behaviour. Some digging showed that this was due to jffs2 lacking an implementation of splice_write. (For unknown reasons the warn_unsupported that should trigger was not displaying any output). Adding this patch resolved the issue and the test now passes. Cc: stable@vger.kernel.org Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Joel Stanley Reviewed-by: Christoph Hellwig Tested-by: Lei YU Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 58db0dd22decdb97f44cf126999562c88de67dcf Author: lizhe Date: Thu Mar 18 11:06:57 2021 +0800 jffs2: Fix kasan slab-out-of-bounds problem commit 960b9a8a7676b9054d8b46a2c7db52a0c8766b56 upstream. KASAN report a slab-out-of-bounds problem. The logs are listed below. It is because in function jffs2_scan_dirent_node, we alloc "checkedlen+1" bytes for fd->name and we check crc with length rd->nsize. If checkedlen is less than rd->nsize, it will cause the slab-out-of-bounds problem. jffs2: Dirent at *** has zeroes in name. Truncating to %d char ================================================================== BUG: KASAN: slab-out-of-bounds in crc32_le+0x1ce/0x260 at addr ffff8800842cf2d1 Read of size 1 by task test_JFFS2/915 ============================================================================= BUG kmalloc-64 (Tainted: G B O ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in jffs2_alloc_full_dirent+0x2a/0x40 age=0 cpu=1 pid=915 ___slab_alloc+0x580/0x5f0 __slab_alloc.isra.24+0x4e/0x64 __kmalloc+0x170/0x300 jffs2_alloc_full_dirent+0x2a/0x40 jffs2_scan_eraseblock+0x1ca4/0x3b64 jffs2_scan_medium+0x285/0xfe0 jffs2_do_mount_fs+0x5fb/0x1bbc jffs2_do_fill_super+0x245/0x6f0 jffs2_fill_super+0x287/0x2e0 mount_mtd_aux.isra.0+0x9a/0x144 mount_mtd+0x222/0x2f0 jffs2_mount+0x41/0x60 mount_fs+0x63/0x230 vfs_kern_mount.part.6+0x6c/0x1f4 do_mount+0xae8/0x1940 SyS_mount+0x105/0x1d0 INFO: Freed in jffs2_free_full_dirent+0x22/0x40 age=27 cpu=1 pid=915 __slab_free+0x372/0x4e4 kfree+0x1d4/0x20c jffs2_free_full_dirent+0x22/0x40 jffs2_build_remove_unlinked_inode+0x17a/0x1e4 jffs2_do_mount_fs+0x1646/0x1bbc jffs2_do_fill_super+0x245/0x6f0 jffs2_fill_super+0x287/0x2e0 mount_mtd_aux.isra.0+0x9a/0x144 mount_mtd+0x222/0x2f0 jffs2_mount+0x41/0x60 mount_fs+0x63/0x230 vfs_kern_mount.part.6+0x6c/0x1f4 do_mount+0xae8/0x1940 SyS_mount+0x105/0x1d0 entry_SYSCALL_64_fastpath+0x1e/0x97 Call Trace: [] dump_stack+0x59/0x7e [] print_trailer+0x125/0x1b0 [] object_err+0x34/0x40 [] kasan_report.part.1+0x21f/0x534 [] ? vprintk+0x2d/0x40 [] ? crc32_le+0x1ce/0x260 [] kasan_report+0x26/0x30 [] __asan_load1+0x3d/0x50 [] crc32_le+0x1ce/0x260 [] ? jffs2_alloc_full_dirent+0x2a/0x40 [] jffs2_scan_eraseblock+0x1d0c/0x3b64 [] ? jffs2_scan_medium+0xccf/0xfe0 [] ? jffs2_scan_make_ino_cache+0x14c/0x14c [] ? kasan_unpoison_shadow+0x35/0x50 [] ? kasan_unpoison_shadow+0x35/0x50 [] ? kasan_kmalloc+0x5e/0x70 [] ? kmem_cache_alloc_trace+0x10c/0x2cc [] ? mtd_point+0xf7/0x130 [] jffs2_scan_medium+0x285/0xfe0 [] ? jffs2_scan_eraseblock+0x3b64/0x3b64 [] ? kasan_unpoison_shadow+0x35/0x50 [] ? kasan_unpoison_shadow+0x35/0x50 [] ? kasan_kmalloc+0x5e/0x70 [] ? __kmalloc+0x12b/0x300 [] ? kasan_kmalloc+0x5e/0x70 [] ? jffs2_sum_init+0x9f/0x240 [] jffs2_do_mount_fs+0x5fb/0x1bbc [] ? jffs2_del_noinode_dirent+0x640/0x640 [] ? kasan_kmalloc+0x5e/0x70 [] ? __init_rwsem+0x97/0xac [] jffs2_do_fill_super+0x245/0x6f0 [] jffs2_fill_super+0x287/0x2e0 [] ? jffs2_parse_options+0x594/0x594 [] mount_mtd_aux.isra.0+0x9a/0x144 [] mount_mtd+0x222/0x2f0 [] ? jffs2_parse_options+0x594/0x594 [] ? mount_mtd_aux.isra.0+0x144/0x144 [] ? free_pages+0x13/0x1c [] ? selinux_sb_copy_data+0x278/0x2e0 [] jffs2_mount+0x41/0x60 [] mount_fs+0x63/0x230 [] ? alloc_vfsmnt+0x32f/0x3b0 [] vfs_kern_mount.part.6+0x6c/0x1f4 [] do_mount+0xae8/0x1940 [] ? audit_filter_rules.constprop.6+0x1d10/0x1d10 [] ? copy_mount_string+0x40/0x40 [] ? alloc_pages_current+0xa4/0x1bc [] ? __get_free_pages+0x25/0x50 [] ? copy_mount_options.part.17+0x183/0x264 [] SyS_mount+0x105/0x1d0 [] ? copy_mnt_ns+0x560/0x560 [] ? msa_space_switch_handler+0x13d/0x190 [] entry_SYSCALL_64_fastpath+0x1e/0x97 [] ? msa_space_switch+0xb0/0xe0 Memory state around the buggy address: ffff8800842cf180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800842cf200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8800842cf280: fc fc fc fc fc fc 00 00 00 00 01 fc fc fc fc fc ^ ffff8800842cf300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800842cf380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Cc: stable@vger.kernel.org Reported-by: Kunkun Xu Signed-off-by: lizhe Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 5f41065f6960a808f64e464d945ea3d89c1431c0 Author: Hansem Ro Date: Thu May 6 13:27:10 2021 -0700 Input: ili210x - add missing negation for touch indication on ili210x commit ac05a8a927e5a1027592d8f98510a511dadeed14 upstream. This adds the negation needed for proper finger detection on Ilitek ili2107/ili210x. This fixes polling issues (on Amazon Kindle Fire) caused by returning false for the cooresponding finger on the touchscreen. Signed-off-by: Hansem Ro Fixes: e3559442afd2a ("ili210x - rework the touchscreen sample processing") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit 7e65ea887d0c0997f3053acd91a027af45e71c5b Author: Trond Myklebust Date: Sun Apr 18 15:00:45 2021 -0400 NFSv4: Don't discard segments marked for return in _pnfs_return_layout() commit de144ff4234f935bd2150108019b5d87a90a8a96 upstream. If the pNFS layout segment is marked with the NFS_LSEG_LAYOUTRETURN flag, then the assumption is that it has some reporting requirement to perform through a layoutreturn (e.g. flexfiles layout stats or error information). Fixes: 6d597e175012 ("pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 7cf20aa8a194a1c9913aa69c297e6d9b0af05598 Author: Trond Myklebust Date: Thu Apr 15 15:41:57 2021 -0400 NFS: Don't discard pNFS layout segments that are marked for return commit 39fd01863616964f009599e50ca5c6ea9ebf88d6 upstream. If the pNFS layout segment is marked with the NFS_LSEG_LAYOUTRETURN flag, then the assumption is that it has some reporting requirement to perform through a layoutreturn (e.g. flexfiles layout stats or error information). Fixes: e0b7d420f72a ("pNFS: Don't discard layout segments that are marked for return") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 3d0163821c035040a46d816a42c0780f0f0a30a8 Author: Randy Dunlap Date: Mon Mar 1 16:19:30 2021 -0800 NFS: fs_context: validate UDP retrans to prevent shift out-of-bounds commit c09f11ef35955785f92369e25819bf0629df2e59 upstream. Fix shift out-of-bounds in xprt_calc_majortimeo(). This is caused by a garbage timeout (retrans) mount option being passed to nfs mount, in this case from syzkaller. If the protocol is XPRT_TRANSPORT_UDP, then 'retrans' is a shift value for a 64-bit long integer, so 'retrans' cannot be >= 64. If it is >= 64, fail the mount and return an error. Fixes: 9954bf92c0cd ("NFS: Move mount parameterisation bits into their own file") Reported-by: syzbot+ba2e91df8f74809417fa@syzkaller.appspotmail.com Reported-by: syzbot+f3a0fa110fd630ab56c8@syzkaller.appspotmail.com Signed-off-by: Randy Dunlap Cc: Trond Myklebust Cc: Anna Schumaker Cc: linux-nfs@vger.kernel.org Cc: David Howells Cc: Al Viro Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 596e079c362ac17ed02aa1b99fdc444d62072a01 Author: Marc Zyngier Date: Wed Apr 21 17:43:16 2021 +0100 ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure commit 1ecd5b129252249b9bc03d7645a7bda512747277 upstream. When failing the driver probe because of invalid firmware properties, the GTDT driver unmaps the interrupt that it mapped earlier. However, it never checks whether the mapping of the interrupt actially succeeded. Even more, should the firmware report an illegal interrupt number that overlaps with the GIC SGI range, this can result in an IPI being unmapped, and subsequent fireworks (as reported by Dann Frazier). Rework the driver to have a slightly saner behaviour and actually check whether the interrupt has been mapped before unmapping things. Reported-by: dann frazier Fixes: ca9ae5ec4ef0 ("acpi/arm64: Add SBSA Generic Watchdog support in GTDT driver") Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/YH87dtTfwYgavusz@xps13.dannf Cc: Cc: Fu Wei Reviewed-by: Sudeep Holla Tested-by: dann frazier Tested-by: Hanjun Guo Reviewed-by: Hanjun Guo Reviewed-by: Lorenzo Pieralisi Link: https://lore.kernel.org/r/20210421164317.1718831-2-maz@kernel.org Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 8e6dfb7beeb6489ac1365b8a71052e737f5da76e Author: Davide Caratti Date: Wed Apr 28 15:23:14 2021 +0200 net/sched: sch_frag: fix stack OOB read while fragmenting IPv4 packets commit 31fe34a0118e0acc958c802e830ad5d37ef6b1d3 upstream. when 'act_mirred' tries to fragment IPv4 packets that had been previously re-assembled using 'act_ct', splats like the following can be observed on kernels built with KASAN: BUG: KASAN: stack-out-of-bounds in ip_do_fragment+0x1b03/0x1f60 Read of size 1 at addr ffff888147009574 by task ping/947 CPU: 0 PID: 947 Comm: ping Not tainted 5.12.0-rc6+ #418 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 Call Trace: dump_stack+0x92/0xc1 print_address_description.constprop.7+0x1a/0x150 kasan_report.cold.13+0x7f/0x111 ip_do_fragment+0x1b03/0x1f60 sch_fragment+0x4bf/0xe40 tcf_mirred_act+0xc3d/0x11a0 [act_mirred] tcf_action_exec+0x104/0x3e0 fl_classify+0x49a/0x5e0 [cls_flower] tcf_classify_ingress+0x18a/0x820 __netif_receive_skb_core+0xae7/0x3340 __netif_receive_skb_one_core+0xb6/0x1b0 process_backlog+0x1ef/0x6c0 __napi_poll+0xaa/0x500 net_rx_action+0x702/0xac0 __do_softirq+0x1e4/0x97f do_softirq+0x71/0x90 __local_bh_enable_ip+0xdb/0xf0 ip_finish_output2+0x760/0x2120 ip_do_fragment+0x15a5/0x1f60 __ip_finish_output+0x4c2/0xea0 ip_output+0x1ca/0x4d0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x1c4b/0x2d00 sock_sendmsg+0xdb/0x110 __sys_sendto+0x1d7/0x2b0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f82e13853eb Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 75 42 2c 00 41 89 ca 8b 00 85 c0 75 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 41 57 4d 89 c7 41 56 41 89 RSP: 002b:00007ffe01fad888 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00005571aac13700 RCX: 00007f82e13853eb RDX: 0000000000002330 RSI: 00005571aac13700 RDI: 0000000000000003 RBP: 0000000000002330 R08: 00005571aac10500 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe01faefb0 R13: 00007ffe01fad890 R14: 00007ffe01fad980 R15: 00005571aac0f0a0 The buggy address belongs to the page: page:000000001dff2e03 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147009 flags: 0x17ffffc0001000(reserved) raw: 0017ffffc0001000 ffffea00051c0248 ffffea00051c0248 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888147009400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888147009480: f1 f1 f1 f1 04 f2 f2 f2 f2 f2 f2 f2 00 00 00 00 >ffff888147009500: 00 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2 f2 ^ ffff888147009580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888147009600: 00 00 00 00 00 00 00 00 00 00 00 00 00 f2 f2 f2 for IPv4 packets, sch_fragment() uses a temporary struct dst_entry. Then, in the following call graph: ip_do_fragment() ip_skb_dst_mtu() ip_dst_mtu_maybe_forward() ip_mtu_locked() the pointer to struct dst_entry is used as pointer to struct rtable: this turns the access to struct members like rt_mtu_locked into an OOB read in the stack. Fix this changing the temporary variable used for IPv4 packets in sch_fragment(), similarly to what is done for IPv6 few lines below. Fixes: c129412f74e9 ("net/sched: sch_frag: add generic packet fragment support.") Cc: # 5.11 Reported-by: Shuang Li Acked-by: Marcelo Ricardo Leitner Acked-by: Cong Wang Signed-off-by: Davide Caratti Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b3502b04e84ac5349be95fc033c17bd701d2787a Author: Davide Caratti Date: Wed Apr 28 15:23:07 2021 +0200 openvswitch: fix stack OOB read while fragmenting IPv4 packets commit 7c0ea5930c1c211931819d83cfb157bff1539a4c upstream. running openvswitch on kernels built with KASAN, it's possible to see the following splat while testing fragmentation of IPv4 packets: BUG: KASAN: stack-out-of-bounds in ip_do_fragment+0x1b03/0x1f60 Read of size 1 at addr ffff888112fc713c by task handler2/1367 CPU: 0 PID: 1367 Comm: handler2 Not tainted 5.12.0-rc6+ #418 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 Call Trace: dump_stack+0x92/0xc1 print_address_description.constprop.7+0x1a/0x150 kasan_report.cold.13+0x7f/0x111 ip_do_fragment+0x1b03/0x1f60 ovs_fragment+0x5bf/0x840 [openvswitch] do_execute_actions+0x1bd5/0x2400 [openvswitch] ovs_execute_actions+0xc8/0x3d0 [openvswitch] ovs_packet_cmd_execute+0xa39/0x1150 [openvswitch] genl_family_rcv_msg_doit.isra.15+0x227/0x2d0 genl_rcv_msg+0x287/0x490 netlink_rcv_skb+0x120/0x380 genl_rcv+0x24/0x40 netlink_unicast+0x439/0x630 netlink_sendmsg+0x719/0xbf0 sock_sendmsg+0xe2/0x110 ____sys_sendmsg+0x5ba/0x890 ___sys_sendmsg+0xe9/0x160 __sys_sendmsg+0xd3/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f957079db07 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 eb ec ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 24 ed ff ff 48 RSP: 002b:00007f956ce35a50 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007f957079db07 RDX: 0000000000000000 RSI: 00007f956ce35ae0 RDI: 0000000000000019 RBP: 00007f956ce35ae0 R08: 0000000000000000 R09: 00007f9558006730 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 00007f956ce37308 R14: 00007f956ce35f80 R15: 00007f956ce35ae0 The buggy address belongs to the page: page:00000000af2a1d93 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112fc7 flags: 0x17ffffc0000000() raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected addr ffff888112fc713c is located in stack of task handler2/1367 at offset 180 in frame: ovs_fragment+0x0/0x840 [openvswitch] this frame has 2 objects: [32, 144) 'ovs_dst' [192, 424) 'ovs_rt' Memory state around the buggy address: ffff888112fc7000: f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888112fc7080: 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 >ffff888112fc7100: 00 00 00 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 ^ ffff888112fc7180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888112fc7200: 00 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. Then, in the following call graph: ip_do_fragment() ip_skb_dst_mtu() ip_dst_mtu_maybe_forward() ip_mtu_locked() the pointer to struct dst_entry is used as pointer to struct rtable: this turns the access to struct members like rt_mtu_locked into an OOB read in the stack. Fix this changing the temporary variable used for IPv4 packets in ovs_fragment(), similarly to what is done for IPv6 few lines below. Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmt") Cc: Acked-by: Eelco Chaudron Signed-off-by: Davide Caratti Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e0041e24ad15a685da41b3b6fd0ccd26ab90815d Author: Ido Schimmel Date: Thu May 6 10:23:08 2021 +0300 mlxsw: spectrum_mr: Update egress RIF list before route's action commit cbaf3f6af9c268caf558c8e7ec52bcb35c5455dd upstream. Each multicast route that is forwarding packets (as opposed to trapping them) points to a list of egress router interfaces (RIFs) through which packets are replicated. A route's action can transition from trap to forward when a RIF is created for one of the route's egress virtual interfaces (eVIF). When this happens, the route's action is first updated and only later the list of egress RIFs is committed to the device. This results in the route pointing to an invalid list. In case the list pointer is out of range (due to uninitialized memory), the device will complain: mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=5733bf490000905c,reg_id=300f(pefa),type=write,status=7(bad parameter)) Fix this by first committing the list of egress RIFs to the device and only later update the route's action. Note that a fix is not needed in the reverse function (i.e., mlxsw_sp_mr_route_evif_unresolve()), as there the route's action is first updated and only later the RIF is removed from the list. Cc: stable@vger.kernel.org Fixes: c011ec1bbfd6 ("mlxsw: spectrum: Add the multicast routing offloading logic") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Link: https://lore.kernel.org/r/20210506072308.3834303-1-idosch@idosch.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 0b60f23e29c8dfcf1b8a037fae1167e4f2e3249e Author: Chao Yu Date: Mon Mar 22 19:47:30 2021 +0800 f2fs: fix to avoid out-of-bounds memory access commit b862676e371715456c9dade7990c8004996d0d9e upstream. butt3rflyh4ck reported a bug found by syzkaller fuzzer with custom modifications in 5.12.0-rc3+ [1]: dump_stack+0xfa/0x151 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x82/0x32c mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 f2fs_test_bit fs/f2fs/f2fs.h:2572 [inline] current_nat_addr fs/f2fs/node.h:213 [inline] get_next_nat_page fs/f2fs/node.c:123 [inline] __flush_nat_entry_set fs/f2fs/node.c:2888 [inline] f2fs_flush_nat_entries+0x258e/0x2960 fs/f2fs/node.c:2991 f2fs_write_checkpoint+0x1372/0x6a70 fs/f2fs/checkpoint.c:1640 f2fs_issue_checkpoint+0x149/0x410 fs/f2fs/checkpoint.c:1807 f2fs_sync_fs+0x20f/0x420 fs/f2fs/super.c:1454 __sync_filesystem fs/sync.c:39 [inline] sync_filesystem fs/sync.c:67 [inline] sync_filesystem+0x1b5/0x260 fs/sync.c:48 generic_shutdown_super+0x70/0x370 fs/super.c:448 kill_block_super+0x97/0xf0 fs/super.c:1394 The root cause is, if nat entry in checkpoint journal area is corrupted, e.g. nid of journalled nat entry exceeds max nid value, during checkpoint, once it tries to flush nat journal to NAT area, get_next_nat_page() may access out-of-bounds memory on nat_bitmap due to it uses wrong nid value as bitmap offset. [1] https://lore.kernel.org/lkml/CAFcO6XOMWdr8pObek6eN6-fs58KG9doRFadgJj-FnF-1x43s2g@mail.gmail.com/T/#u Reported-and-tested-by: butt3rflyh4ck Signed-off-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit befeb96e17f7923a5d891b5b55995a6ab745cd05 Author: Eric Biggers Date: Thu Mar 4 21:43:10 2021 -0800 f2fs: fix error handling in f2fs_end_enable_verity() commit 3c0315424f5e3d2a4113c7272367bee1e8e6a174 upstream. f2fs didn't properly clean up if verity failed to be enabled on a file: - It left verity metadata (pages past EOF) in the page cache, which would be exposed to userspace if the file was later extended. - It didn't truncate the verity metadata at all (either from cache or from disk) if an error occurred while setting the verity bit. Fix these bugs by adding a call to truncate_inode_pages() and ensuring that we truncate the verity metadata (both from cache and from disk) in all error paths. Also rework the code to cleanly separate the success path from the error paths, which makes it much easier to understand. Finally, log a message if f2fs_truncate() fails, since it might otherwise fail silently. Reported-by: Yunlei He Fixes: 95ae251fe828 ("f2fs: add fs-verity support") Cc: # v5.4+ Signed-off-by: Eric Biggers Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 325b731983d010d358505bf2c40f5e0069af62d2 Author: Guochun Mao Date: Tue Mar 16 16:52:14 2021 +0800 ubifs: Only check replay with inode type to judge if inode linked commit 3e903315790baf4a966436e7f32e9c97864570ac upstream. Conside the following case, it just write a big file into flash, when complete writing, delete the file, and then power off promptly. Next time power on, we'll get a replay list like: ... LEB 1105:211344 len 4144 deletion 0 sqnum 428783 key type 1 inode 80 LEB 15:233544 len 160 deletion 1 sqnum 428785 key type 0 inode 80 LEB 1105:215488 len 4144 deletion 0 sqnum 428787 key type 1 inode 80 ... In the replay list, data nodes' deletion are 0, and the inode node's deletion is 1. In current logic, the file's dentry will be removed, but inode and the flash space it occupied will be reserved. User will see that much free space been disappeared. We only need to check the deletion value of the following inode type node of the replay entry. Fixes: e58725d51fa8 ("ubifs: Handle re-linking of inodes correctly while recovery") Cc: stable@vger.kernel.org Signed-off-by: Guochun Mao Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 205332585eebe3faddb4c77eeff7ef9bf9fd375c Author: Marco Elver Date: Wed Mar 3 10:38:45 2021 +0100 kcsan, debugfs: Move debugfs file creation out of early init commit e36299efe7d749976fbdaaf756dee6ef32543c2c upstream. Commit 56348560d495 ("debugfs: do not attempt to create a new file before the filesystem is initalized") forbids creating new debugfs files until debugfs is fully initialized. This means that KCSAN's debugfs file creation, which happened at the end of __init(), no longer works. And was apparently never supposed to work! However, there is no reason to create KCSAN's debugfs file so early. This commit therefore moves its creation to a late_initcall() callback. Cc: "Rafael J. Wysocki" Cc: stable Fixes: 56348560d495 ("debugfs: do not attempt to create a new file before the filesystem is initalized") Reviewed-by: Greg Kroah-Hartman Signed-off-by: Marco Elver Signed-off-by: Paul E. McKenney Signed-off-by: Greg Kroah-Hartman commit 5116e79fc6e6725b8acdad8b7e928a83ab7b47e6 Author: Luis Henriques Date: Wed Mar 17 08:44:43 2021 +0000 virtiofs: fix memory leak in virtio_fs_probe() commit c79c5e0178922a9e092ec8fed026750f39dcaef4 upstream. When accidentally passing twice the same tag to qemu, kmemleak ended up reporting a memory leak in virtiofs. Also, looking at the log I saw the following error (that's when I realised the duplicated tag): virtiofs: probe of virtio5 failed with error -17 Here's the kmemleak log for reference: unreferenced object 0xffff888103d47800 (size 1024): comm "systemd-udevd", pid 118, jiffies 4294893780 (age 18.340s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 80 90 02 a0 ff ff ff ff ................ backtrace: [<000000000ebb87c1>] virtio_fs_probe+0x171/0x7ae [virtiofs] [<00000000f8aca419>] virtio_dev_probe+0x15f/0x210 [<000000004d6baf3c>] really_probe+0xea/0x430 [<00000000a6ceeac8>] device_driver_attach+0xa8/0xb0 [<00000000196f47a7>] __driver_attach+0x98/0x140 [<000000000b20601d>] bus_for_each_dev+0x7b/0xc0 [<00000000399c7b7f>] bus_add_driver+0x11b/0x1f0 [<0000000032b09ba7>] driver_register+0x8f/0xe0 [<00000000cdd55998>] 0xffffffffa002c013 [<000000000ea196a2>] do_one_initcall+0x64/0x2e0 [<0000000008f727ce>] do_init_module+0x5c/0x260 [<000000003cdedab6>] __do_sys_finit_module+0xb5/0x120 [<00000000ad2f48c6>] do_syscall_64+0x33/0x40 [<00000000809526b5>] entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques Fixes: a62a8ef9d97d ("virtio-fs: add virtiofs filesystem") Reviewed-by: Stefan Hajnoczi Reviewed-by: Vivek Goyal Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit f7c80e8a1b0bf3fe9175f723b65567a2cb71e08f Author: Theodore Ts'o Date: Sat Apr 17 23:03:50 2021 -0400 fs: fix reporting supported extra file attributes for statx() commit 5afa7e8b70d65819245fece61a65fd753b4aae33 upstream. statx(2) notes that any attribute that is not indicated as supported by stx_attributes_mask has no usable value. Commits 801e523796004 ("fs: move generic stat response attr handling to vfs_getattr_nosec") and 712b2698e4c02 ("fs/stat: Define DAX statx attribute") sets STATX_ATTR_AUTOMOUNT and STATX_ATTR_DAX, respectively, without setting stx_attributes_mask, which can cause xfstests generic/532 to fail. Fix this in the same way as commit 1b9598c8fb99 ("xfs: fix reporting supported extra file attributes for statx()") Fixes: 801e523796004 ("fs: move generic stat response attr handling to vfs_getattr_nosec") Fixes: 712b2698e4c02 ("fs/stat: Define DAX statx attribute") Cc: stable@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit fd0f06590d35c99f98d12c7984897ec4201a6263 Author: Liao Chang Date: Tue Mar 30 16:18:48 2021 +0800 riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe commit b1ebaa0e1318494a7637099a26add50509e37964 upstream. The execution of sys_read end up hitting a BUG_ON() in __find_get_block after installing kprobe at sys_read, the BUG message like the following: [ 65.708663] ------------[ cut here ]------------ [ 65.709987] kernel BUG at fs/buffer.c:1251! [ 65.711283] Kernel BUG [#1] [ 65.712032] Modules linked in: [ 65.712925] CPU: 0 PID: 51 Comm: sh Not tainted 5.12.0-rc4 #1 [ 65.714407] Hardware name: riscv-virtio,qemu (DT) [ 65.715696] epc : __find_get_block+0x218/0x2c8 [ 65.716835] ra : __getblk_gfp+0x1c/0x4a [ 65.717831] epc : ffffffe00019f11e ra : ffffffe00019f56a sp : ffffffe002437930 [ 65.719553] gp : ffffffe000f06030 tp : ffffffe0015abc00 t0 : ffffffe00191e038 [ 65.721290] t1 : ffffffe00191e038 t2 : 000000000000000a s0 : ffffffe002437960 [ 65.723051] s1 : ffffffe00160ad00 a0 : ffffffe00160ad00 a1 : 000000000000012a [ 65.724772] a2 : 0000000000000400 a3 : 0000000000000008 a4 : 0000000000000040 [ 65.726545] a5 : 0000000000000000 a6 : ffffffe00191e000 a7 : 0000000000000000 [ 65.728308] s2 : 000000000000012a s3 : 0000000000000400 s4 : 0000000000000008 [ 65.730049] s5 : 000000000000006c s6 : ffffffe00240f800 s7 : ffffffe000f080a8 [ 65.731802] s8 : 0000000000000001 s9 : 000000000000012a s10: 0000000000000008 [ 65.733516] s11: 0000000000000008 t3 : 00000000000003ff t4 : 000000000000000f [ 65.734434] t5 : 00000000000003ff t6 : 0000000000040000 [ 65.734613] status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 65.734901] Call Trace: [ 65.735076] [] __find_get_block+0x218/0x2c8 [ 65.735417] [] __ext4_get_inode_loc+0xb2/0x2f6 [ 65.735618] [] ext4_get_inode_loc+0x3a/0x8a [ 65.735802] [] ext4_reserve_inode_write+0x2e/0x8c [ 65.735999] [] __ext4_mark_inode_dirty+0x4c/0x18e [ 65.736208] [] ext4_dirty_inode+0x46/0x66 [ 65.736387] [] __mark_inode_dirty+0x12c/0x3da [ 65.736576] [] touch_atime+0x146/0x150 [ 65.736748] [] filemap_read+0x234/0x246 [ 65.736920] [] generic_file_read_iter+0xc0/0x114 [ 65.737114] [] ext4_file_read_iter+0x42/0xea [ 65.737310] [] new_sync_read+0xe2/0x15a [ 65.737483] [] vfs_read+0xca/0xf2 [ 65.737641] [] ksys_read+0x5e/0xc8 [ 65.737816] [] sys_read+0xe/0x16 [ 65.737973] [] ret_from_syscall+0x0/0x2 [ 65.738858] ---[ end trace fe93f985456c935d ]--- A simple reproducer looks like: echo 'p:myprobe sys_read fd=%a0 buf=%a1 count=%a2' > /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable cat /sys/kernel/debug/tracing/trace Here's what happens to hit that BUG_ON(): 1) After installing kprobe at entry of sys_read, the first instruction is replaced by 'ebreak' instruction on riscv64 platform. 2) Once kernel reach the 'ebreak' instruction at the entry of sys_read, it trap into the riscv breakpoint handler, where it do something to setup for coming single-step of origin instruction, including backup the 'sstatus' in pt_regs, followed by disable interrupt during single stepping via clear 'SIE' bit of 'sstatus' in pt_regs. 3) Then kernel restore to the instruction slot contains two instructions, one is original instruction at entry of sys_read, the other is 'ebreak'. Here it trigger a 'Instruction page fault' exception (value at 'scause' is '0xc'), if PF is not filled into PageTabe for that slot yet. 4) Again kernel trap into page fault exception handler, where it choose different policy according to the state of running kprobe. Because afte 2) the state is KPROBE_HIT_SS, so kernel reset the current kprobe and 'pc' points back to the probe address. 5) Because 'epc' point back to 'ebreak' instrution at sys_read probe, kernel trap into breakpoint handler again, and repeat the operations at 2), however 'sstatus' without 'SIE' is keep at 4), it cause the real 'sstatus' saved at 2) is overwritten by the one withou 'SIE'. 6) When kernel cross the probe the 'sstatus' CSR restore with value without 'SIE', and reach __find_get_block where it requires the interrupt must be enabled. Fix this is very trivial, just restore the value of 'sstatus' in pt_regs with backup one at 2) when the instruction being single stepped cause a page fault. Fixes: c22b0bcb1dd02 ("riscv: Add kprobes supported") Signed-off-by: Liao Chang Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit e4228d7587b6323326618006cb64a0425c798af6 Author: Nathan Chancellor Date: Wed Apr 28 18:23:50 2021 -0700 Makefile: Move -Wno-unused-but-set-variable out of GCC only block commit 885480b084696331bea61a4f7eba10652999a9c1 upstream. Currently, -Wunused-but-set-variable is only supported by GCC so it is disabled unconditionally in a GCC only block (it is enabled with W=1). clang currently has its implementation for this warning in review so preemptively move this statement out of the GCC only block and wrap it with cc-disable-warning so that both compilers function the same. Cc: stable@vger.kernel.org Link: https://reviews.llvm.org/D100581 Signed-off-by: Nathan Chancellor Reviewed-by: Nick Desaulniers Tested-by: Nick Desaulniers Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman commit 48f9d3dd62833c3318a869ed895937ecd1de8fc1 Author: Bill Wendling Date: Fri Apr 23 13:51:59 2021 -0700 arm64/vdso: Discard .note.gnu.property sections in vDSO [ Upstream commit 388708028e6937f3fc5fc19aeeb847f8970f489c ] The arm64 assembler in binutils 2.32 and above generates a program property note in a note section, .note.gnu.property, to encode used x86 ISAs and features. But the kernel linker script only contains a single NOTE segment: PHDRS { text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */ dynamic PT_DYNAMIC FLAGS(4); /* PF_R */ note PT_NOTE FLAGS(4); /* PF_R */ } The NOTE segment generated by the vDSO linker script is aligned to 4 bytes. But the .note.gnu.property section must be aligned to 8 bytes on arm64. $ readelf -n vdso64.so Displaying notes found in: .note Owner Data size Description Linux 0x00000004 Unknown note type: (0x00000000) description data: 06 00 00 00 readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20 readelf: Warning: type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8 Since the note.gnu.property section in the vDSO is not checked by the dynamic linker, discard the .note.gnu.property sections in the vDSO. Similar to commit 4caffe6a28d31 ("x86/vdso: Discard .note.gnu.property sections in vDSO"), but for arm64. Signed-off-by: Bill Wendling Reviewed-by: Kees Cook Acked-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20210423205159.830854-1-morbo@google.com Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin commit 868c10b9be11ac0c9cc8adb3559a1b6d21bef4e8 Author: BingJing Chang Date: Thu Mar 25 09:56:22 2021 +0800 btrfs: fix a potential hole punching failure [ Upstream commit 3227788cd369d734d2d3cd94f8af7536b60fa552 ] In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in a already existed hole."), existing holes can be skipped by calling find_first_non_hole() to adjust start and len. However, if the given len is invalid and large, when an EXTENT_MAP_HOLE extent is found, len will not be set to zero because (em->start + em->len) is less than (start + len). Then the ret will be 1 but len will not be set to 0. The propagated non-zero ret will result in fallocate failure. In the while-loop of btrfs_replace_file_extents(), len is not updated every time before it calls find_first_non_hole(). That is, after btrfs_drop_extents() successfully drops the last non-hole file extent, it may fail with ENOSPC when attempting to drop a file extent item representing a hole. The problem can happen. After it calls find_first_non_hole(), the cur_offset will be adjusted to be larger than or equal to end. However, since the len is not set to zero, the break-loop condition (ret && !len) will not be met. After it leaves the while-loop, fallocate will return 1, which is an unexpected return value. We're not able to construct a reproducible way to let btrfs_drop_extents() fail with ENOSPC after it drops the last non-hole file extent but with remaining holes left. However, it's quite easy to fix. We just need to update and check the len every time before we call find_first_non_hole(). To make the while loop more readable, we also pull the variable updates to the bottom of loop like this: while (cur_offset < end) { ... // update cur_offset & len // advance cur_offset & len in hole-punching case if needed } Reported-by: Robbie Ko Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a already existed hole.") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Robbie Ko Reviewed-by: Chung-Chiang Cheng Reviewed-by: Filipe Manana Signed-off-by: BingJing Chang Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 40b87e9b12b5fcb5dc92640c3c9ccd58397b025a Author: Filipe Manana Date: Tue Apr 20 10:55:44 2021 +0100 btrfs: fix race when picking most recent mod log operation for an old root [ Upstream commit f9690f426b2134cc3e74bfc5d9dfd6a4b2ca5281 ] Commit dbcc7d57bffc0c ("btrfs: fix race when cloning extent buffer during rewind of an old root"), fixed a race when we need to rewind the extent buffer of an old root. It was caused by picking a new mod log operation for the extent buffer while getting a cloned extent buffer with an outdated number of items (off by -1), because we cloned the extent buffer without locking it first. However there is still another similar race, but in the opposite direction. The cloned extent buffer has a number of items that does not match the number of tree mod log operations that are going to be replayed. This is because right after we got the last (most recent) tree mod log operation to replay and before locking and cloning the extent buffer, another task adds a new pointer to the extent buffer, which results in adding a new tree mod log operation and incrementing the number of items in the extent buffer. So after cloning we have mismatch between the number of items in the extent buffer and the number of mod log operations we are going to apply to it. This results in hitting a BUG_ON() that produces the following stack trace: ------------[ cut here ]------------ kernel BUG at fs/btrfs/tree-mod-log.c:675! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 4811 Comm: crawl_1215 Tainted: G W 5.12.0-7d1efdf501f8-misc-next+ #99 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:tree_mod_log_rewind+0x3b1/0x3c0 Code: 05 48 8d 74 10 (...) RSP: 0018:ffffc90001027090 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8880a8514600 RCX: ffffffffaa9e59b6 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8880a851462c RBP: ffffc900010270e0 R08: 00000000000000c0 R09: ffffed1004333417 R10: ffff88802199a0b7 R11: ffffed1004333416 R12: 000000000000000e R13: ffff888135af8748 R14: ffff88818766ff00 R15: ffff8880a851462c FS: 00007f29acf62700(0000) GS:ffff8881f2200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0e6013f718 CR3: 000000010d42e003 CR4: 0000000000170ee0 Call Trace: btrfs_get_old_root+0x16a/0x5c0 ? lock_downgrade+0x400/0x400 btrfs_search_old_slot+0x192/0x520 ? btrfs_search_slot+0x1090/0x1090 ? free_extent_buffer.part.61+0xd7/0x140 ? free_extent_buffer+0x13/0x20 resolve_indirect_refs+0x3e9/0xfc0 ? lock_downgrade+0x400/0x400 ? __kasan_check_read+0x11/0x20 ? add_prelim_ref.part.11+0x150/0x150 ? lock_downgrade+0x400/0x400 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x620 ? __kasan_check_write+0x14/0x20 ? do_raw_spin_unlock+0xa8/0x140 ? rb_insert_color+0x340/0x360 ? prelim_ref_insert+0x12d/0x430 find_parent_nodes+0x5c3/0x1830 ? stack_trace_save+0x87/0xb0 ? resolve_indirect_refs+0xfc0/0xfc0 ? fs_reclaim_acquire+0x67/0xf0 ? __kasan_check_read+0x11/0x20 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? fs_reclaim_acquire+0x67/0xf0 ? __kasan_check_read+0x11/0x20 ? ___might_sleep+0x10f/0x1e0 ? __kasan_kmalloc+0x9d/0xd0 ? trace_hardirqs_on+0x55/0x120 btrfs_find_all_roots_safe+0x142/0x1e0 ? find_parent_nodes+0x1830/0x1830 ? trace_hardirqs_on+0x55/0x120 ? ulist_free+0x1f/0x30 ? btrfs_inode_flags_to_xflags+0x50/0x50 iterate_extent_inodes+0x20e/0x580 ? tree_backref_for_extent+0x230/0x230 ? release_extent_buffer+0x225/0x280 ? read_extent_buffer+0xdd/0x110 ? lock_downgrade+0x400/0x400 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x620 ? __kasan_check_write+0x14/0x20 ? do_raw_spin_unlock+0xa8/0x140 ? _raw_spin_unlock+0x22/0x30 ? release_extent_buffer+0x225/0x280 iterate_inodes_from_logical+0x129/0x170 ? iterate_inodes_from_logical+0x129/0x170 ? btrfs_inode_flags_to_xflags+0x50/0x50 ? iterate_extent_inodes+0x580/0x580 ? __vmalloc_node+0x92/0xb0 ? init_data_container+0x34/0xb0 ? init_data_container+0x34/0xb0 ? kvmalloc_node+0x60/0x80 btrfs_ioctl_logical_to_ino+0x158/0x230 btrfs_ioctl+0x2038/0x4360 ? __kasan_check_write+0x14/0x20 ? mmput+0x3b/0x220 ? btrfs_ioctl_get_supported_features+0x30/0x30 ? __kasan_check_read+0x11/0x20 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x650 ? __might_fault+0x64/0xd0 ? __kasan_check_read+0x11/0x20 ? lock_downgrade+0x400/0x400 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? lockdep_hardirqs_on_prepare+0x13/0x210 ? _raw_spin_unlock_irqrestore+0x51/0x63 ? __kasan_check_read+0x11/0x20 ? do_vfs_ioctl+0xfc/0x9d0 ? ioctl_file_clone+0xe0/0xe0 ? lock_downgrade+0x400/0x400 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x650 ? __task_pid_nr_ns+0xd3/0x250 ? __kasan_check_read+0x11/0x20 ? __fget_files+0x160/0x230 ? __fget_light+0xf2/0x110 __x64_sys_ioctl+0xc3/0x100 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f29ae85b427 Code: 00 00 90 48 8b (...) RSP: 002b:00007f29acf5fcf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f29acf5ff40 RCX: 00007f29ae85b427 RDX: 00007f29acf5ff48 RSI: 00000000c038943b RDI: 0000000000000003 RBP: 0000000001000000 R08: 0000000000000000 R09: 00007f29acf60120 R10: 00005640d5fc7b00 R11: 0000000000000246 R12: 0000000000000003 R13: 00007f29acf5ff48 R14: 00007f29acf5ff40 R15: 00007f29acf5fef8 Modules linked in: ---[ end trace 85e5fce078dfbe04 ]--- (gdb) l *(tree_mod_log_rewind+0x3b1) 0xffffffff819e5b21 is in tree_mod_log_rewind (fs/btrfs/tree-mod-log.c:675). 670 * the modification. As we're going backwards, we do the 671 * opposite of each operation here. 672 */ 673 switch (tm->op) { 674 case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING: 675 BUG_ON(tm->slot < n); 676 fallthrough; 677 case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_MOVING: 678 case BTRFS_MOD_LOG_KEY_REMOVE: 679 btrfs_set_node_key(eb, &tm->key, tm->slot); (gdb) quit The following steps explain in more detail how it happens: 1) We have one tree mod log user (through fiemap or the logical ino ioctl), with a sequence number of 1, so we have fs_info->tree_mod_seq == 1. This is task A; 2) Another task is at ctree.c:balance_level() and we have eb X currently as the root of the tree, and we promote its single child, eb Y, as the new root. Then, at ctree.c:balance_level(), we call: ret = btrfs_tree_mod_log_insert_root(root->node, child, true); 3) At btrfs_tree_mod_log_insert_root() we create a tree mod log operation of type BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING, with a ->logical field pointing to ebX->start. We only have one item in eb X, so we create only one tree mod log operation, and store in the "tm_list" array; 4) Then, still at btrfs_tree_mod_log_insert_root(), we create a tree mod log element of operation type BTRFS_MOD_LOG_ROOT_REPLACE, ->logical set to ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set to the level of eb X and ->generation set to the generation of eb X; 5) Then btrfs_tree_mod_log_insert_root() calls tree_mod_log_free_eb() with "tm_list" as argument. After that, tree_mod_log_free_eb() calls tree_mod_log_insert(). This inserts the mod log operation of type BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING from step 3 into the rbtree with a sequence number of 2 (and fs_info->tree_mod_seq set to 2); 6) Then, after inserting the "tm_list" single element into the tree mod log rbtree, the BTRFS_MOD_LOG_ROOT_REPLACE element is inserted, which gets the sequence number 3 (and fs_info->tree_mod_seq set to 3); 7) Back to ctree.c:balance_level(), we free eb X by calling btrfs_free_tree_block() on it. Because eb X was created in the current transaction, has no other references and writeback did not happen for it, we add it back to the free space cache/tree; 8) Later some other task B allocates the metadata extent from eb X, since it is marked as free space in the space cache/tree, and uses it as a node for some other btree; 9) The tree mod log user task calls btrfs_search_old_slot(), which calls btrfs_get_old_root(), and finally that calls tree_mod_log_oldest_root() with time_seq == 1 and eb_root == eb Y; 10) The first iteration of the while loop finds the tree mod log element with sequence number 3, for the logical address of eb Y and of type BTRFS_MOD_LOG_ROOT_REPLACE; 11) Because the operation type is BTRFS_MOD_LOG_ROOT_REPLACE, we don't break out of the loop, and set root_logical to point to tm->old_root.logical, which corresponds to the logical address of eb X; 12) On the next iteration of the while loop, the call to tree_mod_log_search_oldest() returns the smallest tree mod log element for the logical address of eb X, which has a sequence number of 2, an operation type of BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to the old slot 0 of eb X (eb X had only 1 item in it before being freed at step 7); 13) We then break out of the while loop and return the tree mod log operation of type BTRFS_MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot 0 of eb X, to btrfs_get_old_root(); 14) At btrfs_get_old_root(), we process the BTRFS_MOD_LOG_ROOT_REPLACE operation and set "logical" to the logical address of eb X, which was the old root. We then call tree_mod_log_search() passing it the logical address of eb X and time_seq == 1; 15) But before calling tree_mod_log_search(), task B locks eb X, adds a key to eb X, which results in adding a tree mod log operation of type BTRFS_MOD_LOG_KEY_ADD, with a sequence number of 4, to the tree mod log, and increments the number of items in eb X from 0 to 1. Now fs_info->tree_mod_seq has a value of 4; 16) Task A then calls tree_mod_log_search(), which returns the most recent tree mod log operation for eb X, which is the one just added by task B at the previous step, with a sequence number of 4, a type of BTRFS_MOD_LOG_KEY_ADD and for slot 0; 17) Before task A locks and clones eb X, task A adds another key to eb X, which results in adding a new BTRFS_MOD_LOG_KEY_ADD mod log operation, with a sequence number of 5, for slot 1 of eb X, increments the number of items in eb X from 1 to 2, and unlocks eb X. Now fs_info->tree_mod_seq has a value of 5; 18) Task A then locks eb X and clones it. The clone has a value of 2 for the number of items and the pointer "tm" points to the tree mod log operation with sequence number 4, not the most recent one with a sequence number of 5, so there is mismatch between the number of mod log operations that are going to be applied to the cloned version of eb X and the number of items in the clone; 19) Task A then calls tree_mod_log_rewind() with the clone of eb X, the tree mod log operation with sequence number 4 and a type of BTRFS_MOD_LOG_KEY_ADD, and time_seq == 1; 20) At tree_mod_log_rewind(), we set the local variable "n" with a value of 2, which is the number of items in the clone of eb X. Then in the first iteration of the while loop, we process the mod log operation with sequence number 4, which is targeted at slot 0 and has a type of BTRFS_MOD_LOG_KEY_ADD. This results in decrementing "n" from 2 to 1. Then we pick the next tree mod log operation for eb X, which is the tree mod log operation with a sequence number of 2, a type of BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING and for slot 0, it is the one added in step 5 to the tree mod log tree. We go back to the top of the loop to process this mod log operation, and because its slot is 0 and "n" has a value of 1, we hit the BUG_ON: (...) switch (tm->op) { case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING: BUG_ON(tm->slot < n); fallthrough; (...) Fix this by checking for a more recent tree mod log operation after locking and cloning the extent buffer of the old root node, and use it as the first operation to apply to the cloned extent buffer when rewinding it. Stable backport notes: due to moved code and renames, in =< 5.11 the change should be applied to ctree.c:get_old_root. Reported-by: Zygo Blaxell Link: https://lore.kernel.org/linux-btrfs/20210404040732.GZ32440@hungrycats.org/ Fixes: 834328a8493079 ("Btrfs: tree mod log's old roots could still be part of the tree") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 900fb3279e5edf4de5154307fec7c9f6cf67958c Author: Bas Nieuwenhuizen Date: Wed Apr 28 17:09:03 2021 +0800 tools/power/turbostat: Fix turbostat for AMD Zen CPUs commit 301b1d3a9104f4f3a8ab4171cf88d0f55d632b41 upstream. It was reported that on Zen+ system turbostat started exiting, which was tracked down to the MSR_PKG_ENERGY_STAT read failing because offset_to_idx wasn't returning a non-negative index. This patch combined the modification from Bingsong Si and Bas Nieuwenhuizen and addd the MSR to the index system as alternative for MSR_PKG_ENERGY_STATUS. Fixes: 9972d5d84d76 ("tools/power turbostat: Enable accumulate RAPL display") Reported-by: youling257 Tested-by: youling257 Tested-by: Kurt Garloff Tested-by: Bingsong Si Tested-by: Artem S. Tashkinov Co-developed-by: Bingsong Si Co-developed-by: Terry Bowman Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Chen Yu Signed-off-by: Len Brown Cc: Salvatore Bonaccorso Signed-off-by: Greg Kroah-Hartman commit 19c11722771f7852e3ee25c4583b127529d2e349 Author: Eckhart Mohr Date: Tue Apr 27 17:30:25 2021 +0200 ALSA: hda/realtek: Add quirk for Intel Clevo PCx0Dx commit 970e3012c04c96351c413f193a9c909e6d871ce2 upstream. This applies a SND_PCI_QUIRK(...) to the Clevo PCx0Dx barebones. This fix enables audio output over the headset jack and ensures that a microphone connected via the headset combo jack is correctly recognized when pluged in. [ Rearranged the list entries in a sorted order -- tiwai ] Signed-off-by: Eckhart Mohr Co-developed-by: Werner Sembach Signed-off-by: Werner Sembach Cc: Link: https://lore.kernel.org/r/20210427153025.451118-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit b821885bfcde01bb0ee632e703fb8c51aec0547c Author: Sami Loone Date: Sun Apr 25 22:37:12 2021 +0200 ALSA: hda/realtek: fix static noise on ALC285 Lenovo laptops commit 9bbb94e57df135ef61bef075d9c99b8d9e89e246 upstream. Remove a duplicate vendor+subvendor pin fixup entry as one is masking the other and making it unreachable. Consider the more specific newcomer as a second chance instead. The generic entry is made less strict to also match for laptops with slightly different 0x12 pin configuration. Tested on Lenovo Yoga 6 (AMD) where 0x12 is 0x40000000. Fixes: 607184cb1635 ("ALSA: hda/realtek - Add supported for more Lenovo ALC285 Headset Button") Signed-off-by: Sami Loone Cc: Link: https://lore.kernel.org/r/YIXS+GT/dGI/LtK6@yoga Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 14ad4409418fa2e81cf80626c4d9b4663c3f7738 Author: Kailang Yang Date: Tue Apr 20 14:17:34 2021 +0800 ALSA: hda/realtek - Headset Mic issue on HP platform commit 1c9d9dfd2d254211cb37b1513b1da3e6835b8f00 upstream. Boot with plugged headset, the Headset Mic will be gone. Signed-off-by: Kailang Yang Cc: Link: https://lore.kernel.org/r/207eecfc3189466a820720bc0c409ea9@realtek.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 2c10b675918ed1086c9eb74e7b54dea48beba3cf Author: Phil Calvin Date: Thu Apr 15 18:01:29 2021 -0400 ALSA: hda/realtek: fix mic boost on Intel NUC 8 commit d1ee66c5d3c5a0498dd5e3f2af5b8c219a98bba5 upstream. Fix two bugs with the Intel HDA Realtek ALC233 sound codec present in Intel NUC NUC8i7BEH and probably a few other similar NUC models. These codecs advertise a 4-level microphone input boost amplifier on pin 0x19, but the highest two boost settings do not work correctly, and produce only low analog noise that does not seem to contain any discernible signal. There is an existing fixup for this exact problem but for a different PCI subsystem ID, so we re-use that logic. Changing the boost level also triggers a DC spike in the input signal that bleeds off over about a second and overwhelms any input during that time. Thankfully, the existing fixup has the side effect of making the boost control show up in userspace as a mute/unmute switch, and this keeps (e.g.) PulseAudio from fiddling with it during normal input volume adjustments. Finally, the NUC hardware has built-in inverted stereo mics. This patch also enables the usual fixup for this so the two channels cancel noise instead of the actual signal. [ Re-ordered the quirk entry point by tiwai ] Signed-off-by: Phil Calvin Cc: Link: https://lore.kernel.org/r/80dc5663-7734-e7e5-25ef-15b5df24511a@philcalvin.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit fd2fec39f8a1837bee912587d19a773889dbbb47 Author: Luke D Jones Date: Mon Apr 19 15:04:11 2021 +1200 ALSA: hda/realtek: GA503 use same quirks as GA401 commit 76fae6185f5456865ff1bcb647709d44fd987eb6 upstream. The GA503 has almost exactly the same default setup as the GA401 model with the same issues. The GA401 quirks solve all the issues so we will use the full quirk chain. Signed-off-by: Luke D Jones Cc: Link: https://lore.kernel.org/r/20210419030411.28304-1-luke@ljones.dev Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 8b32a5e4e1310d8478e0358da13ceb7173eab647 Author: Jonas Witschel Date: Fri Apr 16 12:58:54 2021 +0200 ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G7 commit 75b62ab65d2715ce6ff0794033d61ab9dc4a2dfc upstream. The HP ProBook 445 G7 (17T32ES) uses ALC236. Like ALC236_FIXUP_HP_GPIO_LED, COEF index 0x34 bit 5 is used to control the playback mute LED, but the microphone mute LED is controlled using pin VREF instead of a COEF index. AlsaInfo: https://alsa-project.org/db/?f=0d3f4d1af39cc359f9fea9b550727ee87e5cf45a Signed-off-by: Jonas Witschel Cc: Link: https://lore.kernel.org/r/20210416105852.52588-1-diabonas@archlinux.org Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 8f89fb21c85a6e3038844701f4d3fb1a85c46754 Author: Timo Gurr Date: Mon May 3 13:08:22 2021 +0200 ALSA: usb-audio: Add dB range mapping for Sennheiser Communications Headset PC 8 commit ab2165e2e6ed17345ffa8ee88ca764e8788ebcd7 upstream. The decibel volume range contains a negative maximum value resulting in pipewire complaining about the device and effectivly having no sound output. The wrong values also resulted in the headset sounding muted already at a mixer level of about ~25%. PipeWire BugLink: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/1049 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212897 Signed-off-by: Timo Gurr Cc: Link: https://lore.kernel.org/r/20210503110822.10222-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit e117d328d397d322ad94c88b80147955a14ab015 Author: Takashi Iwai Date: Tue Apr 13 10:41:52 2021 +0200 ALSA: usb-audio: Explicitly set up the clock selector commit d2e8f641257d0d3af6e45d6ac2d6f9d56b8ea964 upstream. In the current code, we have some assumption that the audio clock selector has been set up implicitly and don't want to touch it unless it's really needed for the fallback autoclock setup. This works for most devices but some seem having a problem. Partially this was covered for the devices with a single connector at the initialization phase (commit 086b957cc17f "ALSA: usb-audio: Skip the clock selector inquiry for single connections"), but also there are cases where the wrong clock set up is kept silently. The latter seems to be the cause of the noises on Behringer devices. In this patch, we explicitly set up the audio clock selector whenever the appropriate node is found. Reported-by: Geraldo Nascimento BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199327 Link: https://lore.kernel.org/r/CAEsQvcvF7LnO8PxyyCxuRCx=7jNeSCvFAd-+dE0g_rd1rOxxdw@mail.gmail.com Cc: Link: https://lore.kernel.org/r/20210413084152.32325-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1699eddaa136f1d8d34452ef7823e768c79fae9a Author: Lv Yunlong Date: Mon Apr 26 07:55:41 2021 -0700 ALSA: sb: Fix two use after free in snd_sb_qsound_build commit 4fb44dd2c1dda18606348acdfdb97e8759dde9df upstream. In snd_sb_qsound_build, snd_ctl_add(..,p->qsound_switch...) and snd_ctl_add(..,p->qsound_space..) are called. But the second arguments of snd_ctl_add() could be freed via snd_ctl_add_replace() ->snd_ctl_free_one(). After the error code is returned, snd_sb_qsound_destroy(p) is called in __error branch. But in snd_sb_qsound_destroy(), the freed p->qsound_switch and p->qsound_space are still used by snd_ctl_remove(). My patch set p->qsound_switch and p->qsound_space to NULL if snd_ctl_add() failed to avoid the uaf bugs. But these codes need to further be improved with the code style. Signed-off-by: Lv Yunlong Cc: Link: https://lore.kernel.org/r/20210426145541.8070-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 51aad882598635cd62d762695972d1dc5fb590ae Author: Takashi Iwai Date: Wed Apr 28 13:27:04 2021 +0200 ALSA: hda/conexant: Re-order CX5066 quirk table entries commit 2e6a731296be9d356fdccee9fb6ae345dad96438 upstream. Just re-order the cx5066_fixups[] entries for HP devices for avoiding the oversight of the duplicated or unapplied item in future. No functional changes. Also Cc-to-stable for the further patch applications. Cc: Link: https://lore.kernel.org/r/20210428112704.23967-14-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit ad2cd2dff035785a759557f3db814d608d1c46af Author: Lv Yunlong Date: Mon Apr 26 06:11:29 2021 -0700 ALSA: emu8000: Fix a use after free in snd_emu8000_create_mixer commit 1c98f574403dbcf2eb832d5535a10d967333ef2d upstream. Our code analyzer reported a uaf. In snd_emu8000_create_mixer, the callee snd_ctl_add(..,emu->controls[i]) calls snd_ctl_add_replace(.., kcontrol,..). Inside snd_ctl_add_replace(), if error happens, kcontrol will be freed by snd_ctl_free_one(kcontrol). Then emu->controls[i] points to a freed memory, and the execution comes to __error branch of snd_emu8000_create_mixer. The freed emu->controls[i] is used in snd_ctl_remove(card, emu->controls[i]). My patch set emu->controls[i] to NULL if snd_ctl_add() failed to avoid the uaf. Signed-off-by: Lv Yunlong Cc: Link: https://lore.kernel.org/r/20210426131129.4796-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 21f0971f53ba440e4f50c2149fffa852aff4093f Author: Guangqing Zhu Date: Wed Apr 21 22:36:50 2021 +0800 power: supply: cpcap-battery: fix invalid usage of list cursor [ Upstream commit d0a43c12ee9f57ddb284272187bd18726c2c2c98 ] Fix invalid usage of a list_for_each_entry in cpcap_battery_irq_thread(). Empty list or fully traversed list points to list head, which is not NULL (and before the first element containing real data). Signed-off-by: Guangqing Zhu Reviewed-by: Tony Lindgren Reviewed-by: Carl Philipp Klemm Tested-by: Carl Philipp Klemm Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit 284d899d90004213794f45c5506020d7782e9150 Author: Hou Pu Date: Fri Apr 16 10:45:21 2021 +0800 nvmet: avoid queuing keep-alive timer if it is disabled [ Upstream commit 8f864c595bed20ef85fef3e7314212b73800d51d ] Issue following command: nvme set-feature -f 0xf -v 0 /dev/nvme1n1 # disable keep-alive timer nvme admin-passthru -o 0x18 /dev/nvme1n1 # send keep-alive command will make keep-alive timer fired and thus delete the controller like below: [247459.907635] nvmet: ctrl 1 keep-alive timer (0 seconds) expired! [247459.930294] nvmet: ctrl 1 fatal error occurred! Avoid this by not queuing delayed keep-alive if it is disabled when keep-alive command is received from the admin queue. Signed-off-by: Hou Pu Tested-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 244022d2b53d76fcd2b73be88824bf60f702698e Author: Charan Teja Reddy Date: Fri Apr 16 20:32:16 2021 +0530 sched,psi: Handle potential task count underflow bugs more gracefully [ Upstream commit 9d10a13d1e4c349b76f1c675a874a7f981d6d3b4 ] psi_group_cpu->tasks, represented by the unsigned int, stores the number of tasks that could be stalled on a psi resource(io/mem/cpu). Decrementing these counters at zero leads to wrapping which further leads to the psi_group_cpu->state_mask is being set with the respective pressure state. This could result into the unnecessary time sampling for the pressure state thus cause the spurious psi events. This can further lead to wrong actions being taken at the user land based on these psi events. Though psi_bug is set under these conditions but that just for debug purpose. Fix it by decrementing the ->tasks count only when it is non-zero. Signed-off-by: Charan Teja Reddy Signed-off-by: Peter Zijlstra (Intel) Acked-by: Johannes Weiner Link: https://lkml.kernel.org/r/1618585336-37219-1-git-send-email-charante@codeaurora.org Signed-off-by: Sasha Levin commit a78c38ebc44ec00fd1a2a2f38417a113407196bf Author: Harald Freudenberger Date: Tue Apr 20 08:23:12 2021 +0200 s390/archrandom: add parameter check for s390_arch_random_generate [ Upstream commit 28096067686c5a5cbd4c35b079749bd805df5010 ] A review of the code showed, that this function which is exposed within the whole kernel should do a parameter check for the amount of bytes requested. If this requested bytes is too high an unsigned int overflow could happen causing this function to try to memcpy a really big memory chunk. This is not a security issue as there are only two invocations of this function from arch/s390/include/asm/archrandom.h and both are not exposed to userland. Reported-by: Sven Schnelle Signed-off-by: Harald Freudenberger Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit 19bbb353a22eb106a5048f9c29b7b50a62af9609 Author: Pavel Begunkov Date: Tue Apr 20 12:03:32 2021 +0100 io_uring: safer sq_creds putting [ Upstream commit 07db298a1c96bdba2102d60ad51fcecb961177c9 ] Put sq_creds as a part of io_ring_ctx_free(), it's easy to miss doing it in io_sq_thread_finish(), especially considering past mistakes related to ring creation failures. Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/3becb1866467a1de82a97345a0a90d7fb8ff875e.1618916549.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 87a6d381613a31648781955160d69140f41b716d Author: Gioh Kim Date: Mon Apr 19 09:37:15 2021 +0200 block/rnbd-clt: Fix missing a memory free when unloading the module [ Upstream commit 12b06533104e802df73c1fbe159437c19933d6c0 ] When unloading the rnbd-clt module, it does not free a memory including the filename of the symbolic link to /sys/block/rnbdX. It is found by kmemleak as below. unreferenced object 0xffff9f1a83d3c740 (size 16): comm "bash", pid 736, jiffies 4295179665 (age 9841.310s) hex dump (first 16 bytes): 21 64 65 76 21 6e 75 6c 6c 62 30 40 62 6c 61 00 !dev!nullb0@bla. backtrace: [<0000000039f0c55e>] 0xffffffffc0456c24 [<000000001aab9513>] kernfs_fop_write+0xcf/0x1c0 [<00000000db5aa4b3>] vfs_write+0xdb/0x1d0 [<000000007a2e2207>] ksys_write+0x65/0xe0 [<00000000055e280a>] do_syscall_64+0x50/0x1b0 [<00000000c2b51831>] entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Gioh Kim Signed-off-by: Jack Wang Link: https://lore.kernel.org/r/20210419073722.15351-13-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit fba4bf102cf8f0ebc46d0232d59605cc0cc40a9e Author: Gioh Kim Date: Mon Apr 19 09:37:12 2021 +0200 block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel [ Upstream commit b168e1d85cf3201663698dd9dcb3d46c7e67f621 ] We got a warning message below. When server tries to close one session by force, it locks the sysfs interface and locks the srv_sess lock. The problem is that client can send a request to close at the same time. By close request, server locks the srv_sess lock and locks the sysfs to remove the sysfs interfaces. The simplest way to prevent that situation could be just use mutex_trylock. [ 234.153965] ====================================================== [ 234.154093] WARNING: possible circular locking dependency detected [ 234.154219] 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Tainted: G O [ 234.154381] ------------------------------------------------------ [ 234.154531] kworker/1:1H/618 is trying to acquire lock: [ 234.154651] ffff8887a09db0a8 (kn->count#132){++++}, at: kernfs_remove_by_name_ns+0x40/0x80 [ 234.154819] but task is already holding lock: [ 234.154965] ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server] [ 234.155132] which lock already depends on the new lock. [ 234.155311] the existing dependency chain (in reverse order) is: [ 234.155462] -> #1 (&srv_sess->lock){+.+.}: [ 234.155614] __mutex_lock+0x134/0xcb0 [ 234.155761] rnbd_srv_sess_dev_force_close+0x36/0x50 [rnbd_server] [ 234.155889] rnbd_srv_dev_session_force_close_store+0x69/0xc0 [rnbd_server] [ 234.156042] kernfs_fop_write+0x13f/0x240 [ 234.156162] vfs_write+0xf3/0x280 [ 234.156278] ksys_write+0xba/0x150 [ 234.156395] do_syscall_64+0x62/0x270 [ 234.156513] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 234.156632] -> #0 (kn->count#132){++++}: [ 234.156782] __lock_acquire+0x129e/0x23a0 [ 234.156900] lock_acquire+0xf3/0x210 [ 234.157043] __kernfs_remove+0x42b/0x4c0 [ 234.157161] kernfs_remove_by_name_ns+0x40/0x80 [ 234.157282] remove_files+0x3f/0xa0 [ 234.157399] sysfs_remove_group+0x4a/0xb0 [ 234.157519] rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server] [ 234.157648] rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server] [ 234.157775] process_io_req+0x29a/0x6a0 [rtrs_server] [ 234.157924] __ib_process_cq+0x8c/0x100 [ib_core] [ 234.158709] ib_cq_poll_work+0x31/0xb0 [ib_core] [ 234.158834] process_one_work+0x4e5/0xaa0 [ 234.158958] worker_thread+0x65/0x5c0 [ 234.159078] kthread+0x1e0/0x200 [ 234.159194] ret_from_fork+0x24/0x30 [ 234.159309] other info that might help us debug this: [ 234.159513] Possible unsafe locking scenario: [ 234.159658] CPU0 CPU1 [ 234.159775] ---- ---- [ 234.159891] lock(&srv_sess->lock); [ 234.160005] lock(kn->count#132); [ 234.160128] lock(&srv_sess->lock); [ 234.160250] lock(kn->count#132); [ 234.160364] *** DEADLOCK *** [ 234.160536] 3 locks held by kworker/1:1H/618: [ 234.160677] #0: ffff8883ca1ed528 ((wq_completion)ib-comp-wq){+.+.}, at: process_one_work+0x40a/0xaa0 [ 234.160840] #1: ffff8883d2d5fe10 ((work_completion)(&cq->work)){+.+.}, at: process_one_work+0x40a/0xaa0 [ 234.161003] #2: ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server] [ 234.161168] stack backtrace: [ 234.161312] CPU: 1 PID: 618 Comm: kworker/1:1H Tainted: G O 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 [ 234.161490] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 [ 234.161643] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] [ 234.161765] Call Trace: [ 234.161910] dump_stack+0x96/0xe0 [ 234.162028] check_noncircular+0x29e/0x2e0 [ 234.162148] ? print_circular_bug+0x100/0x100 [ 234.162267] ? register_lock_class+0x1ad/0x8a0 [ 234.162385] ? __lock_acquire+0x68e/0x23a0 [ 234.162505] ? trace_event_raw_event_lock+0x190/0x190 [ 234.162626] __lock_acquire+0x129e/0x23a0 [ 234.162746] ? register_lock_class+0x8a0/0x8a0 [ 234.162866] lock_acquire+0xf3/0x210 [ 234.162982] ? kernfs_remove_by_name_ns+0x40/0x80 [ 234.163127] __kernfs_remove+0x42b/0x4c0 [ 234.163243] ? kernfs_remove_by_name_ns+0x40/0x80 [ 234.163363] ? kernfs_fop_readdir+0x3b0/0x3b0 [ 234.163482] ? strlen+0x1f/0x40 [ 234.163596] ? strcmp+0x30/0x50 [ 234.163712] kernfs_remove_by_name_ns+0x40/0x80 [ 234.163832] remove_files+0x3f/0xa0 [ 234.163948] sysfs_remove_group+0x4a/0xb0 [ 234.164068] rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server] [ 234.164196] rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server] [ 234.164345] ? _raw_spin_unlock_irqrestore+0x43/0x50 [ 234.164466] ? lockdep_hardirqs_on+0x1a8/0x290 [ 234.164597] ? mlx4_ib_poll_cq+0x927/0x1280 [mlx4_ib] [ 234.164732] ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server] [ 234.164859] process_io_req+0x29a/0x6a0 [rtrs_server] [ 234.164982] ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server] [ 234.165130] __ib_process_cq+0x8c/0x100 [ib_core] [ 234.165279] ib_cq_poll_work+0x31/0xb0 [ib_core] [ 234.165404] process_one_work+0x4e5/0xaa0 [ 234.165550] ? pwq_dec_nr_in_flight+0x160/0x160 [ 234.165675] ? do_raw_spin_lock+0x119/0x1d0 [ 234.165796] worker_thread+0x65/0x5c0 [ 234.165914] ? process_one_work+0xaa0/0xaa0 [ 234.166031] kthread+0x1e0/0x200 [ 234.166147] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 234.166268] ret_from_fork+0x24/0x30 [ 234.251591] rnbd_server L243: : Device closed [ 234.604221] rnbd_server L264: RTRS Session close_device_session disconnected Signed-off-by: Gioh Kim Signed-off-by: Md Haris Iqbal Signed-off-by: Jack Wang Link: https://lore.kernel.org/r/20210419073722.15351-10-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 9d1c4c2ff48370c2924d2f948890a732bc10a5a0 Author: Peter Zijlstra Date: Thu Mar 25 13:44:46 2021 +0100 sched,fair: Alternative sched_slice() [ Upstream commit 0c2de3f054a59f15e01804b75a04355c48de628c ] The current sched_slice() seems to have issues; there's two possible things that could be improved: - the 'nr_running' used for __sched_period() is daft when cgroups are considered. Using the RQ wide h_nr_running seems like a much more consistent number. - (esp) cgroups can slice it real fine, which makes for easy over-scheduling, ensure min_gran is what the name says. Signed-off-by: Peter Zijlstra (Intel) Tested-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210412102001.611897312@infradead.org Signed-off-by: Sasha Levin commit 76ccefa1b3f8da3750b6548317c116ab8aeccb71 Author: Peter Zijlstra Date: Thu Apr 8 12:35:56 2021 +0200 perf: Rework perf_event_exit_event() [ Upstream commit ef54c1a476aef7eef26fe13ea10dc090952c00f8 ] Make perf_event_exit_event() more robust, such that we can use it from other contexts. Specifically the up and coming remove_on_exec. For this to work we need to address a few issues. Remove_on_exec will not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to disable event_function_call() and we thus have to use perf_remove_from_context(). When using perf_remove_from_context(), there's two races to consider. The first is against close(), where we can have concurrent tear-down of the event. The second is against child_list iteration, which should not find a half baked event. To address this, teach perf_remove_from_context() to special case !ctx->is_active and about DETACH_CHILD. [ elver@google.com: fix racing parent/child exit in sync_child_event(). ] Signed-off-by: Marco Elver Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com Signed-off-by: Sasha Levin commit 76282eddf6aced0be26010a637205129bc070e98 Author: Bart Van Assche Date: Thu Apr 15 15:08:13 2021 -0700 scsi: libfc: Fix a format specifier [ Upstream commit 90d6697810f06aceea9de71ad836a8c7669789cd ] Since the 'mfs' member has been declared as 'u32' in include/scsi/libfc.h, use the %u format specifier instead of %hu. This patch fixes the following clang compiler warning: warning: format specifies type 'unsigned short' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] "lport->mfs:%hu\n", mfs, lport->mfs); ~~~ ^~~~~~~~~~ %u Link: https://lore.kernel.org/r/20210415220826.29438-8-bvanassche@acm.org Cc: Hannes Reinecke Signed-off-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 0e0fb68c253823bdd1e76558cdfaefbc445b1eeb Author: Dinghao Liu Date: Wed Apr 7 13:11:49 2021 +0800 mfd: arizona: Fix rumtime PM imbalance on error [ Upstream commit fe6df2b48043bbe1e852b2320501d3b169363c35 ] pm_runtime_get_sync() will increase the rumtime PM counter even it returns an error. Thus a pairing decrement is needed to prevent refcount leak. Fix this by replacing this API with pm_runtime_resume_and_get(), which will not change the runtime PM counter on error. Signed-off-by: Dinghao Liu Acked-by: Charles Keepax Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 2030567b0e204dccfcc67eb1e6432df9d88eb605 Author: Hubert Streidl Date: Tue Mar 16 17:22:37 2021 +0100 mfd: da9063: Support SMBus and I2C mode [ Upstream commit 586478bfc9f7e16504d6f64cf18bcbdf6fd0cbc9 ] By default the PMIC DA9063 2-wire interface is SMBus compliant. This means the PMIC will automatically reset the interface when the clock signal ceases for more than the SMBus timeout of 35 ms. If the I2C driver / device is not capable of creating atomic I2C transactions, a context change can cause a ceasing of the clock signal. This can happen if for example a real-time thread is scheduled. Then the DA9063 in SMBus mode will reset the 2-wire interface. Subsequently a write message could end up in the wrong register. This could cause unpredictable system behavior. The DA9063 PMIC also supports an I2C compliant mode for the 2-wire interface. This mode does not reset the interface when the clock signal ceases. Thus the problem depicted above does not occur. This patch tests for the bus functionality "I2C_FUNC_I2C". It can reasonably be assumed that the bus cannot obey SMBus timings if this functionality is set. SMBus commands most probably are emulated in this case which is prone to the latency issue described above. This patch enables the I2C bus mode if I2C_FUNC_I2C is set or otherwise keeps the default SMBus mode. Signed-off-by: Hubert Streidl Signed-off-by: Mark Jonas Reviewed-by: Wolfram Sang Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit f14b8dae97dbc462c974c7b0b76c0abd419ffdea Author: Xu Yilun Date: Wed Mar 10 23:55:45 2021 +0800 mfd: intel-m10-bmc: Fix the register access range [ Upstream commit d9b326b2c3673f939941806146aee38e5c635fd0 ] This patch fixes the max register address of MAX 10 BMC. The range 0x20000000 ~ 0x200000fc are for control registers of the QSPI flash controller, which are not accessible to host. Signed-off-by: Xu Yilun Reviewed-by: Tom Rix Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit e2c075537107f9be3413c997e1b842e3f84cc545 Author: James Smart Date: Sun Apr 11 18:31:22 2021 -0700 scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic [ Upstream commit b62232ba8caccaf1954e197058104a6478fac1af ] SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3 does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have code to issue the command. The command will always fail. Remove the code for the mailbox command and leave only the resulting "failure path" logic. Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee Signed-off-by: Justin Tee Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 49bff90dd104d6bfe6d76c3ddaef0e8033fc4613 Author: James Smart Date: Sun Apr 11 18:31:17 2021 -0700 scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode [ Upstream commit 304ee43238fed517faa123e034b593905b8679f8 ] In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses the BMBX register to send the command rather than the MQ. A flag is set indicating the BMBX register is active and saves the mailbox job struct (mboxq) in the mbox_active element of the adapter. The routine then waits for completion or timeout. The mailbox job struct is not freed by the routine. In cases of timeout, the adapter will be reset. The lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for the reset. It clears the BMBX active flag and marks the job structure as MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation in both normal completion and timeout cases is that the issuer of the mbx command will free the structure. Unfortunately, not all calling paths are freeing the memory in cases of error. All calling paths were looked at and updated, if missing, to free the mboxq memory regardless of completion status. Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com Co-developed-by: Justin Tee Signed-off-by: Justin Tee Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit d30a8a240c520776a222b48edb930d6acc56c9b2 Author: James Smart Date: Sun Apr 11 18:31:14 2021 -0700 scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp() [ Upstream commit f866eb06c087125619457b53e9211a9e758f64f7 ] Call traces are being seen that result from a nodelist structure ref counting error. They are typically seen after transmission of an LS_RJT ELS response. Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the ndlp reference count is exactly 1, will decrement the reference count. Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put' within the free would only be invoked if cmdiocb->context1 was not NULL. Since the nodelist structure reference count is decremented when exiting lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required. Calling them is causing the reference count issue. Fix by removing the lpfc_nlp_not_used() calls. Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.com Co-developed-by: Justin Tee Signed-off-by: Justin Tee Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 9465bf9804c028deb6b53a90a3811cd83fb66bea Author: James Smart Date: Sun Apr 11 18:31:13 2021 -0700 scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response [ Upstream commit fffd18ec6579c2d9c72b212169259062fe747888 ] Fix a crash caused by a double put on the node when the driver completed an ACC for an unsolicted abort on the same node. The second put was executed by lpfc_nlp_not_used() and is wrong because the completion routine executes the nlp_put when the iocbq was released. Additionally, the driver is issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node into NPR. This call does nothing. Remove the lpfc_nlp_not_used call and additional set_state in the completion routine. Remove the lpfc_nlp_set_state post issue_logo. Isn't necessary. Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com Co-developed-by: Justin Tee Signed-off-by: Justin Tee Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit a8655bfe7ba6d1a289ce7252af4919937ddbfc0a Author: Gustavo A. R. Silva Date: Thu Apr 1 11:20:54 2021 -0500 scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query [ Upstream commit 16660db3fc2af8664af5e0a3cac69c4a54bfb794 ] Fix the following out-of-bounds warnings by embedding existing struct htb_rel_query into struct mpt3_addnl_diag_query, instead of duplicating its members: include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds] include/linux/fortify-string.h:22:29: warning: '__builtin_memset' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds] The problem is that the original code is trying to copy data into a bunch of struct members adjacent to each other in a single call to memcpy(). All those members are exactly the same contained in struct htb_rel_query, so instead of duplicating them into struct mpt3_addnl_diag_query, replace them with new member rel_query of type struct htb_rel_query. So, now that this new object is introduced, memcpy() doesn't overrun the length of &karg.buffer_rel_condition, because the address of the new struct object _rel_query_ is used as destination, instead. The same issue is present when calling memset(), and it is fixed with this same approach. Below is a comparison of struct mpt3_addnl_diag_query, before and after this change (the size and cachelines remain the same): $ pahole -C mpt3_addnl_diag_query drivers/scsi/mpt3sas/mpt3sas_ctl.o struct mpt3_addnl_diag_query { struct mpt3_ioctl_header hdr; /* 0 12 */ uint32_t unique_id; /* 12 4 */ uint16_t buffer_rel_condition; /* 16 2 */ uint16_t reserved1; /* 18 2 */ uint32_t trigger_type; /* 20 4 */ uint32_t trigger_info_dwords[2]; /* 24 8 */ uint32_t reserved2[2]; /* 32 8 */ /* size: 40, cachelines: 1, members: 7 */ /* last cacheline: 40 bytes */ }; $ pahole -C mpt3_addnl_diag_query drivers/scsi/mpt3sas/mpt3sas_ctl.o struct mpt3_addnl_diag_query { struct mpt3_ioctl_header hdr; /* 0 12 */ uint32_t unique_id; /* 12 4 */ struct htb_rel_query rel_query; /* 16 16 */ uint32_t reserved2[2]; /* 32 8 */ /* size: 40, cachelines: 1, members: 4 */ /* last cacheline: 40 bytes */ }; Also, this helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Link: https://lore.kernel.org/lkml/60659889.bJJILx2THu3hlpxW%25lkp@intel.com/ Link: https://lore.kernel.org/r/20210401162054.GA397186@embeddedor Build-tested-by: kernel test robot Reported-by: kernel test robot Signed-off-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 1fa8aea88a39ee22c7d962bbd56eb545667fcc3e Author: Joshua Aberback Date: Tue Apr 6 17:21:50 2021 -0400 drm/amd/display: Update DCN302 SR Exit Latency [ Upstream commit 79f02534810c9557fb3217b538616dc42a1de3b9 ] [Why&How] Update SR Exit Latency to fix screen flickering caused due to OTG underflow. This is the recommended value given by the hardware IP team. Signed-off-by: Joshua Aberback Acked-by: Aurabindo Pillai Reviewed-by: Harry Wentland Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 7ebfbe221e114a41903aadb0dc7c68dd555ce85c Author: Guchun Chen Date: Tue Mar 30 17:52:18 2021 +0800 drm/amdgpu: fix NULL pointer dereference [ Upstream commit 3c3dc654333f6389803cdcaf03912e94173ae510 ] ttm->sg needs to be checked before accessing its child member. Call Trace: amdgpu_ttm_backend_destroy+0x12/0x70 [amdgpu] ttm_bo_cleanup_memtype_use+0x3a/0x60 [ttm] ttm_bo_release+0x17d/0x300 [ttm] amdgpu_bo_unref+0x1a/0x30 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x78b/0x8b0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x118/0x220 [amdgpu] kfd_ioctl+0x222/0x400 [amdgpu] ? kfd_dev_is_large_bar+0x90/0x90 [amdgpu] __x64_sys_ioctl+0x8e/0xd0 ? __context_tracking_exit+0x52/0x90 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97f264d317 Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffdb402c338 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f97f3cc63a0 RCX: 00007f97f264d317 RDX: 00007ffdb402c380 RSI: 00000000c0284b16 RDI: 0000000000000003 RBP: 00007ffdb402c380 R08: 00007ffdb402c428 R09: 00000000c4000004 R10: 00000000c4000004 R11: 0000000000000246 R12: 00000000c0284b16 R13: 0000000000000003 R14: 00007f97f3cc63a0 R15: 00007f8836200000 Signed-off-by: Guchun Chen Acked-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit b47f594a65f935cf21b4ac62760b40d2ae4ceda2 Author: Werner Sembach Date: Wed Mar 17 16:13:48 2021 +0100 drm/amd/display: Try YCbCr420 color when YCbCr444 fails [ Upstream commit 68eb3ae3c63708f823aeeb63bb15197c727bd9bf ] When encoder validation of a display mode fails, retry with less bandwidth heavy YCbCr420 color mode, if available. This enables some HDMI 1.4 setups to support 4k60Hz output, which previously failed silently. On some setups, while the monitor and the gpu support display modes with pixel clocks of up to 600MHz, the link encoder might not. This prevents YCbCr444 and RGB encoding for 4k60Hz, but YCbCr420 encoding might still be possible. However, which color mode is used is decided before the link encoder capabilities are checked. This patch fixes the problem by retrying to find a display mode with YCbCr420 enforced and using it, if it is valid. Reviewed-by: Harry Wentland Signed-off-by: Werner Sembach Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 89d6e69c87b3ba5eda68c8570926d2dee4c1f18a Author: Alex Deucher Date: Tue Mar 23 12:59:26 2021 -0400 drm/amdgpu/display: fix memory leak for dimgrey cavefish [ Upstream commit 42b599732ee1d4ac742760050603fb6046789011 ] We need to clean up the dcn3 clk_mgr. Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit fc56fb4e57fac73827960f6f73850b1890e3c35a Author: Arnd Bergmann Date: Mon Mar 22 12:54:42 2021 +0100 amdgpu: avoid incorrect %hu format string [ Upstream commit 7d98d416c2cc1c1f7d9508e887de4630e521d797 ] clang points out that the %hu format string does not match the type of the variables here: drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat] version_major, version_minor); ^~~~~~~~~~~~~ include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR' __drm_err(fmt, ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ Change it to a regular %u, the same way a previous patch did for another instance of the same warning. Reviewed-by: Christian König Reviewed-by: Tom Rix Signed-off-by: Arnd Bergmann Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit d99753e15536ca8557b8460c5b08b3deb31be607 Author: Qu Huang Date: Sun Mar 21 16:28:18 2021 +0800 drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug [ Upstream commit d73610211eec8aa027850982b1a48980aa1bc96e ] Here is the system crash log: [ 1272.884438] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1272.884444] IP: [< (null)>] (null) [ 1272.884447] PGD 825b09067 PUD 8267c8067 PMD 0 [ 1272.884452] Oops: 0010 [#1] SMP [ 1272.884509] CPU: 13 PID: 3485 Comm: cat Kdump: loaded Tainted: G [ 1272.884515] task: ffff9a38dbd4d140 ti: ffff9a37cd3b8000 task.ti: ffff9a37cd3b8000 [ 1272.884517] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [ 1272.884520] RSP: 0018:ffff9a37cd3bbe68 EFLAGS: 00010203 [ 1272.884522] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000014d5f [ 1272.884524] RDX: fffffffffffffff4 RSI: 0000000000000001 RDI: ffff9a38aca4d200 [ 1272.884526] RBP: ffff9a37cd3bbed0 R08: ffff9a38dcd5f1a0 R09: ffff9a31ffc07300 [ 1272.884527] R10: ffff9a31ffc07300 R11: ffffffffaddd5e9d R12: ffff9a38b4e0fb00 [ 1272.884529] R13: 0000000000000001 R14: ffff9a37cd3bbf18 R15: ffff9a38aca4d200 [ 1272.884532] FS: 00007feccaa67740(0000) GS:ffff9a38dcd40000(0000) knlGS:0000000000000000 [ 1272.884534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1272.884536] CR2: 0000000000000000 CR3: 00000008267c0000 CR4: 00000000003407e0 [ 1272.884537] Call Trace: [ 1272.884544] [] ? seq_read+0x130/0x440 [ 1272.884548] [] vfs_read+0x9f/0x170 [ 1272.884552] [] SyS_read+0x7f/0xf0 [ 1272.884557] [] system_call_fastpath+0x22/0x27 [ 1272.884558] Code: Bad RIP value. [ 1272.884562] RIP [< (null)>] (null) [ 1272.884564] RSP [ 1272.884566] CR2: 0000000000000000 Signed-off-by: Qu Huang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 819dc7e3ace8741054d5fbe361275009840f6a8f Author: Tong Zhang Date: Sun Mar 21 11:19:07 2021 -0400 drm/radeon: don't evict if not initialized [ Upstream commit 05eacc0f8f6c7e27f1841343611f4bed9ee178c1 ] TTM_PL_VRAM may not initialized at all when calling radeon_bo_evict_vram(). We need to check before doing eviction. [ 2.160837] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 2.161212] #PF: supervisor read access in kernel mode [ 2.161490] #PF: error_code(0x0000) - not-present page [ 2.161767] PGD 0 P4D 0 [ 2.163088] RIP: 0010:ttm_resource_manager_evict_all+0x70/0x1c0 [ttm] [ 2.168506] Call Trace: [ 2.168641] radeon_bo_evict_vram+0x1c/0x20 [radeon] [ 2.168936] radeon_device_fini+0x28/0xf9 [radeon] [ 2.169224] radeon_driver_unload_kms+0x44/0xa0 [radeon] [ 2.169534] radeon_driver_load_kms+0x174/0x210 [radeon] [ 2.169843] drm_dev_register+0xd9/0x1c0 [drm] [ 2.170104] radeon_pci_probe+0x117/0x1a0 [radeon] Reviewed-by: Christian König Suggested-by: Christian König Signed-off-by: Tong Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 15433cb854784199587ca11493a5e16ea1baf5b1 Author: Anson Jacob Date: Mon Mar 1 14:25:44 2021 -0500 drm/amd/display: Fix UBSAN: shift-out-of-bounds warning [ Upstream commit 54718747a6e1037317a8b3610c3be40621b2b75e ] [Why] On NAVI14 CONFIG_UBSAN reported shift-out-of-bounds at display_rq_dlg_calc_20v2.c:304:38 rq_param->misc.rq_c.blk256_height is 0 when chroma(*_c) is invalid. dml_log2 returns -1023 for log2(0), although log2(0) is undefined. Which ended up as: rq_param->dlg.rq_c.swath_height = 1 << -1023 [How] Fix applied on all dml versions. 1. Ensure dml_log2 is only called if the argument is greater than 0. 2. Subtract req128_l/req128_c from log2_swath_height_l/log2_swath_height_c only when it is greater than 0. Tested-by: Daniel Wheeler Signed-off-by: Anson Jacob Reviewed-by: Dmytro Laktyushkin Reviewed-by: Jun Lei Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit c86e38aae2e28179ce5ecdd1d9e7aa29a842c7fa Author: Fangzhi Zuo Date: Tue Mar 9 11:22:36 2021 -0500 drm/amd/display: Fix debugfs link_settings entry [ Upstream commit c006a1c00de29e8cdcde1d0254ac23433ed3fee9 ] 1. Catch invalid link_rate and link_count settings 2. Call dc interface to overwrite preferred link settings, and wait until next stream update to apply the new settings. Tested-by: Daniel Wheeler Signed-off-by: Fangzhi Zuo Reviewed-by: Nicholas Kazlauskas Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit fc3393e6ec5bfba320966ea3e7796893a2a2a998 Author: Daniel Gomez Date: Thu Mar 18 09:32:36 2021 +0100 drm/radeon/ttm: Fix memory leak userptr pages [ Upstream commit 5aeaa43e0ef1006320c077cbc49f4a8229ca3460 ] If userptr pages have been pinned but not bounded, they remain uncleared. Reviewed-by: Christian König Signed-off-by: Daniel Gomez Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit d272c7074661b3176e8ff9f91bfbfbb85f832d37 Author: Daniel Gomez Date: Wed Mar 17 17:08:37 2021 +0100 drm/amdgpu/ttm: Fix memory leak userptr pages [ Upstream commit 0f6f9dd490d524930081a6ef1d60171ce39220b9 ] If userptr pages have been pinned but not bounded, they remain uncleared. Reviewed-by: Christian König Signed-off-by: Daniel Gomez Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 318e8c3434fd9de21395b31c99f620df292c4c51 Author: Marijn Suijten Date: Tue Apr 6 23:47:25 2021 +0200 drm/msm/mdp5: Do not multiply vclk line count by 100 [ Upstream commit 377569f82ea8228c421cef4da33e056a900b58ca ] Neither vtotal nor drm_mode_vrefresh contain a value that is premultiplied by 100 making the x100 variable name incorrect and resulting in vclks_line to become 100 times larger than it is supposed to be. The hardware counts 100 clockticks too many before tearcheck, leading to severe panel issues on at least the Sony Xperia lineup. This is likely an artifact from the original MDSS DSI panel driver where the calculation [1] corrected for a premultiplied reference framerate by 100 [2]. It does not appear that the above values were ever premultiplied in the history of the DRM MDP5 driver. With this change applied the value written to the SYNC_CONFIG_VSYNC register is now identical to downstream kernels. [1]: https://source.codeaurora.org/quic/la/kernel/msm-3.18/tree/drivers/video/msm/mdss/mdss_mdp_intf_cmd.c?h=LA.UM.8.6.c26-02400-89xx.0#n288 [2]: https://source.codeaurora.org/quic/la/kernel/msm-3.18/tree/drivers/video/msm/mdss/mdss_dsi_panel.c?h=LA.UM.8.6.c26-02400-89xx.0#n1648 Reviewed-by: AngeloGioacchino Del Regno Signed-off-by: Marijn Suijten Link: https://lore.kernel.org/r/20210406214726.131534-3-marijn.suijten@somainline.org Signed-off-by: Rob Clark Signed-off-by: Sasha Levin commit 9d105eefb786fb7f90e8045a2d5551d1faad5e23 Author: Marijn Suijten Date: Tue Apr 6 23:47:24 2021 +0200 drm/msm/mdp5: Configure PP_SYNC_HEIGHT to double the vtotal [ Upstream commit 2ad52bdb220de5ab348098e3482b01235d15a842 ] Leaving this at a close-to-maximum register value 0xFFF0 means it takes very long for the MDSS to generate a software vsync interrupt when the hardware TE interrupt doesn't arrive. Configuring this to double the vtotal (like some downstream kernels) leads to a frame to take at most twice before the vsync signal, until hardware TE comes up. In this case the hardware interrupt responsible for providing this signal - "disp-te" gpio - is not hooked up to the mdp5 vsync/pp logic at all. This solves severe panel update issues observed on at least the Xperia Loire and Tone series, until said gpio is properly hooked up to an irq. Suggested-by: AngeloGioacchino Del Regno Signed-off-by: Marijn Suijten Reviewed-by: AngeloGioacchino Del Regno Link: https://lore.kernel.org/r/20210406214726.131534-2-marijn.suijten@somainline.org Signed-off-by: Rob Clark Signed-off-by: Sasha Levin commit ce0da003ae3d04f3ffc59cc7685beaab1bd55a1d Author: Lingutla Chandrasekhar Date: Wed Apr 7 23:06:26 2021 +0100 sched/fair: Ignore percpu threads for imbalance pulls [ Upstream commit 9bcb959d05eeb564dfc9cac13a59843a4fb2edf2 ] During load balance, LBF_SOME_PINNED will be set if any candidate task cannot be detached due to CPU affinity constraints. This can result in setting env->sd->parent->sgc->group_imbalance, which can lead to a group being classified as group_imbalanced (rather than any of the other, lower group_type) when balancing at a higher level. In workloads involving a single task per CPU, LBF_SOME_PINNED can often be set due to per-CPU kthreads being the only other runnable tasks on any given rq. This results in changing the group classification during load-balance at higher levels when in reality there is nothing that can be done for this affinity constraint: per-CPU kthreads, as the name implies, don't get to move around (modulo hotplug shenanigans). It's not as clear for userspace tasks - a task could be in an N-CPU cpuset with N-1 offline CPUs, making it an "accidental" per-CPU task rather than an intended one. KTHREAD_IS_PER_CPU gives us an indisputable signal which we can leverage here to not set LBF_SOME_PINNED. Note that the aforementioned classification to group_imbalance (when nothing can be done) is especially problematic on big.LITTLE systems, which have a topology the likes of: DIE [ ] MC [ ][ ] 0 1 2 3 L L B B arch_scale_cpu_capacity(L) < arch_scale_cpu_capacity(B) Here, setting LBF_SOME_PINNED due to a per-CPU kthread when balancing at MC level on CPUs [0-1] will subsequently prevent CPUs [2-3] from classifying the [0-1] group as group_misfit_task when balancing at DIE level. Thus, if CPUs [0-1] are running CPU-bound (misfit) tasks, ill-timed per-CPU kthreads can significantly delay the upgmigration of said misfit tasks. Systems relying on ASYM_PACKING are likely to face similar issues. Signed-off-by: Lingutla Chandrasekhar [Use kthread_is_per_cpu() rather than p->nr_cpus_allowed] [Reword changelog] Signed-off-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Dietmar Eggemann Reviewed-by: Vincent Guittot Link: https://lkml.kernel.org/r/20210407220628.3798191-2-valentin.schneider@arm.com Signed-off-by: Sasha Levin commit 91ef133ec62d658375d1e829a06627330336cabc Author: Rik van Riel Date: Fri Mar 26 15:19:32 2021 -0400 sched/fair: Bring back select_idle_smt(), but differently [ Upstream commit c722f35b513f807629603bbf24640b1a48be21b5 ] Mel Gorman did some nice work in 9fe1f127b913 ("sched/fair: Merge select_idle_core/cpu()"), resulting in the kernel being more efficient at finding an idle CPU, and in tasks spending less time waiting to be run, both according to the schedstats run_delay numbers, and according to measured application latencies. Yay. The flip side of this is that we see more task migrations (about 30% more), higher cache misses, higher memory bandwidth utilization, and higher CPU use, for the same number of requests/second. This is most pronounced on a memcache type workload, which saw a consistent 1-3% increase in total CPU use on the system, due to those increased task migrations leading to higher L2 cache miss numbers, and higher memory utilization. The exclusive L3 cache on Skylake does us no favors there. On our web serving workload, that effect is usually negligible. It appears that the increased number of CPU migrations is generally a good thing, since it leads to lower cpu_delay numbers, reflecting the fact that tasks get to run faster. However, the reduced locality and the corresponding increase in L2 cache misses hurts a little. The patch below appears to fix the regression, while keeping the benefit of the lower cpu_delay numbers, by reintroducing select_idle_smt with a twist: when a socket has no idle cores, check to see if the sibling of "prev" is idle, before searching all the other CPUs. This fixes both the occasional 9% regression on the web serving workload, and the continuous 2% CPU use regression on the memcache type workload. With Mel's patches and this patch together, task migrations are still high, but L2 cache misses, memory bandwidth, and CPU time used are back down to what they were before. The p95 and p99 response times for the memcache type application improve by about 10% over what they were before Mel's patches got merged. Signed-off-by: Rik van Riel Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mel Gorman Acked-by: Vincent Guittot Link: https://lkml.kernel.org/r/20210326151932.2c187840@imladris.surriel.com Signed-off-by: Sasha Levin commit 36a2d6da4c2f072e788867410d01b3f647cb85f6 Author: Hans Verkuil Date: Thu Apr 8 12:31:20 2021 +0200 media: gscpa/stv06xx: fix memory leak [ Upstream commit 4f4e6644cd876c844cdb3bea2dd7051787d5ae25 ] For two of the supported sensors the stv06xx driver allocates memory which is stored in sd->sensor_priv. This memory is freed on a disconnect, but if the probe() fails, then it isn't freed and so this leaks memory. Add a new probe_error() op that drivers can use to free any allocated memory in case there was a probe failure. Thanks to Pavel Skripkin for discovering the cause of the memory leak. Reported-and-tested-by: syzbot+e7f4c64a4248a0340c37@syzkaller.appspotmail.com Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit d72fa22777207fa5d796c36776dd44161337a203 Author: Pavel Skripkin Date: Sun Mar 28 21:32:19 2021 +0200 media: dvb-usb: fix memory leak in dvb_usb_adapter_init [ Upstream commit b7cd0da982e3043f2eec7235ac5530cb18d6af1d ] syzbot reported memory leak in dvb-usb. The problem was in invalid error handling in dvb_usb_adapter_init(). for (n = 0; n < d->props.num_adapters; n++) { .... if ((ret = dvb_usb_adapter_stream_init(adap)) || (ret = dvb_usb_adapter_dvb_init(adap, adapter_nrs)) || (ret = dvb_usb_adapter_frontend_init(adap))) { return ret; } ... d->num_adapters_initialized++; ... } In case of error in dvb_usb_adapter_dvb_init() or dvb_usb_adapter_dvb_init() d->num_adapters_initialized won't be incremented, but dvb_usb_adapter_exit() relies on it: for (n = 0; n < d->num_adapters_initialized; n++) So, allocated objects won't be freed. Signed-off-by: Pavel Skripkin Reported-by: syzbot+3c2be7424cea3b932b0e@syzkaller.appspotmail.com Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit bdc41f736b41d26fbb1c9be2a3e1ede02f1ad888 Author: Dinghao Liu Date: Wed Apr 7 07:46:06 2021 +0200 media: sun8i-di: Fix runtime PM imbalance in deinterlace_start_streaming [ Upstream commit f1995d5e43cf897f63b4d7a7f84a252d891ae820 ] pm_runtime_get_sync() will increase the runtime PM counter even it returns an error. Thus a pairing decrement is needed to prevent refcount leak. Fix this by replacing this API with pm_runtime_resume_and_get(), which will not change the runtime PM counter on error. Signed-off-by: Dinghao Liu Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit c9c8b61f30dd7816a8a58724efbaaf2196692844 Author: Dinghao Liu Date: Wed Apr 7 07:43:13 2021 +0200 media: platform: sti: Fix runtime PM imbalance in regs_show [ Upstream commit 69306a947b3ae21e0d1cbfc9508f00fec86c7297 ] pm_runtime_get_sync() will increase the runtime PM counter even it returns an error. Thus a pairing decrement is needed to prevent refcount leak. Fix this by replacing this API with pm_runtime_resume_and_get(), which will not change the runtime PM counter on error. Signed-off-by: Dinghao Liu Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 21b50a6ac98f350910d02b54499fe720a8b3f522 Author: Yang Yingliang Date: Tue Apr 6 15:50:53 2021 +0200 media: i2c: adv7842: fix possible use-after-free in adv7842_remove() [ Upstream commit 4a15275b6a18597079f18241c87511406575179a ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 98e887fc93fea317365c8a89f0cd64a70e994443 Author: Yang Yingliang Date: Tue Apr 6 15:49:45 2021 +0200 media: i2c: tda1997: Fix possible use-after-free in tda1997x_remove() [ Upstream commit 7f820ab5d4eebfe2d970d32a76ae496a6c286f0f ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 8d57011b838e6bb36463fe98284e522c679801d3 Author: Yang Yingliang Date: Tue Apr 6 15:48:12 2021 +0200 media: i2c: adv7511-v4l2: fix possible use-after-free in adv7511_remove() [ Upstream commit 2c9541720c66899adf6f3600984cf3ef151295ad ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 2a52254ccae1d3ddebd7771f09576814f73ba619 Author: Yang Yingliang Date: Tue Apr 6 15:42:46 2021 +0200 media: adv7604: fix possible use-after-free in adv76xx_remove() [ Upstream commit fa56f5f1fe31c2050675fa63b84963ebd504a5b3 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 1fcbb1432a7514525ffa44c08a7d3768a1e35f18 Author: Yang Yingliang Date: Tue Apr 6 15:39:29 2021 +0200 media: tc358743: fix possible use-after-free in tc358743_remove() [ Upstream commit 6107a4fdf8554a7aa9488bdc835bb010062fa8a9 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 3a8aa84ad5c6092eaf598f5e174ad1c46bcbe501 Author: Yang Yingliang Date: Wed Apr 7 17:19:03 2021 +0800 power: supply: s3c_adc_battery: fix possible use-after-free in s3c_adc_bat_remove() [ Upstream commit 68ae256945d2abe9036a7b68af4cc65aff79d5b7 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Reviewed-by: Krzysztof Kozlowski Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit 15efbbe7674bfcd34adc90151ca5a50e9cb22417 Author: Yang Yingliang Date: Wed Apr 7 17:17:06 2021 +0800 power: supply: generic-adc-battery: fix possible use-after-free in gab_remove() [ Upstream commit b6cfa007b3b229771d9588970adb4ab3e0487f49 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit ce1e481865a7f2f2e1049790fb4acbc0f9472b8f Author: Colin Ian King Date: Tue Apr 6 18:01:15 2021 +0100 clk: socfpga: arria10: Fix memory leak of socfpga_clk on error return [ Upstream commit 657d4d1934f75a2d978c3cf2086495eaa542e7a9 ] There is an error return path that is not kfree'ing socfpga_clk leading to a memory leak. Fix this by adding in the missing kfree call. Addresses-Coverity: ("Resource leak") Signed-off-by: Colin Ian King Link: https://lore.kernel.org/r/20210406170115.430990-1-colin.king@canonical.com Acked-by: Dinh Nguyen Reviewed-by: Krzysztof Kozlowski Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit 1cf96e59da60551b45f58f77e5b582b0cae30775 Author: Abhinav Kumar Date: Fri Mar 5 11:17:18 2021 -0800 drm/msm/dp: Fix incorrect NULL check kbot warnings in DP driver [ Upstream commit 7d649cfe0314aad2ba18042885ab9de2f13ad809 ] Fix an incorrect NULL check reported by kbot in the MSM DP driver smatch warnings: drivers/gpu/drm/msm/dp/dp_hpd.c:37 dp_hpd_connect() error: we previously assumed 'hpd_priv->dp_cb' could be null (see line 37) Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Abhinav Kumar Reviewed-by: Stephen Boyd Link: https://lore.kernel.org/r/1614971839-2686-2-git-send-email-abhinavk@codeaurora.org Signed-off-by: Rob Clark Signed-off-by: Sasha Levin commit 6281cd9e34b3a2aa0ea01e3645c8bab8ad60134b Author: Akhil P Oommen Date: Mon Apr 5 19:17:12 2021 +0530 drm/msm/a6xx: Fix perfcounter oob timeout [ Upstream commit 2fc8a92e0a22c483e749232d4f13c77a92139aa7 ] We were not programing the correct bit while clearing the perfcounter oob. So, clear it correctly using the new 'clear' bit. This fixes the below error: [drm:a6xx_gmu_set_oob] *ERROR* Timeout waiting for GMU OOB set PERFCOUNTER: 0x80000000 Signed-off-by: Akhil P Oommen Link: https://lore.kernel.org/r/1617630433-36506-1-git-send-email-akhilpo@codeaurora.org Signed-off-by: Rob Clark Signed-off-by: Sasha Levin commit 4af165214d95c84ef0625692111f8025d6eb8e63 Author: Laurent Pinchart Date: Mon Mar 8 11:31:28 2021 +0100 media: uvcvideo: Support devices that report an OT as an entity source [ Upstream commit 4ca052b4ea621d0002a5e5feace51f60ad5e6b23 ] Some devices reference an output terminal as the source of extension units. This is incorrect, as output terminals only have an input pin, and thus can't be connected to any entity in the forward direction. The resulting topology would cause issues when registering the media controller graph. To avoid this problem, connect the extension unit to the source of the output terminal instead. While at it, and while no device has been reported to be affected by this issue, also handle forward scans where two output terminals would be connected together, and skip the terminals found through such an invalid connection. Reported-and-tested-by: John Nealy Signed-off-by: Laurent Pinchart Signed-off-by: Hans de Goede Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 7de578c084c93b4e28922c270b2760de5c1f12d6 Author: Laurent Pinchart Date: Sun Feb 21 22:11:50 2021 +0100 media: uvcvideo: Fix XU id print in forward scan [ Upstream commit 3293448632ff2ae8c7cde4c3475da96138e24ca7 ] An error message in the forward scan code incorrectly prints the ID of the source entity instead of the XU entity being scanned. Fix it. Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit dd06898725d5f3069d8ed406e0a5b2208acb0575 Author: Hans Verkuil Date: Thu Mar 25 08:48:21 2021 +0100 media: vivid: update EDID [ Upstream commit 443ec4bbc6116f6f492a7a1282bfd8422c862158 ] The EDID had a few mistakes as reported by edid-decode: Block 1, CTA-861 Extension Block: Video Data Block: For improved preferred timing interoperability, set 'Native detailed modes' to 1. Video Capability Data Block: S_PT is equal to S_IT and S_CE, so should be set to 0 instead. Fixed those. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 56fd1aa2322ae5486c278f10a22997ae55361c24 Author: Muhammad Usama Anjum Date: Wed Mar 24 19:07:53 2021 +0100 media: em28xx: fix memory leak [ Upstream commit 0ae10a7dc8992ee682ff0b1752ff7c83d472eef1 ] If some error occurs, URB buffers should also be freed. If they aren't freed with the dvb here, the em28xx_dvb_fini call doesn't frees the URB buffers as dvb is set to NULL. The function in which error occurs should do all the cleanup for the allocations it had done. Tested the patch with the reproducer provided by syzbot. This patch fixes the memleak. Reported-by: syzbot+889397c820fa56adf25d@syzkaller.appspotmail.com Signed-off-by: Muhammad Usama Anjum Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 82fa26b929758a7d98edf157a2d4046aa74940f5 Author: Ewan D. Milne Date: Wed Mar 31 16:11:54 2021 -0400 scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg() [ Upstream commit bc3f2b42b70eb1b8576e753e7d0e117bbb674496 ] Some arrays return ILLEGAL_REQUEST with ASC 00h if they don't support the RTPG extended header so remove the check for INVALID FIELD IN CDB. Link: https://lore.kernel.org/r/20210331201154.20348-1-emilne@redhat.com Reviewed-by: Hannes Reinecke Signed-off-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit ae8c6ad672ce55c398bc6054ae01f2f27f673893 Author: Kevin Barnett Date: Thu Mar 11 14:17:48 2021 -0600 scsi: smartpqi: Add new PCI IDs [ Upstream commit 75fbeacca3ad30835e903002dba98dd909b4dfff ] Add support for newer hardware. Link: https://lore.kernel.org/r/161549386882.25025.2594251735886014958.stgit@brunhilda Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Acked-by: Martin Wilck Signed-off-by: Kevin Barnett Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 30a5371d870cb66d83b867167b6d86d198e8f905 Author: Murthy Bhat Date: Thu Mar 11 14:15:03 2021 -0600 scsi: smartpqi: Correct request leakage during reset operations [ Upstream commit b622a601a13ae5974c5b0aeecb990c224b8db0d9 ] While failing queued I/Os in TMF path, there was a request leak and hence stale entries in request pool with ref count being non-zero. In shutdown path we have a BUG_ON to catch stuck I/O either in firmware or in the driver. The stale requests caused a system crash. The I/O request pool leakage also lead to a significant performance drop. Link: https://lore.kernel.org/r/161549370379.25025.12793264112620796062.stgit@brunhilda Reviewed-by: Scott Teel Reviewed-by: Scott Benesh Reviewed-by: Kevin Barnett Signed-off-by: Murthy Bhat Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit d218413a98865b52402395e5612e39b39549bf7c Author: Don Brace Date: Thu Mar 11 14:14:57 2021 -0600 scsi: smartpqi: Use host-wide tag space [ Upstream commit c6d3ee209b9e863c6251f72101511340451ca324 ] Correct SCSI midlayer sending more requests than exposed host queue depth causing firmware ASSERT and lockup issues by enabling host-wide tags. Note: This also results in better performance. Link: https://lore.kernel.org/r/161549369787.25025.8975999483518581619.stgit@brunhilda Suggested-by: Ming Lei Suggested-by: John Garry Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Mike McGowen Reviewed-by: Kevin Barnett Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit b494d1b2791c7b33493ba1ef5044babfb509bf3b Author: Carl Philipp Klemm Date: Sun Jan 17 22:48:53 2021 +0100 power: supply: cpcap-charger: Add usleep to cpcap charger to avoid usb plug bounce [ Upstream commit 751faedf06e895a17e985a88ef5b6364ffd797ed ] Adds 80000 us sleep when the usb cable is plugged in to hopefully avoid bouncing contacts. Upon pluging in the usb cable vbus will bounce for some time, causing cpcap to dissconnect charging due to detecting an undervoltage condition. This is a scope of vbus on xt894 while quickly inserting the usb cable with firm force, probed at the far side of the usb socket and vbus loaded with approx 1k: http://uvos.xyz/maserati/usbplug.jpg. As can clearly be seen, vbus is all over the place for the first 15 ms or so with a small blip at ~40 ms this causes the cpcap to trip up and disable charging again. The delay helps cpcap_usb_detect avoid the worst of this. It is, however, still not ideal as strong vibrations can cause the issue to reapear any time during charging. I have however not been able to cause the device to stop charging due to this in practice as it is hard to vibrate the device such that the vbus pins start bouncing again but cpcap_usb_detect is not called again due to a detected disconnect/reconnect event. Signed-off-by: Carl Philipp Klemm Tested-by: Tony Lindgren Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit 7d8463c8acc9a566262beb714c38909091bcaceb Author: Carl Philipp Klemm Date: Sun Jan 17 22:47:45 2021 +0100 power: supply: cpcap-charger: fix small mistake in current to register conversion [ Upstream commit 8a5a0cc13aa927eac7a9eb3ca82dfc1f82cfc28d ] Signed-off-by: Carl Philipp Klemm Acked-by: Tony Lindgren Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit 3a200c6b7bd5aa3e4649e059d7dd448716419568 Author: Fenghua Yu Date: Wed Mar 17 02:22:54 2021 +0000 selftests/resctrl: Fix checking for < 0 for unsigned values [ Upstream commit 1205b688c92558a04d8dd4cbc2b213e0fceba5db ] Dan reported following static checker warnings tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals() warn: 'bw_imc' unsigned <= 0 tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals() warn: 'bw_resc_end' unsigned <= 0 These warnings are reported because 1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long variables 2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned to 'bw_imc' and 'bw_resc_end' respectively 3. The returned values are checked for <= 0 to see if the calls failed Checking for < 0 for an unsigned value doesn't make any sense. Fix this issue by changing the implementation of get_mem_bw_imc() and get_mem_bw_resctrl() such that they now accept reference to a variable and set the variable appropriately upon success and return 0, else return < 0 on error. Reported-by: Dan Carpenter Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit bf81fc260e2569ad18a3d783245c726373120fed Author: Fenghua Yu Date: Wed Mar 17 02:22:53 2021 +0000 selftests/resctrl: Fix incorrect parsing of iMC counters [ Upstream commit d81343b5eedf84be71a4313e8fd073d0c510afcf ] iMC (Integrated Memory Controller) counters are usually at "/sys/bus/event_source/devices/" and are named as "uncore_imc_". num_of_imcs() function tries to count number of such iMC counters so that it could appropriately initialize required number of perf_attr structures that could be used to read these iMC counters. num_of_imcs() function assumes that all the directories under this path that start with "uncore_imc" are iMC counters. But, on some systems there could be directories named as "uncore_imc_free_running" which aren't iMC counters. Trying to read from such directories will result in "not found file" errors and MBM/MBA tests will fail. Hence, fix the logic in num_of_imcs() such that it looks at the first character after "uncore_imc_" to check if it's a numerical digit or not. If it's a digit then the directory represents an iMC counter, else, skip the directory. Reported-by: Reinette Chatre Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit 477c43592fb66d01619c73df004b64b95bf7f52d Author: Fenghua Yu Date: Wed Mar 17 02:22:47 2021 +0000 selftests/resctrl: Use resctrl/info for feature detection [ Upstream commit ee0415681eb661efa1eb2db7acc263f2c7df1e23 ] Resctrl test suite before running any unit test (like cmt, cat, mbm and mba) should first check if the feature is enabled (by kernel and not just supported by H/W) on the platform or not. validate_resctrl_feature_request() is supposed to do that. This function intends to grep for relevant flags in /proc/cpuinfo but there are several issues here 1. validate_resctrl_feature_request() calls fgrep() to get flags from /proc/cpuinfo. But, fgrep() can only return a string with maximum of 255 characters and hence the complete cpu flags are never returned. 2. The substring search logic is also busted. If strstr() finds requested resctrl feature in the cpu flags, it returns pointer to the first occurrence. But, the logic negates the return value of strstr() and hence validate_resctrl_feature_request() returns false if the feature is present in the cpu flags and returns true if the feature is not present. 3. validate_resctrl_feature_request() checks if a resctrl feature is reported in /proc/cpuinfo flags or not. Having a cpu flag means that the H/W supports the feature, but it doesn't mean that the kernel enabled it. A user could selectively enable only a subset of resctrl features using kernel command line arguments. Hence, /proc/cpuinfo isn't a reliable source to check if a feature is enabled or not. The 3rd issue being the major one and fixing it requires changing the way validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the right place to check if a resctrl feature is enabled or not, a more appropriate place is /sys/fs/resctrl/info directory. Change validate_resctrl_feature_request() such that, 1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not 2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not 3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy 4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and check if /sys/fs/resctrl/info/L3_MON/mon_features has mbm__bytes Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2 variants can be added later. Reported-by: Reinette Chatre Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit aa177a5215c2d89d610dd6039a02bf7ac1072466 Author: Fenghua Yu Date: Wed Mar 17 02:22:40 2021 +0000 selftests/resctrl: Fix missing options "-n" and "-p" [ Upstream commit d7af3d0d515cbdf63b6c3398a3c15ecb1bc2bd38 ] resctrl test suite accepts command line arguments (like -b, -t, -n and -p) as documented in the help. But passing -n and -p throws an invalid option error. This happens because -n and -p are missing in the list of characters that getopt() recognizes as valid arguments. Hence, they are treated as invalid options. Fix this by adding them to the list of characters that getopt() recognizes as valid arguments. Please note that the main() function already has the logic to deal with the values passed as part of these arguments and hence no changes are needed there. Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit c3b5a2c51833e775092d9e3146912ec264037340 Author: Fenghua Yu Date: Wed Mar 17 02:22:38 2021 +0000 selftests/resctrl: Clean up resctrl features check [ Upstream commit 2428673638ea28fa93d2a38b1c3e8d70122b00ee ] Checking resctrl features call strcmp() to compare feature strings (e.g. "mba", "cat" etc). The checkings are error prone and don't have good coding style. Define the constant strings in macros and call strncmp() to solve the potential issues. Suggested-by: Shuah Khan Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit 6055a8a97449e94cb7a54e272dd654bcdeb18f98 Author: Fenghua Yu Date: Wed Mar 17 02:22:37 2021 +0000 selftests/resctrl: Fix compilation issues for other global variables [ Upstream commit 896016d2ad051811ff9c9c087393adc063322fbc ] Reinette reported following compilation issue on Fedora 32, gcc version 10.1.1 /usr/bin/ld: resctrl_tests.o:/resctrl.h:65: multiple definition of `bm_pid'; cache.o:/resctrl.h:65: first defined here Other variables are ppid, tests_run, llc_occup_path, is_amd. Compiler isn't happy because these variables are defined globally in two .c files but are not declared as extern. To fix issues for the global variables, declare them as extern. Chang Log: - Split this patch from v4's patch 1 (Shuah). Reported-by: Reinette Chatre Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit 1afd4311c06278dca56db79d773ceb73470731a7 Author: Fenghua Yu Date: Wed Mar 17 02:22:36 2021 +0000 selftests/resctrl: Fix compilation issues for global variables [ Upstream commit 8236c51d85a64643588505a6791e022cc8d84864 ] Reinette reported following compilation issue on Fedora 32, gcc version 10.1.1 /usr/bin/ld: cqm_test.o:/cqm_test.c:22: multiple definition of `cache_size'; cat_test.o:/cat_test.c:23: first defined here The same issue is reported for long_mask, cbm_mask, count_of_bits etc variables as well. Compiler isn't happy because these variables are defined globally in two .c files namely cqm_test.c and cat_test.c and the compiler during compilation finds that the variable is already defined (multiple definition error). Taking a closer look at the usage of these variables reveals that these variables are used only locally in functions such as cqm_resctrl_val() (defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c). These variables are not shared between those functions. So, there is no need for these variables to be global. Hence, fix this issue by making them static variables. Reported-by: Reinette Chatre Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit fa74403444c6be658eeef87c799d8486963d85fb Author: Fenghua Yu Date: Wed Mar 17 02:22:35 2021 +0000 selftests/resctrl: Enable gcc checks to detect buffer overflows [ Upstream commit a9d26a302dea29eb84f491b1340a57e56c631a71 ] David reported a buffer overflow error in the check_results() function of the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler option to automatically detect any such errors. Feature Test Macros man page describes_FORTIFY_SOURCE as below "Defining this macro causes some lightweight checks to be performed to detect some buffer overflow errors when employing various string and memory manipulation functions (for example, memcpy, memset, stpcpy, strcpy, strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and wide character variants thereof). For some functions, argument consistency is checked; for example, a check is made that open has been supplied with a mode argument when the specified flags include O_CREAT. Not all problems are detected, just some common cases. If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc -O1) and above, checks that shouldn't change the behavior of conforming programs are performed. With _FORTIFY_SOURCE set to 2, some more checking is added, but some conforming programs might fail. Some of the checks can be performed at compile time (via macros logic implemented in header files), and result in compiler warnings; other checks take place at run time, and result in a run-time error if the check fails. Use of this macro requires compiler support, available with gcc since version 4.0." Fix the buffer overflow error in the check_results() function of the cmt unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer overflow errors. Reported-by: David Binderman Suggested-by: David Binderman Tested-by: Babu Moger Signed-off-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit 20eac5036476bf5b303c9c5284f95bf6ee5d9c6a Author: Hou Pu Date: Wed Mar 31 14:52:39 2021 +0800 nvmet: return proper error code from discovery ctrl [ Upstream commit 79695dcd9ad4463a82def7f42960e6d7baa76f0b ] Return NVME_SC_INVALID_FIELD from discovery controller like normal controller when executing identify or get log page command. Signed-off-by: Hou Pu Reviewed-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit b70e92f2b19999a1acce4cb177e924336baa6aa3 Author: Carsten Haitzler Date: Thu Feb 4 13:11:02 2021 +0000 drm/komeda: Fix bit check to import to value of proper type [ Upstream commit a1c3be890440a1769ed6f822376a3e3ab0d42994 ] Another issue found by KASAN. The bit finding is buried inside the dp_for_each_set_bit() macro (that passes on to for_each_set_bit() that calls the bit stuff. These bit functions want an unsigned long pointer as input and just dumbly casting leads to out-of-bounds accesses. This fixes that. Signed-off-by: Carsten Haitzler Reviewed-by: Steven Price Reviewed-by: James Qian Wang Signed-off-by: Liviu Dudau Link: https://patchwork.freedesktop.org/patch/msgid/20210204131102.68658-1-carsten.haitzler@foss.arm.com Signed-off-by: Sasha Levin commit 0ec9e97d2109158ae2d5836ce946776ce5348600 Author: Xingui Yang Date: Fri Mar 12 18:24:36 2021 +0800 ata: ahci: Disable SXS for Hisilicon Kunpeng920 [ Upstream commit 234e6d2c18f5b080cde874483c4c361f3ae7cffe ] On Hisilicon Kunpeng920, ESP is set to 1 by default for all ports of SATA controller. In some scenarios, some ports are not external SATA ports, and it cause disks connected to these ports to be identified as removable disks. So disable the SXS capability on the software side to prevent users from mistakenly considering non-removable disks as removable disks and performing related operations. Signed-off-by: Xingui Yang Signed-off-by: Luo Jiaxing Reviewed-by: John Garry Link: https://lore.kernel.org/r/1615544676-61926-1-git-send-email-luojiaxing@huawei.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit e67ddebf1ffcd463bc145c7e9fe947cb741bf3df Author: Al Cooper Date: Thu Mar 25 15:28:34 2021 -0400 mmc: sdhci-brcmstb: Remove CQE quirk [ Upstream commit f0bdf98fab058efe7bf49732f70a0f26d1143154 ] Remove the CQHCI_QUIRK_SHORT_TXFR_DESC_SZ quirk because the latest chips have this fixed and earlier chips have other CQE problems that prevent the feature from being enabled. Signed-off-by: Al Cooper Acked-by: Florian Fainelli Link: https://lore.kernel.org/r/20210325192834.42955-1-alcooperx@gmail.com Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit e5ffc70d2ac5f303262b91878cec6e9dc24039b5 Author: Adrian Hunter Date: Mon Mar 22 07:53:56 2021 +0200 mmc: sdhci-pci: Add PCI IDs for Intel LKF [ Upstream commit ee629112be8b4eff71d4d3d108a28bc7dc877e13 ] Add PCI IDs for Intel LKF eMMC and SD card host controllers. Signed-off-by: Adrian Hunter Link: https://lore.kernel.org/r/20210322055356.24923-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit fc1077b4995ef234bb0b0ee3be416f7b2426a2dc Author: Peng Fan Date: Thu Feb 25 11:10:04 2021 +0800 mmc: sdhci-esdhc-imx: validate pinctrl before use it [ Upstream commit f410ee0aa2df050a9505f5c261953e9b18e21206 ] When imx_data->pinctrl is not a valid pointer, pinctrl_lookup_state will trigger kernel panic. When we boot Dual OS on Jailhouse hypervisor, we let the 1st Linux to configure pinmux ready for the 2nd OS, so the 2nd OS not have pinctrl settings. Similar to this commit b62eee9f804e ("mmc: sdhci-esdhc-imx: no fail when no pinctrl available"). Reviewed-by: Bough Chen Reviewed-by: Alice Guo Signed-off-by: Peng Fan Link: https://lore.kernel.org/r/1614222604-27066-6-git-send-email-peng.fan@oss.nxp.com Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 1bb697e54e784bc6cb329a743f1659e9b0971855 Author: Quinn Tran Date: Mon Mar 29 01:52:22 2021 -0700 scsi: qla2xxx: Fix use after free in bsg [ Upstream commit 2ce35c0821afc2acd5ee1c3f60d149f8b2520ce8 ] On bsg command completion, bsg_job_done() was called while qla driver continued to access the bsg_job buffer. bsg_job_done() would free up resources that ended up being reused by other task while the driver continued to access the buffers. As a result, driver was reading garbage data. localhost kernel: BUG: KASAN: use-after-free in sg_next+0x64/0x80 localhost kernel: Read of size 8 at addr ffff8883228a3330 by task swapper/26/0 localhost kernel: localhost kernel: CPU: 26 PID: 0 Comm: swapper/26 Kdump: loaded Tainted: G OE --------- - - 4.18.0-193.el8.x86_64+debug #1 localhost kernel: Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 08/12/2016 localhost kernel: Call Trace: localhost kernel: localhost kernel: dump_stack+0x9a/0xf0 localhost kernel: print_address_description.cold.3+0x9/0x23b localhost kernel: kasan_report.cold.4+0x65/0x95 localhost kernel: debug_dma_unmap_sg.part.12+0x10d/0x2d0 localhost kernel: qla2x00_bsg_sp_free+0xaf6/0x1010 [qla2xxx] Link: https://lore.kernel.org/r/20210329085229.4367-6-njavali@marvell.com Reviewed-by: Himanshu Madhani Signed-off-by: Quinn Tran Signed-off-by: Saurav Kashyap Signed-off-by: Nilesh Javali Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 6d1689fadd78fa37ef086aca251f6bd1e9255918 Author: Paolo Valente Date: Thu Mar 4 18:46:25 2021 +0100 block, bfq: fix weight-raising resume with !low_latency [ Upstream commit 8c544770092a3d7532d01903b75721e537d87001 ] When the io_latency heuristic is off, bfq_queues must not start to be weight-raised. Unfortunately, by mistake, this may happen when the state of a previously weight-raised bfq_queue is resumed after a queue split. This commit fixes this error. Tested-by: Jan Kara Signed-off-by: Paolo Valente Tested-by: Oleksandr Natalenko Link: https://lore.kernel.org/r/20210304174627.161-5-paolo.valente@linaro.org Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 15966e6a3338aaae784da203ee748f4c907489df Author: Dmitry Vyukov Date: Sat Mar 20 14:28:40 2021 +0100 drm/vkms: fix misuse of WARN_ON [ Upstream commit b4142fc4d52d051d4d8df1fb6c569e5b445d369e ] vkms_vblank_simulate() uses WARN_ON for timing-dependent condition (timer overrun). This is a mis-use of WARN_ON, WARN_ON must be used to denote kernel bugs. Use pr_warn() instead. Signed-off-by: Dmitry Vyukov Reported-by: syzbot+4fc21a003c8332eb0bdd@syzkaller.appspotmail.com Cc: Rodrigo Siqueira Cc: Melissa Wen Cc: Haneen Mohammed Cc: Daniel Vetter Cc: David Airlie Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Acked-by: Melissa Wen Signed-off-by: Melissa Wen Link: https://patchwork.freedesktop.org/patch/msgid/20210320132840.1315853-1-dvyukov@google.com Signed-off-by: Sasha Levin commit bc1434f7da172e041d006e595d4f700fc159c2e4 Author: Bart Van Assche Date: Sat Mar 20 16:23:58 2021 -0700 scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats() [ Upstream commit a2b2cc660822cae08c351c7f6b452bfd1330a4f7 ] This patch fixes the following Coverity warning: CID 361199 (#1 of 1): Unchecked return value (CHECKED_RETURN) 3. check_return: Calling qla24xx_get_isp_stats without checking return value (as is done elsewhere 4 out of 5 times). Link: https://lore.kernel.org/r/20210320232359.941-7-bvanassche@acm.org Cc: Quinn Tran Cc: Mike Christie Cc: Himanshu Madhani Cc: Daniel Wagner Cc: Lee Duncan Reviewed-by: Daniel Wagner Reviewed-by: Himanshu Madhani Signed-off-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit b45476240f7e1154423464ec070907181bb87595 Author: Qingqing Zhuo Date: Tue Mar 9 15:10:24 2021 -0500 drm/amd/display: Fix potential memory leak [ Upstream commit 51ba691206e35464fd7ec33dd519d141c80b5dff ] [Why] vblank_workqueue is never released. [How] Free it upon dm finish. Tested-by: Daniel Wheeler Signed-off-by: Qingqing Zhuo Reviewed-by: Nicholas Kazlauskas Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 563ac6261e28cf21be99cf0f31613266610fd9da Author: Dmytro Laktyushkin Date: Thu Mar 4 11:04:26 2021 -0500 drm/amd/display: fix dml prefetch validation [ Upstream commit 8ee0fea4baf90e43efe2275de208a7809f9985bc ] Incorrect variable used, missing initialization during validation. Tested-by: Daniel Wheeler Signed-off-by: Dmytro Laktyushkin Reviewed-by: Eric Bernstein Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 4a3fc4596395cfec24a710e33159194805ca3cf3 Author: Aric Cyr Date: Mon Mar 8 15:43:34 2021 -0500 drm/amd/display: DCHUB underflow counter increasing in some scenarios [ Upstream commit 4710430a779e6077d81218ac768787545bff8c49 ] [Why] When unplugging a display, the underflow counter can be seen to increase because PSTATE switch is allowed even when some planes are not blanked. [How] Check that all planes are not active instead of all streams before allowing PSTATE change. Tested-by: Daniel Wheeler Signed-off-by: Aric Cyr Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 0d2d95716fdc9fe5df9ec043f509f64595b5cd89 Author: Anson Jacob Date: Tue Mar 2 16:16:36 2021 -0500 drm/amd/display: Fix UBSAN warning for not a valid value for type '_Bool' [ Upstream commit 6a30a92997eee49554f72b462dce90abe54a496f ] [Why] dc_cursor_position do not initialise position.translate_by_source when crtc or plane->state->fb is NULL. UBSAN caught this error in dce110_set_cursor_position, as the value was garbage. [How] Initialise dc_cursor_position structure elements to 0 in handle_cursor_update before calling get_cursor_position. Tested-by: Daniel Wheeler Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1471 Reported-by: Lyude Paul Signed-off-by: Anson Jacob Reviewed-by: Aurabindo Jayamohanan Pillai Acked-by: Solomon Chiu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit f48cd93fec94f90afbdd1b14c34104e3cf22543a Author: Kenneth Feng Date: Tue Mar 16 10:28:00 2021 +0800 drm/amd/pm: fix workload mismatch on vega10 [ Upstream commit 0979d43259e13846d86ba17e451e17fec185d240 ] Workload number mapped to the correct one. This issue is only on vega10. Signed-off-by: Kenneth Feng Reviewed-by: Kevin Wang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 02cdf707babc19efee0d41def5aad3e5f2f908a1 Author: shaoyunl Date: Tue Mar 9 10:30:15 2021 -0500 drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac7c4f [ Upstream commit c8941550aa66b2a90f4b32c45d59e8571e33336e ] This recent change introduce SDMA interrupt info printing with irq->process function. These functions do not require a set function to enable/disable the irq Signed-off-by: shaoyunl Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 7a12075948ef7e939c447cdede7b208da44ec359 Author: Joshua Aberback Date: Fri Feb 26 19:44:24 2021 -0500 drm/amd/display: Align cursor cache address to 2KB [ Upstream commit 554ba183b135ef09250b61a202d88512b5bbd03a ] [Why] The registers for the address of the cursor are aligned to 2KB, so all cursor surfaces also need to be aligned to 2KB. Currently, the provided cursor cache surface is not aligned, so we need a workaround until alignment is enforced by the surface provider. [How] - round up surface address to nearest multiple of 2048 - current policy is to provide a much bigger cache size than necessary,so this operation is safe Tested-by: Daniel Wheeler Signed-off-by: Joshua Aberback Reviewed-by: Jun Lei Acked-by: Eryk Brol Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 9069b1b542de8f3bbffef868aff41521b21485cf Author: Anson Jacob Date: Wed Mar 3 12:33:15 2021 -0500 drm/amdkfd: Fix UBSAN shift-out-of-bounds warning [ Upstream commit 50e2fc36e72d4ad672032ebf646cecb48656efe0 ] If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up doing a shift operation where the number of bits shifted equals number of bits in the operand. This behaviour is undefined. Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the count is >= number of bits in the operand. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472 Reported-by: Lyude Paul Signed-off-by: Anson Jacob Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling Tested-by: Lyude Paul Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 41db62bc06eec528e8bd423e4edeb4d35eb874b5 Author: Jonathan Kim Date: Wed Jan 27 15:24:59 2021 -0500 drm/amdgpu: mask the xgmi number of hops reported from psp to kfd [ Upstream commit 4ac5617c4b7d0f0a8f879997f8ceaa14636d7554 ] The psp supplies the link type in the upper 2 bits of the psp xgmi node information num_hops field. With a new link type, Aldebaran has these bits set to a non-zero value (1 = xGMI3) so the KFD topology will report the incorrect IO link weights without proper masking. The actual number of hops is located in the 3 least significant bits of this field so mask if off accordingly before passing it to the KFD. Signed-off-by: Jonathan Kim Reviewed-by: Amber Lin Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit e58dd0a4f199ab68fe694b87010d1dbc4591a8f8 Author: Alex Sierra Date: Fri Jan 15 17:03:18 2021 -0600 drm/amdgpu: enable 48-bit IH timestamp counter [ Upstream commit 9a9c59a8f4f4478d5951eb0bded1d17b936aad6e ] By default this timestamp is 32 bit counter. It gets overflowed in around 10 minutes. Signed-off-by: Alex Sierra Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 39ba18abd37af60233f5ea60d87edbce110f9399 Author: Philip Yang Date: Tue Sep 22 13:09:33 2020 -0400 drm/amdgpu: enable retry fault wptr overflow [ Upstream commit b672cb1eee59efe6ca5bb2a2ce90060a22860558 ] If xnack is on, VM retry fault interrupt send to IH ring1, and ring1 will be full quickly. IH cannot receive other interrupts, this causes deadlock if migrating buffer using sdma and waiting for sdma done while handling retry fault. Remove VMC from IH storm client, enable ring1 write pointer overflow, then IH will drop retry fault interrupts and be able to receive other interrupts while driver is handling retry fault. IH ring1 write pointer doesn't writeback to memory by IH, and ring1 write pointer recorded by self-irq is not updated, so always read the latest ring1 write pointer from register. Signed-off-by: Philip Yang Signed-off-by: Felix Kuehling Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit d736c4fc59a8d5e43756b6b0bed79ddae35b1964 Author: Kiran Gunda Date: Thu Mar 18 18:09:39 2021 +0530 backlight: qcom-wled: Fix FSC update issue for WLED5 [ Upstream commit 4d6e9cdff7fbb6bef3e5559596fab3eeffaf95ca ] Currently, for WLED5, the FSC (Full scale current) setting is not updated properly due to driver toggling the wrong register after an FSC update. On WLED5 we should only toggle the MOD_SYNC bit after a brightness update. For an FSC update we need to toggle the SYNC bits instead. Fix it by adopting the common wled3_sync_toggle() for WLED5 and introducing new code to the brightness update path to compensate. Signed-off-by: Kiran Gunda Reviewed-by: Daniel Thompson Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 219db950ae5a3d542d78f23085f76609c7f362ac Author: Obeida Shamoun Date: Sun Mar 14 11:11:10 2021 +0100 backlight: qcom-wled: Use sink_addr for sync toggle [ Upstream commit cdfd4c689e2a52c313b35ddfc1852ff274f91acb ] WLED3_SINK_REG_SYNC is, as the name implies, a sink register offset. Therefore, use the sink address as base instead of the ctrl address. This fixes the sync toggle on wled4, which can be observed by the fact that adjusting brightness now works. It has no effect on wled3 because sink and ctrl base addresses are the same. This allows adjusting the brightness without having to disable then reenable the module. Signed-off-by: Obeida Shamoun Signed-off-by: Konrad Dybcio Signed-off-by: Marijn Suijten Reviewed-by: Daniel Thompson Acked-by: Kiran Gunda Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 599edbf276f43493c5b54b66553831bdf4a317f0 Author: dongjian Date: Mon Mar 22 19:21:33 2021 +0800 power: supply: Use IRQF_ONESHOT [ Upstream commit 2469b836fa835c67648acad17d62bc805236a6ea ] Fixes coccicheck error: drivers/power/supply/pm2301_charger.c:1089:7-27: ERROR: drivers/power/supply/lp8788-charger.c:502:8-28: ERROR: drivers/power/supply/tps65217_charger.c:239:8-33: ERROR: drivers/power/supply/tps65090-charger.c:303:8-33: ERROR: Threaded IRQ with no primary handler requested without IRQF_ONESHOT Signed-off-by: dongjian Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit 889cb8bec2c53738412157cef5b0420c71e96304 Author: Hans Verkuil Date: Fri Mar 12 09:49:55 2021 +0100 media: v4l2-ctrls.c: initialize flags field of p_fwht_params [ Upstream commit ea1611ba3a544b34f89ffa3d1e833caab30a3f09 ] The V4L2_CID_STATELESS_FWHT_PARAMS compound control was missing a proper initialization of the flags field, so after loading the vicodec module for the first time, running v4l2-compliance for the stateless decoder would fail on this control because the initial control value was considered invalid by the vicodec driver. Initializing the flags field to sane values fixes this. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit b8d7e79dd4847762a381bd763d9cb0b2a130a569 Author: Hans Verkuil Date: Thu Mar 11 15:46:40 2021 +0100 media: gspca/sq905.c: fix uninitialized variable [ Upstream commit eaaea4681984c79d2b2b160387b297477f0c1aab ] act_len can be uninitialized if usb_bulk_msg() returns an error. Set it to 0 to avoid a KMSAN error. Signed-off-by: Hans Verkuil Reported-by: syzbot+a4e309017a5f3a24c7b3@syzkaller.appspotmail.com Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit b13bc773eea302bbe76c9acd3c4117501a1a7dd1 Author: Daniel Niv Date: Thu Mar 11 03:53:00 2021 +0100 media: media/saa7164: fix saa7164_encoder_register() memory leak bugs [ Upstream commit c759b2970c561e3b56aa030deb13db104262adfe ] Add a fix for the memory leak bugs that can occur when the saa7164_encoder_register() function fails. The function allocates memory without explicitly freeing it when errors occur. Add a better error handling that deallocate the unused buffers before the function exits during a fail. Signed-off-by: Daniel Niv Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 2b17e0eb9bb51381294cd5dcae694a8ccc3e1ad0 Author: Julian Wiedmann Date: Sat Jan 30 14:56:20 2021 +0100 s390/qdio: let driver manage the QAOB [ Upstream commit 396c100472dd63bb1a5389d9dfb25a94943c41c9 ] We are spending way too much effort on qdio-internal bookkeeping for QAOB management & caching, and it's still not robust. Once qdio's TX path has detached the QAOB from a PENDING buffer, we lost all track of it until it shows up in a CQ notification again. So if the device is torn down before that notification arrives, we leak the QAOB. Just have the driver take care of it, and simply pass down a QAOB if they want a TX with async-completion capability. For a buffer in PENDING state that requires the QAOB for final completion, qeth can now also try to recycle the buffer's QAOB rather than unconditionally freeing it. This also eliminates the qdio_outbuf_state array, which was only needed to transfer the aob->user1 tag from the driver to the qdio layer. Signed-off-by: Julian Wiedmann Acked-by: Benjamin Block Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit 54f1727a0956bcbed3a88540577c5e7dbab93aec Author: Bryan O'Donoghue Date: Fri Feb 5 19:11:49 2021 +0100 media: venus: core, venc, vdec: Fix probe dependency error [ Upstream commit 08b1cf474b7f72750adebe0f0a35f8e9a3eb75f6 ] Commit aaaa93eda64b ("media] media: venus: venc: add video encoder files") is the last in a series of three commits to add core.c vdec.c and venc.c adding core, encoder and decoder. The encoder and decoder check for core drvdata as set and return -EPROBE_DEFER if it has not been set, however both the encoder and decoder rely on core.v4l2_dev as valid. core.v4l2_dev will not be valid until v4l2_device_register() has completed in core.c's probe(). Normally this is never seen however, Dmitry reported the following backtrace when compiling drivers and firmware directly into a kernel image. [ 5.259968] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 5.269850] sd 0:0:0:3: [sdd] Optimal transfer size 524288 bytes [ 5.275505] Workqueue: events deferred_probe_work_func [ 5.275513] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 5.441211] usb 2-1: new SuperSpeedPlus Gen 2 USB device number 2 using xhci-hcd [ 5.442486] pc : refcount_warn_saturate+0x140/0x148 [ 5.493756] hub 2-1:1.0: USB hub found [ 5.496266] lr : refcount_warn_saturate+0x140/0x148 [ 5.500982] hub 2-1:1.0: 4 ports detected [ 5.503440] sp : ffff80001067b730 [ 5.503442] x29: ffff80001067b730 [ 5.592660] usb 1-1: new high-speed USB device number 2 using xhci-hcd [ 5.598478] x28: ffff6c6bc1c379b8 [ 5.598480] x27: ffffa5c673852960 x26: ffffa5c673852000 [ 5.598484] x25: ffff6c6bc1c37800 x24: 0000000000000001 [ 5.810652] x23: 0000000000000000 x22: ffffa5c673bc7118 [ 5.813777] hub 1-1:1.0: USB hub found [ 5.816108] x21: ffffa5c674440000 x20: 0000000000000001 [ 5.820846] hub 1-1:1.0: 4 ports detected [ 5.825415] x19: ffffa5c6744f4000 x18: ffffffffffffffff [ 5.825418] x17: 0000000000000000 x16: 0000000000000000 [ 5.825421] x15: 00000a4810c193ba x14: 0000000000000000 [ 5.825424] x13: 00000000000002b8 x12: 000000000000f20a [ 5.825427] x11: 000000000000f20a x10: 0000000000000038 [ 5.845447] usb 2-1.1: new SuperSpeed Gen 1 USB device number 3 using xhci-hcd [ 5.845904] [ 5.845905] x9 : 0000000000000000 x8 : ffff6c6d36fae780 [ 5.871208] x7 : ffff6c6d36faf240 x6 : 0000000000000000 [ 5.876664] x5 : 0000000000000004 x4 : 0000000000000085 [ 5.882121] x3 : 0000000000000119 x2 : ffffa5c6741ef478 [ 5.887578] x1 : 3acbb3926faf5f00 x0 : 0000000000000000 [ 5.893036] Call trace: [ 5.895551] refcount_warn_saturate+0x140/0x148 [ 5.900202] __video_register_device+0x64c/0xd10 [ 5.904944] venc_probe+0xc4/0x148 [ 5.908444] platform_probe+0x68/0xe0 [ 5.912210] really_probe+0x118/0x3e0 [ 5.915977] driver_probe_device+0x5c/0xc0 [ 5.920187] __device_attach_driver+0x98/0xb8 [ 5.924661] bus_for_each_drv+0x68/0xd0 [ 5.928604] __device_attach+0xec/0x148 [ 5.932547] device_initial_probe+0x14/0x20 [ 5.936845] bus_probe_device+0x9c/0xa8 [ 5.940788] device_add+0x3e8/0x7c8 [ 5.944376] of_device_add+0x4c/0x60 [ 5.948056] of_platform_device_create_pdata+0xbc/0x140 [ 5.953425] of_platform_bus_create+0x17c/0x3c0 [ 5.958078] of_platform_populate+0x80/0x110 [ 5.962463] venus_probe+0x2ec/0x4d8 [ 5.966143] platform_probe+0x68/0xe0 [ 5.969907] really_probe+0x118/0x3e0 [ 5.973674] driver_probe_device+0x5c/0xc0 [ 5.977882] __device_attach_driver+0x98/0xb8 [ 5.982356] bus_for_each_drv+0x68/0xd0 [ 5.986298] __device_attach+0xec/0x148 [ 5.990242] device_initial_probe+0x14/0x20 [ 5.994539] bus_probe_device+0x9c/0xa8 [ 5.998481] deferred_probe_work_func+0x74/0xb0 [ 6.003132] process_one_work+0x1e8/0x360 [ 6.007254] worker_thread+0x208/0x478 [ 6.011106] kthread+0x150/0x158 [ 6.014431] ret_from_fork+0x10/0x30 [ 6.018111] ---[ end trace f074246b1ecdb466 ]--- This patch fixes by - Only setting drvdata after v4l2_device_register() completes - Moving v4l2_device_register() so that suspend/reume in core::probe() stays as-is - Changes pm_ops->core_function() to take struct venus_core not struct device - Minimal rework of v4l2_device_*register in probe/remove Reported-by: Dmitry Baryshkov Signed-off-by: Bryan O'Donoghue Signed-off-by: Stanimir Varbanov Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 0e5d8d321fb8da26b63ee4c03e44467e43eaffe1 Author: Hans de Goede Date: Sun Mar 7 16:17:57 2021 +0100 extcon: arizona: Fix various races on driver unbind [ Upstream commit e5b499f6fb17bc95a813e85d0796522280203806 ] We must free/disable all interrupts and cancel all pending works before doing further cleanup. Before this commit arizona_extcon_remove() was doing several register writes to shut things down before disabling the IRQs and it was cancelling only 1 of the 3 different works used. Move all the register-writes shutting things down to after the disabling of the IRQs and add the 2 missing cancel_delayed_work_sync() calls. This fixes various possible races on driver unbind. One of which would always trigger on devices using the mic-clamp feature for jack detection. The ARIZONA_MICD_CLAMP_MODE_MASK update was done before disabling the IRQs, causing: 1. arizona_jackdet() to run 2. detect a jack being inserted (clamp disabled means jack inserted) 3. call arizona_start_mic() which: 3.1 Enables the MICVDD regulator 3.2 takes a pm_runtime_reference And this was all happening after the ARIZONA_MICD_ENA bit clearing, which would undo 3.1 and 3.2 because the ARIZONA_MICD_CLAMP_MODE_MASK update was being done after the ARIZONA_MICD_ENA bit clearing. So this means that arizona_extcon_remove() would exit with 1. MICVDD enabled and 2. The pm_runtime_reference being unbalanced. MICVDD still being enabled caused the following oops when the regulator is released by the devm framework: [ 2850.745757] ------------[ cut here ]------------ [ 2850.745827] WARNING: CPU: 2 PID: 2098 at drivers/regulator/core.c:2123 _regulator_put.part.0+0x19f/0x1b0 [ 2850.745835] Modules linked in: extcon_arizona ... ... [ 2850.746909] Call Trace: [ 2850.746932] regulator_put+0x2d/0x40 [ 2850.746946] release_nodes+0x22a/0x260 [ 2850.746984] __device_release_driver+0x190/0x240 [ 2850.747002] driver_detach+0xd4/0x120 ... [ 2850.747337] ---[ end trace f455dfd7abd9781f ]--- Note this oops is just one of various theoretically possible races caused by the wrong ordering inside arizona_extcon_remove(), this fixes the ordering fixing all possible races, including the reported oops. Signed-off-by: Hans de Goede Reviewed-by: Andy Shevchenko Acked-by: Charles Keepax Tested-by: Charles Keepax Acked-by: Chanwoo Choi Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 381d0d20ea7b432102ca0f043974f7c2d2dae59f Author: Hans de Goede Date: Sun Mar 7 16:17:56 2021 +0100 extcon: arizona: Fix some issues when HPDET IRQ fires after the jack has been unplugged [ Upstream commit c309a3e8793f7e01c4a4ec7960658380572cb576 ] When the jack is partially inserted and then removed again it may be removed while the hpdet code is running. In this case the following may happen: 1. The "JACKDET rise" or ""JACKDET fall" IRQ triggers 2. arizona_jackdet runs and takes info->lock 3. The "HPDET" IRQ triggers 4. arizona_hpdet_irq runs, blocks on info->lock 5. arizona_jackdet calls arizona_stop_mic() and clears info->hpdet_done 6. arizona_jackdet releases info->lock 7. arizona_hpdet_irq now can continue running and: 7.1 Calls arizona_start_mic() (if a mic was detected) 7.2 sets info->hpdet_done Step 7 is undesirable / a bug: 7.1 causes the device to stay in a high power-state (with MICVDD enabled) 7.2 causes hpdet to not run on the next jack insertion, which in turn causes the EXTCON_JACK_HEADPHONE state to never get set This fixes both issues by skipping these 2 steps when arizona_hpdet_irq runs after the jack has been unplugged. Signed-off-by: Hans de Goede Reviewed-by: Andy Shevchenko Acked-by: Charles Keepax Tested-by: Charles Keepax Acked-by: Chanwoo Choi Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 348c35a21bf1c13fe0968623161ae39b29783bdf Author: Matthias Schiffer Date: Wed Mar 3 10:54:19 2021 +0100 power: supply: bq27xxx: fix power_avg for newer ICs [ Upstream commit c4d57c22ac65bd503716062a06fad55a01569cac ] On all newer bq27xxx ICs, the AveragePower register contains a signed value; in addition to handling the raw value as unsigned, the driver code also didn't convert it to µW as expected. At least for the BQ28Z610, the reference manual incorrectly states that the value is in units of 1mW and not 10mW. I have no way of knowing whether the manuals of other supported ICs contain the same error, or if there are models that actually use 1mW. At least, the new code shouldn't be *less* correct than the old version for any device. power_avg is removed from the cache structure, se we don't have to extend it to store both a signed value and an error code. Always getting an up-to-date value may be desirable anyways, as it avoids inconsistent current and power readings when switching between charging and discharging. Signed-off-by: Matthias Schiffer Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit c8baf9979daec8410c5b701eec5074941a27bb8f Author: Mauro Carvalho Chehab Date: Fri Mar 12 08:16:03 2021 +0100 atomisp: don't let it go past pipes array [ Upstream commit 1f6c45ac5fd70ab59136ab5babc7def269f3f509 ] In practice, IA_CSS_PIPE_ID_NUM should never be used when calling atomisp_q_video_buffers_to_css(), as the driver should discover the right pipe before calling it. Yet, if some pipe parsing issue happens, it could end using it. So, add a WARN_ON() to prevent such case. Reported-by: Dan Carpenter Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 115a5769d710d60da4cfb4507c68029d5c55bc29 Author: Laurent Pinchart Date: Mon Feb 15 05:26:47 2021 +0100 media: imx: capture: Return -EPIPE from __capture_legacy_try_fmt() [ Upstream commit cc271b6754691af74d710b761eaf027e3743e243 ] The correct return code to report an invalid pipeline configuration is -EPIPE. Return it instead of -EINVAL from __capture_legacy_try_fmt() when the capture format doesn't match the media bus format of the connected subdev. Signed-off-by: Laurent Pinchart Reviewed-by: Rui Miguel Silva Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit a6b47637955cfdf2fd628d6fd3bb5413013d2623 Author: Brad Love Date: Tue Jan 26 05:52:05 2021 +0100 media: cx23885: add more quirks for reset DMA on some AMD IOMMU [ Upstream commit 5f864cfbf59bfed2057bd214ce7fbf6ad420d54b ] The folowing AMD IOMMU are affected by the RiSC engine stall, requiring a reset to maintain continual operation. After being added to the broken_dev_id list the systems are functional long term. 0x1481 is the PCI ID for the IOMMU found on Starship/Matisse 0x1419 is the PCI ID for the IOMMU found on 15h (Models 10h-1fh) family 0x5a23 is the PCI ID for the IOMMU found on RD890S/RD990 Signed-off-by: Brad Love Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 08acc7769692c1769be96b2a90e21b042bb4382d Author: Pavel Skripkin Date: Mon Mar 1 21:38:26 2021 +0100 media: drivers/media/usb: fix memory leak in zr364xx_probe [ Upstream commit 9c39be40c0155c43343f53e3a439290c0fec5542 ] syzbot reported memory leak in zr364xx_probe()[1]. The problem was in invalid error handling order. All error conditions rigth after v4l2_ctrl_handler_init() must call v4l2_ctrl_handler_free(). Reported-by: syzbot+efe9aefc31ae1e6f7675@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 6802047d607edfc0c40be9f99dcfddeb7b23152c Author: Julian Braha Date: Thu Feb 25 09:06:58 2021 +0100 media: drivers: media: pci: sta2x11: fix Kconfig dependency on GPIOLIB [ Upstream commit 24df8b74c8b2fb42c49ffe8585562da0c96446ff ] When STA2X11_VIP is enabled, and GPIOLIB is disabled, Kbuild gives the following warning: WARNING: unmet direct dependencies detected for VIDEO_ADV7180 Depends on [n]: MEDIA_SUPPORT [=y] && GPIOLIB [=n] && VIDEO_V4L2 [=y] && I2C [=y] Selected by [y]: - STA2X11_VIP [=y] && MEDIA_SUPPORT [=y] && MEDIA_PCI_SUPPORT [=y] && MEDIA_CAMERA_SUPPORT [=y] && PCI [=y] && VIDEO_V4L2 [=y] && VIRT_TO_BUS [=y] && I2C [=y] && (STA2X11 [=n] || COMPILE_TEST [=y]) && MEDIA_SUBDRV_AUTOSELECT [=y] This is because STA2X11_VIP selects VIDEO_ADV7180 without selecting or depending on GPIOLIB, despite VIDEO_ADV7180 depending on GPIOLIB. Signed-off-by: Julian Braha Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 2eeccff0ba79b86ca81de5185dbbf5411b068c2b Author: Sean Young Date: Mon Feb 22 09:08:35 2021 +0100 media: ite-cir: check for receive overflow [ Upstream commit 28c7afb07ccfc0a939bb06ac1e7afe669901c65a ] It's best if this condition is reported. Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 4f8058c23b2b8d7a0cd9efbd94d13aebceb88592 Author: Chaitanya Kulkarni Date: Sat Feb 27 21:56:26 2021 -0800 scsi: target: pscsi: Fix warning in pscsi_complete_cmd() [ Upstream commit fd48c056a32ed6e7754c7c475490f3bed54ed378 ] This fixes a compilation warning in pscsi_complete_cmd(): drivers/target/target_core_pscsi.c: In function ‘pscsi_complete_cmd’: drivers/target/target_core_pscsi.c:624:5: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ; /* XXX: TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE */ Link: https://lore.kernel.org/r/20210228055645.22253-5-chaitanya.kulkarni@wdc.com Reviewed-by: Mike Christie Reviewed-by: Johannes Thumshirn Signed-off-by: Chaitanya Kulkarni Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 1016c0f62f73d5d2c3a5d1ad1e1fb0cdce1993ae Author: xndcn Date: Fri Mar 5 23:18:19 2021 +0800 drm/virtio: fix possible leak/unlock virtio_gpu_object_array [ Upstream commit 377f8331d0565e6f71ba081c894029a92d0c7e77 ] virtio_gpu_object array is not freed or unlocked in some failed cases. Signed-off-by: xndcn Link: http://patchwork.freedesktop.org/patch/msgid/20210305151819.14330-1-xndchn@gmail.com Signed-off-by: Gerd Hoffmann Signed-off-by: Sasha Levin commit 3f926066e967f6e86aa94d5fb1e9f69f6a651d53 Author: Uladzislau Rezki (Sony) Date: Fri Jan 29 21:05:05 2021 +0100 kvfree_rcu: Use same set of GFP flags as does single-argument [ Upstream commit ee6ddf58475cce8a3d3697614679cd8cb4a6f583 ] Running an rcuscale stress-suite can lead to "Out of memory" of a system. This can happen under high memory pressure with a small amount of physical memory. For example, a KVM test configuration with 64 CPUs and 512 megabytes can result in OOM when running rcuscale with below parameters: ../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \ --bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \ rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make [ 12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0 [ 12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510 [ 12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014 [ 12.056485] Workqueue: events_highpri fill_page_cache_func [ 12.056485] Call Trace: [ 12.056485] dump_stack+0x57/0x6a [ 12.056485] dump_header+0x4c/0x30a [ 12.056485] ? del_timer_sync+0x20/0x30 [ 12.056485] out_of_memory.cold.47+0xa/0x7e [ 12.056485] __alloc_pages_slowpath.constprop.123+0x82f/0xc00 [ 12.056485] __alloc_pages_nodemask+0x289/0x2c0 [ 12.056485] __get_free_pages+0x8/0x30 [ 12.056485] fill_page_cache_func+0x39/0xb0 [ 12.056485] process_one_work+0x1ed/0x3b0 [ 12.056485] ? process_one_work+0x3b0/0x3b0 [ 12.060485] worker_thread+0x28/0x3c0 [ 12.060485] ? process_one_work+0x3b0/0x3b0 [ 12.060485] kthread+0x138/0x160 [ 12.060485] ? kthread_park+0x80/0x80 [ 12.060485] ret_from_fork+0x22/0x30 [ 12.062156] Mem-Info: [ 12.062350] active_anon:0 inactive_anon:0 isolated_anon:0 [ 12.062350] active_file:0 inactive_file:0 isolated_file:0 [ 12.062350] unevictable:0 dirty:0 writeback:0 [ 12.062350] slab_reclaimable:2797 slab_unreclaimable:80920 [ 12.062350] mapped:1 shmem:2 pagetables:8 bounce:0 [ 12.062350] free:10488 free_pcp:1227 free_cma:0 ... [ 12.101610] Out of memory and no killable processes... [ 12.102042] Kernel panic - not syncing: System is deadlocked on memory [ 12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510 [ 12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014 Because kvfree_rcu() has a fallback path, memory allocation failure is not the end of the world. Furthermore, the added overhead of aggressive GFP settings must be balanced against the overhead of the fallback path, which is a cache miss for double-argument kvfree_rcu() and a call to synchronize_rcu() for single-argument kvfree_rcu(). The current choice of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call to synchronize_rcu(), so less-tenacious GFP flags would be helpful. Here is the tradeoff that must be balanced: a) Minimize use of the fallback path, b) Avoid pushing the system into OOM, c) Bound allocation latency to that of synchronize_rcu(), and d) Leave the emergency reserves to use cases lacking fallbacks. This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN. This combination leaves the emergency reserves alone and can initiate reclaim, but will not invoke the OOM killer. Signed-off-by: Uladzislau Rezki (Sony) Signed-off-by: Paul E. McKenney Signed-off-by: Sasha Levin commit cc69d5a314b10ad21a917343fd1e16b3e026d52e Author: Barry Song Date: Wed Feb 24 16:09:44 2021 +1300 sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2 [ Upstream commit 585b6d2723dc927ebc4ad884c4e879e4da8bc21f ] As long as NUMA diameter > 2, building sched_domain by sibling's child domain will definitely create a sched_domain with sched_group which will span out of the sched_domain: +------+ +------+ +-------+ +------+ | node | 12 |node | 20 | node | 12 |node | | 0 +---------+1 +--------+ 2 +-------+3 | +------+ +------+ +-------+ +------+ domain0 node0 node1 node2 node3 domain1 node0+1 node0+1 node2+3 node2+3 + domain2 node0+1+2 | group: node0+1 | group:node2+3 <-------------------+ when node2 is added into the domain2 of node0, kernel is using the child domain of node2's domain2, which is domain1(node2+3). Node 3 is outside the span of the domain including node0+1+2. This will make load_balance() run based on screwed avg_load and group_type in the sched_group spanning out of the sched_domain, and it also makes select_task_rq_fair() pick an idle CPU outside the sched_domain. Real servers which suffer from this problem include Kunpeng920 and 8-node Sun Fire X4600-M2, at least. Here we move to use the *child* domain of the *child* domain of node2's domain2 as the new added sched_group. At the same, we re-use the lower level sgc directly. +------+ +------+ +-------+ +------+ | node | 12 |node | 20 | node | 12 |node | | 0 +---------+1 +--------+ 2 +-------+3 | +------+ +------+ +-------+ +------+ domain0 node0 node1 +- node2 node3 | domain1 node0+1 node0+1 | node2+3 node2+3 | domain2 node0+1+2 | group: node0+1 | group:node2 <-------------------+ While the lower level sgc is re-used, this patch only changes the remote sched_groups for those sched_domains playing grandchild trick, therefore, sgc->next_update is still safe since it's only touched by CPUs that have the group span as local group. And sgc->imbalance is also safe because sd_parent remains the same in load_balance and LB only tries other CPUs from the local group. Moreover, since local groups are not touched, they are still getting roughly equal size in a TL. And should_we_balance() only matters with local groups, so the pull probability of those groups are still roughly equal. Tested by the below topology: qemu-system-aarch64 -M virt -nographic \ -smp cpus=8 \ -numa node,cpus=0-1,nodeid=0 \ -numa node,cpus=2-3,nodeid=1 \ -numa node,cpus=4-5,nodeid=2 \ -numa node,cpus=6-7,nodeid=3 \ -numa dist,src=0,dst=1,val=12 \ -numa dist,src=0,dst=2,val=20 \ -numa dist,src=0,dst=3,val=22 \ -numa dist,src=1,dst=2,val=22 \ -numa dist,src=2,dst=3,val=12 \ -numa dist,src=1,dst=3,val=24 \ -m 4G -cpu cortex-a57 -kernel arch/arm64/boot/Image w/o patch, we get lots of "groups don't span domain->span": [ 0.802139] CPU0 attaching sched-domain(s): [ 0.802193] domain-0: span=0-1 level=MC [ 0.802443] groups: 0:{ span=0 cap=1013 }, 1:{ span=1 cap=979 } [ 0.802693] domain-1: span=0-3 level=NUMA [ 0.802731] groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 } [ 0.802811] domain-2: span=0-5 level=NUMA [ 0.802829] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 } [ 0.802881] ERROR: groups don't span domain->span [ 0.803058] domain-3: span=0-7 level=NUMA [ 0.803080] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 } [ 0.804055] CPU1 attaching sched-domain(s): [ 0.804072] domain-0: span=0-1 level=MC [ 0.804096] groups: 1:{ span=1 cap=979 }, 0:{ span=0 cap=1013 } [ 0.804152] domain-1: span=0-3 level=NUMA [ 0.804170] groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 } [ 0.804219] domain-2: span=0-5 level=NUMA [ 0.804236] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 } [ 0.804302] ERROR: groups don't span domain->span [ 0.804520] domain-3: span=0-7 level=NUMA [ 0.804546] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 } [ 0.804677] CPU2 attaching sched-domain(s): [ 0.804687] domain-0: span=2-3 level=MC [ 0.804705] groups: 2:{ span=2 cap=934 }, 3:{ span=3 cap=1009 } [ 0.804754] domain-1: span=0-3 level=NUMA [ 0.804772] groups: 2:{ span=2-3 cap=1943 }, 0:{ span=0-1 cap=1992 } [ 0.804820] domain-2: span=0-5 level=NUMA [ 0.804836] groups: 2:{ span=0-3 mask=2-3 cap=3991 }, 4:{ span=0-1,4-7 mask=4-5 cap=5985 } [ 0.804944] ERROR: groups don't span domain->span [ 0.805108] domain-3: span=0-7 level=NUMA [ 0.805134] groups: 2:{ span=0-5 mask=2-3 cap=5899 }, 6:{ span=0-1,4-7 mask=6-7 cap=6125 } [ 0.805223] CPU3 attaching sched-domain(s): [ 0.805232] domain-0: span=2-3 level=MC [ 0.805249] groups: 3:{ span=3 cap=1009 }, 2:{ span=2 cap=934 } [ 0.805319] domain-1: span=0-3 level=NUMA [ 0.805336] groups: 2:{ span=2-3 cap=1943 }, 0:{ span=0-1 cap=1992 } [ 0.805383] domain-2: span=0-5 level=NUMA [ 0.805399] groups: 2:{ span=0-3 mask=2-3 cap=3991 }, 4:{ span=0-1,4-7 mask=4-5 cap=5985 } [ 0.805458] ERROR: groups don't span domain->span [ 0.805605] domain-3: span=0-7 level=NUMA [ 0.805626] groups: 2:{ span=0-5 mask=2-3 cap=5899 }, 6:{ span=0-1,4-7 mask=6-7 cap=6125 } [ 0.805712] CPU4 attaching sched-domain(s): [ 0.805721] domain-0: span=4-5 level=MC [ 0.805738] groups: 4:{ span=4 cap=984 }, 5:{ span=5 cap=924 } [ 0.805787] domain-1: span=4-7 level=NUMA [ 0.805803] groups: 4:{ span=4-5 cap=1908 }, 6:{ span=6-7 cap=2029 } [ 0.805851] domain-2: span=0-1,4-7 level=NUMA [ 0.805867] groups: 4:{ span=4-7 cap=3937 }, 0:{ span=0-3 cap=3935 } [ 0.805915] ERROR: groups don't span domain->span [ 0.806108] domain-3: span=0-7 level=NUMA [ 0.806130] groups: 4:{ span=0-1,4-7 mask=4-5 cap=5985 }, 2:{ span=0-3 mask=2-3 cap=3991 } [ 0.806214] CPU5 attaching sched-domain(s): [ 0.806222] domain-0: span=4-5 level=MC [ 0.806240] groups: 5:{ span=5 cap=924 }, 4:{ span=4 cap=984 } [ 0.806841] domain-1: span=4-7 level=NUMA [ 0.806866] groups: 4:{ span=4-5 cap=1908 }, 6:{ span=6-7 cap=2029 } [ 0.806934] domain-2: span=0-1,4-7 level=NUMA [ 0.806953] groups: 4:{ span=4-7 cap=3937 }, 0:{ span=0-3 cap=3935 } [ 0.807004] ERROR: groups don't span domain->span [ 0.807312] domain-3: span=0-7 level=NUMA [ 0.807386] groups: 4:{ span=0-1,4-7 mask=4-5 cap=5985 }, 2:{ span=0-3 mask=2-3 cap=3991 } [ 0.807686] CPU6 attaching sched-domain(s): [ 0.807710] domain-0: span=6-7 level=MC [ 0.807750] groups: 6:{ span=6 cap=1017 }, 7:{ span=7 cap=1012 } [ 0.807840] domain-1: span=4-7 level=NUMA [ 0.807870] groups: 6:{ span=6-7 cap=2029 }, 4:{ span=4-5 cap=1908 } [ 0.807952] domain-2: span=0-1,4-7 level=NUMA [ 0.807985] groups: 6:{ span=4-7 mask=6-7 cap=4077 }, 0:{ span=0-5 mask=0-1 cap=5843 } [ 0.808045] ERROR: groups don't span domain->span [ 0.808257] domain-3: span=0-7 level=NUMA [ 0.808571] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6125 }, 2:{ span=0-5 mask=2-3 cap=5899 } [ 0.808848] CPU7 attaching sched-domain(s): [ 0.808860] domain-0: span=6-7 level=MC [ 0.808880] groups: 7:{ span=7 cap=1012 }, 6:{ span=6 cap=1017 } [ 0.808953] domain-1: span=4-7 level=NUMA [ 0.808974] groups: 6:{ span=6-7 cap=2029 }, 4:{ span=4-5 cap=1908 } [ 0.809034] domain-2: span=0-1,4-7 level=NUMA [ 0.809055] groups: 6:{ span=4-7 mask=6-7 cap=4077 }, 0:{ span=0-5 mask=0-1 cap=5843 } [ 0.809128] ERROR: groups don't span domain->span [ 0.810361] domain-3: span=0-7 level=NUMA [ 0.810400] groups: 6:{ span=0-1,4-7 mask=6-7 cap=5961 }, 2:{ span=0-5 mask=2-3 cap=5903 } w/ patch, we don't get "groups don't span domain->span" any more: [ 1.486271] CPU0 attaching sched-domain(s): [ 1.486820] domain-0: span=0-1 level=MC [ 1.500924] groups: 0:{ span=0 cap=980 }, 1:{ span=1 cap=994 } [ 1.515717] domain-1: span=0-3 level=NUMA [ 1.515903] groups: 0:{ span=0-1 cap=1974 }, 2:{ span=2-3 cap=1989 } [ 1.516989] domain-2: span=0-5 level=NUMA [ 1.517124] groups: 0:{ span=0-3 cap=3963 }, 4:{ span=4-5 cap=1949 } [ 1.517369] domain-3: span=0-7 level=NUMA [ 1.517423] groups: 0:{ span=0-5 mask=0-1 cap=5912 }, 6:{ span=4-7 mask=6-7 cap=4054 } [ 1.520027] CPU1 attaching sched-domain(s): [ 1.520097] domain-0: span=0-1 level=MC [ 1.520184] groups: 1:{ span=1 cap=994 }, 0:{ span=0 cap=980 } [ 1.520429] domain-1: span=0-3 level=NUMA [ 1.520487] groups: 0:{ span=0-1 cap=1974 }, 2:{ span=2-3 cap=1989 } [ 1.520687] domain-2: span=0-5 level=NUMA [ 1.520744] groups: 0:{ span=0-3 cap=3963 }, 4:{ span=4-5 cap=1949 } [ 1.520948] domain-3: span=0-7 level=NUMA [ 1.521038] groups: 0:{ span=0-5 mask=0-1 cap=5912 }, 6:{ span=4-7 mask=6-7 cap=4054 } [ 1.522068] CPU2 attaching sched-domain(s): [ 1.522348] domain-0: span=2-3 level=MC [ 1.522606] groups: 2:{ span=2 cap=1003 }, 3:{ span=3 cap=986 } [ 1.522832] domain-1: span=0-3 level=NUMA [ 1.522885] groups: 2:{ span=2-3 cap=1989 }, 0:{ span=0-1 cap=1974 } [ 1.523043] domain-2: span=0-5 level=NUMA [ 1.523092] groups: 2:{ span=0-3 mask=2-3 cap=4037 }, 4:{ span=4-5 cap=1949 } [ 1.523302] domain-3: span=0-7 level=NUMA [ 1.523352] groups: 2:{ span=0-5 mask=2-3 cap=5986 }, 6:{ span=0-1,4-7 mask=6-7 cap=6102 } [ 1.523748] CPU3 attaching sched-domain(s): [ 1.523774] domain-0: span=2-3 level=MC [ 1.523825] groups: 3:{ span=3 cap=986 }, 2:{ span=2 cap=1003 } [ 1.524009] domain-1: span=0-3 level=NUMA [ 1.524086] groups: 2:{ span=2-3 cap=1989 }, 0:{ span=0-1 cap=1974 } [ 1.524281] domain-2: span=0-5 level=NUMA [ 1.524331] groups: 2:{ span=0-3 mask=2-3 cap=4037 }, 4:{ span=4-5 cap=1949 } [ 1.524534] domain-3: span=0-7 level=NUMA [ 1.524586] groups: 2:{ span=0-5 mask=2-3 cap=5986 }, 6:{ span=0-1,4-7 mask=6-7 cap=6102 } [ 1.524847] CPU4 attaching sched-domain(s): [ 1.524873] domain-0: span=4-5 level=MC [ 1.524954] groups: 4:{ span=4 cap=958 }, 5:{ span=5 cap=991 } [ 1.525105] domain-1: span=4-7 level=NUMA [ 1.525153] groups: 4:{ span=4-5 cap=1949 }, 6:{ span=6-7 cap=2006 } [ 1.525368] domain-2: span=0-1,4-7 level=NUMA [ 1.525428] groups: 4:{ span=4-7 cap=3955 }, 0:{ span=0-1 cap=1974 } [ 1.532726] domain-3: span=0-7 level=NUMA [ 1.532811] groups: 4:{ span=0-1,4-7 mask=4-5 cap=6003 }, 2:{ span=0-3 mask=2-3 cap=4037 } [ 1.534125] CPU5 attaching sched-domain(s): [ 1.534159] domain-0: span=4-5 level=MC [ 1.534303] groups: 5:{ span=5 cap=991 }, 4:{ span=4 cap=958 } [ 1.534490] domain-1: span=4-7 level=NUMA [ 1.534572] groups: 4:{ span=4-5 cap=1949 }, 6:{ span=6-7 cap=2006 } [ 1.534734] domain-2: span=0-1,4-7 level=NUMA [ 1.534783] groups: 4:{ span=4-7 cap=3955 }, 0:{ span=0-1 cap=1974 } [ 1.536057] domain-3: span=0-7 level=NUMA [ 1.536430] groups: 4:{ span=0-1,4-7 mask=4-5 cap=6003 }, 2:{ span=0-3 mask=2-3 cap=3896 } [ 1.536815] CPU6 attaching sched-domain(s): [ 1.536846] domain-0: span=6-7 level=MC [ 1.536934] groups: 6:{ span=6 cap=1005 }, 7:{ span=7 cap=1001 } [ 1.537144] domain-1: span=4-7 level=NUMA [ 1.537262] groups: 6:{ span=6-7 cap=2006 }, 4:{ span=4-5 cap=1949 } [ 1.537553] domain-2: span=0-1,4-7 level=NUMA [ 1.537613] groups: 6:{ span=4-7 mask=6-7 cap=4054 }, 0:{ span=0-1 cap=1805 } [ 1.537872] domain-3: span=0-7 level=NUMA [ 1.537998] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6102 }, 2:{ span=0-5 mask=2-3 cap=5845 } [ 1.538448] CPU7 attaching sched-domain(s): [ 1.538505] domain-0: span=6-7 level=MC [ 1.538586] groups: 7:{ span=7 cap=1001 }, 6:{ span=6 cap=1005 } [ 1.538746] domain-1: span=4-7 level=NUMA [ 1.538798] groups: 6:{ span=6-7 cap=2006 }, 4:{ span=4-5 cap=1949 } [ 1.539048] domain-2: span=0-1,4-7 level=NUMA [ 1.539111] groups: 6:{ span=4-7 mask=6-7 cap=4054 }, 0:{ span=0-1 cap=1805 } [ 1.539571] domain-3: span=0-7 level=NUMA [ 1.539610] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6102 }, 2:{ span=0-5 mask=2-3 cap=5845 } Signed-off-by: Barry Song Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Tested-by: Meelis Roos Link: https://lkml.kernel.org/r/20210224030944.15232-1-song.bao.hua@hisilicon.com Signed-off-by: Sasha Levin commit 125cb2c7f8f31408669f44bd882f1c836f4a5f69 Author: Vincent Donnefort Date: Thu Feb 25 16:58:20 2021 +0000 sched/pelt: Fix task util_est update filtering [ Upstream commit b89997aa88f0b07d8a6414c908af75062103b8c9 ] Being called for each dequeue, util_est reduces the number of its updates by filtering out when the EWMA signal is different from the task util_avg by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the decay from a previous high util_avg, EWMA might now be close enough to the new util_avg. No update would then happen while it would leave ue.enqueued with an out-of-date value. Taking into consideration the two util_est members, EWMA and enqueued for the filtering, ensures, for both, an up-to-date value. This is for now an issue only for the trace probe that might return the stale value. Functional-wise, it isn't a problem, as the value is always accessed through max(enqueued, ewma). This problem has been observed using LISA's UtilConvergence:test_means on the sd845c board. No regression observed with Hackbench on sd845c and Perf-bench sched pipe on hikey/hikey960. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Dietmar Eggemann Reviewed-by: Vincent Guittot Link: https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnefort@arm.com Signed-off-by: Sasha Levin commit 0feca71f535942c8f3f1cb845efa2f94696e3f82 Author: Vincent Donnefort Date: Thu Feb 25 08:36:11 2021 +0000 sched/fair: Fix task utilization accountability in compute_energy() [ Upstream commit 0372e1cf70c28de6babcba38ef97b6ae3400b101 ] find_energy_efficient_cpu() (feec()) computes for each perf_domain (pd) an energy delta as follows: feec(task) for_each_pd base_energy = compute_energy(task, -1, pd) -> for_each_cpu(pd) -> cpu_util_next(cpu, task, -1) energy_delta = compute_energy(task, dst_cpu, pd) -> for_each_cpu(pd) -> cpu_util_next(cpu, task, dst_cpu) energy_delta -= base_energy Then it picks the best CPU as being the one that minimizes energy_delta. cpu_util_next() estimates the CPU utilization that would happen if the task was placed on dst_cpu as follows: max(cpu_util + task_util, cpu_util_est + _task_util_est) The task contribution to the energy delta can then be either: (1) _task_util_est, on a mostly idle CPU, where cpu_util is close to 0 and _task_util_est > cpu_util. (2) task_util, on a mostly busy CPU, where cpu_util > _task_util_est. (cpu_util_est doesn't appear here. It is 0 when a CPU is idle and otherwise must be small enough so that feec() takes the CPU as a potential target for the task placement) This is problematic for feec(), as cpu_util_next() might give an unfair advantage to a CPU which is mostly busy (2) compared to one which is mostly idle (1). _task_util_est being always bigger than task_util in feec() (as the task is waking up), the task contribution to the energy might look smaller on certain CPUs (2) and this breaks the energy comparison. This issue is, moreover, not sporadic. By starving idle CPUs, it keeps their cpu_util < _task_util_est (1) while others will maintain cpu_util > _task_util_est (2). Fix this problem by always using max(task_util, _task_util_est) as a task contribution to the energy (ENERGY_UTIL). The new estimated CPU utilization for the energy would then be: max(cpu_util, cpu_util_est) + max(task_util, _task_util_est) compute_energy() still needs to know which OPP would be selected if the task would be migrated in the perf_domain (FREQUENCY_UTIL). Hence, cpu_util_next() is still used to estimate the maximum util within the pd. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Quentin Perret Reviewed-by: Dietmar Eggemann Link: https://lkml.kernel.org/r/20210225083612.1113823-2-vincent.donnefort@arm.com Signed-off-by: Sasha Levin commit d4be398888b76787e804de50c79f3eee9ef2a852 Author: Emily Deng Date: Thu Mar 4 19:30:51 2021 +0800 drm/amdgpu: Fix some unload driver issues [ Upstream commit bb0cd09be45ea457f25fdcbcb3d6cf2230f26c46 ] When unloading driver after killing some applications, it will hit sdma flush tlb job timeout which is called by ttm_bo_delay_delete. So to avoid the job submit after fence driver fini, call ttm_bo_lock_delayed_workqueue before fence driver fini. And also put drm_sched_fini before waiting fence. Signed-off-by: Emily Deng Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 05395840ae35c4a71c121ed4929e906d9da1dae0 Author: Arunpravin Date: Mon Mar 1 15:45:13 2021 +0530 drm/amd/pm/swsmu: clean up user profile function [ Upstream commit d8cce9306801cfbf709055677f7896905094ff95 ] Remove unnecessary comments, enable restore mode using '|=' operator, fixes the alignment to improve the code readability. v2: Move all restoration flag check to bitwise '&' operator Signed-off-by: Arunpravin Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit e085b5916049ba90f6231dfde72bbcbcda8f5380 Author: James Smart Date: Mon Mar 1 09:18:13 2021 -0800 scsi: lpfc: Fix ADISC handling that never frees nodes [ Upstream commit 309b477462df7542355ac984674a6e89c01c89aa ] While testing target port swap test with ADISC enabled, several nodes remain in UNUSED state. These nodes are never freed and rmmod hangs for long time before finising with "0233 Nodelist not empty" error. During PLOGI completion lpfc_plogi_confirm_nport() looks for existing nodes with same WWPN. If found, the existing node is used to continue discovery. The node on which plogi was performed is freed. When ADISC is enabled, an ADISC els request is triggered in response to an RSCN. It's possible that the ADISC may be rejected by the remote port causing the ADISC completion handler to clear the port and node name in the node. If this occurs, if a PLOGI is received it causes a node lookup based on wwpn to now fail, causing the port swap logic to kick in which allocates a new node and swaps to it. This effectively orphans the original node structure. Fix the situation by detecting when the lookup fails and forgo the node swap and node allocation by using the node on which the PLOGI was issued. Link: https://lore.kernel.org/r/20210301171821.3427-15-jsmart2021@gmail.com Co-developed-by: Dick Kennedy Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit a2dc0640eacdc8ba236cadc792d14a57d1d90fbc Author: James Smart Date: Mon Mar 1 09:18:12 2021 -0800 scsi: lpfc: Fix PLOGI ACC to be transmit after REG_LOGIN [ Upstream commit 143753059b8b957f1cf4355338a3e3a32f3a85bf ] The driver is seeing a scenario where PLOGI response was issued and traffic is arriving while the adapter is still setting up the login context. This is resulting in errors handling the traffic. Change the driver so that PLOGI response is sent after the login context has been setup to avoid the situation. Link: https://lore.kernel.org/r/20210301171821.3427-14-jsmart2021@gmail.com Co-developed-by: Dick Kennedy Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 823db6f1490e911a2a52bb47de68a06680ec887c Author: James Smart Date: Mon Mar 1 09:18:10 2021 -0800 scsi: lpfc: Fix status returned in lpfc_els_retry() error exit path [ Upstream commit 148bc64d38fe314475a074c4f757ec9d84537d1c ] An unlikely error exit path from lpfc_els_retry() returns incorrect status to a caller, erroneously indicating that a retry has been successfully issued or scheduled. Change error exit path to indicate no retry. Link: https://lore.kernel.org/r/20210301171821.3427-12-jsmart2021@gmail.com Co-developed-by: Dick Kennedy Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 895d55027f4ba2ecae05fb7529f0061432afe204 Author: James Smart Date: Mon Mar 1 09:18:06 2021 -0800 scsi: lpfc: Fix pt2pt connection does not recover after LOGO [ Upstream commit bd4f5100424d17d4e560d6653902ef8e49b2fc1f ] On a pt2pt setup, between 2 initiators, if one side issues a a LOGO, there is no relogin attempt. The FC specs are grey in this area on which port (higher wwn or not) is to re-login. As there is no spec guidance, unconditionally re-PLOGI after the logout to ensure a login is re-established. Link: https://lore.kernel.org/r/20210301171821.3427-8-jsmart2021@gmail.com Co-developed-by: Dick Kennedy Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 3224d515e4d9c0a00c1a7a359028b074ee12aad2 Author: James Smart Date: Mon Mar 1 09:18:00 2021 -0800 scsi: lpfc: Fix incorrect dbde assignment when building target abts wqe [ Upstream commit 9302154c07bff4e7f7f43c506a1ac84540303d06 ] The wqe_dbde field indicates whether a Data BDE is present in Words 0:2 and should therefore should be clear in the abts request wqe. By setting the bit we can be misleading fw into error cases. Clear the wqe_dbde field. Link: https://lore.kernel.org/r/20210301171821.3427-2-jsmart2021@gmail.com Co-developed-by: Dick Kennedy Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit b9694aef9b1acca1fe885628ee1f091779dde211 Author: Lee Jones Date: Wed Mar 3 13:42:36 2021 +0000 drm/amd/display/dc/dce/dce_aux: Remove duplicate line causing 'field overwritten' issue [ Upstream commit 3e3527f5b765c6f479ba55e5a570ee9538589a74 ] Fixes the following W=1 kernel build warning(s): In file included from drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_resource.c:59: drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_11_2_sh_mask.h:10014:58: warning: initialized field overwritten [-Woverride-init] drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:214:16: note: in expansion of macro ‘AUX_SW_DATA__AUX_SW_AUTOINCREMENT_DISABLE__SHIFT’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:127:2: note: in expansion of macro ‘AUX_SF’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_resource.c:177:2: note: in expansion of macro ‘DCE_AUX_MASK_SH_LIST’ drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_11_2_sh_mask.h:10014:58: note: (near initialization for ‘aux_shift.AUX_SW_AUTOINCREMENT_DISABLE’) drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:214:16: note: in expansion of macro ‘AUX_SW_DATA__AUX_SW_AUTOINCREMENT_DISABLE__SHIFT’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:127:2: note: in expansion of macro ‘AUX_SF’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_resource.c:177:2: note: in expansion of macro ‘DCE_AUX_MASK_SH_LIST’ drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_11_2_sh_mask.h:10013:56: warning: initialized field overwritten [-Woverride-init] drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:214:16: note: in expansion of macro ‘AUX_SW_DATA__AUX_SW_AUTOINCREMENT_DISABLE_MASK’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:127:2: note: in expansion of macro ‘AUX_SF’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_resource.c:181:2: note: in expansion of macro ‘DCE_AUX_MASK_SH_LIST’ drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_11_2_sh_mask.h:10013:56: note: (near initialization for ‘aux_mask.AUX_SW_AUTOINCREMENT_DISABLE’) drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:214:16: note: in expansion of macro ‘AUX_SW_DATA__AUX_SW_AUTOINCREMENT_DISABLE_MASK’ drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.h:127:2: note: in expansion of macro ‘AUX_SF’ Cc: Harry Wentland Cc: Leo Li Cc: Alex Deucher Cc: "Christian König" Cc: David Airlie Cc: Daniel Vetter Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit a586b255e8e9c1c56a2960414a6f2dd328032bfe Author: Xiaogang Chen Date: Thu Feb 25 12:06:34 2021 -0500 drm/amdgpu/display: buffer INTERRUPT_LOW_IRQ_CONTEXT interrupt work [ Upstream commit b6f91fc183f758461b9462cc93e673adbbf95c2d ] amdgpu DM handles INTERRUPT_LOW_IRQ_CONTEXT interrupt(hpd, hpd_rx) by using work queue and uses single work_struct. If new interrupt is recevied before the previous handler finished, new interrupts(same type) will be discarded and driver just sends "amdgpu_dm_irq_schedule_work FAILED" message out. If some important hpd, hpd_rx related interrupts are missed by driver the hot (un)plug devices may cause system hang or instability, such as issues with system resume from S3 sleep with mst device connected. This patch dynamically allocates new amdgpu_dm_irq_handler_data for new interrupts if previous INTERRUPT_LOW_IRQ_CONTEXT interrupt work has not been handled. So the new interrupt works can be queued to the same workqueue_struct, instead of discard the new interrupts. All allocated amdgpu_dm_irq_handler_data are put into a single linked list and will be reused after. Signed-off-by: Xiaogang Chen Reviewed-by: Aurabindo Pillai Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 34ef003c04d38482bc6c8d55b365af77e4a70a85 Author: Wyatt Wood Date: Fri Feb 19 12:21:47 2021 -0500 drm/amd/display: Return invalid state if GPINT times out [ Upstream commit 8039bc7130ef4206a58e4dc288621bc97eba08eb ] [Why] GPINT timeout is causing PSR_STATE_0 to be returned when it shouldn't. We must guarantee that PSR is fully disabled before doing hw programming on driver-side. [How] Return invalid state if GPINT command times out. Let existing retry logic send the GPINT until successful. Tested-by: Daniel Wheeler Signed-off-by: Wyatt Wood Reviewed-by: Anthony Koo Acked-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 26a5506faacb65b06ba0d072263f78fef223703d Author: Aric Cyr Date: Fri Feb 12 18:13:59 2021 -0500 drm/amd/display: Don't optimize bandwidth before disabling planes [ Upstream commit 6ad98e8aeb0106f453bb154933e8355849244990 ] [Why] There is a window of time where we optimize bandwidth due to no streams enabled will enable PSTATE changing but HUBPs are not disabled yet. This results in underflow counter increasing in some hotplug scenarios. [How] Set the optimize-bandwidth flag for later processing once all the HUBPs are properly disabled. Signed-off-by: Aric Cyr Acked-by: Bindu Ramamurthy Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit e9b75f1e5ee0fd9a42049da5e3f0738dea014b6a Author: Eryk Brol Date: Tue Feb 9 17:09:52 2021 -0500 drm/amd/display: Check for DSC support instead of ASIC revision [ Upstream commit 349a19b2f1b01e713268c7de9944ad669ccdf369 ] [why] This check for ASIC revision is no longer useful and causes lightup issues after a topology change in MST DSC scenario. In this case, DSC configs should be recalculated for the new topology. This check prevented that from happening on certain ASICs that do, in fact, support DSC. [how] Change the ASIC revision to instead check if DSC is supported. Signed-off-by: Eryk Brol Acked-by: Bindu Ramamurthy Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 2cf0e67da6467533636091760d1a4ab9491bc8a1 Author: Tong Zhang Date: Sun Feb 21 21:33:22 2021 -0500 drm/ast: fix memory leak when unload the driver [ Upstream commit dc739820ff90acccd013f6bb420222978a982791 ] a connector is leaked upon module unload, it seems that we should do similar to sample driver as suggested in drm_drv.c. Adding drm_atomic_helper_shutdown() in ast_pci_remove to prevent leaking. [ 153.822134] WARNING: CPU: 0 PID: 173 at drivers/gpu/drm/drm_mode_config.c:504 drm_mode_config_cle0 [ 153.822698] Modules linked in: ast(-) drm_vram_helper drm_ttm_helper ttm [last unloaded: ttm] [ 153.823197] CPU: 0 PID: 173 Comm: modprobe Tainted: G W 5.11.0-03615-g55f62bc873474 [ 153.823708] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-4 [ 153.824333] RIP: 0010:drm_mode_config_cleanup+0x418/0x470 [ 153.824637] Code: 0c 00 00 00 00 48 8b 84 24 a8 00 00 00 65 48 33 04 25 28 00 00 00 75 65 48 81 c0 [ 153.825668] RSP: 0018:ffff888103c9fb70 EFLAGS: 00010212 [ 153.825962] RAX: ffff888102b0d100 RBX: ffff888102b0c298 RCX: ffffffff818d8b2b [ 153.826356] RDX: dffffc0000000000 RSI: 000000007fffffff RDI: ffff888102b0c298 [ 153.826748] RBP: ffff888103c9fba0 R08: 0000000000000001 R09: ffffed1020561857 [ 153.827146] R10: ffff888102b0c2b7 R11: ffffed1020561856 R12: ffff888102b0c000 [ 153.827538] R13: ffff888102b0c2d8 R14: ffff888102b0c2d8 R15: 1ffff11020793f70 [ 153.827935] FS: 00007f24bff456a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 153.828380] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 153.828697] CR2: 0000000001c39018 CR3: 0000000103c90000 CR4: 00000000000006f0 [ 153.829096] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 153.829486] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 153.829883] Call Trace: [ 153.830024] ? drmm_mode_config_init+0x930/0x930 [ 153.830281] ? cpumask_next+0x16/0x20 [ 153.830488] ? mnt_get_count+0x66/0x80 [ 153.830699] ? drm_mode_config_cleanup+0x470/0x470 [ 153.830972] drm_managed_release+0xed/0x1c0 [ 153.831208] drm_dev_release+0x3a/0x50 [ 153.831420] release_nodes+0x39e/0x410 [ 153.831631] ? devres_release+0x40/0x40 [ 153.831852] device_release_driver_internal+0x158/0x270 [ 153.832143] driver_detach+0x76/0xe0 [ 153.832344] bus_remove_driver+0x7e/0x100 [ 153.832568] pci_unregister_driver+0x28/0xf0 [ 153.832821] __x64_sys_delete_module+0x268/0x300 [ 153.833086] ? __ia32_sys_delete_module+0x300/0x300 [ 153.833357] ? call_rcu+0x372/0x4f0 [ 153.833553] ? fpregs_assert_state_consistent+0x4d/0x60 [ 153.833840] ? exit_to_user_mode_prepare+0x2f/0x130 [ 153.834118] do_syscall_64+0x33/0x40 [ 153.834317] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 153.834597] RIP: 0033:0x7f24bfec7cf7 [ 153.834797] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 41 [ 153.835812] RSP: 002b:00007fff72e6cb58 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0 [ 153.836234] RAX: ffffffffffffffda RBX: 00007f24bff45690 RCX: 00007f24bfec7cf7 [ 153.836623] RDX: 00000000ffffffff RSI: 0000000000000080 RDI: 0000000001c2fb10 [ 153.837018] RBP: 0000000001c2fac0 R08: 2f2f2f2f2f2f2f2f R09: 0000000001c2fac0 [ 153.837408] R10: fefefefefefefeff R11: 0000000000000202 R12: 0000000001c2fac0 [ 153.837798] R13: 0000000001c2f9d0 R14: 0000000000000000 R15: 0000000000000001 [ 153.838194] ---[ end trace b92031513bbe596c ]--- [ 153.838441] [drm:drm_mode_config_cleanup] *ERROR* connector VGA-1 leaked! Signed-off-by: Tong Zhang Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20210222023322.984885-1-ztong0001@gmail.com Signed-off-by: Sasha Levin commit b76edf81bf8596ba557636edec7b94f94ea5bac3 Author: Huang Rui Date: Wed Feb 10 12:27:43 2021 +0800 drm/amd/pm: do not issue message while write "r" into pp_od_clk_voltage [ Upstream commit ca1203d7d7295c49e5707d7def457bdc524a8edb ] We should commit the value after restore them back to default as well. $ echo "r" > pp_od_clk_voltage $ echo "c" > pp_od_clk_voltage Signed-off-by: Huang Rui Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 7b1d0b1ddeb99d154a93ce6f9596e1a0b70931ff Author: Nicholas Kazlauskas Date: Wed Feb 3 10:11:30 2021 -0500 drm/amd/display: Fix MPC OGAM power on/off sequence [ Upstream commit 737b2b536a30a467c405d75f2287e17828838a13 ] [Why] Color corruption can occur on bootup into a login manager that applies a non-linear gamma LUT because the LUT may not actually be powered on before writing. It's cleared on the next full pipe reprogramming as we switch to LUTB from LUTA and the pipe accessing the LUT has taken it out of light sleep mode. [How] The MPCC_OGAM_MEM_PWR_FORCE register does not force the current power mode when set to 0. It only forces when set light sleep, deep sleep or shutdown. The register to actually force power on and ignore sleep modes is MPCC_OGAM_MEM_PWR_DIS - a value of 0 will enable power requests and a value of 1 will disable them. When PWR_FORCE!=0 is combined with PWR_DIS=0 then MPCC OGAM memory is forced into the state specified by the force bits. If PWR_FORCE is 0 then it respects the mode specified by MPCC_OGAM_MEM_LOW_PWR_MODE if the RAM LUT is not in use. We set that bit to shutdown on low power, but otherwise it inherits from bootup defaults. So for the fix: 1. Update the sequence to "force" power on when needed We can use MPCC_OGAM_MEM_PWR_DIS for this to turn on the memory even when the block is in bypass and pending to be enabled for the next frame. We need this for both low power enabled or disabled. If we don't set this then we can run into issues when we first program the LUT from bootup. 2. Don't apply FORCE_SEL Once we enable power requests with DIS=0 we run into the issue of the RAM being forced into light sleep and being unusable for display output. Leave this 0 like we used to for DCN20. 3. Rely on MPCC OGAM init to determine light sleep/deep sleep MPC low power debug mode isn't enabled on any ASIC currently but we'll respect the setting determined during init if it is. Lightly tested as working with IGT tests and desktop color adjustment. 4. Change the MPC resource default for DCN30 It was interleaving the dcn20 and dcn30 versions before depending on the sequence. 5. REG_WAIT for it to be on whenever we're powering up the memory Otherwise we can write register values too early and we'll get corruption. Signed-off-by: Nicholas Kazlauskas Reviewed-by: Eric Yang Acked-by: Qingqing Zhuo Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 698b8222e590521b6028639f1a003ca98cba91d3 Author: Martin Leung Date: Tue Feb 2 16:28:05 2021 -0500 drm/amd/display: changing sr exit latency [ Upstream commit efe213e5a57e0cd92fa4f328dc1963d330549982 ] [Why] Hardware team remeasured, need to update timings to increase latency slightly and avoid intermittent underflows. [How] sr exit latency update. Signed-off-by: Martin Leung Reviewed-by: Alvin Lee Acked-by: Qingqing Zhuo Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 6c4e74b56e934cb3077c8935b3567dac8e18179c Author: Thomas Zimmermann Date: Tue Feb 9 14:46:24 2021 +0100 drm/ast: Fix invalid usage of AST_MAX_HWC_WIDTH in cursor atomic_check [ Upstream commit ee4a92d690f30f3793df942939726bec0338e65b ] Use AST_MAX_HWC_HEIGHT for setting offset_y in the cursor plane's atomic_check. The code used AST_MAX_HWC_WIDTH instead. This worked because both constants has the same value. Signed-off-by: Thomas Zimmermann Acked-by: Gerd Hoffmann Link: https://patchwork.freedesktop.org/patch/msgid/20210209134632.12157-3-tzimmermann@suse.de Signed-off-by: Sasha Levin commit 1d3ed1ec0a6b61945d866c778e7e161b8cfc5da9 Author: Gerd Hoffmann Date: Thu Feb 4 15:57:06 2021 +0100 drm/qxl: release shadow on shutdown [ Upstream commit 4ca77c513537700d3fae69030879f781dde1904c ] In case we have a shadow surface on shutdown release it so it doesn't leak. Signed-off-by: Gerd Hoffmann Acked-by: Thomas Zimmermann Link: http://patchwork.freedesktop.org/patch/msgid/20210204145712.1531203-6-kraxel@redhat.com Signed-off-by: Sasha Levin commit e7272a21ef6c5d73d23dbaa8a06aa5860838a187 Author: Tong Zhang Date: Tue Feb 2 23:07:27 2021 -0500 drm/qxl: do not run release if qxl failed to init [ Upstream commit b91907a6241193465ca92e357adf16822242296d ] if qxl_device_init() fail, drm device will not be registered, in this case, do not run qxl_drm_release() [ 5.258534] ================================================================== [ 5.258931] BUG: KASAN: user-memory-access in qxl_destroy_monitors_object+0x42/0xa0 [qxl] [ 5.259388] Write of size 8 at addr 00000000000014dc by task modprobe/95 [ 5.259754] [ 5.259842] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc6-00007-g88bb507a74ea #62 [ 5.260309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda54 [ 5.260917] Call Trace: [ 5.261056] dump_stack+0x7d/0xa3 [ 5.261245] kasan_report.cold+0x10c/0x10e [ 5.261475] ? qxl_destroy_monitors_object+0x42/0xa0 [qxl] [ 5.261789] check_memory_region+0x17c/0x1e0 [ 5.262029] qxl_destroy_monitors_object+0x42/0xa0 [qxl] [ 5.262332] qxl_modeset_fini+0x9/0x20 [qxl] [ 5.262595] qxl_drm_release+0x22/0x30 [qxl] [ 5.262841] drm_dev_release+0x32/0x50 [ 5.263047] release_nodes+0x39e/0x410 [ 5.263253] ? devres_release+0x40/0x40 [ 5.263462] really_probe+0x2ea/0x420 [ 5.263664] driver_probe_device+0x6d/0xd0 [ 5.263888] device_driver_attach+0x82/0x90 [ 5.264116] ? device_driver_attach+0x90/0x90 [ 5.264353] __driver_attach+0x60/0x100 [ 5.264563] ? device_driver_attach+0x90/0x90 [ 5.264801] bus_for_each_dev+0xe1/0x140 [ 5.265014] ? subsys_dev_iter_exit+0x10/0x10 [ 5.265251] ? klist_node_init+0x61/0x80 [ 5.265464] bus_add_driver+0x254/0x2a0 [ 5.265673] driver_register+0xd3/0x150 [ 5.265882] ? 0xffffffffc0048000 [ 5.266064] do_one_initcall+0x84/0x250 [ 5.266274] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 5.266596] ? unpoison_range+0xf/0x30 [ 5.266801] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 5.267082] ? unpoison_range+0xf/0x30 [ 5.267287] ? unpoison_range+0xf/0x30 [ 5.267491] do_init_module+0xf8/0x350 [ 5.267697] load_module+0x3fe6/0x4340 [ 5.267902] ? vm_unmap_ram+0x1d0/0x1d0 [ 5.268115] ? module_frob_arch_sections+0x20/0x20 [ 5.268375] ? __do_sys_finit_module+0x108/0x170 [ 5.268624] __do_sys_finit_module+0x108/0x170 [ 5.268865] ? __ia32_sys_init_module+0x40/0x40 [ 5.269111] ? file_open_root+0x200/0x200 [ 5.269330] ? do_sys_open+0x85/0xe0 [ 5.269527] ? filp_open+0x50/0x50 [ 5.269714] ? exit_to_user_mode_prepare+0xfc/0x130 [ 5.269978] do_syscall_64+0x33/0x40 [ 5.270176] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 5.270450] RIP: 0033:0x7fa3f685bcf7 [ 5.270646] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d1 [ 5.271634] RSP: 002b:00007ffca83048d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 5.272037] RAX: ffffffffffffffda RBX: 0000000001e94a70 RCX: 00007fa3f685bcf7 [ 5.272416] RDX: 0000000000000000 RSI: 0000000001e939e0 RDI: 0000000000000003 [ 5.272794] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 5.273171] R10: 00007fa3f68bf300 R11: 0000000000000246 R12: 0000000001e939e0 [ 5.273550] R13: 0000000000000000 R14: 0000000001e93bd0 R15: 0000000000000001 [ 5.273928] ================================================================== Signed-off-by: Tong Zhang Link: http://patchwork.freedesktop.org/patch/msgid/20210203040727.868921-1-ztong0001@gmail.com Signed-off-by: Gerd Hoffmann Signed-off-by: Sasha Levin commit 9ca845fb12c6bf586d7e1308ea72b442c90e7ad9 Author: Jared Baldridge Date: Wed Jan 20 12:56:26 2021 -0800 drm: Added orientation quirk for OneGX1 Pro [ Upstream commit 81ad7f9f78e4ff80e95be8282423f511b84f1166 ] The OneGX1 Pro has a fairly unique combination of generic strings, but we additionally match on the BIOS date just to be safe. Signed-off-by: Jared Baldridge Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Link: https://patchwork.freedesktop.org/patch/msgid/41288ccb-1012-486b-81c1-a24c31850c91@www.fastmail.com Signed-off-by: Sasha Levin commit c68956c429cdbc6a25b39585716c2f6a3a3253bc Author: Adam Ward Date: Wed Apr 21 12:03:06 2021 +0000 regulator: da9121: automotive variants identity fix [ Upstream commit 013592be146a10d3567c0062cd1416faab060704 ] This patch fixes identification of DA913x parts by the DA9121 driver, where a lack of clarity lead to implementation on the basis that variant IDs were to be identical to the equivalent rated non-automotive parts. There is a new emphasis on the DT identity to cope with overlap in these ID's - this is not considered to be problematic, because projects would be exclusively using automotive or consumer grade parts. Signed-off-by: Adam Ward Link: https://lore.kernel.org/r/20210421120306.DB5B880007F@slsrvapps-01.diasemi.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 9575200840832d832ddf08ce00cd336e7d218b52 Author: Josef Bacik Date: Fri Mar 12 15:25:21 2021 -0500 btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s [ Upstream commit 7a9213a93546e7eaef90e6e153af6b8fc7553f10 ] A few BUG_ON()'s in replace_path are purely to keep us from making logical mistakes, so replace them with ASSERT()'s. Reviewed-by: Qu Wenruo Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 4cb0aea2e250eee35ccfac5f5395cd8f3238a9e5 Author: Josef Bacik Date: Fri Mar 12 15:25:20 2021 -0500 btrfs: do proper error handling in btrfs_update_reloc_root [ Upstream commit 592fbcd50c99b8adf999a2a54f9245caff333139 ] We call btrfs_update_root in btrfs_update_reloc_root, which can fail for all sorts of reasons, including IO errors. Instead of panicing the box lets return the error, now that all callers properly handle those errors. Reviewed-by: Qu Wenruo Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit aa7ccc79588b830af610b6d5de95a4a41f91566f Author: Josef Bacik Date: Fri Mar 12 15:25:14 2021 -0500 btrfs: do proper error handling in create_reloc_root [ Upstream commit 84c50ba5214c2f3c1be4a931d521ec19f55dfdc8 ] We do memory allocations here, read blocks from disk, all sorts of operations that could easily fail at any given point. Instead of panicing the box, simply return the error back up the chain, all callers at this point have proper error handling. Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 63d01008da136dc3de36de5852cc29c2b8694249 Author: Filipe Manana Date: Wed Mar 31 11:55:50 2021 +0100 btrfs: fix exhaustion of the system chunk array due to concurrent allocations [ Upstream commit eafa4fd0ad06074da8be4e28ff93b4dca9ffa407 ] When we are running out of space for updating the chunk tree, that is, when we are low on available space in the system space info, if we have many task concurrently allocating block groups, via fallocate for example, many of them can end up all allocating new system chunks when only one is needed. In extreme cases this can lead to exhaustion of the system chunk array, which has a size limit of 2048 bytes, and results in a transaction abort with errno EFBIG, producing a trace in dmesg like the following, which was triggered on a PowerPC machine with a node/leaf size of 64K: [1359.518899] ------------[ cut here ]------------ [1359.518980] BTRFS: Transaction aborted (error -27) [1359.519135] WARNING: CPU: 3 PID: 16463 at ../fs/btrfs/block-group.c:1968 btrfs_create_pending_block_groups+0x340/0x3c0 [btrfs] [1359.519152] Modules linked in: (...) [1359.519239] Supported: Yes, External [1359.519252] CPU: 3 PID: 16463 Comm: stress-ng Tainted: G X 5.3.18-47-default #1 SLE15-SP3 [1359.519274] NIP: c008000000e36fe8 LR: c008000000e36fe4 CTR: 00000000006de8e8 [1359.519293] REGS: c00000056890b700 TRAP: 0700 Tainted: G X (5.3.18-47-default) [1359.519317] MSR: 800000000282b033 CR: 48008222 XER: 00000007 [1359.519356] CFAR: c00000000013e170 IRQMASK: 0 [1359.519356] GPR00: c008000000e36fe4 c00000056890b990 c008000000e83200 0000000000000026 [1359.519356] GPR04: 0000000000000000 0000000000000000 0000d52a3b027651 0000000000000007 [1359.519356] GPR08: 0000000000000003 0000000000000001 0000000000000007 0000000000000000 [1359.519356] GPR12: 0000000000008000 c00000063fe44600 000000001015e028 000000001015dfd0 [1359.519356] GPR16: 000000000000404f 0000000000000001 0000000000010000 0000dd1e287affff [1359.519356] GPR20: 0000000000000001 c000000637c9a000 ffffffffffffffe5 0000000000000000 [1359.519356] GPR24: 0000000000000004 0000000000000000 0000000000000100 ffffffffffffffc0 [1359.519356] GPR28: c000000637c9a000 c000000630e09230 c000000630e091d8 c000000562188b08 [1359.519561] NIP [c008000000e36fe8] btrfs_create_pending_block_groups+0x340/0x3c0 [btrfs] [1359.519613] LR [c008000000e36fe4] btrfs_create_pending_block_groups+0x33c/0x3c0 [btrfs] [1359.519626] Call Trace: [1359.519671] [c00000056890b990] [c008000000e36fe4] btrfs_create_pending_block_groups+0x33c/0x3c0 [btrfs] (unreliable) [1359.519729] [c00000056890ba90] [c008000000d68d44] __btrfs_end_transaction+0xbc/0x2f0 [btrfs] [1359.519782] [c00000056890bae0] [c008000000e309ac] btrfs_alloc_data_chunk_ondemand+0x154/0x610 [btrfs] [1359.519844] [c00000056890bba0] [c008000000d8a0fc] btrfs_fallocate+0xe4/0x10e0 [btrfs] [1359.519891] [c00000056890bd00] [c0000000004a23b4] vfs_fallocate+0x174/0x350 [1359.519929] [c00000056890bd50] [c0000000004a3cf8] ksys_fallocate+0x68/0xf0 [1359.519957] [c00000056890bda0] [c0000000004a3da8] sys_fallocate+0x28/0x40 [1359.519988] [c00000056890bdc0] [c000000000038968] system_call_exception+0xe8/0x170 [1359.520021] [c00000056890be20] [c00000000000cb70] system_call_common+0xf0/0x278 [1359.520037] Instruction dump: [1359.520049] 7d0049ad 40c2fff4 7c0004ac 71490004 40820024 2f83fffb 419e0048 3c620000 [1359.520082] e863bcb8 7ec4b378 48010d91 e8410018 <0fe00000> 3c820000 e884bcc8 7ec6b378 [1359.520122] ---[ end trace d6c186e151022e20 ]--- The following steps explain how we can end up in this situation: 1) Task A is at check_system_chunk(), either because it is allocating a new data or metadata block group, at btrfs_chunk_alloc(), or because it is removing a block group or turning a block group RO. It does not matter why; 2) Task A sees that there is not enough free space in the system space_info object, that is 'left' is < 'thresh'. And at this point the system space_info has a value of 0 for its 'bytes_may_use' counter; 3) As a consequence task A calls btrfs_alloc_chunk() in order to allocate a new system block group (chunk) and then reserves 'thresh' bytes in the chunk block reserve with the call to btrfs_block_rsv_add(). This changes the chunk block reserve's 'reserved' and 'size' counters by an amount of 'thresh', and changes the 'bytes_may_use' counter of the system space_info object from 0 to 'thresh'. Also during its call to btrfs_alloc_chunk(), we end up increasing the value of the 'total_bytes' counter of the system space_info object by 8MiB (the size of a system chunk stripe). This happens through the call chain: btrfs_alloc_chunk() create_chunk() btrfs_make_block_group() btrfs_update_space_info() 4) After it finishes the first phase of the block group allocation, at btrfs_chunk_alloc(), task A unlocks the chunk mutex; 5) At this point the new system block group was added to the transaction handle's list of new block groups, but its block group item, device items and chunk item were not yet inserted in the extent, device and chunk trees, respectively. That only happens later when we call btrfs_finish_chunk_alloc() through a call to btrfs_create_pending_block_groups(); Note that only when we update the chunk tree, through the call to btrfs_finish_chunk_alloc(), we decrement the 'reserved' counter of the chunk block reserve as we COW/allocate extent buffers, through: btrfs_alloc_tree_block() btrfs_use_block_rsv() btrfs_block_rsv_use_bytes() And the system space_info's 'bytes_may_use' is decremented everytime we allocate an extent buffer for COW operations on the chunk tree, through: btrfs_alloc_tree_block() btrfs_reserve_extent() find_free_extent() btrfs_add_reserved_bytes() If we end up COWing less chunk btree nodes/leaves than expected, which is the typical case since the amount of space we reserve is always pessimistic to account for the worst possible case, we release the unused space through: btrfs_create_pending_block_groups() btrfs_trans_release_chunk_metadata() btrfs_block_rsv_release() block_rsv_release_bytes() btrfs_space_info_free_bytes_may_use() But before task A gets into btrfs_create_pending_block_groups()... 6) Many other tasks start allocating new block groups through fallocate, each one does the first phase of block group allocation in a serialized way, since btrfs_chunk_alloc() takes the chunk mutex before calling check_system_chunk() and btrfs_alloc_chunk(). However before everyone enters the final phase of the block group allocation, that is, before calling btrfs_create_pending_block_groups(), new tasks keep coming to allocate new block groups and while at check_system_chunk(), the system space_info's 'bytes_may_use' keeps increasing each time a task reserves space in the chunk block reserve. This means that eventually some other task can end up not seeing enough free space in the system space_info and decide to allocate yet another system chunk. This may repeat several times if yet more new tasks keep allocating new block groups before task A, and all the other tasks, finish the creation of the pending block groups, which is when reserved space in excess is released. Eventually this can result in exhaustion of system chunk array in the superblock, with btrfs_add_system_chunk() returning EFBIG, resulting later in a transaction abort. Even when we don't reach the extreme case of exhausting the system array, most, if not all, unnecessarily created system block groups end up being unused since when finishing creation of the first pending system block group, the creation of the following ones end up not needing to COW nodes/leaves of the chunk tree, so we never allocate and deallocate from them, resulting in them never being added to the list of unused block groups - as a consequence they don't get deleted by the cleaner kthread - the only exceptions are if we unmount and mount the filesystem again, which adds any unused block groups to the list of unused block groups, if a scrub is run, which also adds unused block groups to the unused list, and under some circumstances when using a zoned filesystem or async discard, which may also add unused block groups to the unused list. So fix this by: *) Tracking the number of reserved bytes for the chunk tree per transaction, which is the sum of reserved chunk bytes by each transaction handle currently being used; *) When there is not enough free space in the system space_info, if there are other transaction handles which reserved chunk space, wait for some of them to complete in order to have enough excess reserved space released, and then try again. Otherwise proceed with the creation of a new system chunk. Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 7c124584b3c0b2eb7802482a44d4981210fa7a96 Author: Filipe Manana Date: Tue Feb 23 12:08:48 2021 +0000 btrfs: fix race between marking inode needs to be logged and log syncing [ Upstream commit bc0939fcfab0d7efb2ed12896b1af3d819954a14 ] We have a race between marking that an inode needs to be logged, either at btrfs_set_inode_last_trans() or at btrfs_page_mkwrite(), and between btrfs_sync_log(). The following steps describe how the race happens. 1) We are at transaction N; 2) Inode I was previously fsynced in the current transaction so it has: inode->logged_trans set to N; 3) The inode's root currently has: root->log_transid set to 1 root->last_log_commit set to 0 Which means only one log transaction was committed to far, log transaction 0. When a log tree is created we set ->log_transid and ->last_log_commit of its parent root to 0 (at btrfs_add_log_tree()); 4) One more range of pages is dirtied in inode I; 5) Some task A starts an fsync against some other inode J (same root), and so it joins log transaction 1. Before task A calls btrfs_sync_log()... 6) Task B starts an fsync against inode I, which currently has the full sync flag set, so it starts delalloc and waits for the ordered extent to complete before calling btrfs_inode_in_log() at btrfs_sync_file(); 7) During ordered extent completion we have btrfs_update_inode() called against inode I, which in turn calls btrfs_set_inode_last_trans(), which does the following: spin_lock(&inode->lock); inode->last_trans = trans->transaction->transid; inode->last_sub_trans = inode->root->log_transid; inode->last_log_commit = inode->root->last_log_commit; spin_unlock(&inode->lock); So ->last_trans is set to N and ->last_sub_trans set to 1. But before setting ->last_log_commit... 8) Task A is at btrfs_sync_log(): - it increments root->log_transid to 2 - starts writeback for all log tree extent buffers - waits for the writeback to complete - writes the super blocks - updates root->last_log_commit to 1 It's a lot of slow steps between updating root->log_transid and root->last_log_commit; 9) The task doing the ordered extent completion, currently at btrfs_set_inode_last_trans(), then finally runs: inode->last_log_commit = inode->root->last_log_commit; spin_unlock(&inode->lock); Which results in inode->last_log_commit being set to 1. The ordered extent completes; 10) Task B is resumed, and it calls btrfs_inode_in_log() which returns true because we have all the following conditions met: inode->logged_trans == N which matches fs_info->generation && inode->last_subtrans (1) <= inode->last_log_commit (1) && inode->last_subtrans (1) <= root->last_log_commit (1) && list inode->extent_tree.modified_extents is empty And as a consequence we return without logging the inode, so the existing logged version of the inode does not point to the extent that was written after the previous fsync. It should be impossible in practice for one task be able to do so much progress in btrfs_sync_log() while another task is at btrfs_set_inode_last_trans() right after it reads root->log_transid and before it reads root->last_log_commit. Even if kernel preemption is enabled we know the task at btrfs_set_inode_last_trans() can not be preempted because it is holding the inode's spinlock. However there is another place where we do the same without holding the spinlock, which is in the memory mapped write path at: vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) { (...) BTRFS_I(inode)->last_trans = fs_info->generation; BTRFS_I(inode)->last_sub_trans = BTRFS_I(inode)->root->log_transid; BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit; (...) So with preemption happening after setting ->last_sub_trans and before setting ->last_log_commit, it is less of a stretch to have another task do enough progress at btrfs_sync_log() such that the task doing the memory mapped write ends up with ->last_sub_trans and ->last_log_commit set to the same value. It is still a big stretch to get there, as the task doing btrfs_sync_log() has to start writeback, wait for its completion and write the super blocks. So fix this in two different ways: 1) For btrfs_set_inode_last_trans(), simply set ->last_log_commit to the value of ->last_sub_trans minus 1; 2) For btrfs_page_mkwrite() only set the inode's ->last_sub_trans, just like we do for buffered and direct writes at btrfs_file_write_iter(), which is all we need to make sure multiple writes and fsyncs to an inode in the same transaction never result in an fsync missing that the inode changed and needs to be logged. Turn this into a helper function and use it both at btrfs_page_mkwrite() and at btrfs_file_write_iter() - this also fixes the problem that at btrfs_page_mkwrite() we were setting those fields without the protection of the inode's spinlock. This is an extremely unlikely race to happen in practice. Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 7bc617ca21ed7089adf1045c1c02a5493d6859ad Author: Josef Bacik Date: Wed Feb 10 17:14:34 2021 -0500 btrfs: use btrfs_inode_lock/btrfs_inode_unlock inode lock helpers [ Upstream commit 64708539cd23b31d0f235a2c12a0cf782f95908a ] A few places we intermix btrfs_inode_lock with a inode_unlock, and some places we just use inode_lock/inode_unlock instead of btrfs_inode_lock. None of these places are using this incorrectly, but as we adjust some of these callers it would be nice to keep everything consistent, so convert everybody to use btrfs_inode_lock/btrfs_inode_unlock. Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 4f2fee1e17efc63e5cd60bb40f6134d45eada6f9 Author: David Bauer Date: Fri Apr 16 21:59:56 2021 +0200 spi: sync up initial chipselect state [ Upstream commit d347b4aaa1a042ea528e385d9070b74c77a14321 ] When initially probing the SPI slave device, the call for disabling an SPI device without the SPI_CS_HIGH flag is not applied, as the condition for checking whether or not the state to be applied equals the one currently set evaluates to true. This however might not necessarily be the case, as the chipselect might be active. Add a force flag to spi_set_cs which allows to override this early exit condition. Set it to false everywhere except when called from spi_setup to sync up the initial CS state. Fixes commit d40f0b6f2e21 ("spi: Avoid setting the chip select if we don't need to") Signed-off-by: David Bauer Link: https://lore.kernel.org/r/20210416195956.121811-1-mail@david-bauer.net Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit aa7ce33cb69143eaaea1a4f9631468d050926240 Author: David E. Box Date: Fri Apr 16 20:12:44 2021 -0700 platform/x86: intel_pmc_core: Don't use global pmcdev in quirks [ Upstream commit c9f86d6ca6b5e23d30d16ade4b9fff5b922a610a ] The DMI callbacks, used for quirks, currently access the PMC by getting the address a global pmc_dev struct. Instead, have the callbacks set a global quirk specific variable. In probe, after calling dmi_check_system(), pass pmc_dev to a function that will handle each quirk if its variable condition is met. This allows removing the global pmc_dev later. Signed-off-by: David E. Box Reviewed-by: Hans de Goede Reviewed-by: Rajneesh Bhardwaj Link: https://lore.kernel.org/r/20210417031252.3020837-2-david.e.box@linux.intel.com Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 58f2c4df081f1f0b443e8373698b4eb357aa23b6 Author: Shixin Liu Date: Thu Apr 8 15:18:39 2021 +0800 crypto: omap-aes - Fix PM reference leak on omap-aes.c [ Upstream commit 1f34cc4a8da34fbb250efb928f9b8c6fe7ee0642 ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 728936b641fcb43b494b63f22b0fbac94229c5f3 Author: Shixin Liu Date: Thu Apr 8 15:18:37 2021 +0800 crypto: sa2ul - Fix PM reference leak in sa_ul_probe() [ Upstream commit 13343badae093977295341d5a050f51ef128821c ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit ef8b29f1e948e5eee9581d7ab96cd0732f06cfc5 Author: Shixin Liu Date: Thu Apr 8 15:18:36 2021 +0800 crypto: stm32/cryp - Fix PM reference leak on stm32-cryp.c [ Upstream commit 747bf30fd944f02f341b5f3bc7d97a13f2ae2fbe ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 994169b0d7bff44320ae808e08456bbb5fa97b05 Author: Shixin Liu Date: Thu Apr 8 15:18:35 2021 +0800 crypto: stm32/hash - Fix PM reference leak on stm32-hash.c [ Upstream commit 1cb3ad701970e68f18a9e5d090baf2b1b703d729 ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 3de2b604df544151249f8cb0d439f7a555f04e8f Author: Shixin Liu Date: Thu Apr 8 15:18:33 2021 +0800 crypto: sun8i-ce - Fix PM reference leak in sun8i_ce_probe() [ Upstream commit cc987ae9150c255352660d235ab27c834aa527be ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 06871eb9a326d6041920a0338512e9746c878afc Author: Shixin Liu Date: Thu Apr 8 15:18:32 2021 +0800 crypto: sun8i-ss - Fix PM reference leak when pm_runtime_get_sync() fails [ Upstream commit 06cd7423cf451d68bfab289278d7890c9ae01a14 ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 077a90e774c15f037bcda73f30e679c779d64e2f Author: Shixin Liu Date: Thu Apr 8 15:18:31 2021 +0800 crypto: sun4i-ss - Fix PM reference leak when pm_runtime_get_sync() fails [ Upstream commit ac98fc5e1c321112dab9ccac9df892c154540f5d ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Signed-off-by: Shixin Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit fae3d0a14dfe6030685d2069614ef4084678176c Author: Yang Yingliang Date: Wed Apr 7 17:27:16 2021 +0800 phy: phy-twl4030-usb: Fix possible use-after-free in twl4030_usb_remove() [ Upstream commit e1723d8b87b73ab363256e7ca3af3ddb75855680 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Link: https://lore.kernel.org/r/20210407092716.3270248-1-yangyingliang@huawei.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 99ef43fde005d752c46ae826b9ad9ef3accdc6cf Author: Pavel Machek Date: Wed Apr 14 20:12:49 2021 +0300 intel_th: Consistency and off-by-one fix [ Upstream commit 18ffbc47d45a1489b664dd68fb3a7610a6e1dea3 ] Consistently use "< ... +1" in for loops. Fix of-by-one in for_each_set_bit(). Signed-off-by: Pavel Machek Signed-off-by: Alexander Shishkin Link: https://lore.kernel.org/lkml/20190724095841.GA6952@amd/ Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20210414171251.14672-6-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8293c3d6032083a5ef12b0686688b7901dd95b9b Author: Hillf Danton Date: Mon Apr 12 11:57:58 2021 +0800 tty: n_gsm: check error while registering tty devices [ Upstream commit 0a360e8b65d62fe1a994f0a8da4f8d20877b2100 ] Add the error path for registering tty devices and roll back in case of error in bid to avoid the UAF like the below one reported. Plus syzbot reported general protection fault in cdev_del() on Sep 24, 2020 and both cases are down to the kobject_put() in tty_cdev_add(). ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 8923 at lib/refcount.c:28 refcount_warn_saturate+0x1cf/0x210 -origin/lib/refcount.c:28 Modules linked in: CPU: 1 PID: 8923 Comm: executor Not tainted 5.12.0-rc5+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:refcount_warn_saturate+0x1cf/0x210 -origin/lib/refcount.c:28 Code: 4f ff ff ff e8 32 fa b5 fe 48 c7 c7 3d f8 f6 86 e8 d6 ab c6 fe c6 05 7c 34 67 04 01 48 c7 c7 68 f8 6d 86 31 c0 e8 81 2e 9d fe <0f> 0b e9 22 ff ff ff e8 05 fa b5 fe 48 c7 c7 3e f8 f6 86 e8 a9 ab RSP: 0018:ffffc90001633c60 EFLAGS: 00010246 RAX: 15d08b2e34b77800 RBX: 0000000000000003 RCX: ffff88804c056c80 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000003 R08: ffffffff813767aa R09: 0001ffffffffffff R10: 0001ffffffffffff R11: ffff88804c056c80 R12: ffff888040b7d000 R13: ffff88804c206938 R14: ffff88804c206900 R15: ffff888041b18488 FS: 00000000022c9940(0000) GS:ffff88807ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9f9b122008 CR3: 0000000044b4b000 CR4: 0000000000750ee0 PKRU: 55555554 Call Trace: __refcount_sub_and_test -origin/./include/linux/refcount.h:283 [inline] __refcount_dec_and_test -origin/./include/linux/refcount.h:315 [inline] refcount_dec_and_test -origin/./include/linux/refcount.h:333 [inline] kref_put -origin/./include/linux/kref.h:64 [inline] kobject_put+0x17b/0x180 -origin/lib/kobject.c:753 cdev_del+0x4b/0x50 -origin/fs/char_dev.c:597 tty_unregister_device+0x99/0xd0 -origin/drivers/tty/tty_io.c:3343 gsmld_detach_gsm -origin/drivers/tty/n_gsm.c:2409 [inline] gsmld_close+0x6c/0x140 -origin/drivers/tty/n_gsm.c:2478 tty_ldisc_close -origin/drivers/tty/tty_ldisc.c:488 [inline] tty_ldisc_kill -origin/drivers/tty/tty_ldisc.c:636 [inline] tty_ldisc_release+0x1b6/0x400 -origin/drivers/tty/tty_ldisc.c:809 tty_release_struct+0x19/0xb0 -origin/drivers/tty/tty_io.c:1714 tty_release+0x9ad/0xa00 -origin/drivers/tty/tty_io.c:1885 __fput+0x260/0x4e0 -origin/fs/file_table.c:280 ____fput+0x11/0x20 -origin/fs/file_table.c:313 task_work_run+0x8e/0x110 -origin/kernel/task_work.c:140 tracehook_notify_resume -origin/./include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop -origin/kernel/entry/common.c:174 [inline] exit_to_user_mode_prepare+0x16b/0x1a0 -origin/kernel/entry/common.c:208 __syscall_exit_to_user_mode_work -origin/kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x20/0x40 -origin/kernel/entry/common.c:301 do_syscall_64+0x45/0x80 -origin/arch/x86/entry/common.c:56 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com Reported-and-tested-by: Hao Sun Signed-off-by: Hillf Danton Link: https://lore.kernel.org/r/20210412035758.1974-1-hdanton@sina.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit ad9f7e48c5e06fb18d1cbccabed0c5ae078a2713 Author: Thinh Nguyen Date: Tue Apr 13 19:13:18 2021 -0700 usb: dwc3: gadget: Check for disabled LPM quirk [ Upstream commit 475e8be53d0496f9bc6159f4abb3ff5f9b90e8de ] If the device doesn't support LPM, make sure to disable the LPM capability and don't advertise to the host that it supports it. Acked-by: Felipe Balbi Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/9e68527ff932b1646f92a7593d4092a903754666.1618366071.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 26f4ea0e0ca4ff532d256fc5f09bb2f4c5e87a5c Author: Bixuan Cui Date: Thu Apr 8 21:08:31 2021 +0800 usb: core: hub: Fix PM reference leak in usb_port_resume() [ Upstream commit 025f97d188006eeee4417bb475a6878d1e0eed3f ] pm_runtime_get_sync will increment pm usage counter even it failed. thus a pairing decrement is needed. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Reported-by: Hulk Robot Signed-off-by: Bixuan Cui Link: https://lore.kernel.org/r/20210408130831.56239-1-cuibixuan@huawei.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit e63746548f840638b692ae00265fb5e605dc6508 Author: Bixuan Cui Date: Thu Apr 8 17:18:36 2021 +0800 usb: musb: fix PM reference leak in musb_irq_work() [ Upstream commit 9535b99533904e9bc1607575aa8e9539a55435d7 ] pm_runtime_get_sync will increment pm usage counter even it failed. thus a pairing decrement is needed. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Reported-by: Hulk Robot Signed-off-by: Bixuan Cui Link: https://lore.kernel.org/r/20210408091836.55227-1-cuibixuan@huawei.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 02798f0cb9b342bdabc4965f2ce3d6a6de6afec5 Author: Yang Yingliang Date: Wed Apr 7 17:29:47 2021 +0800 usb: gadget: tegra-xudc: Fix possible use-after-free in tegra_xudc_remove() [ Upstream commit a932ee40c276767cd55fadec9e38829bf441db41 ] This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This means that the callback function may still be running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that the work is properly cancelled, no longer running, and unable to re-schedule itself. Reported-by: Hulk Robot Signed-off-by: Yang Yingliang Link: https://lore.kernel.org/r/20210407092947.3271507-1-yangyingliang@huawei.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 1b0f64df77d0a1016e3bdf5a1d43993ea63856fc Author: Heikki Krogerus Date: Thu Apr 8 11:31:44 2021 +0300 usb: dwc3: pci: add support for the Intel Alder Lake-M [ Upstream commit 782de5e7190de0a773417708e17d9461d9109bf9 ] This patch adds the necessary PCI ID for Intel Alder Lake-M devices. Signed-off-by: Heikki Krogerus Link: https://lore.kernel.org/r/20210408083144.69350-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 32565a0616f984a85ccb4ed5067e6415d3dd1109 Author: Wang Li Date: Fri Apr 9 09:54:58 2021 +0000 spi: qup: fix PM reference leak in spi_qup_remove() [ Upstream commit cec77e0a249892ceb10061bf17b63f9fb111d870 ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Reported-by: Hulk Robot Signed-off-by: Wang Li Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20210409095458.29921-1-wangli74@huawei.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 18fe30ff2692a290afa7de010ed40c5061a92b0e Author: Wei Yongjun Date: Fri Apr 9 08:29:54 2021 +0000 spi: omap-100k: Fix reference leak to master [ Upstream commit a23faea76d4cf5f75decb574491e66f9ecd707e7 ] Call spi_master_get() holds the reference count to master device, thus we need an additional spi_master_put() call to reduce the reference count, otherwise we will leak a reference to master. This commit fix it by removing the unnecessary spi_master_get(). Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Link: https://lore.kernel.org/r/20210409082954.2906933-1-weiyongjun1@huawei.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit a505edce0bdd485e8469b454eb19dddc1faf7d6e Author: Wei Yongjun Date: Fri Apr 9 08:29:55 2021 +0000 spi: dln2: Fix reference leak to master [ Upstream commit 9b844b087124c1538d05f40fda8a4fec75af55be ] Call spi_master_get() holds the reference count to master device, thus we need an additional spi_master_put() call to reduce the reference count, otherwise we will leak a reference to master. This commit fix it by removing the unnecessary spi_master_get(). Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Link: https://lore.kernel.org/r/20210409082955.2907950-1-weiyongjun1@huawei.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 840872c2aed21e82579292c09ccef8f5e34b0eba Author: Angela Czubak Date: Thu Apr 8 12:37:59 2021 +0200 resource: Prevent irqresource_disabled() from erasing flags [ Upstream commit d08a745729646f407277e904b02991458f20d261 ] Some Chromebooks use hard-coded interrupts in their ACPI tables. This is an excerpt as dumped on Relm: ... Name (_HID, "ELAN0001") // _HID: Hardware ID Name (_DDN, "Elan Touchscreen ") // _DDN: DOS Device Name Name (_UID, 0x05) // _UID: Unique ID Name (ISTP, Zero) Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings { Name (BUF0, ResourceTemplate () { I2cSerialBusV2 (0x0010, ControllerInitiated, 0x00061A80, AddressingMode7Bit, "\\_SB.I2C1", 0x00, ResourceConsumer, , Exclusive, ) Interrupt (ResourceConsumer, Edge, ActiveLow, Exclusive, ,, ) { 0x000000B8, } }) Return (BUF0) /* \_SB_.I2C1.ETSA._CRS.BUF0 */ } ... This interrupt is hard-coded to 0xB8 = 184 which is too high to be mapped to IO-APIC, so no triggering information is propagated as acpi_register_gsi() fails and irqresource_disabled() is issued, which leads to erasing triggering and polarity information. Do not overwrite flags as it leads to erasing triggering and polarity information which might be useful in case of hard-coded interrupts. This way the information can be read later on even though mapping to APIC domain failed. Signed-off-by: Angela Czubak [ rjw: Changelog rearrangement ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit f938233fe9a668268182aad387ddb39cbcfa4940 Author: Dinh Nguyen Date: Mon Mar 22 07:18:44 2021 -0500 clocksource/drivers/dw_apb_timer_of: Add handling for potential memory leak [ Upstream commit 397dc6f7ca3c858dc95800f299357311ccf679e6 ] Add calls to disable the clock and unmap the timer base address in case of any failures. Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Dinh Nguyen Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20210322121844.2271041-1-dinguyen@kernel.org Signed-off-by: Sasha Levin commit 43f27251c576fcb84118528e404ee8271d09c539 Author: Srinivas Pandruvada Date: Tue Mar 30 15:08:40 2021 -0700 platform/x86: ISST: Account for increased timeout in some cases [ Upstream commit 5c782817a981981917ec3c647cf521022ee07143 ] In some cases when firmware is busy or updating, some mailbox commands still timeout on some newer CPUs. To fix this issue, change how we process timeout. With this change, replaced timeout from using simple count with real timeout in micro-seconds using ktime. When the command response takes more than average processing time, yield to other tasks. The worst case timeout is extended upto 1 milli-second. Signed-off-by: Srinivas Pandruvada Link: https://lore.kernel.org/r/20210330220840.3113959-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 5226917d0f470e9f0714a878bb7fa8a2d2c0f9ce Author: Srinivas Pandruvada Date: Thu Mar 4 17:31:49 2021 -0800 tools/power/x86/intel-speed-select: Increase string size [ Upstream commit 2e70b710f36c80b6e78cf32a5c30b46dbb72213c ] The current string size to print cpulist can accommodate upto 80 logical CPUs per package. But this limit is not enough. So increase the string size. Also prevent buffer overflow, if the string size reaches limit. Signed-off-by: Srinivas Pandruvada Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit b605b652887e76beddf4307cf9dd7bdcacd1eba4 Author: Ludovic Desroches Date: Fri Apr 2 15:02:27 2021 +0200 ARM: dts: at91: change the key code of the gpio key [ Upstream commit ca7a049ad1a72ec5f03d1330b53575237fcb727c ] Having a button code and not a key code causes issues with libinput. udev won't set ID_INPUT_KEY. If it is forced, then it causes a bug within libinput. Signed-off-by: Ludovic Desroches Signed-off-by: Nicolas Ferre Link: https://lore.kernel.org/r/20210402130227.21478-1-nicolas.ferre@microchip.com Signed-off-by: Sasha Levin commit cf641232176d457db08a63338a20ac1a27cb8133 Author: Loic Poulain Date: Fri Mar 19 16:50:37 2021 +0100 bus: mhi: pci_generic: Implement PCI shutdown callback [ Upstream commit 757072abe1c0b67cb226936c709291889658a222 ] Deinit the device on shutdown to halt MHI/PCI operation on device side. This change fixes floating device state with some hosts that do not fully shutdown PCIe device when rebooting. Signed-off-by: Loic Poulain Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1616169037-7969-1-git-send-email-loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Sasha Levin commit 29792079690913fdc28ec68271cd82ac5c5a0b62 Author: Bhaumik Bhatt Date: Thu Apr 1 14:16:11 2021 -0700 bus: mhi: core: Clear context for stopped channels from remove() [ Upstream commit 4e44ae3d6d9c2c2a6d9356dd279c925532d5cd8c ] If a channel was explicitly stopped but not reset and a driver remove is issued, clean up the channel context such that it is reflected on the device. This move is useful if a client driver module is unloaded or a device crash occurs with the host having placed the channel in a stopped state. Signed-off-by: Bhaumik Bhatt Reviewed-by: Hemant Kumar Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1617311778-1254-3-git-send-email-bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Sasha Levin commit 80380148c46d96f21840df1b02dd9db76a4bacfa Author: Mathias Nyman Date: Tue Apr 6 10:02:08 2021 +0300 xhci: prevent double-fetch of transfer and transfer event TRBs [ Upstream commit e9fcb07704fcef6fa6d0333fd2b3a62442eaf45b ] The same values are parsed several times from transfer and event TRBs by different functions in the same call path, all while processing one transfer event. As the TRBs are in DMA memory and can be accessed by the xHC host we want to avoid this to prevent double-fetch issues. To resolve this pass the already parsed values to the different functions in the path of parsing a transfer event Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210406070208.3406266-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit f47c7ace4acdc335f70e71c26397c278ee46b78b Author: Mathias Nyman Date: Tue Apr 6 10:02:07 2021 +0300 xhci: fix potential array out of bounds with several interrupters [ Upstream commit 286fd02fd54b6acab65809549cf5fb3f2a886696 ] The Max Interrupters supported by the controller is given in a 10bit wide bitfield, but the driver uses a fixed 128 size array to index these interrupters. Klockwork reports a possible array out of bounds case which in theory is possible. In practice this hasn't been hit as a common number of Max Interrupters for new controllers is 8, not even close to 128. This needs to be fixed anyway Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210406070208.3406266-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit bf6c351709a98dc8bdc19e59015b050d55838caa Author: Mathias Nyman Date: Tue Apr 6 10:02:06 2021 +0300 xhci: check control context is valid before dereferencing it. [ Upstream commit 597899d2f7c5619c87185ee7953d004bd37fd0eb ] Don't dereference ctrl_ctx before checking it's valid. Issue reported by Klockwork Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210406070208.3406266-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 396481ce3f12ee047308eca49501e5dcc320cc80 Author: Mathias Nyman Date: Tue Apr 6 10:02:05 2021 +0300 xhci: check port array allocation was successful before dereferencing it [ Upstream commit 8a157d2ff104d2849c58226a1fd02365d7d60150 ] return if rhub->ports is null after rhub->ports = kcalloc_node() Klockwork reported issue Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210406070208.3406266-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 5598b0a7efdee2509dde9bcb8fcc1c050ebf5672 Author: Russ Weight Date: Mon Apr 5 16:52:59 2021 -0700 fpga: dfl: pci: add DID for D5005 PAC cards [ Upstream commit a78a51a851ed3edc83264a67e2ba77a34f27965f ] This patch adds the approved PCI Express Device IDs for the PF and VF for the card for D5005 PAC cards. Signed-off-by: Russ Weight Signed-off-by: Matthew Gerlach Signed-off-by: Moritz Fischer Signed-off-by: Sasha Levin commit 04cd7a2b0ae785dd099b813934e1402db746d627 Author: Chunfeng Yun Date: Wed Mar 31 17:05:53 2021 +0800 usb: xhci-mtk: support quirk to disable usb2 lpm [ Upstream commit bee1f89aad2a51cd3339571bc8eadbb0dc88a683 ] The xHCI driver support usb2 HW LPM by default, here add support XHCI_HW_LPM_DISABLE quirk, then we can disable usb2 lpm when need it. Signed-off-by: Chunfeng Yun Link: https://lore.kernel.org/r/1617181553-3503-4-git-send-email-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit e132d83b2cffa05d3b9afe333975dbb462dbb72b Author: Eric Biggers Date: Sun Mar 21 22:13:47 2021 -0700 random: initialize ChaCha20 constants with correct endianness [ Upstream commit a181e0fdb2164268274453b5b291589edbb9b22d ] On big endian CPUs, the ChaCha20-based CRNG is using the wrong endianness for the ChaCha20 constants. This doesn't matter cryptographically, but technically it means it's not ChaCha20 anymore. Fix it to always use the standard constants. Cc: linux-crypto@vger.kernel.org Cc: Andy Lutomirski Cc: Jann Horn Cc: Theodore Ts'o Acked-by: Ard Biesheuvel Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 16839bc05c83f5de291728594d6712b5cf58e2e5 Author: Robin Murphy Date: Fri Mar 26 16:02:41 2021 +0000 perf/arm_pmu_platform: Fix error handling [ Upstream commit e338cb6bef254821a8c095018fd27254d74bfd6a ] If we're aborting after failing to register the PMU device, we probably don't want to leak the IRQs that we've claimed. Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/53031a607fc8412a60024bfb3bb8cd7141f998f5.1616774562.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit bf6a8d1b4e4d55ce44319615423726a317601be0 Author: Robin Murphy Date: Fri Mar 26 16:02:40 2021 +0000 perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors [ Upstream commit 11fa1dc8020a2a9e0c59998920092d4df3fb7308 ] By virtue of using platform_irq_get_optional() under the covers, platform_irq_count() needs the target interrupt controller to be available and may return -EPROBE_DEFER if it isn't. Let's use dev_err_probe() to avoid a spurious error log (and help debug any deferral issues) in that case. Reported-by: Paul Menzel Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/073d5e0d3ed1f040592cb47ca6fe3759f40cc7d1.1616774562.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 65b2f85e7273a44eb626d356ed70bf1e6a2f91b8 Author: Pierre-Louis Bossart Date: Tue Mar 23 09:37:07 2021 +0800 soundwire: cadence: only prepare attached devices on clock stop [ Upstream commit 58ef9356260c291a4321e07ff507f31a1d8212af ] We sometimes see COMMAND_IGNORED responses during the clock stop sequence. It turns out we already have information if devices are present on a link, so we should only prepare those when they are attached. In addition, even when COMMAND_IGNORED are received, we should still proceed with the clock stop. The device will not be prepared but that's not a problem. The only case where the clock stop will fail is if the Cadence IP reports an error (including a timeout), or if the devices throw a COMMAND_FAILED response. BugLink: https://github.com/thesofproject/linux/issues/2621 Signed-off-by: Pierre-Louis Bossart Reviewed-by: Rander Wang Reviewed-by: Guennadi Liakhovetski Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20210323013707.21455-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 464461886367226f1f0c915db9e5a01bb8942289 Author: Jerome Forissier Date: Mon Mar 22 11:40:37 2021 +0100 tee: optee: do not check memref size on return from Secure World [ Upstream commit c650b8dc7a7910eb25af0aac1720f778b29e679d ] When Secure World returns, it may have changed the size attribute of the memory references passed as [in/out] parameters. The GlobalPlatform TEE Internal Core API specification does not restrict the values that this size can take. In particular, Secure World may increase the value to be larger than the size of the input buffer to indicate that it needs more. Therefore, the size check in optee_from_msg_param() is incorrect and needs to be removed. This fixes a number of failed test cases in the GlobalPlatform TEE Initial Configuratiom Test Suite v2_0_0_0-2017_06_09 when OP-TEE is compiled without dynamic shared memory support (CFG_CORE_DYN_SHM=n). Reviewed-by: Sumit Garg Suggested-by: Jens Wiklander Signed-off-by: Jerome Forissier Signed-off-by: Jens Wiklander Signed-off-by: Sasha Levin commit ad42a659f35192c5ad58c2af9ae2634ba8b8e4c7 Author: Sebastian Krzyszkowiak Date: Mon Mar 15 09:35:30 2021 +0100 arm64: dts: imx8mq-librem5-r3: Mark buck3 as always on [ Upstream commit a362b0cc94d476b097ba0ff466958c1d4e27e219 ] Commit 99e71c029213 ("arm64: dts: imx8mq-librem5: Don't mark buck3 as always on") removed always-on marking from GPU regulator, which is great for power saving - however it introduces additional i2c0 traffic which can be deadly for devices from the Dogwood batch. To workaround the i2c0 shutdown issue on Dogwood, this commit marks buck3 as always-on again - but only for Dogwood (r3). Signed-off-by: Sebastian Krzyszkowiak Signed-off-by: Martin Kepplinger Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit 2ed7e900e70bee8e80fbbe0acabe5bdbe72e81d6 Author: Dmitry Osipenko Date: Tue Mar 2 15:24:59 2021 +0300 soc/tegra: pmc: Fix completion of power-gate toggling [ Upstream commit c45e66a6b9f40f2e95bc6d97fbf3daa1ebe88c6b ] The SW-initiated power gate toggling is dropped by PMC if there is contention with a HW-initiated toggling, i.e. when one of CPU cores is gated by cpuidle driver. Software should retry the toggling after 10 microseconds on Tegra20/30 SoCs, hence add the retrying. On Tegra114+ the toggling method was changed in hardware, the TOGGLE_START bit indicates whether PMC is busy or could accept the command to toggle, hence handle that bit properly. The problem pops up after enabling dynamic power gating of 3D hardware, where 3D power domain fails to turn on/off "randomly". The programming sequence and quirks are documented in TRMs, but PMC driver obliviously re-used the Tegra20 logic for Tegra30+, which strikes back now. The 10 microseconds and other timeouts aren't documented in TRM, they are taken from downstream kernel. Link: https://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commit;h=311dd1c318b70e93bcefec15456a10ff2b9eb0ff Link: https://nv-tegra.nvidia.com/gitweb/?p=linux-3.10.git;a=commit;h=7f36693c47cb23730a6b2822e0975be65fb0c51d Tested-by: Peter Geis # Ouya T30 Tested-by: Nicolas Chauvet # PAZ00 T20 and TK1 T124 Tested-by: Matt Merhar # Ouya T30 Signed-off-by: Dmitry Osipenko Signed-off-by: Thierry Reding Signed-off-by: Sasha Levin commit 85e8bde13e9915fe0efb90c3c44a2cbb3495ec33 Author: Nathan Chancellor Date: Thu Mar 25 17:04:35 2021 -0700 efi/libstub: Add $(CLANG_FLAGS) to x86 flags [ Upstream commit 58d746c119dfa28e72fc35aacaf3d2a3ac625cd0 ] When cross compiling x86 on an ARM machine with clang, there are several errors along the lines of: arch/x86/include/asm/page_64.h:52:7: error: invalid output constraint '=D' in asm This happens because the x86 flags in the EFI stub are not derived from KBUILD_CFLAGS like the other architectures are and the clang flags that set the target architecture ('--target=') and the path to the GNU cross tools ('--prefix=') are not present, meaning that the host architecture is targeted. These flags are available as $(CLANG_FLAGS) from the main Makefile so add them to the cflags for x86 so that cross compiling works as expected. Signed-off-by: Nathan Chancellor Signed-off-by: Borislav Petkov Acked-by: Ard Biesheuvel Link: https://lkml.kernel.org/r/20210326000435.4785-4-nathan@kernel.org Signed-off-by: Sasha Levin commit 50e9a45834f929110b10feb74ba74e798be4e442 Author: Nathan Chancellor Date: Thu Mar 25 17:04:34 2021 -0700 x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS [ Upstream commit d5cbd80e302dfea59726c44c56ab7957f822409f ] When cross compiling x86 on an ARM machine with clang, there are several errors along the lines of: arch/x86/include/asm/string_64.h:27:10: error: invalid output constraint '=&c' in asm This happens because the compressed boot Makefile reassigns KBUILD_CFLAGS and drops the clang flags that set the target architecture ('--target=') and the path to the GNU cross tools ('--prefix='), meaning that the host architecture is targeted. These flags are available as $(CLANG_FLAGS) from the main Makefile so add them to the compressed boot folder's KBUILD_CFLAGS so that cross compiling works as expected. Signed-off-by: Nathan Chancellor Signed-off-by: Borislav Petkov Acked-by: Ard Biesheuvel Link: https://lkml.kernel.org/r/20210326000435.4785-3-nathan@kernel.org Signed-off-by: Sasha Levin commit 9150f7843b5d0acd415b89f59b6d2dc6b8c68867 Author: John Millikin Date: Thu Mar 25 17:04:33 2021 -0700 x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS) [ Upstream commit 8abe7fc26ad8f28bfdf78adbed56acd1fa93f82d ] When cross-compiling with Clang, the `$(CLANG_FLAGS)' variable contains additional flags needed to build C and assembly sources for the target platform. Normally this variable is automatically included in `$(KBUILD_CFLAGS)' via the top-level Makefile. The x86 real-mode makefile builds `$(REALMODE_CFLAGS)' from a plain assignment and therefore drops the Clang flags. This causes Clang to not recognize x86-specific assembler directives:   arch/x86/realmode/rm/header.S:36:1: error: unknown directive   .type real_mode_header STT_OBJECT ; .size real_mode_header, .-real_mode_header   ^ Explicit propagation of `$(CLANG_FLAGS)' to `$(REALMODE_CFLAGS)', which is inherited by real-mode make rules, fixes cross-compilation with Clang for x86 targets. Relevant flags: * `--target' sets the target architecture when cross-compiling. This   flag must be set for both compilation and assembly (`KBUILD_AFLAGS')   to support architecture-specific assembler directives. * `-no-integrated-as' tells clang to assemble with GNU Assembler   instead of its built-in LLVM assembler. This flag is set by default   unless `LLVM_IAS=1' is set, because the LLVM assembler can't yet   parse certain GNU extensions. Signed-off-by: John Millikin Signed-off-by: Nathan Chancellor Signed-off-by: Borislav Petkov Acked-by: Ard Biesheuvel Tested-by: Sedat Dilek Link: https://lkml.kernel.org/r/20210326000435.4785-2-nathan@kernel.org Signed-off-by: Sasha Levin commit ae4a4ca270c506ff22453833869a18c7004bc394 Author: Linus Walleij Date: Fri Mar 26 10:19:11 2021 +0100 ARM: dts: ux500: Fix up TVK R3 sensors [ Upstream commit aeceecd40d94ed3c00bfe1cfe59dd1bfac2fc6fe ] The TVK1281618 R3 sensors are different from the R2 board, some incorrectness is fixed and some new sensors added, we also rename the nodes appropriately with accelerometer@ etc. Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit a2ecc3aa10671658542dc7d58cbcd21aca646561 Author: Rafał Miłecki Date: Tue Mar 9 13:55:00 2021 +0100 ARM: dts: BCM5301X: fix "reg" formatting in /memory node [ Upstream commit 43986f38818278bb71a7fef6de689637bb734afe ] This fixes warnings/errors like: arch/arm/boot/dts/bcm4708-buffalo-wzr-1750dhp.dt.yaml: /: memory@0:reg:0: [0, 134217728, 2281701376, 402653184] is too long From schema: /lib/python3.6/site-packages/dtschema/schemas/reg.yaml Signed-off-by: Rafał Miłecki Signed-off-by: Florian Fainelli Signed-off-by: Sasha Levin commit 52a7c20bfb4b3319db540a62d89a9ef77e38fa22 Author: Andre Przywara Date: Fri Mar 19 16:53:29 2021 +0000 kselftest/arm64: mte: Fix MTE feature detection [ Upstream commit 592432862cc4019075a7196d9961562c49507d6f ] To check whether the CPU and kernel support the MTE features we want to test, we use an (emulated) CPU ID register read. However we only check against a very particular feature version (0b0010), even though the ARM ARM promises ID register features to be backwards compatible. While this could be fixed by using ">=" instead of "==", we should actually use the explicit HWCAP2_MTE hardware capability, exposed by the kernel via the ELF auxiliary vectors. That moves this responsibility to the kernel, and fixes running the tests on machines with FEAT_MTE3 capability. Signed-off-by: Andre Przywara Reviewed-by: Mark Brown Link: https://lore.kernel.org/r/20210319165334.29213-7-andre.przywara@arm.com Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin commit 236da04d3bc4ff80e75a43dc93ba7de42a0fd325 Author: Rafael J. Wysocki Date: Tue Mar 16 16:51:40 2021 +0100 PCI: PM: Do not read power state in pci_enable_device_flags() [ Upstream commit 4514d991d99211f225d83b7e640285f29f0755d0 ] It should not be necessary to update the current_state field of struct pci_dev in pci_enable_device_flags() before calling do_pci_enable_device() for the device, because none of the code between that point and the pci_set_power_state() call in do_pci_enable_device() invoked later depends on it. Moreover, doing that is actively harmful in some cases. For example, if the given PCI device depends on an ACPI power resource whose _STA method initially returns 0 ("off"), but the config space of the PCI device is accessible and the power state retrieved from the PCI_PM_CTRL register is D0, the current_state field in the struct pci_dev representing that device will get out of sync with the power.state of its ACPI companion object and that will lead to power management issues going forward. To avoid such issues it is better to leave the current_state value as is until it is changed to PCI_D0 by do_pci_enable_device() as appropriate. However, the power state of the device is not changed to PCI_D0 if it is already enabled when pci_enable_device_flags() gets called for it, so update its current_state in that case, but use pci_update_current_state() covering platform PM too for that. Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/ Reported-by: Maximilian Luz Tested-by: Maximilian Luz Signed-off-by: Rafael J. Wysocki Reviewed-by: Mika Westerberg Signed-off-by: Sasha Levin commit 2c22c855b2ab3a086443744767d8ce9707d2345e Author: Dmitry Osipenko Date: Tue Mar 2 15:09:58 2021 +0300 ARM: tegra: acer-a500: Rename avdd to vdda of touchscreen node [ Upstream commit b27b9689e1f3278919c6183c565d837d0aef6fc1 ] Rename avdd supply to vdda of the touchscreen node. The old supply name was incorrect. Signed-off-by: Dmitry Osipenko Signed-off-by: Thierry Reding Signed-off-by: Sasha Levin commit e83a6878a1687209385a9e4ac82f96bf53dece8d Author: Andre Przywara Date: Fri Mar 19 16:53:24 2021 +0000 kselftest/arm64: mte: Fix compilation with native compiler [ Upstream commit 4a423645bc2690376a7a94b4bb7b2f74bc6206ff ] The mte selftest Makefile contains a check for GCC, to add the memtag -march flag to the compiler options. This check fails if the compiler is not explicitly specified, so reverts to the standard "cc", in which case --version doesn't mention the "gcc" string we match against: $ cc --version | head -n 1 cc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 This will not add the -march switch to the command line, so compilation fails: mte_helper.S: Assembler messages: mte_helper.S:25: Error: selected processor does not support `irg x0,x0,xzr' mte_helper.S:38: Error: selected processor does not support `gmi x1,x0,xzr' ... Actually clang accepts the same -march option as well, so we can just drop this check and add this unconditionally to the command line, to avoid any future issues with this check altogether (gcc actually prints basename(argv[0]) when called with --version). Signed-off-by: Andre Przywara Reviewed-by: Nick Desaulniers Reviewed-by: Mark Brown Link: https://lore.kernel.org/r/20210319165334.29213-2-andre.przywara@arm.com Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin commit 8815109c15efcb190270b891ed575b366e72cca9 Author: Thinh Nguyen Date: Wed Mar 10 19:43:21 2021 -0800 usb: xhci: Fix port minor revision [ Upstream commit 64364bc912c01b33bba6c22e3ccb849bfca96398 ] Some hosts incorrectly use sub-minor version for minor version (i.e. 0x02 instead of 0x20 for bcdUSB 0x320 and 0x01 for bcdUSB 0x310). Currently the xHCI driver works around this by just checking for minor revision > 0x01 for USB 3.1 everywhere. With the addition of USB 3.2, checking this gets a bit cumbersome. Since there is no USB release with bcdUSB 0x301 to 0x309, we can assume that sub-minor version 01 to 09 is incorrect. Let's try to fix this and use the minor revision that matches with the USB/xHCI spec to help with the version checking within the driver. Acked-by: Mathias Nyman Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/ed330e95a19dc367819c5b4d78bf7a541c35aa0a.1615432770.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 9777200246f92bb86e9d7d7dbbb06cd5c8a26943 Author: Wesley Cheng Date: Fri Mar 19 02:31:25 2021 -0700 usb: dwc3: gadget: Ignore EP queue requests during bus reset [ Upstream commit 71ca43f30df9c642970f9dc9b2d6f463f4967e7b ] The current dwc3_gadget_reset_interrupt() will stop any active transfers, but only addresses blocking of EP queuing for while we are coming from a disconnected scenario, i.e. after receiving the disconnect event. If the host decides to issue a bus reset on the device, the connected parameter will still be set to true, allowing for EP queuing to continue while we are disabling the functions. To avoid this, set the connected flag to false until the stop active transfers is complete. Signed-off-by: Wesley Cheng Link: https://lore.kernel.org/r/1616146285-19149-3-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 7f66ca5efe7c14aab6d3325cc776f90a0b8f827e Author: Ruslan Bilovol Date: Mon Mar 1 13:49:34 2021 +0200 usb: gadget: f_uac1: validate input parameters [ Upstream commit a59c68a6a3d1b18e2494f526eb19893a34fa6ec6 ] Currently user can configure UAC1 function with parameters that violate UAC1 spec or are not supported by UAC1 gadget implementation. This can lead to incorrect behavior if such gadget is connected to the host - like enumeration failure or other issues depending on host's UAC1 driver implementation, bringing user to a long hours of debugging the issue. Instead of silently accept these parameters, throw an error if they are not valid. Signed-off-by: Ruslan Bilovol Link: https://lore.kernel.org/r/1614599375-8803-5-git-send-email-ruslan.bilovol@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d15e57bb5bf284b336bae865136165959c116a81 Author: Ruslan Bilovol Date: Mon Mar 1 13:49:33 2021 +0200 usb: gadget: f_uac2: validate input parameters [ Upstream commit 3713d5ceb04d5ab6a5e2b86dfca49170053f3a5e ] Currently user can configure UAC2 function with parameters that violate UAC2 spec or are not supported by UAC2 gadget implementation. This can lead to incorrect behavior if such gadget is connected to the host - like enumeration failure or other issues depending on host's UAC2 driver implementation, bringing user to a long hours of debugging the issue. Instead of silently accept these parameters, throw an error if they are not valid. Signed-off-by: Ruslan Bilovol Link: https://lore.kernel.org/r/1614599375-8803-4-git-send-email-ruslan.bilovol@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 79c4569e2492d170874d993ea71714a007eb4f59 Author: Vitaly Kuznetsov Date: Fri Mar 19 12:18:23 2021 +0100 genirq/matrix: Prevent allocation counter corruption [ Upstream commit c93a5e20c3c2dabef8ea360a3d3f18c6f68233ab ] When irq_matrix_free() is called for an unallocated vector the managed_allocated and total_allocated counters get out of sync with the real state of the matrix. Later, when the last interrupt is freed, these counters will underflow resulting in UINTMAX because the counters are unsigned. While this is certainly a problem of the calling code, this can be catched in the allocator by checking the allocation bit for the to be freed vector which simplifies debugging. An example of the problem described above: https://lore.kernel.org/lkml/20210318192819.636943062@linutronix.de/ Add the missing sanity check and emit a warning when it triggers. Suggested-by: Thomas Gleixner Signed-off-by: Vitaly Kuznetsov Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/20210319111823.1105248-1-vkuznets@redhat.com Signed-off-by: Sasha Levin commit 6cb737a99abed908bc5c54fa14d0f02f74231406 Author: Longfang Liu Date: Sat Mar 13 15:28:23 2021 +0800 crypto: hisilicon/sec - fixes a printing error [ Upstream commit 4b7aef0230418345be1fb77abbb1592801869901 ] When the log is output here, the device has not been initialized yet. Signed-off-by: Longfang Liu Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit e0c552b91c7d2de5951f880dbc28e92cab271d8a Author: Joerg Roedel Date: Fri Mar 12 13:38:18 2021 +0100 x86/sev: Do not require Hypervisor CPUID bit for SEV guests [ Upstream commit eab696d8e8b9c9d600be6fad8dd8dfdfaca6ca7c ] A malicious hypervisor could disable the CPUID intercept for an SEV or SEV-ES guest and trick it into the no-SEV boot path, where it could potentially reveal secrets. This is not an issue for SEV-SNP guests, as the CPUID intercept can't be disabled for those. Remove the Hypervisor CPUID bit check from the SEV detection code to protect against this kind of attack and add a Hypervisor bit equals zero check to the SME detection path to prevent non-encrypted guests from trying to enable SME. This handles the following cases: 1) SEV(-ES) guest where CPUID intercept is disabled. The guest will still see leaf 0x8000001f and the SEV bit. It can retrieve the C-bit and boot normally. 2) Non-encrypted guests with intercepted CPUID will check the SEV_STATUS MSR and find it 0 and will try to enable SME. This will fail when the guest finds MSR_K8_SYSCFG to be zero, as it is emulated by KVM. But we can't rely on that, as there might be other hypervisors which return this MSR with bit 23 set. The Hypervisor bit check will prevent that the guest tries to enable SME in this case. 3) Non-encrypted guests on SEV capable hosts with CPUID intercept disabled (by a malicious hypervisor) will try to boot into the SME path. This will fail, but it is also not considered a problem because non-encrypted guests have no protection against the hypervisor anyway. [ bp: s/non-SEV/non-encrypted/g ] Signed-off-by: Joerg Roedel Signed-off-by: Borislav Petkov Acked-by: Tom Lendacky Link: https://lkml.kernel.org/r/20210312123824.306-3-joro@8bytes.org Signed-off-by: Sasha Levin commit 014379ef6849f0e70c2b32095ff4579cd10c1536 Author: Pawel Laszczak Date: Mon Mar 15 08:17:48 2021 +0100 usb: webcam: Invalid size of Processing Unit Descriptor [ Upstream commit 6a154ec9ef6762c774cd2b50215c7a8f0f08a862 ] According with USB Device Class Definition for Video Device the Processing Unit Descriptor bLength should be 12 (10 + bmControlSize), but it has 11. Invalid length caused that Processing Unit Descriptor Test Video form CV tool failed. To fix this issue patch adds bmVideoStandards into uvc_processing_unit_descriptor structure. The bmVideoStandards field was added in UVC 1.1 and it wasn't part of UVC 1.0a. Reviewed-by: Laurent Pinchart Signed-off-by: Pawel Laszczak Reviewed-by: Peter Chen Link: https://lore.kernel.org/r/20210315071748.29706-1-pawell@gli-login.cadence.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit f1fbce00b98f00694251086ecbb4335c51c206c9 Author: Pawel Laszczak Date: Mon Mar 8 13:53:38 2021 +0100 usb: gadget: uvc: add bInterval checking for HS mode [ Upstream commit 26adde04acdff14a1f28d4a5dce46a8513a3038b ] Patch adds extra checking for bInterval passed by configfs. The 5.6.4 chapter of USB Specification (rev. 2.0) say: "A high-bandwidth endpoint must specify a period of 1x125 µs (i.e., a bInterval value of 1)." The issue was observed during testing UVC class on CV. I treat this change as improvement because we can control bInterval by configfs. Reviewed-by: Peter Chen Reviewed-by: Laurent Pinchart Signed-off-by: Pawel Laszczak Link: https://lore.kernel.org/r/20210308125338.4824-1-pawell@gli-login.cadence.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 4e58e2fd84f18e4206f9f7600af041965d67d12d Author: Hui Tang Date: Fri Mar 5 14:35:01 2021 +0800 crypto: qat - fix unmap invalid dma address [ Upstream commit 792b32fad548281e1b7fe14df9063a96c54b32a2 ] 'dma_mapping_error' return a negative value if 'dma_addr' is equal to 'DMA_MAPPING_ERROR' not zero, so fix initialization of 'dma_addr'. Signed-off-by: Hui Tang Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit d7b89a167daa861d5c00cc996630abe5502f50c1 Author: Ard Biesheuvel Date: Tue Mar 2 21:33:03 2021 +0100 crypto: api - check for ERR pointers in crypto_destroy_tfm() [ Upstream commit 83681f2bebb34dbb3f03fecd8f570308ab8b7c2c ] Given that crypto_alloc_tfm() may return ERR pointers, and to avoid crashes on obscure error paths where such pointers are presented to crypto_destroy_tfm() (such as [0]), add an ERR_PTR check there before dereferencing the second argument as a struct crypto_tfm pointer. [0] https://lore.kernel.org/linux-crypto/000000000000de949705bc59e0f6@google.com/ Reported-by: syzbot+12cf5fbfdeba210a89dd@syzkaller.appspotmail.com Reviewed-by: Eric Biggers Signed-off-by: Ard Biesheuvel Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 29b9829718c5e9bd68fc1c652f5e0ba9b9a64fed Author: Bhaumik Bhatt Date: Wed Feb 24 15:23:04 2021 -0800 bus: mhi: core: Process execution environment changes serially [ Upstream commit ef2126c4e2ea2b92f543fae00a2a0332e4573c48 ] In current design, whenever the BHI interrupt is fired, the execution environment is updated. This can cause race conditions and impede ongoing power up/down processing. For example, if a power down is in progress, MHI host updates to a local "disabled" execution environment. If a BHI interrupt fires later, that value gets replaced with one from the BHI EE register. This impacts the controller as it does not expect multiple RDDM execution environment change status callbacks as an example. Another issue would be that the device can enter mission mode and the execution environment is updated, while device creation for SBL channels is still going on due to slower PM state worker thread run, leading to multiple attempts at opening the same channel. Ensure that EE changes are handled only from appropriate places and occur one after another and handle only PBL modes or RDDM EE changes as critical events directly from the interrupt handler. Simplify handling by waiting for SYS ERROR before handling RDDM. This also makes sure that we use the correct execution environment to notify the controller driver when the device resets to one of the PBL execution environments. Signed-off-by: Bhaumik Bhatt Reviewed-by: Loic Poulain Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1614208985-20851-4-git-send-email-bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Sasha Levin commit 3ec3ac3b08e7f0d7fb00487b6030c1afcc17e4fb Author: Bhaumik Bhatt Date: Wed Feb 24 15:23:02 2021 -0800 bus: mhi: core: Destroy SBL devices when moving to mission mode [ Upstream commit 925089c1900f588615db5bf4e1d9064a5f2c18c7 ] Currently, client devices are created in SBL or AMSS (mission mode) and only destroyed after power down or SYS ERROR. When moving between certain execution environments, such as from SBL to AMSS, no clean-up is required. This presents an issue where SBL-specific channels are left open and client drivers now run in an execution environment where they cannot operate. Fix this by expanding the mhi_destroy_device() to do an execution environment specific clean-up if one is requested. Close the gap and destroy devices in such scenarios that allow SBL client drivers to clean up once device enters mission mode. Signed-off-by: Bhaumik Bhatt Reviewed-by: Loic Poulain Reviewed-by: Hemant Kumar Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1614208985-20851-2-git-send-email-bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Sasha Levin commit 936cfae514c7b8857b22a67a9c5b7da0fbe672dd Author: Loic Poulain Date: Fri Mar 5 20:16:46 2021 +0100 bus: mhi: pci_generic: No-Op for device_wake operations [ Upstream commit e3e5e6508fc1c0e98a5a264853713dd30a60e5e5 ] The wake_db register presence is highly speculative and can fuze MHI devices. Indeed, currently the wake_db register address is defined at entry 127 of the 'Channel doorbell array', thus writing to this address is equivalent to ringing the doorbell for channel 127, causing trouble with some devics (e.g. SDX24 based modems) that get an unexpected channel 127 doorbell interrupt. This change fixes that issue by setting wake get/put as no-op for pci_generic devices. The wake device sideband mechanism seems really specific to each device, and is AFAIK not defined by the MHI spec. It also removes zeroing initialization of wake_db register during MMIO initialization, the register being set via wake_get/put accessors few cycles later during M0 transition. Signed-off-by: Loic Poulain Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1614971808-22156-4-git-send-email-loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Sasha Levin commit 43bcf55a67077478ee11ca41925c3a553e0b41a1 Author: David Bauer Date: Wed Mar 3 17:08:37 2021 +0100 spi: ath79: remove spi-master setup and cleanup assignment [ Upstream commit ffb597b2bd3cd78b9bfb68f536743cd46dbb2cc4 ] This removes the assignment of setup and cleanup functions for the ath79 target. Assigning the setup-method will lead to 'setup_transfer' not being assigned in spi_bitbang_init. Because of this, performing any TX/RX operation will lead to a kernel oops. Also drop the redundant cleanup assignment, as it's also assigned in spi_bitbang_init. Signed-off-by: David Bauer Link: https://lore.kernel.org/r/20210303160837.165771-2-mail@david-bauer.net Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 548b144b5f83d95914fd118a54130e054f954f86 Author: David Bauer Date: Wed Mar 3 17:08:36 2021 +0100 spi: ath79: always call chipselect function [ Upstream commit 19e2132174583beb90c1bd3e9c842bc6d5c944d1 ] spi-bitbang has to call the chipselect function on the ath79 SPI driver in order to communicate with the SPI slave device, as the ath79 SPI driver has three dedicated chipselect lines but can also be used with GPIOs for the CS lines. Fixes commit 4a07b8bcd503 ("spi: bitbang: Make chipselect callback optional") Signed-off-by: David Bauer Link: https://lore.kernel.org/r/20210303160837.165771-1-mail@david-bauer.net Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 40a66f6f4b6632067dbb2df8fc6b37dc66fed14c Author: karthik alapati Date: Sun Feb 21 21:01:05 2021 +0530 staging: wimax/i2400m: fix byte-order issue [ Upstream commit 0c37baae130df39b19979bba88bde2ee70a33355 ] fix sparse byte-order warnings by converting host byte-order type to __le16 byte-order types before assigning to hdr.length Signed-off-by: karthik alapati Link: https://lore.kernel.org/r/0ae5c5c4c646506d8be871e7be5705542671a1d5.1613921277.git.mail@karthek.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 37e7b52281ebc8a364391faac82977b77757fb03 Author: Tony Lindgren Date: Mon Mar 8 11:35:07 2021 +0200 bus: ti-sysc: Probe for l4_wkup and l4_cfg interconnect devices first [ Upstream commit 4700a00755fb5a4bb5109128297d6fd2d1272ee6 ] We want to probe l4_wkup and l4_cfg interconnect devices first to avoid issues with missing resources. Otherwise we attempt to probe l4_per devices first causing pointless deferred probe and also annoyingh renumbering of the MMC devices for example. Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit fbe76e9f4fbe84cc3347198d24913ab68736a641 Author: Dmitry Osipenko Date: Tue Mar 2 12:54:04 2021 +0300 cpuidle: tegra: Fix C7 idling state on Tegra114 commit 32c8c34d8132b5fe8497c2538597445a0d65c29d upstream. Trusted Foundation firmware doesn't implement the do_idle call and in this case suspending should fall back to the common suspend path. In order to fix this issue we will unconditionally set the NOFLUSH_L2 mode via firmware call, which is a NO-OP on Tegra30/124, and then proceed to the C7 idling, like it was done by the older Tegra114 cpuidle driver. Fixes: 14e086baca50 ("cpuidle: tegra: Squash Tegra114 driver into the common driver") Cc: stable@vger.kernel.org # 5.7+ Reported-by: Anton Bambura # TF701 T114 Tested-by: Anton Bambura # TF701 T114 Tested-by: Matt Merhar # Ouya T30 Tested-by: Peter Geis # Ouya T30 Signed-off-by: Dmitry Osipenko Reviewed-by: Daniel Lezcano Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20210302095405.28453-1-digetx@gmail.com Signed-off-by: Greg Kroah-Hartman commit 3d40985b100da97b2e6777fade7aa7964580468b Author: Phillip Potter Date: Wed Mar 31 23:07:19 2021 +0100 fbdev: zero-fill colormap in fbcmap.c commit 19ab233989d0f7ab1de19a036e247afa4a0a1e9c upstream. Use kzalloc() rather than kmalloc() for the dynamically allocated parts of the colormap in fb_alloc_cmap_gfp, to prevent a leak of random kernel data to userspace under certain circumstances. Fixes a KMSAN-found infoleak bug reported by syzbot at: https://syzkaller.appspot.com/bug?id=741578659feabd108ad9e06696f0c1f2e69c4b6e Reported-by: syzbot+47fa9c9c648b765305b9@syzkaller.appspotmail.com Cc: stable Reviewed-by: Geert Uytterhoeven Signed-off-by: Phillip Potter Link: https://lore.kernel.org/r/20210331220719.1499743-1-phil@philpotter.co.uk Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 2cf62247f25903afe560f3acee7f590908fec171 Author: Chen Jun Date: Wed Apr 14 03:04:49 2021 +0000 posix-timers: Preserve return value in clock_adjtime32() commit 2d036dfa5f10df9782f5278fc591d79d283c1fad upstream. The return value on success (>= 0) is overwritten by the return value of put_old_timex32(). That works correct in the fault case, but is wrong for the success case where put_old_timex32() returns 0. Just check the return value of put_old_timex32() and return -EFAULT in case it is not zero. [ tglx: Massage changelog ] Fixes: 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to native counterparts") Signed-off-by: Chen Jun Signed-off-by: Thomas Gleixner Reviewed-by: Richard Cochran Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210414030449.90692-1-chenjun102@huawei.com Signed-off-by: Greg Kroah-Hartman commit a06b40880153a2360b313e19b9ccf16c6b1a311b Author: Johannes Thumshirn Date: Fri Apr 16 12:05:27 2021 +0900 btrfs: zoned: fail mount if the device does not support zone append commit 1d68128c107a0b8c0c9125cb05d4771ddc438369 upstream. For zoned btrfs, zone append is mandatory to write to a sequential write only zone, otherwise parallel writes to the same zone could result in unaligned write errors. If a zoned block device does not support zone append (e.g. a dm-crypt zoned device using a non-NULL IV cypher), fail to mount. CC: stable@vger.kernel.org # 5.12 Signed-off-by: Johannes Thumshirn Signed-off-by: Damien Le Moal Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 688ddc93faf4ad65517471ef01d950db9cc8f6a7 Author: Filipe Manana Date: Wed Apr 14 14:05:26 2021 +0100 btrfs: zoned: fix unpaired block group unfreeze during device replace commit 0dc16ef4f6c2708407fab6d141908d46a3b737bc upstream. When doing a device replace on a zoned filesystem, if we find a block group with ->to_copy == 0, we jump to the label 'done', which will result in later calling btrfs_unfreeze_block_group(), even though at this point we never called btrfs_freeze_block_group(). Since at this point we have neither turned the block group to RO mode nor made any progress, we don't need to jump to the label 'done'. So fix this by jumping instead to the label 'skip' and dropping our reference on the block group before the jump. Fixes: 78ce9fc269af6e ("btrfs: zoned: mark block groups to copy for device-replace") CC: stable@vger.kernel.org # 5.12 Reviewed-by: Johannes Thumshirn Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit e2da98788369bfba1138bada72765c47989a4338 Author: Filipe Manana Date: Mon Apr 5 12:32:16 2021 +0100 btrfs: fix race between transaction aborts and fsyncs leading to use-after-free commit 061dde8245356d8864d29e25207aa4daa0be4d3c upstream. There is a race between a task aborting a transaction during a commit, a task doing an fsync and the transaction kthread, which leads to an use-after-free of the log root tree. When this happens, it results in a stack trace like the following: BTRFS info (device dm-0): forced readonly BTRFS warning (device dm-0): Skipping commit of aborted transaction. BTRFS: error (device dm-0) in cleanup_transaction:1958: errno=-5 IO failure BTRFS warning (device dm-0): lost page write due to IO error on /dev/mapper/error-test (-5) BTRFS warning (device dm-0): Skipping commit of aborted transaction. BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0xa4e8 len 4096 err no 10 BTRFS error (device dm-0): error writing primary super block to device 1 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e000 len 4096 err no 10 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e008 len 4096 err no 10 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e010 len 4096 err no 10 BTRFS: error (device dm-0) in write_all_supers:4110: errno=-5 IO failure (1 errors while writing supers) BTRFS: error (device dm-0) in btrfs_sync_log:3308: errno=-5 IO failure general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b68: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 2 PID: 2458471 Comm: fsstress Not tainted 5.12.0-rc5-btrfs-next-84 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x139/0xa40 Code: c0 74 19 (...) RSP: 0018:ffff9f18830d7b00 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b68 RBX: 0000000000000001 RCX: 0000000000000002 RDX: ffffffffb9c54d13 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff9f18830d7bc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff9f18830d7be0 R11: 0000000000000001 R12: ffff8c6cd199c040 R13: ffff8c6c95821358 R14: 00000000fffffffb R15: ffff8c6cbcf01358 FS: 00007fa9140c2b80(0000) GS:ffff8c6fac600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa913d52000 CR3: 000000013d2b4003 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? __btrfs_handle_fs_error+0xde/0x146 [btrfs] ? btrfs_sync_log+0x7c1/0xf20 [btrfs] ? btrfs_sync_log+0x7c1/0xf20 [btrfs] btrfs_sync_log+0x7c1/0xf20 [btrfs] btrfs_sync_file+0x40c/0x580 [btrfs] do_fsync+0x38/0x70 __x64_sys_fsync+0x10/0x20 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fa9142a55c3 Code: 8b 15 09 (...) RSP: 002b:00007fff26278d48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a RAX: ffffffffffffffda RBX: 0000563c83cb4560 RCX: 00007fa9142a55c3 RDX: 00007fff26278cb0 RSI: 00007fff26278cb0 RDI: 0000000000000005 RBP: 0000000000000005 R08: 0000000000000001 R09: 00007fff26278d5c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000340 R13: 00007fff26278de0 R14: 00007fff26278d96 R15: 0000563c83ca57c0 Modules linked in: btrfs dm_zero dm_snapshot dm_thin_pool (...) ---[ end trace ee2f1b19327d791d ]--- The steps that lead to this crash are the following: 1) We are at transaction N; 2) We have two tasks with a transaction handle attached to transaction N. Task A and Task B. Task B is doing an fsync; 3) Task B is at btrfs_sync_log(), and has saved fs_info->log_root_tree into a local variable named 'log_root_tree' at the top of btrfs_sync_log(). Task B is about to call write_all_supers(), but before that... 4) Task A calls btrfs_commit_transaction(), and after it sets the transaction state to TRANS_STATE_COMMIT_START, an error happens before it waits for the transaction's 'num_writers' counter to reach a value of 1 (no one else attached to the transaction), so it jumps to the label "cleanup_transaction"; 5) Task A then calls cleanup_transaction(), where it aborts the transaction, setting BTRFS_FS_STATE_TRANS_ABORTED on fs_info->fs_state, setting the ->aborted field of the transaction and the handle to an errno value and also setting BTRFS_FS_STATE_ERROR on fs_info->fs_state. After that, at cleanup_transaction(), it deletes the transaction from the list of transactions (fs_info->trans_list), sets the transaction to the state TRANS_STATE_COMMIT_DOING and then waits for the number of writers to go down to 1, as it's currently 2 (1 for task A and 1 for task B); 6) The transaction kthread is running and sees that BTRFS_FS_STATE_ERROR is set in fs_info->fs_state, so it calls btrfs_cleanup_transaction(). There it sees the list fs_info->trans_list is empty, and then proceeds into calling btrfs_drop_all_logs(), which frees the log root tree with a call to btrfs_free_log_root_tree(); 7) Task B calls write_all_supers() and, shortly after, under the label 'out_wake_log_root', it deferences the pointer stored in 'log_root_tree', which was already freed in the previous step by the transaction kthread. This results in a use-after-free leading to a crash. Fix this by deleting the transaction from the list of transactions at cleanup_transaction() only after setting the transaction state to TRANS_STATE_COMMIT_DOING and waiting for all existing tasks that are attached to the transaction to release their transaction handles. This makes the transaction kthread wait for all the tasks attached to the transaction to be done with the transaction before dropping the log roots and doing other cleanups. Fixes: ef67963dac255b ("btrfs: drop logs when we've aborted a transaction") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Josef Bacik Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit ae4cd897367582905e5e883d6a062bd5385936d3 Author: Alexander Shishkin Date: Wed Apr 14 20:12:50 2021 +0300 intel_th: pci: Add Rocket Lake CPU support commit 9f7f2a5e01ab4ee56b6d9c0572536fe5fd56e376 upstream. This adds support for the Trace Hub in Rocket Lake CPUs. Signed-off-by: Alexander Shishkin Reviewed-by: Andy Shevchenko Cc: stable # v4.14+ Link: https://lore.kernel.org/r/20210414171251.14672-7-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit e0d49e44953f0205d7ec49e140f8515af9f1fbdc Author: Filipe Manana Date: Tue Apr 20 10:55:12 2021 +0100 btrfs: fix metadata extent leak after failure to create subvolume commit 67addf29004c5be9fa0383c82a364bb59afc7f84 upstream. When creating a subvolume we allocate an extent buffer for its root node after starting a transaction. We setup a root item for the subvolume that points to that extent buffer and then attempt to insert the root item into the root tree - however if that fails, due to ENOMEM for example, we do not free the extent buffer previously allocated and we do not abort the transaction (as at that point we did nothing that can not be undone). This means that we effectively do not return the metadata extent back to the free space cache/tree and we leave a delayed reference for it which causes a metadata extent item to be added to the extent tree, in the next transaction commit, without having backreferences. When this happens 'btrfs check' reports the following: $ btrfs check /dev/sdi Opening filesystem to check... Checking filesystem on /dev/sdi UUID: dce2cb9d-025f-4b05-a4bf-cee0ad3785eb [1/7] checking root items [2/7] checking extents ref mismatch on [30425088 16384] extent item 1, found 0 backref 30425088 root 256 not referenced back 0x564a91c23d70 incorrect global backref count on 30425088 found 1 wanted 0 backpointer mismatch on [30425088 16384] owner ref check failed [30425088 16384] ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space cache [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) found 212992 bytes used, error(s) found total csum bytes: 0 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 124669 file data blocks allocated: 65536 referenced 65536 So fix this by freeing the metadata extent if btrfs_insert_root() returns an error. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 6adafbbb03a11ab7f5725c0b79b1bf90272ca242 Author: Maciej W. Rozycki Date: Wed Apr 14 12:38:28 2021 +0200 x86/build: Disable HIGHMEM64G selection for M486SX commit 0ef3439cd80ba7770723edb0470d15815914bb62 upstream. Fix a regression caused by making the 486SX separately selectable in Kconfig, for which the HIGHMEM64G setting has not been updated and therefore has become exposed as a user-selectable option for the M486SX configuration setting unlike with original M486 and all the other settings that choose non-PAE-enabled processors: High Memory Support > 1. off (NOHIGHMEM) 2. 4GB (HIGHMEM4G) 3. 64GB (HIGHMEM64G) choice[1-3?]: With the fix in place the setting is now correctly removed: High Memory Support > 1. off (NOHIGHMEM) 2. 4GB (HIGHMEM4G) choice[1-2?]: [ bp: Massage commit message. ] Fixes: 87d6021b8143 ("x86/math-emu: Limit MATH_EMULATION to 486SX compatibles") Signed-off-by: Maciej W. Rozycki Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org # v5.5+ Link: https://lkml.kernel.org/r/alpine.DEB.2.21.2104141221340.44318@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman commit 3a736de3271abdd5aadafc4ad01009ce335286a3 Author: Qu Wenruo Date: Tue Aug 4 15:25:47 2020 +0800 btrfs: handle remount to no compress during compression commit 1d8ba9e7e785b6625f4d8e978e8a284b144a7077 upstream. [BUG] When running btrfs/071 with inode_need_compress() removed from compress_file_range(), we got the following crash: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Workqueue: btrfs-delalloc btrfs_work_helper [btrfs] RIP: 0010:compress_file_range+0x476/0x7b0 [btrfs] Call Trace: ? submit_compressed_extents+0x450/0x450 [btrfs] async_cow_start+0x16/0x40 [btrfs] btrfs_work_helper+0xf2/0x3e0 [btrfs] process_one_work+0x278/0x5e0 worker_thread+0x55/0x400 ? process_one_work+0x5e0/0x5e0 kthread+0x168/0x190 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x22/0x30 ---[ end trace 65faf4eae941fa7d ]--- This is already after the patch "btrfs: inode: fix NULL pointer dereference if inode doesn't need compression." [CAUSE] @pages is firstly created by kcalloc() in compress_file_extent(): pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS); Then passed to btrfs_compress_pages() to be utilized there: ret = btrfs_compress_pages(... pages, &nr_pages, ...); btrfs_compress_pages() will initialize each page as output, in zlib_compress_pages() we have: pages[nr_pages] = out_page; nr_pages++; Normally this is completely fine, but there is a special case which is in btrfs_compress_pages() itself: switch (type) { default: return -E2BIG; } In this case, we didn't modify @pages nor @out_pages, leaving them untouched, then when we cleanup pages, the we can hit NULL pointer dereference again: if (pages) { for (i = 0; i < nr_pages; i++) { WARN_ON(pages[i]->mapping); put_page(pages[i]); } ... } Since pages[i] are all initialized to zero, and btrfs_compress_pages() doesn't change them at all, accessing pages[i]->mapping would lead to NULL pointer dereference. This is not possible for current kernel, as we check inode_need_compress() before doing pages allocation. But if we're going to remove that inode_need_compress() in compress_file_extent(), then it's going to be a problem. [FIX] When btrfs_compress_pages() hits its default case, modify @out_pages to 0 to prevent such problem from happening. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212331 CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Josef Bacik Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit bb7d42ee6cd5822afc1e6c77c2c0d78fc000ad8f Author: Aurelien Aptel Date: Fri Apr 9 15:47:01 2021 +0200 smb2: fix use-after-free in smb2_ioctl_query_info() commit ccd48ec3d4a6cc595b2d9c5146e63b6c23546701 upstream. * rqst[1,2,3] is allocated in vars * each rqst->rq_iov is also allocated in vars or using pooled memory SMB2_open_free, SMB2_ioctl_free, SMB2_query_info_free are iterating on each rqst after vars has been freed (use-after-free), and they are freeing the kvec a second time (double-free). How to trigger: * compile with KASAN * mount a share $ smbinfo quota /mnt/foo Segmentation fault $ dmesg ================================================================== BUG: KASAN: use-after-free in SMB2_open_free+0x1c/0xa0 Read of size 8 at addr ffff888007b10c00 by task python3/1200 CPU: 2 PID: 1200 Comm: python3 Not tainted 5.12.0-rc6+ #107 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack+0x93/0xc2 print_address_description.constprop.0+0x18/0x130 ? SMB2_open_free+0x1c/0xa0 ? SMB2_open_free+0x1c/0xa0 kasan_report.cold+0x7f/0x111 ? smb2_ioctl_query_info+0x240/0x990 ? SMB2_open_free+0x1c/0xa0 SMB2_open_free+0x1c/0xa0 smb2_ioctl_query_info+0x2bf/0x990 ? smb2_query_reparse_tag+0x600/0x600 ? cifs_mapchar+0x250/0x250 ? rcu_read_lock_sched_held+0x3f/0x70 ? cifs_strndup_to_utf16+0x12c/0x1c0 ? rwlock_bug.part.0+0x60/0x60 ? rcu_read_lock_sched_held+0x3f/0x70 ? cifs_convert_path_to_utf16+0xf8/0x140 ? smb2_check_message+0x6f0/0x6f0 cifs_ioctl+0xf18/0x16b0 ? smb2_query_reparse_tag+0x600/0x600 ? cifs_readdir+0x1800/0x1800 ? selinux_bprm_creds_for_exec+0x4d0/0x4d0 ? do_user_addr_fault+0x30b/0x950 ? __x64_sys_openat+0xce/0x140 __x64_sys_ioctl+0xb9/0xf0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fdcf1f4ba87 Code: b3 66 90 48 8b 05 11 14 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 13 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffef1ce7748 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000c018cf07 RCX: 00007fdcf1f4ba87 RDX: 0000564c467c5590 RSI: 00000000c018cf07 RDI: 0000000000000003 RBP: 00007ffef1ce7770 R08: 00007ffef1ce7420 R09: 00007fdcf0e0562b R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000004018 R13: 0000000000000001 R14: 0000000000000003 R15: 0000564c467c5590 Allocated by task 1200: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7a/0x90 smb2_ioctl_query_info+0x10e/0x990 cifs_ioctl+0xf18/0x16b0 __x64_sys_ioctl+0xb9/0xf0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 1200: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0xe5/0x110 slab_free_freelist_hook+0x53/0x130 kfree+0xcc/0x320 smb2_ioctl_query_info+0x2ad/0x990 cifs_ioctl+0xf18/0x16b0 __x64_sys_ioctl+0xb9/0xf0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff888007b10c00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 0 bytes inside of 512-byte region [ffff888007b10c00, ffff888007b10e00) The buggy address belongs to the page: page:0000000044e14b75 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7b10 head:0000000044e14b75 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x100000000010200(slab|head) raw: 0100000000010200 ffffea000015f500 0000000400000004 ffff888001042c80 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888007b10b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888007b10b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888007b10c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888007b10c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888007b10d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Signed-off-by: Aurelien Aptel CC: Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit ee3b051cd948f3cc3e049f3b5049c3b0b7fc4b88 Author: Paulo Alcantara Date: Mon May 3 11:55:26 2021 -0300 cifs: fix regression when mounting shares with prefix paths commit 5c1acf3fe05ce443edba5e2110c9e581765f66a8 upstream. The commit 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx") revealed an existing bug when mounting shares that contain a prefix path or DFS links. cifs_setup_volume_info() requires the @devname to contain the full path (UNC + prefix) to update the fs context with the new UNC and prepath values, however we were passing only the UNC path (old_ctx->UNC) in @device thus discarding any prefix paths. Instead of concatenating both old_ctx->{UNC,prepath} and pass it in @devname, just keep the dup'ed values of UNC and prepath in cifs_sb->ctx after calling smb3_fs_context_dup(), and fix smb3_parse_devname() to correctly parse and not leak the new UNC and prefix paths. Cc: # v5.11+ Fixes: 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx") Signed-off-by: Paulo Alcantara (SUSE) Acked-by: David Disseldorp Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit eb877963081ca7fb1ccc37ff60c5716fa658d82f Author: Shyam Prasad N Date: Thu Apr 29 07:53:18 2021 +0000 cifs: detect dead connections only when echoes are enabled. commit f4916649f98e2c7bdba38c6597a98c456c17317d upstream. We can detect server unresponsiveness only if echoes are enabled. Echoes can be disabled under two scenarios: 1. The connection is low on credits, so we've disabled echoes/oplocks. 2. The connection has not seen any request till now (other than negotiate/sess-setup), which is when we enable these two, based on the credits available. So this fix will check for dead connection, only when echo is enabled. Signed-off-by: Shyam Prasad N CC: # v5.8+ Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 0049109550011245df45d4a2abf60960f3eac6af Author: David Disseldorp Date: Fri Apr 23 00:14:03 2021 +0200 cifs: fix leak in cifs_smb3_do_mount() ctx commit 315db9a05b7a56810728589baa930864107e4634 upstream. cifs_smb3_do_mount() calls smb3_fs_context_dup() and then cifs_setup_volume_info(). The latter's subsequent smb3_parse_devname() call overwrites the cifs_sb->ctx->UNC string already dup'ed by smb3_fs_context_dup(), resulting in a leak. E.g. unreferenced object 0xffff888002980420 (size 32): comm "mount", pid 160, jiffies 4294892541 (age 30.416s) hex dump (first 32 bytes): 5c 5c 31 39 32 2e 31 36 38 2e 31 37 34 2e 31 30 \\192.168.174.10 34 5c 72 61 70 69 64 6f 2d 73 68 61 72 65 00 00 4\rapido-share.. backtrace: [<00000000069e12f6>] kstrdup+0x28/0x50 [<00000000b61f4032>] smb3_fs_context_dup+0x127/0x1d0 [cifs] [<00000000c6e3e3bf>] cifs_smb3_do_mount+0x77/0x660 [cifs] [<0000000063467a6b>] smb3_get_tree+0xdf/0x220 [cifs] [<00000000716f731e>] vfs_get_tree+0x1b/0x90 [<00000000491d3892>] path_mount+0x62a/0x910 [<0000000046b2e774>] do_mount+0x50/0x70 [<00000000ca7b64dd>] __x64_sys_mount+0x81/0xd0 [<00000000b5122496>] do_syscall_64+0x33/0x40 [<000000002dd397af>] entry_SYSCALL_64_after_hwframe+0x44/0xae This change is a bandaid until the cifs_setup_volume_info() TODO and error handling issues are resolved. Signed-off-by: David Disseldorp Acked-by: Ronnie Sahlberg Reviewed-by: Paulo Alcantara (SUSE) CC: # v5.11+ Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit e9ac929c3ec9223535745b1df8764426af6b73a5 Author: Eugene Korenevsky Date: Fri Apr 16 10:35:30 2021 +0300 cifs: fix out-of-bound memory access when calling smb3_notify() at mount point commit a637f4ae037e1e0604ac008564934d63261a8fd1 upstream. If smb3_notify() is called at mount point of CIFS, build_path_from_dentry() returns the pointer to kmalloc-ed memory with terminating zero (this is empty FileName to be passed to SMB2 CREATE request). This pointer is assigned to the `path` variable. Then `path + 1` (to skip first backslash symbol) is passed to cifs_convert_path_to_utf16(). This is incorrect for empty path and causes out-of-bound memory access. Get rid of this "increase by one". cifs_convert_path_to_utf16() already contains the check for leading backslash in the path. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212693 CC: # v5.6+ Signed-off-by: Eugene Korenevsky Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit b399c1a3ea0b9d10047ff266d65533df7f15532f Author: Paul Aurich Date: Tue Apr 13 14:25:27 2021 -0700 cifs: Return correct error code from smb2_get_enc_key commit 83728cbf366e334301091d5b808add468ab46b27 upstream. Avoid a warning if the error percolates back up: [440700.376476] CIFS VFS: \\otters.example.com crypt_message: Could not get encryption key [440700.386947] ------------[ cut here ]------------ [440700.386948] err = 1 [440700.386977] WARNING: CPU: 11 PID: 2733 at /build/linux-hwe-5.4-p6lk6L/linux-hwe-5.4-5.4.0/lib/errseq.c:74 errseq_set+0x5c/0x70 ... [440700.397304] CPU: 11 PID: 2733 Comm: tar Tainted: G OE 5.4.0-70-generic #78~18.04.1-Ubuntu ... [440700.397334] Call Trace: [440700.397346] __filemap_set_wb_err+0x1a/0x70 [440700.397419] cifs_writepages+0x9c7/0xb30 [cifs] [440700.397426] do_writepages+0x4b/0xe0 [440700.397444] __filemap_fdatawrite_range+0xcb/0x100 [440700.397455] filemap_write_and_wait+0x42/0xa0 [440700.397486] cifs_setattr+0x68b/0xf30 [cifs] [440700.397493] notify_change+0x358/0x4a0 [440700.397500] utimes_common+0xe9/0x1c0 [440700.397510] do_utimes+0xc5/0x150 [440700.397520] __x64_sys_utimensat+0x88/0xd0 Fixes: 61cfac6f267d ("CIFS: Fix possible use after free in demultiplex thread") Signed-off-by: Paul Aurich CC: stable@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 3f72d3709f53af72835af7dc8b15ba61611a0e36 Author: He Ying Date: Fri Apr 23 04:35:16 2021 -0400 irqchip/gic-v3: Do not enable irqs when handling spurious interrups commit a97709f563a078e259bf0861cd259aa60332890a upstream. We triggered the following error while running our 4.19 kernel with the pseudo-NMI patches backported to it: [ 14.816231] ------------[ cut here ]------------ [ 14.816231] kernel BUG at irq.c:99! [ 14.816232] Internal error: Oops - BUG: 0 [#1] SMP [ 14.816232] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____)) [ 14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.19.95.aarch64 #14 [ 14.816233] Hardware name: evb (DT) [ 14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO) [ 14.816234] pc : asm_nmi_enter+0x94/0x98 [ 14.816235] lr : asm_nmi_enter+0x18/0x98 [ 14.816235] sp : ffff000008003c50 [ 14.816235] pmr_save: 00000070 [ 14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0 [ 14.816238] x27: 0000000000000000 x26: ffff000008004000 [ 14.816239] x25: 00000000015e0000 x24: ffff8008fb916000 [ 14.816240] x23: 0000000020400005 x22: ffff0000080817cc [ 14.816241] x21: ffff000008003da0 x20: 0000000000000060 [ 14.816242] x19: 00000000000003ff x18: ffffffffffffffff [ 14.816243] x17: 0000000000000008 x16: 003d090000000000 [ 14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40 [ 14.816244] x13: ffff8008fff58b9d x12: 0000000000000000 [ 14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5 [ 14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f [ 14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e [ 14.816248] x5 : 0000000000000000 x4 : 0000000080000000 [ 14.816249] x3 : 0000000000000000 x2 : 0000000080000000 [ 14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0 [ 14.816251] Call trace: [ 14.816251] asm_nmi_enter+0x94/0x98 [ 14.816251] el1_irq+0x8c/0x180 (IRQ C) [ 14.816252] gic_handle_irq+0xbc/0x2e4 [ 14.816252] el1_irq+0xcc/0x180 (IRQ B) [ 14.816253] arch_timer_handler_virt+0x38/0x58 [ 14.816253] handle_percpu_devid_irq+0x90/0x240 [ 14.816253] generic_handle_irq+0x34/0x50 [ 14.816254] __handle_domain_irq+0x68/0xc0 [ 14.816254] gic_handle_irq+0xf8/0x2e4 [ 14.816255] el1_irq+0xcc/0x180 (IRQ A) [ 14.816255] arch_cpu_idle+0x34/0x1c8 [ 14.816255] default_idle_call+0x24/0x44 [ 14.816256] do_idle+0x1d0/0x2c8 [ 14.816256] cpu_startup_entry+0x28/0x30 [ 14.816256] rest_init+0xb8/0xc8 [ 14.816257] start_kernel+0x4c8/0x4f4 [ 14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000) [ 14.816258] Modules linked in: start_dp(O) smeth(O) [ 15.103092] ---[ end trace 701753956cb14aa8 ]--- [ 15.103093] Kernel panic - not syncing: Fatal exception in interrupt [ 15.103099] SMP: stopping secondary CPUs [ 15.103100] Kernel Offset: disabled [ 15.103100] CPU features: 0x36,a2400218 [ 15.103100] Memory Limit: none which is cause by a 'BUG_ON(in_nmi())' in nmi_enter(). From the call trace, we can find three interrupts (noted A, B, C above): interrupt (A) is preempted by (B), which is further interrupted by (C). Subsequent investigations show that (B) results in nmi_enter() being called, but that it actually is a spurious interrupt. Furthermore, interrupts are reenabled in the context of (B), and (C) fires with NMI priority. We end-up with a nested NMI situation, something we definitely do not want to (and cannot) handle. The bug here is that spurious interrupts should never result in any state change, and we should just return to the interrupted context. Moving the handling of spurious interrupts as early as possible in the GICv3 handler fixes this issue. Fixes: 3f1f3234bc2d ("irqchip/gic-v3: Switch to PMR masking before calling IRQ handler") Acked-by: Mark Rutland Signed-off-by: He Ying [maz: rewrote commit message, corrected Fixes: tag] Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20210423083516.170111-1-heying24@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 9c0affe34c5cfdd8116817f7d0bbd6f0af136c5b Author: Ulf Hansson Date: Wed Mar 10 16:29:00 2021 +0100 mmc: core: Fix hanging on I/O during system suspend for removable cards commit 17a17bf50612e6048a9975450cf1bd30f93815b5 upstream. The mmc core uses a PM notifier to temporarily during system suspend, turn off the card detection mechanism for removal/insertion of (e)MMC/SD/SDIO cards. Additionally, the notifier may be used to remove an SDIO card entirely, if a corresponding SDIO functional driver don't have the system suspend/resume callbacks assigned. This behaviour has been around for a very long time. However, a recent bug report tells us there are problems with this approach. More precisely, when receiving the PM_SUSPEND_PREPARE notification, we may end up hanging on I/O to be completed, thus also preventing the system from getting suspended. In the end what happens, is that the cancel_delayed_work_sync() in mmc_pm_notify() ends up waiting for mmc_rescan() to complete - and since mmc_rescan() wants to claim the host, it needs to wait for the I/O to be completed first. Typically, this problem is triggered in Android, if there is ongoing I/O while the user decides to suspend, resume and then suspend the system again. This due to that after the resume, an mmc_rescan() work gets punted to the workqueue, which job is to verify that the card remains inserted after the system has resumed. To fix this problem, userspace needs to become frozen to suspend the I/O, prior to turning off the card detection mechanism. Therefore, let's drop the PM notifiers for mmc subsystem altogether and rely on the card detection to be turned off/on as a part of the system_freezable_wq, that we are already using. Moreover, to allow and SDIO card to be removed during system suspend, let's manage this from a ->prepare() callback, assigned at the mmc_host_class level. In this way, we can use the parent device (the mmc_host_class device), to remove the card device that is the child, in the device_prepare() phase. Reported-by: Kiwoong Kim Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Ulf Hansson Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/20210310152900.149380-1-ulf.hansson@linaro.org Reviewed-by: Kiwoong Kim Signed-off-by: Greg Kroah-Hartman commit 79c081d27cc7f9ebb181de6093ad87a7af5ab1ab Author: Seunghui Lee Date: Mon Feb 22 17:31:56 2021 +0900 mmc: core: Set read only for SD cards with permanent write protect bit commit 917a5336f2c27928be270226ab374ed0cbf3805d upstream. Some of SD cards sets permanent write protection bit in their CSD register, due to lifespan or internal problem. To avoid unnecessary I/O write operations, let's parse the bits in the CSD during initialization and mark the card as read only for this case. Signed-off-by: Seunghui Lee Link: https://lore.kernel.org/r/20210222083156.19158-1-sh043.lee@samsung.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 0549d960b0c7563be4a3eae62dac8d39ae0563a9 Author: DooHyun Hwang Date: Wed Feb 10 13:59:36 2021 +0900 mmc: core: Do a power cycle when the CMD11 fails commit 147186f531ae49c18b7a9091a2c40e83b3d95649 upstream. A CMD11 is sent to the SD/SDIO card to start the voltage switch procedure into 1.8V I/O. According to the SD spec a power cycle is needed of the card, if it turns out that the CMD11 fails. Let's fix this, to allow a retry of the initialization without the voltage switch, to succeed. Note that, whether it makes sense to also retry with the voltage switch after the power cycle is a bit more difficult to know. At this point, we treat it like the CMD11 isn't supported and therefore we skip it when retrying. Signed-off-by: DooHyun Hwang Link: https://lore.kernel.org/r/20210210045936.7809-1-dh0421.hwang@samsung.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 615bdc0dd81043d17e3492f09e116c6eb0b2577f Author: Avri Altman Date: Sun Apr 25 09:02:06 2021 +0300 mmc: block: Issue a cache flush only when it's enabled commit 97fce126e279690105ee15be652b465fd96f9997 upstream. In command queueing mode, the cache isn't flushed via the mmc_flush_cache() function, but instead by issuing a CMDQ_TASK_MGMT (CMD48) with a FLUSH_CACHE opcode. In this path, we need to check if cache has been enabled, before deciding to flush the cache, along the lines of what's being done in mmc_flush_cache(). To fix this problem, let's add a new bus ops callback ->cache_enabled() and implement it for the mmc bus type. In this way, the mmc block device driver can call it to know whether cache flushing should be done. Fixes: 1e8e55b67030 (mmc: block: Add CQE support) Cc: stable@vger.kernel.org Reported-by: Brendan Peter Signed-off-by: Avri Altman Tested-by: Brendan Peter Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/20210425060207.2591-2-avri.altman@wdc.com Link: https://lore.kernel.org/r/20210425060207.2591-3-avri.altman@wdc.com [Ulf: Squashed the two patches and made some minor updates] Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 6d19fc6d44da37854f75661d41367abc78b3fa75 Author: Avri Altman Date: Tue Apr 20 16:46:41 2021 +0300 mmc: block: Update ext_csd.cache_ctrl if it was written commit aea0440ad023ab0662299326f941214b0d7480bd upstream. The cache function can be turned ON and OFF by writing to the CACHE_CTRL byte (EXT_CSD byte [33]). However, card->ext_csd.cache_ctrl is only set on init if cache size > 0. Fix that by explicitly setting ext_csd.cache_ctrl on ext-csd write. Signed-off-by: Avri Altman Acked-by: Adrian Hunter Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210420134641.57343-3-avri.altman@wdc.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit df2c44c9b41010189cbc8253465ba5c1f8301a5b Author: Aniruddha Tvs Rao Date: Wed Apr 7 10:46:17 2021 +0100 mmc: sdhci-tegra: Add required callbacks to set/clear CQE_EN bit commit 5ec6fa5a6dc5e42a4aa782f3a81d5f08b0fac1e6 upstream. CMD8 is not supported with Command Queue Enabled. Add required callback to clear CQE_EN and CQE_INTR fields in the host controller register before sending CMD8. Add corresponding callback in the CQHCI resume path to re-enable CQE_EN and CQE_INTR fields. Reported-by: Kamal Mostafa Tested-by: Kamal Mostafa Signed-off-by: Aniruddha Tvs Rao Signed-off-by: Jon Hunter Acked-by: Adrian Hunter Acked-by: Thierry Reding Link: https://lore.kernel.org/r/20210407094617.770495-1-jonathanh@nvidia.com Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit afc09d03bfaf1a9daf540255a971b3f235e80c0d Author: Adrian Hunter Date: Wed Mar 31 11:17:52 2021 +0300 mmc: sdhci-pci: Fix initialization of some SD cards for Intel BYT-based controllers commit 2970134b927834e9249659a70aac48e62dff804a upstream. Bus power may control card power, but the full reset done by SDHCI at initialization still may not reset the power, whereas a direct write to SDHCI_POWER_CONTROL can. That might be needed to initialize correctly, if the card was left powered on previously. Signed-off-by: Adrian Hunter Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210331081752.23621-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 9ef5cb5025d1e11960a9b79d66694e91db5e3209 Author: Pradeep P V K Date: Wed Mar 3 14:02:11 2021 +0530 mmc: sdhci: Check for reset prior to DMA address unmap commit 21e35e898aa9ef7781632959db8613a5380f2eae upstream. For data read commands, SDHC may initiate data transfers even before it completely process the command response. In case command itself fails, driver un-maps the memory associated with data transfer but this memory can still be accessed by SDHC for the already initiated data transfer. This scenario can lead to un-mapped memory access error. To avoid this scenario, reset SDHC (when command fails) prior to un-mapping memory. Resetting SDHC ensures that all in-flight data transfers are either aborted or completed. So we don't run into this scenario. Swap the reset, un-map steps sequence in sdhci_request_done(). Suggested-by: Veerabhadrarao Badiganti Signed-off-by: Pradeep P V K Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/1614760331-43499-1-git-send-email-pragalla@qti.qualcomm.com Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit d6e7fda496978f2763413b5523557b38dc2bf6c2 Author: Christophe JAILLET Date: Sat Feb 20 15:29:53 2021 +0100 mmc: uniphier-sd: Fix a resource leak in the remove function commit e29c84857e2d51aa017ce04284b962742fb97d9e upstream. A 'tmio_mmc_host_free()' call is missing in the remove function, in order to balance a 'tmio_mmc_host_alloc()' call in the probe. This is done in the error handling path of the probe, but not in the remove function. Add the missing call. Fixes: 3fd784f745dd ("mmc: uniphier-sd: add UniPhier SD/eMMC controller driver") Signed-off-by: Christophe JAILLET Reviewed-by: Masahiro Yamada Link: https://lore.kernel.org/r/20210220142953.918608-1-christophe.jaillet@wanadoo.fr Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 4b75f8a3b21b11f6b0655d8675c870799e5d4ed1 Author: Christophe JAILLET Date: Sat Feb 20 15:29:35 2021 +0100 mmc: uniphier-sd: Fix an error handling path in uniphier_sd_probe() commit b03aec1c1f337dfdae44cdb0645ecac34208ae0a upstream. A 'uniphier_sd_clk_enable()' call should be balanced by a corresponding 'uniphier_sd_clk_disable()' call. This is done in the remove function, but not in the error handling path of the probe. Add the missing call. Fixes: 3fd784f745dd ("mmc: uniphier-sd: add UniPhier SD/eMMC controller driver") Signed-off-by: Christophe JAILLET Reviewed-by: Masahiro Yamada Link: https://lore.kernel.org/r/20210220142935.918554-1-christophe.jaillet@wanadoo.fr Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit c8b5286c2154f847e63016bdad7800368378c726 Author: Sreekanth Reddy Date: Tue Mar 30 16:21:37 2021 +0530 scsi: mpt3sas: Block PCI config access from userspace during reset commit 3c8604691d2acc7b7d4795d9695070de9eaa5828 upstream. While diag reset is in progress there is short duration where all access to controller's PCI config space from the host needs to be blocked. This is due to a hardware limitation of the IOC controllers. Block all access to controller's config space from userland applications by calling pci_cfg_access_lock() while diag reset is in progress and unlocking it again after the controller comes back to ready state. Link: https://lore.kernel.org/r/20210330105137.20728-1-sreekanth.reddy@broadcom.com Cc: stable@vger.kernel.org #v5.4.108+ Signed-off-by: Sreekanth Reddy Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit e90e844c5b2b66ad17a7383f120778b1b31a06c9 Author: Sreekanth Reddy Date: Tue Mar 30 16:20:04 2021 +0530 scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES commit 4c51f956965120b3441cdd39c358b87daba13e19 upstream. Whenever the driver is adding a vSES to virtual-phys list it is reinitializing the list head. Hence those vSES devices which were added previously are lost. Stop reinitializing the list every time a new vSES device is added. Link: https://lore.kernel.org/r/20210330105004.20413-1-sreekanth.reddy@broadcom.com Cc: stable@vger.kernel.org #v5.11.10+ Signed-off-by: Sreekanth Reddy Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit a73208e3244127ef9f2cdf24e4adb947aaa32053 Author: Arun Easi Date: Mon Mar 29 01:52:23 2021 -0700 scsi: qla2xxx: Fix crash in qla2xxx_mqueuecommand() commit 6641df81ab799f28a5d564f860233dd26cca0d93 upstream. RIP: 0010:kmem_cache_free+0xfa/0x1b0 Call Trace: qla2xxx_mqueuecommand+0x2b5/0x2c0 [qla2xxx] scsi_queue_rq+0x5e2/0xa40 __blk_mq_try_issue_directly+0x128/0x1d0 blk_mq_request_issue_directly+0x4e/0xb0 Fix incorrect call to free srb in qla2xxx_mqueuecommand(), as srb is now allocated by upper layers. This fixes smatch warning of srb unintended free. Link: https://lore.kernel.org/r/20210329085229.4367-7-njavali@marvell.com Fixes: af2a0c51b120 ("scsi: qla2xxx: Fix SRB leak on switch command timeout") Cc: stable@vger.kernel.org # 5.5 Reported-by: Laurence Oberman Reported-by: Dan Carpenter Reviewed-by: Himanshu Madhani Signed-off-by: Arun Easi Signed-off-by: Nilesh Javali Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit b86203bb433cba7c7c64b504d6f9d5fba2b3e73a Author: James Smart Date: Sun Apr 11 18:31:12 2021 -0700 scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag commit 078c68b87a717b9fcd8e0f2109f73456fbc55490 upstream. Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in lpfc_els_free_iocb(). A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the changes was to convert from building/sending an abort within the routine to using a common routine. The reworked routine passes, without modification, the pring ptr to the new common routine. The older routine had logic to check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine is missing this check and adapt, so the SLI-3 ring pointers are being used in SLI-4 paths. Fix by cleaning up the calling routines. In review, there is no need to pass the ring ptr argument to abort_iocb at all. The routine can look at the adapter type itself and reference the proper ring. Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers") Cc: # v5.11+ Co-developed-by: Justin Tee Signed-off-by: Justin Tee Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 0f86d66b38501e3ac66cf2d9f9f8ad6838bad0e6 Author: Roman Bolshakov Date: Mon Apr 12 19:57:40 2021 +0300 scsi: qla2xxx: Reserve extra IRQ vectors commit f02d4086a8f36a0e1aaebf559b54cf24a177a486 upstream. Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs. That breaks vector allocation assumptions in qla83xx_iospace_config(), qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions computes maximum number of qpairs as: ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default response queue) - 1 (ATIO, in dual or pure target mode) max_qpairs is set to zero in case of two CPUs and initiator mode. The number is then used to allocate ha->queue_pair_map inside qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is left NULL but the driver thinks there are queue pairs available. qla2xxx_queuecommand() tries to find a qpair in the map and crashes: if (ha->mqenable) { uint32_t tag; uint16_t hwq; struct qla_qpair *qpair = NULL; tag = blk_mq_unique_tag(cmd->request); hwq = blk_mq_unique_tag_to_hwq(tag); qpair = ha->queue_pair_map[hwq]; # <- HERE if (qpair) return qla2xxx_mqueuecommand(host, cmd, qpair); } BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G W 5.10.0-rc1+ #25 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc] RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx] Call Trace: scsi_queue_rq+0x58c/0xa60 blk_mq_dispatch_rq_list+0x2b7/0x6f0 ? __sbitmap_get_word+0x2a/0x80 __blk_mq_sched_dispatch_requests+0xb8/0x170 blk_mq_sched_dispatch_requests+0x2b/0x50 __blk_mq_run_hw_queue+0x49/0xb0 __blk_mq_delay_run_hw_queue+0xfb/0x150 blk_mq_sched_insert_request+0xbe/0x110 blk_execute_rq+0x45/0x70 __scsi_execute+0x10e/0x250 scsi_probe_and_add_lun+0x228/0xda0 __scsi_scan_target+0xf4/0x620 ? __pm_runtime_resume+0x4f/0x70 scsi_scan_target+0x100/0x110 fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc] process_one_work+0x1ea/0x3b0 worker_thread+0x28/0x3b0 ? process_one_work+0x3b0/0x3b0 kthread+0x112/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 The driver should allocate enough vectors to provide every CPU it's own HW queue and still handle reserved (MB, RSP, ATIO) interrupts. The change fixes the crash on dual core VM and prevents unbalanced QP allocation where nr_hw_queues is two less than the number of CPUs. Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com Fixes: a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") Cc: Daniel Wagner Cc: Himanshu Madhani Cc: Quinn Tran Cc: Nilesh Javali Cc: Martin K. Petersen Cc: stable@vger.kernel.org # 5.11+ Reported-by: Aleksandr Volkov Reported-by: Aleksandr Miloserdov Reviewed-by: Daniel Wagner Reviewed-by: Himanshu Madhani Signed-off-by: Roman Bolshakov Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit e77fbc250f30e68fd8fb231b71f5084c0baf39f4 Author: Ilya Dryomov Date: Mon May 3 17:09:01 2021 +0200 libceph: allow addrvecs with a single NONE/blank address commit 3f1c6f2122fc780560f09735b6d1dbf39b44eb0f upstream. Normally, an unused OSD id/slot is represented by an empty addrvec. However, it also appears to be possible to generate an osdmap where an unused OSD id/slot has an addrvec with a single blank address of type NONE. Allow such addrvecs and make the end result be exactly the same as for the empty addrvec case -- leave addr intact. Cc: stable@vger.kernel.org # 5.11+ Signed-off-by: Ilya Dryomov Reviewed-by: Jeff Layton Signed-off-by: Greg Kroah-Hartman commit 6ce58cbf68808433ba30b0905d2f58aff219166f Author: Ilya Dryomov Date: Wed Apr 14 10:38:40 2021 +0200 libceph: bump CephXAuthenticate encoding version commit 7807dafda21a549403d922da98dde0ddfeb70d08 upstream. A dummy v3 encoding (exactly the same as v2) was introduced so that the monitors can distinguish broken clients that may not include their auth ticket in CEPHX_GET_AUTH_SESSION_KEY request on reconnects, thus failing to prove previous possession of their global_id (one part of CVE-2021-20288). The kernel client has always included its auth ticket, so it is compatible with enforcing mode as is. However we want to bump the encoding version to avoid having to authenticate twice on the initial connect -- all legacy (CephXAuthenticate < v3) are now forced do so in order to expose insecure global_id reclaim. Marking for stable since at least for 5.11 and 5.12 it is trivial (v2 -> v3). Cc: stable@vger.kernel.org # 5.11+ URL: https://tracker.ceph.com/issues/50452 Signed-off-by: Ilya Dryomov Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 77b96633f972a46e28c8fa0a74e5f835962a0780 Author: Tudor Ambarus Date: Thu Feb 18 15:09:50 2021 +0200 spi: spi-ti-qspi: Free DMA resources commit 1d309cd688a76fb733f0089d36dc630327b32d59 upstream. Release the RX channel and free the dma coherent memory when devm_spi_register_master() fails. Fixes: 5720ec0a6d26 ("spi: spi-ti-qspi: Add DMA support for QSPI mmap read") Cc: stable@vger.kernel.org Signed-off-by: Tudor Ambarus Link: https://lore.kernel.org/r/20210218130950.90155-1-tudor.ambarus@microchip.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit e312452b9570e0a186c692da9516aea0f7c094cf Author: Christophe Kerello Date: Mon Apr 19 14:15:39 2021 +0200 spi: stm32-qspi: fix pm_runtime usage_count counter commit 102e9d1936569d43f55dd1ea89be355ad207143c upstream. pm_runtime usage_count counter is not well managed. pm_runtime_put_autosuspend callback drops the usage_counter but this one has never been increased. Add pm_runtime_get_sync callback to bump up the usage counter. It is also needed to use pm_runtime_force_suspend and pm_runtime_force_resume APIs to handle properly the clock. Fixes: 9d282c17b023 ("spi: stm32-qspi: Add pm_runtime support") Signed-off-by: Christophe Kerello Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210419121541.11617-2-patrice.chotard@foss.st.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit aebbc29301b13436a599744b4834a07d9f53ba1d Author: Gao Xiang Date: Mon Mar 29 08:36:14 2021 +0800 erofs: add unsupported inode i_format check commit 24a806d849c0b0c1d0cd6a6b93ba4ae4c0ec9f08 upstream. If any unknown i_format fields are set (may be of some new incompat inode features), mark such inode as unsupported. Just in case of any new incompat i_format fields added in the future. Link: https://lore.kernel.org/r/20210329003614.6583-1-hsiangkao@aol.com Fixes: 431339ba9042 ("staging: erofs: add inode operations") Cc: # 4.19+ Signed-off-by: Gao Xiang Signed-off-by: Greg Kroah-Hartman commit 4d786870e3262ec098a3b4ed10b895176bc66ecb Author: Gustavo A. R. Silva Date: Fri Feb 12 04:40:22 2021 -0600 mtd: physmap: physmap-bt1-rom: Fix unintentional stack access commit 683313993dbe1651c7aa00bb42a041d70e914925 upstream. Cast &data to (char *) in order to avoid unintentionally accessing the stack. Notice that data is of type u32, so any increment to &data will be in the order of 4-byte chunks, and this piece of code is actually intended to be a byte offset. Fixes: b3e79e7682e0 ("mtd: physmap: Add Baikal-T1 physically mapped ROM support") Addresses-Coverity-ID: 1497765 ("Out-of-bounds access") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva Acked-by: Serge Semin Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20210212104022.GA242669@embeddedor Signed-off-by: Greg Kroah-Hartman commit 676c4cd75307ca5b4cb15d840a497acc630643f7 Author: Kai Stuhlemmer (ebee Engineering) Date: Mon Mar 22 17:07:14 2021 +0200 mtd: rawnand: atmel: Update ecc_stats.corrected counter commit 33cebf701e98dd12b01d39d1c644387b27c1a627 upstream. Update MTD ECC statistics with the number of corrected bits. Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver") Cc: stable@vger.kernel.org Signed-off-by: Kai Stuhlemmer (ebee Engineering) Signed-off-by: Tudor Ambarus Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20210322150714.101585-1-tudor.ambarus@microchip.com Signed-off-by: Greg Kroah-Hartman commit 60015c2e48bf044fee0057d3728a4f5438e8b2c5 Author: Alexander Lobakin Date: Tue Mar 23 17:37:19 2021 +0000 mtd: spinand: core: add missing MODULE_DEVICE_TABLE() commit 25fefc88c71f47db0466570335e3f75f10952e7a upstream. The module misses MODULE_DEVICE_TABLE() for both SPI and OF ID tables and thus never autoloads on ID matches. Add the missing declarations. Present since day-0 of spinand framework introduction. Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs") Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Alexander Lobakin Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20210323173714.317884-1-alobakin@pm.me Signed-off-by: Greg Kroah-Hartman commit d08fcd5794ecfd39aeeacd66790180432ab85b15 Author: Tudor Ambarus Date: Fri Apr 2 11:20:30 2021 +0300 Revert "mtd: spi-nor: macronix: Add support for mx25l51245g" commit 46094049a49be777f12a9589798f7c70b90cd03f upstream. This reverts commit 04b8edad262eec0d153005973dfbdd83423c0dcb. mx25l51245g and mx66l51235l have the same flash ID. The flash detection returns the first entry in the flash_info array that matches the flash ID that was read, thus for the 0xc2201a ID, mx25l51245g was always hit, introducing a regression for mx66l51235l. If one wants to differentiate the flash names, a better fix would be to differentiate between the two at run-time, depending on SFDP, and choose the correct name from a list of flash names, depending on the SFDP differentiator. Fixes: 04b8edad262e ("mtd: spi-nor: macronix: Add support for mx25l51245g") Signed-off-by: Tudor Ambarus Acked-by: Pratyush Yadav Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210402082031.19055-2-tudor.ambarus@microchip.com Signed-off-by: Greg Kroah-Hartman commit e87cc955ca86af0c1e788d21e42f544fef2d06dd Author: Xiang Chen Date: Thu Apr 1 15:34:46 2021 +0800 mtd: spi-nor: core: Fix an issue of releasing resources during read/write commit be94215be1ab19e5d38f50962f611c88d4bfc83a upstream. If rmmod the driver during read or write, the driver will release the resources which are used during read or write, so it is possible to refer to NULL pointer. Use the testcase "mtd_debug read /dev/mtd0 0xc00000 0x400000 dest_file & sleep 0.5;rmmod spi_hisi_sfc_v3xx.ko", the issue can be reproduced in hisi_sfc_v3xx driver. To avoid the issue, fill the interface _get_device and _put_device of mtd_info to grab the reference to the spi controller driver module, so the request of rmmod the driver is rejected before read/write is finished. Fixes: b199489d37b2 ("mtd: spi-nor: add the framework for SPI NOR") Signed-off-by: Xiang Chen Signed-off-by: Yicong Yang Signed-off-by: Tudor Ambarus Tested-by: Michael Walle Tested-by: Tudor Ambarus Reviewed-by: Michael Walle Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1617262486-4223-1-git-send-email-yangyicong@hisilicon.com Signed-off-by: Greg Kroah-Hartman commit 704550d124f7b135ea6690968390b00db04d6872 Author: Jim Quinlan Date: Fri Apr 30 11:21:54 2021 -0400 reset: add missing empty function reset_control_rearm() commit 48582b2e3b87b794a9845d488af2c76ce055502b upstream. All other functions are defined for when CONFIG_RESET_CONTROLLER is not set. Fixes: 557acb3d2cd9 ("reset: make shared pulsed reset controls re-triggerable") Link: https://lore.kernel.org/r/20210430152156.21162-2-jim2101024@gmail.com Signed-off-by: Jim Quinlan Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Greg Kroah-Hartman commit f84f08617825b0033809a417b6a8d01b16387888 Author: Davidlohr Bueso Date: Thu May 6 18:04:07 2021 -0700 fs/epoll: restore waking from ep_done_scan() commit 7fab29e356309ff93a4b30ecc466129682ec190b upstream. Commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") changed the userspace visible behavior of exclusive waiters blocked on a common epoll descriptor upon a single event becoming ready. Previously, all tasks doing epoll_wait would awake, and now only one is awoken, potentially causing missed wakeups on applications that rely on this behavior, such as Apache Qpid. While the aforementioned commit aims at having only a wakeup single path in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to restore the wakeup in what was the old ep_scan_ready_list() such that the next thread can be awoken, in a cascading style, after the waker's corresponding ep_send_events(). Link: https://lkml.kernel.org/r/20210405231025.33829-3-dave@stgolabs.net Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") Signed-off-by: Davidlohr Bueso Cc: Al Viro Cc: Jason Baron Cc: Roman Penyaev Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3d5e1c964b6bdc3f4a6994a2f8ca6067f38be86b Author: Jeffrey Mitchell Date: Fri Feb 26 15:00:23 2021 -0600 ecryptfs: fix kernel panic with null dev_name commit 9046625511ad8dfbc8c6c2de16b3532c43d68d48 upstream. When mounting eCryptfs, a null "dev_name" argument to ecryptfs_mount() causes a kernel panic if the parsed options are valid. The easiest way to reproduce this is to call mount() from userspace with an existing eCryptfs mount's options and a "source" argument of 0. Error out if "dev_name" is null in ecryptfs_mount() Fixes: 237fead61998 ("[PATCH] ecryptfs: fs/Makefile and fs/Kconfig") Cc: stable@vger.kernel.org Signed-off-by: Jeffrey Mitchell Signed-off-by: Tyler Hicks Signed-off-by: Greg Kroah-Hartman commit 8f1999299af7c6c14b018ea81a82a429c5b67893 Author: Chunfeng Yun Date: Tue Mar 16 17:22:24 2021 +0800 arm64: dts: mt8173: fix property typo of 'phys' in dsi node commit e4e5d030bd779fb8321d3b8bd65406fbe0827037 upstream. Use 'phys' instead of 'phy'. Fixes: 81ad4dbaf7af ("arm64: dts: mt8173: Add display subsystem related nodes") Signed-off-by: Chunfeng Yun Reviewed-by: Chun-Kuang Hu Cc: stable Link: https://lore.kernel.org/r/20210316092232.9806-5-chunfeng.yun@mediatek.com Signed-off-by: Matthias Brugger Signed-off-by: Greg Kroah-Hartman commit 93dbb4f3911087e586827d6c7fc338fcb731769f Author: Marek Behún Date: Thu Jan 14 13:40:23 2021 +0100 arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node commit 1d88358a89dbac9c7d4559548b9a44840456e6fb upstream. Add "syscon" compatible to the North Bridge clocks node to allow the cpufreq driver to access these registers via syscon API. This is needed for a fix of cpufreq driver. Signed-off-by: Marek Behún Fixes: e8d66e7927b2 ("arm64: dts: marvell: armada-37xx: add nodes...") Cc: stable@vger.kernel.org Cc: Gregory CLEMENT Cc: Miquel Raynal Signed-off-by: Gregory CLEMENT Signed-off-by: Greg Kroah-Hartman commit 578f7183fd1d4d0e2c2a0915ab7a208b0bb38567 Author: Ard Biesheuvel Date: Fri Feb 5 19:23:00 2021 +0100 ARM: 9056/1: decompressor: fix BSS size calculation for LLVM ld.lld commit c4e792d1acce31c2eb7b9193ab06ab94de05bf42 upstream. The LLVM ld.lld linker uses a different symbol type for __bss_start, resulting in the calculation of KBSS_SZ to be thrown off. Up until now, this has gone unnoticed as it only affects the appended DTB case, but pending changes for ARM in the way the decompressed kernel is cleaned from the caches has uncovered this problem. On a ld.lld build: $ nm vmlinux |grep bss_ c1c22034 D __bss_start c1c86e98 B __bss_stop resulting in $ readelf -s arch/arm/boot/compressed/vmlinux | grep bss_size 433: c1c86e98 0 NOTYPE GLOBAL DEFAULT ABS _kernel_bss_size which is obviously incorrect, and may cause the cache clean to access unmapped memory, or cause the size calculation to wrap, resulting in no cache clean to be performed at all. Fix this by updating the sed regex to take D type symbols into account. Link: https://lore.kernel.org/linux-arm-kernel/6c65bcef-d4e7-25fa-43cf-2c435bb61bb9@collabora.com/ Link: https://lore.kernel.org/linux-arm-kernel/20210205085220.31232-1-ardb@kernel.org/ Cc: # v4.19+ Reviewed-by: Nick Desaulniers Tested-by: Nick Desaulniers Reported-by: Guillaume Tucker Reported-by: "kernelci.org bot" Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit cdd107d7f18158d966c2bc136204fe826dac445c Author: Steven Rostedt (VMware) Date: Wed May 5 10:38:24 2021 -0400 ftrace: Handle commands when closing set_ftrace_filter file commit 8c9af478c06bb1ab1422f90d8ecbc53defd44bc3 upstream. # echo switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter will cause switch_mm to stop tracing by the traceoff command. # echo -n switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter does nothing. The reason is that the parsing in the write function only processes commands if it finished parsing (there is white space written after the command). That's to handle: write(fd, "switch_mm:", 10); write(fd, "traceoff", 8); cases, where the command is broken over multiple writes. The problem is if the file descriptor is closed, then the write call is not processed, and the command needs to be processed in the release code. The release code can handle matching of functions, but does not handle commands. Cc: stable@vger.kernel.org Fixes: eda1e32855656 ("tracing: handle broken names in ftrace filter") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit b5e5faf8f60709c7032d9e51c9806330a8536c65 Author: Mark Langsdorf Date: Tue Apr 27 13:54:33 2021 -0500 ACPI: custom_method: fix a possible memory leak commit 1cfd8956437f842836e8a066b40d1ec2fc01f13e upstream. In cm_write(), if the 'buf' is allocated memory but not fully consumed, it is possible to reallocate the buffer without freeing it by passing '*ppos' as 0 on a subsequent call. Add an explicit kfree() before kzalloc() to prevent the possible memory leak. Fixes: 526b4af47f44 ("ACPI: Split out custom_method functionality into an own driver") Signed-off-by: Mark Langsdorf Cc: 5.4+ # 5.4+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit b7a5baaae212a686ceb812c32fceed79c03c0234 Author: Mark Langsdorf Date: Fri Apr 23 10:28:17 2021 -0500 ACPI: custom_method: fix potential use-after-free issue commit e483bb9a991bdae29a0caa4b3a6d002c968f94aa upstream. In cm_write(), buf is always freed when reaching the end of the function. If the requested count is less than table.length, the allocated buffer will be freed but subsequent calls to cm_write() will still try to access it. Remove the unconditional kfree(buf) at the end of the function and set the buf to NULL in the -EINVAL error path to match the rest of function. Fixes: 03d1571d9513 ("ACPI: custom_method: fix memory leaks") Signed-off-by: Mark Langsdorf Cc: 5.4+ # 5.4+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit a9571131d8899438095c88940b69601ced7c7df1 Author: Stefan Berger Date: Wed Mar 10 17:19:15 2021 -0500 tpm: acpi: Check eventlog signature before using it commit 3dcd15665aca80197333500a4be3900948afccc1 upstream. Check the eventlog signature before using it. This avoids using an empty log, as may be the case when QEMU created the ACPI tables, rather than probing the EFI log next. This resolves an issue where the EFI log was empty since an empty ACPI log was used. Cc: stable@vger.kernel.org Fixes: 85467f63a05c ("tpm: Add support for event log pointer found in TPM2 ACPI table") Signed-off-by: Stefan Berger Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit 93dbbf20e3ffad14f04227a0b7105f6e6f0387ce Author: Jason Wang Date: Tue Apr 13 17:15:57 2021 +0800 vhost-vdpa: fix vm_flags for virtqueue doorbell mapping commit 3a3e0fad16d40a2aa68ddf7eea4acdf48b22dd44 upstream. The virtqueue doorbell is usually implemented via registeres but we don't provide the necessary vma->flags like VM_PFNMAP. This may cause several issues e.g when userspace tries to map the doorbell via vhost IOTLB, kernel may panic due to the page is not backed by page structure. This patch fixes this by setting the necessary vm_flags. With this patch, try to map doorbell via IOTLB will fail with bad address. Cc: stable@vger.kernel.org Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap") Signed-off-by: Jason Wang Link: https://lore.kernel.org/r/20210413091557.29008-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit 5b6e6498e9b9e3605d5eeb83307f04197d0bd936 Author: Vineeth Vijayan Date: Fri Apr 23 12:08:43 2021 +0200 s390/cio: remove invalid condition on IO_SCH_UNREG commit 2f7484fd73729f89085fe08d683f5a8d9e17fe99 upstream. The condition to check the cdev pointer validity on css_sch_device_unregister() is a leftover from the 'commit 8cc0dcfdc1c0 ("s390/cio: remove pm support from ccw bus driver")'. This could lead to a situation, where detaching the device is not happening completely. Remove this invalid condition in the IO_SCH_UNREG case. Link: https://lore.kernel.org/r/20210423100843.2230969-1-vneethv@linux.ibm.com Fixes: 8cc0dcfdc1c0 ("s390/cio: remove pm support from ccw bus driver") Reported-by: Christian Ehrhardt Suggested-by: Christian Ehrhardt Cc: Signed-off-by: Vineeth Vijayan Tested-by: Julian Wiedmann Reviewed-by: Peter Oberparleiter Tested-by: Christian Ehrhardt Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit 32bff908a7c15b075510be580e598a9a83babd05 Author: Tony Krowiak Date: Thu Mar 25 08:46:40 2021 -0400 s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks commit 0cc00c8d40500c4c8fe058dc014bdaf44a82f4f7 upstream. This patch fixes a lockdep splat introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep splat only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid this problem when support becomes available, this fix is being provided. The circular locking dependency was introduced when the setting of the masks in the guest's APCB was executed while holding the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the matrix_mdev->kvm pointer, it is not necessarily critical for setting the masks; so, the matrix_dev->lock will be released while the masks are being set or cleared. Keep in mind, however, that another process that takes the matrix_dev->lock can get control while the masks in the guest's APCB are being set or cleared as a result of the driver being notified that the KVM pointer has been set or unset. This could result in invalid access to the matrix_mdev->kvm pointer by the intervening process. To avoid this scenario, two new fields are being added to the ap_matrix_mdev struct: struct ap_matrix_mdev { ... bool kvm_busy; wait_queue_head_t wait_for_kvm; ... }; The functions that handle notification that the KVM pointer value has been set or cleared will set the kvm_busy flag to true until they are done processing at which time they will set it to false and wake up the tasks on the matrix_mdev->wait_for_kvm wait queue. Functions that require access to matrix_mdev->kvm will sleep on the wait queue until they are awakened at which time they can safely access the matrix_mdev->kvm field. Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit 971dc8706cee47393d393905d294ea47e39503d3 Author: Harald Freudenberger Date: Thu Apr 15 11:22:03 2021 +0200 s390/zcrypt: fix zcard and zqueue hot-unplug memleak commit 70fac8088cfad9f3b379c9082832b4d7532c16c2 upstream. Tests with kvm and a kmemdebug kernel showed, that on hot unplug the zcard and zqueue structs for the unplugged card or queue are not properly freed because of a mismatch with get/put for the embedded kref counter. This fix now adjusts the handling of the kref counters. With init the kref counter starts with 1. This initial value needs to drop to zero with the unregister of the card or queue to trigger the release and free the object. Fixes: 29c2680fd2bf ("s390/ap: fix ap devices reference counting") Reported-by: Marc Hartmayer Signed-off-by: Harald Freudenberger Cc: stable@vger.kernel.org Reviewed-by: Julian Wiedmann Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit aa34d125b8d241faf292a722f04ef289de7ce192 Author: Vasily Gorbik Date: Tue Apr 20 11:04:10 2021 +0200 s390/disassembler: increase ebpf disasm buffer size commit 6f3353c2d2b3eb4de52e9704cb962712033db181 upstream. Current ebpf disassembly buffer size of 64 is too small. E.g. this line takes 65 bytes: 01fffff8005822e: ec8100ed8065\tclgrj\t%r8,%r1,8,001fffff80058408\n\0 Double the buffer size like it is done for the kernel disassembly buffer. Fixes the following KASAN finding: UG: KASAN: stack-out-of-bounds in print_fn_code+0x34c/0x380 Write of size 1 at addr 001fff800ad5f970 by task test_progs/853 CPU: 53 PID: 853 Comm: test_progs Not tainted 5.12.0-rc7-23786-g23457d86b1f0-dirty #19 Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: [<0000000cd8e0538a>] show_stack+0x17a/0x1668 [<0000000cd8e2a5d8>] dump_stack+0x140/0x1b8 [<0000000cd8e16e74>] print_address_description.constprop.0+0x54/0x260 [<0000000cd75a8698>] kasan_report+0xc8/0x130 [<0000000cd6e26da4>] print_fn_code+0x34c/0x380 [<0000000cd6ea0f4e>] bpf_int_jit_compile+0xe3e/0xe58 [<0000000cd72c4c88>] bpf_prog_select_runtime+0x5b8/0x9c0 [<0000000cd72d1bf8>] bpf_prog_load+0xa78/0x19c0 [<0000000cd72d7ad6>] __do_sys_bpf.part.0+0x18e/0x768 [<0000000cd6e0f392>] do_syscall+0x12a/0x220 [<0000000cd8e333f8>] __do_syscall+0x98/0xc8 [<0000000cd8e54834>] system_call+0x6c/0x94 1 lock held by test_progs/853: #0: 0000000cd9bf7460 (report_lock){....}-{2:2}, at: kasan_report+0x96/0x130 addr 001fff800ad5f970 is located in stack of task test_progs/853 at offset 96 in frame: print_fn_code+0x0/0x380 this frame has 1 object: [32, 96) 'buffer' Memory state around the buggy address: 001fff800ad5f800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 001fff800ad5f880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >001fff800ad5f900: 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 f3 f3 ^ 001fff800ad5f980: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 001fff800ad5fa00: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 Cc: Reviewed-by: Heiko Carstens Signed-off-by: Vasily Gorbik Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit bea198f467262884f9fa320e1b325181d04144c2 Author: Shuo Chen Date: Wed Apr 14 14:24:00 2021 -0700 dyndbg: fix parsing file query without a line-range suffix commit 7b1ae248279bea33af9e797a93c35f49601cb8a0 upstream. Query like 'file tcp_input.c line 1234 +p' was broken by commit aaebe329bff0 ("dyndbg: accept 'file foo.c:func1' and 'file foo.c:10-100'") because a file name without a ':' now makes the loop in ddebug_parse_query() exits early before parsing the 'line 1234' part. As a result, all pr_debug() in tcp_input.c will be enabled, instead of only the one on line 1234. Changing 'break' to 'continue' fixes this. Fixes: aaebe329bff0 ("dyndbg: accept 'file foo.c:func1' and 'file foo.c:10-100'") Cc: stable Reviewed-by: Eric Dumazet Signed-off-by: Shuo Chen Acked-by: Jason Baron Link: https://lore.kernel.org/r/20210414212400.2927281-1-giantchen@gmail.com Signed-off-by: Greg Kroah-Hartman commit 3494c68d79cbb7ddff88fd35e0796343ef736606 Author: Mathias Krause Date: Thu Apr 29 19:59:41 2021 +0300 nitro_enclaves: Fix stale file descriptors on failed usercopy commit f1ce3986baa62cffc3c5be156994de87524bab99 upstream. A failing usercopy of the slot uid will lead to a stale entry in the file descriptor table as put_unused_fd() won't release it. This enables userland to refer to a dangling 'file' object through that still valid file descriptor, leading to all kinds of use-after-free exploitation scenarios. Exchanging put_unused_fd() for close_fd(), ksys_close() or alike won't solve the underlying issue, as the file descriptor might have been replaced in the meantime, e.g. via userland calling close() on it (leading to a NULL pointer dereference in the error handling code as 'fget(enclave_fd)' will return a NULL pointer) or by dup2()'ing a completely different file object to that very file descriptor, leading to the same situation: a dangling file descriptor pointing to a freed object -- just in this case to a file object of user's choosing. Generally speaking, after the call to fd_install() the file descriptor is live and userland is free to do whatever with it. We cannot rely on it to still refer to our enclave object afterwards. In fact, by abusing userfaultfd() userland can hit the condition without any racing and abuse the error handling in the nitro code as it pleases. To fix the above issues, defer the call to fd_install() until all possible errors are handled. In this case it's just the usercopy, so do it directly in ne_create_vm_ioctl() itself. Signed-off-by: Mathias Krause Signed-off-by: Andra Paraschiv Cc: stable Link: https://lore.kernel.org/r/20210429165941.27020-2-andraprs@amazon.com Signed-off-by: Greg Kroah-Hartman commit a99b661c3187365f81026d89b1133a76cd2652b3 Author: Loic Poulain Date: Fri Feb 26 11:53:02 2021 +0100 bus: mhi: core: Fix invalid error returning in mhi_queue commit 0ecc1c70dcd32c0f081b173a1a5d89952686f271 upstream. mhi_queue returns an error when the doorbell is not accessible in the current state. This can happen when the device is in non M0 state, like M3, and needs to be waken-up prior ringing the DB. This case is managed earlier by triggering an asynchronous M3 exit via controller resume/suspend callbacks, that in turn will cause M0 transition and DB update. So, since it's not an error but just delaying of doorbell update, there is no reason to return an error. This also fixes a use after free error for skb case, indeed a caller queuing skb will try to free the skb if the queueing fails, but in that case queueing has been done. Fixes: a8f75cb348fd ("mhi: core: Factorize mhi queuing") Signed-off-by: Loic Poulain Reviewed-by: Jeffrey Hugo Reviewed-by: Bhaumik Bhatt Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1614336782-5809-1-git-send-email-loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit 1c1b5a437fc7d0f0d7a4e0dee53d998fece4986a Author: Loic Poulain Date: Tue Apr 6 11:11:54 2021 +0200 bus: mhi: core: Fix MHI runtime_pm behavior commit 4547a749be997eb12ea7edcf361ec2a5329f7aec upstream. This change ensures that PM reference is always get during packet queueing and released either after queuing completion (RX) or once the buffer has been consumed (TX). This guarantees proper update for underlying MHI controller runtime status (e.g. last_busy timestamp) and prevents suspend to be triggered while TX packets are flying, or before we completed update of the RX ring. Signed-off-by: Loic Poulain Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1617700315-12492-1-git-send-email-loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit ed541cff35cbdb695f0c98ef506dd7218883fc07 Author: Loic Poulain Date: Wed Feb 24 11:18:50 2021 +0100 bus: mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue commit 0fccbf0a3b690b162f53b13ed8bc442ea33437dc upstream. A recent change created a dedicated workqueue for the state-change work with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags, but the state-change work (mhi_pm_st_worker) does not guarantee forward progress under memory pressure, and will even wait on various memory allocations when e.g. creating devices, loading firmware, etc... The work is then not part of a memory reclaim path... Moreover, this causes a warning in check_flush_dependency() since we end up in code that flushes a non-reclaim workqueue: [ 40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog [ 40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140 [ 40.969733] Call Trace: [ 40.969740] __flush_work+0x97/0x1d0 [ 40.969745] ? wake_up_process+0x15/0x20 [ 40.969749] ? insert_work+0x70/0x80 [ 40.969750] ? __queue_work+0x14a/0x3e0 [ 40.969753] flush_work+0x10/0x20 [ 40.969756] rollback_registered_many+0x1c9/0x510 [ 40.969759] unregister_netdevice_queue+0x94/0x120 [ 40.969761] unregister_netdev+0x1d/0x30 [ 40.969765] mhi_net_remove+0x1a/0x40 [mhi_net] [ 40.969770] mhi_driver_remove+0x124/0x250 [mhi] [ 40.969776] device_release_driver_internal+0xf0/0x1d0 [ 40.969778] device_release_driver+0x12/0x20 [ 40.969782] bus_remove_device+0xe1/0x150 [ 40.969786] device_del+0x17b/0x3e0 [ 40.969791] mhi_destroy_device+0x9a/0x100 [mhi] [ 40.969796] ? mhi_unmap_single_use_bb+0x50/0x50 [mhi] [ 40.969799] device_for_each_child+0x5e/0xa0 [ 40.969804] mhi_pm_st_worker+0x921/0xf50 [mhi] Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue") Signed-off-by: Loic Poulain Reviewed-by: Bhaumik Bhatt Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1614161930-8513-1-git-send-email-loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit 1ec103ea8684e5ae632ed5dce878332987f7475c Author: Bhaumik Bhatt Date: Tue Mar 9 10:44:50 2021 -0800 bus: mhi: core: Add missing checks for MMIO register entries commit 8de5ad99414347ad08e6ebc2260be1d2e009cb9a upstream. As per documentation, fields marked as (required) in an MHI controller structure need to be populated by the controller driver before calling mhi_register_controller(). Ensure all required pointers and non-zero fields are present in the controller before proceeding with the registration. Signed-off-by: Bhaumik Bhatt Reviewed-by: Jeffrey Hugo Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1615315490-36017-1-git-send-email-bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit 76879a980cd5ede4cb9a638999fb80d37bc09db5 Author: Jeffrey Hugo Date: Wed Mar 10 14:30:55 2021 -0700 bus: mhi: core: Sanity check values from remote device before use commit ec32332df7645e0ba463a08d483fe97665167071 upstream. When parsing the structures in the shared memory, there are values which come from the remote device. For example, a transfer completion event will have a pointer to the tre in the relevant channel's transfer ring. As another example, event ring elements may specify a channel in which the event occurred, however the specified channel value may not be valid as no channel is defined at that index even though the index may be less than the maximum allowed index. Such values should be considered to be untrusted, and validated before use. If we blindly use such values, we may access invalid data or crash if the values are corrupted. If validation fails, drop the relevant event. Signed-off-by: Jeffrey Hugo Reviewed-by: Manivannan Sadhasivam Reviewed-by: Hemant Kumar Link: https://lore.kernel.org/r/1615411855-15053-1-git-send-email-jhugo@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit 1b43b43170c7b24e9249c2ba2db7214a2f6b1f2f Author: Bhaumik Bhatt Date: Thu Apr 1 14:16:15 2021 -0700 bus: mhi: core: Clear configuration from channel context during reset commit 47705c08465931923e2f2b506986ca0bdf80380d upstream. When clearing up the channel context after client drivers are done using channels, the configuration is currently not being reset entirely. Ensure this is done to appropriately handle issues where clients unaware of the context state end up calling functions which expect a context. Signed-off-by: Bhaumik Bhatt Reviewed-by: Hemant Kumar Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1617311778-1254-7-git-send-email-bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit 99b8f10d0216c6006cab3b24811f9e96936ce549 Author: Jeffrey Hugo Date: Fri Feb 12 14:27:23 2021 -0700 bus: mhi: core: Fix check for syserr at power_up commit 6403298c58d4858d93648f553abf0bcbd2dfaca2 upstream. The check to see if we have reset the device after detecting syserr at power_up is inverted. wait_for_event_timeout() returns 0 on failure, and a positive value on success. The check is looking for non-zero as a failure, which is likely to incorrectly cause a device init failure if syserr was detected at power_up. Fix this. Fixes: e18d4e9fa79b ("bus: mhi: core: Handle syserr during power_up") Signed-off-by: Jeffrey Hugo Reviewed-by: Loic Poulain Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1613165243-23359-1-git-send-email-jhugo@codeaurora.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman