aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2023-12-20posix-timers: Get rid of [COMPAT_]SYS_NI() usesHEADmasterLinus Torvalds6-96/+19
Only the posix timer system calls use this (when the posix timer support is disabled, which does not actually happen in any normal case), because they had debug code to print out a warning about missing system calls. Get rid of that special case, and just use the standard COND_SYSCALL interface that creates weak system call stubs that return -ENOSYS for when the system call does not exist. This fixes a kCFI issue with the SYS_NI() hackery: CFI failure at int80_emulation+0x67/0xb0 (target: sys_ni_posix_timers+0x0/0x70; expected type: 0xb02b34d9) WARNING: CPU: 0 PID: 48 at int80_emulation+0x67/0xb0 Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-20Merge tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds9-72/+93
Pull smb client fixes from Steve French: - two multichannel reconnect fixes, one fixing an important refcounting problem that can lead to umount problems - atime fix - five fixes for various potential OOB accesses, including a CVE fix, and two additional fixes for problems pointed out by Robert Morris's fuzzing investigation * tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: do not let cifs_chan_update_iface deallocate channels cifs: fix a pending undercount of srv_count fs: cifs: Fix atime update check smb: client: fix potential OOB in smb2_dump_detail() smb: client: fix potential OOB in cifs_dump_detail() smb: client: fix OOB in smbCalcSize() smb: client: fix OOB in SMB2_query_info_init() smb: client: fix OOB in cifsd when receiving compounded resps
2023-12-20Merge tag 's390-6.7-4' of ↵Linus Torvalds5-14/+16
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Fix virtual vs physical address confusion in Storage Class Memory (SCM) block device driver. - Fix saving and restoring of FPU kernel context, which could lead to corruption of vector registers 8-15 - Update defconfigs * tag 's390-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: update defconfigs s390/vx: fix save/restore of fpu kernel context s390/scm: fix virtual vs physical address confusion
2023-12-20Merge tag 'soc-fixes-6.7-2' of ↵Linus Torvalds10-13/+32
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are only a handful of bugfixes this time, which feels almost too small, so I hope we are not missing something important. - One more mediatek dts warning fix after the previous larger set, this should finally result in a clean defconfig build. - TI OMAP dts fixes for a spurious hang on am335x and invalid data on DTA7 - One DTS fix for ethernet on Oriange Pi Zero (Allwinner H616) - A regression fix for ti-sysc interconnect target module driver to not access registers after reset if srst_udelay quirk is needed - Reset controller driver fixes for a crash during error handling and a build warning" * tag 'soc-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: mediatek: mt8395-genio-1200-evk: add interrupt-parent for mt6360 ARM: dts: Fix occasional boot hang for am3 usb reset: Fix crash when freeing non-existent optional resets ARM: OMAP2+: Fix null pointer dereference and memory leak in omap_soc_device_init ARM: dts: dra7: Fix DRA7 L3 NoC node register size bus: ti-sysc: Flush posted write only after srst_udelay reset: hisilicon: hi6220: fix Wvoid-pointer-to-enum-cast warning arm64: dts: allwinner: h616: update emac for Orange Pi Zero 3
2023-12-20Merge tag 'platform-drivers-x86-v6.7-5' of ↵Linus Torvalds5-34/+131
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: - Fan reporting on some ThinkPads - Laptop 13 spurious keypresses while suspended - Intel PMC correction to avoid crash * tag 'platform-drivers-x86-v6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13 platform/x86/amd/pmc: Move keyboard wakeup disablement detection to pmc-quirks platform/x86/amd/pmc: Only run IRQ1 firmware version check on Cezanne platform/x86/amd/pmc: Move platform defines to header platform/x86/intel/pmc: Fix hang in pmc_core_send_ltr_ignore() platform/x86: thinkpad_acpi: fix for incorrect fan reporting on some ThinkPad systems
2023-12-20Merge tag 'ovl-fixes-6.7-rc7' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fix from Amir Goldstein: "Fix a regression from this merge window" * tag 'ovl-fixes-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fix dentry reference leak after changes to underlying layers
2023-12-20Merge tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefsLinus Torvalds9-28/+70
Pull more bcachefs fixes from Kent Overstreet: - Fix a deadlock in the data move path with nocow locks (vs. update in place writes); when trylock failed we were incorrectly waiting for in flight ios to flush. - Fix reporting of NFS file handle length - Fix early error path in bch2_fs_alloc() - list head wasn't being initialized early enough - Make sure correct (hardware accelerated) crc modules get loaded - Fix a rare overflow in the btree split path, when the packed bkey format grows and all the keys have no value (LRU btree). - Fix error handling in the sector allocator This was causing writes to spuriously fail in multidevice setups, and another bug meant that the errors weren't being logged, only reported via fsync. * tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix bch2_alloc_sectors_start_trans() error handling bcachefs; guard against overflow in btree node split bcachefs: btree_node_u64s_with_format() takes nr keys bcachefs: print explicit recovery pass message only once bcachefs: improve modprobe support by providing softdeps bcachefs: fix invalid memory access in bch2_fs_alloc() error path bcachefs: Fix determining required file handle length bcachefs: Fix nocow locks deadlock
2023-12-20Merge tag 'nfsd-6.7-2' of ↵Linus Torvalds9-258/+29
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address a few recently-introduced issues * tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Revert 5f7fc5d69f6e92ec0b38774c387f5cf7812c5806 NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0 NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500d nfsd: hold nfsd_mutex across entire netlink operation nfsd: call nfsd_last_thread() before final nfsd_put()
2023-12-20Merge tag 'dm-6.7/dm-fixes-3' of ↵Linus Torvalds5-10/+18
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - DM raid target (and MD raid) fix for reconfig_mutex MD deadlock that should have been merged along with recent v6.7-rc6 MD fixes (see MD related commits: f2d87a759f68^..b39113349de6) - DM integrity target fix to avoid modifying immutable biovec in the integrity_metadata() edge case where kmalloc fails. - Fix drivers/md/Kconfig so DM_AUDIT depends on BLK_DEV_DM. - Update DM entry in MAINTAINERS to remove stale info. * tag 'dm-6.7/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: MAINTAINERS: remove stale info for DEVICE-MAPPER dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DM dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata() dm-raid: delay flushing event_work() after reconfig_mutex is released
2023-12-20arm64: dts: mediatek: mt8395-genio-1200-evk: add interrupt-parent for mt6360Macpaul Lin1-0/+1
This patch fix the warning introduced by mt6360 node in mt8395-genio-1200-evk.dts. arch/arm64/boot/dts/mediatek/mt8195.dtsi:464.4-27: Warning (interrupts_property): /soc/i2c@11d01000/pmic@34:#interrupt-cells: size is (8), expected multiple of 16 Add a missing 'interrupt-parent' to fix this warning. Fixes: f2b543a191b6 ("arm64: dts: mediatek: add device-tree for Genio 1200 EVK board") Reported-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/linux-devicetree/20231212214737.230115-1-arnd@kernel.org/ Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-20Merge tag 'am3-usb-hang-fix-signed' of ↵Arnd Bergmann1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fix for occasional boot hang for am335x USB A fix for occasional boot hang for am335x USB that I've only recently started noticing. This can be merged naturally whenever suitable. This issue has been seen with other similar SoCs earlier and has clearly existed for a long time. * tag 'am3-usb-hang-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Fix occasional boot hang for am3 usb Link: https://lore.kernel.org/r/pull-1703071616-395333@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-20Merge tag 'omap-for-v6.7/fixes-signed' of ↵Arnd Bergmann3-5/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps A few fixes for omaps: - A regression fix for ti-sysc interconnect target module driver to not access registers after reset if srst_udelay quirk is needed - DRA7 L3 NoC node register size fix * tag 'omap-for-v6.7/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Fix null pointer dereference and memory leak in omap_soc_device_init ARM: dts: dra7: Fix DRA7 L3 NoC node register size bus: ti-sysc: Flush posted write only after srst_udelay Link: https://lore.kernel.org/r/pull-1702037799-781982@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-19bcachefs: Fix bch2_alloc_sectors_start_trans() error handlingKent Overstreet1-3/+11
When we fail to allocate because of insufficient open buckets, we don't want to retry from the full set of devices - we just want to retry in blocking mode. But if the retry in blocking mode fails with a different error code, we end up squashing the -BCH_ERR_open_buckets_empty error with an error that makes us thing we won't be able to allocate (insufficient_devices) - which is incorrect when we didn't try to allocate from the full set of devices, and causes the write to fail. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19bcachefs; guard against overflow in btree node splitKent Overstreet1-0/+12
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19bcachefs: btree_node_u64s_with_format() takes nr keysKent Overstreet2-17/+14
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19Merge tag 'trace-v6.7-rc6' of ↵Linus Torvalds1-55/+24
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "While working on the ring buffer, I found one more bug with the timestamp code, and the fix for this removed the need for the final 64-bit cmpxchg! The ring buffer events hold a "delta" from the previous event. If it is determined that the delta can not be calculated, it falls back to adding an absolute timestamp value. The way to know if the delta can be used is via two stored timestamps in the per-cpu buffer meta data: before_stamp and write_stamp The before_stamp is written by every event before it tries to allocate its space on the ring buffer. The write_stamp is written after it allocates its space and knows that nothing came in after it read the previous before_stamp and write_stamp and the two matched. A previous fix dd9394257078 ("ring-buffer: Do not try to put back write_stamp") removed putting back the write_stamp to match the before_stamp so that the next event could use the delta, but races were found where the two would match, but not be for of the previous event. It was determined to allow the event reservation to not have a valid write_stamp when it is finished, and this fixed a lot of races. The last use of the 64-bit timestamp cmpxchg depended on the write_stamp being valid after an interruption. But this is no longer the case, as if an event is interrupted by a softirq that writes an event, and that event gets interrupted by a hardirq or NMI and that writes an event, then the softirq could finish its reservation without a valid write_stamp. In the slow path of the event reservation, a delta can still be used if the write_stamp is valid. Instead of using a cmpxchg against the write stamp, the before_stamp needs to be read again to validate the write_stamp. The cmpxchg is not needed. This updates the slowpath to validate the write_stamp by comparing it to the before_stamp and removes all rb_time_cmpxchg() as there are no more users of that function. The removal of the 32-bit updates of rb_time_t will be done in the next merge window" * tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Fix slowpath of interrupted event
2023-12-19Merge tag 'arc-6.7-fixes' of ↵Linus Torvalds12-326/+155
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - build error for hugetlb, sparse and smatch fixes - removal of VIPT aliasing cache code * tag 'arc-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: add hugetlb definitions ARC: fix smatch warning ARC: fix spare error ARC: mm: retire support for aliasing VIPT D$ ARC: entry: move ARCompact specific bits out of entry.h ARC: entry: SAVE_ABI_CALLEE_REG: ISA/ABI specific helper
2023-12-19cifs: do not let cifs_chan_update_iface deallocate channelsShyam Prasad N1-31/+19
cifs_chan_update_iface is meant to check and update the server interface used for a channel when the existing server interface is no longer available. So far, this handler had the code to remove an interface entry even if a new candidate interface is not available. Allowing this leads to several corner cases to handle. This change makes the logic much simpler by not deallocating the current channel interface entry if a new interface is not found to replace it with. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19cifs: fix a pending undercount of srv_countShyam Prasad N1-2/+1
The following commit reverted the changes to ref count the server struct while scheduling a reconnect work: 823342524868 Revert "cifs: reconnect work should have reference on server struct" However, a following change also introduced scheduling of reconnect work, and assumed ref counting. This change fixes that as well. Fixes umount problems like: [73496.157838] CPU: 5 PID: 1321389 Comm: umount Tainted: G W OE 6.7.0-060700rc6-generic #202312172332 [73496.157841] Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET67W (1.50 ) 12/15/2022 [73496.157843] RIP: 0010:cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157906] Code: 5d 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc e8 4a 6e 14 e6 e9 f6 fe ff ff be 03 00 00 00 48 89 d7 e8 78 26 b3 e5 e9 e4 fe ff ff <0f> 0b e9 b1 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 [73496.157908] RSP: 0018:ffffc90003bcbcb8 EFLAGS: 00010286 [73496.157911] RAX: 00000000ffffffff RBX: ffff8885830fa800 RCX: 0000000000000000 [73496.157913] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [73496.157915] RBP: ffffc90003bcbcc8 R08: 0000000000000000 R09: 0000000000000000 [73496.157917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [73496.157918] R13: ffff8887d56ba800 R14: 00000000ffffffff R15: ffff8885830fa800 [73496.157920] FS: 00007f1ff0e33800(0000) GS:ffff88887ba80000(0000) knlGS:0000000000000000 [73496.157922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [73496.157924] CR2: 0000115f002e2010 CR3: 00000003d1e24005 CR4: 00000000003706f0 [73496.157926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [73496.157928] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [73496.157929] Call Trace: [73496.157931] <TASK> [73496.157933] ? show_regs+0x6d/0x80 [73496.157936] ? __warn+0x89/0x160 [73496.157939] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157976] ? report_bug+0x17e/0x1b0 [73496.157980] ? handle_bug+0x51/0xa0 [73496.157983] ? exc_invalid_op+0x18/0x80 [73496.157985] ? asm_exc_invalid_op+0x1b/0x20 [73496.157989] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.158023] ? cifs_put_tcp_session+0x1e/0x190 [cifs] [73496.158057] __cifs_put_smb_ses+0x2b5/0x540 [cifs] [73496.158090] ? tconInfoFree+0xc2/0x120 [cifs] [73496.158130] cifs_put_tcon.part.0+0x108/0x2b0 [cifs] [73496.158173] cifs_put_tlink+0x49/0x90 [cifs] [73496.158220] cifs_umount+0x56/0xb0 [cifs] [73496.158258] cifs_kill_sb+0x52/0x60 [cifs] [73496.158306] deactivate_locked_super+0x32/0xc0 [73496.158309] deactivate_super+0x46/0x60 [73496.158311] cleanup_mnt+0xc3/0x170 [73496.158314] __cleanup_mnt+0x12/0x20 [73496.158330] task_work_run+0x5e/0xa0 [73496.158333] exit_to_user_mode_loop+0x105/0x130 [73496.158336] exit_to_user_mode_prepare+0xa5/0xb0 [73496.158338] syscall_exit_to_user_mode+0x29/0x60 [73496.158341] do_syscall_64+0x6c/0xf0 [73496.158344] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158346] ? do_syscall_64+0x6c/0xf0 [73496.158349] ? exit_to_user_mode_prepare+0x30/0xb0 [73496.158353] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158355] ? do_syscall_64+0x6c/0xf0 Reported-by: Robert Morris <rtm@csail.mit.edu> Fixes: 705fc522fe9d ("cifs: handle when server starts supporting multichannel") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19s390: update defconfigsHeiko Carstens3-10/+11
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-19fs: cifs: Fix atime update checkZizhi Wo1-1/+1
Commit 9b9c5bea0b96 ("cifs: do not return atime less than mtime") indicates that in cifs, if atime is less than mtime, some apps will break. Therefore, it introduce a function to compare this two variables in two places where atime is updated. If atime is less than mtime, update it to mtime. However, the patch was handled incorrectly, resulting in atime and mtime being exactly equal. A previous commit 69738cfdfa70 ("fs: cifs: Fix atime update check vs mtime") fixed one place and forgot to fix another. Fix it. Fixes: 9b9c5bea0b96 ("cifs: do not return atime less than mtime") Cc: stable@vger.kernel.org Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19smb: client: fix potential OOB in smb2_dump_detail()Paulo Alcantara2-17/+19
Validate SMB message with ->check_message() before calling ->calc_smb_size(). This fixes CVE-2023-6610. Reported-by: j51569436@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218219 Cc; stable@vger.kernel.org Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-18ring-buffer: Fix slowpath of interrupted eventSteven Rostedt (Google)1-55/+24
To synchronize the timestamps with the ring buffer reservation, there are two timestamps that are saved in the buffer meta data. 1. before_stamp 2. write_stamp When the two are equal, the write_stamp is considered valid, as in, it may be used to calculate the delta of the next event as the write_stamp is the timestamp of the previous reserved event on the buffer. This is done by the following: /*A*/ w = current position on the ring buffer before = before_stamp after = write_stamp ts = read current timestamp if (before != after) { write_stamp is not valid, force adding an absolute timestamp. } /*B*/ before_stamp = ts /*C*/ write = local_add_return(event length, position on ring buffer) if (w == write - event length) { /* Nothing interrupted between A and C */ /*E*/ write_stamp = ts; delta = ts - after /* * If nothing interrupted again, * before_stamp == write_stamp and write_stamp * can be used to calculate the delta for * events that come in after this one. */ } else { /* * The slow path! * Was interrupted between A and C. */ This is the place that there's a bug. We currently have: after = write_stamp ts = read current timestamp /*F*/ if (write == current position on the ring buffer && after < ts && cmpxchg(write_stamp, after, ts)) { delta = ts - after; } else { delta = 0; } The assumption is that if the current position on the ring buffer hasn't moved between C and F, then it also was not interrupted, and that the last event written has a timestamp that matches the write_stamp. That is the write_stamp is valid. But this may not be the case: If a task context event was interrupted by softirq between B and C. And the softirq wrote an event that got interrupted by a hard irq between C and E. and the hard irq wrote an event (does not need to be interrupted) We have: /*B*/ before_stamp = ts of normal context ---> interrupted by softirq /*B*/ before_stamp = ts of softirq context ---> interrupted by hardirq /*B*/ before_stamp = ts of hard irq context /*E*/ write_stamp = ts of hard irq context /* matches and write_stamp valid */ <---- /*E*/ write_stamp = ts of softirq context /* No longer matches before_stamp, write_stamp is not valid! */ <--- w != write - length, go to slow path // Right now the order of events in the ring buffer is: // // |-- softirq event --|-- hard irq event --|-- normal context event --| // after = write_stamp (this is the ts of softirq) ts = read current timestamp if (write == current position on the ring buffer [true] && after < ts [true] && cmpxchg(write_stamp, after, ts) [true]) { delta = ts - after [Wrong!] The delta is to be between the hard irq event and the normal context event, but the above logic made the delta between the softirq event and the normal context event, where the hard irq event is between the two. This will shift all the remaining event timestamps on the sub-buffer incorrectly. The write_stamp is only valid if it matches the before_stamp. The cmpxchg does nothing to help this. Instead, the following logic can be done to fix this: before = before_stamp ts = read current timestamp before_stamp = ts after = write_stamp if (write == current position on the ring buffer && after == before && after < ts) { delta = ts - after } else { delta = 0; } The above will only use the write_stamp if it still matches before_stamp and was tested to not have changed since C. As a bonus, with this logic we do not need any 64-bit cmpxchg() at all! This means the 32-bit rb_time_t workaround can finally be removed. But that's for a later time. Link: https://lore.kernel.org/linux-trace-kernel/20231218175229.58ec3daf@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231218230712.3a76b081@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Fixes: dd93942570789 ("ring-buffer: Do not try to put back write_stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-18Merge tag 'hid-for-linus-2023121901' of ↵Linus Torvalds1-29/+42
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fix for division by zero in Nintendo driver when generic joycon is attached, reported and fixed by SteamOS folks (Guilherme G. Piccoli) - GCC-7 build fix (which is a good cleanup anyway) for Nintendo driver (Ryan McClelland) * tag 'hid-for-linus-2023121901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: nintendo: Prevent divide-by-zero on code HID: nintendo: fix initializer element is not constant error
2023-12-18SUNRPC: Revert 5f7fc5d69f6e92ec0b38774c387f5cf7812c5806Chuck Lever1-3/+2
Guillaume says: > I believe commit 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from > node-local memory") in Linux 6.5+ is incorrect. It passes > unconditionally rq_pool->sp_id as the NUMA node. > > While the comment in the svc_pool declaration in sunrpc/svc.h says > that sp_id is also the NUMA node id, it might not be the case if > the svc is created using svc_create_pooled(). svc_created_pooled() > can use the per-cpu pool mode therefore in this case sp_id would > be the cpu id. Fix this by reverting now. At a later point this minor optimization, and the deceptive labeling of the sp_id field, can be revisited. Reported-by: Guillaume Morin <guillaume@morinfr.org> Closes: https://lore.kernel.org/linux-nfs/ZYC9rsno8qYggVt9@bender.morinfr.org/T/#u Fixes: 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from node-local memory") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18HID: nintendo: Prevent divide-by-zero on codeGuilherme G. Piccoli1-7/+20
It was reported [0] that adding a generic joycon to the system caused a kernel crash on Steam Deck, with the below panic spew: divide error: 0000 [#1] PREEMPT SMP NOPTI [...] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0119 10/24/2023 RIP: 0010:nintendo_hid_event+0x340/0xcc1 [hid_nintendo] [...] Call Trace: [...] ? exc_divide_error+0x38/0x50 ? nintendo_hid_event+0x340/0xcc1 [hid_nintendo] ? asm_exc_divide_error+0x1a/0x20 ? nintendo_hid_event+0x307/0xcc1 [hid_nintendo] hid_input_report+0x143/0x160 hidp_session_run+0x1ce/0x700 [hidp] Since it's a divide-by-0 error, by tracking the code for potential denominator issues, we've spotted 2 places in which this could happen; so let's guard against the possibility and log in the kernel if the condition happens. This is specially useful since some data that fills some denominators are read from the joycon HW in some cases, increasing the potential for flaws. [0] https://github.com/ValveSoftware/SteamOS/issues/1070 Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Tested-by: Sam Lantinga <slouken@libsdl.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-18Merge tag 'scsi-fixes' of ↵Linus Torvalds5-43/+57
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two medium sized fixes, both in drivers. The UFS one adds parsing of clock info structures, which is required by some host drivers and the aacraid one reverts the IRQ affinity mapping patch which has been causing regressions noted in kernel bugzilla 217599" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Store min and max clk freq from OPP table Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity"
2023-12-18Merge tag 'spi-fix-v6.7-rc7' of ↵Linus Torvalds3-15/+96
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few bigger things here, the main one being that there were changes to the atmel driver in this cycle which made it possible to kill transfers being used for filesystem I/O which turned out to be very disruptive, the series of patches here undoes that and hardens things up further. There's also a few smaller driver specific changes, the main one being to revert a change that duplicted delays" * tag 'spi-fix-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: atmel: Fix clock issue when using devices with different polarities spi: spi-imx: correctly configure burst length when using dma spi: cadence: revert "Add SPI transfer delays" spi: atmel: Prevent spi transfers from being killed spi: atmel: Drop unused defines spi: atmel: Do not cancel a transfer upon any signal
2023-12-18MAINTAINERS: remove stale info for DEVICE-MAPPERMike Snitzer1-2/+0
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DMMike Snitzer1-0/+1
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()Mikulas Patocka1-5/+6
__bio_for_each_segment assumes that the first struct bio_vec argument doesn't change - it calls "bio_advance_iter_single((bio), &(iter), (bvl).bv_len)" to advance the iterator. Unfortunately, the dm-integrity code changes the bio_vec with "bv.bv_len -= pos". When this code path is taken, the iterator would be out of sync and dm-integrity would report errors. This happens if the machine is out of memory and "kmalloc" fails. Fix this bug by making a copy of "bv" and changing the copy instead. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm-raid: delay flushing event_work() after reconfig_mutex is releasedYu Kuai2-3/+11
After commit db5e653d7c9f ("md: delay choosing sync action to md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however, in order to make sure event_work is done, __md_stop() will flush workqueue with reconfig_mutex grabbed, hence if sync_work is still pending, deadlock will be triggered. Fortunately, former pacthes to fix stopping sync_thread already make sure all sync_work is done already, hence such deadlock is not possible anymore. However, in order not to cause confusions for people by this implicit dependency, delay flushing event_work to dm-raid where 'reconfig_mutex' is not held, and add some comments to emphasize that the workqueue can't be flushed with 'reconfig_mutex'. Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()") Depends-on: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0Chuck Lever3-128/+1
There's nothing wrong with this commit, but this is dead code now that nothing triggers a CB_GETATTR callback. It can be re-introduced once the issues with handling conflicting GETATTRs are resolved. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500dChuck Lever3-118/+14
For some reason, the wait_on_bit() in nfsd4_deleg_getattr_conflict() is waiting forever, preventing a clean server shutdown. The requesting client might also hang waiting for a reply to the conflicting GETATTR. Invoking wait_on_bit() in an nfsd thread context is a hazard. The correct fix is to replace this wait_on_bit() call site with a mechanism that defers the conflicting GETATTR until the CB_GETATTR completes or is known to have failed. That will require some surgery and extended testing and it's late in the v6.7-rc cycle, so I'm reverting now in favor of trying again in a subsequent kernel release. This is my fault: I should have recognized the ramifications of calling wait_on_bit() in here before accepting this patch. Thanks to Dai Ngo <dai.ngo@oracle.com> for diagnosing the issue. Reported-by: Wolfgang Walter <linux-nfs@stwm.de> Closes: https://lore.kernel.org/linux-nfs/e3d43ecdad554fbdcaa7181833834f78@stwm.de/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13Mario Limonciello1-0/+17
The Laptop 13 (AMD Ryzen 7040Series) BIOS 03.03 has a workaround included in the EC firmware that will cause the EC to emit a "spurious" keypress during the resume from s0i3 [1]. This series of keypress events can be observed in the kernel log on resume. ``` atkbd serio0: Unknown key pressed (translated set 2, code 0x6b on isa0060/serio0). atkbd serio0: Use 'setkeycodes 6b <keycode>' to make it known. atkbd serio0: Unknown key released (translated set 2, code 0x6b on isa0060/serio0). atkbd serio0: Use 'setkeycodes 6b <keycode>' to make it known. ``` In some user flows this is harmless, but if a user has specifically suspended the laptop and then closed the lid it will cause the laptop to wakeup. The laptop wakes up because the ACPI SCI triggers when the lid is closed and when the kernel sees that IRQ1 is "also" active. The kernel can't distinguish from a real keyboard keypress and wakes the system. Add the model into the list of quirks to disable keyboard wakeup source. This is intentionally only matching the production BIOS version in hopes that a newer EC firmware included in a newer BIOS can avoid this behavior. Cc: Kieran Levin <ktl@framework.net> Link: https://github.com/FrameworkComputer/EmbeddedController/blob/lotus-zephyr/zephyr/program/lotus/azalea/src/power_sequence.c#L313 [1] Link: https://community.frame.work/t/amd-wont-sleep-properly/41755 Link: https://community.frame.work/t/tracking-framework-amd-ryzen-7040-series-lid-wakeup-behavior-feedback/39128 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20231212045006.97581-5-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18platform/x86/amd/pmc: Move keyboard wakeup disablement detection to pmc-quirksMario Limonciello3-1/+5
Other platforms may need to disable keyboard wakeup besides Cezanne, so move the detection into amd_pmc_quirks_init() where it may be applied to multiple platforms. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20231212045006.97581-4-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18platform/x86/amd/pmc: Only run IRQ1 firmware version check on CezanneMario Limonciello1-9/+12
amd_pmc_wa_czn_irq1() only runs on Cezanne platforms currently but may be extended to other platforms in the future. Rename the function and only check platform firmware version when it's called for a Cezanne based platform. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20231212045006.97581-3-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18platform/x86/amd/pmc: Move platform defines to headerMario Limonciello2-10/+11
The platform defines will be used by the quirks in the future, so move them to the common header to allow use by both source files. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20231212045006.97581-2-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18platform/x86/intel/pmc: Fix hang in pmc_core_send_ltr_ignore()Rajvi Jingar1-1/+1
For input value 0, PMC stays unassigned which causes crash while trying to access PMC for register read/write. Include LTR index 0 in pmc_index and ltr_index calculation. Fixes: 2bcef4529222 ("platform/x86:intel/pmc: Enable debugfs multiple PMC support") Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20231216011650.1973941-1-rajvi.jingar@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18platform/x86: thinkpad_acpi: fix for incorrect fan reporting on some ↵Vishnu Sankar1-13/+85
ThinkPad systems Some ThinkPad systems ECFW use non-standard addresses for fan control and reporting. This patch adds support for such ECFW so that it can report the correct fan values. Tested on Thinkpads L13 Yoga Gen 2 and X13 Yoga Gen 2. Suggested-by: Mark Pearson <mpearson-lenovo@squebb.ca> Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20231214134702.166464-1-vishnuocv@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-18s390/vx: fix save/restore of fpu kernel contextHeiko Carstens1-1/+1
The KERNEL_FPR mask only contains a flag for the first eight vector registers. However floating point registers overlay parts of the first sixteen vector registers. This could lead to vector register corruption if a kernel fpu context uses any of the vector registers 8 to 15 and is interrupted or calls a KERNEL_FPR context. If that context uses also vector registers 8 to 15, their contents will be corrupted on return. Luckily this is currently not a real bug, since the kernel has only one KERNEL_FPR user with s390_adjust_jiffies() and it is only using floating point registers 0 to 2. Fix this by using the correct bits for KERNEL_FPR. Fixes: 7f79695cc1b6 ("s390/fpu: improve kernel_fpu_[begin|end]") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-18HID: nintendo: fix initializer element is not constant errorRyan McClelland1-22/+22
With gcc-7 builds, an error happens with the controller button values being defined as const. Change to a define. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312141227.C2h1IzfI-lkp@intel.com/ Signed-off-by: Ryan McClelland <rymcclel@gmail.com> Reviewed-by: Daniel J. Ogorchock <djogorchock@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-17bcachefs: print explicit recovery pass message only onceKent Overstreet1-0/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-17smb: client: fix potential OOB in cifs_dump_detail()Paulo Alcantara1-5/+7
Validate SMB message with ->check_message() before calling ->calc_smb_size(). Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in smbCalcSize()Paulo Alcantara1-0/+4
Validate @smb->WordCount to avoid reading off the end of @smb and thus causing the following KASAN splat: BUG: KASAN: slab-out-of-bounds in smbCalcSize+0x32/0x40 [cifs] Read of size 2 at addr ffff88801c024ec5 by task cifsd/1328 CPU: 1 PID: 1328 Comm: cifsd Not tainted 6.7.0-rc5 #9 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? smbCalcSize+0x32/0x40 [cifs] ? smbCalcSize+0x32/0x40 [cifs] kasan_check_range+0x105/0x1b0 smbCalcSize+0x32/0x40 [cifs] checkSMB+0x162/0x370 [cifs] ? __pfx_checkSMB+0x10/0x10 [cifs] cifs_handle_standard+0xbc/0x2f0 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 cifs_demultiplex_thread+0xed1/0x1360 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? __pfx_lock_release+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? mark_held_locks+0x1a/0x90 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kthread_parkme+0xce/0xf0 ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0x18d/0x1d0 ? kthread+0xdb/0x1d0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> This fixes CVE-2023-6606. Reported-by: j51569436@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218218 Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in SMB2_query_info_init()Paulo Alcantara1-7/+22
A small CIFS buffer (448 bytes) isn't big enough to hold SMB2_QUERY_INFO request along with user's input data from CIFS_QUERY_INFO ioctl. That is, if the user passed an input buffer > 344 bytes, the client will memcpy() off the end of @req->Buffer in SMB2_query_info_init() thus causing the following KASAN splat: BUG: KASAN: slab-out-of-bounds in SMB2_query_info_init+0x242/0x250 [cifs] Write of size 1023 at addr ffff88801308c5a8 by task a.out/1240 CPU: 1 PID: 1240 Comm: a.out Not tainted 6.7.0-rc4 #5 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? SMB2_query_info_init+0x242/0x250 [cifs] ? SMB2_query_info_init+0x242/0x250 [cifs] kasan_check_range+0x105/0x1b0 __asan_memcpy+0x3c/0x60 SMB2_query_info_init+0x242/0x250 [cifs] ? __pfx_SMB2_query_info_init+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? smb_rqst_len+0xa6/0xc0 [cifs] smb2_ioctl_query_info+0x4f4/0x9a0 [cifs] ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs] ? __pfx_cifsConvertToUTF16+0x10/0x10 [cifs] ? kasan_set_track+0x25/0x30 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kasan_kmalloc+0x8f/0xa0 ? srso_alias_return_thunk+0x5/0xfbef5 ? cifs_strndup_to_utf16+0x12d/0x1a0 [cifs] ? __build_path_from_dentry_optional_prefix+0x19d/0x2d0 [cifs] ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs] cifs_ioctl+0x11c7/0x1de0 [cifs] ? __pfx_cifs_ioctl+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? rcu_is_watching+0x23/0x50 ? srso_alias_return_thunk+0x5/0xfbef5 ? __rseq_handle_notify_resume+0x6cd/0x850 ? __pfx___schedule+0x10/0x10 ? blkcg_iostat_update+0x250/0x290 ? srso_alias_return_thunk+0x5/0xfbef5 ? ksys_write+0xe9/0x170 __x64_sys_ioctl+0xc9/0x100 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f893dde49cf Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007ffc03ff4160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc03ff4378 RCX: 00007f893dde49cf RDX: 00007ffc03ff41d0 RSI: 00000000c018cf07 RDI: 0000000000000003 RBP: 00007ffc03ff4260 R08: 0000000000000410 R09: 0000000000000001 R10: 00007f893dce7300 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc03ff4388 R14: 00007f893df15000 R15: 0000000000406de0 </TASK> Fix this by increasing size of SMB2_QUERY_INFO request buffers and validating input length to prevent other callers from overflowing @req in SMB2_query_info_init() as well. Fixes: f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in cifsd when receiving compounded respsPaulo Alcantara3-9/+20
Validate next header's offset in ->next_header() so that it isn't smaller than MID_HEADER_SIZE(server) and then standard_receive3() or ->receive() ends up writing off the end of the buffer because 'pdu_length - MID_HEADER_SIZE(server)' wraps up to a huge length: BUG: KASAN: slab-out-of-bounds in _copy_to_iter+0x4fc/0x840 Write of size 701 at addr ffff88800caf407f by task cifsd/1090 CPU: 0 PID: 1090 Comm: cifsd Not tainted 6.7.0-rc4 #5 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? _copy_to_iter+0x4fc/0x840 ? _copy_to_iter+0x4fc/0x840 kasan_check_range+0x105/0x1b0 __asan_memcpy+0x3c/0x60 _copy_to_iter+0x4fc/0x840 ? srso_alias_return_thunk+0x5/0xfbef5 ? hlock_class+0x32/0xc0 ? srso_alias_return_thunk+0x5/0xfbef5 ? __pfx__copy_to_iter+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_is_held_type+0x90/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? __might_resched+0x278/0x360 ? __pfx___might_resched+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 __skb_datagram_iter+0x2c2/0x460 ? __pfx_simple_copy_to_iter+0x10/0x10 skb_copy_datagram_iter+0x6c/0x110 tcp_recvmsg_locked+0x9be/0xf40 ? __pfx_tcp_recvmsg_locked+0x10/0x10 ? mark_held_locks+0x5d/0x90 ? srso_alias_return_thunk+0x5/0xfbef5 tcp_recvmsg+0xe2/0x310 ? __pfx_tcp_recvmsg+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0x14a/0x3a0 ? srso_alias_return_thunk+0x5/0xfbef5 inet_recvmsg+0xd0/0x370 ? __pfx_inet_recvmsg+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_trylock+0xd1/0x120 sock_recvmsg+0x10d/0x150 cifs_readv_from_socket+0x25a/0x490 [cifs] ? __pfx_cifs_readv_from_socket+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 cifs_read_from_socket+0xb5/0x100 [cifs] ? __pfx_cifs_read_from_socket+0x10/0x10 [cifs] ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_trylock+0xd1/0x120 ? _raw_spin_unlock+0x23/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 ? __smb2_find_mid+0x126/0x230 [cifs] cifs_demultiplex_thread+0xd39/0x1270 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] ? __pfx_lock_release+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? mark_held_locks+0x1a/0x90 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kthread_parkme+0xce/0xf0 ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0x18d/0x1d0 ? kthread+0xdb/0x1d0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> Fixes: 8ce79ec359ad ("cifs: update multiplex loop to handle compounded responses") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17Linux 6.7-rc6v6.7-rc6Linus Torvalds1-1/+1
2023-12-17Merge tag 'perf_urgent_for_v6.7_rc6' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Borislav Petkov: - Avoid iterating over newly created group leader event's siblings because there are none, and thus prevent a lockdep splat * tag 'perf_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix perf_event_validate_size() lockdep splat
2023-12-17Merge tag 'for-6.7-rc5-tag' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix that verifies that the snapshot source is a root, same check is also done in user space but should be done by the ioctl as well" * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not allow non subvolume root targets for snapshot
2023-12-17Merge tag 'soundwire-6.7-fixes' of ↵Linus Torvalds2-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fixes from Vinod Koul: - Null pointer dereference for mult link in core - AC timing fix in intel driver * tag 'soundwire-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: intel_ace2x: fix AC timing setting for ACE2.x soundwire: stream: fix NULL pointer dereference for multi_link
2023-12-17Merge tag 'phy-fixes-6.7' of ↵Linus Torvalds3-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - register offset fix for TI driver - mediatek driver minimal supported frequency fix - negative error code in probe fix for sunplus driver * tag 'phy-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: sunplus: return negative error code in sp_usb_phy_probe phy: mediatek: mipi: mt8183: fix minimal supported frequency phy: ti: gmii-sel: Fix register offset when parent is not a syscon node
2023-12-17Merge tag 'dmaengine-fix-6.7' of ↵Linus Torvalds7-30/+41
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - SPI PDMA data fix for TI k3-psil drivers - suspend fix, pointer check, logic for arbitration fix and channel leak fix in fsl-edma driver - couple of fixes in idxd driver for GRPCFG descriptions and int_handle field handling - single fix for stm32 driver for bitfield overflow * tag 'dmaengine-fix-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: fsl-edma: fix DMA channel leak in eDMAv4 dmaengine: fsl-edma: fix wrong pointer check in fsl_edma3_attach_pd() dmaengine: idxd: Fix incorrect descriptions for GRPCFG register dmaengine: idxd: Protect int_handle field in hw descriptor dmaengine: stm32-dma: avoid bitfield overflow assertion dmaengine: fsl-edma: Add judgment on enabling round robin arbitration dmaengine: fsl-edma: Do not suspend and resume the masked dma channel when the system is sleeping dmaengine: ti: k3-psil-am62a: Fix SPI PDMA data dmaengine: ti: k3-psil-am62: Fix SPI PDMA data
2023-12-17Merge tag 'cxl-fixes-6.7-rc6' of ↵Linus Torvalds10-24/+49
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL (Compute Express Link) fixes from Dan Williams: "A collection of CXL fixes. The touch outside of drivers/cxl/ is for a helper that allocates physical address space. Device hotplug tests showed that the driver failed to utilize (skipped over) valid capacity when allocating a new memory region. Outside of that, new tests uncovered a small crop of lockdep reports. There is also some miscellaneous error path and leak fixups that are not urgent, but useful to cleanup now. - Fix alloc_free_mem_region()'s scan for address space, prevent false negative out-of-space events - Fix sleeping lock acquisition from CXL trace event (atomic context) - Fix put_device() like for the new CXL PMU driver - Fix wrong pointer freed on error path - Fixup several lockdep reports (missing lock hold) from new assertion in cxl_num_decoders_committed() and new tests" * tag 'cxl-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/pmu: Ensure put_device on pmu devices cxl/cdat: Free correct buffer on checksum error cxl/hdm: Fix dpa translation locking kernel/resource: Increment by align value in get_free_mem_region() cxl: Add cxl_num_decoders_committed() usage to cxl_test cxl/memdev: Hold region_rwsem during inject and clear poison ops cxl/core: Always hold region_rwsem while reading poison lists cxl/hdm: Fix a benign lockdep splat
2023-12-17Merge tag 'edac_urgent_for_v6.7_rc6' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - A single fix for the EDAC Versal driver to read out register fields properly * tag 'edac_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro
2023-12-17Merge tag 'powerpc-6.7-5' of ↵Linus Torvalds3-7/+48
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a bug where heavy VAS (accelerator) usage could race with partition migration and prevent the migration from completing. - Update MAINTAINERS to add Aneesh & Naveen. Thanks to Haren Myneni. * tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: MAINTAINERS: powerpc: Add Aneesh & Naveen powerpc/pseries/vas: Migration suspend waits for no in-progress open windows
2023-12-17ovl: fix dentry reference leak after changes to underlying layersAmir Goldstein1-2/+3
syzbot excercised the forbidden practice of moving the workdir under lowerdir while overlayfs is mounted and tripped a dentry reference leak. Fixes: c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held") Reported-and-tested-by: syzbot+8608bb4553edb8c78f41@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-16Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds3-15/+11
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A handful of clk fixes, mostly in the rockchip clk driver: - Fix a clk name, clk parent, and a register for a clk gate in the Rockchip rk3128 clk driver - Add a PLL frequency on Rockchip rk3568 to fix some display artifacts - Fix a kbuild dependency for Qualcomm's SM_CAMCC_8550 symbol so that it isn't possible to select the associated GCC driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name clk: rockchip: rk3128: Fix aclk_peri_src's parent clk: qcom: Fix SM_CAMCC_8550 dependencies clk: rockchip: rk3128: Fix HCLK_OTG gate register clk: rockchip: rk3568: Add PLL rate for 292.5MHz
2023-12-16Merge tag 'trace-v6.7-rc5' of ↵Linus Torvalds6-82/+72
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix eventfs to check creating new files for events with names greater than NAME_MAX. The eventfs lookup needs to check the return result of simple_lookup(). - Fix the ring buffer to check the proper max data size. Events must be able to fit on the ring buffer sub-buffer, if it cannot, then it fails to be written and the logic to add the event is avoided. The code to check if an event can fit failed to add the possible absolute timestamp which may make the event not be able to fit. This causes the ring buffer to go into an infinite loop trying to find a sub-buffer that would fit the event. Luckily, there's a check that will bail out if it looped over a 1000 times and it also warns. The real fix is not to add the absolute timestamp to an event that is starting at the beginning of a sub-buffer because it uses the sub-buffer timestamp. By avoiding the timestamp at the start of the sub-buffer allows events that pass the first check to always find a sub-buffer that it can fit on. - Have large events that do not fit on a trace_seq to print "LINE TOO BIG" like it does for the trace_pipe instead of what it does now which is to silently drop the output. - Fix a memory leak of forgetting to free the spare page that is saved by a trace instance. - Update the size of the snapshot buffer when the main buffer is updated if the snapshot buffer is allocated. - Fix ring buffer timestamp logic by removing all the places that tried to put the before_stamp back to the write stamp so that the next event doesn't add an absolute timestamp. But each of these updates added a race where by making the two timestamp equal, it was validating the write_stamp so that it can be incorrectly used for calculating the delta of an event. - There's a temp buffer used for printing the event that was using the event data size for allocation when it needed to use the size of the entire event (meta-data and payload data) - For hardening, use "%.*s" for printing the trace_marker output, to limit the amount that is printed by the size of the event. This was discovered by development that added a bug that truncated the '\0' and caused a crash. - Fix a use-after-free bug in the use of the histogram files when an instance is being removed. - Remove a useless update in the rb_try_to_discard of the write_stamp. The before_stamp was already changed to force the next event to add an absolute timestamp that the write_stamp is not used. But the write_stamp is modified again using an unneeded 64-bit cmpxchg. - Fix several races in the 32-bit implementation of the rb_time_cmpxchg() that does a 64-bit cmpxchg. - While looking at fixing the 64-bit cmpxchg, I noticed that because the ring buffer uses normal cmpxchg, and this can be done in NMI context, there's some architectures that do not have a working cmpxchg in NMI context. For these architectures, fail recording events that happen in NMI context. * tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI ring-buffer: Have rb_time_cmpxchg() set the msb counter too ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs ring-buffer: Remove useless update to write_stamp in rb_try_to_discard() ring-buffer: Do not try to put back write_stamp tracing: Fix uaf issue when open the hist or hist_debug file tracing: Add size check when printing trace_marker output ring-buffer: Have saved event hold the entire event ring-buffer: Do not update before stamp when switching sub-buffers tracing: Update snapshot buffer on resize if it is allocated ring-buffer: Fix memory leak of free page eventfs: Fix events beyond NAME_MAX blocking tasks tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing ring-buffer: Fix writing to the buffer with max_data_size
2023-12-15Merge tag 'arm64-fixes' of ↵Linus Torvalds2-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Arm CMN perf: fix the DTC allocation failure path which can end up erroneously clearing live counters - arm64/mm: fix hugetlb handling of the dirty page state leading to a continuous fault loop in user on hardware without dirty bit management (DBM). That's caused by the dirty+writeable information not being properly preserved across a series of mprotect(PROT_NONE), mprotect(PROT_READ|PROT_WRITE) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify perf/arm-cmn: Fail DTC counter allocation correctly
2023-12-15Merge tag 'pci-v6.7-fixes-1' of ↵Linus Torvalds6-32/+100
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Limit Max_Read_Request_Size (MRRS) on some MIPS Loongson systems because they don't all support MRRS > 256, and firmware doesn't always initialize it correctly, which meant some PCIe devices didn't work (Jiaxun Yang) - Add and use pci_enable_link_state_locked() to prevent potential deadlocks in vmd and qcom drivers (Johan Hovold) - Revert recent (v6.5) acpiphp resource assignment changes that fixed issues with hot-adding devices on a root bus or with large BARs, but introduced new issues with GPU initialization and hot-adding SCSI disks in QEMU VMs and (Bjorn Helgaas) * tag 'pci-v6.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI: acpiphp: Reassign resources on bridge if necessary" PCI/ASPM: Add pci_disable_link_state_locked() lockdep assert PCI/ASPM: Clean up __pci_disable_link_state() 'sem' parameter PCI: qcom: Clean up ASPM comment PCI: qcom: Fix potential deadlock when enabling ASPM PCI: vmd: Fix potential deadlock when enabling ASPM PCI/ASPM: Add pci_enable_link_state_locked() PCI: loongson: Limit MRRS to 256
2023-12-15btrfs: do not allow non subvolume root targets for snapshotJosef Bacik1-0/+9
Our btrfs subvolume snapshot <source> <destination> utility enforces that <source> is the root of the subvolume, however this isn't enforced in the kernel. Update the kernel to also enforce this limitation to avoid problems with other users of this ioctl that don't have the appropriate checks in place. Reported-by: Martin Michaelis <code@mgjm.de> CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-12-15cred: get rid of CONFIG_DEBUG_CREDENTIALSJens Axboe15-313/+17
This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15cred: switch to using atomic_long_tJens Axboe2-36/+36
There are multiple ways to grab references to credentials, and the only protection we have against overflowing it is the memory required to do so. With memory sizes only moving in one direction, let's bump the reference count to 64-bit and move it outside the realm of feasibly overflowing. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15Revert "PCI: acpiphp: Reassign resources on bridge if necessary"Bjorn Helgaas1-6/+3
This reverts commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 and the subsequent fix to it: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da52b13 fixed a problem where hot-adding a device with large BARs failed if the bridge windows programmed by firmware were not large enough. cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") fixed a problem with 40613da52b13: an ACPI hot-add of a device on a PCI root bus (common in the virt world) or firmware sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron 7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs. Unfortunately the combination of 40613da52b13 and cc22522fd55e caused other problems: - Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails sometimes. - Dongli reported a similar problem with hot-add of SCSI disks. - Jonathan reported a console freeze during boot on bare metal due to an error in radeon GPU initialization. Revert both patches to avoid adding these problems. This means we will again see the problems with hot-adding devices with large BARs and the NULL pointer dereferences and suspend/resume issues that 40613da52b13 and cc22522fd55e were intended to fix. Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") Fixes: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") Reported-by: Fiona Ebner <f.ebner@proxmox.com> Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com Reported-by: Dongli Zhang <dongli.zhang@oracle.com> Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.com Reported-by: Jonathan Woithe <jwoithe@just42.net> Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.au Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Cc: <stable@vger.kernel.org>
2023-12-15Merge tag 'io_uring-6.7-2023-12-15' of git://git.kernel.dk/linuxLinus Torvalds3-4/+21
Pull io_uring fixes from Jens Axboe: "Just two minor fixes: - Fix for the io_uring socket option commands using the wrong value on some archs (Al) - Tweak to the poll lazy wake enable (me)" * tag 'io_uring-6.7-2023-12-15' of git://git.kernel.dk/linux: io_uring/cmd: fix breakage in SOCKET_URING_OP_SIOC* implementation io_uring/poll: don't enable lazy wake for POLLEXCLUSIVE
2023-12-15Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of ↵Linus Torvalds33-158/+171
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues" * tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mglru: reclaim offlined memcgs harder mm/mglru: respect min_ttl_ms with memcgs mm/mglru: try to stop at high watermarks mm/mglru: fix underprotected page cache mm/shmem: fix race in shmem_undo_range w/THP Revert "selftests: error out if kernel header files are not yet built" crash_core: fix the check for whether crashkernel is from high memory x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC loongarch, kexec: change dependency of object files mm/damon/core: make damon_start() waits until kdamond_fn() starts selftests/mm: cow: print ksft header before printing anything else mm: fix VMA heap bounds checking riscv: fix VMALLOC_START definition kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP
2023-12-15Merge tag 'sound-6.7-rc6' of ↵Linus Torvalds3-11/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of HD-audio quirks for TAS2781 codec and device-specific workarounds" * tag 'sound-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/tas2781: reset the amp before component_add ALSA: hda/tas2781: call cleanup functions only once ALSA: hda/tas2781: handle missing EFI calibration data ALSA: hda/tas2781: leave hda_component in usable state ALSA: hda/realtek: Apply mute LED quirk for HP15-db ALSA: hda/hdmi: add force-connect quirks for ASUSTeK Z170 variants ALSA: hda/hdmi: add force-connect quirk for NUC5CPYB
2023-12-15Merge tag 'drm-fixes-2023-12-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds36-83/+190
Pull drm fixes from Dave Airlie: "More regular fixes, amdgpu, i915, mediatek and nouveau are most of them this week. Nothing too major, then a few misc bits and pieces in core, panel and ivpu. drm: - fix uninit problems in crtc - fix fd ownership check - edid: add modes in fallback paths panel: - move LG panel into DSI yaml - ltk050h3146w: set burst mode mediatek: - mtk_disp_gamma: Fix breakage due to merge issue - fix kernel oops if no crtc is found - Add spinlock for setting vblank event in atomic_begin - Fix access violation in mtk_drm_crtc_dma_dev_get i915: - Fix selftest engine reset count storage for multi-tile - Fix out-of-bounds reads for engine reset counts - Fix ADL+ remapped stride with CCS - Fix intel_atomic_setup_scalers() plane_state handling - Fix ADL+ tiled plane stride when the POT stride is smaller than the original - Fix eDP 1.4 rate select method link configuration amdgpu: - Fix suspend fix that got accidently mangled last week - Fix OD regression - PSR fixes - OLED Backlight regression fix - JPEG 4.0.5 fix - Misc display fixes - SDMA 5.2 fix - SDMA 2.4 regression fix - GPUVM race fix nouveau: - fix gk20a instobj hierarchy - fix headless iors inheritance regression ivpu: - fix WA initialisation" * tag 'drm-fixes-2023-12-15' of git://anongit.freedesktop.org/drm/drm: (31 commits) drm/nouveau/kms/nv50-: Don't allow inheritance of headless iors drm/nouveau: Fixup gk20a instobj hierarchy drm/amdgpu: warn when there are still mappings when a BO is destroyed v2 drm/amdgpu: fix tear down order in amdgpu_vm_pt_free drm/amd: Fix a probing order problem on SDMA 2.4 drm/amdgpu/sdma5.2: add begin/end_use ring callbacks drm/panel: ltk050h3146w: Set burst mode for ltk050h3148w dt-bindings: panel-simple-dsi: move LG 5" HD TFT LCD panel into DSI yaml drm/amd/display: Disable PSR-SU on Parade 0803 TCON again drm/amd/display: Populate dtbclk from bounding box drm/amd/display: Revert "Fix conversions between bytes and KB" drm/amdgpu/jpeg: configure doorbell for each playback drm/amd/display: Restore guard against default backlight value < 1 nit drm/amd/display: fix hw rotated modes when PSR-SU is enabled drm/amd/pm: fix pp_*clk_od typo drm/amdgpu: fix buffer funcs setting order on suspend harder drm/mediatek: Fix access violation in mtk_drm_crtc_dma_dev_get drm/edid: also call add modes in EDID connector update fallback drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select drm/i915: Fix ADL+ tiled plane stride when the POT stride is smaller than the original ...
2023-12-15nfsd: hold nfsd_mutex across entire netlink operationNeilBrown1-6/+3
Rather than using svc_get() and svc_put() to hold a stable reference to the nfsd_svc for netlink lookups, simply hold the mutex for the entire time. The "entire" time isn't very long, and the mutex is not often contented. This makes way for us to remove the refcounts of svc, which is more confusing than useful. Reported-by: Jeff Layton <jlayton@kernel.org> Closes: https://lore.kernel.org/linux-nfs/5d9bbb599569ce29f16e4e0eef6b291eda0f375b.camel@kernel.org/T/#u Fixes: bd9d6a3efa97 ("NFSD: add rpc_status netlink support") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-15nfsd: call nfsd_last_thread() before final nfsd_put()NeilBrown3-3/+9
If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put() without calling nfsd_last_thread(). This leaves nn->nfsd_serv pointing to a structure that has been freed. So remove 'static' from nfsd_last_thread() and call it when the nfsd_serv is about to be destroyed. Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-15ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMISteven Rostedt (Google)1-0/+6
As the ring buffer recording requires cmpxchg() to work, if the architecture does not support cmpxchg in NMI, then do not do any recording within an NMI. Link: https://lore.kernel.org/linux-trace-kernel/20231213175403.6fc18540@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15ring-buffer: Have rb_time_cmpxchg() set the msb counter tooSteven Rostedt (Google)1-0/+2
The rb_time_cmpxchg() on 32-bit architectures requires setting three 32-bit words to represent the 64-bit timestamp, with some salt for synchronization. Those are: msb, top, and bottom The issue is, the rb_time_cmpxchg() did not properly salt the msb portion, and the msb that was written was stale. Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: f03f2abce4f39 ("ring-buffer: Have 32 bit time stamps use all 64 bits") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()Mathieu Desnoyers1-2/+2
The following race can cause rb_time_read() to observe a corrupted time stamp: rb_time_cmpxchg() [...] if (!rb_time_read_cmpxchg(&t->msb, msb, msb2)) return false; if (!rb_time_read_cmpxchg(&t->top, top, top2)) return false; <interrupted before updating bottom> __rb_time_read() [...] do { c = local_read(&t->cnt); top = local_read(&t->top); bottom = local_read(&t->bottom); msb = local_read(&t->msb); } while (c != local_read(&t->cnt)); *cnt = rb_time_cnt(top); /* If top and msb counts don't match, this interrupted a write */ if (*cnt != rb_time_cnt(msb)) return false; ^ this check fails to catch that "bottom" is still not updated. So the old "bottom" value is returned, which is wrong. Fix this by checking that all three of msb, top, and bottom 2-bit cnt values match. The reason to favor checking all three fields over requiring a specific update order for both rb_time_set() and rb_time_cmpxchg() is because checking all three fields is more robust to handle partial failures of rb_time_cmpxchg() when interrupted by nested rb_time_set(). Link: https://lore.kernel.org/lkml/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212193049.680122-1-mathieu.desnoyers@efficios.com Fixes: f458a1453424e ("ring-buffer: Test last update in 32bit version of __rb_time_read()") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archsSteven Rostedt (Google)1-1/+3
Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit architectures. That is: static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set) { unsigned long cnt, top, bottom, msb; unsigned long cnt2, top2, bottom2, msb2; u64 val; /* The cmpxchg always fails if it interrupted an update */ if (!__rb_time_read(t, &val, &cnt2)) return false; if (val != expect) return false; <<<< interrupted here! cnt = local_read(&t->cnt); The problem is that the synchronization counter in the rb_time_t is read *after* the value of the timestamp is read. That means if an interrupt were to come in between the value being read and the counter being read, it can change the value and the counter and the interrupted process would be clueless about it! The counter needs to be read first and then the value. That way it is easy to tell if the value is stale or not. If the counter hasn't been updated, then the value is still good. Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 10464b4aa605e ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit") Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()Steven Rostedt (Google)1-36/+11
When filtering is enabled, a temporary buffer is created to place the content of the trace event output so that the filter logic can decide from the trace event output if the trace event should be filtered out or not. If it is to be filtered out, the content in the temporary buffer is simply discarded, otherwise it is written into the trace buffer. But if an interrupt were to come in while a previous event was using that temporary buffer, the event written by the interrupt would actually go into the ring buffer itself to prevent corrupting the data on the temporary buffer. If the event is to be filtered out, the event in the ring buffer is discarded, or if it fails to discard because another event were to have already come in, it is turned into padding. The update to the write_stamp in the rb_try_to_discard() happens after a fix was made to force the next event after the discard to use an absolute timestamp by setting the before_stamp to zero so it does not match the write_stamp (which causes an event to use the absolute timestamp). But there's an effort in rb_try_to_discard() to put back the write_stamp to what it was before the event was added. But this is useless and wasteful because nothing is going to be using that write_stamp for calculations as it still will not match the before_stamp. Remove this useless update, and in doing so, we remove another cmpxchg64()! Also update the comments to reflect this change as well as remove some extra white space in another comment. Link: https://lore.kernel.org/linux-trace-kernel/20231215081810.1f4f38fe@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: b2dd797543cf ("ring-buffer: Force absolute timestamp on discard of event") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15ring-buffer: Do not try to put back write_stampSteven Rostedt (Google)1-23/+6
If an update to an event is interrupted by another event between the time the initial event allocated its buffer and where it wrote to the write_stamp, the code try to reset the write stamp back to the what it had just overwritten. It knows that it was overwritten via checking the before_stamp, and if it didn't match what it wrote to the before_stamp before it allocated its space, it knows it was overwritten. To put back the write_stamp, it uses the before_stamp it read. The problem here is that by writing the before_stamp to the write_stamp it makes the two equal again, which means that the write_stamp can be considered valid as the last timestamp written to the ring buffer. But this is not necessarily true. The event that interrupted the event could have been interrupted in a way that it was interrupted as well, and can end up leaving with an invalid write_stamp. But if this happens and returns to this context that uses the before_stamp to update the write_stamp again, it can possibly incorrectly make it valid, causing later events to have in correct time stamps. As it is OK to leave this function with an invalid write_stamp (one that doesn't match the before_stamp), there's no reason to try to make it valid again in this case. If this race happens, then just leave with the invalid write_stamp and the next event to come along will just add a absolute timestamp and validate everything again. Bonus points: This gets rid of another cmpxchg64! Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-15EDAC/versal: Read num_csrows and num_chans using the correct bitfield macroShubhrajyoti Datta1-2/+2
Fix the extraction of num_csrows and num_chans. The extraction of the num_rows is wrong. Instead of extracting using the FIELD_GET it is calling FIELD_PREP. The issue was masked as the default design has the rows as 0. Fixes: 6f15b178cd63 ("EDAC/versal: Add a Xilinx Versal memory controller driver") Closes: https://lore.kernel.org/all/60ca157e-6eff-d12c-9dc0-8aeab125edda@linux-m68k.org/ Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231215053352.8740-1-shubhrajyoti.datta@amd.com
2023-12-15perf: Fix perf_event_validate_size() lockdep splatMark Rutland1-0/+10
When lockdep is enabled, the for_each_sibling_event(sibling, event) macro checks that event->ctx->mutex is held. When creating a new group leader event, we call perf_event_validate_size() on a partially initialized event where event->ctx is NULL, and so when for_each_sibling_event() attempts to check event->ctx->mutex, we get a splat, as reported by Lucas De Marchi: WARNING: CPU: 8 PID: 1471 at kernel/events/core.c:1950 __do_sys_perf_event_open+0xf37/0x1080 This only happens for a new event which is its own group_leader, and in this case there cannot be any sibling events. Thus it's safe to skip the check for siblings, which avoids having to make invasive and ugly changes to for_each_sibling_event(). Avoid the splat by bailing out early when the new event is its own group_leader. Fixes: 382c27f4ed28f803 ("perf: Fix perf_event_validate_size()") Closes: https://lore.kernel.org/lkml/20231214000620.3081018-1-lucas.demarchi@intel.com/ Closes: https://lore.kernel.org/lkml/ZXpm6gQ%2Fd59jGsuW@xpf.sh.intel.com/ Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231215112450.3972309-1-mark.rutland@arm.com
2023-12-14cxl/pmu: Ensure put_device on pmu devicesIra Weiny1-1/+1
The following kmemleaks were detected when removing the cxl module stack: unreferenced object 0xffff88822616b800 (size 1024): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000448d1afc>] devm_cxl_pmu_add+0x3a/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882260abcc0 (size 16): ... hex dump (first 16 bytes): 70 6d 75 5f 6d 65 6d 30 2e 30 00 26 82 88 ff ff pmu_mem0.0.&.... backtrace: ... [<00000000152b5e98>] dev_set_name+0x43/0x50 [<00000000c228798b>] devm_cxl_pmu_add+0x102/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882272af200 (size 256): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000a14d1813>] device_add+0x4ea/0x890 [<00000000a3f07b47>] devm_cxl_pmu_add+0xbe/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... devm_cxl_pmu_add() correctly registers a device remove function but it only calls device_del() which is only part of device unregistration. Properly call device_unregister() to free up the memory associated with the device. Fixes: 1ad3f701c399 ("cxl/pci: Find and register CXL PMU devices") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231016-pmu-unregister-fix-v1-1-1e2eb2fa3c69@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-15drm/nouveau/kms/nv50-: Don't allow inheritance of headless iorsLyude Paul1-1/+1
Turns out we made a silly mistake when coming up with OR inheritance on nouveau. On pre-DCB 4.1, iors are statically routed to output paths via the DCB. On later generations iors are only routed to an output path if they're actually being used. Unfortunately, it appears with NVIF_OUTP_INHERIT_V0 we make the mistake of assuming the later is true on all generations, which is currently leading us to return bogus ior -> head assignments through nvif, which causes WARN_ON(). So - fix this by verifying that we actually know that there's a head assigned to an ior before allowing it to be inherited through nvif. This -should- hopefully fix the WARN_ON on GT218 reported by Borislav. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231214004359.1028109-1-lyude@redhat.com
2023-12-15drm/nouveau: Fixup gk20a instobj hierarchyThierry Reding1-9/+9
Commit 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") uses container_of() to cast from struct nvkm_memory to struct nvkm_instobj, assuming that all instance objects are derived from struct nvkm_instobj. For the gk20a family that's not the case and they are derived from struct nvkm_memory instead. This causes some subtle data corruption (nvkm_instobj.preserve ends up mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also prevents suspend/resume from working. Fix this by making struct gk20a_instobj derive from struct nvkm_instobj instead. Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com
2023-12-14Merge tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds5-79/+109
Pull smb client fixes from Steve French: "Address OOBs and NULL dereference found by Dr. Morris's recent analysis and fuzzing. All marked for stable as well" * tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix OOB in smb2_query_reparse_point() smb: client: fix NULL deref in asn1_ber_decoder() smb: client: fix potential OOBs in smb2_parse_contexts() smb: client: fix OOB in receive_encrypted_standard()
2023-12-15Merge tag 'drm-misc-fixes-2023-12-14' of ↵Dave Airlie7-12/+19
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v6.7-rc6: - Fix regression for checking if FD is master capable. - Fix uninitialized variables in drm/crtc. - Fix ivpu w/a. - Refresh modes correctly when updating EDID. - Small panel fixes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2d46b68f-c5a4-45e5-beb4-411569f4aac8@linux.intel.com
2023-12-15Merge tag 'amd-drm-fixes-6.7-2023-12-13' of ↵Dave Airlie16-34/+84
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.7-2023-12-13: amdgpu: - Fix suspend fix that got accidently mangled last week - Fix OD regression - PSR fixes - OLED Backlight regression fix - JPEG 4.0.5 fix - Misc display fixes - SDMA 5.2 fix - SDMA 2.4 regression fix - GPUVM race fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231213221122.4937-1-alexander.deucher@amd.com
2023-12-14Merge tag 'platform-drivers-x86-v6.7-4' of ↵Linus Torvalds3-14/+41
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - tablet-mode-switch events fix - kernel-doc warning fixes * tag 'platform-drivers-x86-v6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: intel_ips: fix kernel-doc formatting platform/x86: thinkpad_acpi: fix kernel-doc warnings platform/x86: intel-vbtn: Fix missing tablet-mode-switch events
2023-12-15Merge tag 'drm-intel-fixes-2023-12-13' of ↵Dave Airlie8-24/+59
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v6.7-rc6: - Fix selftest engine reset count storage for multi-tile - Fix out-of-bounds reads for engine reset counts - Fix ADL+ remapped stride with CCS - Fix intel_atomic_setup_scalers() plane_state handling - Fix ADL+ tiled plane stride when the POT stride is smaller than the original - Fix eDP 1.4 rate select method link configuration Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/871qbqw4rw.fsf@intel.com
2023-12-14io_uring/cmd: fix breakage in SOCKET_URING_OP_SIOC* implementationAl Viro1-1/+1
In 8e9fad0e70b7 "io_uring: Add io_uring command support for sockets" you've got an include of asm-generic/ioctls.h done in io_uring/uring_cmd.c. That had been done for the sake of this chunk - + ret = prot->ioctl(sk, SIOCINQ, &arg); + if (ret) + return ret; + return arg; + case SOCKET_URING_OP_SIOCOUTQ: + ret = prot->ioctl(sk, SIOCOUTQ, &arg); SIOC{IN,OUT}Q are defined to symbols (FIONREAD and TIOCOUTQ) that come from ioctls.h, all right, but the values vary by the architecture. FIONREAD is 0x467F on mips 0x4004667F on alpha, powerpc and sparc 0x8004667F on sh and xtensa 0x541B everywhere else TIOCOUTQ is 0x7472 on mips 0x40047473 on alpha, powerpc and sparc 0x80047473 on sh and xtensa 0x5411 everywhere else ->ioctl() expects the same values it would've gotten from userland; all places where we compare with SIOC{IN,OUT}Q are using asm/ioctls.h, so they pick the correct values. io_uring_cmd_sock(), OTOH, ends up passing the default ones. Fixes: 8e9fad0e70b7 ("io_uring: Add io_uring command support for sockets") Cc: <stable@vger.kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231214213408.GT1674809@ZenIV Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-14Merge tag 'net-6.7-rc6' of ↵Linus Torvalds62-582/+1150
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Current release - regressions: - tcp: fix tcp_disordered_ack() vs usec TS resolution Current release - new code bugs: - dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() - eth: octeon_ep: initialise control mbox tasks before using APIs Previous releases - regressions: - io_uring/af_unix: disable sending io_uring over sockets - eth: mlx5e: - TC, don't offload post action rule if not supported - fix possible deadlock on mlx5e_tx_timeout_work - eth: iavf: fix iavf_shutdown to call iavf_remove instead iavf_close - eth: bnxt_en: fix skb recycling logic in bnxt_deliver_skb() - eth: ena: fix DMA syncing in XDP path when SWIOTLB is on - eth: team: fix use-after-free when an option instance allocation fails Previous releases - always broken: - neighbour: don't let neigh_forced_gc() disable preemption for long - net: prevent mss overflow in skb_segment() - ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIX - tcp: remove acked SYN flag from packet in the transmit queue correctly - eth: octeontx2-af: - fix a use-after-free in rvu_nix_register_reporters - fix promisc mcam entry action - eth: dwmac-loongson: make sure MDIO is initialized before use - eth: atlantic: fix double free in ring reinit logic" * tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits) net: atlantic: fix double free in ring reinit logic appletalk: Fix Use-After-Free in atalk_ioctl net: stmmac: Handle disabled MDIO busses from devicetree net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX dpaa2-switch: do not ask for MDB, VLAN and FDB replay dpaa2-switch: fix size of the dma_unmap net: prevent mss overflow in skb_segment() vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space() Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set" MIPS: dts: loongson: drop incorrect dwmac fallback compatible stmmac: dwmac-loongson: drop useless check for compatible fallback stmmac: dwmac-loongson: Make sure MDIO is initialized before use tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() net: ena: Fix XDP redirection error net: ena: Fix DMA syncing in XDP path when SWIOTLB is on net: ena: Fix xdp drops handling due to multibuf packets net: ena: Destroy correct number of xdp queues upon failure net: Remove acked SYN flag from packet in the transmit queue correctly qed: Fix a potential use-after-free in qed_cxt_tables_alloc ...
2023-12-14Merge tag 'reset-fixes-for-v6.7' of git://git.pengutronix.de/pza/linux into ↵Arnd Bergmann2-5/+5
arm/fixes Reset controller fixes for v6.7 Fix a void-pointer-to-enum-cast warning in the hisilicon hi6220 reset driver and fix a NULL pointer dereference when freeing non-existent optional resets requested via the reset_array API in the core. * tag 'reset-fixes-for-v6.7' of git://git.pengutronix.de/pza/linux: reset: Fix crash when freeing non-existent optional resets reset: hisilicon: hi6220: fix Wvoid-pointer-to-enum-cast warning Link: https://lore.kernel.org/r/20231213152606.231044-1-p.zabel@pengutronix.de Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-14Merge tag 'sunxi-fixes-for-6.7-1' of ↵Arnd Bergmann3-3/+5
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes - Fix ethernet node for Orange Pi Zero 3 board * tag 'sunxi-fixes-for-6.7-1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h616: update emac for Orange Pi Zero 3 Link: https://lore.kernel.org/r/ZXtVUJ0SG2NRpPG4@archlinux Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-14bcachefs: improve modprobe support by providing softdepsDaniel Hill1-0/+6
We need to help modprobe load architecture specific modules so we don't fall back to generic software implementations, this should help performance when building as a module. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-14bcachefs: fix invalid memory access in bch2_fs_alloc() error pathThomas Bertschinger3-2/+8
When bch2_fs_alloc() gets an error before calling bch2_fs_btree_iter_init(), bch2_fs_btree_iter_exit() makes an invalid memory access because btree_trans_list is uninitialized. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Fixes: 6bd68ec266ad ("bcachefs: Heap allocate btree_trans") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-14Merge tag 'for-6.7-rc5-tag' of ↵Linus Torvalds11-50/+116
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Some fixes to quota accounting code, mostly around error handling and correctness: - free reserves on various error paths, after IO errors or transaction abort - don't clear reserved range at the folio release time, it'll be properly cleared after final write - fix integer overflow due to int used when passing around size of freed reservations - fix a regression in squota accounting that missed some cases with delayed refs" * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ensure releasing squota reserve on head refs btrfs: don't clear qgroup reserved bit in release_folio btrfs: free qgroup pertrans reserve on transaction abort btrfs: fix qgroup_free_reserved_data int overflow btrfs: free qgroup reserve when ORDERED_IOERR is set
2023-12-14net: atlantic: fix double free in ring reinit logicIgor Russkikh1-1/+4
Driver has a logic leak in ring data allocation/free, where double free may happen in aq_ring_free if system is under stress and driver init/deinit is happening. The probability is higher to get this during suspend/resume cycle. Verification was done simulating same conditions with stress -m 2000 --vm-bytes 20M --vm-hang 10 --backoff 1000 while true; do sudo ifconfig enp1s0 down; sudo ifconfig enp1s0 up; done Fixed by explicitly clearing pointers to NULL on deallocation Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Closes: https://lore.kernel.org/netdev/CAHk-=wiZZi7FcvqVSUirHBjx0bBUZ4dFrMDVLc3+3HCrtq0rBA@mail.gmail.com/ Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Link: https://lore.kernel.org/r/20231213094044.22988-1-irusskikh@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-14ALSA: hda/tas2781: reset the amp before component_addGergo Koteles1-2/+2
Calling component_add starts loading the firmware, the callback function writes the program to the amplifiers. If the module resets the amplifiers after component_add, it happens that one of the amplifiers does not work because the reset and program writing are interleaving. Call tas2781_reset before component_add to ensure reliable initialization. Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver") CC: stable@vger.kernel.org Signed-off-by: Gergo Koteles <soyer@irl.hu> Link: https://lore.kernel.org/r/4d23bf58558e23ee8097de01f70f1eb8d9de2d15.1702511246.git.soyer@irl.hu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-14ALSA: hda/tas2781: call cleanup functions only onceGergo Koteles1-5/+0
If the module can load the RCA but not the firmware binary, it will call the cleanup functions. Then unloading the module causes general protection fault due to double free. Do not call the cleanup functions in tasdev_fw_ready. general protection fault, probably for non-canonical address 0x6f2b8a2bff4c8fec: 0000 [#1] PREEMPT SMP NOPTI Call Trace: <TASK> ? die_addr+0x36/0x90 ? exc_general_protection+0x1c5/0x430 ? asm_exc_general_protection+0x26/0x30 ? tasdevice_config_info_remove+0x6d/0xd0 [snd_soc_tas2781_fmwlib] tas2781_hda_unbind+0xaa/0x100 [snd_hda_scodec_tas2781_i2c] component_unbind+0x2e/0x50 component_unbind_all+0x92/0xa0 component_del+0xa8/0x140 tas2781_hda_remove.isra.0+0x32/0x60 [snd_hda_scodec_tas2781_i2c] i2c_device_remove+0x26/0xb0 Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver") CC: stable@vger.kernel.org Signed-off-by: Gergo Koteles <soyer@irl.hu> Link: https://lore.kernel.org/r/1a0885c424bb21172702d254655882b59ef6477a.1702510018.git.soyer@irl.hu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-14appletalk: Fix Use-After-Free in atalk_ioctlHyunwoo Kim1-5/+4
Because atalk_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with atalk_recvmsg(). A use-after-free for skb occurs with the following flow. ``` atalk_ioctl() -> skb_peek() atalk_recvmsg() -> skb_recv_datagram() -> skb_free_datagram() ``` Add sk->sk_receive_queue.lock to atalk_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231213041056.GA519680@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-14net: stmmac: Handle disabled MDIO busses from devicetreeAndrew Halaney1-1/+5
Many hardware configurations have the MDIO bus disabled, and are instead using some other MDIO bus to talk to the MAC's phy. of_mdiobus_register() returns -ENODEV in this case. Let's handle it gracefully instead of failing to probe the MAC. Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.") Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20231212-b4-stmmac-handle-mdio-enodev-v2-1-600171acf79f@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-14spi: atmel: Fix clock issue when using devices with different polaritiesLouis Chauvet1-1/+81
The current Atmel SPI controller driver (v2) behaves incorrectly when using two SPI devices with different clock polarities and GPIO CS. When switching from one device to another, the controller driver first enables the CS and then applies whatever configuration suits the targeted device (typically, the polarities). The side effect of such order is the apparition of a spurious clock edge after enabling the CS when the clock polarity needs to be inverted wrt. the previous configuration of the controller. This parasitic clock edge is problematic when the SPI device uses that edge for internal processing, which is perfectly legitimate given that its CS was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in the kernel are shift registers and will process this first clock edge to perform a first register shift. In this case, the first bit gets lost and the whole data block that will later be read by the kernel is all shifted by one. Current behavior: The actual switching of the clock polarity only occurs after the CS when the controller sends the first message: CLK ------------\ /-\ /-\ | | | | | . . . \---/ \-/ \ CS -----\ | \------------------ ^ ^ ^ | | | | | Actual clock of the message sent | | | Change of clock polarity, which occurs with the first | write to the bus. This edge occurs when the CS is | already asserted, and can be interpreted as | the first clock edge by the receiver. | GPIO CS toggle This issue is specific to this controller because while the SPI core performs the operations in the right order, the controller however does not. In practice, the controller only applies the clock configuration right before the first transmission. So this is not a problem when using the controller's dedicated CS, as the controller does things correctly, but it becomes a problem when you need to change the clock polarity and use an external GPIO for the CS. One possible approach to solve this problem is to send a dummy message before actually activating the CS, so that the controller applies the clock polarity beforehand. New behavior: CLK ------\ /-\ /-\ /-\ /-\ | | | ... | | | | ... | | \------/ \- -/ \------/ \- -/ \------ CS -\/-----------------------\ || | \/ \--------------------- ^ ^ ^ ^ ^ | | | | | | | | | Expected clock cycles when | | | | sending the message | | | | | | | Actual GPIO CS activation, occurs inside | | | the driver | | | | | Dummy message, to trigger clock polarity | | reconfiguration. This message is not received and | | processed by the device because CS is low. | | | Change of clock polarity, forced by the dummy message. This | time, the edge is not detected by the receiver. | This small spike in CS activation is due to the fact that the spi-core activates the CS gpio before calling the driver's set_cs callback, which deactivates this gpio again until the clock polarity is correct. To avoid having to systematically send a dummy packet, the driver keeps track of the clock's current polarity. In this way, it only sends the dummy packet when necessary, ensuring that the clock will have the correct polarity when the CS is toggled. There could be two hardware problems with this patch: 1- Maybe the small CS activation peak can confuse SPI devices 2- If on a design, a single wire is used to select two devices depending on its state, the dummy message may disturb them. Fixes: 5ee36c989831 ("spi: atmel_spi update chipselect handling") Cc: <stable@vger.kernel.org> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com> Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-14net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RXSneh Shah1-0/+10
In 10M SGMII mode all the packets are being dropped due to wrong Rx clock. SGMII 10MBPS mode needs RX clock divider programmed to avoid drops in Rx. Update configure SGMII function with Rx clk divider programming. Fixes: 463120c31c58 ("net: stmmac: dwmac-qcom-ethqos: add support for SGMII") Tested-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Sneh Shah <quic_snehshah@quicinc.com> Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com> Link: https://lore.kernel.org/r/20231212092208.22393-1-quic_snehshah@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-13Merge branch '40GbE' of ↵Jakub Kicinski5-74/+219
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-12-12 (iavf) This series contains updates to iavf driver only. Piotr reworks Flow Director states to deal with issues in restoring filters. Slawomir fixes shutdown processing as it was missing needed calls. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix iavf_shutdown to call iavf_remove instead iavf_close iavf: Handle ntuple on/off based on new state machines for flow director iavf: Introduce new state machines for flow director ==================== Link: https://lore.kernel.org/r/20231212203613.513423-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13tracing: Fix uaf issue when open the hist or hist_debug fileZheng Yejian3-4/+15
KASAN report following issue. The root cause is when opening 'hist' file of an instance and accessing 'trace_event_file' in hist_show(), but 'trace_event_file' has been freed due to the instance being removed. 'hist_debug' file has the same problem. To fix it, call tracing_{open,release}_file_tr() in file_operations callback to have the ref count and avoid 'trace_event_file' being freed. BUG: KASAN: slab-use-after-free in hist_show+0x11e0/0x1278 Read of size 8 at addr ffff242541e336b8 by task head/190 CPU: 4 PID: 190 Comm: head Not tainted 6.7.0-rc5-g26aff849438c #133 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x98/0xf8 show_stack+0x1c/0x30 dump_stack_lvl+0x44/0x58 print_report+0xf0/0x5a0 kasan_report+0x80/0xc0 __asan_report_load8_noabort+0x1c/0x28 hist_show+0x11e0/0x1278 seq_read_iter+0x344/0xd78 seq_read+0x128/0x1c0 vfs_read+0x198/0x6c8 ksys_read+0xf4/0x1e0 __arm64_sys_read+0x70/0xa8 invoke_syscall+0x70/0x260 el0_svc_common.constprop.0+0xb0/0x280 do_el0_svc+0x44/0x60 el0_svc+0x34/0x68 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x168/0x170 Allocated by task 188: kasan_save_stack+0x28/0x50 kasan_set_track+0x28/0x38 kasan_save_alloc_info+0x20/0x30 __kasan_slab_alloc+0x6c/0x80 kmem_cache_alloc+0x15c/0x4a8 trace_create_new_event+0x84/0x348 __trace_add_new_event+0x18/0x88 event_trace_add_tracer+0xc4/0x1a0 trace_array_create_dir+0x6c/0x100 trace_array_create+0x2e8/0x568 instance_mkdir+0x48/0x80 tracefs_syscall_mkdir+0x90/0xe8 vfs_mkdir+0x3c4/0x610 do_mkdirat+0x144/0x200 __arm64_sys_mkdirat+0x8c/0xc0 invoke_syscall+0x70/0x260 el0_svc_common.constprop.0+0xb0/0x280 do_el0_svc+0x44/0x60 el0_svc+0x34/0x68 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x168/0x170 Freed by task 191: kasan_save_stack+0x28/0x50 kasan_set_track+0x28/0x38 kasan_save_free_info+0x34/0x58 __kasan_slab_free+0xe4/0x158 kmem_cache_free+0x19c/0x508 event_file_put+0xa0/0x120 remove_event_file_dir+0x180/0x320 event_trace_del_tracer+0xb0/0x180 __remove_instance+0x224/0x508 instance_rmdir+0x44/0x78 tracefs_syscall_rmdir+0xbc/0x140 vfs_rmdir+0x1cc/0x4c8 do_rmdir+0x220/0x2b8 __arm64_sys_unlinkat+0xc0/0x100 invoke_syscall+0x70/0x260 el0_svc_common.constprop.0+0xb0/0x280 do_el0_svc+0x44/0x60 el0_svc+0x34/0x68 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x168/0x170 Link: https://lore.kernel.org/linux-trace-kernel/20231214012153.676155-1-zhengyejian1@huawei.com Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-13ARC: add hugetlb definitionsPavel Kozlov1-0/+7
Add hugetlb definitions if THP enabled. ARC doesn't support HugeTLB FS but it supports THP. Some kernel code such as pagemap uses hugetlb definitions with THP. This patch fixes ARC build issue (HPAGE_SIZE undeclared error) with TRANSPARENT_HUGEPAGE enabled. Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-12-13scsi: ufs: core: Store min and max clk freq from OPP tableNitin Rawat1-0/+54
OPP support added by commit 72208ebe181e ("scsi: ufs: core: Add support for parsing OPP") doesn't update the min_freq and max_freq of each clock in 'struct ufs_clk_info'. But these values are used by the host drivers internally for controller configuration. When the OPP support is enabled in devicetree, these values will be 0, causing boot issues on the respective platforms. So add support to parse the min_freq and max_freq of all clocks while parsing the OPP table. Fixes: 72208ebe181e ("scsi: ufs: core: Add support for parsing OPP") Co-developed-by: Manish Pandey <quic_mapa@quicinc.com> Signed-off-by: Manish Pandey <quic_mapa@quicinc.com> Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com> Link: https://lore.kernel.org/r/20231208131331.12596-1-quic_nitirawa@quicinc.com Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13Merge branch 'dpaa2-switch-various-fixes'Jakub Kicinski2-12/+6
Ioana Ciornei says: ==================== dpaa2-switch: various fixes The first patch fixes the size passed to two dma_unmap_single() calls which was wrongly put as the size of the pointer. The second patch is new to this series and reverts the behavior of the dpaa2-switch driver to not ask for object replay upon offloading so that we avoid the errors encountered when a VLAN is installed multiple times on the same port. ==================== Link: https://lore.kernel.org/r/20231212164326.2753457-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13dpaa2-switch: do not ask for MDB, VLAN and FDB replayIoana Ciornei1-9/+2
Starting with commit 4e51bf44a03a ("net: bridge: move the switchdev object replay helpers to "push" mode") the switchdev_bridge_port_offload() helper was extended with the intention to provide switchdev drivers easy access to object addition and deletion replays. This works by calling the replay helpers with non-NULL notifier blocks. In the same commit, the dpaa2-switch driver was updated so that it passes valid notifier blocks to the helper. At that moment, no regression was identified through testing. In the meantime, the blamed commit changed the behavior in terms of which ports get hit by the replay. Before this commit, only the initial port which identified itself as offloaded through switchdev_bridge_port_offload() got a replay of all port objects and FDBs. After this, the newly joining port will trigger a replay of objects on all bridge ports and on the bridge itself. This behavior leads to errors in dpaa2_switch_port_vlans_add() when a VLAN gets installed on the same interface multiple times. The intended mechanism to address this is to pass a non-NULL ctx to the switchdev_bridge_port_offload() helper and then check it against the port's private structure. But since the driver does not have any use for the replayed port objects and FDBs until it gains support for LAG offload, it's better to fix the issue by reverting the dpaa2-switch driver to not ask for replay. The pointers will be added back when we are prepared to ignore replays on unrelated ports. Fixes: b28d580e2939 ("net: bridge: switchdev: replay all VLAN groups") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20231212164326.2753457-3-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13dpaa2-switch: fix size of the dma_unmapIoana Ciornei1-3/+4
The size of the DMA unmap was wrongly put as a sizeof of a pointer. Change the value of the DMA unmap to be the actual macro used for the allocation and the DMA map. Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20231212164326.2753457-2-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net: prevent mss overflow in skb_segment()Eric Dumazet1-1/+2
Once again syzbot is able to crash the kernel in skb_segment() [1] GSO_BY_FRAGS is a forbidden value, but unfortunately the following computation in skb_segment() can reach it quite easily : mss = mss * partial_segs; 65535 = 3 * 5 * 17 * 257, so many initial values of mss can lead to a bad final result. Make sure to limit segmentation so that the new mss value is smaller than GSO_BY_FRAGS. [1] general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 5079 Comm: syz-executor993 Not tainted 6.7.0-rc4-syzkaller-00141-g1ae4cd3cbdd0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551 Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00 RSP: 0018:ffffc900043473d0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597 RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070 RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0 R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046 FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> udp6_ufo_fragment+0xa0e/0xd00 net/ipv6/udp_offload.c:109 ipv6_gso_segment+0x534/0x17e0 net/ipv6/ip6_offload.c:120 skb_mac_gso_segment+0x290/0x610 net/core/gso.c:53 __skb_gso_segment+0x339/0x710 net/core/gso.c:124 skb_gso_segment include/net/gso.h:83 [inline] validate_xmit_skb+0x36c/0xeb0 net/core/dev.c:3626 __dev_queue_xmit+0x6f3/0x3d60 net/core/dev.c:4338 dev_queue_xmit include/linux/netdevice.h:3134 [inline] packet_xmit+0x257/0x380 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3087 [inline] packet_sendmsg+0x24c6/0x5220 net/packet/af_packet.c:3119 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f8692032aa9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fff8d685418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8692032aa9 RDX: 0000000000010048 RSI: 00000000200000c0 RDI: 0000000000000003 RBP: 00000000000f4240 R08: 0000000020000540 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff8d685480 R13: 0000000000000001 R14: 00007fff8d685480 R15: 0000000000000003 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551 Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00 RSP: 0018:ffffc900043473d0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597 RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070 RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0 R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046 FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231212164621.4131800-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space()Nikolay Kuratov1-1/+1
We need to do signed arithmetic if we expect condition `if (bytes < 0)` to be possible Found by Linux Verification Center (linuxtesting.org) with SVACE Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko") Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20231211162317.4116625-1-kniv@yandex-team.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13Merge tag 'v6.7-rockchip-clkfixes1' of ↵Stephen Boyd2-15/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-fixes Pull Rockchip clk driver fixes for the merge window from Heiko Stuebner: Fixes for a wrong clockname, a wrong clock-parent, a wrong clock-gate and finally one new PLL rate for the rk3568 to fix display artifacts on a handheld devices based on that soc. * tag 'v6.7-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name clk: rockchip: rk3128: Fix aclk_peri_src's parent clk: rockchip: rk3128: Fix HCLK_OTG gate register clk: rockchip: rk3568: Add PLL rate for 292.5MHz
2023-12-13drm/amdgpu: warn when there are still mappings when a BO is destroyed v2Christian König1-0/+2
This can only happen when there is a reference counting bug. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13drm/amdgpu: fix tear down order in amdgpu_vm_pt_freeChristian König1-1/+2
When freeing PD/PT with shadows it can happen that the shadow destruction races with detaching the PD/PT from the VM causing a NULL pointer dereference in the invalidation code. Fix this by detaching the the PD/PT from the VM first and then freeing the shadow instead. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867 Cc: <stable@vger.kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13drm/amd: Fix a probing order problem on SDMA 2.4Mario Limonciello1-2/+2
commit 751e293f2c99 ("drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4") made a fateful mistake in `adev->sdma.num_instances` wasn't declared when sdma_v2_4_init_microcode() was run. This caused probing to fail. Move the declaration to right before sdma_v2_4_init_microcode(). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3043 Fixes: 751e293f2c99 ("drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13drm/amdgpu/sdma5.2: add begin/end_use ring callbacksAlex Deucher1-0/+28
Add begin/end_use ring callbacks to disallow GFXOFF when SDMA work is submitted and allow it again afterward. This should avoid corner cases where GFXOFF is erroneously entered when SDMA is still active. For now just allow/disallow GFXOFF in the begin and end helpers until we root cause the issue. This should not impact power as SDMA usage is pretty minimal and GFXOSS should not be active when SDMA is active anyway, this just makes it explicit. v2: move everything into sdma5.2 code. No reason for this to be generic at this point. v3: Add comments in new code Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2220 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Tested-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15+
2023-12-13sign-file: Fix incorrect return values checkYusong Gao1-6/+6
There are some wrong return values check in sign-file when call OpenSSL API. The ERR() check cond is wrong because of the program only check the return value is < 0 which ignored the return val is 0. For example: 1. CMS_final() return 1 for success or 0 for failure. 2. i2d_CMS_bio_stream() returns 1 for success or 0 for failure. 3. i2d_TYPEbio() return 1 for success and 0 for failure. 4. BIO_free() return 1 for success and 0 for failure. Link: https://www.openssl.org/docs/manmaster/man3/ Fixes: e5a2e3c84782 ("scripts/sign-file.c: Add support for signing with a raw signature") Signed-off-by: Yusong Gao <a869920004@gmail.com> Reviewed-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20231213024405.624692-1-a869920004@gmail.com/ # v5 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-13Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull ufs fix from Al Viro: "ufs got broken this merge window on folio conversion - calling conventions for filemap_lock_folio() are not the same as for find_lock_page()" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix ufs_get_locked_folio() breakage
2023-12-13Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"Jakub Kicinski1-1/+1
This reverts commit f3f32a356c0d2379d4431364e74f101f8f075ce3. Paolo reports that the change disables autocorking even after the userspace sets TCP_CORK. Fixes: f3f32a356c0d ("tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set") Link: https://lore.kernel.org/r/0d30d5a41d3ac990573016308aaeacb40a9dc79f.camel@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13Merge tag 'efi-urgent-for-v6.7-2' of ↵Linus Torvalds4-13/+30
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Deal with a regression in the recently refactored x86 EFI stub code on older Dell systems by disabling randomization of the physical load address - Use the correct load address for relocatable Loongarch kernels * tag 'efi-urgent-for-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/x86: Avoid physical KASLR on older Dell systems efi/loongarch: Use load address to calculate kernel entry address
2023-12-13bcachefs: Fix determining required file handle lengthJan Kara1-5/+14
The ->encode_fh method is responsible for setting amount of space required for storing the file handle if not enough space was provided. bch2_encode_fh() was not setting required length in that case which breaks e.g. fanotify. Fix it. Reported-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-13drm/panel: ltk050h3146w: Set burst mode for ltk050h3148wFarouk Bouabid1-1/+1
The ltk050h3148w variant expects the horizontal component lane byte clock cycle(lbcc) to be calculated using lane_mbps (burst mode) instead of the pixel clock. Using the pixel clock rate by default for this calculation was introduced in commit ac87d23694f4 ("drm/bridge: synopsys: dw-mipi-dsi: Use pixel clock rate to calculate lbcc") and starting from commit 93e82bb4de01 ("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode") only panels that support burst mode can keep using the lane_mbps. So add MIPI_DSI_MODE_VIDEO_BURST as part of the mode_flags for the dsi host. Fixes: 93e82bb4de01 ("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode") Signed-off-by: Farouk Bouabid <farouk.bouabid@theobroma-systems.com> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231213145045.41020-1-farouk.bouabid@theobroma-systems.com
2023-12-13fix ufs_get_locked_folio() breakageAl Viro1-1/+1
filemap_lock_folio() returns ERR_PTR(-ENOENT) if the thing is not in cache - not NULL like find_lock_page() used to. Fixes: 5fb7bd50b351 "ufs: add ufs_get_locked_folio and ufs_put_locked_folio" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-12-13io_uring/poll: don't enable lazy wake for POLLEXCLUSIVEJens Axboe2-3/+20
There are a few quirks around using lazy wake for poll unconditionally, and one of them is related the EPOLLEXCLUSIVE. Those may trigger exclusive wakeups, which wake a limited number of entries in the wait queue. If that wake number is less than the number of entries someone is waiting for (and that someone is also using DEFER_TASKRUN), then we can get stuck waiting for more entries while we should be processing the ones we already got. If we're doing exclusive poll waits, flag the request as not being compatible with lazy wakeups. Reported-by: Pavel Begunkov <asml.silence@gmail.com> Fixes: 6ce4a93dbb5b ("io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-13MAINTAINERS: powerpc: Add Aneesh & NaveenMichael Ellerman1-0/+2
Aneesh and Naveen are helping out with some aspects of upstream maintenance, add them as reviewers. Acked-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Acked-by: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231205051105.736470-1-mpe@ellerman.id.au
2023-12-13powerpc/pseries/vas: Migration suspend waits for no in-progress open windowsHaren Myneni2-7/+46
The hypervisor returns migration failure if all VAS windows are not closed. During pre-migration stage, vas_migration_handler() sets migration_in_progress flag and closes all windows from the list. The allocate VAS window routine checks the migration flag, setup the window and then add it to the list. So there is possibility of the migration handler missing the window that is still in the process of setup. t1: Allocate and open VAS t2: Migration event window lock vas_pseries_mutex If migration_in_progress set unlock vas_pseries_mutex return open window HCALL unlock vas_pseries_mutex Modify window HCALL lock vas_pseries_mutex setup window migration_in_progress=true Closes all windows from the list // May miss windows that are // not in the list unlock vas_pseries_mutex lock vas_pseries_mutex return if nr_closed_windows == 0 // No DLPAR CPU or migration add window to the list // Window will be added to the // list after the setup is completed unlock vas_pseries_mutex return unlock vas_pseries_mutex Close VAS window // due to DLPAR CPU or migration return -EBUSY This patch resolves the issue with the following steps: - Set the migration_in_progress flag without holding mutex. - Introduce nr_open_wins_progress counter in VAS capabilities struct - This counter tracks the number of open windows are still in progress - The allocate setup window thread closes windows if the migration is set and decrements nr_open_window_progress counter - The migration handler waits for no in-progress open windows. The code flow with the fix is as follows: t1: Allocate and open VAS t2: Migration event window lock vas_pseries_mutex If migration_in_progress set unlock vas_pseries_mutex return open window HCALL nr_open_wins_progress++ // Window opened, but not // added to the list yet unlock vas_pseries_mutex Modify window HCALL migration_in_progress=true setup window lock vas_pseries_mutex Closes all windows from the list While nr_open_wins_progress { unlock vas_pseries_mutex lock vas_pseries_mutex sleep if nr_closed_windows == 0 // Wait if any open window in or migration is not started // progress. The open window // No DLPAR CPU or migration // thread closes the window without add window to the list // adding to the list and return if nr_open_wins_progress-- // the migration is in progress. unlock vas_pseries_mutex return Close VAS window nr_open_wins_progress-- unlock vas_pseries_mutex return -EBUSY lock vas_pseries_mutex } unlock vas_pseries_mutex return Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231125235104.3405008-1-haren@linux.ibm.com
2023-12-13Merge branch 'stmmac-bug-fixes'David S. Miller3-17/+8
Yanteng Si says: ==================== stmmac: Some bug fixes * Put Krzysztof's patch into my thread, pick Conor's Reviewed-by tag and Jiaxun's Acked-by tag.(prev version is RFC patch) * I fixed an Oops related to mdio, mainly to ensure that mdio is initialized before use, because it will be used in a series of patches I am working on. see <https://lore.kernel.org/loongarch/cover.1699533745.git.siyanteng@loongson.cn/T/#t> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13MIPS: dts: loongson: drop incorrect dwmac fallback compatibleKrzysztof Kozlowski2-4/+2
Device binds to proper PCI ID (LOONGSON, 0x7a03), already listed in DTS, so checking for some other compatible does not make sense. It cannot be bound to unsupported platform. Drop useless, incorrect (space in between) and undocumented compatible. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13stmmac: dwmac-loongson: drop useless check for compatible fallbackKrzysztof Kozlowski1-5/+0
Device binds to proper PCI ID (LOONGSON, 0x7a03), already listed in DTS, so checking for some other compatible does not make sense. It cannot be bound to unsupported platform. Drop useless, incorrect (space in between) and undocumented compatible. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13stmmac: dwmac-loongson: Make sure MDIO is initialized before useYanteng Si1-8/+6
Generic code will use mdio. If it is not initialized before use, the kernel will Oops. Fixes: 30bba69d7db4 ("stmmac: pci: Add dwmac support for Loongson") Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13dt-bindings: panel-simple-dsi: move LG 5" HD TFT LCD panel into DSI yamlDavid Heidelberg2-2/+2
Originally was in the panel-simple, but belongs to panel-simple-dsi. See arch/arm/boot/dts/nvidia/tegra114-roth.dts for more details. Resolves the following warning: ``` arch/arm/boot/dts/tegra114-roth.dt.yaml: panel@0: 'reg' does not match any of the regexes: 'pinctrl-[0-9]+' From schema: Documentation/devicetree/bindings/display/panel/panel-simple.yaml ``` Fixes: 310abcea76e9 ("dt-bindings: display: convert simple lg panels to DT Schema") Signed-off-by: David Heidelberg <david@ixit.cz> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20231212200934.99262-1-david@ixit.cz Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231212200934.99262-1-david@ixit.cz
2023-12-13tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is setSalvatore Dipietro1-1/+1
Based on the tcp man page, if TCP_NODELAY is set, it disables Nagle's algorithm and packets are sent as soon as possible. However in the `tcp_push` function where autocorking is evaluated the `nonagle` value set by TCP_NODELAY is not considered which can trigger unexpected corking of packets and induce delays. For example, if two packets are generated as part of a server's reply, if the first one is not transmitted on the wire quickly enough, the second packet can trigger the autocorking in `tcp_push` and be delayed instead of sent as soon as possible. It will either wait for additional packets to be coalesced or an ACK from the client before transmitting the corked packet. This can interact badly if the receiver has tcp delayed acks enabled, introducing 40ms extra delay in completion times. It is not always possible to control who has delayed acks set, but it is possible to adjust when and how autocorking is triggered. Patch prevents autocorking if the TCP_NODELAY flag is set on the socket. Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu 22.04 and Apache Tomcat 9.0.83 running the basic servlet below: import java.io.IOException; import java.io.OutputStreamWriter; import java.io.PrintWriter; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class HelloWorldServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=utf-8"); OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8"); String s = "a".repeat(3096); osw.write(s,0,s.length()); osw.flush(); } } Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS c6i.8xlarge instance. With the current auto-corking behavior and TCP_NODELAY set an additional 40ms latency from P99.99+ values are observed. With the patch applied we see no occurrences of 40ms latencies. The patch has also been tested with iperf and uperf benchmarks and no regression was observed. # No patch with tcp_autocorking=1 and TCP_NODELAY set on all sockets ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.49.177:8080/hello/hello' ... 50.000% 0.91ms 75.000% 1.12ms 90.000% 1.46ms 99.000% 1.73ms 99.900% 1.96ms 99.990% 43.62ms <<< 40+ ms extra latency 99.999% 48.32ms 100.000% 49.34ms # With patch ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.49.177:8080/hello/hello' ... 50.000% 0.89ms 75.000% 1.13ms 90.000% 1.44ms 99.000% 1.67ms 99.900% 1.78ms 99.990% 2.27ms <<< no 40+ ms extra latency 99.999% 3.71ms 100.000% 4.57ms Fixes: f54b311142a9 ("tcp: auto corking") Signed-off-by: Salvatore Dipietro <dipiets@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13Merge tag 'mediatek-drm-fixes-20231211' of ↵Dave Airlie3-3/+18
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20231211 1. mtk_disp_gamma: Fix breakage due to merge issue 2. fix kernel oops if no crtc is found 3. Add spinlock for setting vblank event in atomic_begin 4. Fix access violation in mtk_drm_crtc_dma_dev_get Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231211151510.6749-1-chunkuang.hu@kernel.org
2023-12-13ARM: dts: Fix occasional boot hang for am3 usbTony Lindgren1-0/+1
With subtle timings changes, we can now sometimes get an external abort on non-linefetch error booting am3 devices at sysc_reset(). This is because of a missing reset delay needed for the usb target module. Looks like we never enabled the delay earlier for am3, although a similar issue was seen earlier with a similar usb setup for dm814x as described in commit ebf244148092 ("ARM: OMAP2+: Use srst_udelay for USB on dm814x"). Cc: stable@vger.kernel.org Fixes: 0782e8572ce4 ("ARM: dts: Probe am335x musb with ti-sysc") Signed-off-by: Tony Lindgren <tony@atomide.com>
2023-12-12tracing: Add size check when printing trace_marker outputSteven Rostedt (Google)1-2/+4
If for some reason the trace_marker write does not have a nul byte for the string, it will overflow the print: trace_seq_printf(s, ": %s", field->buf); The field->buf could be missing the nul byte. To prevent overflow, add the max size that the buf can be by using the event size and the field location. int max = iter->ent_size - offsetof(struct print_entry, buf); trace_seq_printf(s, ": %*.s", max, field->buf); Link: https://lore.kernel.org/linux-trace-kernel/20231212084444.4619b8ce@gandalf.local.home Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12ring-buffer: Have saved event hold the entire eventSteven Rostedt (Google)1-2/+3
For the ring buffer iterator (non-consuming read), the event needs to be copied into the iterator buffer to make sure that a writer does not overwrite it while the user is reading it. If a write happens during the copy, the buffer is simply discarded. But the temp buffer itself was not big enough. The allocation of the buffer was only BUF_MAX_DATA_SIZE, which is the maximum data size that can be passed into the ring buffer and saved. But the temp buffer needs to hold the meta data as well. That would be BUF_PAGE_SIZE and not BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/linux-trace-kernel/20231212072558.61f76493@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 785888c544e04 ("ring-buffer: Have rb_iter_head_event() handle concurrent writer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12ring-buffer: Do not update before stamp when switching sub-buffersSteven Rostedt (Google)1-8/+1
The ring buffer timestamps are synchronized by two timestamp placeholders. One is the "before_stamp" and the other is the "write_stamp" (sometimes referred to as the "after stamp" but only in the comments. These two stamps are key to knowing how to handle nested events coming in with a lockless system. When moving across sub-buffers, the before stamp is updated but the write stamp is not. There's an effort to put back the before stamp to something that seems logical in case there's nested events. But as the current event is about to cross sub-buffers, and so will any new nested event that happens, updating the before stamp is useless, and could even introduce new race conditions. The first event on a sub-buffer simply uses the sub-buffer's timestamp and keeps a "delta" of zero. The "before_stamp" and "write_stamp" are not used in the algorithm in this case. There's no reason to try to fix the before_stamp when this happens. As a bonus, it removes a cmpxchg() when crossing sub-buffers! Link: https://lore.kernel.org/linux-trace-kernel/20231211114420.36dde01b@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12mm/mglru: reclaim offlined memcgs harderYu Zhao2-12/+20
In the effort to reduce zombie memcgs [1], it was discovered that the memcg LRU doesn't apply enough pressure on offlined memcgs. Specifically, instead of rotating them to the tail of the current generation (MEMCG_LRU_TAIL) for a second attempt, it moves them to the next generation (MEMCG_LRU_YOUNG) after the first attempt. Not applying enough pressure on offlined memcgs can cause them to build up, and this can be particularly harmful to memory-constrained systems. On Pixel 8 Pro, launching apps for 50 cycles: Before After Change Zombie memcgs 45 35 -22% [1] https://lore.kernel.org/CABdmKX2M6koq4Q0Cmp_-=wbP0Qa190HdEGGaHfxNS05gAkUtPA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-4-yuzhao@google.com Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: T.J. Mercier <tjmercier@google.com> Tested-by: T.J. Mercier <tjmercier@google.com> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm/mglru: respect min_ttl_ms with memcgsYu Zhao2-27/+33
While investigating kswapd "consuming 100% CPU" [1] (also see "mm/mglru: try to stop at high watermarks"), it was discovered that the memcg LRU can breach the thrashing protection imposed by min_ttl_ms. Before the memcg LRU: kswapd() shrink_node_memcgs() mem_cgroup_iter() inc_max_seq() // always hit a different memcg lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation After the memcg LRU: kswapd() shrink_many() restart: iterate the memcg LRU: inc_max_seq() // occasionally hit the same memcg if raced with lru_gen_rotate_memcg(): goto restart lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation Specifically, when the restart happens in shrink_many(), it needs to stick with the (memcg LRU) generation it began with. In other words, it should neither re-read memcg_lru->seq nor age an lruvec of a different generation. Otherwise it can hit the same memcg multiple times without giving lru_gen_age_node() a chance to check the timestamp of that memcg's oldest generation (against min_ttl_ms). [1] https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg@mail.gmail.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-3-yuzhao@google.com Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <yuzhao@google.com> Tested-by: T.J. Mercier <tjmercier@google.com> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm/mglru: try to stop at high watermarksYu Zhao1-8/+28
The initial MGLRU patchset didn't include the memcg LRU support, and it relied on should_abort_scan(), added by commit f76c83378851 ("mm: multi-gen LRU: optimize multiple memcgs"), to "backoff to avoid overshooting their aggregate reclaim target by too much". Later on when the memcg LRU was added, should_abort_scan() was deemed unnecessary, and the test results [1] showed no side effects after it was removed by commit a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard"). However, that test used memory.reclaim, which sets nr_to_reclaim to SWAP_CLUSTER_MAX. So it can overshoot only by SWAP_CLUSTER_MAX-1 pages, i.e., from nr_reclaimed=nr_to_reclaim-1 to nr_reclaimed=nr_to_reclaim+SWAP_CLUSTER_MAX-1. Compared with the batch size kswapd sets to nr_to_reclaim, SWAP_CLUSTER_MAX is tiny. Therefore that test isn't able to reproduce the worst case scenario, i.e., kswapd overshooting GBs on large systems and "consuming 100% CPU" (see the Closes tag). Bring back a simplified version of should_abort_scan() on top of the memcg LRU, so that kswapd stops when all eligible zones are above their respective high watermarks plus a small delta to lower the chance of KSWAPD_HIGH_WMARK_HIT_QUICKLY. Note that this only applies to order-0 reclaim, meaning compaction-induced reclaim can still run wild (which is a different problem). On Android, launching 55 apps sequentially: Before After Change pgpgin 838377172 802955040 -4% pgpgout 38037080 34336300 -10% [1] https://lore.kernel.org/20221222041905.2431096-1-yuzhao@google.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-2-yuzhao@google.com Fixes: a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: Charan Teja Kalla <quic_charante@quicinc.com> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Closes: https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: T.J. Mercier <tjmercier@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm/mglru: fix underprotected page cacheYu Zhao3-13/+18
Unmapped folios accessed through file descriptors can be underprotected. Those folios are added to the oldest generation based on: 1. The fact that they are less costly to reclaim (no need to walk the rmap and flush the TLB) and have less impact on performance (don't cause major PFs and can be non-blocking if needed again). 2. The observation that they are likely to be single-use. E.g., for client use cases like Android, its apps parse configuration files and store the data in heap (anon); for server use cases like MySQL, it reads from InnoDB files and holds the cached data for tables in buffer pools (anon). However, the oldest generation can be very short lived, and if so, it doesn't provide the PID controller with enough time to respond to a surge of refaults. (Note that the PID controller uses weighted refaults and those from evicted generations only take a half of the whole weight.) In other words, for a short lived generation, the moving average smooths out the spike quickly. To fix the problem: 1. For folios that are already on LRU, if they can be beyond the tracking range of tiers, i.e., five accesses through file descriptors, move them to the second oldest generation to give them more time to age. (Note that tiers are used by the PID controller to statistically determine whether folios accessed multiple times through file descriptors are worth protecting.) 2. When adding unmapped folios to LRU, adjust the placement of them so that they are not too close to the tail. The effect of this is similar to the above. On Android, launching 55 apps sequentially: Before After Change workingset_refault_anon 25641024 25598972 0% workingset_refault_file 115016834 106178438 -8% Link: https://lkml.kernel.org/r/20231208061407.2125867-1-yuzhao@google.com Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: Charan Teja Kalla <quic_charante@quicinc.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Cc: T.J. Mercier <tjmercier@google.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm/shmem: fix race in shmem_undo_range w/THPDavid Stevens1-1/+18
Split folios during the second loop of shmem_undo_range. It's not sufficient to only split folios when dealing with partial pages, since it's possible for a THP to be faulted in after that point. Calling truncate_inode_folio in that situation can result in throwing away data outside of the range being targeted. [akpm@linux-foundation.org: tidy up comment layout] Link: https://lkml.kernel.org/r/20230418084031.3439795-1-stevensd@google.com Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: David Stevens <stevensd@chromium.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12Revert "selftests: error out if kernel header files are not yet built"John Hubbard2-57/+4
This reverts commit 9fc96c7c19df ("selftests: error out if kernel header files are not yet built"). It turns out that requiring the kernel headers to be built as a prerequisite to building selftests, does not work in many cases. For example, Peter Zijlstra writes: "My biggest beef with the whole thing is that I simply do not want to use 'make headers', it doesn't work for me. I have a ton of output directories and I don't care to build tools into the output dirs, in fact some of them flat out refuse to work that way (bpf comes to mind)." [1] Therefore, stop erroring out on the selftests build. Additional patches will be required in order to change over to not requiring the kernel headers. [1] https://lore.kernel.org/20231208221007.GO28727@noisy.programming.kicks-ass.net Link: https://lkml.kernel.org/r/20231209020144.244759-1-jhubbard@nvidia.com Fixes: 9fc96c7c19df ("selftests: error out if kernel header files are not yet built") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: David Hildenbrand <david@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Marcos Paulo de Souza <mpdesouza@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12crash_core: fix the check for whether crashkernel is from high memoryYuntao Wang1-5/+5
If crash_base is equal to CRASH_ADDR_LOW_MAX, it also indicates that the crashkernel memory is allocated from high memory. However, the current check only considers the case where crash_base is greater than CRASH_ADDR_LOW_MAX. Fix it. The runtime effects is that crashkernel high memory is successfully reserved, whereas the crashkernel low memory is bypassed in this case, then kdump kernel bootup will fail because of no low memory under 4G. This patch also includes some minor cleanups. Link: https://lkml.kernel.org/r/20231209141438.77233-1-ytcoode@gmail.com Fixes: 0ab97169aa05 ("crash_core: add generic function to do reservation") Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12x86, kexec: fix the wrong ifdeffery CONFIG_KEXECBaoquan He1-1/+1
With the current ifdeffery CONFIG_KEXEC, get_cmdline_acpi_rsdp() is only available when kexec_load interface is taken, while kexec_file_load interface can't make use of it. Now change it to CONFIG_KEXEC_CORE. Link: https://lkml.kernel.org/r/20231208073036.7884-6-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: kernel test robot <lkp@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXECBaoquan He4-6/+6
The select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec will be dropped, then compiling errors will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === Here, change the dependency of building kexec_core related object files, and the ifdeffery on SuperH from CONFIG_KEXEC to CONFIG_KEXEC_CORE. Link: https://lkml.kernel.org/r/20231208073036.7884-5-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: kernel test robot <lkp@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXECBaoquan He9-16/+16
The select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec will be dropped, then compiling errors will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === -------------------------------------------------------------------- mipsel-linux-ld: kernel/kexec_core.o: in function `kimage_free': kernel/kexec_core.c:(.text+0x2200): undefined reference to `machine_kexec_cleanup' mipsel-linux-ld: kernel/kexec_core.o: in function `__crash_kexec': kernel/kexec_core.c:(.text+0x2480): undefined reference to `machine_crash_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x2488): undefined reference to `machine_kexec' mipsel-linux-ld: kernel/kexec_core.o: in function `kernel_kexec': kernel/kexec_core.c:(.text+0x29b8): undefined reference to `machine_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x29c0): undefined reference to `machine_kexec' -------------------------------------------------------------------- Here, change the dependency of building kexec_core related object files, and the ifdeffery in mips from CONFIG_KEXEC to CONFIG_KEXEC_CORE. Link: https://lkml.kernel.org/r/20231208073036.7884-4-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311302042.sn8cDPIX-lkp@intel.com/ Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXECBaoquan He2-3/+3
The select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec will be dropped, then compiling errors will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === Here, change the dependency of buinding machine_kexec.o relocate_kernel.o and the ifdeffery in asm/kexe.h to CONFIG_KEXEC_CORE. Link: https://lkml.kernel.org/r/20231208073036.7884-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: kernel test robot <lkp@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12loongarch, kexec: change dependency of object filesBaoquan He1-1/+1
Patch series "kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC". The select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec will be dropped, then compiling errors will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === E.g on mips, below link error are seen: -------------------------------------------------------------------- mipsel-linux-ld: kernel/kexec_core.o: in function `kimage_free': kernel/kexec_core.c:(.text+0x2200): undefined reference to `machine_kexec_cleanup' mipsel-linux-ld: kernel/kexec_core.o: in function `__crash_kexec': kernel/kexec_core.c:(.text+0x2480): undefined reference to `machine_crash_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x2488): undefined reference to `machine_kexec' mipsel-linux-ld: kernel/kexec_core.o: in function `kernel_kexec': kernel/kexec_core.c:(.text+0x29b8): undefined reference to `machine_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x29c0): undefined reference to `machine_kexec' -------------------------------------------------------------------- Here, change the incorrect dependency of building kexec_core related object files, and the ifdeffery on architectures from CONFIG_KEXEC to CONFIG_KEXEC_CORE. Testing: ======== Passed on mips and loognarch with the LKP reproducer. This patch (of 5): Currently, in arch/loongarch/kernel/Makefile, building machine_kexec.o relocate_kernel.o depends on CONFIG_KEXEC. Whereas, since we will drop the select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec, compiling error will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === --------------------------------------------------------------- loongarch64-linux-ld: kernel/kexec_core.o: in function `.L209': >> kexec_core.c:(.text+0x1660): undefined reference to `machine_kexec_cleanup' loongarch64-linux-ld: kernel/kexec_core.o: in function `.L287': >> kexec_core.c:(.text+0x1c5c): undefined reference to `machine_crash_shutdown' >> loongarch64-linux-ld: kexec_core.c:(.text+0x1c64): undefined reference to `machine_kexec' loongarch64-linux-ld: kernel/kexec_core.o: in function `.L2^B5': >> kexec_core.c:(.text+0x2090): undefined reference to `machine_shutdown' loongarch64-linux-ld: kexec_core.c:(.text+0x20a0): undefined reference to `machine_kexec' --------------------------------------------------------------- Here, change the dependency of machine_kexec.o relocate_kernel.o to CONFIG_KEXEC_CORE can fix above building error. Link: https://lkml.kernel.org/r/20231208073036.7884-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20231208073036.7884-2-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311300946.kHE9Iu71-lkp@intel.com/ Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm/damon/core: make damon_start() waits until kdamond_fn() startsSeongJae Park2-0/+8
The cleanup tasks of kdamond threads including reset of corresponding DAMON context's ->kdamond field and decrease of global nr_running_ctxs counter is supposed to be executed by kdamond_fn(). However, commit 0f91d13366a4 ("mm/damon: simplify stop mechanism") made neither damon_start() nor damon_stop() ensure the corresponding kdamond has started the execution of kdamond_fn(). As a result, the cleanup can be skipped if damon_stop() is called fast enough after the previous damon_start(). Especially the skipped reset of ->kdamond could cause a use-after-free. Fix it by waiting for start of kdamond_fn() execution from damon_start(). Link: https://lkml.kernel.org/r/20231208175018.63880-1-sj@kernel.org Fixes: 0f91d13366a4 ("mm/damon: simplify stop mechanism") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Jakub Acs <acsjakub@amazon.de> Cc: Changbin Du <changbin.du@intel.com> Cc: Jakub Acs <acsjakub@amazon.de> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12selftests/mm: cow: print ksft header before printing anything elseDavid Hildenbrand1-1/+2
Doing a ksft_print_msg() before the ksft_print_header() seems to confuse the ksft framework in a strange way: running the test on the cmdline results in the expected output. But piping the output somewhere else, results in some odd output, whereby we repeatedly get the same info printed: # [INFO] detected THP size: 2048 KiB # [INFO] detected hugetlb page size: 2048 KiB # [INFO] detected hugetlb page size: 1048576 KiB # [INFO] huge zeropage is enabled TAP version 13 1..190 # [INFO] Anonymous memory tests in private mappings # [RUN] Basic COW after fork() ... with base page # [INFO] detected THP size: 2048 KiB # [INFO] detected hugetlb page size: 2048 KiB # [INFO] detected hugetlb page size: 1048576 KiB # [INFO] huge zeropage is enabled TAP version 13 1..190 # [INFO] Anonymous memory tests in private mappings # [RUN] Basic COW after fork() ... with base page ok 1 No leak from parent into child # [RUN] Basic COW after fork() ... with swapped out base page # [INFO] detected THP size: 2048 KiB # [INFO] detected hugetlb page size: 2048 KiB # [INFO] detected hugetlb page size: 1048576 KiB # [INFO] huge zeropage is enabled Doing the ksft_print_header() first seems to resolve that and gives us the output we expect: TAP version 13 # [INFO] detected THP size: 2048 KiB # [INFO] detected hugetlb page size: 2048 KiB # [INFO] detected hugetlb page size: 1048576 KiB # [INFO] huge zeropage is enabled 1..190 # [INFO] Anonymous memory tests in private mappings # [RUN] Basic COW after fork() ... with base page ok 1 No leak from parent into child # [RUN] Basic COW after fork() ... with swapped out base page ok 2 No leak from parent into child # [RUN] Basic COW after fork() ... with THP ok 3 No leak from parent into child # [RUN] Basic COW after fork() ... with swapped-out THP ok 4 No leak from parent into child # [RUN] Basic COW after fork() ... with PTE-mapped THP ok 5 No leak from parent into child Link: https://lkml.kernel.org/r/20231206103558.38040-1-david@redhat.com Fixes: f4b5fd6946e2 ("selftests/vm: anon_cow: THP tests") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Nico Pache <npache@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12mm: fix VMA heap bounds checkingKefeng Wang1-4/+4
After converting selinux to VMA heap check helper, the gcl triggers an execheap SELinux denial, which is caused by a changed logic check. Previously selinux only checked that the VMA range was within the VMA heap range, and the implementation checks the intersection between the two ranges, but the corner case (vm_end=start_brk, brk=vm_start) isn't handled correctly. Since commit 11250fd12eb8 ("mm: factor out VMA stack and heap checks") was only a function extraction, it seems that the issue was introduced by commit 0db0c01b53a1 ("procfs: fix /proc/<pid>/maps heap check"). Let's fix above corner cases, meanwhile, correct the wrong indentation of the stack and heap check helpers. Fixes: 11250fd12eb8 ("mm: factor out VMA stack and heap checks") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Closes: https://lore.kernel.org/selinux/CAFqZXNv0SVT0fkOK6neP9AXbj3nxJ61JAY4+zJzvxqJaeuhbFw@mail.gmail.com/ Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Link: https://lkml.kernel.org/r/20231207152525.2607420-1-wangkefeng.wang@huawei.com Cc: David Hildenbrand <david@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12riscv: fix VMALLOC_START definitionBaoquan He1-1/+1
When below config items are set, compiler complained: -------------------- CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y ...... ----------------------- ------------------------------------------------------------------- arch/riscv/kernel/crash_core.c: In function 'arch_crash_save_vmcoreinfo': arch/riscv/kernel/crash_core.c:11:58: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'int' [-Wformat=] 11 | vmcoreinfo_append_str("NUMBER(VMALLOC_START)=0x%lx\n", VMALLOC_START); | ~~^ | | | long unsigned int | %x ---------------------------------------------------------------------- This is because on riscv macro VMALLOC_START has different type when CONFIG_MMU is set or unset. arch/riscv/include/asm/pgtable.h: -------------------------------------------------- Changing it to _AC(0, UL) in case CONFIG_MMU=n can fix the warning. Link: https://lkml.kernel.org/r/ZW7OsX4zQRA3mO4+@MiWiFi-R3L-srv Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMPIgnat Korchagin3-4/+5
In commit f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") we tried to fix a config regression, where CONFIG_CRASH_DUMP required CONFIG_KEXEC. However, it was not enough at least for arm64 platforms. While further testing the patch with our arm64 config I noticed that CONFIG_CRASH_DUMP is unavailable in menuconfig. This is because CONFIG_CRASH_DUMP still depends on the new CONFIG_ARCH_SUPPORTS_KEXEC introduced in commit 91506f7e5d21 ("arm64/kexec: refactor for kernel/Kconfig.kexec") and on arm64 CONFIG_ARCH_SUPPORTS_KEXEC requires CONFIG_PM_SLEEP_SMP=y, which in turn requires either CONFIG_SUSPEND=y or CONFIG_HIBERNATION=y neither of which are set in our config. Given that we already established that CONFIG_KEXEC (which is a switch for kexec system call itself) is not required for CONFIG_CRASH_DUMP drop CONFIG_ARCH_SUPPORTS_KEXEC dependency as well. The arm64 kernel builds just fine with CONFIG_CRASH_DUMP=y and with both CONFIG_KEXEC=n and CONFIG_KEXEC_FILE=n after f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") and this patch are applied given that the necessary shared bits are included via CONFIG_KEXEC_CORE dependency. [bhe@redhat.com: don't export some symbols when CONFIG_MMU=n] Link: https://lkml.kernel.org/r/ZW03ODUKGGhP1ZGU@MiWiFi-R3L-srv [bhe@redhat.com: riscv, kexec: fix dependency of two items] Link: https://lkml.kernel.org/r/ZW04G/SKnhbE5mnX@MiWiFi-R3L-srv Link: https://lkml.kernel.org/r/20231129220409.55006-1-ignat@cloudflare.com Fixes: 91506f7e5d21 ("arm64/kexec: refactor for kernel/Kconfig.kexec") Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: <stable@vger.kernel.org> # 6.6+: f8ff234: kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP Cc: <stable@vger.kernel.org> # 6.6+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12Merge tag 'hid-for-linus-2023121201' of ↵Linus Torvalds6-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - Lenovo ThinkPad TrackPoint Keyboard II firmware-specific regression fix (Mikhail Khvainitski) - device-specific fixes (various authors) * tag 'hid-for-linus-2023121201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: apple: Add "hfd.cn" and "WKB603" to the list of non-apple keyboards HID: lenovo: Restrict detection of patched firmware only to USB cptkbd HID: Add quirk for Labtec/ODDOR/aikeec handbrake HID: i2c-hid: Add IDEA5002 to i2c_hid_acpi_blacklist[] mailmap: add address mapping for Jiri Kosina
2023-12-12dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set()Jiri Pirko1-5/+8
User may not pass DPLL_A_PIN_STATE attribute in the pin set operation message. Sanitize that by checking if the attr pointer is not null and process the passed state attribute value only in that case. Reported-by: Xingyuan Mo <hdthky0@gmail.com> Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20231211083758.1082853-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12Merge branch 'ena-driver-xdp-bug-fixes'Jakub Kicinski2-30/+26
David Arinzon says: ==================== ENA driver XDP bug fixes This patchset contains multiple XDP-related bug fixes in the ENA driver. ==================== Link: https://lore.kernel.org/r/20231211062801.27891-1-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12net: ena: Fix XDP redirection errorDavid Arinzon1-3/+0
When sending TX packets, the meta descriptor can be all zeroes as no meta information is required (as in XDP). This patch removes the validity check, as when `disable_meta_caching` is enabled, such TX packets will be dropped otherwise. Fixes: 0e3a3f6dacf0 ("net: ena: support new LLQ acceleration mode") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-5-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12net: ena: Fix DMA syncing in XDP path when SWIOTLB is onDavid Arinzon1-14/+9
This patch fixes two issues: Issue 1 ------- Description ``````````` Current code does not call dma_sync_single_for_cpu() to sync data from the device side memory to the CPU side memory before the XDP code path uses the CPU side data. This causes the XDP code path to read the unset garbage data in the CPU side memory, resulting in incorrect handling of the packet by XDP. Solution ```````` 1. Add a call to dma_sync_single_for_cpu() before the XDP code starts to use the data in the CPU side memory. 2. The XDP code verdict can be XDP_PASS, in which case there is a fallback to the non-XDP code, which also calls dma_sync_single_for_cpu(). To avoid calling dma_sync_single_for_cpu() twice: 2.1. Put the dma_sync_single_for_cpu() in the code in such a place where it happens before XDP and non-XDP code. 2.2. Remove the calls to dma_sync_single_for_cpu() in the non-XDP code for the first buffer only (rx_copybreak and non-rx_copybreak cases), since the new call that was added covers these cases. The call to dma_sync_single_for_cpu() for the second buffer and on stays because only the first buffer is handled by the newly added dma_sync_single_for_cpu(). And there is no need for special handling of the second buffer and on for the XDP path since currently the driver supports only single buffer packets. Issue 2 ------- Description ``````````` In case the XDP code forwarded the packet (ENA_XDP_FORWARDED), ena_unmap_rx_buff_attrs() is called with attrs set to 0. This means that before unmapping the buffer, the internal function dma_unmap_page_attrs() will also call dma_sync_single_for_cpu() on the whole buffer (not only on the data part of it). This sync is both wasteful (since a sync was already explicitly called before) and also causes a bug, which will be explained using the below diagram. The following diagram shows the flow of events causing the bug. The order of events is (1)-(4) as shown in the diagram. CPU side memory area (3)convert_to_xdp_frame() initializes the headroom with xdpf metadata || \/ ___________________________________ | | 0 | V 4K --------------------------------------------------------------------- | xdpf->data | other xdpf | < data > | tailroom ||...| | | fields | | GARBAGE || | --------------------------------------------------------------------- /\ /\ || || (4)ena_unmap_rx_buff_attrs() calls (2)dma_sync_single_for_cpu() dma_sync_single_for_cpu() on the copies data from device whole buffer page, overwriting side to CPU side memory the xdpf->data with GARBAGE. || 0 4K --------------------------------------------------------------------- | headroom | < data > | tailroom ||...| | GARBAGE | | GARBAGE || | --------------------------------------------------------------------- Device side memory area /\ || (1) device writes RX packet data After the call to ena_unmap_rx_buff_attrs() in (4), the xdpf->data becomes corrupted, and so when it is later accessed in ena_clean_xdp_irq()->xdp_return_frame(), it causes a page fault, crashing the kernel. Solution ```````` Explicitly tell ena_unmap_rx_buff_attrs() not to call dma_sync_single_for_cpu() by passing it the ENA_DMA_ATTR_SKIP_CPU_SYNC flag. Fixes: f7d625adeb7b ("net: ena: Add dynamic recycling mechanism for rx buffers") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-4-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12net: ena: Fix xdp drops handling due to multibuf packetsDavid Arinzon1-7/+10
Current xdp code drops packets larger than ENA_XDP_MAX_MTU. This is an incorrect condition since the problem is not the size of the packet, rather the number of buffers it contains. This commit: 1. Identifies and drops XDP multi-buffer packets at the beginning of the function. 2. Increases the xdp drop statistic when this drop occurs. 3. Adds a one-time print that such drops are happening to give better indication to the user. Fixes: 838c93dc5449 ("net: ena: implement XDP drop support") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-3-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12net: ena: Destroy correct number of xdp queues upon failureDavid Arinzon1-6/+7
The ena_setup_and_create_all_xdp_queues() function freed all the resources upon failure, after creating only xdp_num_queues queues, instead of freeing just the created ones. In this patch, the only resources that are freed, are the ones allocated right before the failure occurs. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-2-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12tracing: Update snapshot buffer on resize if it is allocatedSteven Rostedt (Google)1-2/+2
The snapshot buffer is to mimic the main buffer so that when a snapshot is needed, the snapshot and main buffer are swapped. When the snapshot buffer is allocated, it is set to the minimal size that the ring buffer may be at and still functional. When it is allocated it becomes the same size as the main ring buffer, and when the main ring buffer changes in size, it should do. Currently, the resize only updates the snapshot buffer if it's used by the current tracer (ie. the preemptirqsoff tracer). But it needs to be updated anytime it is allocated. When changing the size of the main buffer, instead of looking to see if the current tracer is utilizing the snapshot buffer, just check if it is allocated to know if it should be updated or not. Also fix typo in comment just above the code change. Link: https://lore.kernel.org/linux-trace-kernel/20231210225447.48476a6a@rorschach.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: ad909e21bbe69 ("tracing: Add internal tracing_snapshot() functions") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12ring-buffer: Fix memory leak of free pageSteven Rostedt (Google)1-0/+2
Reading the ring buffer does a swap of a sub-buffer within the ring buffer with a empty sub-buffer. This allows the reader to have full access to the content of the sub-buffer that was swapped out without having to worry about contention with the writer. The readers call ring_buffer_alloc_read_page() to allocate a page that will be used to swap with the ring buffer. When the code is finished with the reader page, it calls ring_buffer_free_read_page(). Instead of freeing the page, it stores it as a spare. Then next call to ring_buffer_alloc_read_page() will return this spare instead of calling into the memory management system to allocate a new page. Unfortunately, on freeing of the ring buffer, this spare page is not freed, and causes a memory leak. Link: https://lore.kernel.org/linux-trace-kernel/20231210221250.7b9cc83c@rorschach.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12eventfs: Fix events beyond NAME_MAX blocking tasksBeau Belgrave1-0/+4
Eventfs uses simple_lookup(), however, it will fail if the name of the entry is beyond NAME_MAX length. When this error is encountered, eventfs still tries to create dentries instead of skipping the dentry creation. When the dentry is attempted to be created in this state d_wait_lookup() will loop forever, waiting for the lookup to be removed. Fix eventfs to return the error in simple_lookup() back to the caller instead of continuing to try to create the dentry. Link: https://lore.kernel.org/linux-trace-kernel/20231210213534.497-1-beaub@linux.microsoft.com Fixes: 63940449555e ("eventfs: Implement eventfs lookup, read, open functions") Link: https://lore.kernel.org/linux-trace-kernel/20231208183601.GA46-beaub@linux.microsoft.com/ Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12tracing: Have large events show up as '[LINE TOO BIG]' instead of nothingSteven Rostedt (Google)1-1/+5
If a large event was added to the ring buffer that is larger than what the trace_seq can handle, it just drops the output: ~# cat /sys/kernel/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 2/2 #P:8 # # _-----=> irqs-off/BH-disabled # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # |||| / delay # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | <...>-859 [001] ..... 141.118951: tracing_mark_write <...>-859 [001] ..... 141.148201: tracing_mark_write: 78901234 Instead, catch this case and add some context: ~# cat /sys/kernel/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 2/2 #P:8 # # _-----=> irqs-off/BH-disabled # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # |||| / delay # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | <...>-852 [001] ..... 121.550551: tracing_mark_write[LINE TOO BIG] <...>-852 [001] ..... 121.550581: tracing_mark_write: 78901234 This now emulates the same output as trace_pipe. Link: https://lore.kernel.org/linux-trace-kernel/20231209171058.78c1a026@gandalf.local.home Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12ring-buffer: Fix writing to the buffer with max_data_sizeSteven Rostedt (Google)1-1/+6
The maximum ring buffer data size is the maximum size of data that can be recorded on the ring buffer. Events must be smaller than the sub buffer data size minus any meta data. This size is checked before trying to allocate from the ring buffer because the allocation assumes that the size will fit on the sub buffer. The maximum size was calculated as the size of a sub buffer page (which is currently PAGE_SIZE minus the sub buffer header) minus the size of the meta data of an individual event. But it missed the possible adding of a time stamp for events that are added long enough apart that the event meta data can't hold the time delta. When an event is added that is greater than the current BUF_MAX_DATA_SIZE minus the size of a time stamp, but still less than or equal to BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking for a page that can hold the event. Luckily, there's a check for this loop and after 1000 iterations and a warning is emitted and the ring buffer is disabled. But this should never happen. This can happen when a large event is added first, or after a long period where an absolute timestamp is prefixed to the event, increasing its size by 8 bytes. This passes the check and then goes into the algorithm that causes the infinite loop. For events that are the first event on the sub-buffer, it does not need to add a timestamp, because the sub-buffer itself contains an absolute timestamp, and adding one is redundant. The fix is to check if the event is to be the first event on the sub-buffer, and if it is, then do not add a timestamp. This also fixes 32 bit adding a timestamp when a read of before_stamp or write_stamp is interrupted. There's still no need to add that timestamp if the event is going to be the first event on the sub buffer. Also, if the buffer has "time_stamp_abs" set, then also check if the length plus the timestamp is greater than the BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/all/20231212104549.58863438@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@gandalf.local.home Link: https://lore.kernel.org/linux-trace-kernel/20231212111617.39e02849@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated") Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some branches") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> # (on IRC) Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12net: Remove acked SYN flag from packet in the transmit queue correctlyDong Chenchen1-0/+6
syzkaller report: kernel BUG at net/core/skbuff.c:3452! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty #135 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452) Call Trace: icmp_glue_bits (net/ipv4/icmp.c:357) __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165) ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341) icmp_push_reply (net/ipv4/icmp.c:370) __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772) ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577) __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295) ip_output (net/ipv4/ip_output.c:427) __ip_queue_xmit (net/ipv4/ip_output.c:535) __tcp_transmit_skb (net/ipv4/tcp_output.c:1462) __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387) tcp_retransmit_skb (net/ipv4/tcp_output.c:3404) tcp_retransmit_timer (net/ipv4/tcp_timer.c:604) tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716) The panic issue was trigered by tcp simultaneous initiation. The initiation process is as follows: TCP A TCP B 1. CLOSED CLOSED 2. SYN-SENT --> <SEQ=100><CTL=SYN> ... 3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT 4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED 5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ... // TCP B: not send challenge ack for ack limit or packet loss // TCP A: close tcp_close tcp_send_fin if (!tskb && tcp_under_memory_pressure(sk)) tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN; // set FIN flag 6. FIN_WAIT_1 --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // TCP B: send challenge ack to SYN_FIN_ACK 7. ... <SEQ=301><ACK=101><CTL=ACK> <-- SYN-RECEIVED //challenge ack // TCP A: <SND.UNA=101> 8. FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic __tcp_retransmit_skb //skb->len=0 tcp_trim_head len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100 __pskb_trim_head skb->data_len -= len // skb->len=-1, wrap around ... ... ip_fragment icmp_glue_bits //BUG_ON If we use tcp_trim_head() to remove acked SYN from packet that contains data or other flags, skb->len will be incorrectly decremented. We can remove SYN flag that has been acked from rtx_queue earlier than tcp_trim_head(), which can fix the problem mentioned above. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12qed: Fix a potential use-after-free in qed_cxt_tables_allocDinghao Liu1-0/+1
qed_ilt_shadow_alloc() will call qed_ilt_shadow_free() to free p_hwfn->p_cxt_mngr->ilt_shadow on error. However, qed_cxt_tables_alloc() accesses the freed pointer on failure of qed_ilt_shadow_alloc() through calling qed_cxt_mngr_free(), which may lead to use-after-free. Fix this issue by setting p_mngr->ilt_shadow to NULL in qed_ilt_shadow_free(). Fixes: fe56b9e6a8d9 ("qed: Add module with basic common support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20231210045255.21383-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12Merge tag 'ext4_for_linus-6.7-rc6' of ↵Linus Torvalds5-20/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix various bugs / regressions for ext4, including a soft lockup, a WARN_ON, and a BUG" * tag 'ext4_for_linus-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix soft lockup in journal_finish_inode_data_buffers() ext4: fix warning in ext4_dio_write_end_io() jbd2: increase the journal IO's priority jbd2: correct the printing of write_flags in jbd2_write_superblock() ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS
2023-12-12iavf: Fix iavf_shutdown to call iavf_remove instead iavf_closeSlawomir Laba1-51/+21
Make the flow for pci shutdown be the same to the pci remove. iavf_shutdown was implementing an incomplete version of iavf_remove. It misses several calls to the kernel like iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap that might break the system on reboot or hibernation. Implement the call of iavf_remove directly in iavf_shutdown to close this gap. Fixes below error messages (dmesg) during shutdown stress tests - [685814.900917] ice 0000:88:00.0: MAC 02:d0:5f:82:43:5d does not exist for VF 0 [685814.900928] ice 0000:88:00.0: MAC 33:33:00:00:00:01 does not exist for VF 0 Reproduction: 1. Create one VF interface: echo 1 > /sys/class/net/<interface_name>/device/sriov_numvfs 2. Run live dmesg on the host: dmesg -wH 3. On SUT, script below steps into vf_namespace_assignment.sh <#!/bin/sh> // Remove <>. Git removes # line if=<VF name> (edit this per VF name) loop=0 while true; do echo test round $loop let loop++ ip netns add ns$loop ip link set dev $if up ip link set dev $if netns ns$loop ip netns exec ns$loop ip link set dev $if up ip netns exec ns$loop ip link set dev $if netns 1 ip netns delete ns$loop done 4. Run the script for at least 1000 iterations on SUT: ./vf_namespace_assignment.sh Expected result: No errors in dmesg. Fixes: 129cf89e5856 ("iavf: rename functions and structs to new name") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Ranganatha Rao <ranganatha.rao@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12iavf: Handle ntuple on/off based on new state machines for flow directorPiotr Gardocki1-0/+59
ntuple-filter feature on/off: Default is on. If turned off, the filters will be removed from both PF and iavf list. The removal is irrespective of current filter state. Steps to reproduce: ------------------- 1. Ensure ntuple is on. ethtool -K enp8s0 ntuple-filters on 2. Create a filter to receive the traffic into non-default rx-queue like 15 and ensure traffic is flowing into queue into 15. Now, turn off ntuple. Traffic should not flow to configured queue 15. It should flow to default RX queue. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12iavf: Introduce new state machines for flow directorPiotr Gardocki5-23/+139
New states introduced: IAVF_FDIR_FLTR_DIS_REQUEST IAVF_FDIR_FLTR_DIS_PENDING IAVF_FDIR_FLTR_INACTIVE Current FDIR state machines (SM) are not adequate to handle a few scenarios in the link DOWN/UP event, reset event and ntuple-feature. For example, when VF link goes DOWN and comes back UP administratively, the expectation is that previously installed filters should also be restored. But with current SM, filters are not restored. So with new SM, during link DOWN filters are marked as INACTIVE in the iavf list but removed from PF. After link UP, SM will transition from INACTIVE to ADD_REQUEST to restore the filter. Similarly, with VF reset, filters will be removed from the PF, but marked as INACTIVE in the iavf list. Filters will be restored after reset completion. Steps to reproduce: ------------------- 1. Create a VF. Here VF is enp8s0. 2. Assign IP addresses to VF and link partner and ping continuously from remote. Here remote IP is 1.1.1.1. 3. Check default RX Queue of traffic. ethtool -S enp8s0 | grep -E "rx-[[:digit:]]+\.packets" 4. Add filter - change default RX Queue (to 15 here) ethtool -U ens8s0 flow-type ip4 src-ip 1.1.1.1 action 15 loc 5 5. Ensure filter gets added and traffic is received on RX queue 15 now. Link event testing: ------------------- 6. Bring VF link down and up. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Reset event testing: -------------------- 7. Reset the VF. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12Merge tag 'fuse-fixes-6.7-rc6' of ↵Linus Torvalds6-16/+106
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix a couple of potential crashes, one introduced in 6.6 and one in 5.10 - Fix misbehavior of virtiofs submounts on memory pressure - Clarify naming in the uAPI for a recent feature * tag 'fuse-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: disable FOPEN_PARALLEL_DIRECT_WRITES with FUSE_DIRECT_IO_ALLOW_MMAP fuse: dax: set fc->dax to NULL in fuse_dax_conn_free() fuse: share lookup state between submount and its parent docs/fuse-io: Document the usage of DIRECT_IO_ALLOW_MMAP fuse: Rename DIRECT_IO_RELAX to DIRECT_IO_ALLOW_MMAP
2023-12-12Merge tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds8-45/+171
Pull smb server fixes from Steve French: - Memory leak fix (in lock error path) - Two fixes for create with allocation size - FIx for potential UAF in lease break error path - Five directory lease (caching) fixes found during additional recent testing * tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix wrong name of SMB2_CREATE_ALLOCATION_SIZE ksmbd: fix wrong allocation size update in smb2_open() ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() ksmbd: lazy v2 lease break on smb2_write() ksmbd: send v2 lease break notification for directory ksmbd: downgrade RWH lease caching state to RH for directory ksmbd: set v2 lease capability ksmbd: set epoch in create context v2 lease ksmbd: fix memory leak in smb2_lock()
2023-12-12drm/amd/display: Disable PSR-SU on Parade 0803 TCON againMario Limonciello1-0/+2
When screen brightness is rapidly changed and PSR-SU is enabled the display hangs on panels with this TCON even on the latest DCN 3.1.4 microcode (0x8002a81 at this time). This was disabled previously as commit 072030b17830 ("drm/amd: Disable PSR-SU on Parade 0803 TCON") but reverted as commit 1e66a17ce546 ("Revert "drm/amd: Disable PSR-SU on Parade 0803 TCON"") in favor of testing for a new enough microcode (commit cd2e31a9ab93 ("drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix")). As hangs are still happening specifically with this TCON, disable PSR-SU again for it until it can be root caused. Cc: stable@vger.kernel.org Cc: aaron.ma@canonical.com Cc: binli@gnome.org Cc: Marc Rossi <Marc.Rossi@amd.com> Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2046131 Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-12drm/amd/display: Populate dtbclk from bounding boxFangzhi Zuo2-7/+12
dtbclk is unavaliable from pmfw. Try to grab the value from bounding box Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-12drm/amd/display: Revert "Fix conversions between bytes and KB"Taimur Hassan1-8/+8
[Why & How] HostVMMinPageSize is expected to be in KB according to spec, the checks later down the line reflect this as well. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Taimur Hassan <syed.hassan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-12drm/amdgpu/jpeg: configure doorbell for each playbackSaleemkhan Jamadar1-7/+8
Doorbell is configured during start of each playback. v1 - add comment for the doorbell programming change Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-12arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modifyJames Houghton1-0/+6
It is currently possible for a userspace application to enter an infinite page fault loop when using HugeTLB pages implemented with contiguous PTEs when HAFDBS is not available. This happens because: 1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean (PTE_DIRTY | PTE_RDONLY | PTE_WRITE). 2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling the memory access on a system without HAFDBS, we will get a page fault. 3. HugeTLB will check if it needs to update the dirty bits on the PTE. For contiguous PTEs, it will check to see if the pgprot bits need updating. In this case, HugeTLB wants to write a sequence of sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()), so it thinks no update is necessary. We can get the kernel to write a sw-dirty, hw-clean PTE with the following steps (showing the relevant VMA flags and pgprot bits): i. Create a valid, writable contiguous PTE. VMA vmflags: VM_SHARED | VM_READ | VM_WRITE VMA pgprot bits: PTE_RDONLY | PTE_WRITE PTE pgprot bits: PTE_DIRTY | PTE_WRITE ii. mprotect the VMA to PROT_NONE. VMA vmflags: VM_SHARED VMA pgprot bits: PTE_RDONLY PTE pgprot bits: PTE_DIRTY | PTE_RDONLY iii. mprotect the VMA back to PROT_READ | PROT_WRITE. VMA vmflags: VM_SHARED | VM_READ | VM_WRITE VMA pgprot bits: PTE_RDONLY | PTE_WRITE PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY Make it impossible to create a writeable sw-dirty, hw-clean PTE with pte_modify(). Such a PTE should be impossible to create, and there may be places that assume that pte_dirty() implies pte_hw_dirty(). Signed-off-by: James Houghton <jthoughton@google.com> Fixes: 031e6e6b4e12 ("arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags") Cc: <stable@vger.kernel.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20231204172646.2541916-3-jthoughton@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-12-12jbd2: fix soft lockup in journal_finish_inode_data_buffers()Ye Bin1-0/+1
There's issue when do io test: WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170] CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE Call trace: dump_backtrace+0x0/0x1a0 show_stack+0x24/0x30 dump_stack+0xb0/0x100 watchdog_timer_fn+0x254/0x3f8 __hrtimer_run_queues+0x11c/0x380 hrtimer_interrupt+0xfc/0x2f8 arch_timer_handler_phys+0x38/0x58 handle_percpu_devid_irq+0x90/0x248 generic_handle_irq+0x3c/0x58 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x90/0x320 el1_irq+0xcc/0x180 queued_spin_lock_slowpath+0x1d8/0x320 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2] kjournald2+0xec/0x2f0 [jbd2] kthread+0x134/0x138 ret_from_fork+0x10/0x18 Analyzed informations from vmcore as follows: (1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list'; (2) Now is processing the 855th jbd2_inode; (3) JBD2 task has TIF_NEED_RESCHED flag; (4) There's no pags in address_space around the 855th jbd2_inode; (5) There are some process is doing drop caches; (6) Mounted with 'nodioread_nolock' option; (7) 128 CPUs; According to informations from vmcore we know 'journal->j_list_lock' spin lock competition is fierce. So journal_finish_inode_data_buffers() maybe process slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors(). However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK, will not call cond_resched(). So may lead to soft lockup. journal_finish_inode_data_buffers filemap_fdatawait_range_keep_errors __filemap_fdatawait_range while (index <= end) nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK); if (!nr_pages) break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched() for (i = 0; i < nr_pages; i++) wait_on_page_writeback(page); cond_resched(); To solve above issue, add scheduling point in the journal_finish_inode_data_buffers(); Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-12-12HID: apple: Add "hfd.cn" and "WKB603" to the list of non-apple keyboardsYan Jun1-0/+2
JingZao(京造) WKB603 keyboard is a rebranded product of Jamesdonkey RS2 keyboard, identified as "hfd.cn WKB603" in wired mode, "WKB603" in bluetooth mode. Adding them to the list of non-apple keyboards fixes function key. Signed-off-by: Yan Jun <jerrysteve1101@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-12HID: lenovo: Restrict detection of patched firmware only to USB cptkbdMikhail Khvainitski1-1/+2
Commit 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround") introduced a regression for ThinkPad TrackPoint Keyboard II which has similar quirks to cptkbd (so it uses the same workarounds) but slightly different so that there are false-positives during detecting well-behaving firmware. This commit restricts detecting well-behaving firmware to the only model which known to have one and have stable enough quirks to not cause false-positives. Fixes: 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround") Link: https://lore.kernel.org/linux-input/ZXRiiPsBKNasioqH@jekhomev/ Link: https://bbs.archlinux.org/viewtopic.php?pid=2135468#p2135468 Signed-off-by: Mikhail Khvainitski <me@khvoinitsky.org> Tested-by: Yauhen Kharuzhy <jekhor@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-12net/rose: Fix Use-After-Free in rose_ioctlHyunwoo Kim1-1/+3
Because rose_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with rose_accept(). A use-after-free for skb occurs with the following flow. ``` rose_ioctl() -> skb_peek() rose_accept() -> skb_dequeue() -> kfree_skb() ``` Add sk->sk_receive_queue.lock to rose_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209100538.GA407321@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-12atm: Fix Use-After-Free in do_vcc_ioctlHyunwoo Kim1-2/+5
Because do_vcc_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with vcc_recvmsg(). A use-after-free for skb occurs with the following flow. ``` do_vcc_ioctl() -> skb_peek() vcc_recvmsg() -> skb_recv_datagram() -> skb_free_datagram() ``` Add sk->sk_receive_queue.lock to do_vcc_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209094210.GA403126@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-12perf/arm-cmn: Fail DTC counter allocation correctlyRobin Murphy1-1/+1
Calling arm_cmn_event_clear() before all DTC indices are allocated is wrong, and can lead to arm_cmn_event_add() erroneously clearing live counters from full DTCs where allocation fails. Since the DTC counters are only updated by arm_cmn_init_counter() after all DTC and DTM allocations succeed, nothing actually needs cleaning up in this case anyway, and it should just return directly as it did before. Fixes: 7633ec2c262f ("perf/arm-cmn: Rework DTC counters (again)") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/ed589c0d8e4130dc68b8ad1625226d28bdc185d4.1702322847.git.robin.murphy@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-12-11bcachefs: Fix nocow locks deadlockKent Overstreet1-1/+2
On trylock failure we were waiting for outstanding reads to complete - but nocow locks need to be held until the whole move is finished. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-11Merge tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefsLinus Torvalds20-38/+84
Pull more bcachefs bugfixes from Kent Overstreet: - Fix a rare emergency shutdown path bug: dropping journal pins after the filesystem has mostly been torn down is not what we want. - Fix some concurrency issues with the btree write buffer and journal replay by not using the btree write buffer until journal replay is finished - A fixup from the prior patch to kill journal pre-reservations: at the start of the btree update path, where previously we took a pre-reservation, we do at least want to check the journal watermark. - Fix a race between dropping device metadata and btree node writes, which would re-add a pointer to a device that had just been dropped - Fix one of the SCRU lock warnings, in bch2_compression_stats_to_text(). - Partial fix for a rare transaction paths overflow, when indirect extents had been split by background tasks, by not running certain triggers when they're not needed. - Fix for creating a snapshot with implicit source in a subdirectory of the containing subvolume - Don't unfreeze when we're emergency read-only - Fix for rebalance spinning trying to compress unwritten extentns - Another deleted_inodes fix, for directories - Fix a rare deadlock (usually just an unecessary wait) when flushing the journal with an open journal entry. * tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefs: bcachefs: Close journal entry if necessary when flushing all pins bcachefs: Fix uninitialized var in bch2_journal_replay() bcachefs: Fix deleted inode check for dirs bcachefs: rebalance shouldn't attempt to compress unwritten extents bcachefs: don't attempt rw on unfreeze when shutdown bcachefs: Fix creating snapshot with implict source bcachefs: Don't run indirect extent trigger unless inserting/deleting bcachefs: Convert compression_stats to for_each_btree_key2 bcachefs: Fix bch2_extent_drop_ptrs() call bcachefs: Fix a journal deadlock in replay bcachefs; Don't use btree write buffer until journal replay is finished bcachefs: Don't drop journal pins in exit path
2023-12-11afs: Fix refcount underflow from error handling raceDavid Howells1-1/+1
If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL server or fileserver), an asynchronous probe to one of its addresses may fail immediately because sendmsg() returns an error. When this happens, a refcount underflow can happen if certain events hit a very small window. The way this occurs is: (1) There are two levels of "call" object, the afs_call and the rxrpc_call. Each of them can be transitioned to a "completed" state in the event of success or failure. (2) Asynchronous afs_calls are self-referential whilst they are active to prevent them from evaporating when they're not being processed. This reference is disposed of when the afs_call is completed. Note that an afs_call may only be completed once; once completed completing it again will do nothing. (3) When a call transmission is made, the app-side rxrpc code queues a Tx buffer for the rxrpc I/O thread to transmit. The I/O thread invokes sendmsg() to transmit it - and in the case of failure, it transitions the rxrpc_call to the completed state. (4) When an rxrpc_call is completed, the app layer is notified. In this case, the app is kafs and it schedules a work item to process events pertaining to an afs_call. (5) When the afs_call event processor is run, it goes down through the RPC-specific handler to afs_extract_data() to retrieve data from rxrpc - and, in this case, it picks up the error from the rxrpc_call and returns it. The error is then propagated to the afs_call and that is completed too. At this point the self-reference is released. (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the window between rxrpc_send_data() queuing the request packet and checking for call completion on the way out, then rxrpc_kernel_send_data() will return the error from sendmsg() to the app. (7) Then afs_make_call() will see an error and will jump to the error handling path which will attempt to clean up the afs_call. (8) The problem comes when the error handling path in afs_make_call() tries to unconditionally drop an async afs_call's self-reference. This self-reference, however, may already have been dropped by afs_extract_data() completing the afs_call (9) The refcount underflows when we return to afs_do_probe_vlserver() and that tries to drop its reference on the afs_call. Fix this by making afs_make_call() attempt to complete the afs_call rather than unconditionally putting it. That way, if afs_extract_data() manages to complete the call first, afs_make_call() won't do anything. The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and sticking an msleep() in rxrpc_send_data() after the 'success:' label to widen the race window. The error message looks something like: refcount_t: underflow; use-after-free. WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110 ... RIP: 0010:refcount_warn_saturate+0xba/0x110 ... afs_put_call+0x1dc/0x1f0 [kafs] afs_fs_get_capabilities+0x8b/0xe0 [kafs] afs_fs_probe_fileserver+0x188/0x1e0 [kafs] afs_lookup_server+0x3bf/0x3f0 [kafs] afs_alloc_server_list+0x130/0x2e0 [kafs] afs_create_volume+0x162/0x400 [kafs] afs_get_tree+0x266/0x410 [kafs] vfs_get_tree+0x25/0xc0 fc_mount+0xe/0x40 afs_d_automount+0x1b3/0x390 [kafs] __traverse_mounts+0x8f/0x210 step_into+0x340/0x760 path_openat+0x13a/0x1260 do_filp_open+0xaf/0x160 do_sys_openat2+0xaf/0x170 or something like: refcount_t: underflow; use-after-free. ... RIP: 0010:refcount_warn_saturate+0x99/0xda ... afs_put_call+0x4a/0x175 afs_send_vl_probes+0x108/0x172 afs_select_vlserver+0xd6/0x311 afs_do_cell_detect_alias+0x5e/0x1e9 afs_cell_detect_alias+0x44/0x92 afs_validate_fc+0x9d/0x134 afs_get_tree+0x20/0x2e6 vfs_get_tree+0x1d/0xc9 fc_mount+0xe/0x33 afs_d_automount+0x48/0x9d __traverse_mounts+0xe0/0x166 step_into+0x140/0x274 open_last_lookups+0x1c1/0x1df path_openat+0x138/0x1c3 do_filp_open+0x55/0xb4 do_sys_openat2+0x6c/0xb6 Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting") Reported-by: Bill MacAllister <bill@ca-zephyr.org> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304 Suggested-by: Jeffrey E Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-11smb: client: fix OOB in smb2_query_reparse_point()Paulo Alcantara1-10/+16
Validate @ioctl_rsp->OutputOffset and @ioctl_rsp->OutputCount so that their sum does not wrap to a number that is smaller than @reparse_buf and we end up with a wild pointer as follows: BUG: unable to handle page fault for address: ffff88809c5cd45f #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1260 Comm: mount.cifs Not tainted 6.7.0-rc4 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_query_reparse_point+0x3e0/0x4c0 [cifs] Code: ff ff e8 f3 51 fe ff 41 89 c6 58 5a 45 85 f6 0f 85 14 fe ff ff 49 8b 57 48 8b 42 60 44 8b 42 64 42 8d 0c 00 49 39 4f 50 72 40 <8b> 04 02 48 8b 9d f0 fe ff ff 49 8b 57 50 89 03 48 8b 9d e8 fe ff RSP: 0018:ffffc90000347a90 EFLAGS: 00010212 RAX: 000000008000001f RBX: ffff88800ae11000 RCX: 00000000000000ec RDX: ffff88801c5cd440 RSI: 0000000000000000 RDI: ffffffff82004aa4 RBP: ffffc90000347bb0 R08: 00000000800000cd R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000024 R12: ffff8880114d4100 R13: ffff8880114d4198 R14: 0000000000000000 R15: ffff8880114d4000 FS: 00007f02c07babc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88809c5cd45f CR3: 0000000011750000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? smb2_query_reparse_point+0x3e0/0x4c0 [cifs] cifs_get_fattr+0x16e/0xa50 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0xbf/0x2b0 cifs_root_iget+0x163/0x5f0 [cifs] cifs_smb3_do_mount+0x5bd/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f02c08d5b1e Fixes: 2e4564b31b64 ("smb3: add support for stat of WSL reparse points for special file types") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-11smb: client: fix NULL deref in asn1_ber_decoder()Paulo Alcantara1-16/+10
If server replied SMB2_NEGOTIATE with a zero SecurityBufferOffset, smb2_get_data_area() sets @len to non-zero but return NULL, so decode_negTokeninit() ends up being called with a NULL @security_blob: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 871 Comm: mount.cifs Not tainted 6.7.0-rc4 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:asn1_ber_decoder+0x173/0xc80 Code: 01 4c 39 2c 24 75 09 45 84 c9 0f 85 2f 03 00 00 48 8b 14 24 4c 29 ea 48 83 fa 01 0f 86 1e 07 00 00 48 8b 74 24 28 4d 8d 5d 01 <42> 0f b6 3c 2e 89 fa 40 88 7c 24 5c f7 d2 83 e2 1f 0f 84 3d 07 00 RSP: 0018:ffffc9000063f950 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 000000000000004a RDX: 000000000000004a RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000002 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 000000000000004d R15: 0000000000000000 FS: 00007fce52b0fbc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000001ae64000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? __stack_depot_save+0x1e6/0x480 ? exc_page_fault+0x6f/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? asn1_ber_decoder+0x173/0xc80 ? check_object+0x40/0x340 decode_negTokenInit+0x1e/0x30 [cifs] SMB2_negotiate+0xc99/0x17c0 [cifs] ? smb2_negotiate+0x46/0x60 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 smb2_negotiate+0x46/0x60 [cifs] cifs_negotiate_protocol+0xae/0x130 [cifs] cifs_get_smb_ses+0x517/0x1040 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? queue_delayed_work_on+0x5d/0x90 cifs_mount_get_session+0x78/0x200 [cifs] dfs_mount_share+0x13a/0x9f0 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0xbf/0x2b0 ? find_nls+0x16/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 cifs_mount+0x7e/0x350 [cifs] cifs_smb3_do_mount+0x128/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7fce52c2ab1e Fix this by setting @len to zero when @off == 0 so callers won't attempt to dereference non-existing data areas. Reported-by: Robert Morris <rtm@csail.mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-11smb: client: fix potential OOBs in smb2_parse_contexts()Paulo Alcantara3-47/+75
Validate offsets and lengths before dereferencing create contexts in smb2_parse_contexts(). This fixes following oops when accessing invalid create contexts from server: BUG: unable to handle page fault for address: ffff8881178d8cc3 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1736 Comm: mount.cifs Not tainted 6.7.0-rc4 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_parse_contexts+0xa0/0x3a0 [cifs] Code: f8 10 75 13 48 b8 93 ad 25 50 9c b4 11 e7 49 39 06 0f 84 d2 00 00 00 8b 45 00 85 c0 74 61 41 29 c5 48 01 c5 41 83 fd 0f 76 55 <0f> b7 7d 04 0f b7 45 06 4c 8d 74 3d 00 66 83 f8 04 75 bc ba 04 00 RSP: 0018:ffffc900007939e0 EFLAGS: 00010216 RAX: ffffc90000793c78 RBX: ffff8880180cc000 RCX: ffffc90000793c90 RDX: ffffc90000793cc0 RSI: ffff8880178d8cc0 RDI: ffff8880180cc000 RBP: ffff8881178d8cbf R08: ffffc90000793c22 R09: 0000000000000000 R10: ffff8880180cc000 R11: 0000000000000024 R12: 0000000000000000 R13: 0000000000000020 R14: 0000000000000000 R15: ffffc90000793c22 FS: 00007f873753cbc0(0000) GS:ffff88806bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881178d8cc3 CR3: 00000000181ca000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? smb2_parse_contexts+0xa0/0x3a0 [cifs] SMB2_open+0x38d/0x5f0 [cifs] ? smb2_is_path_accessible+0x138/0x260 [cifs] smb2_is_path_accessible+0x138/0x260 [cifs] cifs_is_path_remote+0x8d/0x230 [cifs] cifs_mount+0x7e/0x350 [cifs] cifs_smb3_do_mount+0x128/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f8737657b1e Reported-by: Robert Morris <rtm@csail.mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-11smb: client: fix OOB in receive_encrypted_standard()Paulo Alcantara1-6/+8
Fix potential OOB in receive_encrypted_standard() if server returned a large shdr->NextCommand that would end up writing off the end of @next_buffer. Fixes: b24df3e30cbf ("cifs: update receive_encrypted_standard to handle compounded responses") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-11PCI/ASPM: Add pci_disable_link_state_locked() lockdep assertJohan Hovold1-0/+2
Add a lockdep assert to pci_disable_link_state_locked() which should only be called with a pci_bus_sem read lock held. Link: https://lore.kernel.org/r/20231128081512.19387-7-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> [bhelgaas: include function name in subject, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2023-12-11PCI/ASPM: Clean up __pci_disable_link_state() 'sem' parameterJohan Hovold1-5/+5
Replace the current 'sem' parameter to the __pci_disable_link_state() helper with a more descriptive 'locked' parameter, which indicates whether a pci_bus_sem read lock is already held. Link: https://lore.kernel.org/r/20231128081512.19387-6-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> [bhelgaas: include function name in subject, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2023-12-11PCI: qcom: Clean up ASPM commentJohan Hovold1-1/+4
Break up the newly added ASPM comment so that it fits within the soft 80 character limit and becomes more readable. Link: https://lore.kernel.org/r/20231128081512.19387-5-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-11PCI: qcom: Fix potential deadlock when enabling ASPMJohan Hovold1-1/+1
The qcom_pcie_enable_aspm() helper is called from pci_walk_bus() during host init to enable ASPM. Since pci_walk_bus() already holds a pci_bus_sem read lock, use pci_enable_link_state_locked() to enable link states in order to avoid a potential deadlock (e.g. in case someone takes a write lock before reacquiring the read lock). This issue was reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0-rc1 #4 Not tainted -------------------------------------------- kworker/u16:6/147 is trying to acquire lock: ffffbf3ff9d2cfa0 (pci_bus_sem){++++}-{3:3}, at: pci_enable_link_state+0x74/0x1e8 but task is already holding lock: ffffbf3ff9d2cfa0 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Link: https://lore.kernel.org/r/20231128081512.19387-4-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> [bhelgaas: add "potential" in subject since the deadlock has only been reported by lockdep, include helper name in commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2023-12-11PCI: vmd: Fix potential deadlock when enabling ASPMJohan Hovold1-1/+1
The vmd_pm_enable_quirk() helper is called from pci_walk_bus() during probe to enable ASPM for controllers with VMD_FEAT_BIOS_PM_QUIRK set. Since pci_walk_bus() already holds a pci_bus_sem read lock, use pci_enable_link_state_locked() to enable link states in order to avoid a potential deadlock (e.g. in case someone takes a write lock before reacquiring the read lock). Fixes: f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and LTR") Link: https://lore.kernel.org/r/20231128081512.19387-3-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> [bhelgaas: add "potential" in subject since the deadlock has only been reported by lockdep, include helper name in commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@vger.kernel.org> # 6.3 Cc: Michael Bottini <michael.a.bottini@linux.intel.com> Cc: David E. Box <david.e.box@linux.intel.com>
2023-12-11PCI/ASPM: Add pci_enable_link_state_locked()Johan Hovold2-13/+43
Add pci_enable_link_state_locked() for enabling link states that can be used in contexts where a pci_bus_sem read lock is already held (e.g. from pci_walk_bus()). This helper will be used to fix a couple of potential deadlocks where the current helper is called with the lock already held, hence the CC stable tag. Fixes: f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and LTR") Link: https://lore.kernel.org/r/20231128081512.19387-2-johan+linaro@kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> [bhelgaas: include helper name in subject, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@vger.kernel.org> # 6.3 Cc: Michael Bottini <michael.a.bottini@linux.intel.com> Cc: David E. Box <david.e.box@linux.intel.com>
2023-12-11drm/amd/display: Restore guard against default backlight value < 1 nitMario Limonciello1-2/+2
Mark reports that brightness is not restored after Xorg dpms screen blank. This behavior was introduced by commit d9e865826c20 ("drm/amd/display: Simplify brightness initialization") which dropped the cached backlight value in display code, but also removed code for when the default value read back was less than 1 nit. Restore this code so that the backlight brightness is restored to the correct default value in this circumstance. Reported-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3031 Cc: stable@vger.kernel.org Cc: Camille Cho <camille.cho@amd.com> Cc: Krunoslav Kovac <krunoslav.kovac@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Fixes: d9e865826c20 ("drm/amd/display: Simplify brightness initialization") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-11drm/amd/display: fix hw rotated modes when PSR-SU is enabledHamza Mahfooz4-3/+16
We currently don't support dirty rectangles on hardware rotated modes. So, if a user is using hardware rotated modes with PSR-SU enabled, use PSR-SU FFU for all rotated planes (including cursor planes). Cc: stable@vger.kernel.org Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support") Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2952 Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Bin Li <binli@gnome.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-11drm/amd/pm: fix pp_*clk_od typoDmitrii Galantsev1-2/+2
Fix pp_dpm_sclk_od and pp_dpm_mclk_od typos. Those were defined as pp_*clk_od but used as pp_dpm_*clk_od instead. This change removes the _dpm part. Fixes: 8cfd6a05750c ("drm/amd/pm: Hide irrelevant pm device attributes") Signed-off-by: Dmitrii Galantsev <dmitrii.galantsev@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org