aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-12-13Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostHEADmasterLinus Torvalds6-8/+11
Pull virtio fixes from Michael Tsirkin: "Misc virtio and vdpa bugfixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa: Consider device id larger than 31 virtio/vsock: fix the transport to work with VMADDR_CID_ANY virtio_ring: Fix querying of maximum DMA mapping size for virtio device virtio: always enter drivers/virtio/ vduse: check that offset is within bounds in get_config() vdpa: check that offsets are within bounds vduse: fix memory corruption in vduse_dev_ioctl()
2021-12-13PCI: mt7621: Convert driver into 'bool'Sergio Paracuellos1-2/+2
The driver is not ready yet to be compiled as a module since it depends on some symbols not exported on MIPS. We have the following current problems: Building mips:allmodconfig ... failed -------------- Error log: ERROR: modpost: missing MODULE_LICENSE() in drivers/pci/controller/pcie-mt7621.o ERROR: modpost: "mips_cm_unlock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined! ERROR: modpost: "mips_cpc_base" [drivers/pci/controller/pcie-mt7621.ko] undefined! ERROR: modpost: "mips_cm_lock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined! ERROR: modpost: "mips_cm_is64" [drivers/pci/controller/pcie-mt7621.ko] undefined! ERROR: modpost: "mips_gcr_base" [drivers/pci/controller/pcie-mt7621.ko] undefined! Temporarily move from 'tristate' to 'bool' until a better solution is ready. Also RALINK is redundant because SOC_MT7621 already depends on it. Hence, simplify condition. Fixes: 2bdd5238e756 ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver"). Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Reviewed-and-Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-13fget: clarify and improve __fget_files() implementationLinus Torvalds1-16/+56
Commit 054aa8d439b9 ("fget: check that the fd still exists after getting a ref to it") fixed a race with getting a reference to a file just as it was being closed. It was a fairly minimal patch, and I didn't think re-checking the file pointer lookup would be a measurable overhead, since it was all right there and cached. But I was wrong, as pointed out by the kernel test robot. The 'poll2' case of the will-it-scale.per_thread_ops benchmark regressed quite noticeably. Admittedly it seems to be a very artificial test: doing "poll()" system calls on regular files in a very tight loop in multiple threads. That means that basically all the time is spent just looking up file descriptors without ever doing anything useful with them (not that doing 'poll()' on a regular file is useful to begin with). And as a result it shows the extra "re-check fd" cost as a sore thumb. Happily, the regression is fixable by just writing the code to loook up the fd to be better and clearer. There's still a cost to verify the file pointer, but now it's basically in the noise even for that benchmark that does nothing else - and the code is more understandable and has better comments too. [ Side note: this patch is also a classic case of one that looks very messy with the default greedy Myers diff - it's much more legible with either the patience of histogram diff algorithm ] Link: https://lore.kernel.org/lkml/20211210053743.GA36420@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/20211213083154.GA20853@linux.intel.com/ Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Carel Si <beibei.si@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-12Linux 5.16-rc5Linus Torvalds1-1/+1
2021-12-12Merge tag 'usb-5.16-rc5' of ↵Linus Torvalds8-33/+61
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 5.16-rc5. They include: - gadget driver fixes for reported issues - xhci fixes for reported problems. - config endpoint parsing fixes for where we got bitfields wrong Most of these have been in linux-next, the remaining few were not, but got lots of local testing in my systems and in some cloud testing infrastructures" * tag 'usb-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: core: config: using bit mask instead of individual bits usb: core: config: fix validation of wMaxPacketValue entries USB: gadget: zero allocate endpoint 0 buffers USB: gadget: detect too-big endpoint 0 requests xhci: avoid race between disable slot command and host runtime suspend xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime suspending Revert "usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default"
2021-12-12Merge tag 'char-misc-5.16-rc5' of ↵Linus Torvalds34-102/+136
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a bunch of small char/misc and other driver subsystem fixes. Included in here are: - iio driver fixes for reported problems - phy driver fixes for a number of reported problems - mhi resume bugfix for broken hardware - nvmem driver fix - rtsx driver fix for irq issues - fastrpc packet parsing fix All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (33 commits) bus: mhi: core: Add support for forced PM resume iio: trigger: stm32-timer: fix MODULE_ALIAS misc: rtsx: Avoid mangling IRQ during runtime PM nvmem: eeprom: at25: fix FRAM byte_len misc: fastrpc: fix improper packet size calculation MAINTAINERS: add maintainer for Qualcomm FastRPC driver bus: mhi: pci_generic: Fix device recovery failed issue iio: adc: stm32: fix null pointer on defer_probe error phy: HiSilicon: Fix copy and paste bug in error handling dt-bindings: phy: zynqmp-psgtr: fix USB phy name phy: ti: omap-usb2: Fix the kernel-doc style phy: qualcomm: ipq806x-usb: Fix kernel-doc style iio: at91-sama5d2: Fix incorrect sign extension iio: adc: axp20x_adc: fix charging current reporting on AXP22x iio: gyro: adxrs290: fix data signedness phy: ti: tusb1210: Fix the kernel-doc warn phy: qualcomm: usb-hsic: Fix the kernel-doc warn phy: qualcomm: qmp: Add missing struct documentation phy: mvebu-cp110-utmi: Fix kernel-doc warns iio: ad7768-1: Call iio_trigger_notify_done() on error ...
2021-12-12Merge tag 'timers-urgent-2021-12-12' of ↵Linus Torvalds2-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two fixes for clock chip drivers: - A regression fix for the Designware APB timer. A recent change to the error checking code transformed the error condition wrongly so it turned into a fail if good condition. - Fix a clang build fail of the ARM architected timer driver" * tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic() clocksource/drivers/dw_apb_timer_of: Fix probe failure
2021-12-12Merge tag 'irq-urgent-2021-12-12' of ↵Linus Torvalds7-17/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of interrupt chip driver fixes: - Fix the multi vector MSI allocation on Armada 370XP - Do interrupt acknowledgement correctly in the aspeed-scu driver - Make the IPR register offset correct in the NVIC driver - Make redistribution table flushing correct by issueing a SYNC command to ensure that the invalidation command has been executed - Plug a device tree node reference leak in the bcm7210-l2 driver - Trivial fixes in the MIPS GIC and the Apple AIC drivers" * tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node() irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL irqchip/apple-aic: Mark aic_init_smp() as __init irqchip: nvic: Fix offset for Interrupt Priority Offsets irqchip/mips-gic: Use bitfield helpers irqchip/aspeed-scu: Replace update_bits with write_bits. irqchip/armada-370-xp: Fix support for Multi-MSI interrupts irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()
2021-12-12Merge tag 'sched-urgent-2021-12-12' of ↵Linus Torvalds1-0/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "A single fix for the x86 scheduler topology: Using cluster topology on hybrid CPUs, e.g. Alder Lake, biases the scheduler towards the ATOM cluster as that has more total capacity. Use selection based on CPU priority instead" * tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched,x86: Don't use cluster topology for x86 hybrid CPUs
2021-12-12Merge tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linuxLinus Torvalds1-2/+2
Pull csky from Guo Ren: "Only one fix for csky: fix fpu config macro" * tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux: csky: fix typo of fpu config macro
2021-12-12usb: core: config: using bit mask instead of individual bitsPavel Hofman1-2/+2
Using standard USB_EP_MAXP_MULT_MASK instead of individual bits for extracting multiple-transactions bits from wMaxPacketSize value. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com> Link: https://lore.kernel.org/r/20211210085219.16796-2-pavel.hofman@ivitera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-12usb: core: config: fix validation of wMaxPacketValue entriesPavel Hofman1-1/+1
The checks performed by commit aed9d65ac327 ("USB: validate wMaxPacketValue entries in endpoint descriptors") require that initial value of the maxp variable contains both maximum packet size bits (10..0) and multiple-transactions bits (12..11). However, the existing code assings only the maximum packet size bits. This patch assigns all bits of wMaxPacketSize to the variable. Fixes: aed9d65ac327 ("USB: validate wMaxPacketValue entries in endpoint descriptors") Cc: stable <stable@vger.kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com> Link: https://lore.kernel.org/r/20211210085219.16796-1-pavel.hofman@ivitera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-12USB: gadget: zero allocate endpoint 0 buffersGreg Kroah-Hartman2-2/+2
Under some conditions, USB gadget devices can show allocated buffer contents to a host. Fix this up by zero-allocating them so that any extra data will all just be zeros. Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-12USB: gadget: detect too-big endpoint 0 requestsGreg Kroah-Hartman3-1/+40
Sometimes USB hosts can ask for buffers that are too large from endpoint 0, which should not be allowed. If this happens for OUT requests, stall the endpoint, but for IN requests, trim the request size to the endpoint buffer size. Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-11Merge tag 'scsi-fixes' of ↵Linus Torvalds6-29/+23
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four fixes, all in drivers. Three are small and obvious, the qedi one is a bit larger but also pretty obvious" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Format log strings only if needed scsi: scsi_debug: Fix buffer size of REPORT ZONES command scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issue scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()
2021-12-11Merge tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-3/+11
Pull xfs fix from Darrick Wong: "This fixes a race between a readonly remount process and other processes that hold a file IOLOCK on files that previously experienced copy on write, that could result in severe filesystem corruption if the filesystem is then remounted rw. I think this is fairly rare (since the only reliable reproducer I have that fits the second criteria is the experimental xfs_scrub program), but the race is clear, so we still need to fix this. Summary: - Fix a data corruption vector that can result from the ro remount process failing to clear all speculative preallocations from files and the rw remount process not noticing the incomplete cleanup" * tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove all COW fork extents when remounting readonly
2021-12-11Merge branch 'for-5.16-fixes' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu Pull percpu fixes from Dennis Zhou: "This contains a fix for SMP && !MMU archs for percpu which has been tested by arm and sh. It seems in the past they have gotten away with it due to mapping of vm functions to km functions, but this fell apart a few releases ago and was just reported recently. The other is just a minor dependency clean up. I think queued up right now by Andrew is a fix in percpu that papers of what seems to be a bug in hotplug for a special situation with memoryless nodes. Michal Hocko is digging into it further" * 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: percpu_ref: Replace kernel.h with the necessary inclusions percpu: km: ensure it is used with NOMMU (either UP or SMP)
2021-12-11Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of ↵Linus Torvalds5-32/+64
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Prevent out-of-bounds access to per sample registers. - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding. - Intel PT fixes, half of those are one-liners: - Fix some PGE (packet generation enable/control flow packets) usage. - Fix sync state when a PSB (synchronization) packet is found. - Fix intel_pt_fup_event() assumptions about setting state type. - Fix state setting when receiving overflow (OVF) packet. - Fix next 'err' value, walking trace. - Fix missing 'instruction' events with 'q' option. - Fix error timestamp setting on the decoder error path. * tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf python: Fix NULL vs IS_ERR_OR_NULL() checking perf intel-pt: Fix error timestamp setting on the decoder error path perf intel-pt: Fix missing 'instruction' events with 'q' option perf intel-pt: Fix next 'err' value, walking trace perf intel-pt: Fix state setting when receiving overflow (OVF) packet perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type perf intel-pt: Fix sync state when a PSB (synchronization) packet is found perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage perf tools: Prevent out-of-bounds access to registers
2021-12-11Merge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-blockLinus Torvalds8-12/+40
Pull block fixes from Jens Axboe: "A few block fixes that should go into this release: - NVMe pull request: - set ana_log_size to 0 after freeing ana_log_buf (Hou Tao) - show subsys nqn for duplicate cntlids (Keith Busch) - disable namespace access for unsupported metadata (Keith Busch) - report write pointer for a full zone as zone start + zone len (Niklas Cassel) - fix use after free when disconnecting a reconnecting ctrl (Ruozhu Li) - fix a list corruption in nvmet-tcp (Sagi Grimberg) - Fix for a regression on DIO single bio async IO (Pavel) - ioprio seteuid fix (Davidlohr) - mtd fix that subsequently got reverted as it was broken, will get re-done and submitted for the next round - Two MD fixes via Song (Markus, zhangyue)" * tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block: Revert "mtd_blkdevs: don't scan partitions for plain mtdblock" block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) md: fix double free of mddev->private in autorun_array() md: fix update super 1.0 on rdev size change nvmet-tcp: fix possible list corruption for unexpected command failure block: fix single bio async DIO error handling nvme: fix use after free when disconnecting a reconnecting ctrl nvme-multipath: set ana_log_size to 0 after free ana_log_buf mtd_blkdevs: don't scan partitions for plain mtdblock nvme: report write pointer for a full zone as zone start + zone len nvme: disable namespace access for unsupported metadata nvme: show subsys nqn for duplicate cntlids
2021-12-11Merge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-blockLinus Torvalds2-8/+27
Pull io_uring fixes from Jens Axboe: "A few fixes that are all bound for stable: - Two syzbot reports for io-wq that turned out to be separate fixes, but ultimately very closely related - io_uring task_work running on cancelations" * tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block: io-wq: check for wq exit after adding new worker task_work io_uring: ensure task_work gets run as part of cancelations io-wq: remove spurious bit clear on task_work addition
2021-12-11Merge branch 'i2c/for-current' of ↵Linus Torvalds2-21/+13
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two more I2C driver bugfixes" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mpc: Use atomic read and fix break condition i2c: virtio: fix completion handling
2021-12-11Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds8-6/+29
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk driver fixes from Stephen Boyd: - Fix qcom mux logic to look at the proper parent table member. Luckily this clk type isn't very common. - Don't kill clks on qcom systems that use Trion PLLs that are enabled out of the bootloader. We will simply skip programming the PLL rate if it's already done. - Use the proper clk_ops for the qcom sm6125 ICE clks. - Use module_platform_driver() in i.MX as it can be a module. - Fix a UAF in the versatile clk driver on an error path. * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: versatile: clk-icst: use after free on error path clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1 clk: imx: use module_platform_driver clk: qcom: clk-alpha-pll: Don't reconfigure running Trion clk: qcom: regmap-mux: fix parent clock lookup
2021-12-11Merge tag 'devicetree-fixes-for-5.16-2' of ↵Linus Torvalds7-22/+43
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Revert schema checks on %.dtb targets. This was problematic for some external build tools. - A few DT binding example fixes - Add back dropped 'enet-phy-lane-no-swap' Ethernet PHY property - Drop erroneous if/then schema in nxp,imx7-mipi-csi2 - Add a quirk to fix some interrupt controllers use of 'interrupt-map' * tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: Revert "kbuild: Enable DT schema checks for %.dtb targets" dt-bindings: bq25980: Fixup the example dt-bindings: input: gpio-keys: Fix interrupts in example dt-bindings: net: Reintroduce PHY no lane swap binding dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema of/irq: Add a quirk for controllers with their own definition of interrupt-map dt-bindings: iio: adc: exynos-adc: Fix node name in example
2021-12-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds23-207/+322
Merge misc fixes from Andrew Morton: "21 patches. Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm (mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits) mm: bdi: initialize bdi_min_ratio when bdi is unregistered hugetlbfs: fix issue of preallocation of gigantic pages can't work mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock() mm/slub: fix endianness bug for alloc/free_traces attributes selftests/damon: split test cases selftests/damon: test debugfs file reads/writes with huge count selftests/damon: test wrong DAMOS condition ranges input selftests/damon: test DAMON enabling with empty target_ids case selftests/damon: skip test if DAMON is running mm/damon/vaddr-test: remove unnecessary variables mm/damon/vaddr-test: split a test function having >1024 bytes frame size mm/damon/vaddr: remove an unnecessary warning message mm/damon/core: remove unnecessary error messages mm/damon/dbgfs: remove an unnecessary error message mm/damon/core: use better timer mechanisms selection threshold mm/damon/core: fix fake load reports due to uninterruptible sleeps timers: implement usleep_idle_range() filemap: remove PageHWPoison check from next_uptodate_page() mailmap: update email address for Guo Ren MAINTAINERS: update kdump maintainers ...
2021-12-11Merge tag 'timers-v5.16-rc4' of ↵Thomas Gleixner2-3/+8
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull timer fixes from Daniel Lezcano: - Fix build error with clang and some kernel configuration on the arm64 architected timer by inlining the erratum_set_next_event_generic() function (Marc Zyngier) - Fix probe error on the dw_apb_timer_of driver by fixing the incorrect condition previously introduced (Alexey Sheplyakov) Link: https://lore.kernel.org/r/429b796d-9395-4ca8-81f3-30911f80a9a9@linaro.org
2021-12-11perf python: Fix NULL vs IS_ERR_OR_NULL() checkingMiaoqian Lin1-1/+1
The function trace_event__tp_format_id may return ERR_PTR(-ENOMEM). Use IS_ERR_OR_NULL to check tp_format. Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: http://lore.kernel.org/lkml/20211211053856.19827-1-linmq006@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix error timestamp setting on the decoder error pathAdrian Hunter1-0/+1
An error timestamp shows the last known timestamp for the queue, but this is not updated on the error path. Fix by setting it. Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix missing 'instruction' events with 'q' optionAdrian Hunter1-3/+8
FUP packets contain IP information, which makes them also an 'instruction' event in 'hop' mode i.e. the itrace 'q' option. That wasn't happening, so restructure the logic so that FUP events are added along with appropriate 'instruction' and 'branch' events. Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix next 'err' value, walking traceAdrian Hunter1-0/+1
Code after label 'next:' in intel_pt_walk_trace() assumes 'err' is zero, but it may not be, if arrived at via a 'goto'. Ensure it is zero. Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix state setting when receiving overflow (OVF) packetAdrian Hunter1-4/+28
An overflow (OVF packet) is treated as an error because it represents a loss of trace data, but there is no loss of synchronization, so the packet state should be INTEL_PT_STATE_IN_SYNC not INTEL_PT_STATE_ERR_RESYNC. To support that, some additional variables must be reset, and the FUP packet that may follow OVF is treated as an FUP event. Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state typeAdrian Hunter1-19/+13
intel_pt_fup_event() assumes it can overwrite the state type if there has been an FUP event, but this is an unnecessary and unexpected constraint on callers. Fix by touching only the state type flags that are affected by an FUP event. Fixes: a472e65fc490a ("perf intel-pt: Add decoder support for ptwrite and power event packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix sync state when a PSB (synchronization) packet is foundAdrian Hunter1-1/+1
When syncing, it may be that branch packet generation is not enabled at that point, in which case there will not immediately be a control-flow packet, so some packets before a control flow packet turns up, get ignored. However, the decoder is in sync as soon as a PSB is found, so the state should be set accordingly. Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf intel-pt: Fix some PGE (packet generation enable/control flow packets) ↵Adrian Hunter1-3/+4
usage Packet generation enable (PGE) refers to whether control flow (COFI) packets are being produced. PGE may be false even when branch-tracing is enabled, due to being out-of-context, or outside a filter address range. Fix some missing PGE usage. Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only") Fixes: 839598176b0554 ("perf intel-pt: Allow decoding with branch tracing disabled") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/20211210162303.2288710-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11perf tools: Prevent out-of-bounds access to registersGerman Gomez2-1/+7
The size of the cache of register values is arch-dependant (PERF_REGS_MAX). This has the potential of causing an out-of-bounds access in the function "perf_reg_value" if the local architecture contains less registers than the one the perf.data file was recorded on. Since the maximum number of registers is bound by the bitmask "u64 cache_mask", and the size of the cache when running under x86 systems is 64 already, fix the size to 64 and add a range-check to the function "perf_reg_value" to prevent out-of-bounds access. Reported-by: Alexandre Truong <alexandre.truong@arm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20211201123334.679131-2-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11Merge tag 'irqchip-fixes-5.16-2' of ↵Thomas Gleixner7-17/+14
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix Armada-370-XP Multi-MSi allocation to be aligned on the allocation size, as required by the PCI spec - Fix aspeed-scu interrupt acknowledgement by directly writing to the register instead of a read-modify-write sequence - Use standard bitfirl helpers in the MIPS GIC driver instead of custom constructs - Fix the NVIC driver IPR register offset - Correctly drop the reference of the device node in the irq-bcm7120-l2 driver - Fix the GICv3 ITS INVALL command by issueing a following SYNC command - Add a missing __init attribute to the init function of the Apple AIC driver Link: https://lore.kernel.org/r/20211210133516.664497-1-maz@kernel.org
2021-12-10Merge tag 'for-5.16-rc4-tag' of ↵Linus Torvalds7-10/+35
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more regression fixes and stable patches, mostly one-liners. Regression fixes: - fix pointer/ERR_PTR mismatch returned from memdup_user - reset dedicated zoned mode relocation block group to avoid using it and filling it without any recourse Fixes: - handle a case to FITRIM range (also to make fstests/generic/260 work) - fix warning when extent buffer state and pages get out of sync after an IO error - fix transaction abort when syncing due to missing mapping error set on metadata inode after inlining a compressed file - fix transaction abort due to tree-log and zoned mode interacting in an unexpected way - fix memory leak of additional extent data when qgroup reservation fails - do proper handling of slot search call when deleting root refs" * tag 'for-5.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling btrfs: zoned: clear data relocation bg on zone finish btrfs: free exchange changeset on failures btrfs: fix re-dirty process of tree-log nodes btrfs: call mapping_set_error() on btree inode with a write error btrfs: clear extent buffer uptodate when we fail to write it btrfs: fail if fstrim_range->start == U64_MAX btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()
2021-12-10Merge tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2-31/+36
Pull cifs fixes from Steve French: "Two cifs/smb3 fixes - one for stable, the other fixes a recently reported NTLMSSP auth problem" * tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix ntlmssp auth when there is no key exchange cifs: Fix crash on unload of cifs_arc4.ko
2021-12-10Merge tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds3-9/+15
Pull nfsd fixes from Bruce Fields: "Fix a race on startup and another in the delegation code. The latter has been around for years, but I suspect recent changes may have widened the race window a little, so I'd like to go ahead and get it in" * tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux: nfsd: fix use-after-free due to delegation race nfsd: Fix nsfd startup race (again)
2021-12-10mm: bdi: initialize bdi_min_ratio when bdi is unregisteredManjong Lee1-0/+7
Initialize min_ratio if it is set during bdi unregistration. This can prevent problems that may occur a when bdi is removed without resetting min_ratio. For example. 1) insert external sdcard 2) set external sdcard's min_ratio 70 3) remove external sdcard without setting min_ratio 0 4) insert external sdcard 5) set external sdcard's min_ratio 70 << error occur(can't set) Because when an sdcard is removed, the present bdi_min_ratio value will remain. Currently, the only way to reset bdi_min_ratio is to reboot. [akpm@linux-foundation.org: tweak comment and coding style] Link: https://lkml.kernel.org/r/20211021161942.5983-1-mj0123.lee@samsung.com Signed-off-by: Manjong Lee <mj0123.lee@samsung.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Changheun Lee <nanich.lee@samsung.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <seunghwan.hyun@samsung.com> Cc: <sookwan7.kim@samsung.com> Cc: <yt0928.kim@samsung.com> Cc: <junho89.kim@samsung.com> Cc: <jisoo2146.oh@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10hugetlbfs: fix issue of preallocation of gigantic pages can't workZhenguo Yao1-1/+1
Preallocation of gigantic pages can't work bacause of commit b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation"). When nid is NUMA_NO_NODE(-1), alloc_bootmem_huge_page will always return without doing allocation. Fix this by adding more check. Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com Fixes: b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Signed-off-by: Zhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()Waiman Long1-53/+53
All the calls to mod_objcg_mlstate(), get_obj_stock() and put_obj_stock() are done by functions defined within the same "#ifdef CONFIG_MEMCG_KMEM" compilation block. When CONFIG_MEMCG_KMEM isn't defined, the following compilation warnings will be issued [1] and [2]. mm/memcontrol.c:785:20: warning: unused function 'mod_objcg_mlstate' mm/memcontrol.c:2113:33: warning: unused function 'get_obj_stock' Fix these warning by moving those functions to under the same CONFIG_MEMCG_KMEM compilation block. There is no functional change. [1] https://lore.kernel.org/lkml/202111272014.WOYNLUV6-lkp@intel.com/ [2] https://lore.kernel.org/lkml/202111280551.LXsWYt1T-lkp@intel.com/ Link: https://lkml.kernel.org/r/20211129161140.306488-1-longman@redhat.com Fixes: 559271146efc ("mm/memcg: optimize user context object stock access") Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp") Signed-off-by: Waiman Long <longman@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/slub: fix endianness bug for alloc/free_traces attributesGerald Schaefer1-6/+9
On big-endian s390, the alloc/free_traces attributes produce endless output, because of always 0 idx in slab_debugfs_show(). idx is de-referenced from *v, which points to a loff_t value, with unsigned int idx = *(unsigned int *)v; This will only give the upper 32 bits on big-endian, which remain 0. Instead of only fixing this de-reference, during discussion it seemed more appropriate to change the seq_ops so that they use an explicit iterator in private loc_track struct. This patch adds idx to loc_track, which will also fix the endianness bug. Link: https://lore.kernel.org/r/20211117193932.4049412-1-gerald.schaefer@linux.ibm.com Link: https://lkml.kernel.org/r/20211126171848.17534-1-gerald.schaefer@linux.ibm.com Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs") Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reported-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Faiyaz Mohammed <faiyazm@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: split test casesSeongJae Park7-112/+129
Currently, the single test program, debugfs.sh, contains all test cases for DAMON. When one of the cases fails, finding which case is failed from the test log is not so easy, and all remaining tests will be skipped. To improve the situation, this commit splits the single program into small test programs having their own names. Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test debugfs file reads/writes with huge countSeongJae Park4-0/+61
DAMON debugfs interface users were able to trigger warning by writing some files with arbitrarily large 'count' parameter. The issue is fixed with commit db7a347b26fe ("mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation"). This commit adds a test case for the issue in DAMON selftests to avoid future regressions. Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test wrong DAMOS condition ranges inputSeongJae Park1-0/+2
A patch titled "mm/damon/schemes: add the validity judgment of thresholds"[1] makes DAMON debugfs interface to validate DAMON scheme inputs. This commit adds a test case for the validation logic in DAMON selftests. [1] https://lore.kernel.org/linux-mm/d78360e52158d786fcbf20bc62c96785742e76d3.1637239568.git.xhao@linux.alibaba.com/ Link: https://lkml.kernel.org/r/20211201150440.1088-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test DAMON enabling with empty target_ids caseSeongJae Park1-0/+9
DAMON debugfs didn't check empty targets when starting monitoring, and the issue is fixed with commit b5ca3e83ddb0 ("mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on"). To avoid future regression, this commit adds a test case for that in DAMON selftests. Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: skip test if DAMON is runningSeongJae Park1-0/+9
Testing the DAMON debugfs files while DAMON is running makes no sense, as any write to the debugfs files will fail. This commit makes the test be skipped in this case. Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr-test: remove unnecessary variablesSeongJae Park1-8/+0
A couple of test functions in DAMON virtual address space monitoring primitives implementation has unnecessary damon_ctx variables. This commit removes those. Link: https://lkml.kernel.org/r/20211201150440.1088-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr-test: split a test function having >1024 bytes frame sizeSeongJae Park1-37/+40
On some configuration[1], 'damon_test_split_evenly()' kunit test function has >1024 bytes frame size, so below build warning is triggered: CC mm/damon/vaddr.o In file included from mm/damon/vaddr.c:672: mm/damon/vaddr-test.h: In function 'damon_test_split_evenly': mm/damon/vaddr-test.h:309:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=] 309 | } | ^ This commit fixes the warning by separating the common logic in the function. [1] https://lore.kernel.org/linux-mm/202111182146.OV3C4uGr-lkp@intel.com/ Link: https://lkml.kernel.org/r/20211201150440.1088-6-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr: remove an unnecessary warning messageSeongJae Park1-1/+0
The DAMON virtual address space monitoring primitive prints a warning message for wrong DAMOS action. However, it is not essential as the code returns appropriate failure in the case. This commit removes the message to make the log clean. Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org Fixes: 6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: remove unnecessary error messagesSeongJae Park1-9/+2
DAMON core prints error messages when damon_target object creation is failed or wrong monitoring attributes are given. Because appropriate error code is returned for each case, the messages are not essential. Also, because the code path can be triggered with user-specified input, this could result in kernel log mistakenly being messy. To avoid the case, this commit removes the messages. Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Fixes: b9a6ac4e4ede ("mm/damon: adaptively adjust regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: kernel test robot <lkp@intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/dbgfs: remove an unnecessary error messageSeongJae Park1-3/+1
When wrong scheme action is requested via the debugfs interface, DAMON prints an error message. Because the function returns error code, this is not really needed. Because the code path is triggered by the user specified input, this can result in kernel log mistakenly being messy. To avoid the case, this commit removes the message. Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org Fixes: af122dd8f3c0 ("mm/damon/dbgfs: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: use better timer mechanisms selection thresholdSeongJae Park1-1/+2
Patch series "mm/damon: Trivial fixups and improvements". This patchset contains trivial fixups and improvements for DAMON and its kunit/kselftest tests. This patch (of 11): DAMON is using hrtimer if requested sleep time is <=100ms, while the suggested threshold[1] is <=20ms. This commit applies the threshold. [1] Documentation/timers/timers-howto.rst Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org Fixes: ee801b7dd7822 ("mm/damon/schemes: activate schemes based on a watermarks mechanism") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: fix fake load reports due to uninterruptible sleepsSeongJae Park1-3/+3
Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake load while DAMON is turned on, though it is doing nothing. This can confuse users[1]. To avoid the case, this commit makes DAMON sleeps in idle mode. [1] https://lore.kernel.org/all/11868371.O9o76ZdvQC@natalenko.name/ Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org Fixes: 2224d8485492 ("mm: introduce Data Access MONitor (DAMON)") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10timers: implement usleep_idle_range()SeongJae Park2-8/+22
Patch series "mm/damon: Fix fake /proc/loadavg reports", v3. This patchset fixes DAMON's fake load report issue. The first patch makes yet another variant of usleep_range() for this fix, and the second patch fixes the issue of DAMON by making it using the newly introduced function. This patch (of 2): Some kernel threads such as DAMON could need to repeatedly sleep in micro seconds level. Because usleep_range() sleeps in uninterruptible state, however, such threads would make /proc/loadavg reports fake load. To help such cases, this commit implements a variant of usleep_range() called usleep_idle_range(). It is same to usleep_range() but sets the state of the current task as TASK_IDLE while sleeping. Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10filemap: remove PageHWPoison check from next_uptodate_page()Matthew Wilcox (Oracle)1-2/+0
Pages are individually marked as suffering from hardware poisoning. Checking that the head page is not hardware poisoned doesn't make sense; we might be after a subpage. We check each page individually before we use it, so this was an optimisation gone wrong. It will cause us to fall back to the slow path when there was no need to do that Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Yang Shi <shy828301@gmail.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mailmap: update email address for Guo RenGuo Ren1-0/+2
The ren_guo@c-sky.com would be deprecated and use guoren@kernel.org as the main email address. Link: https://lkml.kernel.org/r/20211123022741.545541-1-guoren@kernel.org Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10MAINTAINERS: update kdump maintainersDave Young1-1/+1
Remove myself from kdump maintainers as I have no enough time to maintain it now. But I can review patches on demand though. Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Increase default MLOCK_LIMIT to 8 MiBDrew DeVault1-3/+10
This limit has not been updated since 2008, when it was increased to 64 KiB at the request of GnuPG. Until recently, the main use-cases for this feature were (1) preventing sensitive memory from being swapped, as in GnuPG's use-case; and (2) real-time use-cases. In the first case, little memory is called for, and in the second case, the user is generally in a position to increase it if they need more. The introduction of IOURING_REGISTER_BUFFERS adds a third use-case: preparing fixed buffers for high-performance I/O. This use-case will take as much of this memory as it can get, but is still limited to 64 KiB by default, which is very little. This increases the limit to 8 MB, which was chosen fairly arbitrarily as a more generous, but still conservative, default value. It is also possible to raise this limit in userspace. This is easily done, for example, in the use-case of a network daemon: systemd, for instance, provides for this via LimitMEMLOCK in the service file; OpenRC via the rc_ulimit variables. However, there is no established userspace facility for configuring this outside of daemons: end-user applications do not presently have access to a convenient means of raising their limits. The buck, as it were, stops with the kernel. It's much easier to address it here than it is to bring it to hundreds of distributions, and it can only realistically be relied upon to be high-enough by end-user software if it is more-or-less ubiquitous. Most distros don't change this particular rlimit from the kernel-supplied default value, so a change here will easily provide that ubiquity. Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com Signed-off-by: Drew DeVault <sir@cmpwn.com> Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Cyril Hrubis <chrubis@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Andrew Dona-Couch <andrew@donacou.ch> Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Merge tag 'thermal-5.16-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix the definition of one of the Tiger Lake MMIO registers in the int340x thermal driver (Sumeet Pawnikar)" * tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL
2021-12-10Merge tag 'acpi-5.16-rc5' of ↵Linus Torvalds2-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Create the output directory for the ACPI tools during build if it has not been present before and prevent the compilation from failing in that case (Chen Yu)" * tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: tools: Fix compilation when output directory is not present
2021-12-10Merge tag 'pm-5.16-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a kernedoc comment that doesn't match the behavior of the function documented by it" * tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Fix pm_runtime_active() kerneldoc comment
2021-12-10Merge tag 'hwmon-for-v5.16-rc5' of ↵Linus Torvalds5-10/+7
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - In the pwm-fan driver, ensure that the internal pwm state matches the state assumed by the pwm code. - Avoid EREMOTEIO errors in sht4 driver - In the nct6775 driver, make it explicit that the register value passed to nct6775_asuswmi_read() is an 8-bit value - Avoid WARNing in dell-smm driver removal after failing to create /proc/i8k - Stop using a plain integer as NULL pointer in corsair-psu driver * tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pwm-fan) Ensure the fan going on in .probe() hwmon: (sht4x) Fix EREMOTEIO errors hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value() hwmon: (dell-smm) Fix warning on /proc/i8k creation error hwmon: (corsair-psu) fix plain integer used as NULL pointer
2021-12-10Merge tag 'trace-v5.16-rc4' of ↵Linus Torvalds5-6/+242
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Tracing, ftrace and tracefs fixes: - Have tracefs honor the gid mount option - Have new files in tracefs inherit the parent ownership - Have direct_ops unregister when it has no more functions - Properly clean up the ops when unregistering multi direct ops - Add a sample module to test the multiple direct ops - Fix memory leak in error path of __create_synth_event()" * tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible memory leak in __create_synth_event() error path ftrace/samples: Add module to test multi direct modify interface ftrace: Add cleanup to unregister_ftrace_direct_multi ftrace: Use direct_ops hash in unregister_ftrace_direct tracefs: Set all files to the same group ownership as the mount option tracefs: Have new files inherit the ownership of their parent
2021-12-10Merge tag 'aio-poll-for-linus' of ↵Linus Torvalds6-58/+196
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull aio poll fixes from Eric Biggers: "Fix three bugs in aio poll, and one issue with POLLFREE more broadly: - aio poll didn't handle POLLFREE, causing a use-after-free. - aio poll could block while the file is ready. - aio poll called eventfd_signal() when it isn't allowed. - POLLFREE didn't handle multiple exclusive waiters correctly. This has been tested with the libaio test suite, as well as with test programs I wrote that reproduce the first two bugs. I am sending this pull request myself as no one seems to be maintaining this code" * tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: aio: Fix incorrect usage of eventfd_signal_allowed() aio: fix use-after-free due to missing POLLFREE handling aio: keep poll requests on waitqueue until completed signalfd: use wake_up_pollfree() binder: use wake_up_pollfree() wait: add wake_up_pollfree()
2021-12-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds11-17/+223
Pull kvm fixes from Paolo Bonzini: "More x86 fixes: - Logic bugs in CR0 writes and Hyper-V hypercalls - Don't use Enlightened MSR Bitmap for L3 - Remove user-triggerable WARN Plus a few selftest fixes and a regression test for the user-triggerable WARN" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode selftests: KVM: avoid failures due to reserved HyperTransport region KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
2021-12-10i2c: mpc: Use atomic read and fix break conditionChris Packham1-1/+1
Maxime points out that the polling code in mpc_i2c_isr should use the _atomic API because it is called in an irq context and that the behaviour of the MCF bit is that it is 1 when the byte transfer is complete. All of this means the original code was effectively a udelay(100). Fix this by using readb_poll_timeout_atomic() and removing the negation of the break condition. Fixes: 4a8ac5e45cda ("i2c: mpc: Poll for MCF") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-10io-wq: check for wq exit after adding new worker task_workJens Axboe1-6/+25
We check IO_WQ_BIT_EXIT before attempting to create a new worker, and wq exit cancels pending work if we have any. But it's possible to have a race between the two, where creation checks exit finding it not set, but we're in the process of exiting. The exit side will cancel pending creation task_work, but there's a gap where we add task_work after we've canceled existing creations at exit time. Fix this by checking the EXIT bit post adding the creation task_work. If it's set, run the same cancelation that exit does. Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com Reviewed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10io_uring: ensure task_work gets run as part of cancelationsJens Axboe1-2/+4
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning. While in there, correct a comment that should be IFF, not IIF. Reported-and-tested-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10Merge tag 'pci-v5.16-fixes-2' of ↵Linus Torvalds3-14/+16
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Revert emulation of Marvell Armada A3720 expansion ROM because it doesn't work as expected (Marek Behún) - Assert PERST# in Apple M1 driver to fix initialization when booting from bootloaders using PCIe, such as U-Boot (Marc Zyngier) - Describe PERST# as active low in Apple T8103 DT and update driver to match (Marc Zyngier) * tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: apple: Fix PERST# polarity arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DT PCI: apple: Follow the PCIe specifications when resetting the port Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"
2021-12-10Merge tag 'mmc-v5.16-rc3' of ↵Linus Torvalds2-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - mtk-sd: Fix memory leak during tuning - renesas_sdhi: Initialize variable properly when tuning * tag 'mmc-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mediatek: free the ext_csd when mmc_get_ext_csd success mmc: renesas_sdhi: initialize variable properly when tuning
2021-12-10Merge tag 'libata-5.16-rc5' of ↵Linus Torvalds2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull libata fixes from Damien Le Moal: - Fix a sparse warning in the ahci_ceva driver (me) - Disable the ASMedia 1092 non-functional device (Hannes) * tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: libata: add horkage for ASMedia 1092 ata: ahci_ceva: Fix id array access in ceva_ahci_read_id()
2021-12-10Merge tag 'sound-5.16-rc5' of ↵Linus Torvalds18-122/+274
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Another collection of small fixes. It's still not quite calm yet, but nothing looks scary. ALSA core got a few fixes for covering the issues detected by fuzzer and the 32bit compat problem of control API, while the rest are all device-specific small fixes, including the continued fixes for Tegra" * tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits) ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform ALSA: usb-audio: Reorder snd_djm_devices[] entries ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1 ALSA: ctl: Fix copy of updated id with element read/write ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*() ALSA: pcm: oss: Limit the period size to 16MB ALSA: pcm: oss: Fix negative period/buffer sizes ASoC: codecs: wsa881x: fix return values from kcontrol put ASoC: codecs: wcd934x: return correct value from mixer put ASoC: codecs: wcd934x: handle channel mappping list correctly ASoC: qdsp6: q6routing: Fix return value from msm_routing_put_audio_mixer ASoC: SOF: Intel: Retry codec probing if it fails ASoC: amd: fix uninitialized variable in snd_acp6x_probe() ASoC: rockchip: i2s_tdm: Dup static DAI template ASoC: rt5682s: Fix crash due to out of scope stack vars ASoC: rt5682: Fix crash due to out of scope stack vars ASoC: tegra: Use normal system sleep for ADX ASoC: tegra: Use normal system sleep for AMX ASoC: tegra: Use normal system sleep for Mixer ASoC: tegra: Use normal system sleep for MVC ...
2021-12-10Merge tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds6-12/+30
Pull drm fixes from Dave Airlie: "Regular fixes, pretty small overall, couple of core fixes, two i915 and two amdgpu, hopefully it stays this quiet. ttm: - fix ttm_bo_swapout syncobj: - fix fence find bug with signalled fences i915: - fix error pointer deref in gem execbuffer - fix for GT init with GuC/HuC on ICL amdgpu: - DPIA fix - eDP fix" * tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drm: drm/i915/gen11: Moving WAs to icl_gt_workarounds_init() drm/amd/display: prevent reading unitialized links drm/amd/display: Fix DPIA outbox timeout after S3/S4/reset drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer() drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence. drm/ttm: fix ttm_bo_swapout
2021-12-10Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"Jens Axboe1-4/+2
This reverts commit 776b54e97a7d993ba23696e032426d5dea5bbe70. Looks like a last minute edit snuck into this patch, and as a result, it doesn't even compile. Revert the change for now. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)Davidlohr Bueso1-0/+3
do_each_pid_thread(PIDTYPE_PGID) can race with a concurrent change_pid(PIDTYPE_PGID) that can move the task from one hlist to another while iterating. Serialize ioprio_get to take the tasklist_lock in this case, just like it's set counterpart. Fixes: d69b78ba1de (ioprio: grab rcu_read_lock in sys_ioprio_{set,get}()) Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lore.kernel.org/r/20211210182058.43417-1-dave@stgolabs.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10Merge branch 'md-fixes' of ↵Jens Axboe1-1/+3
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.16 Pull MD fixes from Song. * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: fix double free of mddev->private in autorun_array() md: fix update super 1.0 on rdev size change
2021-12-10md: fix double free of mddev->private in autorun_array()zhangyue1-1/+2
In driver/md/md.c, if the function autorun_array() is called, the problem of double free may occur. In function autorun_array(), when the function do_md_run() returns an error, the function do_md_stop() will be called. The function do_md_run() called function md_run(), but in function md_run(), the pointer mddev->private may be freed. The function do_md_stop() called the function __md_stop(), but in function __md_stop(), the pointer mddev->private also will be freed without judging null. At this time, the pointer mddev->private will be double free, so it needs to be judged null or not. Signed-off-by: zhangyue <zhangyue1@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2021-12-10md: fix update super 1.0 on rdev size changeMarkus Hochholdinger1-0/+1
The superblock of version 1.0 doesn't get moved to the new position on a device size change. This leads to a rdev without a superblock on a known position, the raid can't be re-assembled. The line was removed by mistake and is re-added by this patch. Fixes: d9c0fa509eaf ("md: fix max sectors calculation for super 1.0") Cc: stable@vger.kernel.org Signed-off-by: Markus Hochholdinger <markus@hochholdinger.net> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2021-12-10nfsd: fix use-after-free due to delegation raceJ. Bruce Fields1-2/+7
A delegation break could arrive as soon as we've called vfs_setlease. A delegation break runs a callback which immediately (in nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we then exit nfs4_set_delegation without hashing the delegation, it will be freed as soon as the callback is done with it, without ever being removed from del_recall_lru. Symptoms show up later as use-after-free or list corruption warnings, usually in the laundromat thread. I suspect aba2072f4523 "nfsd: grant read delegations to clients holding writes" made this bug easier to hit, but I looked as far back as v3.0 and it looks to me it already had the same problem. So I'm not sure where the bug was introduced; it may have been there from the beginning. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-12-10nfsd: Fix nsfd startup race (again)Alexander Sverdlin2-7/+8
Commit bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") has re-opened rpc_pipefs_event() race against nfsd_net_id registration (register_pernet_subsys()) which has been fixed by commit bb7ffbf29e76 ("nfsd: fix nsfd startup race triggering BUG_ON"). Restore the order of register_pernet_subsys() vs register_cld_notifier(). Add WARN_ON() to prevent a future regression. Crash info: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000012 CPU: 8 PID: 345 Comm: mount Not tainted 5.4.144-... #1 pc : rpc_pipefs_event+0x54/0x120 [nfsd] lr : rpc_pipefs_event+0x48/0x120 [nfsd] Call trace: rpc_pipefs_event+0x54/0x120 [nfsd] blocking_notifier_call_chain rpc_fill_super get_tree_keyed rpc_fs_get_tree vfs_get_tree do_mount ksys_mount __arm64_sys_mount el0_svc_handler el0_svc Fixes: bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") Cc: stable@vger.kernel.org Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-12-10clocksource/drivers/arm_arch_timer: Force inlining of ↵Marc Zyngier1-2/+7
erratum_set_next_event_generic() With some specific kernel configuration and Clang, the kernel fails to like with something like: ld.lld: error: undefined symbol: __compiletime_assert_200 >>> referenced by arch_timer.h:156 (./arch/arm64/include/asm/arch_timer.h:156) >>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a ld.lld: error: undefined symbol: __compiletime_assert_197 >>> referenced by arch_timer.h:133 (./arch/arm64/include/asm/arch_timer.h:133) >>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a make: *** [Makefile:1161: vmlinux] Error 1 These are due to the BUILD_BUG() macros contained in the low-level accessors (arch_timer_reg_{write,read}_cp15) being emitted, as the access type wasn't known at compile time. Fix this by making erratum_set_next_event_generic() __force_inline, resulting in the 'access' parameter to be resolved at compile time, similarly to what is already done for set_next_event(). Fixes: 4775bc63f880 ("Add build-time guards for unhandled register accesses") Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211117113532.3895208-1-maz@kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-12-10clocksource/drivers/dw_apb_timer_of: Fix probe failureAlexey Sheplyakov1-1/+1
The driver refuses to probe with -EINVAL since the commit 5d9814df0aec ("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available"). Before the driver used to probe successfully if either "clock-freq" or "clock-frequency" properties has been specified in the device tree. That commit changed if (A && B) panic("No clock nor clock-frequency property"); into if (!A && !B) return 0; That's a bug: the reverse of `A && B` is '!A || !B', not '!A && !B' Signed-off-by: Vadim V. Vlasov <vadim.vlasov@elpitech.ru> Signed-off-by: Alexey Sheplyakov <asheplyakov@basealt.ru> Fixes: 5d9814df0aec56a6 ("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available"). Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vadim V. Vlasov <vadim.vlasov@elpitech.ru> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Link: https://lore.kernel.org/r/20211109153401.157491-1-asheplyakov@basealt.ru Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-12-10selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/OSean Christopherson3-0/+116
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode if userspace modifies RCX during a userspace exit to handle string I/O. This is a regression test for a user-triggerable WARN introduced by commit 3b27de271839 ("KVM: x86: split the two parts of emulator_pio_in"). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211025201311.1881846-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exitSean Christopherson1-2/+7
Replace a WARN with a comment to call out that userspace can modify RCX during an exit to userspace to handle string I/O. KVM doesn't actually support changing the rep count during an exit, i.e. the scenario can be ignored, but the WARN needs to go as it's trivial to trigger from userspace. Cc: stable@vger.kernel.org Fixes: 3b27de271839 ("KVM: x86: split the two parts of emulator_pio_in") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211025201311.1881846-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10KVM: X86: Raise #GP when clearing CR0_PG in 64 bit modeLai Jiangshan1-1/+2
In the SDM: If the logical processor is in 64-bit mode or if CR4.PCIDE = 1, an attempt to clear CR0.PG causes a general-protection exception (#GP). Software should transition to compatibility mode and clear CR4.PCIDE before attempting to disable paging. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211207095230.53437-1-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10xhci: avoid race between disable slot command and host runtime suspendMathias Nyman3-8/+16
Make xhci_disable_slot() synchronous, thus ensuring it, and xhci_free_dev() calling it return after xHC controller completes the disable slot command. Otherwise the roothub and xHC host may runtime suspend, and clear the command ring while the disable slot command is being processed. This causes a command completion mismatch as the completion event can't be mapped to the correct command. Command ring gets out of sync and commands time out. Driver finally assumes host is unresponsive and bails out. usb 2-4: USB disconnect, device number 10 xhci_hcd 0000:00:0d.0: ERROR mismatched command completion event ... xhci_hcd 0000:00:0d.0: xHCI host controller not responding, assume dead xhci_hcd 0000:00:0d.0: HC died; cleaning up Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20211210141735.1384209-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-10xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime suspendingKai-Heng Feng1-4/+0
When the xHCI is quirked with XHCI_RESET_ON_RESUME, runtime resume routine also resets the controller. This is bad for USB drivers without reset_resume callback, because there's no subsequent call of usb_dev_complete() -> usb_resume_complete() to force rebinding the driver to the device. For instance, btusb device stops working after xHCI controller is runtime resumed, if the controlled is quirked with XHCI_RESET_ON_RESUME. So always take XHCI_RESET_ON_RESUME into account to solve the issue. Cc: <stable@vger.kernel.org> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20211210141735.1384209-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-10Merge tag 'nvme-5.16-2021-12-10' of git://git.infradead.org/nvme into block-5.16Jens Axboe5-9/+33
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.16 - set ana_log_size to 0 after freeing ana_log_buf (Hou Tao) - show subsys nqn for duplicate cntlids (Keith Busch) - disable namespace access for unsupported metadata (Keith Busch) - report write pointer for a full zone as zone start + zone len (Niklas Cassel) - fix use after free when disconnecting a reconnecting ctrl (Ruozhu Li) - fix a list corruption in nvmet-tcp (Sagi Grimberg)" * tag 'nvme-5.16-2021-12-10' of git://git.infradead.org/nvme: nvmet-tcp: fix possible list corruption for unexpected command failure nvme: fix use after free when disconnecting a reconnecting ctrl nvme-multipath: set ana_log_size to 0 after free ana_log_buf nvme: report write pointer for a full zone as zone start + zone len nvme: disable namespace access for unsupported metadata nvme: show subsys nqn for duplicate cntlids
2021-12-10irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()Ye Guojin1-0/+1
This was found by coccicheck: ./drivers/irqchip/irq-bcm7120-l2.c,328,1-7,ERROR missing put_device; call of_find_device_by_node on line 234, but without a corresponding object release within this function. ./drivers/irqchip/irq-bcm7120-l2.c,341,1-7,ERROR missing put_device; call of_find_device_by_node on line 234, but without a corresponding object release within this function. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211109055958.130287-1-ye.guojin@zte.com.cn
2021-12-10selftests: KVM: avoid failures due to reserved HyperTransport regionPaolo Bonzini3-1/+78
AMD proceessors define an address range that is reserved by HyperTransport and causes a failure if used for guest physical addresses. Avoid selftests failures by reserving those guest physical addresses; the rules are: - On parts with <40 bits, its fully hidden from software. - Before Fam17h, it was always 12G just below 1T, even if there was more RAM above this location. In this case we just not use any RAM above 1T. - On Fam17h and later, it is variable based on SME, and is either just below 2^48 (no encryption) or 2^43 (encryption). Fixes: ef4c9f4f6546 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()") Cc: stable@vger.kernel.org Cc: David Matlack <dmatlack@google.com> Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210805105423.412878-1-pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI reqSean Christopherson1-2/+5
Do not bail early if there are no bits set in the sparse banks for a non-sparse, a.k.a. "all CPUs", IPI request. Per the Hyper-V spec, it is legal to have a variable length of '0', e.g. VP_SET's BankContents in this case, if the request can be serviced without the extra info. It is possible that for a given invocation of a hypercall that does accept variable sized input headers that all the header input fits entirely within the fixed size header. In such cases the variable sized input header is zero-sized and the corresponding bits in the hypercall input should be set to zero. Bailing early results in KVM failing to send IPIs to all CPUs as expected by the guest. Fixes: 214ff83d4473 ("KVM: x86: hyperv: implement PV IPI send hypercalls") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush ↵Vitaly Kuznetsov1-1/+1
hypercall Prior to commit 0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use tlb_flush_guest()"), kvm_hv_flush_tlb() was using 'KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP' when making a request to flush TLBs on other vCPUs and KVM_REQ_TLB_FLUSH is/was defined as: (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) so KVM_REQUEST_WAIT was lost. Hyper-V TLFS, however, requires that "This call guarantees that by the time control returns back to the caller, the observable effects of all flushes on the specified virtual processors have occurred." and without KVM_REQUEST_WAIT there's a small chance that the vCPU making the TLB flush will resume running before all IPIs get delivered to other vCPUs and a stale mapping can get read there. Fix the issue by adding KVM_REQUEST_WAIT flag to KVM_REQ_TLB_FLUSH_GUEST: kvm_hv_flush_tlb() is the sole caller which uses it for kvm_make_all_cpus_request()/kvm_make_vcpus_request_mask() where KVM_REQUEST_WAIT makes a difference. Cc: stable@kernel.org Fixes: 0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use tlb_flush_guest()") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211209102937.584397-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10Merge tag 'amd-drm-fixes-5.16-2021-12-08' of ↵Dave Airlie2-1/+8
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.16-2021-12-08: amdgpu: - DPIA fix - eDP fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211209042824.6720-1-alexander.deucher@amd.com
2021-12-10Merge tag 'drm-intel-fixes-2021-12-09' of ↵Dave Airlie2-9/+10
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes A fix to a error pointer dereference in gem_execbuffer and a fix for GT initialization when GuC/HuC are used on ICL. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YbJVWYAd/jeERCYY@intel.com
2021-12-10Merge tag 'drm-misc-fixes-2021-12-09' of ↵Dave Airlie2-2/+12
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A fix in syncobj to handle fence already signalled better, and a fix for a ttm_bo_swapout eviction check. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20211209124305.gxhid5zwf7m4oasn@houat
2021-12-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds17-67/+102
Pull rdma fixes from Jason Gunthorpe: "Quite a few small bug fixes old and new, also Doug Ledford is retiring now, we thank him for his work. Details: - Use after free in rxe - mlx5 DM regression - hns bugs triggred by device reset - Two fixes for CONFIG_DEBUG_PREEMPT - Several longstanding corner case bugs in hfi1 - Two irdma data path bugs in rare cases and some memory issues" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ RDMA/irdma: Report correct WC errors RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()' RDMA/irdma: Fix a user-after-free in add_pble_prm IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr IB/hfi1: Fix early init panic IB/hfi1: Insure use of smp_processor_id() is preempt disabled IB/hfi1: Correct guard on eager buffer deallocation RDMA/rtrs: Call {get,put}_cpu_ptr to silence a debug kernel warning RDMA/hns: Do not destroy QP resources in the hw resetting phase RDMA/hns: Do not halt commands during reset until later Remove Doug Ledford from MAINTAINERS RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow RDMA: Fix use-after-free in rxe_queue_cleanup
2021-12-09percpu_ref: Replace kernel.h with the necessary inclusionsAndy Shevchenko1-1/+1
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
2021-12-09Merge tag 'net-5.16-rc5' of ↵Linus Torvalds100-383/+1370
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap Current release - new code bugs: - bpf: fix bpf_check_mod_kfunc_call for built-in modules - ice: fixes for TC classifier offloads - vrf: don't run conntrack on vrf with !dflt qdisc Previous releases - regressions: - bpf: fix the off-by-two error in range markings - seg6: fix the iif in the IPv6 socket control block - devlink: fix netns refcount leak in devlink_nl_cmd_reload() - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports Previous releases - always broken: - ethtool: do not perform operations on net devices being unregistered - udp: use datalen to cap max gso segments - ice: fix races in stats collection - fec: only clear interrupt of handling queue in fec_enet_rx_queue() - m_can: pci: fix incorrect reference clock rate - m_can: disable and ignore ELO interrupt - mvpp2: fix XDP rx queues registering Misc: - treewide: add missing includes masked by cgroup -> bpf.h dependency" * tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports net: wwan: iosm: fixes unable to send AT command during mbim tx net: wwan: iosm: fixes net interface nonfunctional after fw flash net: wwan: iosm: fixes unnecessary doorbell send net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering MAINTAINERS: s390/net: remove myself as maintainer net/sched: fq_pie: prevent dismantle issue net: mana: Fix memory leak in mana_hwc_create_wq seg6: fix the iif in the IPv6 socket control block nfp: Fix memory leak in nfp_cpp_area_cache_add() nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done nfc: fix segfault in nfc_genl_dump_devices_done udp: using datalen to cap max gso segments net: dsa: mv88e6xxx: error handling for serdes_power functions can: kvaser_usb: get CAN clock frequency from device can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter net: mvpp2: fix XDP rx queues registering vmxnet3: fix minimum vectors alloc issue net, neigh: clear whole pneigh_entry at alloc time net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" ...
2021-12-09Merge tag 'mtd/fixes-for-5.16-rc5' of ↵Linus Torvalds4-12/+40
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal: "MTD fixes: - dataflash: Add device-tree SPI IDs to avoid new warnings Raw NAND fixes: - Fix nand_choose_best_timings() on unsupported interface - Fix nand_erase_op delay (wrong unit) - fsmc: - Fix timing computation - Take instruction delay into account - denali: - Add the dependency on HAS_IOMEM to silence robots" * tag 'mtd/fixes-for-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: dataflash: Add device-tree SPI IDs mtd: rawnand: fsmc: Fix timing computation mtd: rawnand: fsmc: Take instruction delay into account mtd: rawnand: Fix nand_choose_best_timings() on unsupported interface mtd: rawnand: Fix nand_erase_op delay mtd: rawnand: denali: Add the dependency on HAS_IOMEM
2021-12-09Merge branch 'for-linus' of ↵Linus Torvalds36-36/+146
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fixes for various drivers which assume that a HID device is on USB transport, but that might not necessarily be the case, as the device can be faked by uhid. (Greg, Benjamin Tissoires) - fix for spurious wakeups on certain Lenovo notebooks (Thomas Weißschuh) - a few other device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for Elan touchscreen on Asus UX550VE HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested HID: google: add eel USB id HID: add USB_HID dependancy to hid-prodikeys HID: add USB_HID dependancy to hid-chicony HID: bigbenff: prevent null pointer dereference HID: sony: fix error path in probe HID: add USB_HID dependancy on some USB HID drivers HID: check for valid USB device for many HID drivers HID: wacom: fix problems when device is not a valid USB device HID: add hid_is_usb() function to make it simpler for USB detection HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
2021-12-09Revert "usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default"Douglas Anderson1-15/+0
This reverts commit cefdd52fa0455c0555c30927386ee466a108b060. On sc7180-trogdor class devices with 'fw_devlink=permissive' and KASAN enabled, you'll see a Use-After-Free reported at bootup. The root of the problem is that dwc3_qcom_of_register_core() is adding a devm-allocated "tx-fifo-resize" property to its device tree node using of_add_property(). The issue is that of_add_property() makes a _permanent_ addition to the device tree that lasts until reboot. That means allocating memory for the property using "devm" managed memory is a terrible idea since that memory will be freed upon probe deferral or device unbinding. Let's revert the patch since the system is still functional without it. The fact that of_add_property() makes a permanent change is extra fodder for those folks who were aruging that the device tree isn't really the right way to pass information between parts of the driver. It is an exercise left to the reader to submit a patch re-adding the new feature in a way that makes everyone happier. Fixes: cefdd52fa045 ("usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default") Cc: stable <stable@vger.kernel.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20211207094327.1.Ie3cde3443039342e2963262a4c3ac36dc2c08b30@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-09aio: Fix incorrect usage of eventfd_signal_allowed()Xie Yongji1-1/+1
We should defer eventfd_signal() to the workqueue when eventfd_signal_allowed() return false rather than return true. Fixes: b542e383d8c0 ("eventfd: Make signal recursion protection a task bit") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210913111928.98-1-xieyongji@bytedance.com Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09aio: fix use-after-free due to missing POLLFREE handlingEric Biggers2-32/+107
signalfd_poll() and binder_poll() are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, by sending a POLLFREE notification to all waiters. Unfortunately, only eventpoll handles POLLFREE. A second type of non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't handle POLLFREE. This allows a use-after-free to occur if a signalfd or binder fd is polled with aio poll, and the waitqueue gets freed. Fix this by making aio poll handle POLLFREE. A patch by Ramji Jiyani <ramjiyani@google.com> (https://lore.kernel.org/r/20211027011834.2497484-1-ramjiyani@google.com) tried to do this by making aio_poll_wake() always complete the request inline if POLLFREE is seen. However, that solution had two bugs. First, it introduced a deadlock, as it unconditionally locked the aio context while holding the waitqueue lock, which inverts the normal locking order. Second, it didn't consider that POLLFREE notifications are missed while the request has been temporarily de-queued. The second problem was solved by my previous patch. This patch then properly fixes the use-after-free by handling POLLFREE in a deadlock-free way. It does this by taking advantage of the fact that freeing of the waitqueue is RCU-delayed, similar to what eventpoll does. Fixes: 2c14fa838cbe ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.18+ Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09aio: keep poll requests on waitqueue until completedEric Biggers1-20/+63
Currently, aio_poll_wake() will always remove the poll request from the waitqueue. Then, if aio_poll_complete_work() sees that none of the polled events are ready and the request isn't cancelled, it re-adds the request to the waitqueue. (This can easily happen when polling a file that doesn't pass an event mask when waking up its waitqueue.) This is fundamentally broken for two reasons: 1. If a wakeup occurs between vfs_poll() and the request being re-added to the waitqueue, it will be missed because the request wasn't on the waitqueue at the time. Therefore, IOCB_CMD_POLL might never complete even if the polled file is ready. 2. When the request isn't on the waitqueue, there is no way to be notified that the waitqueue is being freed (which happens when its lifetime is shorter than the struct file's). This is supposed to happen via the waitqueue entries being woken up with POLLFREE. Therefore, leave the requests on the waitqueue until they are actually completed (or cancelled). To keep track of when aio_poll_complete_work needs to be scheduled, use new fields in struct poll_iocb. Remove the 'done' field which is now redundant. Note that this is consistent with how sys_poll() and eventpoll work; their wakeup functions do *not* remove the waitqueue entries. Fixes: 2c14fa838cbe ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.18+ Link: https://lore.kernel.org/r/20211209010455.42744-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09signalfd: use wake_up_pollfree()Eric Biggers1-11/+1
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll and aio poll are fortunately not affected by this, but it's very fragile. Thus, the new function wake_up_pollfree() has been introduced. Convert signalfd to use wake_up_pollfree(). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: d80e731ecab4 ("epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09binder: use wake_up_pollfree()Eric Biggers1-12/+9
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll and aio poll are fortunately not affected by this, but it's very fragile. Thus, the new function wake_up_pollfree() has been introduced. Convert binder to use wake_up_pollfree(). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread exits.") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09wait: add wake_up_pollfree()Eric Biggers2-0/+33
Several ->poll() implementations are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'. However, that has a bug: wake_up_poll() calls __wake_up() with nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters, and the wakeup function for the first one returns a positive value, only that one will be called. That's *not* what's needed for POLLFREE; POLLFREE is special in that it really needs to wake up everyone. Considering the three non-blocking poll systems: - io_uring poll doesn't handle POLLFREE at all, so it is broken anyway. - aio poll is unaffected, since it doesn't support exclusive waits. However, that's fragile, as someone could add this feature later. - epoll doesn't appear to be broken by this, since its wakeup function returns 0 when it sees POLLFREE. But this is fragile. Although there is a workaround (see epoll), it's better to define a function which always sends POLLFREE to all waiters. Add such a function. Also make it verify that the queue really becomes empty after all waiters have been woken up. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09Merge tag 'netfs-fixes-20211207' of ↵Linus Torvalds1-13/+8
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfslib fixes from David Howells: - Fix a lockdep warning and potential deadlock. This is takes the simple approach of offloading the write-to-cache done from within a network filesystem read to a worker thread to avoid taking the sb_writer lock from the cache backing filesystem whilst holding the mmap lock on an inode from the network filesystem. Jan Kara posits a scenario whereby this can cause deadlock[1], though it's quite complex and I think requires someone in userspace to actually do I/O on the cache files. Matthew Wilcox isn't so certain, though[2]. An alternative way to fix this, suggested by Darrick Wong, might be to allow cachefiles to prevent userspace from performing I/O upon the file - something like an exclusive open - but that's beyond the scope of a fix here if we do want to make such a facility in the future. - In some of the error handling paths where netfs_ops->cleanup() is called, the arguments are transposed[3]. gcc doesn't complain because one of the parameters is void* and one of the values is void*. Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/ [1] Link: https://lore.kernel.org/r/Ya9eDiFCE2fO7K/S@casper.infradead.org/ [2] Link: https://lore.kernel.org/r/20211207031449.100510-1-jefflexu@linux.alibaba.com/ [3] * tag 'netfs-fixes-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: netfs: fix parameter of cleanup() netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lock
2021-12-09tracing: Fix possible memory leak in __create_synth_event() error pathMiaoqian Lin1-5/+6
There's error paths in __create_synth_event() after the argv is allocated that fail to free it. Add a jump to free it when necessary. Link: https://lkml.kernel.org/r/20211209024317.11783-1-linmq006@gmail.com Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> [ Fixed up the patch and change log ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-09ftrace/samples: Add module to test multi direct modify interfaceJiri Olsa2-0/+153
Adding ftrace-direct-multi-modify.ko kernel module that uses modify_ftrace_direct_multi API. The core functionality is taken from ftrace-direct-modify.ko kernel module and changed to fit multi direct interface. The init function creates kthread that periodically calls modify_ftrace_direct_multi to change the trampoline address for the direct ftrace_ops. The ftrace trace_pipe then shows trace from both trampolines. Link: https://lkml.kernel.org/r/20211206182032.87248-4-jolsa@kernel.org Cc: Ingo Molnar <mingo@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-09bus: mhi: core: Add support for forced PM resumeLoic Poulain3-4/+36
For whatever reason, some devices like QCA6390, WCN6855 using ath11k are not in M3 state during PM resume, but still functional. The mhi_pm_resume should then not fail in those cases, and let the higher level device specific stack continue resuming process. Add an API mhi_pm_resume_force(), to force resuming irrespective of the current MHI state. This fixes a regression with non functional ath11k WiFi after suspend/resume cycle on some machines. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=214179 Link: https://lore.kernel.org/regressions/871r5p0x2u.fsf@codeaurora.org/ Fixes: 020d3b26c07a ("bus: mhi: Early MHI resume failure in non M3 state") Cc: stable@vger.kernel.org #5.13 Reported-by: Kalle Valo <kvalo@codeaurora.org> Reported-by: Pengyu Ma <mapengyu@gmail.com> Tested-by: Kalle Valo <kvalo@kernel.org> Acked-by: Kalle Valo <kvalo@kernel.org> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> [mani: Switched to API, added bug report, reported-by tags and CCed stable] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20211209131633.4168-1-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-09KVM: x86: selftests: svm_int_ctl_test: fix intercept calculationMaciej S. Szmigiero1-1/+1
INTERCEPT_x are bit positions, but the code was using the raw value of INTERCEPT_VINTR (4) instead of BIT(INTERCEPT_VINTR). This resulted in masking of bit 2 - that is, SMI instead of VINTR. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <49b9571d25588870db5380b0be1a41df4bbaaf93.1638486479.git.maciej.szmigiero@oracle.com>
2021-12-09tools/lib/lockdep: drop leftover liblockdep headersSasha Levin7-176/+0
Clean up remaining headers that are specific to liblockdep but lived in the shared header directory. These are all unused after the liblockdep code was removed in commit 7246f4dcaccc ("tools/lib/lockdep: drop liblockdep"). Note that there are still headers that were originally created for liblockdep, that still have liblockdep references, but they are used by other tools/ code at this point. Signed-off-by: Sasha Levin <sashal@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-09net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA portsRussell King (Oracle)1-30/+34
Martyn Welch reports that his CPU port is unable to link where it has been necessary to use one of the switch ports with an internal PHY for the CPU port. The reason behind this is the port control register is left forcing the link down, preventing traffic flow. This occurs because during initialisation, phylink expects the link to be down, and DSA forces the link down by synthesising a call to the DSA drivers phylink_mac_link_down() method, but we don't touch the forced-link state when we later reconfigure the port. Resolve this by also unforcing the link state when we are operating in PHY mode and the PPU is set to poll the PHY to retrieve link status information. Reported-by: Martyn Welch <martyn.welch@collabora.com> Tested-by: Martyn Welch <martyn.welch@collabora.com> Fixes: 3be98b2d5fbc ("net: dsa: Down cpu/dsa ports phylink will control") Cc: <stable@vger.kernel.org> # 5.7: 2b29cb9e3f7f: net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1mvFhP-00F8Zb-Ul@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge branch 'net-wwan-iosm-bug-fixes'Jakub Kicinski3-18/+19
M Chetan Kumar says: ==================== net: wwan: iosm: bug fixes This patch series brings in IOSM driver bug fixes. Patch details are explained below. PATCH1: stop sending unnecessary doorbell in IP tx flow. PATCH2: Restore the IP channel configuration after fw flash. PATCH3: Removed the unnecessary check around control port TX transfer. ==================== Link: https://lore.kernel.org/r/20211209101629.2940877-1-m.chetan.kumar@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: wwan: iosm: fixes unable to send AT command during mbim txM Chetan Kumar3-10/+0
ev_cdev_write_pending flag is preventing a TX message post for AT port while MBIM transfer is ongoing. Removed the unnecessary check around control port TX transfer. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: wwan: iosm: fixes net interface nonfunctional after fw flashM Chetan Kumar3-1/+8
Devlink initialization flow was overwriting the IP traffic channel configuration. This was causing wwan0 network interface to be unusable after fw flash. When device boots to fully functional mode restore the IP channel configuration. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: wwan: iosm: fixes unnecessary doorbell sendM Chetan Kumar1-7/+11
In TX packet accumulation flow transport layer is giving a doorbell to device even though there is no pending control TX transfer that needs immediate attention. Introduced a new hpda_ctrl_pending variable to keep track of pending control TX transfer. If there is a pending control TX transfer which needs an immediate attention only then give a doorbell to device. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: dsa: felix: Fix memory leak in felix_setup_mmio_filteringJosé Expósito1-1/+4
Avoid a memory leak if there is not a CPU port defined. Fixes: 8d5f7954b7c8 ("net: dsa: felix: break at first CPU port during init and teardown") Addresses-Coverity-ID: 1492897 ("Resource leak") Addresses-Coverity-ID: 1492899 ("Resource leak") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211209110538.11585-1-jose.exposito89@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09MAINTAINERS: s390/net: remove myself as maintainerJulian Wiedmann1-2/+0
I won't have access to the relevant HW and docs much longer. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20211209153546.1152921-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net/sched: fq_pie: prevent dismantle issueEric Dumazet1-0/+1
For some reason, fq_pie_destroy() did not copy working code from pie_destroy() and other qdiscs, thus causing elusive bug. Before calling del_timer_sync(&q->adapt_timer), we need to ensure timer will not rearm itself. rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-....: (4416 ticks this GP) idle=60d/1/0x4000000000000000 softirq=10433/10434 fqs=2579 (t=10501 jiffies g=13085 q=3989) NMI backtrace for cpu 0 CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343 print_cpu_stall kernel/rcu/tree_stall.h:627 [inline] check_cpu_stall kernel/rcu/tree_stall.h:711 [inline] rcu_pending kernel/rcu/tree.c:3878 [inline] rcu_sched_clock_irq.cold+0x9d/0x746 kernel/rcu/tree.c:2597 update_process_times+0x16d/0x200 kernel/time/timer.c:1785 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 __run_hrtimer kernel/time/hrtimer.c:1685 [inline] __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 RIP: 0010:write_comp_data kernel/kcov.c:221 [inline] RIP: 0010:__sanitizer_cov_trace_const_cmp1+0x1d/0x80 kernel/kcov.c:273 Code: 54 c8 20 48 89 10 c3 66 0f 1f 44 00 00 53 41 89 fb 41 89 f1 bf 03 00 00 00 65 48 8b 0c 25 40 70 02 00 48 89 ce 4c 8b 54 24 08 <e8> 4e f7 ff ff 84 c0 74 51 48 8b 81 88 15 00 00 44 8b 81 84 15 00 RSP: 0018:ffffc90000d27b28 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff888064bf1bf0 RCX: ffff888011928000 RDX: ffff888011928000 RSI: ffff888011928000 RDI: 0000000000000003 RBP: ffff888064bf1c28 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff875d8295 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8880783dd300 R14: 0000000000000000 R15: 0000000000000000 pie_calculate_probability+0x405/0x7c0 net/sched/sch_pie.c:418 fq_pie_timer+0x170/0x2a0 net/sched/sch_fq_pie.c:383 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421 expire_timers kernel/time/timer.c:1466 [inline] __run_timers.part.0+0x675/0xa20 kernel/time/timer.c:1734 __run_timers kernel/time/timer.c:1715 [inline] run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1747 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 run_ksoftirqd kernel/softirq.c:921 [inline] run_ksoftirqd+0x2d/0x60 kernel/softirq.c:913 smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Cc: Sachin D. Patil <sdp.sachin@gmail.com> Cc: V. Saicharan <vsaicharan1998@gmail.com> Cc: Mohit Bhasi <mohitbhasi1998@gmail.com> Cc: Leslie Monis <lesliemonis@gmail.com> Cc: Gautam Ramakrishnan <gautamramk@gmail.com> Link: https://lore.kernel.org/r/20211209084937.3500020-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: mana: Fix memory leak in mana_hwc_create_wqJosé Expósito1-5/+5
If allocating the DMA buffer fails, mana_hwc_destroy_wq was called without previously storing the pointer to the queue. In order to avoid leaking the pointer to the queue, store it as soon as it is allocated. Addresses-Coverity-ID: 1484720 ("Resource leak") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/20211208223723.18520-1-jose.exposito89@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09seg6: fix the iif in the IPv6 socket control blockAndrea Mayer1-0/+8
When an IPv4 packet is received, the ip_rcv_core(...) sets the receiving interface index into the IPv4 socket control block (v5.16-rc4, net/ipv4/ip_input.c line 510): IPCB(skb)->iif = skb->skb_iif; If that IPv4 packet is meant to be encapsulated in an outer IPv6+SRH header, the seg6_do_srh_encap(...) performs the required encapsulation. In this case, the seg6_do_srh_encap function clears the IPv6 socket control block (v5.16-rc4 net/ipv6/seg6_iptunnel.c line 163): memset(IP6CB(skb), 0, sizeof(*IP6CB(skb))); The memset(...) was introduced in commit ef489749aae5 ("ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation") a long time ago (2019-01-29). Since the IPv6 socket control block and the IPv4 socket control block share the same memory area (skb->cb), the receiving interface index info is lost (IP6CB(skb)->iif is set to zero). As a side effect, that condition triggers a NULL pointer dereference if commit 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") is applied. To fix that issue, we set the IP6CB(skb)->iif with the index of the receiving interface once again. Fixes: ef489749aae5 ("ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation") Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211208195409.12169-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09nfp: Fix memory leak in nfp_cpp_area_cache_add()Jianglei Nie1-1/+3
In line 800 (#1), nfp_cpp_area_alloc() allocates and initializes a CPP area structure. But in line 807 (#2), when the cache is allocated failed, this CPP area structure is not freed, which will result in memory leak. We can fix it by freeing the CPP area when the cache is allocated failed (#2). 792 int nfp_cpp_area_cache_add(struct nfp_cpp *cpp, size_t size) 793 { 794 struct nfp_cpp_area_cache *cache; 795 struct nfp_cpp_area *area; 800 area = nfp_cpp_area_alloc(cpp, NFP_CPP_ID(7, NFP_CPP_ACTION_RW, 0), 801 0, size); // #1: allocates and initializes 802 if (!area) 803 return -ENOMEM; 805 cache = kzalloc(sizeof(*cache), GFP_KERNEL); 806 if (!cache) 807 return -ENOMEM; // #2: missing free 817 return 0; 818 } Fixes: 4cb584e0ee7d ("nfp: add CPP access core") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Acked-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20211209061511.122535-1-niejianglei2021@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_doneKrzysztof Kozlowski1-2/+4
The done() netlink callback nfc_genl_dump_ses_done() should check if received argument is non-NULL, because its allocation could fail earlier in dumpit() (nfc_genl_dump_ses()). Fixes: ac22ac466a65 ("NFC: Add a GET_SE netlink API") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211209081307.57337-1-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09nfc: fix segfault in nfc_genl_dump_devices_doneTadeusz Struk1-2/+4
When kmalloc in nfc_genl_dump_devices() fails then nfc_genl_dump_devices_done() segfaults as below KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 25 Comm: kworker/0:1 Not tainted 5.16.0-rc4-01180-g2a987e65025e-dirty #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-6.fc35 04/01/2014 Workqueue: events netlink_sock_destruct_work RIP: 0010:klist_iter_exit+0x26/0x80 Call Trace: <TASK> class_dev_iter_exit+0x15/0x20 nfc_genl_dump_devices_done+0x3b/0x50 genl_lock_done+0x84/0xd0 netlink_sock_destruct+0x8f/0x270 __sk_destruct+0x64/0x3b0 sk_destruct+0xa8/0xd0 __sk_free+0x2e8/0x3d0 sk_free+0x51/0x90 netlink_sock_destruct_work+0x1c/0x20 process_one_work+0x411/0x710 worker_thread+0x6fd/0xa80 Link: https://syzkaller.appspot.com/bug?id=fc0fa5a53db9edd261d56e74325419faf18bd0df Reported-by: syzbot+f9f76f4a0766420b4a02@syzkaller.appspotmail.com Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211208182742.340542-1-tadeusz.struk@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09udp: using datalen to cap max gso segmentsJianguo Wu1-1/+1
The max number of UDP gso segments is intended to cap to UDP_MAX_SEGMENTS, this is checked in udp_send_skb(): if (skb->len > cork->gso_size * UDP_MAX_SEGMENTS) { kfree_skb(skb); return -EINVAL; } skb->len contains network and transport header len here, we should use only data len instead. Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/900742e5-81fb-30dc-6e0b-375c6cdd7982@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: dsa: mv88e6xxx: error handling for serdes_power functionsAmeer Hamza1-1/+7
Added default case to handle undefined cmode scenario in mv88e6393x_serdes_power() and mv88e6393x_serdes_power() methods. Addresses-Coverity: 1494644 ("Uninitialized scalar variable") Fixes: 21635d9203e1c (net: dsa: mv88e6xxx: Fix application of erratum 4.8 for 88E6393X) Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com> Link: https://lore.kernel.org/r/20211209041552.9810-1-amhamza.mgc@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge tag 'linux-can-fixes-for-5.16-20211209' of ↵Jakub Kicinski2-29/+80
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2021-12-09 Both patches are by Jimmy Assarsson. The first one fixes the incrementing of the rx/tx error counters in the Kvaser PCIe FD driver. The second one fixes the Kvaser USB driver by using the CAN clock frequency provided by the device instead of using a hard coded value. * tag 'linux-can-fixes-for-5.16-20211209' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: kvaser_usb: get CAN clock frequency from device can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter ==================== Link: https://lore.kernel.org/r/20211209081312.301036-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()Raviteja Goud Talla1-9/+9
Bspec page says "Reset: BUS", Accordingly moving w/a's: Wa_1407352427,Wa_1406680159 to proper function icl_gt_workarounds_init() Which will resolve guc enabling error v2: - Previous patch rev2 was created by email client which caused the Build failure, This v2 is to resolve the previous broken series Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Raviteja Goud Talla <ravitejax.goud.talla@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211203145603.4006937-1-ravitejax.goud.talla@intel.com (cherry picked from commit 67b858dd89932086ae0ee2d0ce4dd070a2c88bb3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-12-09mmc: mediatek: free the ext_csd when mmc_get_ext_csd successWenbin Mei1-1/+3
If mmc_get_ext_csd success, the ext_csd are not freed. Add the missing kfree() calls. Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com> Fixes: c4ac38c6539b ("mmc: mtk-sd: Add HS400 online tuning support") Link: https://lore.kernel.org/r/20211207075013.22911-1-wenbin.mei@mediatek.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-12-09i2c: virtio: fix completion handlingVincent Whitchurch1-20/+12
The driver currently assumes that the notify callback is only received when the device is done with all the queued buffers. However, this is not true, since the notify callback could be called without any of the queued buffers being completed (for example, with virtio-pci and shared interrupts) or with only some of the buffers being completed (since the driver makes them available to the device in multiple separate virtqueue_add_sgs() calls). This can lead to incorrect data on the I2C bus or memory corruption in the guest if the device operates on buffers which are have been freed by the driver. (The WARN_ON in the driver is also triggered.) BUG kmalloc-128 (Tainted: G W ): Poison overwritten First byte 0x0 instead of 0x6b Allocated in i2cdev_ioctl_rdwr+0x9d/0x1de age=243 cpu=0 pid=28 memdup_user+0x2e/0xbd i2cdev_ioctl_rdwr+0x9d/0x1de i2cdev_ioctl+0x247/0x2ed vfs_ioctl+0x21/0x30 sys_ioctl+0xb18/0xb41 Freed in i2cdev_ioctl_rdwr+0x1bb/0x1de age=68 cpu=0 pid=28 kfree+0x1bd/0x1cc i2cdev_ioctl_rdwr+0x1bb/0x1de i2cdev_ioctl+0x247/0x2ed vfs_ioctl+0x21/0x30 sys_ioctl+0xb18/0xb41 Fix this by calling virtio_get_buf() from the notify handler like other virtio drivers and by actually waiting for all the buffers to be completed. Fixes: 3cfc88380413d20f ("i2c: virtio: add a virtio i2c frontend driver") Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-09can: kvaser_usb: get CAN clock frequency from deviceJimmy Assarsson1-28/+73
The CAN clock frequency is used when calculating the CAN bittiming parameters. When wrong clock frequency is used, the device may end up with wrong bittiming parameters, depending on user requested bittiming parameters. To avoid this, get the CAN clock frequency from the device. Various existing Kvaser Leaf products use different CAN clocks. Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Link: https://lore.kernel.org/all/20211208152122.250852-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-09can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct ↵Jimmy Assarsson1-1/+7
stats->{rx,tx}_errors counter Check the direction bit in the error frame packet (EPACK) to determine which net_device_stats {rx,tx}_errors counter to increase. Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices") Link: https://lore.kernel.org/all/20211208152122.250852-1-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08drm/amd/display: prevent reading unitialized linksMikita Lipski1-0/+2
[why/how] The function can be called on boot or after suspend when links are not initialized, to prevent it guard it with NULL pointer check Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-08drm/amd/display: Fix DPIA outbox timeout after S3/S4/resetNicholas Kazlauskas1-1/+6
[Why] The HW interrupt gets disabled after S3/S4/reset so we don't receive notifications for HPD or AUX from DMUB - leading to timeout and black screen with (or without) DPIA links connected. [How] Re-enable the interrupt after S3/S4/reset like we do for the other DC interrupts. Guard both instances of the outbox interrupt enable or we'll hang during restore on ASIC that don't support it. Fixes: 6eff272dbee7ad ("drm/amd/display: Fix DPIA outbox timeout after GPU reset") Reviewed-by: Jude Shih <Jude.Shih@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-08net: mvpp2: fix XDP rx queues registeringLouis Amas1-2/+2
The registration of XDP queue information is incorrect because the RX queue id we use is invalid. When port->id == 0 it appears to works as expected yet it's no longer the case when port->id != 0. The problem arised while using a recent kernel version on the MACCHIATOBin. This board has several ports: * eth0 and eth1 are 10Gbps interfaces ; both ports has port->id == 0; * eth2 is a 1Gbps interface with port->id != 0. Code from xdp-tutorial (more specifically advanced03-AF_XDP) was used to test packet capture and injection on all these interfaces. The XDP kernel was simplified to: SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) { int index = ctx->rx_queue_index; /* A set entry here means that the correspnding queue_id * has an active AF_XDP socket bound to it. */ if (bpf_map_lookup_elem(&xsks_map, &index)) return bpf_redirect_map(&xsks_map, index, 0); return XDP_PASS; } Starting the program using: ./af_xdp_user -d DEV Gives the following result: * eth0 : ok * eth1 : ok * eth2 : no capture, no injection Investigating the issue shows that XDP rx queues for eth2 are wrong: XDP expects their id to be in the range [0..3] but we found them to be in the range [32..35]. Trying to force rx queue ids using: ./af_xdp_user -d eth2 -Q 32 fails as expected (we shall not have more than 4 queues). When we register the XDP rx queue information (using xdp_rxq_info_reg() in function mvpp2_rxq_init()) we tell it to use rxq->id as the queue id. This value is computed as: rxq->id = port->id * max_rxq_count + queue_id where max_rxq_count depends on the device version. In the MACCHIATOBin case, this value is 32, meaning that rx queues on eth2 are numbered from 32 to 35 - there are four of them. Clearly, this is not the per-port queue id that XDP is expecting: it wants a value in the range [0..3]. It shall directly use queue_id which is stored in rxq->logic_rxq -- so let's use that value instead. rxq->id is left untouched ; its value is indeed valid but it should not be used in this context. This is consistent with the remaining part of the code in mvpp2_rxq_init(). With this change, packet capture is working as expected on all the MACCHIATOBin ports. Fixes: b27db2274ba8 ("mvpp2: use page_pool allocator") Signed-off-by: Louis Amas <louis.amas@eho.link> Signed-off-by: Emmanuel Deloget <emmanuel.deloget@eho.link> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20211207143423.916334-1-louis.amas@eho.link Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09libata: add horkage for ASMedia 1092Hannes Reinecke1-0/+2
The ASMedia 1092 has a configuration mode which will present a dummy device; sadly the implementation falsely claims to provide a device with 100M which doesn't actually exist. So disable this device to avoid errors during boot. Cc: stable@vger.kernel.org Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-12-08vmxnet3: fix minimum vectors alloc issueRonak Doshi1-6/+7
'Commit 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues")' added support for 32Tx/Rx queues. Within that patch, value of VMXNET3_LINUX_MIN_MSIX_VECT was updated. However, there is a case (numvcpus = 2) which actually requires 3 intrs which matches VMXNET3_LINUX_MIN_MSIX_VECT which then is treated as failure by stack to allocate more vectors. This patch fixes this issue. Fixes: 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues") Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Link: https://lore.kernel.org/r/20211207081737.14000-1-doshir@vmware.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net, neigh: clear whole pneigh_entry at alloc timeEric Dumazet1-2/+1
Commit 2c611ad97a82 ("net, neigh: Extend neigh->flags to 32 bit to allow for extensions") enables a new KMSAM warning [1] I think the bug is actually older, because the following intruction only occurred if ndm->ndm_flags had NTF_PROXY set. pn->flags = ndm->ndm_flags; Let's clear all pneigh_entry fields at alloc time. [1] BUG: KMSAN: uninit-value in pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_dump_table net/core/neighbour.c:2715 [inline] neigh_dump_info+0x1e3f/0x2c60 net/core/neighbour.c:2832 netlink_dump+0xaca/0x16a0 net/netlink/af_netlink.c:2265 __netlink_dump_start+0xd1c/0xee0 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:254 [inline] rtnetlink_rcv_msg+0x181b/0x18c0 net/core/rtnetlink.c:5534 netlink_rcv_skb+0x447/0x800 net/netlink/af_netlink.c:2491 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5589 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x1095/0x1360 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x16f3/0x1870 net/netlink/af_netlink.c:1916 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] sock_write_iter+0x594/0x690 net/socket.c:1057 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28c/0x520 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] __kmalloc+0xc3c/0x12d0 mm/slub.c:4437 kmalloc include/linux/slab.h:595 [inline] pneigh_lookup+0x60f/0xd70 net/core/neighbour.c:766 arp_req_set_public net/ipv4/arp.c:1016 [inline] arp_req_set+0x430/0x10a0 net/ipv4/arp.c:1032 arp_ioctl+0x8d4/0xb60 net/ipv4/arp.c:1232 inet_ioctl+0x4ef/0x820 net/ipv4/af_inet.c:947 sock_do_ioctl net/socket.c:1118 [inline] sock_ioctl+0xa3f/0x13e0 net/socket.c:1235 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0x2df/0x4a0 fs/ioctl.c:860 __x64_sys_ioctl+0xd8/0x110 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae CPU: 1 PID: 20001 Comm: syz-executor.0 Not tainted 5.16.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 62dd93181aaa ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211206165329.1049835-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski11-32/+82
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal. 2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel. 3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit groups, from Stefano Brivio. 4) Break rule evaluation on malformed TCP options. 5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh, also from Florian 6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: conntrack: annotate data-races around ct->timeout selftests: netfilter: switch zone stress to socat netfilter: nft_exthdr: break evaluation if setting TCP option fails selftests: netfilter: Add correctness test for mac,net set type nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups vrf: don't run conntrack on vrf with !dflt qdisc netfilter: nfnetlink_queue: silence bogus compiler warning ==================== Link: https://lore.kernel.org/r/20211209000847.102598-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge branch '100GbE' of ↵Jakub Kicinski9-47/+74
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-08 Yahui adds re-initialization of Flow Director for VF reset. Paul restores interrupts when enabling VFs. Dave re-adds bandwidth check for DCBNL and moves DSCP mode check earlier in the function. Jesse prevents reporting of dropped packets that occur during initialization and fixes reporting of statistics which could occur with frequent reads. Michal corrects setting of protocol type for UDP header and fixes lack of differentiation when adding filters for tunnels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: safer stats processing ice: fix adding different tunnels ice: fix choosing UDP header type ice: ignore dropped packets during init ice: Fix problems with DSCP QoS implementation ice: rearm other interrupt cause register after enabling VFs ice: fix FDIR init missing when reset VF ==================== Link: https://lore.kernel.org/r/20211208211144.2629867-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski29-80/+659
Daniel Borkmann says: ==================== bpf 2021-12-08 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 29 files changed, 659 insertions(+), 80 deletions(-). The main changes are: 1) Fix an off-by-two error in packet range markings and also add a batch of new tests for coverage of these corner cases, from Maxim Mikityanskiy. 2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh. 3) Fix two functional regressions and a build warning related to BTF kfunc for modules, from Kumar Kartikeya Dwivedi. 4) Fix outdated code and docs regarding BPF's migrate_disable() use on non- PREEMPT_RT kernels, from Sebastian Andrzej Siewior. 5) Add missing includes in order to be able to detangle cgroup vs bpf header dependencies, from Jakub Kicinski. 6) Fix regression in BPF sockmap tests caused by missing detachment of progs from sockets when they are removed from the map, from John Fastabend. 7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF dispatcher, from Björn Töpel. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add selftests to cover packet access corner cases bpf: Fix the off-by-two error in range markings treewide: Add missing includes masked by cgroup -> bpf dependency tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets bpf: Fix bpf_check_mod_kfunc_call for built-in modules bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL mips, bpf: Fix reference to non-existing Kconfig symbol bpf: Make sure bpf_disable_instrumentation() is safe vs preemption. Documentation/locking/locktypes: Update migrate_disable() bits. bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap bpf, sockmap: Attach map progs to psock early for feature probes bpf, x86: Fix "no previous prototype" warning ==================== Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08cifs: fix ntlmssp auth when there is no key exchangePaulo Alcantara1-18/+36
Warn on the lack of key exchange during NTLMSSP authentication rather than aborting it as there are some servers that do not set it in CHALLENGE message. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-08net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"Russell King (Oracle)1-8/+13
This commit fixes a misunderstanding in commit 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's"). For Marvell DSA switches with the PHY_DETECT bit (for non-6250 family devices), controls whether the PPU polls the PHY to retrieve the link, speed, duplex and pause status to update the port configuration. This applies for both internal and external PHYs. For some switches such as 88E6352 and 88E6390X, PHY_DETECT has an additional function of enabling auto-media mode between the internal PHY and SERDES blocks depending on which first gains link. The original intention of commit 5d5b231da7ac (net: dsa: mv88e6xxx: use PHY_DETECT in mac_link_up/mac_link_down) was to allow this bit to be used to detect when this propagation is enabled, and allow software to update the port configuration. This has found to be necessary for some switches which do not automatically propagate status from the SERDES to the port, which includes the 88E6390. However, commit 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") breaks this assumption. Maarten Zanders has confirmed that the issue he was addressing was for an 88E6250 switch, which does not have a PHY_DETECT bit in bit 12, but instead a link status bit. Therefore, mv88e6xxx_port_ppu_updates() does not report correctly. This patch resolves the above issues by reverting Maarten's change and instead making mv88e6xxx_port_ppu_updates() indicate whether the port is internal for the 88E6250 family of switches. Yes, you're right, I'm targeting the 6250 family. And yes, your suggestion would solve my case and is a better implementation for the other devices (as far as I can see). Fixes: 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Maarten Zanders <maarten.zanders@mind.be> Link: https://lore.kernel.org/r/E1muXm7-00EwJB-7n@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Revert "kbuild: Enable DT schema checks for %.dtb targets"Rob Herring1-5/+5
This reverts commit 53182e81f47d4ea0c727c49ad23cb782173ab849. This added tool dependencies on various build systems using %.dtb targets. Validation will need to be controlled by a kconfig or make variable instead, but for now let's just revert it. Signed-off-by: Rob Herring <robh@kernel.org>
2021-12-08sched,x86: Don't use cluster topology for x86 hybrid CPUsPeter Zijlstra1-0/+14
For x86 hybrid CPUs like Alder Lake, the order of CPU selection should be based strictly on CPU priority. Don't include cluster topology for hybrid CPUs to avoid interference with such CPU selection order. On Alder Lake, the Atom CPU cluster has more capacity (4 Atom CPUs) vs Big core cluster (2 hyperthread CPUs). This could potentially bias CPU selection towards Atom over Big Core, when Big core CPU has higher priority. Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") Suggested-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lkml.kernel.org/r/20211204091402.GM16608@worktop.programming.kicks-ass.net
2021-12-08vdpa: Consider device id larger than 31Parav Pandit1-1/+2
virtio device id value can be more than 31. Hence, use BIT_ULL in assignment. Fixes: 33b347503f01 ("vdpa: Define vdpa mgmt device, ops and a netlink interface") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211130042949.88958-1-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-12-08virtio/vsock: fix the transport to work with VMADDR_CID_ANYWei Wang1-1/+2
The VMADDR_CID_ANY flag used by a socket means that the socket isn't bound to any specific CID. For example, a host vsock server may want to be bound with VMADDR_CID_ANY, so that a guest vsock client can connect to the host server with CID=VMADDR_CID_HOST (i.e. 2), and meanwhile, a host vsock client can connect to the same local server with CID=VMADDR_CID_LOCAL (i.e. 1). The current implementation sets the destination socket's svm_cid to a fixed CID value after the first client's connection, which isn't an expected operation. For example, if the guest client first connects to the host server, the server's svm_cid gets set to VMADDR_CID_HOST, then other host clients won't be able to connect to the server anymore. Reproduce steps: 1. Run the host server: socat VSOCK-LISTEN:1234,fork - 2. Run a guest client to connect to the host server: socat - VSOCK-CONNECT:2:1234 3. Run a host client to connect to the host server: socat - VSOCK-CONNECT:1:1234 Without this patch, step 3. above fails to connect, and socat complains "socat[1720] E connect(5, AF=40 cid:1 port:1234, 16): Connection reset by peer". With this patch, the above works well. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Signed-off-by: Wei Wang <wei.w.wang@intel.com> Link: https://lore.kernel.org/r/20211126011823.1760-1-wei.w.wang@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-12-08virtio_ring: Fix querying of maximum DMA mapping size for virtio deviceWill Deacon1-1/+1
virtio_max_dma_size() returns the maximum DMA mapping size of the virtio device by querying dma_max_mapping_size() for the device when the DMA API is in use for the vring. Unfortunately, the device passed is initialised by register_virtio_device() and does not inherit the DMA configuration from its parent, resulting in SWIOTLB errors when bouncing is enabled and the default 256K mapping limit (IO_TLB_SEGSIZE) is not respected: | virtio-pci 0000:00:01.0: swiotlb buffer is full (sz: 294912 bytes), total 1024 (slots), used 725 (slots) Follow the pattern used elsewhere in the virtio_ring code when calling into the DMA layer and pass the parent device to dma_max_mapping_size() instead. Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211201112018.25276-1-will@kernel.org Acked-by: Jason Wang <jasowang@redhat.com> Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()") Cc: Joerg Roedel <jroedel@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-12-08virtio: always enter drivers/virtio/Arnd Bergmann1-2/+1
When neither VIRTIO_PCI_LIB nor VIRTIO are enabled, but the alibaba vdpa driver is, the kernel runs into a link error because the legacy virtio module never gets built: x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_set_features': eni_vdpa.c:(.text+0x23f): undefined reference to `vp_legacy_set_features' x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_set_vq_state': eni_vdpa.c:(.text+0x2fe): undefined reference to `vp_legacy_get_queue_enable' x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_set_vq_address': eni_vdpa.c:(.text+0x376): undefined reference to `vp_legacy_set_queue_address' x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_set_vq_ready': eni_vdpa.c:(.text+0x3b4): undefined reference to `vp_legacy_set_queue_address' x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_free_irq': eni_vdpa.c:(.text+0x460): undefined reference to `vp_legacy_queue_vector' x86_64-linux-ld: eni_vdpa.c:(.text+0x4b7): undefined reference to `vp_legacy_config_vector' x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function `eni_vdpa_reset': When VIRTIO_PCI_LIB was added, it was correctly added to drivers/Makefile as well, but for the legacy module, this is missing. Solve this by always entering drivers/virtio during the build and letting its Makefile take care of the individual options, rather than having a separate line for each sub-option. Fixes: 64b9f64f80a6 ("vdpa: introduce virtio pci driver") Fixes: e85087beedca ("eni_vdpa: add vDPA driver for Alibaba ENI") Fixes: d89c8169bd70 ("virtio-pci: introduce legacy device module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211206085034.2836099-1-arnd@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-12-08vduse: check that offset is within bounds in get_config()Dan Carpenter1-1/+2
This condition checks "len" but it does not check "offset" and that could result in an out of bounds read if "offset > dev->config_size". The problem is that since both variables are unsigned the "dev->config_size - offset" subtraction would result in a very high unsigned value. I think these checks might not be necessary because "len" and "offset" are supposed to already have been validated using the vhost_vdpa_config_validate() function. But I do not know the code perfectly, and I like to be safe. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208150956.GA29160@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org
2021-12-08vdpa: check that offsets are within boundsDan Carpenter1-1/+1
In this function "c->off" is a u32 and "size" is a long. On 64bit systems if "c->off" is greater than "size" then "size - c->off" is a negative and we always return -E2BIG. But on 32bit systems the subtraction is type promoted to a high positive u32 value and basically any "c->len" is accepted. Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend") Reported-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208103337.GA4047@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org
2021-12-08vduse: fix memory corruption in vduse_dev_ioctl()Dan Carpenter1-1/+2
The "config.offset" comes from the user. There needs to a check to prevent it being out of bounds. The "config.offset" and "dev->config_size" variables are both type u32. So if the offset if out of bounds then the "dev->config_size - config.offset" subtraction results in a very high u32 value. The out of bounds offset can result in memory corruption. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208103307.GA3778@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org
2021-12-08ice: safer stats processingJesse Brandeburg1-11/+18
The driver was zeroing live stats that could be fetched by ndo_get_stats64 at any time. This could result in inconsistent statistics, and the telltale sign was when reading stats frequently from /proc/net/dev, the stats would go backwards. Fix by collecting stats into a local, and delaying when we write to the structure so it's not incremental. Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-08drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer()Dan Carpenter1-0/+1
Originally "out_fence" was set using out_fence = sync_file_create() but which returns NULL, but now it is set with out_fence = eb_requests_create() which returns error pointers. The error path needs to be modified to avoid an Oops in the "goto err_request;" path. Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211202044831.29583-1-matthew.brost@intel.com (cherry picked from commit 8722ded49ce8a0c706b373e8087eb810684962ff) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-12-08HID: Ignore battery for Elan touchscreen on Asus UX550VEHans de Goede2-0/+3
Battery status is reported for the Asus UX550VE touchscreen even though it does not have a battery. Prevent it from always reporting the battery as low. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1897823 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-12-08ftrace: Add cleanup to unregister_ftrace_direct_multiJiri Olsa1-0/+4
Adding ops cleanup to unregister_ftrace_direct_multi, so it can be reused in another register call. Link: https://lkml.kernel.org/r/20211206182032.87248-3-jolsa@kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Fixes: f64dd4627ec6 ("ftrace: Add multi direct register/unregister interface") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-08ftrace: Use direct_ops hash in unregister_ftrace_directJiri Olsa1-1/+3
Now when we have *direct_multi interface the direct_functions hash is no longer owned just by direct_ops. It's also used by any other ftrace_ops passed to *direct_multi interface. Thus to find out that we are unregistering the last function from direct_ops, we need to check directly direct_ops's hash. Link: https://lkml.kernel.org/r/20211206182032.87248-2-jolsa@kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Fixes: f64dd4627ec6 ("ftrace: Add multi direct register/unregister interface") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-08drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence.Bas Nieuwenhuizen1-1/+10
dma_fence_chain_find_seqno only ever returns the top fence in the chain or an unsignalled fence. Hence if we request a seqno that is already signalled it returns a NULL fence. Some callers are not prepared to handle this, like the syncobj transfer functions for example. This behavior is "new" with timeline syncobj and it looks like not all callers were updated. To fix this behavior make sure that a successful drm_sync_find_fence always returns a non-NULL fence. v2: Move the fix to drm_syncobj_find_fence from the transfer functions. Fixes: ea569910cbab ("drm/syncobj: add transition iotcls between binary and timeline v2") Cc: stable@vger.kernel.org Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211208023935.17018-1-bas@basnieuwenhuizen.nl
2021-12-08nvmet-tcp: fix possible list corruption for unexpected command failureSagi Grimberg1-1/+8
nvmet_tcp_handle_req_failure needs to understand weather to prepare for incoming data or the next pdu. However if we misidentify this, we will wait for 0-length data, and queue the response although nvmet_req_init already did that. The particular command was namespace management command with no data, which was incorrectly categorized as a command with incapsule data. Also, add a code comment of what we are trying to do here. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-12-08btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handlingQu Wenruo1-1/+2
I hit the BUG_ON() with generic/475 test case, and to my surprise, all callers of btrfs_del_root_ref() are already aborting transaction, thus there is not need for such BUG_ON(), just go to @out label and caller will properly handle the error. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: zoned: clear data relocation bg on zone finishJohannes Thumshirn1-0/+2
When finishing a zone that is used by a dedicated data relocation block group, also remove its reference from fs_info, so we're not trying to use a full block group for allocations during data relocation, which will always fail. The result is we're not making any forward progress and end up in a deadlock situation. Fixes: c2707a255623 ("btrfs: zoned: add a dedicated data relocation block group") Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: free exchange changeset on failuresJohannes Thumshirn1-3/+9
Fstests runs on my VMs have show several kmemleak reports like the following. unreferenced object 0xffff88811ae59080 (size 64): comm "xfs_io", pid 12124, jiffies 4294987392 (age 6.368s) hex dump (first 32 bytes): 00 c0 1c 00 00 00 00 00 ff cf 1c 00 00 00 00 00 ................ 90 97 e5 1a 81 88 ff ff 90 97 e5 1a 81 88 ff ff ................ backtrace: [<00000000ac0176d2>] ulist_add_merge+0x60/0x150 [btrfs] [<0000000076e9f312>] set_state_bits+0x86/0xc0 [btrfs] [<0000000014fe73d6>] set_extent_bit+0x270/0x690 [btrfs] [<000000004f675208>] set_record_extent_bits+0x19/0x20 [btrfs] [<00000000b96137b1>] qgroup_reserve_data+0x274/0x310 [btrfs] [<0000000057e9dcbb>] btrfs_check_data_free_space+0x5c/0xa0 [btrfs] [<0000000019c4511d>] btrfs_delalloc_reserve_space+0x1b/0xa0 [btrfs] [<000000006d37e007>] btrfs_dio_iomap_begin+0x415/0x970 [btrfs] [<00000000fb8a74b8>] iomap_iter+0x161/0x1e0 [<0000000071dff6ff>] __iomap_dio_rw+0x1df/0x700 [<000000002567ba53>] iomap_dio_rw+0x5/0x20 [<0000000072e555f8>] btrfs_file_write_iter+0x290/0x530 [btrfs] [<000000005eb3d845>] new_sync_write+0x106/0x180 [<000000003fb505bf>] vfs_write+0x24d/0x2f0 [<000000009bb57d37>] __x64_sys_pwrite64+0x69/0xa0 [<000000003eba3fdf>] do_syscall_64+0x43/0x90 In case brtfs_qgroup_reserve_data() or btrfs_delalloc_reserve_metadata() fail the allocated extent_changeset will not be freed. So in btrfs_check_data_free_space() and btrfs_delalloc_reserve_space() free the allocated extent_changeset to get rid of the allocated memory. The issue currently only happens in the direct IO write path, but only after 65b3c08606e5 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range"), and also at defrag_one_locked_target(). Every other place is always calling extent_changeset_free() even if its call to btrfs_delalloc_reserve_space() or btrfs_check_data_free_space() has failed. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: fix re-dirty process of tree-log nodesNaohiro Aota1-2/+3
There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d3575156f662 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: https://github.com/kdave/btrfs-progs/issues/415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: call mapping_set_error() on btree inode with a write errorJosef Bacik1-0/+8
generic/484 fails sometimes with compression on because the write ends up small enough that it goes into the btree. This means that we never call mapping_set_error() on the inode itself, because the page gets marked as fine when we inline it into the metadata. When the metadata writeback happens we see it and abort the transaction properly and mark the fs as readonly, however we don't do the mapping_set_error() on anything. In syncfs() we will simply return 0 if the sb is marked read-only, so we can't check for this in our syncfs callback. The only way the error gets returned if we called mapping_set_error() on something. Fix this by calling mapping_set_error() on the btree inode mapping. This allows us to properly return an error on syncfs and pass generic/484 with compression on. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: clear extent buffer uptodate when we fail to write itJosef Bacik1-0/+6
I got dmesg errors on generic/281 on our overnight fstests. Looking at the history this happens occasionally, with errors like this WARNING: CPU: 0 PID: 673217 at fs/btrfs/extent_io.c:6848 assert_eb_page_uptodate+0x3f/0x50 CPU: 0 PID: 673217 Comm: kworker/u4:13 Tainted: G W 5.16.0-rc2+ #469 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Workqueue: btrfs-cache btrfs_work_helper RIP: 0010:assert_eb_page_uptodate+0x3f/0x50 RSP: 0018:ffffae598230bc60 EFLAGS: 00010246 RAX: 0017ffffc0002112 RBX: ffffebaec4100900 RCX: 0000000000001000 RDX: ffffebaec45733c7 RSI: ffffebaec4100900 RDI: ffff9fd98919f340 RBP: 0000000000000d56 R08: ffff9fd98e300000 R09: 0000000000000000 R10: 0001207370a91c50 R11: 0000000000000000 R12: 00000000000007b0 R13: ffff9fd98919f340 R14: 0000000001500000 R15: 0000000001cb0000 FS: 0000000000000000(0000) GS:ffff9fd9fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f549fcf8940 CR3: 0000000114908004 CR4: 0000000000370ef0 Call Trace: extent_buffer_test_bit+0x3f/0x70 free_space_test_bit+0xa6/0xc0 load_free_space_tree+0x1d6/0x430 caching_thread+0x454/0x630 ? rcu_read_lock_sched_held+0x12/0x60 ? rcu_read_lock_sched_held+0x12/0x60 ? rcu_read_lock_sched_held+0x12/0x60 ? lock_release+0x1f0/0x2d0 btrfs_work_helper+0xf2/0x3e0 ? lock_release+0x1f0/0x2d0 ? finish_task_switch.isra.0+0xf9/0x3a0 process_one_work+0x270/0x5a0 worker_thread+0x55/0x3c0 ? process_one_work+0x5a0/0x5a0 kthread+0x174/0x1a0 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 This happens because we're trying to read from a extent buffer page that is !PageUptodate. This happens because we will clear the page uptodate when we have an IO error, but we don't clear the extent buffer uptodate. If we do a read later and find this extent buffer we'll think its valid and not return an error, and then trip over this warning. Fix this by also clearing uptodate on the extent buffer when this happens, so that we get an error when we do a btrfs_search_slot() and find this block later. CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08bpf: Add selftests to cover packet access corner casesMaxim Mikityanskiy1-16/+584
This commit adds BPF verifier selftests that cover all corner cases by packet boundary checks. Specifically, 8-byte packet reads are tested at the beginning of data and at the beginning of data_meta, using all kinds of boundary checks (all comparison operators: <, >, <=, >=; both permutations of operands: data + length compared to end, end compared to data + length). For each case there are three tests: 1. Length is just enough for an 8-byte read. Length is either 7 or 8, depending on the comparison. 2. Length is increased by 1 - should still pass the verifier. These cases are useful, because they failed before commit 2fa7d94afc1a ("bpf: Fix the off-by-two error in range markings"). 3. Length is decreased by 1 - should be rejected by the verifier. Some existing tests are just renamed to avoid duplication. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211207081521.41923-1-maximmi@nvidia.com
2021-12-08btrfs: fail if fstrim_range->start == U64_MAXJosef Bacik1-0/+3
We've always been failing generic/260 because it's testing things we actually don't care about and thus won't fail for. However we probably should fail for fstrim_range->start == U64_MAX since we clearly can't trim anything past that. This in combination with an update to generic/260 will allow us to pass this test properly. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()Dan Carpenter1-4/+2
If memdup_user() fails the error handing will crash when it tries to kfree() an error pointer. Just return directly because there is no cleanup required. Fixes: 1a15eb724aae ("btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-12-08thermal: int340x: Fix VCoRefLow MMIO bit offset for TGLSumeet Pawnikar1-1/+1
The VCoRefLow CPU FIVR register definition for Tiger Lake is incorrect. Current implementation reads it from MMIO offset 0x5A18 and bit offset [12:14], but the actual correct register definition is from bit offset [11:13]. Update to fix the bit offset. Fixes: 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: 5.14+ <stable@vger.kernel.org> # 5.14+ [ rjw: New subject, changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-08PM: runtime: Fix pm_runtime_active() kerneldoc commentRafael J. Wysocki1-1/+1
The kerneldoc comment of pm_runtime_active() does not reflect the behavior of the function, so update it accordingly. Fixes: 403d2d116ec0 ("PM: runtime: Add kerneldoc comments to multiple helpers") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-12-08ACPI: tools: Fix compilation when output directory is not presentChen Yu2-0/+2
Compiling the ACPI tools when output directory parameter is specified, but the output directory is not present, triggers the following error: make O=/data/test/tmp/ -C tools/power/acpi/ make: Entering directory '/data/src/kernel/linux/tools/power/acpi' DESCEND tools/acpidbg make[1]: Entering directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg' MKDIR include CP include CC tools/acpidbg/acpidbg.o Assembler messages: Fatal error: can't create /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o: No such file or directory make[1]: *** [../../Makefile.rules:24: /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o] Error 1 make[1]: Leaving directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg' make: *** [Makefile:18: acpidbg] Error 2 make: Leaving directory '/data/src/kernel/linux/tools/power/acpi' which occurs because the output directory has not been created yet. Fix this issue by creating the output directory before compiling. Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [ rjw: New subject, changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-08tracefs: Set all files to the same group ownership as the mount optionSteven Rostedt (VMware)1-0/+72
As people have been asking to allow non-root processes to have access to the tracefs directory, it was considered best to only allow groups to have access to the directory, where it is easier to just set the tracefs file system to a specific group (as other would be too dangerous), and that way the admins could pick which processes would have access to tracefs. Unfortunately, this broke tooling on Android that expected the other bit to be set. For some special cases, for non-root tools to trace the system, tracefs would be mounted and change the permissions of the top level directory which gave access to all running tasks permission to the tracing directory. Even though this would be dangerous to do in a production environment, for testing environments this can be useful. Now with the new changes to not allow other (which is still the proper thing to do), it breaks the testing tooling. Now more code needs to be loaded on the system to change ownership of the tracing directory. The real solution is to have tracefs honor the gid=xxx option when mounting. That is, (tracing group tracing has value 1003) mount -t tracefs -o gid=1003 tracefs /sys/kernel/tracing should have it that all files in the tracing directory should be of the given group. Copy the logic from d_walk() from dcache.c and simplify it for the mount case of tracefs if gid is set. All the files in tracefs will be walked and their group will be set to the value passed in. Link: https://lkml.kernel.org/r/20211207171729.2a54e1b3@gandalf.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-fsdevel@vger.kernel.org Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reported-by: Kalesh Singh <kaleshsingh@google.com> Reported-by: Yabin Cui <yabinc@google.com> Fixes: 49d67e445742 ("tracefs: Have tracefs directories not set OTH permission bits by default") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-08tracefs: Have new files inherit the ownership of their parentSteven Rostedt (VMware)1-0/+4
If directories in tracefs have their ownership changed, then any new files and directories that are created under those directories should inherit the ownership of the director they are created in. Link: https://lkml.kernel.org/r/20211208075720.4855d180@gandalf.local.home Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Yabin Cui <yabinc@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: stable@vger.kernel.org Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system") Reported-by: Kalesh Singh <kaleshsingh@google.com> Reported: https://lore.kernel.org/all/CAC_TJve8MMAv+H_NdLSJXZUSoxOEq2zB_pVaJ9p=7H6Bu3X76g@mail.gmail.com/ Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-08Merge tag 'iio-fixes-for-5.16b' of ↵Greg Kroah-Hartman15-44/+36
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: 2nd set of IIO fixes for 5.16 Note 1st set were before the merge window. Biggest set in here fix what happens when things go wrong in the interrupt handlers for an IIO trigger. Otherwise normal mix of recent and ancient bugs. trigger core - Fix reference counting bug that was preventing the iio_trig structures from being released. adxrs290 - Correctly sign extend the rate and temperature data. at91-sama5d2 - Fix sign extension from the wrong bit and use the scan_type values to avoid it being open coded in two places (which were out of sync) axp20x_adc - Fix current reporting bit depth. dln2-adc - Fix a lock ordering issue and lockdep complaint that results. - Add error handling for failure to register the trigger. imx8qxp - Wrong config dependency kxcjk-1013 - Potential leak due to wrong guard on cleanup. ltr501, kxsd9, stk3310, itg3200, ad7768 - Don't return error codes from interrupt handler and call iio_trigger_notify_done() on all paths to avoid leaving trigger disabled on an intermittent fault. mma8452 - Fix missing iio_trigger_get() that could lead to use after free. stm32 - Fix a current leak. - Avoid null pointer derefence on defer_probe error due to wrong struct device being passed. stm32-timer - Drop space in MODULE_ALIAS. * tag 'iio-fixes-for-5.16b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: trigger: stm32-timer: fix MODULE_ALIAS iio: adc: stm32: fix null pointer on defer_probe error iio: at91-sama5d2: Fix incorrect sign extension iio: adc: axp20x_adc: fix charging current reporting on AXP22x iio: gyro: adxrs290: fix data signedness iio: ad7768-1: Call iio_trigger_notify_done() on error iio: itg3200: Call iio_trigger_notify_done() on error iio: imx8qxp-adc: fix dependency to the intended ARCH_MXC config iio: dln2: Check return value of devm_iio_trigger_register() iio: trigger: Fix reference counting iio: dln2-adc: Fix lockdep complaint iio: adc: stm32: fix a current leak by resetting pcsel before disabling vdda iio: mma8452: Fix trigger reference couting iio: stk3310: Don't return error code in interrupt handler iio: kxsd9: Don't return error code in trigger handler iio: ltr501: Don't return error code in trigger handler iio: accel: kxcjk-1013: Fix possible memory leak in probe and remove
2021-12-08irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALLWudi Wang1-1/+1
INVALL CMD specifies that the ITS must ensure any caching associated with the interrupt collection defined by ICID is consistent with the LPI configuration tables held in memory for all Redistributors. SYNC is required to ensure that INVALL is executed. Currently, LPI configuration data may be inconsistent with that in the memory within a short period of time after the INVALL command is executed. Signed-off-by: Wudi Wang <wangwudi@hisilicon.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue") Link: https://lore.kernel.org/r/20211208015429.5007-1-zhangshaokun@hisilicon.com
2021-12-08KVM: nVMX: Don't use Enlightened MSR Bitmap for L3Vitaly Kuznetsov1-9/+13
When KVM runs as a nested hypervisor on top of Hyper-V it uses Enlightened VMCS and enables Enlightened MSR Bitmap feature for its L1s and L2s (which are actually L2s and L3s from Hyper-V's perspective). When MSR bitmap is updated, KVM has to reset HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP from clean fields to make Hyper-V aware of the change. For KVM's L1s, this is done in vmx_disable_intercept_for_msr()/vmx_enable_intercept_for_msr(). MSR bitmap for L2 is build in nested_vmx_prepare_msr_bitmap() by blending MSR bitmap for L1 and L1's idea of MSR bitmap for L2. KVM, however, doesn't check if the resulting bitmap is different and never cleans HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP in eVMCS02. This is incorrect and may result in Hyper-V missing the update. The issue could've been solved by calling evmcs_touch_msr_bitmap() for eVMCS02 from nested_vmx_prepare_msr_bitmap() unconditionally but doing so would not give any performance benefits (compared to not using Enlightened MSR Bitmap at all). 3-level nesting is also not a very common setup nowadays. Don't enable 'Enlightened MSR Bitmap' feature for KVM's L2s (real L3s) for now. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211129094704.326635-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08csky: fix typo of fpu config macroKelly Devilliv1-2/+2
Fix typo which will cause fpe and privilege exception error. Signed-off-by: Kelly Devilliv <kelly.devilliv@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2021-12-07net: fec: only clear interrupt of handling queue in fec_enet_rx_queue()Joakim Zhang2-1/+4
Background: We have a customer is running a Profinet stack on the 8MM which receives and responds PNIO packets every 4ms and PNIO-CM packets every 40ms. However, from time to time the received PNIO-CM package is "stock" and is only handled when receiving a new PNIO-CM or DCERPC-Ping packet (tcpdump shows the PNIO-CM and the DCERPC-Ping packet at the same time but the PNIO-CM HW timestamp is from the expected 40 ms and not the 2s delay of the DCERPC-Ping). After debugging, we noticed PNIO, PNIO-CM and DCERPC-Ping packets would be handled by different RX queues. The root cause should be driver ack all queues' interrupt when handle a specific queue in fec_enet_rx_queue(). The blamed patch is introduced to receive as much packets as possible once to avoid interrupt flooding. But it's unreasonable to clear other queues'interrupt when handling one queue, this patch tries to fix it. Fixes: ed63f1dcd578 (net: fec: clear receive interrupts before processing a packet) Cc: Russell King <rmk+kernel@arm.linux.org.uk> Reported-by: Nicolas Diaz <nicolas.diaz@nxp.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20211206135457.15946-1-qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch '40GbE' of ↵Jakub Kicinski5-38/+91
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-06 This series contains updates to iavf and i40e drivers. Mitch adds restoration of MSI state during reset for iavf. Michal fixes checking and reporting of descriptor count changes to communicate changes and/or issues for iavf. Karen resolves an issue with failed handling of VF requests while a VF reset is occurring for i40e. Mateusz removes clearing of VF requested queue count when configuring VF ADQ for i40e. Norbert fixes a NULL pointer dereference that can occur when getting VSI descriptors for i40e. ==================== Link: https://lore.kernel.org/r/20211206183519.2733180-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch 'net-phy-fix-doc-build-warning'Jakub Kicinski2-5/+7
Yanteng Si says: ==================== net: phy: Fix doc build warnings ==================== Link: https://lore.kernel.org/r/cover.1638776933.git.siyanteng@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: phy: Add the missing blank line in the phylink_suspend commentYanteng Si1-0/+1
Fix warning as: Documentation/networking/kapi:147: ./drivers/net/phy/phylink.c:1657: WARNING: Unexpected indentation. Documentation/networking/kapi:147: ./drivers/net/phy/phylink.c:1658: WARNING: Block quote ends without a blank line; unexpected unindent. Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: phy: Remove unnecessary indentation in the comments of phy_deviceYanteng Si1-5/+6
Fix warning as: linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:543: WARNING: Unexpected indentation. linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:544: WARNING: Block quote ends without a blank line; unexpected unindent. linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:546: WARNING: Unexpected indentation. Suggested-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07gve: fix for null pointer dereference.Ameer Hamza1-0/+3
Avoid passing NULL skb to __skb_put() function call if napi_alloc_skb() returns NULL. Fixes: 37149e9374bf ("gve: Implement packet continuation for RX.") Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com> Link: https://lore.kernel.org/r/20211205183810.8299-1-amhamza.mgc@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07cifs: Fix crash on unload of cifs_arc4.koVincent Whitchurch1-13/+0
The exit function is wrongly placed in the __init section and this leads to a crash when the module is unloaded. Just remove both the init and exit functions since this module does not need them. Fixes: 71c02863246167b3d ("cifs: fork arc4 and create a separate module...") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: stable@vger.kernel.org # 5.15 Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-07MAINTAINERS: net: mlxsw: Remove Jiri as a maintainer, add myselfPetr Machata1-1/+1
Jiri has moved on and will not carry out the mlxsw maintainership duty any longer. Add myself as a co-maintainer instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/45b54312cdebaf65c5d110b15a5dd2df795bf2be.1638807297.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch 'net-tls-cover-all-ciphers-with-tests'Jakub Kicinski1-0/+36
Vadim Fedorenko says: ==================== net: tls: cover all ciphers with tests Recent patches to Kernel TLS showed that some ciphers are not covered with tests. Let's cover missed. ==================== Link: https://lore.kernel.org/r/20211206213932.7508-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES256-GCM cipherVadim Fedorenko1-0/+18
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES-CCM cipher testsVadim Fedorenko1-0/+18
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08netfilter: conntrack: annotate data-races around ct->timeoutEric Dumazet4-9/+9
(struct nf_conn)->timeout can be read/written locklessly, add READ_ONCE()/WRITE_ONCE() to prevent load/store tearing. BUG: KCSAN: data-race in __nf_conntrack_alloc / __nf_conntrack_find_get write to 0xffff888132e78c08 of 4 bytes by task 6029 on cpu 0: __nf_conntrack_alloc+0x158/0x280 net/netfilter/nf_conntrack_core.c:1563 init_conntrack+0x1da/0xb30 net/netfilter/nf_conntrack_core.c:1635 resolve_normal_ct+0x502/0x610 net/netfilter/nf_conntrack_core.c:1746 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline] tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680 __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864 tcp_push_pending_frames include/net/tcp.h:1897 [inline] tcp_data_snd_check+0x62/0x2e0 net/ipv4/tcp_input.c:5452 tcp_rcv_established+0x880/0x10e0 net/ipv4/tcp_input.c:5947 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 sk_stream_wait_memory+0x435/0x700 net/core/stream.c:145 tcp_sendmsg_locked+0xb85/0x25a0 net/ipv4/tcp.c:1402 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1440 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:644 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] __sys_sendto+0x21e/0x2c0 net/socket.c:2036 __do_sys_sendto net/socket.c:2048 [inline] __se_sys_sendto net/socket.c:2044 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2044 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888132e78c08 of 4 bytes by task 17446 on cpu 1: nf_ct_is_expired include/net/netfilter/nf_conntrack.h:286 [inline] ____nf_conntrack_find net/netfilter/nf_conntrack_core.c:776 [inline] __nf_conntrack_find_get+0x1c7/0xac0 net/netfilter/nf_conntrack_core.c:807 resolve_normal_ct+0x273/0x610 net/netfilter/nf_conntrack_core.c:1734 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 __tcp_send_ack+0x1fd/0x300 net/ipv4/tcp_output.c:3956 tcp_send_ack+0x23/0x30 net/ipv4/tcp_output.c:3962 __tcp_ack_snd_check+0x2d8/0x510 net/ipv4/tcp_input.c:5478 tcp_ack_snd_check net/ipv4/tcp_input.c:5523 [inline] tcp_rcv_established+0x8c2/0x10e0 net/ipv4/tcp_input.c:5948 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 tcp_sendpage+0x94/0xb0 net/ipv4/tcp.c:1114 inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833 rds_tcp_xmit+0x376/0x5f0 net/rds/tcp_send.c:118 rds_send_xmit+0xbed/0x1500 net/rds/send.c:367 rds_send_worker+0x43/0x200 net/rds/threads.c:200 process_one_work+0x3fc/0x980 kernel/workqueue.c:2298 worker_thread+0x616/0xa70 kernel/workqueue.c:2445 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x00027cc2 -> 0x00000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G W 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: krdsd rds_send_worker Note: I chose an arbitrary commit for the Fixes: tag, because I do not think we need to backport this fix to very old kernels. Fixes: e37542ba111f ("netfilter: conntrack: avoid possible false sharing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: switch zone stress to socatFlorian Westphal1-6/+13
centos9 has nmap-ncat which doesn't like the '-q' option, use socat. While at it, mark test skipped if needed tools are missing. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08netfilter: nft_exthdr: break evaluation if setting TCP option failsPablo Neira Ayuso1-4/+7
Break rule evaluation on malformed TCP options. Fixes: 99d1712bc41c ("netfilter: exthdr: tcp option set support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: Add correctness test for mac,net set typeStefano Brivio1-3/+21
The existing net,mac test didn't cover the issue recently reported by Nikita Yushchenko, where MAC addresses wouldn't match if given as first field of a concatenated set with AVX2 and 8-bit groups, because there's a different code path covering the lookup of six 8-bit groups (MAC addresses) if that's the first field. Add a similar mac,net test, with MAC address and IPv4 address swapped in the set specification. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groupsStefano Brivio1-1/+1
The sixth byte of packet data has to be looked up in the sixth group, not in the seventh one, even if we load the bucket data into ymm6 (and not ymm5, for convenience of tracking stalls). Without this fix, matching on a MAC address as first field of a set, if 8-bit groups are selected (due to a small set size) would fail, that is, the given MAC address would never match. Reported-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Cc: <stable@vger.kernel.org> # 5.6.x Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-By: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08vrf: don't run conntrack on vrf with !dflt qdiscNicolas Dichtel2-8/+30
After the below patch, the conntrack attached to skb is set to "notrack" in the context of vrf device, for locally generated packets. But this is true only when the default qdisc is set to the vrf device. When changing the qdisc, notrack is not set anymore. In fact, there is a shortcut in the vrf driver, when the default qdisc is set, see commit dcdd43c41e60 ("net: vrf: performance improvements for IPv4") for more details. This patch ensures that the behavior is always the same, whatever the qdisc is. To demonstrate the difference, a new test is added in conntrack_vrf.sh. Fixes: 8c9c296adfae ("vrf: run conntrack only in context of lower/physdev for locally generated packets") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-07Merge tag 'perf-tools-fixes-for-v5.16-2021-12-07' of ↵Linus Torvalds17-60/+47
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix SMT detection fast read path on sysfs. - Fix memory leaks when processing feature headers in perf.data files. - Fix 'Simple expression parser' 'perf test' on arch without CPU die topology info, such as s/390. - Fix building perf with BUILD_BPF_SKEL=1. - Fix 'perf bench' by reverting "perf bench: Fix two memory leaks detected with ASan". - Fix itrace space allowed for new attributes in 'perf script'. - Fix the build feature detection fast path, that was always failing on systems with python3 development packages, speeding up the build. - Reset shadow counts before loading, fixing metrics using duration_time. - Sync more kernel headers changed by the new futex_waitv syscall: s390 and powerpc. * tag 'perf-tools-fixes-for-v5.16-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf_skel: Do not use typedef to avoid error on old clang perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros perf header: Fix memory leaks when processing feature headers perf test: Reset shadow counts before loading perf test: Fix 'Simple expression parser' test on arch without CPU die topology info tools build: Remove needless libpython-version feature check that breaks test-all fast path perf tools: Fix SMT detection fast read path tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall perf inject: Fix itrace space allowed for new attributes tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall Revert "perf bench: Fix two memory leaks detected with ASan"
2021-12-07block: fix single bio async DIO error handlingPavel Begunkov1-2/+1
BUG: KASAN: use-after-free in io_submit_one+0x496/0x2fe0 fs/aio.c:1882 CPU: 2 PID: 15100 Comm: syz-executor873 Not tainted 5.16.0-rc1-syzk #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014 Call Trace: [...] refcount_dec_and_test include/linux/refcount.h:333 [inline] iocb_put fs/aio.c:1161 [inline] io_submit_one+0x496/0x2fe0 fs/aio.c:1882 __do_sys_io_submit fs/aio.c:1938 [inline] __se_sys_io_submit fs/aio.c:1908 [inline] __x64_sys_io_submit+0x1c7/0x4a0 fs/aio.c:1908 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae __blkdev_direct_IO_async() returns errors from bio_iov_iter_get_pages() directly, in which case upper layers won't be expecting ->ki_complete to be called by the block layer and will terminate the request. However, there is also bio_endio() leading to a second ->ki_complete and a double free. Fixes: 54a88eb838d37 ("block: add single bio async direct IO helper") Reported-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c9eb786f6cef041e159e6287de131bec0719ad5c.1638907997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-07ice: fix adding different tunnelsMichal Swiatkowski6-13/+25
Adding filters with the same values inside for VXLAN and Geneve causes HW error, because it looks exactly the same. To choose between different type of tunnels new recipe is needed. Add storing tunnel types in creating recipes function and start checking it in finding function. Change getting open tunnels function to return port on correct tunnel type. This is needed to copy correct port to dummy packet. Block user from adding enc_dst_port via tc flower, because VXLAN and Geneve filters can be created only with destination port which was previously opened. Fixes: 8b032a55c1bd5 ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>