aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-05-09gcc-10: mark more functions __init to avoid section mismatch warningsHEADmasterfor-nextLinus Torvalds2-2/+2
It seems that for whatever reason, gcc-10 ends up not inlining a couple of functions that used to be inlined before. Even if they only have one single callsite - it looks like gcc may have decided that the code was unlikely, and not worth inlining. The code generation difference is harmless, but caused a few new section mismatch errors, since the (now no longer inlined) function wasn't in the __init section, but called other init functions: Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap() So add the appropriate __init annotation to make modpost not complain. In both cases there were trivially just a single callsite from another __init function. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Merge tag 'riscv-for-linus-5.7-rc5' of ↵Linus Torvalds9-34/+121
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "A smattering of fixes and cleanups: - Dead code removal. - Exporting riscv_cpuid_to_hartid_mask for modules. - Per-CPU tracking of ISA features. - Setting max_pfn correctly when probing memory. - Adding a note to the VDSO so glibc can check the kernel's version without a uname(). - A fix to force the bootloader to initialize the boot spin tables, which still get used as a fallback when SBI-0.1 is enabled" * tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Remove unused code from STRICT_KERNEL_RWX riscv: force __cpu_up_ variables to put in data section riscv: add Linux note to vdso riscv: set max_pfn to the PFN of the last page RISC-V: Remove N-extension related defines RISC-V: Add bitmap reprensenting ISA features common across CPUs RISC-V: Export riscv_cpuid_to_hartid_mask() API
2020-05-09gcc-10: avoid shadowing standard library 'free()' in cryptoLinus Torvalds2-6/+6
gcc-10 has started warning about conflicting types for a few new built-in functions, particularly 'free()'. This results in warnings like: crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch] because the crypto layer had its local freeing functions called 'free()'. Gcc-10 is in the wrong here, since that function is marked 'static', and thus there is no chance of confusion with any standard library function namespace. But the simplest thing to do is to just use a different name here, and avoid this gcc mis-feature. [ Side note: gcc knowing about 'free()' is in itself not the mis-feature: the semantics of 'free()' are special enough that a compiler can validly do special things when seeing it. So the mis-feature here is that gcc thinks that 'free()' is some restricted name, and you can't shadow it as a local static function. Making the special 'free()' semantics be a function attribute rather than tied to the name would be the much better model ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'restrict' warning for nowLinus Torvalds1-0/+3
gcc-10 now warns about passing aliasing pointers to functions that take restricted pointers. That's actually a great warning, and if we ever start using 'restrict' in the kernel, it might be quite useful. But right now we don't, and it turns out that the only thing this warns about is an idiom where we have declared a few functions to be "printf-like" (which seems to make gcc pick up the restricted pointer thing), and then we print to the same buffer that we also use as an input. And people do that as an odd concatenation pattern, with code like this: #define sysfs_show_gen_prop(buffer, fmt, ...) \ snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__) where we have 'buffer' as both the destination of the final result, and as the initial argument. Yes, it's a bit questionable. And outside of the kernel, people do have standard declarations like int snprintf( char *restrict buffer, size_t bufsz, const char *restrict format, ... ); where that output buffer is marked as a restrict pointer that cannot alias with any other arguments. But in the context of the kernel, that 'use snprintf() to concatenate to the end result' does work, and the pattern shows up in multiple places. And we have not marked our own version of snprintf() as taking restrict pointers, so the warning is incorrect for now, and gcc picks it up on its own. If we do start using 'restrict' in the kernel (and it might be a good idea if people find places where it matters), we'll need to figure out how to avoid this issue for snprintf and friends. But in the meantime, this warning is not useful. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'stringop-overflow' warning for nowLinus Torvalds1-0/+1
This is the final array bounds warning removal for gcc-10 for now. Again, the warning is good, and we should re-enable all these warnings when we have converted all the legacy array declaration cases to flexible arrays. But in the meantime, it's just noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'array-bounds' warning for nowLinus Torvalds1-0/+1
This is another fine warning, related to the 'zero-length-bounds' one, but hitting the same historical code in the kernel. Because C didn't historically support flexible array members, we have code that instead uses a one-sized array, the same way we have cases of zero-sized arrays. The one-sized arrays come from either not wanting to use the gcc zero-sized array extension, or from a slight convenience-feature, where particularly for strings, the size of the structure now includes the allocation for the final NUL character. So with a "char name[1];" at the end of a structure, you can do things like v = my_malloc(sizeof(struct vendor) + strlen(name)); and avoid the "+1" for the terminator. Yes, the modern way to do that is with a flexible array, and using 'offsetof()' instead of 'sizeof()', and adding the "+1" by hand. That also technically gets the size "more correct" in that it avoids any alignment (and thus padding) issues, but this is another long-term cleanup thing that will not happen for 5.7. So disable the warning for now, even though it's potentially quite useful. Having a slew of warnings that then hide more urgent new issues is not an improvement. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'zero-length-bounds' warning for nowLinus Torvalds1-0/+3
This is a fine warning, but we still have a number of zero-length arrays in the kernel that come from the traditional gcc extension. Yes, they are getting converted to flexible arrays, but in the meantime the gcc-10 warning about zero-length bounds is very verbose, and is hiding other issues. I missed one actual build failure because it was hidden among hundreds of lines of warning. Thankfully I caught it on the second go before pushing things out, but it convinced me that I really need to disable the new warnings for now. We'll hopefully be all done with our conversion to flexible arrays in the not too distant future, and we can then re-enable this warning. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Stop the ad-hoc games with -Wno-maybe-initializedLinus Torvalds3-23/+3
We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't. For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES). And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did. At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings. So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123". Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler. That's currently not the world we live in, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Merge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-blockLinus Torvalds2-70/+40
Pull io_uring fixes from Jens Axboe: - Fix finish_wait() balancing in file cancelation (Xiaoguang) - Ensure early cleanup of resources in ring map failure (Xiaoguang) - Ensure IORING_OP_SLICE does the right file mode checks (Pavel) - Remove file opening from openat/openat2/statx, it's not needed and messes with O_PATH * tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block: io_uring: don't use 'fd' for openat/openat2/statx splice: move f_mode checks to do_{splice,tee}() io_uring: handle -EFAULT properly in io_uring_setup() io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()
2020-05-08Merge tag 'scsi-fixes' of ↵Linus Torvalds4-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ibmvscsi: Fix WARN_ON during event pool release scsi: ibmvfc: Don't send implicit logouts prior to NPIV login scsi: qla2xxx: Delete all sessions before unregister local nvme port scsi: qla2xxx: Fix hang when issuing nvme disconnect-all in NPIV
2020-05-08Merge tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-clientLinus Torvalds4-14/+7
Pull ceph fixes from Ilya Dryomov: "Fixes for an endianness handling bug that prevented mounts on big-endian arches, a spammy log message and a couple error paths. Also included a MAINTAINERS update" * tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client: ceph: demote quotarealm lookup warning to a debug message MAINTAINERS: remove myself as ceph co-maintainer ceph: fix double unlock in handle_cap_export() ceph: fix special error code in ceph_try_get_caps() ceph: fix endianness bug when handling MDS session feature bits
2020-05-08ceph: demote quotarealm lookup warning to a debug messageLuis Henriques1-2/+2
A misconfigured cephx can easily result in having the kernel client flooding the logs with: ceph: Can't lookup inode 1 (err: -13) Change this message to debug level. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/44546 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-08Merge tag 'char-misc-5.7-rc5' of ↵Linus Torvalds14-51/+77
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small driver fixes for 5.7-rc5 that resolve a number of minor reported issues: - mhi bus driver fixes found as people actually use the code - phy driver fixes and compat string additions - most driver fix due to link order changing when the core moved out of staging - mei driver fix - interconnect build warning fix All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: bus: mhi: core: Fix channel device name conflict bus: mhi: core: Fix typo in comment bus: mhi: core: Offload register accesses to the controller bus: mhi: core: Remove link_status() callback bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails bus: mhi: Fix parsing of mhi_flags mei: me: disable mei interface on LBG servers. phy: qualcomm: usb-hs-28nm: Prepare clocks in init MAINTAINERS: Add Vinod Koul as Generic PHY co-maintainer interconnect: qcom: Move the static keyword to the front of declaration most: core: use function subsys_initcall() bus: mhi: core: Fix a NULL vs IS_ERR check in mhi_create_devices() phy: qcom-qusb2: Re add "qcom,sdm845-qusb2-phy" compat string phy: tegra: Select USB_COMMON for usb_get_maximum_speed()
2020-05-08Merge tag 'driver-core-5.7-rc5' of ↵Linus Torvalds10-30/+48
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are a number of small driver core fixes for 5.7-rc5 to resolve a bunch of reported issues with the current tree. Biggest here are the reverts and patches from John Stultz to resolve a bunch of deferred probe regressions we have been seeing in 5.7-rc right now. Along with those are some other smaller fixes: - coredump crash fix - devlink fix for when permissive mode was enabled - amba and platform device dma_parms fixes - component error silenced for when deferred probe happens All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: regulator: Revert "Use driver_deferred_probe_timeout for regulator_init_complete_work" driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings driver core: Revert default driver_deferred_probe_timeout value to 0 component: Silence bind error on -EPROBE_DEFER driver core: Fix handling of fw_devlink=permissive coredump: fix crash when umh is disabled amba: Initialize dma_parms for amba devices driver core: platform: Initialize dma_parms for platform devices
2020-05-08Merge tag 'staging-5.7-rc5' of ↵Linus Torvalds3-6/+4
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are three small driver fixes for 5.7-rc5. Two of these are documentation fixes: - MAINTAINERS update due to removed driver - removing Wolfram from the ks7010 driver TODO file The other patch is a real fix: - fix gasket driver to proper check the return value of a call All of these have been in linux-next for a while with no reported issues" * tag 'staging-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: gasket: Check the return value of gasket_get_bar_index() staging: ks7010: remove me from CC list MAINTAINERS: remove entry after hp100 driver removal
2020-05-08Merge tag 'tty-5.7-rc5' of ↵Linus Torvalds3-5/+9
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are three small TTY/Serial/VT fixes for 5.7-rc5: - revert for the bcm63xx driver "fix" that was incorrect - vt unicode console bugfix - xilinx_uartps console driver fix All of these have been in linux next with no reported issues" * tag 'tty-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: xilinx_uartps: Fix missing id assignment to the console vt: fix unicode console freeing with a common interface Revert "tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart"
2020-05-08Merge tag 'usb-5.7-rc5' of ↵Linus Torvalds8-10/+24
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 5.7-rc5 to resolve some reported issues: - syzbot found problems fixed - usbfs dma mapping fix - typec bugfixs - chipidea bugfix - usb4/thunderbolt fix - new device ids/quirks All of these have been in linux-next for a while with no reported issues" * tag 'usb-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: chipidea: msm: Ensure proper controller reset using role switch API usb: typec: mux: intel: Handle alt mode HPD_HIGH usb: usbfs: correct kernel->user page attribute mismatch usb: typec: intel_pmc_mux: Fix the property names USB: core: Fix misleading driver bug report USB: serial: qcserial: Add DW5816e support USB: uas: add quirk for LaCie 2Big Quadra thunderbolt: Check return value of tb_sw_read() in usb4_switch_op() USB: serial: garmin_gps: add sanity checking for data length
2020-05-08Merge tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds12-31/+57
Pull drm fixes from Dave Airlie: "Another pretty normal week. I didn't get any i915 fixes yet, so next week I'd expect double the usual i915, but otherwise a bunch of amdgpu and some scattered other fixes. hdcp: - fix HDCP regression amdgpu: - Runtime PM fixes - DC fix for PPC - Misc DC fixes virtio: - fix context ordering issue sun4i: - old gcc warning fix ingenic-drm: - missing module support" * tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Prevent dpcd reads with passive dongles drm/amd/display: fix counter in wait_for_no_pipes_pending drm/amd/display: Update DCN2.1 DV Code Revision drm: Fix HDCP failures when SRM fw is missing sun6i: dsi: fix gcc-4.8 drm: ingenic-drm: add MODULE_DEVICE_TABLE drm/virtio: create context before RESOURCE_CREATE_2D in 3D mode drm/amd/display: work around fp code being emitted outside of DC_FP_START/END drm/amdgpu/dc: Use WARN_ON_ONCE for ASSERT drm/amdgpu: drop redundant cg/pg ungate on runpm enter drm/amdgpu: move kfd suspend after ip_suspend_phase1
2020-05-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds14-78/+275
Merge misc fixes from Andrew Morton: "14 fixes and one selftest to verify the ipc fixes herein" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: limit boost_watermark on small zones ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST mm/vmscan: remove unnecessary argument description of isolate_lru_pages() epoll: atomically remove wait entry on wake up kselftests: introduce new epoll60 testcase for catching lost wakeups percpu: make pcpu_alloc() aware of current gfp context mm/slub: fix incorrect interpretation of s->offset scripts/gdb: repair rb_first() and rb_last() eventpoll: fix missing wakeup for ovflist in ep_poll_callback arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory() scripts/decodecode: fix trapping instruction formatting kernel/kcov.c: fix typos in kcov_remote_start documentation mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous() mm, memcg: fix error return value of mem_cgroup_css_alloc() ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
2020-05-08Merge tag 'drm-misc-fixes-2020-05-07' of ↵Dave Airlie6-4/+14
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A few minor fixes for an ordering issue in virtio, an (old) gcc warning in sun4i, a probe issue in ingenic-drm and a regression in the HDCP support. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200507160130.id64niqgf5wsha4u@gilmour.lan
2020-05-08Merge tag 'amd-drm-fixes-5.7-2020-05-06' of ↵Dave Airlie6-27/+43
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.7-2020-05-06: amdgpu: - Runtime PM fixes - DC fix for PPC - Misc DC fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200506212257.3893-1-alexander.deucher@amd.com
2020-05-07Merge branch 'for-v5.7' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem fix from James Morris: "Fix the default value of fs_context_parse_param hook" * 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security: Fix the default value of fs_context_parse_param hook
2020-05-07mm: limit boost_watermark on small zonesHenry Willard1-0/+8
Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") adds a boost_watermark() function which increases the min watermark in a zone by at least pageblock_nr_pages or the number of pages in a page block. On Arm64, with 64K pages and 512M huge pages, this is 8192 pages or 512M. It does this regardless of the number of managed pages managed in the zone or the likelihood of success. This can put the zone immediately under water in terms of allocating pages from the zone, and can cause a small machine to fail immediately due to OoM. Unlike set_recommended_min_free_kbytes(), which substantially increases min_free_kbytes and is tied to THP, boost_watermark() can be called even if THP is not active. The problem is most likely to appear on architectures such as Arm64 where pageblock_nr_pages is very large. It is desirable to run the kdump capture kernel in as small a space as possible to avoid wasting memory. In some architectures, such as Arm64, there are restrictions on where the capture kernel can run, and therefore, the space available. A capture kernel running in 768M can fail due to OoM immediately after boost_watermark() sets the min in zone DMA32, where most of the memory is, to 512M. It fails even though there is over 500M of free memory. With boost_watermark() suppressed, the capture kernel can run successfully in 448M. This patch limits boost_watermark() to boosting a zone's min watermark only when there are enough pages that the boost will produce positive results. In this case that is estimated to be four times as many pages as pageblock_nr_pages. Mel said: : There is no harm in marking it stable. Clearly it does not happen very : often but it's not impossible. 32-bit x86 is a lot less common now : which would previously have been vulnerable to triggering this easily. : ppc64 has a larger base page size but typically only has one zone. : arm64 is likely the most vulnerable, particularly when CMA is : configured with a small movable zone. Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Henry Willard <henry.willard@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07ubsan: disable UBSAN_ALIGNMENT under COMPILE_TESTKees Cook2-10/+6
The documentation for UBSAN_ALIGNMENT already mentions that it should not be used on all*config builds (and for efficient-unaligned-access architectures), so just refactor the Kconfig to correctly implement this so randconfigs will stop creating insane images that freak out objtool under CONFIG_UBSAN_TRAP (due to the false positives producing functions that never return, etc). Link: http://lkml.kernel.org/r/202005011433.C42EA3E2D@keescook Fixes: 0887a7ebc977 ("ubsan: add trap instrumentation option") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/linux-next/202004231224.D6B3B650@keescook/ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07mm/vmscan: remove unnecessary argument description of isolate_lru_pages()Qiwu Chen1-1/+0
Since commit a9e7c39fa9fd9 ("mm/vmscan.c: remove 7th argument of isolate_lru_pages()"), the explanation of 'mode' argument has been unnecessary. Let's remove it. Signed-off-by: Qiwu Chen <chenqiwu@xiaomi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200501090346.2894-1-chenqiwu@xiaomi.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07epoll: atomically remove wait entry on wake upRoman Penyaev1-19/+24
This patch does two things: - fixes a lost wakeup introduced by commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") - improves performance for events delivery. The description of the problem is the following: if N (>1) threads are waiting on ep->wq for new events and M (>1) events come, it is quite likely that >1 wakeups hit the same wait queue entry, because there is quite a big window between __add_wait_queue_exclusive() and the following __remove_wait_queue() calls in ep_poll() function. This can lead to lost wakeups, because thread, which was woken up, can handle not all the events in ->rdllist. (in better words the problem is described here: https://lkml.org/lkml/2019/10/7/905) The idea of the current patch is to use init_wait() instead of init_waitqueue_entry(). Internally init_wait() sets autoremove_wake_function as a callback, which removes the wait entry atomically (under the wq locks) from the list, thus the next coming wakeup hits the next wait entry in the wait queue, thus preventing lost wakeups. Problem is very well reproduced by the epoll60 test case [1]. Wait entry removal on wakeup has also performance benefits, because there is no need to take a ep->lock and remove wait entry from the queue after the successful wakeup. Here is the timing output of the epoll60 test case: With explicit wakeup from ep_scan_ready_list() (the state of the code prior 339ddb53d373): real 0m6.970s user 0m49.786s sys 0m0.113s After this patch: real 0m5.220s user 0m36.879s sys 0m0.019s The other testcase is the stress-epoll [2], where one thread consumes all the events and other threads produce many events: With explicit wakeup from ep_scan_ready_list() (the state of the code prior 339ddb53d373): threads events/ms run-time ms 8 5427 1474 16 6163 2596 32 6824 4689 64 7060 9064 128 6991 18309 After this patch: threads events/ms run-time ms 8 5598 1429 16 7073 2262 32 7502 4265 64 7640 8376 128 7634 16767 (number of "events/ms" represents event bandwidth, thus higher is better; number of "run-time ms" represents overall time spent doing the benchmark, thus lower is better) [1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c [2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jason Baron <jbaron@akamai.com> Cc: Khazhismel Kumykov <khazhy@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Heiher <r@hev.cc> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200430130326.1368509-2-rpenyaev@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07kselftests: introduce new epoll60 testcase for catching lost wakeupsRoman Penyaev1-0/+146
This test case catches lost wake up introduced by commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") The test is simple: we have 10 threads and 10 event fds. Each thread can harvest only 1 event. 1 producer fires all 10 events at once and waits that all 10 events will be observed by 10 threads. In case of lost wakeup epoll_wait() will timeout and 0 will be returned. Test case catches two sort of problems: forgotten wakeup on event, which hits the ->ovflist list, this problem was fixed by: 5a2513239750 ("eventpoll: fix missing wakeup for ovflist in ep_poll_callback") the other problem is when several sequential events hit the same waiting thread, thus other waiters get no wakeups. Problem is fixed in the following patch. Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Khazhismel Kumykov <khazhy@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Heiher <r@hev.cc> Cc: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/20200430130326.1368509-1-rpenyaev@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07percpu: make pcpu_alloc() aware of current gfp contextFilipe Manana1-4/+10
Since 5.7-rc1, on btrfs we have a percpu counter initialization for which we always pass a GFP_KERNEL gfp_t argument (this happens since commit 2992df73268f78 ("btrfs: Implement DREW lock")). That is safe in some contextes but not on others where allowing fs reclaim could lead to a deadlock because we are either holding some btrfs lock needed for a transaction commit or holding a btrfs transaction handle open. Because of that we surround the call to the function that initializes the percpu counter with a NOFS context using memalloc_nofs_save() (this is done at btrfs_init_fs_root()). However it turns out that this is not enough to prevent a possible deadlock because percpu_alloc() determines if it is in an atomic context by looking exclusively at the gfp flags passed to it (GFP_KERNEL in this case) and it is not aware that a NOFS context is set. Because percpu_alloc() thinks it is in a non atomic context it locks the pcpu_alloc_mutex. This can result in a btrfs deadlock when pcpu_balance_workfn() is running, has acquired that mutex and is waiting for reclaim, while the btrfs task that called percpu_counter_init() (and therefore percpu_alloc()) is holding either the btrfs commit_root semaphore or a transaction handle (done fs/btrfs/backref.c: iterate_extent_inodes()), which prevents reclaim from finishing as an attempt to commit the current btrfs transaction will deadlock. Lockdep reports this issue with the following trace: ====================================================== WARNING: possible circular locking dependency detected 5.6.0-rc7-btrfs-next-77 #1 Not tainted ------------------------------------------------------ kswapd0/91 is trying to acquire lock: ffff8938a3b3fdc8 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs] but task is already holding lock: ffffffffb4f0dbc0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (fs_reclaim){+.+.}: fs_reclaim_acquire.part.0+0x25/0x30 __kmalloc+0x5f/0x3a0 pcpu_create_chunk+0x19/0x230 pcpu_balance_workfn+0x56a/0x680 process_one_work+0x235/0x5f0 worker_thread+0x50/0x3b0 kthread+0x120/0x140 ret_from_fork+0x3a/0x50 -> #3 (pcpu_alloc_mutex){+.+.}: __mutex_lock+0xa9/0xaf0 pcpu_alloc+0x480/0x7c0 __percpu_counter_init+0x50/0xd0 btrfs_drew_lock_init+0x22/0x70 [btrfs] btrfs_get_fs_root+0x29c/0x5c0 [btrfs] resolve_indirect_refs+0x120/0xa30 [btrfs] find_parent_nodes+0x50b/0xf30 [btrfs] btrfs_find_all_leafs+0x60/0xb0 [btrfs] iterate_extent_inodes+0x139/0x2f0 [btrfs] iterate_inodes_from_logical+0xa1/0xe0 [btrfs] btrfs_ioctl_logical_to_ino+0xb4/0x190 [btrfs] btrfs_ioctl+0x165a/0x3130 [btrfs] ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5c/0x260 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #2 (&fs_info->commit_root_sem){++++}: down_write+0x38/0x70 btrfs_cache_block_group+0x2ec/0x500 [btrfs] find_free_extent+0xc6a/0x1600 [btrfs] btrfs_reserve_extent+0x9b/0x180 [btrfs] btrfs_alloc_tree_block+0xc1/0x350 [btrfs] alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs] __btrfs_cow_block+0x122/0x5a0 [btrfs] btrfs_cow_block+0x106/0x240 [btrfs] commit_cowonly_roots+0x55/0x310 [btrfs] btrfs_commit_transaction+0x509/0xb20 [btrfs] sync_filesystem+0x74/0x90 generic_shutdown_super+0x22/0x100 kill_anon_super+0x14/0x30 btrfs_kill_super+0x12/0x20 [btrfs] deactivate_locked_super+0x31/0x70 cleanup_mnt+0x100/0x160 task_work_run+0x93/0xc0 exit_to_usermode_loop+0xf9/0x100 do_syscall_64+0x20d/0x260 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&space_info->groups_sem){++++}: down_read+0x3c/0x140 find_free_extent+0xef6/0x1600 [btrfs] btrfs_reserve_extent+0x9b/0x180 [btrfs] btrfs_alloc_tree_block+0xc1/0x350 [btrfs] alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs] __btrfs_cow_block+0x122/0x5a0 [btrfs] btrfs_cow_block+0x106/0x240 [btrfs] btrfs_search_slot+0x50c/0xd60 [btrfs] btrfs_lookup_inode+0x3a/0xc0 [btrfs] __btrfs_update_delayed_inode+0x90/0x280 [btrfs] __btrfs_commit_inode_delayed_items+0x81f/0x870 [btrfs] __btrfs_run_delayed_items+0x8e/0x180 [btrfs] btrfs_commit_transaction+0x31b/0xb20 [btrfs] iterate_supers+0x87/0xf0 ksys_sync+0x60/0xb0 __ia32_sys_sync+0xa/0x10 do_syscall_64+0x5c/0x260 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&delayed_node->mutex){+.+.}: __lock_acquire+0xef0/0x1c80 lock_acquire+0xa2/0x1d0 __mutex_lock+0xa9/0xaf0 __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs] btrfs_evict_inode+0x40d/0x560 [btrfs] evict+0xd9/0x1c0 dispose_list+0x48/0x70 prune_icache_sb+0x54/0x80 super_cache_scan+0x124/0x1a0 do_shrink_slab+0x176/0x440 shrink_slab+0x23a/0x2c0 shrink_node+0x188/0x6e0 balance_pgdat+0x31d/0x7f0 kswapd+0x238/0x550 kthread+0x120/0x140 ret_from_fork+0x3a/0x50 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> pcpu_alloc_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(pcpu_alloc_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); *** DEADLOCK *** 3 locks held by kswapd0/91: #0: (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 #1: (shrinker_rwsem){++++}, at: shrink_slab+0x12f/0x2c0 #2: (&type->s_umount_key#43){++++}, at: trylock_super+0x16/0x50 stack backtrace: CPU: 1 PID: 91 Comm: kswapd0 Not tainted 5.6.0-rc7-btrfs-next-77 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x8f/0xd0 check_noncircular+0x170/0x190 __lock_acquire+0xef0/0x1c80 lock_acquire+0xa2/0x1d0 __mutex_lock+0xa9/0xaf0 __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs] btrfs_evict_inode+0x40d/0x560 [btrfs] evict+0xd9/0x1c0 dispose_list+0x48/0x70 prune_icache_sb+0x54/0x80 super_cache_scan+0x124/0x1a0 do_shrink_slab+0x176/0x440 shrink_slab+0x23a/0x2c0 shrink_node+0x188/0x6e0 balance_pgdat+0x31d/0x7f0 kswapd+0x238/0x550 kthread+0x120/0x140 ret_from_fork+0x3a/0x50 This could be fixed by making btrfs pass GFP_NOFS instead of GFP_KERNEL to percpu_counter_init() in contextes where it is not reclaim safe, however that type of approach is discouraged since memalloc_[nofs|noio]_save() were introduced. Therefore this change makes pcpu_alloc() look up into an existing nofs/noio context before deciding whether it is in an atomic context or not. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Link: http://lkml.kernel.org/r/20200430164356.15543-1-fdmanana@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07mm/slub: fix incorrect interpretation of s->offsetWaiman Long1-15/+30
In a couple of places in the slub memory allocator, the code uses "s->offset" as a check to see if the free pointer is put right after the object. That check is no longer true with commit 3202fa62fb43 ("slub: relocate freelist pointer to middle of object"). As a result, echoing "1" into the validate sysfs file, e.g. of dentry, may cause a bunch of "Freepointer corrupt" error reports like the following to appear with the system in panic afterwards. ============================================================================= BUG dentry(666:pmcd.service) (Tainted: G B): Freepointer corrupt ----------------------------------------------------------------------------- To fix it, use the check "s->offset == s->inuse" in the new helper function freeptr_outside_object() instead. Also add another helper function get_info_end() to return the end of info block (inuse + free pointer if not overlapping with object). Fixes: 3202fa62fb43 ("slub: relocate freelist pointer to middle of object") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Vitaly Nikolenko <vnik@duasynt.com> Cc: Silvio Cesare <silvio.cesare@gmail.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: Changbin Du <changbin.du@gmail.com> Link: http://lkml.kernel.org/r/20200429135328.26976-1-longman@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07scripts/gdb: repair rb_first() and rb_last()Aymeric Agon-Rambosson1-2/+2
The current implementations of the rb_first() and rb_last() gdb functions have a variable that references itself in its instanciation, which causes the function to throw an error if a specific condition on the argument is met. The original author rather intended to reference the argument and made a typo. Referring the argument instead makes the function work as intended. Signed-off-by: Aymeric Agon-Rambosson <aymeric.agon@yandex.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20200427051029.354840-1-aymeric.agon@yandex.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07eventpoll: fix missing wakeup for ovflist in ep_poll_callbackKhazhismel Kumykov1-9/+9
In the event that we add to ovflist, before commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") we would be woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback. With that wakeup removed, if we add to ovflist here, we may never wake up. Rather than adding back the ep_scan_ready_list wakeup - which was resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback. We noticed that one of our workloads was missing wakeups starting with 339ddb53d373 and upon manual inspection, this wakeup seemed missing to me. With this patch added, we no longer see missing wakeups. I haven't yet tried to make a small reproducer, but the existing kselftests in filesystem/epoll passed for me with this patch. [khazhy@google.com: use if/elif instead of goto + cleanup suggested by Roman] Link: http://lkml.kernel.org/r/20200424190039.192373-1-khazhy@google.com Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Roman Penyaev <rpenyaev@suse.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Roman Penyaev <rpenyaev@suse.de> Cc: Heiher <r@hev.cc> Cc: Jason Baron <jbaron@akamai.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()Janakarajan Natarajan1-1/+1
When trying to lock read-only pages, sev_pin_memory() fails because FOLL_WRITE is used as the flag for get_user_pages_fast(). Commit 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a write 'bool'") updated the get_user_pages_fast() call sites to use flags, but incorrectly updated the call in sev_pin_memory(). As the original coding of this call was correct, revert the change made by that commit. Fixes: 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a write 'bool'") Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Mike Marshall <hubcap@omnibond.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Link: http://lkml.kernel.org/r/20200423152419.87202-1-Janakarajan.Natarajan@amd.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07scripts/decodecode: fix trapping instruction formattingIvan Delalande1-1/+1
If the trapping instruction contains a ':', for a memory access through segment registers for example, the sed substitution will insert the '*' marker in the middle of the instruction instead of the line address: 2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction I started to think I had forgotten some quirk of the assembly syntax before noticing that it was actually coming from the script. Fix it to add the address marker at the right place for these instructions: 28: 49 8b 06 mov (%r14),%rax 2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction 30: 0f 94 c0 sete %al Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust") Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07kernel/kcov.c: fix typos in kcov_remote_start documentationMaciej Grochowski1-2/+2
Signed-off-by: Maciej Grochowski <maciej.grochowski@pm.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Link: http://lkml.kernel.org/r/20200420030259.31674-1-maciek.grochowski@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()David Hildenbrand1-0/+1
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected, e.g., while booting up. watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: __pageblock_pfn_to_page+0x134/0x1c0 Call Trace: set_zone_contiguous+0x56/0x70 page_alloc_init_late+0x166/0x176 kernel_init_freeable+0xfa/0x255 kernel_init+0xa/0x106 ret_from_fork+0x35/0x40 The issue becomes visible when having a lot of memory (e.g., 4TB) assigned to a single NUMA node - a system that can easily be created using QEMU. Inside VMs on a hypervisor with quite some memory overcommit, this is fairly easy to trigger. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07mm, memcg: fix error return value of mem_cgroup_css_alloc()Yafang Shao1-6/+9
When I run my memcg testcase which creates lots of memcgs, I found there're unexpected out of memory logs while there're still enough available free memory. The error log is mkdir: cannot create directory 'foo.65533': Cannot allocate memory The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno should be -ENOSPC "No space left on device", which is an appropriate errno for userspace's failed mkdir. As the errno really misled me, we should make it right. After this patch, the error log will be mkdir: cannot create directory 'foo.65533': No space left on device [akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal] [akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal] Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()Oleg Nesterov1-8/+26
Commit cc731525f26a ("signal: Remove kernel interal si_code magic") changed the value of SI_FROMUSER(SI_MESGQ), this means that mq_notify() no longer works if the sender doesn't have rights to send a signal. Change __do_notify() to use do_send_sig_info() instead of kill_pid_info() to avoid check_kill_permission(). This needs the additional notify.sigev_signo != 0 check, shouldn't we change do_mq_notify() to deny sigev_signo == 0 ? Test-case: #include <signal.h> #include <mqueue.h> #include <unistd.h> #include <sys/wait.h> #include <assert.h> static int notified; static void sigh(int sig) { notified = 1; } int main(void) { signal(SIGIO, sigh); int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL); assert(fd >= 0); struct sigevent se = { .sigev_notify = SIGEV_SIGNAL, .sigev_signo = SIGIO, }; assert(mq_notify(fd, &se) == 0); if (!fork()) { assert(setuid(1) == 0); mq_send(fd, "",1,0); return 0; } wait(NULL); mq_unlink("/mq"); assert(notified); return 0; } [manfred@colorfullife.com: 1) Add self_exec_id evaluation so that the implementation matches do_notify_parent 2) use PIDTYPE_TGID everywhere] Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic") Reported-by: Yoji <yoji.fujihar.min@gmail.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Markus Elfring <elfring@users.sourceforge.net> Cc: <1vier1@web.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/e2a782e4-eab9-4f5c-c749-c07a8f7a4e66@colorfullife.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07Merge tag 'trace-v5.7-rc3' of ↵Linus Torvalds8-42/+114
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM enabled - Fix allocation leaks in bootconfig tool - Fix a double initialization of a variable - Fix API bootconfig usage from kprobe boot time events - Reject NULL location for kprobes - Fix crash caused by preempt delay module not cleaning up kthread correctly - Add vmalloc_sync_mappings() to prevent x86_64 page faults from recursively faulting from tracing page faults - Fix comment in gpu/trace kerneldoc header - Fix documentation of how to create a trace event class - Make the local tracing_snapshot_instance_cond() function static * tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tools/bootconfig: Fix resource leak in apply_xbc() tracing: Make tracing_snapshot_instance_cond() static tracing: Fix doc mistakes in trace sample gpu/trace: Minor comment updates for gpu_mem_total tracepoint tracing: Add a vmalloc_sync_mappings() for safe measure tracing: Wait for preempt irq delay thread to finish tracing/kprobes: Reject new event if loc is NULL tracing/boottime: Fix kprobe event API usage tracing/kprobes: Fix a double initialization typo bootconfig: Fix to remove bootconfig data from initrd while boot
2020-05-07Merge tag 'linux-kselftest-5.7-rc5' of ↵Linus Torvalds2-3/+58
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "ftrace test fixes and a fix to kvm Makefile for relocatable native/cross builds and installs" * tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: fix kvm relocatable native/cross builds and installs selftests/ftrace: Make XFAIL green color ftrace/selftest: make unresolved cases cause failure if --fail-unresolved set ftrace/selftests: workaround cgroup RT scheduling issues
2020-05-07io_uring: don't use 'fd' for openat/openat2/statxJens Axboe1-25/+7
We currently make some guesses as when to open this fd, but in reality we have no business (or need) to do so at all. In fact, it makes certain things fail, like O_PATH. Remove the fd lookup from these opcodes, we're just passing the 'fd' to generic helpers anyway. With that, we can also remove the special casing of fd values in io_req_needs_file(), and the 'fd_non_neg' check that we have. And we can ensure that we only read sqe->fd once. This fixes O_PATH usage with openat/openat2, and ditto statx path side oddities. Cc: stable@vger.kernel.org: # v5.6 Reported-by: Max Kellermann <mk@cm4all.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07tools/bootconfig: Fix resource leak in apply_xbc()Yunfeng Ye1-3/+6
Fix the @data and @fd allocations that are leaked in the error path of apply_xbc(). Link: http://lkml.kernel.org/r/583a49c9-c27a-931d-e6c2-6f63a4b18bea@huawei.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07tracing: Make tracing_snapshot_instance_cond() staticZou Wei1-1/+2
Fix the following sparse warning: kernel/trace/trace.c:950:6: warning: symbol 'tracing_snapshot_instance_cond' was not declared. Should it be static? Link: http://lkml.kernel.org/r/1587614905-48692-1-git-send-email-zou_wei@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07tracing: Fix doc mistakes in trace sampleWei Yang1-1/+1
As the example below shows, DECLARE_EVENT_CLASS() is used instead of DEFINE_EVENT_CLASS(). Link: http://lkml.kernel.org/r/20200428214959.11259-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07gpu/trace: Minor comment updates for gpu_mem_total tracepointYiwei Zhang1-1/+1
This change updates the improper comment for the 'size' attribute in the tracepoint definition. Most gfx drivers pre-fault in physical pages instead of making virtual allocations. So we drop the 'Virtual' keyword here and leave this to the implementations. Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.com Signed-off-by: Yiwei Zhang <zzyiwei@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07tracing: Add a vmalloc_sync_mappings() for safe measureSteven Rostedt (VMware)1-0/+13
x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu areas can be complex, to say the least. Mappings may happen at boot up, and if nothing synchronizes the page tables, those page mappings may not be synced till they are used. This causes issues for anything that might touch one of those mappings in the path of the page fault handler. When one of those unmapped mappings is touched in the page fault handler, it will cause another page fault, which in turn will cause a page fault, and leave us in a loop of page faults. Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split vmalloc_sync_all() into vmalloc_sync_unmappings() and vmalloc_sync_mappings(), as on system exit, it did not need to do a full sync on x86_64 (although it still needed to be done on x86_32). By chance, the vmalloc_sync_all() would synchronize the page mappings done at boot up and prevent the per cpu area from being a problem for tracing in the page fault handler. But when that synchronization in the exit of a task became a nop, it caused the problem to appear. Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com> Suggested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07tracing: Wait for preempt irq delay thread to finishSteven Rostedt (VMware)1-6/+24
Running on a slower machine, it is possible that the preempt delay kernel thread may still be executing if the module was immediately removed after added, and this can cause the kernel to crash as the kernel thread might be executing after its code has been removed. There's no reason that the caller of the code shouldn't just wait for the delay thread to finish, as the thread can also be created by a trigger in the sysfs code, which also has the same issues. Link: http://lore.kernel.org/r/5EA2B0C8.2080706@cn.fujitsu.com Cc: stable@vger.kernel.org Fixes: 793937236d1ee ("lib: Add module for testing preemptoff/irqsoff latency tracers") Reported-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07Merge tag 'arm64-fixes' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Avoid potential NULL dereference in huge_pte_alloc() on pmd_alloc() failure" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hugetlb: avoid potential NULL dereference
2020-05-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds25-125/+628
Pull kvm fixes from Paolo Bonzini: "Bugfixes, mostly for ARM and AMD, and more documentation. Slightly bigger than usual because I couldn't send out what was pending for rc4, but there is nothing worrisome going on. I have more fixes pending for guest debugging support (gdbstub) but I will send them next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly KVM: selftests: Fix build for evmcs.h kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path docs/virt/kvm: Document configuring and running nested guests KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts KVM: x86: Fixes posted interrupt check for IRQs delivery modes KVM: SVM: fill in kvm_run->debug.arch.dr[67] KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning KVM: arm64: Fix 32bit PC wrap-around KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITS KVM: arm64: Save/restore sp_el0 as part of __guest_enter KVM: arm64: Delete duplicated label in invalid_vector KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER read KVM: arm64: PSCI: Forbid 64bit functions for 32bit guests ...
2020-05-07Merge tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfsLinus Torvalds1-0/+1
Pull configfs fix from Christoph Hellwig: "Fix a refcount leak in configfs_rmdir (Xiyu Yang)" * tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs: configfs: fix config_item refcnt leak in configfs_rmdir()
2020-05-07splice: move f_mode checks to do_{splice,tee}()Pavel Begunkov1-27/+18
do_splice() is used by io_uring, as will be do_tee(). Move f_mode checks from sys_{splice,tee}() to do_{splice,tee}(), so they're enforced for io_uring as well. Fixes: 7d67af2c0134 ("io_uring: add splice(2) support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07arm64: hugetlb: avoid potential NULL dereferenceMark Rutland1-0/+2
The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL: | CC arch/arm64/mm/pageattr.o | CC arch/arm64/mm/hugetlbpage.o | from arch/arm64/mm/hugetlbpage.c:10: | arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’: | ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference] | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’ | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’ | |arch/arm64/mm/hugetlbpage.c:232:10: | |./arch/arm64/include/asm/pgtable-types.h:28:24: | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’ | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’ This can only occur when the kernel cannot allocate a page, and so is unlikely to happen in practice before other systems start failing. We can avoid this by bailing out if pmd_alloc() fails, as we do earlier in the function if pud_alloc() fails. Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Kyrill Tkachov <kyrylo.tkachov@arm.com> Cc: <stable@vger.kernel.org> # 4.5.x- Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-05-07usb: chipidea: msm: Ensure proper controller reset using role switch APIBryan O'Donoghue1-1/+1
Currently we check to make sure there is no error state on the extcon handle for VBUS when writing to the HS_PHY_GENCONFIG_2 register. When using the USB role-switch API we still need to write to this register absent an extcon handle. This patch makes the appropriate update to ensure the write happens if role-switching is true. Fixes: 05559f10ed79 ("usb: chipidea: add role switch class support") Cc: stable <stable@vger.kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Peter Chen <peter.chen@nxp.com> Link: https://lore.kernel.org/r/20200507004918.25975-2-peter.chen@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds107-462/+801
Pull networking fixes from David Miller: 1) Fix reference count leaks in various parts of batman-adv, from Xiyu Yang. 2) Update NAT checksum even when it is zero, from Guillaume Nault. 3) sk_psock reference count leak in tls code, also from Xiyu Yang. 4) Sanity check TCA_FQ_CODEL_DROP_BATCH_SIZE netlink attribute in fq_codel, from Eric Dumazet. 5) Fix panic in choke_reset(), also from Eric Dumazet. 6) Fix VLAN accel handling in bnxt_fix_features(), from Michael Chan. 7) Disallow out of range quantum values in sch_sfq, from Eric Dumazet. 8) Fix crash in x25_disconnect(), from Yue Haibing. 9) Don't pass pointer to local variable back to the caller in nf_osf_hdr_ctx_init(), from Arnd Bergmann. 10) Wireguard should use the ECN decap helper functions, from Toke Høiland-Jørgensen. 11) Fix command entry leak in mlx5 driver, from Moshe Shemesh. 12) Fix uninitialized variable access in mptcp's subflow_syn_recv_sock(), from Paolo Abeni. 13) Fix unnecessary out-of-order ingress frame ordering in macsec, from Scott Dial. 14) IPv6 needs to use a global serial number for dst validation just like ipv4, from David Ahern. 15) Fix up PTP_1588_CLOCK deps, from Clay McClure. 16) Missing NLM_F_MULTI flag in gtp driver netlink messages, from Yoshiyuki Kurauchi. 17) Fix a regression in that dsa user port errors should not be fatal, from Florian Fainelli. 18) Fix iomap leak in enetc driver, from Dejin Zheng. 19) Fix use after free in lec_arp_clear_vccs(), from Cong Wang. 20) Initialize protocol value earlier in neigh code paths when generating events, from Roman Mashak. 21) netdev_update_features() must be called with RTNL mutex in macsec driver, from Antoine Tenart. 22) Validate untrusted GSO packets even more strictly, from Willem de Bruijn. 23) Wireguard decrypt worker needs a cond_resched(), from Jason Donenfeld. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning wireguard: send/receive: cond_resched() when processing worker ringbuffers wireguard: socket: remove errant restriction on looping to self wireguard: selftests: use normal kernel stack size on ppc64 net: ethernet: ti: am65-cpsw-nuss: fix irqs type ionic: Use debugfs_create_bool() to export bool net: dsa: Do not leave DSA master with NULL netdev_ops net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred net: stricter validation of untrusted gso packets seg6: fix SRH processing to comply with RFC8754 net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms net: dsa: ocelot: the MAC table on Felix is twice as large net: dsa: sja1105: the PTP_CLK extts input reacts on both edges selftests: net: tcp_mmap: fix SO_RCVLOWAT setting net: hsr: fix incorrect type usage for protocol variable net: macsec: fix rtnl locking issue net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del() ...
2020-05-06net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CAREPablo Neira Ayuso3-4/+22
This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver that the frontend does not need counters, this hw stats type request never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests the driver to disable the stats, however, if the driver cannot disable counters, it bails out. TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_* except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*. Fixes: 319a1d19471e ("flow_offload: check for basic action hw stats type") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper orderLukas Bulwahn1-1/+1
Commit 9b038086f06b ("docs: networking: convert DIM to RST") added a new file entry to DYNAMIC INTERRUPT MODERATION to the end, and not following alphabetical order. So, ./scripts/checkpatch.pl -f MAINTAINERS complains: WARNING: Misordered MAINTAINERS entry - list file patterns in alphabetic order #5966: FILE: MAINTAINERS:5966: +F: lib/dim/ +F: Documentation/networking/net_dim.rst Reorder the file entries to keep MAINTAINERS nicely ordered. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06Merge branch 'wireguard-fixes'David S. Miller6-33/+72
Jason A. Donenfeld says: ==================== wireguard fixes for 5.7-rc5 With Ubuntu and Debian having backported this into their kernels, we're finally seeing testing from places we hadn't seen prior, which is nice. With that comes more fixes: 1) The CI for PPC64 was running with extremely small stacks for 64-bit, causing spurious crashes in surprising places. 2) There's was an old leftover routing loop restriction, which no longer makes sense given the queueing architecture, and was causing problems for people who really did want nested routing. 3) Not yielding our kthread on CONFIG_PREEMPT_VOLUNTARY systems caused RCU stalls and other issues, reported by Wang Jian, with the fix suggested by Sultan Alsawaf. 4) Clang spewed warnings in a selftest for CONFIG_IPV6=n, reported by Arnd Bergmann. 5) A complicated if statement was simplified to an assignment while also making the likely/unlikely hinting more correct and simple, and increasing readability, suggested by Sultan. Patches (2) and (3) have Fixes: lines and are probably good candidates for stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06wireguard: send/receive: use explicit unlikely branch instead of implicit ↵Jason A. Donenfeld2-16/+12
coalescing It's very unlikely that send will become true. It's nearly always false between 0 and 120 seconds of a session, and in most cases becomes true only between 120 and 121 seconds before becoming false again. So, unlikely(send) is clearly the right option here. What happened before was that we had this complex boolean expression with multiple likely and unlikely clauses nested. Since this is evaluated left-to-right anyway, the whole thing got converted to unlikely. So, we can clean this up to better represent what's going on. The generated code is the same. Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06wireguard: selftests: initalize ipv6 members to NULL to squelch clang warningJason A. Donenfeld1-2/+2
Without setting these to NULL, clang complains in certain configurations that have CONFIG_IPV6=n: In file included from drivers/net/wireguard/ratelimiter.c:223: drivers/net/wireguard/selftest/ratelimiter.c:173:34: error: variable 'skb6' is uninitialized when used here [-Werror,-Wuninitialized] ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count); ^~~~ drivers/net/wireguard/selftest/ratelimiter.c:123:29: note: initialize the variable 'skb6' to silence this warning struct sk_buff *skb4, *skb6; ^ = NULL drivers/net/wireguard/selftest/ratelimiter.c:173:40: error: variable 'hdr6' is uninitialized when used here [-Werror,-Wuninitialized] ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count); ^~~~ drivers/net/wireguard/selftest/ratelimiter.c:125:22: note: initialize the variable 'hdr6' to silence this warning struct ipv6hdr *hdr6; ^ We silence this warning by setting the variables to NULL as the warning suggests. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06wireguard: send/receive: cond_resched() when processing worker ringbuffersJason A. Donenfeld2-0/+6
Users with pathological hardware reported CPU stalls on CONFIG_ PREEMPT_VOLUNTARY=y, because the ringbuffers would stay full, meaning these workers would never terminate. That turned out not to be okay on systems without forced preemption, which Sultan observed. This commit adds a cond_resched() to the bottom of each loop iteration, so that these workers don't hog the core. Note that we don't need this on the napi poll worker, since that terminates after its budget is expended. Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com> Reported-by: Wang Jian <larkwang@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06wireguard: socket: remove errant restriction on looping to selfJason A. Donenfeld2-15/+51
It's already possible to create two different interfaces and loop packets between them. This has always been possible with tunnels in the kernel, and isn't specific to wireguard. Therefore, the networking stack already needs to deal with that. At the very least, the packet winds up exceeding the MTU and is discarded at that point. So, since this is already something that happens, there's no need to forbid the not very exceptional case of routing a packet back to the same interface; this loop is no different than others, and we shouldn't special case it, but rather rely on generic handling of loops in general. This also makes it easier to do interesting things with wireguard such as onion routing. At the same time, we add a selftest for this, ensuring that both onion routing works and infinite routing loops do not crash the kernel. We also add a test case for wireguard interfaces nesting packets and sending traffic between each other, as well as the loop in this case too. We make sure to send some throughput-heavy traffic for this use case, to stress out any possible recursion issues with the locks around workqueues. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06wireguard: selftests: use normal kernel stack size on ppc64Jason A. Donenfeld1-0/+1
While at some point it might have made sense to be running these tests on ppc64 with 4k stacks, the kernel hasn't actually used 4k stacks on 64-bit powerpc in a long time, and more interesting things that we test don't really work when we deviate from the default (16k). So, we stop pushing our luck in this commit, and return to the default instead of the minimum. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: ethernet: ti: am65-cpsw-nuss: fix irqs typeGrygorii Strashko1-2/+3
The K3 INTA driver, which is source TX/RX IRQs for CPSW NUSS, defines IRQs triggering type as EDGE by default, but triggering type for CPSW NUSS TX/RX IRQs has to be LEVEL as the EDGE triggering type may cause unnecessary IRQs triggering and NAPI scheduling for empty queues. It was discovered with RT-kernel. Fix it by explicitly specifying CPSW NUSS TX/RX IRQ type as IRQF_TRIGGER_HIGH. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06ionic: Use debugfs_create_bool() to export boolGeert Uytterhoeven1-2/+1
Currently bool ionic_cq.done_color is exported using debugfs_create_u8(), which requires a cast, preventing further compiler checks. Fix this by switching to debugfs_create_bool(), and dropping the cast. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: dsa: Do not leave DSA master with NULL netdev_opsFlorian Fainelli1-1/+2
When ndo_get_phys_port_name() for the CPU port was added we introduced an early check for when the DSA master network device in dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When we perform the teardown operation in dsa_master_ndo_teardown() we would not be checking that cpu_dp->orig_ndo_ops was successfully allocated and non-NULL initialized. With network device drivers such as virtio_net, this leads to a NPD as soon as the DSA switch hanging off of it gets torn down because we are now assigning the virtio_net device's netdev_ops a NULL pointer. Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port") Reported-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirredVladimir Oltean1-5/+3
This was caused by a poor merge conflict resolution on my side. The "act = &cls->rule->action.entries[0];" assignment was already present in the code prior to the patch mentioned below. Fixes: e13c2075280e ("net: dsa: refactor matchall mirred action to separate function") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: stricter validation of untrusted gso packetsWillem de Bruijn1-2/+24
Syzkaller again found a path to a kernel crash through bad gso input: a packet with transport header extending beyond skb_headlen(skb). Tighten validation at kernel entry: - Verify that the transport header lies within the linear section. To avoid pulling linux/tcp.h, verify just sizeof tcphdr. tcp_gso_segment will call pskb_may_pull (th->doff * 4) before use. - Match the gso_type against the ip_proto found by the flow dissector. Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06seg6: fix SRH processing to comply with RFC8754Ahmed Abdelsalam1-2/+8
The Segment Routing Header (SRH) which defines the SRv6 dataplane is defined in RFC8754. RFC8754 (section 4.1) defines the SR source node behavior which encapsulates packets into an outer IPv6 header and SRH. The SR source node encodes the full list of Segments that defines the packet path in the SRH. Then, the first segment from list of Segments is copied into the Destination address of the outer IPv6 header and the packet is sent to the first hop in its path towards the destination. If the Segment list has only one segment, the SR source node can omit the SRH as he only segment is added in the destination address. RFC8754 (section 4.1.1) defines the Reduced SRH, when a source does not require the entire SID list to be preserved in the SRH. A reduced SRH does not contain the first segment of the related SR Policy (the first segment is the one already in the DA of the IPv6 header), and the Last Entry field is set to n-2, where n is the number of elements in the SR Policy. RFC8754 (section 4.3.1.1) defines the SRH processing and the logic to validate the SRH (S09, S10, S11) which works for both reduced and non-reduced behaviors. This patch updates seg6_validate_srh() to validate the SRH as per RFC8754. Signed-off-by: Ahmed Abdelsalam <ahabdels@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06Merge branch 'FDB-fixes-for-Felix-and-Ocelot-switches'David S. Miller6-6/+16
Vladimir Oltean says: ==================== FDB fixes for Felix and Ocelot switches This series fixes the following problems: - Dynamically learnt addresses never expiring (neither for Ocelot nor for Felix) - Half of the FDB not visible in 'bridge fdb show' (for Felix only) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not msVladimir Oltean1-2/+9
One may notice that automatically-learnt entries 'never' expire, even though the bridge configures the address age period at 300 seconds. Actually the value written to hardware corresponds to a time interval 1000 times higher than intended, i.e. 83 hours. Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Faineli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: dsa: ocelot: the MAC table on Felix is twice as largeVladimir Oltean6-4/+7
When running 'bridge fdb dump' on Felix, sometimes learnt and static MAC addresses would appear, sometimes they wouldn't. Turns out, the MAC table has 4096 entries on VSC7514 (Ocelot) and 8192 entries on VSC9959 (Felix), so the existing code from the Ocelot common library only dumped half of Felix's MAC table. They are both organized as a 4-way set-associative TCAM, so we just need a single variable indicating the correct number of rows. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06Merge tag 'tag-chrome-platform-fixes-for-v5.7-rc5' of ↵Linus Torvalds3-61/+93
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform fix from Benson Leung: "Fix a resource allocation issue in cros_ec_sensorhub.c" * tag 'tag-chrome-platform-fixes-for-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_sensorhub: Allocate sensorhub resource before claiming sensors
2020-05-06net: dsa: sja1105: the PTP_CLK extts input reacts on both edgesVladimir Oltean1-8/+18
It looks like the sja1105 external timestamping input is not as generic as we thought. When fed a signal with 50% duty cycle, it will timestamp both the rising and the falling edge. When fed a short pulse signal, only the timestamp of the falling edge will be seen in the PTPSYNCTS register, because that of the rising edge had been overwritten. So the moral is: don't feed it short pulse inputs. Luckily this is not a complete deal breaker, as we can still work with 1 Hz square waves. But the problem is that the extts polling period was not dimensioned enough for this input signal. If we leave the period at half a second, we risk losing timestamps due to jitter in the measuring process. So we need to increase it to 4 times per second. Also, the very least we can do to inform the user is to deny any other flags combination than with PTP_RISING_EDGE and PTP_FALLING_EDGE both set. Fixes: 747e5eb31d59 ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06selftests: net: tcp_mmap: fix SO_RCVLOWAT settingEric Dumazet1-1/+3
Since chunk_size is no longer an integer, we can not use it directly as an argument of setsockopt(). This patch should fix tcp_mmap for Big Endian kernels. Fixes: 597b01edafac ("selftests: net: avoid ptl lock contention in tcp_mmap") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Arjun Roy <arjunroy@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: hsr: fix incorrect type usage for protocol variableMurali Karicheri1-1/+1
Fix following sparse checker warning:- net/hsr/hsr_slave.c:38:18: warning: incorrect type in assignment (different base types) net/hsr/hsr_slave.c:38:18: expected unsigned short [unsigned] [usertype] protocol net/hsr/hsr_slave.c:38:18: got restricted __be16 [usertype] h_proto net/hsr/hsr_slave.c:39:25: warning: restricted __be16 degrades to integer net/hsr/hsr_slave.c:39:57: warning: restricted __be16 degrades to integer Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: macsec: fix rtnl locking issueAntoine Tenart1-1/+2
netdev_update_features() must be called with the rtnl lock taken. Not doing so triggers a warning, as ASSERT_RTNL() is used in __netdev_update_features(), the first function called by netdev_update_features(). Fix this. Fixes: c850240b6c41 ("net: macsec: report real_dev features when HW offloading is enabled") Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del()Dan Carpenter1-0/+3
The "info->fs.location" is a u32 that comes from the user via the ethtool_set_rxnfc() function. We need to check for invalid values to prevent a buffer overflow. I copy and pasted this check from the mvpp2_ethtool_cls_rule_ins() function. Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: mvpp2: prevent buffer overflow in mvpp22_rss_ctx()Dan Carpenter1-0/+2
The "rss_context" variable comes from the user via ethtool_get_rxfh(). It can be any u32 value except zero. Eventually it gets passed to mvpp22_rss_ctx() and if it is over MVPP22_N_RSS_TABLES (8) then it results in an array overflow. Fixes: 895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06selftests: net: tcp_mmap: clear whole tcp_zerocopy_receive structEric Dumazet1-1/+2
We added fields in tcp_zerocopy_receive structure, so make sure to clear all fields to not pass garbage to the kernel. We were lucky because recent additions added 'out' parameters, still we need to clean our reference implementation, before folks copy/paste it. Fixes: c8856c051454 ("tcp-zerocopy: Return inq along with tcp receive zerocopy.") Fixes: 33946518d493 ("tcp-zerocopy: Return sk_err (if set) along with tcp receive zerocopy.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06Merge branch 'linus' of ↵Linus Torvalds11-34/+69
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a potential scheduling latency problem for the algorithms used by WireGuard" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: arch/nhpoly1305 - process in explicit 4k chunks crypto: arch/lib - limit simd usage to 4k chunks
2020-05-06Merge tag 'usb-serial-5.7-rc5' of ↵Greg Kroah-Hartman2-2/+3
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 5.7-rc5 Here's a fix adding a missing input sanity check and a new modem device id. Both have been in linux-next with no reported issues. * tag 'usb-serial-5.7-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: qcserial: Add DW5816e support USB: serial: garmin_gps: add sanity checking for data length
2020-05-06tracing/kprobes: Reject new event if loc is NULLMasami Hiramatsu1-0/+6
Reject the new event which has NULL location for kprobes. For kprobes, user must specify at least the location. Link: http://lkml.kernel.org/r/158779376597.6082.1411212055469099461.stgit@devnote2 Cc: Tom Zanussi <zanussi@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Fixes: 2a588dd1d5d6 ("tracing: Add kprobe event command generation functions") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-06tracing/boottime: Fix kprobe event API usageMasami Hiramatsu1-12/+8
Fix boottime kprobe events to use API correctly for multiple events. For example, when we set a multiprobe kprobe events in bootconfig like below, ftrace.event.kprobes.myevent { probes = "vfs_read $arg1 $arg2", "vfs_write $arg1 $arg2" } This cause an error; trace_boot: Failed to add probe: p:kprobes/myevent (null) vfs_read $arg1 $arg2 vfs_write $arg1 $arg2 This shows the 1st argument becomes NULL and multiprobes are merged to 1 probe. Link: http://lkml.kernel.org/r/158779375766.6082.201939936008972838.stgit@devnote2 Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Fixes: 29a154810546 ("tracing: Change trace_boot to use kprobe_event interface") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-06tracing/kprobes: Fix a double initialization typoMasami Hiramatsu1-1/+1
Fix a typo that resulted in an unnecessary double initialization to addr. Link: http://lkml.kernel.org/r/158779374968.6082.2337484008464939919.stgit@devnote2 Cc: Tom Zanussi <zanussi@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Fixes: c7411a1a126f ("tracing/kprobe: Check whether the non-suffixed symbol is notrace") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-06bootconfig: Fix to remove bootconfig data from initrd while bootMasami Hiramatsu1-17/+52
If there is a bootconfig data in the tail of initrd/initramfs, initrd image sanity check caused an error while decompression stage as follows. [ 0.883882] Unpacking initramfs... [ 2.696429] Initramfs unpacking failed: invalid magic at start of compressed archive This error will be ignored if CONFIG_BLK_DEV_RAM=n, but CONFIG_BLK_DEV_RAM=y the kernel failed to mount rootfs and causes a panic. To fix this issue, shrink down the initrd_end for removing tailing bootconfig data while boot the kernel. Link: http://lkml.kernel.org/r/158788401014.24243.17424755854115077915.stgit@devnote2 Cc: Borislav Petkov <bp@alien8.de> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: 7684b8582c24 ("bootconfig: Load boot config from the tail of initrd") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-06Merge tag 'kvm-s390-master-5.7-3' of ↵Paolo Bonzini1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for running nested uner z/VM There are circumstances when running nested under z/VM that would trigger a WARN_ON_ONCE. Remove the WARN_ON_ONCE. Long term we certainly want to make this code more robust and flexible, but just returning instead of WARNING makes guest bootable again.
2020-05-06KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properlyPeter Xu3-0/+3
KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-06KVM: selftests: Fix build for evmcs.hPeter Xu2-2/+5
I got this error when building kvm selftests: /usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: multiple definition of `current_evmcs'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: first defined here /usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: multiple definition of `current_vp_assist'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: first defined here I think it's because evmcs.h is included both in a test file and a lib file so the structs have multiple declarations when linking. After all it's not a good habit to declare structs in the header files. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200504220607.99627-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-06kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bitsPaolo Bonzini1-15/+5
Using CPUID data can be useful for the processor compatibility check, but that's it. Using it to compute guest-reserved bits can have both false positives (such as LA57 and UMIP which we are already handling) and false negatives: in particular, with this patch we don't allow anymore a KVM guest to set CR4.PKE when CR4.PKE is clear on the host. Fixes: b9dd21e104bc ("KVM: x86: simplify handling of PKRU") Reported-by: Jim Mattson <jmattson@google.com> Tested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-06KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB pathSean Christopherson1-0/+3
Clear CF and ZF in the VM-Exit path after doing __FILL_RETURN_BUFFER so that KVM doesn't interpret clobbered RFLAGS as a VM-Fail. Filling the RSB has always clobbered RFLAGS, its current incarnation just happens clear CF and ZF in the processs. Relying on the macro to clear CF and ZF is extremely fragile, e.g. commit 089dd8e53126e ("x86/speculation: Change FILL_RETURN_BUFFER to work with objtool") tweaks the loop such that the ZF flag is always set. Reported-by: Qian Cai <cai@lca.pw> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Fixes: f2fde6a5bcfcf ("KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200506035355.2242-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-06docs/virt/kvm: Document configuring and running nested guestsKashyap Chamarthy2-0/+278
This is a rewrite of this[1] Wiki page with further enhancements. The doc also includes a section on debugging problems in nested environments, among other improvements. [1] https://www.linux-kvm.org/page/Nested_Guests Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Message-Id: <20200505112839.30534-1-kchamart@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-05RISC-V: Remove unused code from STRICT_KERNEL_RWXAtish Patra2-24/+0
This patch removes the unused functions set_kernel_text_rw/ro. Currently, it is not being invoked from anywhere and no other architecture (except arm) uses this code. Even in ARM, these functions are not invoked from anywhere currently. Fixes: d27c3c90817e ("riscv: add STRICT_KERNEL_RWX support") Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-05Merge tag 'platform-drivers-x86-v5.7-2' of ↵Linus Torvalds7-27/+35
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fixes from Andy Shevchenko: - Avoid loading asus-nb-wmi module on selected laptop models - Fix S0ix debug support for Jasper Lake PMC - Few fixes which have been reported by Hulk bot and others * tag 'platform-drivers-x86-v5.7-2' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: thinkpad_acpi: Remove always false 'value < 0' statement platform/x86: intel_pmc_core: avoid unused-function warnings platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TA platform/x86: intel_pmc_core: Change Jasper Lake S0ix debug reg map back to ICL platform/x86/intel-uncore-freq: make uncore_root_kobj static platform/x86: wmi: Make two functions static platform/x86: surface3_power: Fix a NULL vs IS_ERR() check in probe
2020-05-05neigh: send protocol value in neighbor create notificationRoman Mashak1-3/+3
When a new neighbor entry has been added, event is generated but it does not include protocol, because its value is assigned after the event notification routine has run, so move protocol assignment code earlier. Fixes: df9b0e30d44c ("neighbor: Add protocol attribute") Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05drm/amd/display: Prevent dpcd reads with passive donglesAurabindo Pillai1-6/+11
[why] During hotplug, a DP port may be connected to the sink through passive adapter which does not support DPCD reads. Issuing reads without checking for this condition will result in errors [how] Ensure the link is in aux_mode before initiating operation that result in a DPCD read. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-05drm/amd/display: fix counter in wait_for_no_pipes_pendingRoman Li1-3/+2
[Why] Wait counter is not being reset for each pipe. [How] Move counter reset into pipe loop scope. Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Zhan Liu <Zhan.Liu@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-05drm/amd/display: Update DCN2.1 DV Code RevisionSung Lee1-4/+4
[WHY & HOW] There is a problem in hscale_pixel_rate, the bug causes DCN to be more optimistic (more likely to underflow) in upscale cases during prefetch. This commit ports the fix from DV code to address these issues. Signed-off-by: Sung Lee <sung.lee@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-05io_uring: handle -EFAULT properly in io_uring_setup()Xiaoguang Wang1-13/+11
If copy_to_user() in io_uring_setup() failed, we'll leak many kernel resources, which will be recycled until process terminates. This bug can be reproduced by using mprotect to set params to PROT_READ. To fix this issue, refactor io_uring_create() a bit to add a new 'struct io_uring_params __user *params' parameter and move the copy_to_user() in io_uring_setup() to io_uring_setup(), if copy_to_user() failed, we can free kernel resource properly. Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-05net: broadcom: fix a mistake about ioremap resourceDejin Zheng1-9/+15
Commit d7a5502b0bb8b ("net: broadcom: convert to devm_platform_ioremap_resource_byname()") will broke this driver. idm_base and nicpm_base were optional, after this change, they are mandatory. it will probe fails with -22 when the dtb doesn't have them defined. so revert part of this commit and make idm_base and nicpm_base as optional. Fixes: d7a5502b0bb8bde ("net: broadcom: convert to devm_platform_ioremap_resource_byname()") Reported-by: Jonathan Richardson <jonathan.richardson@broadcom.com> Cc: Scott Branden <scott.branden@broadcom.com> Cc: Ray Jui <ray.jui@broadcom.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05drm: Fix HDCP failures when SRM fw is missingSean Paul1-1/+7
The SRM cleanup in 79643fddd6eb2 ("drm/hdcp: optimizing the srm handling") inadvertently altered the behavior of HDCP auth when the SRM firmware is missing. Before that patch, missing SRM was interpreted as the device having no revoked keys. With that patch, if the SRM fw file is missing we reject _all_ keys. This patch fixes that regression by returning success if the file cannot be found. It also checks the return value from request_srm such that we won't end up trying to parse the ksv list if there is an error fetching it. Fixes: 79643fddd6eb ("drm/hdcp: optimizing the srm handling") Cc: stable@vger.kernel.org Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Sean Paul <sean@poorly.run> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200414190258.38873-1-sean@poorly.run Changes in v2: -Noticed a couple other things to clean up Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
2020-05-05bus: mhi: core: Fix channel device name conflictJeffrey Hugo1-1/+2
When multiple instances of the same MHI product are present in a system, we can see a splat from mhi_create_devices() - "sysfs: cannot create duplicate filename". This is because the device names assigned to the MHI channel devices are non-unique. They consist of the channel's name, and the channel's pipe id. For identical products, each instance is going to have the same set of channel (both in name and pipe id). To fix this, we prepend the device name of the parent device that the MHI channels belong to. Since different instances of the same product should have unique device names, this makes the MHI channel devices for each product also unique. Additionally, remove the pipe id from the MHI channel device name. This is an internal detail to the MHI product that provides little value, and imposes too much device specific internal details to userspace. It is expected that channel with a specific name (ie "SAHARA") has a specific client, and it does not matter what pipe id that channel is enumerated on. The pipe id is an internal detail between the MHI bus, and the hardware. The client is not expected to make decisions based on the pipe id, and to do so would require the client to have intimate knowledge of the hardware, which is inappropiate as it may violate the layering provided by the MHI bus. The limitation of doing this is that each product may only have one instance of a channel by a unique name. This limitation is appropriate given the usecases of MHI channels. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-7-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05bus: mhi: core: Fix typo in commentJeffrey Hugo1-1/+1
There is a typo - "runtimet" should be "runtime". Fix it. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-6-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05bus: mhi: core: Offload register accesses to the controllerJeffrey Hugo4-14/+10
When reading or writing MHI registers, the core assumes that the physical link is a memory mapped PCI link. This assumption may not hold for all MHI devices. The controller knows what is the physical link (ie PCI, I2C, SPI, etc), and therefore knows the proper methods to access that link. The controller can also handle link specific error scenarios, such as reading -1 when the PCI link went down. Therefore, it is appropriate that the MHI core requests the controller to make register accesses on behalf of the core, which abstracts the core from link specifics, and end up removing an unnecessary assumption. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-5-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05bus: mhi: core: Remove link_status() callbackJeffrey Hugo3-9/+4
If the MHI core detects invalid data due to a PCI read, it calls into the controller via link_status() to double check that the link is infact down. All in all, this is pretty pointless, and racy. There are no good reasons for this, and only drawbacks. Its pointless because chances are, the controller is going to do the same thing to determine if the link is down - attempt a PCI access and compare the result. This does not make the link status decision any smarter. Its racy because its possible that the link was down at the time of the MHI core access, but then recovered before the controller access. In this case, the controller will indicate the link is not down, and the MHI core will precede to use a bad value as the MHI core does not attempt to retry the access. Retrying the access in the MHI core is a bad idea because again, it is racy - what if the link is down again? Furthermore, there may be some higher level state associated with the link status, that is now invalid because the link went down. The only reason why the MHI core could see "invalid" data when doing a PCI access, that is actually valid, is if the register actually contained the PCI spec defined sentinel for an invalid access. In this case, it is arguable that the MHI implementation broken, and should be fixed, not worked around. Therefore, remove the link_status() callback before anyone attempts to implement it. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-4-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05bus: mhi: core: Make sure to powerdown if mhi_sync_power_up failsJeffrey Hugo1-1/+5
Powerdown is necessary if mhi_sync_power_up fails due to a timeout, to clean up the resources. Otherwise a BUG could be triggered when attempting to clean up MSIs because the IRQ is still active from a request_irq(). Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-3-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05bus: mhi: Fix parsing of mhi_flagsManivannan Sadhasivam1-3/+3
With the current parsing of mhi_flags, the following statement always return false: eob = !!(flags & MHI_EOB); This is due to the fact that 'enum mhi_flags' starts with index 0 and we are using direct AND operation to extract each bit. Fix this by using BIT() macros for defining the flags so that the reset of the code need not be touched. Fixes: 189ff97cca53 ("bus: mhi: core: Add support for data transfer") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20200430190555.32741-2-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05mei: me: disable mei interface on LBG servers.Tomas Winkler3-1/+13
Disable the MEI driver on LBG SPS (server) platforms, some corner flows such as recovery mode does not work, and the driver doesn't have working use cases. Cc: <stable@vger.kernel.org> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20200428211200.12200-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05platform/x86: thinkpad_acpi: Remove always false 'value < 0' statementXiongfeng Wang1-1/+1
Since 'value' is declared as unsigned long, the following statement is always false. value < 0 So let's remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-05-05platform/x86: intel_pmc_core: avoid unused-function warningsArnd Bergmann2-18/+2
When both CONFIG_DEBUG_FS and CONFIG_PM_SLEEP are disabled, the functions that got moved out of the #ifdef section now cause a warning: drivers/platform/x86/intel_pmc_core.c:654:13: error: 'pmc_core_lpm_display' defined but not used [-Werror=unused-function] 654 | static void pmc_core_lpm_display(struct pmc_dev *pmcdev, struct device *dev, | ^~~~~~~~~~~~~~~~~~~~ drivers/platform/x86/intel_pmc_core.c:617:13: error: 'pmc_core_slps0_display' defined but not used [-Werror=unused-function] 617 | static void pmc_core_slps0_display(struct pmc_dev *pmcdev, struct device *dev, | ^~~~~~~~~~~~~~~~~~~~~~ Rather than add even more #ifdefs here, remove them entirely and let the compiler work it out, it can actually get rid of all the debugfs calls without problems as long as the struct member is there. The two PM functions just need a __maybe_unused annotations to avoid another warning instead of the #ifdef. Fixes: aae43c2bcdc1 ("platform/x86: intel_pmc_core: Relocate pmc_core_*_display() to outside of CONFIG_DEBUG_FS") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-05-05platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TAHans de Goede1-0/+24
asus-nb-wmi does not add any extra functionality on these Asus Transformer books. They have detachable keyboards, so the hotkeys are send through a HID device (and handled by the hid-asus driver) and also the rfkill functionality is not used on these devices. Besides not adding any extra functionality, initializing the WMI interface on these devices actually has a negative side-effect. For some reason the \_SB.ATKD.INIT() function which asus_wmi_platform_init() calls drives GPO2 (INT33FC:02) pin 8, which is connected to the front facing webcam LED, high and there is no (WMI or other) interface to drive this low again causing the LED to be permanently on, even during suspend. This commit adds a blacklist of DMI system_ids on which not to load the asus-nb-wmi and adds these Transformer books to this list. This fixes the webcam LED being permanently on under Linux. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-05-05platform/x86: intel_pmc_core: Change Jasper Lake S0ix debug reg map back to ICLArchana Patni1-3/+3
Jasper Lake uses Icelake PCH IPs and the S0ix debug interfaces are same as Icelake. It uses SLP_S0_DBG register latch/read interface from Icelake generation. It doesn't use Tiger Lake LPM debug registers. Change the Jasper Lake S0ix debug interface to use the ICL reg map. Fixes: 16292bed9c56 ("platform/x86: intel_pmc_core: Add Atom based Jasper Lake (JSL) platform support") Signed-off-by: Archana Patni <archana.patni@intel.com> Acked-by: David E. Box <david.e.box@intel.com> Tested-by: Divagar Mohandass <divagar.mohandass@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-05-05usb: typec: mux: intel: Handle alt mode HPD_HIGHPrashant Malani1-0/+4
According to the PMC Type C Subsystem (TCSS) Mux programming guide rev 0.6, when a device is transitioning to DP Alternate Mode state, if the HPD_STATE (bit 7) field in the status update command VDO is set to HPD_HIGH, the HPD_HIGH field in the Alternate Mode request “mode_data” field (bit 14) should also be set. Ensure the bit is correctly handled while issuing the Alternate Mode request. Signed-off-by: Prashant Malani <pmalani@chromium.org> Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control") Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200429054432.134178-1-pmalani@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05usb: usbfs: correct kernel->user page attribute mismatchJeremy Linton1-3/+2
On some architectures (e.g. arm64) requests for IO coherent memory may use non-cachable attributes if the relevant device isn't cache coherent. If these pages are then remapped into userspace as cacheable, they may not be coherent with the non-cacheable mappings. In particular this happens with libusb, when it attempts to create zero-copy buffers for use by rtl-sdr (https://github.com/osmocom/rtl-sdr/). On low end arm devices with non-coherent USB ports, the application will be unexpectedly killed, while continuing to work fine on arm machines with coherent USB controllers. This bug has been discovered/reported a few times over the last few years. In the case of rtl-sdr a compile time option to enable/disable zero copy was implemented to work around it. Rather than relaying on application specific workarounds, dma_mmap_coherent() can be used instead of remap_pfn_range(). The page cache/etc attributes will then be correctly set in userspace to match the kernel mapping. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200504201348.1183246-1-jeremy.linton@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05usb: typec: intel_pmc_mux: Fix the property namesHeikki Krogerus1-2/+2
The device property names for the port index number are "usb2-port-number" and "usb3-port-number", not "usb2-port" and "usb3-port". Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200430135657.45169-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05USB: core: Fix misleading driver bug reportAlan Stern1-2/+2
The syzbot fuzzer found a race between URB submission to endpoint 0 and device reset. Namely, during the reset we call usb_ep0_reinit() because the characteristics of ep0 may have changed (if the reset follows a firmware update, for example). While usb_ep0_reinit() is running there is a brief period during which the pointers stored in udev->ep_in[0] and udev->ep_out[0] are set to NULL, and if an URB is submitted to ep0 during that period, usb_urb_ep_type_check() will report it as a driver bug. In the absence of those pointers, the routine thinks that the endpoint doesn't exist. The log message looks like this: ------------[ cut here ]------------ usb 2-1: BOGUS urb xfer, pipe 2 != type 2 WARNING: CPU: 0 PID: 9241 at drivers/usb/core/urb.c:478 usb_submit_urb+0x1188/0x1460 drivers/usb/core/urb.c:478 Now, although submitting an URB while the device is being reset is a questionable thing to do, it shouldn't count as a driver bug as severe as submitting an URB for an endpoint that doesn't exist. Indeed, endpoint 0 always exists, even while the device is in its unconfigured state. To prevent these misleading driver bug reports, this patch updates usb_disable_endpoint() to avoid clearing the ep_in[] and ep_out[] pointers when the endpoint being disabled is ep0. There's no danger of leaving a stale pointer in place, because the usb_host_endpoint structure being pointed to is stored permanently in udev->ep0; it doesn't get deallocated until the entire usb_device structure does. Reported-and-tested-by: syzbot+db339689b2101f6f6071@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2005011558590.903-100000@netrider.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05staging: gasket: Check the return value of gasket_get_bar_index()Oscar Carter1-0/+4
Check the return value of gasket_get_bar_index function as it can return a negative one (-EINVAL). If this happens, a negative index is used in the "gasket_dev->bar_data" array. Addresses-Coverity-ID: 1438542 ("Negative array index read") Fixes: 9a69f5087ccc2 ("drivers/staging: Gasket driver framework + Apex driver") Signed-off-by: Oscar Carter <oscar.carter@gmx.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Richard Yeh <rcy@google.com> Link: https://lore.kernel.org/r/20200501155118.13380-1-oscar.carter@gmx.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05staging: ks7010: remove me from CC listWolfram Sang1-1/+0
I lost interest in this driver years ago because I could't keep up with testing the incoming janitorial patches. So, drop me from CC. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Link: https://lore.kernel.org/r/20200502112648.6942-1-wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05KVM: s390: Remove false WARN_ON_ONCE for the PQAP instructionChristian Borntraeger1-1/+3
In LPAR we will only get an intercept for FC==3 for the PQAP instruction. Running nested under z/VM can result in other intercepts as well as ECA_APIE is an effective bit: If one hypervisor layer has turned this bit off, the end result will be that we will get intercepts for all function codes. Usually the first one will be a query like PQAP(QCI). So the WARN_ON_ONCE is not right. Let us simply remove it. Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: stable@vger.kernel.org # v5.3+ Fixes: e5282de93105 ("s390: ap: kvm: add PQAP interception for AQIC") Link: https://lore.kernel.org/kvm/20200505083515.2720-1-borntraeger@de.ibm.com Reported-by: Qian Cai <cailca@icloud.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-05-04Merge branch 'for-linus' of ↵Linus Torvalds11-76/+74
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - Wacom driver functional and regression fixes from Jason Gerecke - race condition fix in usbhid, found by syzbot and fixed by Alan Stern - a few device-specific quirks and ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: quirks: Add HID_QUIRK_NO_INIT_REPORTS quirk for Dell K12A keyboard-dock HID: mcp2221: add gpiolib dependency HID: i2c-hid: reset Synaptics SYNA2393 on resume HID: wacom: Report 2nd-gen Intuos Pro S center button status over BT HID: usbhid: Fix race between usbhid_close() and usbhid_stop() Revert "HID: wacom: generic: read the number of expected touches on a per collection basis" HID: alps: ALPS_1657 is too specific; use U1_UNICORN_LEGACY instead HID: alps: Add AUI1657 device ID HID: logitech: Add support for Logitech G11 extra keys HID: multitouch: add eGalaxTouch P80H84 support HID: wacom: Read HID_DG_CONTACTMAX directly for non-generic devices
2020-05-04riscv: force __cpu_up_ variables to put in data sectionZong Li1-2/+2
Put __cpu_up_stack_pointer and __cpu_up_task_pointer in data section. Currently, these two variables are put in bss section, there is a potential risk that secondary harts get the uninitialized value before main hart finishing the bss clearing. In this case, all secondary harts would pass the waiting loop and enable the MMU before main hart set up the page table. This issue happens on random booting of multiple harts, which means it will manifest for BBL and OpenSBI v0.6 (or older version). In OpenSBI v0.7 (or higher version), we have HSM extension so all the secondary harts are brought-up by Linux kernel in an orderly fashion. This means we don't need this change for OpenSBI v0.7 (or higher version). Signed-off-by: Zong Li <zong.li@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04riscv: add Linux note to vdsoAndreas Schwab2-1/+13
The Linux note in the vdso allows glibc to check the running kernel version without having to issue the uname syscall. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04riscv: set max_pfn to the PFN of the last pageVincent Chen1-1/+2
The current max_pfn equals to zero. In this case, I found it caused users cannot get some page information through /proc such as kpagecount in v5.6 kernel because of new sanity checks. The following message is displayed by stress-ng test suite with the command "stress-ng --verbose --physpage 1 -t 1" on HiFive unleashed board. # stress-ng --verbose --physpage 1 -t 1 stress-ng: debug: [109] 4 processors online, 4 processors configured stress-ng: info: [109] dispatching hogs: 1 physpage stress-ng: debug: [109] cache allocate: reducing cache level from L3 (too high) to L0 stress-ng: debug: [109] get_cpu_cache: invalid cache_level: 0 stress-ng: info: [109] cache allocate: using built-in defaults as no suitable cache found stress-ng: debug: [109] cache allocate: default cache size: 2048K stress-ng: debug: [109] starting stressors stress-ng: debug: [109] 1 stressor spawned stress-ng: debug: [110] stress-ng-physpage: started [110] (instance 0) stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd34de000 in /proc/kpagecount, errno=0 (Success) stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd32db078 in /proc/kpagecount, errno=0 (Success) ... stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd32db078 in /proc/kpagecount, errno=0 (Success) stress-ng: debug: [110] stress-ng-physpage: exited [110] (instance 0) stress-ng: debug: [109] process [110] terminated stress-ng: info: [109] successful run completed in 1.00s # After applying this patch, the kernel can pass the test. # stress-ng --verbose --physpage 1 -t 1 stress-ng: debug: [104] 4 processors online, 4 processors configured stress-ng: info: [104] dispatching hogs: 1 physpage stress-ng: info: [104] cache allocate: using defaults, can't determine cache details from sysfs stress-ng: debug: [104] cache allocate: default cache size: 2048K stress-ng: debug: [104] starting stressors stress-ng: debug: [104] 1 stressor spawned stress-ng: debug: [105] stress-ng-physpage: started [105] (instance 0) stress-ng: debug: [105] stress-ng-physpage: exited [105] (instance 0) stress-ng: debug: [104] process [105] terminated stress-ng: info: [104] successful run completed in 1.01s # Cc: stable@vger.kernel.org Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Yash Shah <yash.shah@sifive.com> Tested-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04RISC-V: Remove N-extension related definesAnup Patel1-3/+0
The RISC-V N-extension is still in draft state hence remove N-extension related defines from asm/csr.h. Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04RISC-V: Add bitmap reprensenting ISA features common across CPUsAnup Patel2-3/+102
This patch adds riscv_isa bitmap which represents Host ISA features common across all Host CPUs. The riscv_isa is not same as elf_hwcap because elf_hwcap will only have ISA features relevant for user-space apps whereas riscv_isa will have ISA features relevant to both kernel and user-space apps. One of the use-case for riscv_isa bitmap is in KVM hypervisor where we will use it to do following operations: 1. Check whether hypervisor extension is available 2. Find ISA features that need to be virtualized (e.g. floating point support, vector extension, etc.) Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04RISC-V: Export riscv_cpuid_to_hartid_mask() APIAnup Patel1-0/+2
The riscv_cpuid_to_hartid_mask() API should be exported to allow building KVM RISC-V as loadable module. Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-04nfp: abm: fix a memory leak bugQiushi Wu1-0/+1
In function nfp_abm_vnic_set_mac, pointer nsp is allocated by nfp_nsp_open. But when nfp_nsp_has_hwinfo_lookup fail, the pointer is not released, which can lead to a memory leak bug. Fix this issue by adding nfp_nsp_close(nsp) in the error path. Fixes: f6e71efdf9fb1 ("nfp: abm: look up MAC addresses via management FW") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04atm: fix a memory leak of vcc->user_backCong Wang1-0/+6
In lec_arp_clear_vccs() only entry->vcc is freed, but vcc could be installed on entry->recv_vcc too in lec_vcc_added(). This fixes the following memory leak: unreferenced object 0xffff8880d9266b90 (size 16): comm "atm2", pid 425, jiffies 4294907980 (age 23.488s) hex dump (first 16 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 6b 6b 6b a5 ............kkk. backtrace: [<(____ptrval____)>] kmem_cache_alloc_trace+0x10e/0x151 [<(____ptrval____)>] lane_ioctl+0x4b3/0x569 [<(____ptrval____)>] do_vcc_ioctl+0x1ea/0x236 [<(____ptrval____)>] svc_ioctl+0x17d/0x198 [<(____ptrval____)>] sock_do_ioctl+0x47/0x12f [<(____ptrval____)>] sock_ioctl+0x2f9/0x322 [<(____ptrval____)>] vfs_ioctl+0x1e/0x2b [<(____ptrval____)>] ksys_ioctl+0x61/0x80 [<(____ptrval____)>] __x64_sys_ioctl+0x16/0x19 [<(____ptrval____)>] do_syscall_64+0x57/0x65 [<(____ptrval____)>] entry_SYSCALL_64_after_hwframe+0x49/0xb3 Cc: Gengming Liu <l.dmxcsnsbh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04atm: fix a UAF in lec_arp_clear_vccs()Cong Wang1-11/+11
Gengming reported a UAF in lec_arp_clear_vccs(), where we add a vcc socket to an entry in a per-device list but free the socket without removing it from the list when vcc->dev is NULL. We need to call lec_vcc_close() to search and remove those entries contain the vcc being destroyed. This can be done by calling vcc->push(vcc, NULL) unconditionally in vcc_destroy_socket(). Another issue discovered by Gengming's reproducer is the vcc->dev may point to the static device lecatm_dev, for which we don't need to register/unregister device, so we can just check for vcc->dev->ops->owner. Reported-by: Gengming Liu <l.dmxcsnsbh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04net: stmmac: gmac5+: fix potential integer overflow on 32 bit multiplyColin Ian King1-1/+1
The multiplication of cfg->ctr[1] by 1000000000 is performed using a 32 bit multiplication (since cfg->ctr[1] is a u32) and this can lead to a potential overflow. Fix this by making the constant a ULL to ensure a 64 bit multiply occurs. Fixes: 504723af0d85 ("net: stmmac: Add basic EST support for GMAC5+") Addresses-Coverity: ("Unintentional integer overflow") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04net_sched: fix tcm_parent in tc filter dumpCong Wang2-4/+5
When we tell kernel to dump filters from root (ffff:ffff), those filters on ingress (ffff:0000) are matched, but their true parents must be dumped as they are. However, kernel dumps just whatever we tell it, that is either ffff:ffff or ffff:0000: $ nl-cls-list --dev=dummy0 --parent=root cls basic dev dummy0 id none parent root prio 49152 protocol ip match-all cls basic dev dummy0 id :1 parent root prio 49152 protocol ip match-all $ nl-cls-list --dev=dummy0 --parent=ffff: cls basic dev dummy0 id none parent ffff: prio 49152 protocol ip match-all cls basic dev dummy0 id :1 parent ffff: prio 49152 protocol ip match-all This is confusing and misleading, more importantly this is a regression since 4.15, so the old behavior must be restored. And, when tc filters are installed on a tc class, the parent should be the classid, rather than the qdisc handle. Commit edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto") removed the classid we save for filters, we can just restore this classid in tcf_block. Steps to reproduce this: ip li set dev dummy0 up tc qd add dev dummy0 ingress tc filter add dev dummy0 parent ffff: protocol arp basic action pass tc filter show dev dummy0 root Before this patch: filter protocol arp pref 49152 basic filter protocol arp pref 49152 basic handle 0x1 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 After this patch: filter parent ffff: protocol arp pref 49152 basic filter parent ffff: protocol arp pref 49152 basic handle 0x1 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 Fixes: a10fa20101ae ("net: sched: propagate q and parent from caller down to tcf_fill_node") Fixes: edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04cxgb4/chcr: avoid -Wreturn-local-addr warningArnd Bergmann1-37/+46
gcc-10 warns about functions that return a pointer to a stack variable. In chcr_write_cpl_set_tcb_ulp(), this does not actually happen, but it's too hard to see for the compiler: drivers/crypto/chelsio/chcr_ktls.c: In function 'chcr_write_cpl_set_tcb_ulp.constprop': drivers/crypto/chelsio/chcr_ktls.c:760:9: error: function may return address of local variable [-Werror=return-local-addr] 760 | return pos; | ^~~ drivers/crypto/chelsio/chcr_ktls.c:712:5: note: declared here 712 | u8 buf[48] = {0}; | ^~~ Split the middle part of the function out into a helper to make it easier to understand by both humans and compilers, which avoids the warning. Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04s390/qeth: fix cancelling of TX timer on dev_close()Julian Wiedmann1-5/+5
With the introduction of TX coalescing, .ndo_start_xmit now potentially starts the TX completion timer. So only kill the timer _after_ TX has been disabled. Fixes: ee1e52d1e4bb ("s390/qeth: add TX IRQ coalescing support for IQD devices") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04Merge tag 'gcc-plugins-v5.7-rc5' of ↵Linus Torvalds3-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc-plugins fixes from Kees Cook: "GCC 10 fixes for gcc-plugins: - Adjust caller of cgraph_create_edge for GCC 10 argument usage - Update common headers to build under GCC 10 (Frédéric Pierret)" * tag 'gcc-plugins-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-common.h: Update for GCC 10 gcc-plugins/stackleak: Avoid assignment for unused macro argument
2020-05-04Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-8/+83
Pull virtio fixes from Michael Tsirkin: "A couple of bug fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: vsock: kick send_pkt worker once device is started virtio-blk: handle block_device_operations callbacks after hot unplug
2020-05-04net: enetc: fix an issue about leak system resourcesDejin Zheng1-1/+1
the related system resources were not released when enetc_hw_alloc() return error in the enetc_pci_mdio_probe(), add iounmap() for error handling label "err_hw_alloc" to fix it. Fixes: 6517798dd3432a ("enetc: Make MDIO accessors more generic and export to include/linux/fsl") Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04Merge tag 'flexible-array-member-5.7-rc5' of ↵Linus Torvalds8-12/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flex-array reverts from Gustavo Silva: "This reverts flexible array changes in include/uapi/ These structures can get embedded in other structures in user-space and cause all sorts of warnings and problems[1]. So, we better don't take any chances and keep the zero-length arrays in place for now" [1] https://lore.kernel.org/lkml/20200424121553.GE26002@ziepe.ca/ * tag 'flexible-array-member-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: uapi: revert flexible-array conversions
2020-05-04net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()Tariq Toukan1-1/+3
When ENOSPC is set the idx is still valid and gets set to the global MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that ENOSPC is impossible from mlx4_cmd_imm() and gives this warning: drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be used uninitialized in this function [-Wmaybe-uninitialized] 2552 | priv->def_counter[port] = idx; Also, when ENOSPC is returned mlx4_allocate_default_counters should not fail. Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04devlink: Fix reporter's recovery conditionAya Levin1-2/+5
Devlink health core conditions the reporter's recovery with the expiration of the grace period. This is not relevant for the first recovery. Explicitly demand that the grace period will only apply to recoveries other than the first. Fixes: c8e1da0bf923 ("devlink: Add health report functionality") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04stmmac: fix pointer check after utilization in stmmac_interruptMaxim Petrov1-6/+1
The paranoidal pointer check in IRQ handler looks very strange - it really protects us only against bogus drivers which request IRQ line with null pointer dev_id. However, the code fragment is incorrect because the dev pointer is used before the actual check which leads to undefined behavior. Remove the check to avoid confusing people with incorrect code. Signed-off-by: Maxim Petrov <mmrmaximuzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04tipc: fix partial topology connection closureTuong Lien1-2/+3
When an application connects to the TIPC topology server and subscribes to some services, a new connection is created along with some objects - 'tipc_subscription' to store related data correspondingly... However, there is one omission in the connection handling that when the connection or application is orderly shutdown (e.g. via SIGQUIT, etc.), the connection is not closed in kernel, the 'tipc_subscription' objects are not freed too. This results in: - The maximum number of subscriptions (65535) will be reached soon, new subscriptions will be rejected; - TIPC module cannot be removed (unless the objects are somehow forced to release first); The commit fixes the issue by closing the connection if the 'recvmsg()' returns '0' i.e. when the peer is shutdown gracefully. It also includes the other unexpected cases. Acked-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04MAINTAINERS: remove myself as ceph co-maintainerSage Weil1-6/+0
Jeff, Ilya, and Dongsheng are doing all of the Ceph maintainance these days. [ idryomov: Remove Sage's git tree too, it hasn't been pushed to in years. ] Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-04net: dsa: Do not make user port errors fatalFlorian Fainelli1-1/+1
Prior to 1d27732f411d ("net: dsa: setup and teardown ports"), we would not treat failures to set-up an user port as fatal, but after this commit we would, which is a regression for some systems where interfaces may be declared in the Device Tree, but the underlying hardware may not be present (pluggable daughter cards for instance). Fixes: 1d27732f411d ("net: dsa: setup and teardown ports") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04ceph: fix double unlock in handle_cap_export()Wu Bo1-0/+1
If the ceph_mdsc_open_export_target_session() return fails, it will do a "goto retry", but the session mutex has already been unlocked. Re-lock the mutex in that case to ensure that we don't unlock it twice. Signed-off-by: Wu Bo <wubo40@huawei.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-04ceph: fix special error code in ceph_try_get_caps()Wu Bo1-1/+1
There are 3 speical error codes: -EAGAIN/-EFBIG/-ESTALE. After calling try_get_cap_refs, ceph_try_get_caps test for the -EAGAIN twice. Ensure that it tests for -ESTALE instead. Signed-off-by: Wu Bo <wubo40@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-04ceph: fix endianness bug when handling MDS session feature bitsJeff Layton1-5/+3
Eduard reported a problem mounting cephfs on s390 arch. The feature mask sent by the MDS is little-endian, so we need to convert it before storing and testing against it. Cc: stable@vger.kernel.org Reported-and-Tested-by: Eduard Shishkin <edward6@linux.ibm.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-04tty: xilinx_uartps: Fix missing id assignment to the consoleShubhrajyoti Datta1-0/+1
When serial console has been assigned to ttyPS1 (which is serial1 alias) console index is not updated property and pointing to index -1 (statically initialized) which ends up in situation where nothing has been printed on the port. The commit 18cc7ac8a28e ("Revert "serial: uartps: Register own uart console and driver structures"") didn't contain this line which was removed by accident. Fixes: 18cc7ac8a28e ("Revert "serial: uartps: Register own uart console and driver structures"") Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/ed3111533ef5bd342ee5ec504812240b870f0853.1588602446.git.michal.simek@xilinx.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-04uapi: revert flexible-array conversionsGustavo A. R. Silva8-12/+12
These structures can get embedded in other structures in user-space and cause all sorts of warnings and problems. So, we better don't take any chances and keep the zero-length arrays in place for now. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2020-05-04kvm: ioapic: Restrict lazy EOI update to edge-triggered interruptsPaolo Bonzini1-5/+5
Commit f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI") introduces the following infinite loop: BUG: stack guard page was hit at 000000008f595917 \ (stack is 00000000bdefe5a4..00000000ae2b06f5) kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI RIP: 0010:kvm_set_irq+0x51/0x160 [kvm] Call Trace: irqfd_resampler_ack+0x32/0x90 [kvm] kvm_notify_acked_irq+0x62/0xd0 [kvm] kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm] ioapic_set_irq+0x20e/0x240 [kvm] kvm_ioapic_set_irq+0x5c/0x80 [kvm] kvm_set_irq+0xbb/0x160 [kvm] ? kvm_hv_set_sint+0x20/0x20 [kvm] irqfd_resampler_ack+0x32/0x90 [kvm] kvm_notify_acked_irq+0x62/0xd0 [kvm] kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm] ioapic_set_irq+0x20e/0x240 [kvm] kvm_ioapic_set_irq+0x5c/0x80 [kvm] kvm_set_irq+0xbb/0x160 [kvm] ? kvm_hv_set_sint+0x20/0x20 [kvm] .... The re-entrancy happens because the irq state is the OR of the interrupt state and the resamplefd state. That is, we don't want to show the state as 0 until we've had a chance to set the resamplefd. But if the interrupt has _not_ gone low then ioapic_set_irq is invoked again, causing an infinite loop. This can only happen for a level-triggered interrupt, otherwise irqfd_inject would immediately set the KVM_USERSPACE_IRQ_SOURCE_ID high and then low. Fortunately, in the case of level-triggered interrupts the VMEXIT already happens because TMR is set. Thus, fix the bug by restricting the lazy invocation of the ack notifier to edge-triggered interrupts, the only ones that need it. Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reported-by: borisvk@bstnet.org Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://www.spinics.net/lists/kvm/msg213512.html Fixes: f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207489 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-04USB: serial: qcserial: Add DW5816e supportMatt Jolly1-0/+1
Add support for Dell Wireless 5816e to drivers/usb/serial/qcserial.c Signed-off-by: Matt Jolly <Kangie@footclan.ninja> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2020-05-04KVM: x86: Fixes posted interrupt check for IRQs delivery modesSuravee Suthikulpanit1-2/+2
Current logic incorrectly uses the enum ioapic_irq_destination_types to check the posted interrupt destination types. However, the value was set using APIC_DM_XXX macros, which are left-shifted by 8 bits. Fixes by using the APIC_DM_FIXED and APIC_DM_LOWEST instead. Fixes: (fdcf75621375 'KVM: x86: Disable posted interrupts for non-standard IRQs delivery modes') Cc: Alexander Graf <graf@amazon.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <1586239989-58305-1-git-send-email-suravee.suthikulpanit@amd.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-04gcc-10 warnings: fix low-hanging fruitLinus Torvalds3-3/+3
Due to a bug-report that was compiler-dependent, I updated one of my machines to gcc-10. That shows a lot of new warnings. Happily they seem to be mostly the valid kind, but it's going to cause a round of churn for getting rid of them.. This is the really low-hanging fruit of removing a couple of zero-sized arrays in some core code. We have had a round of these patches before, and we'll have many more coming, and there is nothing special about these except that they were particularly trivial, and triggered more warnings than most. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-04Merge tag 'kvmarm-fixes-5.7-2' of ↵Paolo Bonzini7-19/+49
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm fixes for Linux 5.7, take #2 - Fix compilation with Clang - Correctly initialize GICv4.1 in the absence of a virtual ITS - Move SP_EL0 save/restore to the guest entry/exit code - Handle PC wrap around on 32bit guests, and narrow all 32bit registers on userspace access
2020-05-04Merge tag 'kvmarm-fixes-5.7-1' of ↵Paolo Bonzini7-80/+272
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm fixes for Linux 5.7, take #1 - Prevent the userspace API from interacting directly with the HW stage of the virtual GIC - Fix a couple of vGIC memory leaks - Tighten the rules around the use of the 32bit PSCI functions for 64bit guest, as well as the opposite situation (matches the specification)
2020-05-04KVM: SVM: fill in kvm_run->debug.arch.dr[67]Paolo Bonzini1-0/+2
The corresponding code was added for VMX in commit 42dbaa5a057 ("KVM: x86: Virtualize debug registers, 2008-12-15) but never for AMD. Fix this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-04KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warningSean Christopherson1-1/+1
Use BUG() in the impossible-to-hit default case when switching on the scope of INVEPT to squash a warning with clang 11 due to clang treating the BUG_ON() as conditional. >> arch/x86/kvm/vmx/nested.c:5246:3: warning: variable 'roots_to_free' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] BUG_ON(1); Reported-by: kbuild test robot <lkp@intel.com> Fixes: ce8fe7b77bd8 ("KVM: nVMX: Free only the affected contexts when emulating INVEPT") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200504153506.28898-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-04io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()Xiaoguang Wang1-5/+4
The prepare_to_wait() and finish_wait() calls in io_uring_cancel_files() are mismatched. Currently I don't see any issues related this bug, just find it by learning codes. Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-04sun6i: dsi: fix gcc-4.8Arnd Bergmann1-1/+1
Older compilers warn about initializers with incorrect curly braces: drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c: In function 'sun6i_dsi_encoder_enable': drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:720:8: error: missing braces around initializer [-Werror=missing-braces] union phy_configure_opts opts = { 0 }; ^ drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:720:8: error: (near initialization for 'opts.mipi_dphy') [-Werror=missing-braces] Use the GNU empty initializer extension to avoid this. Fixes: bb3b6fcb6849 ("sun6i: dsi: Convert to generic phy handling") Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200428215105.3928459-1-arnd@arndb.de
2020-05-04drm: ingenic-drm: add MODULE_DEVICE_TABLEH. Nikolaus Schaller1-0/+1
so that the driver can load by matching the device tree if compiled as module. Cc: stable@vger.kernel.org # v5.3+ Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Link: https://patchwork.freedesktop.org/patch/msgid/1694a29b7a3449b6b662cec33d1b33f2ee0b174a.1588574111.git.hns@goldelico.com
2020-05-04vt: fix unicode console freeing with a common interfaceNicolas Pitre1-2/+7
By directly using kfree() in different places we risk missing one if it is switched to using vfree(), especially if the corresponding vmalloc() is hidden away within a common abstraction. Oh wait, that's exactly what happened here. So let's fix this by creating a common abstraction for the free case as well. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Reported-by: syzbot+0bfda3ade1ee9288a1be@syzkaller.appspotmail.com Fixes: 9a98e7a80f95 ("vt: don't use kmalloc() for the unicode screen buffer") Cc: <stable@vger.kernel.org> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://lore.kernel.org/r/nycvar.YSQ.7.76.2005021043110.2671@knanqh.ubzr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-04Revert "tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart"Florian Fainelli1-3/+1
This reverts commit 580d952e44de5509c69c8f9346180ecaa78ebeec ("tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart") because we should not be doing a clk_put() if we were not successful in getting a valid clock reference via clk_get() in the first place. Fixes: 580d952e44de ("tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200501013904.1394-1-f.fainelli@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-04HID: quirks: Add HID_QUIRK_NO_INIT_REPORTS quirk for Dell K12A keyboard-dockHans de Goede2-0/+2
Add a HID_QUIRK_NO_INIT_REPORTS quirk for the Dell K12A keyboard-dock, which can be used with various Dell Venue 11 models. Without this quirk the keyboard/touchpad combo works fine when connected at boot, but when hotplugged 9 out of 10 times it will not work properly. Adding the quirk fixes this. Cc: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-04drm/virtio: create context before RESOURCE_CREATE_2D in 3D modeGurchetan Singh3-2/+5
If 3D is enabled, but userspace requests a dumb buffer, we will call CTX_ATTACH_RESOURCE before actually creating the context. Fixes: 72b48ae800da ("drm/virtio: enqueue virtio_gpu_create_context after the first 3D ioctl") Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20200501185557.740-1-gurchetansingh@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2020-05-03net: macb: fix an issue about leak related system resourcesDejin Zheng1-9/+3
A call of the function macb_init() can fail in the function fu540_c000_init. The related system resources were not released then. use devm_platform_ioremap_resource() to replace ioremap() to fix it. Fixes: c218ad559020ff9 ("macb: Add support for SiFive FU540-C000") Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Yash Shah <yash.shah@sifive.com> Suggested-by: Nicolas Ferre <nicolas.ferre@microchip.com> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: usb: qmi_wwan: add support for DW5816eMatt Jolly1-0/+1
Add support for Dell Wireless 5816e to drivers/net/usb/qmi_wwan.c Signed-off-by: Matt Jolly <Kangie@footclan.ninja> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net_sched: sch_skbprio: add message validation to skbprio_change()Eric Dumazet1-0/+3
Do not assume the attribute has the right size. Fixes: aea5f654e6b7 ("net/sched: add skbprio scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03Linux 5.7-rc4v5.7-rc4Linus Torvalds1-1/+1
2020-05-03Merge tag 'for-5.7-rc3-tag' of ↵Linus Torvalds4-7/+53
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull more btrfs fixes from David Sterba: "A few more stability fixes, minor build warning fixes and git url fixup: - fix partial loss of prealloc extent past i_size after fsync - fix potential deadlock due to wrong transaction handle passing via journal_info - fix gcc 4.8 struct intialization warning - update git URL in MAINTAINERS entry" * tag 'for-5.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: MAINTAINERS: btrfs: fix git repo URL btrfs: fix gcc-4.8 build warning for struct initializer btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info btrfs: fix partial loss of prealloc extent past i_size after fsync
2020-05-03Merge tag 'iommu-fixes-v5.7-rc3' of ↵Linus Torvalds5-7/+11
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: - Fix a memory leak when dev_iommu gets freed and a sub-pointer does not - Build dependency fixes for Mediatek, spapr_tce, and Intel IOMMU driver - Export iommu_group_get_for_dev() only for GPLed modules - Fix AMD IOMMU interrupt remapping when x2apic is enabled - Fix error path in the QCOM IOMMU driver probe function * tag 'iommu-fixes-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/qcom: Fix local_base status check iommu: Properly export iommu_group_get_for_dev() iommu/vt-d: Use right Kconfig option name iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system iommu: spapr_tce: Disable compile testing to fix build on book3s_32 config iommu/mediatek: Fix MTK_IOMMU dependencies iommu: Fix the memory leak in dev_iommu_free()
2020-05-03MAINTAINERS: btrfs: fix git repo URLEric Biggers1-1/+1
The git repo listed for btrfs hasn't been updated in over a year. List the current one instead. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-02Merge tag 'pm-5.7-rc4' of ↵Linus Torvalds3-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: - prevent the intel_pstate driver from printing excessive diagnostic messages in some cases (Chris Wilson) - make the hibernation restore kernel freeze kernel threads as well as user space tasks (Dexuan Cui) - fix the ACPI device PM disagnostic messages to include the correct power state name (Kai-Heng Feng). * tag 'pm-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: ACPI: Output correct message on target power state PM: hibernate: Freeze kernel threads in software_resume() cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once
2020-05-02Merge branches 'pm-cpufreq' and 'pm-sleep'Rafael J. Wysocki2-1/+8
* pm-cpufreq: cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once * pm-sleep: PM: hibernate: Freeze kernel threads in software_resume()
2020-05-02Merge tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-4/+9
Pull iomap fix from Darrick Wong: "Hoist the check for an unrepresentable FIBMAP return value into ioctl_fibmap. The internal kernel function can handle 64-bit values (and is needed to fix a regression on ext4 + jbd2). It is only the userspace ioctl that is so old that it cannot deal" * tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fibmap: Warn and return an error in case of block > INT_MAX
2020-05-02Merge tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds10-36/+79
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - fix handling of backchannel binding in BIND_CONN_TO_SESSION Bugfixes: - Fix a credential use-after-free issue in pnfs_roc() - Fix potential posix_acl refcnt leak in nfs3_set_acl - defer slow parts of rpc_free_client() to a workqueue - Fix an Oopsable race in __nfs_list_for_each_server() - Fix trace point use-after-free race - Regression: the RDMA client no longer responds to server disconnect requests - Fix return values of xdr_stream_encode_item_{present, absent} - _pnfs_return_layout() must always wait for layoutreturn completion Cleanups: - Remove unreachable error conditions" * tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix a race in __nfs_list_for_each_server() NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION SUNRPC: defer slow parts of rpc_free_client() to a workqueue. NFSv4: Remove unreachable error condition due to rpc_run_task() SUNRPC: Remove unreachable error condition xprtrdma: Fix use of xdr_stream_encode_item_{present, absent} xprtrdma: Fix trace point use-after-free race xprtrdma: Restore wake-up-all to rpcrdma_cm_event_handler() nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl NFS/pnfs: Fix a credential use-after-free issue in pnfs_roc() NFS/pnfs: Ensure that _pnfs_return_layout() waits for layoutreturn completion
2020-05-02Merge tag 'dmaengine-fix-5.7-rc4' of ↵Linus Torvalds10-60/+65
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Core: - Documentation typo fixes - fix the channel indexes - dmatest: fixes for process hang and iterations Drivers: - hisilicon: build error fix without PCI_MSI - ti-k3: deadlock fix - uniphier-xdmac: fix for reg region - pch: fix data race - tegra: fix clock state" * tag 'dmaengine-fix-5.7-rc4' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: dmatest: Fix process hang when reading 'wait' parameter dmaengine: dmatest: Fix iteration non-stop logic dmaengine: tegra-apb: Ensure that clock is enabled during of DMA synchronization dmaengine: fix channel index enumeration dmaengine: mmp_tdma: Reset channel error on release dmaengine: mmp_tdma: Do not ignore slave config validation errors dmaengine: pch_dma.c: Avoid data race between probe and irq handler dt-bindings: dma: uniphier-xdmac: switch to single reg region include/linux/dmaengine: Typos fixes in API documentation dmaengine: xilinx_dma: Add missing check for empty list dmaengine: ti: k3-psil: fix deadlock on error path dmaengine: hisilicon: Fix build error without PCI_MSI
2020-05-02vhost: vsock: kick send_pkt worker once device is startedJia He1-0/+5
Ning Bo reported an abnormal 2-second gap when booting Kata container [1]. The unconditional timeout was caused by VSOCK_DEFAULT_CONNECT_TIMEOUT of connecting from the client side. The vhost vsock client tries to connect an initializing virtio vsock server. The abnormal flow looks like: host-userspace vhost vsock guest vsock ============== =========== ============ connect() --------> vhost_transport_send_pkt_work() initializing | vq->private_data==NULL | will not be queued V schedule_timeout(2s) vhost_vsock_start() <--------- device ready set vq->private_data wait for 2s and failed connect() again vq->private_data!=NULL recv connecting pkt Details: 1. Host userspace sends a connect pkt, at that time, guest vsock is under initializing, hence the vhost_vsock_start has not been called. So vq->private_data==NULL, and the pkt is not been queued to send to guest 2. Then it sleeps for 2s 3. After guest vsock finishes initializing, vq->private_data is set 4. When host userspace wakes up after 2s, send connecting pkt again, everything is fine. As suggested by Stefano Garzarella, this fixes it by additional kicking the send_pkt worker in vhost_vsock_start once the virtio device is started. This makes the pending pkt sent again. After this patch, kata-runtime (with vsock enabled) boot time is reduced from 3s to 1s on a ThunderX2 arm64 server. [1] https://github.com/kata-containers/runtime/issues/1917 Reported-by: Ning Bo <n.b@live.com> Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jia He <justin.he@arm.com> Link: https://lore.kernel.org/r/20200501043840.186557-1-justin.he@arm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2020-05-02virtio-blk: handle block_device_operations callbacks after hot unplugStefan Hajnoczi1-8/+78
A userspace process holding a file descriptor to a virtio_blk device can still invoke block_device_operations after hot unplug. This leads to a use-after-free accessing vblk->vdev in virtblk_getgeo() when ioctl(HDIO_GETGEO) is invoked: BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio] PGD 800000003a92f067 PUD 3a930067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 task: ffff9be5fbfb8000 ti: ffff9be5fa890000 task.ti: ffff9be5fa890000 RIP: 0010:[<ffffffffc00e5450>] [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio] RSP: 0018:ffff9be5fa893dc8 EFLAGS: 00010246 RAX: ffff9be5fc3f3400 RBX: ffff9be5fa893e30 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9be5fbc10b40 RBP: ffff9be5fa893dc8 R08: 0000000000000301 R09: 0000000000000301 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9be5fdc24680 R13: ffff9be5fbc10b40 R14: ffff9be5fbc10480 R15: 0000000000000000 FS: 00007f1bfb968740(0000) GS:ffff9be5ffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000090 CR3: 000000003a894000 CR4: 0000000000360ff0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffc016ac37>] virtblk_getgeo+0x47/0x110 [virtio_blk] [<ffffffff8d3f200d>] ? handle_mm_fault+0x39d/0x9b0 [<ffffffff8d561265>] blkdev_ioctl+0x1f5/0xa20 [<ffffffff8d488771>] block_ioctl+0x41/0x50 [<ffffffff8d45d9e0>] do_vfs_ioctl+0x3a0/0x5a0 [<ffffffff8d45dc81>] SyS_ioctl+0xa1/0xc0 A related problem is that virtblk_remove() leaks the vd_index_ida index when something still holds a reference to vblk->disk during hot unplug. This causes virtio-blk device names to be lost (vda, vdb, etc). Fix these issues by protecting vblk->vdev with a mutex and reference counting vblk so the vd_index_ida index can be removed in all cases. Fixes: 48e4043d4529 ("virtio: add virtio disk geometry feature") Reported-by: Lance Digby <ldigby@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20200430140442.171016-1-stefanha@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2020-05-01Merge tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfioLinus Torvalds1-5/+5
Pull VFIO fixes from Alex Williamson: - copy_*_user validity check for new vfio_dma_rw interface (Yan Zhao) - Fix a potential math overflow (Yan Zhao) - Use follow_pfn() for calculating PFNMAPs (Sean Christopherson) * tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfio: vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn() vfio: avoid possible overflow in vfio_iommu_type1_pin_pages vfio: checking of validity of user vaddr in vfio_dma_rw
2020-05-01Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Add -fasynchronous-unwind-tables to the vDSO CFLAGS" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: vdso: Add -fasynchronous-unwind-tables to cflags
2020-05-01Merge tag 'io_uring-5.7-2020-05-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-27/+31
Pull io_uring fixes from Jens Axboe: - Fix for statx not grabbing the file table, making AT_EMPTY_PATH fail - Cover a few cases where async poll can handle retry, eliminating the need for an async thread - fallback request busy/free fix (Bijan) - syzbot reported SQPOLL thread exit fix for non-preempt (Xiaoguang) - Fix extra put of req for sync_file_range (Pavel) - Always punt splice async. We'll improve this for 5.8, but wanted to eliminate the inode mutex lock from the non-blocking path for 5.7 (Pavel) * tag 'io_uring-5.7-2020-05-01' of git://git.kernel.dk/linux-block: io_uring: punt splice async because of inode mutex io_uring: check non-sync defer_list carefully io_uring: fix extra put in sync_file_range() io_uring: use cond_resched() in io_ring_ctx_wait_and_kill() io_uring: use proper references for fallback_req locking io_uring: only force async punt if poll based retry can't handle it io_uring: enable poll retry for any file with ->read_iter / ->write_iter io_uring: statx must grab the file table for valid fd
2020-05-01drop_monitor: work around gcc-10 stringop-overflow warningArnd Bergmann1-4/+7
The current gcc-10 snapshot produces a false-positive warning: net/core/drop_monitor.c: In function 'trace_drop_common.constprop': cc1: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] In file included from net/core/drop_monitor.c:23: include/uapi/linux/net_dropmon.h:36:8: note: at offset 0 to object 'entries' with size 4 declared here 36 | __u32 entries; | ^~~~~~~ I reported this in the gcc bugzilla, but in case it does not get fixed in the release, work around it by using a temporary variable. Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01gtp: set NLM_F_MULTI flag in gtp_genl_dump_pdp()Yoshiyuki Kurauchi1-4/+5
In drivers/net/gtp.c, gtp_genl_dump_pdp() should set NLM_F_MULTI flag since it returns multipart message. This patch adds a new arg "flags" in gtp_genl_fill_info() so that flags can be set by the callers. Signed-off-by: Yoshiyuki Kurauchi <ahochauwaaaaa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01cxgb4: Add missing annotation for service_ofldq()Jules Irenge1-0/+1
Sparse reports a warning at service_ofldq() warning: context imbalance in service_ofldq() - unexpected unlock The root cause is the missing annotation at service_ofldq() Add the missing __must_hold(&q->sendq.lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01ice: cleanup language in ice.rst for fw.appJacob Keller1-2/+2
The documentation for the ice driver around "fw.app" has a spelling mistake in variation. Additionally, the language of "shall have a unique name" sounds like a requirement. Reword this to read more like a description or property. Reported-by: Benjamin Fisher <benjamin.l.fisher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kubakici@wp.pl> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01net: Make PTP-specific drivers depend on PTP_1588_CLOCKClay McClure5-5/+5
Commit d1cbfd771ce8 ("ptp_clock: Allow for it to be optional") changed all PTP-capable Ethernet drivers from `select PTP_1588_CLOCK` to `imply PTP_1588_CLOCK`, "in order to break the hard dependency between the PTP clock subsystem and ethernet drivers capable of being clock providers." As a result it is possible to build PTP-capable Ethernet drivers without the PTP subsystem by deselecting PTP_1588_CLOCK. Drivers are required to handle the missing dependency gracefully. Some PTP-capable Ethernet drivers (e.g., TI_CPSW) factor their PTP code out into separate drivers (e.g., TI_CPTS_MOD). The above commit also changed these PTP-specific drivers to `imply PTP_1588_CLOCK`, making it possible to build them without the PTP subsystem. But as Grygorii Strashko noted in [1]: On Wed, Apr 22, 2020 at 02:16:11PM +0300, Grygorii Strashko wrote: > Another question is that CPTS completely nonfunctional in this case and > it was never expected that somebody will even try to use/run such > configuration (except for random build purposes). In my view, enabling a PTP-specific driver without the PTP subsystem is a configuration error made possible by the above commit. Kconfig should not allow users to create a configuration with missing dependencies that results in "completely nonfunctional" drivers. I audited all network drivers that call ptp_clock_register() but merely `imply PTP_1588_CLOCK` and found five PTP-specific drivers that are likely nonfunctional without PTP_1588_CLOCK: NET_DSA_MV88E6XXX_PTP NET_DSA_SJA1105_PTP MACB_USE_HWSTAMP CAVIUM_PTP TI_CPTS_MOD Note how these symbols all reference PTP or timestamping in their name; this is a clue that they depend on PTP_1588_CLOCK. Change them from `imply PTP_1588_CLOCK` [2] to `depends on PTP_1588_CLOCK`. I'm not using `select PTP_1588_CLOCK` here because PTP_1588_CLOCK has its own dependencies, which `select` would not transitively apply. Additionally, remove the `select NET_PTP_CLASSIFY` from CPTS_TI_MOD; PTP_1588_CLOCK already selects that. [1]: https://lore.kernel.org/lkml/c04458ed-29ee-1797-3a11-7f3f560553e6@ti.com/ [2]: NET_DSA_SJA1105_PTP had never declared any type of dependency on PTP_1588_CLOCK (`imply` or otherwise); adding a `depends on PTP_1588_CLOCK` here seems appropriate. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: d1cbfd771ce8 ("ptp_clock: Allow for it to be optional") Signed-off-by: Clay McClure <clay@daemons.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01devlink: fix return value after hitting end in region readJakub Kicinski1-0/+5
Commit d5b90e99e1d5 ("devlink: report 0 after hitting end in region read") fixed region dump, but region read still returns a spurious error: $ devlink region read netdevsim/netdevsim1/dummy snapshot 0 addr 0 len 128 0000000000000000 a6 f4 c4 1c 21 35 95 a6 9d 34 c3 5b 87 5b 35 79 0000000000000010 f3 a0 d7 ee 4f 2f 82 7f c6 dd c4 f6 a5 c3 1b ae 0000000000000020 a4 fd c8 62 07 59 48 03 70 3b c7 09 86 88 7f 68 0000000000000030 6f 45 5d 6d 7d 0e 16 38 a9 d0 7a 4b 1e 1e 2e a6 0000000000000040 e6 1d ae 06 d6 18 00 85 ca 62 e8 7e 11 7e f6 0f 0000000000000050 79 7e f7 0f f3 94 68 bd e6 40 22 85 b6 be 6f b1 0000000000000060 af db ef 5e 34 f0 98 4b 62 9a e3 1b 8b 93 fc 17 devlink answers: Invalid argument 0000000000000070 61 e8 11 11 66 10 a5 f7 b1 ea 8d 40 60 53 ed 12 This is a minimal fix, I'll follow up with a restructuring so we don't have two checks for the same condition. Fixes: fdd41ec21e15 ("devlink: Return right error code in case of errors for region read") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01hv_netvsc: Fix netvsc_start_xmit's return typeNathan Chancellor1-1/+2
netvsc_start_xmit is used as a callback function for the ndo_start_xmit function pointer. ndo_start_xmit's return type is netdev_tx_t but netvsc_start_xmit's return type is int. This causes a failure with Control Flow Integrity (CFI), which requires function pointer prototypes and callback function definitions to match exactly. When CFI is in enforcing, the kernel panics. When booting a CFI kernel with WSL 2, the VM is immediately terminated because of this. The splat when CONFIG_CFI_PERMISSIVE is used: [ 5.916765] CFI failure (target: netvsc_start_xmit+0x0/0x10): [ 5.916771] WARNING: CPU: 8 PID: 0 at kernel/cfi.c:29 __cfi_check_fail+0x2e/0x40 [ 5.916772] Modules linked in: [ 5.916774] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 5.7.0-rc3-next-20200424-microsoft-cbl-00001-ged4eb37d2c69-dirty #1 [ 5.916776] RIP: 0010:__cfi_check_fail+0x2e/0x40 [ 5.916777] Code: 48 c7 c7 70 98 63 a9 48 c7 c6 11 db 47 a9 e8 69 55 59 00 85 c0 75 02 5b c3 48 c7 c7 73 c6 43 a9 48 89 de 31 c0 e8 12 2d f0 ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 00 cc cc 00 00 85 f6 74 25 [ 5.916778] RSP: 0018:ffffa803c0260b78 EFLAGS: 00010246 [ 5.916779] RAX: 712a1af25779e900 RBX: ffffffffa8cf7950 RCX: ffffffffa962cf08 [ 5.916779] RDX: ffffffffa9c36b60 RSI: 0000000000000082 RDI: ffffffffa9c36b5c [ 5.916780] RBP: ffff8ffc4779c2c0 R08: 0000000000000001 R09: ffffffffa9c3c300 [ 5.916781] R10: 0000000000000151 R11: ffffffffa9c36b60 R12: ffff8ffe39084000 [ 5.916782] R13: ffffffffa8cf7950 R14: ffffffffa8d12cb0 R15: ffff8ffe39320140 [ 5.916784] FS: 0000000000000000(0000) GS:ffff8ffe3bc00000(0000) knlGS:0000000000000000 [ 5.916785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.916786] CR2: 00007ffef5749408 CR3: 00000002f4f5e000 CR4: 0000000000340ea0 [ 5.916787] Call Trace: [ 5.916788] <IRQ> [ 5.916790] __cfi_check+0x3ab58/0x450e0 [ 5.916793] ? dev_hard_start_xmit+0x11f/0x160 [ 5.916795] ? sch_direct_xmit+0xf2/0x230 [ 5.916796] ? __dev_queue_xmit.llvm.11471227737707190958+0x69d/0x8e0 [ 5.916797] ? neigh_resolve_output+0xdf/0x220 [ 5.916799] ? neigh_connected_output.cfi_jt+0x8/0x8 [ 5.916801] ? ip6_finish_output2+0x398/0x4c0 [ 5.916803] ? nf_nat_ipv6_out+0x10/0xa0 [ 5.916804] ? nf_hook_slow+0x84/0x100 [ 5.916807] ? ip6_input_finish+0x8/0x8 [ 5.916807] ? ip6_output+0x6f/0x110 [ 5.916808] ? __ip6_local_out.cfi_jt+0x8/0x8 [ 5.916810] ? mld_sendpack+0x28e/0x330 [ 5.916811] ? ip_rt_bug+0x8/0x8 [ 5.916813] ? mld_ifc_timer_expire+0x2db/0x400 [ 5.916814] ? neigh_proxy_process+0x8/0x8 [ 5.916816] ? call_timer_fn+0x3d/0xd0 [ 5.916817] ? __run_timers+0x2a9/0x300 [ 5.916819] ? rcu_core_si+0x8/0x8 [ 5.916820] ? run_timer_softirq+0x14/0x30 [ 5.916821] ? __do_softirq+0x154/0x262 [ 5.916822] ? native_x2apic_icr_write+0x8/0x8 [ 5.916824] ? irq_exit+0xba/0xc0 [ 5.916825] ? hv_stimer0_vector_handler+0x99/0xe0 [ 5.916826] ? hv_stimer0_callback_vector+0xf/0x20 [ 5.916826] </IRQ> [ 5.916828] ? hv_stimer_global_cleanup.cfi_jt+0x8/0x8 [ 5.916829] ? raw_setsockopt+0x8/0x8 [ 5.916830] ? default_idle+0xe/0x10 [ 5.916832] ? do_idle.llvm.10446269078108580492+0xb7/0x130 [ 5.916833] ? raw_setsockopt+0x8/0x8 [ 5.916833] ? cpu_startup_entry+0x15/0x20 [ 5.916835] ? cpu_hotplug_enable.cfi_jt+0x8/0x8 [ 5.916836] ? start_secondary+0x188/0x190 [ 5.916837] ? secondary_startup_64+0xa5/0xb0 [ 5.916838] ---[ end trace f2683fa869597ba5 ]--- Avoid this by using the right return type for netvsc_start_xmit. Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver") Link: https://github.com/ClangBuiltLinux/linux/issues/1009 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01Merge branch 'WoL-fixes-for-DP83822-and-DP83tc811'David S. Miller2-25/+26
Dan Murphy says: ==================== WoL fixes for DP83822 and DP83tc811 The WoL feature for each device was enabled during boot or when the PHY was brought up which may be undesired. These patches disable the WoL in the config_init. The disabling and enabling of the WoL is now done though the set_wol call. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01net: phy: DP83TC811: Fix WoL in config init to be disabledDan Murphy1-9/+12
The WoL feature should be disabled when config_init is called and the feature should turned on or off when set_wol is called. In addition updated the calls to modify the registers to use the set_bit and clear_bit function calls. Fixes: 6d749428788b ("net: phy: DP83TC811: Introduce support for the DP83TC811 phy") Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01net: phy: DP83822: Fix WoL in config init to be disabledDan Murphy1-16/+14
The WoL feature should be disabled when config_init is called and the feature should turned on or off when set_wol is called. In addition updated the calls to modify the registers to use the set_bit and clear_bit function calls. Fixes: 3b427751a9d0 ("net: phy: DP83822 initial driver submission") Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01ipv6: Use global sernum for dst validation with nexthop objectsDavid Ahern3-0/+36
Nik reported a bug with pcpu dst cache when nexthop objects are used illustrated by the following: $ ip netns add foo $ ip -netns foo li set lo up $ ip -netns foo addr add 2001:db8:11::1/128 dev lo $ ip netns exec foo sysctl net.ipv6.conf.all.forwarding=1 $ ip li add veth1 type veth peer name veth2 $ ip li set veth1 up $ ip addr add 2001:db8:10::1/64 dev veth1 $ ip li set dev veth2 netns foo $ ip -netns foo li set veth2 up $ ip -netns foo addr add 2001:db8:10::2/64 dev veth2 $ ip -6 nexthop add id 100 via 2001:db8:10::2 dev veth1 $ ip -6 route add 2001:db8:11::1/128 nhid 100 Create a pcpu entry on cpu 0: $ taskset -a -c 0 ip -6 route get 2001:db8:11::1 Re-add the route entry: $ ip -6 ro del 2001:db8:11::1 $ ip -6 route add 2001:db8:11::1/128 nhid 100 Route get on cpu 0 returns the stale pcpu: $ taskset -a -c 0 ip -6 route get 2001:db8:11::1 RTNETLINK answers: Network is unreachable While cpu 1 works: $ taskset -a -c 1 ip -6 route get 2001:db8:11::1 2001:db8:11::1 from :: via 2001:db8:10::2 dev veth1 src 2001:db8:10::1 metric 1024 pref medium Conversion of FIB entries to work with external nexthop objects missed an important difference between IPv4 and IPv6 - how dst entries are invalidated when the FIB changes. IPv4 has a per-network namespace generation id (rt_genid) that is bumped on changes to the FIB. Checking if a dst_entry is still valid means comparing rt_genid in the rtable to the current value of rt_genid for the namespace. IPv6 also has a per network namespace counter, fib6_sernum, but the count is saved per fib6_node. With the per-node counter only dst_entries based on fib entries under the node are invalidated when changes are made to the routes - limiting the scope of invalidations. IPv6 uses a reference in the rt6_info, 'from', to track the corresponding fib entry used to create the dst_entry. When validating a dst_entry, the 'from' is used to backtrack to the fib6_node and check the sernum of it to the cookie passed to the dst_check operation. With the inline format (nexthop definition inline with the fib6_info), dst_entries cached in the fib6_nh have a 1:1 correlation between fib entries, nexthop data and dst_entries. With external nexthops, IPv6 looks more like IPv4 which means multiple fib entries across disparate fib6_nodes can all reference the same fib6_nh. That means validation of dst_entries based on external nexthops needs to use the IPv4 format - the per-network namespace counter. Add sernum to rt6_info and set it when creating a pcpu dst entry. Update rt6_get_cookie to return sernum if it is set and update dst_check for IPv6 to look for sernum set and based the check on it if so. Finally, rt6_get_pcpu_route needs to validate the cached entry before returning a pcpu entry (similar to the rt_cache_valid calls in __mkroute_input and __mkroute_output for IPv4). This problem only affects routes using the new, external nexthops. Thanks to the kbuild test robot for catching the IS_ENABLED needed around rt_genid_ipv6 before I sent this out. Fixes: 5b98324ebe29 ("ipv6: Allow routes to use nexthop objects") Reported-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@kernel.org> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01Merge tag 'block-5.7-2020-05-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-1/+3
Pull block fixes from Jens Axboe: "A few fixes for this release: - NVMe pull request from Christoph, with a single fix for a double free in the namespace error handling. - Kill the bd_openers check in blk_drop_partitions(), fixing a regression in this merge window (Christoph)" * tag 'block-5.7-2020-05-01' of git://git.kernel.dk/linux-block: block: remove the bd_openers checks in blk_drop_partitions nvme: prevent double free in nvme_alloc_ns() error handling
2020-05-01Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds4-26/+20
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Three driver bugfixes, and two reverts because the original patches revealed underlying problems which the Tegra guys are now working on" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: aspeed: Avoid i2c interrupt status clear race condition. i2c: amd-mp2-pci: Fix Oops in amd_mp2_pci_init() error handling Revert "i2c: tegra: Better handle case where CPU0 is busy for a long time" Revert "i2c: tegra: Synchronize DMA before termination" i2c: iproc: generate stop event for slave writes
2020-05-01Merge tag 'sound-5.7-rc4' of ↵Linus Torvalds8-33/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just a collection of small fixes around this time: - One more try for fixing PCM OSS regression - HD-audio: a new quirk for Lenovo, the improved driver blacklisting, a lock fix in the minor error path, and a fix for the possible race at monitor notifiaction - USB-audio: a quirk ID fix, a fix for POD HD500 workaround" * tag 'sound-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Correct a typo of NuPrime DAC-10 USB ID ALSA: opti9xx: shut up gcc-10 range warning ALSA: hda/hdmi: fix without unlocked before return ALSA: hda/hdmi: fix race in monitor detection during probe ALSA: hda/realtek - Two front mics on a Lenovo ThinkCenter ALSA: line6: Fix POD HD500 audio playback ALSA: pcm: oss: Place the plugin buffer overflow checks correctly (for 5.7) ALSA: pcm: oss: Place the plugin buffer overflow checks correctly ALSA: hda: Match both PCI ID and SSID for driver blacklist
2020-05-01Merge tag 'drm-fixes-2020-05-01' of git://anongit.freedesktop.org/drm/drmLinus Torvalds40-154/+290
Pull drm fixes from Dave Airlie: "Regular scheduled fixes for graphics. Nothing to extreme bunch of amdgpu fixes, i915 and qxl fixes, along with some misc ones. All seems to be progressing normally. core: - EDID off by one DTD fix - DP mst write return code fix dma-buf: - fix SET_NAME ioctl uapi - doc fixes amdgpu: - Fix a green screen on resume issue - PM fixes for SR-IOV SDMA fix for navi - Renoir display fixes - Cursor and pageflip stuttering fixes - Misc additional display fixes - (uapi) Add additional DCC tiling flags for navi1x i915: - Fix selftest refcnt leak (Xiyu) - Fix gem vma lock (Chris) - Fix gt's i915_request.timeline acquire by checking if cacheline is valid (Chris) - Fix IRQ postinistall fault masks (Matt) qxl: - use after gree fix - fix lost kunmap - release leak fix virtio: - context destruction fix" * tag 'drm-fixes-2020-05-01' of git://anongit.freedesktop.org/drm/drm: (26 commits) dma-buf: fix documentation build warnings drm/qxl: qxl_release use after free drm/qxl: lost qxl_bo_kunmap_atomic_page in qxl_image_init_helper() drm/i915: Use proper fault mask in interrupt postinstall too drm/amd/display: Use cursor locking to prevent flip delays drm/amd/display: Update downspread percent to match spreadsheet for DCN2.1 drm/amd/display: Defer cursor update around VUPDATE for all ASIC drm/amd/display: fix rn soc bb update drm/amd/display: check if REFCLK_CNTL register is present drm/amdgpu: bump version for invalidate L2 before SDMA IBs drm/amdgpu: invalidate L2 before SDMA IBs (v2) drm/amdgpu: add tiling flags from Mesa drm/amd/powerplay: avoid using pm_en before it is initialized revised Revert "drm/amd/powerplay: avoid using pm_en before it is initialized" drm/qxl: qxl_release leak in qxl_hw_surface_alloc() drm/qxl: qxl_release leak in qxl_draw_dirty_fb() drm/virtio: only destroy created contexts drm/dp_mst: Fix drm_dp_send_dpcd_write() return code drm/i915/gt: Check cacheline is valid before acquiring drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma() ...
2020-05-01Merge tag 'scsi-fixes' of ↵Linus Torvalds3-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four minor fixes: three in drivers and one in the core. The core one allows an additional state change that fixes a regression introduced by an update to the aacraid driver in the previous merge window" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target/iblock: fix WRITE SAME zeroing scsi: qla2xxx: check UNLOADING before posting async work scsi: qla2xxx: set UNLOADING before waiting for session deletion scsi: core: Allow the state change from SDEV_QUIESCE to SDEV_BLOCK
2020-05-01selftests: fix kvm relocatable native/cross builds and installsShuah Khan1-1/+28
kvm test Makefile doesn't fully support cross-builds and installs. UNAME_M = $(shell uname -m) variable is used to define the target programs and libraries to be built from arch specific sources in sub-directories. For cross-builds to work, UNAME_M has to map to ARCH and arch specific directories and targets in this Makefile. UNAME_M variable to used to run the compiles pointing to the right arch directories and build the right targets for these supported architectures. TEST_GEN_PROGS and LIBKVM are set using UNAME_M variable. LINUX_TOOL_ARCH_INCLUDE is set using ARCH variable. x86_64 targets are named to include x86_64 as a suffix and directories for includes are in x86_64 sub-directory. s390x and aarch64 follow the same convention. "uname -m" doesn't result in the correct mapping for s390x and aarch64. Fix it to set UNAME_M correctly for s390x and aarch64 cross-builds. In addition, Makefile doesn't create arch sub-directories in the case of relocatable builds and test programs under s390x and x86_64 directories fail to build. This is a problem for native and cross-builds. Fix it to create all necessary directories keying off of TEST_GEN_PROGS. The following use-cases work with this change: Native x86_64: make O=/tmp/kselftest -C tools/testing/selftests TARGETS=kvm install \ INSTALL_PATH=$HOME/x86_64 arm64 cross-build: make O=$HOME/arm64_build/ ARCH=arm64 HOSTCC=gcc \ CROSS_COMPILE=aarch64-linux-gnu- defconfig make O=$HOME/arm64_build/ ARCH=arm64 HOSTCC=gcc \ CROSS_COMPILE=aarch64-linux-gnu- all make kselftest-install TARGETS=kvm O=$HOME/arm64_build ARCH=arm64 \ HOSTCC=gcc CROSS_COMPILE=aarch64-linux-gnu- s390x cross-build: make O=$HOME/s390x_build/ ARCH=s390 HOSTCC=gcc \ CROSS_COMPILE=s390x-linux-gnu- defconfig make O=$HOME/s390x_build/ ARCH=s390 HOSTCC=gcc \ CROSS_COMPILE=s390x-linux-gnu- all make kselftest-install TARGETS=kvm O=$HOME/s390x_build/ ARCH=s390 \ HOSTCC=gcc CROSS_COMPILE=s390x-linux-gnu- all No regressions in the following use-cases: make -C tools/testing/selftests TARGETS=kvm make kselftest-all TARGETS=kvm Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-05-01selftests/ftrace: Make XFAIL green colorMasami Hiramatsu1-1/+1
Since XFAIL (Expected Failure) is expected to fail the test, which means that test case works as we expected. IOW, XFAIL is same as PASS. So make it green. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-05-01ftrace/selftest: make unresolved cases cause failure if --fail-unresolved setAlan Maguire1-1/+7
Currently, ftracetest will return 1 (failure) if any unresolved cases are encountered. The unresolved status results from modules and programs not being available, and as such does not indicate any issues with ftrace itself. As such, change the behaviour of ftracetest in line with unsupported cases; if unsupported cases happen, ftracetest still returns 0 unless --fail-unsupported. Here --fail-unresolved is added and the default is to return 0 if unresolved results occur. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-05-01ftrace/selftests: workaround cgroup RT scheduling issuesAlan Maguire1-0/+22
wakeup_rt.tc and wakeup.tc tests in tracers/ subdirectory fail due to the chrt command returning: chrt: failed to set pid 0's policy: Operation not permitted. To work around this, temporarily disable grout RT scheduling during ftracetest execution. Restore original value on test run completion. With these changes in place, both tests consistently pass. Fixes: c575dea2c1a5 ("selftests/ftrace: Add wakeup_rt tracer testcase") Fixes: c1edd060b413 ("selftests/ftrace: Add wakeup tracer testcase") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-05-01io_uring: punt splice async because of inode mutexPavel Begunkov1-14/+2
Nonblocking do_splice() still may wait for some time on an inode mutex. Let's play safe and always punt it async. Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-01io_uring: check non-sync defer_list carefullyPavel Begunkov1-1/+1
io_req_defer() do double-checked locking. Use proper helpers for that, i.e. list_empty_careful(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>