diff options
author | Andrew Morton <akpm@linux-foundation.org> | 2024-05-02 08:31:40 -0700 |
---|---|---|
committer | Andrew Morton <akpm@linux-foundation.org> | 2024-05-02 08:31:40 -0700 |
commit | 6850ce43634ae5acf10cb967e69cc719068c7f48 (patch) | |
tree | 897ee6ec6691b739dc46538def9087537966e713 | |
parent | 16803081a638472e502095d780f6ca16eb23e25e (diff) | |
download | 25-new-6850ce43634ae5acf10cb967e69cc719068c7f48.tar.gz |
foo
60 files changed, 817 insertions, 285 deletions
diff --git a/patches/bitops-optimize-fns-for-improved-performance.patch b/patches/bitops-optimize-fns-for-improved-performance.patch deleted file mode 100644 index 7d9758e97..000000000 --- a/patches/bitops-optimize-fns-for-improved-performance.patch +++ /dev/null @@ -1,63 +0,0 @@ -From: Kuan-Wei Chiu <visitorckw@gmail.com> -Subject: bitops: optimize fns() for improved performance -Date: Thu, 2 May 2024 17:24:43 +0800 - -The current fns() repeatedly uses __ffs() to find the index of the least -significant bit and then clears the corresponding bit using __clear_bit(). -The method for clearing the least significant bit can be optimized by -using word &= word - 1 instead. - -Typically, the execution time of one __ffs() plus one __clear_bit() is -longer than that of a bitwise AND operation and a subtraction. To improve -performance, the loop for clearing the least significant bit has been -replaced with word &= word - 1, followed by a single __ffs() operation to -obtain the answer. This change reduces the number of __ffs() iterations -from n to just one, enhancing overall performance. - -This modification significantly accelerates the fns() function in the -test_bitops benchmark, improving its speed by approximately 7.6 times. -Additionally, it enhances the performance of find_nth_bit() in the -find_bit benchmark by approximately 26%. - -Before: -test_bitops: fns: 58033164 ns -find_nth_bit: 4254313 ns, 16525 iterations - -After: -test_bitops: fns: 7637268 ns -find_nth_bit: 3362863 ns, 16501 iterations - -Link: https://lkml.kernel.org/r/20240502092443.6845-3-visitorckw@gmail.com -Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> -Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> -Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> -Cc: Yury Norov <yury.norov@gmail.com> -Signed-off-by: Andrew Morton <akpm@linux-foundation.org> ---- - - include/linux/bitops.h | 12 +++--------- - 1 file changed, 3 insertions(+), 9 deletions(-) - ---- a/include/linux/bitops.h~bitops-optimize-fns-for-improved-performance -+++ a/include/linux/bitops.h -@@ -254,16 +254,10 @@ static inline unsigned long __ffs64(u64 - */ - static inline unsigned long fns(unsigned long word, unsigned int n) - { -- unsigned int bit; -+ while (word && n--) -+ word &= word - 1; - -- while (word) { -- bit = __ffs(word); -- if (n-- == 0) -- return bit; -- __clear_bit(bit, &word); -- } -- -- return BITS_PER_LONG; -+ return word ? __ffs(word) : BITS_PER_LONG; - } - - /** -_ diff --git a/patches/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.patch b/patches/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.patch new file mode 100644 index 000000000..e509777e4 --- /dev/null +++ b/patches/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.patch @@ -0,0 +1,42 @@ +From: David Hildenbrand <david@redhat.com> +Subject: mm/hugetlb: document why hugetlb uses folio_mapcount() for COW reuse decisions +Date: Thu, 2 May 2024 10:52:59 +0200 + +Let's document why hugetlb still uses folio_mapcount() and is prone to +leaking memory between processes, for example using vmsplice() that still +uses FOLL_GET. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + +Link: https://lkml.kernel.org/r/20240502085259.103784-3-david@redhat.com +Signed-off-by: David Hildenbrand <david@redhat.com> +Suggested-by: Peter Xu <peterx@redhat.com> +Cc: Muchun Song <muchun.song@linux.dev> +Cc: Shuah Khan <shuah@kernel.org> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + mm/hugetlb.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +--- a/mm/hugetlb.c~mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions ++++ a/mm/hugetlb.c +@@ -5963,6 +5963,13 @@ retry_avoidcopy: + /* + * If no-one else is actually using this page, we're the exclusive + * owner and can reuse this page. ++ * ++ * Note that we don't rely on the (safer) folio refcount here, because ++ * copying the hugetlb folio when there are unexpected (temporary) ++ * folio references could harm simple fork()+exit() users when ++ * we run out of free hugetlb folios: we would have to kill processes ++ * in scenarios that used to work. As a side effect, there can still ++ * be leaks between processes, for example, with FOLL_GET users. + */ + if (folio_mapcount(old_folio) == 1 && folio_test_anon(old_folio)) { + if (!PageAnonExclusive(&old_folio->page)) { +_ diff --git a/patches/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.patch b/patches/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.patch new file mode 100644 index 000000000..1e1191d4e --- /dev/null +++ b/patches/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.patch @@ -0,0 +1,31 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/madvise: add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON) +Date: Wed, 1 May 2024 17:24:57 -0600 + +The soft hwpoison injector via madvise(MADV_HWPOISON) operates in a +synchrous way in a sense, the injector is also a process under test, and +should it have the poisoned page mapped in its address space, it should +legitimately get killed as much as in a real UE situation. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-3-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + mm/madvise.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/mm/madvise.c~mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison ++++ a/mm/madvise.c +@@ -1147,7 +1147,7 @@ static int madvise_inject_error(int beha + } else { + pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n", + pfn, start); +- ret = memory_failure(pfn, MF_COUNT_INCREASED | MF_SW_SIMULATED); ++ ret = memory_failure(pfn, MF_ACTION_REQUIRED | MF_COUNT_INCREASED | MF_SW_SIMULATED); + if (ret == -EOPNOTSUPP) + ret = 0; + } +_ diff --git a/patches/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.patch b/patches/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.patch new file mode 100644 index 000000000..9a3d7d86f --- /dev/null +++ b/patches/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.patch @@ -0,0 +1,72 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/memory-failure: send SIGBUS in the event of thp split fail +Date: Wed, 1 May 2024 17:24:58 -0600 + +When handle hwpoison in a GUP longterm pin'ed thp page, +try_to_split_thp_page() will fail. And at this point, there is little +else the kernel could do except sending a SIGBUS to the user process, thus +give it a chance to recover. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-4-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + mm/memory-failure.c | 36 ++++++++++++++++++++++++++++++++++++ + 1 file changed, 36 insertions(+) + +--- a/mm/memory-failure.c~mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail ++++ a/mm/memory-failure.c +@@ -2162,6 +2162,37 @@ out: + return rc; + } + ++/* ++ * The calling condition is as such: thp split failed, page might have ++ * been GUP longterm pinned, not much can be done for recovery. ++ * But a SIGBUS should be delivered with vaddr provided so that the user ++ * application has a chance to recover. Also, application processes' ++ * election for MCE early killed will be honored. ++ */ ++static int kill_procs_now(struct page *p, unsigned long pfn, int flags, ++ struct page *hpage) ++{ ++ struct folio *folio = page_folio(hpage); ++ LIST_HEAD(tokill); ++ int res = -EHWPOISON; ++ ++ /* deal with user pages only */ ++ if (PageReserved(p) || PageSlab(p) || PageTable(p) || PageOffline(p)) ++ res = -EBUSY; ++ if (!(PageLRU(hpage) || PageHuge(p))) ++ res = -EBUSY; ++ ++ if (res == -EHWPOISON) { ++ collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED); ++ kill_procs(&tokill, true, pfn, flags); ++ } ++ ++ if (flags & MF_COUNT_INCREASED) ++ put_page(p); ++ ++ return res; ++} ++ + /** + * memory_failure - Handle memory failure of a page. + * @pfn: Page Number of the corrupted page +@@ -2291,6 +2322,11 @@ try_again: + */ + folio_set_has_hwpoisoned(folio); + if (try_to_split_thp_page(p) < 0) { ++ if (flags & MF_ACTION_REQUIRED) { ++ pr_err("%#lx: thp split failed\n", pfn); ++ res = kill_procs_now(p, pfn, flags, hpage); ++ goto unlock_mutex; ++ } + res = action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); + goto unlock_mutex; + } +_ diff --git a/patches/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.patch b/patches/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.patch new file mode 100644 index 000000000..f3d1ab77b --- /dev/null +++ b/patches/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.patch @@ -0,0 +1,91 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/memory-failure: try to send SIGBUS even if unmap failed +Date: Wed, 1 May 2024 17:24:56 -0600 + +Patch series "Enhance soft hwpoison handling and injection". + +This series aim at the following enhancement - + +1. Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave + more like as if a real UE occurred. Because the other two injectors + such as hwpoison-inject and the 'einj' on x86 can't, and it seems to + me we need a better simulation to real UE scenario. + +2. For years, if the kernel is unable to unmap a hwpoisoned page, it send + a SIGKILL instead of SIGBUS to prevent user process from potentially + accessing the page again. But in doing so, the user process also lose + important information: vaddr, for recovery. Fortunately, the kernel + already has code to kill process re-accessing a hwpoisoned page, so + remove the '!unmap_success' check. + +3. Right now, if a thp page under GUP longterm pin is hwpoisoned, and + kernel cannot split the thp page, memory-failure simply ignores + the UE and returns. That's not ideal, it could deliver a SIGBUS with + useful information for userspace recovery. + + +This patch (of 3): + +For years when it comes down to kill a process due to hwpoison, a SIGBUS +is delivered only if unmap has been successful. Otherwise, a SIGKILL is +delivered. And the reason for that is to prevent the involved process +from accessing the hwpoisoned page again. + +Since then a lot has changed, a hwpoisoned page is marked and upon being +re-accessed, the process will be killed immediately. So let's take out +the '!unmap_success' factor and try to deliver SIGBUS if possible. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-1-jane.chu@oracle.com +Link: https://lkml.kernel.org/r/20240501232458.3919593-2-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + mm/memory-failure.c | 13 ++++--------- + 1 file changed, 4 insertions(+), 9 deletions(-) + +--- a/mm/memory-failure.c~mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed ++++ a/mm/memory-failure.c +@@ -517,19 +517,14 @@ void add_to_kill_ksm(struct task_struct + * Also when FAIL is set do a force kill because something went + * wrong earlier. + */ +-static void kill_procs(struct list_head *to_kill, int forcekill, bool fail, ++static void kill_procs(struct list_head *to_kill, int forcekill, + unsigned long pfn, int flags) + { + struct to_kill *tk, *next; + + list_for_each_entry_safe(tk, next, to_kill, nd) { + if (forcekill) { +- /* +- * In case something went wrong with munmapping +- * make sure the process doesn't catch the +- * signal and then access the memory. Just kill it. +- */ +- if (fail || tk->addr == -EFAULT) { ++ if (tk->addr == -EFAULT) { + pr_err("%#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n", + pfn, tk->tsk->comm, tk->tsk->pid); + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, +@@ -1660,7 +1655,7 @@ static bool hwpoison_user_mappings(struc + */ + forcekill = folio_test_dirty(folio) || (flags & MF_MUST_KILL) || + !unmap_success; +- kill_procs(&tokill, forcekill, !unmap_success, pfn, flags); ++ kill_procs(&tokill, forcekill, pfn, flags); + + return unmap_success; + } +@@ -1724,7 +1719,7 @@ static void unmap_and_kill(struct list_h + unmap_mapping_range(mapping, start, size, 0); + } + +- kill_procs(to_kill, flags & MF_MUST_KILL, false, pfn, flags); ++ kill_procs(to_kill, flags & MF_MUST_KILL, pfn, flags); + } + + /* +_ diff --git a/patches/old/bitops-optimize-fns-for-improved-performance.patch b/patches/old/bitops-optimize-fns-for-improved-performance.patch index edfd0c9bb..7d9758e97 100644 --- a/patches/old/bitops-optimize-fns-for-improved-performance.patch +++ b/patches/old/bitops-optimize-fns-for-improved-performance.patch @@ -1,6 +1,6 @@ From: Kuan-Wei Chiu <visitorckw@gmail.com> Subject: bitops: optimize fns() for improved performance -Date: Fri, 26 Apr 2024 11:51:52 +0800 +Date: Thu, 2 May 2024 17:24:43 +0800 The current fns() repeatedly uses __ffs() to find the index of the least significant bit and then clears the corresponding bit using __clear_bit(). @@ -14,99 +14,39 @@ replaced with word &= word - 1, followed by a single __ffs() operation to obtain the answer. This change reduces the number of __ffs() iterations from n to just one, enhancing overall performance. -The following microbenchmark data, conducted on my x86-64 machine, shows -the execution time (in microseconds) required for 1000000 test data -generated by get_random_u64() and executed by fns() under different values -of n: +This modification significantly accelerates the fns() function in the +test_bitops benchmark, improving its speed by approximately 7.6 times. +Additionally, it enhances the performance of find_nth_bit() in the +find_bit benchmark by approximately 26%. -+-----+---------------+---------------+ -| n | time_old | time_new | -+-----+---------------+---------------+ -| 0 | 29194 | 25878 | -| 1 | 25510 | 25497 | -| 2 | 27836 | 25721 | -| 3 | 30140 | 25673 | -| 4 | 32569 | 25426 | -| 5 | 34792 | 25690 | -| 6 | 37117 | 25651 | -| 7 | 39742 | 25383 | -| 8 | 42360 | 25657 | -| 9 | 44672 | 25897 | -| 10 | 47237 | 25819 | -| 11 | 49884 | 26530 | -| 12 | 51864 | 26647 | -| 13 | 54265 | 28915 | -| 14 | 56440 | 28373 | -| 15 | 58839 | 28616 | -| 16 | 62383 | 29128 | -| 17 | 64257 | 30041 | -| 18 | 66805 | 29773 | -| 19 | 69368 | 33203 | -| 20 | 72942 | 33688 | -| 21 | 77006 | 34518 | -| 22 | 80926 | 34298 | -| 23 | 85723 | 35586 | -| 24 | 90324 | 36376 | -| 25 | 95992 | 37465 | -| 26 | 101101 | 37599 | -| 27 | 106520 | 37466 | -| 28 | 113287 | 38163 | -| 29 | 120552 | 38810 | -| 30 | 128040 | 39373 | -| 31 | 135624 | 40500 | -| 32 | 142580 | 40343 | -| 33 | 148915 | 40460 | -| 34 | 154005 | 41294 | -| 35 | 157996 | 41730 | -| 36 | 160806 | 41523 | -| 37 | 162975 | 42088 | -| 38 | 163426 | 41530 | -| 39 | 164872 | 41789 | -| 40 | 164477 | 42505 | -| 41 | 164758 | 41879 | -| 42 | 164182 | 41415 | -| 43 | 164842 | 42119 | -| 44 | 164881 | 42297 | -| 45 | 164870 | 42145 | -| 46 | 164673 | 42066 | -| 47 | 164616 | 42051 | -| 48 | 165055 | 41902 | -| 49 | 164847 | 41862 | -| 50 | 165171 | 41960 | -| 51 | 164851 | 42089 | -| 52 | 164763 | 41717 | -| 53 | 164635 | 42154 | -| 54 | 164757 | 41983 | -| 55 | 165095 | 41419 | -| 56 | 164641 | 42381 | -| 57 | 164601 | 41654 | -| 58 | 164864 | 41834 | -| 59 | 164594 | 41920 | -| 60 | 165207 | 42020 | -| 61 | 165056 | 41185 | -| 62 | 165160 | 41722 | -| 63 | 164923 | 41702 | -| 64 | 164777 | 41880 | -+-----+---------------+---------------+ +Before: +test_bitops: fns: 58033164 ns +find_nth_bit: 4254313 ns, 16525 iterations -Link: https://lkml.kernel.org/r/20240426035152.956702-1-visitorckw@gmail.com +After: +test_bitops: fns: 7637268 ns +find_nth_bit: 3362863 ns, 16501 iterations + +Link: https://lkml.kernel.org/r/20240502092443.6845-3-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> +Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- - include/linux/bitops.h | 12 ++++-------- - 1 file changed, 4 insertions(+), 8 deletions(-) + include/linux/bitops.h | 12 +++--------- + 1 file changed, 3 insertions(+), 9 deletions(-) --- a/include/linux/bitops.h~bitops-optimize-fns-for-improved-performance +++ a/include/linux/bitops.h -@@ -254,16 +254,12 @@ static inline unsigned long __ffs64(u64 +@@ -254,16 +254,10 @@ static inline unsigned long __ffs64(u64 */ static inline unsigned long fns(unsigned long word, unsigned int n) { - unsigned int bit; -+ unsigned int i; ++ while (word && n--) ++ word &= word - 1; - while (word) { - bit = __ffs(word); @@ -114,9 +54,7 @@ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> - return bit; - __clear_bit(bit, &word); - } -+ for (i = 0; word && i < n; i++) -+ word &= word - 1; - +- - return BITS_PER_LONG; + return word ? __ffs(word) : BITS_PER_LONG; } diff --git a/patches/lib-test_bitops-add-benchmark-test-for-fns.patch b/patches/old/lib-test_bitops-add-benchmark-test-for-fns.patch index cd4377877..cd4377877 100644 --- a/patches/lib-test_bitops-add-benchmark-test-for-fns.patch +++ b/patches/old/lib-test_bitops-add-benchmark-test-for-fns.patch diff --git a/patches/memcg-cleanup-__mod_memcg_lruvec_state.patch b/patches/old/memcg-cleanup-__mod_memcg_lruvec_state.patch index 2f84aa814..2f84aa814 100644 --- a/patches/memcg-cleanup-__mod_memcg_lruvec_state.patch +++ b/patches/old/memcg-cleanup-__mod_memcg_lruvec_state.patch diff --git a/patches/memcg-dynamically-allocate-lruvec_stats.patch b/patches/old/memcg-dynamically-allocate-lruvec_stats.patch index b06db9bb6..b06db9bb6 100644 --- a/patches/memcg-dynamically-allocate-lruvec_stats.patch +++ b/patches/old/memcg-dynamically-allocate-lruvec_stats.patch diff --git a/patches/memcg-pr_warn_once-for-unexpected-events-and-stats.patch b/patches/old/memcg-pr_warn_once-for-unexpected-events-and-stats.patch index a5c4c7684..a5c4c7684 100644 --- a/patches/memcg-pr_warn_once-for-unexpected-events-and-stats.patch +++ b/patches/old/memcg-pr_warn_once-for-unexpected-events-and-stats.patch diff --git a/patches/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch b/patches/old/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch index b312342d8..b312342d8 100644 --- a/patches/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch +++ b/patches/old/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch diff --git a/patches/memcg-reduce-memory-size-of-mem_cgroup_events_index.patch b/patches/old/memcg-reduce-memory-size-of-mem_cgroup_events_index.patch index cc0596c3d..cc0596c3d 100644 --- a/patches/memcg-reduce-memory-size-of-mem_cgroup_events_index.patch +++ b/patches/old/memcg-reduce-memory-size-of-mem_cgroup_events_index.patch diff --git a/patches/memcg-use-proper-type-for-mod_memcg_state.patch b/patches/old/memcg-use-proper-type-for-mod_memcg_state.patch index b19324cf0..b19324cf0 100644 --- a/patches/memcg-use-proper-type-for-mod_memcg_state.patch +++ b/patches/old/memcg-use-proper-type-for-mod_memcg_state.patch diff --git a/patches/mm-cleanup-workingset_nodes-in-workingset.patch b/patches/old/mm-cleanup-workingset_nodes-in-workingset.patch index 8dad71032..8dad71032 100644 --- a/patches/mm-cleanup-workingset_nodes-in-workingset.patch +++ b/patches/old/mm-cleanup-workingset_nodes-in-workingset.patch diff --git a/patches/mm-ksm-rename-mm_slot-for-get_next_rmap_item.patch b/patches/old/mm-ksm-rename-mm_slot-for-get_next_rmap_item.patch index f049983b9..f049983b9 100644 --- a/patches/mm-ksm-rename-mm_slot-for-get_next_rmap_item.patch +++ b/patches/old/mm-ksm-rename-mm_slot-for-get_next_rmap_item.patch diff --git a/patches/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.patch b/patches/old/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.patch index 8aeb4032e..8aeb4032e 100644 --- a/patches/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.patch +++ b/patches/old/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.patch diff --git a/patches/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.patch b/patches/old/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.patch index 66a3b6838..66a3b6838 100644 --- a/patches/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.patch +++ b/patches/old/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.patch diff --git a/patches/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.patch b/patches/old/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.patch index 0182f458d..0182f458d 100644 --- a/patches/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.patch +++ b/patches/old/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.patch diff --git a/patches/selftests-memfd-fix-spelling-mistakes.patch b/patches/selftests-memfd-fix-spelling-mistakes.patch new file mode 100644 index 000000000..c2949418f --- /dev/null +++ b/patches/selftests-memfd-fix-spelling-mistakes.patch @@ -0,0 +1,42 @@ +From: Saurav Shah <sauravshah.31@gmail.com> +Subject: selftests/memfd: fix spelling mistakes +Date: Thu, 2 May 2024 04:43:17 +0530 + +Fix spelling mistakes in the comments. + +Link: https://lkml.kernel.org/r/20240501231317.24648-1-sauravshah.31@gmail.com +Signed-off-by: Saurav Shah <sauravshah.31@gmail.com> +Cc: Aleksa Sarai <cyphar@cyphar.com> +Cc: Greg Thelen <gthelen@google.com> +Cc: Jeff Xu <jeffxu@google.com> +Cc: Shuah Khan <shuah@kernel.org> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + tools/testing/selftests/memfd/fuse_test.c | 2 +- + tools/testing/selftests/memfd/memfd_test.c | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +--- a/tools/testing/selftests/memfd/fuse_test.c~selftests-memfd-fix-spelling-mistakes ++++ a/tools/testing/selftests/memfd/fuse_test.c +@@ -306,7 +306,7 @@ int main(int argc, char **argv) + * then the kernel did a page-replacement or canceled the read() (or + * whatever magic it did..). In that case, the memfd object is still + * all zero. +- * In case the memfd-object was *not* sealed, the read() was successfull ++ * In case the memfd-object was *not* sealed, the read() was successful + * and the memfd object must *not* be all zero. + * Note that in real scenarios, there might be a mixture of both, but + * in this test-cases, we have explicit 200ms delays which should be +--- a/tools/testing/selftests/memfd/memfd_test.c~selftests-memfd-fix-spelling-mistakes ++++ a/tools/testing/selftests/memfd/memfd_test.c +@@ -1528,7 +1528,7 @@ static void test_share_open(char *banner + + /* + * Test sharing via fork() +- * Test whether seal-modifications work as expected with forked childs. ++ * Test whether seal-modifications work as expected with forked children. + */ + static void test_share_fork(char *banner, char *b_suffix) + { +_ diff --git a/patches/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.patch b/patches/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.patch new file mode 100644 index 000000000..78783ffe3 --- /dev/null +++ b/patches/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.patch @@ -0,0 +1,325 @@ +From: David Hildenbrand <david@redhat.com> +Subject: selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL +Date: Thu, 2 May 2024 10:52:58 +0200 + +Patch series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". + +The failing hugetlb vmsplice() COW tests keep confusing people, and having +tests that have been failing for years and likely will keep failing for +years to come because nobody cares enough is rather suboptimal. Let's +mark them as XFAIL and document why fixing them is not that easy as it +would appear at first sight. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + + +This patch (of 2): + +The vmsplice() hugetlb tests have been failing right from the start, and +we documented that in the introducing commit 7dad331be781 ("selftests/vm: +anon_cow: hugetlb tests"): + + Note that some tests cases still fail. This will, for example, be + fixed once vmsplice properly uses FOLL_PIN instead of FOLL_GET for + pinning. With 2 MiB and 1 GiB hugetlb on x86_64, the expected + failures are: + +Until vmsplice() is changed, these tests will likely keep failing: hugetlb +COW reuse logic is harder to change, because using the same COW reuse +logic as we use for !hugetlb could harm other (sane) users when running +out of free hugetlb pages. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +These (expected) failures keep confusing people, so flag them accordingly. + +Before: + $ ./cow + [...] + Bail out! 8 out of 778 tests failed + # Totals: pass:769 fail:8 xfail:0 xpass:0 skip:1 error:0 + $ echo $? + 1 + +After: + $ ./cow + [...] + # Totals: pass:769 fail:0 xfail:8 xpass:0 skip:1 error:0 + $ echo $? + 0 + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + +Link: https://lkml.kernel.org/r/20240502085259.103784-1-david@redhat.com +Link: https://lkml.kernel.org/r/20240502085259.103784-2-david@redhat.com +Signed-off-by: David Hildenbrand <david@redhat.com> +Cc: Muchun Song <muchun.song@linux.dev> +Cc: Peter Xu <peterx@redhat.com> +Cc: Shuah Khan <shuah@kernel.org> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +--- + + tools/testing/selftests/mm/cow.c | 106 +++++++++++++++++++---------- + 1 file changed, 71 insertions(+), 35 deletions(-) + +--- a/tools/testing/selftests/mm/cow.c~selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail ++++ a/tools/testing/selftests/mm/cow.c +@@ -199,7 +199,7 @@ static int child_vmsplice_memcmp_fn(char + typedef int (*child_fn)(char *mem, size_t size, struct comm_pipes *comm_pipes); + + static void do_test_cow_in_parent(char *mem, size_t size, bool do_mprotect, +- child_fn fn) ++ child_fn fn, bool xfail) + { + struct comm_pipes comm_pipes; + char buf; +@@ -247,33 +247,47 @@ static void do_test_cow_in_parent(char * + else + ret = -EINVAL; + +- ksft_test_result(!ret, "No leak from parent into child\n"); ++ if (!ret) { ++ ksft_test_result_pass("No leak from parent into child\n"); ++ } else if (xfail) { ++ /* ++ * With hugetlb, some vmsplice() tests are currently expected to ++ * fail because (a) harder to fix and (b) nobody really cares. ++ * Flag them as expected failure for now. ++ */ ++ ksft_test_result_xfail("Leak from parent into child\n"); ++ } else { ++ ksft_test_result_fail("Leak from parent into child\n"); ++ } + close_comm_pipes: + close_comm_pipes(&comm_pipes); + } + +-static void test_cow_in_parent(char *mem, size_t size) ++static void test_cow_in_parent(char *mem, size_t size, bool is_hugetlb) + { +- do_test_cow_in_parent(mem, size, false, child_memcmp_fn); ++ do_test_cow_in_parent(mem, size, false, child_memcmp_fn, false); + } + +-static void test_cow_in_parent_mprotect(char *mem, size_t size) ++static void test_cow_in_parent_mprotect(char *mem, size_t size, bool is_hugetlb) + { +- do_test_cow_in_parent(mem, size, true, child_memcmp_fn); ++ do_test_cow_in_parent(mem, size, true, child_memcmp_fn, false); + } + +-static void test_vmsplice_in_child(char *mem, size_t size) ++static void test_vmsplice_in_child(char *mem, size_t size, bool is_hugetlb) + { +- do_test_cow_in_parent(mem, size, false, child_vmsplice_memcmp_fn); ++ do_test_cow_in_parent(mem, size, false, child_vmsplice_memcmp_fn, ++ is_hugetlb); + } + +-static void test_vmsplice_in_child_mprotect(char *mem, size_t size) ++static void test_vmsplice_in_child_mprotect(char *mem, size_t size, ++ bool is_hugetlb) + { +- do_test_cow_in_parent(mem, size, true, child_vmsplice_memcmp_fn); ++ do_test_cow_in_parent(mem, size, true, child_vmsplice_memcmp_fn, ++ is_hugetlb); + } + + static void do_test_vmsplice_in_parent(char *mem, size_t size, +- bool before_fork) ++ bool before_fork, bool xfail) + { + struct iovec iov = { + .iov_base = mem, +@@ -355,8 +369,18 @@ static void do_test_vmsplice_in_parent(c + } + } + +- ksft_test_result(!memcmp(old, new, transferred), +- "No leak from child into parent\n"); ++ if (!memcmp(old, new, transferred)) { ++ ksft_test_result_pass("No leak from child into parent\n"); ++ } else if (xfail) { ++ /* ++ * With hugetlb, some vmsplice() tests are currently expected to ++ * fail because (a) harder to fix and (b) nobody really cares. ++ * Flag them as expected failure for now. ++ */ ++ ksft_test_result_xfail("Leak from child into parent\n"); ++ } else { ++ ksft_test_result_fail("Leak from child into parent\n"); ++ } + close_pipe: + close(fds[0]); + close(fds[1]); +@@ -367,14 +391,14 @@ free: + free(new); + } + +-static void test_vmsplice_before_fork(char *mem, size_t size) ++static void test_vmsplice_before_fork(char *mem, size_t size, bool is_hugetlb) + { +- do_test_vmsplice_in_parent(mem, size, true); ++ do_test_vmsplice_in_parent(mem, size, true, is_hugetlb); + } + +-static void test_vmsplice_after_fork(char *mem, size_t size) ++static void test_vmsplice_after_fork(char *mem, size_t size, bool is_hugetlb) + { +- do_test_vmsplice_in_parent(mem, size, false); ++ do_test_vmsplice_in_parent(mem, size, false, is_hugetlb); + } + + #ifdef LOCAL_CONFIG_HAVE_LIBURING +@@ -529,12 +553,12 @@ close_comm_pipes: + close_comm_pipes(&comm_pipes); + } + +-static void test_iouring_ro(char *mem, size_t size) ++static void test_iouring_ro(char *mem, size_t size, bool is_hugetlb) + { + do_test_iouring(mem, size, false); + } + +-static void test_iouring_fork(char *mem, size_t size) ++static void test_iouring_fork(char *mem, size_t size, bool is_hugetlb) + { + do_test_iouring(mem, size, true); + } +@@ -678,37 +702,41 @@ free_tmp: + free(tmp); + } + +-static void test_ro_pin_on_shared(char *mem, size_t size) ++static void test_ro_pin_on_shared(char *mem, size_t size, bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_SHARED, false); + } + +-static void test_ro_fast_pin_on_shared(char *mem, size_t size) ++static void test_ro_fast_pin_on_shared(char *mem, size_t size, bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_SHARED, true); + } + +-static void test_ro_pin_on_ro_previously_shared(char *mem, size_t size) ++static void test_ro_pin_on_ro_previously_shared(char *mem, size_t size, ++ bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_PREVIOUSLY_SHARED, false); + } + +-static void test_ro_fast_pin_on_ro_previously_shared(char *mem, size_t size) ++static void test_ro_fast_pin_on_ro_previously_shared(char *mem, size_t size, ++ bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_PREVIOUSLY_SHARED, true); + } + +-static void test_ro_pin_on_ro_exclusive(char *mem, size_t size) ++static void test_ro_pin_on_ro_exclusive(char *mem, size_t size, ++ bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_RO_EXCLUSIVE, false); + } + +-static void test_ro_fast_pin_on_ro_exclusive(char *mem, size_t size) ++static void test_ro_fast_pin_on_ro_exclusive(char *mem, size_t size, ++ bool is_hugetlb) + { + do_test_ro_pin(mem, size, RO_PIN_TEST_RO_EXCLUSIVE, true); + } + +-typedef void (*test_fn)(char *mem, size_t size); ++typedef void (*test_fn)(char *mem, size_t size, bool hugetlb); + + static void do_run_with_base_page(test_fn fn, bool swapout) + { +@@ -740,7 +768,7 @@ static void do_run_with_base_page(test_f + } + } + +- fn(mem, pagesize); ++ fn(mem, pagesize, false); + munmap: + munmap(mem, pagesize); + } +@@ -904,7 +932,7 @@ static void do_run_with_thp(test_fn fn, + break; + } + +- fn(mem, size); ++ fn(mem, size, false); + munmap: + munmap(mmap_mem, mmap_size); + if (mremap_mem != MAP_FAILED) +@@ -997,7 +1025,7 @@ static void run_with_hugetlb(test_fn fn, + } + munmap(dummy, hugetlbsize); + +- fn(mem, hugetlbsize); ++ fn(mem, hugetlbsize, true); + munmap: + munmap(mem, hugetlbsize); + } +@@ -1036,7 +1064,7 @@ static const struct test_case anon_test_ + */ + { + "vmsplice() + unmap in child", +- test_vmsplice_in_child ++ test_vmsplice_in_child, + }, + /* + * vmsplice() test, but do an additional mprotect(PROT_READ)+ +@@ -1044,7 +1072,7 @@ static const struct test_case anon_test_ + */ + { + "vmsplice() + unmap in child with mprotect() optimization", +- test_vmsplice_in_child_mprotect ++ test_vmsplice_in_child_mprotect, + }, + /* + * vmsplice() [R/O GUP] in parent before fork(), unmap in parent after +@@ -1322,23 +1350,31 @@ close_comm_pipes: + close_comm_pipes(&comm_pipes); + } + +-static void test_anon_thp_collapse_unshared(char *mem, size_t size) ++static void test_anon_thp_collapse_unshared(char *mem, size_t size, ++ bool is_hugetlb) + { ++ assert(!is_hugetlb); + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_UNSHARED); + } + +-static void test_anon_thp_collapse_fully_shared(char *mem, size_t size) ++static void test_anon_thp_collapse_fully_shared(char *mem, size_t size, ++ bool is_hugetlb) + { ++ assert(!is_hugetlb); + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_FULLY_SHARED); + } + +-static void test_anon_thp_collapse_lower_shared(char *mem, size_t size) ++static void test_anon_thp_collapse_lower_shared(char *mem, size_t size, ++ bool is_hugetlb) + { ++ assert(!is_hugetlb); + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_LOWER_SHARED); + } + +-static void test_anon_thp_collapse_upper_shared(char *mem, size_t size) ++static void test_anon_thp_collapse_upper_shared(char *mem, size_t size, ++ bool is_hugetlb) + { ++ assert(!is_hugetlb); + do_test_anon_thp_collapse(mem, size, ANON_THP_COLLAPSE_UPPER_SHARED); + } + +_ diff --git a/pc/bitops-optimize-fns-for-improved-performance.pc b/pc/bitops-optimize-fns-for-improved-performance.pc deleted file mode 100644 index 2ceed3e3f..000000000 --- a/pc/bitops-optimize-fns-for-improved-performance.pc +++ /dev/null @@ -1 +0,0 @@ -include/linux/bitops.h diff --git a/pc/devel-series b/pc/devel-series index 98752e89f..9f4f507d4 100644 --- a/pc/devel-series +++ b/pc/devel-series @@ -729,10 +729,6 @@ mm-rmap-remove-duplicated-exit-code-in-pagewalk-loop.patch mm-rmap-integrate-pmd-mapped-folio-splitting-into-pagewalk-loop.patch mm-vmscan-avoid-split-lazyfree-thp-during-shrink_folio_list.patch # -mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.patch -mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.patch -mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.patch -mm-ksm-rename-mm_slot-for-get_next_rmap_item.patch # mm-pagemap-make-trylock_page-return-bool.patch # @@ -740,16 +736,16 @@ mm-rmap-change-the-type-of-we_locked-from-int-to-bool.patch # mm-swapfile-mark-racy-access-on-si-highest_bit.patch # -memcg-reduce-memory-size-of-mem_cgroup_events_index.patch -#memcg-dynamically-allocate-lruvec_stats.patch: https://lkml.kernel.org/r/Zi_BsznrtoC1N_lq@P9FQF9L96D -memcg-dynamically-allocate-lruvec_stats.patch -#memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch: https://lkml.kernel.org/r/Zi_EEOUS_iCh2Nfh@P9FQF9L96D https://lkml.kernel.org/r/CABdmKX0Mo74f-AjY46cyPwd2qSpwmj4CLvYVCEywq_PEsVmd9w@mail.gmail.com -memcg-reduce-memory-for-the-lruvec-and-memcg-stats.patch -memcg-cleanup-__mod_memcg_lruvec_state.patch -#memcg-pr_warn_once-for-unexpected-events-and-stats.patch: https://lkml.kernel.org/r/Zi_Ff_iAs_PTph4l@P9FQF9L96D -memcg-pr_warn_once-for-unexpected-events-and-stats.patch -memcg-use-proper-type-for-mod_memcg_state.patch -mm-cleanup-workingset_nodes-in-workingset.patch +# +selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.patch +mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.patch +# +#mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.patch+N: acks? +mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.patch +mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.patch +mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.patch +# +selftests-memfd-fix-spelling-mistakes.patch # # # @@ -933,7 +929,5 @@ scripts-gdb-fix-detection-of-current-cpu-in-kgdb.patch squashfs-convert-squashfs_symlink_read_folio-to-use-folio-apis.patch squashfs-remove-calls-to-set-the-folio-error-flag.patch # -lib-test_bitops-add-benchmark-test-for-fns.patch -bitops-optimize-fns-for-improved-performance.patch # #ENDBRANCH mm-nonmm-unstable diff --git a/pc/lib-test_bitops-add-benchmark-test-for-fns.pc b/pc/lib-test_bitops-add-benchmark-test-for-fns.pc deleted file mode 100644 index ebe826768..000000000 --- a/pc/lib-test_bitops-add-benchmark-test-for-fns.pc +++ /dev/null @@ -1 +0,0 @@ -lib/test_bitops.c diff --git a/pc/memcg-cleanup-__mod_memcg_lruvec_state.pc b/pc/memcg-cleanup-__mod_memcg_lruvec_state.pc deleted file mode 100644 index ba4010b8e..000000000 --- a/pc/memcg-cleanup-__mod_memcg_lruvec_state.pc +++ /dev/null @@ -1 +0,0 @@ -mm/memcontrol.c diff --git a/pc/memcg-dynamically-allocate-lruvec_stats.pc b/pc/memcg-dynamically-allocate-lruvec_stats.pc deleted file mode 100644 index 78bdc29b3..000000000 --- a/pc/memcg-dynamically-allocate-lruvec_stats.pc +++ /dev/null @@ -1,2 +0,0 @@ -include/linux/memcontrol.h -mm/memcontrol.c diff --git a/pc/memcg-pr_warn_once-for-unexpected-events-and-stats.pc b/pc/memcg-pr_warn_once-for-unexpected-events-and-stats.pc deleted file mode 100644 index ba4010b8e..000000000 --- a/pc/memcg-pr_warn_once-for-unexpected-events-and-stats.pc +++ /dev/null @@ -1 +0,0 @@ -mm/memcontrol.c diff --git a/pc/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.pc b/pc/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.pc deleted file mode 100644 index ba4010b8e..000000000 --- a/pc/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.pc +++ /dev/null @@ -1 +0,0 @@ -mm/memcontrol.c diff --git a/pc/memcg-reduce-memory-size-of-mem_cgroup_events_index.pc b/pc/memcg-reduce-memory-size-of-mem_cgroup_events_index.pc deleted file mode 100644 index ba4010b8e..000000000 --- a/pc/memcg-reduce-memory-size-of-mem_cgroup_events_index.pc +++ /dev/null @@ -1 +0,0 @@ -mm/memcontrol.c diff --git a/pc/memcg-use-proper-type-for-mod_memcg_state.pc b/pc/memcg-use-proper-type-for-mod_memcg_state.pc deleted file mode 100644 index 78bdc29b3..000000000 --- a/pc/memcg-use-proper-type-for-mod_memcg_state.pc +++ /dev/null @@ -1,2 +0,0 @@ -include/linux/memcontrol.h -mm/memcontrol.c diff --git a/pc/mm-cleanup-workingset_nodes-in-workingset.pc b/pc/mm-cleanup-workingset_nodes-in-workingset.pc deleted file mode 100644 index 9414275f2..000000000 --- a/pc/mm-cleanup-workingset_nodes-in-workingset.pc +++ /dev/null @@ -1 +0,0 @@ -mm/workingset.c diff --git a/pc/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.pc b/pc/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.pc new file mode 100644 index 000000000..6dc98425d --- /dev/null +++ b/pc/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.pc @@ -0,0 +1 @@ +mm/hugetlb.c diff --git a/pc/mm-ksm-rename-mm_slot-for-get_next_rmap_item.pc b/pc/mm-ksm-rename-mm_slot-for-get_next_rmap_item.pc deleted file mode 100644 index 712364693..000000000 --- a/pc/mm-ksm-rename-mm_slot-for-get_next_rmap_item.pc +++ /dev/null @@ -1 +0,0 @@ -mm/ksm.c diff --git a/pc/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.pc b/pc/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.pc deleted file mode 100644 index 712364693..000000000 --- a/pc/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.pc +++ /dev/null @@ -1 +0,0 @@ -mm/ksm.c diff --git a/pc/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.pc b/pc/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.pc deleted file mode 100644 index 712364693..000000000 --- a/pc/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.pc +++ /dev/null @@ -1 +0,0 @@ -mm/ksm.c diff --git a/pc/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.pc b/pc/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.pc deleted file mode 100644 index 712364693..000000000 --- a/pc/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.pc +++ /dev/null @@ -1 +0,0 @@ -mm/ksm.c diff --git a/pc/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.pc b/pc/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.pc new file mode 100644 index 000000000..74d58a564 --- /dev/null +++ b/pc/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.pc @@ -0,0 +1 @@ +mm/madvise.c diff --git a/pc/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.pc b/pc/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.pc new file mode 100644 index 000000000..709648673 --- /dev/null +++ b/pc/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.pc @@ -0,0 +1 @@ +mm/memory-failure.c diff --git a/pc/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.pc b/pc/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.pc new file mode 100644 index 000000000..709648673 --- /dev/null +++ b/pc/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.pc @@ -0,0 +1 @@ +mm/memory-failure.c diff --git a/pc/selftests-memfd-fix-spelling-mistakes.pc b/pc/selftests-memfd-fix-spelling-mistakes.pc new file mode 100644 index 000000000..8295a975a --- /dev/null +++ b/pc/selftests-memfd-fix-spelling-mistakes.pc @@ -0,0 +1,2 @@ +tools/testing/selftests/memfd/fuse_test.c +tools/testing/selftests/memfd/memfd_test.c diff --git a/pc/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.pc b/pc/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.pc new file mode 100644 index 000000000..d4cbb097a --- /dev/null +++ b/pc/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.pc @@ -0,0 +1 @@ +tools/testing/selftests/mm/cow.c diff --git a/txt/bitops-optimize-fns-for-improved-performance.txt b/txt/bitops-optimize-fns-for-improved-performance.txt deleted file mode 100644 index 74bb7e91a..000000000 --- a/txt/bitops-optimize-fns-for-improved-performance.txt +++ /dev/null @@ -1,34 +0,0 @@ -From: Kuan-Wei Chiu <visitorckw@gmail.com> -Subject: bitops: optimize fns() for improved performance -Date: Thu, 2 May 2024 17:24:43 +0800 - -The current fns() repeatedly uses __ffs() to find the index of the least -significant bit and then clears the corresponding bit using __clear_bit(). -The method for clearing the least significant bit can be optimized by -using word &= word - 1 instead. - -Typically, the execution time of one __ffs() plus one __clear_bit() is -longer than that of a bitwise AND operation and a subtraction. To improve -performance, the loop for clearing the least significant bit has been -replaced with word &= word - 1, followed by a single __ffs() operation to -obtain the answer. This change reduces the number of __ffs() iterations -from n to just one, enhancing overall performance. - -This modification significantly accelerates the fns() function in the -test_bitops benchmark, improving its speed by approximately 7.6 times. -Additionally, it enhances the performance of find_nth_bit() in the -find_bit benchmark by approximately 26%. - -Before: -test_bitops: fns: 58033164 ns -find_nth_bit: 4254313 ns, 16525 iterations - -After: -test_bitops: fns: 7637268 ns -find_nth_bit: 3362863 ns, 16501 iterations - -Link: https://lkml.kernel.org/r/20240502092443.6845-3-visitorckw@gmail.com -Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> -Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> -Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> -Cc: Yury Norov <yury.norov@gmail.com> diff --git a/txt/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.txt b/txt/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.txt new file mode 100644 index 000000000..d473aca44 --- /dev/null +++ b/txt/mm-hugetlb-document-why-hugetlb-uses-folio_mapcount-for-cow-reuse-decisions.txt @@ -0,0 +1,19 @@ +From: David Hildenbrand <david@redhat.com> +Subject: mm/hugetlb: document why hugetlb uses folio_mapcount() for COW reuse decisions +Date: Thu, 2 May 2024 10:52:59 +0200 + +Let's document why hugetlb still uses folio_mapcount() and is prone to +leaking memory between processes, for example using vmsplice() that still +uses FOLL_GET. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + +Link: https://lkml.kernel.org/r/20240502085259.103784-3-david@redhat.com +Signed-off-by: David Hildenbrand <david@redhat.com> +Suggested-by: Peter Xu <peterx@redhat.com> +Cc: Muchun Song <muchun.song@linux.dev> +Cc: Shuah Khan <shuah@kernel.org> diff --git a/txt/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.txt b/txt/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.txt new file mode 100644 index 000000000..638e15003 --- /dev/null +++ b/txt/mm-madvise-add-mf_action_required-to-madvisemadv_hwpoison.txt @@ -0,0 +1,13 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/madvise: add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON) +Date: Wed, 1 May 2024 17:24:57 -0600 + +The soft hwpoison injector via madvise(MADV_HWPOISON) operates in a +synchrous way in a sense, the injector is also a process under test, and +should it have the poisoned page mapped in its address space, it should +legitimately get killed as much as in a real UE situation. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-3-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> diff --git a/txt/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.txt b/txt/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.txt new file mode 100644 index 000000000..a361d92ce --- /dev/null +++ b/txt/mm-memory-failure-send-sigbus-in-the-event-of-thp-split-fail.txt @@ -0,0 +1,13 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/memory-failure: send SIGBUS in the event of thp split fail +Date: Wed, 1 May 2024 17:24:58 -0600 + +When handle hwpoison in a GUP longterm pin'ed thp page, +try_to_split_thp_page() will fail. And at this point, there is little +else the kernel could do except sending a SIGBUS to the user process, thus +give it a chance to recover. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-4-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> diff --git a/txt/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.txt b/txt/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.txt new file mode 100644 index 000000000..02e507840 --- /dev/null +++ b/txt/mm-memory-failure-try-to-send-sigbus-even-if-unmap-failed.txt @@ -0,0 +1,42 @@ +From: Jane Chu <jane.chu@oracle.com> +Subject: mm/memory-failure: try to send SIGBUS even if unmap failed +Date: Wed, 1 May 2024 17:24:56 -0600 + +Patch series "Enhance soft hwpoison handling and injection". + +This series aim at the following enhancement - + +1. Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave + more like as if a real UE occurred. Because the other two injectors + such as hwpoison-inject and the 'einj' on x86 can't, and it seems to + me we need a better simulation to real UE scenario. + +2. For years, if the kernel is unable to unmap a hwpoisoned page, it send + a SIGKILL instead of SIGBUS to prevent user process from potentially + accessing the page again. But in doing so, the user process also lose + important information: vaddr, for recovery. Fortunately, the kernel + already has code to kill process re-accessing a hwpoisoned page, so + remove the '!unmap_success' check. + +3. Right now, if a thp page under GUP longterm pin is hwpoisoned, and + kernel cannot split the thp page, memory-failure simply ignores + the UE and returns. That's not ideal, it could deliver a SIGBUS with + useful information for userspace recovery. + + +This patch (of 3): + +For years when it comes down to kill a process due to hwpoison, a SIGBUS +is delivered only if unmap has been successful. Otherwise, a SIGKILL is +delivered. And the reason for that is to prevent the involved process +from accessing the hwpoisoned page again. + +Since then a lot has changed, a hwpoisoned page is marked and upon being +re-accessed, the process will be killed immediately. So let's take out +the '!unmap_success' factor and try to deliver SIGBUS if possible. + +Link: https://lkml.kernel.org/r/20240501232458.3919593-1-jane.chu@oracle.com +Link: https://lkml.kernel.org/r/20240501232458.3919593-2-jane.chu@oracle.com +Signed-off-by: Jane Chu <jane.chu@oracle.com> +Cc: Miaohe Lin <linmiaohe@huawei.com> +Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> diff --git a/txt/old/bitops-optimize-fns-for-improved-performance.txt b/txt/old/bitops-optimize-fns-for-improved-performance.txt index 40c10ca9b..74bb7e91a 100644 --- a/txt/old/bitops-optimize-fns-for-improved-performance.txt +++ b/txt/old/bitops-optimize-fns-for-improved-performance.txt @@ -1,6 +1,6 @@ From: Kuan-Wei Chiu <visitorckw@gmail.com> Subject: bitops: optimize fns() for improved performance -Date: Fri, 26 Apr 2024 11:51:52 +0800 +Date: Thu, 2 May 2024 17:24:43 +0800 The current fns() repeatedly uses __ffs() to find the index of the least significant bit and then clears the corresponding bit using __clear_bit(). @@ -14,82 +14,21 @@ replaced with word &= word - 1, followed by a single __ffs() operation to obtain the answer. This change reduces the number of __ffs() iterations from n to just one, enhancing overall performance. -The following microbenchmark data, conducted on my x86-64 machine, shows -the execution time (in microseconds) required for 1000000 test data -generated by get_random_u64() and executed by fns() under different values -of n: +This modification significantly accelerates the fns() function in the +test_bitops benchmark, improving its speed by approximately 7.6 times. +Additionally, it enhances the performance of find_nth_bit() in the +find_bit benchmark by approximately 26%. -+-----+---------------+---------------+ -| n | time_old | time_new | -+-----+---------------+---------------+ -| 0 | 29194 | 25878 | -| 1 | 25510 | 25497 | -| 2 | 27836 | 25721 | -| 3 | 30140 | 25673 | -| 4 | 32569 | 25426 | -| 5 | 34792 | 25690 | -| 6 | 37117 | 25651 | -| 7 | 39742 | 25383 | -| 8 | 42360 | 25657 | -| 9 | 44672 | 25897 | -| 10 | 47237 | 25819 | -| 11 | 49884 | 26530 | -| 12 | 51864 | 26647 | -| 13 | 54265 | 28915 | -| 14 | 56440 | 28373 | -| 15 | 58839 | 28616 | -| 16 | 62383 | 29128 | -| 17 | 64257 | 30041 | -| 18 | 66805 | 29773 | -| 19 | 69368 | 33203 | -| 20 | 72942 | 33688 | -| 21 | 77006 | 34518 | -| 22 | 80926 | 34298 | -| 23 | 85723 | 35586 | -| 24 | 90324 | 36376 | -| 25 | 95992 | 37465 | -| 26 | 101101 | 37599 | -| 27 | 106520 | 37466 | -| 28 | 113287 | 38163 | -| 29 | 120552 | 38810 | -| 30 | 128040 | 39373 | -| 31 | 135624 | 40500 | -| 32 | 142580 | 40343 | -| 33 | 148915 | 40460 | -| 34 | 154005 | 41294 | -| 35 | 157996 | 41730 | -| 36 | 160806 | 41523 | -| 37 | 162975 | 42088 | -| 38 | 163426 | 41530 | -| 39 | 164872 | 41789 | -| 40 | 164477 | 42505 | -| 41 | 164758 | 41879 | -| 42 | 164182 | 41415 | -| 43 | 164842 | 42119 | -| 44 | 164881 | 42297 | -| 45 | 164870 | 42145 | -| 46 | 164673 | 42066 | -| 47 | 164616 | 42051 | -| 48 | 165055 | 41902 | -| 49 | 164847 | 41862 | -| 50 | 165171 | 41960 | -| 51 | 164851 | 42089 | -| 52 | 164763 | 41717 | -| 53 | 164635 | 42154 | -| 54 | 164757 | 41983 | -| 55 | 165095 | 41419 | -| 56 | 164641 | 42381 | -| 57 | 164601 | 41654 | -| 58 | 164864 | 41834 | -| 59 | 164594 | 41920 | -| 60 | 165207 | 42020 | -| 61 | 165056 | 41185 | -| 62 | 165160 | 41722 | -| 63 | 164923 | 41702 | -| 64 | 164777 | 41880 | -+-----+---------------+---------------+ +Before: +test_bitops: fns: 58033164 ns +find_nth_bit: 4254313 ns, 16525 iterations -Link: https://lkml.kernel.org/r/20240426035152.956702-1-visitorckw@gmail.com +After: +test_bitops: fns: 7637268 ns +find_nth_bit: 3362863 ns, 16501 iterations + +Link: https://lkml.kernel.org/r/20240502092443.6845-3-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> +Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yury Norov <yury.norov@gmail.com> diff --git a/txt/lib-test_bitops-add-benchmark-test-for-fns.txt b/txt/old/lib-test_bitops-add-benchmark-test-for-fns.txt index 95297e3ae..95297e3ae 100644 --- a/txt/lib-test_bitops-add-benchmark-test-for-fns.txt +++ b/txt/old/lib-test_bitops-add-benchmark-test-for-fns.txt diff --git a/txt/memcg-cleanup-__mod_memcg_lruvec_state.txt b/txt/old/memcg-cleanup-__mod_memcg_lruvec_state.txt index b76f8e88e..b76f8e88e 100644 --- a/txt/memcg-cleanup-__mod_memcg_lruvec_state.txt +++ b/txt/old/memcg-cleanup-__mod_memcg_lruvec_state.txt diff --git a/txt/memcg-dynamically-allocate-lruvec_stats.txt b/txt/old/memcg-dynamically-allocate-lruvec_stats.txt index c44cec71a..c44cec71a 100644 --- a/txt/memcg-dynamically-allocate-lruvec_stats.txt +++ b/txt/old/memcg-dynamically-allocate-lruvec_stats.txt diff --git a/txt/memcg-pr_warn_once-for-unexpected-events-and-stats.txt b/txt/old/memcg-pr_warn_once-for-unexpected-events-and-stats.txt index 0ef6cc094..0ef6cc094 100644 --- a/txt/memcg-pr_warn_once-for-unexpected-events-and-stats.txt +++ b/txt/old/memcg-pr_warn_once-for-unexpected-events-and-stats.txt diff --git a/txt/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.txt b/txt/old/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.txt index defe117a8..defe117a8 100644 --- a/txt/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.txt +++ b/txt/old/memcg-reduce-memory-for-the-lruvec-and-memcg-stats.txt diff --git a/txt/memcg-reduce-memory-size-of-mem_cgroup_events_index.txt b/txt/old/memcg-reduce-memory-size-of-mem_cgroup_events_index.txt index f1e0027fa..f1e0027fa 100644 --- a/txt/memcg-reduce-memory-size-of-mem_cgroup_events_index.txt +++ b/txt/old/memcg-reduce-memory-size-of-mem_cgroup_events_index.txt diff --git a/txt/memcg-use-proper-type-for-mod_memcg_state.txt b/txt/old/memcg-use-proper-type-for-mod_memcg_state.txt index 2505cdb45..2505cdb45 100644 --- a/txt/memcg-use-proper-type-for-mod_memcg_state.txt +++ b/txt/old/memcg-use-proper-type-for-mod_memcg_state.txt diff --git a/txt/mm-cleanup-workingset_nodes-in-workingset.txt b/txt/old/mm-cleanup-workingset_nodes-in-workingset.txt index 5afe3719a..5afe3719a 100644 --- a/txt/mm-cleanup-workingset_nodes-in-workingset.txt +++ b/txt/old/mm-cleanup-workingset_nodes-in-workingset.txt diff --git a/txt/mm-ksm-rename-mm_slot-for-get_next_rmap_item.txt b/txt/old/mm-ksm-rename-mm_slot-for-get_next_rmap_item.txt index c4de044ad..c4de044ad 100644 --- a/txt/mm-ksm-rename-mm_slot-for-get_next_rmap_item.txt +++ b/txt/old/mm-ksm-rename-mm_slot-for-get_next_rmap_item.txt diff --git a/txt/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.txt b/txt/old/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.txt index af91085e7..af91085e7 100644 --- a/txt/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.txt +++ b/txt/old/mm-ksm-rename-mm_slot-members-to-ksm_slot-for-better-readability.txt diff --git a/txt/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.txt b/txt/old/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.txt index 700482727..700482727 100644 --- a/txt/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.txt +++ b/txt/old/mm-ksm-rename-mm_slot_cache-to-ksm_slot_cache.txt diff --git a/txt/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.txt b/txt/old/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.txt index c6c252049..c6c252049 100644 --- a/txt/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.txt +++ b/txt/old/mm-ksm-rename-variable-mm_slot-to-ksm_slot-in-unmerge_and_remove_all_rmap_items.txt diff --git a/txt/selftests-memfd-fix-spelling-mistakes.txt b/txt/selftests-memfd-fix-spelling-mistakes.txt new file mode 100644 index 000000000..6d314a1dc --- /dev/null +++ b/txt/selftests-memfd-fix-spelling-mistakes.txt @@ -0,0 +1,12 @@ +From: Saurav Shah <sauravshah.31@gmail.com> +Subject: selftests/memfd: fix spelling mistakes +Date: Thu, 2 May 2024 04:43:17 +0530 + +Fix spelling mistakes in the comments. + +Link: https://lkml.kernel.org/r/20240501231317.24648-1-sauravshah.31@gmail.com +Signed-off-by: Saurav Shah <sauravshah.31@gmail.com> +Cc: Aleksa Sarai <cyphar@cyphar.com> +Cc: Greg Thelen <gthelen@google.com> +Cc: Jeff Xu <jeffxu@google.com> +Cc: Shuah Khan <shuah@kernel.org> diff --git a/txt/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.txt b/txt/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.txt new file mode 100644 index 000000000..23823d110 --- /dev/null +++ b/txt/selftests-mm-cow-flag-vmsplice-hugetlb-tests-as-xfail.txt @@ -0,0 +1,64 @@ +From: David Hildenbrand <david@redhat.com> +Subject: selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL +Date: Thu, 2 May 2024 10:52:58 +0200 + +Patch series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". + +The failing hugetlb vmsplice() COW tests keep confusing people, and having +tests that have been failing for years and likely will keep failing for +years to come because nobody cares enough is rather suboptimal. Let's +mark them as XFAIL and document why fixing them is not that easy as it +would appear at first sight. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + + +This patch (of 2): + +The vmsplice() hugetlb tests have been failing right from the start, and +we documented that in the introducing commit 7dad331be781 ("selftests/vm: +anon_cow: hugetlb tests"): + + Note that some tests cases still fail. This will, for example, be + fixed once vmsplice properly uses FOLL_PIN instead of FOLL_GET for + pinning. With 2 MiB and 1 GiB hugetlb on x86_64, the expected + failures are: + +Until vmsplice() is changed, these tests will likely keep failing: hugetlb +COW reuse logic is harder to change, because using the same COW reuse +logic as we use for !hugetlb could harm other (sane) users when running +out of free hugetlb pages. + +More details can be found in [1], especially around how hugetlb pages +cannot really be overcommitted, and why we don't particularly care about +these vmsplice() leaks for hugetlb -- in contrast to ordinary memory. + +These (expected) failures keep confusing people, so flag them accordingly. + +Before: + $ ./cow + [...] + Bail out! 8 out of 778 tests failed + # Totals: pass:769 fail:8 xfail:0 xpass:0 skip:1 error:0 + $ echo $? + 1 + +After: + $ ./cow + [...] + # Totals: pass:769 fail:0 xfail:8 xpass:0 skip:1 error:0 + $ echo $? + 0 + +[1] https://lore.kernel.org/all/8b42a24d-caf0-46ef-9e15-0f88d47d2f21@redhat.com/ + +Link: https://lkml.kernel.org/r/20240502085259.103784-1-david@redhat.com +Link: https://lkml.kernel.org/r/20240502085259.103784-2-david@redhat.com +Signed-off-by: David Hildenbrand <david@redhat.com> +Cc: Muchun Song <muchun.song@linux.dev> +Cc: Peter Xu <peterx@redhat.com> +Cc: Shuah Khan <shuah@kernel.org> |