aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/mmzone.h
diff options
context:
space:
mode:
authorKalesh Singh <kaleshsingh@google.com>2023-04-13 14:43:26 -0700
committerAndrew Morton <akpm@linux-foundation.org>2023-04-18 16:30:11 -0700
commit7f63cf2d9b9bbe7b90f808927558a66ff737d399 (patch)
tree4fde6a043c710bfe4379efc126e491048cf3529d /include/linux/mmzone.h
parented8f3f999e9270ccc542ac44e237e389a2687a9e (diff)
downloadlinux-7f63cf2d9b9bbe7b90f808927558a66ff737d399.tar.gz
mm: Multi-gen LRU: remove wait_event_killable()
Android 14 and later default to MGLRU [1] and field telemetry showed occasional long tail latency (>100ms) in the reclaim path. Tracing revealed priority inversion in the reclaim path. In try_to_inc_max_seq(), when high priority tasks were blocked on wait_event_killable(), the preemption of the low priority task to call wake_up_all() caused those high priority tasks to wait longer than necessary. In general, this problem is not different from others of its kind, e.g., one caused by mutex_lock(). However, it is specific to MGLRU because it introduced the new wait queue lruvec->mm_state.wait. The purpose of this new wait queue is to avoid the thundering herd problem. If many direct reclaimers rush into try_to_inc_max_seq(), only one can succeed, i.e., the one to wake up the rest, and the rest who failed might cause premature OOM kills if they do not wait. So far there is no evidence supporting this scenario, based on how often the wait has been hit. And this begs the question how useful the wait queue is in practice. Based on Minchan's recommendation, which is in line with his commit 6d4675e60135 ("mm: don't be stuck to rmap lock on reclaim path") and the rest of the MGLRU code which also uses trylock when possible, remove the wait queue. [1] https://android-review.googlesource.com/q/I7ed7fbfd6ef9ce10053347528125dd98c39e50bf Link: https://lkml.kernel.org/r/20230413214326.2147568-1-kaleshsingh@google.com Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Suggested-by: Minchan Kim <minchan@kernel.org> Reported-by: Wei Wang <wvw@google.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include/linux/mmzone.h')
-rw-r--r--include/linux/mmzone.h8
1 files changed, 2 insertions, 6 deletions
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 337956748976e..a4889c9d4055b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -443,18 +443,14 @@ enum {
struct lru_gen_mm_state {
/* set to max_seq after each iteration */
unsigned long seq;
- /* where the current iteration continues (inclusive) */
+ /* where the current iteration continues after */
struct list_head *head;
- /* where the last iteration ended (exclusive) */
+ /* where the last iteration ended before */
struct list_head *tail;
- /* to wait for the last page table walker to finish */
- struct wait_queue_head wait;
/* Bloom filters flip after each iteration */
unsigned long *filters[NR_BLOOM_FILTERS];
/* the mm stats for debugging */
unsigned long stats[NR_HIST_GENS][NR_MM_STATS];
- /* the number of concurrent page table walkers */
- int nr_walkers;
};
struct lru_gen_mm_walk {