From: John Hawkes A large number of processes that are pinned to a single CPU results in every other CPU's load_balance() seeing this overloaded CPU as "busiest", yet move_tasks() never finds a task to pull-migrate. This condition occurs during module unload, but can also occur as a denial-of-service using sys_sched_setaffinity(). Several hundred CPUs performing this fruitless load_balance() will livelock on the busiest CPU's runqueue lock. A smaller number of CPUs will livelock if the pinned task count gets high. This simple patch remedies the more common first problem: after a move_tasks() failure to migrate anything, the balance_interval increments. Using a simple increment, vs. the more dramatic doubling of the balance_interval, is conservative and yet also effective. Signed-off-by: John Hawkes Acked-by: Ingo Molnar Signed-off-by: Andrew Morton --- 25-akpm/kernel/sched.c | 16 ++++++++++++---- 1 files changed, 12 insertions(+), 4 deletions(-) diff -puN kernel/sched.c~sched-improved-load_balance-tolerance-for-pinned-tasks kernel/sched.c --- 25/kernel/sched.c~sched-improved-load_balance-tolerance-for-pinned-tasks 2004-10-21 14:54:28.489592416 -0700 +++ 25-akpm/kernel/sched.c 2004-10-21 14:54:28.495591504 -0700 @@ -1974,11 +1974,19 @@ static int load_balance(int this_cpu, ru */ sd->nr_balance_failed = sd->cache_nice_tries; } - } else - sd->nr_balance_failed = 0; - /* We were unbalanced, so reset the balancing interval */ - sd->balance_interval = sd->min_interval; + /* + * We were unbalanced, but unsuccessful in move_tasks(), + * so bump the balance_interval to lessen the lock contention. + */ + if (sd->balance_interval < sd->max_interval) + sd->balance_interval++; + } else { + sd->nr_balance_failed = 0; + + /* We were unbalanced, so reset the balancing interval */ + sd->balance_interval = sd->min_interval; + } return nr_moved; _