[PATCH RT 1/6] kernel: softirq: unlock with irqs on

classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[rfc patch 1/2] rt/locking/hotplug: Kill hotplug_lock()/hotplug_unlock()

Mike Galbraith-5

This lock is itself a source of hotplug deadlocks for RT:

1. kernfs_mutex is taken during hotplug, so any path other than hotplug
meeting hotplug_lock() thus deadlocks us.

2. notifier dependency upon RCU GP threads, same meeting hotplug_lock()
deadlocks us.

Removing it. Start migration immediately, do not stop migrating until the
cpu is down.  Have caller poll ->refcount before actually taking it down.
If someone manages to sneak in before we wake the migration thread, it
returns -EBUSY, and caller tries polling one more time.

Signed-off-by: Mike Galbraith <[hidden email]>
---
 kernel/cpu.c |  267 ++++++++++-------------------------------------------------
 1 file changed, 47 insertions(+), 220 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -23,6 +23,7 @@
 #include <linux/tick.h>
 #include <linux/irq.h>
 #include <linux/smpboot.h>
+#include <linux/delay.h>
 
 #include <trace/events/power.h>
 #define CREATE_TRACE_POINTS
@@ -166,49 +167,14 @@ static struct {
 
 /**
  * hotplug_pcp - per cpu hotplug descriptor
- * @unplug: set when pin_current_cpu() needs to sync tasks
- * @sync_tsk: the task that waits for tasks to finish pinned sections
  * @refcount: counter of tasks in pinned sections
- * @grab_lock: set when the tasks entering pinned sections should wait
- * @synced: notifier for @sync_tsk to tell cpu_down it's finished
- * @mutex: the mutex to make tasks wait (used when @grab_lock is true)
- * @mutex_init: zero if the mutex hasn't been initialized yet.
- *
- * Although @unplug and @sync_tsk may point to the same task, the @unplug
- * is used as a flag and still exists after @sync_tsk has exited and
- * @sync_tsk set to NULL.
+ * @migrate: set when the tasks entering/leaving pinned sections should migrate
  */
 struct hotplug_pcp {
- struct task_struct *unplug;
- struct task_struct *sync_tsk;
  int refcount;
- int grab_lock;
- struct completion synced;
- struct completion unplug_wait;
-#ifdef CONFIG_PREEMPT_RT_FULL
- /*
- * Note, on PREEMPT_RT, the hotplug lock must save the state of
- * the task, otherwise the mutex will cause the task to fail
- * to sleep when required. (Because it's called from migrate_disable())
- *
- * The spinlock_t on PREEMPT_RT is a mutex that saves the task's
- * state.
- */
- spinlock_t lock;
-#else
- struct mutex mutex;
-#endif
- int mutex_init;
+ int migrate;
 };
 
-#ifdef CONFIG_PREEMPT_RT_FULL
-# define hotplug_lock(hp) rt_spin_lock__no_mg(&(hp)->lock)
-# define hotplug_unlock(hp) rt_spin_unlock__no_mg(&(hp)->lock)
-#else
-# define hotplug_lock(hp) mutex_lock(&(hp)->mutex)
-# define hotplug_unlock(hp) mutex_unlock(&(hp)->mutex)
-#endif
-
 static DEFINE_PER_CPU(struct hotplug_pcp, hotplug_pcp);
 
 /**
@@ -221,42 +187,20 @@ static DEFINE_PER_CPU(struct hotplug_pcp
  */
 void pin_current_cpu(void)
 {
- struct hotplug_pcp *hp;
- int force = 0;
-
-retry:
- hp = this_cpu_ptr(&hotplug_pcp);
+ struct hotplug_pcp *hp = this_cpu_ptr(&hotplug_pcp);
 
- if (!hp->unplug || hp->refcount || force || preempt_count() > 1 ||
-    hp->unplug == current) {
+ if (!hp->migrate) {
  hp->refcount++;
  return;
  }
- if (hp->grab_lock) {
- preempt_enable();
- hotplug_lock(hp);
- hotplug_unlock(hp);
- } else {
- preempt_enable();
- /*
- * Try to push this task off of this CPU.
- */
- if (!migrate_me()) {
- preempt_disable();
- hp = this_cpu_ptr(&hotplug_pcp);
- if (!hp->grab_lock) {
- /*
- * Just let it continue it's already pinned
- * or about to sleep.
- */
- force = 1;
- goto retry;
- }
- preempt_enable();
- }
- }
+
+ /*
+ * Try to push this task off of this CPU.
+ */
+ preempt_enable_no_resched();
+ migrate_me();
  preempt_disable();
- goto retry;
+ this_cpu_ptr(&hotplug_pcp)->refcount++;
 }
 
 /**
@@ -268,182 +212,54 @@ void unpin_current_cpu(void)
 {
  struct hotplug_pcp *hp = this_cpu_ptr(&hotplug_pcp);
 
- WARN_ON(hp->refcount <= 0);
-
- /* This is safe. sync_unplug_thread is pinned to this cpu */
- if (!--hp->refcount && hp->unplug && hp->unplug != current)
- wake_up_process(hp->unplug);
-}
-
-static void wait_for_pinned_cpus(struct hotplug_pcp *hp)
-{
- set_current_state(TASK_UNINTERRUPTIBLE);
- while (hp->refcount) {
- schedule_preempt_disabled();
- set_current_state(TASK_UNINTERRUPTIBLE);
- }
-}
-
-static int sync_unplug_thread(void *data)
-{
- struct hotplug_pcp *hp = data;
-
- wait_for_completion(&hp->unplug_wait);
- preempt_disable();
- hp->unplug = current;
- wait_for_pinned_cpus(hp);
-
- /*
- * This thread will synchronize the cpu_down() with threads
- * that have pinned the CPU. When the pinned CPU count reaches
- * zero, we inform the cpu_down code to continue to the next step.
- */
- set_current_state(TASK_UNINTERRUPTIBLE);
- preempt_enable();
- complete(&hp->synced);
-
- /*
- * If all succeeds, the next step will need tasks to wait till
- * the CPU is offline before continuing. To do this, the grab_lock
- * is set and tasks going into pin_current_cpu() will block on the
- * mutex. But we still need to wait for those that are already in
- * pinned CPU sections. If the cpu_down() failed, the kthread_should_stop()
- * will kick this thread out.
- */
- while (!hp->grab_lock && !kthread_should_stop()) {
- schedule();
- set_current_state(TASK_UNINTERRUPTIBLE);
- }
-
- /* Make sure grab_lock is seen before we see a stale completion */
- smp_mb();
-
- /*
- * Now just before cpu_down() enters stop machine, we need to make
- * sure all tasks that are in pinned CPU sections are out, and new
- * tasks will now grab the lock, keeping them from entering pinned
- * CPU sections.
- */
- if (!kthread_should_stop()) {
- preempt_disable();
- wait_for_pinned_cpus(hp);
- preempt_enable();
- complete(&hp->synced);
- }
+ WARN_ON(hp->refcount-- <= 0);
 
- set_current_state(TASK_UNINTERRUPTIBLE);
- while (!kthread_should_stop()) {
- schedule();
- set_current_state(TASK_UNINTERRUPTIBLE);
- }
- set_current_state(TASK_RUNNING);
+ if (!hp->migrate)
+ return;
 
  /*
- * Force this thread off this CPU as it's going down and
- * we don't want any more work on this CPU.
+ * Try to push this task off of this CPU.
  */
- current->flags &= ~PF_NO_SETAFFINITY;
- set_cpus_allowed_ptr(current, cpu_present_mask);
+ preempt_enable_no_resched();
  migrate_me();
- return 0;
-}
-
-static void __cpu_unplug_sync(struct hotplug_pcp *hp)
-{
- wake_up_process(hp->sync_tsk);
- wait_for_completion(&hp->synced);
+ preempt_disable();
 }
 
-static void __cpu_unplug_wait(unsigned int cpu)
+static void wait_for_pinned_cpus(struct hotplug_pcp *hp)
 {
- struct hotplug_pcp *hp = &per_cpu(hotplug_pcp, cpu);
-
- complete(&hp->unplug_wait);
- wait_for_completion(&hp->synced);
+ while (hp->refcount) {
+ trace_printk("CHILL\n");
+ cpu_chill();
+ }
 }
 
 /*
- * Start the sync_unplug_thread on the target cpu and wait for it to
- * complete.
+ * Start migration of pinned tasks on the target cpu.
  */
 static int cpu_unplug_begin(unsigned int cpu)
 {
  struct hotplug_pcp *hp = &per_cpu(hotplug_pcp, cpu);
- int err;
-
- /* Protected by cpu_hotplug.lock */
- if (!hp->mutex_init) {
-#ifdef CONFIG_PREEMPT_RT_FULL
- spin_lock_init(&hp->lock);
-#else
- mutex_init(&hp->mutex);
-#endif
- hp->mutex_init = 1;
- }
 
  /* Inform the scheduler to migrate tasks off this CPU */
  tell_sched_cpu_down_begin(cpu);
+ hp->migrate = 1;
 
- init_completion(&hp->synced);
- init_completion(&hp->unplug_wait);
-
- hp->sync_tsk = kthread_create(sync_unplug_thread, hp, "sync_unplug/%d", cpu);
- if (IS_ERR(hp->sync_tsk)) {
- err = PTR_ERR(hp->sync_tsk);
- hp->sync_tsk = NULL;
- return err;
- }
- kthread_bind(hp->sync_tsk, cpu);
+ /* Let all tasks know cpu unplug is starting */
+ smp_rmb();
 
- /*
- * Wait for tasks to get out of the pinned sections,
- * it's still OK if new tasks enter. Some CPU notifiers will
- * wait for tasks that are going to enter these sections and
- * we must not have them block.
- */
- wake_up_process(hp->sync_tsk);
  return 0;
 }
 
-static void cpu_unplug_sync(unsigned int cpu)
-{
- struct hotplug_pcp *hp = &per_cpu(hotplug_pcp, cpu);
-
- init_completion(&hp->synced);
- /* The completion needs to be initialzied before setting grab_lock */
- smp_wmb();
-
- /* Grab the mutex before setting grab_lock */
- hotplug_lock(hp);
- hp->grab_lock = 1;
-
- /*
- * The CPU notifiers have been completed.
- * Wait for tasks to get out of pinned CPU sections and have new
- * tasks block until the CPU is completely down.
- */
- __cpu_unplug_sync(hp);
-
- /* All done with the sync thread */
- kthread_stop(hp->sync_tsk);
- hp->sync_tsk = NULL;
-}
-
 static void cpu_unplug_done(unsigned int cpu)
 {
  struct hotplug_pcp *hp = &per_cpu(hotplug_pcp, cpu);
 
- hp->unplug = NULL;
- /* Let all tasks know cpu unplug is finished before cleaning up */
+ /* Let all tasks know cpu unplug is finished */
  smp_wmb();
 
- if (hp->sync_tsk)
- kthread_stop(hp->sync_tsk);
-
- if (hp->grab_lock) {
- hotplug_unlock(hp);
+ if (hp->migrate) {
  /* protected by cpu_hotplug.lock */
- hp->grab_lock = 0;
+ hp->migrate = 0;
  }
  tell_sched_cpu_down_done(cpu);
 }
@@ -951,6 +767,10 @@ static int take_cpu_down(void *_param)
  enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
  int err, cpu = smp_processor_id();
 
+ /* RT: too bad, some task snuck in on the way here */
+ if (this_cpu_ptr(&hotplug_pcp)->refcount)
+ return -EBUSY;
+
  /* Ensure this CPU doesn't handle any more interrupts. */
  err = __cpu_disable();
  if (err < 0)
@@ -972,7 +792,7 @@ static int take_cpu_down(void *_param)
 static int takedown_cpu(unsigned int cpu)
 {
  struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
- int err;
+ int err, once = 0;
 
  /*
  * By now we've cleared cpu_active_mask, wait for all preempt-disabled
@@ -989,25 +809,32 @@ static int takedown_cpu(unsigned int cpu
  else
  synchronize_rcu();
 
- __cpu_unplug_wait(cpu);
-
  /* Park the smpboot threads */
  kthread_park(per_cpu_ptr(&cpuhp_state, cpu)->thread);
  smpboot_park_threads(cpu);
 
- /* Notifiers are done. Don't let any more tasks pin this CPU. */
- cpu_unplug_sync(cpu);
-
  /*
  * Prevent irq alloc/free while the dying cpu reorganizes the
  * interrupt affinities.
  */
  irq_lock_sparse();
 
+again:
+ /* Notifiers are done.  Check for late references */
+ wait_for_pinned_cpus(&per_cpu(hotplug_pcp, cpu));
+
  /*
  * So now all preempt/rcu users must observe !cpu_active().
  */
  err = stop_machine(take_cpu_down, NULL, cpumask_of(cpu));
+ if (err == -EBUSY) {
+ if (!once) {
+ trace_printk("BUSY, trying again\n");
+ once++;
+ goto again;
+ }
+ trace_printk("CRAP, still busy.  Deal with it caller\n");
+ }
  if (err) {
  /* CPU didn't die: tell everyone.  Can't complain. */
  cpu_notify_nofail(CPU_DOWN_FAILED, cpu);
Reply | Threaded
Open this post in threaded view
|

[rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

Mike Galbraith-5
In reply to this post by Mike Galbraith-5

I met a problem while testing shiny new hotplug machinery.

migrate_disable() -> pin_current_cpu() -> hotplug_lock() leads to..
        BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on));

With hotplug_lock()/hotplug_unlock() now gone, there is no lock added by
the CPU pinning code, thus we're free to pin after lock acquisition, and
unpin before release with no ABBA worries.  Doing so will also save a few
cycles if we have to repeat the acquisition loop.

Fixes: e24b142cfb4a rt/locking: Reenable migration accross schedule
Signed-off-by: Mike Galbraith <[hidden email]>
---
 kernel/locking/rtmutex.c |   37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -930,12 +930,12 @@ static inline void rt_spin_lock_fastlock
 {
  might_sleep_no_state_check();
 
- if (do_mig_dis)
- migrate_disable();
-
- if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
  rt_mutex_deadlock_account_lock(lock, current);
- else
+
+ if (do_mig_dis)
+ migrate_disable();
+ } else
  slowfn(lock, do_mig_dis);
 }
 
@@ -995,12 +995,11 @@ static int task_blocks_on_rt_mutex(struc
  * the try_to_wake_up() code handles this accordingly.
  */
 static void  noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock,
-    bool mg_off)
+    bool do_mig_dis)
 {
  struct task_struct *lock_owner, *self = current;
  struct rt_mutex_waiter waiter, *top_waiter;
  unsigned long flags;
- int ret;
 
  rt_mutex_init_waiter(&waiter, true);
 
@@ -1008,6 +1007,8 @@ static void  noinline __sched rt_spin_lo
 
  if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
  raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+ if (do_mig_dis)
+ migrate_disable();
  return;
  }
 
@@ -1024,8 +1025,7 @@ static void  noinline __sched rt_spin_lo
  __set_current_state_no_track(TASK_UNINTERRUPTIBLE);
  raw_spin_unlock(&self->pi_lock);
 
- ret = task_blocks_on_rt_mutex(lock, &waiter, self, RT_MUTEX_MIN_CHAINWALK);
- BUG_ON(ret);
+ BUG_ON(task_blocks_on_rt_mutex(lock, &waiter, self, RT_MUTEX_MIN_CHAINWALK));
 
  for (;;) {
  /* Try to acquire the lock again. */
@@ -1039,13 +1039,8 @@ static void  noinline __sched rt_spin_lo
 
  debug_rt_mutex_print_deadlock(&waiter);
 
- if (top_waiter != &waiter || adaptive_wait(lock, lock_owner)) {
- if (mg_off)
- migrate_enable();
+ if (top_waiter != &waiter || adaptive_wait(lock, lock_owner))
  schedule();
- if (mg_off)
- migrate_disable();
- }
 
  raw_spin_lock_irqsave(&lock->wait_lock, flags);
 
@@ -1077,6 +1072,9 @@ static void  noinline __sched rt_spin_lo
 
  raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
 
+ if (do_mig_dis)
+ migrate_disable();
+
  debug_rt_mutex_free_waiter(&waiter);
 }
 
@@ -1159,10 +1157,10 @@ EXPORT_SYMBOL(rt_spin_unlock__no_mg);
 
 void __lockfunc rt_spin_unlock(spinlock_t *lock)
 {
+ migrate_enable();
  /* NOTE: we always pass in '1' for nested, for simplicity */
  spin_release(&lock->dep_map, 1, _RET_IP_);
  rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock);
- migrate_enable();
 }
 EXPORT_SYMBOL(rt_spin_unlock);
 
@@ -1204,12 +1202,11 @@ int __lockfunc rt_spin_trylock(spinlock_
 {
  int ret;
 
- migrate_disable();
  ret = rt_mutex_trylock(&lock->lock);
- if (ret)
+ if (ret) {
+ migrate_disable();
  spin_acquire(&lock->dep_map, 0, 1, _RET_IP_);
- else
- migrate_enable();
+ }
  return ret;
 }
 EXPORT_SYMBOL(rt_spin_trylock);
Reply | Threaded
Open this post in threaded view
|

Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

Mike Galbraith-5
It'll take a hotplug beating seemingly as well as any non-rt kernel,
but big box NAKed it due to jitter, which can mean 1.0 things.. duh.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

Mike Galbraith-5
On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
> It'll take a hotplug beating seemingly as well as any non-rt kernel,
> but big box NAKed it due to jitter, which can mean 1.0 things.. duh.

FWIW, the below turned big box NAK into ACK.  Stressing hotplug over
night, iteration completion time went from about 2 1/2 hours with
bandaids on the two identified rt sore spots, to an hour and 10 minutes
as well for some reason, but whatever..

There are other things like doing the downing on the cpu being taken
down that would likely be a good idea, but I suppose I'll now wait to
see what future devel trees look like.  I suspect Thomas will aim his
axe at the annoying lock too (and his makes clean cuts).  Meanwhile,
just reverting e24b142cfb4a makes hotplug as safe as it ever was (not
at all), slaughtering the lock seems to put current rt on par with non
-rt (iow other changes left not much rt trouble remaining), and the
below is one way to make e24b142cfb4a non-toxic.

        -Mike

rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

I met a problem while testing shiny new hotplug machinery.

migrate_disable() -> pin_current_cpu() -> hotplug_lock() leads to..
        BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on));

Unpin before we block, and repin while still in atomic context
after acquisition.

Fixes: e24b142cfb4a rt/locking: Reenable migration accross schedule
Signed-off-by: Mike Galbraith <[hidden email]>
---
 include/linux/cpu.h      |    2 ++
 include/linux/preempt.h  |    2 ++
 kernel/cpu.c             |   13 +++++++++++++
 kernel/locking/rtmutex.c |   18 +++++++++++-------
 kernel/sched/core.c      |    7 +++++++
 5 files changed, 35 insertions(+), 7 deletions(-)

--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -231,6 +231,7 @@ extern void put_online_cpus(void);
 extern void cpu_hotplug_disable(void);
 extern void cpu_hotplug_enable(void);
 extern void pin_current_cpu(void);
+extern void pin_current_cpu_in_atomic(void);
 extern void unpin_current_cpu(void);
 #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
 #define __hotcpu_notifier(fn, pri) __cpu_notifier(fn, pri)
@@ -250,6 +251,7 @@ static inline void cpu_hotplug_done(void
 #define cpu_hotplug_disable() do { } while (0)
 #define cpu_hotplug_enable() do { } while (0)
 static inline void pin_current_cpu(void) { }
+static inline void pin_current_cpu_in_atomic(void) { }
 static inline void unpin_current_cpu(void) { }
 #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
 #define __hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -302,9 +302,11 @@ do { \
 # define preempt_enable_nort() barrier()
 # ifdef CONFIG_SMP
    extern void migrate_disable(void);
+   extern void migrate_disable_in_atomic(void);
    extern void migrate_enable(void);
 # else /* CONFIG_SMP */
 #  define migrate_disable() barrier()
+#  define migrate_disable_in_atomic() barrier()
 #  define migrate_enable() barrier()
 # endif /* CONFIG_SMP */
 #else
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -204,6 +204,19 @@ void pin_current_cpu(void)
 }
 
 /**
+ * pin_current_cpu_in_atomic - Prevent the current cpu from being unplugged
+ *
+ * The caller is acquiring a lock, and must have a reference before leaving
+ * the preemption disabled region therein.
+ *
+ * Must be called with preemption disabled (preempt_count = 1)!
+ */
+void pin_current_cpu_in_atomic(void)
+{
+ this_cpu_ptr(&hotplug_pcp)->refcount++;
+}
+
+/**
  * unpin_current_cpu - Allow unplug of current cpu
  *
  * Must be called with preemption or interrupts disabled!
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1002,11 +1002,17 @@ static void  noinline __sched rt_spin_lo
  unsigned long flags;
  int ret;
 
+ mg_off &= (self->migrate_disable == 1 && !self->state);
+ if (mg_off)
+ migrate_enable();
+
  rt_mutex_init_waiter(&waiter, true);
 
  raw_spin_lock_irqsave(&lock->wait_lock, flags);
 
  if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
+ if (mg_off)
+ migrate_disable_in_atomic();
  raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
  return;
  }
@@ -1029,8 +1035,11 @@ static void  noinline __sched rt_spin_lo
 
  for (;;) {
  /* Try to acquire the lock again. */
- if (__try_to_take_rt_mutex(lock, self, &waiter, STEAL_LATERAL))
+ if (__try_to_take_rt_mutex(lock, self, &waiter, STEAL_LATERAL)) {
+ if (mg_off)
+ migrate_disable_in_atomic();
  break;
+ }
 
  top_waiter = rt_mutex_top_waiter(lock);
  lock_owner = rt_mutex_owner(lock);
@@ -1039,13 +1048,8 @@ static void  noinline __sched rt_spin_lo
 
  debug_rt_mutex_print_deadlock(&waiter);
 
- if (top_waiter != &waiter || adaptive_wait(lock, lock_owner)) {
- if (mg_off)
- migrate_enable();
+ if (top_waiter != &waiter || adaptive_wait(lock, lock_owner))
  schedule();
- if (mg_off)
- migrate_disable();
- }
 
  raw_spin_lock_irqsave(&lock->wait_lock, flags);
 
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3328,6 +3328,13 @@ void migrate_disable(void)
 }
 EXPORT_SYMBOL(migrate_disable);
 
+void migrate_disable_in_atomic(void)
+{
+ pin_current_cpu_in_atomic();
+ current->migrate_disable++;
+}
+EXPORT_SYMBOL(migrate_disable_in_atomic);
+
 void migrate_enable(void)
 {
  struct task_struct *p = current;
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
In reply to this post by Mike Galbraith-5
On 04/02/2016 05:12 AM, Mike Galbraith wrote:
>> By the time I improved hotplug I played with this. I had a few ideas but
>> it didn't fly in the end. Today however I ended up with this:
>
> Yeah, but that fails the duct tape test too.  Mine is below, and is the
> extra sticky variety ;-)  With busted 0299 patch reverted and those two
> applied, my DL980 took a beating for ~36 hours before I aborted it.. ie
> hotplug road seemingly has no more -rt specific potholes.

just to be clear: The patch I attached did _not_ work for you.

> If that lock dies, we can unpin when entering lock slow path and pin
> again post acquisition with no ABBA worries as well, and not only does
> existing hotplug work heaping truckloads better, -rt can perhaps help
> spot trouble as the rewrite proceeds.
>
> Current state is more broken than ever.. if that's possible.

And the two patches you attached here did?

>
> -Mike

Sebastian

Reply | Threaded
Open this post in threaded view
|

Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

Sebastian Andrzej Siewior-4
In reply to this post by Mike Galbraith-5
On 04/07/2016 06:37 AM, Mike Galbraith wrote:
> On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
>> It'll take a hotplug beating seemingly as well as any non-rt kernel,
>> but big box NAKed it due to jitter, which can mean 1.0 things.. duh.
>
> FWIW, the below turned big box NAK into ACK.  Stressing hotplug over
> night, iteration completion time went from about 2 1/2 hours with
> bandaids on the two identified rt sore spots, to an hour and 10 minutes
> as well for some reason, but whatever..

Just to be clear. I need #1, #2 from this thread and this below to make
it work?

> -Mike
>

Sebastian

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
In reply to this post by Sebastian Andrzej Siewior-4
On Thu, 2016-04-07 at 18:47 +0200, Sebastian Andrzej Siewior wrote:

> On 04/02/2016 05:12 AM, Mike Galbraith wrote:
> > > By the time I improved hotplug I played with this. I had a few ideas but
> > > it didn't fly in the end. Today however I ended up with this:
> >
> > Yeah, but that fails the duct tape test too.  Mine is below, and is the
> > extra sticky variety ;-)  With busted 0299 patch reverted and those two
> > applied, my DL980 took a beating for ~36 hours before I aborted it.. ie
> > hotplug road seemingly has no more -rt specific potholes.
>
> just to be clear: The patch I attached did _not_ work for you.

Sorry, I didn't test.  Marathon stress test session convinced me that
the lock added by -rt absolutely had to die.

> > If that lock dies, we can unpin when entering lock slow path and pin
> > again post acquisition with no ABBA worries as well, and not only does
> > existing hotplug work heaping truckloads better, -rt can perhaps help
> > spot trouble as the rewrite proceeds.
> >
> > Current state is more broken than ever.. if that's possible.
>
> And the two patches you attached here did?

I've killed way too many NOPREEMPT kernels to make any rash -rt claims.
 What I can tell you is that my 64 core DL980 running 4.6-rc2-rt13 plus
the two posted patches survived for ~20 hours before I had to break it
off because I needed the box.

These two haven't been through _as_ much pounding as the two targeted
bandaids I showed have, but have been through quite a bit.  Other folks
beating the living crap outta their boxen too would not be a bad idea.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

Mike Galbraith-5
In reply to this post by Sebastian Andrzej Siewior-4
On Thu, 2016-04-07 at 18:48 +0200, Sebastian Andrzej Siewior wrote:

> On 04/07/2016 06:37 AM, Mike Galbraith wrote:
> > On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
> > > It'll take a hotplug beating seemingly as well as any non-rt kernel,
> > > but big box NAKed it due to jitter, which can mean 1.0 things.. duh.
> >
> > FWIW, the below turned big box NAK into ACK.  Stressing hotplug over
> > night, iteration completion time went from about 2 1/2 hours with
> > bandaids on the two identified rt sore spots, to an hour and 10 minutes
> > as well for some reason, but whatever..
>
> Just to be clear. I need #1, #2 from this thread and this below to make
> it work?

The followup to 2/2 replaced 2/2.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
In reply to this post by Sebastian Andrzej Siewior-4
On Thu, 2016-04-07 at 18:47 +0200, Sebastian Andrzej Siewior wrote:

> > If that lock dies, we can unpin when entering lock slow path and pin
> > again post acquisition with no ABBA worries as well, and not only does
> > existing hotplug work heaping truckloads better, -rt can perhaps help
> > spot trouble as the rewrite proceeds.
> >
> > Current state is more broken than ever.. if that's possible.
>
> And the two patches you attached here did?

Re-reading your question, no, the only troubles I encountered were the
rt specific woes previously identified.  So the thought that started me
down this path turned up jack-diddly-spit.. but that's not a bad thing,
so I don't consider it to have been a waste of time.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
In reply to this post by Mike Galbraith-5
On 04/07/2016 09:04 PM, Mike Galbraith wrote:
>> just to be clear: The patch I attached did _not_ work for you.
>
> Sorry, I didn't test.  Marathon stress test session convinced me that
> the lock added by -rt absolutely had to die.

Okay. And the patch did that. I removed the lock.

>>> If that lock dies, we can unpin when entering lock slow path and pin
>>> again post acquisition with no ABBA worries as well, and not only does
>>> existing hotplug work heaping truckloads better, -rt can perhaps help
>>> spot trouble as the rewrite proceeds.
>>>
>>> Current state is more broken than ever.. if that's possible.
>>
>> And the two patches you attached here did?
>
> I've killed way too many NOPREEMPT kernels to make any rash -rt claims.
>  What I can tell you is that my 64 core DL980 running 4.6-rc2-rt13 plus
> the two posted patches survived for ~20 hours before I had to break it
> off because I needed the box.
>
> These two haven't been through _as_ much pounding as the two targeted
> bandaids I showed have, but have been through quite a bit.  Other folks
> beating the living crap outta their boxen too would not be a bad idea.

I see. So what I don't like are all the exceptions you have: one for
RCU and one kernfs. There might come more in the future. So what I aim
is the removal of the lock.

>
> -Mike
>
Sebastian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
On Fri, 2016-04-08 at 12:30 +0200, Sebastian Andrzej Siewior wrote:
> On 04/07/2016 09:04 PM, Mike Galbraith wrote:
> > > just to be clear: The patch I attached did _not_ work for you.
> >
> > Sorry, I didn't test.  Marathon stress test session convinced me that
> > the lock added by -rt absolutely had to die.
>
> Okay. And the patch did that. I removed the lock.

But also adds when it appears no addition is required.  I don't care
how it dies though, only that it does.

> I see. So what I don't like are all the exceptions you have: one for
> RCU and one kernfs. There might come more in the future. So what I aim
> is the removal of the lock.

Yes, those two were bandaids to allow searching for more -rt specific
disease (none found).  Removing that lock is the cure.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
In reply to this post by Sebastian Andrzej Siewior-4
On Thu, 2016-04-07 at 18:47 +0200, Sebastian Andrzej Siewior wrote:

> just to be clear: The patch I attached did _not_ work for you.

Did you perchance mean with "Reenable migration across schedule"
reverted?  Figured it would still explode in seconds.. it did.

[  172.996232] kernel BUG at kernel/locking/rtmutex.c:1360!
[  172.996234] invalid opcode: 0000 [#1] PREEMPT SMP
[  172.996236] Dumping ftrace buffer:
[  172.996239]    (ftrace buffer empty)
[  172.996254] Modules linked in: ebtable_filter(E) ebtables(E) fuse(E) nf_log_ipv6(E) xt_pkttype(E) xt_physdev(E) br_netfilter(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_limit(E) af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) ip6t_REJECT(E) xt_tcpudp(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) ip6table_raw(E) ipt_REJECT(E) iptable_raw(E) xt_CT(E) iptable_filter(E) ip6table_mangle(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip_tables(E) xt_conntrack(E) nf_conntrack(E) ip6table_filter(E) ip6_tables(E) x_tables(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_rapl(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E)
[  172.996271]  snd_hda_codec_generic(E) drbg(E) snd_hda_intel(E) ansi_cprng(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) aesni_intel(E) snd_pcm(E) aes_x86_64(E) lrw(E) r8169(E) mii(E) snd_timer(E) gf128mul(E) dm_mod(E) iTCO_wdt(E) iTCO_vendor_support(E) lpc_ich(E) mei_me(E) shpchp(E) snd(E) i2c_i801(E) joydev(E) pcspkr(E) serio_raw(E) glue_helper(E) ablk_helper(E) mei(E) mfd_core(E) cryptd(E) soundcore(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) processor(E) thermal(E) battery(E) fan(E) tpm_infineon(E) fjes(E) intel_smartconnect(E) sunrpc(E) efivarfs(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sr_mod(E) cdrom(E) sd_mod(E) hid_logitech_hidpp(E) hid_logitech_dj(E) hid_generic(E) uas(E) usb_storage(E) usbhid(E) nouveau(E) wmi(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E)
[  172.996275]  ahci(E) sysimgblt(E) fb_sys_fops(E) libahci(E) ttm(E) libata(E) drm(E) video(E) button(E) sg(E) scsi_mod(E) autofs4(E)
[  172.996277] CPU: 7 PID: 6109 Comm: futex_wait Tainted: G            E   4.4.6-rt13-virgin #12
[  172.996277] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[  172.996278] task: ffff88017ce6ab80 ti: ffff8803d2e20000 task.ti: ffff8803d2e20000
[  172.996283] RIP: 0010:[<ffffffff810b5a23>]  [<ffffffff810b5a23>] task_blocks_on_rt_mutex+0x243/0x260
[  172.996284] RSP: 0018:ffff8803d2e23a38  EFLAGS: 00010092
[  172.996285] RAX: ffff8803d2e23c10 RBX: ffff88017ce6ab80 RCX: 0000000000000000
[  172.996285] RDX: 0000000000000001 RSI: ffff8803d2e23a98 RDI: ffff88017ce6b258
[  172.996286] RBP: ffff8803d2e23a68 R08: ffff8800dddc0000 R09: ffffffff81f33918
[  172.996286] R10: ffff8800dddc0001 R11: 0000000000000000 R12: ffff8800dddc0000
[  172.996287] R13: ffff8803d2e23a98 R14: ffffffff81f33900 R15: 0000000000000000
[  172.996288] FS:  00007f4017988700(0000) GS:ffff88041edc0000(0000) knlGS:0000000000000000
[  172.996288] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  172.996289] CR2: 0000000000000000 CR3: 00000003bf7f4000 CR4: 00000000001406e0
[  172.996289] Stack:
[  172.996291]  000000007ce6abe8 ffffffff81f33900 ffff8803d2e23a98 0000000000000000
[  172.996292]  0000000000000000 0000000000000000 ffff8803d2e23b08 ffffffff8162f105
[  172.996293]  0000000200000000 0000000000000296 0000000000000000 ffff8803d2e23ae8
[  172.996293] Call Trace:
[  172.996298]  [<ffffffff8162f105>] rt_mutex_slowlock+0xe5/0x290
[  172.996301]  [<ffffffff810a30e5>] ? pick_next_entity+0xa5/0x160
[  172.996303]  [<ffffffff8162f3a1>] rt_mutex_lock+0x31/0x40
[  172.996304]  [<ffffffff816308ae>] _mutex_lock+0xe/0x10
[  172.996306]  [<ffffffff81096543>] migrate_me+0x63/0x1f0
[  172.996308]  [<ffffffff81093fed>] ? finish_task_switch+0x7d/0x300
[  172.996310]  [<ffffffff8106bb25>] pin_current_cpu+0x1e5/0x2a0
[  172.996311]  [<ffffffff810942e3>] migrate_disable+0x73/0xd0
[  172.996313]  [<ffffffff8162f598>] rt_spin_lock_slowlock+0x1e8/0x2e0
[  172.996314]  [<ffffffff81630748>] rt_spin_lock+0x38/0x40
[  172.996317]  [<ffffffff810ece18>] futex_wait_setup+0x98/0x100
[  172.996318]  [<ffffffff810ecfcf>] futex_wait+0x14f/0x240
[  172.996320]  [<ffffffff810b4f36>] ? rt_mutex_dequeue_pi+0x36/0x60
[  172.996322]  [<ffffffff810b5cc6>] ? rt_mutex_adjust_prio+0x36/0x40
[  172.996323]  [<ffffffff8162f714>] ? rt_spin_lock_slowunlock+0x84/0xc0
[  172.996325]  [<ffffffff810edb81>] do_futex+0xd1/0x560
[  172.996327]  [<ffffffff81003666>] ? __switch_to+0x1d6/0x450
[  172.996329]  [<ffffffff81093fed>] ? finish_task_switch+0x7d/0x300
[  172.996330]  [<ffffffff8162d40e>] ? __schedule+0x2ae/0x7d0
[  172.996332]  [<ffffffff810ee081>] SyS_futex+0x71/0x150
[  172.996334]  [<ffffffff81066123>] ? exit_to_usermode_loop+0x4b/0xe4
[  172.996335]  [<ffffffff81630c2e>] entry_SYSCALL_64_fastpath+0x12/0x71
[  172.996349] Code: 0d 1b 54 f5 7e 74 30 65 48 8b 04 25 c4 28 01 00 48 8b 80 08 c0 ff ff f6 c4 02 75 1b b8 f5 ff ff ff e9 25 ff ff ff e8 8d f5 ff ff <0f> 0b e8 d6 b5 f4 ff e9 0e ff ff ff e8 cc b5 f4 ff b8 f5 ff ff
[  172.996351] RIP  [<ffffffff810b5a23>] task_blocks_on_rt_mutex+0x243/0x260
[  172.996351]  RSP <ffff8803d2e23a38>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
On 04/08/2016 03:44 PM, Mike Galbraith wrote:
> On Thu, 2016-04-07 at 18:47 +0200, Sebastian Andrzej Siewior wrote:
>
>> just to be clear: The patch I attached did _not_ work for you.
>
> Did you perchance mean with "Reenable migration across schedule"
> reverted?  Figured it would still explode in seconds.. it did.

I meant 4.4.6-rt13 + my patch and nothing else.

> [  172.996232] kernel BUG at kernel/locking/rtmutex.c:1360!

okay. and how did you trigger this? Just Steven's script or was there
more to it?

Sebastian

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
On Fri, 2016-04-08 at 15:58 +0200, Sebastian Andrzej Siewior wrote:

> On 04/08/2016 03:44 PM, Mike Galbraith wrote:
> > On Thu, 2016-04-07 at 18:47 +0200, Sebastian Andrzej Siewior wrote:
> >
> > > just to be clear: The patch I attached did _not_ work for you.
> >
> > Did you perchance mean with "Reenable migration across schedule"
> > reverted?  Figured it would still explode in seconds.. it did.
>
> I meant 4.4.6-rt13 + my patch and nothing else.
>
> > [  172.996232] kernel BUG at kernel/locking/rtmutex.c:1360!
>
> okay. and how did you trigger this? Just Steven's script or was there
> more to it?

I run stockfish, futextest, hackbench and tbench with it, terminating
and restarting them at random intervals just to make sure nobody gets
into a comfortable little rut.  Stockfish and tbench are sized as to
not saturate the box, hackbench runs periodically (and with no args to
turn it into a hog), futextest run.sh just does its normal thing.

Trying to grab an rtmutex while queued on an rtmutex... doesn't matter
much if it's the lock that likes to deadlock us, or the one you added
instead of making that blasted lock really really dead.

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
On 04/08/2016 04:16 PM, Mike Galbraith wrote:
>> okay. and how did you trigger this? Just Steven's script or was there
>> more to it?
>
> I run stockfish, futextest, hackbench and tbench with it, terminating
> and restarting them at random intervals just to make sure nobody gets
> into a comfortable little rut.  Stockfish and tbench are sized as to
> not saturate the box, hackbench runs periodically (and with no args to
> turn it into a hog), futextest run.sh just does its normal thing.

Is there anything you can hand me over?

> Trying to grab an rtmutex while queued on an rtmutex... doesn't matter
> much if it's the lock that likes to deadlock us, or the one you added
> instead of making that blasted lock really really dead.

Yeah, doesn't look too good.

> -Mike
>
Sebastian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
On Fri, 2016-04-08 at 16:51 +0200, Sebastian Andrzej Siewior wrote:

> Is there anything you can hand me over?

Sure, I'll send it offline (yup, that proud of my scripting;)

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
* Mike Galbraith | 2016-04-08 18:49:28 [+0200]:

>On Fri, 2016-04-08 at 16:51 +0200, Sebastian Andrzej Siewior wrote:
>
>> Is there anything you can hand me over?
>
>Sure, I'll send it offline (yup, that proud of my scripting;)
>
> -Mike

take 2. There is this else case in pin_current_cpu() where I take
hp_lock. I didn't manage to get in there. So I *think* we can get rid of
the lock now. Since there is no lock (or will be) we can drop the whole
`do_mig_dis' checking and do the migrate_disable() _after_ we obtained
the lock. We were not able to do so due to the lock hp_lock.

And with this, I didn't manage to triger the lockup you had with
futextest.

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f9a0f2b540f1..b0f786274025 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1492,7 +1492,7 @@ struct task_struct {
 #ifdef CONFIG_COMPAT_BRK
  unsigned brk_randomized:1;
 #endif
-
+ unsigned mig_away :1;
  unsigned long atomic_flags; /* Flags needing atomic access. */
 
  struct restart_block restart_block;
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 8edd3c716092..3a1ee02ba3ab 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -30,6 +30,10 @@
 /* Serializes the updates to cpu_online_mask, cpu_present_mask */
 static DEFINE_MUTEX(cpu_add_remove_lock);
 
+static DEFINE_SPINLOCK(cpumask_lock);
+static cpumask_var_t mig_cpumask;
+static cpumask_var_t mig_cpumask_org;
+
 /*
  * The following two APIs (cpu_maps_update_begin/done) must be used when
  * attempting to serialize the updates to cpu_online_mask & cpu_present_mask.
@@ -120,6 +124,8 @@ struct hotplug_pcp {
  * state.
  */
  spinlock_t lock;
+ cpumask_var_t cpumask;
+ cpumask_var_t cpumask_org;
 #else
  struct mutex mutex;
 #endif
@@ -158,9 +164,30 @@ void pin_current_cpu(void)
  return;
  }
  if (hp->grab_lock) {
+ int cpu;
+
+ cpu = smp_processor_id();
  preempt_enable();
- hotplug_lock(hp);
- hotplug_unlock(hp);
+ if (cpu != raw_smp_processor_id())
+ goto retry;
+
+ current->mig_away = 1;
+ rt_spin_lock__no_mg(&cpumask_lock);
+
+ /* DOWN */
+ cpumask_copy(mig_cpumask_org, tsk_cpus_allowed(current));
+ cpumask_andnot(mig_cpumask, cpu_online_mask, cpumask_of(cpu));
+ set_cpus_allowed_ptr(current, mig_cpumask);
+
+ if (cpu == raw_smp_processor_id()) {
+ /* BAD */
+ hotplug_lock(hp);
+ hotplug_unlock(hp);
+ }
+ set_cpus_allowed_ptr(current, mig_cpumask_org);
+ current->mig_away = 0;
+ rt_spin_unlock__no_mg(&cpumask_lock);
+
  } else {
  preempt_enable();
  /*
@@ -800,7 +827,13 @@ static struct notifier_block smpboot_thread_notifier = {
 
 void smpboot_thread_init(void)
 {
+ bool ok;
+
  register_cpu_notifier(&smpboot_thread_notifier);
+ ok = alloc_cpumask_var(&mig_cpumask, GFP_KERNEL);
+ BUG_ON(!ok);
+ ok = alloc_cpumask_var(&mig_cpumask_org, GFP_KERNEL);
+ BUG_ON(!ok);
 }
 
 /* Requires cpu_add_remove_lock to be held */
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 66971005cc12..b5e5e6a15278 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -930,13 +930,13 @@ static inline void rt_spin_lock_fastlock(struct rt_mutex *lock,
 {
  might_sleep_no_state_check();
 
- if (do_mig_dis)
+ if (do_mig_dis && 0)
  migrate_disable();
 
  if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
  rt_mutex_deadlock_account_lock(lock, current);
  else
- slowfn(lock, do_mig_dis);
+ slowfn(lock, false);
 }
 
 static inline void rt_spin_lock_fastunlock(struct rt_mutex *lock,
@@ -1125,12 +1125,14 @@ void __lockfunc rt_spin_lock(spinlock_t *lock)
 {
  rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock, true);
  spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+ migrate_disable();
 }
 EXPORT_SYMBOL(rt_spin_lock);
 
 void __lockfunc __rt_spin_lock(struct rt_mutex *lock)
 {
  rt_spin_lock_fastlock(lock, rt_spin_lock_slowlock, true);
+ migrate_disable();
 }
 EXPORT_SYMBOL(__rt_spin_lock);
 
@@ -1145,6 +1147,7 @@ void __lockfunc rt_spin_lock_nested(spinlock_t *lock, int subclass)
 {
  spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
  rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock, true);
+ migrate_disable();
 }
 EXPORT_SYMBOL(rt_spin_lock_nested);
 #endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index da96d97f3d79..0eb7496870bd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3369,6 +3369,9 @@ static inline void sched_submit_work(struct task_struct *tsk)
 {
  if (!tsk->state)
  return;
+
+ if (tsk->mig_away)
+ return;
  /*
  * If a worker went to sleep, notify and ask workqueue whether
  * it wants to wake up a task to maintain concurrency.
--
2.8.0.rc3


Sebastian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
On Mon, 2016-04-18 at 19:15 +0200, Sebastian Andrzej Siewior wrote:

> take 2. There is this else case in pin_current_cpu() where I take
> hp_lock. I didn't manage to get in there. So I *think* we can get rid of
> the lock now. Since there is no lock (or will be) we can drop the whole
> `do_mig_dis' checking and do the migrate_disable() _after_ we obtained
> the lock. We were not able to do so due to the lock hp_lock.
>
> And with this, I didn't manage to triger the lockup you had with
> futextest.

I'll have to feed it to DL980, hotplug and jitter test it.  It seemed
to think that pinning post acquisition was a bad idea jitter wise, but
I was bending things up while juggling multiple boxen, so..

        -Mike
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Sebastian Andrzej Siewior-4
On 04/18/2016 07:55 PM, Mike Galbraith wrote:
>
> I'll have to feed it to DL980, hotplug and jitter test it.  It seemed
> to think that pinning post acquisition was a bad idea jitter wise, but
> I was bending things up while juggling multiple boxen, so..

pinning pre acquisition could get you in a situation where you get the
lock and you are stuck on CPU A where is also a task running right now
with a higher priority while CPU B and CPU C are idle.

>
> -Mike
>
Sebastian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule

Mike Galbraith-5
On Tue, 2016-04-19 at 09:07 +0200, Sebastian Andrzej Siewior wrote:
> On 04/18/2016 07:55 PM, Mike Galbraith wrote:
> >
> > I'll have to feed it to DL980, hotplug and jitter test it.  It seemed
> > to think that pinning post acquisition was a bad idea jitter wise, but
> > I was bending things up while juggling multiple boxen, so..
>
> pinning pre acquisition could get you in a situation where you get the
> lock and you are stuck on CPU A where is also a task running right now
> with a higher priority while CPU B and CPU C are idle.

I can't get to my DL980 to do jitter testing atm (network outage), but
wrt hotplug banging, my local boxen say patch is toxic.  i4790 desktop
box silently bricked once too.  The boom begins with...

   BUG: scheduling while atomic: futex_wait/11303/0x00000000

...very noisy funeral procession follows.

        -Mike
123