[PATCH v3 0/6] powerpc use pv-qpsinlock as the default spinlock implemention
change from v2:
__spin_yeild_cpu() will yield slices to lpar if target cpu is running.
remove unnecessary rmb() in __spin_yield/wake_cpu.
__pv_wait() will check the *ptr == val.
some commit message change
change fome v1:
separate into 6 pathes from one patch
some minor code changes.
I do several tests on pseries IBM,8408-E8E with 32cpus, 64GB memory.
benchmark test results are below.
Pan Xinhui (6):
qspinlock: powerpc support qspinlock
powerpc: pseries/Kconfig: Add qspinlock build config
powerpc: lib/locks.c: Add cpu yield/wake helper function
pv-qspinlock: powerpc support pv-qspinlock
pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock
powerpc: pseries: Add pv-qspinlock build config/make
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index a5b1248..2bbffe4 100644
@@ -614,7 +614,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock *lock)
* unhash. Otherwise it would be possible to have multiple @lock
* entries, which would be BAD.
- locked = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
+ locked = cmpxchg_release(&l->locked, _Q_LOCKED_VAL, 0);
if (likely(locked == _Q_LOCKED_VAL))
If unsure, select Y.
+ bool "Paravirtialization support for qspinlock"
+ depends on PPC_SPLPAR && QUEUED_SPINLOCKS
+ default y
+ If platform supports virtualization, for example PowerVM, this option
+ can let guest have a better performace.
On Wed, May 25, 2016 at 04:18:03PM +0800, Pan Xinhui wrote:
> |futex hash | 556370 ops | 629634 ops |
> |futex lock-pi | 362 ops | 367 ops |
> scheduler test:
> Test how many loops of schedule() can finish within 10 seconds on all cpus.
> |schedule() loops| 322811921 | 311449290 |
> kernel compiling test:
> build a linux kernel image to see how long it took
> | compiling takes| 22m | 22m |
Is 'spinlcok' the current test-and-set lock?
And what about regular qspinlock, in case of !SHARED_PROCESSOR?
On Thu, May 26, 2016 at 06:47:29PM +0200, Peter Zijlstra wrote:
> On Wed, May 25, 2016 at 04:18:08PM +0800, Pan Xinhui wrote:
> > cmpxchg_release is light-wight than cmpxchg, we can gain a better
> > performace then. On some arch like ppc, barrier impact the performace
> > too much.
> > Suggested-by: Boqun Feng <[hidden email]>
> > Signed-off-by: Pan Xinhui <[hidden email]>
> > ---
> > kernel/locking/qspinlock_paravirt.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> > diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> > index a5b1248..2bbffe4 100644
> > --- a/kernel/locking/qspinlock_paravirt.h
> > +++ b/kernel/locking/qspinlock_paravirt.h
> > @@ -614,7 +614,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock *lock)
> > * unhash. Otherwise it would be possible to have multiple @lock
> > * entries, which would be BAD.
> > */
> > - locked = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
> > + locked = cmpxchg_release(&l->locked, _Q_LOCKED_VAL, 0);
> > if (likely(locked == _Q_LOCKED_VAL))
> > return;
> This patch fails to explain _why_ it can be relaxed.
> And seeing how this cmpxchg() can actually unlock the lock, I don't see
> how this can possibly be correct. Maybe cmpxchg_release(), but relaxed
> seems very wrong.
Clearly I need to stop working for the day, I cannea read. You're doing
release, not relaxed.