[PATCH] md/raid5: fix locking in handle_stripe_clean_event()

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Roman Gushchin
After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
__find_stripe() is called under conf->hash_locks + hash.
But handle_stripe_clean_event() calls remove_hash() under
conf->device_lock.

Under some cirscumstances the hash chain can be circuited,
and we get an infinite loop with disabled interrupts and locked hash
lock in __find_stripe(). This leads to hard lockup on multiple CPUs
and following system crash.

I was able to reproduce this behavior on raid6 over 6 ssd disks.
The devices_handle_discard_safely option should be set to enable trim
support. The following script was used:

for i in `seq 1 32`; do
    dd if=/dev/zero of=large$i bs=10M count=100 &
done

Signed-off-by: Roman Gushchin <[hidden email]>
Cc: Neil Brown <[hidden email]>
Cc: Shaohua Li <[hidden email]>
Cc: [hidden email]
Cc: <[hidden email]> # 3.10 - 3.19
---
 drivers/md/raid5.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e421016..5fa7549 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3060,6 +3060,8 @@ static void handle_stripe_clean_event(struct r5conf *conf,
  }
  if (!discard_pending &&
     test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+ int hash = sh->hash_lock_index;
+
  clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
  clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
  if (sh->qd_idx >= 0) {
@@ -3073,9 +3075,9 @@ static void handle_stripe_clean_event(struct r5conf *conf,
  * no updated data, so remove it from hash list and the stripe
  * will be reinitialized
  */
- spin_lock_irq(&conf->device_lock);
+ spin_lock_irq(conf->hash_locks + hash);
  remove_hash(sh);
- spin_unlock_irq(&conf->device_lock);
+ spin_unlock_irq(conf->hash_locks + hash);
  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
  set_bit(STRIPE_HANDLE, &sh->state);
 
--
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

NeilBrown
On Wed, Oct 28 2015, Roman Gushchin wrote:

> After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
> __find_stripe() is called under conf->hash_locks + hash.
> But handle_stripe_clean_event() calls remove_hash() under
> conf->device_lock.
>
> Under some cirscumstances the hash chain can be circuited,
> and we get an infinite loop with disabled interrupts and locked hash
> lock in __find_stripe(). This leads to hard lockup on multiple CPUs
> and following system crash.
>
> I was able to reproduce this behavior on raid6 over 6 ssd disks.
> The devices_handle_discard_safely option should be set to enable trim
> support. The following script was used:
>
> for i in `seq 1 32`; do
>     dd if=/dev/zero of=large$i bs=10M count=100 &
> done
>
> Signed-off-by: Roman Gushchin <[hidden email]>
> Cc: Neil Brown <[hidden email]>
> Cc: Shaohua Li <[hidden email]>
> Cc: [hidden email]
> Cc: <[hidden email]> # 3.10 - 3.19
Hi Roman,
 thanks for reporting this and providing a fix.

I'm a bit confused by that stable range: 3.10 - 3.19

The commit you identify as introducing the bug was added in 3.13, so
presumably 3.10, 3.11, 3.12 are not affected.
Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
affected, though the patch needs to be revised a bit for 4.1 and later.

Does that match your understanding?  Or is there something that I am
missing?

Thanks,
NeilBrown

> ---
>  drivers/md/raid5.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e421016..5fa7549 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3060,6 +3060,8 @@ static void handle_stripe_clean_event(struct r5conf *conf,
>   }
>   if (!discard_pending &&
>      test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
> + int hash = sh->hash_lock_index;
> +
>   clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
>   clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
>   if (sh->qd_idx >= 0) {
> @@ -3073,9 +3075,9 @@ static void handle_stripe_clean_event(struct r5conf *conf,
>   * no updated data, so remove it from hash list and the stripe
>   * will be reinitialized
>   */
> - spin_lock_irq(&conf->device_lock);
> + spin_lock_irq(conf->hash_locks + hash);
>   remove_hash(sh);
> - spin_unlock_irq(&conf->device_lock);
> + spin_unlock_irq(conf->hash_locks + hash);
>   if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
>   set_bit(STRIPE_HANDLE, &sh->state);
>  
> --
> 2.4.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [hidden email]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

signature.asc (834 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Roman Gushchin
29.10.2015, 03:35, "Neil Brown" <[hidden email]>:

> On Wed, Oct 28 2015, Roman Gushchin wrote:
>
>>  After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
>>  __find_stripe() is called under conf->hash_locks + hash.
>>  But handle_stripe_clean_event() calls remove_hash() under
>>  conf->device_lock.
>>
>>  Under some cirscumstances the hash chain can be circuited,
>>  and we get an infinite loop with disabled interrupts and locked hash
>>  lock in __find_stripe(). This leads to hard lockup on multiple CPUs
>>  and following system crash.
>>
>>  I was able to reproduce this behavior on raid6 over 6 ssd disks.
>>  The devices_handle_discard_safely option should be set to enable trim
>>  support. The following script was used:
>>
>>  for i in `seq 1 32`; do
>>      dd if=/dev/zero of=large$i bs=10M count=100 &
>>  done
>>
>>  Signed-off-by: Roman Gushchin <[hidden email]>
>>  Cc: Neil Brown <[hidden email]>
>>  Cc: Shaohua Li <[hidden email]>
>>  Cc: [hidden email]
>>  Cc: <[hidden email]> # 3.10 - 3.19
>
> Hi Roman,
>  thanks for reporting this and providing a fix.
>
> I'm a bit confused by that stable range: 3.10 - 3.19
>
> The commit you identify as introducing the bug was added in 3.13, so
> presumably 3.10, 3.11, 3.12 are not affected.

Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry.

> Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
> affected, though the patch needs to be revised a bit for 4.1 and later.

Yes, exactly, but things are a bit more complicated in mainline.
I'll try to prepare a patch for mainline in a couple of days.

Thanks,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Greg Kroah-Hartman
On Thu, Oct 29, 2015 at 05:15:48PM +0300, Roman Gushchin wrote:

> 29.10.2015, 03:35, "Neil Brown" <[hidden email]>:
> > On Wed, Oct 28 2015, Roman Gushchin wrote:
> >
> >>  After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
> >>  __find_stripe() is called under conf->hash_locks + hash.
> >>  But handle_stripe_clean_event() calls remove_hash() under
> >>  conf->device_lock.
> >>
> >>  Under some cirscumstances the hash chain can be circuited,
> >>  and we get an infinite loop with disabled interrupts and locked hash
> >>  lock in __find_stripe(). This leads to hard lockup on multiple CPUs
> >>  and following system crash.
> >>
> >>  I was able to reproduce this behavior on raid6 over 6 ssd disks.
> >>  The devices_handle_discard_safely option should be set to enable trim
> >>  support. The following script was used:
> >>
> >>  for i in `seq 1 32`; do
> >>      dd if=/dev/zero of=large$i bs=10M count=100 &
> >>  done
> >>
> >>  Signed-off-by: Roman Gushchin <[hidden email]>
> >>  Cc: Neil Brown <[hidden email]>
> >>  Cc: Shaohua Li <[hidden email]>
> >>  Cc: [hidden email]
> >>  Cc: <[hidden email]> # 3.10 - 3.19
> >
> > Hi Roman,
> >  thanks for reporting this and providing a fix.
> >
> > I'm a bit confused by that stable range: 3.10 - 3.19
> >
> > The commit you identify as introducing the bug was added in 3.13, so
> > presumably 3.10, 3.11, 3.12 are not affected.
>
> Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry.
>
> > Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
> > affected, though the patch needs to be revised a bit for 4.1 and later.
>
> Yes, exactly, but things are a bit more complicated in mainline.
> I'll try to prepare a patch for mainline in a couple of days.

We can't do anything with a patch that is not already in Linus's tree,
which is why this isn't even in my patch queue anymore.  Please resend
this once the fix is in Linus's tree, with the git commit id of what it
is there and we will be glad to queue it up.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

NeilBrown
In reply to this post by Roman Gushchin
On Fri, Oct 30 2015, Roman Gushchin wrote:

> 29.10.2015, 03:35, "Neil Brown" <[hidden email]>:
>> On Wed, Oct 28 2015, Roman Gushchin wrote:
>>
>>>  After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
>>>  __find_stripe() is called under conf->hash_locks + hash.
>>>  But handle_stripe_clean_event() calls remove_hash() under
>>>  conf->device_lock.
>>>
>>>  Under some cirscumstances the hash chain can be circuited,
>>>  and we get an infinite loop with disabled interrupts and locked hash
>>>  lock in __find_stripe(). This leads to hard lockup on multiple CPUs
>>>  and following system crash.
>>>
>>>  I was able to reproduce this behavior on raid6 over 6 ssd disks.
>>>  The devices_handle_discard_safely option should be set to enable trim
>>>  support. The following script was used:
>>>
>>>  for i in `seq 1 32`; do
>>>      dd if=/dev/zero of=large$i bs=10M count=100 &
>>>  done
>>>
>>>  Signed-off-by: Roman Gushchin <[hidden email]>
>>>  Cc: Neil Brown <[hidden email]>
>>>  Cc: Shaohua Li <[hidden email]>
>>>  Cc: [hidden email]
>>>  Cc: <[hidden email]> # 3.10 - 3.19
>>
>> Hi Roman,
>>  thanks for reporting this and providing a fix.
>>
>> I'm a bit confused by that stable range: 3.10 - 3.19
>>
>> The commit you identify as introducing the bug was added in 3.13, so
>> presumably 3.10, 3.11, 3.12 are not affected.
>
> Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry.
>
>> Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
>> affected, though the patch needs to be revised a bit for 4.1 and later.
>
> Yes, exactly, but things are a bit more complicated in mainline.
> I'll try to prepare a patch for mainline in a couple of days.
>
Thanks for the confirmation.

Isn't the 4.1 fix just:

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e5befa356dbe..6e4350a78257 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3522,16 +3522,16 @@ returnbi:
  * no updated data, so remove it from hash list and the stripe
  * will be reinitialized
  */
- spin_lock_irq(&conf->device_lock);
 unhash:
+ spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
  remove_hash(sh);
+ spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
  if (head_sh->batch_head) {
  sh = list_first_entry(&sh->batch_list,
       struct stripe_head, batch_list);
  if (sh != head_sh)
  goto unhash;
  }
- spin_unlock_irq(&conf->device_lock);
  sh = head_sh;
 
  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))

??

Or maybe
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e5befa356dbe..704ef7fcfbf8 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3509,6 +3509,7 @@ returnbi:
 
  if (!discard_pending &&
     test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+ int hash;
  clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
  clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
  if (sh->qd_idx >= 0) {
@@ -3522,16 +3523,17 @@ returnbi:
  * no updated data, so remove it from hash list and the stripe
  * will be reinitialized
  */
- spin_lock_irq(&conf->device_lock);
 unhash:
+ hash = sh->hash_lock_index;
+ spin_lock_irq(conf->hash_locks + hash);
  remove_hash(sh);
+ spin_unlock_irq(conf->hash_locks + hash);
  if (head_sh->batch_head) {
  sh = list_first_entry(&sh->batch_list,
       struct stripe_head, batch_list);
  if (sh != head_sh)
  goto unhash;
  }
- spin_unlock_irq(&conf->device_lock);
  sh = head_sh;
 
  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))


For personal reasons I would like to get this resolved today or
tomorrow, though it would be silly to rush if there is any uncertainty.

Thanks,
NeilBrown

signature.asc (834 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Roman Gushchin
> Isn't the 4.1 fix just:
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e5befa356dbe..6e4350a78257 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3522,16 +3522,16 @@ returnbi:
>                   * no updated data, so remove it from hash list and the stripe
>                   * will be reinitialized
>                   */
> - spin_lock_irq(&conf->device_lock);
>  unhash:
> + spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
>                  remove_hash(sh);
> + spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
>                  if (head_sh->batch_head) {
>                          sh = list_first_entry(&sh->batch_list,
>                                                struct stripe_head, batch_list);
>                          if (sh != head_sh)
>                                          goto unhash;
>                  }
> - spin_unlock_irq(&conf->device_lock);
>                  sh = head_sh;
>
>                  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
>
> ??

In my opion, this patch looks correct, although it seems to me, that there is an another issue here.

>                  if (head_sh->batch_head) {
>                          sh = list_first_entry(&sh->batch_list,
>                                                struct stripe_head, batch_list);
>                          if (sh != head_sh)
>                                          goto unhash;
>                  }
 
With a patch above this code will be executed without taking any locks. It it correct?
In my opinion, we need to take at least sh->stripe_lock, which protects sh->batch_head.
Or do I miss something?

If you want, we can handle this issue separately.


Thanks,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Shaohua Li-2
On Fri, Oct 30, 2015 at 05:02:47PM +0300, Roman Gushchin wrote:

> > Isn't the 4.1 fix just:
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index e5befa356dbe..6e4350a78257 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -3522,16 +3522,16 @@ returnbi:
> >                   * no updated data, so remove it from hash list and the stripe
> >                   * will be reinitialized
> >                   */
> > - spin_lock_irq(&conf->device_lock);
> >  unhash:
> > + spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
> >                  remove_hash(sh);
> > + spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
> >                  if (head_sh->batch_head) {
> >                          sh = list_first_entry(&sh->batch_list,
> >                                                struct stripe_head, batch_list);
> >                          if (sh != head_sh)
> >                                          goto unhash;
> >                  }
> > - spin_unlock_irq(&conf->device_lock);
> >                  sh = head_sh;
> >
> >                  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
> >
> > ??
>
> In my opion, this patch looks correct, although it seems to me, that there is an another issue here.
>
> >                  if (head_sh->batch_head) {
> >                          sh = list_first_entry(&sh->batch_list,
> >                                                struct stripe_head, batch_list);
> >                          if (sh != head_sh)
> >                                          goto unhash;
> >                  }
>  
> With a patch above this code will be executed without taking any locks. It it correct?
> In my opinion, we need to take at least sh->stripe_lock, which protects sh->batch_head.
> Or do I miss something?
>
> If you want, we can handle this issue separately.

The batch_list list doesn't need the protection. Only the remove_hash() need it.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

NeilBrown
On Sat, Oct 31 2015, Shaohua Li wrote:

> On Fri, Oct 30, 2015 at 05:02:47PM +0300, Roman Gushchin wrote:
>> > Isn't the 4.1 fix just:
>> >
>> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> > index e5befa356dbe..6e4350a78257 100644
>> > --- a/drivers/md/raid5.c
>> > +++ b/drivers/md/raid5.c
>> > @@ -3522,16 +3522,16 @@ returnbi:
>> >                   * no updated data, so remove it from hash list and the stripe
>> >                   * will be reinitialized
>> >                   */
>> > - spin_lock_irq(&conf->device_lock);
>> >  unhash:
>> > + spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
>> >                  remove_hash(sh);
>> > + spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
>> >                  if (head_sh->batch_head) {
>> >                          sh = list_first_entry(&sh->batch_list,
>> >                                                struct stripe_head, batch_list);
>> >                          if (sh != head_sh)
>> >                                          goto unhash;
>> >                  }
>> > - spin_unlock_irq(&conf->device_lock);
>> >                  sh = head_sh;
>> >
>> >                  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
>> >
>> > ??
>>
>> In my opion, this patch looks correct, although it seems to me, that there is an another issue here.
>>
>> >                  if (head_sh->batch_head) {
>> >                          sh = list_first_entry(&sh->batch_list,
>> >                                                struct stripe_head, batch_list);
>> >                          if (sh != head_sh)
>> >                                          goto unhash;
>> >                  }
>>  
>> With a patch above this code will be executed without taking any locks. It it correct?
>> In my opinion, we need to take at least sh->stripe_lock, which protects sh->batch_head.
>> Or do I miss something?
>>
>> If you want, we can handle this issue separately.
>
> The batch_list list doesn't need the protection. Only the remove_hash() need it.
Yes, that's my understanding too.  The key to understanding is that
comment you (helpfully!) put in clear_batch_ready():

        /*
         * BATCH_READY is cleared, no new stripes can be added.
         * batch_list can be accessed without lock
         */

I'll wrangle some patches...

Thanks,
NeilBrown

signature.asc (834 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

Roman Gushchin
Ok, thank you for clarifications!

--
Roman


31.10.2015, 01:17, "Neil Brown" <[hidden email]>:

> On Sat, Oct 31 2015, Shaohua Li wrote:
>
>>  On Fri, Oct 30, 2015 at 05:02:47PM +0300, Roman Gushchin wrote:
>>>  > Isn't the 4.1 fix just:
>>>  >
>>>  > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>>  > index e5befa356dbe..6e4350a78257 100644
>>>  > --- a/drivers/md/raid5.c
>>>  > +++ b/drivers/md/raid5.c
>>>  > @@ -3522,16 +3522,16 @@ returnbi:
>>>  >                   * no updated data, so remove it from hash list and the stripe
>>>  >                   * will be reinitialized
>>>  >                   */
>>>  > - spin_lock_irq(&conf->device_lock);
>>>  >  unhash:
>>>  > + spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
>>>  >                  remove_hash(sh);
>>>  > + spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
>>>  >                  if (head_sh->batch_head) {
>>>  >                          sh = list_first_entry(&sh->batch_list,
>>>  >                                                struct stripe_head, batch_list);
>>>  >                          if (sh != head_sh)
>>>  >                                          goto unhash;
>>>  >                  }
>>>  > - spin_unlock_irq(&conf->device_lock);
>>>  >                  sh = head_sh;
>>>  >
>>>  >                  if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
>>>  >
>>>  > ??
>>>
>>>  In my opion, this patch looks correct, although it seems to me, that there is an another issue here.
>>>
>>>  >                  if (head_sh->batch_head) {
>>>  >                          sh = list_first_entry(&sh->batch_list,
>>>  >                                                struct stripe_head, batch_list);
>>>  >                          if (sh != head_sh)
>>>  >                                          goto unhash;
>>>  >                  }
>>>
>>>  With a patch above this code will be executed without taking any locks. It it correct?
>>>  In my opinion, we need to take at least sh->stripe_lock, which protects sh->batch_head.
>>>  Or do I miss something?
>>>
>>>  If you want, we can handle this issue separately.
>>
>>  The batch_list list doesn't need the protection. Only the remove_hash() need it.
>
> Yes, that's my understanding too. The key to understanding is that
> comment you (helpfully!) put in clear_batch_ready():
>
>         /*
>          * BATCH_READY is cleared, no new stripes can be added.
>          * batch_list can be accessed without lock
>          */
>
> I'll wrangle some patches...
>
> Thanks,
> NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/