[ceph] what's going on with d_rehash() in splice_dentry()?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[ceph] what's going on with d_rehash() in splice_dentry()?

Al Viro-3
You have, modulo printks and BUG_ON(),
{
        struct dentry *realdn;
        /* dn must be unhashed */
        if (!d_unhashed(dn))
                d_drop(dn);
        realdn = d_splice_alias(in, dn);
        if (IS_ERR(realdn)) {
                if (prehash)
                        *prehash = false; /* don't rehash on error */
                dn = realdn; /* note realdn contains the error */
                goto out;
        } else if (realdn) {
                dput(dn);
                dn = realdn;
        }
        if ((!prehash || *prehash) && d_unhashed(dn))
                d_rehash(dn);

When d_splice_alias() returns NULL it has hashed the dentry you'd given it;
when it returns a different dentry, that dentry is also returned hashed.
IOW, d_rehash(dn) in there should never be called.

If you have a case when it _is_ called, you've found a bug somewhere and
I'd like to see details.  AFAICS, the whole prehash thing appears to be
pointless - even the place where we modify *prehash, since in that case
we return ERR_PTR() and the only caller passing non-NULL prehash (&have_lease)
buggers off on such return value past all code that would look at have_lease
value.

One possible reading is that you want to prevent hashing in !have_lease
case of
                        dn = splice_dentry(dn, in, &have_lease);
If that's the case, you might have a problem, since it will be hashed no
matter what...

PS: the proof that d_splice_alias() always hashes is simple - if you exclude
the places where it returns ERR_PTR(), you are left with
        d_rehash(dentry);
        return NULL;
in the very end,
                                __d_move(new, dentry, false);
                                ...
                        return new;
and
                                int err = __d_unalias(inode, dentry, new);
                                ...
                                // err turned out to be zero
                        return new;
The first one is obvious - we return NULL after an explicit d_rehash() of
the argument.  __d_move() is guaranteed to return with its first argument
hashed due to
        __d_drop(dentry);
        __d_rehash(dentry, d_hash(target->d_parent, target->d_name.hash));
(dentry here refers to the first argument of __d_move() - it's our 'new').
And zero-returning __d_unalias() ends up calling __d_move(), with the
third argument of __d_unalias() ending up as the first one of __d_move().
So in both remaining cases we return a dentry that has just been hashed.
Reply | Threaded
Open this post in threaded view
|

Re: [ceph] what's going on with d_rehash() in splice_dentry()?

Sage Weil-4
Hi Al,

On Fri, 26 Feb 2016, Al Viro wrote:

> You have, modulo printks and BUG_ON(),
> {
>         struct dentry *realdn;
>         /* dn must be unhashed */
>         if (!d_unhashed(dn))
>                 d_drop(dn);
>         realdn = d_splice_alias(in, dn);
>         if (IS_ERR(realdn)) {
>                 if (prehash)
>                         *prehash = false; /* don't rehash on error */
>                 dn = realdn; /* note realdn contains the error */
>                 goto out;
>         } else if (realdn) {
>                 dput(dn);
>                 dn = realdn;
>         }
>         if ((!prehash || *prehash) && d_unhashed(dn))
>                 d_rehash(dn);
>
> When d_splice_alias() returns NULL it has hashed the dentry you'd given it;
> when it returns a different dentry, that dentry is also returned hashed.
> IOW, d_rehash(dn) in there should never be called.
>
> If you have a case when it _is_ called, you've found a bug somewhere and
> I'd like to see details.  AFAICS, the whole prehash thing appears to be
> pointless - even the place where we modify *prehash, since in that case
> we return ERR_PTR() and the only caller passing non-NULL prehash (&have_lease)
> buggers off on such return value past all code that would look at have_lease
> value.

Right.
 
> One possible reading is that you want to prevent hashing in !have_lease
> case of
>                         dn = splice_dentry(dn, in, &have_lease);
> If that's the case, you might have a problem, since it will be hashed no
> matter what...

In this case it doesn't actually matter if it is hashed or not, since
we will look at the lease state on the dentry before trusting it...

This code dates back to when Ceph was originally upstreamed, so the
history is murky, but I expect at that point I wanted to avoid hashing in
the no-lease case.  But I don't think it matters.  We should just remove
the prehash argument from splice_dentry entirely.

Zheng, does that sound right?

Thanks!
sage
Reply | Threaded
Open this post in threaded view
|

Re: [ceph] what's going on with d_rehash() in splice_dentry()?

Yan, Zheng-3

> On Mar 1, 2016, at 22:50, Sage Weil <[hidden email]> wrote:
>
> Hi Al,
>
> On Fri, 26 Feb 2016, Al Viro wrote:
>> You have, modulo printks and BUG_ON(),
>> {
>>        struct dentry *realdn;
>>        /* dn must be unhashed */
>>        if (!d_unhashed(dn))
>>                d_drop(dn);
>>        realdn = d_splice_alias(in, dn);
>>        if (IS_ERR(realdn)) {
>>                if (prehash)
>>                        *prehash = false; /* don't rehash on error */
>>                dn = realdn; /* note realdn contains the error */
>>                goto out;
>>        } else if (realdn) {
>>                dput(dn);
>>                dn = realdn;
>>        }
>>        if ((!prehash || *prehash) && d_unhashed(dn))
>>                d_rehash(dn);
>>
>> When d_splice_alias() returns NULL it has hashed the dentry you'd given it;
>> when it returns a different dentry, that dentry is also returned hashed.
>> IOW, d_rehash(dn) in there should never be called.
>>
>> If you have a case when it _is_ called, you've found a bug somewhere and
>> I'd like to see details.  AFAICS, the whole prehash thing appears to be
>> pointless - even the place where we modify *prehash, since in that case
>> we return ERR_PTR() and the only caller passing non-NULL prehash (&have_lease)
>> buggers off on such return value past all code that would look at have_lease
>> value.
>
> Right.
>
>> One possible reading is that you want to prevent hashing in !have_lease
>> case of
>>                        dn = splice_dentry(dn, in, &have_lease);
>> If that's the case, you might have a problem, since it will be hashed no
>> matter what...
>
> In this case it doesn't actually matter if it is hashed or not, since
> we will look at the lease state on the dentry before trusting it...
>
> This code dates back to when Ceph was originally upstreamed, so the
> history is murky, but I expect at that point I wanted to avoid hashing in
> the no-lease case.  But I don't think it matters.  We should just remove
> the prehash argument from splice_dentry entirely.
>
> Zheng, does that sound right?

Yes. I think we can remove the d_rehash(dn) call and rehash parameter.

Regards
Yan, Zheng
Reply | Threaded
Open this post in threaded view
|

Re: [ceph] what's going on with d_rehash() in splice_dentry()?

Al Viro-3
On Wed, Mar 02, 2016 at 11:00:01AM +0800, Yan, Zheng wrote:

> > This code dates back to when Ceph was originally upstreamed, so the
> > history is murky, but I expect at that point I wanted to avoid hashing in
> > the no-lease case.  But I don't think it matters.  We should just remove
> > the prehash argument from splice_dentry entirely.
> >
> > Zheng, does that sound right?
>
> Yes. I think we can remove the d_rehash(dn) call and rehash parameter.

Another question in the same general area:
                /* null dentry? */
                if (!rinfo->head->is_target) {
                        dout("fill_trace null dentry\n");
                        if (d_really_is_positive(dn)) {
                                ceph_dir_clear_ordered(dir);
                                dout("d_delete %p\n", dn);
                                d_delete(dn);
                        } else {
                                dout("d_instantiate %p NULL\n", dn);
                                d_instantiate(dn, NULL);
                                if (have_lease && d_unhashed(dn))
                                        d_rehash(dn);
                                update_dentry_lease(dn, rinfo->dlease,
                                                    session,
                                                    req->r_request_started);
                        }
                        goto done;
                }
What's that d_instantiate() about?  We have just checked that it's
negative; what's the point of setting ->d_inode to NULL again?  Would it
be OK if we just do
                        } else {
                                if (have_lease && d_unhashed(dn))
                                        d_add(dn, NULL);
                                update_dentry_lease(dn, rinfo->dlease,
                                                    session,
                                                    req->r_request_started);
                        }
in there?  As an aside, tracking back to the originating fs method is
painful as hell ;-/  I _think_ that rehash can be hit during ->lookup()
returning a negative, but I wouldn't bet a dime on it not happening from
other methods...  AFAICS, the change should be OK regardless of what
it's been called from, but... _ouch_.  Is is documented anywhere public?
Reply | Threaded
Open this post in threaded view
|

Re: [ceph] what's going on with d_rehash() in splice_dentry()?

Sage Weil-4
On Mon, 7 Mar 2016, Al Viro wrote:

> On Wed, Mar 02, 2016 at 11:00:01AM +0800, Yan, Zheng wrote:
>
> > > This code dates back to when Ceph was originally upstreamed, so the
> > > history is murky, but I expect at that point I wanted to avoid hashing in
> > > the no-lease case.  But I don't think it matters.  We should just remove
> > > the prehash argument from splice_dentry entirely.
> > >
> > > Zheng, does that sound right?
> >
> > Yes. I think we can remove the d_rehash(dn) call and rehash parameter.
>
> Another question in the same general area:
>                 /* null dentry? */
>                 if (!rinfo->head->is_target) {
>                         dout("fill_trace null dentry\n");
>                         if (d_really_is_positive(dn)) {
>                                 ceph_dir_clear_ordered(dir);
>                                 dout("d_delete %p\n", dn);
>                                 d_delete(dn);
>                         } else {
>                                 dout("d_instantiate %p NULL\n", dn);
>                                 d_instantiate(dn, NULL);
>                                 if (have_lease && d_unhashed(dn))
>                                         d_rehash(dn);
>                                 update_dentry_lease(dn, rinfo->dlease,
>                                                     session,
>                                                     req->r_request_started);
>                         }
>                         goto done;
>                 }
> What's that d_instantiate() about?  We have just checked that it's
> negative; what's the point of setting ->d_inode to NULL again?  Would it
> be OK if we just do
> } else {
> if (have_lease && d_unhashed(dn))
> d_add(dn, NULL);
>                                 update_dentry_lease(dn, rinfo->dlease,
>                                                     session,
>                                                     req->r_request_started);
>                         }
> in there?

That looks okay, but changing d_rehash to d_add still means you're doing
te d_instantiate(dn, NULL) in the d_unhashed case; is there a reason you
changed that line?  Is the dentry_rcuwalk_invalidate in __d_instantiate is
important before rehashing?

> As an aside, tracking back to the originating fs method is
> painful as hell ;-/  I _think_ that rehash can be hit during ->lookup()
> returning a negative, but I wouldn't bet a dime on it not happening from
> other methods...  AFAICS, the change should be OK regardless of what
> it's been called from, but... _ouch_.  Is is documented anywhere public?

It is a pain to follow, yes. FWIW this whole block is predicated in
req->r_locked_dir being non-NULL (i.e., VFS holds dir->i_mutex), which is
only true for lookup, create operations (mkdir/mknod/symlink/etc.),
atomic_open, and the .get_name export op.  There's not much documentation
beyond a description of the meaning of fields (e.g. r_locked_dir) in
fs/ceph/mds_client.h ...

sage

Reply | Threaded
Open this post in threaded view
|

Re: [ceph] what's going on with d_rehash() in splice_dentry()?

Al Viro-3
On Mon, Mar 07, 2016 at 06:25:13AM -0500, Sage Weil wrote:

> That looks okay, but changing d_rehash to d_add still means you're doing
> te d_instantiate(dn, NULL) in the d_unhashed case; is there a reason you
> changed that line?  Is the dentry_rcuwalk_invalidate in __d_instantiate is
> important before rehashing?

d_add() gets uninlined in my queue, along with some locking massage to avoid
extra bouncing of ->d_lock we do for no good reason.  So that call is also
going away.

> > As an aside, tracking back to the originating fs method is
> > painful as hell ;-/  I _think_ that rehash can be hit during ->lookup()
> > returning a negative, but I wouldn't bet a dime on it not happening from
> > other methods...  AFAICS, the change should be OK regardless of what
> > it's been called from, but... _ouch_.  Is is documented anywhere public?
>
> It is a pain to follow, yes. FWIW this whole block is predicated in
> req->r_locked_dir being non-NULL (i.e., VFS holds dir->i_mutex), which is
> only true for lookup, create operations (mkdir/mknod/symlink/etc.),
> atomic_open, and the .get_name export op.  There's not much documentation
> beyond a description of the meaning of fields (e.g. r_locked_dir) in
> fs/ceph/mds_client.h ...

Yes.  Now consider the plans to make ->i_mutex an rwsem and weaken it for
->lookup() (i.e. held only shared when it's held for lookup alone).  IOW,
the aforementioned queue...  I would rather avoid d_rehash() in ->lookup()
instances; the tricky part is that we want to avoid parallel lookups on
the *same* name, so we need an "in parallel lookup" state for dentries and
mechanism allowing to wait for it to be finished.  d_add()/d_splice_alias()
on such a dentry would take it out of that state and I would rather
avoid dealing with it in d_rehash()...

I was going to post the series last week, got sidetracked on various fun
things I'd found (ecryptfs and lustre).  Hopefully will get all of that
sorted out and the whole series posted for review tomorrow or on Wednesday...