Re: Hotplug CPU and setaffinity?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Hotplug CPU and setaffinity?

Joel Schopp

>>The affinity of the process is reset to the default and it is migrated
>>to another cpu, for better or worse.  The kernel assumes the admin
>>know what he/she is doing.
>
>
> Yeh that's ok - is there anything that would hotplug a cpu
> automatically; say on receiving some MCEs ; and thus not
> give the admin a look in.

On ppc64 we have CPU guard, which would remove a processor if it is
failing.  Of course, the implications of not removing such a CPU are
pretty terrible.

>
>
>>>In particular I was thinking of the cases where a thread has a
>>> functional reason for remaining on one particular CPU (e.g. if you
>>>had calibrated for some feature of that CPU say its time stamp
>>>counter skew/speed). Another case would be a set of threads which
>>>had set their affinity to the same CPU and then made memory
>>>consistency or locking assumptions that wouldn't be valid
>>>if they got rescheduled onto different CPUs.

This sounds like a theoretical problem.  Can you think of any real
examples?  The only cases I can think of cause performance hits, but not
functional problems.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: Hotplug CPU and setaffinity?

Dr. David Alan Gilbert-5
Joel Schopp wrote:

> On ppc64 we have CPU guard, which would remove a processor if it is
> failing.  Of course, the implications of not removing such a CPU are
> pretty terrible.

Indeed.

>>>> In particular I was thinking of the cases where a thread has a
>>>> functional reason for remaining on one particular CPU (e.g. if you
>>>> had calibrated for some feature of that CPU say its time stamp
>>>> counter skew/speed). Another case would be a set of threads which
>>>> had set their affinity to the same CPU and then made memory
>>>> consistency or locking assumptions that wouldn't be valid
>>>> if they got rescheduled onto different CPUs.
>
>
> This sounds like a theoretical problem.  Can you think of any real
> examples?  The only cases I can think of cause performance hits, but not
> functional problems.

Well I'm not aware of anything that currently would break with it; but I
was gently thinking of whether it would be possible to read cycle
counters (as a faster gettimeofday) even on systems which had
unsynchronised counters if you could lock the thread that did it to a
particular physical cpu.
But this behaviour currently makes that a bad idea; in this case it
would be nicer if the kernel either just killed my process or just
unscheduled it or sent it a signal.

Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/