Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Takashi Ikebe
Hello,
Andi Kleen wrote:

>Takashi Ikebe <[hidden email]> writes:
>
>  
>
>>The patch was over 50k, so I separate it to each architecture and in line..
>>
>>This patch add function called "Live patching" which is defined on
>>OSDL's carrier grade linux requiremnt definition to linux 2.6.11.7 kernel.
>>The live patching allows process to patch on-line (without restarting
>>process) on i386 and x86_64 architectures, by overwriting jump assembly
>>code on entry point of functions which you want to fix, to patched
>>functions.
>>    
>>
>
>How exactly is this different from ptrace?
>Seems just like a ptrace memcpy extension
>Is the patching really that time critical that you cant do it
>with normal ptrace?
>  
>
Only few patch modules are not so critical, however sometimes large
number of patches are applied at one time. In that case, time is very
critical with normal ptrace. As you know, normal ptrace need to target
process STOP whenever change the memory/registers.
Our approach is "do not stop the target process's execution as possible
as", because the target process can provide service during patch on SMP
machine (do not want to stop service due to patch).
If we load hundreds of patch modules at one time, I think it will goes
quite time critical..

>>+ if(((current->uid != tsk->euid) ||
>>+    (current->uid != tsk->suid) ||
>>+    (current->uid != tsk->uid) ||
>>+    (current->gid != tsk->egid) ||
>>+    (current->gid != tsk->sgid) ||
>>+    (current->gid != tsk->gid)) && !capable(CAP_SYS_PANNUS)) {
>>+                // invalid user in sys_accesspvm
>>+                return -EPERM;
>>+        }
>>+> + p = vmalloc(len);
>>    
>>
>
>This needs a limit.
>  
>
Thank you, we'll fix this soon.

>annus-x86_64/arch/x86_64/kernel/entry.S
>  
>
>>--- linux-2.6.11.7-vanilla/arch/x86_64/kernel/entry.S 2005-04-08 03:57:30.000000000 +0900
>>+++ linux-2.6.11.7-pannus-x86_64/arch/x86_64/kernel/entry.S 2005-04-18 10:45:47.000000000 +0900
>>@@ -214,6 +214,8 @@ sysret_check:
>> /* Handle reschedules */
>> /* edx: work, edi: workmask */
>> sysret_careful:
>>+ cmpl $0,threadinfo_inipending(%rcx)
>>+ jne sysret_init
>>    
>>
>
>Put the check into the normal notify_resume work mask, not adding
>a separate check into this critical fast path.
>  
>
OK, we'll fix this soon.

>> CFI_ENDPROC
>>
>> /*
>>+ * In the case restorer calls rt_handlereturn, collect and store registers,
>>+ * and call rt_handlereturn with stored register struct.
>>+ */
>>+ENTRY(stub_rt_handlereturn)
>>    
>>
>
>This seems quite pointless since ptrace and can change all registers
>in a child.
>  
>
Well, this can change as you said, but I think, this makes target
process stopping time increase.
Because, to control target process's (patch module's) initialization,
the command process should know the target process's status and then
stop with ptrace.
Currently rt_handlereturn works on target process's context like signal
handler return, so, I think there is minimum time loss on target process.
If command process controls the target process's initialization, this
seems target process's stopping time increasing.
Well, may be our idea is wrong, please tell us.

Thank you your advice!

>Didnt review more.
>
>-Andi
>  
>


--
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : [hidden email]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Kyle Moffett
If you want that exact functionality, do this:

At program start, spawn a new thread:
        1) Open a UNIX socket (/var/run/someapp_live_patch.sock)
        2) poll() that socket for a connection.
        3) When you get a connection, do your own security checks
        4) If it's ok, then map the specified file into memory
        5) Read a table of crap to patch from the file
        6) Do the patching, being careful to avoid the millions of
           races involved for each CPU, *especially* regarding the
           separate icache and dcache on CPUs like PPC and such.
        7) Go back to step 2

If you want equivalent functionality but much safer and not CPU
dependent and full of hand-coded assembly:

1) open(), mmap(), and mlock() the file (/var/lib/someapp/data)
2) Spawn normal operation threads
3) Spawn a new hot-patch thread:
        1) Open a UNIX socket (/var/run/someapp_live_patch.sock)
        2) poll() that socket for a connection.
        3) When you get one, coordinate with the new process as it
           attaches itself to /var/lib/someapp/data
        4) Handle shared locking of parts of /var/lib/someapp/data
        5) Send it your listen() file-descriptors over the socket.
        6) Wait for the other process to signal it's ready.
        7) Stop accepting new connections on the socket.
        8) Send file-descriptors for current connections
        9) Cleanup and quit

When live-patching:
1)  connect to the socket /var/run/someapp_live_patch.sock
2)  open(), mmap() and mlock() /var/lib/someapp/data
3)  Coordinate with the other process via the socket
4)  Receive the listen() file-descriptors over the socket.
5)  Set up the shared data locking
6)  Spawn normal operation threads
7)  Signal readiness
8)  Receive file-descriptors for current connections
9)  Spawn threads for them too.
10) Spawn a new hot-patch thread as above

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r  
!y?(-)
------END GEEK CODE BLOCK------


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Takashi Ikebe
Kyle, thank you so much for your detailed information.
If you design completely new software, your suggestion is very useful!

Unfortunately, we carrier have very many exiting software and try to run
on Linux.
We need to seek the way which can apply to exiting software also...

Kyle Moffett wrote:

> If you want that exact functionality, do this:
>
> At program start, spawn a new thread:
> 1) Open a UNIX socket (/var/run/someapp_live_patch.sock)
> 2) poll() that socket for a connection.
> 3) When you get a connection, do your own security checks
> 4) If it's ok, then map the specified file into memory
> 5) Read a table of crap to patch from the file
> 6) Do the patching, being careful to avoid the millions of
> races involved for each CPU, *especially* regarding the
> separate icache and dcache on CPUs like PPC and such.
> 7) Go back to step 2
>
> If you want equivalent functionality but much safer and not CPU
> dependent and full of hand-coded assembly:
>
> 1) open(), mmap(), and mlock() the file (/var/lib/someapp/data)
> 2) Spawn normal operation threads
> 3) Spawn a new hot-patch thread:
> 1) Open a UNIX socket (/var/run/someapp_live_patch.sock)
> 2) poll() that socket for a connection.
> 3) When you get one, coordinate with the new process as it
> attaches itself to /var/lib/someapp/data
> 4) Handle shared locking of parts of /var/lib/someapp/data
> 5) Send it your listen() file-descriptors over the socket.
> 6) Wait for the other process to signal it's ready.
> 7) Stop accepting new connections on the socket.
> 8) Send file-descriptors for current connections
> 9) Cleanup and quit
>
> When live-patching:
> 1) connect to the socket /var/run/someapp_live_patch.sock
> 2) open(), mmap() and mlock() /var/lib/someapp/data
> 3) Coordinate with the other process via the socket
> 4) Receive the listen() file-descriptors over the socket.
> 5) Set up the shared data locking
> 6) Spawn normal operation threads
> 7) Signal readiness
> 8) Receive file-descriptors for current connections
> 9) Spawn threads for them too.
> 10) Spawn a new hot-patch thread as above
>
> Cheers,
> Kyle Moffett
>
> -----BEGIN GEEK CODE BLOCK-----
> Version: 3.12
> GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
> L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
> PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
> !y?(-)
> ------END GEEK CODE BLOCK------
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [hidden email]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : [hidden email]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Kyle Moffett

On Apr 25, 2005, at 06:39, Takashi Ikebe wrote:

> Kyle Moffett wrote:
>
>> If you want that exact functionality, do this:
>>
>> At program start, spawn a new thread:
>> 1) Open a UNIX socket (/var/run/someapp_live_patch.sock)
>> 2) poll() that socket for a connection.
>> 3) When you get a connection, do your own security checks
>> 4) If it's ok, then map the specified file into memory
>> 5) Read a table of crap to patch from the file
>> 6) Do the patching, being careful to avoid the millions of
>> races involved for each CPU, *especially* regarding the
>> separate icache and dcache on CPUs like PPC and such.
>> 7) Go back to step 2
> Kyle, thank you so much for your detailed information.
> If you design completely new software, your suggestion is very useful!
>
> Unfortunately, we carrier have very many exiting software and try to
> run
> on Linux.
> We need to seek the way which can apply to exiting software also...

If you notice, the above method has only minimal changes from
your mmap3 stuff, except without needing kernel support. One
thing to remember, though, as there _is_ a very clean method
to do this from userspace, therefore you are not likely to
get much sympathy on this list.

I suggest you try adding a new hotpatch thread to your code,
as above, then use it to implement the mmap3 and other tasks
necessary for live patching instead of in kernel space.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r  
!y?(-)
------END GEEK CODE BLOCK------


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Pavel Machek
In reply to this post by Takashi Ikebe
Hi!

> Kyle, thank you so much for your detailed information.
> If you design completely new software, your suggestion is very useful!
>
> Unfortunately, we carrier have very many exiting software and try to run
> on Linux.
> We need to seek the way which can apply to exiting software also...

"We want to do the wrong thing because we think its easier".

Okay, you are free to do that, but don't try to push that into
mainline kernel. Maintain your own patches; if that seems too hard, do
the right thing.
                                                                        Pavel

--
Boycott Kodak -- for their patent abuse against Java.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Andi Kleen-2
In reply to this post by Takashi Ikebe
On Mon, Apr 25, 2005 at 07:39:51PM +0900, Takashi Ikebe wrote:
> Kyle, thank you so much for your detailed information.
> If you design completely new software, your suggestion is very useful!
>
> Unfortunately, we carrier have very many exiting software and try to run
> on Linux.
> We need to seek the way which can apply to exiting software also...

ptrace can all do this, even with an existing kernel.
Your full patch is just a funky ptrace equivalent as far as I can see.


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Valdis.Kletnieks
In reply to this post by Takashi Ikebe
On Mon, 25 Apr 2005 19:39:51 +0900, Takashi Ikebe said:

> Unfortunately, we carrier have very many exiting software and try to run
> on Linux.
> We need to seek the way which can apply to exiting software also...

You *really* want to take the time to re-write the software to do things
The Linux Way.  If you're looking at doing on-the-fly patching, you're
probably also carrying around a lot of *other* ugly cruft to make this
creeping horror work on Linux.  In fact, I'd not be surprised if you have
a shim layer to make the compatibility layer for the *previous* system
work on Linux...

I'm reminded of a (possibly apocryphal) quote from an ATT spokesperson from
1988 or so, when a misplaced comma in a patch kept crashing the long-distance
phone network. When asked "Why don't you just reboot the affected switches?"
his response was "This assumes that the switch had ever been booted in the
first place". (Apparently, the *whole thing* had been on-the-fly replaced/patched
without an actual reload happening...)

Gaaahhh! :)


attachment0 (234 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Takashi Ikebe
I think that's the common sense in every carrier.
If we reboot the switch, the service will be disrupted.
The phone network is lifeline, and does not allow to be disrupt by just
bug fix.
I think same kind of function is needed in many real
enterprise/mission-critical/business area.

All do with ptrace may affect target process's time critical task. (need
to stop target process whenever fix)
All implement in user application costs too much, need to implement all
the application...(and I do not know this approach really works on time
critical applications yet.)
There are clear demand to realize this common and GPL-ed function....

[hidden email] wrote:

> On Mon, 25 Apr 2005 19:39:51 +0900, Takashi Ikebe said:
>
>
>>Unfortunately, we carrier have very many exiting software and try to run
>>on Linux.
>>We need to seek the way which can apply to exiting software also...
>
>
> You *really* want to take the time to re-write the software to do things
> The Linux Way.  If you're looking at doing on-the-fly patching, you're
> probably also carrying around a lot of *other* ugly cruft to make this
> creeping horror work on Linux.  In fact, I'd not be surprised if you have
> a shim layer to make the compatibility layer for the *previous* system
> work on Linux...
>
> I'm reminded of a (possibly apocryphal) quote from an ATT spokesperson from
> 1988 or so, when a misplaced comma in a patch kept crashing the long-distance
> phone network. When asked "Why don't you just reboot the affected switches?"
> his response was "This assumes that the switch had ever been booted in the
> first place". (Apparently, the *whole thing* had been on-the-fly replaced/patched
> without an actual reload happening...)
>
> Gaaahhh! :)
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Kyle Moffett
On Apr 25, 2005, at 21:34, Takashi Ikebe wrote:
> [hidden email] wrote:
>> When asked "Why don't you just reboot the affected switches?" his
>> response was "This assumes that the switch had ever been booted in
>> the first place". (Apparently, the *whole thing* had been
>> on-the-fly replaced/patched without an actual reload happening...)
>> Gaaahhh! :)
>
> I think that's the common sense in every carrier.

That is definitely not common sense.  It may be good business
practice, but those are two *entirely* different things.

> If we reboot the switch, the service will be disrupted.

Yes.  My personal favorite solution to this problem is HeartBeat,
some Open-Source software that is very good at maintaining high
availability.  With a properly written multi-system clustering
switch application that utilizes the Linux Virtual-Server tools,
you could reasonably efficiently run a system such that you can
reboot any individual system without any loss of service.

> The phone network is lifeline, and does not allow to be disrupt
> by just bug fix.  I think same kind of function is needed in many
> real enterprise/mission-critical/business area.

But you miss the point.  Linux is *NOT* about "business", or
"enterprise", or "mission-critical".  Linux is (at least to
many hackers) about hacking, having fun, and Good Design(TM).

> All do with ptrace may affect target process's time critical
> task. (need to stop target process whenever fix)

So don't do it with ptrace!!! I've given you one other method
that uses minimal changes to existing software and emulates the
crappy mmap3 call you keep trying to push.

> All implement in user application costs too much,

What about one of the dozen other offered methods?

> need to implement all the application...

So why not write a utility library?  You'd need to "implement
all in the kernel", too, and since it can be done better in
userspace, let's keep out the bloat while we're at it.

> (and I do not know this approach really works on time critical
> applications yet.)

So test it! You're clearly working for a big corporation with
the money and resources to develop something like this, so do
so, and if you get something that works well, *and* uses good
design, we'll welcome patches!

> There are clear demand to realize this common and GPL-ed
> function....

The kernel is not about business, demand, or what the CEO of
some big-name company wants.  The kernel strives for the goal
of "Good Engineering (TM)".


Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r  
!y?(-)
------END GEEK CODE BLOCK------


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Pavel Machek
In reply to this post by Takashi Ikebe
On Út 26-04-05 10:34:56, Takashi Ikebe wrote:

> I think that's the common sense in every carrier.
> If we reboot the switch, the service will be disrupted.
> The phone network is lifeline, and does not allow to be disrupt by just
> bug fix.
> I think same kind of function is needed in many real
> enterprise/mission-critical/business area.
>
> All do with ptrace may affect target process's time critical task. (need
> to stop target process whenever fix)
> All implement in user application costs too much, need to implement all
> the application...(and I do not know this approach really works on time
> critical applications yet.)
> There are clear demand to realize this common and GPL-ed function....
        ~~~~~~~~~~~~~~~~
I had very strong urge to reply with "<plonk>" here.

Clearly noone but you wants to make kernel more ugly just for "faster
ptrace". If you want faster ptrace, fine, advertise it as such and
provide nice and small patch to make it faster.

If you are going to handwave about "clear demand", well, find some
other list to troll on.
                                                                Pavel
--
Boycott Kodak -- for their patent abuse against Java.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

Andi Kleen-2
In reply to this post by Takashi Ikebe
On Tue, Apr 26, 2005 at 10:34:56AM +0900, Takashi Ikebe wrote:
> I think that's the common sense in every carrier.
> If we reboot the switch, the service will be disrupted.
> The phone network is lifeline, and does not allow to be disrupt by just
> bug fix.
> I think same kind of function is needed in many real
> enterprise/mission-critical/business area.
>
> All do with ptrace may affect target process's time critical task. (need
> to stop target process whenever fix)

Sorry, but what are your exact time requirements for this?

Remember any x86-64 CPU is really fast and it can do a _lot_ of ptrace
operations in a very short time.

Just a vague "it may be too slow" is not enough justification to
push a lot of redundant code into the kernel. Also if ptrace
should be really too slow (which I doubt, but you are welcome
to show some numbers together with real time requirements from
a real system) then we could optimize ptrace for this, e.g.
by adding a ptrace subcommand to copy whole memory blocks
more efficiently or maybe even do a mmap like thing.

But unless someone actually demonstrates this is needed it seems far overkill.

> All implement in user application costs too much, need to implement all
> the application...(and I do not know this approach really works on time
> critical applications yet.)

I think you have a lot of unproved and doubtful assumptions here.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/