Major KVM issues with kernel 4.5 on the host

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Borislav Petkov-3
Try this one better - it fixes an unitialized var.

--
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

01-mm-thp-calculate_the_mapcount_correctly_for_thp_pages_during_wp_faults.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Dr. David Alan Gilbert-2
In reply to this post by Marc Haber-22
* Marc Haber ([hidden email]) wrote:

> Hi David,
>
> On Sat, Apr 23, 2016 at 07:52:46PM +0100, Dr. David Alan Gilbert wrote:
> > Hmm, your problem does sound like bad hardware, but....
> > If you've got a nice reliable crash, can you try turning transparent huge pages
> > off on the host;
> >    echo never > /sys/kernel/mm/transparent_hugepage/enabled
>
> I must have missed this hint in the middle of the "your hardware is
> bad" avalance that came over me.
>
> I spent two weeks bisecting "good" kernels since during the repeated
> reconfigurations, transparent huge pages got turned off in kernel
> configuration. After running each kernel for 24 hours, I eventually
> ended up with a working 4.5 kernel. The configuration diff was short,
> showing transparent huge pages, and - finally - upon re-reading the
> thread I found your hint.

OK, good.  When I sent that mail I'd hit a THP bug but in a
corner of migration and at the time we didn't know why and there was
no reason to think it would cause any other symptoms, but since it was
also between 4.4 and 4.5 it did seem worth mentioning as a long shot,
but it was no more than a long shot.

> I have now the result that 4.5, 4.5.1 and 4.5.4 corrupt KVM guest
> memory reliably in the first hour of running under disk load, causing
> the VM to either drop dead in the water, or to read randomness from
> disk. Rebooting fixes the VM. This happens as soon as transparent huge
> pages are turned on in the host.
>
> Turning off transparent huge pages by echo never >
> /sys/kernel/mm/transparent_hugepage/enabled fixes the issue even
> without rebooting the host. Start up the VM again and it works just
> fine.
>
> Is this an issue in (a) transparent huge pages, (b) KVM or (c) qemu?
> Where should this issue be forwarded? Or do we just accept it and turn
> transparent huge pages off?

Try Andrea's fix for (a).

Dave

>
> Greetings
> Marc
>
> --
> -----------------------------------------------------------------------------
> Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
> Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
> Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
 -----Open up your eyes, open up your mind, open up your code -------  
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Marc Haber-22
In reply to this post by Borislav Petkov-3
On Fri, May 13, 2016 at 10:07:45AM +0200, Borislav Petkov wrote:
> On Fri, May 13, 2016 at 07:23:34AM +0200, Marc Haber wrote:
> > How do I apply this?
>
> I'm attaching it.

Ok, stupid me, I thought that one could simply curl the web page. Too
bad that list archives keep mangling patches :-(

It applies now to 4.5 as well.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Borislav Petkov-3
On Fri, May 13, 2016 at 11:08:46AM +0200, Marc Haber wrote:
> It applies now to 4.5 as well.

Yeah, I tried getting the raw message from marc.info but then it said:

patch unexpectedly ends in middle of line
Hunk #1 succeeded at 922 with fuzz 1.

The attached versions I sent you are from my lkml mbox - the only reason
I keep it :-)

--
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Marc Haber-22
In reply to this post by Borislav Petkov-3
On Fri, May 13, 2016 at 10:09:52AM +0200, Borislav Petkov wrote:
> Try this one better - it fixes an unitialized var.

Instead, or in addiiton?

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Marc Haber-22
In reply to this post by Dr. David Alan Gilbert-2
On Fri, May 13, 2016 at 09:35:45AM +0100, Dr. David Alan Gilbert wrote:
> also between 4.4 and 4.5 it did seem worth mentioning as a long shot,
> but it was no more than a long shot.

It was however helpful. I'd have bisected kernel configuration instead
of using the runtime control first, and seeing your long shot two
weeks earlier, it'd have saved myself those two weeks of tedious
bisecting.

> Try Andrea's fix for (a).

In the works.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Marc Haber-22
In reply to this post by Borislav Petkov-3
On Fri, May 13, 2016 at 10:07:45AM +0200, Borislav Petkov wrote:
> On Fri, May 13, 2016 at 07:23:34AM +0200, Marc Haber wrote:
> > How do I apply this?
>
> I'm attaching it.

Had the VM crashing twice with this patch applied, THP==madvise on the
host and THP==never in the VM.

Now trying the other patch, assuming that it's intended to be used
_instead_ of this one.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Borislav Petkov-3
In reply to this post by Marc Haber-22
On Fri, May 13, 2016 at 03:21:56PM +0200, Marc Haber wrote:
> Instead, or in addiiton?

Instead. You'll notice that it doesn't apply if you try "in addition".

--
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
Reply | Threaded
Open this post in threaded view
|

Re: transparent huge pages breaks KVM on AMD.

Marc Haber-22
In reply to this post by Borislav Petkov-3
On Fri, May 13, 2016 at 10:09:52AM +0200, Borislav Petkov wrote:
> Try this one better - it fixes an unitialized var.

Nosireebob, VMs crash even with this patch in the host as soon as the
host has THP enabled.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
123