Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012 #13

Open
sirhelly opened this issue Jun 7, 2013 · 10 comments

Comments

@sirhelly
Copy link

sirhelly commented Jun 7, 2013

Issues:
a) kernel warnings in full speed on the system console and syslogd, log files fill up
b) dmesg hangs

SmartOS Version:
SunOS 00-25-90-77-43-ac 5.11 joyent_20130307T214308Z i86pc i386 i86pc

Background Info:
The system is running various different OS's (build&testserver)
52 VM's (KVM and Zones) including:
FreeBSD8 32 Bit, FreeBSD9 32&64 Bit, OpenBSD 32&64 Bit, NetBSD 32/64
ReactOS, Haiku, Ubuntu, Oracle Solaris, OpenIndiana, Open Solaris (last)
WinXP, Win7 64,Win Vista 64, Linux 32 & 64 Bit
and some OS Zones.

Maybe related: Avi Kivity in 2007:
http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg07044.html

If i can help anyhow (cmd's, further infos, etc) please do not hesitate to
contact me mail: helmut dot hartl at firmos dot at.

Messages for reference:

2013-06-07T15:04:02.293908+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.297239+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.300573+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.303907+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.307243+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.310578+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
2013-06-07T15:04:02.313911+00:00 00-25-90-77-43-ac kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012

@ingenthr
Copy link

I've just encountered this with 20160304T005100Z. System seems to hang pretty easily as well. Nothing in /var/crash.

@kfr-
Copy link

kfr- commented Mar 29, 2016

Just ran into this with joyent_20151104T185720Z. System seems to hang. Nothing in /var/crash.

@rmustacc
Copy link
Contributor

On 3/28/16 18:30 , kfr- wrote:

Just ran into this with joyent_20151104T185720Z. System seems to hang. Nothing in /var/crash.

Did the host hang or the VM?

@kfr-
Copy link

kfr- commented Mar 29, 2016

Seems to hang the host.

I see
2016-03-28T19:28:41.000251+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
2016-03-28T19:28:41.000310+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
2016-03-28T19:28:41.000889+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
2016-03-28T19:28:41.000937+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
2016-03-28T19:28:41.001028+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
repeats from above lots

Then I see:
2016-03-28T19:28:41.854469+00:00 kam-srv1 kvm: [ID 177374 kern.warning] WARNING: kvm: emulating exchange as write#012
Then more:
2016-03-28T19:28:41.854998+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
2016-03-28T19:28:41.855015+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
2016-03-28T19:28:41.855938+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
2016-03-28T19:28:41.855956+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
2016-03-28T19:28:41.856927+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
repeats from above lots

Then before it hangs I see one last:
2016-03-28T19:43:36.678921+00:00 kam-srv1 kvm: [ID 987709 kern.info] unimplemented perfctr wrmsr: 0xc0010000 data 0x530076

I think I have a proper filter setup in /opt/custom/etc/rsyslog.d/kvm.conf
:msg, contains, "unimplemented perfctr wrmsr" /var/adm/messages
& ~

@kfr-
Copy link

kfr- commented Mar 29, 2016

If the filter is setup correctly, I should no longer see "unimplemented perfctr wrmsr" messages.

@rmustacc
Copy link
Contributor

Well, if it's causing the host to hang, then can you generate an NMI
when that happens and verify that you can't ping and use the console
before that.

@kfr-
Copy link

kfr- commented Mar 29, 2016

I know I can't ping the console and I can't ssh into the host.

@kfr-
Copy link

kfr- commented Mar 29, 2016

Going to generate an NMI as per https://wiki.smartos.org/pages/viewpage.action?pageId=754743 the next time it happens. I will report back.

@ingenthr
Copy link

The host was hung in my case as well. I moved to lx brand VMs for now, but may need thost KVMs again at some point. I can try to get to another hang if it'd be helpful. My host doesn't have a BMC, so I'd need to see if it has a hardware NMI to get a crash dump or need to set a breakpoint somewhere useful. I've done this before but it's been a while, so I'm glad to get more data if it's useful.

@davefinster
Copy link

If it's of any usefulness, I just observed this on an old box running 20150123T200224Z that has its disks attached via a JBOD array that was suffering from both a very low available DRAM condition (mistakenly almost over-committed, as such free list showed <1GB available) and what appeared to be faulty SLOG SSDs exhibiting insanely high I/O latency. These SSDs are SATA Intel devices residing in a JBOD.

After a reboot, as VMs were being started, the host would eventually hang - keyboard I/O on consoles would still be accepted and be displayed, but nothing would make forward progress.

Removing the SLOG SSDs from service resolved the problem. Unfortunately I did not get a chance to inject an NMI/obtain a dump but if it happens again I'll be sure to give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants