Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failure to start, with pegged CPU #940

Open
5 tasks done
griffint61 opened this issue Feb 23, 2024 · 4 comments
Open
5 tasks done

Intermittent failure to start, with pegged CPU #940

griffint61 opened this issue Feb 23, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@griffint61
Copy link

Thanks in advance for your bug report!

  • Have you reproduced issue in safe mode?
  • Have you used the debugging guide to try to resolve the issue?
  • Have you checked our FAQs to make sure your question isn't answered there?
  • Have you checked to make sure your issue does not already exist?
  • Have you checked you are on the latest release of Pulsar?

What happened?

When starting pulsar, about 20% of the time, it pegs the CPU and the window does not open. The workaround is to kill it and start again.

This behavior is new with version 1.114.0. It did not occur with 1.108.0. (I was not able to use anything between those two versions due to issue #733.)

There are two processes running while the CPU pegged. I did an strace on both of them. One doing this repeatedly:

getrandom("\x3c\xa2\x94\x82\x18\x6f\xce\x70", 8, 0) = 8
getrandom("\x50\xac\x30\xab\x23\xbc\x78\xd4", 8, 0) = 8
getrandom("\x9f\x39\x82\xf4\x59\x44\x99\x5e", 8, 0) = 8
getrandom("\xb6\xb5\xaf\x18\x96\x50\x44\x49", 8, 0) = 8
write(29, "\0", 1)                      = 1

The other is doing this:

futex(0x12c3a86f6790, FUTEX_WAKE_PRIVATE, 1) = 1
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 0) = 1 ([{fd=12, revents=POLLIN}])
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
read(12, "!", 2)                        = 1
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 3916) = 1 ([{fd=12, revents=POLLIN}])
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
read(12, "!", 2)                        = 1
futex(0x56181bb4b680, FUTEX_WAKE_PRIVATE, 1) = 1
epoll_wait(14, [], 1024, 0)             = 0
futex(0x12c3a86f6790, FUTEX_WAKE_PRIVATE, 1) = 1
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 0) = 1 ([{fd=12, revents=POLLIN}])
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
read(12, "!", 2)                        = 1
epoll_wait(14, [{events=EPOLLIN, data={u32=29, u64=29}}], 1024, 0) = 1
read(29, "\1\0\0\0\0\0\0\0", 1024)      = 8
futex(0x12c3a86f6790, FUTEX_WAKE_PRIVATE, 1) = 1
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 3443) = 1 ([{fd=41, revents=POLLIN}])
recvmsg(41, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\241\10\24\1\0\0\0\0<\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 32
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 0) = 0 (Timeout)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(42, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=41, events=POLLIN}, {fd=42, events=POLLIN}], 4, 2965) = 1 ([{fd=12, revents=POLLIN}])
recvmsg(41, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
read(12, "!", 2)                        = 1

Pulsar version

1.114.0

Which OS does this happen on?

🐧 Red Hat based (Fedora, Alma, RockyLinux, CentOS Stream, etc.)

OS details

AlmaLinux 9

Which CPU architecture are you running this on?

x86_64/AMD64

What steps are needed to reproduce this?

Start Pulsar. If it pegs the CPU and the window does not open, the issue has been reproduced. If the window opens then it is fine.

Additional Information:

No response

@griffint61 griffint61 added the bug Something isn't working label Feb 23, 2024
@confused-Techie
Copy link
Member

Sorry this hasn't received a response yet.

Just wanted to say thanks for reporting the issue and we will be sure to look at this one, but unfortunately there's been a lot of change between these versions, and with the issue seemingly being random or depending on something local to your system it may be hard to pinpoint.

But thanks a ton!

@griffint61
Copy link
Author

Since having originally reported this I've had a chance to use the same version of pulsar on several other AlmaLinux 9 hosts. The issue has not shown itself on any of the others.

The thing that makes the original host different is that it is a virtual machine, whereas all of the other hosts are bare metal. The VM software is VirtualBox 7.0.14 running on Windows 10.

@confused-Techie
Copy link
Member

@griffint61 This is fantastic information! Thanks for that, I'll see what I can do to try and replicate it myself

@griffint61
Copy link
Author

I tried replicating the issue on a KVM/QEMU VM. No luck.

Both host and guest were AlmaLinux 9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants