Skip to content
This repository was archived by the owner on Sep 20, 2023. It is now read-only.

sysfs-gpio: assert WaitForEdge() and Halt() work as expected #323

Closed
maruel opened this issue Nov 9, 2018 · 12 comments
Closed

sysfs-gpio: assert WaitForEdge() and Halt() work as expected #323

maruel opened this issue Nov 9, 2018 · 12 comments
Assignees

Comments

@maruel
Copy link
Contributor

maruel commented Nov 9, 2018

  • Add gpiosmoketest to assert that WaitForEdge() can be unblock by Halt().
  • Make it live, CI: run gpiosmoketest #322.
  • Document this in package gpio.
@mame82
Copy link

mame82 commented Nov 9, 2018

Hi,

quick heads up:

The code snippet below produces the following results (on Pi0w):

  • Before the go routine with Halt() fires, falling edges are detected reliably (GPIO13 has a button connected)
  • after the go routine triggers Halt() no further edges are detected, when pushing/releasing the button, but the WaitForEdge call still blocks and thus the for loop never returns
func HaltTest() {
	host.Init()
	pButton := gpioreg.ByName("GPIO13")
	pRed := gpioreg.ByName("GPIO23")
	pGreen := gpioreg.ByName("GPIO24")
	pBlue := gpioreg.ByName("GPIO25")

	out := false



	pButton.In(gpio.PullDown, gpio.FallingEdge)
	//start go routine calling Halt on pButton after some seconds
	go func() {
		pRed.Out(gpio.Low)
		time.Sleep(20 * time.Second)
		pButton.Halt()
		pButton.In(gpio.PullNoChange, gpio.NoEdge)
		pRed.Out(gpio.High) //indicate Halt() fired
	}()

	pGreen.Out(gpio.High) //Indicate loop entry by LED
	for {
		if pButton.WaitForEdge(-1) {
			fmt.Println("Edge detected")
		} else {
			fmt.Println("Edge detection failed, stopping loop")
		}
		if out {
			pBlue.Out(gpio.High)
		} else {
			pBlue.Out(gpio.Low)
		}
		out = !out //toggle
	}
	pButton.In(gpio.PullNoChange, gpio.NoEdge)

	pGreen.Out(gpio.Low) //Indicate loop exit by LED
}

The GPIO out parts (GPIO23,24,25) could be safely ignored, but I kept them, as I mostly test with headless setup and help of LED indicators.

Additionally I noticed, that ones the parent process is suspended and resumed (f.e. with CTRL+Z on Linux bash terminal), the WaitForEdge call always returns false (?intended epoll behavior?).

As mentioned on twitter, toggling the internal pull resistor between PullUp and PullDown could be (mis)used from software side to "force release" a pending WaitForEdge (without an external change of the real GPIO level). Of course this wouldn't make WaitForEdge return false and thus needs an additional check condition (for example to terminate the for loop of the code snippet), but it is the only way I found so far to replace the non-working Halt().

@mame82
Copy link

mame82 commented Nov 9, 2018

Add up:

The syscall.EpollWait returns an event which has EPOLLERR set beginning from the second call (event[0].Events == 10).

If the value file is opened with unix.O_NONBLOCK in addition to syscall.O_RDWR, the pending EpollWait returns with n==0 after calling Halt().

@maruel
Copy link
Contributor Author

maruel commented Nov 10, 2018

Ok so I enabled gpiosmoketest to run on the gohci beaglebone worker, and I created https://github.com/google/periph/commits/gpiotest to test this behavior.
It currently fails as expected: https://gist.github.com/gohci-bot/7ef3d25a3cfbb07d58f2393debd54645

I hope I covered most cases there, but this gives a good test bed, making it easy to add more.

@maruel
Copy link
Contributor Author

maruel commented Nov 10, 2018

Filed #326 for the signal part. Let's focus on normal behavior here, then signals can be looked at afterward

maruel added a commit that referenced this issue Nov 11, 2018

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Make it run on beaglebone via gohci.

The test intentionally expects failure as described in #323.
@mame82
Copy link

mame82 commented Nov 12, 2018

I have done a bunch of tests, to find a way to interrupt epoll_wait if it is called with a timeout argument of -1.

To summarize: It isn't possible (at least not in a clean way)!

Possible solutions

  1. For "endless blocking" (timeout == -1) epoll_wait could be called in a loop with a predefined timeout.
    The loop continues calling epoll_wait, as long as the number of returned file descriptors is less than 1 (for a timeout the number of returned fds would be 0 instead of err == EINTR, like documented in the man page).
    The abort condition for this epoll_wait loop would be p.edge == gpio.NoEdge. To get access to this attribute, the loop has to be implemented directly in WaitForEdge of host/sysfs/gpio.go.
    The shortcoming of this method is that edges could be missed between successive calls to p.event.Wait (and thus to epoll_wait). Additionally there's a trade-off in responsiveness (time between setting p.Edge = gpio.NoEdge and the time till the loop calling wait really exits) and system load (short timeouts increase CPU load, timeout == 0 would be busy waiting).

  2. The intended way to interrupt epoll_wait (as well as poll/select) is to use a signal. By default every signal interrupts epoll_wait (as already highlighted). A signal filter could be set by using epoll_pwait instead of epoll_wait. In both cases a signal (f.e. USR1) has to be send to the process using the your gpio library. The logic to generate the signal could be placed in haltEdge. Anyways, to me this seems like an ugly solution, as it allows external processes to influence the behavior by sending signals, too.

So from my point of view the best solution would be to implement the looping method.
An example could look like this (periph/host/sysfs/gpio.go):

// WaitForEdge implements gpio.PinIn.
func (p *Pin) WaitForEdge(timeout time.Duration) bool {
	// Run lockless, as the normal use is to call in a busy loop.
	var ms int
	if timeout == -1 {
		ms = -1
	} else {
		ms = int(timeout / time.Millisecond)
	}
	start := time.Now()
	for {

//<--- start of customized code
		nr := 0
		var err error = nil
		if timeout == -1 {
			breakoutTimeoutMillis := 100
			for {
				//Note:
				// In contrast to man page, epoll_wait doesn't return EINTR error on timeout.
				// EINTR is only returned for signals (all signals to be more precise, if epoll_pwait isn't used)
				// This means, if p.Event.Wait returns nr == 0, no edge event was detected and we have to call
				// Wait again, unless p.edge has been set to gpio.NoEdge meanwhile.
				// If NoEdge is set, false will be returned, as the 'ms <= 0' condition becomes true.
				//
				// The only other valid solution to interrupt a pending epoll_wait, would be to send a SIGNAL.
				// Both methods dealing with endless timeouts (looping with a "breakoutTimeout" and sending a SIGNAL)
				// have shortcomings. The looping approach could miss edge interrupts between successive calls to
				// p.event.Wait(), the approach based on signals could be interrupted by external processes (which is
				// the case anyways, unless epoll_pwait is used instead of epoll_wait to get fine grained control over
				// valid signals)
				nr, err = p.event.Wait(breakoutTimeoutMillis)
				if nr != 0 || err !=nil || p.edge == gpio.NoEdge {
					break
				}
			}
		} else {
			nr, err = p.event.Wait(ms)
		}
//<--- end of customized code
		if err != nil {
			return false
		} else if nr == 1 {
			// TODO(maruel): According to pigpio, the correct way to consume the
			// interrupt is to call Seek().
			return true
		}
		// A signal occurred.
		if timeout != -1 {
			ms = int((timeout - time.Since(start)) / time.Millisecond)
		}
		if ms <= 0 {
			return false
		}
	}
}

Please understand that I can't file a PR, as neither solution seems clean to me and for my own use case I'm still fine with the idea of calling epoll_wait with timeout = -1 and raising pseudo "edge events" by toggling the internal pull resistors from PullUp to PullDown, till epoll_wait returns.

I hope you gonna find a valid fix. The code snippet from above should at least avoid the failing tests, without impacting other functionality.

@maruel
Copy link
Contributor Author

maruel commented Nov 12, 2018

I think implementing the signal method is better than looping; yes this opens the door to having WaitForEdge() return false spuriously, but it's not a big deal.

mame82 added a commit to mame82/periph that referenced this issue Nov 12, 2018
@mame82
Copy link

mame82 commented Nov 12, 2018

Another (promising) solution in the PR

Edit:
Mentioned PR isn't accepted. Please understand that I'm missing the time to tune everything for a possible merge. The intention of the PR is to show a possible solution, distributed among multiple files (way easier than trying to explain it with code snippets in a comment).

So if you consider the idea valid, I'd be happy to see it implemented, as I need to move on with my own project (where this problem is already solved by another hack)

maruel added a commit that referenced this issue Dec 1, 2018
Add a short description for each worker.

Running the GPIO smoke test on a Raspberry Pi will be helpful; up to now
it only ran on a BeagleBone Green Wireless, which doesn't have a CPU
specific GPIO driver. So this enables smoke testing a new code path.

This should help with issue #323.
@maruel
Copy link
Contributor Author

maruel commented Dec 1, 2018

@hatstand
Copy link
Contributor

Any reason not to create another FD with eventfd() and use that to interrupt epoll_wait()? Seems cleaner than using a signal.

@maruel
Copy link
Contributor Author

maruel commented Dec 18, 2018

I have no idea, I'm not super familiar with linux's events (I'm a Windows guy).

@maruel
Copy link
Contributor Author

maruel commented Dec 28, 2018

I'm making real progress on a permanent fix for this. I'm now able to run a much stricter version of gpiosmoketest in a continous loop locally, testing that all of Halt(), In() and Out() can cancel a pending WaitForEdge().
I'm going to be mostly OOO for the next two weeks but I'll try to get this done.

@maruel
Copy link
Contributor Author

maruel commented Jan 14, 2019

This is going to be fixed via #380.

@maruel maruel closed this as completed Jan 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants