Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow queries when events of a type are rare. #52

Open
macosgrove opened this issue Apr 12, 2018 · 5 comments
Open

Slow queries when events of a type are rare. #52

macosgrove opened this issue Apr 12, 2018 · 5 comments

Comments

@macosgrove
Copy link

macosgrove commented Apr 12, 2018

RSS, an Event Sourcery project, was having problems with the database CPU usage increasing over time:

screen shot 2018-04-12 at 10 47 22 am

We turned on Postgres logging to look for slow queries and found these two:

SELECT * FROM "events" 
WHERE ("type" IN ('revenue_period_voided', 'revenue_period_ended')) 
ORDER BY "id" DESC 
LIMIT 1

and

SELECT * FROM "events" 
WHERE ((id >= 3616762) AND ("type" IN ('revenue_period_voided', 'revenue_period_ended'))) 
ORDER BY "id" 
LIMIT 1000

RSS stopped emitting events of type revenue_period_ended and revenue_period_voided some time ago but we had not turned off their projector.

The dip in the CPU graph is when we disabled the relevant projector.

The implication here is that any projector that deals with RARE events will run slow.

@envatopoho is investigating indexes that may resolve the issue.

@envatopoho
Copy link

Have done some digging (many thanks to @gstib for his help) and found that we are pretty well as optimised as we are going to get with the database indexes. We tried a bunch of different indexes, combinations etc but it always decided to use the primary key index and filter the results.

I think a possible fix is for Event Sourcery to maintain an index to the 'top' of the event store (since it last checked for that particular event type) and only check for subsequent events of that type from when it last checked (rather than for all time).

@macosgrove
Copy link
Author

macosgrove commented Apr 12, 2018

Looks like that first query is coming from latest_event_id and the second from get_next_from in EventStore

But I can't find any calls of latest_event_id

@macosgrove
Copy link
Author

macosgrove commented Apr 12, 2018

Our next slowest query is

SELECT * FROM "events" 
WHERE ("type" IN ('subscriber_billing_details_update_failed')) 
ORDER BY "id" DESC 
LIMIT 1

'subscriber_billing_details_update_failed' is a rare event.

Even after turning off the revenue period projector, we are seeing a slow rise in CPU usage:

screen shot 2018-04-12 at 11 07 18 am

@envatopoho
Copy link

Same thing with this query, it uses the primary key index to get results and then filters:

"QUERY PLAN"
"Limit  (cost=0.43..32.78 rows=1 width=413) (actual time=302.200..302.201 rows=1 loops=1)"
"  ->  Index Scan Backward using events_pkey on events  (cost=0.43..517975.03 rows=16015 width=413) (actual time=302.198..302.198 rows=1 loops=1)"
"        Filter: ((type)::text = 'subscriber_billing_details_update_failed'::text)"
"        Rows Removed by Filter: 723668"
"Planning time: 0.109 ms"
"Execution time: 302.224 ms"

@ricobl
Copy link

ricobl commented Jan 13, 2021

We have identified this issue again where there were no recent events that a certain projector cares about, the tracker was stuck in a past event while the event store keeps growing.

The challenge here is keeping a good compromise in performance, only loading specific event types but also finding a way to advance the tracker when there are no event types for the processor.

We could consider loading all types of events and only forwarding the ones that the processor cares about. This would allow for advancing the tracker but would have additional burden on all projectors.

Also considered having an additional query to get the latest event (of any type) after an empty result, but this could cause skipping events.

In the meantime, our workaround was to register a dummy handler for another event type that is known to be frequent:

project FrequentEvent do
  # no-op
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants