Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

Dendrite server stuck/not responding #3366

Open
hensg opened this issue Apr 19, 2024 · 1 comment
Open

Dendrite server stuck/not responding #3366

hensg opened this issue Apr 19, 2024 · 1 comment

Comments

@hensg
Copy link

hensg commented Apr 19, 2024

Background information

  • Dendrite version or git SHA: v0.13.6
  • SQLite3 or Postgres?: postgresql
  • Running in Docker?: no, nix
  • go version:
  • Client used (if applicable): element

Description

  • What is the problem:
    The dendrite server is stuck, not being able to operate normally and becoming extremely slow (10m to create a user)
  • Who is affected:
    everyone in the server
  • How is this bug manifesting:
    not sure, chat room with different region (asia(dendrite) <-> us (synapse))
  • When did this first appear:
    idk, first version install is 0.13.6
Apr 13 00:21:34 matrix-dendrite dendrite[50475]: time="2024-04-13T00:21:34.916263901Z" level=info msg="Processing registration request" auth.type=m.login.dummy req.id=D07R8ffP3r5r req.method=POST req.path=/_matrix/client/v3/register session_id="" username=<redacted>
Apr 13 00:21:35 matrix-dendrite dendrite[50475]: time="2024-04-13T00:21:35.012944307Z" level=info msg="PerformDeviceCreation" device_id="<nil>" display_name=<redacted> localpart=<redacted> req.id=D07R8ffP3r5r req.method=POST req.path=/_matrix/client/v3/register
Apr 13 00:25:18 matrix-dendrite dendrite[50475]: time="2024-04-13T00:25:18.193324285Z" level=info msg="Creating new room" req.id=SiLzMVzH2l9W req.method=POST req.path=/_matrix/client/v3/createRoom roomID=<redacted> roomVersion=10 userID=<redacted> user_id=<redacted>
Apr 13 00:28:48 matrix-dendrite dendrite[50475]: time="2024-04-13T00:28:48.534446323Z" level=info msg="Creating new room" req.id=8FY47hJw1rTr req.method=POST req.path=/_matrix/client/v3/createRoom roomID=<redacted> roomVersion=10 userID=<redacted> user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.138855878Z" level=error msg="unable to update event filter with ignored users" device_id=DE-q7arv error="context canceled" limit=50 req.id=eViRvKWwr2Zw req.method=GET req.path=/_matrix/client/r0/sync since=s328_2_67_9_180_329_100_290_0 timeout=30s user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.138958418Z" level=error msg="d.addRoomDeltaToResponse failed" device_id=DE-q7arv error="p.DB.RecentEvents: sql: transaction has already been committed or rolled back" limit=50 req.id=eViRvKWwr2Zw req.method=GET req.path=/_matrix/client/r0/sync since=s328_2_67_9_180_329_100_290_0 timeout=30s user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139007999Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139036961Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139059912Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"

Wondering if it's related to this continue and the error not being handled a few lines above where it returns the newPos:
https://github.com/matrix-org/dendrite/blob/main/syncapi/streams/stream_pdu.go#L232

Also, I observed a few queries running for too long in the db:

 SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '2 minutes';
  pid  |    duration     |                                                                                     query
-------+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 10357 | 00:02:04.219947 | BEGIN READ WRITE
 11644 | 00:02:04.261905 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11410 | 00:02:04.262119 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11647 | 00:02:04.213341 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11643 | 00:02:04.213535 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11649 | 00:02:04.238318 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
@Serlodh
Copy link

Serlodh commented Sep 26, 2024

@hensg did you find a solution to this by any chance?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants