Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a shutting down message to polykey agent start command #157

Closed
tegefaulkes opened this issue Mar 20, 2024 · 26 comments
Closed

Add a shutting down message to polykey agent start command #157

tegefaulkes opened this issue Mar 20, 2024 · 26 comments
Assignees
Labels
bug Something isn't working development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management

Comments

@tegefaulkes
Copy link
Contributor

Specification

Right now the agent seems un-responsive when stopping. Generally there are two ways we can stop the agent.

  1. By sending a ctrl+c to the running agent.
  2. By running the polykey agent stop command for the agent.

Both will cause the polykey agent to start shutting down but no messages are output when doing this. So unless you ran the agent with the verbose -v flag you can't tell anything is happening until it exits.

To address this we need to add a message when the agent starts to shut down. We need to hook onto this event on the PolykeyAgent and output Agent is shutting down and Agent has shut down messages.

There is a possibility that the agent is actually failing to shut down via the agent stop command. But I've been unable to replicate this in brief testing on nix and the mac machine. In any case it's something to keep an eye out for.

Additional context

Roger Qiu — Today at 8:39 AM
failing to shutdown
<Image>
Had to sigkill it

image

Tasks

  1. Hook onto the stopping and stopped event to output a message reflecting such events.
@tegefaulkes tegefaulkes added the development Standard development label Mar 20, 2024
Copy link

linear bot commented Mar 20, 2024

@tegefaulkes
Copy link
Contributor Author

I'm not sure how this conforms to the work @amydevs has been doing standardising the command outputs. Do you want to weigh in @amydevs ?

I'm also pretty sure that a change like this may break some expectations in the agent tests. So check that too.

@amydevs
Copy link
Member

amydevs commented Mar 28, 2024

I'm not sure how this conforms to the work @amydevs has been doing standardising the command outputs. Do you want to weigh in @amydevs ?

I'm also pretty sure that a change like this may break some expectations in the agent tests. So check that too.

hmm i think this should just be a message that goes into stderr. It's a feedback msg imo, so it's fitting

@CryptoTotalWar
Copy link

CryptoTotalWar commented Apr 3, 2024

@tegefaulkes The issue is still persisting.

  1. Polykey agent running in a terminal won't shut down. On Mac, this required opening the Activity Monitor and Force Quitting the program.
image
  1. We initially came across this issue on Nguyen's device when we were trying to start up Polykey but kept getting the ErrorStatusLocked: Status is locked by another process. We had control C'd all the terminals that had Polykey and closed all the terminals and the issue was still persisting. We then checked the Activity Monitor to see that Polykey was still running in the background so we force quit the application then re-ran the command to start polykey and only after the force quit were we able to re-run Polykey.
image
  1. When running polykey agent stop & then polykey agant status we don't get a response that tells us once it has stopped.
    Note: We still have polykey agent start running in a separate terminal. Is it possible that if this other terminal is unresponsive, and somehow still running or in the process of shutting down that it could be interfering with the program being used in the background on a separate terminal? Probably not, just wondering because the terminal that was running polykey agent start could only be shut down at the very end of our demo with a Force Shutdown.
image image image

@CryptoTotalWar CryptoTotalWar added the bug Something isn't working label Apr 3, 2024
@CryptoTotalWar
Copy link

CryptoTotalWar commented Apr 3, 2024

Various hours after the demo with Wen, I checked my Activity Monitor and noticed that Polykey was still actively running in the background on my machine.

I had ran control+C on the terminal where I was running my polykey commands but the initial terminal where I ran polykey agent start had been closed and its possible that I did not properly shutdown Polykey with control+c.... which could explain why it was still running.

However, with all my terminal applications closed, I was able to open a new run and simply run polykey agent stop which in this case did shut down Polykey.

Point is, Wen would run polykey agent stop in a terminal where polykey was being interacted with and it would not properly shut down. For me, polykey agent stop was the very first command i ran in a fresh new terminal before running any other commands and the response was instant (where i could see from my activity monitor that PK was indeed killed).

There's some inconsistencies in the shut down process that are possibly due to various known unknown factors (known unknowns because Wen's commands and sessions were not identical to mine) and so maybe it is important that there is more responsiveness in knowing if Polykey agent has been shut down properly or if its still running (and its status) if one were to use it several hours after initiating the program...

@CMCDragonkai
Copy link
Member

It is intended for there to be multiple Polykey agents in a desktop.

So when you do PK agent stop, you have to beware which one it is talking to.

The main one is the default home path. Agents is determined by node path.

Already reported earlier that CTRL-C has problems when doing it in foreground. But backgrounder agents always require pk agent stop.

Note that if you run PK agent start twice off the same node path it should prevent the second command from running.

@tegefaulkes
Copy link
Contributor Author

I'm worried that there may be a weird interaction with polykey agent start --fresh on a node that is already running. I should check this.

Keep in mind that ErrorStatusLocked means that an agent for that node is already running. Also keep in mind if you run a agent with --background it will have to use polykey agent stop on it to stop it.

@tegefaulkes
Copy link
Contributor Author

Ok, no problem running Polykey agent start --fresh on an already running node.

❯ npm run polykey -- agent start -np ./tmp/test3/ --fresh

> polykey-cli@0.2.6 polykey
> ts-node src/polykey.ts agent start -np ./tmp/test3/ --fresh

✔ Enter new password … ***
✔ Confirm new password … ***
WARN:polykey.PolykeyAgent:Failed Creating PolykeyAgent
ErrorStatusLocked: Status is locked by another process

@tegefaulkes
Copy link
Contributor Author

The problem is, I can't re-create this and I have very little context to what it could be.

Does it happen every time? Or only sometimes? What are you doing when it happens. Can you narrow down a cause or procedure to trigger it?

For me to really get a better guess to whats happening I need some logging when it happens. For example when I run this with logging I get the following.

❯ npm run polykey -- agent start -np ./tmp/test3 -v

> polykey-cli@0.2.6 polykey
> ts-node src/polykey.ts agent start -np ./tmp/test3 -v

✔ Please enter the password … ***
INFO:polykey.PolykeyAgent:Creating PolykeyAgent
INFO:polykey.PolykeyAgent:Setting umask to 077
INFO:polykey.PolykeyAgent:Setting node path to ./tmp/test3
INFO:polykey.PolykeyAgent.Status:Starting Status
INFO:polykey.PolykeyAgent.Status:Writing Status to tmp/test3/status.json
INFO:polykey.PolykeyAgent.Status:Status is STARTING
INFO:polykey.PolykeyAgent.Schema:Creating Schema
INFO:polykey.PolykeyAgent.Schema:Starting Schema
INFO:polykey.PolykeyAgent.Schema:Setting state path to tmp/test3/state
INFO:polykey.PolykeyAgent.Schema:Started Schema
INFO:polykey.PolykeyAgent.Schema:Created Schema
INFO:polykey.PolykeyAgent.KeyRing:Creating KeyRing
INFO:polykey.PolykeyAgent.KeyRing:Setting keys path to tmp/test3/state/keys
INFO:polykey.PolykeyAgent.KeyRing:Starting KeyRing
INFO:polykey.PolykeyAgent.KeyRing:Checking tmp/test3/state/keys/private.jwk
INFO:polykey.PolykeyAgent.KeyRing:Loading root key pair
INFO:polykey.PolykeyAgent.KeyRing:Checking tmp/test3/state/keys/db.jwk
INFO:polykey.PolykeyAgent.KeyRing:Started KeyRing
INFO:polykey.PolykeyAgent.KeyRing:Created KeyRing
INFO:polykey.PolykeyAgent.DB:Creating DB
INFO:polykey.PolykeyAgent.DB:Starting DB
INFO:polykey.PolykeyAgent.DB:Setting DB path to tmp/test3/state/db
INFO:polykey.PolykeyAgent.DB:Started DB
INFO:polykey.PolykeyAgent.DB:Created DB
INFO:polykey.PolykeyAgent:Creating TaskManager
INFO:polykey.PolykeyAgent:Starting TaskManager in Lazy Mode
INFO:polykey.PolykeyAgent:Begin Tasks Repair
INFO:polykey.PolykeyAgent:Finish Tasks Repair
INFO:polykey.PolykeyAgent:Started TaskManager
INFO:polykey.PolykeyAgent:Created TaskManager
INFO:polykey.PolykeyAgent.CertManager:Creating CertManager
INFO:polykey.PolykeyAgent.CertManager:Starting CertManager
INFO:polykey.PolykeyAgent.CertManager:Begin current certificate setup
INFO:polykey.PolykeyAgent.CertManager:Existing current certificate found
INFO:polykey.PolykeyAgent.CertManager:Finish current certificate setup
INFO:polykey.PolykeyAgent.CertManager:Registering handler CertManager.renewCurrentCert
INFO:polykey.PolykeyAgent.CertManager:Started CertManager
INFO:polykey.PolykeyAgent.CertManager:Created CertManager
INFO:polykey.PolykeyAgent.Sigchain:Creating Sigchain
INFO:polykey.PolykeyAgent.Sigchain:Starting Sigchain
INFO:polykey.PolykeyAgent.Sigchain:Started Sigchain
INFO:polykey.PolykeyAgent.Sigchain:Created Sigchain
INFO:polykey.PolykeyAgent.ACL:Creating ACL
INFO:polykey.PolykeyAgent.ACL:Starting ACL
INFO:polykey.PolykeyAgent.ACL:Started ACL
INFO:polykey.PolykeyAgent.ACL:Created ACL
INFO:polykey.PolykeyAgent.GestaltGraph:Creating GestaltGraph
INFO:polykey.PolykeyAgent.GestaltGraph:Starting GestaltGraph
INFO:polykey.PolykeyAgent.GestaltGraph:Started GestaltGraph
INFO:polykey.PolykeyAgent.GestaltGraph:Created GestaltGraph
INFO:polykey.PolykeyAgent.IdentitiesManager:Creating IdentitiesManager
INFO:polykey.PolykeyAgent.IdentitiesManager:Starting IdentitiesManager
INFO:polykey.PolykeyAgent.IdentitiesManager:Started IdentitiesManager
INFO:polykey.PolykeyAgent.IdentitiesManager:Created IdentitiesManager
INFO:polykey.PolykeyAgent.NodeGraph:Creating NodeGraph
INFO:polykey.PolykeyAgent.NodeGraph:Starting NodeGraph
INFO:polykey.PolykeyAgent.NodeGraph:Started NodeGraph
INFO:polykey.PolykeyAgent.NodeGraph:Created NodeGraph
INFO:polykey.PolykeyAgent.Discovery:Creating Discovery
INFO:polykey.PolykeyAgent.Discovery:Starting Discovery
INFO:polykey.PolykeyAgent.Discovery:Started Discovery
INFO:polykey.PolykeyAgent.Discovery:Created Discovery
INFO:polykey.PolykeyAgent.NotificationsManager:Creating NotificationsManager
INFO:polykey.PolykeyAgent.NotificationsManager:Starting NotificationsManager
INFO:polykey.PolykeyAgent.NotificationsManager:Started NotificationsManager
INFO:polykey.PolykeyAgent.NotificationsManager:Created NotificationsManager
INFO:polykey.PolykeyAgent.VaultManager:Creating VaultManager
INFO:polykey.PolykeyAgent.VaultManager:Setting vaults path to tmp/test3/state/vaults
INFO:polykey.PolykeyAgent.VaultManager:Starting VaultManager
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Creating DB
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Starting DB
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Setting DB path to tmp/test3/state/vaults/efs
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Started DB
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Created DB
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Creating INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Starting INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Started INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Created INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem:Starting EncryptedFS
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem:Started EncryptedFS
INFO:polykey.PolykeyAgent.VaultManager:Started VaultManager
INFO:polykey.PolykeyAgent.VaultManager:Created VaultManager
INFO:polykey.PolykeyAgent.SessionManager:Creating SessionManager
INFO:polykey.PolykeyAgent.SessionManager:Starting SessionManager
INFO:polykey.PolykeyAgent.SessionManager:Started SessionManager
INFO:polykey.PolykeyAgent.SessionManager:Created SessionManager
INFO:polykey.PolykeyAgent.Audit:Creating Audit
INFO:polykey.PolykeyAgent.Audit:Starting Audit
INFO:polykey.PolykeyAgent.Audit:Started Audit
INFO:polykey.PolykeyAgent.Audit:Created Audit
INFO:polykey.PolykeyAgent:Starting PolykeyAgent
INFO:polykey.PolykeyAgent.ClientService:Start ClientService
INFO:polykey.PolykeyAgent.ClientService.RPCServer:Start RPCServer
INFO:polykey.PolykeyAgent.ClientService.RPCServer:Started RPCServer
INFO:polykey.PolykeyAgent.ClientService.WebSocketServer:Starting WebSocketServer
INFO:polykey.PolykeyAgent.ClientService.WebSocketServer:Started WebSocketServer
INFO:polykey.PolykeyAgent.ClientService:Started ClientService
INFO:polykey.PolykeyAgent.NodeConnectionManager:Start NodeConnectionManager on [::]:0
INFO:polykey.PolykeyAgent.NodeConnectionManager.RPCServer:Start RPCServer
INFO:polykey.PolykeyAgent.NodeConnectionManager.RPCServer:Started RPCServer
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICSocket:Start QUICSocket on [::]:0
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICSocket:Started QUICSocket on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICServer:Start QUICServer on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICServer:Started QUICServer on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager:Started NodeConnectionManager
INFO:polykey.PolykeyAgent.NodeManager:Starting NodeManager
INFO:polykey.PolykeyAgent.NodeManager:Setting up refreshBucket tasks
INFO:polykey.PolykeyAgent.NodeManager:Set up refreshBucket tasks
INFO:polykey.PolykeyAgent.NodeManager.MDNS:Start MDNS
(node:95652) [DEP0112] DeprecationWarning: Socket.prototype._handle is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
INFO:polykey.PolykeyAgent.NodeManager.MDNS:Started MDNS
INFO:polykey.PolykeyAgent.NodeManager:Started NodeManager
INFO:polykey.PolykeyAgent.NodeManager.syncNodeGraph:Synchronizing NodeGraph
INFO:polykey.PolykeyAgent.scheduler:Starting Scheduling Loop
INFO:polykey.PolykeyAgent.scheduler:Started Scheduling Loop
INFO:polykey.PolykeyAgent.queue:Starting Queueing Loop
INFO:polykey.PolykeyAgent.queue:Started Queueing Loop
INFO:polykey.PolykeyAgent.WorkerManager:Creating WorkerManager
INFO:polykey.PolykeyAgent.WorkerManager:Created WorkerManager
INFO:polykey.PolykeyAgent.Status:Finish Status STARTING
INFO:polykey.PolykeyAgent.Status:Writing Status to tmp/test3/status.json
INFO:polykey.PolykeyAgent.Status:Status is LIVE
INFO:polykey.PolykeyAgent:Started PolykeyAgent
INFO:polykey.PolykeyAgent:Created PolykeyAgent
pid             95652
nodeId          vc81km0en0e50ic4bet9ujtnde8ot3t6bn2af8o1cau580585peig
clientHost      ::1
clientPort      36221
agentHost       ::
agentPort       42616

<redacted some connection logging to save space>

^C // Sending `ctrl+c` here

INFO:polykey.PolykeyAgent:Stopping PolykeyAgent
INFO:polykey.PolykeyAgent.Status:Begin Status STOPPING
INFO:polykey.PolykeyAgent.Status:Writing Status to tmp/test3/status.json
INFO:polykey.PolykeyAgent.Status:Status is STOPPING
INFO:polykey.PolykeyAgent:Stopping Queuing Loop
INFO:polykey.PolykeyAgent:Stopping Scheduling Loop
INFO:polykey.PolykeyAgent:Stopped Queuing Loop
INFO:polykey.PolykeyAgent:Stopped Scheduling Loop
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [116.110.42.206:43449].QUICClient:ErrorQUICConnectionLocal: QUIC Connection local error - Failed to start QUIC connection due to start timeout
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [116.110.42.206:43449].QUICClient.QUICConnection 57ebb54593363ac108f41942ece5eeec175efe5e:ErrorQUICConnectionLocal: QUIC Connection local error - Failed to start QUIC connection due to start timeout
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [159.196.126.185:55260].QUICClient:ErrorQUICConnectionLocal: QUIC Connection local error - Failed to start QUIC connection due to start timeout
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [159.196.126.185:55260].QUICClient.QUICConnection 9eef7d5bdbb3ee8a0a793a936b31e31af656cd56:ErrorQUICConnectionLocal: QUIC Connection local error - Failed to start QUIC connection due to start timeout
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [116.110.42.206:43449].QUICClient:Destroy QUICClient
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [159.196.126.185:55260].QUICClient:Destroy QUICClient
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [116.110.42.206:43449].QUICClient:Destroyed QUICClient
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [159.196.126.185:55260].QUICClient:Destroyed QUICClient
INFO:polykey.PolykeyAgent.SessionManager:Stopping SessionManager
INFO:polykey.PolykeyAgent.SessionManager:Stopped SessionManager
INFO:polykey.PolykeyAgent.NotificationsManager:Stopping NotificationsManager
INFO:polykey.PolykeyAgent.NotificationsManager:Stopped NotificationsManager
INFO:polykey.PolykeyAgent.VaultManager:Stopping VaultManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem:Stopping EncryptedFS
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Stopping INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem.INodeManager:Stopped INodeManager
INFO:polykey.PolykeyAgent.VaultManager.EncryptedFileSystem:Stopped EncryptedFS
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Stopping DB
INFO:polykey.PolykeyAgent.VaultManager.EFS Database:Stopped DB
INFO:polykey.PolykeyAgent.VaultManager:Stopped VaultManager
INFO:polykey.PolykeyAgent.Discovery:Stopping Discovery
INFO:polykey.PolykeyAgent.Discovery:Stopped Discovery
INFO:polykey.PolykeyAgent.NodeConnectionManager:Stop NodeConnectionManager
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnection [vncm2mkk41vgp2fmplqiu1je7b2l3v6fhgltlqf5f3f85923ve0j0@::ffff:13.239.117.143:1314]:Destroying NodeConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnection [v6p14qcvvftnnscuavsehu37t22vfvnhse054pbkb3ehemmjsrdh0@::ffff:3.145.86.40:1314]:Destroying NodeConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient:Destroy QUICClient to [::ffff:13.239.117.143]:1314
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient:Destroy QUICClient to [::ffff:3.145.86.40]:1314
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient.QUICConnection 289401881386bcf02b0f46ca8ee506e07f64902f:Stop QUICConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient.QUICConnection 4a4c91e570bd74615822e6b2dc56da6e945ac137:Stop QUICConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient:ErrorQUICConnectionLocal: QUIC Connection local error - Locally closed with application code 0
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient.QUICConnection 289401881386bcf02b0f46ca8ee506e07f64902f:ErrorQUICConnectionLocal: QUIC Connection local error - Locally closed with application code 0
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient:ErrorQUICConnectionLocal: QUIC Connection local error - Locally closed with application code 0
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient.QUICConnection 4a4c91e570bd74615822e6b2dc56da6e945ac137:ErrorQUICConnectionLocal: QUIC Connection local error - Locally closed with application code 0
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient.QUICConnection 289401881386bcf02b0f46ca8ee506e07f64902f:Stopped QUICConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [13.239.117.143:1314].QUICClient:Destroyed QUICClient to [::ffff:13.239.117.143]:1314
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnection [vncm2mkk41vgp2fmplqiu1je7b2l3v6fhgltlqf5f3f85923ve0j0@::ffff:13.239.117.143:1314]:Destroyed NodeConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient.QUICConnection 4a4c91e570bd74615822e6b2dc56da6e945ac137:Stopped QUICConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnectionForward [3.145.86.40:1314].QUICClient:Destroyed QUICClient to [::ffff:3.145.86.40]:1314
INFO:polykey.PolykeyAgent.NodeConnectionManager.NodeConnection [v6p14qcvvftnnscuavsehu37t22vfvnhse054pbkb3ehemmjsrdh0@::ffff:3.145.86.40:1314]:Destroyed NodeConnection
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICServer:Stop QUICServer on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICServer:Stopped QUICServer on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICSocket:Stop QUICSocket on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.QUICSocket:Stopped QUICSocket on [::]:42616
INFO:polykey.PolykeyAgent.NodeConnectionManager.RPCServer:Stop RPCServer
INFO:polykey.PolykeyAgent.NodeConnectionManager.RPCServer:Stopped RPCServer
INFO:polykey.PolykeyAgent.NodeConnectionManager:Stopped NodeConnectionManager
INFO:polykey.PolykeyAgent.NodeGraph:Stopping NodeGraph
INFO:polykey.PolykeyAgent.NodeGraph:Stopped NodeGraph
INFO:polykey.PolykeyAgent.NodeManager:Stopping NodeManager
INFO:polykey.PolykeyAgent.NodeManager.MDNS:Stop MDNS
INFO:polykey.PolykeyAgent.NodeManager.MDNS:Stopped MDNS
INFO:polykey.PolykeyAgent.NodeManager:Stopped NodeManager
INFO:polykey.PolykeyAgent.ClientService:Stop ClientService
INFO:polykey.PolykeyAgent.ClientService.WebSocketServer:Stopping WebSocketServer
INFO:polykey.PolykeyAgent.ClientService.WebSocketServer:Stopped WebSocketServer
INFO:polykey.PolykeyAgent.ClientService.RPCServer:Stop RPCServer
INFO:polykey.PolykeyAgent.ClientService.RPCServer:Stopped RPCServer
INFO:polykey.PolykeyAgent.ClientService:Stopped ClientService
INFO:polykey.PolykeyAgent.IdentitiesManager:Stopping IdentitiesManager
INFO:polykey.PolykeyAgent.IdentitiesManager:Stopped IdentitiesManager
INFO:polykey.PolykeyAgent.GestaltGraph:Stopping GestaltGraph
INFO:polykey.PolykeyAgent.GestaltGraph:Stopped GestaltGraph
INFO:polykey.PolykeyAgent.ACL:Stopping ACL
INFO:polykey.PolykeyAgent.ACL:Stopped ACL
INFO:polykey.PolykeyAgent.Sigchain:Stopping Sigchain
INFO:polykey.PolykeyAgent.Sigchain:Stopped Sigchain
INFO:polykey.PolykeyAgent.CertManager:Stopping CertManager
INFO:polykey.PolykeyAgent.CertManager:Cancelling task CertManager.renewCurrentCert:v0pgdtin30po01bvnegl7puoat8
INFO:polykey.PolykeyAgent.CertManager:Deregistering handler CertManager.renewCurrentCert
INFO:polykey.PolykeyAgent.CertManager:Stopped CertManager
INFO:polykey.PolykeyAgent:Stopping TaskManager
INFO:polykey.PolykeyAgent:Stopped TaskManager
INFO:polykey.PolykeyAgent.Audit:Stopping Audit
INFO:polykey.PolykeyAgent.Audit:Stopped Audit
INFO:polykey.PolykeyAgent.DB:Stopping DB
INFO:polykey.PolykeyAgent.DB:Stopped DB
INFO:polykey.PolykeyAgent.KeyRing:Stopping KeyRing
INFO:polykey.PolykeyAgent.KeyRing:Stopped KeyRing
INFO:polykey.PolykeyAgent.Schema:Stopping Schema
INFO:polykey.PolykeyAgent.Schema:Stopped Schema
INFO:polykey.PolykeyAgent.WorkerManager:Destroying WorkerManager
INFO:polykey.PolykeyAgent.WorkerManager:Destroyed WorkerManager
INFO:polykey.PolykeyAgent.Status:Stopping Status
INFO:polykey.PolykeyAgent.Status:Writing Status to tmp/test3/status.json
INFO:polykey.PolykeyAgent.Status:Status is DEAD
INFO:polykey.PolykeyAgent:Stopped PolykeyAgent

These logs will help a lot in working out what's wrong because it could be a few things. It could be ignoring the signal outright or some domain that is shutting down is taking a long time/deadlocked. I need to know where to look.

@amydevs
Copy link
Member

amydevs commented Apr 4, 2024

I've added a callback to the the EventPolykeyAgentStop event so that polykey agent start will reflect if the agent is stopping: d55d0fd

@amydevs
Copy link
Member

amydevs commented Apr 4, 2024

I can't seem to figure out how to reproduce what the comments are describing, so i'll leave it here.

@CryptoTotalWar
Copy link

@tegefaulkes To clarify, what would help going forward is to always run the command polykey agent start -v in a separate terminal window because this provides you with a log of events to help you know where to look in the case of running into bugs or issues?

@tegefaulkes
Copy link
Contributor Author

When testing always run the agent in the foreground in a separate terminal. You can do it verbose -v but this adds a lot of noise and is only really useful for really deep debugging. If you do run with verbose then redirect the output to a file. There may be an option to do this already with --output-file or something.

Copy link
Member

No don't add options for no reason. Just learn how to do process redirection or redirect stdout or stderr.

@tegefaulkes
Copy link
Contributor Author

I'm not saying we add it for no reason. It's already a feature.

const backgroundOutFile = new commander.Option(
  '-bof, --background-out-file <path>',
  'Path to STDOUT for agent process',
);

const backgroundErrFile = new commander.Option(
  '-bef, --background-err-file <path>',
  'Path to STDERR for agent process',
);

Though it only applies to a back-grounded agent. An agent running in the foreground should use redirection.

Copy link
Contributor Author

@amydevs Is this done? I recall you made some changes relating to this.

Copy link
Member

I actually do see this:

cmcdragonkai ➜ matrix-ml-1  ➜ ~/Projects/Polykey-CLI
 $ ./dist/polykey.js agent start
✔ Please enter the password … ******
(node:1149047) [DEP0112] DeprecationWarning: Socket.prototype._handle is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
pid       	1149047
nodeId    	vve3f83v1qog578bgav365bmmgtt71prpdogs2f2dcqcf59i60br0
clientHost	::1
clientPort	32899
agentHost 	::
agentPort 	57354
WARN:polykey.PolykeyAgent.task v0pi6svoo31o0160nvvp66on06o:Failed - Reason: ErrorUtilsUndefinedBehaviour
WARN:polykey.PolykeyAgent.task v0piance285o015kje3qe84r1vg:Failed - Reason: ErrorUtilsUndefinedBehaviour
^CStopping Agent

However I'm not sure if it looks as nice it can be. It's sort of because Ctrl+C send a ^C character without a newline… Not sure if this is due to this issue being solved? @amydevs need an update.

Copy link
Contributor Author

Still needs to be verified

@tegefaulkes tegefaulkes closed this as not planned Won't fix, can't repro, duplicate, stale May 29, 2024
@tegefaulkes tegefaulkes reopened this May 29, 2024
@tegefaulkes
Copy link
Contributor Author

There was a duplicate in linear for this issue. Seemed to auto close this despite removing the link first..

Copy link
Contributor Author

When stopping in forground with ctrl+c

❯ npm run polykey -- -np ./tmp/bsd agent start

> polykey-cli@0.4.1 polykey
> ts-node src/polykey.ts -np ./tmp/bsd agent start

✔ Please enter the password … ***
(node:3871611) [DEP0112] DeprecationWarning: Socket.prototype._handle is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
pid             3871611
nodeId          v3i2kvgtr2r8imfuidndrvo6mqrumuo2nbeehgk97felbi0bcljg0
clientHost      ::1
clientPort      43299
agentHost       ::
agentPort       36724
WARN:polykey.PolykeyAgent.task v0pida0bjmpo0174rgoh1msqd48:Failed - Reason: ErrorUtilsUndefinedBehaviour

^CStopping Agent

and via agent stop

❯ npm run polykey -- -np ./tmp/bsd agent start

> polykey-cli@0.4.1 polykey
> ts-node src/polykey.ts -np ./tmp/bsd agent start

✔ Please enter the password … ***
(node:3871951) [DEP0112] DeprecationWarning: Socket.prototype._handle is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
pid             3871951
nodeId          v3i2kvgtr2r8imfuidndrvo6mqrumuo2nbeehgk97felbi0bcljg0
clientHost      ::1
clientPort      45147
agentHost       ::
agentPort       48231
(node:3871951) MaxListenersExceededWarning: Possible EventTarget memory leak detected. 11 abort listeners added to [AbortSignal]. Use events.setMaxListeners() to increase limit
Stopping Agent

So I'm calling this one done.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Aug 13, 2024
@CMCDragonkai
Copy link
Member

So this idea of a message at the end. I didn't get around to reviewing this. But there are some guidelines/rules to CLI programs that is important.

  1. Our STDOUT should be reserved for actual output when the program succeeds.
  2. Our STDERR is being used for logs and also for exception/error messages.

So I can see that Stopping Agent is a line that is emitted to STDERR. Furthermore using CTRL+C and SIGINT results in a exit code of 130. Sending SIGTERM results in an exit code of 143.

There are several problems with this.

  1. Our STDOUT for polykey agent start should be just the status message as it is currently.
  2. Saying Stopping Agent is not an error, nor an exception. Therefore the exit code should be 0, because there's no actual error. We have custom exit handler specifically for this reason. This applies to both SIGINT, and SIGTERM and any other custom signal handler where we gracefully terminate. It should always just be 0 in this case, which is a successful process exit. I notice that this is the case when sending polykey agent stop, but not when using signals.
  3. The Stopping Agent doesn't look like a regular log message, nor does it look like an exception, this adds a 3rd case to any enum-pattern match of the output of Polykey agent process. I believe this is is unnecessary. It would have been sufficient to simply add a warning message indicating that the agent is stopped. In fact the last message is in fact INFO:polykey.PolykeyAgent:Stopped PolykeyAgent. You should just change this to the WARN level, and it would show up even if you didn't do --verbose.

So the real question is why do we even need this message? Is it because our agent stopping process is buggy? That's what the --verbose flag is for isn't it? I can see that #185 still exists, so this problem isn't really solved.

@tegefaulkes
Copy link
Contributor Author

We added it because of the lack of feedback. @CryptoTotalWar at the time was running into issues where the agent wasn't stopping properly so adding this message makes it clear if it responded at all when triggered to stop.

Copy link
Member

Yes I'm aware of that, but my arguments above indicate that such a message doesn't fit within the rest of the product design… and there are better ways of doing it.

Copy link
Contributor Author

I think some indication of stopping is needed regardless of the logging level. It adds a degree of responsiveness we don't get otherwise right now. The agent takes a moment to gracefully stop. So as a user without that feedback it feels unresponsive and in the worst case you can't tell if anything is happening.

So I say we do include a stderr message for the agent stopping as a form of feedback.

As for the error codes. I think only stopping an agent with agent stop should exit with code 0. If the agent is killed with a signal I think these should exit with their current codes. I think any external killing of the agent should be considered an error even if it was handled gracefully.

@CMCDragonkai
Copy link
Member

I think some indication of stopping is needed regardless of the logging level. It adds a degree of responsiveness we don't get otherwise right now. The agent takes a moment to gracefully stop. So as a user without that feedback it feels unresponsive and in the worst case you can't tell if anything is happening.

So I say we do include a stderr message for the agent stopping as a form of feedback. As for the error codes. I think only stopping an agent with agent stop should exit with code 0. If the agent is killed with a signal I think these should exit with their current codes. I think any external killing of the agent should be considered an error even if it was handled gracefully.

The message needs to match the logging standard. You can do a warn for this shutting down.

@tegefaulkes
Copy link
Contributor Author

New issue at #270 for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management
Development

No branches or pull requests

4 participants