Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NODE] Correct reconnect on read-only PSQL replica #12834

Open
shibaeff opened this issue Apr 16, 2024 · 2 comments
Open

[NODE] Correct reconnect on read-only PSQL replica #12834

shibaeff opened this issue Apr 16, 2024 · 2 comments

Comments

@shibaeff
Copy link

Description
We have active-passive PostgreSQL setup and we failover our database instances, when there is a need for it. Today we tested a failover scenario in a controlled manner and chainlink-ocr node needed a manual restart.

We would like that Chainlink node automatically handles a flip of database instance from read-write (when it is still an active master) to a read-only replica (when previously active master, is not any more an active master), by tearing down all active SQL sessions, maybe sleeping for 5 seconds or so, and then re-establishing them from scratch.

Steps to Reproduce
Put 2 psql instances behind the haproxy instance. Point chainlink-ocr node to the haproxy and failover the psql from one instance to another while Chainlink node is running.

Basic Information
We're running Chainlink in the k8s environment using Docker image based on the publicly available Docker image provided by Chainlink team.
Logs on the chainlink-ocr side:

d PoR USD version 4 contract 0x6CeA38508B186DE36AAfd0f3B513E708691bc0C4 network mainnet jobID=3703 jobName=CacheGold PoR USD version 4 contract 0x6CeA38508B186DE36AAfd0f3B513E708691bc0C4 network mainnet logger=OCR version=2.10.0@0fe6514
2024-04-16T03:31:35.194Z [ERROR] Error creating SpecError ReportGeneration: DataSource errored job/orm.go:658                   err=ERROR: cannot execute INSERT in a read-only transaction (SQLSTATE 25006) logger=JobORM stacktrace=github.com/smartcontractkit/chainlink/v2/core/services/job.(*orm).TryRecordError
        /chainlink/core/services/job/orm.go:658
github.com/smartcontractkit/chainlink/v2/core/services/ocr.(*Delegate).ServicesForSpec.func1
        /chainlink/core/services/ocr/delegate.go:162
github.com/smartcontractkit/chainlink-common/pkg/logger.(*ocrWrapper).Error
        /go/pkg/mod/github.com/smartcontractkit/chainlink-common@v0.1.7-0.20240306173252-5cbf83ca3a69/pkg/logger/ocr.go:47
github.com/smartcontractkit/libocr/internal/loghelper.loggerWithContextImpl.ErrorIfNotCanceled
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/internal/loghelper/logger_with_context.go:54
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*reportGenerationState).observeValue
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation_follower.go:383
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*reportGenerationState).messageObserveReq
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation_follower.go:108
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.MessageObserveReq.processReportGeneration
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/message.go:123
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*reportGenerationState).run
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation.go:147
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.RunReportGeneration
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation.go:55
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*pacemakerState).spawnReportGeneration.func1
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/pacemaker.go:508
github.com/smartcontractkit/libocr/subprocesses.(*Subprocesses).Go.func1
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/subprocesses/subprocesses.go:29 version=2.10.0@0fe6514
2024-04-16T03:31:35.195Z [ERROR] ReportGeneration: DataSource errored               protocol/report_generation_follower.go:383 configDigest=94027e1122b30b47797c20a59633fbce contractAddress=0x6CeA38508B186DE36AAfd0f3B513E708691bc0C4 epoch=73913 error=Number of faulty inputs 1 to median task > number allowed faults 0: too many errors errorVerbose=too many errors
Number of faulty inputs 1 to median task > number allowed faults 0
github.com/smartcontractkit/chainlink/v2/core/services/pipeline.(*MedianTask).Run
        /chainlink/core/services/pipeline/task.median.go:53
github.com/smartcontractkit/chainlink/v2/core/services/pipeline.(*runner).executeTaskRun
        /chainlink/core/services/pipeline/runner.go:472
github.com/smartcontractkit/chainlink/v2/core/services/pipeline.(*runner).run.func1
        /chainlink/core/services/pipeline/runner.go:342
github.com/smartcontractkit/chainlink/v2/core/recovery.WrapRecoverHandle
        /chainlink/core/recovery/recover.go:40
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1650 externalJobID=d2ed2fcf-1302-487a-ae21-184765c30c1b jobID=3703 jobName=CacheGold PoR USD version 4 contract 0x6CeA38508B186DE36AAfd0f3B513E708691bc0C4 network mainnet leader=0 logger=OCR oid=3 round=4 stacktrace=github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*reportGenerationState).observeValue
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation_follower.go:383
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.(*reportGenerationState).messageObserveReq
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/report_generation_follower.go:108
github.com/smartcontractkit/libocr/offchainreporting/internal/protocol.MessageObserveReq.processReportGeneration
        /go/pkg/mod/github.com/smartcontractkit/libocr@v0.0.0-20240229181116-bfb2432a7a66/offchainreporting/internal/protocol/message.go:123

Logs on the psql side:

root@db10:~# tail  /var/log/postgresql/log.log
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
2024-04-16 02:52:02.749 UTC [3653275] chainlink-ocr@chainlink-ocr ERROR:  cannot execute INSERT in a read-only transaction
2024-04-16 02:52:02.749 UTC [3653275] chainlink-ocr@chainlink-ocr STATEMENT:  INSERT INTO pipeline_runs (pipeline_spec_id, meta, all_errors, fatal_errors, inputs, outputs, created_at, finished_at, state)
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
2024-04-16 02:52:03.192 UTC [3653275] chainlink-ocr@chainlink-ocr ERROR:  cannot execute INSERT in a read-only transaction
2024-04-16 02:52:03.192 UTC [3653275] chainlink-ocr@chainlink-ocr STATEMENT:  INSERT INTO pipeline_runs (pipeline_spec_id, meta, all_errors, fatal_errors, inputs, outputs, created_at, finished_at, state)
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
root@db10:~# tail  /var/log/postgresql/log.log
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
2024-04-16 02:52:04.709 UTC [3653279] chainlink-ocr@chainlink-ocr ERROR:  cannot execute INSERT in a read-only transaction
2024-04-16 02:52:04.709 UTC [3653279] chainlink-ocr@chainlink-ocr STATEMENT:  INSERT INTO pipeline_runs (pipeline_spec_id, meta, all_errors, fatal_errors, inputs, outputs, created_at, finished_at, state)
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
2024-04-16 02:52:05.361 UTC [3653279] chainlink-ocr@chainlink-ocr ERROR:  cannot execute INSERT in a read-only transaction
2024-04-16 02:52:05.361 UTC [3653279] chainlink-ocr@chainlink-ocr STATEMENT:  INSERT INTO pipeline_runs (pipeline_spec_id, meta, all_errors, fatal_errors, inputs, outputs, created_at, finished_at, state)
                        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
                        RETURNING id;
  • Network: Ethereum
  • Blockchain Client: geth v1.13.14
  • Go Version: 1.21
  • Operating System: debian bullseye 11.8
  • Commit: Chainlink v2.10.0
  • Hosting Provider: self-hosted k8s + psql behind proxy
  • Startup Command: [e.g. docker run smartcontract/chainlink local n]
    flags for the Chainlink entrypoint binary:
              - '-s'
              - /home/chainlink/secrets.toml
              - local
              - 'n'
              - '-p'
              - /home/chainlink/credentials/.password
              - '-a'
              - /home/chainlink/credentials/.api
    
    
@rgottleber
Copy link

Thanks for sharing this and all of the details. We will take a look.

@saram-aman
Copy link

@rgottleber could you please assign it to me? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants