Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stakepoold access to MySQL db data without needing to provide MySQL connection params on all stakepoold instances #495

Open
itswisdomagain opened this issue Aug 21, 2019 · 12 comments

Comments

@itswisdomagain
Copy link
Member

itswisdomagain commented Aug 21, 2019

Currently, stakepoold accesses the MySQL db running on the dcrstakepool web server to read user voting preferences and added low fee tickets. This db read op is performed on 2 occasions: when stakepoold starts up and every couple minutes (if stakepoold gRPC server is disabled).

Assuming that stakepoold gRPC server is enabled (shouldn't it always?), we can add a stakepoold gRPC method to supply the above data to all connected stakepoold backends when dcrstakepool starts.

This has the advantage of not needlessly exposing the MySQL db to external IPs (considering that the db would typically sit on the web server, can just keep access to the db local to the running dcrstakepool instance).

It also simplifies the process for setting up a vsp, as vsp admins would have 4 fewer stakepoold config fields to worry about when setting up their backends.

PS: The implication is that the stakepoold norpclisten config option will have to be deprecated. dcrstakepool will currently not even start if the stakepoold backends do not have their rpc servers enabled.

@JoeGruffins
Copy link
Member

Might as well move to boltdb then.

@jholdstock
Copy link
Member

As mentioned on matrix, supplying config through RPC will always get a firm No from me. Servers should be able to run in isolation and should not depend on config values being passed in via RPC from another server.

@itswisdomagain
Copy link
Member Author

itswisdomagain commented Aug 21, 2019

The data I'm proposing to supply through RPC are not config values, they're actually currently being passed via RPC when the data changes in the database, but not when dcrstakepool starts up.

err = controller.Cfg.StakepooldServers.SetAddedLowFeeTickets(votableLowFeeTickets)

err = controller.Cfg.StakepooldServers.SetUserVotingPrefs(allUsers)

@JoeGruffins
Copy link
Member

I agree with @itswisdomagain This is a good idea. We don't have to expose the db, and it simplifies things. It paves the way for maybe a db change in the future.

@itswisdomagain
Copy link
Member Author

itswisdomagain commented Sep 24, 2019

@jholdstock what's your thought on this? I agree with @JoeGruffins that we could just use the opportunity to migrate away from postgres.

@itswisdomagain
Copy link
Member Author

Based on recent conversation on matrix, particularly @jholdstock's message here, it makes sense to continue to allow stakepoold direct access to MySQL, to enable stakepoold run as independently from dcrstakepool as possible.

@chappjc
Copy link
Member

chappjc commented Mar 26, 2020

I'm surprised we're changing course from this and #134. The matrix link isn't working so I don't know what was discussed.

@chappjc
Copy link
Member

chappjc commented Mar 26, 2020

stakepoold will never run independently of dcrstakepool, but it can be made to run without connecting to the MySQL server, which only dcrstakepool should talk to.

@chappjc
Copy link
Member

chappjc commented Mar 26, 2020

After accessing the matrix chat from then, my takeaways are:

  • We would like stakepoold not to access mysql.
  • We also want to support a stakepoold mode where dcrstakepool doesn't know about it (norpclisten). Maybe that can be reconsidered though, either by eliminating norpclisten and flagging certain wallet/stakepoold instances as optional in dcrstakepool config, or by somehow supporting retrieval of the user voting data by another mechanism such as stakepoold-originating requests.

What other solutions would allow a stakepoold/dcrwallet instance to run without having it configured in dcrstakepool?

@JoeGruffins JoeGruffins reopened this Mar 26, 2020
@JoeGruffins
Copy link
Member

I just want to give some information about the current structure of dcrstakepool and the complications of running all data through dcrstakepool as appose to having stakepoold fetch their own data.

Currently stakepoold need information in the sql database to function. That data is user voting configurations, to vote properly, and redeem scripts to check that wallet has what it needs to vote.

There are two ways in which stakepoold can optain this information:

stakepoold === mysql

or:
stakepoold --- dcrstakepool === mysql

Point 1:
Having stakepoold talk to mysql cuts out the middleman, making for less code, and a more straightforward system.

Point 2:
Notice that the stakepoold/dcrstakepool connection is only one line. This means that, currently, stakepoold can't request data from dcrstakepool, only respond to requests. It saves the last known data on shutdown and reads that file on start-up. Not only does that mean that stakepoold will always start with old data (which could cause it to miss-vote, which in my opinion is unacceptable), but it means that stakepoold is basically a brick until dcrstakepool is started at least once to send it the initial data it needs to operate.

Another rpc server could be added to dcrstakepool to alleviate this, but that opens another port on dcrstakepool, and complicates the work flow.

Point 3:
I think that in the future stakepoold will rely even more on fresh data, #542 and so the complexity of maintaining the stakepoold --- dcrstakepool data connection will also increase.

Point 4:
@isuldor #586 is working on setting up mysql to run over tls. After that is in, I really can't see the argument against opening these ports. I'm not an expert on internet security though.

Point 5:
Stakepoold has already been using the mysql db (since birth?) without problems.

@itswisdomagain
Copy link
Member Author

I agree with @chappjc. We do not want stakepoold to connect directly to the db.

The question of whether or not to allow stakepoold run independently of dcrstakepool is also very valid.

Consider this scenario:
1 dcrstakepool instance and 3 stakepoold instances are actively running.
1 of the stakepoold instances is not connected to dcrstakepool. On startup, that instance downloads user config from the database.
Sometime later, a user updates his voting preferences. Another user signs up and a multisig is generated for the user. These updates don't get to the independent stakepoold instance realtime. If this instance were to vote, it could vote using an outdated user pref. And completely not vote for the newly added user.

Agreed, stakepoold (in independent mode), periodically queries the db for updates but why shouldn't it just open up a gRPC port to get updates from dcrstakepool realtime?

On the other hand, it makes sense to be able to startup stakepoold if there's no actively running dcrstakepool instance (but why wouldn't there be though?). Having stakepoold able to vote on user prefs stored in the db makes sense in this scenario.

In any case, even if we decide to allow independent stakepoold runs, it still will be preferred to give stakepoold read-only access to the data it needs, rather than direct access to the database.

I do feel the following flow would be better:

  • stakepoold should not run without establishing a connection to dcrstakepool (and should shutdown/become inactive if the connection is lost)
  • whenever a stakepoold instance connects to dcrstakepool, dcrstakepool should send the user pref data to the newly connected stakepoold instance rather than having stakepoold connect to the db to get this data
  • when new data is sent to the db via dcrstakepool web or API, dcrstakepool should send updates to all connected stakepoold instances rather than having the stakepoold instances each periodically download updates directly from the db

@JoeGruffins
Copy link
Member

JoeGruffins commented Mar 30, 2020

Is the specific security issue that if any one stakepoold were compromised an attacker would have write permissions to the database?

If so could giving stakepoold a read-only db user be viable solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants