Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request the transaction history from UTXO RPCs only when balance is changed. #612

Closed
artemii235 opened this issue Apr 9, 2020 · 26 comments
Assignees

Comments

@artemii235
Copy link
Member

artemii235 commented Apr 9, 2020

https://github.com/KomodoPlatform/atomicDEX-API/blob/9884f181d8f990f57208484f1fdb20e9ed0e0882/mm2src/coins/utxo.rs#L1485

The tx_history fetching loop starts as background thread to download address transaction history providing instant access from local database.
It might consume a lot of traffic in Electrum case - blockchain-scripthash-get-history method always returns full history of address which might contain a lot of records. As of now this request is repeated every 30 seconds again and again even if there're no new records.

We can easily avoid it by checking balance - the result has way lower size, so in general it'll look like:

Always request history on first loop iteration, also request balance, remember it.
On all next iterations request balance first and then request history only if balance is changed
@artemii235
Copy link
Member Author

Would also nice to measure traffic usage using address with significant number of transactions before making optimizations. Then measure it again after optimization is done and provide results here. It might be done as internal metric collecting the data transferred by ElectrumClient, specific design decisions are up to you, we may discuss them in person when you start working on this task.

@ArtemGr
Copy link

ArtemGr commented Apr 9, 2020

In https://github.com/ca333/komodoDEX/issues/349 we're going to gather such metrics via decentralized gossip. Having the history traffic measurements available to us would be very useful
(and will open the data for SQL analysis, such as comparison between coins and MM versions).

The easiest way to share these measurements with the gossiping UI code is probably to log them.
The log subsystem has a tag format that is intended as a stable medium for UI to parse (and we're using it in AtomicDEX mobile UI to learn of the swap progress, for example)

https://github.com/KomodoPlatform/atomicDEX-API/blob/34dd28dbc82b2f0dd7fcd2d49d920c11b5be3028/mm2src/common/log.rs#L300

https://github.com/KomodoPlatform/atomicDEX-API/blob/34dd28dbc82b2f0dd7fcd2d49d920c11b5be3028/mm2src/common/log.rs#L591-L593

@sergeyboyko0791
Copy link

sergeyboyko0791 commented Apr 17, 2020

There are several implementation questions:

  • How often we should record to log the measure? I added the log recording on each sent and received packet. I offer to record once a minute count of sent bytes. Else we can record on each specified send/received bytes (eg every 5 kibibytes)
  • Should we calculate not only payload bytes but also IP headers?
  • Should only Electrum traffic measure be logged? What about Eth and Native?
  • And also what's the best record structure for the measure? I implemented something like these:
    . 2020-04-16 20:06:06 +0700 [traffic electrum send] 62
    . 2020-04-16 20:06:06 +0700 [traffic electrum receive] 105

What should be the emotion, the tags are right and are useful?

@ArtemGr
Copy link

ArtemGr commented Apr 17, 2020

So the history loading is something that has a beginning, works for some time and then has an end, right? I mean, once we loaded a history for a coin we don't need to do that anymore, or if we do, it's unlikely to take much traffic? If yes, then this sounds like a good fit for a dashboard entry. Dashboard will print the current version of the status periodically and then will print it last final time when the operation is marked as finished (IIRC). Like it does for the state of the swap.

And it was designed specifically to track ongoing continuous operations with minimal interference to the normal tracing / signal-oriented log.

Should have probably mentioned it earlier.

But I offer to record once a minute count of sent bytes sounds a lot like what dashboard does, except it groups these continous operations together, tells us how long an op was active, separates them from the signal-oriented log.

Should we calculate not only payload bytes but also IP headers?

Most probably not. If history loading takes a lot of traffic then we'll be thinking on how to reduce it but we won't be thinking about what IPs it uses. But we should probably add the coin's ticker to the tags.

Should only Electrum traffic measure be logged? What about Eth and Native?

Mobile UIs not using native, but we'll be needing the ETH stats just as much as the UTXO ones.

[traffic electrum send] 62

The parsable data should all be in tags, like [history coin=KMD tx=62 rx=105], and the human-readable part should be human-readable, so the full dashboard entry might look like
[history coin=KMD tx=62 rx=105] Loading KMD history

cf. https://github.com/ca333/komodoDEX/blob/c2ab90a6c4d60df79f8437e966a703fd069ac9cd/lib/services/mm_service.dart#L453

P.S. I think that dashboarding the background operations in MM is a good idea even regardless of the traffic measurement, and to boot it saves us from doing extra periodic requests to MM, but I should probably mention that it's also feasible to implement this by adding the traffic fields to https://developers.atomicdex.io/basic-docs/atomicdex/atomicdex-api.html#my-tx-history ?

@artemii235
Copy link
Member Author

So the history loading is something that has a beginning, works for some time and then has an end, right? I mean, once we loaded a history for a coin we don't need to do that anymore, or if we do, it's unlikely to take much traffic?

History loading repeats over time, we need to fetch it again and again from coin RPC when balance is updated. The balance check is now missing so tx_history_loop fetches history every 30 seconds from Electrum which always returns full transactions history without paging support (that is already mentioned in the initial description of the issue).

How often we should record to log the measure? I added the log recording on each sent and received packet. I offer to record once a minute count of sent bytes. Else we can record on each specified send/received bytes (eg every 5 kibibytes)

As Artem mentioned you may create a status similar to https://github.com/KomodoPlatform/atomicDEX-API/blob/7d10c66c033ae4e6cfccaea318e2bddc3db7ea35/mm2src/lp_swap/maker_swap.rs#L1018
https://github.com/KomodoPlatform/atomicDEX-API/blob/7d10c66c033ae4e6cfccaea318e2bddc3db7ea35/mm2src/lp_swap/maker_swap.rs#L1036
So it will be periodically printed to log unless status handle is dropped.

But what I'm unsure of: we may have dozens of active coins, won't the log be overloaded with dozens of periodical messages since history fetching loop actually never ends? E.g.

[history coin=KMD tx=62 rx=105]
[history coin=RICK tx=62 rx=105]
[history coin=MORTY tx=62 rx=105]
...
[history coin=BTC tx=62 rx=105]

Should we calculate not only payload bytes but also IP headers?

IP headers size calculation is not mandatory, it's negligible overhead.

Should only Electrum traffic measure be logged? What about Eth and Native?

It will be also nice, we can extend this idea not only to coin RPC clients traffic measurement, we actually need other metrics, e.g. ordermatching protocol traffic usage, and these metrics might be not only network related. We possibly need an abstraction for such purpose, there're also metrics and more widely used prometheus crate. I think it might be nice feature to integrate a widely used OS solution for metrics collection, it already has pretty visualization capabilities.
But we should be sure that our metrics collection works without the requirement to setup 3rd party solution server.

@ArtemGr
Copy link

ArtemGr commented Apr 17, 2020

History loading repeats over time, we need to fetch it again and again from coin RPC when balance is updated. The balance check is now missing so tx_history_loop fetches history every 30 seconds from Electrum which always returns full transactions history without paging support

I see. Nasty! Thanks for the reminder!
Yes, printing this for all the coins every 30 seconds is not what I had in mind, but printing a sum
[t-history tx=62 rx=105] Loading coin transaction history
via dashboard might work and will inform the MM users of this recurring activity, its timing and network overhead.

The metrics crate is rad.

@sergeyboyko0791
Copy link

As far as I understand the metrics crate suits better for our purposes than the prometheus. The first one doesn't require a 3rd party server and allows to easily implement a custom metrics exporter. Eg records the metrics via dashboard.

But what I'm unsure of: we may have dozens of active coins, won't the log be overloaded with dozens of periodical messages since history fetching loop actually never ends? E.g.

The best way I see is collect metrics and record it every specified time (eg 5 min or less for the tests).
It'll allow record the history traffic value of each coin and to avoid the log overload.

[history coin=KMD tx=62 rx=105] Loading KMD history

Recording the log entry above may be little difficult, because the count of request and response bytes should be passed from corresponding connection loop to the RPC client API. But it may be useful later, not only for measuring the history request traffic. If it is planned in the future, the way is suitable for this. Else I can add other history metrics:

[history coin=KMD tx_requested=100 tx_detail_requested=8] Loading KMD history

where the tx_requested is size of tx history list (the list contains just tx hashes) received over the time,
the tx_detail_requested is count of tx detail received over the time for the history purposes.

I also suggest to measure and record all RPC client traffic.

@ArtemGr
Copy link

ArtemGr commented Apr 20, 2020

A couple of ideas.

  • AFAIK, we are running a lot of Electrum servers. For those servers, can't we run a proxy nearby, on an HTTP port, that would allow the MM instances to fetch blockchain.scripthash.get_history from a given local Electrum in reverse order?
    Using HTTP means the communication will be more reliable on mobile operators (some of which aren't as friendly to non-HTTP traffic). And with (brotli) compression we can save even more traffic.
    Using reverse order (and/or explicit “stopat” paging) means that MM can skip the history it already has.
  • Maybe implement this as a caretaker option, making it easier to deploy caretakers alongside electrum servers?
  • Or just skip the Electrum and reimplement the blockchain.scripthash.get_history by talking with the (same) native wallet?

cc @cipig

@cipig
Copy link
Member

cipig commented Apr 20, 2020

I would like to avoid running additional stuff on the electrums, if that is possible.
Have we already tried https://electrumx.readthedocs.io/en/latest/protocol-methods.html#blockchain-scripthash-subscribe ?
Do we really need to query the tx_history all the time? It is not needed to do swaps, just to show txes in the app. Can we get this info from an explorer instead? The explorers are designed for such requests.

@artemii235
Copy link
Member Author

@cipig
I agree, Electrum HTTP proxy is not mandatory.

Using HTTP means the communication will be more reliable on mobile operators (some of which aren't as friendly to non-HTTP traffic)

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider.

Can we get this info from an explorer instead? The explorers are designed for such requests.

Sure, we may integrate block explorers API too in the future (e.g. Insight).

Have we already tried https://electrumx.readthedocs.io/en/latest/protocol-methods.html#blockchain-scripthash-subscribe ?
Do we really need to query the tx_history all the time?

As I can see from docs blockchain-scripthash-subscribe notifies us about status update which is actually sha256 of scripthash transactions list, so we will need to request tx_history anyway to understand what exactly has changed.

And in summary the idea of such tx history fetching from Electrum was to avoid dependency on 3rd party blockchain explorer services. So to have history you need to run Electrum open source server only which has standardized API.
Whereas each block explorer may have it's own API, some of them are closed source, may limit requests to them etc.

@ArtemGr
Copy link

ArtemGr commented Apr 20, 2020

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider.

And we won't, until our network is larger.
By which time redesigning this will be much harder.

We know such providers exist though.
For example: "Claro is choking access to TCP/UDP ports because of paranoid policies on their NAT" - https://wiki.vuze.com/w/Bad_ISPs.
The two-layered decentralized technology we're developing with caretakers is consequently designed to prefer HTTP, which also benefits future WASM ports and simplifies our mobile and web stacks.

I would like to avoid running additional stuff on the electrums, if that is possible.

I take it the tx history information is public and hence we can proxy it from any server, not necessarily from the electrum ones?

Can we get this info from an explorer instead? The explorers are designed for such requests.

If the transaction history happens to take much traffic then we'll have to redesign it per https://github.com/ca333/komodoDEX/issues/349.
One of the possible routes there is to opt out of the MM history retrieval and switch to explorers or to caretaker proxies.
Which explorers should we look at?

@artemii235
Copy link
Member Author

artemii235 commented Apr 21, 2020

One of the possible routes there is to opt out of the MM history retrieval and switch to explorers or to caretaker proxies.
Which explorers should we look at?

There was a request some time ago to use open-source Insight explorer API as 1 of UTXO coins operational mode, it's in backlog as of now, it will be implemented later - #618

AFAIK we didn't get any reports that Electrum connectivity is getting blocked by any provider.
And we won't, until our network is larger.
By which time redesigning this will be much harder.
We know such providers exist though.
For example: "Claro is choking access to TCP/UDP ports because of paranoid policies on their NAT" - https://wiki.vuze.com/w/Bad_ISPs.
The two-layered decentralized technology we're developing with caretakers is consequently designed to prefer HTTP, which also benefits future WASM ports and simplifies our mobile and web stacks.

Electrum servers can use 80/443 ports, Electrums support WS protocol which is fine for pure browser environments. So even if all non-HTTP ports are blocked we can solve the problem by simple reconfiguration. Users also can use TOR proxy or VPNs to mitigate the problem. And I would like to deal with 1 thing at time, this issue is about history fetching optimization with Electrum operational mode and corresponding metrics collection, not about Electrum can be potentially blocked.

@artemii235
Copy link
Member Author

artemii235 commented Apr 21, 2020

And what is also important: we're not only persons using Electrum. Electrum based wallets and server is popular software that is used by millions people to manage their coins, I guess if there were any massive blocking attempts there would be some news about that, however I'm not able to find such reports by searching isp block electrum servers or isp blocks electrum. Some people report of course that their wallet won't connect but there's no proof that such connection is blocked by their ISP.

@ArtemGr
Copy link

ArtemGr commented Apr 21, 2020

I cede you the point, Artem. In the realm of cryptocurrency, Electrum is among the more used technologies, it is Firewall friendly, using the 80/443 ports, and we might very well follow its pattern.

(Speaking of which, is there already an issue to use Electrum ports 80/443 in MM?)

And it follows that we don't need to proxy Electrum in its entirety.

But Electrum pagination is a known problem which brings us back to the issue at hand. Any thoughts on proxying Electrum tx history in order to reverse, paginate and compress it?

P.S.

I guess if there were any massive blocking attempts there would be some news about that

Human behavior follows certain patterns or established social habits ("rules of fair play" per Jordan Peterson). One of such patterns is that people talk about providers in the forums that discuss the providers. It is exceedingly rare to see provider gossip overflowing into subject of specific technologies and their implementation.
Then again, as software designers we might need to be a bit smarter than that.

Note also, that we're not talking so much about blocking but about traffic prioritization, which is actually hard to detect, sneaky.

And if something does exists, it does not follow that you (or any of us) have heard about it. That would be a false premise and I sincerely advise you to use a better logic.

@artemii235
Copy link
Member Author

(Speaking of which, is there already an issue to use Electrum ports 80/443 in MM?)

Not yet, it requires additional DNS configuration, current electrum servers addresses maintained by cipi follow the electrum1-3.cipig.net:PORT pattern, it should be changed to e.g. kmd.electrum1.cipig.net:80 etc.
@cipig is it possible to add such DNS for electrums keeping old addresses available too?

And if something does exists, it does not follow that you (or any of us) have heard about it. That would be a false premise and I sincerely advise you to use a better logic.

Ì'm not thinking that if I didn't heard about something it doesn't exist, I'm making an assumption that if there were any massive (which is very important word here) block it's highly probable that someone would report this. But of course, "Bad ISP DB" reporting some providers tend to block any non-HTTP ports is sufficient argument to prepare for this and use 80/443 ports.

It is exceedingly rare to see provider gossip overflowing into subject of specific technologies and their implementation.

Vuse "bad ISPs DB" is example of such overflowing 🙂 And there's similar community supported list by TOR. Maybe something similar already exists in context of blockchain-related projects. Or maybe we will create something by ourselves sometime 🙂

@cipig
Copy link
Member

cipig commented Apr 22, 2020

Not yet, it requires additional DNS configuration, current electrum servers addresses maintained by cipi follow the electrum1-3.cipig.net:PORT pattern, it should be changed to e.g. kmd.electrum1.cipig.net:80 etc.
@cipig is it possible to add such DNS for electrums keeping old addresses available too?

That will not work, since all coins are running on the same server (or 3 of them), and each server has only one port 80.

sergeyboyko0791 added a commit that referenced this issue Apr 22, 2020
* Add `metrics` crate dependency
* Implement custom observer and exporter for the MM logging purposes
* Add unit tests
@artemii235
Copy link
Member Author

But Electrum pagination is a known problem which brings us back to the issue at hand. Any thoughts on proxying Electrum tx history in order to reverse, paginate and compress it?

@ArtemGr I think It's not a big issue if we fetch the history only when balance is changed.
Also there's a protocol idea to add method that will allow to fetch history from specific block height: https://electrumx.readthedocs.io/en/latest/protocol-ideas.html#blockchain-scripthash-history, so when this is implemented we will save the block number for which we have the history already and request only new transactions afterwards.
Even if this method is not implemented we can add it by ourselves to at least cipi ElectrumX fork, update our electrums and then submit a PR to the base repo (hoping it will be accepted so more coins will support it).
But before taking such actions I would like to start collecting metrics first and then base our decisions on them.

@ArtemGr
Copy link

ArtemGr commented Apr 23, 2020

Good idea on collecting the metrics!
Having them will be a nice addition to the "order liveliness" package.
I think we can get them already by parsing the logs, such as

00:00:16.760 mm_service:421] mm2] 19 21:00:16, rpc_clients:1261] Electrum client connected to electrum3.cipig.net:10001
00:01:16.764 mm_service:421] mm2] 19 21:01:16, rpc_clients:1296] "electrum3.cipig.net:10001" error "rpc_clients:1218] Didn\'t receive any data since 1587330016. Shutting down the connection."
00:01:16.807 mm_service:421] mm2] 19 21:01:16, rpc_clients:1261] Electrum client connected to electrum3.cipig.net:10001

added to my list.

@sergeyboyko0791
Copy link

sergeyboyko0791 commented Apr 30, 2020

@ArtemGr I want to do sort the tags while converting &[&TagParam] to Vec<Tag>.
https://discordapp.com/channels/@me/687147433003974657/705371359307497513

This comment is two years old. Can you say is it important to keep tags unordering? if I added tags sorting, UI applications would work correctly?

@ArtemGr
Copy link

ArtemGr commented Apr 30, 2020

Might affect https://github.com/ca333/komodoDEX/blob/e3a1d2ab39edb7719ee58f829d11a910277f1705/lib/services/mm_service.dart#L453, but it's NP as currently the UI is bundled with a specific version of MM, so we'll just update the UI code if the ordering of swap tags is flipped.

artemii235 pushed a commit that referenced this issue May 19, 2020
* Add and implement metrics module #612
* Add `metrics` crate dependency
* Implement custom observer and exporter for the MM logging purposes
* Add unit tests

* Refactor the metrics module to use it in MmCtx
* Make Metrics::receiver the Constructable<Receiver>
* Add metrics: mm_metrics::Metrics field to the MmCtx
* Add MmArc::init_metrics() method

* Add the mm_metrics using to measure rpc electrum traffic
* Add ctx: MmWeak as field to ElectrumClientImpl
* Pass ctx: MmWeak to connect_loop

* Add traffic measurement for NativeClient

* Incapsulate transport traffic metrics into TransportMetrics trait
* Create CoinTransportMetrics implemented TransportMetrics trait
* Replace the mm_counter! using within electrum's connect_loop and NativeClient::transport

* Replace Arc<dyn TransportMetrics> to Box<dyn TransportMetrics>
* Add clone_into_box method to TransportMetrics to clone Box's content

* Add traffic measurement for Web3Transport - etomic coin's rpc client

* Replace ctx: MmWeak by log_state: LogArc within Metrics, change dashboard formatting
* Add LogArc, LogWeak like to MmArc, MmWeak
* Replace log: LogState by log: LogArc within MmCtx
* Record metrics to dashboard with the same labels in one line

* Make the TransportMetrics more abstract
* Rename TransportMetrics to RpcTransportEventHandler
* Replace Box<Trait> by Arc<Trait> to avoid excess dynamic memory allocation
* Replace one shared handler by vector of shared handlers within RPC clients

* Add `metrics` method to MarketMaker
* Add method
* Implement custom JsonObserver and use it to collect metrics in Json

* Add tx_history metrics collection to ETH, ERC20, NATIVE, ELECTRUM tx_history
* Add metrics
* Add an integration test to check request the transaction history from UTXO RPCs only when balance is changed
* Fix warnings

* Change metrics format
* Replace metrics from tags to message
* Remove tags sorting

* Request the transaction history from Electrum RCP only when balance is changed #612

* Add prometheus HTTP exporter

* Add prometheus graceful shutdown on MM context dropped

* Fix merge conflicts

* Implement Debug for RpcTransportEventHandler trait

* Add ability to disable metrics recording to log
@artemii235
Copy link
Member Author

Implemented, closing.

@ArtemGr
Copy link

ArtemGr commented May 28, 2020

Any hints on how we can use the new measurement functionality to measure MM traffic?

@artemii235
Copy link
Member Author

The documentation has been published recently, I think recommended way for GUIs is to call API and parse the metrics from JSON: https://developers.komodoplatform.com/basic-docs/atomicdex/atomicdex-tutorials/atomicdex-metrics.html#api-calling

@ArtemGr
Copy link

ArtemGr commented May 28, 2020

Awesome, thanks!

@sergeyboyko0791
Copy link

@ArtemGr, I decided not to reorder the dashboard tags. I hope you haven't changed GUI's code.

@ArtemGr
Copy link

ArtemGr commented May 28, 2020

@sergeyboyko0791, Nope, everything's intact. Appreciate the update.
P.S. Looked at the PRs, but there was too much for me to parse D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants