Monero: fix decoy selection algo and add test for latest spendable #384

j-berman · 2023-09-29T01:05:49Z

DSA only selected coinbase outputs and didn't match the wallet2 implementation
Added test to make sure DSA will select a decoy output from the most recent unlocked block
Made usage of "height" in DSA consistent with other usage of "height" in Monero code (height == num blocks in chain)
Rely on monerod RPC response for output's unlocked status

Could use another round after this PR to verify the DSA matches wallet2 in its entirety.

Planning to use wallet2 to generate 100k decoys at a specific height on mainnet, output the selections to a CSV file, then do the same with the monero-serai DSA, and make sure the height distributions align. (DONE: #384 (comment))

kayabaNerve · 2023-10-09T00:11:38Z

coins/monero/src/rpc/mod.rs

-    }
-
+  /// Get the specified outputs from the RingCT (zero-amount) pool
+  pub async fn get_outs(&self, indexes: &[u64]) -> Result<Vec<OutputResponse>, RpcError> {


This breaks consensus. This will return if the outputs are unlocked by the current view when we need to check they're unlocked at the specified height.

We could add an argument by_height: Option<usize>, triggering a secondary pass, which restores in monero-serai checks as needed for consensus?

Though this would need to error if by_height < curr_height, else the first-pass may prune outputs unlocked by by_height.

We also need documentation on how the second pass is incomplete (not handling time-based timestamps).

Actually, we can't do that as it has a race condition.

If we check by_height < before-hand, it could reorg before our get_outs RPC call.

If we check it after, it could add the new block after we call get_outs.

If we check it both before and after, it could reorg to a shorter chain, then re-org to a longer chain.

We have to decide a single strategy regarding if we use Monero's unlocked or our own deterministic unlocked.

It looks like with current code, if there's a reorg, it could similarly break consensus because the output distributions used to select decoys could be inconsistent. It seems the primary way to solve this issue is for the Serai nodes to lean on block hash + block height and be reorg resistant external to this RPC.

How about a strategy along these lines: the Serai processor keeps track of the chain tip's block hash and height before it starts to construct a tx, then at some point later in tx construction after decoys are fetched, makes sure that the block hash and height remain expected (and handles reorgs accordingly)?

From an API perspective, I would think Serai-specific consensus issues would ideally be handled external to monero-serai (e.g. avoid extra round trips from client to daemon to handle Serai-specific consensus issues).

Uhhhh

The current code only breaks if it reorgs, or is inconsistent, at the specified height.

The new code is inconsistent if the tip is inconsistent.

If we apply the invariant of the specified height being stable, then my commentary is invalid. We'd have to check by_height <= current_height before-hand, to catch that error case, yet then we can do a secondary pass successfully. Sorry for not noticing the current code defined an invariant, then saying how my suggestion breaks if the invariant is broken.

As for policy, we need an RPC offering a consistent API under the defined invariant (operated on blocks are stable). So long as that is met, libraries should be as ergonomic to their projects as possible. That doesn't change this PR, as-is, is fundamentally unviable for Serai. No amount of external code will protect Serai against the issue raised (differing results based on tip block).

Also, the Serai processor already does check for lack of re-orgs as detectable.

In the latest, I created 2 decoy selection functions exposed by the API:

Decoys::select - relies on monerod's get_outs RPC response for outputs' unlocked status. Note that Serai cannot use this function because Serai requires a consistent view of unlocked outputs at a specific height across multiple daemons. This function is useful for normal wallets that point to a single daemon.

Decoys::fingerprintable_canonical_select - does extra trips to the monerod RPC to make sure that outputs are unlocked by a specific height. Does not support selecting outputs locked with a timestamp, which is why any ring with decoys that include timestamped timelocks cannot have been constructed using this function and is thus "fingerprintable"-ish. Serai can use this function because it ensures that so long as no reorg has occurred, all parallel Serai nodes will have a consistent view of outputs' unlocked statuses.

kayabaNerve · 2023-12-01T05:47:40Z

Please let me know when this is ready for review :)

j-berman · 2023-12-01T07:22:07Z

It's ready

j-berman · 2023-12-27T20:11:05Z

Using mainnet, I tested a sample of 1mn outputs from the DSA in this PR, against this reference implementation from the monero repo (monero-project/monero#9024) and found that the distributions matched.

I ran a two-sample Kolmogorov–Smirnov Test as described in that PR (and with the help of @Rucknium to validate the approach) and found a p-value of 0.645, which is greater than 0.05, which suggests the distributions match.

Here is a pretty chart that visually shows they appear to match:

Here's the code I used to get a sample from this PR's DSA (WARNING: using this code generates a sample that excludes the real output and can therefore reveal a received output, so careful with sharing it):

use rand_core::OsRng;

use monero_serai::wallet::{ViewPair, Scanner, Decoys, SpendableOutput, address::SubaddressIndex};

use zeroize::Zeroizing;

use std_shims::collections::HashSet;

use curve25519_dalek::{Scalar, edwards::CompressedEdwardsY};

use csv::Writer;

mod runner;

async_sequential!(
  async fn mainnet_decoys() {
    // CONSTANTS
    let tx_hash_bytes =
      hex::decode("TEST")
        .unwrap()
        .try_into()
        .unwrap();
    let spend_pub_bytes =
      hex::decode("TEST")
        .unwrap()
        .try_into()
        .unwrap();
    let view_priv_bytes =
      hex::decode("TEST")
        .unwrap()
        .try_into()
        .unwrap();

    let csv_file_path = "./monero_serai_decoy_selections.csv";

    let num_picks = 1_000_000;
    // END CONSTANTS

    let spend_pub = CompressedEdwardsY(spend_pub_bytes).decompress().unwrap();
    let view_priv = Zeroizing::new(Scalar::from_canonical_bytes(view_priv_bytes).unwrap());
    let view = ViewPair::new(spend_pub, view_priv);

    let mut scanner = Scanner::from_view(view, Some(HashSet::new()));

    let rpc = runner::rpc().await;

    let tx = rpc.get_transaction(tx_hash_bytes).await.unwrap();
    let output = scanner.scan_transaction(&tx).not_locked().swap_remove(0);
    let spendable_output = SpendableOutput::from(&rpc, output).await.unwrap();

    let protocol = rpc.get_protocol().await.unwrap();
    let height = rpc.get_height().await.unwrap();

    // Make selections and write to csv file
    let mut wtr = Writer::from_path(csv_file_path).unwrap();

    #[derive(serde::Serialize)]
    struct Row {
      global_index: u64,
      height: usize,
      age: usize,
    }

    let mut iters = 0;
    let mut done = false;
    while !done {
      let decoys = Decoys::select(&mut OsRng, &rpc, protocol.ring_len(), height, &[spendable_output.clone()])
        .await
        .unwrap();

      let indexes = decoys[0].indexes();
      let output_heights: Vec<usize> = rpc
        .get_outs(&indexes)
        .await
        .unwrap()
        .iter()
        .map(|o| o.height)
        .collect::<Vec<_>>();

      for i in 0 .. indexes.len() {
        let global_index = indexes[i];
        if global_index != spendable_output.global_index {
          let output_height = output_heights[i];

          wtr.serialize(Row {
            global_index: global_index,
            height: output_height,
            age: height - output_height,
          }).unwrap();

          iters += 1;
          if iters == num_picks {
            done = true;
            break;
          } else if iters % 1000 == 0 {
            println!("{iters} / {num_picks}");
          }
        }
      }
    }

    wtr.flush().unwrap();

    assert_eq!(height, rpc.get_height().await.unwrap());
  }
);

Here's the python I used to run the KS test:

import csv
from scipy.stats import kstest

def get_global_indexes(filename):
    res = []
    with open(filename, 'r') as csvfile:
        rows = csv.reader(csvfile, delimiter = ',')

        idx = 0
        for row in rows:
            if idx > 0:
                res.append(int(row[0]))

            idx += 1

    return res

reference_indices = get_global_indexes("python_decoy_selections_only_unlocked.csv")
monero_serai_indices = get_global_indexes("monero_serai_decoy_selections.csv")

print(kstest(reference_indices, monero_serai_indices))

- DSA only selected coinbase outputs and didn't match the wallet2 implementation - Added test to make sure DSA will select a decoy output from the most recent unlocked block - Made usage of "height" in DSA consistent with other usage of "height" in Monero code (height == num blocks in chain) - Rely on monerod RPC response for output's unlocked status

Makes it simpler for callers who are unconcered with consistent canonical output selection across multiple clients to rely on the simpler Decoy::select and not worry about fingerprintable canonical

The RingCT distribution on mainnet doesn't start until well after genesis, so the distribution length is expected to be < height. To be clear, this was my mistake from this series of changes to the DSA. I noticed this mistake because the DSA would error when running on mainnet.

) * Monero: fix decoy selection algo and add test for latest spendable - DSA only selected coinbase outputs and didn't match the wallet2 implementation - Added test to make sure DSA will select a decoy output from the most recent unlocked block - Made usage of "height" in DSA consistent with other usage of "height" in Monero code (height == num blocks in chain) - Rely on monerod RPC response for output's unlocked status * xmr runner tests mine until outputs are unlocked * fingerprintable canoncial select decoys * Separate fingerprintable canonical function Makes it simpler for callers who are unconcered with consistent canonical output selection across multiple clients to rely on the simpler Decoy::select and not worry about fingerprintable canonical * fix merge conflicts * Put back TODO for issue #104 * Fix incorrect check on distribution len The RingCT distribution on mainnet doesn't start until well after genesis, so the distribution length is expected to be < height. To be clear, this was my mistake from this series of changes to the DSA. I noticed this mistake because the DSA would error when running on mainnet.

kayabaNerve reviewed Oct 9, 2023

View reviewed changes

This was referenced Oct 10, 2023

Monero decoy selection algorithm is broken #266

Closed

Write a test the decoy selection can sample from all possible inputs on chain #265

Closed

j-berman force-pushed the xmr-decoy-off-by-1-test branch from 33bc227 to 438afa7 Compare November 30, 2023 09:57

j-berman mentioned this pull request Nov 30, 2023

Support time-based Monero timelocks #104

Closed

j-berman added 7 commits February 19, 2024 18:01

xmr runner tests mine until outputs are unlocked

3983019

fingerprintable canoncial select decoys

0f39d82

Separate fingerprintable canonical function

7bea197

Makes it simpler for callers who are unconcered with consistent canonical output selection across multiple clients to rely on the simpler Decoy::select and not worry about fingerprintable canonical

fix merge conflicts

a48dca3

Put back TODO for issue serai-dex#104

19973f9

j-berman force-pushed the xmr-decoy-off-by-1-test branch from 77095f7 to 733e6c9 Compare February 20, 2024 02:12

kayabaNerve merged commit 92d8b91 into serai-dex:develop Feb 20, 2024
13 of 18 checks passed

Rucknium mentioned this pull request Oct 17, 2024

Monero Research Lab Meeting - Wed 16 October 2024, 17:00 UTC monero-project/meta#1092

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monero: fix decoy selection algo and add test for latest spendable #384

Monero: fix decoy selection algo and add test for latest spendable #384

j-berman commented Sep 29, 2023 •

edited

Loading

kayabaNerve Oct 9, 2023

kayabaNerve Oct 9, 2023 •

edited

Loading

kayabaNerve Oct 9, 2023

j-berman Oct 10, 2023

kayabaNerve Oct 10, 2023 •

edited

Loading

j-berman Nov 30, 2023

kayabaNerve commented Dec 1, 2023

j-berman commented Dec 1, 2023

j-berman commented Dec 27, 2023 •

edited

Loading

Monero: fix decoy selection algo and add test for latest spendable #384

Monero: fix decoy selection algo and add test for latest spendable #384

Conversation

j-berman commented Sep 29, 2023 • edited Loading

kayabaNerve Oct 9, 2023

Choose a reason for hiding this comment

kayabaNerve Oct 9, 2023 • edited Loading

Choose a reason for hiding this comment

kayabaNerve Oct 9, 2023

Choose a reason for hiding this comment

j-berman Oct 10, 2023

Choose a reason for hiding this comment

kayabaNerve Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

j-berman Nov 30, 2023

Choose a reason for hiding this comment

kayabaNerve commented Dec 1, 2023

j-berman commented Dec 1, 2023

j-berman commented Dec 27, 2023 • edited Loading

j-berman commented Sep 29, 2023 •

edited

Loading

kayabaNerve Oct 9, 2023 •

edited

Loading

kayabaNerve Oct 10, 2023 •

edited

Loading

j-berman commented Dec 27, 2023 •

edited

Loading