Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize account lookup by auth-addr #8

Merged
merged 1 commit into from
Sep 21, 2024
Merged

Conversation

agodnic
Copy link

@agodnic agodnic commented Sep 20, 2024

Summary

Tracking issue: #5

This pull request adds a partial index to the CRDB schema, in order to optimize the performance of account lookups by the auth-addr parameter. It wasn't necessary to modify the indexer SQL query for this index to be used.

CREATE INDEX ON account ((account_data->>'spend'))
WHERE ((account_data->>'spend') IS NOT NULL)

Note that the index expression intentionally avoids using the expression NOT account.deleted, so that it can also be used when the include-deleted=true parameter is passed in the REST API.

The same kind of index can be created in a PostgreSQL schema, with similar results.

Tests in a local CockroachDB light indexer

deleted=true
EXPLAIN ANALYZE                                                                                                                                                                   
                            -> WITH qaccounts AS (                                                                                                                                                               
                            ->     SELECT a.addr, a.microalgos, a.rewards_total, a.created_at, a.closed_at, a.deleted, a.rewardsbase, a.keytype, a.account_data FROM account a                                   
                            ->     WHERE a.account_data ->> 'spend' = 'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU='                                                                                             
                            ->     ORDER BY a.addr ASC                                                                                                                                                           
                            ->     LIMIT 100                                                                                                                                                                     
                            -> )                                                                                                                                                                                 
                            ->     SELECT za.addr, za.microalgos, za.rewards_total, za.created_at, za.closed_at, za.deleted, za.rewardsbase, za.keytype, za.account_data                                         
                            ->     FROM qaccounts za                                                                                                                                                             
                            ->     ORDER BY za.addr ASC                                                                                                                                                          
                            -> ;                                                                                                                                                                                 
                            ->                                                                                                                                                                                   
                                                          info
------------------------------------------------------------------------------------------------------------------------
  planning time: 827µs
  execution time: 21ms
  distribution: local
  vectorized: true
  plan type: custom
  rows decoded from KV: 200 (28 KiB, 3 gRPC calls)
  cumulative time spent in KV: 20ms
  maximum memory usage: 207 KiB
  network usage: 0 B (0 messages)
  sql cpu time: 231µs
  isolation level: serializable
  priority: normal
  quality of service: regular

  • render
  │
  └── • index join (streamer)
      │ sql nodes: n1
      │ kv nodes: n1
      │ actual row count: 100
      │ KV time: 19ms
      │ KV contention time: 0µs
      │ KV rows decoded: 100
      │ KV bytes read: 18 KiB
      │ KV gRPC calls: 2
      │ estimated max memory allocated: 160 KiB
      │ estimated max sql temp disk usage: 0 B
      │ sql cpu time: 191µs
      │ estimated row count: 100
      │ table: account@account_pkey
      │
      └── • scan
            sql nodes: n1
            kv nodes: n1
            actual row count: 100
            KV time: 1ms
            KV contention time: 0µs
            KV rows decoded: 100
            KV bytes read: 11 KiB
            KV gRPC calls: 1
            estimated max memory allocated: 30 KiB
            sql cpu time: 41µs
            estimated row count: 100 (<0.01% of the table; stats collected 8 minutes ago)
            table: account@account_expr_idx (partial index)
            spans: [/'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU=' - /'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU=']
            limit: 100
(46 rows)

Time: 23ms total (execution 22ms / network 0ms)
deleted=false
EXPLAIN ANALYZE                                                                                                                                                                   
                            -> WITH qaccounts AS (                                                                                                                                                               
                            ->         SELECT a.addr, a.microalgos, a.rewards_total, a.created_at, a.closed_at, a.deleted, a.rewardsbase, a.keytype, a.account_data FROM account a                               
                            ->         WHERE NOT a.deleted AND a.account_data ->> 'spend' = 'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU='                                                                       
                            ->         ORDER BY a.addr ASC                                                                                                                                                       
                            ->         LIMIT 100                                                                                                                                                                 
                            -> )                                                                                                                                                                                 
                            ->         SELECT za.addr, za.microalgos, za.rewards_total, za.created_at, za.closed_at, za.deleted, za.rewardsbase, za.keytype, za.account_data                                     
                            ->         FROM qaccounts za                                                                                                                                                         
                            ->         ORDER BY za.addr ASC                                                                                                                                                      
                            -> ;                                                                                                                                                                                 
                                                              info
--------------------------------------------------------------------------------------------------------------------------------
  planning time: 2ms
  execution time: 23ms
  distribution: local
  vectorized: true
  plan type: generic, re-optimized
  rows decoded from KV: 364 (51 KiB, 6 gRPC calls)
  cumulative time spent in KV: 22ms
  maximum memory usage: 354 KiB
  network usage: 0 B (0 messages)
  sql cpu time: 521µs
  isolation level: serializable
  priority: normal
  quality of service: regular

  • render
  │
  └── • limit
      │ count: 100
      │
      └── • filter
          │ sql nodes: n1
          │ actual row count: 100
          │ sql cpu time: 7µs
          │ estimated row count: 348
          │ filter: NOT deleted
          │
          └── • index join (streamer)
              │ sql nodes: n1
              │ kv nodes: n1
              │ actual row count: 182
              │ KV time: 20ms
              │ KV contention time: 0µs
              │ KV rows decoded: 182
              │ KV bytes read: 32 KiB
              │ KV gRPC calls: 5
              │ estimated max memory allocated: 290 KiB
              │ estimated max sql temp disk usage: 0 B
              │ sql cpu time: 412µs
              │ estimated row count: 632
              │ table: account@account_pkey
              │
              └── • scan
                    sql nodes: n1
                    kv nodes: n1
                    actual row count: 182
                    KV time: 2ms
                    KV contention time: 0µs
                    KV rows decoded: 182
                    KV bytes read: 19 KiB
                    KV gRPC calls: 1
                    estimated max memory allocated: 40 KiB
                    sql cpu time: 103µs
                    estimated row count: 182 - 633 (<0.01% of the table; stats collected 7 minutes ago)
                    table: account@account_expr_idx (partial index)
                    spans: [/'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU=' - /'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU=']
(55 rows)

Time: 26ms total (execution 25ms / network 1ms)

Tests in a CockroachDB full indexer

This test was not possible since the full indexer was reset to sync from scratch, and at the time or writing it was syncing at round ~700k.

Tests in a PostgreSQL full indexer

Note that the index already existed in the development server.

    "idx_acc_spend" btree ((account_data ->> 'spend'::text), addr) WHERE (account_data ->> 'spend'::text) IS NOT NULL AND NOT deleted

This index will not be used when the include-deleted=true parameter is used. In order to be used, the AND NOT deleted expression must be removed. However, it does work when the include-all parameter is not used.

deleted=false
EXPLAIN ANALYZE
WITH qaccounts AS (
        SELECT a.addr, a.microalgos, a.rewards_total, a.created_at, a.closed_at, a.deleted, a.rewardsbase, a.keytype, a.account_data FROM account a
        WHERE NOT a.deleted AND a.account_data ->> 'spend' = 'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU='
        ORDER BY a.addr ASC
        LIMIT 100
)
        SELECT za.addr, za.microalgos, za.rewards_total, za.created_at, za.closed_at, za.deleted, za.rewardsbase, za.keytype, za.account_data
        FROM qaccounts za
        ORDER BY za.addr ASC
;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..107.97 rows=100 width=104) (actual time=0.429..10.268 rows=100 loops=1)
   ->  Index Scan using idx_acc_spend on account a  (cost=0.42..108487.98 rows=100876 width=104) (actual time=0.428..10.261 rows=100 loops=1)
         Index Cond: ((account_data ->> 'spend'::text) = 'Z+k/psisXIGZkKpzQMDBa1CKuxF4vpsw0CS4rCUZPUU='::text)
 Planning Time: 0.553 ms
 Execution Time: 10.294 ms
(5 rows)

@agodnic agodnic requested a review from urtho September 20, 2024 22:05
@agodnic agodnic changed the title [WIP] Optimize account lookup by auth-addr Optimize account lookup by auth-addr Sep 20, 2024
@urtho urtho merged commit 4b91c9e into main Sep 21, 2024
4 checks passed
@agodnic agodnic deleted the optimize-auth-addr-lookup branch September 21, 2024 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants