Skip to content
This repository has been archived by the owner on Apr 4, 2024. It is now read-only.

Latest commit

 

History

History
46 lines (41 loc) · 2.1 KB

readme.md

File metadata and controls

46 lines (41 loc) · 2.1 KB

Bitcoin Blockchain

Data ingestion

  1. Run the ingest1.sh script
    • Downloads and verifies the Bitcoin blockchain
    • The dbcache parameter in the script requires at least 2GB of RAM
    • On another terminal:
      1. Install jq with: bash sudo apt-get install -y jq
      2. Check the progress (from 0 to ~1) of bitcoind's synchronization with: bash bitcoin-cli getblockchaininfo | jq -r ".verificationprogress"
    • This synchronization never stops and should be manually stopped in another terminal when the progress reaches 0.99 with
      bitcoin-cli stop
  2. Run the ingest2.sh script
    • Creates appropriate directories in HDFS
    • Copies the Bitcoin blocks data in HDFS

Data profiling

Data analysis

  • In the analysis directory
  • MapReduce program written in Java 1.7
  • Uses maven to install necessary dependencies and build the project
  • Uses the hadoopcryptoledger library to parse the Bitcoin blocks binary data
  • See analysis/readme.md for more information

Results

  • Raw results from the analysis are stored in an Impala database
  • This is done using the script analysis/resultsToImpala.sql provided with:
    impala-shell --quiet -i compute-1-1 -f resultsToImpala.sql
  • The raw results are also on this repository at analysis/results.csv
  • These are fetched by our website to display them