Skip to content

High Performance Noise Cross-Corelation Computing Code for 9-component Recordings

License

Notifications You must be signed in to change notification settings

wangkingh/FastXC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

广告图片

FastXC

High Performance Noise Cross-Corelation Computing Code for 9-component Recordings

  • Table of Contents

Important Warning

BUG of tfpws and pws in this code has been found, those who used a GPU with memory < 20G should take care!!

💡Project Introduction

High Performance Noise Cross-Corelation Computing Code for 9-component Recordings

Using a high-performance CPU-GPU heterogeneous computing framework, this program is designed to efficiently compute single/nine-component noise cross-correlation functions (NCFs) from ambient noise data. It integrates data preprocessing, accelerated cross-correlation computation, and various stacking techniques (Linear, PWS, tf-PWS), particularly optimizing the computing process using CUDA technology. This significantly enhances processing speed and the signal-to-noise ratio of the data, making it especially suitable for handling large-scale noise datasets.

Program Features 🎉🎉

  1. CUDA-accelerated heterogeneous computing
  2. Supports computing both single-component and nine-component cross-correlation functions
  3. Employs regex-based file retrieval for SAC files, generally eliminating the need for users to rename files
  4. Enables cross-correlation calculation between two seismic arrays
  5. Integrates PWS and tf-PWS high-SNR stacking methods (requiring sufficient GPU memory) with CUDA acceleration
  6. Separates business logic from low-level computation, allowing users familiar with CUDA and C to customize preprocessing and cross-correlation steps

🔧Installation & Requirements

System Requirements

  • Requires Linux and a GPU-enabled graphics card, preferably with 8GB or more of GPU memory.
  • If this is your first time running a CUDA program, please check the Computational Environment section beforehand.

Python Version Requirements

  • Requirement: Python 3.8 or higher

Required Third-Party Python Libraries

  • Necessary Libraries: obspy, pandas, scipy, matplotlib, tqdm, numpy
  • We only use basic functionalities from these libraries, so we recommend installing their latest versions. If your environment already has these libraries, you can skip updates. To install them, use:
pip install obspy pandas scipy matplotlib tqdm numpy

If you are familiar with setting up environments using Anaconda, that would be even better!

Compilation

The entire program's code is divided into two parts. The more "high-level" part in Python is used for allocating computational tasks, designing filters, and generating terminal executable commands, while the large-scale fundamental operations are implemented mostly in C and CUDA-C. For the C and CUDA-C portion, we need to compile them into executables before running the program. To compile, follow the steps below:

cd FastXC
make veryclean
make

If you’re not using a high-end computing card (such as A100), you’ll need to modify the third line of the FastXC/Makefile file:

export ARCH=SM_89

This involves the GPU’s compute capability. You can Google your device’s compute capability. I’ve also prepared a script under FastXC/utils to compile and run the check_gpu program:

bash compile.sh
./check_gpu

On my desktop, I’m using an NVIDIA RTX 4090 graphics card. The output after running the program is:

Device Number: 0
 Device  name: NVIDIA GeForce RTX 4090
 Compute capability:8.9

My device’s compute capability is 8.9, hence the compilation option is ARCH=sm_89.

After the compilation, please check FastXC/bin. All the executables generated by the compilation will be stored in this folder. At a minimum, you should see RotateNCF, extractSegments, ncfstack, sac2spec, xc_dual_channel, and xc_multi_channel.

After compilation, you can try running these executables in the bin directory to check their output and confirm that the compilation was successful. For example:

cd FastXC/bin
./sac2spec

🚀Quick Start

Under the FastXC directory, there are several subfolders and files. The five most important ones are:

FastXC/src       # CUDA and C source code
FastXC/bin       # Executables compiled from CUDA-C and C code
FastXC/fastxc    # Python scripts that call the executables
FastXC/config/test.ini  # Example configuration file
FastXC/run.py     # The "main" program

Modify the Configuration File

Use vim or another text editor to modify FastXC/config/test.ini. For more details on the configuration file, refer to the Complete Configuration File Explanation.

Change line 5:

sac_dir = /mnt/c/Uers/admin/Desktop/FastXC/test_data

to the absolute path of the example data in your operating system.

Change line 27:

output_dir = /mnt/c/Users/admin/Desktop/FastXC/test_output

to the absolute path of the output directory in your operating system.

Change lines 84-88:

sac2spec = /mnt/c/Users/admin/Desktop/FastXC/bin/sac2spec
xc_multi = /mnt/c/Users/admin/Desktop/FastXC/bin/xc_multi_channel
xc_dual = /mnt/c/Users/admin/Desktop/FastXC/bin/xc_dual_channel
stack = /mnt/c/Users/admin/Desktop/FastXC/bin/ncfstack
rotate = /mnt/c/Users/admin/Desktop/FastXC/bin/RotateNCF

to the absolute paths of these five executables under FastXC/bin. This is similar to setting system environment variables.

Change lines 94-96:

gpu_list = 0
gpu_task_num = 1
gpu_mem_info = 24

to match the configuration of the computing devices you intend to use. You can use nvidia-smi to check your GPU info. For example, if you have two GPUs (labeled 0 and 1), each with 40GB of memory, and you plan to run one task per GPU (this is recommended), then you could write:

gpu_list = 0,1
gpu_task_num = 1,1
gpu_mem_info = 40,40

If you’re not familiar with GPU information, refer to Computing Environment.

Running the Example Data

After modifying the configuration file, go to the FastXC main directory and run:

python run.py

Checking the Output Directory

The output file location is set by the user in the configuration file. In the test example, the output directory is ~/FastXC/test_output. After the program finishes, several files and directories are created, for example:

test_output/
├── butterworth_filter.txt   # Records filter parameters, including Butterworth filter poles and zeros
├── cmd_list                 # A record of commands invoked internally by the program
├── dat_list.txt             # Data file list (redundant)
├── sac_spec_list            # List of spectral info for SAC data (used in the sac2spec stage)
├── segspec                  # Output results from the sac2spec stage
├── ncf                      # Cross-correlation results for each time segment; not present if calculate_style=DUAL
├── stack                    # Directory with the stacked cross-correlation results from different time segments
└── xc_list                  # A record of cross-correlation pair lists

Among these files and folders, pay special attention to the following two directories (assuming they appear after the calculation finishes):

  • ncf: Stores the cross-correlation results for each time segment. This directory contains data results organized by time window or station array, allowing you to review the segmented cross-correlation information.

  • stack: Stores the final cross-correlation results after stacking. This represents the combined outcomes after linear stacking or applying PWS, tf-PWS stacking methods.

If the ncf directory doesn’t appear in your output directory, check your configuration file and computation parameters to ensure the cross-correlation calculation steps completed correctly. Sometimes the program only creates the ncf folder and related results at a specific processing stage.

In summary, ncf represents segmented processing results, while stack represents integrated, stacked results. The files in these two directories are crucial for subsequent data analysis and interpretation.

📝Complete Configuration File Explanation

SeisArrayInfo Section

The [SeisArrayInfo] section is used to specify one or two arrays’ data directories, file path naming formats, and the time range for processing. Key parameters include the location of array data, file naming patterns, and the time range for processing. By configuring this section properly, you can flexibly retrieve and match data.

  • sac_dir_1 and sac_dir_2: Specify the absolute paths to the continuous waveform data for one or two arrays. If you are not computing cross-correlations between two arrays, set sac_dir_2 to NONE.

  • pattern_1 and pattern_2: Define the path patterns for accessing the array data files. The patterns can include:

    • {home}: Represents the root directory of the array data, automatically replaced by the value of sac_dir_1 or sac_dir_2.

    • {YYYY}: Four-digit year (e.g., 2020).

    • {YY}: Two-digit year (e.g., 20 for 2020).

    • {MM}: Two-digit month (01-12).

    • {DD}: Two-digit day (01-31).

    • {JJJ}: Julian Day (001-365/366), representing the nth day of the year.

    • {HH}: Two-digit hour (00-23).

    • {MI}: Two-digit minute (00-59).

    • {component}: Represents the data component, e.g., Z, N, E.

    • {suffix}: File suffix, such as SAC or another file extension.

    • {*}: A wildcard that matches any string of arbitrary length.

    • Notes:

      • Each placeholder can appear only once in the file name or path.
      • Time information must be specified at least to the day, and station/component information is required.
      • Redundant information can be represented by {*}.
      • Supported delimiters are . and _.
  • start and end: Specify the time range for retrieving data for cross-correlation calculations, in YYYY-MM-DD HH:MM:SS format.

  • sta_list_1 and sta_list_2: Specify the paths (relative to run.py) of the station list files for each array. Each line in these files contains a station name that matches the station field in pattern_1/2.

  • component_list_1 and component_list_2: Specify the components to be used for cross-correlation, such as Z, N, E. For nine-component cross-correlation, the input components must follow the E, N, Z order (though the component names can differ, for example BH1).

Parameters Section

The [Parameters] section controls various settings and algorithm choices during cross-correlation calculations, including frequency bands, time-frequency normalization options, and stacking options.

Basic Calculation Settings

  • output_dir:
    The absolute path where cross-correlation results are saved.

  • win_len (in seconds):
    The time window length for cross-correlation calculations. For example, win_len=7200 means each calculation segment is 2 hours.

  • delta (in seconds):
    The data sampling interval. Ensure accuracy because it’s critical for filter design and other key steps.

  • max_lag (in seconds):
    The maximum time lag, determining the half-length of the cross-correlation function. For example, max_lag=1000 results in a total cross-correlation length of 2000 seconds (±1000 seconds).

  • skip_step:
    Controls skipping behavior for continuous data segments.

    • -1 means no segments are skipped.
    • Can be set like 3/4/-1 to skip certain segments.
  • distance_threshold (in km):
    Only compute cross-correlation for station pairs closer than this threshold.

Frequency-Domain and Time-Domain Normalization Settings

  • whiten:
    Determines when spectral whitening is applied. Options: BEFORE, AFTER, BOTH, OFF.

    • BEFORE: Whiten before time-domain normalization, suitable for long-period data (Zhang et al., 2018).
    • AFTER: Whiten after time-domain normalization (Bensen et al., 2007).
    • BOTH: Apply whitening both before and after, a more aggressive choice.
    • OFF: No whitening (for testing purposes).
  • normalize:
    Type of time-domain normalization. Options: RUN-ABS, ONE-BIT, RUN-ABS-MF, OFF.

    • RUN-ABS: Running absolute value normalization.
    • ONE-BIT: One-bit normalization.
    • RUN-ABS-MF: Multiband running absolute value normalization.
    • OFF: No normalization (for testing).

    Note: In practical applications, a normalization method is typically chosen to improve signal-to-noise ratio.

  • norm_special:
    Choose CUDA or PYV. Unless there’s a special need, use CUDA. PYV is mainly for testing CPU-based preprocessing.

  • bands:
    Specifies the frequency bands (in Hz) for spectral whitening/normalization. One or more frequency bands can be given, e.g. 0.01/0.02 0.02/0.05. The program uses the min and max frequencies of all bands as the processing corner frequencies.

Parallel and Logging Options

  • parallel (True/False):
    Whether to enable CPU parallel processing.

  • cpu_count:
    The number of CPU cores (threads) to use for parallel processing. For example, cpu_count=100 means using 100 cores.

  • debug (True/False):
    Whether to run in debug mode. Usually set to False.

  • log_file_path:
    Path to the log file for recording detailed runtime and debugging information.

Output and Saving Controls

  • save_flag:
    A four-digit 0/1 flag that controls the output types. The bits correspond to LINEAR, PWS, TF-PWS, and per-segment results. For example: save_flag=1001 means save linear stack results and per-segment cross-correlation results.

    Note: save_flag only takes effect in calculate_style=DUAL mode.

  • rotate_dir:
    Specifies the rotation option for nine-component cross-correlation. Choices: LINEAR, PWS, TFPWS. Determines the final stacking and rotation processing style.

  • todat_flag:
    Similar to save_flag, this four-digit switch controls which results are converted to dat format (LINEAR,PWS,TFPWS,RTZ).
    Used similarly to save_flag.

    Note: todat_flag is also affected by calculate_style.

  • calculate_style: Choose MULTI or DUAL:

    • MULTI: Optimized for I/O performance, supports only linear stacking.
    • DUAL: Saves storage space and supports PWS and TF-PWS stacking. Use DUAL if you need PWS or tf-PWS.

Command Section

The [Command] section specifies the paths to the executables required during the program’s run. By configuring these executable paths, the main script (run.py) can automatically call these tools for data preprocessing, cross-correlation calculation, stacking, and rotation, without manual command-line input.

Parameter Explanations

  • sac2spec:
    Points to the sac2spec executable, used for converting SAC-format data and computing spectra as a preprocessing step.

  • xc_multi:
    Points to the xc_multi_channel executable, used for cross-correlation with multiple stations simultaneously (MULTI mode). Only linear stacking is supported in this mode.

  • xc_dual:
    Points to the xc_dual_channel executable, used for cross-correlating data from two arrays or segments (DUAL mode), supporting both linear stacking and PWS/TF-PWS stacking. Use this tool when higher SNR stacking methods are required.

  • stack:
    Points to the ncfstack executable, used to linearly stack cross-correlation results to improve SNR.

  • rotate:
    Points to the RotateNCF executable, used to rotate nine-component cross-correlation results to obtain direction-specific cross-correlation features. In nine-component calculations, this allows component rotation and output of desired stacked results.

After correctly configuring these executable paths, run.py or other main program modules will automatically invoke the corresponding programs at each processing stage.

GPU Information (gpu_info)

The [gpu_info] section is used to specify GPU computing resources, including available GPU device numbers, the number of tasks per GPU, and GPU memory information. Under the CUDA heterogeneous computing framework, properly allocating GPU resources can greatly improve computational efficiency.

Parameter Explanations

  • gpu_list:
    Specifies the available GPU device numbers, separated by commas. For example:

    gpu_list = 0

    Means only GPU #0 is used.

    If multiple GPUs are available:

    gpu_list = 0,1
  • gpu_task_num:
    Specifies the number of tasks for each GPU in [gpu_list]. For example:

    gpu_task_num = 1

    Means 1 task is assigned to GPU #0.

    If there are two GPUs (0 and 1), each assigned one task:

    gpu_task_num = 1,1
  • gpu_mem_info (in GB):
    Specifies the available memory on each GPU for resource optimization. For example:

    gpu_mem_info = 24

    Means the GPU has 24GB of memory. If there are two GPUs with 40GB and 24GB respectively:

    gpu_mem_info = 40,24

Summary

In the [gpu_info] section, properly setting GPU IDs, task numbers, and memory information helps efficiently utilize computing resources under CUDA heterogeneous computing. Before configuration, use nvidia-smi to check GPU IDs and memory, allowing for informed and targeted settings.

🔍Computational Environment Check

To run this CUDA program, ensure your system meets the following requirements:

  1. NVIDIA GPU: Your computer must have an NVIDIA GPU that supports CUDA.
  2. CUDA Toolkit: You must install the CUDA Toolkit, which is essential for running CUDA programs. The latest version can be downloaded from the NVIDIA official website.
  3. GPU Drivers: Make sure that your NVIDIA GPU drivers are up-to-date to be compatible with the installed version of CUDA.

Tools Check

Before starting, it is recommended to use the following commands to check if your environment is correctly configured:

  • Use the nvidia-smi command to check the status of your GPU and drivers.
nvidia-smi

This command will display details about your GPU and the current version of drivers.

nvcc --version

This command helps confirm the CUDA and CUDA compiler (NVCC) version.

❓FAQ

Q1: Does the program support Windows environment?

A1: In principle, the program is mainly optimized for Linux environments. However, you could try running it under Windows via WSL.

Q2: Aside from computing power, what other limitations are there on the performance of the computing device?

A2: A large part of the limitation actually comes from disk performance. Although the computation has been optimized to the limit, if the disk or disk array is underperforming, overall efficiency will still be low. (Of course, it should still be better than a pure CPU setup.)

Q3: Why doesn’t cal_type=MULTI support tf-PWS or PWS?

A3: A new version will be released in the future to support this. During earlier development, it was assumed that using tf-PWS with MULTI would be slow, but this will be addressed in subsequent revisions.

📒Change Log

See Change Log

📧Author Contact Information

If you have any questions or suggestions or want to contribute to the project, open an issue or submit a pull request.

For more direct inquiries, you can reach the author at:
Email: wkh16@mail.ustc.edu.cn

It will be my great pleasure if my code can provide any help for your research!

🙏Acknowledgements

We extend our sincere gratitude to our colleagues from the University of Science and Technology of China, the Institute of Geophysics, China Earthquake Administration, the Institute of Earthquake Forecasting, China Earthquake Administration, and the Institute of Geology and Geophysics, Chinese Academy of Sciences, for their significant contributions during this program's testing and trial runs!

ChatGPT generated the title illustration.

📚References

Wang et al. (2025). "High-performance CPU-GPU Heterogeneous Computing Method for 9-Component Ambient Noise Cross-correlation." Earthquake Research Advances. In Press.

Bensen, G. D., et al. (2007). "Processing seismic ambient noise data to obtain reliable broad-band surface wave dispersion measurements." Geophysical Journal International 169(3): 1239-1260.

Cupillard, P., et al. (2011). "The one-bit noise correlation: a theory based on the concepts of coherent and incoherent noise." Geophysical Journal International 184(3): 1397-1414.

Zhang, Y., et al. (2018). "3-D Crustal Shear-Wave Velocity Structure of the Taiwan Strait and Fujian, SE China, Revealed by Ambient Noise Tomography." Journal of Geophysical Research: Solid Earth 123(9): 8016-8031. Abstract The Taiwan Strait, along with the southeastern continental margin of the Eurasian plate, Fujian in SE China, is not far from the convergent boundary between the Eurasian plate and the Philippine Sea plate.

About

High Performance Noise Cross-Corelation Computing Code for 9-component Recordings

Resources

License

Stars

Watchers

Forks

Packages

No packages published