Skip to content

This repository contains the code and documentation for a distributed computing analysis conducted for identifying high-redshift Lyman-break galaxies from spectral files. The project was undertaken as part of Statistics 405 at the University of Wisconsin-Madison, in collaboration with the Center for High Throughput Computing (CHTC).

Notifications You must be signed in to change notification settings

neuraldevx/Cosmic-Phenomena-Identification-Distributed-Computing

Repository files navigation

Cosmic-Phenomena-Identification-Distributed-Computing

This repository contains the code and documentation for a distributed computing analysis conducted for identifying high-redshift Lyman-break galaxies from spectral files. The project was undertaken as part of Statistics 405 at the University of Wisconsin-Madison, in collaboration with the Center for High Throughput Computing (CHTC).

Distributed Computing Analysis for Cosmic Phenomena Identification

This project involves the development and execution of a distributed computing strategy to identify high-redshift Lyman-break galaxies from a large dataset of spectral files. The analysis was conducted using R and Bash scripting on the HTCondor platform, optimizing job scheduling and data processing efficiency.

Project Overview

Research Objective: The primary objective of this project is to identify high-redshift Lyman-break galaxies from a dataset consisting of 2.5 million spectral files. Tools Used: R, Bash scripting, HTCondor, Git, Shell Collaborators: Center for High Throughput Computing (CHTC), University of Wisconsin-Madison, Statistics 405 Key Achievements: Developed and executed a distributed computing strategy for analyzing spectral files. Orchestrated 2459 parallel computing jobs on the HTCondor platform. Implemented data analysis techniques to filter and prioritize galaxy candidates. Automated data merging and analysis workflow for efficient handling of large datasets.

Project Components

Data Preprocessing: Includes scripts and code for preprocessing spectral files before analysis. Job Scheduling: Scripts and documentation related to job scheduling and optimization using HTCondor. Data Analysis: R scripts and documentation for analyzing spectral data and identifying Lyman-break galaxies. Automation: Shell scripts and utilities for automating workflow processes and data merging.

Contributors

[Jake Christensen] [Center for High Throughput Computing (CHTC)] [University of Wisconsin-Madison, Statistics 405]

About

This repository contains the code and documentation for a distributed computing analysis conducted for identifying high-redshift Lyman-break galaxies from spectral files. The project was undertaken as part of Statistics 405 at the University of Wisconsin-Madison, in collaboration with the Center for High Throughput Computing (CHTC).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages