Skip to content

Artifact associated with the paper entitled: Enhancing GitHub Actions Failure Explanations: Log Preprocessing and Prompt Optimization with LLMs

Notifications You must be signed in to change notification settings

smilevo/Workflow_Log_Cleaner

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

workflow_log_cleaner

Large Language Models (LLMs) have recently gained attention for automating software debugging and log analysis, particularly in Continuous Integration and Continuous Deployment (CI/CD) workflows. GitHub Actions (GA), a widely used CI/CD automation tool, generates complex and verbose failure logs, making debugging a time-consuming and error-prone task. Existing studies show that LLMs perform well in explaining simple errors but struggle with complex failure scenarios due to unstructured logs and inadequate reasoning capabilities.

This research aims to enhance LLM-based analysis to improve CI/CD pipeline reliability by diagnosing failures more efficiently. This will be achieved through log preprocessing, prompt optimization, and analysis of the differences between unfiltered and preprocessed logs, followed by an evaluation of the results.

Data-Preparation drawio

Data-PreProcessing-Page-2 drawio

prompt-opt drawio

Data-PreProcessing-Page-4 drawio (1)

Script Execution Guide

This repository contains scripts for diagnosing CI/CD pipeline failures using AI-powered analysis. The current script processes failure logs from GitHub Actions and provides summarized root causes with suggested fixes.

Usage Instructions for llama3-8B_failure_explanation.py

Prerequisites

  • Install Python 3.8+
  • Install required dependencies:
    pip install pandas openpyxl
    
  • Ensure Ollama is installed and accessible in your system path.

Execution Steps

  • Run the script from the terminal:
    python llama3-8B_failure_explanation.py
  • Provide the required input: 1- The script will prompt you to enter the path to the input Excel file. 2- It will also ask for the output Excel file path where the results will be saved.
Expected Input File Format
  • The input file must be an Excel (.xlsx) file.
  • It should contain a column named "Error log", where each row includes a failure log from GitHub Actions.
  • The script processes each log and generates a summarized root cause analysis.
Expected Output
  • The output Excel file will be saved at the specified path.
  • A new column, "llama_8b", will contain AI-generated diagnoses for each error log.

About

Artifact associated with the paper entitled: Enhancing GitHub Actions Failure Explanations: Log Preprocessing and Prompt Optimization with LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.7%
  • Python 5.3%