Test Data Generation Using Parallel Genetic Algorithm

KAIST 2020 Fall CS454 Artificial Intelligence Based Software Engineering

Introduction

Generating testdata is necessary for debugging, but test data generation is time-consuming and annoying process. We tried to generate test data automatically using parallel genetic algorithm.

Genetic algorithm and Parallel Genetice Algorithm

Genetic Algorithm (GA) is a search heuristic that is inspired by Charles Darwin’s theory of natural evolution.

GA can explore larger search region thanks to crossover and mutation. Nevertheless, GA has some disadvantages alike other search-based algorithms: it has a tendency to converge to local optima.

To overcome the weakness, we introduce new variant of genetica algorithm, which we named parallel genetic algorithm (PGA). It has a tree-like architecture.

This is pseudocode of PGA.

Line 10 to 15 : perform GA parallel n times.

Line 16 to 22 : perform inter-crossover, so each population can share information with others.

For memory and calcuation time, we limit the max number of populations in one generation as k. Pruning pick best k populations among one generation.

Evaluator

Usage

Generate instance of evaluator
```
 evaluator = Tester.instance()
```

Condition initialize

 evaluator.reset(argnum, max_value, condition_range, error_rate, correction_range)

There are 5 arguments to control error conditions. You can see the detail explanation in code.

Run experiment
```
 evaluator.run(input)
```

Results

For experiment, we use evaluator from genetic_CIT. We experiment the performance of PGA compared to GA in terms of population size and time. We run PGA and GA until 80,000 population size, and 400 seconds. We execute 5 times and show the average for experiments about population size, and one time for experiments about time. Also, experiment evaluator without correction range and evaluator with a correction range; error region 0 to 3 in 70% for parameter 0 to 2, error region 3 to 6 in 70% for parameter 0 to 2, and error region 6 to 9 in 70% for parameter 0 to 2.

These are results in terms of population size. Left one is with correction range, and right one is without.

These are results in terms of elapsed time. Also, Left one is with correction range, and right one is without.

Usage

We implement both GA and PGA to compare the performance.

Clone this repository:

  git clone https://github.com/ChoiIseungil/CS454Project.git
  cd CS454Project

Experiment environments:

correction range for evaluator

Quit option for GA

save performance with population size or with time in GA

Quit option for PGA

save performance with population size or with time in PGA

Hyper parameters

n, m, k in PGA

    python main.py -p True -m 0.1 -n 3 -l 100 -c 15 -r 0.5

arguments when running program

-p: "True" for PGA and "False" for GA (default = "False")
-m: mutation rate (default = 0.05)
-n: arg_num for evaluator (default = 5)
-l: max_value for evaluator (default = 20)
-c: condition_range for evaluator (default = 5)
-r: error_rate for evaluator (default = 0.3)

References

genetic_CIT

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
Results		Results
image		image
GA.py		GA.py
PGA.py		PGA.py
README.md		README.md
main.py		main.py
tester.py		tester.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test Data Generation Using Parallel Genetic Algorithm

Introduction

Genetic algorithm and Parallel Genetice Algorithm

Evaluator

Usage

Results

Usage

References

About

Releases

Packages

Contributors 3

Languages

ChoiIseungil/CS454Project

Folders and files

Latest commit

History

Repository files navigation

Test Data Generation Using Parallel Genetic Algorithm

Introduction

Genetic algorithm and Parallel Genetice Algorithm

Evaluator

Usage

Results

Usage

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages