Skip to content

A tool for summarizing a requirements conversation | RE-Lab @ Utrecht University

License

Notifications You must be signed in to change notification settings

RELabUU/REConSum

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REConSum: the Requirements Elicitation Conversations Summarizer

About

This code found in the code folder can be used to extract requirements-relevant questions from a transcription of a requirements elicitation session (an interview). It consists of three parts, plus an example.

The first part (preprocessing.py) preprocesses our data, converting it to a format which can be used by the code. After that, the code in find_questions.py can identify the questions using either Part of Speech tags or Dialogue Acts. Finally, the code in categorize_relevance.py will determine whether these questions are requirements relevant or not, using TF-IDF. These three Python files are orchestrated in the Jupyter notebook example.ipynb.

The input for running the code can be added and found in the data folder.

The results of our tests can be found in the results folder.

Requirements

To be able to run this code and replicate the example in the notebook, you will need to have the following:

With the following packages:

Furthermore, you will need some form of input; a transcription of a conversation. Our preprocessing code takes input generated by AWS Transcribe, otherwise adjustments to the preprocessing code have to be made.

Finally, to perform our TF-IDF comparison, we need the to use TFIDF terms from a Wikipedia dump. Please place the wiki_tfidf_terms.csv in the data folder.

In the example, we have put our input in a data folder and I would suggest doing the same:

# For demonstration purposes, we will use one of our conversations from our dataset
transcription_file = "../data/example_conversation.txt"
wiki_tfidf_file = "../data/wiki_tfidf_terms.csv"

Running the code

In order to run the code, the requirements have to be met. Furthermore, if you wish to use Part of Speech tags to identify the questions, we need to run StanfordCoreNLP locally.

Then, to listen to any calls made to http://localhost:9000/, we need to start it with the following command, in the directory in which StanfordCoreNLP was created:

# Run the server using all jars in the current directory (e.g., the CoreNLP home directory)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Contact

The tool was developed by Xavier de Bondt (https://github.com/XavierdeBondt)

About

A tool for summarizing a requirements conversation | RE-Lab @ Utrecht University

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors