REConSum: the Requirements Elicitation Conversations Summarizer

About

This code found in the code folder can be used to extract requirements-relevant questions from a transcription of a requirements elicitation session (an interview). It consists of three parts, plus an example.

The first part (preprocessing.py) preprocesses our data, converting it to a format which can be used by the code. After that, the code in find_questions.py can identify the questions using either Part of Speech tags or Dialogue Acts. Finally, the code in categorize_relevance.py will determine whether these questions are requirements relevant or not, using TF-IDF. These three Python files are orchestrated in the Jupyter notebook example.ipynb.

The input for running the code can be added and found in the data folder.

The results of our tests can be found in the results folder.

Requirements

To be able to run this code and replicate the example in the notebook, you will need to have the following:

(Python >= 3.7)
(Jupyter Notebook >= 6.4.0)

With the following packages:

(DialogTag >= 1.1.3)
(NLTK >= 3.6)
(Numpy >= 1.20)
(Pandas >= 1.4.0)
(Pycorenlp == 0.3.0)
(Sklearn >= 1.0)
(Stemming == 1.0)
(Textract >= 1.6.4)
(Tqdm >= 4.62.0)

Furthermore, you will need some form of input; a transcription of a conversation. Our preprocessing code takes input generated by AWS Transcribe, otherwise adjustments to the preprocessing code have to be made.

Finally, to perform our TF-IDF comparison, we need the to use TFIDF terms from a Wikipedia dump. Please place the wiki_tfidf_terms.csv in the data folder.

In the example, we have put our input in a data folder and I would suggest doing the same:

# For demonstration purposes, we will use one of our conversations from our dataset
transcription_file = "../data/example_conversation.txt"
wiki_tfidf_file = "../data/wiki_tfidf_terms.csv"

Running the code

In order to run the code, the requirements have to be met. Furthermore, if you wish to use Part of Speech tags to identify the questions, we need to run StanfordCoreNLP locally.

Then, to listen to any calls made to http://localhost:9000/, we need to start it with the following command, in the directory in which StanfordCoreNLP was created:

# Run the server using all jars in the current directory (e.g., the CoreNLP home directory)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Contact

The tool was developed by Xavier de Bondt (https://github.com/XavierdeBondt)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
code		code
data		data
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REConSum: the Requirements Elicitation Conversations Summarizer

About

Requirements

Running the code

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

RELabUU/REConSum

Folders and files

Latest commit

History

Repository files navigation

REConSum: the Requirements Elicitation Conversations Summarizer

About

Requirements

Running the code

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages