Skip to content

VarunGumma/Code-commenting

 
 

Repository files navigation

Code-commenting

Dataset

  • We use the dataset of the DeepCom code for our training. We have trained mostly on Google Colaboratory. Recommend trying for the university's HPC (if not too busy) or GCloud credits (if ANY of your cards can get through without a refund).

Results

  • Our model did not cross SOTA performance, which is something we have expected. It has however managed to produce semantically correct comments, occasionally more informative than the user's comments themselves.
  • Many of the comments within the first epoch were repetitive, but the number of meaningful comments increased significantly over time.

NOTE: The first line is the comment spit out by the machine. The second one is the true human comment:
image

The model requires more training for rarer tokens:
image

Here the model fails to spit a grammatically correct word, but it can capture the inner semantics of the code:
image

Due to teacher-forcing, the machine has been confused, but otherwise, it still tried to produce a meaningful comment when humans gave a bad comment:
image

The model can also substitute words with similar meanings:
image

image

BLEU Scoring and Loss plots

meteor

bleu

loss

Box Plots

Bleu

METEOR

Contributing

About

Design project on code comment generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%