Skip to content
Change the repository type filter

All

    Repositories list

    • Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlier pre-print: Maybe Deep Neural Networks are the Best Choice for Modeling Source Code (https://arxiv.org/abs/1903.05734). This is the first open vocabulary language model for code that uses the byte pair encodin…
      Python
      Apache License 2.0
      248333Updated Mar 24, 2023Mar 24, 2023
    • Hosts our tool for mining simple "stupid'' bugs (SStuBs).
      Java
      Apache License 2.0
      183538Updated May 20, 2022May 20, 2022
    • DeepSStuBs is a framework for learning single statement bug detectors from an existing code corpus.
      JavaScript
      MIT License
      0100Updated Mar 6, 2020Mar 6, 2020
    • bilm-tf

      Public
      Tensorflow implementation of contextualized word representations from bi-directional language models
      Python
      Apache License 2.0
      451100Updated Mar 2, 2020Mar 2, 2020
    • clams

      Public
      CLAMS API Summarizer
      Python
      Apache License 2.0
      1810Updated Aug 17, 2018Aug 17, 2018
    • codemining-core

      Public archive
      A set of tools for extracting tokens and ASTs from code
      Java
      BSD 3-Clause "New" or "Revised" License
      102200Updated Jun 5, 2018Jun 5, 2018
    • sequence-mining

      Public archive
      Probabilistic Sequence Mining
      Java
      GNU General Public License v3.0
      84400Updated Apr 25, 2018Apr 25, 2018
    • Website for Learning from "Big Code"
      CSS
      MIT License
      17400Updated Apr 13, 2018Apr 13, 2018
    • api-mining

      Public archive
      Probabilistic API Mining
      Java
      GNU General Public License v3.0
      165320Updated Jan 8, 2018Jan 8, 2018
    • MAST Group Website
      HTML
      1400Updated Oct 2, 2017Oct 2, 2017
    • codemining-treelm

      Public archive
      Tree Language Models
      Java
      BSD 3-Clause "New" or "Revised" License
      8900Updated Jan 1, 2017Jan 1, 2017
    • eqnet

      Public
      Code related to "Learning Continuous Semantic Representations of Symbolic Expressions" project.
      Python
      BSD 3-Clause "New" or "Revised" License
      83600Updated Dec 8, 2016Dec 8, 2016
    • Source code related to the variable naming challenge
      Python
      MIT License
      2410Updated Nov 9, 2016Nov 9, 2016
    • JS Random testing tool and new Definition File creator using old versions
      JavaScript
      1200Updated Sep 22, 2016Sep 22, 2016
    • Util package to analyse instrumented and collected data from Node.JS projects
      Java
      1100Updated Sep 21, 2016Sep 21, 2016
    • Javascript analyser using Node and Esprima
      JavaScript
      1200Updated Sep 21, 2016Sep 21, 2016
    • tassal

      Public archive
      Tree-based Autofolding Software Summarization Algorithm
      Java
      BSD 3-Clause "New" or "Revised" License
      74210Updated Jul 30, 2016Jul 30, 2016
    • Repository for the code of the "A Convolutional Attention Network for Extreme Summarization of Source Code" paper
      HTML
      BSD 3-Clause "New" or "Revised" License
      3111900Updated Jul 19, 2016Jul 19, 2016
    • itemset-mining

      Public archive
      Probabilistic Itemset Mining
      Java
      GNU General Public License v3.0
      31900Updated Jun 22, 2016Jun 22, 2016
    • codemining-utils

      Public archive
      Utility classes for serialization, parameter loading, sampling and math
      Java
      BSD 3-Clause "New" or "Revised" License
      8400Updated Mar 9, 2016Mar 9, 2016
    • Maven repository for jars not on maven central
      Python
      3200Updated Feb 1, 2016Feb 1, 2016
    • naturalize

      Public archive
      Source code for the Naturalize project
      Java
      BSD 3-Clause "New" or "Revised" License
      125610Updated Sep 5, 2015Sep 5, 2015
    • codemining-sequencelm

      Public archive
      Sequential Language Models
      Java
      BSD 3-Clause "New" or "Revised" License
      6400Updated Sep 5, 2015Sep 5, 2015
    • A set of tools for traversing a Git repository and possibly its files
      Java
      BSD 3-Clause "New" or "Revised" License
      6300Updated Mar 28, 2015Mar 28, 2015
    • nlptools

      Public archive
      A set of NLP tools that may be useful when processing text
      Java
      BSD 3-Clause "New" or "Revised" License
      1100Updated Dec 17, 2014Dec 17, 2014