This is a dataset consisting of vulnerability-causing and vulnerability-fixing commits collected from Java programs available on repositories on GitHub. There are different data files that contain different information about the commits, including the code change hunk, context and the entire code file, in both original form as well as AST representation.
The paper for this dataset can be found here: DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning