Generate JavaScript files AST in a format compatible with 150k JavaScript Dataset.
This package can be with npm by running
npm i -g bigcode-astgen
or by fetching this repository and running
cd bigcode-astgen/javascript
npm i -g .
bigcode-astgen-js -o <output> <input>
<input>
should be a file, or a glob expression to files.
In normal mode, <input>
is interpreted as a filename and the resulting AST
is outputed in <output>
if provided, else printed to stdout
.
In batch mode, <input>
is interpreted as a glob, and all matching files
are parsed. <output>
is a prefix and <output>.json
, <output>.txt
and
<output>_failed.txt
files will be created.
<output>.json
- contains a JSON formatted AST per line<output>.txt
- contains a filename per line, in the same order as<output>.json
<output>_failed.txt
- contains a filename per line, with the reason why it could not be parsed
Quote your glob pattern so that it is not expanded by your shell.
bigcode-astgen-js index.js
parse index.js
and output the result to stdout.
bigcode-astgen-js --batch -o result/asts "src/**/*.js"
parse all .js
files in src
directory and output results in the result
directory
with the prefix asts
as asts.json
, asts.txt
and asts_failed.txt
.
bigcode-astgen
exports the following functions
options
{Object}
- should contain the following propertiesinput
{String}
- glob expression of the files to processoutput
{String}
- file basename to save the data
callback
{Function}
err
{Error | null}
count
{Number}
- the number of files processed
path
{String}
- path of the file to processoutput
{String}
- output file to save the AST, outputs to stdout if falsycallback
{Function}
err
{Error | null}
path
{String}
- path of the file to processcallback
{Function}
err
{Error | null}
ast
{Array}
- the nodes of the AST in the 150k JavaScript dataset format
content
{String}
- a JavaScript program- return:
{Array}
the nodes of the AST in the 150k JavaScript dataset format
root
{Node}
- the root of the AST parsed by acorn- return:
{Array}
the nodes of the AST in the 150k JavaScript dataset format