Skip to content

Commit

Permalink
Added instructions on running a small subset of NAB
Browse files Browse the repository at this point in the history
  • Loading branch information
subutai committed Jul 23, 2016
1 parent 7b15b44 commit bee1c76
Show file tree
Hide file tree
Showing 2 changed files with 72 additions and 11 deletions.
49 changes: 38 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,11 +137,13 @@ PYTHONPATH setup:

There are several different use cases for NAB:

1. If you just want to look at all
the results we reported in the paper, there is no need to run anything.
All the data files are in the data subdirectory and all individual detections
for reported algorithms are checked in to the results subdirectory. Please see
the README files in those locations.
1. If you just want to look at all the results we reported in the paper, there
is no need to run anything. All the data files are in the data subdirectory and
all individual detections for reported algorithms are checked in to the results
subdirectory. Please see the README files in those locations.

1. If you want to plot some of the results, please see the README in the
`scripts` directory for `scripts/plot.py`

1. If you have your own algorithm and want to run the NAB benchmark, please see
the [NAB Entry Points](https://github.com/numenta/NAB/wiki#nab-entry-diagram)
Expand All @@ -157,6 +159,9 @@ the directions below to "Run HTM with NAB".
the directions below to "Run full NAB". Note that this will take hours as the
Skyline code is quite slow.

1. If you just want to run NAB on one or more data files (e.g. for debugging)
follow the directions below to "Run a subset of NAB".


##### Run HTM with NAB

Expand Down Expand Up @@ -190,16 +195,38 @@ the specific version of NuPIC (and associated nupic.core) that is noted in the
cd /path/to/nab
python run.py

This will run everything and produce results files for the anomaly detection
methods. Included in the repo are the Numenta anomaly detection method, as well
as methods from the [Etsy Skyline](https://github.com/etsy/skyline) anomaly
detection library, a random detector, and a null detector. This will also pass
those results files to the scoring script to generate final NAB scores.
**Note**: this option will take many many hours to run.
This will run everything and produce results files for all anomaly detection
methods. Several algorithms are included in the repo, such as the Numenta
HTM anomaly detection method, as well as methods from the [Etsy
Skyline](https://github.com/etsy/skyline) anomaly detection library, a sliding
window detector, Bayes Changepoint, and so on. This will also pass those results
files to the scoring script to generate final NAB scores. *Note**: this option
will take many many hours to run.

The run.py command has a number of useful options. To view a description of the
command line options please enter

python run.py --help


##### Run subset of NAB data files

For debugging it is sometimes useful to be able to run your algorithm on a
subset of the NAB data files or on your own set of data files. You can do that
by creating a custom `combined_windows.json` file that only contains labels for
the files you want to run. This new file should be in exactly the same format as
`combined_windows.json` except it would only contain windows for the files you
are interested in.

**Example**: an example file containing two files is in
`labels/combined_windows_tiny.json`. The following command shows you how to run
NAB on a subset of labels:

cd /path/to/nab
python run.py -d numenta --detect --windowsFile labels/combined_windows_tiny.json

This will run the `detect` phase of NAB on the data files specified in the above
JSON file. Scoring and normalization are not supported with this option. Note
that you may see warning messages regarding the lack of labels for other files.
You can ignore these warnings.

34 changes: 34 additions & 0 deletions labels/combined_windows_tiny.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"realKnownCause/nyc_taxi.csv": [
[
"2014-10-30 15:30:00.000000",
"2014-11-03 22:30:00.000000"
],
[
"2014-11-25 12:00:00.000000",
"2014-11-29 19:00:00.000000"
],
[
"2014-12-23 11:30:00.000000",
"2014-12-27 18:30:00.000000"
],
[
"2014-12-29 21:30:00.000000",
"2015-01-03 04:30:00.000000"
],
[
"2015-01-24 20:30:00.000000",
"2015-01-29 03:30:00.000000"
]
],
"realKnownCause/rogue_agent_key_hold.csv": [
[
"2014-07-15 04:35:00.000000",
"2014-07-15 13:25:00.000000"
],
[
"2014-07-17 05:50:00.000000",
"2014-07-18 06:45:00.000000"
]
]
}

0 comments on commit bee1c76

Please sign in to comment.