This repo will be updated soon with streamlined codes (modular structure, TFRecords and faster evaluation)
-
Extreme classification repository - http://manikvarma.org/downloads/XC/XMLRepository.html Please download any of the datasets mentioned the paper, unzip them and and push the files to the the respective data folder (like ./amazon_670k/data/).
-
Download ODP dataset from http://hunch.net/~vw/odp_train.vw.gz and http://hunch.net/~vw/odp_test.vw.gz . The data format must be changed to match the datasets on Extreme Classification repo.
-
Download ImageNet dataset from http://hunch.net/~jl/datasets/imagenet/training.txt.gz and http://hunch.net/~jl/datasets/imagenet/testing.txt.gz . Yet again, the data format must be changed to match the datasets on Extreme Classification repo.
Please move in to src folder for respective dataset, like 'amazon_670k/src/'. The steps to build indexes, train and evaualte are mentioned sequentially in run.sh The steps to build indexes, train and evaualte are mentioned sequentially in run.sh