Releases: eXascaleInfolab/GenConvNMI
Releases · eXascaleInfolab/GenConvNMI
Optional Node Base Sync to the 1st File
Features
- Optional node base synchronization to the first file added besides the synchronization to the minimal node base
- Description refined
Fixes
- Missed dependency added (for the AggHash details see the paper StaTIX — Statistical Type Inference on Linked Data, BigData'18)
The executable is built on Ubuntu x64 16.04
Accuracy and Gonvergence Improved
Accuracy and convergence on large datasets improved further insignificantly affecting the efficiency.
The executable is built on Ubuntu x64 16.04.
Heavy Overlaps Processing Refined and Optimized
- Normalization of the vertex samples (importance) refined
- Multimatch of the modules refined and optimized (hard and soft matching converged!)
- Some speed optimizations
The executable is built on Ubuntu x64 16.04
NMI Constraints Considered
- NMI constraints considered: explicit exceptions are throw when NMI is not applicable instead of outputting 0 or 1. For example, NMI is formally equal to zero when one clustering contains a single cluster of all nodes and another clustering can be any, which says nothing about the clusterings similarity
- Some accuracy and performance optimizations made to avoid evaluation complexity dependence from the overlaps size
- Code refined
The executable is built on Ubuntu x64 16.04
Evaluation of duplicated and equally similar clusters
- Evaluation of equally similar clusters in the collections (including duplicated clusters, which earlier caused exceptions)
- Bug fixes: correct evaluation of fully matched clusters when MI and H1/2 are zero now correctly yields 1, etc.
- Minor optimizations
The executable is built on Ubuntu x64 16.04
Ids Remapping Added
- Input Ids remapping added to allow non-solid range of arbitrary ids (still numbers),
-i
input option should be specified - Default input ids are allowed to start from 1 besides 0
By default (without -i
option) the input ids should form the solid range that starts from 0 or 1.
The build is made on Linux Ubuntu 16.04 x64.
NMIsqrt Added
- NMIsqrt added
- Output options extended
The executable is built on Ubuntu x64 16.04
Node Estimation Fixed
- Node estimation fixed to avoid "bad alloc" on memory over reserving
- Minor refinements of notifications and docs
- Common code with onmi is shared
The executable is built on Ubuntu x64 16.04
Improved Accuracy due to the Weighted Sampling
- Improved accuracy due to the weighted sampling proportional to the number of clusters in the collection
- Random generator parameterized with the random device is used instead of the random device
The executable is built on Ubuntu x64 16.04
Collections Oder Invariant Sampling
The stochastic sampling fixed to be collections order invariant
The executable is built on Ubuntu x64 16.04