- Would be nice to compare memory and CPU - Would be nice to make sure all tests use same resampling algorithm. I suppose they are using lancos3, but just need to make sure - Would be nice to show statistical error regions in graph, maybe something like candel graph