IOPS improvements #35

asqasq · 2017-11-28T09:43:18Z

This PR contains changes to improve the number of IOPS:

Adapt to new DaRPC API
Additional Crail benchmark to measure IOPS with multiple namenodes
Additional HDFS benchmark to measure namenode IOPS
Move Crail namenode statistics out of fast-path to separate class, which can
be instantiated instead of original one, if we want to do measurements
Added properties to tune mempool
Request DaRPC version 1.4

Note: This PR has to be merged together with the corresponding DaRPC PR.

PepperJo · 2017-11-28T10:32:17Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

+		double latency = 0.0;
+		if (executionTime > 0) {
+			latency = 1000000.0 * executionTime / ops;
+		}


I know this logic has been used elsewhere in this file but can we please change this to not use doubles and millis to calculate execution time. Instead use System.nanoTime() and do calculations with long where it makes sense for latency etc we of course need double

I am not exactly sure, why calculating in nanoseconds is better than milliseconds (both return long, but we need double for the division).

If we change that, it should be everywhere to keep it consistent. I'll create an issue, as I think that this is a bigger cleanup step.

The problem with currentTimeMillis is that it returns the actual time, i.e. is not monotonic and that leads to obvious problems, e.g. if you are running a NTP service time can change while you run a benchmark. That is why it is recommended to use System.nanoTime().

Ok sounds good. Will change all benchmark calculations in a new PR.

PepperJo · 2017-11-28T10:32:39Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

+		}
+		long end = System.currentTimeMillis();
+		double executionTime = ((double) (end - start));
+		double latency = executionTime*1000.0 / ((double) batch);


See comment above

PepperJo · 2017-11-28T10:34:07Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

@@ -999,7 +1104,8 @@ public static void main(String[] args) throws Exception {
 		boolean useBuffered = true;

 		String benchmarkTypes = "write|writeAsync|readSequential|readRandom|readSequentialAsync|readMultiStream|"
-				+ "createFile|createFileAsync|createMultiFile|getKey|getFile|getFileAsync|enumerateDir|browseDir|"
+				+ "createFile|createFileAsync|createMultiFile|getKey|getFile|getFileAsync|getMultiFile"
+				+ "getMultiFileAsync|enumerateDir|browseDir|"


Can we make this more generic. Lot's of Strings duplicated and possible spelling errors.
I would like to see something like a map from string -> benchmark method

I agree with this, there should be no need for string duplication.

I'd prefer to do this as a cleanup step and will create an issue.

PepperJo · 2017-11-28T10:34:52Z

hdfs/src/main/java/com/ibm/crail/hdfs/tools/HdfsIOBenchmark.java

@@ -77,7 +79,7 @@ public void run() throws Exception {

 	public static void usage(){
 		System.out.println("Usage:");
-		System.out.println("hdfsbench <readSequentialDirect|readSequentialHeap|readRandomDirect|readRandomHeap|writeSequentialHeap> <size> <iterations> <path>");
+		System.out.println("hdfsbench <readSequentialDirect|readSequentialHeap|readRandomDirect|readRandomHeap|writeSequentialHeap|getFile|getFileIOPS> <size> <iterations> <path>");


Again please make this more generic

PepperJo · 2017-11-28T10:35:38Z

hdfs/src/main/java/com/ibm/crail/hdfs/tools/HdfsIOBenchmark.java

+		}
+		long end = System.currentTimeMillis();
+		double iops = ((double)loop) / (end - start) * (double)1000.0;
+		double executionTime = ((double) (end - start));


Again see above.

PepperJo · 2017-11-28T10:36:29Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCConstants.java

@@ -55,6 +55,21 @@
 	public static final String NAMENODE_DARPC_CLUSTERSIZE_KEY = "crail.namenode.darpc.clustersize";
 	public static int NAMENODE_DARPC_CLUSTERSIZE = 128;	

+	public static final String NAMENODE_DARPC_MEMPOOL_HUGEPAGEPATH_KEY = "crail.namenode.darpc.mempool.hugepagepath";


I was under the impression we don't want this multilevel config anymore e.g. everything should be darpc.X

PepperJo · 2017-11-28T10:39:17Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCServiceDispatcher.java

+	protected AtomicLong renameOps;
+	protected AtomicLong getOps;
+	protected AtomicLong locationOps;
+	protected AtomicLong errorOps;


I prefer putting the Atomics in the stats class instead of making them protected

patrickstuedi

I think we should take out the code for the statistics that we only used internally (e.g., DaRPCServiceDispatcherStats.java)

asqasq · 2017-12-13T21:23:47Z

I removed the IOPS thread. Crail uses now the simple memory pool. I created two cleanup issues based on PepperJo's comments. Please have a look at the newest version.

PepperJo

One minor comment.

PepperJo · 2017-12-14T08:58:11Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCNameNodeServer.java

 		String _clusterAffinities[] = DaRPCConstants.NAMENODE_DARPC_AFFINITY.split(",");
 		long clusterAffinities[] = new long[_clusterAffinities.length];
 		for (int i = 0; i < clusterAffinities.length; i++){
 			int affinity = Integer.decode(_clusterAffinities[i]).intValue();
 			clusterAffinities[i] = 1L << affinity;
 		}
 		DaRPCServiceDispatcher darpcService = new DaRPCServiceDispatcher(service);
-		this.namenodeServerGroup = DaRPCServerGroup.createServerGroup(darpcService, clusterAffinities, -1, DaRPCConstants.NAMENODE_DARPC_MAXINLINE, DaRPCConstants.NAMENODE_DARPC_POLLING, DaRPCConstants.NAMENODE_DARPC_RECVQUEUE, DaRPCConstants.NAMENODE_DARPC_SENDQUEUE, DaRPCConstants.NAMENODE_DARPC_POLLSIZE, DaRPCConstants.NAMENODE_DARPC_CLUSTERSIZE);
+		if (!DaRPCConstants.NAMENODE_DARPC_STATS.isEmpty()) {


Maybe it makes sense to treat crail.namenode.darpc.stats as a boolean or similar. I can see this being misinterpreted otherwise and set to "crail.namenode.darpc.stats false" or "crail.namenode.darpc.stats no".

…S and new HDFS benchmarks to measure IOPS.

PepperJo

One minor comment.

PepperJo · 2018-01-18T11:40:12Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCNameNodeServer.java

+				DaRPCConstants.NAMENODE_DARPC_MEMPOOL_ALLOCSZ,
+				DaRPCConstants.NAMENODE_DARPC_MEMPOOL_ALIGNMENT,
+				DaRPCConstants.NAMENODE_DARPC_MEMPOOL_ALLOC_LIMIT
+				);
 		String _clusterAffinities[] = DaRPCConstants.NAMENODE_DARPC_AFFINITY.split(",");


I would prefer doing all parsing in the DaRPCConstants.

I did not add any parsing here. If you mean the split(), this is old code. I would prefer cleaning aup existing code in a separate PR

Sounds good.

PepperJo suggested changes Nov 28, 2017

View reviewed changes

patrickstuedi requested changes Nov 28, 2017

View reviewed changes

asqasq force-pushed the iopsimprovements branch from 5512a02 to 4140ce6 Compare December 13, 2017 13:55

PepperJo approved these changes Dec 14, 2017

View reviewed changes

asqasq added 6 commits January 11, 2018 11:12

IOPS improvements, new Crail benchmarks to measure multi-namenode IOP…

39740af

…S and new HDFS benchmarks to measure IOPS.

Use simple memory pool.

64feb09

Use simple mempool with hugepages.

bbed231

Minor fix.

c05d788

NAMENODE_DARPC_STATS is now a boolean.

180a976

Added a limit property to the mempool properties.

52aea73

asqasq force-pushed the iopsimprovements branch from 8d8935a to 52aea73 Compare January 17, 2018 23:43

patrickstuedi approved these changes Jan 18, 2018

View reviewed changes

PepperJo approved these changes Jan 18, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IOPS improvements #35

IOPS improvements #35

asqasq commented Nov 28, 2017

PepperJo Nov 28, 2017

asqasq Dec 13, 2017

PepperJo Dec 14, 2017

asqasq Dec 14, 2017

PepperJo Nov 28, 2017

PepperJo Nov 28, 2017

asqasq Dec 13, 2017

PepperJo Nov 28, 2017

PepperJo Nov 28, 2017

PepperJo Nov 28, 2017

PepperJo Nov 28, 2017

patrickstuedi left a comment

asqasq commented Dec 13, 2017

PepperJo left a comment

PepperJo Dec 14, 2017

PepperJo left a comment

PepperJo Jan 18, 2018

asqasq Jan 18, 2018 •

edited

Loading

PepperJo Jan 18, 2018

IOPS improvements #35

Are you sure you want to change the base?

IOPS improvements #35

Conversation

asqasq commented Nov 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickstuedi left a comment

Choose a reason for hiding this comment

asqasq commented Dec 13, 2017

PepperJo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PepperJo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asqasq Jan 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asqasq Jan 18, 2018 •

edited

Loading