Skip to content

Optimization Tool Architecture

Marco Ieni edited this page Aug 31, 2017 · 4 revisions

D-SPACE4Cloud supports the optimization of Hadoop, Spark, and Storm deployments. D-SPACE4Cloud uses alternatively Stochastic Well formed Net (SWN) and Queueing Network (QN) models to estimate average MapReduce and Spark job completion times and Storm applications throughput and cluster utilization, assuming that the Yet Another Resource Negotiator (YARN) Capacity Scheduler is used.

The optimization model can be applied to both statically partitioned and work conserving mode clusters, but care should be taken in the interpretation of results. In the former case, the performance model provides the mean completion time/throughput for every job class. On the other hand, if the scheduler is configured in work conserving, then the performance metrics we obtain are an approximation due to possible performance gains when resources are exploited by other classes instead of lying idle. SWNs and QN models are described in DICE Deliverables D3.4 and D3.8.

D-SPACE4Cloud follows the Service Oriented Architecture (SOA) pattern. Its architecture (see Figure 1), in fact, encompasses a set of services that can be roughly aggregated in three tiers. The first tier implements the frontend of the optimization tool in form of a standalone Java web service exposing. The frontend is in charge of managing several concurrent optimization runs keeping track of the launched experiments.

Figure 1: D-SPACE4Cloud three-tier architecture

The frontend interacts with one or more D-SPACE4Cloud backend instances; each instance is a RESTful Java web service in charge of solving the resource provisioning problem described in the DICE Deliverable D3.9. Since the optimization process is a time-demanding operation, the backend has been designed in order to scale horizontally whereas the frontend service is able to balance the load between the backend services.

Finally, the third tier encompasses a set of third-party utilities providing different services (see DICE Deliverable D3.8). In particular, the backend makes use of a relational database (through JPA) to store and retrieve information of the target deployment (e.g., name, memory, number of cores, speed of VM publicly offered by the considered cloud providers).

Clone this wiki locally