Skip to content

Releases: linkedin/dr-elephant

New Spark Heuristics in Dr. Elephant which are supported by custom Spark History Server changes.

19 Apr 05:49
Compare
Choose a tag to compare

This release includes the below commits since v2.1.7. Please note that this release runs with custom Spark History Server changes made internally in LinkedIn.

We are running this with MapReduceFSFetcherHadoop2 for MapReduce and SparkFetcher for Spark Jobs.

7c1e88b making rest calls sequential
4ebd4b9 adding failedTasks value (#363)
638eb77 Removing blocking keyword so as to prevent a large number of threads being spawned (#362)
25c07bb Spark Heuristic Fixes for Dr. Elephant (#324)
019a9f4 Changing GC thresholds and calculation in spill heuristic (#319)
8e193a2 Fixed resources used/wasted computation for spark jobs - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) (#287)
78dd699 Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) (#281)
7ca8706 Spark Peak jvm memory Heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) (#318)
a40d251 Spark Stages with Failed tasks Heuristic - (Depends on Custom SHS - Requires stages/failedTasks Rest API) (#288)
6b4a3cf Spark Executor Spill Heuristic - (Depends on Custom SHS - Requires totalMemoryBytesSpilled metric) (#310)
50a7409 Removing blocking keyword (#361)
a0470a3 Dr. Elephant Tez Support working patch (#313)
d5a6897 added connection timeout for REST Calls. (#359)
fe7bfea changed async for LogClient (#354)
977623d Changed async to future/blocking and changed the error to warn (#353)
c89bafe Reducing timeout of spark fetcher from 60 to 5 seconds (#345)
fe076f7 Bug fix: Auto tuning disable model unit test failure (#343)
79bb59f Added support for multiple Azkaban Host URL (#342)
8ea2850 Bug fix: Delay computation of MR application (#340)
b2c24b8 Adding Auto Tuning Feature (#338)
c182c98 added a function to check if the script's required programs exist or exit the program with an indicative message (#326)
5500aad Revert "Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) (#283)" (#317)
6b2f7e8 Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) (#283)
7a27a3f Secondary Sort suggestion to reduce memory footprint at reducer (#316)
a208c31 Spark Configuration Threshold Heuristic (#286)
8c99625 Spark Executor GC Heuristic (#311)
a384fcc Added a Second Retry Queue - Useful while fetching Spark Metrics (#314)
8b46933 Add httpcore dependency to solve classpath issues (#308)
35d06d9 Dr. Elephant should check for finished directory before listing
e756226 TUNING Updating default MR fetcher for performance (#300)
9c8915c BUGFIX Updating java_args to elephant.conf for resolving argument conflicts (#299)
53fd50c BUGFIX Updating AnalyticJobGeneratorHadoop2.java to resolve the Job listing Conflict (#302)
83c1ef3 Fix MapReduceFSFetcherHadoop2 Fetcher filesystem to pick the configured URI (#292)
37ad77f BUGFIX: Fixes NullPointerException in AnalyticJobGeneratorHadoop2 (#294)

Linkedin Release

12 Sep 06:18
Compare
Choose a tag to compare

This release includes the below commits since v2.0.14. We are running this release with the FSFetcher for both MR and Spark.

12e02a6 HadoopSecurity should be a Singleton (#284)
68487ad Fix text box for ids composed of several words (#265)
757f0c2 DOC: Clarify the wording of output (#282)

Linkedin Release

12 Sep 04:58
Compare
Choose a tag to compare

This release includes the below commits since v2.0.13. We are running this release with the FSFetcher for both MR and Spark.

752a94b added logic for map reduce time-skew heuristic (#267)
7230038 Add filtering on Job Definition Id in the Search view (#269)
9a65e0e Add custom flowtime per scheduler (#268)
1d6f3f6 add s3, s3a, s3n bytes read and bytes written, and update heuristics to use them (#254)
54a16fd Add pinball scheduler to dr-elephant (#253)
f77886a Add index on severity, finish_time to speed up welcome page display (#250)
cdf680b MRfetcher ignores failed tasks (#249)
cae79c7 Refactor statusapiv1 to trait and implement for ease of creation of these objects when we implement our own parser (#248)
1ca2676 Enables SparkFetcher to only get eventLog via rest and process it locally. (#243)

Linkedin Release

11 May 15:42
Compare
Choose a tag to compare

This release includes the below commits since v2.0.9. We are running this release with the FSFetcher for both MR and Spark.

7c373d4 - Shekhar Gupta : Updates Spark configuration heuristic severity calculations (#229)
a1f866a - Anant Nag : Minor bug fixes in exception and UI (#238)
b7e04ab - shankar37 : Spark metrics aggregator fix (#237)
c8a7009 - shankar37 : Update #224 (credits: rayortigas) to add FSFetcher as a standalone fetcher (#232)
8e4a094 - Sergei Lebedev : Spark fetcher is now able to fetch event logs via REST API (#225)
5a98701 - Shekhar Gupta : Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled (#222)
6b80614 - Akshay Rai : Include reference to the weekly meeting
965cba3 - stiga-huang : add config for timezone of job history server (#214)
f6274b1 - Shekhar Gupta : Fixes issue caused by http in history server config property (#217)
0d668ab - Shekhar Gupta : Adds an option to fetch recently finished apps from RM (#212)
da7983c - shankar37 : Fix Exception thrown when JAVA_EXTRA_OPTIONS is not present (#210)
d3c90d5 - shankar37 : Fix #162 with the right calculation for resourceswasted and add missing workflow links (#207)
dd7a458 - stiga-huang : Fix for null pointers in TaskList returned by MapReduceFSFetcherHadoop2 (#203)
e93d431 - Shekhar Gupta : Fixes Spark REST fetcher for client mode applications (#193)
2a84735 - Shekhar Gupta : Cleanes up MapReduceTaskData class by removing unnecessary constructors (#202)
4df9ba9 - Ragesh Rajagopalan : Added new heuristic DistributedCacheLimit heuristic. (#187)