Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Docker image layering #1677

Open
dwickern opened this issue Feb 1, 2025 · 2 comments
Open

Optimize Docker image layering #1677

dwickern opened this issue Feb 1, 2025 · 2 comments

Comments

@dwickern
Copy link
Collaborator

dwickern commented Feb 1, 2025

The default layering is not great for Play Framework applications. Here I create a fresh app and print the layers:

sbt new playframework/play-scala-seed.g8
// add to build.sbt
enablePlugins(DockerPlugin, LauncherJarPlugin)
[play-scala-seed] $ show dockerLayerMappings
[info] * LayeredMapping(Some(2),/Users/dwickern/code/play-scala-seed/target/scala-2.13/play-scala-seed_2.13-1.0-SNAPSHOT-sans-externalized.jar,/opt/docker/lib/com.example.play-scala-seed-1.0-SNAPSHOT-sans-externalized.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.13.16/scala-library-2.13.16.jar,/opt/docker/lib/org.scala-lang.scala-library-2.13.16.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/playframework/twirl/twirl-api_2.13/2.0.7/twirl-api_2.13-2.0.7.jar,/opt/docker/lib/org.playframework.twirl.twirl-api_2.13-2.0.7.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/playframework/play-server_2.13/3.0.6/play-server_2.13-3.0.6.jar,/opt/docker/lib/org.playframework.play-server_2.13-3.0.6.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/playframework/play-logback_2.13/3.0.6/play-logback_2.13-3.0.6.jar,/opt/docker/lib/org.playframework.play-logback_2.13-3.0.6.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/playframework/play-pekko-http-server_2.13/3.0.6/play-pekko-http-server_2.13-3.0.6.jar,/opt/docker/lib/org.playframework.play-pekko-http-server_2.13-3.0.6.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/playframework/play-filters-helpers_2.13/3.0.6/play-filters-helpers_2.13-3.0.6.jar,/opt/docker/lib/org.playframework.play-filters-helpers_2.13-3.0.6.jar)
...
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/com/google/guava/listenablefuture/9999.0-empty-to-avoid-conflict-with-guava/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar,/opt/docker/lib/com.google.guava.listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/com/google/code/findbugs/jsr305/3.0.2/jsr305-3.0.2.jar,/opt/docker/lib/com.google.code.findbugs.jsr305-3.0.2.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/checkerframework/checker-qual/3.37.0/checker-qual-3.37.0.jar,/opt/docker/lib/org.checkerframework.checker-qual-3.37.0.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/com/google/j2objc/j2objc-annotations/2.8/j2objc-annotations-2.8.jar,/opt/docker/lib/com.google.j2objc.j2objc-annotations-2.8.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/apache/pekko/pekko-protobuf-v3_2.13/1.0.3/pekko-protobuf-v3_2.13-1.0.3.jar,/opt/docker/lib/org.apache.pekko.pekko-protobuf-v3_2.13-1.0.3.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/code/play-scala-seed/target/scala-2.13/play-scala-seed_2.13-1.0-SNAPSHOT-web-assets.jar,/opt/docker/lib/com.example.play-scala-seed-1.0-SNAPSHOT-assets.jar)
[info] * LayeredMapping(Some(2),/Users/dwickern/code/play-scala-seed/target/scala-2.13/com.example.play-scala-seed-1.0-SNAPSHOT-launcher.jar,/opt/docker/lib/com.example.play-scala-seed-1.0-SNAPSHOT-launcher.jar)
[info] * LayeredMapping(Some(4),/Users/dwickern/code/play-scala-seed/target/universal/scripts/bin/play-scala-seed,/opt/docker/bin/play-scala-seed)
[info] * LayeredMapping(Some(4),/Users/dwickern/code/play-scala-seed/target/universal/scripts/bin/play-scala-seed.bat,/opt/docker/bin/play-scala-seed.bat)
[info] * LayeredMapping(Some(1),/Users/dwickern/code/play-scala-seed/conf/logback.xml,/opt/docker/conf/logback.xml)
[info] * LayeredMapping(Some(1),/Users/dwickern/code/play-scala-seed/conf/messages,/opt/docker/conf/messages)
[info] * LayeredMapping(Some(1),/Users/dwickern/code/play-scala-seed/conf/application.conf,/opt/docker/conf/application.conf)
[info] * LayeredMapping(Some(1),/Users/dwickern/code/play-scala-seed/conf/routes,/opt/docker/conf/routes)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api,/opt/docker/share/doc/api/)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/index.html,/opt/docker/share/doc/api//index.html)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/index.js,/opt/docker/share/doc/api//index.js)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/lib,/opt/docker/share/doc/api//lib)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/lib/source-code-pro-v6-latin-regular.ttf,/opt/docker/share/doc/api//lib/source-code-pro-v6-latin-regular.ttf)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/lib/annotation_comp.svg,/opt/docker/share/doc/api//lib/annotation_comp.svg)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/lib/abstract_type.svg,/opt/docker/share/doc/api//lib/abstract_type.svg)
...
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/router/Routes.html,/opt/docker/share/doc/api//router/Routes.html)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/router/index.html,/opt/docker/share/doc/api//router/index.html)
[info] * LayeredMapping(None,/Users/dwickern/code/play-scala-seed/target/scala-2.13/api/router/RoutesPrefix$.html,/opt/docker/share/doc/api//router/RoutesPrefix$.html)

The biggest issue is that application artifacts are put in the same layer as libraryDependencies. The library dependencies are 43MB for this play-scala-seed app and much larger for any real application. Dependencies change less frequently compared to application code so we should be able to cache them across builds.

Current layers

Here's how the layers are currently structured:

Layer 1

  • sbt-native-packager's generated conf/application.ini if using Universal / javaOptions
  • Play Framework's externalized resources (by default, the contents of conf/)

Layer 2

  • the project's transitive libraryDependencies
  • sbt-native-packager's launcher jar if using LauncherJarPlugin
  • Play Framework's -assets.jar and -sans-externalized.jar jars

Layer 3

  • sbt-native-packager jlink files if using JlinkPlugin

Layer 4

  • the project's transitive artifact jars
  • sbt-native-packager's shell scripts
  • sbt-native-packager's classpath jar if using ClasspathJarPlugin

Final layer (layerId = None)

  • sbt-native-packager's mappings of Docker / sourceDirectory
  • Play Framework's API docs if using includeDocumentationInBinary := true (default true)

Proposed layers

At minimum we should move the project artifacts out from layer 2.
Config files shouldn't be the bottom layer. They are only a few KB so it doesn't matter to cache them.
Should jlink files be layered below libraryDependencies? I would guess they change less frequently but I don't use jlink 🤷

@muuki88
Copy link
Contributor

muuki88 commented Feb 3, 2025

Hi @dwickern

That's indeed very wasteful. I remember vaguely that this was solved at one point, but apparently not 😢

I assume this can be fixed by re-ordering the dockerCommands ?

dockerCommands := {
val conv0 = fileConverter.value
implicit val conv: FileConverter = conv0
val strategy = dockerPermissionStrategy.value
val dockerBaseDirectory = (Docker / defaultLinuxInstallLocation).value
val user = (Docker / daemonUser).value
val uidOpt = (Docker / daemonUserUid).value
val group = (Docker / daemonGroup).value
val gidOpt = (Docker / daemonGroupGid).value
val base = dockerBaseImage.value
val addPerms = dockerAdditionalPermissions.value
val multiStageId = UUID.randomUUID().toString
val generalCommands = makeFromAs(base, "mainstage") +: makeMaintainer((Docker / maintainer).value).toSeq
val stage0name = "stage0"
val layerMappings = (Docker / dockerLayerMappings).value
val layerIdsAscending = layerMappings.map(_.layerId).distinct.sortWith { (a, b) =>
// Make the None (unspecified) layer the last layer
a.getOrElse(Int.MaxValue) < b.getOrElse(Int.MaxValue)
}
val stage0: Seq[CmdLike] = strategy match {
case DockerPermissionStrategy.MultiStage =>
Seq(
makeFromAs(base, stage0name),
makeLabel("snp-multi-stage" -> "intermediate"),
makeLabel("snp-multi-stage-id" -> multiStageId),
makeWorkdir(dockerBaseDirectory)
) ++
layerIdsAscending.map(l => makeCopyLayerIntermediate(l, dockerBaseDirectory)) ++
Seq(makeUser("root")) ++ layerIdsAscending.map(l =>
makeChmodRecursive(dockerChmodType.value, Seq(pathInLayer(dockerBaseDirectory, l)))
) ++ {
val layerToPath = (Docker / dockerGroupLayers).value
addPerms map { case (tpe, v) =>
// Try and find the source file for the path from the mappings
val layerId = layerMappings
.find(_.path == v)
.map(_.layerId)
.getOrElse {
// We couldn't find a source file for the mapping, so try with a dummy source file,
// in case there is an explicitly configured path based layer mapping, eg for a directory.
layerToPath.lift((PluginCompat.toFileRef(new File("/dev/null")), v))
}
makeChmod(tpe, Seq(pathInLayer(v, layerId)))
}
} ++
Seq(DockerStageBreak)
case _ => Seq()
}
val stage1: Seq[CmdLike] = generalCommands ++
(uidOpt match {
case Some(_) => Seq(makeUser("root"), makeUserAdd(user, group, uidOpt, gidOpt))
case _ => Seq()
}) ++
Seq(makeWorkdir(dockerBaseDirectory)) ++ {
(strategy match {
case DockerPermissionStrategy.MultiStage =>
layerIdsAscending.map { layerId =>
makeCopyFrom(pathInLayer(dockerBaseDirectory, layerId), dockerBaseDirectory, stage0name, user, group)
}
case DockerPermissionStrategy.Run =>
layerIdsAscending.map(layerId => makeCopyLayerDirect(layerId, dockerBaseDirectory)) ++
Seq(makeChmodRecursive(dockerChmodType.value, Seq(dockerBaseDirectory))) ++
(addPerms map { case (tpe, v) => makeChmod(tpe, Seq(v)) })
case DockerPermissionStrategy.CopyChown =>
layerIdsAscending.map(layerId => makeCopyChown(layerId, dockerBaseDirectory, user, group))
case DockerPermissionStrategy.None =>
layerIdsAscending.map(layerId => makeCopyLayerDirect(layerId, dockerBaseDirectory))
})
} ++
dockerLabels.value.map(makeLabel) ++
dockerEnvVars.value.map(makeEnvVar) ++
makeExposePorts(dockerExposedPorts.value, dockerExposedUdpPorts.value) ++
makeVolumes(dockerExposedVolumes.value, user, group) ++
Seq(uidOpt match {
case Some(uid) => makeUser(uid, gidOpt)
case _ => makeUser(user)
}) ++
// Use this to debug permissions
// Seq(ExecCmd("RUN", Seq("ls", "-l", "/opt/docker/bin/"): _*)) ++
Seq(makeEntrypoint(dockerEntrypoint.value), makeCmd(dockerCmd.value))
stage0 ++ stage1
}

@dwickern
Copy link
Collaborator Author

dwickern commented Feb 4, 2025

It's actually dockerGroupLayers. it will be easy to fix... more work to update the tests 😅

dockerGroupLayers := {
val conv0 = fileConverter.value
implicit val conv: FileConverter = conv0
val dockerBaseDirectory = (Docker / defaultLinuxInstallLocation).value
// Ensure this doesn't break even if the JvmPlugin isn't enabled.
var artifacts = projectDependencyArtifacts.?.value.getOrElse(Nil).map(_.data).toSet
// add the classpath jar to the project artifacts to improve layer caching as it is created by
// the ClasspathJarPlugin and is not part of the projectDependencyArtifacts
ClasspathJarPlugin.autoImport.packageJavaClasspathJar.?.value match {
case Some(p) => artifacts += p
case _ =>
}
val oldFunction = dockerLayerGrouping.value
// By default we set this to a function that always returns None.
val oldPartialFunction = Function.unlift((tuple: (PluginCompat.FileRef, String)) => oldFunction(tuple._2))
val libDir = dockerBaseDirectory + "/lib/"
val binDir = dockerBaseDirectory + "/bin/"
val jreDir = dockerBaseDirectory + "/jre/"
val confDir = dockerBaseDirectory + "/conf/"
oldPartialFunction.orElse {
// bin directory contains start scripts which are containing artifacts / classpath jar,
// so should be together with actual artifacts
case (file, path) if artifacts(file) || path.startsWith(binDir) => 4
case (_, path) if path.startsWith(jreDir) => 3
case (_, path) if path.startsWith(libDir) => 2
case (_, path) if path.startsWith(confDir) => 1
}
},

#1425 made some improvements before. The layering is pretty good for most applications. The issue is Play Framework outputs additional jars besides the default packageBin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants