The CernVM-FS Shrinkwrap utility provides a means of exporting CVMFS repositories. These exports may consist of the complete repository or contain a curated subset of the repository.
The CernVM-FS shrinkwrap utility uses libcvmfs
to export repositories
to a POSIX file tree. This file tree can then be packaged and exported in
several ways, such as SquashFS, Docker layers, or TAR file.
The cvmfs_shrinkwrap
utility supports multithreaded copying to increase
throughput and a file specification to create a subset of a repository.
The cvmfs_shrinkwrap
utility is packaged for Red Hat based and Debian based
platforms in the cvmfs-shrinkwrap
package.
In order to compile cvmfs_shrinkwrap
from sources, use the
-DBUILD_SHRINKWRAP=on
cmake option.
The structure used in the Shrinkwrap output mirrors that used internally
by CernVM-FS. The visible files are hard linked to a hidden data directory.
By default, cvmfs_shrinkwrap
builds in a base directory (/tmp/cvmfs
)
where a directory exists for each repository and a .data
directory
containing the content-addressed files for deduplication.
The shrinkwrap output directory should be formatted with XFS. The ext file systems limit the number of hard links to 64k.
File Path | Description |
---|---|
/tmp/cvmfs |
Default base directory Single mount point that can be used to package repositories, containing both the directory tree and the data directory. |
<base>/<fqrn> |
Repository file tree Directory containing the visible structure and file names for a repository. |
<base>/.data |
File storage location for repositories Content-addressed files in a hidden directory. |
<base>/.provenance |
Storage location for provenance
Hidden directory that stores the provenance
information, including libcvmfs
configurations and specification files. |
The specification file allows for both positive entries and exclusion statements. Inclusion can be specified directly for each file, can use wildcards for directories trees, and an anchor to limit to only the specified directory. Directly specify file :
/lcg/releases/gcc/7.1.0/x86_64-centos7/setup.sh
Specify directory tree :
/lcg/releases/ROOT/6.10.04-4c60e/x86_64-cenots7-gcc7-opt/*
Specify only directory contents :
^/lcg/releases/*
Negative entries will be left out of the traversal :
!/lcg/releases/uuid
Start out with either building cvmfs_shrinkwrap
, adding it to your path,
or locating it in your working directory.
Create a file specification to limit the files subject to being shrinkwrapped.
Here is an example for ROOT version 6.10 (~8.3 GB). For our example put this in
a file named sft.cern.ch.spec
.
/lcg/releases/ROOT/6.10.04-4c60e/x86_64-centos7-gcc7-opt/* /lcg/contrib/binutils/2.28/x86_64-centos7/lib/* /lcg/contrib/gcc/* /lcg/releases/gcc/* /lcg/releases/lcgenv/*
Write the libcvmfs
configuration file that will be used for cvmfs_shrinkwrap
.
Warning
cvmfs_shrinkwrap
puts heavy load on servers. DO NOT configure it to read
from production Stratum 1s!
To use cvmfs_shrinkwrap
at CERN please use http://cvmfs-stratum-zero-hpc.cern.ch
,
and for OSG please use http://cvmfs-s1goc.opensciencegrid.org:8001
.
Here is an example that uses the CERN server, written to sft.cern.ch.config
.
CVMFS_REPOSITORIES=sft.cern.ch CVMFS_REPOSITORY_NAME=sft.cern.ch CVMFS_CONFIG_REPOSITORY=cvmfs-config.cern.ch CVMFS_SERVER_URL='http://cvmfs-stratum-zero-hpc.cern.ch/cvmfs/sft.cern.ch' CVMFS_HTTP_PROXY=DIRECT # Avoid filling up any local squid's cache CVMFS_CACHE_BASE=/var/lib/cvmfs/shrinkwrap CVMFS_KEYS_DIR=/etc/cvmfs/keys/cern.ch # Need to be provided for shrinkwrap CVMFS_SHARED_CACHE=no # Important as libcvmfs does not support shared caches CVMFS_USER=cvmfs
Note
Keys will need to be provided. The location in this configuration is the default used for CVMFS with FUSE.
Using the cvmfs repository sft.cern.ch
:
sudo cvmfs_shrinkwrap -r sft.cern.ch -f sft.cern.ch.config -t sft.cern.ch.spec --dest-base /tmp/cvmfs -j 16
Start by using the above setup.
Alternatively, shrinkwrap images can be created in user space. This is achieved using
the UID and GID mapping feature of libcvmfs
. First mapping files need to be written.
Example (Assuming UID 1000). Write * 1000
into uid.map
at /tmp/cvmfs
.
Add this rule sft.cern.ch.config
. :
CVMFS_UID_MAP=/tmp/cvmfs/uid.map
The same is done with GID into gid.map
.
Using the cvmfs repository sft.cern.ch
:
cvmfs_shrinkwrap -r sft.cern.ch -f sft.cern.ch.config -t sft.cern.ch.spec --dest-base /tmp/cvmfs -j 16
CernVM-FS variant symlinks that are used in the organization of repositories are evaluated at the time of image creation. As such, the OS the image is created on should be the expected OS the image will be used with. Specification rules can be written to include other OS compatible version, but symlinks will resolve to the original OS.
Shrinkwrap was developed to address similar restrictions as the CVMFS Preloader. Having created an image from your specification there are a number of ways this can be used and moved around.
Having a fully loaded repository, including the hard linked data, the image can be exported to a number of different formats and packages. Some examples of this could be ZIP, tarballs, or squashfs. The recommendation is to use squashfs as it provides a great amount of portability and is supported for directly mounting on most OS.
If tools for creating squashfs are not already available try :
apt-get install squashfs-tools
-- or --
yum install squashfs-tools
After this has been installed a squashfs image can be created using the above image :
mksquashfs /tmp/cvmfs root-sft-image.sqsh
This process may take time to create depending on the size of the shrinkwrapped image. The squashfs image can now be moved around and mounted using :
mount -t squashfs /PATH/TO/IMAGE/root-sft-image.sqsh /cvmfs
The shrinkwrap image can also be directly moved and mounted using bind mounts.
mount --bind /tmp/cvmfs /cvmfs
This provides a quick method for testing created images and verifying the contents will run your expected workload.
Shrinkwrap images mirror the data organization of CVMFS. As such it is important that the data and the file system tree be co-located in the file system/mountpoint. If the data is separated from the file system tree you are likely to encounter an error.