Merge branch 'main' into book-intro

datalad-handbook · Nov 9, 2023 · b0cd830 · b0cd830
2 parents bd84718 + cd0a100
commit b0cd830
Show file tree

Hide file tree

Showing 15 changed files with 35 additions and 31 deletions.
diff --git a/docs/basics/101-101-create.rst b/docs/basics/101-101-create.rst
@@ -31,6 +31,7 @@ Note the command structure of :dlcmd:`create` (optional bits are enclosed in ``[
 
   datalad create [--description "..."] [-c <config options>] PATH
 
+.. _createdescription:
 .. index::
    pair: set description for dataset location; with DataLad
 .. find-out-more:: What is the description option of 'datalad create'?

diff --git a/docs/basics/101-115-symlinks.rst b/docs/basics/101-115-symlinks.rst
@@ -21,7 +21,7 @@ We'll take a look together, using the ``books/`` directory as an example:
 .. index::
    pair: no symlinks; on Windows
    pair: tree; terminal command
-.. windows-wit:: This will look different to you
+.. windows-wit:: Dataset directories look different on Windows
 
    .. include:: topic/tree-symlinks.rst
 
@@ -87,7 +87,7 @@ tree is also known as the *annex* of a dataset.
 .. index::
    pair: elevated storage demand; in adjusted mode
    pair: no symlinks; on Windows
-.. windows-wit:: What happens on Windows?
+.. windows-wit:: File content management on Windows
    :name: woa_objecttree
    :float:
 
@@ -139,15 +139,15 @@ This comes with two very important advantages:
 
 One, should you have copies of the
 same data in different places of your dataset, the symlinks of these files
-point to the same place (in order to understand why this is the case, you
-will need to read the hidden section at the end of the page).
+point to the same place - in order to understand why this is the case, you
+will need to read the :find-out-more:`about the object tree <fom-objecttree>`.
 Therefore, any amount of copies of a piece of data
 is only one single piece of data in your object tree. This, depending on
 how much identical file content lies in different parts of your dataset,
 can save you much disk space and time.
 
 The second advantage is less intuitive but clear for users familiar with Git.
-Small symlinks can be written very very fast when switching :term:`branch`\es, as opposed to copying and deleting huge data files.
+Compared to copying and deleting huge data files, small symlinks can be written very very fast, for example, when switching dataset versions, or :term:`branch`\es.
 
 .. gitusernote:: Speedy branch switches
 
@@ -168,9 +168,10 @@ work with the paths in the object tree than you or any other human are.
 Lastly, understanding that annexed files in your dataset are symlinked
 will be helpful to understand how common file system operations such as
 moving, renaming, or copying content translate to dataset modifications
-in certain situations. Later in this book we will have a section on how
-to manage the file system in a DataLad dataset (:ref:`file system`).
+in certain situations. Later in this book, the section :ref:`file system`
+will take a closer look at that.
 
+.. _objecttree:
 .. index::
    pair: key; git-annex concept
 .. find-out-more:: Paths, checksums, object trees, and data integrity
@@ -227,7 +228,7 @@ to manage the file system in a DataLad dataset (:ref:`file system`).
    consisting of two letters each.
    These two letters are derived from the md5sum of the key, and their sole purpose to exist is to avoid issues with too many files in one directory (which is a situation that certain file systems have problems with).
    The next subdirectory in the symlink helps to prevent accidental deletions and changes, as it does not have write :term:`permissions`, so that users cannot modify any of its underlying contents.
-   This is the reason that annexed files need to be unlocked prior to modifications, and this information will be helpful to understand some file system management operations such as removing files or datasets (see section :ref:`file system`).
+   This is the reason that annexed files need to be unlocked prior to modifications, and this information will be helpful to understand some file system management operations such as removing files or datasets. Section :ref:`file system` takes a look at that.
 
    The next part of the symlink contains the actual hash.
    There are different hash functions available.

diff --git a/docs/basics/101-116-sharelocal.rst b/docs/basics/101-116-sharelocal.rst
@@ -33,7 +33,8 @@ DataLad for, if everyone can already access everything?" However,
 universal, unrestricted access can easily lead to chaos. DataLad can
 help facilitate collaboration without requiring ultimate trust and
 reliability of all participants. Essentially, with a shared dataset,
-collaborators can look and use your dataset without ever touching it.
+collaborators can see and use your dataset without any danger
+of undesired, or uncontrolled modification.
 
 To demonstrate how to share a DataLad dataset on a common file system,
 we will pretend that your personal computer
@@ -48,8 +49,8 @@ But as we cannot easily simulate a second user in this book,
 for now, you will have to share your dataset with yourself.
 This endeavor serves several purposes: For one, you will experience a very easy
 way of sharing a dataset. Secondly, it will show you
-how a dataset can be obtained from a path (instead of a URL as shown in the section
-:ref:`installds`). Thirdly, ``DataLad-101`` is a dataset that can
+how a dataset can be obtained from a path, instead of a URL as shown in section
+:ref:`installds`. Thirdly, ``DataLad-101`` is a dataset that can
 showcase many different properties of a dataset already, but it will
 be an additional learning experience to see how the different parts
 of the dataset -- text files, larger files, subdatasets,
@@ -194,8 +195,7 @@ and hostname of your computer. "This", you exclaim, excited about your own reali
    pair: set description for dataset location; with DataLad
 .. find-out-more:: What is this location, and what if I provided a description?
 
-   Back in the very first section of the Basics, :ref:`createDS`, a hidden
-   section mentioned the ``--description`` option of :dlcmd:`create`.
+   Back in the very first section of the Basics, :ref:`createDS`, a :ref:`Find-out-more mentioned the '--description' option <createdescription>`   of :dlcmd:`create`.
    With this option, you can provide a description about the dataset *location*.
 
    The :gitannexcmd:`whereis` command, finally, is where such a description

diff --git a/docs/basics/101-123-config2.rst b/docs/basics/101-123-config2.rst
@@ -34,7 +34,7 @@ This looks neither spectacular nor pretty. Also, it does not follow the ``sectio
 organization of the ``.git/config`` file anymore. Instead, there are three lines,
 and all of these seem to have something to do with the configuration of git-annex.
 There even is one key word that you recognize: MD5E.
-If you have read the hidden section in :ref:`symlink`
+If you have read the :ref:`Find-out-more on object trees <objecttree>`
 you will recognize it as a reference to the type of
 key used by git-annex to identify and store file content in the object-tree.
 The first row, ``* annex.backend=MD5E``, therefore translates to "Everything in this

diff --git a/docs/basics/101-130-yodaproject.rst b/docs/basics/101-130-yodaproject.rst
@@ -14,6 +14,7 @@ the `Python <https://www.python.org>`__ programming language, you decide
 to script your analysis in Python. Delighted, you find out that there is even
 a Python API for DataLad's functionality that you can read about in :ref:`a Findoutmore on DataLad in Python<fom-pythonapi>`.
 
+.. _pythonapi:
 .. index::
    pair: use DataLad API; with Python
 .. find-out-more:: DataLad's Python API

diff --git a/docs/basics/101-132-advancednesting.rst b/docs/basics/101-132-advancednesting.rst
@@ -123,7 +123,7 @@ interested in this, checkout the :ref:`dedicated Findoutmore <fom-status>`.
    Note that both of these commands return only the ``untracked`` file and not
    not the ``modified`` subdataset because we're explicitly querying only the
    subdataset for its status.
-   If you however, as done outside of this hidden section, you want to know about
+   If you however, as done outside of this Find-out-more, you want to know about
    the subdataset record in the superdataset without causing a status query for
    the state *within* the subdataset itself, you can also provide an explicit
    path to the dataset (without a trailing path separator). This can be used
@@ -171,7 +171,7 @@ interested in this, checkout the :ref:`dedicated Findoutmore <fom-status>`.
    option (especially powerful when combined with ``-f json_pp``). To get a complete overview on what you could do, check out the technical
    documentation of :dlcmd:`status` `here <https://docs.datalad.org/en/latest/generated/man/datalad-status.html>`_.
 
-   Before we leave this hidden section, lets undo the modification of the subdataset
+   Before we leave this Find-out-more, lets undo the modification of the subdataset
    by removing the untracked file:
 
    .. runrecord:: _examples/DL-101-132-109

diff --git a/docs/basics/101-133-containersrun.rst b/docs/basics/101-133-containersrun.rst
@@ -143,12 +143,12 @@ For this, we will pull an image from Singularity hub. This image was made
 for the online-handbook, and it contains the relevant Python setup for
 the analysis. Its recipe lives in the online-handbook's
 `resources repository <https://github.com/datalad-handbook/resources>`_.
-If you're curious how to create a Singularity image, the hidden
-section below has some pointers:
+If you're curious how to create a Singularity image, the :find-out-more:`on this topic <fom-container-creation>` has some pointers:
 
 .. index::
    pair: build container image; with Singularity
 .. find-out-more:: How to make a Singularity image
+   :name: fom-container-creation
 
    Singularity containers are build from image files, often
    called "recipes", that hold a "definition" of the software container and its

diff --git a/docs/basics/101-136-filesystem.rst b/docs/basics/101-136-filesystem.rst
@@ -110,12 +110,12 @@ save a change that is marked as a deletion in a
    datalad save -m "rename file" oldname newname
 
 Alternatively, there is also a way to save the name change
-only using Git tools only, outlined in the following hidden
-section. If you are a Git user, you will be very familiar with it.
+only using Git tools only, outlined in the :find-out-more:`on faster renaming <fom-gitmv>`. If you are a Git user, you will be very familiar with it.
 
 .. index::
    pair: rename file; with Git
 .. find-out-more:: Faster renaming with Git tools
+   :name: fom-gitmv
 
    Git has built-in commands that provide a solution in two steps.
 
@@ -757,12 +757,12 @@ use.
 Beware of one thing though: If your dataset either is a sibling
 or has a sibling with the source being a path, moving or renaming
 the dataset will break the linkage between the datasets. This can
-be fixed easily though. We can try this in the following hidden
-section.
+be fixed easily though. We can try this in the :find-out-more:`on adjusting sibling URLs <fom-adjust-sibling-urls>`.
 
 .. index::
    pair: move subdataset; with Git
 .. find-out-more:: If a renamed/moved dataset is a sibling...
+   :name: fom-adjust-sibling-urls
 
    As section :ref:`config` explains, each
    sibling is registered in ``.git/config`` in a "submodule" section.

diff --git a/docs/basics/101-139-gitlfs.rst b/docs/basics/101-139-gitlfs.rst
@@ -56,7 +56,7 @@ Alternatively, to make publication even easier for you, the dataset provider, yo
    # afterwards, only datalad push is needed to publish dataset contents and history
    $ datalad push --to github
 
-Consumers of your dataset should be able to retrieve files right after cloning the dataset without a ``siblings enable`` command (as shown in the section :ref:`dropbox`), because of the ``autoenable=true`` configuration for the special remote.
+Consumers of your dataset should be able to retrieve files right after cloning the dataset without a ``siblings enable`` command, as shown in section :ref:`dropbox`, because of the ``autoenable=true`` configuration for the special remote.
 
 .. index::
    pair: drop (LFS); with DataLad

diff --git a/docs/basics/topic/adjustedmode-nosymlinks.rst b/docs/basics/topic/adjustedmode-nosymlinks.rst
@@ -4,7 +4,7 @@ While git-annex on Unix-based file operating systems stores data in the annex an
 
 **Why is that?**
 Data *needs* to be in the annex for version control and transport logistics -- the annex is able to store all previous versions of the data, and manage the transport to other storage locations if you want to publish your dataset.
-But as the :ref:`Findoutmore in this section <fom-objecttree>` will show, the :term:`annex` is a non-human readable tree structure, and data thus also needs to exist in its original location.
+But as the :ref:`Findoutmore in this section <fom-objecttree>` shows, the :term:`annex` is a non-human readable tree structure, and data thus also needs to exist in its original location.
 Thus, it exists in both places: it has moved into the annex, and copied back into its original location.
 Once you edit an annexed file, the most recent version of the file is available in its original location, and past versions are stored and readily available in the annex.
 If you reset your dataset to a previous state (as is shown in the section :ref:`history`), the respective version of your data is taken from the annex and copied to replace the newer version, and vice versa.

diff --git a/docs/beyond_basics/101-146-providers.rst b/docs/beyond_basics/101-146-providers.rst
@@ -62,7 +62,7 @@ dataset -- lacks a configuration for data access about this server::
 
 However, data access can be configured by
 the user if the required authentication and credential type are supported by
-DataLad (a list is given in the hidden section below).
+DataLad - a list is given in the :find-out-more:`on authentication <fom-provider-auth>`.
 With a data access configuration in place, commands such as
 :dlcmd:`download-url` or :dlcmd:`addurls` can work with urls
 the point to the location of the data to be retrieved, and
@@ -82,6 +82,7 @@ The following information is needed:
 The example below sheds some light one this.
 
 .. find-out-more:: Which authentication and credential types are possible?
+   :name: fom-provider-auth
 
    When configuring custom data access, credential and authentication type
    are required information. Below, we list the most common choices for these fields.

diff --git a/docs/beyond_basics/101-147-riastores.rst b/docs/beyond_basics/101-147-riastores.rst
@@ -708,7 +708,7 @@ procedures.
          `the docs <https://git-annex.branchable.com/internals/hashing>`_.
 
 .. [#f3] To re-read about how git-annex's object tree works, check out section
-         :ref:`symlink`, and pay close attention to the hidden section.
+         :ref:`symlink`, and pay close attention to the :ref:`Find-out-more on the object tree <objecttree>`.
          Additionally, you can find a lot of background information in git-annex's
          `documentation <https://git-annex.branchable.com/internals>`_.
 

diff --git a/docs/beyond_basics/101-161-biganalyses.rst b/docs/beyond_basics/101-161-biganalyses.rst
@@ -138,7 +138,7 @@ in size even if they are each small in size.
 .. [#f1] FEAT is a software tool for model-based fMRI data analysis and part of of
          `FSL <https://fsl.fmrib.ox.ac.uk>`_.
 
-.. [#f2] Read more about DataLad's Python API in the first hidden section in
+.. [#f2] Read more about DataLad's Python API in the :ref:`Find-out-more on it <pythonapi>` in
          :ref:`yoda_project`.
 
 .. [#f3] Read up on these configurations in the chapter :ref:`chapter_config`.
diff --git a/docs/beyond_basics/101-179-gitignore.rst b/docs/beyond_basics/101-179-gitignore.rst
@@ -67,8 +67,7 @@ or create your own one.
 To specify dataset content to be git-ignored, you can either write
 a full file name, e.g. ``playlists/my-little-pony-themesongs/Friendship-is-magic.mp3``
 into this file, or paths or patterns that make use of globbing, such as
-``playlists/my-little-pony-themesongs/*``. The hidden section at the end of this
-page contains some general rules for patterns in ``.gitignore`` files. Afterwards,
+``playlists/my-little-pony-themesongs/*``. The :find-out-more:`on general rules for patterns in .gitignore files <fom-gitignore>` contains a helpful overview. Afterwards,
 you just need to save the file once to your dataset so that it is version controlled.
 If you have new content you do not want to track, you can add
 new paths or patterns to the file, and save these modifications.
@@ -120,6 +119,7 @@ ignored! Therefore, a ``.gitignore`` file can give you a space inside of
 your dataset to be messy, if you want to be.
 
 .. find-out-more:: Rules for .gitignore files
+   :name: fom-gitignore
 
    Here are some general rules for the patterns you can put into a ``.gitignore``
    file, taken from the book `Pro Git <https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository#_ignoring>`_ :

diff --git a/docs/usecases/HCP_dataset.rst b/docs/usecases/HCP_dataset.rst
@@ -166,10 +166,10 @@ which it has been aggregated are small in size, and yet provide access to the HC
 data for anyone who has valid AWS S3 credentials.
 
 At the end of this step, there is one nested dataset per subject in the HCP data
-release. If you are interested in the details of this process, checkout the
-hidden section below.
+release. If you are interested in the details of this process, checkout the :find-out-more:`on the datasets' generation <fom-hcp>`.
 
 .. find-out-more:: How exactly did the datasets came to be?
+   :name: fom-hcp
 
    All code and tables necessary to generate the HCP datasets can be found on
    GitHub at `github.com/TobiasKadelka/build_hcp <https://github.com/TobiasKadelka/build_hcp>`_.