Skip to content

Double whitespace breaks logging #10970

@maxgriehl-web

Description

@maxgriehl-web

Bug Report

dvc repro: fails storing correct command in dvc.lock

Description

Some double spaces of dvc.yaml-commands are not stored in the dvc.lock-clone.
This happens if a command is too long, and gets formatted with an additional line break exactly at the double-space.

As a result, dvc status regards the stage as dirty, even directly after commiting or running it.

Reproduce

# 1. Create a new directory and initialize DVC
mkdir my-dir && cd my-dir
git init
dvc init

# 2. Create a dvc.yaml with lots of double spaces in the command
cat > dvc.yaml << 'EOF'
stages:
  test_stage:
    cmd: echo "lots"  "of"  "words"  "(too much"  "for"  "one"  "line)"  "with"  "lots"  "of"  "double-whitespace"  "between"  "the"  "parts" > output.txt
    outs:
      - output.txt
EOF

# 3. Run the stage to generate lock file
# (you can check in it that some double-space became a line-break!)
dvc repro

# 4. Check status - should be clean but shows "changed command"
dvc status

Expected

dvc status should regard everything as up-to-date.

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 3.66.1 (pip)
-------------------------
Platform: Python 3.12.3 on Linux-6.14.0-119037-tuxedo-x86_64-with-glibc2.39
Subprojects:
        dvc_data = 3.18.2
        dvc_objects = 5.2.0
        dvc_render = 1.0.2
        dvc_task = 0.40.2
        scmrepo = 3.6.1
Supports:
        http (aiohttp = 3.13.3, aiohttp-retry = 2.9.1),
        https (aiohttp = 3.13.3, aiohttp-retry = 2.9.1),
        ssh (sshfs = 2025.11.0)
Config:
        Global: /home/maxg/.config/dvc
        System: /home/maxg/.config/kdedefaults/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/system-root
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/mapper/system-root
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/f919da52546ca06d0c807bb616433829

Additional Information:

Neither does dvc commit store the correct command.

Already started a discussion here #10933. But I guess, a bug-report is more appropriate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions