Skip to content

Conversation

@shanmugamr1992
Copy link
Contributor

@shanmugamr1992 shanmugamr1992 commented Nov 11, 2025

What does this PR do ?

Adds mcore dynamic engine generation support

https://wandb.ai/shanmugamr/grpo-dev?nw=nwusershanmugamr (PLOT showing vllm and mcore having similar performance)

So final times :
The numbers are as follows

Generation alone :
Mcore - 20 seconds
VLLM - 8 seconds

End to end step times :
Mcore - 47 seconds
VLLM - 37 seconds

Issues

List issues that this PR closes (syntax):

Closes #1079

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Introduced dynamic inference engine for improved generation performance with CUDA graph optimization support.
    • Added configuration for GRPO Llama 3.2 1B Instruct model with Megatron generation backend.
  • Bug Fixes

    • Fixed potential error in policy generation initialization.
  • Tests

    • Added new functional tests for GRPO Megatron generation workflow.
    • Enabled previously disabled generation tests.

@shanmugamr1992 shanmugamr1992 requested review from a team as code owners November 11, 2025 00:05
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 11, 2025

📝 Walkthrough

Walkthrough

This PR introduces GRPO support for Llama 3.2-1B with dynamic megatron generation inference, refactors MegatronPolicyWorker to use DynamicInferenceEngine instead of StaticInferenceEngine, adds new test scripts to the GPU and nightly test suites, adds a None-check guard for policy_generation setup, and enables previously skipped megatron generation unit tests.

Changes

Cohort / File(s) Summary
Configuration Addition
examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
New YAML config for GRPO with Llama-3.2-1B-Instruct, megatron generation backend, and settings for checkpointing, logging, and inference (max_new_tokens 512).
Core Algorithm Logic
nemo_rl/algorithms/grpo.py
Adds None-check guard around prepare_refit_info(state_dict_info) call to prevent AttributeError when policy_generation is None.
Inference Engine Refactor
nemo_rl/models/policy/megatron_policy_worker.py
Replaces StaticInferenceEngine with DynamicInferenceContext and DynamicInferenceEngine in generate() method; introduces per-prompt request handling, dynamic sampling parameter configuration, explicit detokenization, and CUDA graph optimization support.
GPU Functional Tests
tests/functional/L1_Functional_Tests_GPU.sh, tests/functional/grpo_megatron_generation.sh
Extends GPU L1 test suite to include new GRPO megatron generation functional test; new test script sets up experiment environment, runs GRPO training, generates metrics, and validates token probability error < 1.05.
Nightly & Unit Tests
tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh, tests/test_suites/nightly.txt, tests/unit/models/policy/test_megatron_worker.py
Adds new nightly test script for GRPO megatron generation with config and metric validation; registers test in nightly suite; removes pytest skip decorator to enable megatron worker unit tests.

Sequence Diagram(s)

sequenceDiagram
    participant caller as Caller
    participant old as Old: StaticInferenceEngine
    participant new as New: DynamicInferenceEngine
    participant model as Model
    participant engine as Engine

    rect rgb(200, 220, 250)
    Note over old,engine: Previous Flow (Static)
    caller->>old: generate(prompts, ...)
    old->>old: run_mcore_engine()
    old->>engine: forward pass
    engine-->>old: output tokens + logprobs
    old-->>caller: return results
    end

    rect rgb(220, 250, 200)
    Note over new,engine: New Flow (Dynamic)
    caller->>new: generate(prompts, ...)
    new->>new: DynamicInferenceContext setup
    new->>new: GPTInferenceWrapper init
    new->>model: prep_model_for_inference()
    model->>model: enable CUDA graphs (local)
    new->>new: compute tokens_to_generate
    new->>new: configure SamplingParams (temp, top_k, top_p)
    loop per-prompt
        new->>engine: create request with params
        engine->>engine: dynamic batching
    end
    new->>engine: collect results by request_id
    new->>new: detokenize per-prompt
    new->>new: sort by request_id
    new-->>caller: return ordered results
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • MegatronPolicyWorker.generate() refactor (nemo_rl/models/policy/megatron_policy_worker.py): Significant internal logic change replacing inference engine approach; requires careful verification of per-prompt batching, sampling parameter handling, result ordering, and CUDA graph integration.
  • Test script logic (tests/functional/grpo_megatron_generation.sh and corresponding test suite script): Verify environment setup, metric validation thresholds, and log parsing correctness.
  • Policy generation None-check (nemo_rl/algorithms/grpo.py): Minor defensive fix; straightforward guard condition.

Possibly related PRs

Suggested labels

CI:L1, Run CICD

Suggested reviewers

  • parthchadha
  • terrykong
  • guyueh1

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR implements major inference engine refactoring and performance optimizations but PR description lacks test results, performance metrics, or convergence analysis documentation. Document performance benchmarks, test execution results, and convergence/regression analysis validating the StaticInferenceEngine to DynamicInferenceEngine refactoring.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'build: Use dynamic engine for generate' directly reflects the main change in the PR: replacing the hard-coded StaticInferenceEngine with DynamicInferenceEngine in the megatron_policy_worker.py file, which is the core technical change across the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch build_dynmamic

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a035bc and 8a0f86b.

📒 Files selected for processing (8)
  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml (1 hunks)
  • nemo_rl/algorithms/grpo.py (1 hunks)
  • nemo_rl/models/policy/megatron_policy_worker.py (2 hunks)
  • tests/functional/L1_Functional_Tests_GPU.sh (1 hunks)
  • tests/functional/grpo_megatron_generation.sh (1 hunks)
  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh (1 hunks)
  • tests/test_suites/nightly.txt (1 hunks)
  • tests/unit/models/policy/test_megatron_worker.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tests/unit/models/policy/test_megatron_worker.py
🧰 Additional context used
📓 Path-based instructions (10)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Follow the Google Shell Style Guide for all shell scripts
Use uv run to execute Python scripts in shell/driver scripts instead of activating virtualenvs and calling python directly
Add the NVIDIA copyright header (with current year) at the top of all shell scripts, excluding tests/ and test-only scripts

Files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/functional/grpo_megatron_generation.sh
  • tests/functional/L1_Functional_Tests_GPU.sh
tests/test_suites/llm/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
tests/test_suites/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place driver shell scripts and common.env under tests/test_suites// and list nightly tests in tests/test_suites/nightly.txt

Files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/test_suites/nightly.txt
examples/configs/recipes/**/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

examples/configs/recipes/**/*.yaml: Recipe YAMLs under examples/configs/recipes/** are runnable snapshots and may omit documentation
When adding support for a new model, add a recipe YAML under examples/configs/recipes/ in the appropriate domain (llm/ or vlm/) with the correct name

Files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
examples/configs/recipes/llm/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM recipe YAML filenames must follow: --ng-[-modifiers][-long][.vN].yaml

Files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
examples/configs/recipes/**/*.{yaml,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Known exception: Deepscaler recipes may encode context length in place of the cluster tuple (e.g., grpo-deepscaler-1.5b-8K.*); allowed but document intended hardware in the script

Files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
examples/configs/recipes/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place recipe YAMLs under examples/configs/recipes//

Files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts

Files:

  • nemo_rl/algorithms/grpo.py
  • nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)

Files:

  • nemo_rl/algorithms/grpo.py
  • nemo_rl/models/policy/megatron_policy_worker.py
tests/test_suites/nightly.txt

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Append the new driver script path (relative to tests/test_suites/) to tests/test_suites/nightly.txt

Files:

  • tests/test_suites/nightly.txt
🧠 Learnings (10)
📚 Learning: 2025-10-12T14:46:57.171Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

Applied to files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/test_suites/nightly.txt
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to tests/test_suites/llm/*.sh : LLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Applied to files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/test_suites/nightly.txt
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to tests/test_suites/**/*.{sh} : For new model support, add a matching driver shell script under tests/test_suites/<domain>/ that sources common.env and invokes 'uv run ... --config <yaml>'

Applied to files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/functional/grpo_megatron_generation.sh
📚 Learning: 2025-10-12T14:46:55.513Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.

Applied to files:

  • tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh
  • tests/functional/grpo_megatron_generation.sh
📚 Learning: 2025-09-19T07:28:29.887Z
Learnt from: shuo-nvidia
Repo: NVIDIA-NeMo/RL PR: 1006
File: tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh:1-4
Timestamp: 2025-09-19T07:28:29.887Z
Learning: The NVIDIA-NeMo/RL project prefers to maintain consistent formatting across test scripts rather than applying individual bash hardening improvements like `set -euo pipefail` or proper quoting for sourcing files.

Applied to files:

  • tests/functional/grpo_megatron_generation.sh
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/llm/*.yaml : LLM recipe YAML filenames must follow: <algo>-<model>-<nodes>n<gpus>g-<strategy-and-params>[-modifiers][-long][.vN].yaml

Applied to files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to examples/configs/recipes/**/*.yaml : When adding support for a new model, add a recipe YAML under examples/configs/recipes/ in the appropriate domain (llm/ or vlm/) with the correct name

Applied to files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
📚 Learning: 2025-09-18T14:57:31.003Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1006
File: nemo_rl/algorithms/distillation.py:312-354
Timestamp: 2025-09-18T14:57:31.003Z
Learning: The distillation algorithm's cluster setup logic is designed to follow the same patterns used in GRPO for handling distributed training clusters and resource allocation.

Applied to files:

  • examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to tests/test_suites/nightly.txt : Append the new driver script path (relative to tests/test_suites/) to tests/test_suites/nightly.txt

Applied to files:

  • tests/test_suites/nightly.txt
📚 Learning: 2025-09-20T14:58:45.492Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-09-20T14:58:45.492Z
Learning: Applies to tests/test_suites/** : Place driver shell scripts and common.env under tests/test_suites/<domain>/ and list nightly tests in tests/test_suites/nightly.txt

Applied to files:

  • tests/test_suites/nightly.txt
🧬 Code graph analysis (2)
nemo_rl/algorithms/grpo.py (10)
nemo_rl/models/policy/megatron_policy_worker.py (1)
  • prepare_refit_info (1984-1998)
nemo_rl/models/generation/vllm/vllm_backend.py (1)
  • prepare_refit_info (90-97)
nemo_rl/models/policy/dtensor_policy_worker.py (1)
  • prepare_refit_info (1733-1740)
nemo_rl/models/policy/dtensor_policy_worker_v2.py (1)
  • prepare_refit_info (1694-1701)
nemo_rl/models/generation/vllm/vllm_worker.py (1)
  • prepare_refit_info (631-633)
nemo_rl/models/generation/interfaces.py (1)
  • prepare_refit_info (239-241)
nemo_rl/models/generation/vllm/vllm_generation.py (1)
  • prepare_refit_info (751-768)
nemo_rl/models/policy/lm_policy.py (1)
  • prepare_refit_info (682-691)
nemo_rl/models/policy/interfaces.py (1)
  • prepare_refit_info (157-158)
tests/unit/algorithms/test_distillation.py (2)
  • prepare_refit_info (549-550)
  • prepare_refit_info (565-566)
nemo_rl/models/policy/megatron_policy_worker.py (1)
nemo_rl/distributed/batched_data_dict.py (1)
  • size (814-823)
🪛 Ruff (0.14.4)
nemo_rl/models/policy/megatron_policy_worker.py

1796-1796: Local variable max_batch_size is assigned to but never used

Remove assignment to unused variable max_batch_size

(F841)

🪛 Shellcheck (0.11.0)
tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh

[warning] 6-6: NUM_NODES appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 9-9: NUM_RUNS appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 16-16: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[error] 29-29: Double quote array expansions to avoid re-splitting elements.

(SC2068)

tests/functional/grpo_megatron_generation.sh

[error] 39-39: Double quote array expansions to avoid re-splitting elements.

(SC2068)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Lint check
  • GitHub Check: Post submodule check comment / Comment on PR
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (7)
nemo_rl/algorithms/grpo.py (1)

514-515: LGTM! Defensive None-check prevents AttributeError.

This correctly guards against calling prepare_refit_info when policy_generation is None (which occurs when backend == "megatron" at lines 439-444).

nemo_rl/models/policy/megatron_policy_worker.py (2)

1753-1756: LGTM! Ensures model is on CUDA for generation.

The model movement before disabling forward pre-hook is appropriate for the colocated inference setup.


1869-1897: LGTM! Dynamic engine request handling and result collection.

The per-prompt request loop, result collection via step_modern, and sorting by request_id correctly maintain batch order and integrate with the new dynamic inference flow.

tests/test_suites/nightly.txt (1)

15-15: LGTM! Test path correctly registered.

The new Megatron generation test path follows the established pattern and aligns with the new test script added to the suite.

tests/functional/L1_Functional_Tests_GPU.sh (1)

26-26: LGTM! Extends L1 GPU test suite.

The new grpo_megatron_generation.sh test invocation follows the existing pattern and appropriately expands test coverage for the dynamic Megatron generation path.

examples/configs/recipes/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.yaml (1)

1-36: LGTM! Well-structured Megatron generation configuration.

The YAML configuration correctly defines all necessary settings for GRPO with Megatron-based generation, including:

  • Appropriate defaults inheritance
  • Megatron backend with dynamic generation parameters
  • Consistent resource allocation (8 GPUs per node)
  • Proper logging and checkpointing setup

The configuration aligns well with the dynamic inference changes in the Megatron policy worker.

tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh (1)

29-29: Quote the $@ expansion to handle arguments with spaces.

Unquoted $@ can cause word splitting and glob expansion issues when arguments contain spaces or special characters.

Apply this diff:

-    $@ \
+    "$@" \
⛔ Skipped due to learnings
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.

@terrykong terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Nov 22, 2025
@terrykong terrykong linked an issue Nov 22, 2025 that may be closed by this pull request
@shanmugamr1992 shanmugamr1992 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 25, 2025
@shanmugamr1992 shanmugamr1992 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 26, 2025
@terrykong terrykong enabled auto-merge (squash) November 26, 2025 20:19
terrykong
terrykong previously approved these changes Nov 26, 2025
@shanmugamr1992 shanmugamr1992 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 27, 2025
@shanmugamr1992 shanmugamr1992 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 29, 2025
@shanmugamr1992 shanmugamr1992 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 30, 2025
@terrykong terrykong merged commit 66d80e6 into main Dec 1, 2025
53 of 58 checks passed
@terrykong terrykong deleted the build_dynmamic branch December 1, 2025 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Megatron Inference support for GRPO (Dynamic batching)

3 participants