Skip to content

Conversation

@ramnar
Copy link

@ramnar ramnar commented Dec 7, 2025

📋 Description

JIRA ID:

Please provide a summary of the change and the motivation behind it. Include relevant context and details.

Enhanced the script which loads sample data to download the sql file, unzip it and run the script. Earlier this was a manual step. Updated the readme accordingly

✅ Type of Change

  • 🐞 Bug fix (non-breaking change which resolves an issue)
  • [x ] ✨ New feature (non-breaking change which adds functionality)
  • 🔥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 🛠 Refactor (change that is neither a fix nor a new feature)
  • ⚙️ Config change (configuration file or build script updates)
  • 📚 Documentation (updates to docs or readme)
  • 🧪 Tests (adding new or updating existing tests)
  • 🎨 UI/UX (changes that affect the user interface)
  • 🚀 Performance (improves performance)
  • 🧹 Chore (miscellaneous changes that don't modify src or test files)

ℹ️ Additional Information

Please describe how the changes were tested, and include any relevant screenshots, logs, or other information that provides additional context.

Summary by CodeRabbit

  • Documentation

    • Reworked local setup into a concise "Overview" with clear three-step activity flow; added MySQL Client to requirements.
    • Consolidated container orchestration steps, added explicit instructions to verify containers (MySQL, MongoDB, Redis) and service endpoints.
    • Streamlined deployment into a two-step database-start and schema-creation flow; tightened troubleshooting tips.
  • New Features

    • Added automated, cross-platform data loading scripts with download, extract, execution, per-file logging, and cleanup.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 7, 2025

📝 Walkthrough

Walkthrough

This PR restructures the local setup README, consolidating deployment and container initialization steps, and adds automated data-loading scripts that download, extract, execute SQL files against MySQL with per-file error handling, then clean up temporary artifacts.

Changes

Cohort / File(s) Summary
Documentation Update
amrit-local-setup/README.md
Rewrote setup into a single "Overview" flow, added MySQL Client as a prerequisite, consolidated container/startup steps, added explicit container verification (MySQL, MongoDB, Redis), simplified deployment into starting DBs and creating schema, replaced prior data-package steps with script-based data load instructions, and tightened troubleshooting guidance.
Unix Data Loader
amrit-local-setup/loaddummydata.sh
Added download of AmritMasterData.zip, extraction to AmritMasterDataFiles/AmritMasterData, sets SQL_DIR, iterates SQL files executing each against MySQL, captures output and exit codes, treats ERROR 1062 (duplicate key) specially, logs per-file results, and removes extracted files on completion.
Windows Data Loader
amrit-local-setup/loaddummydata.bat
Adds PowerShell-based download and Expand-Archive extraction into AmritMasterDataFiles, updates SQL_DIR, loops over SQL files executing them against MySQL with per-file success/error messages, and cleans up downloaded ZIP and extracted directory afterward.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant Script as Data Loader<br/>(loaddummydata.sh/.bat)
    participant Remote as Download Service
    participant FS as Local File System
    participant MySQL as MySQL DB

    User->>Script: run script
    Script->>Remote: GET AmritMasterData.zip
    Remote-->>Script: ZIP stream
    Script->>FS: extract ZIP -> AmritMasterDataFiles/
    FS-->>Script: extraction complete

    loop for each SQL file in SQL_DIR
        Script->>FS: read SQL file
        Script->>MySQL: execute SQL
        alt exit 0
            MySQL-->>Script: OK
            Script->>Script: log success
        else ERROR 1062
            MySQL-->>Script: ERROR 1062 (duplicate)
            Script->>Script: log duplicate-skip
        else other error
            MySQL-->>Script: ERROR
            Script->>Script: log error + output
        end
    end

    Script->>FS: delete AmritMasterDataFiles/ and ZIP
    FS-->>Script: cleanup done
    Script-->>User: finished (summary)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I hopped to fetch a zip so grand,

Extracted fields across the land,
SQL seeds planted, duplicates spared,
Logs for each file carefully shared,
I tidy my trail—local setup complete.

Pre-merge checks

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Bugfix/data loading' is vague and generic. It uses a non-descriptive pattern that doesn't clearly convey the main change—automating sample data loading scripts. Clarify the title to better describe the main change, such as: 'Automate sample data loading in setup scripts' or 'Add automated SQL data loading to local setup'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c54a12b and 219d86e.

📒 Files selected for processing (1)
  • amrit-local-setup/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • amrit-local-setup/README.md

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
amrit-local-setup/loaddummydata.bat (1)

24-38: ERRORLEVEL check inside the for block will not behave as expected.

if %ERRORLEVEL% equ 0 ( is evaluated once when the block is parsed, so it won’t reliably reflect the exit code of each mysql invocation inside the for loop. You’ll likely get the same success/failure branch for all files, regardless of individual outcomes.

A robust fix is to enable delayed expansion and use !ERRORLEVEL!:

 @echo off
+setlocal enabledelayedexpansion
 ...
         mysql -h %HOST% -P %PORT% -u %USER% -p%PASSWORD% %%B < "%SQL_DIR%\%%A"
-        
-        if %ERRORLEVEL% equ 0 (
+        
+        if !ERRORLEVEL! equ 0 (
             echo Successfully executed %%A on %%B.
         ) else (
             echo Error executing %%A on %%B.
         )
 ...
+endlocal
🧹 Nitpick comments (3)
amrit-local-setup/loaddummydata.sh (1)

30-41: MySQL execution and duplicate-key handling look good; consider optionally surfacing non‑duplicate failures.

The pattern of capturing OUTPUT and EXIT_CODE, treating EXIT_CODE=0 as success and ERROR 1062 as “duplicates skipped,” is reasonable for idempotent sample-data loads. For non‑duplicate failures, you’re correctly logging the full output.

If you ever need this script in automation (CI or pre‑commit), consider:

  • Exiting with a non‑zero status in the else branch so calling processes can fail fast.
  • Optionally printing a brief summary at the end if any files hit non‑duplicate errors.
amrit-local-setup/README.md (2)

11-19: Prerequisites and clone/cd instructions align with the scripts.

Adding “MySQL Client” and explicitly cd‑ing into AMRIT-DevOps/amrit-local-setup matches how loaddummydata.sh and loaddummydata.bat are written (they assume mysql is in PATH and run relative to that directory). You might optionally call out that wget and unzip (or equivalents) are required on Linux/Unix for the data-loading step, but the current wording is already a solid improvement.


42-42: Tighten wording and capitalization in the container verification note.

Minor grammar/style nit; suggest:

-**Note:** Before proceeding, Verify that the Docker containers are running for mysql, redis and mongo
+**Note:** Before proceeding, verify that the Docker containers are running for MySQL, Redis and Mongo.

This matches typical capitalization and fixes the mid-sentence capital “Verify”. (Based on static analysis hints.)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0988e47 and c54a12b.

📒 Files selected for processing (3)
  • amrit-local-setup/README.md (2 hunks)
  • amrit-local-setup/loaddummydata.bat (2 hunks)
  • amrit-local-setup/loaddummydata.sh (2 hunks)
🧰 Additional context used
🪛 Gitleaks (8.30.0)
amrit-local-setup/loaddummydata.sh

[high] 10-10: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 11-11: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

amrit-local-setup/loaddummydata.bat

[high] 11-11: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🪛 LanguageTool
amrit-local-setup/README.md

[grammar] ~42-~42: Ensure spelling is correct
Context: ...ainers are running for mysql, redis and mongo ### 2. Database Schema Management Service De...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (1)
amrit-local-setup/loaddummydata.sh (1)

45-47: Cleanup of extracted files is straightforward and appropriate.

Removing AmritMasterDataFiles after the run keeps the workspace clean and forces a fresh download on the next execution, which is desirable for a reproducible sample-data loader.

Comment on lines +9 to +16
:: Download the file using PowerShell's Invoke-WebRequest (equivalent to wget)
echo Downloading AmritMasterData.zip...
powershell -Command "Invoke-WebRequest -Uri 'https://1865391384-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYfDZFIsUuulWkRHaq4c1%2Fuploads%2F1WdSAf0fQBeJOea70EXE%2FAmritMasterData.zip?alt=media&token=18e0b6d6-487c-4c0c-967a-02cdd94d61ad' -OutFile 'AmritMasterData.zip'"

:: Extract the file using PowerShell's Expand-Archive (equivalent to unzip)
echo Extracting AmritMasterData.zip...
powershell -Command "Expand-Archive -Path 'AmritMasterData.zip' -DestinationPath 'AmritMasterDataFiles'"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

External download URL embeds a token; consider treating it as a secret and hardening failure handling.

The GitBook URL used in Invoke-WebRequest contains a token= query parameter, which static analysis correctly flags as a potential generic API key. If this token is sensitive or revocable, committing it to the repo risks accidental exposure and future breakage when it’s rotated. Also, neither the download nor the extraction has error checks, so the script will continue even if the file is missing or the network fails.

Consider:

  • Moving the URL (or just the token) into a config/env variable, or hosting the ZIP at a stable, unauthenticated location.
  • Adding basic checks on %ERRORLEVEL% after Invoke-WebRequest/Expand-Archive and aborting with a clear message on failure.
🧰 Tools
🪛 Gitleaks (8.30.0)

[high] 11-11: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
In amrit-local-setup/loaddummydata.bat around lines 9 to 16, the script embeds a
tokenized GitBook URL and lacks error handling; remove the hardcoded token by
reading the full URL or only the token from an environment variable or a config
file (do not commit the secret), then after each PowerShell call check
%ERRORLEVEL% (or inspect PowerShell's exit code) and if non-zero echo a clear
error message and exit /b with a non‑zero code so the script stops on download
or extraction failures.

Comment on lines 17 to +18
:: Path to the extracted SQL files
set SQL_DIR=<PATH to the extracted files directory>
set SQL_DIR="AmritMasterDataFiles/AmritMasterData"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix SQL_DIR quoting to avoid malformed paths.

set SQL_DIR="AmritMasterDataFiles/AmritMasterData" stores the quotes inside the variable, and later you wrap %SQL_DIR% in another pair of quotes: "%SQL_DIR%\%%A". This yields a doubly quoted, slightly malformed path which can cause file resolution issues.

A safer pattern is:

- set SQL_DIR="AmritMasterDataFiles/AmritMasterData"
+ set "SQL_DIR=AmritMasterDataFiles\AmritMasterData"

Then keep the existing usage "%SQL_DIR%\%%A".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
:: Path to the extracted SQL files
set SQL_DIR=<PATH to the extracted files directory>
set SQL_DIR="AmritMasterDataFiles/AmritMasterData"
:: Path to the extracted SQL files
set "SQL_DIR=AmritMasterDataFiles\AmritMasterData"
🤖 Prompt for AI Agents
In amrit-local-setup/loaddummydata.bat around lines 17-18, the SQL_DIR variable
is assigned with embedded quotes which become part of the variable value and
lead to double-quoting later; change the assignment to remove the quotes (e.g.,
set SQL_DIR=AmritMasterDataFiles\AmritMasterData) so the variable contains a
plain path, then keep using it as "%SQL_DIR%\%%A" elsewhere.

Comment on lines +42 to +45
# Clean up the extracted files
del AmritMasterData.zip
rmdir /s /q AmritMasterDataFiles
echo Cleaned up AmritMasterDataFiles folder.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Batch comment syntax: use REM or :: instead of #.

# Clean up the extracted files in a .bat file is executed as a command (and will print an error), not treated as a comment.

Change it to something like:

-# Clean up the extracted files
+:: Clean up the extracted files
 del AmritMasterData.zip
 rmdir /s /q AmritMasterDataFiles
 echo Cleaned up AmritMasterDataFiles folder.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Clean up the extracted files
del AmritMasterData.zip
rmdir /s /q AmritMasterDataFiles
echo Cleaned up AmritMasterDataFiles folder.
:: Clean up the extracted files
del AmritMasterData.zip
rmdir /s /q AmritMasterDataFiles
echo Cleaned up AmritMasterDataFiles folder.
🤖 Prompt for AI Agents
In amrit-local-setup/loaddummydata.bat around lines 42 to 45, the line starting
with "#" is being treated as a command rather than a comment in a .bat file;
replace the "#" comment with a valid batch comment token (for example prefix the
line with REM or ::) so the cleanup note is a true comment, leaving the del and
rmdir commands unchanged and preserving the final echo.

Comment on lines +10 to 15
# download and extract the SQL files from the given url https://1865391384-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYfDZFIsUuulWkRHaq4c1%2Fuploads%2F1WdSAf0fQBeJOea70EXE%2FAmritMasterData.zip?alt=media&token=18e0b6d6-487c-4c0c-967a-02cdd94d61ad
wget -O AmritMasterData.zip "https://1865391384-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYfDZFIsUuulWkRHaq4c1%2Fuploads%2F1WdSAf0fQBeJOea70EXE%2FAmritMasterData.zip?alt=media&token=18e0b6d6-487c-4c0c-967a-02cdd94d61ad"
unzip AmritMasterData.zip -d AmritMasterDataFiles
rm AmritMasterData.zip
SQL_DIR="AmritMasterDataFiles/AmritMasterData"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Download URL with embedded token may be sensitive and is a single point of failure.

The hard-coded GitBook URL (with token=) for AmritMasterData.zip is flagged by Gitleaks as a generic API key. If the token is private or short‑lived, committing it to the repo both exposes it and makes the script fragile when it’s rotated or expires. Additionally, if wget/unzip fail, the script will still proceed to the MySQL step with an invalid SQL_DIR.

Recommendations:

  • Move the URL (or token) into configuration (env var, .env file ignored by Git, or a central config script) or host the ZIP at a stable, unauthenticated location.
  • After wget and unzip, check the exit status and/or the presence of $SQL_DIR and abort early with a clear message if the data isn’t available.
🧰 Tools
🪛 Gitleaks (8.30.0)

[high] 10-10: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 11-11: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
In amrit-local-setup/loaddummydata.sh around lines 10 to 15, the script embeds a
GitBook URL with a token and does not validate download/unzip success; move the
URL or token into a configuration source (eg. an environment variable or .env
entry that is gitignored) and reference that variable in the script instead of
the hard-coded string, and after running wget and unzip check their exit codes
and verify the expected SQL_DIR exists (abort with a clear error and non-zero
exit status if any step fails) so the script never proceeds to the MySQL step
with missing data.

Comment on lines 51 to 53
cp `src/main/environment/common_example.properties` to `src/main/environment/common_local.properties`
mvn clean install -DENV_VAR=local
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix the cp command syntax and polish the data-loading sentence.

The line inside the bash block is not a valid shell command as written:

cp `src/main/environment/common_example.properties` to `src/main/environment/common_local.properties`

It will be interpreted as command substitution and a literal to, and will fail. Also, the sentence introducing the data-loading step can be made a bit clearer.

Suggested changes:

 ```bash
 git clone https://github.com/PSMRI/AMRIT-DB.git
 cd AMRIT-DB
-cp `src/main/environment/common_example.properties` to `src/main/environment/common_local.properties`
-mvn clean install -DENV_VAR=local
+cp src/main/environment/common_example.properties src/main/environment/common_local.properties
+mvn clean install -DENV_VAR=local

...
-Run the below script to load sample Data into the database tables created in the previous step
+Run the following script to load sample data into the database tables created in the previous step.






Also applies to: 58-58

<details>
<summary>🤖 Prompt for AI Agents</summary>

In amrit-local-setup/README.md around lines 51-53 and line 58, fix the invalid
cp command and polish the data-loading sentence: replace the backticked and
"to"-containing line with a proper copy command using plain paths (cp
src/main/environment/common_example.properties
src/main/environment/common_local.properties) and ensure the subsequent mvn line
remains unchanged; update the sentence introducing the script to "Run the
following script to load sample data into the database tables created in the
previous step." Keep the corrected lines inside the bash code block.


</details>

<!-- fingerprinting:phantom:poseidon:olive -->

<!-- This is an auto-generated comment by CodeRabbit -->


:: Download the file using PowerShell's Invoke-WebRequest (equivalent to wget)
echo Downloading AmritMasterData.zip...
powershell -Command "Invoke-WebRequest -Uri 'https://1865391384-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYfDZFIsUuulWkRHaq4c1%2Fuploads%2F1WdSAf0fQBeJOea70EXE%2FAmritMasterData.zip?alt=media&token=18e0b6d6-487c-4c0c-967a-02cdd94d61ad' -OutFile 'AmritMasterData.zip'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is that this DB dump can change in the documentation (with the same file name). Also, I am not sure if the token will expire any time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid Point. I was thinking about it.

File url will change when new data dump is added to the documentation. We need to update the script as well whenever we update the dump file in gitbook. Token doesnt seem to expire as long as the file name is same

Other option is to maintain the dump file in github rather than gitbook. As the file size is large, we need to use git LFS

Do you have any other ideas?

Copy link
Author

@ramnar ramnar Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drtechie - End goal I am trying to achieve in future is to have a single script to start database containers, start schema management service and load dummy data. I believe we can have this single script in schema management service git repository itself. Load dummy data can be integrated with schema management service based one some flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants