Sentinel: Fix SQL injection bypass in read-only query validation#151
Sentinel: Fix SQL injection bypass in read-only query validation#151Dexploarer wants to merge 1 commit intodevelopfrom
Conversation
Refactored `validateReadOnlyQuery` to correctly strip SQL comments by replacing them with spaces instead of empty strings. This prevents mutation keywords from being concatenated and bypassing the blocklist (e.g., `DELETE/**/FROM` -> `DELETE FROM`, caught by regex). Also exported the validation logic for unit testing.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the security of read-only SQL query validation by patching a critical SQL injection vulnerability. The changes prevent malicious users from executing mutation commands by cleverly obfuscating their queries with comments. By refining the comment stripping mechanism and consolidating the validation rules into a dedicated function, the system is now more robust against various forms of SQL attacks, ensuring that read-only database access remains strictly read-only. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| export function validateReadOnlyQuery(sqlText: string): string | null { | ||
| // Strip block comments (/* ... */) and line comments (-- ...). | ||
| // Replace with a space to prevent keyword concatenation (e.g. DELETE/**/FROM -> DELETE FROM). | ||
| const stripped = sqlText | ||
| .replace(/\/\*[\s\S]*?\*\//g, " ") | ||
| .replace(/--.*$/gm, " ") | ||
| .trim(); | ||
|
|
||
| // Strip string literals so that mutation keywords/functions inside quoted | ||
| // strings are ignored. Handles single-quoted ('...'), dollar-quoted | ||
| // ($$...$$), and tagged dollar-quoted ($tag$...$tag$) strings. | ||
| const noLiterals = stripped | ||
| .replace(/\$([A-Za-z0-9_]*)\$[\s\S]*?\$\1\$/g, " ") | ||
| .replace(/'(?:[^']|'')*'/g, " "); | ||
|
|
||
| // For keyword checks, also strip double-quoted identifiers to avoid | ||
| // matching words inside quoted table/column names. | ||
| const noStrings = noLiterals.replace(/"(?:[^"]|"")*"/g, " "); | ||
|
|
||
| const mutationKeywords = [ | ||
| // ── DML ──────────────────────────────────────────────────────────── | ||
| "INSERT", | ||
| "UPDATE", | ||
| "DELETE", | ||
| "INTO", | ||
| "COPY", | ||
| "MERGE", | ||
| // ── DDL ──────────────────────────────────────────────────────────── | ||
| "DROP", | ||
| "ALTER", | ||
| "TRUNCATE", | ||
| "CREATE", | ||
| "COMMENT", | ||
| // ── Admin / privilege ────────────────────────────────────────────── | ||
| "GRANT", | ||
| "REVOKE", | ||
| "SET", | ||
| "RESET", | ||
| "LOAD", | ||
| // ── Maintenance ──────────────────────────────────────────────────── | ||
| "VACUUM", | ||
| "REINDEX", | ||
| "CLUSTER", | ||
| "REFRESH", | ||
| "DISCARD", | ||
| // ── Procedural ───────────────────────────────────────────────────── | ||
| "CALL", | ||
| "DO", | ||
| // ── Async notifications (side-effects) ───────────────────────────── | ||
| "LISTEN", | ||
| "UNLISTEN", | ||
| "NOTIFY", | ||
| // ── Prepared statements (can wrap mutations) ─────────────────────── | ||
| "PREPARE", | ||
| "EXECUTE", | ||
| "DEALLOCATE", | ||
| // ── Locking ──────────────────────────────────────────────────────── | ||
| "LOCK", | ||
| ]; | ||
| // Match mutation keywords as whole words (word boundary) anywhere in the | ||
| // query, catching them inside CTEs, subqueries, etc. | ||
| const mutationPattern = new RegExp( | ||
| `\\b(${mutationKeywords.join("|")})\\b`, | ||
| "i", | ||
| ); | ||
| const match = mutationPattern.exec(noStrings); | ||
| if (match) { | ||
| return `Query rejected: "${match[1].toUpperCase()}" is a mutation keyword. Set readOnly: false to execute mutations.`; | ||
| } | ||
|
|
||
| // PostgreSQL built-in functions that can read/write server files, mutate | ||
| // server state, or cause denial of service. These appear inside otherwise | ||
| // valid SELECT expressions, so keyword checks alone won't catch them. | ||
| // | ||
| // ── File I/O (arbitrary file read/write on the DB server) ───────── | ||
| // lo_import('/etc/passwd') — load file into large object | ||
| // lo_export(oid, '/tmp/evil') — write large object to file | ||
| // lo_unlink(oid) — delete large object | ||
| // pg_read_file('/etc/passwd') — read server file (superuser) | ||
| // pg_read_binary_file(...) — same, binary | ||
| // pg_write_file(...) — write to server files (ext. module) | ||
| // pg_stat_file(...) — stat a server file | ||
| // pg_ls_dir(...) — list server directory | ||
| // | ||
| // ── Sequence / state mutation ──────────────────────────────────── | ||
| // nextval('seq'), setval('seq', n) | ||
| // | ||
| // ── Denial of service ──────────────────────────────────────────── | ||
| // pg_sleep(n) — block connection for n seconds | ||
| // pg_sleep_for(interval) — same, interval version | ||
| // pg_sleep_until(timestamp) — same, deadline version | ||
| // | ||
| // ── Session / backend control ──────────────────────────────────── | ||
| // pg_terminate_backend(pid) — kill another connection | ||
| // pg_cancel_backend(pid) — cancel a running query | ||
| // pg_reload_conf() — reload server configuration | ||
| // pg_rotate_logfile() — rotate the server log | ||
| // set_config(name, value, local) — SET equivalent as function | ||
| // | ||
| // ── Advisory locks (can deadlock other connections) ─────────────── | ||
| // pg_advisory_lock(key) — session-level advisory lock | ||
| // pg_advisory_lock_shared(key) | ||
| // pg_try_advisory_lock(key) | ||
| const dangerousFunctions = [ | ||
| // File I/O | ||
| "lo_import", | ||
| "lo_export", | ||
| "lo_unlink", | ||
| "lo_put", | ||
| "lo_from_bytea", | ||
| "pg_read_file", | ||
| "pg_read_binary_file", | ||
| "pg_write_file", | ||
| "pg_stat_file", | ||
| "pg_ls_dir", | ||
| "pg_ls_logdir", | ||
| "pg_ls_waldir", | ||
| "pg_ls_tmpdir", | ||
| "pg_ls_archive_statusdir", | ||
| // Sequence / state mutation | ||
| "nextval", | ||
| "setval", | ||
| // Denial of service | ||
| "pg_sleep", | ||
| "pg_sleep_for", | ||
| "pg_sleep_until", | ||
| // Session / backend control | ||
| "pg_terminate_backend", | ||
| "pg_cancel_backend", | ||
| "pg_reload_conf", | ||
| "pg_rotate_logfile", | ||
| "set_config", | ||
| // Advisory locks | ||
| "pg_advisory_lock", | ||
| "pg_advisory_lock_shared", | ||
| "pg_try_advisory_lock", | ||
| "pg_try_advisory_lock_shared", | ||
| "pg_advisory_xact_lock", | ||
| "pg_advisory_xact_lock_shared", | ||
| "pg_advisory_unlock", | ||
| "pg_advisory_unlock_shared", | ||
| "pg_advisory_unlock_all", | ||
| ]; | ||
| const dangerousFnPattern = new RegExp( | ||
| `(?:^|[^\\w$])"?(?:${dangerousFunctions.join("|")})"?\\s*\\(`, | ||
| "i", | ||
| ); | ||
| const fnMatch = dangerousFnPattern.exec(noLiterals); | ||
| if (fnMatch) { | ||
| // Extract the function name from the match for the error message. | ||
| const fnNameMatch = fnMatch[0].match( | ||
| new RegExp(`(${dangerousFunctions.join("|")})`, "i"), | ||
| ); | ||
| const fnName = fnNameMatch ? fnNameMatch[1].toUpperCase() : "UNKNOWN"; | ||
| return `Query rejected: "${fnName}" is a dangerous function that can modify server state. Set readOnly: false to execute this query.`; | ||
| } | ||
|
|
||
| // Reject multi-statement queries (naive: any semicolon not at the very end) | ||
| const trimmedForSemicolon = stripped.replace(/;\s*$/, ""); | ||
| if (trimmedForSemicolon.includes(";")) { | ||
| return "Query rejected: multi-statement queries are not allowed in read-only mode."; | ||
| } | ||
|
|
||
| return null; | ||
| } |
There was a problem hiding this comment.
Security: Read-Only Query Validation is Not a Full SQL Parser
The validateReadOnlyQuery function (lines 462-626) attempts to block mutation queries and dangerous functions by stripping comments and string literals, then searching for keywords. However, this approach is not as robust as a true SQL parser and can be bypassed by sufficiently obfuscated queries (e.g., using Unicode homoglyphs, unusual whitespace, or creative CTEs). The code comments acknowledge this limitation, but the risk remains significant if the API is exposed to untrusted users.
Recommendation:
- Consider using a proper SQL parser or a third-party library designed for query analysis to enforce read-only constraints more reliably.
- If this is not feasible, ensure that the API is only accessible to trusted users, and document the limitations clearly in the API documentation.
| const sqlText = body.sql.trim(); | ||
|
|
||
| // If readOnly mode, reject mutation statements. | ||
| // Strip SQL comments, then scan for mutation keywords *anywhere* in the | ||
| // query — not just the leading keyword. This prevents bypass via CTEs | ||
| // (WITH ... AS (DELETE ...)) and other SQL constructs that nest mutations. | ||
| if (body.readOnly !== false) { | ||
| // Strip block comments (/* ... */) and line comments (-- ...). | ||
| // Use empty-string replacement (not space) to mirror how PostgreSQL | ||
| // concatenates tokens across comments — e.g. DE/* */LETE → DELETE. | ||
| // A space replacement would turn it into "DE LETE", hiding the keyword. | ||
| const stripped = sqlText | ||
| .replace(/\/\*[\s\S]*?\*\//g, "") | ||
| .replace(/--.*$/gm, "") | ||
| .trim(); | ||
|
|
||
| // Strip string literals so that mutation keywords/functions inside quoted | ||
| // strings are ignored. Handles single-quoted ('...'), dollar-quoted | ||
| // ($$...$$), and tagged dollar-quoted ($tag$...$tag$) strings. | ||
| const noLiterals = stripped | ||
| .replace(/\$([A-Za-z0-9_]*)\$[\s\S]*?\$\1\$/g, " ") | ||
| .replace(/'(?:[^']|'')*'/g, " "); | ||
|
|
||
| // For keyword checks, also strip double-quoted identifiers to avoid | ||
| // matching words inside quoted table/column names. | ||
| const noStrings = noLiterals.replace(/"(?:[^"]|"")*"/g, " "); | ||
|
|
||
| const mutationKeywords = [ | ||
| // ── DML ──────────────────────────────────────────────────────────── | ||
| "INSERT", | ||
| "UPDATE", | ||
| "DELETE", | ||
| "INTO", | ||
| "COPY", | ||
| "MERGE", | ||
| // ── DDL ──────────────────────────────────────────────────────────── | ||
| "DROP", | ||
| "ALTER", | ||
| "TRUNCATE", | ||
| "CREATE", | ||
| "COMMENT", | ||
| // ── Admin / privilege ────────────────────────────────────────────── | ||
| "GRANT", | ||
| "REVOKE", | ||
| "SET", | ||
| "RESET", | ||
| "LOAD", | ||
| // ── Maintenance ──────────────────────────────────────────────────── | ||
| "VACUUM", | ||
| "REINDEX", | ||
| "CLUSTER", | ||
| "REFRESH", | ||
| "DISCARD", | ||
| // ── Procedural ───────────────────────────────────────────────────── | ||
| "CALL", | ||
| "DO", | ||
| // ── Async notifications (side-effects) ───────────────────────────── | ||
| "LISTEN", | ||
| "UNLISTEN", | ||
| "NOTIFY", | ||
| // ── Prepared statements (can wrap mutations) ─────────────────────── | ||
| "PREPARE", | ||
| "EXECUTE", | ||
| "DEALLOCATE", | ||
| // ── Locking ──────────────────────────────────────────────────────── | ||
| "LOCK", | ||
| ]; | ||
| // Match mutation keywords as whole words (word boundary) anywhere in the | ||
| // query, catching them inside CTEs, subqueries, etc. | ||
| const mutationPattern = new RegExp( | ||
| `\\b(${mutationKeywords.join("|")})\\b`, | ||
| "i", | ||
| ); | ||
| const match = mutationPattern.exec(noStrings); | ||
| if (match) { | ||
| sendJsonError( | ||
| res, | ||
| `Query rejected: "${match[1].toUpperCase()}" is a mutation keyword. Set readOnly: false to execute mutations.`, | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| // PostgreSQL built-in functions that can read/write server files, mutate | ||
| // server state, or cause denial of service. These appear inside otherwise | ||
| // valid SELECT expressions, so keyword checks alone won't catch them. | ||
| // | ||
| // ── File I/O (arbitrary file read/write on the DB server) ───────── | ||
| // lo_import('/etc/passwd') — load file into large object | ||
| // lo_export(oid, '/tmp/evil') — write large object to file | ||
| // lo_unlink(oid) — delete large object | ||
| // pg_read_file('/etc/passwd') — read server file (superuser) | ||
| // pg_read_binary_file(...) — same, binary | ||
| // pg_write_file(...) — write to server files (ext. module) | ||
| // pg_stat_file(...) — stat a server file | ||
| // pg_ls_dir(...) — list server directory | ||
| // | ||
| // ── Sequence / state mutation ──────────────────────────────────── | ||
| // nextval('seq'), setval('seq', n) | ||
| // | ||
| // ── Denial of service ──────────────────────────────────────────── | ||
| // pg_sleep(n) — block connection for n seconds | ||
| // pg_sleep_for(interval) — same, interval version | ||
| // pg_sleep_until(timestamp) — same, deadline version | ||
| // | ||
| // ── Session / backend control ──────────────────────────────────── | ||
| // pg_terminate_backend(pid) — kill another connection | ||
| // pg_cancel_backend(pid) — cancel a running query | ||
| // pg_reload_conf() — reload server configuration | ||
| // pg_rotate_logfile() — rotate the server log | ||
| // set_config(name, value, local) — SET equivalent as function | ||
| // | ||
| // ── Advisory locks (can deadlock other connections) ─────────────── | ||
| // pg_advisory_lock(key) — session-level advisory lock | ||
| // pg_advisory_lock_shared(key) | ||
| // pg_try_advisory_lock(key) | ||
| const dangerousFunctions = [ | ||
| // File I/O | ||
| "lo_import", | ||
| "lo_export", | ||
| "lo_unlink", | ||
| "lo_put", | ||
| "lo_from_bytea", | ||
| "pg_read_file", | ||
| "pg_read_binary_file", | ||
| "pg_write_file", | ||
| "pg_stat_file", | ||
| "pg_ls_dir", | ||
| "pg_ls_logdir", | ||
| "pg_ls_waldir", | ||
| "pg_ls_tmpdir", | ||
| "pg_ls_archive_statusdir", | ||
| // Sequence / state mutation | ||
| "nextval", | ||
| "setval", | ||
| // Denial of service | ||
| "pg_sleep", | ||
| "pg_sleep_for", | ||
| "pg_sleep_until", | ||
| // Session / backend control | ||
| "pg_terminate_backend", | ||
| "pg_cancel_backend", | ||
| "pg_reload_conf", | ||
| "pg_rotate_logfile", | ||
| "set_config", | ||
| // Advisory locks | ||
| "pg_advisory_lock", | ||
| "pg_advisory_lock_shared", | ||
| "pg_try_advisory_lock", | ||
| "pg_try_advisory_lock_shared", | ||
| "pg_advisory_xact_lock", | ||
| "pg_advisory_xact_lock_shared", | ||
| "pg_advisory_unlock", | ||
| "pg_advisory_unlock_shared", | ||
| "pg_advisory_unlock_all", | ||
| ]; | ||
| const dangerousFnPattern = new RegExp( | ||
| `(?:^|[^\\w$])"?(?:${dangerousFunctions.join("|")})"?\\s*\\(`, | ||
| "i", | ||
| ); | ||
| const fnMatch = dangerousFnPattern.exec(noLiterals); | ||
| if (fnMatch) { | ||
| // Extract the function name from the match for the error message. | ||
| const fnNameMatch = fnMatch[0].match( | ||
| new RegExp(`(${dangerousFunctions.join("|")})`, "i"), | ||
| ); | ||
| const fnName = fnNameMatch ? fnNameMatch[1].toUpperCase() : "UNKNOWN"; | ||
| sendJsonError( | ||
| res, | ||
| `Query rejected: "${fnName}" is a dangerous function that can modify server state. Set readOnly: false to execute this query.`, | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| // Reject multi-statement queries (naive: any semicolon not at the very end) | ||
| const trimmedForSemicolon = stripped.replace(/;\s*$/, ""); | ||
| if (trimmedForSemicolon.includes(";")) { | ||
| sendJsonError( | ||
| res, | ||
| "Query rejected: multi-statement queries are not allowed in read-only mode.", | ||
| ); | ||
| const error = validateReadOnlyQuery(sqlText); | ||
| if (error) { | ||
| sendJsonError(res, error); | ||
| return; | ||
| } | ||
| } |
There was a problem hiding this comment.
Security: Reliance on validateReadOnlyQuery for Mutation Protection
In the query handler (lines 1238-1247), the code relies on validateReadOnlyQuery to prevent mutation queries when readOnly is not explicitly set to false. As noted above, this validator is not a full SQL parser and can be bypassed by sophisticated attackers. If the API is exposed to untrusted clients, this could allow unauthorized data modification.
Recommendation:
- Treat the read-only mode as a best-effort safeguard, not a guarantee. For high-security environments, restrict access to this endpoint or use a more robust query analysis tool.
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a critical SQL injection vulnerability by modifying how comments are stripped from queries. Replacing comments with spaces instead of empty strings correctly prevents keyword concatenation attacks. The extraction of the validation logic into a reusable validateReadOnlyQuery function is an excellent improvement for code clarity and maintainability. The lists of mutation keywords and dangerous functions are comprehensive. Overall, this is a solid security fix and refactoring. I have one minor suggestion to improve efficiency.
| const dangerousFnPattern = new RegExp( | ||
| `(?:^|[^\\w$])"?(?:${dangerousFunctions.join("|")})"?\\s*\\(`, | ||
| "i", | ||
| ); | ||
| const fnMatch = dangerousFnPattern.exec(noLiterals); | ||
| if (fnMatch) { | ||
| // Extract the function name from the match for the error message. | ||
| const fnNameMatch = fnMatch[0].match( | ||
| new RegExp(`(${dangerousFunctions.join("|")})`, "i"), | ||
| ); | ||
| const fnName = fnNameMatch ? fnNameMatch[1].toUpperCase() : "UNKNOWN"; | ||
| return `Query rejected: "${fnName}" is a dangerous function that can modify server state. Set readOnly: false to execute this query.`; | ||
| } |
There was a problem hiding this comment.
You can make this code more efficient and readable by using a capturing group in the dangerousFnPattern regex to extract the function name directly, instead of running a second regex match. This avoids the need for fnMatch[0].match(...).
const dangerousFnPattern = new RegExp(
`(?:^|[^\\w$])"?(${dangerousFunctions.join("|")})"?\\s*\\(`,
"i",
);
const fnMatch = dangerousFnPattern.exec(noLiterals);
if (fnMatch) {
const fnName = fnMatch[1].toUpperCase();
return `Query rejected: "${fnName}" is a dangerous function that can modify server state. Set readOnly: false to execute this query.`;
}
🛡️ Sentinel: Fix SQL injection bypass in read-only query validation
🚨 Severity: HIGH
💡 Vulnerability: SQL injection bypass in read-only query validation. The previous implementation stripped comments by replacing them with empty strings, allowing attackers to split keywords (e.g.,
DELETE/**/FROM->DELETEFROM) which bypassed the keyword blocklist while still being valid SQL.🎯 Impact: An attacker with read-only database access could execute mutation commands (DELETE, DROP, INSERT, etc.) by obfuscating the query.
🔧 Fix: Changed comment stripping logic to replace comments with spaces, ensuring keywords remain separated and detectable. Extracted validation logic into a reusable
validateReadOnlyQueryfunction.✅ Verification: Added a new test suite that verifies the validator blocks obfuscated attacks (block comments, line comments, casing, CTEs) while allowing valid read-only queries.
PR created automatically by Jules for task 17574086292071621154 started by @Dexploarer