feat: save AI diagnostics locally on each successful request#21
Conversation
Co-authored-by: tassan <7563760+tassan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR implements best-effort local persistence of every successful AI diagnostic result, so data can later be migrated to Google Drive or a similar service. It adds a storage/diagnostics/ directory outside the web root, writes timestamped JSON files there after each successful AI response, and excludes those files from version control.
Changes:
- Each validated AI diagnostic is serialized to a uniquely-named JSON file under
storage/diagnostics/. - The
storage/diagnostics/directory is tracked via.gitkeep, with actual data files excluded by.gitignore. - Write failures are logged via
error_logand never surfaced to the caller.
Reviewed changes
Copilot reviewed 1 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
public/api/diagnostico-ia.php |
After building $parecer, writes a JSON record containing created_at, sanitized $input, and $parecer to storage/diagnostics/. |
.gitignore |
Excludes *.json files inside /storage/diagnostics/ from version control. |
storage/diagnostics/.gitkeep |
Ensures the storage directory is tracked in git while keeping it empty. |
| $filename = date('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | ||
| $record = json_encode( | ||
| ['created_at' => date('c'), 'input' => $input, 'parecer' => $parecer], |
There was a problem hiding this comment.
Both date('Y-m-d_His') (used in the filename) and date('c') (used for created_at) depend on the PHP runtime's configured date.timezone. Since no timezone is set explicitly in the code or php.ini within the project, the output will reflect the server's system timezone rather than guaranteed UTC. The PR description example shows +00:00, implying UTC is assumed. Using date_create('now', timezone_open('UTC')) with date_format(), or simply calling date_default_timezone_set('UTC') before these calls, would make the timestamps deterministic across different hosting environments.
| $filename = date('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | |
| $record = json_encode( | |
| ['created_at' => date('c'), 'input' => $input, 'parecer' => $parecer], | |
| $nowUtc = new DateTimeImmutable('now', new DateTimeZone('UTC')); | |
| $filename = $nowUtc->format('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | |
| $record = json_encode( | |
| ['created_at' => $nowUtc->format(DATE_ATOM), 'input' => $input, 'parecer' => $parecer], |
| $written = file_put_contents($storageDir . '/' . $filename, $record . "\n"); | ||
| if ($written === false) { | ||
| error_log('Failed to write diagnostic file: ' . $storageDir . '/' . $filename); | ||
| } |
There was a problem hiding this comment.
When json_encode() returns false (e.g., due to malformed UTF-8 sequences from AI response fields being embedded in $parecer), the failure is silently ignored with no error_log() call. This makes it hard to diagnose encoding issues in production. An error_log should be added for this failure path, similar to how the file-write failure is handled at line 246.
| } | |
| } | |
| } else { | |
| error_log('Failed to json_encode diagnostic record: ' . json_last_error_msg()); |
| @mkdir($storageDir, 0755, true); | ||
| } | ||
|
|
||
| if (is_dir($storageDir) && is_writable($storageDir)) { | ||
| $filename = date('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | ||
| $record = json_encode( | ||
| ['created_at' => date('c'), 'input' => $input, 'parecer' => $parecer], |
There was a problem hiding this comment.
The $input array persisted in the JSON file includes email and telefone — personally identifiable information (PII). These files are written to disk at storage/diagnostics/ in plain text with no access controls beyond filesystem permissions. In a Docker setup (as per docker-compose.yml) the entire project root is bind-mounted and the process runs as www-data. Consider whether saving PII locally aligns with applicable data-protection obligations (e.g. LGPD in Brazil), and whether the storage directory should have tighter permissions (e.g., 0700) or PII fields should be excluded from the stored record.
| @mkdir($storageDir, 0755, true); | |
| } | |
| if (is_dir($storageDir) && is_writable($storageDir)) { | |
| $filename = date('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | |
| $record = json_encode( | |
| ['created_at' => date('c'), 'input' => $input, 'parecer' => $parecer], | |
| // Restrictive permissions: only the web server user can read/write | |
| @mkdir($storageDir, 0700, true); | |
| } | |
| if (is_dir($storageDir) && is_writable($storageDir)) { | |
| $filename = date('Y-m-d_His') . '_' . bin2hex(random_bytes(4)) . '.json'; | |
| // Avoid persisting PII (e.g. email, telefone) in diagnostics storage | |
| $inputForStorage = $input; | |
| unset($inputForStorage['email'], $inputForStorage['telefone']); | |
| $record = json_encode( | |
| ['created_at' => date('c'), 'input' => $inputForStorage, 'parecer' => $parecer], |
Every AI diagnostic generated by
/api/diagnostico-ianeeds to be persisted for later analysis and eventual migration to Google Drive or similar.Changes
storage/diagnostics/— new directory at project root (outsidepublic/, never web-accessible). Tracked via.gitkeep; actual JSON files excluded from git..gitignore— adds/storage/diagnostics/*.jsonto keep saved diagnostics out of version control.public/api/diagnostico-ia.php— after$pareceris validated, writes a timestamped JSON file containingcreated_at, sanitizedinput, andparecer. Best-effort: write failures are logged but never surface to the caller.Saved file format
Filename:
YYYY-MM-DD_HHmmss_<8-hex-random>.json{ "created_at": "2026-03-05T03:56:24+00:00", "input": { "nome": "João Silva", "negocio": "Clínica Odontológica ABC", "segmento": "saude", ... }, "parecer": { "titulo": "...", "situacao_atual": "...", "gaps": [...], "urgencia": "alta" } }File uniqueness uses
bin2hex(random_bytes(4))rather thanuniqid()for proper entropy.Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.