Default max_tokens=2048 in tailor stage causes truncated JSON for long resumes

Description of the bug
The tailor stage sets `max_tokens=2048` when calling the LLM. For candidates with extensive work history, the prompt alone can exceed 5,000 tokens, and thinking models like `gemini-2.5-flash` consume additional tokens for reasoning. This leaves insufficient tokens for the full JSON response, causing truncated output and repeated `EXHAUSTED_RETRIES` failures.

To Reproduce
Set up a profile with 15+ years of work history and run the tailor stage with `gemini-2.5-flash`. Every job will hit `EXHAUSTED_RETRIES` with `finishReason: MAX_TOKENS` visible in the API logs. The response cuts off mid-JSON after only 82-418 tokens.

```
"finishReason": "MAX_TOKENS",
"candidatesTokenCount": 82,
"promptTokenCount": 5380,
"thoughtsTokenCount": 7769
```

Expected behavior
The `max_tokens` limit should be high enough to accommodate long resumes, or better yet, be a configurable parameter in `profile.json` or as a CLI flag so users can tune it for their situation.
Fix
In `tailor.py` around line 403, change:
`raw = client.chat(messages, max_tokens=2048, temperature=0.4)`
to:
`raw = client.chat(messages, max_tokens=16384, temperature=0.4)`

Environment

- Resume length: 30 years of experience
- Model: gemini-2.5-flash
- Observed prompt token count: ~5,700 tokens

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default max_tokens=2048 in tailor stage causes truncated JSON for long resumes #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Default max_tokens=2048 in tailor stage causes truncated JSON for long resumes #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions