Skip to content

Duplicate job titles cause tailored resume and cover letter PDFs to overwrite each other; batch size limit compounds the problem #17

@tearl42

Description

@tearl42

When multiple job postings share the same title and source site, the generated PDF filenames are identical, causing files to be silently overwritten. The final "ready to apply" count is lower than the number of approved tailored resumes because matched resume/cover letter pairs are lost. This is further compounded by a default batch size of 20 jobs per run, requiring multiple manual runs to process the full queue with no way to override it via CLI.
To Reproduce

Run the tailor and cover stages against a job database containing multiple postings with the same title from the same site (e.g. five "Network Engineer V" listings from LinkedIn, or four "Sr. Network Operations Engineer $135/hr" listings from LinkedIn)
Only one tailored resume PDF and one cover letter PDF will exist after the run despite all jobs being marked [APPROVED]
With 67 pending jobs, the pipeline only processes 20 per run, requiring 4 manual re-runs to complete the queue

Expected behavior

Each job posting should produce a uniquely named file regardless of title. The job's unique ID (available from the URL) should be appended to the filename.
The batch size should either default to processing all pending jobs, or be exposed as a CLI flag such as --limit so users can control it without editing source code.

Fix
Filename fix — tailor.py around line 496:

# Before
prefix = f"{safe_site}_{safe_title}"

# After
job_id = job["url"].rstrip("/").split("/")[-1]
prefix = f"{safe_site}_{safe_title}_{job_id}"

Filename fix — cover_letter.py around line 243:
`# Before
prefix = f"{safe_site}_{safe_title}"

After

job_id = job["url"].rstrip("/").split("/")[-1]
prefix = f"{safe_site}{safe_title}{job_id}"`

Batch size fix — tailor.py line 458, cover_letter.py line 188, pdf.py line 393:
`# Before
limit: int = 20 # (or 50 in pdf.py)

After

limit: int = 100`

Additional context
A --limit CLI flag would be the ideal long-term solution so users can tune batch size at runtime without modifying source code.
Environment

ApplyPilot version: (your version)
OS: Ubuntu 24.04 (WSL)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions