+
How Elvis Works (At a Glance)
+Procedure RunElvis()
+ Begin
+ Read seed URLs from srv/urls.txt
+ For each URL:
+ Fetch job listings
+ Extract company and location using SED/AWK
+ Deduplicate and validate results
+ Write output to home/calllist.txt
+ If --append-history is set:
+ Append new companies to history
+ End If
+ End
+End Procedure
+
+ flowchart TD
+ A[Start] --> B[Read seed URLs]
+ B --> C[Fetch job listings]
+ C --> D[Extract company/location]
+ D --> E[Deduplicate & validate]
+ E --> F[Write calllist.txt]
+ F --> G{Append history?}
+ G -- Yes --> H[Update company_history.txt]
+ G -- No --> I[Done]
+
+ +
Pseudocode: Validating Output
+Procedure ValidateCallList()
+ Begin
+ If home/calllist.txt does not exist or is empty then
+ Log error and exit
+ End If
+ For each row in calllist.txt:
+ Check format and required fields
+ If invalid, log error
+ End For
+ If all rows valid then
+ Print "Validation successful"
+ Else
+ Print "Validation failed"
+ End If
+ End
+End Procedure
+
+ +
Mermaid: Elvis Main Pipeline
++ Mermaid: Elvis System Architecture (C4 Container Diagram) +
+C4Context
+ Person(user, "User", "Runs Elvis and reviews call lists")
+ System(elvis, "Elvis", "POSIX shell web scraper")
+ Container(bin, "bin/elvis.sh", "Shell Script", "Entrypoint orchestrator")
+ Container(dataInput, "lib/data_input.sh", "Shell Script", "Fetches and extracts job data")
+ Container(processor, "lib/processor.sh", "Shell Script", "Normalizes and deduplicates")
+ Container(validator, "lib/validate_calllist.sh", "Shell Script", "Validates output")
+ ContainerDb(output, "home/calllist.txt", "Text File", "Final call list output")
+
+ Rel(user, elvis, "Runs")
+ Rel(elvis, bin, "Orchestrates")
+ Rel(bin, dataInput, "Invokes")
+ Rel(dataInput, processor, "Sends extracted data")
+ Rel(processor, validator, "Sends processed data")
+ Rel(validator, output, "Writes validated call list")
+
+ +
+ +
+ Elvis is a POSIX shell-based web scraper that generates daily call lists + of Australian companies from job boards (e.g., Seek). It is built for + reliability, transparency, and easy customization using POSIX utilities + only. +
++
Onboarding: Choose Your Path
++ Start here! Use the flowchart below to find the best onboarding for your + needs. +
+flowchart TD
+ A[Start Here] --> B{What do you want to do?}
+ B --> C[Just use Elvis to get call lists]
+ B --> D[Understand how Elvis works]
+ B --> E[Contribute code or docs]
+ C --> F[Non-Technical Onboarding]
+ D --> G[Technical Onboarding]
+ E --> H[Contributor Onboarding]
+
+ -
+
- + Non-Technical Onboarding: + Quick start for using Elvis. + +
- + Technical Onboarding: + Learn the architecture and internals. + +
- + Contributor Onboarding: + Start contributing code or docs. + +
+ See the + Onboarding Guide + for step-by-step help. +
+Glossary (Quick Reference)
++ Elvis Project Concepts (Mindmap) +
+mindmap
+ root((Elvis))
+ Usage
+ "Call List"
+ "Seed URL"
+ "User Agent"
+ Architecture
+ "POSIX Shell"
+ "Modular Scripts"
+ "Config in etc/elvisrc"
+ Compliance
+ "robots.txt"
+ "Ethical scraping"
+ Processing
+ "Deduplication"
+ "Validation"
+ "Parser"
+
+ -
+
- + Call List: The output file with extracted job leads. + +
- Seed URL: A starting web address for scraping. +
- + Parser: A script that extracts information from web + pages. + +
- + Deduplication: Removing duplicate entries from + results. + +
- + POSIX Shell: A standard command-line environment for + Unix systems. + +
- + User Agent: A string that identifies the tool to + websites. + +
- + robots.txt: A file that tells scrapers whatโs + allowed. + +
- + Compliance: Following legal and ethical scraping + rules. + +
+ See the full + Glossary + in the Wiki. +
++
home/calllist.txt.
+ +++ Add a screenshot or animated GIF at +
+assets/demo.pngshowing a typical run or +home/calllist.txtsample. Keep images small for mobile + readability. +
