This script scrapes medals earned by BYOND users and saves the data in JSON format. It handles different date formats and converts them to ISO 8601 format.
- Scrapes medals for a list of BYOND usernames
- Handles date formats: today, yesterday, on DAY, and specific dates
- Saves data in JSON format
- Supports concurrent scraping for faster execution
- Includes a progress bar to show scraping progress
- Adds a delay between batches to be considerate to the web server
- Can resume scraping from where it left off if interrupted
- Python 3.x
requests
librarybeautifulsoup4
librarytqdm
library
-
Clone this repository:
git clone https://github.com/yourusername/byond-medals-scraper.git cd byond-medals-scraper
-
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
-
Install the required libraries:
pip install -r requirements.txt
Ensure your
requirements.txt
contains the following:requests beautifulsoup4 tqdm
-
Create a
usernames.txt
file in the same directory as the script. This file should contain one username per line.Example
usernames.txt
:user1 user2 user3
-
Set the desired mode and parameters at the top of the script:
DELAY
: Delay between each batch in seconds. Default is1
.MAX_WORKERS
: Maximum number of concurrent workers. Default is10
.ERROR_DELAY
: Delay between retries after a network failure. Default is3
.RETRIES
: Max retries per user. Default is3
.OUTPUT_FILE
: Output file name. Default is'all_users_medals.json'
.INPUT_FILE
: Input file name. Default is'usernames.txt'
.SECTION_TITLE
: Section title to search for. Default is'Space Station 13 Medals'
.APPEND_MODE
: Boolean to either append with checks (True
) or start fresh (False
). Default isFalse
.
-
Run the script:
python scrape_medals_batch.py
-
The script will create an
all_users_medals.json
file containing the scraped data. Errors will be logged inerror_log.txt
. -
Deactivate the virtual environment when you are finished working with the script to restore your shell to the state it was in before you activated the virtual environment:
deactivate
Example JSON structure:
{
"user1": [
{
"Name": "Fish",
"Date": "2023-11-29T10:22:00"
},
{
"Name": "It'sa me, Mario",
"Date": "2023-11-29T10:23:00"
}
],
"user2": [
{
"Name": "HIGH VOLTAGE",
"Date": "2023-11-29T10:34:00"
}
]
}