A Perl script that processes text files to replace specified patterns.
This script applies word-based pattern replacements defined in a configuration file
- Perl 5
- Required Perl modules:
- File::Copy
- File::Basename
- Cwd
- Getopt::Long
./sanitize-writing.pl --template=CONFIG_FILE [options] text_file--template=CONFIG_FILE- Path to the configuration file containing replacement patterns
-b, --backup- Create a backup of the original file before processing
--backup-file=FILENAME- Specify custom backup filename (optional, requires -b)
text_file- Path to the text file to be processed
The configuration file should contain replacement patterns in the following format:
search=replace
Each line defines a single replacement rule:
- Empty lines are ignored
- Lines starting with # are treated as comments
- Patterns are case-insensitive
- Words are matched with word boundaries
Example configuration:
# Replace common abbreviations
govt=government
dept=department
- Optional backup creation with -b flag
- Default backup uses
.bakextension - Custom backup filename supported
- Default backup uses
- Word boundary matching to prevent partial word replacements
- assumes UTF-8 encoding
- Preserves original file on error
The script will exit with an error message if:
- Required files are missing
- Command-line arguments are incorrect
- File operations fail
- Reports each successful pattern replacement with count
- Creates backup files when requested
- Displays completion message
Process a document using a custom template:
./sanitize-writing.pl --template=my_rules.conf document.txtCreate backup with default name (.bak extension):
./sanitize-writing.pl --template=my_rules.conf -b document.txtCreate backup with custom filename:
./sanitize-writing.pl --template=my_rules.conf -b --backup-file=document.backup.txt document.txtsanitize-writing.pl- Main script
CONFIG_FILE- User-provided replacement patterns
- See
words-uk-to-us.txtandwords-us-to-uk.txtas examples (British to American and American to British English word replacements)
- See