Convert spoken content from video files, MP3 audio, or online platforms (YouTube, Vimeo, Dailymotion) to literal Unicode Braille or Braille-optimized text using AWS services.
- π₯ Extract audio from video files (MP4, AVI, MOV, etc.)
- π΅ Process MP3 audio files directly
- πΊ Download and process YouTube, Vimeo, and Dailymotion videos
- π€ Transcribe audio using AWS Transcribe
- π€ Convert to literal Unicode Braille (U+2800βU+28FF) or Braille-optimized text using AWS Bedrock (Claude)
- ποΈ Automatic cleanup of temporary files
- π Comprehensive logging and error handling
- Python: 3.8β3.12 (Python 3.13+ is not supported due to audio library limitations)
- AWS Account with access to:
- Amazon S3
- Amazon Transcribe
- Amazon Bedrock (Claude model)
- FFmpeg (for audio processing)
- Clone the repository:
git clone <repository-url> cd touch
- Create and activate a Python 3.12 virtual environment:
python3.12 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Install FFmpeg:
- macOS:
brew install ffmpeg - Ubuntu/Debian:
sudo apt install ffmpeg - Windows: Download from https://ffmpeg.org/download.html
- macOS:
Use the provided script to automatically create an IAM user with the necessary permissions:
-
Configure your AWS credentials (if not already done):
aws configure
-
Run the automated setup script:
python setup_aws_iam.py --bucket-name my-touch-bucket --output-env
This script will:
- Create an IAM user (
touch-app-userby default) - Create an S3 bucket for audio files
- Attach the required AWS managed policies:
AmazonS3FullAccessAmazonTranscribeFullAccessAmazonBedrockFullAccess
- Generate access keys
- Output the
.envfile content
- Create an IAM user (
-
Create the
.envfile with the output from the script -
Test the setup:
python cli.py --test-aws
Advanced options:
- Use custom policy (more secure):
--use-custom-policy - Specify different username:
--username my-user - Specify different region:
--region us-west-2
If you prefer to set up AWS manually:
- Create an IAM user in the AWS Console
- Attach these managed policies:
AmazonS3FullAccessAmazonTranscribeFullAccessAmazonBedrockFullAccess
- Create access keys for the user
- Create an S3 bucket for audio files
Create a .env file in the project root:
# AWS Configuration
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
# S3 Configuration
TOUCH_S3_BUCKET=your-s3-bucket-name- Convert a local video file to Unicode Braille:
python cli.py video.mp4
- Convert a local MP3 file to Unicode Braille:
python cli.py audio.mp3
- Convert a YouTube video to Unicode Braille:
python cli.py 'https://www.youtube.com/watch?v=example'
Note: Always wrap video URLs in single or double quotes to avoid shell issues. MP3 files are processed directly and do not require extraction from video.
- Test AWS connectivity:
python cli.py --test-aws
- Check environment configuration:
python cli.py --check-env
- By default, output is literal Unicode Braille (U+2800βU+28FF).
- To get plain Braille-optimized text instead, use:
python cli.py video.mp4 --braille-mode optimized
--braille-mode unicode(default): Output is literal Unicode Braille (β β β β ...)--braille-mode optimized: Output is plain text, optimized for Braille translation software
- View Unicode Braille in any Unicode-aware text editor (VSCode, Sublime, Notepad++, etc.)
- Copy/paste the output into Braille embosser software, or use it with digital Braille displays that support Unicode Braille.
- For physical Braille, use translation software or embosser tools that accept Unicode Braille input.
- Specify output file:
python cli.py video.mp4 --output my_output.txt
- Enable verbose logging:
python cli.py video.mp4 --verbose
- S3 (Audio Storage):
- Uploaded audio files are stored in your configured S3 bucket.
- View your S3 bucket.
- Transcribe (Speech-to-Text):
- Transcription jobs are visible in the AWS Transcribe Console.
- Look for jobs named
touch-....
- Bedrock (AI Model):
- Bedrock model invocations are not directly visible, but you can monitor usage and logs in the Bedrock Console.
- CloudWatch (Logs & Errors):
- For detailed logs and errors, check CloudWatch Logs.
Video/Audio Input β Audio Extraction β S3 Upload β AWS Transcribe β AWS Bedrock β Braille Text
- Audio Extraction: MoviePy for video files, pydub for MP3 files
- S3 Storage: Temporary audio storage for AWS Transcribe
- Transcription: AWS Transcribe converts speech to text
- Braille Conversion: AWS Bedrock (Claude) outputs literal Unicode Braille or Braille-optimized text
- Cleanup: Automatic removal of temporary files and S3 objects
- Invalid input files/URLs
- Network connectivity issues
- AWS service failures
- Audio extraction problems
- Transcription timeouts
- "TOUCH_S3_BUCKET environment variable is required"
- Ensure your
.envfile is properly configured - Check that the S3 bucket exists and is accessible
- Ensure your
- "Video file has no audio track"
- Verify the video file contains audio
- Try a different video file
- "Transcription job failed"
- Check AWS credentials and permissions
- Ensure the audio file is not corrupted
- Verify AWS Transcribe service is available in your region
- "audioop not found" or MP3 extraction errors
- Ensure you are using Python 3.12 or lower (Python 3.13+ is not supported)
- AWS permission errors
- Run
python cli.py --test-awsto diagnose specific service issues - Verify your IAM user has the required policies attached
- Run
| Service | Estimated Cost |
|---|---|
| S3 | <$0.01 |
| Transcribe | $0.12 |
| Bedrock (Claude) | $0.01β$0.02 |
| Total | $0.13β$0.15 |
- Costs scale linearly with file length.
- Using more advanced Claude models may increase Bedrock costs.
- Local compute and YouTube download are free (except for your own bandwidth/electricity).
- AWS Free Tier may cover some or all costs for new accounts.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions, please open an issue on GitHub.
- All output files are placed in the
output/directory by default. - Unicode Braille output (default) will be in
.txtfiles, and BRF output (embossable) will be in.brffiles. - Example:
output/test_output.brf
This project was tested using the following YouTube video:
https://www.youtube.com/watch?v=WLQ6HyFbfKU
The pipeline was run as follows (be sure to quote the URL):
python cli.py --input-url "https://www.youtube.com/watch?v=WLQ6HyFbfKU" --braille-mode unicodeThe resulting Braille output was saved in the output/ directory as test_output.brf (for BRF) and as .txt for Unicode Braille.