TL;DR: We tested 4 AI models on generating a macOS SSD monitoring script. Results might surprise you! ๐ฅ
Task: Create a Python script that reads and displays complete SSD SMART data on macOS, including:
- Temperature
- Total Bytes Written (TBW)
- Power On Hours
- Wear Level
- Media Errors
- Full SMART output
Models Tested:
- ๐ Claude Sonnet 4.5 (Anthropic) - Premium AI
- ๐ฅ Nemotron 3 Nano (NVIDIA) - Open Source
- ๐ฅ Qwen3 Coder 30B (Alibaba) - Open Source
- ๐ด GPT-OSS-20B (Open Source) - Failed completely
- ๐ด Devstral Small 2 (Mistral AI) - Open Source
| Rank | Model | Status | Key Features | Score |
|---|---|---|---|---|
| ๐ฅ 1st | Claude Sonnet 4.5 | โ PERFECT | Auto-detect, No sudo needed, Beautiful UI, Smart error handling | 10/10 |
| ๐ฅ 2nd | Nemotron 3 Nano | โ SUCCESS | Complete data, Works well, Requires sudo, Fast & Efficient | 8.5/10 |
| ๐ฅ 3rd | Qwen3 Coder 30B | Functional but poor UX, Hard-coded sudo | 6/10 | |
| 4th | GPT-OSS-20B | โ FAILED | No SSD detection, Wrong logic, High GPU usage | 1/10 |
| 5th | Devstral Small 2 | โ FAILED | Wrong device paths, Doesn't understand macOS | 2/10 |
Score: 10/10 | View Script
โ
Auto-detects physical SSDs (filters virtual APFS containers)
โ
Intelligent sudo handling (tries without, fallback if needed)
โ
Beautiful formatted output with tables
โ
Extracts all key metrics:
โข Temperature: 28ยฐC
โข TBW: 8,022.20 TB
โข Power On Hours: 251h (10 days)
โข Wear Level: 0%
โข Media Errors: 0
โ
Complete SMART data dump
โ
Proper error handling (handles smartctl exit codes)
โ
Professional code quality
Sample Output:
======================================================================
SSD SMART REPORT - Complete Diagnostics
======================================================================
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ KEY METRICS SUMMARY โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Critical Warning 0x00 (OK) โ
โ Temperature 28 ยฐC โ
โ Wear Level 0% โ
โ TBW (Data Units) 8022.20 TB โ
โ Power On Hours 251 hours (10 days) โ
โ Media Errors 0 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Why it wins: Perfect execution, no sudo needed for the user, beautiful presentation, and enterprise-grade error handling.
Score: 8.5/10 | View Script
โ
Auto-detects disk (/dev/disk0)
โ
Shows complete SMART output
โ
All data visible:
โข Temperature: 28 Celsius โ
โข Data Units Written: 16,429,485 [8.41 TB] โ
โข Power On Hours: 251 โ
โข All metrics present โ
โ ๏ธ Requires user to run with sudo
โ ๏ธ No metric extraction (raw output only)
โ ๏ธ User must read through full output
What it does well:
- Gets the job done! All data is there and correct
- Very close to Claude's functionality
- Clean, readable output
- Reliable detection
What could be better:
- Needs manual sudo (user must remember to type "sudo python3 script.py")
- Doesn't parse metrics into a summary table
- Less polished UI
Verdict: Impressive performance for an open-source model! Shows that open-source AI is catching up fast. With minor improvements, it could match Claude.
โก Performance Highlights:
- Generation Speed: Fast (generated script quickly)
- Resource Efficiency: Low GPU usage, Mac stayed cool
- Code Quality: Clean, readable, functional
- Best Open-Source Model: Clear winner among free alternatives
Comparison with GPT-OSS-20B:
| Metric | Nemotron 3 Nano | GPT-OSS-20B |
|---|---|---|
| Speed | โก Fast | ๐ Very Slow |
| GPU Usage | โ Low | ๐ฅ High (overheated Mac) |
| Functionality | โ Works | โ Failed |
| Code Quality | โญโญโญโญ | โญ |
Score: 6/10 | View Script
โ
Functional (works with sudo)
โ
Retrieves SMART data
โ Hard-coded sudo requirement
โ No fallback mechanism
โ Poor user experience (blocks without sudo)
โ Less intelligent permission handling
Verdict: Works but requires significant UX improvements. Not production-ready without modifications.
Score: 1/10 | View Script
โ Completely fails to detect SSDs
โ Wrong disk detection logic (looks for 'Whole' instead of 'WholeDisk')
โ No SSDs detected even with sudo
โ High GPU usage during generation (overheated Mac)
โ Slow generation time
โ Resource inefficient
Error Output:
โ ๏ธ No SSDs detected.
Performance Issues:
- Generation time: Very slow compared to Nemotron
- Resource usage: High GPU load, caused Mac to overheat
- Efficiency: Worst resource-to-quality ratio
Verdict: Complete failure. The model consumed significant resources during generation but produced non-functional code. Wrong API assumptions (uses entry.get('Whole') instead of checking individual disk properties with WholeDisk).
Score: 2/10 | View Script
โ Looks for /dev/nvme0 (wrong path for macOS)
โ Should use /dev/disk0
โ No understanding of macOS disk architecture
โ No auto-detection
โ Complete failure on macOS
Error Output:
Error: No se encontrรณ un dispositivo NVMe
Prueba manualmente con: sudo smartctl -a /dev/nvme0
Verdict: Fundamental misunderstanding of macOS storage. Would work better on Linux.
- Intelligence: Understands macOS quirks (APFS virtual containers vs physical disks)
- UX Design: Smart sudo handling, beautiful formatting
- Error Handling: Handles smartctl's non-zero exit codes correctly
- Polish: Production-ready code
- Functionality: Gets all the data correctly
- Reliability: Solid detection and output
- Efficiency: Fast generation, low resource usage
- Performance: Didn't overheat the Mac like GPT-OSS-20B
- Cost: Free vs Claude's premium pricing
- Gap is closing: 85% of Claude's quality at 0% of the cost
Nemotron 3 Nano proved that open-source AI can compete with premium models!
Not all open-source models are created equal:
- โ Nemotron 3 Nano: Fast, efficient, functional (85% of Claude's quality)
- โ GPT-OSS-20B: Slow, resource-hungry, non-functional (worst performer)
The gap between paid and open-source AI is narrowing, but model selection matters. Quality open-source models like Nemotron offer excellent value, while others (GPT-OSS-20B) waste resources with poor results.
| Model | Speed | GPU Usage | Mac Temperature | Result |
|---|---|---|---|---|
| Nemotron 3 Nano | โกโกโก Fast | ๐ข Low | โ๏ธ Cool | โ Functional |
| Claude Sonnet 4.5 | โกโก Normal | ๐ก Medium | ๐ก๏ธ Normal | โ Perfect |
| Qwen3 Coder 30B | โก Slow | ๐ก Medium | ๐ก๏ธ Normal | |
| GPT-OSS-20B | ๐ Very Slow | ๐ด VERY HIGH | ๐ฅ Overheated | โ Failed |
| Devstral Small 2 | โก Normal | ๐ข Low | โ๏ธ Cool | โ Failed |
Why Nemotron is the best open-source model:
- โก Fastest at generating functional code
- ๐ข Lowest resource consumption (GPU, CPU)
- โ๏ธ Doesn't overheat Mac during generation
- โ Code that works (unlike GPT-OSS-20B)
- ๐ฐ Free with near-Claude performance
GPT-OSS-20B Issues:
- ๐ Extremely slow generation
- ๐ฅ High GPU usage โ Mac overheated
- โ Output: non-functional code
- ๐ธ Waste of resources and time
python3 tbw-claude-sonnet-4.5.pysudo python3 tbw-nemotron-3-nano.py# Install smartmontools
brew install smartmontools
# Python 3.7+
python3 --versionClick to see Claude Sonnet 4.5 output
======================================================================
SSD SMART REPORT - Complete Diagnostics
======================================================================
โ ๏ธ Note: This script may require sudo privileges to access SMART data.
Trying without sudo first, then with sudo if needed.
======================================================================
DISK: /dev/disk0
======================================================================
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ KEY METRICS SUMMARY โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Critical Warning 0x00 (OK) โ
โ Temperature 28 ยฐC โ
โ Wear Level 0% โ
โ TBW (Data Units) 8022.20 TB โ
โ Host Writes 379,822,896 โ
โ Power On Hours 251 hours (10 days) โ
โ Media Errors 0 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Click to see Nemotron 3 Nano output
=== INFORME SMART COMPLETO ===
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 28 Celsius
Available Spare: 100%
Available Spare Threshold: 99%
Percentage Used: 0%
Data Units Read: 40,640,061 [20.8 TB]
Data Units Written: 16,429,485 [8.41 TB]
Host Read Commands: 363,638,479
Host Write Commands: 379,824,113
Power Cycles: 396
Power On Hours: 251
Media and Data Integrity Errors: 0
Found a bug or want to improve a script? PRs welcome!
MIT License - Feel free to use, modify, and distribute.
If you found this comparison useful, please star the repo! It helps others discover this research.
Made with ๐ค by AI (and a human who tested them all)
Comparison conducted on macOS 14.6 (Sonoma) with Python 3.14 and smartmontools 7.5