Hugging Face, the world's largest open-source model repository (over 4 million developers), is becoming a frontier for AI supply chain attacks. The incident exposes the inherent contradiction between collaboration platforms and security.
Attack technical details:
Pickle file abuse: the Python standard library Pickle module is used for ML model serialization, but its deserialization inherently executes arbitrary Python code. pyTorch relies on Pickle by default, making it a perfect vehicle for hiding malicious code.
Picklescan Bypass: Hugging Face uses the Picklescan tool to detect dangerous functions (e.g. os.system, subprocess, etc.) based on a blacklisting mechanism. The attacker bypasses through two layers:
Compression format variation: PyTorch standard ZIP compression changed to 7z format, Picklescan can't decompress it
File-breaking exploit: inserting corrupted bytes at the end of the Pickle stream, Picklescan fails to validate and then releases them directly; however, the Python interpreter executes them sequentially, and the malicious code has been triggered before the file breaks
Hidden Load: The malicious code is located at the beginning of the serialized stream, and the file breaks immediately after execution to confuse detection tools.
A sample of the real world:
模型库”glockr1/ballr7″、”who-r-u0000/0000000000000000000000000000000000000″含反向Shell
Implanted IP address: 107.173.7.141 (C2 server)
Scope of impact: unknown how many developers have downloaded these models for integration into production code
Defensive Tiers:
Platform level: Hugging Face upgraded Picklescan to support 7z decompression and detection of broken files; SafeTensors was promoted as an alternative to Pickle.
Developer level: disable unknown models, enable model signature verification, isolate model loading process, scan for Pickle files in dependency chain
Detection level: implementation of a runtime sandbox to monitor network connectivity and process creation behavior when models are loaded
The takeaway: the price of democratizing AI is the accumulation of security debt. Every Pickle file is a potential Trojan horse, and blacklisting defenses are breaking down.