What Happened
Hugging Face — the platform that hosts over 1.5 million AI models and powers enterprise inference pipelines worldwide — is facing two simultaneous critical vulnerabilities. The first, CVE-2026-0599, affects Text Generation Inference (TGI) version 3.3.6 and allows unauthenticated attackers to crash hosts via unbounded image fetching during VLM input validation. The second, CVE-2026-25874 (CVSS 9.3), is an unauthenticated remote code execution in LeRobot — Hugging Face's open-source robotics framework with nearly 24,000 GitHub stars — through unsafe pickle deserialization over unauthenticated gRPC channels. Neither has a complete fix available from Hugging Face as of this writing.
Technical Analysis
CVE-2026-0599: TGI Resource Exhaustion
The vulnerability lives in the VLM (vision-language model) input validation logic inside text-generation-inference 3.3.6. When a user submits a request containing an external image URL, TGI fetches the image during validation before checking whether the request itself exceeds token limits. An attacker can send crafted requests pointing to arbitrarily large external resources, causing the host to allocate unlimited network bandwidth, memory, and CPU. The resource exhaustion happens even if the request is later rejected — the fetch already occurred. Attackers do not need authentication because TGI's default deployment configuration ships without auth enabled and imposes no memory usage caps. SentinelOne's analysis confirms the DoS is trivially exploitable, and the NVD entry reflects a High severity rating. The GitHub Security Advisory (GHSA-j7x9-7j54-2v3h) recommends upgrading to a patched version and enforcing request size limits at the reverse proxy level.
CVE-2026-25874: LeRobot Unauthenticated RCE via Pickle
This is the more dangerous of the two flaws. LeRobot's async inference pipeline uses Python's pickle.loads() to deserialize data received over gRPC channels — specifically in three RPC methods: SendPolicyInstructions, SendObservations, and GetActions. The gRPC PolicyServer in LeRobot versions through 0.5.1 accepts these connections without authentication and without TLS encryption. An attacker who can reach the gRPC port can send a crafted pickle payload that executes arbitrary system commands on the server or robot client. The Hacker News reported a CVSS score of 9.3, while some vendor analyses rate it 9.8. OpenCVE classifies this as CWE-502 (Deserialization of Untrusted Data), and SentinelOne confirms the attack path requires zero authentication — just network access to the exposed gRPC endpoint.
The Pickle Problem, Again
This is not the first time Python's pickle module has been the root cause of RCE in ML tooling. Earlier this year, vulnerabilities in vLLM and Ollama used the same deserialization vector. The pattern is clear: ML frameworks treat model artifacts and inference data as trusted inputs, treating Python pickle as a serialization format rather than what it actually is — an arbitrary code execution surface. Every ML team running LeRobot, or any framework that deserializes untrusted data with pickle.loads(), needs to audit their pipeline today.
Who's Affected
- Anyone running TGI 3.3.6 in default configuration — this includes teams using Hugging Face's own Inference Endpoints on older deployments, and self-hosted instances exposed to the internet without reverse proxy rate limiting.
- Anyone running LeRobot through version 0.5.1 — with ~24,000 GitHub stars and growing adoption in robotics research labs, manufacturing QA, and autonomous systems testing. The gRPC servers run on network-accessible ports by default.
- Downstream users of Hugging Face Hub models — both TGI and LeRobot pull models from the Hub. A compromised inference pipeline means a supply chain entry point for anyone deploying AI workloads off the platform.
How to Protect Yourself
- Pin and upgrade TGI immediately. If you're on
text-generation-inference==3.3.6or earlier, upgrade to the latest patched version. Enforce authentication on all TGI endpoints — the default no-auth config is the attack surface. - Network-isolate LeRobot gRPC endpoints. If you run LeRobot ≤0.5.1, ensure the gRPC PolicyServer port is not exposed to the public internet. Use network policies, firewalls, or mutual TLS to restrict access.
- Replace pickle.loads() in any custom ML pipeline. Use JSON, MessagePack, or Protocol Buffers for data serialization between untrusted components. If you must use pickle, use it only for data generated entirely within a trust boundary.
- Set memory and request-size limits at your reverse proxy (nginx, Envoy, etc.) for all TGI deployments — not just inside the application. This adds defense-in-depth against CVE-2026-0599 even if the app-level fix lags.
- Audit your AI dependencies. Run
pip show text-generation-inference lerobotacross your environments. Both packages have open CVEs right now.
The Sable Angle
Hugging Face's double CVE is a pattern we track closely at Sable. AI infrastructure is the new attack surface — and most ML teams don't have a red team. At Sable, we run offensive assessments against ML pipelines exactly like these: scanning for open inference endpoints, testing deserialization surfaces, and mapping the trust boundaries between your training data and production models. Our red team engagements cover AI supply chain risks, from Hub model poisoning to gRPC endpoint exposure — the same flaws that CVE-2026-0599 and CVE-2026-25874 exploit right now. If you're running Hugging Face infrastructure in production, we should talk.