What is Agent-as-a-Service pentesting?

Agent-as-a-Service (AaaS) pentesting lets you chat with autonomous pentesting agents that scan your application on demand. Instead of waiting weeks for a manual engagement, you talk to specialized agents — pen-scout (recon and surface mapping), pen-recon (deeper enumeration), pen-triage (validates and prioritizes findings), pen-fixer (remediation guidance), and pen-compliance (OWASP/standards mapping). Findings are validated with proof-of-concept and re-test, not just flagged. You start with 150 free credits, no credit card.

What are credits and how do they work?

Credits are how you run agent scans. Each scan or agent action consumes credits based on its depth. New accounts get 150 free credits with no card required. After that you can buy one-time credit packs ($29, $79, or $199) for pay-as-you-go use, or subscribe to a monthly tier ($49, $149, or $399/mo) for continuous, on-demand pentesting. The agents pen-scout, pen-recon, pen-triage, pen-fixer and pen-compliance all draw from the same credit balance.

Yes. Every new account gets 150 free credits with no credit card required — enough to chat with the pentesting agents and run real scans against your app. There is also a free security headers scan at sable.somoswilab.com/free-scan and a sample report at sable.somoswilab.com/sample-report. The free tier runs on OpenRouter models so you can evaluate the autonomous agents before paying.

What is penetration testing for startups?

Penetration testing (pentesting) is a simulated cyberattack against your application to find security vulnerabilities before real attackers do. For startups, we focus on the issues that matter most at your stage: authentication flaws, data exposure, API security, and common mistakes in modern stacks like Next.js, Supabase, and Firebase.

How much does a pentest cost?

Traditional pentests cost $10,000-$50,000+. SableOffensive starts at $29 for a Pre-Launch Check covering OWASP Top 10 and secrets detection. Founder Shield ($79) adds IDOR testing, auth bypass, and a debrief call. Scale Secure ($199) is a full-scope assessment. Every plan includes a professional report with remediation steps.

How long does a security scan take?

Pre-Launch Check reports are delivered within 24-48 hours. Founder Shield and Scale Secure may take 2-3 business days depending on the complexity of your application.

What do I need to provide?

At minimum, just your application URL. For more comprehensive testing, we may ask for staging credentials, API documentation, or GitHub repository access. We sign NDAs for all engagements.

What is OWASP Top 10?

OWASP Top 10 is the industry standard list of the most critical web application security risks. It includes injection attacks, broken authentication, cross-site scripting (XSS), server-side request forgery (SSRF), and security misconfigurations. Every SableOffensive assessment tests against the full OWASP Top 10.

Do you test AI-generated code?

Yes. Code generated by AI tools like Cursor, GitHub Copilot, and v0 often contains subtle security issues: hardcoded secrets, missing input validation, insecure API patterns, and overly permissive access controls. We have specific testing procedures for AI-generated codebases.

How do you secure Supabase and Firebase apps?

For Supabase, we audit Row Level Security (RLS) policies, test for direct table access, and check for exposed service keys. For Firebase, we review security rules, test Firestore/RTDB access patterns, and check Cloud Functions for vulnerabilities.

What if you find zero vulnerabilities?

50% money back guarantee. If our scan finds zero security issues, you get half your money back.

Is there a free pentesting option?

Yes. SableOffensive offers a free security headers scan at sable.somoswilab.com/free-scan. It instantly checks your website for 8 critical security headers (HSTS, CSP, X-Frame-Options, and more) and gives you an A-F grade with copy-paste fixes. No signup or payment required.

Can I get a free vulnerability scan?

Our free security headers check scans your website instantly and grades your security posture. For a deeper free assessment, contact us — we occasionally offer complimentary scans for early-stage startups and open source projects.

Hugging Face Double CVE: TGI DoS and LeRobot RCE Expose AI Infrastructure

What Happened

Hugging Face — the platform that hosts over 1.5 million AI models and powers enterprise inference pipelines worldwide — is facing two simultaneous critical vulnerabilities. The first, CVE-2026-0599, affects Text Generation Inference (TGI) version 3.3.6 and allows unauthenticated attackers to crash hosts via unbounded image fetching during VLM input validation. The second, CVE-2026-25874 (CVSS 9.3), is an unauthenticated remote code execution in LeRobot — Hugging Face's open-source robotics framework with nearly 24,000 GitHub stars — through unsafe pickle deserialization over unauthenticated gRPC channels. Neither has a complete fix available from Hugging Face as of this writing.

Technical Analysis

CVE-2026-0599: TGI Resource Exhaustion

The vulnerability lives in the VLM (vision-language model) input validation logic inside text-generation-inference 3.3.6. When a user submits a request containing an external image URL, TGI fetches the image during validation before checking whether the request itself exceeds token limits. An attacker can send crafted requests pointing to arbitrarily large external resources, causing the host to allocate unlimited network bandwidth, memory, and CPU. The resource exhaustion happens even if the request is later rejected — the fetch already occurred. Attackers do not need authentication because TGI's default deployment configuration ships without auth enabled and imposes no memory usage caps. SentinelOne's analysis confirms the DoS is trivially exploitable, and the NVD entry reflects a High severity rating. The GitHub Security Advisory (GHSA-j7x9-7j54-2v3h) recommends upgrading to a patched version and enforcing request size limits at the reverse proxy level.

CVE-2026-25874: LeRobot Unauthenticated RCE via Pickle

This is the more dangerous of the two flaws. LeRobot's async inference pipeline uses Python's pickle.loads() to deserialize data received over gRPC channels — specifically in three RPC methods: SendPolicyInstructions, SendObservations, and GetActions. The gRPC PolicyServer in LeRobot versions through 0.5.1 accepts these connections without authentication and without TLS encryption. An attacker who can reach the gRPC port can send a crafted pickle payload that executes arbitrary system commands on the server or robot client. The Hacker News reported a CVSS score of 9.3, while some vendor analyses rate it 9.8. OpenCVE classifies this as CWE-502 (Deserialization of Untrusted Data), and SentinelOne confirms the attack path requires zero authentication — just network access to the exposed gRPC endpoint.

The Pickle Problem, Again

This is not the first time Python's pickle module has been the root cause of RCE in ML tooling. Earlier this year, vulnerabilities in vLLM and Ollama used the same deserialization vector. The pattern is clear: ML frameworks treat model artifacts and inference data as trusted inputs, treating Python pickle as a serialization format rather than what it actually is — an arbitrary code execution surface. Every ML team running LeRobot, or any framework that deserializes untrusted data with pickle.loads(), needs to audit their pipeline today.

Who's Affected

Anyone running TGI 3.3.6 in default configuration — this includes teams using Hugging Face's own Inference Endpoints on older deployments, and self-hosted instances exposed to the internet without reverse proxy rate limiting.
Anyone running LeRobot through version 0.5.1 — with ~24,000 GitHub stars and growing adoption in robotics research labs, manufacturing QA, and autonomous systems testing. The gRPC servers run on network-accessible ports by default.
Downstream users of Hugging Face Hub models — both TGI and LeRobot pull models from the Hub. A compromised inference pipeline means a supply chain entry point for anyone deploying AI workloads off the platform.

How to Protect Yourself

Pin and upgrade TGI immediately. If you're on text-generation-inference==3.3.6 or earlier, upgrade to the latest patched version. Enforce authentication on all TGI endpoints — the default no-auth config is the attack surface.
Network-isolate LeRobot gRPC endpoints. If you run LeRobot ≤0.5.1, ensure the gRPC PolicyServer port is not exposed to the public internet. Use network policies, firewalls, or mutual TLS to restrict access.
Replace pickle.loads() in any custom ML pipeline. Use JSON, MessagePack, or Protocol Buffers for data serialization between untrusted components. If you must use pickle, use it only for data generated entirely within a trust boundary.
Set memory and request-size limits at your reverse proxy (nginx, Envoy, etc.) for all TGI deployments — not just inside the application. This adds defense-in-depth against CVE-2026-0599 even if the app-level fix lags.
Audit your AI dependencies. Run pip show text-generation-inference lerobot across your environments. Both packages have open CVEs right now.

The Sable Angle

Hugging Face's double CVE is a pattern we track closely at Sable. AI infrastructure is the new attack surface — and most ML teams don't have a red team. At Sable, we run offensive assessments against ML pipelines exactly like these: scanning for open inference endpoints, testing deserialization surfaces, and mapping the trust boundaries between your training data and production models. Our red team engagements cover AI supply chain risks, from Hub model poisoning to gRPC endpoint exposure — the same flaws that CVE-2026-0599 and CVE-2026-25874 exploit right now. If you're running Hugging Face infrastructure in production, we should talk.