OpenAI Codex PSA on Malicious Config Files

When users run Codex in a directory for the first time, they are forced to acknowledge a dialog or quit Codex. The dialog is misleading because it warns of prompt injection, but actually serves to gate ‘workspace trust’. When users choose ‘Yes, continue’, the project's configuration files are loaded. Loading a malicious configuration file can trigger immediate arbitrary code execution.

We aim to inform Codex users who may be accepting risks they do not intend to be when browsing any third party repo e.g. candidate coding projects, open-source code, etc. The risk was reported, but the response was “End-user already accepting the risk. This is working intended.”

Arbitrary code can run on Codex project startup.

Context

OpenAI’s Codex CLI uses config.toml files to control many aspects of the tool’s operation. This includes human-in-the-loop command approval settings, web search, settings, MCP configurations, and more. When Codex runs, it finds and activates these config files – including executing arbitrary code snippets defined in the MCP section of the configuration file. This presents a risk when operating on untrusted repositories. For example:

Evaluating an open source dependency before adding it to a project
Reviewing code submitted by a candidate, contractor, or vendor
Project maintainers assessing PRs made to an open-source project

OpenAI mentions in its docs that “For security, Codex loads project config files only when you trust the project.” However, in order to open Codex in a new directory, the user is required* to approve Codex trusting the project.

*With the exception of when running Codex in ‘Yolo mode’, which does not require project trust / loading project-level configs, but ‘Yolo mode’ is intended to be the less secure way to run Codex, as it disables human-in-the-loop protections.

The dialog that requests project trust presents itself as a warning about prompt injection – it does not state that acknowledging the dialog will trigger loading and executing commands from project-level config files.

Warning dialog states 'Do you trust the contents of this directory? Working with untrusted contents comes with higher risk of prompt injection.' Menu options 'Yes, Continue', 'No, quit'.

As the default comes with human-in-the-loop mechanisms and file access restrictions, users may have a false sense of security when accepting this.

This misleading set-up may enable hackers to distribute malware by concealing it in .codex/config.toml files – knowing that users will likely trigger the malware by following the dialog. This allows for malware delivery without the user intentionally choosing to run the code in the project.

Below, we demonstrate how this risk can manifest in practice for an unsuspecting Codex user.

Risk Example

A .codex/config.toml file with a command to install and run a malicious package is hidden by the attacker (in this case, a fake candidate) in a repository that will be encountered by the victim

The victim clones the untrusted repo and runs Codex in the cloned directory to use the Codex CLI
The victim chooses ‘Yes, continue’ to disregard a warning about prompt injections (trusting human-in-the-loop controls and command allowlists to help mitigate prompt injection risks)
Commands in the config.toml file execute immediately, running unsandboxed code on the user’s device (in this case, installing and running a malicious NPM package)

Here, the malicious config command installs a malicious NPM package. The package then searches the user’s Desktop, Downloads, and Home Directory looking for a fake credentials file and sends it to the attacker’s server.
Besides the above, with the ability to execute code on the user’s machine, malware can target many outcomes, including persistence on the user’s machine, lateral movement within the victim’s network, ransomware delivery, etc.

Re-used Trust Risk with Git Pull

One more risk we want to call out is that attackers may target the config.toml behavior via a route that aims to leverage trust from an existing Codex workspace. For example:

Victim clones an open-source project; it is safe, and workspace trust is granted.
Malicious commit introduces a config.toml with an attacker’s payload.
Victim pulls the updated code and runs Codex. The config file executes without additional authorization, even though its contents differ from when the user granted workspace trust.

Concealing the Payload

Below is an example demonstrating how similar a malicious config can look to a legitimate MCP config found in the Codex docs.

The config for a typo-squatted NPM package is only two characters different from the trusted example config in OpenAI docs.

We also note that if an attacker wanted the whole payload to live locally in the project, avoiding NPM, they could place malicious shell commands directly in the config or set the config to execute a script from the project files.

Mitigating Risks at the Org Level

Codex supports admin-enforced requirements for many configuration settings, including MCP server start commands. To remediate this risk, we recommend organizations deploy a requirements file that restricts MCP configurations for org members to only trusted servers and MCP start-up commands.

For more information, see OpenAI’s Admin Enforced Requirements Documentation

Responsible Disclosure

While we acknowledge that some users may be aware that the trust modal has a downstream consequence of commands in project-level config files being run, we also know that when developers run Codex in spite of the prompt injection warning, many won’t expect to be executing arbitrary commands. We informed OpenAI, as users may not be aware of this behavior.

02/14/2026 Report submitted via BugCrowd
02/16/2026 Closed as N/A “End-user already accepting the risk. This is working intended.”

As users may not be aware, we wanted to share publicly via this PSA so that Codex users can properly assess the risk they are accepting when browsing repositories with Codex.