CVE-2024-5480: Remote Code Execution Vulnerability in PyTorch's torch.distributed.rpc Framework
June 7, 2024
A critical remote code execution (RCE) vulnerability has been identified in the PyTorch's torch.distributed.rpc framework, affecting versions prior to 2.2.2. This vulnerability allows attackers to execute arbitrary commands by leveraging built-in Python functions such as eval during multi-cpu RPC communication. The issue arises from the lack of restriction on function calls when a worker node serializes and sends a Python User Defined Function (PythonUDF) to the master node, which then deserializes and executes the function without validation. This flaw can be exploited to compromise master nodes initiating distributed training, potentially leading to the theft of sensitive AI-related data.
Background
The torch.distributed.rpc framework is used in distributed training scenarios to facilitate communication between worker nodes and the master node. The framework is designed to handle RPC operations efficiently, but it lacks proper verification of the functions being called during these operations. This oversight allows attackers to execute arbitrary commands by leveraging built-in Python functions such as eval.
Vulnerability Details
The vulnerability arises from the following sequence of events:
Serialization and Sending: A worker node serializes and sends a PythonUDF to the master node.
Deserialization and Execution: The master node deserializes and executes the PythonUDF without validation.
This sequence of events allows attackers to execute arbitrary commands by crafting a malicious PythonUDF that includes built-in Python functions such as eval. The eval function can be used to execute arbitrary code, which can lead to remote code execution.
Impact
The impact of this vulnerability is significant, as it can be exploited to compromise master nodes initiating distributed training. This could potentially lead to the theft of sensitive AI-related data, including model weights, training data, and other confidential information.
Mitigation
To mitigate this vulnerability, it is recommended that users upgrade to the latest version of PyTorch (2.2.2 or later) that includes the necessary fixes. Additionally, users should ensure that their distributed training environments are properly configured to prevent unauthorized access to the master node.
Conclusion
The CVE-2024-5480 vulnerability highlights the importance of proper verification and validation of functions during RPC operations in distributed training scenarios. It is crucial for developers and users to stay up-to-date with the latest security patches and best practices to prevent such vulnerabilities from being exploited.
How can I patch my PyTorch installation to fix CVE-2024-5480
pip install --upgrade torch3. Verify the installed version by running:
import torch;print (torch.__version__)