VU#252619: Multiple deserialization vulnerabilities in PyTorch Lightning 2.4.0 and earlier versions

VU#252619: Multiple deserialization vulnerabilities in PyTorch Lightning 2.4.0 and earlier versions

Overview

PyTorch Lightning versions 2.4.0 and earlier do not use any verification mechanisms to ensure that model files are safe to load before loading them. Users of PyTorch Lightning should use caution when loading models from unknown or unmanaged sources.

Description

PyTorch Lightning, a high-level framework built on top of PyTorch, is designed to streamline deep learning model training, scaling, and deployment. PyTorch Lightning is widely used in AI research and production environments, often integrating with various cloud and distributed computing platforms to manage large-scale machine learning workloads.

PyTorch Lightning contains multiple vulnerabilities related to the deserialization of untrusted data (CWE-502). These vulnerabilities arise from the unsafe use of torch.load(), which is used to deserialize model checkpoints, configurations, and sometimes metadata. While torch.load() provides an optional weights_only=True parameter to mitigate the risks of loading arbitrary code, PyTorch Lightning does not require or enforce this safeguard as a principal security requirement for the product.

Kasimir Schulz of HiddenLayer identified and reported the following five vulnerabilities:

  1. The DeepSpeed integration in PyTorch Lightning loads optimizer states and model checkpoints without enforcing safe deserialization practices. It does not validate the integrity or origin of serialized data before passing it to torch.load(), allowing deserialization of arbitrary objects.
  2. The PickleSerializer class directly utilizes Python’s pickle module to handle data serialization and deserialization. Since pickle inherently allows execution of embedded code during deserialization, any untrusted or manipulated input processed by this class can introduce security risks.
  3. The _load_distributed_checkpoint component is responsible for handling distributed training checkpoints. It processes model state data across multiple nodes, but it does not include safeguards to verify or restrict the content being deserialized.
  4. The _lazy_load function is designed to defer loading of model components for efficiency. However, it does not enforce security controls on the serialized input, allowing for the potential deserialization of unverified objects.
  5. The Cloud_IO module facilitates storage and retrieval of model files from local and remote sources. It provides multiple deserialization pathways, such as handling files from disk, from remote servers, and from in-memory byte streams, without applying constraints on how the serialized data is interpreted.

Impact

A user could unknowingly load a malicious file from local or remote locations containing embedded code that executes within the system’s context, potentially leading to full system compromise.

Solution

To reduce the risk of deserialization-based vulnerabilities in PyTorch Lightning, users and organizations can implement the following mitigations at the system and operational levels:

  1. Verify that files to be loaded are from trusted sources and with valid signatures;
  2. Use Sandbox environments to prevent abuse of arbitrary commands when untrusted models or files are being used or tested;
  3. Perform static and dynamic analysis of files to be loaded to verify that the ensuing operations will remain restricted to the data processing needs of the environment;
  4. Disable unnecessary deserialization features by ensuring that torch.load() is always used with weights_only = True when the files to be loaded are model weights.

We have not received a statement from Lightning AI at this time. Please check the Vendor Information section for updates as they become available.

Acknowledgements

Thanks to the reporter, Kasimir Schulz [kschulz@hiddenlayer.com] from HiddenLayer. Thanks to Matt Churilla for verifying the vulnerabilities. This document was written by Renae Metcalf, Vijay Sarvepalli, and Eric Hatleback.