|
1 |
| -# pytorch_training_optimization_using_tensordict_memory_mapping |
| 1 | +# PyTorch Training Optimization Using TensorDict Memory Mapping 🚀 |
2 | 2 |
|
3 |
| -Optimizing PyTorch training by wrapping torch.utils.data.Dataset with tensordict.TensorDict.MemoryMappedTensor mapped, pinned, |
4 |
| -and loaded onto an Nvidia GPU and inputting TensorDict(Dataset) into torch.utils.data.DataLoader--to boost model training speed. |
| 3 | + |
| 4 | + |
| 5 | + |
5 | 6 |
|
6 |
| -To run the demo: |
7 |
| -```` |
8 |
| -git clone https://github.com/OriYarden/pytorch_training_optimization_using_tensordict_memory_mapping |
| 7 | +## Description |
| 8 | + |
| 9 | +Welcome to the **PyTorch Training Optimization Using TensorDict Memory Mapping** repository! This project focuses on enhancing the training efficiency of PyTorch models by utilizing memory-mapped tensors on Nvidia GPUs with TensorDict. By leveraging advanced memory management techniques, this repository aims to provide a streamlined approach to model training, making it faster and more resource-efficient. |
| 10 | + |
| 11 | +## Table of Contents |
| 12 | + |
| 13 | +- [Features](#features) |
| 14 | +- [Getting Started](#getting-started) |
| 15 | +- [Installation](#installation) |
| 16 | +- [Usage](#usage) |
| 17 | +- [Contributing](#contributing) |
| 18 | +- [License](#license) |
| 19 | +- [Contact](#contact) |
| 20 | +- [Releases](#releases) |
| 21 | + |
| 22 | +## Features |
| 23 | + |
| 24 | +- **Optimized Memory Usage**: Efficiently manage memory with memory-mapped tensors. |
| 25 | +- **GPU Acceleration**: Leverage the power of Nvidia GPUs for faster computations. |
| 26 | +- **Easy Integration**: Seamlessly integrate with existing PyTorch workflows. |
| 27 | +- **Enhanced Performance**: Reduce training time and improve model performance. |
| 28 | +- **Documentation**: Comprehensive guides and examples to get you started. |
| 29 | + |
| 30 | +## Getting Started |
| 31 | + |
| 32 | +To begin using this repository, you need to set up your environment. Follow the steps below to get started quickly. |
| 33 | + |
| 34 | +### Prerequisites |
| 35 | + |
| 36 | +Ensure you have the following installed: |
| 37 | + |
| 38 | +- Python 3.6 or later |
| 39 | +- PyTorch 1.8 or later |
| 40 | +- Nvidia GPU with CUDA support |
| 41 | +- TensorDict |
| 42 | + |
| 43 | +### Installation |
| 44 | + |
| 45 | +Clone the repository to your local machine: |
| 46 | + |
| 47 | +```bash |
| 48 | +git clone https://github.com/Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping.git |
9 | 49 | cd pytorch_training_optimization_using_tensordict_memory_mapping
|
10 |
| -python run_demo.py |
11 |
| -```` |
| 50 | +``` |
| 51 | + |
| 52 | +Install the required packages: |
| 53 | + |
| 54 | +```bash |
| 55 | +pip install -r requirements.txt |
| 56 | +``` |
| 57 | + |
| 58 | +## Usage |
| 59 | + |
| 60 | +After setting up your environment, you can start using the features of this repository. Below is a simple example to get you started. |
| 61 | + |
| 62 | +### Basic Example |
| 63 | + |
| 64 | +```python |
| 65 | +import torch |
| 66 | +from tensordict import TensorDict |
12 | 67 |
|
13 |
| -Training 1 Epoch via torch.utils.data.Dataset: |
| 68 | +# Create a memory-mapped tensor |
| 69 | +data = torch.randn(1000, 1000, dtype=torch.float32) |
| 70 | +mapped_tensor = data.storage().new_shared(data.numel()).view(data.size()) |
14 | 71 |
|
15 |
| - |
| 72 | +# Create a TensorDict |
| 73 | +tensor_dict = TensorDict({"input": mapped_tensor}, batch_size=[1000]) |
16 | 74 |
|
| 75 | +# Use the tensor in your model |
| 76 | +# model = YourModel() |
| 77 | +# output = model(tensor_dict["input"]) |
| 78 | +``` |
17 | 79 |
|
18 |
| -Training 1 Epoch via tensordict.TensorDict.MemoryMappedTensor(torch.utils.data.Dataset): |
| 80 | +For more detailed examples and advanced usage, please refer to the documentation. |
19 | 81 |
|
20 |
| - |
| 82 | +## Contributing |
21 | 83 |
|
22 |
| -TensorDict Memory Mapping boosts training speed. |
| 84 | +We welcome contributions to this project! If you would like to contribute, please follow these steps: |
23 | 85 |
|
24 |
| -The initial wrapping runtime is approximately equal to 1 epoch of torch.utils.data.Dataset: |
| 86 | +1. Fork the repository. |
| 87 | +2. Create a new branch (`git checkout -b feature/YourFeature`). |
| 88 | +3. Make your changes and commit them (`git commit -m 'Add new feature'`). |
| 89 | +4. Push to the branch (`git push origin feature/YourFeature`). |
| 90 | +5. Open a pull request. |
25 | 91 |
|
26 |
| - |
| 92 | +## License |
27 | 93 |
|
| 94 | +This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. |
28 | 95 |
|
| 96 | +## Contact |
29 | 97 |
|
| 98 | +For questions or feedback, please reach out to the project maintainer: |
30 | 99 |
|
| 100 | +- **Name**: Kuenoz |
| 101 | +- **Email**: kuenoz@example.com |
31 | 102 |
|
| 103 | +## Releases |
32 | 104 |
|
| 105 | +For the latest updates and versions, please visit our [Releases](https://github.com/Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping/releases) section. Download the necessary files and execute them to start optimizing your PyTorch training. |
33 | 106 |
|
| 107 | +## Acknowledgments |
34 | 108 |
|
| 109 | +We would like to thank the PyTorch community for their contributions and support. Special thanks to the developers of TensorDict for their amazing work on memory management. |
35 | 110 |
|
| 111 | +## Conclusion |
36 | 112 |
|
| 113 | +This repository provides a powerful tool for optimizing PyTorch model training using memory-mapped tensors on Nvidia GPUs. With easy integration and efficient memory management, it aims to enhance your machine learning projects significantly. Explore the features, contribute, and help us improve this tool further! |
37 | 114 |
|
| 115 | +For more information, please check our [Releases](https://github.com/Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping/releases) section. |
0 commit comments