DeepSpeed
latest
Training Setup
DeepSpeed Configuration
Training API
Model Checkpointing
Activation Checkpointing
Transformer Kernels
Pipeline Parallelism
DeepSpeed
Docs
»
DeepSpeed
Edit on GitHub
DeepSpeed
¶
Model Setup
¶
Training Setup
Argument Parsing
Training Initialization
Distributed Initialization
Configuration
¶
DeepSpeed Configuration
Configurations
Extending Configurations
Training API
¶
Training API
Forward Propagation
Backward Propagation
Optimizer Step
Gradient Accumulation
Checkpointing API
¶
Model Checkpointing
Loading Training Checkpoints
Saving Training Checkpoints
Activation Checkpointing
Configuring Activation Checkpointing
Using Activation Checkpointing
Configuring and Checkpointing Random Seeds
Transformer Kernel API
¶
Transformer Kernels
DeepSpeed Transformer Config
DeepSpeed Transformer Layer
Pipeline Parallelism
¶
Pipeline Parallelism
Model Specification
Training
Extending Pipeline Parallelism
Indices and tables
¶
Index
Module Index
Search Page
Read the Docs
v: latest
Versions
latest
Downloads
pdf
On Read the Docs
Project Home
Builds
Free document hosting provided by
Read the Docs
.