Reduce Memory Needed to Train Deep Learning Models

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

checkmate¶

checkmate breaks the GPU memory wall by enabling researchers to train large state-of-the-art models that do not fit in GPU memory. Checkmate applies optimal tensor rematerialization (as detailed in the paper) to trade off space and time.