I am training a scgen model using the following command:
model.train(max_epochs=100,
batch_size=32,
early_stopping=True,
early_stopping_patience=25,
enable_checkpointing=True
)
The integration is taking a long time and the job is ultimately cancelled, however, I have the saved checkpoint file. How can re start the model training using the information saved from the checkpoint?