why are there multiple settings for `actor_rollout_ref.model.enable_gradient_checkpointing`? Is this a deliberate design choice? #35