Hi, from this issue, it says VecNormalize's gamma should match the gamma of RL algorithm (e.g., gamma=0.99 should be consistent in both PPO2 and VecNormalize) to ensure consistent sliding window size. However, it seems the normalization arguments used in create_env are always the default one read from .yml file (i.e., gamma=0.99 as default):
|
env = VecNormalize(env, **normalize_kwargs) |
although gamma has different candidates in hyperparams_opt.py:
|
gamma = trial.suggest_categorical('gamma', [0.9, 0.95, 0.98, 0.99, 0.995, 0.999, 0.9999]) |
The same applies for rl-baselines3-zoo. Is this a bug? Should create_env consider gamma change in initiating VecNormalize per trial? Please give me some hint if I missed anything, thank you!