Skip to content

lelis-research/Common-Benchmarks-Undervalue-the-Generalization-Power-of-Programmatic-Policies

Repository files navigation

Common Benchmarks Undervalue the Generalization Power of Programmatic Policies

This is the implementation for "Common Benchmarks Undervalue the Generalization Power of Programmatic Policies" paper. For more info visit project page. This repository serves as an umbrella for all code implementations and experimental results related to our research. The work is organized into four dedicated submodules, each focusing on a specific environment or set of experiments:

  • SparsePolicies: Core implementations and experiments related to Karel, SparseMaze, Cartpole, Quad, and ParallelPark environments.
  • SparsePolicies_Torcs: Experiments related to the Torcs environment. This repository uses a Dockerized version (also Apptainer for Compute Canada) of the Torcs server.
  • SparsePolicies_ParallelPark: Additional experiments for the ParallelPark environment.
  • FunSearch: Using this approach for finding programmatic policies for the SparseMaze environment.

Usage

After cloning this super-project, initialize and update all submodules with:

git submodule update --init --recursive

Each submodule can then be accessed and run according to its own documentation and setup instructions.


Cite the paper

@misc{rajabpour2025commonbenchmarksundervaluegeneralization,
     title={Common Benchmarks Undervalue the Generalization Power of Programmatic Policies}, 
     author={Amirhossein Rajabpour and Kiarash Aghakasiri and Sandra Zilles and Levi H. S. Lelis},
     year={2025},
     eprint={2506.14162},
     archivePrefix={arXiv},
     primaryClass={cs.LG},
     url={https://arxiv.org/abs/2506.14162}, 
}

Authors

Amirhossein Rajabpour, Kiarash Aghakasiri, Sandra Zilles, Levi Lelis

Releases

No releases published

Packages

No packages published

Languages