Skip to content

Conversation

@ksuderman
Copy link
Contributor

Summary

Adds a new job runner for Google Cloud Batch, enabling Galaxy to execute jobs on Google Cloud's managed batch computing service.

Features:

  • Container-based execution with automatic tool container discovery
  • NFS volume mounting for shared data access
  • CVMFS support for reference data
  • Dynamic CPU/memory allocation from job requirements
  • Job monitoring, recovery, and cancellation

Configuration

Add the runner to job_conf.yml:

runners:
  gcp_batch:
    load: galaxy.jobs.runners.gcp_batch:GoogleCloudBatchJobRunner
    workers: 4
    project_id: my-gcp-project
    region: us-central1
    nfs_server: 10.0.0.2
    nfs_path: /galaxy/server/database
    nfs_mount_path: /mnt/nfs
    container_image: quay.io/galaxyproject/galaxy-min:25.1

destinations:
  gcp_batch:
    runner: gcp_batch
    params:
      machine_type: n2-standard-4
      requests_cpu: "2"
      requests_memory: "4Gi"

Required GCP setup:

  1. Enable the Batch API (batch.googleapis.com)
  2. Create a service account with Batch Admin and Compute Admin roles
  3. Configure NFS (Filestore) accessible from Batch VMs on the same VPC
  4. Authenticate via GOOGLE_APPLICATION_CREDENTIALS or service_account_file parameter

Authentication: The service_account_email parameter is optional and primarily useful for development/testing. In production, use a workload identity pool for secure, keyless authentication.

Key parameters:

Parameter Description Default
project_id GCP project ID Auto-detected
region GCP region us-central1
machine_type VM machine type n2-standard-4
container_image Default container quay.io/galaxyproject/galaxy-min:25.1
nfs_server NFS server IP Required
requests_cpu / requests_memory Resource requests 1 / 2Gi

Test plan

  • Unit tests pass: pytest test/unit/app/jobs/test_gcp_batch_runner.py
  • Manual testing with GCP Batch infrastructure

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

Copy link
Member

@nuwang nuwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some preliminary comments.

@ksuderman ksuderman requested a review from jmchilton January 15, 2026 15:21
@ksuderman
Copy link
Contributor Author

I have been asked why not use the new Pulsar enhancements introduced in #20862

  1. We did not develop this in isolation and we did consult with @jmchilton before starting work on this
  2. The GCP Batch job runner is optimized towards a deployment on AnVIL, or at least Galaxy instances deployed to GCP.
  3. The two approaches to dispatching jobs to Google Batch are not mutually exclusive. I had Claude write a short summary of the pros and cons of each approach. TLDR: it depends, and if Galaxy can be configured to use Google buckets for its object store any differences in I/O are negligible.
  4. There are some bugs and features missing from the Pulsar GCP runner that means we can't use it on AnVIL in its current state. I will PR fixes and updates, but this runner is tested and working now.

@ksuderman ksuderman marked this pull request as ready for review January 25, 2026 02:12
@jmchilton
Copy link
Member

Can you fix the remaining linting issues?

@guerler guerler requested a review from nuwang January 28, 2026 11:20
@guerler guerler merged commit 777c9a4 into galaxyproject:dev Jan 28, 2026
58 of 59 checks passed
@guerler guerler added the release-testing-26.0 PRs marked for testing for the 26.0 release and issues stemming from release testing label Jan 29, 2026
@guerler guerler changed the title Google Batch job runner Add Google Batch job runner Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/jobs area/testing kind/feature release-testing-26.0 PRs marked for testing for the 26.0 release and issues stemming from release testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants