Skip to content

Conversation

@caffeinated92
Copy link
Collaborator

This pull request introduces significant enhancements and configurability to the Restic backup integration, focusing on flexible tagging, purge grouping, and new API endpoints for snapshot management. The changes allow users to customize how backups are tagged and grouped, and provide new REST endpoints to list and restore Restic snapshots. Several internal helper functions and configuration settings were added to support these features, and existing backup flows were refactored to use the new tagging mechanism.

Restic Tagging and Purge Grouping Improvements:

  • Added configuration options BackupResticTagCategories and BackupResticPurgeGroupBy to the Config struct, with corresponding CLI flags and setters, allowing users to customize backup tag categories and purge grouping behavior. [1] [2] [3] [4]
  • Implemented helper functions for normalizing and parsing tag categories and group-by values, ensuring only allowed values are used and providing user-friendly error messages.
  • Refactored all Restic backup invocations to use the new BuildResticTags method, which generates tags based on the configured categories, improving consistency and flexibility. [1] [2] [3] [4] [5]

Restic Snapshot Management API:

  • Added new API endpoints to list (/api/clusters/{clusterName}/restic/ls/{snapshotID}) and restore (/api/clusters/{clusterName}/restic/restore/{snapshotID}) Restic snapshots, with corresponding handler implementations and request/response types. [1] [2] [3]
  • Implemented cluster methods for snapshot restore, listing, dumping, mounting, and unmounting, enabling the new API endpoints and improving backup management workflows.

Restic Purge Enhancements:

  • Updated the purge logic to use the normalized groupBy value from configuration, with validation and fallback to default on error, allowing for more granular retention policies. [1] [2]

Miscellaneous:

  • Added necessary imports (io, strings) to support new functionality.

These changes collectively provide a more flexible and powerful Restic backup experience, making it easier to manage, tag, and restore backups according to organizational needs.

…ionForm components for enhanced backup management UI
- Implemented ResticListSnapshot and ResticRestoreSnapshot methods in the Cluster struct.
- Added corresponding API handlers in the ReplicationManager for listing and restoring snapshots.
- Enhanced ConfirmActionForm component to handle snapshot restoration with specific paths and target directories.
- Updated SnapshotTables component to include restore functionality for snapshots.
- Introduced utility functions for normalizing paths and filtering entries based on base paths.
- Added Redux actions and service methods for interacting with the new snapshot listing and restoration features.
@claude
Copy link

claude bot commented Jan 13, 2026

PR Review: Enhanced Restic Backup Management

Summary

This PR adds significant enhancements to Restic backup management, including flexible tagging, purge grouping configuration, and new REST API endpoints for snapshot listing and restoration. The changes are well-structured and comprehensive.

✅ Strengths

  1. Comprehensive Test Coverage

    • Excellent addition of restic_test.go with 634 lines covering core functionality
    • Tests cover lifecycle operations: create, list, restore, mount, dump
    • Good use of test helpers and temporary directories for isolation
    • Tests validate edge cases (missing snapshots, invalid paths, etc.)
  2. Clean API Design

    • New endpoints (/restic/ls/{snapshotID}, /restic/restore/{snapshotID}) follow REST conventions
    • Request/response types are well-defined
    • Token middleware applied consistently for security
  3. Configuration Flexibility

    • BackupResticTagCategories and BackupResticPurgeGroupBy provide user customization
    • normalizeResticTagCategories() and normalizeResticPurgeGroupBy() functions validate inputs with clear error messages
    • Fallback to defaults on validation errors (defensive programming)
  4. Code Organization

    • Cluster wrapper methods (ResticListSnapshot, ResticRestoreSnapshot, etc.) provide clean separation
    • Tag building logic (BuildResticTags, parseResticTagCategories) is modular and testable
    • Error handling in CheckResticErrors() properly categorizes task-specific errors

⚠️ Issues & Recommendations

1. Path Traversal Security Concern [HIGH]

In cluster_bck.go:160-164, the normalizeResticPurgeGroupBy() function:

normalized = strings.ReplaceAll(normalized, "_", "")  // Line 689

The function allows underscores to be stripped, converting "path_name" → "pathname". While this appears intentional for flexibility, there's no validation that the resulting value is in the allowed set before it's used.

Recommendation: The validation check on line 697 protects against this, so this is actually safe—but it could be clearer. Consider adding a comment explaining the underscore normalization.

2. Race Condition in CancelTask() [MEDIUM]

restic.go:1036-1056:

func (repo *ResticManager) CancelTask(taskId int) {
    // Line 1051 has a logic bug:
    repo.TaskQueue = append(repo.TaskQueue[:taskToCancel.ID], 
                           repo.TaskQueue[taskToCancel.ID+1:]...)
}

This indexes by taskToCancel.ID instead of finding the actual index. If task IDs are non-sequential, this will delete the wrong task. The i variable from the loop should be used instead.

Recommendation: Fix this to use the loop index:

for i, task := range repo.TaskQueue {
    if task.ID == taskId {
        repo.TaskQueue = append(repo.TaskQueue[:i], repo.TaskQueue[i+1:]...)
        // ...
        return
    }
}

3. Missing Input Validation [MEDIUM]

In API handlers (api_cluster.go:6938-7015):

  • payload.TargetDir is validated but not checked for path traversal (e.g., ../../etc/passwd)
  • sourcePath parameter accepts any string without sanitization

Recommendation: Add path validation:

if filepath.IsAbs(payload.TargetDir) {
    // Ensure it's within allowed boundaries
}

4. Inefficient Task ID Generation

restic.go:313-317:

func (repo *ResticManager) GenerateTaskID() int {
    repo.taskID++
    return repo.taskID
}

Task IDs are not thread-safe. If called concurrently without holding repo.Mutex, two calls could return the same ID.

Recommendation: Either lock the mutex or use atomic operations:

func (repo *ResticManager) GenerateTaskID() int {
    repo.Mutex.Lock()
    defer repo.Mutex.Unlock()
    repo.taskID++
    return repo.taskID
}

5. Goroutine Leak in RestoreSnapshot()

restic.go:490-510:
The restoreSnapshot() implementation spawns goroutines for streaming output but doesn't properly handle early returns. If GetCanFetch() returns false, the function returns without waiting for output streams to complete.

Recommendation: Ensure all goroutines complete or have proper cancellation.

6. Missing CLI Flags for New Configuration

config/config.go and server/server.go add:

  • BackupResticTagCategories
  • BackupResticPurgeGroupBy

But I don't see CLI flag definitions in the diff. These should be added to server/server.go:AddFlags() for CLI access.

7. Incomplete Error Handling in RunCommand()

restic.go:1170:

repo.Printf(logrus.ErrorLevel, prefix+"Error reading output:", err)

This doesn't format the error properly (missing %v format specifier). The error is concatenated as a string rather than properly formatted.

Recommendation: Use proper format specifier:

repo.Printf(logrus.ErrorLevel, prefix+"Error reading output: %v", err)

🧪 Test Coverage Notes

Excellent:

  • TestRestoreDumpMountUnmount() comprehensively tests all snapshot operations
  • Mount/unmount with proper cleanup and timeout handling
  • JSON parsing from restic ls command verified

Missing:

  • No test for BuildResticTags() with different category configurations
  • No negative tests for invalid purge group-by values
  • No test for concurrent task operations (race condition testing)
  • No test for empty snapshot IDs or invalid snapshot references

📝 Documentation

The PR description is clear and well-organized. Consider adding:

  • CHANGELOG entry documenting the new configuration options
  • Migration guide if these settings replace old behavior
  • Example curl commands for the new REST endpoints

🔒 Security Observations

  1. Token Middleware: All new endpoints properly protected with validateTokenMiddleware
  2. Input Sanitization: Path parameters could benefit from stricter validation
  3. Command Injection: Restic commands are built from user input but properly escaped via argument array ✅

Summary of Required Changes

Issue Severity Type
CancelTask() array indexing bug HIGH Bug
Path traversal in restore endpoint MEDIUM Security
GenerateTaskID() thread safety MEDIUM Concurrency
Error format string in RunCommand LOW Bug
Add CLI flags for new config options LOW Feature

Recommendations for Follow-up

  1. Add integration tests for tag building with various config combinations
  2. Add stress tests for task queue operations
  3. Document the purge group-by and tag category options in config documentation
  4. Add validation for absolute paths in restore operations

Overall Assessment: The PR introduces valuable functionality with mostly solid implementation. The main concerns are the task ID indexing bug in CancelTask() and missing input validation in the restore endpoint. These should be addressed before merging. The test suite is comprehensive, but could benefit from additional edge case coverage.

@caffeinated92 caffeinated92 marked this pull request as draft January 13, 2026 13:25
@caffeinated92
Copy link
Collaborator Author

@claude review my pull request

@claude
Copy link

claude bot commented Jan 14, 2026

Claude encountered an error after 26s —— View job


I'll analyze this and get back to you.

@caffeinated92
Copy link
Collaborator Author

@claude review my pull request

@claude
Copy link

claude bot commented Jan 15, 2026

Claude encountered an error after 17s —— View job


I'll analyze this and get back to you.

- Implemented TestFailoverResticDump, TestResticDumpRestore, and TestResticSnapshotAndDump in regtest/test_failover_restic_dump.go to validate restic dump and restore processes.
- Added API endpoint for dumping restic snapshots to MySQL in server/api_cluster.go.
- Created SnapshotDumpForm component in ConfirmActionForm.jsx for user input on server and file path during dump operations.
- Enhanced SnapshotTables.jsx to include a button for initiating the dump to MySQL action.
- Updated Maintenance component to handle snapshot dump actions and manage state for dump files.
- Modified ConfirmModal to accept a size prop for better UI handling.
- Improved styling in ConfirmModal's styles.module.scss for better visual hierarchy.
- Added resticDumpToMysql action in clusterSlice.js to handle the dump request.
- Updated globalClustersSlice.js to maintain pending thunk states.
- Integrated resticDumpToMysql service in clusterService.js for API interaction.
- Enhanced ClusterDBTabContent and DBServerGrid components to include backupRestic configuration.
- Implemented advanced backup modal in ServerMenu for selecting backup type (logical/physical) and line (default/adhoc).
- Added retention days and Restic toggle options for ad-hoc backups.
- Updated Redux actions to support options for physical and logical backups.
- Modified clusterService to build query parameters for backup requests.
- Enhanced ResticManager to return snapshot IDs and handle backup command output more efficiently.
- Added tests for SnapshotDumpForm to ensure proper functionality and user interactions.
- Add backup line selection (default/adhoc) to reseed operations
- Implement Advanced Reseed modal with reseed type, source, and line selection
- Add Point-In-Time Recovery (PITR) modal with backup selection and timestamp
- Update backend API to accept 'line' query parameter for reseed endpoints
- Maintain backward compatibility with existing reseed functionality
Backend changes:
- server/api_database.go: Parse backup line options for reseed operations
Frontend changes:
- services/clusterService.js: Add PITR service and update reseed services
- redux/clusterSlice.js: Add PITR thunk and update reseed thunks
- ServerMenu.jsx: Implement Advanced Reseed and PITR modals
- Implemented a new API endpoint for reseeding a server from a RESTIC snapshot in `api_database.go`.
- Added a new handler `handlerMuxServerReseedRestic` to process RESTIC reseed requests, including parameter validation and error handling.
- Updated the React dashboard to include RESTIC snapshot options in the server menu, allowing users to select snapshots and specify restore modes.
- Introduced Redux actions and service functions for handling RESTIC reseed operations in `clusterSlice.js` and `clusterService.js`.
- Enhanced the RESTIC manager with a synchronous restore function to facilitate the new reseed process.
…eseed

- Added BackupID field to BackupRunOptions in backup_helpers.go.
- Introduced reseedLogicalBackupByID method in srv_job.go for logical backup reseeding by ID.
- Updated API to handle backup ID in api_backup_options.go and api_database.go.
- Enhanced ServerMenu component in ServerMenu.jsx to support ad-hoc backup selection by ID.
- Modified clusterSlice.js and clusterService.js to include useSourcePath parameter for restic reseed operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants