Skip to content

Conversation

@AidanBeltonS
Copy link
Contributor

Motivation

The current CI occationally is flakey with failures due to UB, and correctness issues. This change addresses some of those issues in the memory tests

Technical Details

hipDrvMemcpy2DUnaligned.cc
* Reset parameters before each SECTION
* Free pinned host memory with hipHostFree not hipFree
* srcH and srcD need to check, that they are not nullptr's after malloc. Implemented this using RAII class LinearAllocGuard
hipDrvMemcpy3DAsync.cc
* Passes actually invalid pitch, i.e. MaxPitch + 1
hipDrvMemcpy3DAsync_old.cc
* Handle peer access enable/disable
* Allocate 3D memory correctly, currently missing depth we are just allocating 2D
* Reset parameters between SECTION's
* Construct stream, I think previously it was left uninitialized and all work was on the default stream
* Fix memory leak
hipDrvPtrGetAttributes.cc
* Type mismatchs
* remove incorrect nvidia test, with buffer overflow

JIRA ID

Partial solution: ROCM-1617

Test Plan

NA

Submission Checklist

@AidanBeltonS AidanBeltonS requested a review from a team as a code owner January 26, 2026 15:32
Copilot AI review requested due to automatic review settings January 26, 2026 15:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR resolves various issues in HIP driver memory copy tests including undefined behavior, memory leaks, and correctness problems that were causing CI flakiness.

Changes:

  • Fixed memory allocation/deallocation issues (incorrect free functions, missing nullptr checks, memory leaks)
  • Added proper parameter resets between test sections to avoid state pollution
  • Corrected 3D memory allocation to include depth dimension (was allocating 2D instead of 3D)
  • Implemented peer access enable/disable handling and stream initialization
  • Fixed type mismatches and removed invalid test cases

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
memcpy2d_tests_common.hh Added peer access tracking and proper cleanup after device-to-device operations
hipDrvPtrGetAttributes.cc Fixed type mismatches (unsigned int to int/size_t) and removed invalid NVIDIA-specific test with buffer overflow
hipDrvMemcpy3D_old.cc Fixed 3D memory allocation, added parameter reset between sections, peer access handling, nullptr checks, and memory leak fixes
hipDrvMemcpy3DAsync_old.cc Similar fixes as hipDrvMemcpy3D_old.cc plus stream initialization and improved memory management
hipDrvMemcpy3DAsync.cc Corrected invalid pitch test to actually pass MaxPitch + 1 instead of MaxPitch
hipDrvMemcpy2DUnaligned.cc Fixed incorrect hipFree calls (should be hipHostFree), added parameter resets, and replaced raw pointers with RAII LinearAllocGuard
hip_test_common.hh Added PeerAccessGuard RAII class for safe peer access management

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants