-
Notifications
You must be signed in to change notification settings - Fork 69
Energy profiling tools: Core infrastructure with timing tool and export capabilities #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Energy profiling tools: Core infrastructure with timing tool and export capabilities #299
Conversation
JBludau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
|
@dalg24 @masterleinad Hello! I would need some review on this PR, being the baseline blocks needed the daemon system and energy measurement tools (that would also need review #300) |
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
JBludau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are getting there
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
| // Stack-based timing for robust region/kernel tracking | ||
| void start_region(const std::string& name, RegionType type, uint64_t id = 0); | ||
| void end_region(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine but noting that if you can't correlate start and end of a region via some identifier then it won't be threadsafe.
a9c54ce to
5d9082f
Compare
| # - Tool interface definitions | ||
| # - Basic kernel timer tool | ||
|
|
||
| add_subdirectory(kokkos-tools) No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This red circle means "no newline at end of file".
Please fix it by adding a newline character.
Same comment everywhere.
| namespace KokkosTools { | ||
| namespace EnergyProfiler { | ||
|
|
||
| std::string generate_prefix() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| namespace KokkosTools { | |
| namespace EnergyProfiler { | |
| std::string generate_prefix() { | |
| std::string KokkosTools::EnergyProfiler::generate_prefix() { |
| namespace KokkosTools { | ||
| namespace EnergyProfiler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| namespace KokkosTools { | |
| namespace EnergyProfiler { | |
| namespace KokkosTools::EnergyProfiler { |
profiling/energy-profiler/kokkos-tools/kp_energy_kernel_timer.cpp
Outdated
Show resolved
Hide resolved
6fb522b to
f2f0606
Compare
5fbccdf to
2629fad
Compare
| std::lock_guard<std::mutex> lock(state.get_mutex()); | ||
| state.get_active_regions().push_back(region); | ||
| } catch (const std::exception& e) { | ||
| std::cerr << "Error in start_region: " << e.what() << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ... afaik that is not how exceptions are supposed to be used ... especially since you are not handling the exception but just printing and not rethrowing. So this would not lead to an abort.
if any of the functions in the try block throw, you are silencing that but also not restoring a valid state so that the program can continue. I would remove the try-catch
| TimingInfo region; | ||
| region.name = name; | ||
| region.type = type; | ||
| region.start_time = std::chrono::high_resolution_clock::now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when starting, you should take the time at the end and then use it to update the last region in the dqueue. This way you don't measure the construction or the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I am writing this: It of course does not hold for nested regions ... but that is too much for now
d1bae2d to
049a99f
Compare
…ling in start and end region functions
049a99f to
6fcf78e
Compare
This PR introduces the foundational infrastructure for energy profiling tools in Kokkos:
Features:
TimingInfostructure and region trackingEnergyProfilerStatesingleton for managing active/completed timing regionstiming_exportmodule for exporting timing data to CSV files with summary statisticsImplementation:
kp_energy_profiler- Core profiling functionalitytiming_utils.hpp/cpp- State management and helper functionstiming_export.hpp/cpp- CSV export and summary generationArchitecture:
This infrastructure provides a clean separation between timing collection, state management, and data export, making it easy for future energy monitoring providers to integrate their functionality. The design supports both synchronous and asynchronous profiling patterns (if combined with #300).