-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Description
Overview
We have identified an opportunity to improve the current [audio-to-text](https://github.com/livepeer/go-livepeer/pull/3078/) pipeline in Livepeer AI Network by enabling [flash-attention](https://arxiv.org/abs/2307.08691/) that will speed up the pipeline significantly allowing for faster and almost realtime operation. We are seeking the community and bounty hunters support to quickly implement this optimisation so it can be available to developers working with Livepeer.
Problem
Implementing improved flash_attention to audio-to-text models in Livepeer AI Network.
Desired Solution
Improvement in speed of the model execution for audio-to-textpipeline.
Bounty Requirements
- Enable the optimisation on the [existing pipeline](https://github.com/livepeer/ai-worker/blob/main/runner/app/pipelines/audio_to_text.py/) by enabling memory efficient flash attention.
- Ensure that devices that don't yet support the optimisation should safely fallback to working Scaled Dot-Product Attention [SDPA](https://pytorch.org/docs/main/generated/torch.nn.functional.scaled_dot_product_attention/) implementation .
- Create a separate docker container image similar to [PR #185](Segment anything 2 pipeline image ai-runner#185) to avoid dependencies issues with other pipelines.
Applicant Requirements
- Proven experience working with deep learning frameworks such as PyTorch, particularly in implementing attention mechanisms and optimising model performance.
- Strong experience with [Python](https://www.python.org/).
Scope Exclusions
- None. All areas related to the issue are within scope.
Implementation Tips
- Consult the documentation of the flash-attention from [pytorch](https://pytorch.org/docs/main/generated/torch.nn.functional.scaled_dot_product_attention/) to better understand how to enable it in
audio-to-textpipeline. - Validate performance improvements in the Flash Attention-enabled pipeline and ensure proper fallback functionality in unsupported devices.
Additional Resources
How to Apply
- Express Your Interest: Fill out [this form](https://www.notion.so/13f0a34856878045ba5be0218bc28d3f?pvs=21), making sure to specify the bounty you are interested in
- Wait for Review: Our team will review expressions of interest and select the best candidate.
- Get Assigned: If selected, we'll contact you and assign the bounty to you.
- Start Working: Dive into your task! If you need assistance or guidance, join the discussions in the
#developer-loungechannel on our [Discord server](https://discord.gg/livepeer). - Submit Your Work: Create a pull request in the relevant repository and request a review.
- Notify Us: Ping us on Discord when you’re pull request is ready for review.
- Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
- Gain Recognition: Your valuable contributions will be showcased in our project's [changelog](https://livepeer-ai.productlane.com/changelog).
Contact Information
For questions or clarifications, please contact: [[email protected]](mailto:[email protected])
Metadata
Metadata
Assignees
Labels
No labels