While there have been efforts by AMD over the years to make it easier to port codebases targeting NVIDIA’s CUDA API to run atop HIP/ROCm, it still requires work on the part of developers. The tooling has improved such as with HIPIFY to help in auto-generating but it isn’t any simple, instant, and guaranteed solution — especially if striving for optimal performance. Over the past two years AMD has quietly been funding an effort though to bring binary compatibility so that many NVIDIA CUDA applications could run atop the AMD ROCm stack at the library level — a drop-in replacement without the need to adapt source code. In practice for many real-world workloads, it’s a solution for end-users to run CUDA-enabled software without any developer intervention. Here is more information on this “skunkworks” project that is now available as open-source along with some of my own testing and performance benchmarks of this CUDA implementation built for Radeon GPUs.

From several years ago you may recall ZLUDA that was for enabling CUDA support on Intel graphics. That open-source project aimed to provide a drop-in CUDA implementation on Intel graphics built atop Intel oneAPI Level Zero. ZLUDA was discontinued due to private reasons but it turns out that the developer behind that (and who was also employed by Intel at the time), Andrzej Janik, was contracted by AMD in 2022 to effectively adapt ZLUDA for use on AMD GPUs with HIP/ROCm. Prior to being contracted by AMD, Intel was considering ZLUDA development. However, they ultimately turned down the idea and did not provide funding for the project.

Andrzej Janik spent the past two years bringing ZLUDA to Radeon GPUs and it works: many CUDA software can run on HIP/ROCm without any modifications — or other processes… Just run the binaries as you normally would while ensuring that the ZLUDA library replacements to CUDA are loaded. For reasons unknown to me, AMD decided this year to discontinue funding the effort and not release it as any software product. But the good news was that there was a clause in case of this eventuality: Janik could open-source the work if/when the contract ended.

Andrzej Janik reached out and provided access to the new ZLUDA implementation for AMD ROCm to allow me to test it out and benchmark it in advance of today’s planned public announcement. I’ve been testing it out for a few days and it’s been a positive experience: CUDA-enabled software indeed running
Read More