Adds an optimization pass for AMD OpenCL, different from writing an LLVM pass, as in Writing an LLVM Pass . What additional knowledge should I complete for this? Do we need additional libraries to optimize the OpenCL kernel?
I got an answer for this in AMD Forums (updated link)