Static initialization of shared variables is illegal in CUDA. The problem is that the semantics of how each thread should handle the static initialization of shared memory are undefined in the programming model. What stream should write? What happens if the value is not evenly between threads? How should the compiler emit code for such a case and how to run it?
- .