Project

General

Profile

Task #2695

Updated by Szilárd Páll almost 2 years ago

The timing implementation is straightforward, but not critical given that, due to the buggy cudaEven-timing facilities, it is not possible to time kernels when there is concurrent work launched (i.e. multiple streams).

Host-side (launch) timing should also be added to avoid leaking time into "Rest".

Back