cuda - what happens when multiple kernels are sent to the device to be executed? -


suppose have send 2 consecutive kernel calls device. wait complete first 1 or executed them concurrently? if executed in parallel, intersect each other instance memory access? paradigm used such case in cuda?

two consecutive kernel launches same cuda device run concurrently if:

  1. they launched same cuda context.
  2. they executed on different cuda streams.
  3. the device supports concurrency (compute 2.0 , later).
  4. there sufficient resources (registers, shared memory, thread blocks) support thread blocks both kernels simultaneously.

for more information, see this section in cuda c programming guide.

as sgar91 commented, if these kernels share global memory, programmer's responsibility write correctly synchronized program avoid race conditions. if 2 kernels read same memory, there can no race condition.


Comments

Popular posts from this blog

c# - Send Image in Json : 400 Bad request -

jquery - Fancybox - apply a function to several elements -

An easy way to program an Android keyboard layout app -