Explicitly grab first work item before entering asynchronous loop to prevent apparent HW errors when first starting due to stale data on the GPU.