Delayed responses from testing pools that are down can hold up the watchdog thread from getting to its device testing code, leading to false detection of the GPU not checking in, and can substantially delay auto gpu/auto fan management leading to overheating. Move pool watching to its own thread.