Pytorch cuda slow, , test_cuda_assert_async (__main__