mirror of
https://github.com/zebrajr/pytorch.git
synced 2026-01-15 12:15:51 +00:00
Summary: I think this will be it. So for one, the previous test was bullshit because it was returning the thread id instead of the sample index (which is the thing whose ordering is enforced). Just turning up the number of threads to 10 from 4 made this very obvious. I also think there is a race condition, which may or may not have surfaced, in that there was nothing stopping one worker to get multiple batches, which would screw with the whole ordering logic. I've added a barrier struct such that workers wait for all workers to be in the `get_batch` function before actually doing something. Fixes https://github.com/pytorch/pytorch/issues/14002 ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/14038 Differential Revision: D13088132 Pulled By: goldsborough fbshipit-source-id: 4bded63756c6a49502ee07ef8709a03073e7e05f
31 KiB
31 KiB