top of page

3D Medical Imaging - 12x faster


Machine config: 8 x NVIDIA A100-SXM4-40GB


Model architecture: UNet


  • With DLOP:

    • Configuration: {batch_size: 1028, prefetch_factor: 8, num_workers: 16}

    • Average throughput: 242.69 imgs/sec


  • Without DLOP:

    • Configuration: {batch_size: 1028, prefetch_factor: 2, num_workers: 32}

    • Average throughput: 18.87 samples/sec



Speedup: 12x over regular training.



bottom of page