AMD Continues Enhancing AMDGPU/AMDKFD Drivers For Checkpoint/Restore
([AMD] 5 Minutes Ago
AMDGPU CRIU)
CRIU is for Checkpoint/Restore in Userspace to be able to freeze a running container or app, preserve its state to disk, and later restore said running workload. A few years ago we saw AMD working on being able to checkpoint/restore running ROCm workloads. As seemingly the first work in a while on the matter by the AMDGPU/AMDKFD kernel drivers, there are some new CRIU elements coming for Linux 6.18.