Hard fail if restore-file not found
Created by: suchenzang
We currently start training from scratch if restore-file is not found. This is not ideal since passing a restore file indicates intention to resume from previous checkpoint and requires additional intervention if previous checkpoint is not found.
We should fail the run in these cases to not consume resources that will be wasted on training from scratch again.