Knowledge Base

Article ID: 1589 - Last Modified:

I submitted a Glide docking job to our cluster a couple of weeks ago and we just had a system failure that caused our NFS disks to unmount and for the queuing system on our cluster to shut down. Is there a way to restart the jobs?

Distributed Glide jobs can be restarted at a coarse level: that is, incomplete subjobs have to be started from the beginning, but any completed subjobs do not have to be rerun. For a given Glide job with multiple subjobs, first check to see what state Job Control thinks the job is in:

$SCHRODINGER/jobcontrol -list -c JobId

where JobId is the Schrodinger Job ID of the Glide job, visible in the Monitor panel or near the top of the jobname.log file. You may find that some subjobs are in 'stranded' status, which happens when Job Control on the launch machine loses track of the superintending Job Control processes for the backends on the compute nodes. If there are 'running' subjobs when you know they really aren't running, try

$SCHRODINGER/jobctonrol -ping -c JobId

to have Job Control refresh their statuses. Next, try

$SCHRODINGER/jobcontrol -recover -c JobId

to see if Job Control can recover any files from the compute nodes. It could be that the Glide backends on the compute nodes continued running and were able to produce pv or lib files.

Once the job has been cleaned up, from Job Control's perspective, and the main Glide driver job is in 'died' status, you can try to restart the job

$SCHRODINGER/glide -RESTART jobname.in

Glide will rerun any 'died' or 'killed' subjobs, plus any subjobs that didn't get run in the original job, and then combine all the old and new results together.

Keywords: Glide, restart

Related Articles:

#656: I started a job to screen a million molecules, and it failed before finishing. How can I recover my results and finis...
#1054: How can I restart a Glide docking calculation for which some subjobs failed?
#1634: My serial Glide job was interrupted. Can I resume the job without losing the results of ligands already docked in the...

Back to Search Results

Was this information helpful?

What can we do to improve this information?


To ask a question or get help, please submit a support ticket or email us at help@schrodinger.com.
Knowledge Base Search

Type the words or phrases on which you would like to search, or click here to view a list of all
Knowledge Base articles