To use gaussian 09 (g09) in hpc2, please note the following.
- In hpc2, the G09 is NOT supported across nodes with Slurm, which means a single gaussian job can use up to 24 CPUs. As a result, the -N option in the slurm script should be 1. The %NProcShared in the gaussian input file should not be greater than 24, which is the same as the -n option in the slurm script.
- We make use of the local disk as the scratch directory for the g09 jobs to save disk space and improve disk I/O performance. The checkpoint file will be copied from local disk to the working directory either when the job completes or when the running time exceeds the walltime (timeout). The checkpoint file will be in form of $HOSTNAME.<check point file>, where the HOSTNAME is the compute node name running the job.
- If the job completes within walltime, the checkpoint file will be found in the working directory
- If the job cannot complete within walltime, the checkpoint will be found in <SLURM_JOBID>-chkfiledir, which is a subdirectory under the working directory and <SLURM_JOBID> is the job id.
- Please place your gaussian input and job submission script in the working directory. If you need to use an existing checkpoint file in the job, also place it in the directory.
- User must be in a special system group “g09″ in order to use all features described in this page. To check if you are in that group, type “groups” and it shows all the groups you are with. If you are in that group, there will be a “g09″ entry in the output. In case you would like to use Gaussian 09 and are not in the “g09″ group yet, please email to hpcadmin@ust.hk for assistance.
The sample job submission script with Slurm, hpc2_slurm_g09.txt, can be downloaded from here.
Presume you have g09 working directory in your hpc2 home, place your gaussian input file, hpc2_slurm_g09.txt and existing checkpoint file(if applicable) there. Update the names of input, output, checkpoint file and partition you want to run the job in the script and then submit the job as follows.
sbatch hpc2_slurm_g09.txt
To check your job status,
squeue -u $USER
To cancel the job,
scancel <jobid>
Special Case: cancel the job and collect the checkpoint file
Sometimes the user may want to cancel the running job but collect the checkpoint file for future use. This can be done manually:
- Update the jobid, chkfile name and email address in this sample bash script and execute it.
sh hpc2_extra_slurm_g09.txt
The output will show the time arrangement for collecting checkpoint file. - Cancel the running job if there is no error message in the output of previous execution.
scancel <jobid>
Then the job is cancelled and the chkfile will be copied to <SLURM_JOBID>-chkfiledir. Email notification will be sent when it is done.