Set-up for High Performance Computing on Windows
i) Download and install WinSCP. WinSCP can be downloaded from https://winscp.net/eng/download.php.
ii) Download Putty. Putty must be in (C:\Program Files (x86)). It can be downloaded from http://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html.
iii) Open WinSCP.
iv) Enter hostname “halstead.rcac.purdue.edu”.
v) Enter your username and password.
vi) Go to your directory on Halstead “scratch/Halstead/(first letter of username)/username”.
vii) Make a new directory “Abaqus”.
viii) Create two tex files abaqus_v6.env and script.sh.
ix) abaqus_v6.env should contain the following three lines
mp_mode = MPI
mp_mpirun_options = ‘-UDAPL ‘
memory = ’115 gb’
x) script.sh file should contain the following lines
#PBS -l walltime=30:00:00,nodes=3:ppn=8 #PBS -q standby #PBS -j oe #PBS -N abaqus #PBS -m e #PBS -n # NP=20 #!How many procs/nodes do I have? source /etc/profile #module load unsupported #module load abaqus/6.10-1 module load abaqus/6.14-6 #export MPI_REMSH=ssh cd $PBS_O_WORKDIR #abq6101 input=RR1000_80_80_80_res2 \ abq6146 input=6006024R \ job=6006024R cpus=24 interactive scratch=${PWD} date #! print date and time
ix) Transfer abaqus_v6 file and script file to “Abaqus directory”.
Run Jobs on halstead
i) Open WinSCP
ii) Enter hostname “halstead.rcac.purdue.edu”.
iii) Enter your user name and password.
iv) Go to Abaqus folder in your account folder by following “scratch/Halstead/(first letter of username)/username/Abaqus”.
v) Transfer “Abaqus input file” to “Abaqus” directory.
vi) Open script file and change walltime. Halstead kills the job if running time exceeds walltime.
vii) Change number of nodes (machines). Maximum: 20, Recommended: 10.
viii) Enter ppn (processors per node). Maximum: 20
ix) Enter input file name and assign a job name.
x) Enter Abaqus input file name (in the third last line: abq6146 input=(input file name).
xi) Enter job name.
xii) Enter number of cpus that is equal to number of nodes*ppn.
xiii) Open console (Putty).
xiv) Type cd /scratch/Halstead//(first letter of your username)/username/Abaqus.
xv) Type “qsub –q cmsc script.sh” to run a job.
xvi) Type: “qstat –u username” to check job status.
xvii) Type: “qdel job id”. Job id is a job number assigned by Halstead.
xviii) Transfer your files to local directory for post-processing.
Please consult clusters101.pdf (22 MB, uploaded by Wenbin Yu 4 years 10 months ago), High_Performance_Computing_Resources.pdf (2 MB, uploaded by Wenbin Yu 4 years 10 months ago) and rcac_cluster_reference.pdf (223 KB, uploaded by Wenbin Yu 4 years 10 months ago) for more details.
Additional Comments
- When you submit a job to the queue, if you ask for cores on more than one node, the cluster locks out the entire node to that job. So when you submit a job with 8 nodes and 6 cores per node, you are effectively using 160 of the cores since they are locked out of our queue, even though you are only utilizing 48 to do work. It would be much better to submit a job with 2 nodes and 20 cores per node for that case. At the very least, we should not be submitting jobs on multiple nodes that use less than 20 cores per node.
- The number of nodes will affect the memory your job can use. If you use one node, the memory you can use will be 128 gb minus system usage, and using 2 or more nodes will double or multiply the total memory you can use. The hpc or abaqus on hpc seems to be not having a smart memory management so once one of the node does not have enough memory it will kill your job. Using more nodes reduce the memory required on each node making the chance of error I had smaller.