Occasionally, a group of serial jobs need to be run on TSCC. Rather than submit each job individually, they can be grouped together and submitted using a single batch script procedure such as the one described below.
Although it's preferable to run parallel codes whenever possible, sometimes that is not cost-effective, or the tasks are simply not parallelizable. In that case, using a procedure like this can save time and effort by organizing multiple serial jobs into a single input file and submitting them all in one step.
The code for this process is given below in a very simple example that uses basic shell commands as the serial tasks. Your complex serial tasks can easily be substituted for those commands and submitted using a modified version of these scripts and run from your own home directory.
Note that the /home filesystem on TSCC uses autofs. Under autofs, filesystems are not always visible to the ls command. If you cd to the/home/beta directory, for example, it will get mounted and become accessible. You can also reference it explicitly, e.g. ls /home/beta, to verify its availability.
Autofs is used to minimize the number of mounts visible to active nodes. All users have their own filesystem for their home directory.
The following is an example script that can be modified to suit users with similar needs. This file is named submit.qsub.
#!/bin/sh # #PBS -q hotel #PBS -m e #PBS -o outfile #PBS -e errfile #PBS -V ################################################################### ### Update the below variables with correct values ### Name your job here again #PBS -N jobname ### Put your node count and time here #PBS -l nodes=1:np=5 #PBS -l walltime=00:10:00 ### Put your notification E-mail ID here #PBS -M username@some.domain ### Set this to the working directory cd /home/beta/scripts/bundling #################################################################### ## Run my parallel job /opt/openmpi_pgimx/bin/mpirun -machinefile $PBS_NODEFILE -np 5 \ ./my_script.pl jobs-list
The above batch script refers to this script file, named my_script.pl
.
#!/usr/bin/perl # # This script executes a command from a list of files, # based on the current MPI id. # # Last modified: Mar/11/2005 # # call getid to get the MPI id number ($myid,$numprocs) = split(/\s+/,`./getid`); $file_id = $myid; $file_id++; # open file and execute appropriate command $file_to_use = $ARGV[0]; open (INPUT_FILE, $file_to_use) or &showhelp; for ($i = 1; $i <= $file_id; $i++) { $buf = <INPUT_FILE>; } system("$buf"); close INPUT_FILE; sub showhelp { print "\nUsage: mpiscript.pl <filename>\n\n"; print "<filename> should contain a list of executables,"; print " one-per-line, including the path.\n\n"; }
The batch script refers to this input file, named jobs-list
.
hostname; date hostname; ls hostname; uptime uptime uptime > line-5
Running the above script writes output like this to the file <outfile>
. Notice that the output lines are non-sequential and may be written to the same file.
12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 compute-0-51.local compute-0-51.local Wed Aug 19 12:20:53 PDT 2009 12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 compute-0-51.local getid getid.c jobs-list line-5 my_script.pl submit.qsub
Line 5 in the above script writes output like this to the file <line-5>
. This output does not appear in the shared output file.
12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99
A modification of this procedure is available from TSCC User Support (member-only list) that matches the number of scripts to the number of processors, when more scripts are being run than processors are available.
It should also be possible to modify this script to run parallel jobs. Feel free to try it or ask support for help through the Discussion List.