next up previous contents
Next: condor_userprio Up: 5. Command Reference Manual Previous: condor_status

Subsections

  
condor_submit

Queue jobs for execution on remote machines

Synopsis

condor_submit [-] [-v] [-n schedd_name] [-r schedd_name] submit-description file

Description

condor_submit is the program for submitting jobs to Condor. condor_submit requires a submit-description file which contains commands to direct the queuing of jobs. One description file may contain specifications for the queuing of many condor jobs at once. All jobs queued by a single invocation of condor_submit must share the same executable, and are referred to as a ``job cluster''. It is advantageous to submit multiple jobs as a single cluster because:

SUBMIT DESCRIPTION FILE COMMANDS

Each condor job description file describes one cluster of jobs to be placed in the condor execution pool. All jobs in a cluster must share the same executable, but they may have different input and output files, and different program arguments, etc. The submit-description file is then used as the only command-line argument to condor_submit.

The submit-description file must contain one executable command and at least one queue command. All of the other commands have default actions.

The commands which can appear in the submit-description file are:

executable = <name>
The name of the executable file for this job cluster. Only one executable command may be present in a description file. If submitting into the Standard Universe, which is the default, then the named executable must have been re-linked with the Condor libraries (such as via the condor_compile command). If submitting into the Vanilla Universe, then the named executable need not be re-linked and can be any process which can run in the background (shell scripts work fine as well).

input = <pathname>
Condor assumes that its jobs are long-running, and that the user will not wait at the terminal for their completion. Because of this, the standard files which normally access the terminal, (stdin, stdout, and stderr), must refer to files. Thus, the filename specified with input should contain any keyboard input the program requires (i.e. this file becomes stdin). If not specified, the default value of /dev/null is used.

output = <pathname>
The output filename will capture any information the program would normally write to the screen (i.e. this file becomes stdout). If not specified, the default value of /dev/null is used. More than one job should not use the same output file, since this will cause one job to overwrite the output of another.

error = <pathname>
The error filename will capture any error messages the program would normally write to the screen (i.e. this file becomes stderr). If not specified, the default value of /dev/null is used. More than one job should not use the same error file, since this will cause one job to overwrite the errors of another.

arguments = <argument_list>
List of arguments to be supplied to the program on the command line.

initialdir = <directory-path>
Used to specify the current working directory for the Condor job. Should be a path to a preexisting directory. If not specified, condor_submit will automatically insert the user's current working directory at the time condor_submit was run as the value for initialdir.

requirements = <ClassAd Boolean Expression>
The requirements command is a boolean ClassAd expression which uses C-like operators. In order for any job in this cluster to run on a given machine, this requirements expression must evaluate to true on the given machine. For example, to require that whatever machine executes your program has a least 64 Meg of RAM and has a MIPS performance rating greater than 45, use:
        requirements = Memory >= 64 && Mips > 45
Only one requirements command may be present in a description file. By default, condor_submit appends the following clauses to the requirements expression:
1.
Arch and OpSys are set equal to the Arch and OpSys of the submit machine. In other words: unless you request otherwise, Condor will give your job machines with the same architecture and operating system version as the machine running condor_submit.
2.
Disk > ExecutableSize. To ensure there is enough disk space on the target machine for Condor to copy over your executable.
3.
VirtualMemory >= ImageSize. To ensure the target machine has enough virtual memory to run your job.
4.
If Universe is set to Vanilla, FileSystemDomain is set equal to the submit machine's FileSystemDomain.
You can view the requirements of a job which has already been submitted (along with everything else about the job ClassAd) with the command condor_q -l; see the command reference for condor_q on page [*]. Also, see the Condor Users Manual for complete information on the syntax and available attributes that can be used in the ClassAd expression.

rank = <ClassAd Float Expression>
A ClassAd Floating-Point expression that states how to rank machines which have already met the requirements expression. Essentially, rank expresses preference. A higher numeric value equals better rank. Condor will give the job the machine with the highest rank. For example,
        requirements = Memory > 60
        rank = Memory
asks Condor to find all available machines with more than 60 megabytes of memory and give the job the one with the most amount of memory. See the Condor Users Manual for complete information on the syntax and available attributes that can be used in the ClassAd expression.

priority = <priority>
Condor job priorities range from -20 to +20, with 0 being the default. Jobs with higher numerical priority will run before jobs with lower numerical priority. Note that this priority is on a per user basis; setting the priority will determine the order in which your own jobs are executed, but will have no effect on whether or not your jobs will run ahead of another user's jobs.

notification = <when>
Owners of condor jobs are notified by email when certain events occur. If when is set to Always, the owner will be notified whenever the job is checkpointed, and when it completes. If when is set to Complete (the default), the owner will be notified when the job terminates. If when is set to Error, the owner will only be notified if the job terminates abnormally. Finally, if when is set to Never, the owner will not be mailed, regardless what happens to the job.

notify_user = <email-address>
Used to specify the email address to use when Condor sends email about a job. If not specified, Condor will default to using :
        job-owner@UID_DOMAIN
where UID_DOMAIN is specified by the Condor site administrator. If UID_DOMAIN has not been specified, Condor will send the email to :
        job-owner@submit-machine-name

getenv = <True | False>
If getenv is set to True, then condor_submit will copy all of the user's current shell environment variables at the time of job submission into the job ClassAd. The job will therefore execute with the same set of environment variables that the user had at submit time. Defaults to False.

environment = <parameter_list>
List of environment variables of the form :
        <parameter> = <value>
Multiple environment variables can be specified by separating them with a semicolon (`` ; ''). These environment variables will be placed into the job's environment before execution. The length of all characters specified in the environment is currently limited to 4096 characters.

log = <pathname>
Use log to specify a filename where Condor will write a log file of what is happening with this job cluster. For example, Condor will log into this file when and where the job begins running, when the job is checkpointed and/or migrated, when the job completes, etc. Most users find specifying a log file to be very handy; its use is recommended. If no log entry is specified, Condor does not create a log for this cluster.

universe = <vanilla | standard | pvm | scheduler>
Specifies which Condor Universe to use when running this job. The Condor Universe specifies a Condor execution environment. The standard Universe is the default, and tells Condor that this job has been re-linked via condor_compile with the Condor libraries and therefore supports checkpointing and remote system calls. The vanilla Universe is an execution environment for jobs which have not been linked with the Condor libraries. Note: use the vanilla Universe to submit shell scripts to Condor. The pvm Universe is for a parallel job written with PVM 3.3, and scheduler is for a job that should act as a metascheduler. See the Condor User's Manual for more information about using Universe.

image_size = <size>
This command tells Condor the maximum virtual image size to which you believe your program will grow during its execution. Condor will then execute your job only on machines which have enough resources, (such as virtual memory), to support executing your job. If you do not specify the image size of your job in the description file, Condor will automatically make a (reasonably accurate) estimate about its size and adjust this estimate as your program runs. If the image size of your job is underestimated, it may crash due to inability to acquire more address space, e.g. malloc() fails. If the image size is overestimated, Condor may have difficulty finding machines which have the required resources. size must be in kbytes, e.g. for an image size of 8 megabytes, use a size of 8000.

machine_count = <min..max>
If machine_count is specified, Condor will not start the job until it can simultaneously supply the job with min machines. Condor will continue to try to provide up to max machines, but will not delay starting of the job to do so. If the job is started with fewer than max machines, the job will be notified via a usual PvmHostAdd notification as additional hosts come on line. Important: only use machine_count if an only if submitting into the PVM Universe. At this time, machine_count must be used only with a parallel PVM application.

coresize = <size>
Should the user's program abort and produce a core file, coresize specifies the maximum size in bytes of the core file which the user wishes to keep. If coresize is not specified in the command file, the system's user resource limit ``coredumpsize'' is used (except on HP-UX).

nice_user = <True | False>
 Normally, when a machine becomes available to Condor, Condor decides which job to run based upon user and job priorities. Setting nice_user equal to True tells Condor not to use your regular user priority, but that this job should have last priority amongst all users and all jobs. So jobs submitted in this fashion run only on machines which no other non-nice_user job wants -- a true ``bottom-feeder'' job! This is very handy if a user has some jobs they wish to run, but do not wish to use resources that could instead be used to run other people's Condor jobs. Jobs submitted in this fashion have ``nice-user.'' pre-appended in front of the owner name when viewed from condor_q or condor_userprio. The default value if False.

kill_sig = <signal-number>
When Condor needs to kick a job off of a machine, it will send the job the signal specified by signal-number. signal-number needs to be an integer which represents a valid signal on the execution machine. For jobs submitted to the Standard Universe, the default value is the number for SIGTSTP which tells the Condor libraries to initiate a checkpoint of the process. For jobs submitted to the Vanilla Universe, the default is SIGTERM which is the standard way to terminate a program in UNIX.

+<attribute> = <value>
A line which begins with a '+' (plus) character instructs condor_submit to simply insert the following attribute into the job ClasssAd with the given value.

queue [number-of-procs
] Places one or more copies of the job into the Condor queue. If desired, new input, output, error, initialdir, arguments, nice_user, priority, kill_sig, coresize, or image_size commands may be issued between queue commands. This is very handy when submitting multiple runs into one cluster with one submit file; for example, by issuing an initialdir between each queue command, each run can work in its own subdirectory. The optional argument number-of-procs specifies how many times to submit the job to the queue, and defaults to 1.

In addition to commands, the submit-description file can contain macros and comments:

Macros
Parameterless macros in the form of $(macro_name) may be inserted anywhere in condor description files. Macros can be defined by lines in the form of
 
        <macro_name> = <string>
Two pre-defined macros are supplied by the description file parser. The $(Cluster) macro supplies the number of the job cluster, and the $(Process) macro supplies the number of the job. These macros are intended to aid in the specification of input/output files, arguments, etc., for clusters with lots of jobs, and/or could be used to supply a Condor process with its own cluster and process numbers on the command line.

Comments
Blank lines and lines beginning with a '#' (pound-sign) character are ignored by the submit-description file parser.

Options

Supported options are as follows:

-
Accept the command file from stdin.
-v
Verbose output - display the created job class-ad

-n schedd_name
Submit to the specified schedd. This option is used when there is more than one schedd running on the submitting machine

-r schedd_name
Submit to a remote schedd. The jobs will be submitted to the schedd on the specified remote host, and their owner will be set to ``nobody".

Exit Status

condor_submit will exit with a status value of 0 (zero) upon success, and a non-zero value upon failure.

Examples

Example 1: The below example queues three jobs for execution by Condor. The first will be given command line arguments of '15' and '2000', and will write its standard output to 'foo.out1'. The second will be given command line arguments of '30' and '2000', and will write its standard output to 'foo.out2'. Similarly the third will have arguments of '45' and '6000', and will use 'foo.out3' for its standard output. Standard error output, (if any), from all three programs will appear in 'foo.error'.

      ####################
      #
      # Example 1: queueing multiple jobs with differing
      # command line arguments and output files.
      #                                                                      
      ####################                                                   
                                                                         
      Executable     = foo                                                   
                                                                         
      Arguments      = 15 2000                                               
      Output  = foo.out1                                                     
      Error   = foo.err1
      Queue                                                                  
                                                                         
      Arguments      = 30 2000                                               
      Output  = foo.out2                                                     
      Error   = foo.err2
      Queue                                                                  
                                                                         
      Arguments      = 45 6000                                               
      Output  = foo.out3                                                     
      Error   = foo.err3
      Queue

Example 2: This submit-description file example queues 150 runs of program 'foo' which must have been compiled and linked for Silicon Graphics workstations running IRIX 6.x. Condor will not attempt to run the processes on machines which have less than 32 megabytes of physical memory, and will run them on machines which have at least 64 megabytes if such machines are available. Stdin, stdout, and stderr will refer to ``in.0'', ``out.0'', and ``err.0'' for the first run of this program (process 0). Stdin, stdout, and stderr will refer to ``in.1'', ``out.1'', and ``err.1'' for process 1, and so forth. A log file containing entries about where/when Condor runs, checkpoints, and migrates processes in this cluster will be written into file ``foo.log''.

      ####################                                                    
      #                                                                       
      # Example 2: Show off some fancy features including                            
      # use of pre-defined macros and logging.                                
      #                                                                       
      ####################                                                    
                                                                          
      Executable     = foo                                                    
      Requirements   = Memory >= 32 && OpSys == "IRIX6" && Arch =="SGI"     
      Rank           = Memory >= 64
      Image_Size     = 28 Meg                                                 
                                                                          
      Error   = err.$(Process)                                                
      Input   = in.$(Process)                                                 
      Output  = out.$(Process)                                                
      Log = foo.log                                                                       
                                                                          
      Queue 150

General Remarks

See Also

Condor User Manual

Author

Condor Team, University of Wisconsin-Madison

Copyright

Copyright © 1990-1998 Condor Team, Computer Sciences Department, University of Wisconsin-Madison, Madison, WI. All Rights Reserved. No use of the Condor Software Program is authorized without the express consent of the Condor Team. For more information contact: Condor Team, Attention: Professor Miron Livny, 7367 Computer Sciences, 1210 W. Dayton St., Madison, WI 53706-1685, (608) 262-0856 or miron@cs.wisc.edu. U.S. Government Rights Restrictions: Use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of The Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 or subparagraphs (c)(1) and (2) of Commercial Computer Software-Restricted Rights at 48 CFR 52.227-19, as applicable, Condor Team, Attention: Professor Miron Livny, 7367 Computer Sciences, 1210 W. Dayton St., Madison, WI 53706-1685, (608) 262-0856 or miron@cs.wisc.edu.

See the Condor Version 6.1.2 Manual for additional notices.


next up previous contents
Next: condor_userprio Up: 5. Command Reference Manual Previous: condor_status
condor-admin@cs.wisc.edu