Start program (with arguments arg1, arg2,...) in the cluster environment. The command
mpirun exits immediately after it supplies all information needed to start
the job to the wimps server.
OPTIONS:
-N |
force Wimpy to start and keep the process on the specific node. If NODE_ID is not
supplied, new job is started on the predefined node with most memory. See
list of available nodes with brief description of their
specific hardware parameters. Do not send jobs to nodes without explicitely specified
special features in the list.
| -n |
NICE: start the program with given value of nice. If not specified, default
value NICE=12 is used. |
LONG OPTIONS:
--stdin=FILE |
redirect standard input to regular file named FILE. Don't use pipes or sockets!
Only regular files in the /home directory are accessible accross the
whole cluster and can be safely used. If this option is not used, standard
input is redirected to /dev/null. |
--stdout[=FILE] |
redirect standard output to regular file. If FILE is not specified, name
'PID.stdout' in the current working directory is used. If FILE ends with slash,
name 'FILE/PID.stdout' is used. If the option is omitted, standard output
is redirected to /dev/null. |
--stderr[=FILE] |
redirect standard error output to regular file. If FILE is not specified, name
'PID.stderr' is used. If FILE ends with slash, name 'FILENAME/PID.stderr'
is used. If the option is omitted, standard error output is redirected to
/dev/null. |
send signal to running job with given PID. Wimpy daemons forward the signal across
the cluster. In the case of fine signal handling, users should keep in mind that
the process is signalled by root process (Wimpy daemon).
SIGNALs (only numeric values are accepted):
1 | SIGHUP |
9 | SIGKILL |
10 | SIGUSR1 |
12 | SIGUSR2 |
15 | SIGTERM - default value |
18 | SIGCONT - resume suspended
process. Signal is handled by Wimpy daemons and is not delivered to the process |
19 | SIGSTOP - suspend running
process. Signal is handled by Wimpy daemons and is not delivered to the process |
alias to mpikill -s 19 PID . Suspend (checkpoint & kill) process
to an image file. It can be used e.g. to give more room to other processes in the
case of heavy cluster overload.
alias to mpikill -s 18 PID . Resume suspended process from the image file.
Change nice of process determined by PID. New value of nice must be larger than
the old one. Users can only change nice of their own processes.
Print brief cluster statistics and list of running jobs. By default, the jobs are
listed with fields specifying PID of the process, effective USER id, value of NICE,
"number" of the NODE where the process is currently running, STATus,
used MEMory, consumed (normalized) CPU TIME and name of the executable (COMMAND).
Possible states are:
R | running job |
S | sleeping job (i.e. not consuming CPU time) |
CH | job suspended by the user to an image file |
CH+ | job suspended by the system (probably due to the PID conflict) |
EXIT | exitted job (it may stay in the table for some time, but definitely less than one minute) |
LOST | lost job (e.g. due to the lost communication with the node). This status should never occur under normal circumstances; the process can be usually recovered from the last checkpoint |
OPTIONS:
-a | show full IP address of the nodes running the jobs (by default, only last byte of the address is shown) |
-c | show full command lines |
-d | show difference of consumed CPUtime with respect to value targeted by the scheduler |
-m | show number of process migrations |
-q | don't print header |
-s | don't print cluster statistics |
-S | show per user usage rather than full list of nodes |
-t | show the consumed CPUtime in seconds (default format is minutes:seconds) |
|