Submitted by periv4 on
These are the most common LSF commands used:
bjobs
Display the current status and job id, add the -w flag for wide information.
Example of bsub:
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 431956 user1 RUN gpu-v100 bmiclusterp 4*bmi-r740- bash Feb 21 11:12
bjobs -l <job id> will give more descriptions about the job
bjobs -l 431956
Job <431956>, User <user1>, Project <default>, Status <RUN>, Queue <gpu-v100>,
Interactive pseudo-terminal shell mode, Job Priority <50>,
Command <bash>, Esub <set-defaults dynamic-reject>
Tue Feb 21 11:12:06: Submitted from host <bmiclusterp2>, CWD <$HOME>, 4 Process
ors Requested, Requested Resources <span[hosts=1] rusage[m
em=128000] order[cpuf:-mem]>, Requested GPU <num=1>;
Tue Feb 21 11:12:06: Started on 4 Hosts/Processors <4*bmi-r740-02>, Execution H
ome </users/user1>, Execution CWD </users/user1>;
Wed Feb 22 14:02:39: Resource usage collected.
The CPU time used is 183 seconds.
MEM: 302 Mbytes; SWAP: 0 Mbytes; NTHREAD: 56
PGID: 34918; PIDs: 34918
RUNLIMIT
2880.0 min
MEMLIMIT
125 G
MEMORY USAGE:
MAX MEM: 302 Mbytes; AVG MEM: 300 Mbytes; MEM Efficiency: 0.24%
CPU USAGE:
CPU PEAK: 0.08 ; CPU Efficiency: 2.02%
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
EXTERNAL MESSAGES:
MSG_ID FROM POST_TIME MESSAGE ATTACHMENT
0 user1 Feb 21 11:12 bmi-r740-02:gpus=1; N
RESOURCE REQUIREMENT DETAILS:
Combined: select[(ngpus>0) && (type == local)] order[cpuf:-mem] rusage[mem=128
000.00:ngpus_physical=1.00] span[hosts=1]
Effective: select[( (ngpus>0)) && (type == local)] order[cpuf:-mem] rusage[mem
=128000.00,ngpus_physical=1.00] span[hosts=1]
GPU REQUIREMENT DETAILS:
Combined: num=1:mode=shared:mps=no:j_exclusive=no:gvendor=nvidia
Effective: num=1:mode=shared:mps=no:j_exclusive=no:gvendor=nvidia
bqueues
This will display the available queues on the system, add -w flag for wide information and -l flag for more information.
for example:
#bqueues -w
QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP private1 60 Open:Active - - - - 0 0 0 0 docker 60 Open:Active - - - - 0 0 0 0 upgrade 60 Open:Active - - - - 0 0 0 0 gpu-v100 60 Open:Active - - - - 6 0 6 0 gpu-a100 60 Open:Active - - - - 12 0 12 0
bkill
This will terminate a job, for example:
bkill <job id>
bstop
This command will stop your job, usage:
bstop <job id>
bresume
This command will resume your job, usage:
bresume <job id>
bmod
This command will change the parameters of the job, most are admin controlled but the wall time can be changed by typing:
bmod -W <Current time + Extended time> <job id>
bhosts
This command will display all compute nodes on the HPC cluster.
lsload
This command will display the current utilization of the compute nodes, use with the flag -w to wide the description.
