| | 4 | |
| | 5 | == Torque == |
| | 6 | |
| | 7 | * 撰寫 pbs script |
| | 8 | {{{ |
| | 9 | jazz@bio037:~$ cat myscript |
| | 10 | #!/bin/bash |
| | 11 | ### Job 名稱 |
| | 12 | #PBS -N mytest |
| | 13 | ### 輸出檔案 |
| | 14 | #PBS -e /home/jazz/mytest.err |
| | 15 | #PBS -o /home/jazz/mytest.log |
| | 16 | ###================================================ |
| | 17 | # 顯示目錄及時間資訊 |
| | 18 | echo Working directory is $PBS_O_WORKDIR |
| | 19 | cd $PBS_O_WORKDIR |
| | 20 | echo Running on host `hostname` |
| | 21 | echo Time is `date` |
| | 22 | echo Directory is `pwd` |
| | 23 | # 執行檔案 |
| | 24 | date |
| | 25 | }}} |
| | 26 | * 丟 job |
| | 27 | {{{ |
| | 28 | jazz@bio037:~$ qsub < myscript |
| | 29 | 30.bio037 |
| | 30 | }}} |
| | 31 | * 查 job 執行過程 |
| | 32 | {{{ |
| | 33 | jazz@bio037:~$ tracejob 30 |
| | 34 | /var/spool/torque/mom_logs/20091015: No matching job records located |
| | 35 | |
| | 36 | Job: 30.bio037 |
| | 37 | |
| | 38 | 10/15/2009 00:38:59 S enqueuing into batch, state 1 hop 1 |
| | 39 | 10/15/2009 00:38:59 S Job Queued at request of jazz@bio037, owner = jazz@bio037, job name = mytest, queue = |
| | 40 | batch |
| | 41 | 10/15/2009 00:38:59 S Job Modified at request of Scheduler@bio037 |
| | 42 | 10/15/2009 00:38:59 S Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=0kb |
| | 43 | resources_used.vmem=0kb resources_used.walltime=00:00:00 |
| | 44 | 10/15/2009 00:38:59 L Job Run |
| | 45 | 10/15/2009 00:38:59 S Job Run at request of Scheduler@bio037 |
| | 46 | 10/15/2009 00:38:59 A queue=batch |
| | 47 | 10/15/2009 00:38:59 A user=jazz group=jazz jobname=mytest queue=batch ctime=1255538339 qtime=1255538339 |
| | 48 | etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 |
| | 49 | Resource_List.neednodes=1 Resource_List.nodect=1 Resource_List.nodes=1 |
| | 50 | 10/15/2009 00:38:59 A user=jazz group=jazz jobname=mytest queue=batch ctime=1255538339 qtime=1255538339 |
| | 51 | etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 |
| | 52 | Resource_List.neednodes=1 Resource_List.nodect=1 Resource_List.nodes=1 session=5366 |
| | 53 | end=1255538339 Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=0kb |
| | 54 | resources_used.vmem=0kb resources_used.walltime=00:00:00 |
| | 55 | 10/15/2009 00:39:07 S Post job file processing error |
| | 56 | 10/15/2009 00:39:07 S dequeuing from batch, state COMPLETE |
| | 57 | }}} |
| | 58 | * 每個 Job 都可以用 jobid 去查執行的 host 是哪些,在 exec_host 這個變數 |
| | 59 | {{{ |
| | 60 | etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 |
| | 61 | }}} |
| | 62 | * 從錯誤訊息,可以明白每台 pbs_mom 執行過的 job 都會紀錄在 /var/spool/torque/mom_logs/日期 |
| | 63 | {{{ |
| | 64 | /var/spool/torque/mom_logs/20091015 |
| | 65 | }}} |