Hi,
When I want to submit job I have this message error : qsub :cannot connect to server centos7 (errno=113)
qstat command return the same message.
who can help me please ?
Hi,
When I want to submit job I have this message error : qsub :cannot connect to server centos7 (errno=113)
qstat command return the same message.
who can help me please ?
pbs_hostn -v centos7 return
primary name: centos7.home (from gethostbyname())
aliases: -none-
address length: 4 bytes
address: 192.168.1.97 (1627498688 dec) name: centos7.home
in /etc/hostname it is only written localhost
Please add the below line the /etc/hosts file
192.168.1.97 centos7.home
and then restart the pbs services and check the below command
ps -ef | grep pbs_
into my /etc/hosts file :
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.97 centos7.home
Then I have restart my pbs server :
sudo ./pbs_server
Command ps -ef | grep pbs_ return :
root 6554 1 0 15:11 ? 00:00:00 /opt/pbs/sbin/pbs_ds_monitor monitor
root 6836 1 0 15:12 ? 00:00:00 /opt/pbs/sbin/pbs_server.bin
nekcorp 8159 20891 0 15:12 pts/4 00:00:00 grep --color=auto pbs_
when I want to submit job I have this message :
No route to host
qsub: cannot connect to server centos7 (errno=113)
Please follow the below to restart the services.
for example on my system:
pbs_server is pid 1793
pbs_sched is pid 1523
pbs_comm is 1507
Check the for the services which are up and running:
ps -ef | grep pbs_
netstat -tunap | grep pbs
Share the contents of /etc/pbs.conf
Added/updated/edited:
I have follow your recommandation to restart the services, but I have the same message when I use qsub command or qstat command :
No route to host
qstat: cannot connect to server centos7 (errno=113)
No route to host
qsub: cannot connect to server centos7 (errno=113)
/etc/init.d/pbs status command return :
pbs_server is pid 9043
pbs_mom is pid 4803
pbs_sched is pid 4869
pbs_comm is 4686
ps -ef | grep pbs commad returns :
root 4686 1 0 15:08 ? 00:00:00 /opt/pbs/sbin/pbs_comm
root 4803 1 0 15:08 ? 00:00:00 /opt/pbs/sbin/pbs_mom
root 4869 1 0 15:08 ? 00:00:00 /opt/pbs/sbin/pbs_sched
root 5531 1 0 15:08 ? 00:00:00 /opt/pbs/sbin/pbs_ds_monitor monitor
root 9043 1 0 15:08 ? 00:00:00 /opt/pbs/sbin/pbs_server.bin
nekcorp 30691 22213 0 15:21 pts/2 00:00:00 grep --color=auto pbs_
netstat -tunap | grep pbs command returns :
tcp 0 0 0.0.0.0:17001 0.0.0.0:* LISTEN 4686/pbs_comm
tcp 0 0 0.0.0.0:15002 0.0.0.0:* LISTEN 4803/pbs_mom
tcp 0 0 0.0.0.0:15003 0.0.0.0:* LISTEN 4803/pbs_mom
tcp 0 0 0.0.0.0:15004 0.0.0.0:* LISTEN 4869/pbs_sched
tcp 0 1 192.168.1.49:89 192.168.1.97:17001 SYN_SENT 4803/pbs_mom
tcp 0 1 192.168.1.49:88 192.168.1.97:17001 SYN_SENT 4869/pbs_sched
The contents of /etc/pbs.conf are :
PBS_SERVER=centos7
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp
Firewall is disable and selinux too.
To be honnest I do not understand what I am doing and what is the problem.
So I do not know why but in my file /etc/pbs.conf the ip adress was not correct, the hostname too. I have change it and now when I use qsub and qstat I have not the error message.
but When I use qstat I have this :
Job id Name User Time Use S Queue
'---------------- ---------------- ---------------- -------- - -----
57.centos7 OPTISTRUCT12 nekcorp 00:00:01 E optistruct
This is a old job before I have the errno=113 message error.
when I want to kill them with qdel 57.centos7 I have this message :
No route to host
qdel: cannot connect to server centos7 (errno=113)
Thank you for sharing your findings. Much appreciate it. Could you please update this line in the /etc/hosts to
192.168.1.97 centos7.home centos7
And then restart the pbs services.
Please delete the old job as below and restart the pbs services and then submit a job
qdel -W force 57.centos7
qsub — /bin/hostname
If it errors out then get strace output of qstat and qsub
strace -o qstat_strace.txt -tt -f -s 8192 qstat
strace -o qsub_strace.txt -tt -f -s 8192 qsub — /bin/hostname
Thank a lot for your help, everything works.