Pbsnodes: cannot connect to server , error=111 and Failed to start PBS dataservice

Hi Team,
I am using pbspro ce 19.1.1 on rocks cluster with Master and Compute-0-0.
Still yrsterday it was working fine. today when i check the pbsnodes -aSj it is getting below error.

[root@master ~]# pbsnodes -aSj
Connection refused
pbsnodes: cannot connect to server master.calligotech.com, error=111

then i check the pbs status

[root@master ~]# service pbs status
pbs_server is pid 5189
pbs_mom is pid 2362
pbs_sched is pid 2374
pbs_comm is 2333
[root@master ~]# service pbs restart
Restarting PBS
Stopping PBS
Killing Server.
PBS server - was pid: 5189
PBS mom - was pid: 2362
PBS sched - was pid: 2374
PBS comm - was pid: 2333
Waiting for shutdown to complete
Starting PBS
/share/apps/platform/pbs/sbin/pbs_comm ready (pid=30142), Proxy Name:master.calligotech.com:17001, Threads:4
PBS comm
PBS mom
Creating usage database for fairshare.
PBS sched
**Connecting to PBS dataservice..Failed to start PBS dataservice:[2020-07-25 10:24:47 ISTFATAL:  could not create lock file "/var/run/postgresql/.s.PGSQL.15007.lock": No such file or directory]**
.Failed to start PBS dataservice
.Failed to start PBS dataservice
/etc/init.d/pbs: line 268: 30203 Terminated              ${PBS_EXEC}/sbin/pbs_server
pbs_server startup failed, exit 143 aborting.
[root@master ~]# 

below are the my pbs conf details

[root@master ~]# cat /etc/pbs.conf
PBS_EXEC=/share/apps/platform/pbs
PBS_HOME=/var/spool/pbs
PBS_SERVER=master.calligotech.com
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp
[root@master ~]# cat /etc/hosts
# Added by rocks report host #
#        DO NOT MODIFY       #
#  Add any modifications to  #
#    /etc/hosts.local file   #

127.0.0.1       localhost.localdomain   localhost

10.1.50.254     compute-0-0.local       compute-0-0 c0
10.1.50.1       master.local    master
192.168.0.50    master.calligotech.com master

And i check the datastore logs

[root@master ~]# cat /var/spool/pbs/datastore/pg_log/postgresql-Fri.log
2020-07-24 10:27:44 ISTLOG:  database system was shut down at 2020-07-23 23:20:20 IST
2020-07-24 10:27:44 ISTLOG:  autovacuum launcher started
2020-07-24 10:27:44 ISTLOG:  database system is ready to accept connections
2020-07-24 10:59:40 ISTLOG:  received fast shutdown request
2020-07-24 10:59:40 ISTLOG:  aborting any active transactions
2020-07-24 10:59:40 ISTLOG:  autovacuum launcher shutting down
2020-07-24 10:59:40 ISTLOG:  shutting down
2020-07-24 10:59:41 ISTLOG:  database system is shut down
2020-07-24 12:16:50 ISTLOG:  database system was shut down at 2020-07-24 10:59:41 IST
2020-07-24 12:16:50 ISTLOG:  autovacuum launcher started
2020-07-24 12:16:50 ISTLOG:  database system is ready to accept connections
2020-07-24 12:37:50 ISTLOG:  database system was interrupted; last known up at 2020-07-24 12:16:50 IST
2020-07-24 12:37:51 ISTLOG:  database system was not properly shut down; automatic recovery in progress
2020-07-24 12:37:51 ISTLOG:  redo starts at 0/19426E8
2020-07-24 12:37:51 ISTLOG:  record with zero length at 0/1944460
2020-07-24 12:37:51 ISTLOG:  redo done at 0/1944430
2020-07-24 12:37:51 ISTLOG:  last completed transaction was at log time 2020-07-24 12:16:56.710097+05:30
2020-07-24 12:37:51 ISTLOG:  autovacuum launcher started
2020-07-24 12:37:51 ISTLOG:  database system is ready to accept connections
2020-07-24 13:00:23 ISTLOG:  database system was interrupted; last known up at 2020-07-24 12:42:51 IST
**2020-07-24 13:00:25 ISTLOG:  database system was not properly shut down; automatic recovery in progress**
2020-07-24 13:00:25 ISTLOG:  record with zero length at 0/1946168
2020-07-24 13:00:25 ISTLOG:  redo is not required
2020-07-24 13:00:25 ISTLOG:  autovacuum launcher started
2020-07-24 13:00:25 ISTLOG:  database system is ready to accept connections
2020-07-24 18:02:55 ISTLOG:  received fast shutdown request
2020-07-24 18:02:55 ISTLOG:  aborting any active transactions
2020-07-24 18:02:55 ISTLOG:  autovacuum launcher shutting down
2020-07-24 18:02:55 ISTLOG:  shutting down
2020-07-24 18:02:57 ISTLOG:  database system is shut down
[root@master ~]#

the reason is because of database not shutdown properly. please help me out to resolve this issue.

Regards,
Zain

The PBS_SERVER assigned hostname/FQDN is the problem here. Please use the correct resolvable short hostname (alias) and make sure there are no duplicates.

Please check the PBS server logs when you run the command pbsnodes -aSj to find out through which IP/hostname the commands are reaching the pbs server , also you can check with strace pbsnodes -aSj

Please check the below section of the PBS Professional Installation and Upgrade guide
https://www.altair.com/pdfs/pbsworks/PBSInstallGuide19.2.3.pdf

Hi,
Whenever we start the master node , pbs service is automatically running and after sometime it is getting error as below.

[root@master ~]# pbsnodes -aSj
Connection refused
pbsnodes: cannot connect to server master.calligotech.com, error=111

my server logs

[root@master ~]# cat /var/spool/pbs/server_logs/20200727
07/27/2020 12:20:19;0002;Server@master;Svr;Log;Log opened
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;pbs_version=19.1.1
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;pbs_build=mach=N/A:security=N/A:configure_args=N/A
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;hostname=master.calligotech.com;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv4 interface lo: localhost
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv4 interface eno1: master
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv4 interface eno2: master
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv4 interface virbr0: master.calligotech.com
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv6 interface lo: master.calligotech.com
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv6 interface eno1: master.calligotech.com
07/27/2020 12:20:19;0002;Server@master;Svr;Server@master;ipv6 interface eno2: master.calligotech.com
07/27/2020 12:20:19;0006;Server@master;Fil;Server@master;Version 19.1.1, started, initialization type = 1
07/27/2020 12:20:21;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:20:21;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:20:27;0002;Server@master;Svr;Server@master;connected to PBS dataservice@master.calligotech.com
07/27/2020 12:20:27;0086;Server@master;Svr;pbs_python_ext_quick_start_interpreter;--> Python Interpreter quick started, compiled with version:'2.7.5 (default, Aug  4 2017, 00:39:18)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]' <--
07/27/2020 12:20:27;0086;Server@master;Svr;pbs_python_ext_quick_start_interpreter;--> Inserted Altair PBS Python modules dir '/share/apps/platform/pbs/lib/python/altair' <--
07/27/2020 12:20:27;0002;Server@master;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Main Thread);TPP leaf node names = 192.168.0.50:15001,127.0.0.1:15001,10.1.50.1:15001,192.168.0.50:15001,192.168.122.1:15001
07/27/2020 12:20:28;0002;Server@master;Svr;Server@master;Server pid = 3021 ready;  using ports Server:15001 Scheduler:15004 MOM:15002 RM:15003
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Thread 0);Thread ready
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Thread 0);Registering address 192.168.0.50:15001 to pbs_comm
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Thread 0);Registering address 10.1.50.1:15001 to pbs_comm
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Thread 0);Registering address 192.168.122.1:15001 to pbs_comm
07/27/2020 12:20:28;0c06;Server@master;TPP;Server@master(Thread 0);Connected to pbs_comm master.calligotech.com:17001
07/27/2020 12:50:19;0002;Server@master;Svr;Server@master;Stopping PBS dataservice
07/27/2020 12:50:24;0002;Server@master;Svr;Server@master;Server shutdown completed
07/27/2020 12:50:24;0002;Server@master;Svr;Log;Log closed
07/27/2020 12:53:44;0002;Server@master;Svr;Log;Log opened
07/27/2020 12:53:44;0002;Server@master;Svr;Server@master;pbs_version=19.1.1
07/27/2020 12:53:44;0002;Server@master;Svr;Server@master;pbs_build=mach=N/A:security=N/A:configure_args=N/A
07/27/2020 12:53:44;0002;Server@master;Svr;Server@master;hostname=master.calligotech.com;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv4 interface lo: localhost
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv4 interface eno1: master
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv4 interface eno2: master
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv4 interface virbr0: master.calligotech.com
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv6 interface lo: localhost
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv6 interface eno1: master.calligotech.com
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;ipv6 interface eno2: master.calligotech.com
07/27/2020 12:53:45;0006;Server@master;Fil;Server@master;Version 19.1.1, started, initialization type = 1
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:53:45;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:53:57;0002;Server@master;Svr;Server@master;PBS dataservice not running:[Connection:  failed: could not connect to server: Connection refused
        Is the server running on host "192.168.0.50" and accepting
        TCP/IP connections on port 15007?]
07/27/2020 12:53:58;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:54:00;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:54:00;0006;Server@master;Svr;Server@master;Failed to start PBS dataservice
07/27/2020 12:54:01;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:54:05;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:54:05;0006;Server@master;Svr;Server@master;Failed to start PBS dataservice
07/27/2020 12:54:05;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:54:10;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:54:22;0002;Server@master;Svr;Server@master;PBS dataservice not running:[Connection:  failed: could not connect to server: Connection refused
        Is the server running on host "192.168.0.50" and accepting
        TCP/IP connections on port 15007?]
07/27/2020 12:54:23;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:54:30;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:54:30;0006;Server@master;Svr;Server@master;Failed to start PBS dataservice
07/27/2020 12:54:31;0002;Server@master;Svr;Server@master;pbs_status_db exit code 1
07/27/2020 12:54:39;0002;Server@master;Svr;Server@master;Starting PBS dataservice
07/27/2020 12:54:51;0002;Server@master;Svr;Server@master;PBS dataservice not running:[Connection:  failed: could not connect to server: Connection refused
        Is the server running on host "192.168.0.50" and accepting
        TCP/IP connections on port 15007?]

and as you suggested i changed /etc/pbs.conf file and i have added

PBS_SERVER_HOST_NAME=master.calligotech.com

and strace pbsnodes -aSj logs are below

mmap(NULL, 105051, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f9345488000
close(4)                                = 0
open("/lib64/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=37096, ...}) = 0
mmap(NULL, 2131056, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x7f9344268000
mprotect(0x7f9344270000, 2093056, PROT_NONE) = 0
mmap(0x7f934446f000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x7000) = 0x7f934446f000
close(4)                                = 0
mprotect(0x7f934446f000, 4096, PROT_READ) = 0
munmap(0x7f9345488000, 105051)          = 0
fstat(-1, 0x7ffdc24018c0)               = -1 EBADF (Bad file descriptor)
socket(AF_LOCAL, SOCK_STREAM, 0)        = 4
fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
fcntl(4, F_GETFD)                       = 0
fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
connect(4, {sa_family=AF_LOCAL, sun_path="/var/lib/sss/pipes/nss"}, 110) = -1 ENOENT (No such file or directory)
close(4)                                = 0
close(3)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
socket(AF_LOCAL, SOCK_STREAM, 0)        = 3
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
fcntl(3, F_GETFD)                       = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
connect(3, {sa_family=AF_LOCAL, sun_path="/var/lib/sss/pipes/nss"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/pbs.conf", O_RDONLY)         = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=204, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
read(3, "PBS_EXEC=/share/apps/platform/pb"..., 4096) = 204
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
open("/etc/pbs.conf", O_RDONLY)         = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=204, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
read(3, "PBS_EXEC=/share/apps/platform/pb"..., 4096) = 204
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
socket(AF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 4
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(4)                                = 0
socket(AF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 4
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(4)                                = 0
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=9, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
read(4, "multi on\n", 4096)             = 9
read(4, "", 4096)                       = 0
close(4)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
futex(0x7f9344c47a30, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=73, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
read(4, "search local calligotech.com\nnam"..., 4096) = 73
read(4, "", 4096)                       = 0
close(4)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=286, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
read(4, "# Added by rocks report host #\n#"..., 4096) = 286
read(4, "", 4096)                       = 0
close(4)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
connect(3, {sa_family=AF_INET, sin_port=htons(15001), sin_addr=inet_addr("192.168.0.50")}, 16) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
dup(2)                                  = 3
fcntl(3, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93454c0000
write(3, "Connection refused\n", 19Connection refused
)    = 19
close(3)                                = 0
munmap(0x7f93454c0000, 4096)            = 0
write(2, "pbsnodes: cannot connect to serv"..., 69pbsnodes: cannot connect to server master.calligotech.com, error=111
) = 69
exit_group(1)                           = ?
+++ exited with 1 +++
[root@master ~]# 

Still i am getting same issue , please help me out.

[root@master ~]# pbsnodes -aSj
Connection refused
pbsnodes: cannot connect to server master.calligotech.com, error=111
[root@master ~]# service pbs restart
Restarting PBS
Stopping PBS
Killing Server.
PBS server - was pid: 1340
PBS mom - was pid: 30622
PBS sched - was pid: 30635
PBS comm - was pid: 30594
Waiting for shutdown to complete
Starting PBS
/share/apps/platform/pbs/sbin/pbs_comm ready (pid=28045), Proxy Name:master.calligotech.com:17001, Threads:4
PBS comm
PBS mom
Creating usage database for fairshare.
PBS sched
Connecting to PBS dataservice..Failed to start PBS dataservice:[2020-07-27 14:14:05 ISTFATAL:  could not create lock file "/var/run/postgresql/.s.PGSQL.15007.lock": No such file or directory]
.Failed to start PBS dataservice
.Failed to start PBS dataservice
..Failed to start PBS dataservice
continuing in background.
PBS server
touch: cannot touch '/var/lock/subsys/pbs': No such file or directory
[root@master ~]#

What is the reason some times it will work properly and sometimes not. please help me out.

Regards,
Zain

The PBS_SERVER hostname mentioned in the /etc/pbs.conf should be resolvable ( reverse/forward ) from the PBS Server and Compute Nodes.

Please run this command on the PBS Server host:
pbs_hostn -v master.calligotech.com

Please run this command from the compute node hosts:
pbs_hostn -v master.calligotech.com

For this error:
source /etc/pbs.conf
cd $PBS_HOME/datastore
edit pg_hba.conf and find this line
host all all 0.0.0.0/0 md5

and update it to
host all all 0.0.0.0/0 trust
and restart pbs services and check ps -ef | grep pbs_

Otherwise please try:
update your /etc/hosts on the PBS Server host , to :
10.1.50.1 master.local master
#192.168.0.50 master.calligotech.com master

and /etc/pbs.conf
PBS_SERVER=master
and remove
PBS_SERVER_HOST_NAME

Hi,

Actually we have rebooted server, after that we tested pbsnodes -aSj it worked fine.

and strace pbsnodes -aSj log is below

SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=13691, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
wait4(13691, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 13691
getsockopt(3, SOL_TCP, TCP_NODELAY, [0], [4]) = 0
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
write(3, "+2+12+58+4root+0+0+0", 20)    = 20
poll([{fd=3, events=POLLIN}], 1, 10800000) = 1 ([{fd=3, revents=POLLIN}])
read(3, "+2+1+0+0+6+2+3+6master2+212+17+3"..., 1024) = 1024
poll([{fd=3, events=POLLIN}], 1, 10800000) = 1 ([{fd=3, revents=POLLIN}])
read(3, "2+372+19resources_available+1+4h"..., 1024) = 620
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0476d5b000
write(1, "                                "..., 88                                                        mem       ncpus   nmics   ngpus
) = 88
write(1, "vnode           state           "..., 94vnode           state           njobs   run   susp      f/t        f/t     f/t     f/t   jobs
) = 94
write(1, "--------------- --------------- "..., 97--------------- --------------- ------ ----- ------ ------------ ------- ------- ------- -------
) = 97
write(1, "master          free            "..., 92master          free                 0     0      0  126gb/126gb   16/16     0/0     0/0 --
) = 92
write(1, "compute-0-0     free            "..., 92compute-0-0     free                 0     0      0  126gb/126gb   16/16     0/0     0/0 --
) = 92
write(3, "+2+12+59+4root", 14)          = 14
read(3, "", 1)                          = 0
close(3)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
[root@master ~]# pbsnodes -aSj
                                                        mem       ncpus   nmics   ngpus
vnode           state           njobs   run   susp      f/t        f/t     f/t     f/t   jobs
--------------- --------------- ------ ----- ------ ------------ ------- ------- ------- -------
master          free                 0     0      0  126gb/126gb   16/16     0/0     0/0 --
compute-0-0     free                 0     0      0  126gb/126gb   16/16     0/0     0/0 --
[root@master ~]# 

and pbs_hostn details

[root@master ~]# pbs_hostn -v master.calligotech.com
primary name: master.calligotech.com (from gethostbyname())
aliases: master
address length: 4 bytes
address: 192.168.0.50 (838904000 dec) name: master.calligotech.com
[root@master ~]# ssh compute-0-0
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Last login: Mon Jul 27 17:14:05 2020 from gateway
Rocks Compute Node
Rocks 7.0 (Manzanita)
Profile built 05:43 21-Jul-2020

Kickstarted 05:53 21-Jul-2020
[root@compute-0-0 ~]# pbs_hostn -v master.calligotech.com
primary name: master.calligotech.com (from gethostbyname())
aliases: master
address length: 4 bytes
address: 192.168.0.50 (838904000 dec) name: master.calligotech.com
[root@compute-0-0 ~]#

and will try suggested details and update you.
Otherwise please try:
update your /etc/hosts on the PBS Server host , to :
10.1.50.1 master.local master
#192.168.0.50 master.calligotech.com master

Thanks for support Adarsh.

Regards,
Zain
1 Like

Nice ! Thank you @zainul1114

Hi Adarsh, I have a similar issue. Could you please help me?