aVrise
September 12, 2022, 10:34am
1
Hi all,
I try to install the OpenPBS on a single node for task management.
However, there is an error when I run qstat after starting PBS. Do you have any idea on my problem? Thank you!
I have disabled the firewall:
$ firewall-cmd --state
not running
$ grep -v ‘#’ /etc/hosts
127.0.0.1 localhost
10.96.44.240 J35
$ sudo /etc/init.d/pbs status
pbs_server is not running
pbs_mom is pid 11297
pbs_sched is pid 11310
pbs_comm is 11287
$ cat /etc/pbs.conf
PBS_SERVER=J35
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp
adarsh
September 12, 2022, 11:53am
2
Please run the qstat command again (share us the outcome) and corresponding log in the $PBS_HOME/server_logs/YYYYMMDD against that command.
aVrise
September 12, 2022, 3:09pm
3
Thanks for your reply. Here are the outputs.
$ qstat
Connection refused
qstat: cannot connect to server J35 (errno=15010)
$ vi /var/spool/pbs/server_logs/20220912
09/12/2022 18:27:07;0002;Server@j35;Svr;Log;Log opened
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;pbs_version=20.0.0
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;pbs_build=mach=N/A:security=N/A:configure_args=N/A
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;hostname=j35;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv4 interface lo: localhost
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv4 interface enp0s20f0u7u2c2: J35
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv4 interface eno2: J35
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv6 interface lo: localhost
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv6 interface enp0s20f0u7u2c2: J35
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;ipv6 interface eno2: J35
09/12/2022 18:27:07;0006;Server@j35;Fil;Server@j35;Version 20.0.0, started, initialization type = 1
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;pbs_status_db exit code 1
09/12/2022 18:27:07;0002;Server@j35;Svr;Server@j35;Starting PBS dataservice
09/12/2022 18:27:11;0002;Server@j35;Svr;Server@j35;connected to PBS dataservice@j35
09/12/2022 18:27:11;0d80;Server@j35;TPP;Server@j35(Main Thread);TPP authentication method = resvport
09/12/2022 18:27:11;0c06;Server@j35;TPP;Server@j35(Main Thread);TPP leaf node names = 10.96.44.240:15001,127.0.0.1:15001,169.254.3.1:15001,10.96.44.240:15001
09/12/2022 18:27:11;0d80;Server@j35;TPP;Server@j35(Main Thread);Initializing TPP transport Layer
09/12/2022 18:27:11;0d80;Server@j35;TPP;Server@j35(Main Thread);Max files allowed = 16384
09/12/2022 18:27:11;0d80;Server@j35;TPP;Server@j35(Main Thread);TPP initialization done
09/12/2022 18:27:11;0d80;Server@j35;TPP;Server@j35(Main Thread);Connecting to pbs_comm J35:17001
09/12/2022 18:27:11;0c06;Server@j35;TPP;Server@j35(Thread 0);Thread ready
09/12/2022 18:27:11;0c06;Server@j35;TPP;Server@j35(Thread 0);Registering address 10.96.44.240:15001 to pbs_comm J35:17001
09/12/2022 18:27:11;0c06;Server@j35;TPP;Server@j35(Thread 0);Registering address 169.254.3.1:15001 to pbs_comm J35:17001
09/12/2022 18:27:11;0c06;Server@j35;TPP;Server@j35(Thread 0);Connected to pbs_comm J35:17001
09/12/2022 18:27:11;0002;Server@j35;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
09/12/2022 18:27:11;0000;Server@j35;Svr;Server@j35;Supported authentication method: resvport
09/12/2022 18:27:11;0002;Server@j35;Svr;Server@j35;Stopping PBS dataservice
aVrise
September 13, 2022, 2:30am
5
Hi adarsh. All set and restarted. But it still doesn’t work.
$ sestatus
SELinux status: disabled
$ sudo firewall-cmd --state
not running
$ sudo firewall-cmd --list-ports
FirewallD is not running
$ sudo iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A INPUT -p tcp -m tcp --dport 17001 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15009 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15008 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15007 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15006 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15005 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15004 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15003 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15002 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15001 -j ACCEPT
$ grep -v ‘#’ /etc/hosts
127.0.0.1 localhost
10.96.44.240 J35 j35
$ qstat -Bf
Connection refused
qstat: cannot connect to server J35 (errno=15010)
$ vi /var/spool/pbs/server_logs/20220913
09/13/2022 10:25:14;0002;Server@j35;Svr;Log;Log opened
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;pbs_version=20.0.0
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;pbs_build=mach=N/A:security=N/A:configure_args=N/A
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;hostname=j35;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv4 interface lo: localhost
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv4 interface enp0s20f0u7u2c2: J35
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv4 interface eno2: j35
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv6 interface lo: localhost
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv6 interface enp0s20f0u7u2c2: J35
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;ipv6 interface eno2: J35
09/13/2022 10:25:14;0006;Server@j35;Fil;Server@j35;Version 20.0.0, started, initialization type = 1
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;pbs_status_db exit code 1
09/13/2022 10:25:14;0002;Server@j35;Svr;Server@j35;Starting PBS dataservice
09/13/2022 10:25:17;0002;Server@j35;Svr;Server@j35;connected to PBS dataservice@j35
09/13/2022 10:25:17;0d80;Server@j35;TPP;Server@j35(Main Thread);TPP authentication method = resvport
09/13/2022 10:25:17;0c06;Server@j35;TPP;Server@j35(Main Thread);TPP leaf node names = 10.96.44.240:15001,127.0.0.1:15001,169.254.3.1:15001,10.96.44.240:15001
09/13/2022 10:25:17;0d80;Server@j35;TPP;Server@j35(Main Thread);Initializing TPP transport Layer
09/13/2022 10:25:17;0d80;Server@j35;TPP;Server@j35(Main Thread);Max files allowed = 16384
09/13/2022 10:25:17;0d80;Server@j35;TPP;Server@j35(Main Thread);TPP initialization done
09/13/2022 10:25:17;0d80;Server@j35;TPP;Server@j35(Main Thread);Connecting to pbs_comm J35:17001
09/13/2022 10:25:17;0c06;Server@j35;TPP;Server@j35(Thread 0);Thread ready
09/13/2022 10:25:17;0c06;Server@j35;TPP;Server@j35(Thread 0);Registering address 10.96.44.240:15001 to pbs_comm J35:17001
09/13/2022 10:25:17;0c06;Server@j35;TPP;Server@j35(Thread 0);Registering address 169.254.3.1:15001 to pbs_comm J35:17001
09/13/2022 10:25:17;0c06;Server@j35;TPP;Server@j35(Thread 0);Connected to pbs_comm J35:17001
09/13/2022 10:25:17;0002;Server@j35;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
09/13/2022 10:25:17;0000;Server@j35;Svr;Server@j35;Supported authentication method: resvport
09/13/2022 10:25:17;0002;Server@j35;Svr;Server@j35;Stopping PBS dataservice
aVrise
September 13, 2022, 2:39am
6
Indeed, there are some errors with ln when I $ sudo /opt/pbs/libexec/pbs_postinstall (See below). Is this a big problem?
$ sudo /opt/pbs/libexec/pbs_postinstall
*** PBS Installation Summary
*** Postinstall script called as follows:
*** /opt/pbs/libexec/pbs_postinstall ‘’
*** Existing configuration file found: /etc/pbs.conf
*** Saving /etc/pbs.conf as /etc/pbs.conf.pre.20.0.0.20220913103220
*** Replacing /etc/pbs.conf with /etc/pbs.conf.20.0.0
*** /etc/pbs.conf has been modified.
*** The original contents have been saved to /etc/pbs.conf.pre.20.0.0.20220913103220
*** Registering PBS as a service.
ln: failed to create symbolic link ‘/etc/rc.d/rc0.d/K10pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc1.d/K10pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc2.d/K10pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc3.d/S90pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc4.d/K10pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc5.d/S90pbs’: No such file or directory
ln: failed to create symbolic link ‘/etc/rc.d/rc6.d/K10pbs’: No such file or directory
*** PBS_HOME is /var/spool/pbs
*** Existing environment file left unmodified: /var/spool/pbs/pbs_environment
*** The PBS server has been installed in /opt/pbs/sbin.
*** The PBS scheduler has been installed in /opt/pbs/sbin.
*** The PBS communication agent has been installed in /opt/pbs/sbin.
*** The PBS MOM has been installed in /opt/pbs/sbin.
*** The PBS commands have been installed in /opt/pbs/bin.
*** End of /opt/pbs/libexec/pbs_postinstall
Btw, the system is centos stream 9. Is there any compatible problem?
$ cat /etc/os-release
NAME=“CentOS Stream”
VERSION=“9”
ID=“centos”
ID_LIKE=“rhel fedora”
VERSION_ID=“9”
PLATFORM_ID=“platform:el9”
PRETTY_NAME=“CentOS Stream 9”
ANSI_COLOR=“0;31”
LOGO=“fedora-logo-icon”
CPE_NAME=“cpe:/o:centos:centos:9”
HOME_URL=“https://centos.org/ ”
BUG_REPORT_URL=“https://bugzilla.redhat.com/ ”
REDHAT_SUPPORT_PRODUCT=“Red Hat Enterprise Linux 9”
REDHAT_SUPPORT_PRODUCT_VERSION=“CentOS Stream”
adarsh
September 13, 2022, 11:14am
7
I have not tested in on Centos stream 9. If you have compiled it by source, then it should work.
Thanks for posting the diagnostics, members might comment on this.
aVrise
September 14, 2022, 2:46am
8
Alright. Thank you so much!