PBS_PRIMARY/PBS_SECONDARY vs PBS_LEAF_NAME

Hi,
Can You tell me that I understanding well meaning PBS_RIMARY and PBS_SECONDARY parameter when I have been set PBS_LEAF_NAME parameter according which this table:

parameters in pbs.conf on server1(primary) on server2(secondary)
PBS_SERVER h1.domain h2.domain
PBS_LEAF_NAME comm_h1 comm_h2
PBS_PRIMARY comm_h1 comm_h1
PBS_SECONDARY comm_h2 comm_h2

And I need add to file /etc/hosts alias: comm_h1 on server1 and comm_h2 to the same file on server2, yes?

Regards!

PBS_PRIMARY is the primary PBS Server in the PBS Failover configuration and PBS_SECONDARY is the secondary PBS server.

The value of PBS_LEAF_NAME should be the hostname of the interface over which you want PBS Pro to communicate, not the domain name. Unless you have multiple NICs, you should not need to set this value at all. Please try removing PBS_LEAF_NAME from /etc/pbs.conf and try starting PBS Pro.

Make sure the /etc/hosts file are identical and resolvable from both primary and secondary.

The /etc/pbs.conf of Primary and Secondary are as below respectively

Primary:
PBS_EXEC=/opt/pbs
PBS_SERVER=/mnt/file_locking_enabled_storage/pbsworks
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=0
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp
PBS_PRIMARY=primary_server
PBS_SECONDARY=secondary_server

Secondary:

PBS_EXEC=/opt/pbs
PBS_SERVER=primary_server # edited for correctness
PBS_START_SERVER=1
PBS_START_SCHED=0
PBS_START_COMM=1
PBS_START_MOM=0
PBS_HOME=/mnt/file_locking_enabled_storage/pbsworks
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp
PBS_PRIMARY=primary_server
PBS_SECONDARY=secondary_server

I understand it.

You mean hostname in /etc/hosts, yes?

I do not provide domain name in previous example, I was entered short hostname from /etc/hosts, is this OK?

I have multiple NICs and I want to enable Failover configuration. So did my previous example is OK or not? On primary server, value of PBS_LEAF_NAME is comm_h1 (which is hostname from /etc/hosts for one of my internal network), and value of PBS_LEAF_NAME on secondary server is set analogously.

Why You entered here path? It should be hostname, Yes? I entered FQDN hostname here in my example.

You entered here short hostnames of Your primary and secondary hosts, which are set in /etc/hosts?

Maybe, to better understanding my question, I enter here example of /etc/hosts which relate to my above example:

  • primary server hosts file
    10.10.10.1 h1.domain h1 comm_h1

-secondary hosts file
10.10.10.2 h2.domain h2 comm_h2

Which, for example, h1.domain is FQDN hostname and h1 is short hostname and comm_h1 is additional hostname for communication deamon to set PBS_LEAF_NAME value in pbs.conf

Is those settings are good or I misunderstanding how to set FAILOVER?

That was my mistake, i have edited and updated it. Thank you

PBS_PRIMARY=h1
PBS_SECONDARY=h2

Please check whether they are resolvable using: pbs_hostn -v h1 from h2 and on all the nodes and same for h2.

Start the PBS Services , try to submit jobs, check whether jobs are running fine, then you are all set. If for some reason (from the respective logs) the daemons are communicating on a different IP on the NIC , then tell PBS to use the correct hostname using PBS_LEAF_NAME.

Thanks for quickly respond.

All this topic is about how to tell PBS to use the correct hostname using PBS_LEAF_NAME, so if I should do this, I need replace PBS_PRIMARY and PBS_SECONDARY attribute respectively to:
PBS_PRIMARY=comm_h1
PBS_SECONDARY=comm_h2
in pbs.conf on both machines, yes ?

Correct, if in case the intended name was not picked.

Correct

In failover configuration wrong!

on both servers and endpoints:
PBS_PRIMARY=h1.domain
PBS_SECONDARY=h2.domain

on primary
PBS_LEAF_NAME=h1.domain
on secondary:
PBS_LEAF_NAME=h2.domain
on endpoints:
PBS_LEAF_NAME=hostname.domain