Minor: OpenPBS and node licensing

I’m experimenting with OpenPBS nearly top-of-tree (commit 072689ab). I found that the scheduler would not start jobs, reporting “No available resources on nodes”. Digging into the scheduler, it turns out this is due to commit b5459cbc Node buckets does not check for unlicensed nodes by arungrover · Pull Request #2248 · openpbs/openpbs · GitHub. This is because all nodes end up with lic_lock == 0, and the commit now causes the scheduler to ignore them.

My question is what mechanism will allow lic_lock to get set non-zero on an OSS build-from-scratch? I possibly have something configured incorrectly, but I cannot figure out what. There are also local mods involved, so I could have broken something.

qstat -Bf | grep -i licen
pbs_license_min = 0
pbs_license_max = 2147483647
pbs_license_linger_time = 31536000
license_count = Avail_Global:1000000 Avail_Local:1000000 Used:0 High_Use:0

I have a hack work-around, so this is not a show-stopper.

Thanks.

Could you please share the output of the below commands:
pbsnodes -av
qstat -Bf

$ pbsnodes -av
node3
     Mom = node3.local
     Port = 15002
     pbs_version = 20.0.0_nas_6bcbb
     ntype = PBS
     state = free
     pcpus = 2
     resv = R11.server2
     resources_available.arch = linux
     resources_available.bigmem = False
     resources_available.host = node3
     resources_available.mem = 1014600kb
     resources_available.ncpus = 2
     resources_available.vnode = node3
     resources_assigned.accelerator_memory = 0kb
     resources_assigned.hbmem = 0kb
     resources_assigned.mem = 0kb
     resources_assigned.naccelerators = 0
     resources_assigned.ncpus = 0
     resources_assigned.vmem = 0kb
     resv_enable = True
     sharing = default_shared
     last_state_change_time = Sat May 15 11:20:24 2021
     last_used_time = Sat May 15 11:20:24 2021
     server_instance_id = server2.local:15001

node4
     Mom = node4.local
     Port = 15002
     pbs_version = unavailable
     ntype = PBS
     state = state-unknown,down
     resources_available.host = node4
     resources_available.vnode = node4
     resources_assigned.accelerator_memory = 0kb
     resources_assigned.hbmem = 0kb
     resources_assigned.mem = 0kb
     resources_assigned.naccelerators = 0
     resources_assigned.ncpus = 0
     resources_assigned.vmem = 0kb
     resv_enable = True
     sharing = default_shared
     server_instance_id = server2.local:15001

$ qstat -Bf
Server: server2
    server_state = Active
    server_host = server2.local
    scheduling = True
    total_jobs = 29
    state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun
	:0 
    managers = dtalcott@*
    default_queue = workq
    log_events = 511
    mailer = /usr/sbin/sendmail
    mail_from = adm
    query_other_jobs = True
    resources_default.ncpus = 1
    resources_default.walltime = 01:00:00
    default_chunk.ncpus = 1
    resources_assigned.mem = 0mb
    resources_assigned.ncpus = 0
    resources_assigned.nodect = 0
    scheduler_iteration = 600
    flatuid = True
    resv_enable = True
    node_fail_requeue = 310
    max_array_size = 10000
    pbs_license_min = 0
    pbs_license_max = 2147483647
    pbs_license_linger_time = 31536000
    license_count = Avail_Global:1000000 Avail_Local:1000000 Used:0 High_Use:0
    pbs_version = 20.0.0_nas_6bcbb
    eligible_time_enable = True
    job_history_enable = True
    job_history_duration = 672:00:00
    max_concurrent_provision = 5
    power_provisioning = False
    max_job_sequence_id = 9999999

This is on CentOS-7.

Thank you for sharing the information here. I might be on a different page here

Looking at your pbsnodes -av output, one node (node3) is part of reservation queue resv = R11.server2 and node4 is down. This might be the reason for the scheduler message .

Good thinking, but does not apply in this case. The reservation starts way out in July, but my test jobs ask for only 2000 seconds. With a hack that sets lic_lock to 1, the jobs run okay.

My problem is that I cannot figure out how lic_lock is supposed to get set for non-commercial builds of OpenPBS. I tried using qmgr to set the license type, but that is rejected:

/opt/pbs/bin/qmgr -c 'set node node3 license=l'
qmgr obj=node3 svr=default: Undefined attribute 
qmgr: Error (15002) returned from server

I’m not finding anything relevant in the Licensing Guide. Should I be running a license server, even for OpenPBS?

Thanks.

I’ve only seen this issue when installing commercial PBS and forgetting to point it to the right license server, but I was not using the buckets code path. Looking at the code, I think I agree with you, this looks like a bug in OpenPBS. You certainly don’t need to set up a license server for OpenPBS. Do you mind filing a Github Issue for this (Issues · openpbs/openpbs · GitHub)?

Thanks.

Issue opened: Licensing issue with OpenPBS · Issue #2392 · openpbs/openpbs · GitHub.

1 Like