Released a small web app to show cluster status

Hi all

I have released as open source a small web application to show the nodes, queues and jobs running on a cluster which uses PBS Pro. This is useful for administrators to quickly see the status of the nodes and how busy they are, to check how full each queue is and to see what jobs are running and queued. It’s also useful for users to see this information as well.

You can see our working example here: https://hpc.research.uts.edu.au/status/
Here is a screenshot:

The source code is here: https://github.com/UTS-eResearch/pbsweb

Hopefully some of you may find this useful. Suggestions and improvements are welcome.

Best Regards
Mike Lake
eResearch, University of Technology Sydney

6 Likes

Nicely done! :tada:

Thank you very much for supporting the PBS Pro community.

Mike

+1
Much appreciate it ! Much helpful, certainly will invoke wide spread usage !
Thank you :+1:

Something like this would be good, color coding reflects job distribution.

Hello @adarsh, how are you doing?

Which software (pbs cluster status - image that you attached) are you using to monitor your PBS workload?

Thanks

Hi @daniel, i am doing good, thank you. How are you? I have written a python script that outputs the above html page. This python script runs as a cronjob updating the page every minute or so.

@adarsh
I’am doing well tahnks.

It is available for download or can you share it?

Regards

I can share the script, please let me first check & sanitise it before sharing. Also, please share it back to the community if you extend it.

1 Like

Hi Adarsh

Quite a few users would probably like to use your Python script. We look forward to it if you can release it under an open source license. That screen shot of the page it generates looks very useful. A simple python script run by cron would be easier to install than my Python WSGI app for some users.

Mike

Hi all

I have updated my PBSweb app that displays in real time the PBS stats for your HPC. The code is at GitHub - UTS-eResearch/pbsweb: Web interface to show nodes, queues and jobs on a High Performance Compute Cluster using PBS Pro

This release is version 2.0.0 which now has a script to ease the installation, better documentation, bug fixes, and now most importantly it works with Python 3.8 instead of Python 2.7. See the Change Log and the Install instructions. For other details read the top of this post.

Regards
Mike Lake

1 Like

hi mike,

I am getting below error. -

./swig_compile_pbs.sh

/usr/bin/ld: cannot find -lsec

collect2: error: ld returned 1 exit status

chmod: cannot access ‘_pbs.so’: No such file or directory

Hi Vinay

On my _pbs.so created from swig_compile_pbs.sh I can check where this comes from:
$ ldd _pbs.so
… snipped
libpbs.so.0 => /opt/pbs/lib/libpbs.so.0 (0x00007ff94c5f8000)
libssl.so.1.1 => /lib64/libssl.so.1.1 (0x00007ff94bc5b000)
libsec.so.0 => /opt/pbs/lib/libsec.so.0 (0x00007ff94ba58000)
… snipped
So in my system with PBSPro it comes from /opt/pbs/lib/libsec.so.0 which is from the pbspro-client package. Perhaps OpenPBS does not have this.
Googling says it provides “Functions in this library provide comparison and manipulation of File Access Control Lists.”

Edit the swig compile and remove that link requirement. I see that on my system it still makes _pbs.so so I willl need to lookup why I had it there.

Mike

Hi mike,
Sorry to bother you. I did that and wen ahead, install_pbsweb.sh … but then this gives error pbs.py and pbs.so not found. so i commented those also.

now it has permission issues.

Although the script is convinient, but i guess lot of setups will have different architecture.

So is there someway, i can get the python app, which i can just copy paste modify the wsgi file and host it as per my web server preference suppose apache.

Hi Vinay
Did you copy pbs_ifl.h to the pbsweb directory?
After removing the -lsec did the swig script create the three files: pbs.py, pbs_wrap.c and _pbs.so ?
What of those three files did it create?
If it did not create _pbs.so that issue needs to be solved as the application needs that file to work.
Mike

Hi Mike,
I have copied pbs_ifl.h in the pbsweb directory.

but I am stuck with swig_compile_pbs.sh file.

Screenshot 2022-06-28 at 6.14.25 PM

Also I am testing it now with pbspro 19.1.3. What should i comment in swig compile file to avoid it.

Hi
In the swig script you will see there is a compiling step and a linking step.
The compiling is just this one line:
gcc -c -shared -fpic -I$PYTHON_INCL -I$PBS_EXEC/include pbs_wrap.c
After that line add “exit” into the script so it will stop at this point. Run the script.
This should create three files pbs_wrap.c pbs_wrap.o pbs.py
Do you get these 3 files being created? What is the output of the swig script when it runs?
There should be no output if this step went OK.
Then we will move onto the linking step later on.
Mike

Hi Mike,
I was able to go further. Yes those 3 files were created. Now after starting the application when I try to access http://ip_address/statuspbs it gives - Internal server error.

Below is the uwsgi.log

Screenshot 2022-06-28 at 9.27.56 PM

Hi Vinay
Next step is to remove the “exit” you added and run swig again. We know that everything is fine (probably) up to this step. The linking step should create _pbs.so. That’s what the option -o means, i.e. output _pbs.so by linking together the preceeding files.
Also check that there is a $PBS_EXEC/lib/libpbs.so
Mike

Hi Mike,
So after removing exit from swig script. It gave the same error -
as earlier. So I removed the -lsec option from third line. and then executed it again, and it ran fine. created the file _pbs.so

gcc -shared -fpic -L/opt/pbs/lib
$PBS_EXEC/lib/libpbs.so pbs_wrap.o
-lpthread -lcrypto -lssl
-o _pbs.so


I wen ahead with other steps after this. But still opening the http://localaddress/statuspbs gives “Internal Server error”

uwsgi log file gives below output -

— no python application found, check your startup logs for errors —

[pid: 1498|app: -1|req: -1/7] 10.101.202.77 () {42 vars in 783 bytes} [Wed Jun 29 08:28:07 2022] GET /statuspbs/ => generated 21 bytes in 0 msecs (HTTP/1.1 500) 2 headers in 83 bytes (0 switches on core 0)


But the uwsgi service is running -

What conf should i check.

Hi Vinay
Check in the pbsweb.ini that the paths there are all present.
Check your webserver conf setup and its logs.
Googling has a lot of stuff with that error message. I find that Python WSGI is not easy!
I also have a minimal Python WSGI app that is useful for testing that just prints some text. Don’t seem though to be able to upload or attach them here.
Mike

No Problem mike. Ill udpate you soon.