Scheduler Maintaining Dedicated Server Connections

suresht · March 4, 2020, 5:49am

All,

I have simplified the proposal and added to the design document. Please go through it and provide your comments if any.

agrawalravi90 · March 4, 2020, 9:36pm

"Scheduler might choose a different server instance/s to run the jobs than the one who has kicked off the cycle. This brings us the need of having persistent connections among Server/s and Scheduler/s. "

So the need is borne out of performance lag that would have happened otherwise because we’d have to create new connections to different servers, possibly multiple times during a cycle ? if yes, can you please mention this in the motivation as well?

"As the connections are persistent, when Scheduler finishes its cycle it needs to have a mechanism through which it can tell the Server that scheduling cycle is finished"

A section titled “Solution” shouldn’t start by talking about the challenges associated with the proposed changes, so this seems kind of misplaced. How about we rename this section “Internal/Technical details”?

"Each Server can mark a flag when it kicks off a scheduling cycle and clears this flag when it receives end of cycle indication from Scheduler."

Can you please explain here what the end of cycle indication from scheduler will look like? is it a IS_ message?

suresht · March 5, 2020, 8:39am

Yes performance is one of the reasons why we should have persistent connections. Sure I will mention this also in the motivation.

suresht · March 5, 2020, 8:40am

I am ok to rename the section to “Technical Details”

suresht · March 5, 2020, 8:42am

It is just a integer flag that we send on socket which Server reads it and understands that it is end of cycle indication.

suresht · March 5, 2020, 9:00am

@agrawalravi90, I have implemented your comments. Please have a look.

agrawalravi90 · March 5, 2020, 2:21pm

Ok, so you are saying that since the scheduler doesn’t send anything else on the secondary connection, when it sends a random integer, the server will take that to mean end of cycle notification? If yes then please also add this explanation to the document.

“Scheduler might choose a different server instance/s to run the jobs than the one who has kicked off the cycle. This brings us the need of having persistent connections among Server/s and Scheduler/s. It also increases the performance as Scheduler can directly talk to the desired Server/s.”

Your phrasing implies that there’s another reason besides performance. From what I understand, functionally we would have been ok without dedicated connections in a multi-server scenario, we’d have just passed IFL the correct server to talk to, right? So isn’t performance the ONLY reason to do this?

suresht · March 6, 2020, 4:42am

Yes @agrawalravi90. As this is implementation detail I don’t think it is necessary. What do you think ?

agrawalravi90 · March 6, 2020, 4:52am

I think it’s an important detail, it wasn’t obvious. We should write internal details in the design if they explain the algorithm better. But maybe it just wasn’t obvious to me, so it’s your call, I’m ok with whatever you think is best.

suresht · March 6, 2020, 4:53am

Performance is the only reason we are moving towards this approach and I have made it clear regarding the same in the design document. Please look into it.

agrawalravi90 · March 6, 2020, 4:56am

Looks better now, thanks

suresht · March 12, 2020, 10:12am

All,
I have added a new scheduling command SCH_SVR_IDENTIFIER, its need in case of multiple servers. Please look into it and provide your comments if any.

suresht · April 6, 2020, 5:23am

I have added the newly added/modified log messages to the document. Please take a look.

Topic		Replies	Views
Make scheduler to connect server and keep that connection persistent Developers	36	1533	September 15, 2020
PP-832: Which scheduler to talk to while taking over from Primary Developers	21	1516	May 4, 2018
New sched attribute to control runjob wait + making pbs_asynrunjob truly async + deprecating 'throughput_mode' Developers	26	1574	April 14, 2020
Pre-sched hook event in server Developers	5	690	December 10, 2018
Scheduler can spend 94% of its time waiting for job run ACK Developers	10	1393	March 26, 2020

Scheduler Maintaining Dedicated Server Connections

Related Topics