Can I use tpp_mcast to send different data to each receiver?

Hi developers,

I am trying to parallelize the tm_spawn function using tpp_mcast and need your expertise.

Here is a brief summary of how tm_spawn currently works. The client that uses the task manager API (for example pbs_dsh) has a list of nodes that it wants to spawn a task one, and for each of the node it calls tm_spawn, which has the signature below:

int tm_spawn(int argc, char **argv, char **envp, tm_node_id where, tm_task_id *tid, tm_event_t *event)

Note that the client passes an “event”, which is just a number. If there are n nodes, there will be n event numbers. These will be used to track the status of the tm_spawn requests.

Once the mom gets a TM_SPAWN request from the client, and finds that the requested node to spawn task on is not herself, mom sends an IM_SPAWN (inter-mom message) to the sister, forwarding the event number as well. And when the mom gets a reply back from the sister, mom tm_reply to reply to the client. The client will learn about the status of the request its sent out, by doing tm_poll using the event number.

Now I would like to achieve something like this:

int tm_spawn_multi(int argc, char **argv, char **envp, tm_node_id where[], tm_task_id *tid[], tm_event_t *event[])

so that a client can spawn tasks on multiple nodes (note where has turned into an array) at once instead of using a loop. The actual implementation could utilize the tpp_mcast functionality. Once mom gets the TM_SPAWN request from the client, mom could use tpp_mcast_add_strm to add the list of receivers (sister moms) specified by where[], then im_compose to send the all sisters an IM_SPAWN request at once.

What I could not figure out was how to use events to track the status of these requests now. The way that the tpp_mcast suite of functions were implemented assumes that we send the same data to all the streams (sisters), as you can see from the signature of im_compose, all the streams will be sent the same event number:

int im_compose(int stream, char *jobid, char *cookie, int command, tm_event_t event, tm_task_id taskid, int version)

Is there a way to use tpp_mcast, but send each receiver a different event number?
Or is there a better way to track status of the tm_spawn request? Maybe instead of replying to the client every time a sister complete the spawn, the mom can somehow using a single event to track the number of sisters have finished spawning task, and then send a single reply to the client once all sisters have finished?

Thanks,

Hi @mliu

tpp_mcast() was to designed to multicast (at an application level, not ICMP mulitcast) - and multicast really means send the “same” packet to more than one receivers in parallel, right? If we made it separate packets (which differ by even one bit), then it will become unicast to individul target receivers - won’t it.

tpp_mcast() is used in sending IM_JOIN_JOB requests to sisters as well, and is tracked by the MS properly…u could use that kind of semantics - IIRC, the MS uses counts to know if every sister responded etc - in line with the last line of your message…

Subhasis