PP-288: Asynchronous logging option for the daemons

Hi Bhroam,

We do not call fflush() on logfile in log_record() when PBS_LOG_ASYNC is set. By doing this we have seen remarkable improvement in performance which helps in improving the job cycle performance. However as you said there is a possibility of intermix of log messages due to this in case of both parent and children write to the same log file. One solution for this would be to append process ID to each and every log message with proper locking mechanism which is something similar to the way logging is done in TPP. Since fflush() automatically locks the stream I think this mechanism is fine. Please share your thoughts on the same.

As you suggested we are actually planning to move all configuration which is required to achieve the job cycle performance to qmgr and make the corresponding changes in EDD accordingly.

As we are moving this configuration to qmgr, we can modify the EDD to say that this parameter is not turned on by default and in case if users want high throughput they can turn on this parameter using qmgr.

syslog does not have any impact with this parameter.

I believe using write() with O_APPEND would address the issue of intermixing of log messages… at least on POSIX filesystems. (On NFS or for pipes, it might only work when the buffer is small enough, e.g., under 512 or 4096 bytes).

Intermix of messages is only possible if you are using buffered file IO. Ie, the libc functionality accesses via the FILE * pointer. If you are using unix system calls directly (the fd based open, read/write etc), along with O_APPEND, its guaranteed that the messages will not trample over each other.

@bhroam, we need your thoughts on the following.

Just trying to understand whether we are moving configuration of all PBS daemons to to qmgr ? If yes then we think we can move PBS_LOG_ASYNC parameter to qmgr.

If No then also we can try moving this parameter to qmgr but then it is only applicable to PBS Server as of now. For other daemons like Scheduler and MOM since they have their own config files we anyhow have to depend on pbs.conf only. So please let us know which is the right approach we could follow in this case.

Thanks,
Suresh

We are definitely going in the direction of putting configuration into pbs.conf. I understand this is less convenient than using pbs.conf. The daemons can come up and do a stat IFL call to the server to get some configuration data. In this case it’d be to find out if we’d log asynchronously.

Bhroam

I think that’s a typo – the future direction being proposed is to use qmgr for all configuration (not pbs.conf).

doh! Yes it was a typo. We are putting configuration into qmgr. Thanks for catching this!

Bhroam

I think you are going to get intermixed messages unless you are very careful. If you use fprintf(), as with the existing code, but skip the fflush() call after each fprintf, you risk getting intermixed messages when fprintf itself issues a write when its stdio buffer fills.

If you could peek inside the stdio buffering, you could check how full the buffer is getting and issue your own fflush() at a record boundary. But that’s not portable. Instead, you could use setvbuf() to control the size of the buffer and accumulate the results of the fprintf() calls to keep track of how many bytes have been used. Again, call fflush() at a record boundary when your accumulator gets above xx% of the size of your buffer. Your counter will get out of sync with reality if the process forks (because fork flushes stdio buffers), but that just means you’ll do your own flush earlier than you needed to. Then, you’ll be back in sync.

I realize I’m fighting a losing battle, but why should every configuration parameter be settable from qmgr? If the parameter is specific to a host, why should it not be in a host-specific file, especially if the setting never changes during the life of the host?

Take this async logging option for example. Whether that makes sense is very dependent on the host where the daemon is running. Thus, it has to be a node-specific option. This means that every time I grab the status for that node, I get an additional 20-30 characters of invariant noise. Multiply that by 10,000 nodes and that’s 1/4 MB of noise that has to be encoded by the server every time someone asks for the state of the cluster. Pure waste.

@dtalcott is right. scheduler and server level config through qmgr is ok, but mom(hosts) may not be right. There may be usecases where only few moms needs same configs as server.

Having said, qmgr provides setting node info. We need to extend this to forward the configs to corresponding nodes.

I think most server and scheduler options make sense to put in to qmgr. However, I agree with @dtalcott on this one. In this instance of logging, I think it makes sense to keeps it in /etc/pbs.conf since we do not currently have any daemon other than the server getting most of its config from the qmgr. I know we do have plans to move the sched config into the qmgr but we still don’t have plans yet to move the mom and comm into qmgr anytime soon.

I agree there - given that we need this configuration on all the daemons and that the “all configurations via qmgr project” is not yet staffed, it makes sense to address this currently in the pbs.conf file.

Allow me to begin by stating that PBS Pro is an open source community based project. One of the main reasons this forum exists is to ensure the community is in charge. When differences of opinion are encountered between community members we must discuss them openly. We do not want any community member to feel they are fighting a losing battle, or any battle whatsoever. My apologies to @dtalcott if that has been your experience. Let’s work together to address your concerns.

The topic of this thread involves asynchronous logging. However, I believe the source of contention regarding storage of configuration data is actually covered by PP-685. I have opened a new thread to discuss storage of configuration data. I invite and encourage you to participate.

I share your concern @dtalcott, but I think it also depends on the design and implementation of the feature pertaining to configuration data. Because this topic is independent of asynchronous logging, I have started a new thread to discuss it. Please let me know if I have in some way misunderstood your message.

For the time being, I would agree that the configuration data pertaining to this feature should be stored in a local file. My question is whether an administrator might want microsecond logging enabled per daemon. If that is the case, /etc/pbs.conf might not make sense.

@mkaro I already got sign off from you and Bhroam on the following thread of discussion, Hope I can assume your sign off on this too because it is also relating to logging and as per the discussion we had, we have decided to go with pbs.conf.

http://community.openpbs.org/t/pp-261-micro-second-time-stamp-for-daemon-logging/?source_topic_id=466

Hi @suresht,
The EDD looks fine to me. Sign off.

Hi @suresht,
EDD looks fine to me, I sign off.

The EDD looks fine to me as well. I sign off.