Events and Messaging

I have been working on having the OpenDAX core send event messages to modules when something in the database changes (or some other event happens) and I am starting to see some of the problems with using a message queue for communication between modules. I'm starting to realize that the message queue gives me some interface problems between the modules and the library. It works just fine as long as the message queue simply replicates a functional interface. For instance, reading a tag is a function call that is one message sent (the request) and another message received (the data). Once the message queue is used for any kind of asynchronous notification scheme the problem with the message queue becomes apparent.

Basically it requires too much complexity for the module. I'd rather the complexity be in the library and/or core. I've always said that I wanted building modules to be easy. I still think that a shallow learning curve for programming modules will be of great benefit to the OpenDAX project, because eventually the success of the project will depend on the number and quality of modules out there for it.

Let me try to explain a bit more. If a module reads and writes a lot of data an asynchronous message handler is not really that difficult. If the library finds itself blocked in the msgrcv() function then when the OpenDAX core sends an event message the msgrcv() function will retrieve the message and the wrapper function can call some kind of event handling function within the library to deal with the event, and then call msgrcv() again and wait for the message that it wanted in the first place. This kind of interface would work well for I/O type modules that are constantly moving data.

It doesn't work too well for modules like daxc, the command line interface module for OpenDAX. daxc spends most of it's time waiting for the user to input data. If the module receives an event it would not receive the event message until the user entered a command that required a message to be received and then the msgrcv() function would retrieve the event message, process it and then get the message for the users command. It would be possible to implement threads in daxc to handle this situation. One thread taking care of the message queue and the other handling the user input. This added complexity on the part of the module is exactly what I am trying to avoid.

I can't help but wonder what kind of interface would be best for the modules. I am starting to think that the library should take more control of the program flow of the modules. For instance, the modbus module is a scanning type module that sleeps for a specified period of time and then wakes up, sends some modbus messages and then goes back to sleep. Is it better for the modbus module to handle this timing or should the message sending routines in the module be callback routines that the library calls in response to timeouts or events received from the core? This might make porting this application to a real-time OS easier. If the library handled these timing details then porting to a real-time OS would be a matter of dealing with new library code and not rewriting a bunch of modules.

Even if the library implemented a purely event driven type interface it might still cause problems for any kind of user input. Most GUI toolkits contain a similar interface and the two would probably not be very compatible with this kind of an idea. It would also either have to implement or interrupt any kind of gets or readline type function calls. I will have to think about this some more.

Part of this makes me start to wonder if I should have used sockets as the IPC method between OpenDAX and the modules. I could then use select() or poll() and make the interface whatever I wanted it to be. I suppose that I could do something similar with the message queue by first checking for the presence of a message and then retrieving it if one is there, and the calling usleep() or something similar to act like select(). The problem with this idea is obviously that it would require two or three system calls to mimic what could be done with a single select() call. I suspect that the kinds of messages that will be commonly sent will be fairly small, so bandwidth may not be as big of an issue as system call overhead. I don't really know the answer to that question.

Another advantage of sockets is that it makes OpenDAX immediately network aware. It would simply be a difference of what type of socket was opened.

The drawback to using sockets is that modules would not be able to send messages to one another easily without using core resources. Probably not a big deal since I'm having a hard time figuring out why modules would want to send messages to each other. A logging module is really the only thing I can come up with, and that is easily implemented in other ways.

Also message queues are a little more dependable. If the msgsnd() function returns good then the message is on the queue, and the library can assume that if the recipient is alive it will eventually get the message. Sockets are really more dependable that that on the local machine but if we go with sockets we might as well go all the way and handle the errors as though the other end was on another continent, and that makes them more complicated. I did say that I was okay with complexity in the library though so I should probably stop being lazy.

I should probably spend a good bit of time re-thinking how the modules interact with the core. I have been designing from the core out and I should probably start with the module/library interface and work my way in. Stay tuned. As always comments are welcome.