I'm back to questioning the method that I am using for communicating between the OpenDAX server and the modules. I have been stuck on using a message queue for this communication. I have once again read a book (this one is rather authoritative) that said message queues should not be used for new projects. Uggghhh! I guess I probably knew that and have questioned the decision since the very beginning of the project. Given that I continue to question my choice of using message queues I've decided to write down what I think the pros and cons are of the queue and solicit comments on what I should be doing.
Pros
- Atomic - Message queues are atomic. When the msgsnd() function returns the calling program is guaranteed that the entire message is on the queue.
- Sharing - Several processes can use the queue to send messages to each other or to send them to the core. This can also be a con.
- Fast - Although not the fastest IPC mechanism on some systems they are pretty fast.
- Record Based - If a structure is put into the queue that same structure comes out. The memory can simply be copied. I guess some of the other mechanisms will do this too but it's really easy with the queue.
- Easy - Create a queue and everybody can start reading and writing.
Cons
- Not File Based - The first rule in *nix is that everything is a file. Well everything except message queues. There are very powerful mechanisms (select, poll) that can be used with file descriptors, that cannot be used on message queues. This is causing me to have to "work around" some problems that would normally be quite easy to deal with if it were a socket or a serial port.
- Networking - The message queue message passing structure is not easily adapted to passing information over a network. With the right error handling routines any of the file based IPC mechanisms could easily be converted to work over a network. For binary data the endianness will have to be handled as well.
- Resource Deallocation - Message queues will live on the system whether any process is using it or not. If the program that is tasked with deleting the queue crashes, the queue will live forever (or at least until the user runs ipcrm). The kernel resources for sockets and pipes will be freed once all the processes that have them open are gone.
- Limited size - Everything has limits but the message queue is quite finite and since it is shared a wayward module can ruin it for everyone. Sockets and pipes would be unique for each module (client) so if a wayward module filled up all it's buffers it would only be hurting itself. I guess I could have multiple message queues that did the same thing but then I am losing what few benefits the queue has in the first place.
- Cleanup - If a module dies and other modules are still trying to send it messages on the queue the messages will not be removed. I was dreaming up some pretty elaborate ways to have the OpenDAX server clean up these messages but with other forms of communication the kernel will handle all this for me.
- Central Control - With the message queue the system cannot enforce any kind of central control. This really isn't that big of a deal since we will be putting quite a bit of trust in the modules anyway. With sockets or pipes the modules will have to send all their information to the server and the server can then enforce any rules it needs. I guess there really is nothing stopping modules from talking to each other but they'd have to do it outside of the libdax library routines.
I doubt that this list is exhaustive but it's what I've come up with so far.
Okay, if I decide to scrap the message queue and replace it with some other form of IPC, which way do I go? I don't know that I understand all the other methods well enough to discuss the differences. If I'm gonna change I want to have at least three things.
- File descriptor type IPC. I want to be able to use select() and poll().
- FAST! At least the local modules need to be seriously fast. The networked modules will go however fast they go.
- Easily networked. This makes me lean toward some kind of socket interface. Probably UNIX Domain sockets. Then it's a simple matter to open a different socket to a remote machine. Except for that pesky endianness issue.
This is all starting to resemble the way that other, similar systems work (Postfix or MySQL for example). Using some kind of underlying socket architecture would make redundancy easier, make networking easier and make the system more robust. One thing that might be a little different is that most of these systems tend to send a lot of text data. (This is also a *nix thing). OpenDAX will be more binary in nature which will be fine on the local machine but may cause problems once we start networking. The biggest drawback is that I have a few thousand lines of code to rewrite. I don't mind if it saves me time in the long run. Comments are welcome.