I am working on a simple client server program in C in which multiple clients will be connected to a single server.
Clients will submit operations/actions to the server and the server will process these requests. These operations may be expensive and/or long running so ideally I would like to have a thread pool on the server that can concurrently process requests rather than block the main thread.
In addition I also thought that using poll (can't use epoll as I need to be POSIX compliant) might be better performance wise rather than creating a new thread per socket connection (stems from the C10K problem: https://en.wikipedia.org/wiki/C10k_problem).
So in theory the server might look like the following pseudo code
int main()
{
// Pretend these are initialized in some manner
ThreadPool thread_pool;
Socket server_listening_socket;
PolledFileDescriptors list_of_polled_fds;
// The first pollfd will be the listening socket which looks for read events on it
list_of_polled_fds[0].fd = server_listening_socket;
list_of_polled_fds[0].events = POLLIN;
while (true)
{
// Call poll on our list of file descriptors with unlimited timeout (-1)
poll(&list_of_polled_fds, number_of_fds, -1);
for (int i = 0; i < number_of_fds; i++)
{
// We received a read event on this file descriptor
if (list_of_polled_fds[i].revents & POLLIN)
{
// The listening socket has an event (meaning a new connection was created)
if (i == 0)
{
Socket client_socket = accept();
AddClientConnectionToListOfPollFds(&list_of_polled_fds, client_socket);
}
// A connected client has an event (data was sent over the socket)
else
{
ThreadPoolTask task = {
.argument = list_of_polled_fds[i].fd // client connected file descriptor
.function = SomeFunctionToReadDataFromSocketAndProcessIt
};
AddTaskToThreadPool(&thread_pool, &task);
}
}
}
}
return 0;
}
Now with this high level design I have a few concerns.
Single Message Causes Multiple Events
- Suppose the client tries to send the server a message of 10 bytes, but for some reason the bytes get split into 2 TCP packets.
- The first packet will come in on the client socket and this will cause
pollto detect an event. - It will then place this socket into a task on the thread pool which will read and process the data.
- The second packet then comes in and causes poll to do the same thing.
- Now I have 2 tasks in my thread pool that correspond to the same socket and for what should be the same "message".
How should I manage this? Should I just keep track of which sockets are currently being worked on in the thread pool and not add the same socket if a task exists?
If I guard the thread pool from adding the same socket twice, then that means if a single client sends 2 independent requests, I will not be able to process them in parallel. I will have to wait for the first message to finish and then process the next one.
What is a good mechanism for detecting if multiple poll events belong so a single client message so I can both not add redundant tasks to my thread pool, but still process multiple requests from the client simultaneously?
recv()returnedEAGAINorEWOULDBLOCK. epoll is more convenient mechanism for such scenario.recvsets the error toEWOULDBLOCKorEAGAIN, then I place the file descriptor into the poll list and try again to receive the full message?