Linux poll with a thread pool and multiple events

Ask Question

Asked 2 years, 2 months ago

Modified 2 years, 2 months ago

Viewed 634 times

I am working on a simple client server program in C in which multiple clients will be connected to a single server.

Clients will submit operations/actions to the server and the server will process these requests. These operations may be expensive and/or long running so ideally I would like to have a thread pool on the server that can concurrently process requests rather than block the main thread.

In addition I also thought that using poll (can't use epoll as I need to be POSIX compliant) might be better performance wise rather than creating a new thread per socket connection (stems from the C10K problem: https://en.wikipedia.org/wiki/C10k_problem).

So in theory the server might look like the following pseudo code

int main()
{
  // Pretend these are initialized in some manner
  ThreadPool thread_pool;
  Socket server_listening_socket;
  PolledFileDescriptors list_of_polled_fds;

  // The first pollfd will be the listening socket which looks for read events on it
  list_of_polled_fds[0].fd = server_listening_socket;
  list_of_polled_fds[0].events = POLLIN;

  while (true)
  {
    // Call poll on our list of file descriptors with unlimited timeout (-1)
    poll(&list_of_polled_fds, number_of_fds, -1);
    for (int i = 0; i < number_of_fds; i++)
    {
      // We received a read event on this file descriptor
      if (list_of_polled_fds[i].revents & POLLIN)
      {
 
        // The listening socket has an event (meaning a new connection was created) 
        if (i == 0)
        {
          Socket client_socket = accept();
          AddClientConnectionToListOfPollFds(&list_of_polled_fds, client_socket);
        }

        // A connected client has an event (data was sent over the socket)
        else
        {  
          ThreadPoolTask task = {
            .argument = list_of_polled_fds[i].fd // client connected file descriptor
            .function = SomeFunctionToReadDataFromSocketAndProcessIt
          };
          AddTaskToThreadPool(&thread_pool, &task);
        }
      }
    }
  }

  return 0;
}

Now with this high level design I have a few concerns.

Single Message Causes Multiple Events

Suppose the client tries to send the server a message of 10 bytes, but for some reason the bytes get split into 2 TCP packets.
The first packet will come in on the client socket and this will cause poll to detect an event.
It will then place this socket into a task on the thread pool which will read and process the data.
The second packet then comes in and causes poll to do the same thing.
Now I have 2 tasks in my thread pool that correspond to the same socket and for what should be the same "message".

How should I manage this? Should I just keep track of which sockets are currently being worked on in the thread pool and not add the same socket if a task exists?

If I guard the thread pool from adding the same socket twice, then that means if a single client sends 2 independent requests, I will not be able to process them in parallel. I will have to wait for the first message to finish and then process the next one.

What is a good mechanism for detecting if multiple poll events belong so a single client message so I can both not add redundant tasks to my thread pool, but still process multiple requests from the client simultaneously?

edited Oct 2, 2023 at 20:34

asked Oct 2, 2023 at 20:22

nick2225

5751 gold badge8 silver badges22 bronze badges

1

Assign one thread for each event source (fd), not for each event.

dimich
– dimich

2023-10-02 20:42:16 +00:00
Commented Oct 2, 2023 at 20:42
@dimich What about in the large case scenario where there are 10,000 or more connections? It sounds like I'll need 10,000 threads open at the same time which will probably yield some performance problems.

nick2225
– nick2225

2023-10-02 20:48:20 +00:00
Commented Oct 2, 2023 at 20:48
1

How it differs from acquiring one thread per event? Number of threads is limited by your pool capacity. You should use non-blocking sockets and put descriptor to poll list only if recv() returned EAGAIN or EWOULDBLOCK. epoll is more convenient mechanism for such scenario.

dimich
– dimich

2023-10-02 20:57:10 +00:00
Commented Oct 2, 2023 at 20:57
2

stackoverflow.com/questions/17593699/… <- C10k @ 1 thread per connection on Linux reported as feasible 10 years ago

teapot418
– teapot418

2023-10-02 20:57:15 +00:00
Commented Oct 2, 2023 at 20:57
@dimich Oh that is an interesting approach. So part of my data framing is that I send 4 bytes which represent the upcoming message size in bytes. Then I just receive 4 byte to get that size. Are you suggesting that in the main thread I set the sockets as non blocking and try to recv all the data for a single "message" and only if I can receive the full message I place it into my thread pool? If I can not receive the full message (I.E. recv sets the error to EWOULDBLOCK or EAGAIN, then I place the file descriptor into the poll list and try again to receive the full message?

nick2225
– nick2225

2023-10-02 21:09:25 +00:00
Commented Oct 2, 2023 at 21:09

| Show 6 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Linux poll with a thread pool and multiple events

Single Message Causes Multiple Events

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Single Message Causes Multiple Events

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked