Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic socket buffer allocation #200

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

turekt
Copy link

@turekt turekt commented Sep 30, 2022

Hi,

the motivation behind this PR has several elements:

Netlink is not a reliable protocol.  It tries its best to deliver a message to its destination(s), but may drop messages when an out-of-memory condition or other error occurs.  For reliable transfer the sender can  request  an
acknowledgement  from  the  receiver  by setting the NLM_F_ACK flag.  An acknowledgment is an NLMSG_ERROR packet with the error field set to 0.  The application must generate acknowledgements for received messages itself.  The
kernel tries to send an NLMSG_ERROR message for every failed packet.  A user process should follow this convention too.

However, reliable transmissions from kernel to user are impossible in any case.  The kernel can't send a netlink message if the socket buffer is full: the message will be dropped and the kernel and the user-space process  will
no longer have the same view of kernel state.  It is up to the application to detect when this happens (via the ENOBUFS error returned by recvmsg(2)) and resynchronize.

With this in mind, since it was discussed to try and address the ENOBUFS issue in the netlink lib (https://github.com/google/nftables/pull/191/files/0d4369aacbd8b10bc86765a69851d0d01a821fd8#r982856106), I am introducing a PR for discussion which covers:

  • introduces a ReceiveBuffer (ExecuteBuffer) func which receives a BufferAllocationFunc that allocates buffers for the underlying socket, passed by the user (covering issue netlink: add a version of Conn.Receive that doesn't allocate its own buffers #178)
  • introduces a default BufferAllocationFunc (or default allocation strategy) which is similar to the previous peek-loop-allocate but in case ENOBUFS happens, it automatically resizes the socket read and write buffers
  • exposed the ReadBuffer and WriteBuffer methods via bufferGetter interface for easier calculation of buffer size by user applications
  • already implemented functions were minimally changed to retain backwards compatibility

In the end, I am not sure if this is the best approach since applications could still catch the ENOBUFS error themselves, resize the read and write buffers with SetReadBuffer or SetWriteBuffer and then resend the message making this PR unnecessary? We can change the PR as per your feedback.

Let me know what you think.

Introduces resizing of buffers on ENOBUFS | Introduces BufferAllocationFunc for buffer allocation passed by user applications | Exposes ReadBuffer and WriteBuffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant