Replies: 2 comments 2 replies
-
I agree that io_uring should be the default, but there should be a fall-back to epoll, if the requirements that we need aren't met. Otherwise, your plan looks sound ! |
Beta Was this translation helpful? Give feedback.
1 reply
-
I think - no matter what - that you submit a WiP pull request before August 12 which is your GSoC mid-term. We can keep working on it after... |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
My name is Henrique de Carvalho and I will be working on pgagroal's new I/O layer as part of GSoC 24 -- proposal.
The project involves replacing the I/O Layer of pgagroal, which is currently dependent on the libev library, with pgagroal's own implementation.
The motivation behind this project is because libev is not being maintained any longer. Therefore, pgagroal needs an efficient (i.e., maintainable, reliable, fast, lightweight, secure and scalable) implementation of an event loop that can be directly maintained by the pgagroal community.
Currently, pgagroal depends on libev to (a) monitor incoming read/write requests from its connections in a non-blocking fashion; (b) launch timer events; and (c) handle signals.
Therefore, I will be writing pgagroal's own I/O layer with support for Linux and FreeBSD. For Linux, I will be using io_uring (introduced in Linux kernel 5.1). For FreeBSD, I will be using kqueue (introduced in FreeBSD 4.1). If, for any reason, performance requirements aren't met with these APIs, epoll should be used for Linux or FreeBSD.
The goal of this discussion is to outline the path to the implementations in both Linux and FreeBSD and to be transparent about the ongoing implementation, allowing for discussions of ideas.
This should be an open space for discussions about this subject, so I encourage you to comment below your ideas for the project. If you would like to contact me directly instead, feel free to email me at decarv.henrique+pgagroal@gmail.com.
I will be updating this document regularly (at least once a week), with developments on the project.
Final Report
The final report can be found here.
Planning
Weekly Updates:
Designed the strategy for performance testing using
perf
linux tool (I will be constantly updating this and this scripts).Experimentation with io_uring.
Currently looking the PostgreSQL protocol.
May 15 - May 28 : Start on the I/O Foundation by designing and implementing a simple io_uring loop for core I/O operations, marking the initial replacement of libev functionalities. Refine the io_uring integration for I/O other than event loop, try to identify potential room for more advanced io_uring techniques.
Weekly Updates:
Currently learning io_uring, and designing and implementing the io loop, inspired by this code.
May 30 Update: The I/O loop is being implemented here. I have been using more advanced techniques such as multishot accept / receive and ring buffers. Ring buffers are still not working straight, but it is a matter of debugging. This will be finished by the end of the week.
May 29 - June 11 : Finish the I/O Foundation by integrating advanced networking io_uring features.
Weekly Updates:
June 10: Implemented the I/O loop for basic networking alls with advanced io_uring features, such as multishot accept / receive and ring buffers. Implemented support for signals. Implemented some tests to verify functionality. Stress tests are missing. I will make adjustments to the I/O loop as I integrate the loop with pgagroal.
June 12 - June 25 : Design and implement functional tests to ensure functionality and stability. Fix potential issues with the code found during tests. If io_uring does not meet the performance requirements for pgagroal, fallback to an epoll implementation.
Weekly Updates:
June 18: The basic functionality is already implemented. I have started replacing the libev from the main code. I am focusing on replacing the libev from pgagroal's code right now so I can implement the epoll later.
June 25: I decided to implement epoll before porting to pgagroal, so epoll is now implemented and working. I have to design and implement tests, yet simple, to test basic functionality.
July 10 - September 03 : Port the custom libev to a pgagroal branch.
Weekly Updates:
July 12: The event loop functionality has been successfully implemented using both io_uring and epoll. The custom version of libev, now compiled and tested, works effectively on Rocky Linux 8 and 9. To facilitate this process, Dockerfiles were developed. During the integration of libev into pgagroal's codebase, I identified and implemented enhancements to pgagroal's custom event loop (pgagroal_ev). I expect to finish the integration by the end of this period (July 23). I have to set aside some time to document the code and tests, though.
July 23: Integrating the code has been challenging and, due to how things work with io_uring, part of the current pgagroal code needs to be modified (mainly the function signatures in the pipelines will have to be redesigned). Therefore, I still have to change a couple things before successfully completing this step and diving into FreeBSD's implementation. Nothing too big. I am currently working on this and I will update this forum this week.
July 30: Major refactoring of the ev code because the integration was essentially broke. My implementation had no good way to keep a loop abstraction separated from watcher abstraction (I had only watchers). This led to difficulties while integrating the code. Guess I should have tried this earlier during custom libev implementation. Mo=st of the work is done now for io_uring (epoll changes will be simpler), but the integration is being smooth now because the custom libev now contains the same abstractions (including the same names:
ev_loop
,ev_io
,ev_periodic
,ev_signal
, and so on). I will update this document later this week.I decided to keep both implementations in different branches. The new branch is this one.
Aug 4: The branch compiles fine. But there are accept / receive issues that I am still debugging. I intend to fix this and update this discussion as soon as possible.
Aug 7: I cannot focus on FreeBSD as long as this does not work fine, so I am giving myself more time to debug. I don't know what is happening. The client accuses the server to close unexpectedly when communicating with the worker, as though many messages are being exchanged. I need to investigate this further to understand where the event loop is failing. I need to capture the exchanged packets so I can debug this further.
Aug 13: Submitted WiP PR. I'm leving here a reminder that Github CI also needs to be changed.
Aug 27: What is currently under work: (a) implementing configuration option for backend selection; (b) porting the work to other pipelines; (c) CI update; (d) performance benchmarks for new io layer (pgbench).
September 04 - October 1st : Begin working on the I/O foundation for FreeBSD, focusing on integrating and adapting to FreeBSD's system specifics. Finish the I/O Foundation. Search and experiment with advanced networking patterns and usage for kqueue that can benefit the implementation. Design and implement functional tests to ensure functionality and stability. Fix potential issues with the code found during tests.
Sep 8: Development updates: (a) io_uring and epoll working everywhere on the code; (b) pgbench not throwing any errors; (c) ev backend is dynamically set on startup, according to the ev_backend option on pgagroal.conf. TODO: (a) performance evaluation comparison using pgbench; (b) Begin working on FreeBSD.
September 18 - October 01 : Refine performance strategy and tests. Identify code bottlenecks and define a strategy to optimize code based on findings and implement them.
Oct 1: Fixed some problems with current ev implementation for Linux. Still a bunch of TODOs for linux, but basic implementation is there. Still have to test IO in FreeBSD, but implementation is also there.
October 02 - October 22 : Final testing of all implementations for both Linux and FreeBSD. Make sure everything is running fine.
Oct 14: Implementation of Linux and FreeBSD is working. CI has been modified and is working for both Linux and Darwin. Bugs related to
blocking fds are still present as the application attempts to read before the fd is readyclosed connections not being handled properly and the application hangs. This will be fixed shortly. Comments from the thread have already been addressed (related toauto
implementation forev_backend
).Oct 17: Pushed fixes commented above and
auto
implementation. I intend to move the event backend configuration to pgagroal's configuration function so that it is easier to handle and to avoid the overhead of setting up the backend everytime a connection is started. I'm also currently removing dead code and looking at huge_pages implementation of the event backend. I am occasionally running pgbench to see if i can find any other bugs. I invite the community to do the same.Oct 20: Played a bit with the current implementation to replace I/O calls for when using io_uring. Managed to get it working, I have to merge to the current implementation and think of a better design to have this. While this work is not merged, I will work today and tomorrow on finishing the config details and fixing the bug reported by Luca.
Oct 23-28: Tried reproducing the issue reported by Luca here, without success. I have identified another bug that is probably related to what Luca reported (dangling processes), so I will focus on fixing that. I moved main the configuration to the configuration file.
Beta Was this translation helpful? Give feedback.
All reactions