Burying pdflush in the back yard - Jens Axboe's blog
Mar. 12th, 2009
03:43 pm - Burying pdflush in the back yard
So this week I began playing with implementing an alternative approach to buffered writeback. Right now we have the pdflush threads taking care of this, but that has a number of annoying points that made me want to try something else. One is that writeout tends to be very lumpy, which is easily visible in vmstat. Another is that it doesn't play well with the request allocation scheme, since pdflush backs off when a queue becomes congested. And fairness between congested users and blocking users is... well not there.
Enter bdi threads. The first step was moving the dirty inodes to some place where bdi threads could easily get at them. So instead of putting them on the super_block lists, we put them on the bdi lists of similar names. One upside of this change is also that now we don't have to
do a linear search for the bdi, we have it upfront. The next step is forking a thread per bdi that does IO. My initial approach simply created a kernel thread when the bdi was registered, but I'm sure that lots of people will find that wasteful. So instead it now registers a forker thread on behalf of the default backing device (default_backing_dev_info), which takes care of creating the appropriate threads when someone calls bdi_start_writeback() on a bdi. It'll handle memory pressure conditions as well, find the details in the patch set.
Initial tests look pretty good, though I haven't done a whole lot of testing on this yet. It's still very fresh code. I posted it on lkml today, you can find the individual patches and complete description there. As always, the patches are also in my git repo. Find them in the writeback branch here.