Re: many processes stuck in "hmrrcm", system unusable

看板DFBSD_bugs作者時間16年前 (2009/10/05 02:01), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串4/6 (看更多)
:Shouldn't we rather try to fix the issue, i.e. make hammer work just a :little bit performant and capable of concurrent use? I think now that :the code is stable we should start investigating performance (latency) :issues and address them. : :cheers : simon : I think the main culprit here is the background flusher. With UFS any modifying operations can block the process context responsible for them. With HAMMER *ALL* modifying operations are asynchronous and do not block the process context responsible for them. Thus when resources reach their limit, ANY process trying to make a modification or even just load a new inode (hmrrcm) winds up taking a hit instead of the one process that was responsible for eating up all the resources in the first place. These limits are quickly hit when rm -rf'ing or tar extracting tens of thousands of files, but otherwise typically not hit. In both cases the disk winds up being banged up, but with UFS it is easier to prevent the resource starvation issue from bleeding over into other processes. HAMMER can't really distinguish between modifying operations belonging to a heavy handed process verses modifying operations incidental to processes which otherwise have a light touch. I do believe it is possible to solve the problem, but it isn't a quick fix. Essentially we have to move meta-data modification out of the backend flusher and into the frontend. This will shift the cpu and buffer cache burden back to the processes responsible. But it isn't easy to do this because those meta-data buffers cannot be flushed to the media without first synchronizing the UNDO space. Synchronizing the UNDO space and still maintaining a pipeline requires double-buffering dirty meta-data buffers (because new changes to meta-data which is already dirtied from a previous operation now undergoing a flush cannot be made in-place). I would have to abandon using the buffer cache entirely for meta-data buffers and go with a roll-my-own scheme. That might make porters happier but it won't make me happier as it opens a whole new can of worms on how to manage the buffer resources. I would much rather work on the clustering, but if people are going to constantly complain about HAMMER's performance I will have to take 2-3 months and deal with this issue first I guess. -Matt Matthew Dillon <dillon@backplane.com>
文章代碼(AID): #1AoEBdRw (DFBSD_bugs)
討論串 (同標題文章)
文章代碼(AID): #1AoEBdRw (DFBSD_bugs)