One of my colleague came to me today and ask: I found the free memory is amazingly few even there is no process using memory by observing the result of "top". This leads to poor system performance when kernel starts to use swap. Why?
Mmm...How to analyze this problem? My steps are:
今天同事在測試時遇到一個問題:從 top 觀察記憶體用量,發現沒有任何 process 占用太多記憶體,但是 free memory 卻少的可憐,swap 介入後系統效能變差,為何有這種情形?
怎麼處理呢?嗯,我的步驟依序為:
Mmm...How to analyze this problem? My steps are:
- Since we could not see the cause in user space, it might be that kernel consumed the memory. We tried to see the result of "free". If the "cached" is large, means we need to use fadvise to handle it.
- It turned out...not the case!! Ok, I tried to observe the result of the cached status of kernel by "cat /proc/slabinfo". Ooops, it shows dentry_cache occupied 2GB!!
- My colleague's program tried to generate a lot of new files, this caused a lot of dentries generated by kernel. Linux is not capable of predicting when to release dentry_cache, so it keeps them for performance. Since we realize the root cause, the solution is easy to find.
- echo 2 > /proc/sys/vm/drop_cache is sufficient. Take a look at this article if you want to know the details.
After working in the Linux kernel for years, it helps a lot when debugging a system. Fun, fun, fun. :-)
--------------------------------------------------------
今天同事在測試時遇到一個問題:從 top 觀察記憶體用量,發現沒有任何 process 占用太多記憶體,但是 free memory 卻少的可憐,swap 介入後系統效能變差,為何有這種情形?
怎麼處理呢?嗯,我的步驟依序為:
- user space 看不到記憶體使用? 那應該就是 kernel space 吃掉的嘍?看一下 free 的結果,是否 cached 佔太多,如果是 cached 太多,那就要用 fadvise 技法去處理。
- 結果發現不是...嗯,好吧,觀察一下 kernel 內部 cached 狀態(cat /proc/slabinfo),嘿,發現 dentry_cache 特別高!吃了將近 2GB!
- 抓到原因,問題就解決一半了,原來同事的測試程式會很誇張地產生新檔案與不同路徑,然後進行讀檔,所以造成 dentry_cache 快速累積,以 Linux 目前的預設行為,無法及時釋放。施以 echo 2 > /proc/sys/vm/drop_cache 即可主動釋放。詳細說明可看這篇文章。
心得:搞過底層後,對 debug 系統多了幾個角度,還不賴~
留言
張貼留言