跳到主要內容

發表文章

目前顯示的是 8月, 2012的文章

Drop dentry_cache to release memory in Linux

One of my colleague came to me today and ask: I found the free memory is amazingly few even there is no process using memory by observing the result of "top". This leads to poor system performance when kernel starts to use swap. Why? Mmm...How to analyze this problem? My steps are: Since we could not see the cause in user space, it might be that kernel consumed the memory. We tried to see the result of "free". If the "cached" is large, means we need to use fadvise to handle it.  It turned out...not the case!! Ok, I tried to observe the result of the cached status of kernel by "cat /proc/slabinfo". Ooops, it shows dentry_cache occupied 2GB!! My colleague's program tried to generate a lot of new files, this caused a lot of dentries generated by kernel. Linux is not capable of predicting when to release dentry_cache, so it keeps them for performance. Since we realize the root cause, the solution is easy to find. echo 2 > /proc/sys/vm/

How to write a software watchdog in Python?

Software watchdog is essential to monitor system's status, especially for those run for very long period of time, such as server programs. The basic idea is simple: construct a trustworth process or thread to monitor the real working process or thread. However, after googling a while, I could not get any satisfactory solution in Python. I think it's time to write this by myself. In this article, I will share how I implement a software watchdog in Python. Before we go on, you may want to pull my code from github to play with. The first thing is to determine what kind of model we want watchdog to run? There are two choices: be a thread, or be a process. Being a thread, we can simplify communication between watchdog and working thread. The problem is that if there are some resources hold on with working thread, watchdog might deadlock. Moreover if working thread is trapped in some loop, we have no graceful way to kill it. My choice is obvious: let watchdog and working one

Why Linux kernel has ACCESS_ONCE()?

在 LWN 看到 一篇好文章 ,討論Linux kernel 中的 ACCESS_ONCE() macro 的用法與意義。文章討論到 multi-thread 程式共用變數所需要注意的一個細節:要使用 volatile 關鍵字去修飾變數以避免編譯器的最佳化造成的問題。 文中以一個在 Linux kernel 的例子來說明,但考慮到這個議題在任何 multi-thread 程式(或任何有 context-switch 的程式中)都會遇到,所以我用簡單的情況來舉例。考慮某個 thread 有如下程式碼片段: for   ( ;; )   { ...     if   ( global_status  ==  ST_MOVE )   {        do_something ( ) ;     } ... 它會依據 global_status 的值來決定程式流程,而這個變數可能會在另一個 thread function 中被修改(也可能在 signal handler 中被修改),所以每次 loop 時,可能會執行 do_something() 也可能不會執行它。問題在於編譯器的最佳化往往都會作在語言沒有清楚定義的地方。在 C 語言中,由於沒有規定 multi-thread 的共享變數的行為(在 C11 前幾乎對此沒有著墨),所以若 for loop 中沒有對 global_status 任何修改,編譯器有可能將上述 loop 中的判斷整個拿掉: // if global_status is ST_SLEEP for   ( ;; )   { ...     /*    if (global_status == ST_MOVE) {        do_something();    }*/ ... 所以一般要避免這種情況發生,都會以 volatile 修飾 globe_status 變數。這時,我們再來看看 ACCESS_ONCE() 的定義: #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) 很清楚地,就是希望以這種方式來存取變數,但問題在於: 為何不從一開始就以 volatile 宣告

How to debug recursive functions in Python?

Recursive function is the natural way to solve problems when you encounter functional programming. However it is sometimes hard to realize the run-time execution flow even the function is simple. I used to running python debugger and use "commands" to investigate the behavior: ( Pdb )  l    1   - >   def  iseven ( n ) :    2           return   True   if  n ==  0   else  isodd ( n -  1 )    3    4       def  isodd ( n ) :    5           return   False   if  n ==  0   else  iseven ( n -  1 )    6    7       print ( iseven ( 4 ) ) [ EOF ] ( Pdb )  b  2 Breakpoint  1  at /home/mars/ try /python/decorator/use_pdb. py : 2 ( Pdb )  b  5 Breakpoint  2  at /home/mars/ try /python/decorator/use_pdb. py : 5 ( Pdb )   commands   1 ( com )  p n ( com )  c ( Pdb )   commands   2 ( com )  p n ( com )  c ( Pdb )  c 4 >  /home/mars/ try /python/decorator/use_pdb. py ( 2 ) iseven ( ) - >   return   True   if  n ==