Hi,It always a pleasure to have real life contributions from colleagues in the industry. This time,
Rubi Dagan, a system architect and senior team leader at
Metacafe, one of the world's largest video sites ww, shares with us "the mystery of system calls".
BackgroundMetacafe's software system had many calls to time(), and during stress it was felt much stronger. For example, in the figure 1 you can see that 35% of the syscall time was wasted on time().
However, when taking a look on several other servers, it was found that all requests are being processes without calls to time at all!!! (see figure 2). Hint: use
strace -cp `ps ax | grep [h]ttpd | awk '{ print $1 }' | tr '\n' ',' | sed 's/,/ -p /g'` -f to get this informationSolving the mystery...The solution of it is based on the BIOS, in an option named HPET – High Precision Event Timer which when enabled, the kernel do fast lookups without a need to use the time() syscall. This method is able to track the time instead of the kernel. Please notice that this function should be enabled on the kernel.
That’s it, instead of Apache or other program deals with time the HPET mechanism do that. The bottom line is reduce time system calls. See also the thread in StackOverflow.
Bottom LineThis new configuraion reduced syscalls time by ~30% and more which is being translated to a great performance impact on Metacafe servers.
P.S We'll be glad to expose here other cases from the industry. Don't be shame to submit your case and contribute the community.
Best Regards,
Moshe Kaplan.
RockeTier. The Performance Experts