Wednesday, January 26, 2011

Day 6 - the C10K problem, nginx buffers and chains

The C10K problem page is a must-read for anyone who is really interested in highly-scalable event-driven architecture. Now, it's a bit outdated and some of the problems mentioned (including the "thundering herd" problem) seem to have been solved a long time ago by our friends working on the kernel. But it took a very long time for the kernel to provide mechanisms that support event-driven user-space software. That is also probably a good reason why nginx could not exist sooner. The lesson I get from this (and this is definitely one I never thought about) is "if the thing is asynchronous" don't make it synchronous because it's making your life easier. Eventually, you will have to bite the bullet. So, the sooner, the better.

Now, to keep you posted on the translation front, I'm in the hardcore part of nginx : buffers and chains of buffers. I must say, I feel like I'm in the middle of the ocean with no land in sight : LOST. The two concepts are closely tied together but they are also very tied to the way content is generated and processed. And that part is further away in the paper I'm translating (or at least, that's what I believe). So, it's like I'm understanding the words but not the meaning of the sentence.

Enough about how I feel. High-level, a buffer points to a chunk of memory and keeps track of where nginx stands in terms of "processing" this data. Chains are a succession of such buffers. I think the whole idea behind this is to avoid copying memory back and forth. Let's say you have a constant string you want added at the end of every single request. There is no way to allocate a copy of this string for each request : you build a buffer that points to the constant string (allocated by your compiler somewhere in memory) and "chain" this buffer just after the original response buffer. I'm over simplifying here but I think this is the idea. At some point (when I am sure I fully understand it), I will have to make a good schema. I definitely think it could be useful. That will come when I'm done with the translation. So, I'm @TODOing this then...

One more revelation for the day : there seems to be one chain per memory allocation pool. Cannot start to fathom why. But I'm sure there is a good reason. May be, future events will shed some light on this.

No comments:

Post a Comment