Monday, March 21, 2011

Day 37 - the end of the last buffer or should it be the other way round ?

One of the mysteries of nginx I still have not figured out (and I think there are a lot of them) are buffers. Those little things called ngx_buf_t. I got the basics: it's a way to point at an existing zone of memory (or file) that already exist. The main objective (as far I understand it) is to avoid copying stuff (especially big chunks of memory) around. If you look at the Development of modules for nginx I translated you will realise the important fields in the structure are:
    u_char          *pos;
    off_t            file_pos;
    u_char          *last;
    off_t            file_last;

    u_char          *start;         /* start of buffer */
    u_char          *end;           /* end of buffer */
Most of the time, all you'll do is:
  1. Allocate memory (let's call the pointer to this zone p)
  2. Create a buffer pointing to this zone:pos=start=p and last=end=p+size_needed
This is so common, there is even ngx_create_temp_buf doing that for you. So, most of the time you don't even realize that last and end are not the same and the naming probably doesn't help: there is usually nothing left after the end and nobody after the last one... ;) The thing is : with nginx, after the last comes the end.

Now, there is only one situation I saw end != last and that was with very specific buffers: the ones I crafted on Day 28 - POST body buffers... to show you that the body of a request could be split in two buffers. In this very specific situation, the first buffer is actually the one pre-allocated by nginx to read the request method, uri, headers, etc. So, the buffer end is set even before the request starts arriving and its last is determined by how much was read from the network. Hence, you end up with a buffer in which there is room between the last and the end.

Before I let you go with this revelation, I want to tell you about one more thing that surprised me recently regarding buffers. So, I was trying to migrate from my python testing to agentzh's Test::Nginx. As there is no support for multiple requests (in the traditional sense) per test in the current version of Test::Nginx (we're working on it and this might actually be my first contribution to this project, but that's another story) I used pipelined_requests to simulate this. pipelined_requests were intended to test HTTP/1.1 so they send all the requests in pretty much one go. And that caused my rrd module to crash. Why, would you ask ? Pretty simple: I was assuming the body of the request to end with the last of request_body->bufs->buf. And I was wrong! Here the buffer goes all the way up to the last byte read from the network, which in my test happened to be the end of the second pipelined request. So, my module was basically considering the body of request one to be body of request 1+request 2+headers of request 2+body of request 2. Needless to say, inserting this data in my RRD did not work. Of course, I fixed this by refusing to look beyond r->headers_in.content_length_n (the length of the body as announced in the request header for those of you who are not familiar with the nginx requests attributes yet).

One more thing (not related to buffers at all): today I tried to setup my nginx to run some php. I found tons of posts/articles/forums entries telling that PHP had to be recompiled for fastcgi to be enabled on a Fedora 14 (and others as well). I even found people pointing as repositories just for that. And, you know what? The standard install (yum install php) comes with fastcgi installed. Yes, it takes to look at the configuration used by the packagers with php -i. Yes, it takes to Read The Fucking Manual. But if you do so, all you have to do to run you php as a fastcgi server is: php-cgi -b 127.0.0.1:9000. And no need to say that this work perfectly with nginx FastCGI module.

No comments:

Post a Comment