Wednesday, February 23, 2011

Day 26 - content handler vs. request body handler

As we have seen in yesterday's episode a request body handler (called when the body of the POST request made it to the server) returns nothing whereas the content handler (called after the headers have been processed) returns a status (NGX_DONE, NGX_AGAIN, etc.) in the form of a ngx_int_t. This thing ended up driving me almost crazy as you will see in a moment.

So, after yesterday's adventure, I was happy, able to test my module with big POST bodies. So, I figured I could start returning stuff. Like the URL-decoded version of the parameter that was submitted to me in the POST. Nothing really fancy. I had some trouble actually getting the data out of the POST body and in memory but I'll tell you about this tomorrow. So, I managed to get the response in a memory buffer and started sending it to the client with something like:
void ngx_http_rrd_body_received(ngx_http_request_t *r) {
[...]
/* Prepare BIG header and response */
[...]
rc = ngx_http_send_header(r);
if (rc != NGX_OK) {
ngx_log_error(NGX_LOG_ALERT, log, 0,
"pb sending header");
return NGX_HTTP_INTERNAL_SERVER_ERROR;
}

return ngx_http_output_filter(r, out_chain);
}
And that's when things started to get interesting (to say the least):
  • When posting a small body everything worked fine.
  • When posting a big body, the webserver was sending only the beginning fo the response and the client ended up timing out.

I went on and tried to figure out what was wrong. Checked 10 times my code to make sure that there was nothing wrong with the memory allocations. Ran it step-by-step in the debugger. Added debug logging for nginx, even watched the packets fly with Wireshark. I also started to dig deep into nginx code with the http_copy_filter, the http_write_filter and all their friends in the output filter chain. Watched them run in my debugger. But all this only got me:
  • A spinning head at realizing how much there is I don't know
  • The confirmation that nginx was sending only the first 65536 bytes of my response

A break under my fig tree did not help, so I went and explained my problem on the development mailing list. I was pretty sure I was doing something wrong but I could not figure what. Maxim Dounim gave me the answer: in the request body handler you should call ngx_http_finalize_request. When you are writing a content handler, you return the output of ngx_http_output_filter and nginx calls ngx_http_finalize_request to do the job for you. Now, in our situation, the content handler just registers a request body handler so it needs a way to tell nginx not to call ngx_http_finalize_request. It does that by returning NGX_DONE. The counterpart is that, later on, when the request body handler executes, wz must call ngx_http_finalize_request. It is fairly simple once you know it. And it explains why the request body handler does not return anything.

Now, if like me you wonder why two different conventions were used for something so close, I don't have the answer. The only comments I got from agentzh were:
Well, just to mind you, the rewrite handler, access handler, and body
handler all use its own convention of return values and they're
different in one way or another ;)

Besides, the specific meaning may change without notice because Igor
Sysoev tends to change his mind :)

If you'd call such things "traps", then I'd say there's tons of them
in the core.

The good thing is that it means nginx still has a lot more surprises in stock for me to discover and share with you... ;)

4 comments:

  1. Hi Antoine,
    Thanks for these "days" comments on nginx and module development.

    I am trying to add a module in nginx, which needs to do some kind of b/g work independent of HTTP request.
    I plan to use HTTP push module to update a page on completing the work (as u can see, it is completely async)

    Have gone through evanmiller blog and nginxguts to write my own module. Wrote a hello module which is working fine.

    I extended the hello module and in the http handler, it could start the thread. I have used pthread library for this. Now I want to start a thread of my own along with nginx and not with a http handler.

    1. What modification is required to make it?
    2. How to stop the thread when nginx is exiting?
    3. Is there a different way of doing it other than pthread for a background activity.

    I posted a thread in nginx http://forum.nginx.org/read.php?2,213156. So far no reply. So it is a re-post here.

    ReplyDelete
  2. Hi Gireesh,

    I don't think nginx was designed to do what you are trying to achieve. The lack of answer on the mailing list is probably a good sign that your train of thoughts breaks some unwritten rule in the minds of Igor and Maxim.
    So, a safer route is probably to do the b/g work in a completely separate process (triggered by cron, maybe).
    If you really want all this to happen in nginx, you should rather look into the timers but that's a part of the code I haven't looked at yet...

    Sorry I could not be more helpful.

    ReplyDelete
  3. Hi Antoine:
    Even I felt so.
    Thanks a lot for the reply. :-)

    I will try to benchmark the performance and see if there is any meaning of doing it like this than putting it as a FASTCGI back-end server

    ReplyDelete
  4. Maybe if you provide more insight on the kind of thing you want to do (like uploading video files+trancoding or whatever that is you want to achieve), people on the mailing list will certainly share with you their best practices.

    ReplyDelete