• 201005.04

    Using gzip_static in nginx to cache gzip files

    Recently I've been working on speeding up the homepage of beeets.com. Most speed tests say it takes between 4-6 seconds. Obviously, all of them are somehow fatally flawed. I digress, though.

    Everyone (who's anyone) knows that gzipping your content is a great way to reduce download time for your users. It can cut the size of html, css, and javascript by about 60-90%. Everyone also knows that gzipping can be very cpu intensive. Not anymore.

    I just installed nginx's Gzip Static Module (compile nginx with --with-http_gzip_static_module) on beeets.com. It allows you to pre-cache your gzip files. What?

    Let's say you have the file /css/beeets.css. When a request for beeets.css comes through. the static gzip module will look for /css/beeets.css.gz. If it finds it, it will serve that file as gzipped content. This allows you to gzip your static files using the highest compression ratio (gzip -9) when deploying your site. Nginx then has absolutely no work to do besides serving the static gzip file (it's very good at serving static content).

    Wherever you have a gzip section in your nginx config, you can do:

    gzip_static on;

    That's it. Note that you will have to create the .gz versions of the files yourself, and it's mentioned in the docs that it's better if the original and the .gz files have the same timestamp; so it may be a good idea to "touch" the files after both are created. It's also a good idea to turn the gzip compression down (gzip_comp_level 1..3). This will minimally compress dynamic content without putting too much strain on the server.

    This is a great way to get the best of both worlds: gzipping (faster downloads) without the extra load on the server. Once again, nginx pulls through as the best thing since multi-cellular life. Keep in mind that this only works on static content (css, javascript, etc etc). Dynamic pages can and should be gzipped, but with a lower compression ratio to keep load off the server.

    Comments
  • 201002.04

    NginX as a caching reverse proxy for PHP

    So I got to thinking. There are some good caching reverse proxies out there, maybe it's time to check one out for beeets. Not that we get a ton of traffic or we really need one, but hey what if we get digged or something? Anyway, the setup now is not really what I call simple. HAproxy sits in front of NginX, which serves static content and sends PHP requests back to PHP-FPM. That's three steps to load a fucking page. Most sites use apache + mod_php (one step)! But I like to tinker, and I like to see requests/second double when I'm running ab on beeets.

    So, I'd like to try something like Varnish (sorry, Squid) but that's adding one more step in between my requests and my content. Sure it would add a great speed boost, but it's another layer of complexity. Plus it's a whole nother service to ramp up on, which is fun but these days my time is limited. I did some research and found what I was looking for.

    NginX has made me cream my pants every time I log onto the server since the day I installed it. It's fast, stable, fast, and amazing. Wow, I love it. Now I read that NginX can cache FastCGI requests based on response caching headers. So I set it up, modified the beeets api to send back some Cache-Control junk, and voilà...a %2800 speed boost on some of the more complicated functions in the API.

    Here's the config I used:

    # in http {}
    fastcgi_cache_path /srv/tmp/cache/fastcgi_cache levels=1:2
                               keys_zone=php:16m
                               inactive=5m max_size=500m;
    # after our normal fastcgi_* stuff in server {}
    fastcgi_cache php;
    fastcgi_cache_key $request_uri$request_body;
    fastcgi_cache_valid any 1s;
    fastcgi_pass_header Set-Cookie;
    fastcgi_buffers 64 4k;

    So we're giving it a 500mb cache. It says that any valid cache is saved for 1 second, but this gets overriden with the Cache-Control headers sent by PHP. I'm using $request_body in the cache key because in our API, the actual request is sent through like:

    GET /events/tags/1 HTTP/1.1
    Host: ...
    {"page":1,"per_page":10}

    The params are sent through the HTTP body even in a GET. Why? I spent a good amount of time trying to get the API to accept the params through the query string, but decided that adding $request_body to one line in an NginX config was easier that re-working the structure of the API. So far so good.

    That's FastCGI acting as a reverse proxy cache. Ideally in our setup, HAproxy would be replaced by a reverse proxy cache like Varnish, and NginX would just stupidly forward requests to PHP like it was earlier today...but I like HAproxy. Having a health-checking load-balancer on every web server affords some interesting failover opportunities.

    Anyway, hope this helps someone. NginX can be a caching reverse proxy. Maybe not the best, but sometimes, just sometimes,  simple > faster.

    Comments
  • 200901.16

    Amazon S3

    Very cool service. I updated beeets to pull all images from images.beeets.com, an S3 bucket. Also, all css files now go through /css/css.php/file.css which rewrites

    url(/images/...)

    to

    url(http://images.beeets.com/images/...)

    And guess what, it all works. I had some bad experiences with the S3Fox firefox plugin in the past, but it's since been updated and I've been using it regularly.

    Also, using S3.php, all profile images now go directly onto images.beeets.com. Wicked.

    So what does this mean? A few things:

    1. Less bandwidth & work - beeets will spend more time serving HTML, CSS, and JS than images.
    2. Safer - We were backing up profile images to S3 indirectly before, but the chances of S3 going down VS our hosting are slim.
    3. Worse image caching - Before, I had .htaccess controlling all the caching for static files. I liked it that way. S3 doesn't do this very well at all. Apparently it's configurable, but I don't know how...any ideas?

    All in all, it should be better for beeets. Maybe we'll actually let users have images bigger than 10x10 now ;)

    Thumbs up to S3 (and probably all other Amazon web services).

    Comments