Squid
2007-12-28
Efficient use of Resources
This afternoon, I finished replacing Apache httpd with Cherokee on Inigo's main server. Now Squid is handling access to all http services first, with Pound handling the management of the backend servers.
There really wasn't much need for Apache httpd, as we don't use most of it's features and without tuning, it takes up a lot of memory just to do rewrites, http proxy for the backends and logging.
Squid uses much less memory, and is very fast. Kagesenshi is working on tighter integration with Cachefu for the Plone sites. Even without that, you will find that sites like http://foss.org.my are now snappy (1.02 seconds total according to firebug). It's improved the speed of http://mirror.inigo-tech.com also which is a slow external USB drive.
The performance bottleneck now is actually memory. While FreeBSD's virtual memory does an awesome job (we're using 1201MB of swap at time of writing), we are now running 4 separate virtual servers, each running it's own self contained services. Not much more we can optimize now. Long idle processes of course, take several seconds to swap back in. So adding another 1GB of memory (total 2GB) will give quite a bit of breathing space and get rid of that lag.
For those that are curious, all this including http://www.apdip.net which used to sit it's own server is running on RM2.5K worth of hardware.
2007-12-25
Reducing complexity and resource usage for Zope front-end
Keeping things simple is important. Keeping things simple however does not necessarily mean things are dumbed down. There is a Unix mantra, which is to do one thing and to do it well. For example take bzip2, does it's job well which is to compress things, it doesn't do anything else. Yes it has a lot of options, but they're related to compressing data.
Sometimes when programs do too many things, they end up being bloated and complicated to set up. So when archiving http://www.apdip.net to a virtual server, it was good opportunity to simplify things, reduce resource usage, but not reduce functionality or performance.
So I've reduced the backend setup to a chain of:
squid -> pound -> zope
-> cherokee
- Squid here does what it does best, which is to cache requests
- Pound to load balance between application servers and httpd server
- Cherokee a lightweight httpd server
Except for Squid, the others do their job really well with simple small configuration files. With squid, by keeping it's role to strictly as a caching server, the configuration is also simplified and less prone to errors.
Proper articles later, but hopefully some tips here will help you on your way.
Squid
Start here: http://wiki.squid-cache.org/Squid_Faq/ReverseProxy
Set our Squid to listen to port 80 and also deal with named vhost requests
http_port 80 accel vhost
Set Squid to go to Pound to manage rediretions and load balance the backend services
cache_peer 127.0.0.1 parent 81 0 originserver default
We needed to deal logs for different vhosts, and that was not too difficult. First we set the ACLs
acl apdip dstdomain www.apdip.net http_access allow apdip acl stats dstdomain stats.apdip.net http_access allow stats
Then we split the logs by referring to the acls, so that each site has their own logs
access_log /var/log/httpd/www.apdip.net/access.log combined apdip access_log /var/log/httpd/stats.apdip.net/access.log combined stats
Pound
We then configure Pound to deal with the backend services. Just man pound, the man page is all you need to setup different priorities of servers, redirect requests to different servers, set time outs on backend pools etc. It's quite simple, because well.. that's what Pound is supposed to do, and do well.
Here it's listening on 81, and redirecting the right requests to the right servers.
User "www"
Group "www"
Client 300
ListenHTTP
Address 127.0.0.1
Port 81
End
Service
HeadRequire "Host:.*stats.apdip.net.*"
Backend
TimeOut 120
Address 127.0.0.1
Port 8081
End
End
Service
Backend
TimeOut 120
Address 127.0.0.1
Port 8080
End
End
Cherokee

Love it.. small and simple configuration files. vhosts, cgi etc. snap to set up. The configs are even set up debian style (sites-enabled), brilliant. It's also uses much less memory than Apache and since we don't use any of the modules, php support etc, it makes much more sense. I got introduced to this by Alvaro some time back, since he wrote it, I trust it's as good as he says it is. :)
I won't paste the config files here, you can check them out, but all I can say is that it was simple, just go do the documention on the website to see examples for common uses.
