phpBB and Postgres Performance, AWStats & logrotate

Posted by Jason Tue, 28 Feb 2006 06:13:00 GMT

It’s been a fun little webmaster week for me as I’ve solved two of the nagging issues I’ve had with the site.

I host a message board here, and after a couple months of use, performance was terrible. Watching top on any request would show the bottleneck was obviously Postgres. Although I didn’t think to try doing an EXPLAIN on any of the queries phpBB uses, I did eventually try doing a vacuum on the database, and it worked well. The speed wasn’t great, but it was usable. Well, just this week I tried doing a full vacuum and it gave another great speed boost. So now the forum performs at the level you’d expect, as long as the load on the Linode is at a reasonable level. All it took was one entry in crontab:

0 4 * * * vacuumdb --full --all --verbose --analyze

It may seem like overkill to do a full vacuum everyday, but I figure it’s worth it. With sites like this blog where content is cached, a full vacuum won’t affect viewing. And for a site the forum, it’s worth it for the site to inaccessible for a few minutes during a dead hour if it keeps the site snappy during peak hours.

The interesting thing was the possible causes of the slowdown. It seems that a few indexes would get really crufty though normal phpBB uses. When the vacuum was performed hundreds of index rows would be removed, and this was on the forums table! Considering the forums table has less than 10 rows, it’s possible that dropping that index might give a speed boost. But at the same time, I know Postgres uses MVCC, and that mean every update creates a new row. Since that forums table gets updated on every post, that probably results in hundreds of rows in the forums table.

Since I don’t really know exactly what the problem is, this would be an interesting experiment:
  • Create a script (preferably with a load testing tool like JMeter) that hits a phpBB instance like so – hit the front page, log in, view a forum, post a message, post 10 replies, log out
  • Run it a thousand times or so, and see how performance degrades
  • Start again, but this time drop one of the indexes that gets a lot of action during a vacuum.
  • Run it again the same number of times and see if the performance degradation is worse or better.
  • Repeat for other suspect indexes.
  • Repeat the experiment, with all the original indexes, but this time, pause the test after 100 runs, run a vacuum, then resume the test.

I would do it, but the full vacuum is a working, automated solution for the time being. If the forum’s database gets too big, it may be worth trying.

The other thing I did this week was I got my Apache log files under the control of logrotate, and the analysis of those logs under the control of AWStats. At first I followed the AWStats documentation and set up AWStats as a CGI program. Got that working, but I wasn’t happy with it because it was a little slow, and I’ve had some automated attacks on AWStats hit me before. (I was fairly safe against them – I’d put it on it’s own virtual host and password protected it.) So instead, I changed it to generate static reports whenever my logs are rotated. It works great. It’s secure, accessing the reports is fast, and I don’t need real time reports or the other features that CGI gives you.

So here’s how I did it… This is the relevant logrotate entry:


/var/log/www/*.log {
  missingok
  daily
  notifempty
  # /var/log/www is NOT world readable, but /var/log/www_old is, so
  # awstats will always be looking at /var/log/www_old/HOSTNAME_access.log.1
  olddir /var/log/www_old
  sharedscripts
  postrotate
        /bin/kill -HUP `cat /opt/apache/logs/httpd.pid 2>/dev/null` 2> /dev/null || true; su - awstats -l -c 'awstats-update.sh' > /dev/null 2>&1
  endscript
}

And here’s the awstats-update.sh script that gets called after the logs are rotated:


#!/bin/sh

STATS_PAGE="/var/www/awstats/stats/index.html" 
echo "<html><body><ul>" > $STATS_PAGE

for AWSTATS_CONF in `find /etc/awstats/ -type f | perl -e 'while ( <STDIN> ) { chomp; my ( $config ) = ( m|/etc/awstats/awstats\.(.*)\.conf$| ); print "$config\n"; }'`; do
  awstats_buildstaticpages.pl -config=${AWSTATS_CONF} -update -awstatsprog=$HOME/bin/awstats/awstats.pl -dir=/var/www/awstats/stats
  echo "<li><a href=\"awstats.${AWSTATS_CONF}.html\">${AWSTATS_CONF}</a></li>" >> $STATS_PAGE
done

echo "</ul></body></html>" >> $STATS_PAGE

Done deal. It creates a little index page for easy navigation. If I add new virtual hosts, I just add the awstats configuration file in /etc/awstats and the script picks it up.

Tomorrow I am off to Japan to see my brother. VERY EXCITED.

3 comments | no trackbacks

JAY DEE DROP THE BEAT

Posted by Jason Sun, 12 Feb 2006 06:13:00 GMT

James Yancey aka Jay Dee aka J Dilla died last Friday. It was a very bad way to end a bad week for me. I’m not usually deeply affected by deaths of people I’ve never met, but the death of Jay Dee got to me. Ever since Like Water For Chocolate was released, I’ve considered him the best producer in the game. In the past couple years, it seemed like he was finally finding his place in the industry. And finally getting the recognition he deserved.

It was obvious he’s been sick since Champion Sound was released. The weight swings, the hospitalizations, those European shows he did in a wheelchair… It’s just hard to hear that he was hiding lupus for so long.

I wanted to put together a Jay Dee compilation, but it’s already been done much better by other people (such as Mental Combat). But I will still offer a couple downloads:

Two Can Win from the new Donuts LP. I actually haven’t bought or listened to Donuts yet – this is the only track from the album I’ve listened to. If this track is any indication, Donuts will be great. Gonna buy the album on Monday.

Common – Heat from Common’s Like Water For Chocolate. LWFC is one of my favourite albums. I know I’ve spent a fair bit of time defending it. “Heat” is my favourite beat from the album.

It’s Like That from Welcome 2 Detroit. This is the most underrated Dilla beat ever. It follows the formula that Dilla used to win my ears – really simple drum patterns with the volume of the drums turned WAY UP. This beat isn’t much more than than one drum loop and a chopped bass sample. Who else can keep it this simple and make it this hot? Listen to this song in the car and turn up the volume.

http://jaydee.ytmnd.com/ – I made this about a year and a half ago. Click. It features the song “Welcome 2 Detroit”.

For the past few weeks, this has been my main CD stack…

The flash washed it out, but there are two other CDs in the stack with Jay Dee production.

Detroit Free Press article on Dilla’s death

Dilla’s section on the Stones Throw page

no comments | no trackbacks

Programmers MUST Read This

Posted by Jason Fri, 10 Feb 2006 06:13:00 GMT

Programmers Need To Learn Statistics Or I Will Kill Them All by Zed Shaw

I’ve done a little bit of load testing before, and I couldn’t say I believed my results told the complete story. Zed’s article definitely clears up a lot for me.

I will admit that I did pretty much the bare minimum in the area of Statistic in my University career. I slept through the first year course because it was pretty much a course for Arts students. After that, I picked up enough in compsci courses to calculate the number of runs required for a Monte Carlo/Las Vegas algorithm to be correct with a given certainty.

If I were to go back to school, Statistics is something I’d like to revisit. As one of my favourite professors says, Statistics is the field that’s most applicable to real life. (Okay, Al’s wording is much more elegant than mine.)

no comments | no trackbacks