Archive for the ‘oss’ Category

Phpbb3 import error: bbcode_uid truncation

January 17th, 2008 by daryl

I recently upgraded an install of phpbb to phpbb3. Shortly thereafter, I moved the site that the forum runs on to different hardware after several days of downtime on the original hardware (and an unresponsive vendor). To move to the new hardware, I dumped the database to a text file, compressed it, and shot the database and all site files across the network to the new hardware. Then I uncompressed the database and slurped it into mysql. Simple enough. What I hadn’t considered in advance was the fact that I was moving from mysql4 to mysql5. Accordingly, some weird things started happening when I started testing the site on the new hardware. I googled around a bit to discover that some of the problems were a result of the mysql upgrade, and I finally found this script, which purports to solve the problems by modifying the database structure. The script seemed to work just fine. The problems I had seen went away, and I figured the migration was a success.

But then somebody in the forums pointed out that bbcode throughout the site was messed up. And sure enough, all posts that had been imported had weird extra characters appended to bbcode blocks, which kept the bbcode from being converted into the appropriate html. For example, a block of bbcode might look like this: [quote="username"scd]stuff[/quote:scd]. But the characters were never consistent across posts. A bit more googling turned up the fact that phpbb has a field called bbcode_uid that is supposed to allow eight characters, but either when moving from mysql4 to mysql5 or as part of that nifty script I ran (I’m not sure which), the field gets truncated to five characters, which lops off the last three characters of an eight-character bbcode_uid, which ultimately results in the weird display we found.

What’s going on is that parsing nested tags (e.g. “[quote][b][url][/url][/b][/quote]“) can become laborious for the server, especially when tags don’t get closed properly. To make it more surefire and to simplify the process, phpbb appends a bbcode_uid to any bbcode inserted. So when you type “[url]http://daryl.learnhouston.com[/url]“, what actually gets inserted into the database is something like “[url:d98cJ1pv]http://daryl.learnhouston.com[/url:d98cJ1pv]“. This makes it so that you’re not having to figure out arbitrary nesting, because every opening tag has a corresponding unique end tag; you don’t have to find a beginning tag’s mate by parsing a string recursively, in other words. It’s a really cool idea. Of course, to remove the bbcode_uids from posts as a page is built, you need to store the bbcode_uid associated with a given post, so that it can be stripped out once tags are matched to one another. This is the bbcode_uid field in the posts table. And this field has just been truncated to five characters by the database move. Which means that when phpbb tries to find the bbcode_uid value within a given post, it finds and strips out only the first five characters, which results in three weird characters being appended to bbcode tags and the improper display of bbcode. In every single post and every single signature of your forums, which in my case was nearly 200,000 posts.

The fix is rather daunting to implement. Basically, you have to script something that looks at every single post and every single signature, finds bbcode_uids therein, matches the first five characters to the bbcode_uid field in the posts table (just as a check), and then updates the bbcode_uid for each post to the match found (this is after altering your table to make the bbcode_uid column accommodate eight characters, of course). If you get this wrong, you’ve basically wrecked your whole database, and bbcode for posts in the past will never render correctly. Of course, if you’ve discovered this problem before anybody has posted to your site, then you can alter the database and reimport the data, but this isn’t an option if people have been using the site for a few days before the issue was reported. Luckily, I was able to come up with a pretty simple script to fix the issue. Of course I was terrified to push the start button, so to speak, but push it I did, and it worked.

If you’re having the same issue, you can try my fix at your own risk.

Making an Honest Man of Myself

February 21st, 2006 by daryl

For a long time now, I’ve been very pleased to use open source software. To the uninitiated, what that means is that all of the software I use on a daily basis is free. There’s much more nuance to the culture surrounding open source software, but that’s what’s really significant about it to me with respect to daily use.

For example, when I was a Windows user, I never ponied up and paid for a license for Microsoft Word. If it didn’t come on my computer, I used an old CD to install it, probably using an install key I found on the internet. Open source software lets me have great (often comparable) software at no cost and with no guilt. I simply download Open Office and use that as my document software now. My use of and later involvement with the production and marketing of open source software have bled in recent years into another significant area of my daily life.

As a programmer, I often find myself listening to music while I work. My tastes vary. Sometimes I’ll pop in 20th century Estonian classical and sometimes Eminem. Sometimes bluegrass, sometimes Chemical Brothers. Sometimes the Statler Brothers or Aaron Neville and sometimes RadioHead or Nine Inch Nails. Much of the music I’ve listened to over the past few years has been pirated. I had one employer that kept a music library selected by its employees. That is, each month, each employee got to pick an album that the company bought and had available for the employees to listen to. Another company had some people in it who happened to toss mp3s of a lot of their songs onto a server. I’ve derived much enjoyment and distraction from songs I copied from these sources.

And lately, I’ve been feeling pretty guilty about it, largely because I’ve also felt guilty about stealing software as I used to do. A few times recently, I’ve declined to share my music with friends who got iPods. It’s hard to do that, to know you’re going to come off as some sort of ninny for being so rigid about not wanting to facilitate music theft. At the same time, though, I’ve kept and enjoyed the music I’ve stolen over the years. It was pretty hypocritical. I’ve justified it in part by saying “well, it’s already stolen, and I can’t unsteal it.” But that’s bullshit. Any justification is bullshit.

So today, I deleted it. There are probably a few tunes I inadvertently skipped (you try going quickly through a few thousand songs to delete the ones you didn’t buy), but I made a good faith effort to purge anything I haven’t either personally bought or been given a legitimately purchased copy of. I’m now an honest man, no longer a hypocrite. And it feels good.

If you’re a friend and you’ve asked for music, please don’t take personally my disinclination to give it to you. And don’t think I’m judging you. I don’t care who does what. I just know that I personally felt guilty for using music I hadn’t paid for that people expect to be paid for, so I’m not doing it anymore. I also feel guilty when I accidentally kill a bug (I prefer to let them out the door), but that doesn’t mean I’m looking at you askance if you bust out a flyswatter.

I’ll miss a lot of the music I deleted, though a lot of it was junk I’ve never listened to very much (so I really lose two sorts of burden as a result of the deletion). The upside is that I can purchase much of it back and be legitimate. Radio Head doesn’t seem to have anything on iTunes, and that’s perhaps my biggest woe. But it’s a pretty small price to pay in exchange for not feeling like a hypocrite.