Archive for January 17th, 2008

Phpbb3 import error: bbcode_uid truncation

January 17th, 2008 by daryl

I recently upgraded an install of phpbb to phpbb3. Shortly thereafter, I moved the site that the forum runs on to different hardware after several days of downtime on the original hardware (and an unresponsive vendor). To move to the new hardware, I dumped the database to a text file, compressed it, and shot the database and all site files across the network to the new hardware. Then I uncompressed the database and slurped it into mysql. Simple enough. What I hadn’t considered in advance was the fact that I was moving from mysql4 to mysql5. Accordingly, some weird things started happening when I started testing the site on the new hardware. I googled around a bit to discover that some of the problems were a result of the mysql upgrade, and I finally found this script, which purports to solve the problems by modifying the database structure. The script seemed to work just fine. The problems I had seen went away, and I figured the migration was a success.

But then somebody in the forums pointed out that bbcode throughout the site was messed up. And sure enough, all posts that had been imported had weird extra characters appended to bbcode blocks, which kept the bbcode from being converted into the appropriate html. For example, a block of bbcode might look like this: [quote="username"scd]stuff[/quote:scd]. But the characters were never consistent across posts. A bit more googling turned up the fact that phpbb has a field called bbcode_uid that is supposed to allow eight characters, but either when moving from mysql4 to mysql5 or as part of that nifty script I ran (I’m not sure which), the field gets truncated to five characters, which lops off the last three characters of an eight-character bbcode_uid, which ultimately results in the weird display we found.

What’s going on is that parsing nested tags (e.g. “[quote][b][url][/url][/b][/quote]“) can become laborious for the server, especially when tags don’t get closed properly. To make it more surefire and to simplify the process, phpbb appends a bbcode_uid to any bbcode inserted. So when you type “[url]http://daryl.learnhouston.com[/url]“, what actually gets inserted into the database is something like “[url:d98cJ1pv]http://daryl.learnhouston.com[/url:d98cJ1pv]“. This makes it so that you’re not having to figure out arbitrary nesting, because every opening tag has a corresponding unique end tag; you don’t have to find a beginning tag’s mate by parsing a string recursively, in other words. It’s a really cool idea. Of course, to remove the bbcode_uids from posts as a page is built, you need to store the bbcode_uid associated with a given post, so that it can be stripped out once tags are matched to one another. This is the bbcode_uid field in the posts table. And this field has just been truncated to five characters by the database move. Which means that when phpbb tries to find the bbcode_uid value within a given post, it finds and strips out only the first five characters, which results in three weird characters being appended to bbcode tags and the improper display of bbcode. In every single post and every single signature of your forums, which in my case was nearly 200,000 posts.

The fix is rather daunting to implement. Basically, you have to script something that looks at every single post and every single signature, finds bbcode_uids therein, matches the first five characters to the bbcode_uid field in the posts table (just as a check), and then updates the bbcode_uid for each post to the match found (this is after altering your table to make the bbcode_uid column accommodate eight characters, of course). If you get this wrong, you’ve basically wrecked your whole database, and bbcode for posts in the past will never render correctly. Of course, if you’ve discovered this problem before anybody has posted to your site, then you can alter the database and reimport the data, but this isn’t an option if people have been using the site for a few days before the issue was reported. Luckily, I was able to come up with a pretty simple script to fix the issue. Of course I was terrified to push the start button, so to speak, but push it I did, and it worked.

If you’re having the same issue, you can try my fix at your own risk.

His Dark Materials, Zadie Smith

January 17th, 2008 by daryl

I spent much of my 11-day holiday break either horizontal or wishing I was horizontal thanks to a back strain that’s still giving me fits. I took advantage of the time to get some reading done. Mleeka and I had gone to see The Golden Compass, and in anticipation of it, she purchased and read the trilogy of which that movie composes roughly one third. She was somewhat disappointed in how the movie chopped off the end of the first book and how it left out some of the back story about Dust. Having not read the books yet, I thought it was a pretty engaging movie, if it was a little slow at times (especially when Kidman was onscreen). In any case, watching the movie and hearing Mleeka talk about the books prompted me to read the books. Steinbeck they’re not, but I enjoyed the whole set. Oddly, where Mleeka found the second book to suffer from what she calls second book syndrome (wherein a second book in a set serves primarily to set up the more involved politics and relationships that drive subsequent books but are of limited interest on their own in terms of actually moving the plot along), I found it to be pretty interesting. On the whole, not a bad bunch of books for a quick read.

I had heard about an author named Zadie Smith. She made waves a few years ago with her first novel (published when she was 24, I think), and I’ve been meaning for a while to pick up some of her stuff. Her On Beauty was on my amazon wish list, and Ashley got it for me for Christmas. It was a good book, though somewhat different from what I had expected based on comparisons of her work to other authors I like. It felt a bit like a modern day take on the old comedies of manners. I don’t mean to pigeon-hole Smith here in the almost patronizing way it’ll probably come off, but the book felt a bit like Pride and Prejudice or Sense and Sensibility for the 20th century (which probably isn’t terribly flattering given that I find those sorts of books tedious and dull and light). And yet it wasn’t tedious or dull or light, and in fact, there was much to appreciate. Smith writes really great dialogue, and especially argument dialogue. So the book wasn’t quite what I expected, but it was well-done enough that I decided to get her first novel, White Teeth. This I finished reading last night, and it’s the sort of book I expected based on what I had read about Smith. It was different than any of the old white guy fiction I’ve read, and it dealt with its subjects in what felt like really honest, informed ways across cultures, religions, races, genders, and ages. And it did so with wit and beauty and absurdity and sometimes sadness. On Beauty isn’t a book I’ll likely read a great many times in my life, but White Teeth I can imagine myself re-reading every few years, as I do with most of my favorites. (Uh, which is not to detract from the gift itself of the former book; had I not read that one, I might never have gotten around to reading the other.) If you happen to like reading contemporary literary fiction, this one should go on your list.

Next up I think is George R. R. Martin. I’ve never been much on sci-fi or fantasy, and I guess he’s a fantasy author. Three or four people have separately recommended him to me even knowing that I’m not much of a fantasy reader. Mleeka gobbles the stuff up, so I got her the first in his big series for Christmas, and she dug it and has since read the rest of the series that’s been published to date. She seems to validate what others have told me about him, so I’m thinking I might broaden my horizons a bit and see what I think of his books.