Phpbb3 import error: bbcode_uid truncation

I recently upgraded an install of phpbb to phpbb3. Shortly thereafter, I moved the site that the forum runs on to different hardware after several days of downtime on the original hardware (and an unresponsive vendor). To move to the new hardware, I dumped the database to a text file, compressed it, and shot the database and all site files across the network to the new hardware. Then I uncompressed the database and slurped it into mysql. Simple enough. What I hadn’t considered in advance was the fact that I was moving from mysql4 to mysql5. Accordingly, some weird things started happening when I started testing the site on the new hardware. I googled around a bit to discover that some of the problems were a result of the mysql upgrade, and I finally found this script, which purports to solve the problems by modifying the database structure. The script seemed to work just fine. The problems I had seen went away, and I figured the migration was a success.

But then somebody in the forums pointed out that bbcode throughout the site was messed up. And sure enough, all posts that had been imported had weird extra characters appended to bbcode blocks, which kept the bbcode from being converted into the appropriate html. For example, a block of bbcode might look like this: [quote="username"scd]stuff[/quote:scd]. But the characters were never consistent across posts. A bit more googling turned up the fact that phpbb has a field called bbcode_uid that is supposed to allow eight characters, but either when moving from mysql4 to mysql5 or as part of that nifty script I ran (I’m not sure which), the field gets truncated to five characters, which lops off the last three characters of an eight-character bbcode_uid, which ultimately results in the weird display we found.

What’s going on is that parsing nested tags (e.g. “[quote][b][url][/url][/b][/quote]“) can become laborious for the server, especially when tags don’t get closed properly. To make it more surefire and to simplify the process, phpbb appends a bbcode_uid to any bbcode inserted. So when you type “[url]http://daryl.learnhouston.com[/url]“, what actually gets inserted into the database is something like “[url:d98cJ1pv]http://daryl.learnhouston.com[/url:d98cJ1pv]“. This makes it so that you’re not having to figure out arbitrary nesting, because every opening tag has a corresponding unique end tag; you don’t have to find a beginning tag’s mate by parsing a string recursively, in other words. It’s a really cool idea. Of course, to remove the bbcode_uids from posts as a page is built, you need to store the bbcode_uid associated with a given post, so that it can be stripped out once tags are matched to one another. This is the bbcode_uid field in the posts table. And this field has just been truncated to five characters by the database move. Which means that when phpbb tries to find the bbcode_uid value within a given post, it finds and strips out only the first five characters, which results in three weird characters being appended to bbcode tags and the improper display of bbcode. In every single post and every single signature of your forums, which in my case was nearly 200,000 posts.

The fix is rather daunting to implement. Basically, you have to script something that looks at every single post and every single signature, finds bbcode_uids therein, matches the first five characters to the bbcode_uid field in the posts table (just as a check), and then updates the bbcode_uid for each post to the match found (this is after altering your table to make the bbcode_uid column accommodate eight characters, of course). If you get this wrong, you’ve basically wrecked your whole database, and bbcode for posts in the past will never render correctly. Of course, if you’ve discovered this problem before anybody has posted to your site, then you can alter the database and reimport the data, but this isn’t an option if people have been using the site for a few days before the issue was reported. Luckily, I was able to come up with a pretty simple script to fix the issue. Of course I was terrified to push the start button, so to speak, but push it I did, and it worked.

If you’re having the same issue, you can try my fix at your own risk.

phpMyAdmin and Designer mode

phpMyAdmin Designer ViewAnybody who’s done any open source development knows about this nifty tool called phpMyAdmin that lets you manipulate mysql databases through a web interface. For Luddites like me, using the command line interface is usually preferable to using any sort of GUI tool that requires pesky mouse moves and clicks, so I’ve generally deployed phpMyAdmin to be used by others who needed to screw with databases but haven’t been comfortable with (or had access to) the command line.

One of my tasks for today is drawing up some diagrams to show relationships between tables in a big fancy dimensionally modeled database I’m trying to blunder my way through designing. The problem is that I utterly hate all the tools I’ve tried for this on Linux. What I wanted was something that would read in a mysql schema file and populate the basic diagram for me so that all I had left to do was to map out the relationships. There’s a UML-diagram application called Umbrello that does entity-relationship diagrams of the type I need to deliver, and a user has contributed a script that will read in a schema file (though it’s finicky). I don’t like the application itself very much, but it fit my basic requirements. This morning, it started crashing, though, and was unusable. Luckily, I hadn’t done much manipulation of the model in it yet. So the quest for a better tool was back on.

I tried various versions of DBDesigner4 and MySQL Workbench, even going so far as to try running the Windows versions in wine when I experienced problems with the Linux versions, but nothing worked out.

Finally, I decided to take another look at phpMyAdmin to see if it had been enhanced with any sort of modeling capabilities. And it has! In recent versions, you can enable a “Designer” view by uncommenting a couple of lines in the config file and slurping in the tables for the phpmyadmin database in the scripts directory. It’s not a perfect tool by any means, but when I click the “Designer” tab in the app, it shows me a nice DHTML view of the tables that lets me drag them around, specify relationships, toggle to show or hide tables that have no relationships defined, etc. It comes with a nifty little palette and a toggleable sidebar to handle these operations, and it’s really a pretty elegant little piece of work. What’s more, and what makes this really useful for my purposes, is that I can save the frame (phpMyAdmin keeps a navigation frame open on the left) to my local disk, zip it up, and send it to somebody, who can then perform the same DHTML manipulations I was able to perform, making it ideal for sending along a complex schema that can have portions of it disabled for ease of viewing. (To clarify, they can save none of the information back to the database, but there’s some degree of flexibility with respect to how they can control the static view.) And to top if all off, if I want to make changes to tables, I can do it at the command line or right there in phpMyAdmin, and there’s no re-importing of a schema — the Designer view will be up to date the next time I reload it. In Umbrello, it’s my impression that any changes I made after import (e.g. drawing relationships) could not be exported back out in a useful way for porting back to the database, so I would constantly have been updating the schema, importing, and redrawing relationships.

This is a great tool for my purposes, and of course phpMyAdmin’s core features are also very useful in many environments.

Flexible Drupal surveys

A few weeks ago, my company needed to publish a survey with a pretty flexible layout. Had we been constrained to the one-field-per-row constraint that Drupal’s survey module allows for, we would have had a very long an ugly form when in fact what we wanted was a nice tidy grid of small form controls that was much less imposing for users to consider filling out. So I hacked our old version of the survey module (for Drupal 4.6) to add a “layout” field to the fields tab for a survey. In the layout field, those who can administer surveys can specify markup and drop form fields into the markup using numeric placeholders wrapped in curly braces. Surveys for which no layout is defined use the default layout with numeric field weighting. I’ve submitted a patch to incorporate this functionality, and you can find the bug report with attached patch here if you’re in need of such functionality before the patch gets review or if the survey module developer declines to integrate it. The patch is for Drupal 5.x.

On a related note, I created a patch for the forms module that lets you add HTML between form elements. This hack/patch arose out of a need to stick a quick text snippet between two fields, where the snippet couldn’t be contained gracefully in the “description” line of the topmost of the fields. I add a form field type “html” that spits out whatever HTML you specify. I wrote this code (or the old version of it; this patch too is for Drupal 5.x) when my company needed explanatory HTML between two fields but the need to manipulate the form fields themselves hadn’t arisen yet. This functionality is good for insertion of quick snippets of HTML, where the first patch I mention above is best for layout overhauls.

Both patches probably represent security risks on sites that allow non-administrators to create forms or surveys, as I’m not (as yet) filtering content, so anybody who can create a survey can add arbitrary HTML. So apply the patch with that in mind.

Styling Drupal 5.x search forms

I’m working on a project that requires me to apply a fancy pants style to a Drupal search form. I thought this would be simple enough, as it’s pretty easy to override default themes for pretty much everything else in Drupal, but it turns out either that I’m a dolt or that there’s not much clarity out there on this topic. After screwing around with a lot of things (e.g. poring over debug_backtrace() output, writing die() statements all over the place, temporarily hacking the search and node modules, etc.), I searched Drupal’s site and found a promising link that turned out not to be the solution I needed (it simply didn’t work).

At long last, I tried creating a theme function named mytheme_search_form(). The search module has a function named search_form(), but in all my hair-pulling, I never saw anything that indicated that you could override this function by prepending your theme name to it (I would have expected to find calls to “theme(‘search_form’, $args)” somewhere). At any rate, I ultimately created the above-named function and gave it the following definition:

function mytheme_search_form($form){
        return _phptemplate_callback('search_theme_form', array('form' => $form));
}

Then I created a file in my template directory named “search_theme_form.tpl.php” and built a custom form.

Next up was adding additional search fields to the form. To do this, I looked at node.module, where the function node_form_alter() adds fields to the default form. It didn’t seem necessary to jump through that hoop since my form was mostly hard-coded anyway, so I just added some radio buttons with the name “category” so that I could filter search results by taxonomy. Simple enough. But the search never actually filtered on my results. Here I did more hair-pulling and weird debugging. Finally, I went back to the hook_form_alter() functionality, having noticed a “processed_keys” key in the $form array. So to make my form honor my category search, I added the following things to my template.php file. It’s not clear to me how much of this is necessary, and I rather suspect I’m doing something stupid here, but it seems to work and I’m on deadline, so I’m rolling with it.

function mytheme_search_keys($type = null){
        $keys = search_get_keys();
        if($type){
                $keys = search_query_insert($keys, 'type', $type);
        }
        if($_POST['category']){
                $keys = search_query_insert($keys, 'category', $_POST['category']);
        }
        return $keys;
}

/**
 * Taking over this function so that I can call mytheme_search_validate to do the advanced search.
 */
function blog_form_alter($form_id, &$form){
        if($form_id == 'search_form'){
                $form['#validate']['mytheme_search_validate'] = array();
        }
}

/**
 * Need to call this to add category to the processed_keys array item so that
 * the category actually gets searched in the mini-advanced form we generate
 * in mytheme_search_form().
 */
function mytheme_search_validate($form_id, $form_values, &$form){
        $keys = $form_values['processed_keys'];
        $keys = mytheme_search_keys($form_values['module']);
        form_set_value($form['basic']['inline']['processed_keys'], $keys);
}

function faq_search($op = 'search', $keys){
        switch($op){
                case 'name':
                        return t('content');
                default:
                        $keys = mytheme_search_keys('faq');
                        return node_search('search', $keys);
        }
}

function forum_search($op = 'search', $keys){
        switch($op){
                case 'name':
                        return t('content');
                default:
                        $keys = mytheme_search_keys('forum');
                        return node_search('search', $keys);
        }
}

function blog_search($op = 'search', $keys){
        switch($op){
                case 'name':
                        return t('content');
                default:
                        $keys = mytheme_search_keys('blog');
                        return node_search('search', $keys);
        }
}

Now for an explanation. The “mytheme_search_keys()” function is a helper that lets me make sure I’m limiting the search to a given node type. It’s not clear to me that this is absolutely necessary, but things seem not to work if I don’t add the “type:” string to the search, so I’m leaving it in. Note that this function also looks for $_POST['category'] and adds it to the keys. If I wanted to add other search fields, I’d add them here as well. I suppose that since mytheme_search_validate() calls this function to set keys, I could eliminate the extra function and just do the same work in mytheme_search_validate().

Next up, blog_form_alter(). The hook_form_alter() functions are associated with modules, so I chose one I knew my site would be using that didn’t have a form_alter hook defined already. It feels kind of hacky, but it seems to work. The idea here is that we need to make the form run a validation function in order to add the keys we’re pulling in from mytheme_search_keys(). It was when I added this code that the category filter actually started working, so it seems to be a crucial bit. The key seems to be adding the keys to the $form['basic']['inline']['processed_keys'] array item, which seems to handle adding the search criteria to the URL and to the search itself.

Finally, I added the three _search() functions, which again feel a little extraneous, but the thing doesn’t work unless I add them, so they’re staying put for now. All we’re doing in these functions is adding the node type to the keys being searched (the search code extracts things like “type:blog” and “category:3″ from the query to do advanced searches) and then executing the node_search() function with the revised $keys value.

So, there you have it, the long way around to having a custom-themed Drupal search form with additional filters based on node type and category. Hope this saves somebody a few of the 10 or so hours I spent staring incredulously at my screen as solution after solution failed to work.

Inviting JSON to the Table

I’m doing some preliminary work on a project for which it’s been suggested that I consider using JSON rather than XML as a data transport. “JSON?” you ask. “What’s that?” It stands for JavaScript Object Notation, and for those of us who’ve spent a lot of time writing javascript, it’ll look very familiar. It’s a subset of the javascript language and can be described as the convention whereby one represents object members as name/value pairs. In short, it’s a form of serialization native to javascript and is therefore understood by all modern browsers out of the box and by many other programming languages either natively or by simple extension. A javascript function can eval a JSON text string with no additional parsing needed and can then use the decoded values directly. This can be beneficial to web applications in at least two potentially notable ways:

  • It eliminates the need to parse a verbose XML document into an object and then perform operations on the object.
  • The format can be (though isn’t necessarily) less verbose than XML.

Transfer time and processing overhead can therefore be optimized when using JSON in some circumstances. Furthermore, for some uses, JSON might actually save programming effort required on the server side to generate XML from objects or on the client side going in the other direction.

Those advantages notwithstanding, I was originally hesitant to give JSON more than a passing glance. For most web applications that feature the sort of functionality I had in mind (including, as far as I was aware at the time, the one I’m doing R&D for), existing AJAX toolkits fit the bill, and I was inclined to use an existing AJAX toolkit rather than to implement JSON for the sheer novelty of doing so. Consider an editable grid table, for example. You have a text field with an onchange event. On change, you send a small piece of data to the server and you get a small piece of data back that tells the client how to provide UI feedback. None of the three benefits of JSON I mentioned above really apply here, as the data in both directions is small, requires almost no processing, and need not be an otherwise usable object. It’s text out, text in, and minor DOM manipulation. In such a case, JSON provides no real advantage, and you might as well go with a standard AJAX toolkit.

A colleague working on the project with me pointed out some other possible use cases, however, that might render JSON worth further investigation. For example, if the data comes down as an object, it can be sorted and have calculations performed on it more readily on the client side without a round-trip to the server and back per operation. There’s something very appealing to me about this. So I’ll be doing more diligence on JSON.

As my first foray into coding with JSON, I wanted to test the example my colleague brought up. Doing so required me to grab a few libraries, and I haven’t packaged it all up nicely, but it’d be reasonably easy to assemble these things and test this out for yourself if you’re interested. Here’s what you need to grab:

So, in a web sandbox, save the PEAR class as JSON.php and create a file named “process.php” with the following contents:

<?php

include("JSON.php");

$rows = array();
$cols = array();

$colcount=20;

for($i=0; $i<$colcount; $i++){
array_push($cols, "col $i");
}

for($i=0; $i<100; $i++){
$row=array("count"      => $i);
for($j=0; $j<$colcount; $j++){
$row[$cols[$j] ] = substr(md5(microtime() . $cols[$colcount]),0,8);
}
array_push($rows, $row);
}

$response=array(
"error"         => "0",
"message"       => "success",
"payload"       => $rows
);

$json = new Services_JSON();
$output = $json->encode($response);
print($output);

?>

This script generates 100 rows of 20 columns of junk data and returns it as a JSON object. In a real-world application, this would presumably be a data set returned from a database. The XMLHttpRequest issued from your client calls this script and handles the data. Now on to that part of the code. Create a file named index.html and populate it as follows:

<html>
<head>
<title>JSON Demo</title>
<script type="text/javascript" xsrc="sortable.js" mce_src="sortable.js"   ></script>
<script type="text/javascript" xsrc="json.js" mce_src="json.js"   ></script>
<script type="text/javascript" xsrc="xmlrpc.js" mce_src="xmlrpc.js"   ></script>
<style type="text/css">
/* Sortable tables */
table.sortable a.sortheader {
background-color:#eee;
color:#666666;
font-weight: bold;
text-decoration: none;
display: block;
}
table.sortable span.sortarrow {
color: black;
text-decoration: none;
}
</style>
</head>
<body>
<div id="container">
<input type="text" id="thefield" name="thefield" value="" />
<input type="button" id="thebutton" name="thebutton" value="The Button" onclick="json_request(document.getElementById('thefield').value)" />
</div>
</body>
</html>

Note the javascript includes at the top, and be sure to name the downloaded libraries appropriately or to change the file. Now create json.js and populate it as follows:

function json_request(txt){

var myOnComplete = function(responseText, responseXML){
var obj = eval('(' + responseText + ')');
var container=document.getElementById('container');
container.appendChild(make_table(obj.payload));
sortables_init();
}

var myOnLog = function(msg){
alert(msg);
}

var provider = new oyXMLRPCProvider();
provider.onComplete = myOnComplete;
provider.onError = myOnLog;

provider.submit("process.php?txt=" + escape(txt));
}

function make_table(data){
var table = document.createElement('table');
table.className='sortable';
var tbody = document.createElement('tbody');
for(var i=0; i<data.length; i++ ){
var tr = document.createElement('tr');
//Create headers on first iteration.
if(i==0){
for(var field in data[i]){
var th = document.createElement('th');
th.innerHTML=field;
tr.appendChild(th);
}
tbody.appendChild(tr);
//Be sure to start a new row.
var tr = document.createElement('tr');
}
for(var field in data[i]){
var td = document.createElement('td');
td.innerHTML=data[i][field];
td.className=field;
tr.appendChild(td);
}
tbody.appendChild(tr);
}
table.appendChild(tbody);
table.id="table_" + Math.floor ( Math.random ( ) * 100 );
return table;
}

If you get everything linked up correctly, the result should be that when you press the button on the main page, a JSON object is pulled down asynchronously from the server and appended to the page as a table of data. Each such table is independently sortable without round trips to the server and back (and without your having to write sorting validation code to prevent SQL injection attacks, etc.). Of course, this does degrade poorly for browsers in which javascript is disabled. In any case, I’ve modeled one bit of functionality that’s pretty painless to implement using JSON, and I suspect that further work in this direction will turn up even more interesting results.

For more information on JSON, be sure to hit the JSON site.

technorati tags: , ,

PHP5

Brushing up a little on some of the things available in PHP5. Here’s something that made me weep happy tears. Given the XML file:

<people>
    <person>
        <name>Daryl L. L. Houston></name>
        <coolness>100%></coolness>
    </person>
    <person>
        <name>Steve Urkel></name>
        <coolness>-100%></coolness>
    </person>
</people>

And the code:

<?php

$people = simplexml_load_file('test.xml');
foreach($people->person as $person){
        print $person->name . ' (' . $person->coolness . ')' . "\n";
}

?>

You get the output:

Daryl L. L. Houston (100%)
Steve Urkel (-100%)

No iterating over arrays, no finding the right PEAR class and making sure you’ve got the right versions of dependent PEAR classes installed. It Just Works™.

I’ve more or less avoided PHP5 to date, as much of the code I write wouldn’t benefit a great deal from many of the things I understand PHP5 to provide, but this alone makes it worth a second glance.

Xajax in Drupal

I’ve taken a little break from tech postings here because most of my tech postings of late had been about Flock, and I wanted to group those elsewhere. Here’s another tech posting that’s relevant to Flock because I wrote this code for a project for Flock, but it doesn’t exactly belong at that blog because it’s not specifically about Flock. So feel free to skip over this one, dear family members.

Xajax is a php library that makes ajax development a cinch. I’ve looked at a few ajax libraries, and there are better implementations than what xajax provides on the javascript side, but the ease with which one correlates php function calls to javascript function calls in xajax is well worth the reduced functionality in many cases. For example, xajax doesn’t (as far as I’ve investigated, at least) provide elegant drag and drop functionality. But for simple updating of content upon request completion, it handles the basics without forcing you to write your own javascript callback functions (which can be a pain and which can make your source code look really nasty). Here’s a quick primer on how it works:

  • Write php functions to do server-side data manipulation and to use xajax calls to manipulate the page upon request return.
  • Register these functions in your php script. This gets the correlating javascript functions added to the javascript include.
  • Include the javascript in your source, tell xajax to process requests, etc. (all this is outlined in detail at the xajax site).
  • Add the javascript function calls to your source. If your php function was named “delete_member,” your javascript call is named “xajax_delete_member.”

That’s it. Arguments passed to the js function are passed straight through to the php function, and you have to do no special mumbo jumbo to make the thing work. The out-of-the-box functionality is pretty simple when compared to the libraries available at, for example, script.aculo.us, but xajax is reportedly extensible, so you’re certainly not limited to what you get out of the box.

Now, on to why this is ideal for Drupal development. The more I use Drupal, the more appealing I find its API. It’s pretty simple to extend the software. For example, I recently wanted to add a little snippet of code next to the title of any sort of node. Lucky for me, the Drupal developers thought to include a nodeapi hook, which lets you do just what I wanted to do. For all its niceness in some areas, though, it’s painful to do some things in Drupal. Adding global ajax support seemed a likely candidate for such pain because Drupal’s got menu hooks, a permissions system, etc., that I thought might make for tedious hacking.

I started by simply including the xajax code in one module I was working on (as proposed here), but when I wanted to include the same sort of functionality in another module, I started getting collisions. The library code was called more than once, so I found myself getting php errors. This prompted me to try to find a way to make xajax work globally across a Drupal install. And it turned out to be dead simple. I simply included the xajax library as in the example linked to above. If the node is enabled, this adds the javascript include to the template globally. Then I added some code that checks all installed Drupal modules for a function named MODULE_xajax_init (where “MODULE” is the module name). This init function for relevant modules performs the function registration calls. The routine for adding ajax support to any module in Drupal, then, now goes as follows:

  • Install and enable xajax.module.
  • For the module you’re writing, define MODULE_xajax_init and include function registration calls.
  • Within your module, define the php functions you’ve registered.
  • Add javascript calls to your UI.

That’s it. Voila. Magic, easy ajax support in Drupal.

Xajax is especially suited to Drupal because it handles everything using simple php code. There’s no writing of nasty javascript callbacks and embedding those into your templates or, worse, into your module itself. The appeal of using this xajax module is that it works globally across your Drupal install with no “function already defined” type errors and, well, it’s done and it’s free. You can download the module here.