Archive for the ‘Tech’ Category

Knoxville meetup confirmed

July 25th, 2006 by daryl

It’s confirmed — Knoxville’s second Flock meetup will take place at 7:00 p.m. on Wednesday (tomorrow) at the Barne’s and Noble on Kingston Pike (yes, we do know about books here in the south). We’ll gather in the cafe area. We’re a small group so far (but bigger than last time), and anybody in the area is welcome to stop by. I’ve got some nifty buttons to give out and, FedEx willing, I’ll have one or two tee-shirts that people can fight over. Other than that, we’ll just talk Flock. I hope to have a chance to give an overview of things to come and to answer any questions I’m able to about where we are with the software. Admittedly, since I’m on the web end of things now rather than the client side, my knowledge on that front is more limited than in the past. In any case, it should be a good event. Naturally, unless it’s a real snoozer, I’ll report on how it went. Consider this an invitation to other community members to hold meetups and to do status reports afterward. You’ll have to check this with Community Ambassador Will Pate, but I gather we’re getting much closer now to being ready for spread-like campaigns, and meetups seem to me like as good a way as any to participate.

Knoxville Meetup

July 19th, 2006 by daryl

Back in March, I more or less presided over a small Flock meetup in Knoxville. We’ve come a long way since March, and given the recent releases, I thought it might be a good time to hold another meetup event, this time with hopefully a slightly broader reach. Including myself, I can count on four participants this time and may be able to garner a fifth. If I break six, I’ll be pretty happy; Knoxville isn’t exactly browseropolis, you know. If you happen to be in the Knoxville area and are interested in meeting some other Flock users or just want to find out more, please let me know by email (daryl at flock dot com), and I’ll fill you in on the details as I firm up plans. Tentatively, I’m looking at finding a book store or coffee shop with wifi on Wednesday or Thursday evening next week.

There’s no set agenda, but I imagine we’ll talk some about where the browser’s been, where it’s come, and where it’s going. I fully anticipate the airing of some beefs with the browser, and I hope we’ve also given reason for some kudos to be awarded as well.

Quick and dirty apache logfile analysis

June 28th, 2006 by daryl

Dilemma: I had a bunch of rotated apache log files that I wanted to check traffic patterns in to see if some link changes I had made to a site were affecting traffic. Specifically, for the domain in question, I had tried to route requests to certain urls over to another domain by changing links in the html, and I wanted to see if it was actually impacting traffic. So I wrote the following little bash script to iterate over a set of log files and print the date and the line count for specified search strings.

#!/bin/bash

if [ -z "$1" ]
then
echo “Usage: $0 search_string [log_file_prefix] [log_directory]”
echo “log_file_prefix defaults to ‘access_log.’”
echo “log_directory defaults to ‘.’”
exit
fi

if [ -z "$2" ]
then
2=”access_log.”
fi

if [ -z "$3" ]
then
3=”.”
fi

FILES=$(ls $3)

for FILE in $FILES
do
EPOCH=${FILE##*$2}
DATE=$(echo $EPOCH|awk ‘{print strftime(”%c”,$1)}’)
COUNT=$(cat $3/$FILE | grep “GET $1″ |wc -l)
echo “Looking for $1 in log for $DATE: $COUNT”
echo “”
done

So say I execute the command as follows: “./log_parse /about access_log_www. www”. It would scan for requests beginning with “/about” in all log files whose names begin with “access_log_www.” in the directory “www”. The script assumes that rotated log files are suffixed with a timestamp, and it writes the time based on that timestamp. Output looks something like this:

[root@www logs]# ./log_parse /about access_log_www. www
Looking for /about in log for Thu 08 Jun 2006 07:00:00 PM CDT: 726

Looking for /about in log for Fri 09 Jun 2006 07:00:00 PM CDT: 681

Looking for /about in log for Sat 10 Jun 2006 07:00:00 PM CDT: 28

It’s certainly not a full-service solution for log analysis, but it makes a quick check of one-off traffic patterns over time pretty easy to spot.

Flock

June 23rd, 2006 by daryl

So, people who know me know I work for Flock, which has something to do with computers. We actually produce a web browser built on the platform that Firefox is built on. When we first started talking big a little less than a year ago, we sang out about how we were going to revolutionize browsing. Talking big got us in a little trouble because we were over-hyped too early. We’ve since changed our message a bit. Rather than being the saviors of the internets, we’re just trying to make the web a little easier to use for some people. If you like to blog, share your bookmarks, and use photo services like flickr and photobucket, maybe Flock is something you’d be interested in. Otherwise, it’s probably not. Here’s some of what I personally like about our product; these things are what would make me think about using Flock if I didn’t feel bound to because I’m an employee.

Built-in news/feeds integration. Most blogs and many news sites you see nowadays display little orange radio-wave icons and advertise things like “RSS2.0″ and “Atom.” These are just techie ways of saying “here’s a list of recent articles that your software may be able to keep track of.” The idea is that you can plug these feeds into your feed reading software, which will notify you of updates so that you don’t have to go out each day and click through to the dozen sites you like to keep up with. There have long been Firefox extensions and standalone applications that would consume these feeds, but I’ve never liked any of them. In the case of the extensions, the interface has always bugged me. In the case of applications, well, I really don’t need another application open on my desktop. Flock has feed-reading built in. I subscribe to the feeds for the blogs I like to read. When one of them has new content, a button in my browser lights up to let me know. So no more going out and visiting the site to see what’s new. When I click the news button, a sidebar opens up that gives me a tree view of my feeds (sort of like an email view, with folders and items on the side), which I can scan easily for updated content. This feature really makes my life easier and is probably my favorite thing in the browser.

Next, there’s the star button. It’s a little button connected to the urlbar. If you’re at a site that you like and would like to save for later, just click the star button. It’ll turn orange, and when you later visit sites you’ve starred, it’ll turn orange to let you know you’ve already starred them. If you’re adventurous, you can make an advanced options box appear that’ll let you apply tags (just think of them as ad hoc labels that are easier to use than nested folders) to the things you star. We call the things you star “favorites.” And there’s a favorites manager that lets you see a list view of the things you’ve starred. You can search through them by tag, sort by recent, organize them into collections (which you can make appear in your toolbar), and go back to their advanced options to change settings. What’s more, if you’re so inclined, you can set your favorites to be synced up with an online bookmarks service like delicious or Shadows so that your favorites are available to you anywhere, any time. I like our favorites system because it lets me do some level of organization of my links without making me manipulate a bunch of nested folders; I can quickly apply multiple labels that let me get to my links later in several ways without having to file them redundantly in multiple folders.

Favorites interact with the search box, which widget can be very useful but is something I often overlook. If you’ve used Firefox before, you’re already familiar with the search box embedded in the browser interface. You can select from among several search engines to direct your searches to, and you can easily perform searches without first having to navigate to the search page itself. I have muscle memory that makes me hit my home button and search straight from Google, so the search box hasn’t historically been something I’ve used (hence my often overlooking it now). But there’s one thing that makes it very handy in Flock: As you type search strings, it searches through your browser history and your favorites and populates a flyout with relevant searches. In other words, it’s a quick interface to search for relevance among things you’ve already flagged as relevant by visiting or starring. And when I say it searches through your browser history, I don’t mean it looks for strings in the url. It does a full text search on pages you’ve visited. So even if you didn’t star that page about platypuses that you found so interesting, if you want to get back to it without combing through thousands of results from Google, you can just start typing “platypus” in the search box, and sites you’ve visited that contained the word will magically appear in the flyout.

Next, there’s photo integration. We have this thing called a topbar that’s a little slice of screen that pops open between the browser buttons and the content window. It’s a pretty good size for displaying thumbnails of photos, and it’s got a neat little slidey interaction that makes it easy to zoom through photos it contains. The photos currently can be set up to pull from accounts at flickr and photobucket, and you can use the topbar to view your own photos or those of other users. The topbar will also notify you by lighting up a browser button when your contacts post new photos, so there’s no more running out to click through your contacts’ pages to see what’s new — Flock just brings it all to you with no effort on your part. This functionality is of pretty limited use to me because I don’t follow photos very closely, and the ones I tend to care about usually get posted to kodakgallery, which Flock doesn’t yet support and which sends me email invitations to see photos anyway. What is useful to me to the extent that I use it is the fact that Flock comes with a photo uploader that makes it pretty easy for me to send photos to flickr. As a Linux user, I’ve always had to do it the painful way, using file upload widgets and navigating through files by name to upload. This is definitely not a good way to batch upload photos. Now I can just open the uploader, browse files or drag files to the window, rotate and crop, and upload. If I were inclined to use photo services very frequently, this would be a great tool for me, especially since most of the tools out there that simplify photo uploading don’t work in Linux. Rumor has it that you can also now drag photos into textareas in the browser and they’ll get uploaded and posted automatically, but I haven’t tested that functionality out yet and so can’t verify that it works.

The last two features that stand out to me are things that used to excite me but that now completely underwhelm me. The first is the blog editor. It’s built right into Flock and lets you edit and save posts locally before pushing to your blog service. One nice thing is that it’s got drag and drop goodness, and I’ll bet that by this time, if you drag photos into it, they get pushed to your photo hosting service. Generally, I’m about as happy editing blog posts in my blog’s user interface, and as far as blog editing usability goes, I prefer the extension produced by Performancing to our blog editor. This is really sort of a shame, as blog integration was one of the things we wanted to address best and first, and in my opinion, we’ve regressed a bit on that front. The other feature I used to like that I now find almost useless is what we’re calling web snippets. This used to be a little sidebar or floaty window that you could drag photos and text snippets to for later use in blog posts. We later added the ability to add notes to the web snippets, and I found this useful as sort of a blogging to-do list. But we crammed the tool down into a little area at the bottom of the screen and made it pretty much unusable because of the way it uses its real estate. In theory, it helps you assemble things to put into blog posts, but in reality, for me, at least, it sits down in the status bar unused. These two features were very appealing to me when we started working on them around a year ago, but they’re now the least useful things we’ve added to the browser.

Which is ok, because, as I think I’ve indicated, we’ve added other useful things. The news functionality alone makes me want to use Flock. I think we’ve also produced a beautiful browser. As much credit as Firefox is due for helping to bring choice back to the web browser industry, the browser’s default theme is heinous. Any time I run Firefox now to test something, I’m shocked at how clunky the buttons and toolbars look, and I find it a little oppressive. Using Flock just feels smoother to me because our interface looks smoother. This is of course purely a matter of taste, and it should be noted that both Flock and Firefox can be re-skinned to look entirely different and better (or worse). I think that for the time being, Flock provides a better experience out of the box, though.
So, that’s what I’ve been doing for the last 18 months. I do primarily web programming and system administration these days, though I’ve been involved at some point in pretty much every part of the business. Last week, we released our first public beta (we’ve been in developer alpha for months), and feedback so far is pretty good. We’re still recommending that people try Flock out but not be disappointed if we’re not quite ready for prime time or if the browser’s a little buggy. If any of the features I’ve described sound useful to you, I invite you to try the browser out. You can download it here. If you’re not game, no hard feelings; as I said at the beginning of this post, Flock isn’t for everybody.

Would you like to make your life on the web easier?

May 15th, 2006 by daryl

So, let’s just pretend for a moment that there was a link you wanted to send to someone that had an insufferably long URL. So you try to paste it into your email program, but the person you were trying to send it to couldn’t view it because their stupid email program cut the link off. And what ensues is a ridiculous six-email-long conversation in which you say things like “no, backspace but then add a percent sign, and then copy — yeah, control C — and then paste — I think it’s control P — oh, crap, you did a backspace?” And so imagine that you could just say “go to http://shrtlnk.com/daryl/beach” to show off your beach photos otherwise available at an oppresively long url. Would that be useful?

There’s at least one site that provides a similar service (tinyurl.com). Unfortunately, it shackles you to its random url scheme. That is, instead of the short and semantically useful http://shrtlnk.com/daryl/beach or (if you’re logged in) http://shrtlnk.com/beach), it might force you to use something like http://tinyurl.com/g8lrz. But something like http://shrtlnk.com/daryl/beach might be more useful to you and your friends. And something like http://shrtlnk.com/daryl might be a useful as a way of organizing the things that your friend daryl is linking to. Is this something you might find useful?

If you were a Firefox or a Flock user and it were easy for you to click a button, apply a label, and passively share a long-url’d link with a friend, would you do it?

Backing up Windows Documents on Ubuntu Linux

May 3rd, 2006 by daryl

I wrote earlier about getting a hard drive added to one of my Linux boxes so that I could do backups. I’ve been meaning to get backups in order for many years, but it’s been more urgent since about 10 months ago (yeah yeah, must not’ve been too urgent if it waited that long), when I lost all the data on my laptop and had to recover what I could from a backup I had manually made some months before. More precious than most of the data I might have lost in that case are the photos and videos we’ve accumulated of Lennie over the past two years. When I had to send Mleeka’s laptop in to the shop a few months ago, I made an effort at dumping her files onto a Linux fileshare, but it wasn’t terribly successful, it wasn’t easy, and it wasn’t a long-term, self-sustaining solution. My project tonight was to put such a solution in place.

Backing up from Linux to Linux is simple, and I’ve had that going for several days. You just run rsync as a daemon on the fileshare, create a module, put the client’s public ssh key on the server, and run a cron job on the client that pushes to the module. All I had to to today to make this work on the new disk was move the files from one disk on the server to the other, edit the module in /etc/rsyncd.conf to reflect the new path, and restart rsync. Nothing to it.

Backing up the Windows box has been somewhat more challenging. I spent a few hours wrangling with a couple of different options for running rsync clients and daemons on the Windows laptop. I kept running into problems with weird Windows path issues, Windows firewall issues, other connection issues, permissions issues, and so on. It was very frustrating. Finally, I read a suggestion that one just mount the Windows directory on the Linux fileshare and then rsync locally. Simple and brilliant!

Ubuntu Linux appears not to ship with smbfs enabled, so after getting errors with the mount, I issued “apt-get install smbfs” to get that module. Then I did the following to create a mountpoint and a share:

mkdir /mnt/laptop_mleeka
mount -t smbfs //server/sharename /mnt/laptop_mleeka

This is of course after right-clicking properties on the My Documents folder on the Windows laptop and sharing the folder as “sharename.”

Once that was mounted, I issued the following:

rsync -a /mnt/laptop_mleeka/* houston@localhost::mleeka_bak

In this case, “mleeka_bak” at the end of the command is an rsync module I set up that manages permissions and file locations for the sync. The initial backup is running as I type. I understand that samba mounts are pretty slow, so it’s taking a while to haul the 9 GB of data across even my local network. Luckily, rsync is incremental, so in the future, it’ll sync only differences between the file systems.

All that remains is to add the mount command to /etc/fstab so that the server tries to remount the drive in the unlikely event of a reboot and to toss the rsync command itself into a cron job so that the backup takes place nightly.

Adding a hard drive to a Ubuntu Linux box

May 3rd, 2006 by daryl

Dilemma: I’ve got an old desktop system with less than 20 GB of disk space. On that system, I wanted to store backups of my laptop and our photos of Lennie. I also wanted to use it as a machine that will allow me to build software from time to time (which can take up a few GB of disk space). Without even starting backing up the many GBs of photos yet, I’ve nearly filled up the 20GB, and that’s after deleting a bunch of stuff. I happen to have another desktop system that has been on its last legs for a while, but it’s got a 65 GB hard drive.

Objective: Put the 65 GB hard drive in the working desktop system and make it operational.

Dilemma 2: I don’t really know anything about hardware. I forged ahead anyway, though, and I got it working. Here’s how.

Disclaimer: I did this on a Ubuntu Linux box. I have no idea how it’d go on any non-Linux box (so don’t ask). If you try these steps and you or your computer blow up or become otherwise harmed or incapacitated, note well that by reading further, you’re asserting that you’ve proceeded at your own risk. :)

  1. Remove hard drive from dying computer. (This is pretty straightforward and doesn’t merit more detailed instructions.)
  2. Read the back edge of the hard drive (the part with the pins) to figure out how to make it a slave so that the computer doesn’t get confused when it tries to boot up with two drives, This usually involves pulling a little plastic tab that slips over two pins off and sliding it back down onto two other prongs. I learned this the hard way by installing the drive, having the computer fail to boot, and remembering that something like this had to be done, whereupon I consulted the back edge of the hard drive.
  3. Install the hard drive in a free bay in the working computer. This is pretty straighforward and, like initial removal, will require a screwdriver.
  4. Cross your fingers that you’ve got the necessary cables available in your working computer’s case. If you don’t, you should probably consult some other guide. I was lucky. One cable is a ribbon cable with a plastic component that will receive the long array of pins on the back edge of the disk drive. The other is a little bundle of colored cables capped by a white plastic piece. It’s got four or five holds in the back for receiving pins. These plug into the obvious spots on the back edge of the disk drive.
  5. Now boot up. The computer should boot as it did previously.
  6. Get a terminal window open. If you don’t know how to do this, you probably shouldn’t go any further.
  7. Find out what the disk is named. The following command will give you some output about your disks and partitions. You should look for one that’s the same size as the disk you installed and one that’s not listed in the output of the command “df”. If you don’t see one, something went wrong with the hardware install and I probably can’t help you.sudo fdisk -l
  8. Make a directory to mount the drive on. I used /bak: Sosudo mkdir /bak
  9. Now give your user permission on that directory. You might have to do something more involved if you need to allow more than one user to access the disk. This works for me, though.sudo chown -R houston /bak
  10. Now partition the disk using fdisk. This is weird if you don’t know what you’re doing. I’ll walk you through my steps, which may or may not be the best to have taken.
    1. First do the following command, which takes you into a little interactive prompt (where /dev/hdb is the disk; note that it differs from /dev/hdb1, which I guess represents the partition on that disk; also note that you should substitute your disk identifier as above):sudo fdisk /dev/hdb
    2. At the command, enter “d” and then “1″ to delete the first partition on the disk. If there are other partitions, delete them as well.
    3. Then press “w” to write the changes to the disk and exit. This’ll probably take a few seconds or minutes.
    4. Next do sudo fdisk /dev/hdb again (substituting your disk identifier)
    5. Press “n” to add a partition. Chances are that you want just one, in which case you can just accept the defaults in the ensuing prompts.
    6. Then press “w” to write the changes to disk and exit.
  11. Now format the new disk as an ext3 disk. Note that the /dev/hdb1 below is the same as what I discovered in step 7 to be my new disk. Substitute your value there.sudo mkfs /dev/hdb1 -t ext3
  12. Ok, now you’ve got an ext3-formated disk with one partition. All that’s left is to mount it so that it can be used. To do this, edit /etc/fstab and add the following line to the end (again, substituting the drive identifier and directory the disk should be mounted on in columns one and two):/dev/hdb1 /bak ext3 defaults 0 0
  13. Finally, typing the following command should mount the drive so that you can begin writing to it (substitute your path if needed):sudo mount /bak
  14. Since we added the entry to /etc/fstab, the drive should come back mounted on reboot as well.

News, Feeds, RSS

May 2nd, 2006 by daryl

This is hopefully a precursor to something I’ll eventually blog at my work blog. I wanted to solicit feedback from a not-as-technical crowd as the crowd that tends to read that blog, though to be honest, of the five of you who read me on a semi-regular basis, a solid two are at least as technically-inclined as I am, so this isn’t the best sampling either. Anyway, I’d love to have some responses to a couple of questions I’m about to pose. Email me or add a comment to this post. Respond without Googling around to try to find an answer. If you’re a geek, please don’t add any spoilers to the comments for those to whom this stuff may not be second nature. So, the questions:

What do you think “news,” “feeds,” or “RSS” are in the context of web content? If you have at least a vague idea of what these are, do you find the idea useful?

Thanks for any feedback.

Multi-group, multi-join, no-union, no-view mysql query

April 24th, 2006 by daryl

Chances are than unless you googled to find this post, both the title and the post will be pretty much nonsense. If so, don’t continue reading on my account. There are no hidden gems unless you’re looking for an answer to a mysql question.

So, my dilemma was that in a Drupal module I’m writing, I was asked to provide data sorting that requires groupings on multiple columns in different tables. Three tables are involved. Table “node” holds content; “votes” holds ratings users have given nodes; and “extension_dowloads” holds download counts for nodes of a given type. The challenge was to get average votes, vote counts, download counts, and node information for all nodes of a given type, sortable on each of these columns the aggregation of which requires grouping.

There are several possible approaches. The least favorite to my mind was post-processing. That is, I could have gotten all node info, all download info, and all voting info, tossed it into a multi-dimensional array and sorted on various keys in the array. Downsides to this approach include the fact that I have to retrieve all potentially relevant data from the database (rather than just the subset I need), and then there’s processing time to do the sorting. I can see this scaling terribly. It also keeps me from using the nice hooks Drupal has in place for paged queries, sorting, etc., and that just seems like flagrant waste.

Another approach is trying to do some complex multi-join table, but that turned out to have weird results. You have to use a left outer join in order to get nodes that have neither votes nor downloads, but grouping on the two satellite tables as you must in order to get counts and sums out of them throws the numbers off. If you group by multiple columns, you get redundant data (ie, nodes listed twice, once for the match in the downloads table and once for the match in the votes table). If you group by just the common key, strange things happen as well. In some cases, something like an unexpected multiplication (across columns!) seemed to be happening as a result of grouping. I never quite figured that one out. In any case, I couldn’t find a single multi-join query that would work.

Yet another approach is to try to use a union. A union lets you daisy-chain the results of multiple queries together. Column names and types are taken from the first query in the union, and as long as there are no major incompatibilities (e.g. different numbers of columns in subsequent queries), the results of the queries are dumped out as one result set. I was hoping I could use a union to combine output from the download count query and the vote calculation query, but a union returns distinct rows. If I selected NULL for the download count in the voting query and NULL for the voting calculation columns in the download count query, I did get back all the results I wanted, but there were two rows of data per content node because rows were not unique. This would require further post-processing.

And finally, I considered using either temporary tables (select both result sets into temporary tables and then join the two to the nodes table to select the results I wanted) or views. Temporary tables seem messy and inefficient, so I was reluctant to use those, and views require a more recent version of mysql than I’m using, so that was out.

Now on to the solution. I had heretofore used sub-selects only to get scalar lists of ids to select from: “SELECT foo from bar WHERE id IN (SELECT id from other_table)”. It turns out that you can get whole result sets from sub-selects. What you’re doing in this case is in essence to select some results and define them as a table to select from within your wrapping query. (Really, I guess it’s a temporary table scenario, though it seems less hacky and possibly somehow more efficient/optimized than issuing CREATE, INSERT, SELECT, and DROP statements per request to get some data.) Here’s what I arrived at to solve my problem:

SELECT node.*, votes.vote, votes.vote_count, votes.vote_ratio,
downloads.downloads
FROM node,

(SELECT node.nid, SUM(votes.vote) vote, COUNT(votes.vote) vote_count, (SUM(votes.vote) / COUNT(votes.vote)) vote_ratio
FROM node
LEFT OUTER JOIN votes
ON votes.content_id = node.nid
WHERE node.type = "extension"
GROUP BY node.nid) votes,

(SELECT node.nid, COUNT(extension_downloads.eid) downloads
FROM node
LEFT OUTER JOIN extension_downloads
ON extension_downloads.eid = node.nid
WHERE node.type = "extension"
GROUP BY node.nid) downloads

WHERE votes.nid = downloads.nid
AND node.type = "extension"
AND node.nid = votes.nid;

I get everything from the node table. Then I do my first sub-query, which left outer joins the votes table on the node table to get all ids and relevant vote stats. I alias that sub-select as “votes” so that in the wrapping query, I can refer to its columns using “votes” as a prefix. Next I do a similar query on the downloads table. Finally, I constrain my wrapping query by node type and id. Since I’m doing left outer joins in my sub-queries, the row count for all three “tables” I’m selecting from is the same, and a simple nid = nid correspondence makes my data line up.

I may look into improving this further. It’s possible that I can reduce my data transfer and query burden by avoiding the left outer joins in the sub-queries and doing a single left outer join onto the two virtual tables in the wrapping query. I’m not sure whether aliasing the sub-queries will allow this or not.

Update: I was able to tweak the query as I speculated I might be able to above. Here’s the new query:

SELECT node.*, votes.vote, votes.vote_count, votes.vote_ratio, downloads.downloads
FROM node

LEFT OUTER JOIN
(SELECT votes.content_id nid, SUM(votes.vote) vote, COUNT(votes.vote) vote_count, (SUM(votes.vote) / COUNT(votes.vote)) vote_ratio
FROM node, votes
WHERE node.nid = votes.content_id AND node.type = "extension"
GROUP BY votes.content_id)

votes ON votes.nid = node.nid

LEFT OUTER JOIN
(SELECT extension_downloads.eid nid, COUNT(extension_downloads.eid) downloads
FROM extension_downloads GROUP BY eid)

downloads on downloads.nid = node.nid

WHERE node.type = "extension"

What’s going on now is that I’m joining on the results of the sub-queries rather than within the sub-queries. If I join within, then each sub-query returns as many rows as there are relevant nodes. If I join outside the sub-queries, each sub-query returns a number of results equal to the subset of nodes for which there is relevant data. Say I’ve got 500 nodes and that 300 of them have download counts and 250 of them have been voted on. In the original query, each sub-query returned 500 rows of data. In the new query, the sub-queries return 300 and 250 rows, which are then merged back into the 500 rows selected from the node table. The change stands to provide a significantly more efficient query that will scale better as more content nodes are added over time. What was really at question at the end of my original posting was whether or not aliasing the sub-query and using that alias in the join would work (I’ve had issues with aliasing in joins before), and it did.

Inviting JSON to the Table

March 7th, 2006 by daryl

I’m doing some preliminary work on a project for which it’s been suggested that I consider using JSON rather than XML as a data transport. “JSON?” you ask. “What’s that?” It stands for JavaScript Object Notation, and for those of us who’ve spent a lot of time writing javascript, it’ll look very familiar. It’s a subset of the javascript language and can be described as the convention whereby one represents object members as name/value pairs. In short, it’s a form of serialization native to javascript and is therefore understood by all modern browsers out of the box and by many other programming languages either natively or by simple extension. A javascript function can eval a JSON text string with no additional parsing needed and can then use the decoded values directly. This can be beneficial to web applications in at least two potentially notable ways:

  • It eliminates the need to parse a verbose XML document into an object and then perform operations on the object.
  • The format can be (though isn’t necessarily) less verbose than XML.

Transfer time and processing overhead can therefore be optimized when using JSON in some circumstances. Furthermore, for some uses, JSON might actually save programming effort required on the server side to generate XML from objects or on the client side going in the other direction.

Those advantages notwithstanding, I was originally hesitant to give JSON more than a passing glance. For most web applications that feature the sort of functionality I had in mind (including, as far as I was aware at the time, the one I’m doing R&D for), existing AJAX toolkits fit the bill, and I was inclined to use an existing AJAX toolkit rather than to implement JSON for the sheer novelty of doing so. Consider an editable grid table, for example. You have a text field with an onchange event. On change, you send a small piece of data to the server and you get a small piece of data back that tells the client how to provide UI feedback. None of the three benefits of JSON I mentioned above really apply here, as the data in both directions is small, requires almost no processing, and need not be an otherwise usable object. It’s text out, text in, and minor DOM manipulation. In such a case, JSON provides no real advantage, and you might as well go with a standard AJAX toolkit.

A colleague working on the project with me pointed out some other possible use cases, however, that might render JSON worth further investigation. For example, if the data comes down as an object, it can be sorted and have calculations performed on it more readily on the client side without a round-trip to the server and back per operation. There’s something very appealing to me about this. So I’ll be doing more diligence on JSON.

As my first foray into coding with JSON, I wanted to test the example my colleague brought up. Doing so required me to grab a few libraries, and I haven’t packaged it all up nicely, but it’d be reasonably easy to assemble these things and test this out for yourself if you’re interested. Here’s what you need to grab:

So, in a web sandbox, save the PEAR class as JSON.php and create a file named “process.php” with the following contents:

<?php

include("JSON.php");

$rows = array();
$cols = array();

$colcount=20;

for($i=0; $i<$colcount; $i++){
array_push($cols, "col $i");
}

for($i=0; $i<100; $i++){
$row=array("count"      => $i);
for($j=0; $j<$colcount; $j++){
$row[$cols[$j] ] = substr(md5(microtime() . $cols[$colcount]),0,8);
}
array_push($rows, $row);
}

$response=array(
"error"         => "0",
"message"       => "success",
"payload"       => $rows
);

$json = new Services_JSON();
$output = $json->encode($response);
print($output);

?>

This script generates 100 rows of 20 columns of junk data and returns it as a JSON object. In a real-world application, this would presumably be a data set returned from a database. The XMLHttpRequest issued from your client calls this script and handles the data. Now on to that part of the code. Create a file named index.html and populate it as follows:

<html>
<head>
<title>JSON Demo</title>
<script type="text/javascript" xsrc="sortable.js" mce_src="sortable.js"   ></script>
<script type="text/javascript" xsrc="json.js" mce_src="json.js"   ></script>
<script type="text/javascript" xsrc="xmlrpc.js" mce_src="xmlrpc.js"   ></script>
<style type="text/css">
/* Sortable tables */
table.sortable a.sortheader {
background-color:#eee;
color:#666666;
font-weight: bold;
text-decoration: none;
display: block;
}
table.sortable span.sortarrow {
color: black;
text-decoration: none;
}
</style>
</head>
<body>
<div id="container">
<input type="text" id="thefield" name="thefield" value="" />
<input type="button" id="thebutton" name="thebutton" value="The Button" onclick="json_request(document.getElementById('thefield').value)" />
</div>
</body>
</html>

Note the javascript includes at the top, and be sure to name the downloaded libraries appropriately or to change the file. Now create json.js and populate it as follows:

function json_request(txt){

var myOnComplete = function(responseText, responseXML){
var obj = eval('(' + responseText + ')');
var container=document.getElementById('container');
container.appendChild(make_table(obj.payload));
sortables_init();
}

var myOnLog = function(msg){
alert(msg);
}

var provider = new oyXMLRPCProvider();
provider.onComplete = myOnComplete;
provider.onError = myOnLog;

provider.submit("process.php?txt=" + escape(txt));
}

function make_table(data){
var table = document.createElement('table');
table.className='sortable';
var tbody = document.createElement('tbody');
for(var i=0; i<data.length; i++ ){
var tr = document.createElement('tr');
//Create headers on first iteration.
if(i==0){
for(var field in data[i]){
var th = document.createElement('th');
th.innerHTML=field;
tr.appendChild(th);
}
tbody.appendChild(tr);
//Be sure to start a new row.
var tr = document.createElement('tr');
}
for(var field in data[i]){
var td = document.createElement('td');
td.innerHTML=data[i][field];
td.className=field;
tr.appendChild(td);
}
tbody.appendChild(tr);
}
table.appendChild(tbody);
table.id="table_" + Math.floor ( Math.random ( ) * 100 );
return table;
}

If you get everything linked up correctly, the result should be that when you press the button on the main page, a JSON object is pulled down asynchronously from the server and appended to the page as a table of data. Each such table is independently sortable without round trips to the server and back (and without your having to write sorting validation code to prevent SQL injection attacks, etc.). Of course, this does degrade poorly for browsers in which javascript is disabled. In any case, I’ve modeled one bit of functionality that’s pretty painless to implement using JSON, and I suspect that further work in this direction will turn up even more interesting results.

For more information on JSON, be sure to hit the JSON site.

technorati tags: , ,