Archive for July, 2013

Flurish

Of Databases and Character Encodings

Gimlet is old. Its roots date back to mid-2006, when it started life as a Ruby on Rails 1.1.6 program. Along the way, we’ve built up a bit of cruft here and there; last night, we cleaned a bunch of it out when we fixed the character encodings in our database. This paves the way for us to upgrade to newer and better tools.

Feeling like a nerd? Read on.

Hey what’s a character encoding?

Since the days of Braille, Morse code, and telegraphs, people have been using binary codes to represent letters and numbers. The idea is simple: As long as everyone agrees on the letters and numbers and their codes (say, ‘01000001’ means ‘A’) writing a program to turn those 1s and 0s into human-readable text is easy.

For many years, the “standard” character encoding was called ASCII (or the closely-related Latin-1) – which covered all the characters you see on a American keyboard plus a few extras. However! There’s no way to represent ב in Latin-1. Can’t do it. It’s not a Latin character. Because people really want to store documents in different languages, it was time for a new standard. After some fits and starts, most of the world settled on a standard called UTF-8. Instead of just being able to represent all the letters, numbers, and symbols used in American English, UTF-8 has enough capacity to represent all the letters, numbers, and symbols in every language written by humans.

As you might imagine, that’s the encoding we’re using in Gimlet.

OK so what did we do last night?

As we mentioned, Gimlet is old. Old enough that UTF-8 hadn’t completely taken over the world when we started working on it. But, being forward-thinking, we created the database with UTF-8-encoded tables and served our web pages with a UTF-8 encoding. Everything looked great – until we upgraded the software that connected to our database.

Suddenly, where we expected to see Gimlét, we saw Gimlét – and a variety of similar oddities. Upon closer investigation, it turned out that the old database connector had been talking to the database in Latin-1. We were taking UTF-8 data from the Web, encoding it as Latin-1, and storing it in UTF-8 tables. When we got it back out, the database connector kindly re-converted the data so it looked nice to everyone. The new database connector wasn’t having any of this funny business, so all non-Latin-1 characters looked broken to everyone. We asked our new database connector to deal in Latin-1, and the problem went away, though we were still storing “broken” data in our database. An uneasy stalemate was reached – until we went to upgrade Ruby.

Ruby 1.9 brooked none of this silliness. It steadfastly refused to cooperate in our decode-recode scheme, dutifully keeping the data in UTF-8 format all the way from the database to the screen, with predictably broken-looking results. People on forums suggested hacks, but they were ugly hacks. It was time to fix the data.

Fortunately, it was easy to see the problem in a standalone database management program:

select name from accounts where id = 1;
Gimlét

We needed to convert the text to UTF-8. Of course, the database already thought the text *was* in UTF-8, so this didn’t work:

select name, convert(name using utf8) from accounts where id = 1;
Gimlét   Gimlét

The trick was to do what our old database connector had been doing: take the data, interpret it as Latin-1, then re-interpret it as UTF-8. Because MySQL is “smart” about character encodings, we also had to tell it the characters were just a string of bytes after its Latin-1 encoding. Finally, we tell it to re-code the data as real UTF-8:

select name, convert(cast(convert(name using latin1) as binary) using utf8) from accounts where id = 1;
Gimlét   Gimlét

And from there, we wrote a little script to fix all the text columns in the database, and all was well.

July 30th, 2013  |  Published in Gimlet

Flurish

Planned Gimlet outage: July 29, 11:00 PM-Midnight Central Time

As the title suggests, we’re planning some system updates the night of Monday, July 29. We’re planning to be down for about one hour from 11:00 PM to Midnight, Central Time.

July 24th, 2013  |  Published in Gimlet

Flurish

One more exhibition musing

Next time you’re at a conference, look at the big booths. They’re paying between $15 and $25 per square foot for space. And $200-ish per hour (plus materials costs) to build their displays. And for all of the staff they have on-hand to talk to you and run informational sessions and whatnot. It’s pretty easy to come up with a ballpark figure on what that exhibition costs.

Interestingly, one vendor filed for Chapter 11 bankruptcy while exhibiting at possibly the biggest booth at the conference. Wow.

July 23rd, 2013  |  Published in Uncategorized

Flurish

Exhibiting at ALA: Redux

A few weeks ago, Eric and I took Gimlet to Chicago for the annual meeting of the American Library Association. When we got back, our friends and family all wanted to know:
Was it worth it?

Short answer: Probably. It’s not an obvious direct financial win, but meeting our customers is priceless.

First off: exhibiting at a conference is expensive. A 10×10 space alone is more than $2200. Electricity, carpeting, and seating add nearly $1000. Travel expenses (airfare, gas, tolls, parking, hotels, food) come out above $1600. Printing banners, our sweet Gimlet coasters, and postcards cost about $500. Some of those things might be reused later — but we’ll want to update them as Gimlet (and our messaging) evolve. Add some miscellaneous costs, and going to ALA represents about $5500 in direct costs. While this doesn’t break the bank, it’s not chump change, either.

We’re charging $120 per year per library branch, so for us to financially break even on the conference, we’re looking for about 45 library branches to sign up as a result of meeting us at ALA. That’s certainly not impossible, but it’s no surefire bet, either.

On top of that, exhibiting takes time. Lots of time. Gimlet is a two-person shop: Eric and I are the whole deal. The hours we spend designing banners and coasters, registering for conference space, shopping for hotels and airfare and banner stands and what-have-you, and (of course) traveling and exhibiting are hours we aren’t programming or designing or upgrading servers or sending invoices or answering support questions or any of the other hundred things that keep Gimlet alive.

On the other side of the equation are the positives. We talked to lots of people. Hundreds of people. It was a pleasure – and hugely helpful – to talk to each and every person who stopped by our booth. At our price, we’re not going to send a sales team to a bunch of site visits, so ALA is our best chance to sit down with existing and prospective clients. We love to hear about the challenges of running busy public service points and how better stats help people make informed decisions about staffing and programming. It’s great hearing about how Gimlet works great for people. It’s invaluable hearing about the ways we could make Gimlet better for library staff and directors alike.

But the best thing about exhibiting is meeting the people we talk to over the internet. It’s easy to run an internet service and not know, really, who your clients are. Sitting down and talking to you reminds us why we’re doing this: We love librarians and libraries. We want to help you evolve and thrive through whatever the future holds, so our kids can grow up loving librarians and libraries just like we do.

See you in Las Vegas.

 

July 16th, 2013  |  Published in Gimlet