The paper analog

Yesterday’s EFF deeplinks blog picked up our DRM exploit post from March and linked it with a paper from Microsoft security engineers arguing that DRM is doomed to failure.

Now might be a good time take another look at just how big this analog hole is, since, however big it is, Amazon’s new 9.7″ screen is about to make it a lot bigger.

Posted in In the News | Comments closed

Building in Parallel

We’ve talked about other efforts to digitize books before but now things are getting a lot closer to home.

Yesterday, a group of three grad students posted an instructable on how to build a book scanning device using a similar V-shaped cradle, the same camera model, and for about the same price as our design. They even wrote free software to do the image processing! It looks like our two efforts have a lot to offer each other and there is some great interest on both sides in pooling our knowledge to come up with a best of both worlds design, so there should be some great new stuff once the relevant grad schools are done with finals.

Posted in Uncategorized | Comments closed

6 Ways to Save Publishing

Michael Tamblyn, the CEO of BookNet Canada, presents 6 ways technology can improve the world of paper book publishing. For me, the most interesting tidbit in that video is that 99.5% of Canadian book sales are paper books (I suppose this is as opposed to digital or audio books). And if you’re interested in how books publishers survive as that percentage decreases, there are some good thoughts in this presentation.

Here’s the 6 ideas:

  1. The ability to pull metadata from the cloud
  2. More and better markup. “An XML workflow that doesn’t suck”
  3. Sexy mobile reading devices
  4. Smarter, individualized comprehensive sales information in catalogs
  5. Online browsing that makes you want to buy
  6. Embrace startup culture

And the video:

Via Boing2

Posted in Uncategorized | Comments closed

Grouping

We’ve got a google group!

Now that a number of people are in the process of building their own book rippers we need a place where we can pull together the multiple discussions we’ve been having over email and in other forums. The google group should do that very nicely. Come on by.

For those of you keeping track at home, we went with a google group rather than focus on the mailing list capabilities built in to launchpad, our code host, for a number of reasons mostly relating to the fact that everything from the mailing list itself to whether or how you can use OpenID accounts (like our wiki) is impossible to find when using launchpad. Google makes the important things the easiest to find.

Posted in Uncategorized | Comments closed

Universal e-book DRM exploit discovered!

Sort of.

There is a common joke in e-book and book digitization circles that paper is the original, and most effective, DRM available for books. This is mostly true, though anyone with a sheet-fed scanner quickly learns that it is a book’s binding that stops it from being easily digitized, not its paper. Yet, wherever things currently stand in the race between publishers releasing things in restricted formats and the public breaking those formats open to get at the tasty text inside, we can be sure of this, getting around the DRM on an e-book will never be more difficult than treating it like paper and photographing it normally.

When I tried just that using my book ripping camera mount I found that the OCR rates were almost identical vs the matching paper book, the page turning effort was reduced to almost nothing, and the speed was a comfortable ~750 pages/hour, where a “page” is the screen size of your e-reader.

So to anyone out there working on e-book DRM: I’ve got enough paper books (with bindings!) to scan already. Why not spend your effort on something more productive? Because this paper-DRM is only going to get even worse.

Posted in Uncategorized | Comments closed

Paperback testing begins in earnest

The book ripper used in our demo video, the one that created the test images and matching OCR results, is made from a single 24″x12″ piece of plexi with a bend in the middle. This turns out to be a good general purpose size for scanning the books in my library, which are mostly hard cover and large paperbacks.

If you have a similar collection of books, or just want to put together a book ripper capable of handling a wide variety of book sizes, I recommend the 24″x12″ size, or even a 28″x12″ one for very large books. I’ve tested both sizes on full books and am quite happy with the results.

If, on the other hand, you only want to rip small works, you might be able to pick up some ripping speed by using a smaller device. At least that is the the theory I’m going to start testing this week with two new rippers, one a shrunken 18″x8″ version of the standard book ripper design and one a new design built with off the shelf plexi sheets and webcams.

Posted in Uncategorized | Comments closed

The Future of Books

Why destroy your books just to scan them when there are so many more interesting ways to destroy them?

Posted in Uncategorized | Comments closed

When publishers were gods

Ars Technica has a great article by John Siracusa on the history of ebooks and their running failure to penetrate down to the consumer level over the last decade.

John make a number of great points, including this bit where he attacks the greatest paper tiger of ebooks, that ‘no one wants to read from a screen”:

People are clearly willing to read text off screens. Plain, old, often awful screens with tiny, ugly text and large pixels. Vast amounts of text, read over extended periods of time. Up to 40 hours a week at work alone, in the case of most office workers who sit in front of a computer all day. And more at home for pleasure. Hell, you’re likely doing it right now (unless you printed the PDF version of this article or are being paid to read it).

The whole article is worth a read for anyone interested in the history of selling ebooks and the arguments railed against them. Unfortunately, the article seems to be premised on the idea that the success of ebooks is tied directly to having a healthy market in selling ebooks, which a quick look at the history of computer audio files will show is not true.

MP3s did not become the format of choice for music because there was a healthy market in selling them, or any market at all. MP3s took off because people were able to easily convert the CD-based music collections that they already owned. Only now, years after the format switched for consumers, have retailers started changing their distribution format to match.

If digital music had been stuck waiting on publishers, it might be no further along than the current ebook market, swamped under competing DRM’d formats with absurd pricing policies designed to prevent cannibalization of physical media sales. After all, that was an accurate description of the digital music selling business until just the last year or two.

Thankfully, the publishers are not the only ones with books, they simply print some of the new ones. Once you realize that, you come face to face with the biggest real roadblock to ebooks: getting digital text out of the paper books that people already own.

This, naturally, is where we come in.

Posted in In the News | Comments closed

Test Images

If you’re interested in seeing what the software can do, there’s no need to actually build a rig.
Grab version 0.1.82 and the test images and get to work!

Posted in Uncategorized | Tagged | Comments closed

The software is done*

*for certain values of “done”.

Today we’re pushing out the 0.1 release of our page image processing software.  There are some limitations, most notably that it only runs on Linux, but the core rotate and crop functions are working and we’re targeting Windows and OS X for the 0.2 release later this winter. If you are interested in working on the software, now is a great time.

With last week’s documentation we’re also ready to declare the book ripper itself 0.1! The device works well enough to produce some great output and there should now be enough information to make building one yourself a simple project. If you run into any problems, let us know! We’re happy to put up whatever information or advice is needed.

Posted in Announcements | Comments closed