Why DRM really, truly sucks

November 29th, 2007 by Dan Fuhry

Today in a class I’m taking there was someone a couple desks down that has dyslexia. We have a bit of reading to do for the class, so he had gone and bought the e-book form of the book. Naturally the file was delivered in a DRM-protected format.

The fact is that this guy uses TextHelp Read and Write Gold for screen-reading.  The way it works is, you select text, and the software reads the text to you in a way that’s a bit better than “Microsoft Sam” (you know, the speech engine that says “crow’s nest” when you feed it the word “crotch”?).

The problem with this e-book, and most others, is that copying text is not allowed. You see, according to the publisher, if you select text, you’re trying to copy it so that you can share the book with others, which violates the publisher’s rights and makes the author starve to death because he has to choose between paying the electric bills and eating dinner.

When this guy tried to select text so as to make Read and Write Gold read it, he received a message from Acrobat Reader informing him that the “publisher of the book  doesn’t permit copying text.” Read and Write Gold didn’t work for him, and now reading this book is going to be about 5 times as hard.

Folks, we have a serious moral issue here. About 1 person out of 30 will have some sort of reading difficulty, most commonly dyslexia. It is absolutely imperative that all technology that we develop is accessible - and that means without DRM. Developing technology that restricts people with learning differences under the claim of copyright protection is cruel and immoral, not to mention just complete bullcrap. It’s absolutely impossible for a company like Adobe to provide test cases for absolutely every scenario where DRM would restrict use. Therefore, it should be quite obvious that we need to just DRM completely. Developing it costs money, licensing it costs money, implementing it costs money, and when your customers start seeing its effects and stop buying in DRM-protected formats, that’ll cost a buttload of money.

Posted in Uncategorized | No Comments »

How to spend a sick day…

November 28th, 2007 by Dan Fuhry

I just caught a nasty cold and decided to stay home from work today. It wasn’t an easy decision because I obviously got my dad’s workaholic gene. Mom always said he’d be going into work unless he was on his death bed. Well I’m exactly the same way.

So I’m feeling raunchy today with crap coming out of my nose and ears (no, not actual crap), an unrelenting cough, a sore throat, and I can’t even think straight. What shoud I spend the day doing? Maybe release Coblynau? Start the Christmas shopping? Take a 3-hour-long shower? Sleep all day?

*yawns*

Posted in Uncategorized | No Comments »

Enano’s new search engine: an inside look

November 21st, 2007 by Dan Fuhry

Coblynau is scheduled to be released in about two days. But let’s rewind a little bit and take a look at what’s been going on the past day or two.

Two days ago I booted Nighthawk into Vista because Fedora was making my sound card queasy and I needed to do some testing of Enano under the OS from “that company out west” anyway.

I’ll be the first to admit that while I’m competent with MySQL, I’m not the best-coded stored procedure in the database, so to speak, when it comes to the dinosaur open source DBMS. I’m still trying to get MySQL’s weird syntax down (SHOW INDEXES FROM table anyone?), and I just realized how clueless I was when I set up the FULLTEXT index that comes on Enano’s page_text table.

The problem became apparent to me when I realized how difficult it was to hook into the search system convincingly. I was fed up with MySQL’s indexing functionality, and the built-in search engine in Enano blows because it’s just incredibly slow. There was never any central  result list that you could tap into and manipulate, so plugins like Decir and Snapr were pretty much doomed to having either their own search pages or segregating search results from different tables or types of pages for the same query. Blech.

It occurred to me last night, what if I just selected the indexed words from search_index and matched that against the query? You know, something like “SELECT page_names FROM search_index WHERE word=’some_term’;”? The index is already automatically updated each time  a page is saved, so I decided that it was a go.

I started rewriting the code around 10:30 PM. By 1:30AM, I finally had ironed out the 4 or 5 PHP and SQL syntax errors and the newborn algorithm (hackers will want to know that it’s a function called perform_search()) was returning an array with trimmed/clipped/highlighted page info and I had a little file called search-test.php that displayed the results in an increasingly human-friendly way.

The way it all works is, you have two arrays, $scores and $page_data. Each page has a unique string assigned to it in the format of “ns=Article;pid=Main_Page”. $scores is an associative array containing one value for each page found. The value is incremented by 1 each time another search term is found on the page. $page_data just contains the unique ID, page text, page name, and the size of the page in bytes.

The reason this approach works so well is that you can easily have a plugin hook into the search algorithm and inject its own results, scoring them appropriately. The algorithm also was designed to consume almost no memory, and it’s working pretty well on my development site on Scribus.

An additional benefit, of course, was that we got to try something that’s never before happened in Enano history: during your upgrade, the search_cache table is actually going to be dropped. That’s right. The algorithm is so fast (it processed an 11-term query in 0.13 seconds whereas the old algorithm took a whopping 10.4) that I decided a caching system wasn’t necessary. Only time and a heavy server load will tell, but so far Enano’s been performing very well with the new search code.

Posted in Uncategorized | No Comments »