Calibre Aces Ebook Conversion and Management

August 30, 2011

Anyone who uses an eBook knows how challenging managing all the books can be. To solve this annoying problem a new program has entered the market: Calibre, an eBook management tool. With so many different types of files and equally different types of eReaders available, it’s nice to finally have a central command to sort through it all.

The concept was borne from an avid eBook enthusiast and reader, who was unhappy with the software available for eBook management and file conversion. Calibre, as it is today, is a work-in-progress that aims to meet the demands of busy eReading folk. As the website explains,

Today Caliber is a vibrant open-source community with half a dozen developers and many, many testers and bug reporters. It is used in over 200 countries and has been translated into a dozen different languages by volunteers. Calibre has become a comprehensive tool for the management of digital texts, allowing you to do whatever you could possibly imagine with your e-book library.

Perhaps the best feature of Calibre is its ability to convert all types of files making it possible for one to download an eBook of any type and then miraculously send it to the eReader of choice. Voila! As one Calibre fan wrote in the article, Best Ebook Library Manager: Calibre, on Book Sprung, “Calibre’s secret weapon is that it’s got crazy ninja formatting skills, and can convert all sorts of files into all sorts of other files. For Kindle owners, this means you can convert unusable file formats into the .mobi format that Kindle likes.”

We look forward to seeing what else Calibre can pull out of its hat, and more importantly, if the eBook providers of the world will play nice with the newest teacher’s pet.

Catherine Lamsfuss, August 30, 2011

Sponsored by Pandia.com

The Internet Means Search and Email

August 24, 2011

We were a bit underwhelmed. Though social media is gaining ground, one survey found that it has a long way to go to overtake the number one use of the internet which is searching for information. As discussed in “Who Uses Search Engines? 92% of Adult U.S. Internet Users [Study]”, research center, Pew Internet, found that searching is the single most popular use of the internet with email coming in second.

The survey found that the amount of people searching on a regular basis has grown over the past 10 years in every demographic. Now 92 percent of internet users utilize search engines, with 59 percent of them doing it on a regular basis. Email has similar numbers. The younger, the wealthier, and the more educated are the most likely to search and use email on a daily basis.

This leaves people to wonder what is happening with social media?

It’s certainly true that social sites are growing rapidly. Since 2004, when Pew Internet started looking at social media usage among those surveyed, social sites have risen from 11 percent usage to 65 percent usage. The growth started slowing in 2009, but is continuing a gradual climb.

 

I think it is safe to say that social media popularity will continue to grow, but will never have the numbers associated with searching and email. The likes of Facebook and Twitter just are not alternatives to a search engine. People are always going to need and seek out information which will safely secure the top spot for companies like Google and Yahoo.

Jennifer Wensink August 24, 2011

Sponsored by Pandia.com

Aggregation: A Brave New World?

August 24, 2011

As I’m typing this article on my computer, I must confess, I love pen and paper, the smell of a new book, the sound a newspaper makes when its pages are turned. Unfortunately, these physical things are slowly becoming extinct thanks to the internet. Though I stubbornly resist the allure of Kindle, I can see the writing on the wall, or the tablet.

The article How the Internet Has All But Destroyed the Market for Films, Music and Newspapers from the UK’s The Guardian, believes the impending death of physical newspapers, among other media outlets, is due to the lack of law governing and enforced on the internet. According to it, as long as information can be easily pirated and transmitted to others for free, those footing the bill for creating the movies, music and news will continue to see sharp declines in profits.

image

Image source: http://www.sreweb.com/weekend_emails/sept_10_2010/

To understand how the internet is killing the newspaper star, one must first understand why newspapers have worked so well for so long. It’s all about aggregation and curation. Aggregation is simply the gathering of ‘stuff’; in a newspaper’s case, that stuff is news stories, sports scores, horoscopes, classified ads, etc… Curation is the culling out of unnecessary ‘stuff’.

Newspapers have created brands for themselves because of their unique aggregating and curating. For hundreds of years if someone liked a column in a specific newspaper, they were forced to buy the entire paper to read the one column of interest. The newspaper hoped that the reader would also find the other articles interesting, but it didn’t really matter because the price of the newspaper was the same whether a reader liked one article or all of them.

Read more

Google Plus Demographics

August 14, 2011

Here at the Beyond Search goose pond, we pay more attention to the less zippy aspects of search. The notion of asking someone and getting an answer is a method we learned at our orientation class at Halliburton NUS 40 years ago. The training went something like this.

When you need to know where the diagrams for the ECCS are, you need to ask the duty officer?

Not too fancy, but the method worked despite government and plant operator bureaucratic “efficiency.” Moving questions to another communication medium seems pretty understandable to us. Searching the digital artifacts is an obvious step. We can even get our tiny minds around the notion of knowing who asked whom, what, and when.

When we think about Google Plus, we see a new service which is changing. We think that the changes are coming less quickly than we anticipated. Google seems to be putting considerable effort into the new service. Once a person provides the who, what, why and when for routine communications one has a very interesting commercialization opportunity.

Study Google+ Winning over Suburban Parents, Losing College Kids and Cafe Dwellers” caught our attention on august 13, 2011. The write up provides some early data about the demographics of the 20 million plus Google Plus users. (Am I the only one who eschews using the plus sign because of its role as an operator in some search systems?)

Here’s the passage we noted:

Google+ seems to be falling out of favor among the “colleges and cafes” crowd, generally younger people without children. However, it’s seeing an increase in interest from the “kids and cabernet” segment — defined as “prosperous, middle-aged married couples living child-focused lives in affluent suburbs.” That’s a group that hasn’t embraced Facebook as much as the rest of the population, according to the Experian Hitwise data.

My hunch is that Google is going to want hundreds of millions of users of all demographic stripes and hues. The inclusion of games is a first obvious step of what is a consumerizing move. The video stuff also points down market to me, but I am 67 and not too keen on the boob tube whether implemented on a big screen TV, a mobile device, or some intermediate gizmo like an iPad. A wasteland is a wasteland to me.

The more consumerized a service, the less utility that service has to me. Facebook is the ultimate consumer “space”, and I don’t spend much time in that service. (A couple of the goslings are working on a Facebook implementation for Augmentext.com, but I just watch and learn. I don’t “do.”) Google Plus seems more appropriate to me, but if it goes down-market, then I will drift away. LinkedIn has already become a crazy “hire me” and “I am an expert” place, and I am not too keen on that digital watering hole either. I am willing to be semi flexible, but since I can’t touch my toes, I don’t know how far I can go in this down-market type environment.

Stephen E Arnold, August 14, 2011

ReVerb: The Whole Language Movement

August 12, 2011

Reverb, a new search method, presents an optimistic future for search engines and intelligence levels. Projecting what Web search engines will look like in ten years, ReVerb should hope that the whole language movement doesn’t make a comeback in schools. Requiring users to input an “argument” and a “predicate,” this program automatically identifies and extracts binary relationships from English sentences—and requires users to know the basic parts of a sentence.

Created by the University of Washington’s Turing Center, as a part of the KnowItAll project, there are currently 15 million Reverb extractions available for academic use. This program has blown similar ones out of the water.

The paper entitled, “Identifying Relations for Open Information Extraction” asserts the following:

“[ReVerb] more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and WOE-pos. More than 30% of ReVerb’s extractions are at precision 0.8 or higher— compared to virtually none for earlier systems.”

The creators are confident that ReVerb will be useful for queries where target relations cannot be specified in advance and speed is important. Currently, there is a demo available.

Is this the next big thing in search or another public relations push? Will this generate sympathetic vibrations within the Google?

Megan Feil, August 11, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Is Thomson Reuters Chasing after LegalZoom?

August 9, 2011

Here’s another “me too!” development. Taume reports, “Thomson Reuters Launches Westlaw Form Builder.” LegalZoom offers a client the forms required to create a limited liability corporation for less than $100. A lawyer may charge quite a bit more.

Completing an unending stream of forms is a time-consuming aspect of any legal office, and Thomson Reuters hopes their online tool will spell efficiency for its clients. The press release explains what the company hopes will distinguish its product from the competition:

Attorneys can access more than 20,000 official and lawyer-tested forms anytime and anywhere they have an Internet connection. Westlaw editors continually update the forms to ensure they are current, eliminating the need to download upgrades or verify citations. Unlike static forms, Westlaw Form Builder allows users to customize forms, making them specific to a given client and case. And every Westlaw Form Builder plan includes links to any cited authority or commentary on WestlawNext without incurring additional charges, helping users understand the legal context surrounding a particular form.

Completed forms are downloadable, and client data is stored, saving time on re-entry.

Thompson Reuters provides information management tools to clients around the globe in fields from financial and legal to science and health care. And, of course, the company is a respected source of world news coverage. But Thomson is targeting attorneys who are increasingly cost sensitive. Maybe attorneys are using LegalZoom too? The search system works. Oh, LegalZoom looks like a pretty good bargain. Buying legal information from an outfit like Thomson Reuters? Well, it can be more expensive in my experience.

Cynthia Murrell August 9, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Delightful Irony: Human Crashes Google Car

August 7, 2011

This morning my Overflight information service overflowed with Google related information. There were coveys of quales [Latin and not a misspelling, gentle reader] about Google and patents. There was another Googley shutdown story. The idea is that you should just Google a word. Who cares about a “real” dictionary entry. I find the reference appropriate because who cares about a “real” anything, including an azure chip consulting company with a penchant for becoming authorities in ANSI standard controlled term lists. I found a tardy response to the feline centric “How Do I Hate Google? Let Me Count the Ways”, which had precious little of the Elizabeth Barrett Browning gentleness from her pain and suffering.

Consider this EBB passage:

First time he kissed me, he but only kissed The fingers of this hand wherewith I write; And, ever since, it grew more clean and white.

Now evaluate the budding wordsmith Brian S. Hall’s passage:

David Drummond, you are [lame]. Larry, Sergey, you are [lame]. And I know why you’re [lame]. I know why you have monopoly profits in one business, use them to *destroy* other businesses, dominate the newest business (smartphones) and still whine.

Now who should be the focus for legions of soon to be unemployed English majors?

But what caught my attention was this item: “Google Blames a Human for its Robo-Car Crash.” My take: Algorithm good. Human bad.

Now what happens if Google’s next big product initiative such as a relaunch of the fascinating Google TV product line or a fully integrated, graphically consistent interface to the Android mobile devices flops?

Maybe algorithm good, human bad? Amusing to me because humans, not algorithms, are actually making decisions at the Googleplex. So a failure at Google boils down to “Human bad.” Seems logical.

Stephen E Arnold, August 7, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Are Privacy Issues Still Plaguing Google?

July 31, 2011

It’s hard to believe that Google is continually putting consumer privacy in question, you would think they would learn. While I’m all for a good Google roast, this is borderline overkill. TechSpot’s Matthew DeCarlo’s article “Goggle’s Street View Cars Collect Locations of Wi-Fi Devices” is an interesting look at what looks like Google’s latest troubles.

Google’s Street View Cars collected information about the Wi-Fi locations of many European users and their “previous locations”. We learned:

“For instance, someone could use the data to show you were at a specific place during a specific time, and that’s something you might not want to share with the world.”

What I don’t get is how this is any different than the check in applications that millions of people are already utilizing on their Facebook and Twitter accounts. It’s also no different than the hundreds of millions of social media users that post their whereabouts on statuses and feeds worldwide.

In order for users to cry “privacy infringement” the data should have been private in the first place. But it sure looks as if privacy issues, whether grounded or not, are an albatross around Googzilla’s neck. It is tough to search if one cannot connect in our opinion.

Leslie Radcliff, July 31, 2011

Sponsored by Pandia.com, an outfit which published my new study of enterprise search and a chapter that provides some of my analysis of the Google Search Appliance.

Ardentia Search Now Connexica

July 29, 2011

Short honk: We were updating the Overflight links today and noted that Ardentia Search which had positioned itself as a “business intelligence company” is now redirecting to Connexica.com. The About Us page references Ardentia Search. The managing director of the company is Richard Lewis. Here’s the important bit:

As CTO at Ardentia, [Richard Lewis] was responsible for the development of BI and Data Warehouse products which are now used in over 100 NHS organizations as well as providing analysis and extract services for the National Program for IT. Richard founded Connexica in 2006 by buying the IPR for his latest BI and search product from Ardentia.

If you don’t recall the Ardentia system, here’s a block diagram I unearthed from the Overflight archive:

ardentia overview

A number of search and content processing companies are repositioning, not disappearing.

Stephen E Arnold, July 29, 2011

Freebie

Exclusive Interview with Margie Hlava, Access Innovations

July 19, 2011

Access Innovations has been a leader in the indexing, thesaurus, and value-added content processing space for more than 30 years. Her company has worked for most of the major commercial database publishers, the US government, and a number of professional societies.

image

See www.accessinn.com for more information about MAI and the firm’s other products and services.

When I worked at the database unit of the Courier-Journal & Louisville Times, we relied on Access Innovations for a number of services, including thesaurus guidance. Her firm’s MAI system and its supporting products deliver what most of the newly-minted “discovery” systems need. Indexing that is accurate, consistent, and makes it easy for a user to find the information needed to answer a research or consumer level question. What few realize is that using the systems and methods developed by the taxonomy experts at Access Innovations is the value of standards. Specifically, the Access Innovations’ approach generates an ANSI standard term list. Without getting bogged down in details, the notion of an ANSI compliant controlled term list embodies logical consistency and adherence to strict technical requirements. See the Z39.19 ANSI/NISO standard. Most of the 20 somethings hacking away at indexing fall far short of the quality of the Access Innovations’ implementations. Quality? Not in my book. Give me the Access Innovations (Data Harmony) approach.

Care to argue? I think you need to read the full interview with Margie Hlava in the ArnoldIT.com Search Wizards Speak series. Then we can interact enthusiastically.

On a rare visit to Louisville, Kentucky, on July 15, 2011, I was able to talk with Ms. Hlava about the explosion of interest in high quality content tagging, the New Age word for indexing. Our conversation covered the roots of indexing to the future of systems which will be available from Access Innovations in the next few months.

Let me highlight three points from our conversation, interview, and enthusiastic discussion. (How often do I in rural Kentucky get to interact with one of the, if not the, leading figure in taxonomy development and smart, automated indexing? Answer: Not often enough.)

First, I asked how her firm fit into the landscape of search and retrieval?

She said:

I have always been fascinated with logic and the application of it to the search algorithms was a perfect match for my intellectual interests. When people have an information need, I believe there are three levels to the resources which will satisfy them. First, the person may just need a fact checked. For this they can use encyclopedia, dictionary etc. Second, the person needs what I call “discovery.” There is no simple factual answer and one needs to be created or inferred. This often leads to a research project and it is certainly the beginning point for research. Third, the person needs updating, what has happened since I last gathered all the information available. Ninety five percent of search is either number one or number two. These three levels are critical to answering properly the user questions and determining what kind of search will support their needs. Our focus is to change search to found.

Second, I probed why is indexing such a hot topic?

She said:

Indexing, which I define as the tagging of records with controlled vocabularies, is not new. Indexing has been around since before Cutter and Dewey. My hunch is that librarians in Ephesus put tags on scrolls thousands of years ago. What is different is that it is now widely recognized that search is better with the addition of controlled vocabularies. The use of classification systems, subject headings, thesauri and authority files certainly has been around for a long time. When we were just searching the abstract or a summary, the need was not as great because those content objects are often tightly written. The hard sciences went online first and STM [scientific, technical, medical] content is more likely to use the same terms worldwide for the same things. The coming online of social sciences, business information, popular literature and especially full text has made search overwhelming, inaccurate, and frustrating. I know that you have reported that more than half the users of an enterprise search system are dissatisfied with that system. I hear complaints about people struggling with Bing and Google.

Third, I queried her about her firm’s approach, which I know to be anchored in personal service and obsessive attention to detail to ensure the client’s system delivers exactly what the client wants and needs.

She said:

The data processed by our systems are flexible and free to move. The data are portable. The format is flexible. The interfaces are tailored to the content via the DTD for the client’s data.  We do not need to do special programming. Our clients can use our system and perform virtually all of the metadata tasks themselves through our systems’ administrative module. The user interface is intuitive. Of course, we would do the work for a client as well. We developed the software for our own needs and that includes needing to be up running and in production on a new project very quickly. Access Innovations does not get paid for down time. So our staff are are trained. The application can be set up, fine tuned, deployed in production mode in two weeks or less. Some installations can take a bit longer. But as soon as we have a DTD, we can have the XML application up in two hours. We can create a taxonomy really quickly as well. So the benefits, are fast, flexible, accurate, high quality, and fun!

You will want to read the complete interview with Ms. Hlava. Skip the pretend experts in indexing and taxonomy. The interview answers the question, “Where’s the beef in the taxonomy burger?”

Answer: http://www.arnoldit.com/search-wizards-speak/access-innovations.html

Stephen E Arnold, July 19, 2011

It pains me to say it, but this is a freebie.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta