Hakia: Taking on Google

April 12, 2009

I find these stories about search systems that will challenge Google fascinating. One of the more recent ones I saw was an April 6, 2009, article “Is Hakia.com the Search Engine That Is Going to Challenge Google?” which appeared in My Questions, a South African Web log here. The story provides a useful summary of the features of the Hakia semantic system. I ran an interview with one of the Hakia founders, Riza Berkan, in August 2008. You can read that exclusive interview here. The point that jumped out at me in the My Questions’ write up was this comment:

The results are ranked according to the relevant site and the categories that they belong to.

There is a growing interest in the authority of a source. The role that a subject matter expert, a Ph.D. committee, or a reference librarian once played has to make its ways to software. The present financial climate and the inefficiency of finding a reliable way to validate a source make human methods highly variable. Software, with its machine like consistency, seems to offer a solution. Hakia has probed this issue and includes this component in its search results ranking.

Another comment that caught my attention was:

Hakia is a very good search engine but it still has a lot of ground to cover before it can take over much of the market the Google has. We will only have to see with time how the market receives it.

I think Hakia has much to commend it. My recollection is that the company’s processing of health and medical information was quite useful. In my experience, semantic processes often work more quickly and reliably when processing content that is about a specific subject area. But technology continues to improve and some vendors, like Autonomy, emphasize that their systems can adapt to a changing flow of content. I have been around a long time, and I think that “drift” remains a challenge for many search and content processing vendors.

The effort of carpetbaggers and azure chip consultants to sell taxonomy as a silver bullet is pragmatic. With a managed list of terms or categories, the content can be put in a pigeonhole. There may be drift, but the categories act as a red herring for other indexing flaws.

With the deteriorating financial climate, many search vendors will be forced to retrench or exit the business. Each week I hear rumors about companies that are either for sale, seeking investors, or preparing to close their doors. I will have to follow up with Hakia to see if the company still wants to challenge Google.

Stephen Arnold, April 12, 2009

YAGG: Google Japan Scrambled Results

April 12, 2009

Network World reported the Google Blogoscoped story about “Google Japan Bug Showed Gibberish Results” here. For me, the key sentence in the news story was:

According to sources cited by Asiajin.com, the garbled pages sent to Japanese users (and some users of other languages) did not have a correct UTF8 character encoding…

Is this Yet Another Google Glitch? Hard to say but apparently some of the Google users in Japan thought the results were in Turkish.

Stephen Arnold, April 12, 2009

Appearances Are Deceiving

April 11, 2009

Peter Kafka has a wonderful post here. The title tells the tale: “‘AP Exec: “To the Untrained Eye It Looks Like We’re Stupid’” Do you think? In my opinion, the most interesting comment in the article was this paragraph:

On the confusing message that the AP presented to the world this week: Guilty as charged, says Kennedy [an Associated Press senior manager]. But he argues that his group has indeed given some thought to what it’s doing, even if it hasn’t communicated that clearly to date.

Wow. No, I don’t think appearances are deceiving in this case. What you see is what you get.

Stephen Arnold, April 11, 2009

Wired Explains Why the Children of Publishers Are a Problem

April 11, 2009

Now the remaining hands at the downsized Wired did not say that. I wrote a headline that expresses why the dead tree crowd is paddling against the current at Niagara Falls. First, click to this Wired story here: “Teens Love Aggregation and ‘Free’, Newspaper Study Finds”. Second, consider this snippet from the article:

“Not only are teens not rushing to pay for content, but they also struggle to envision in what realm they would need to pay for content,” said the study, conducted for the NAA by Northwestern University’s Media Management Center. They are less interested in news brands than a site’s usability and depth of content. “Ask teens where they find news, and they typically say Yahoo!, Google, AOL or MSN,” the study said. “Sometimes, they mean Yahoo! and other times they mean Yahoo! News; sometimes they mean Google, the search bar, and other times they mean Google News or iGoogle. And sometimes they say MSN but mean MSNBC.com.”

The problems seems to be what I call demographic. The children of the traditional media giants are the termites in the old media’s business model. Wonder how the media companies will deal with that. Ground them and cut off online access. I heard a rumor that William Gates banned Apple iPhones and iPods from his house. I suppose that works too.

Stephen Arnold, April 11, 2009

Search Costs: Clouds Come Lower

April 10, 2009

IT-Analysis published Laurie McCabe’s “Will CPAs Bring the Cloud to Earth for SMBs?” You can read the story here. The hook for the story was CPA chatter. I would imaging that “chatter” to a CPA is fairly tame stuff, but I may be wrong. MBAs were once considered harmless but since the financial meltdown, MBAs are downright lethal. The write up is about two accounting groups’ decision to support Intacct for their customers. I never heard of Intacct, but I assume QuickBooks has. Ms. McCabe wrote:

Not only does this alliance pose a strong threat to Intuit QuickBooks’ dominance in the small business accounting market, it has the potential to pull SMBs into cloud computing in vast numbers. Intacct, AICPA and CPA2Biz did a lot of homework beforehand, including research that showed online accounting solutions boost productivity by as much as 50%. By dramatically reducing the need for travel, and the necessity of exchanging paper and email files, CPAs have more time to spend providing guidance to clients to help them improve financial performance and decision-making.

Too bad for QuickBooks, but the green eyeshade set believes that cloud-based applications like accounting make financial sense. Do you think? When the bean counters figure out how to save money, it makes little difference what the info tech folks say. Blossom.com, one of the most successful cloud search vendors, is probably quite happy with the CPAs’ new found ability to see the clouds.

Stephen Arnold, April 10, 2009

Google Apps: Googzilla’s Fangs

April 10, 2009

ComputerWorld has an important story here. The url is a Dusie so click quick. I sense a 404 in your future if you delay. The title “Google Working to Add Every Last Service to Apps” is a categorical affirmative. If you recall your college logic class, categorical affirmatives are tough to make stick, particularly when these are applied to the GOOG. The subtitle is the ballpeen hammer: “Exec Offers Up Plans in Colorful Tweet.” Google reveals that it will attack the enterprise with the muscular App Engine, not the kick-sand-in-its-face Google Search Appliance. Instead of a news conference in New York, the GOOG sends out a Twitter message. I think it is safe to say that the GOOG is banking on the demographics of the Twitter generation to get the message. ComputerWorld’s writers quoted various gurus as allegedly saying:

“While this strategy creates a certain ‘shock and awe’ factor in the developer and geek world, this still leaves certain large enterprise requirements unanswered, such as role-based administration and records management capabilities,” he said. “I think this strategy strengthens Google Apps within its core constituency — the [small and midsize business] market. SMBs will love the increasingly Swiss Army knife capabilities of Google Apps.

My thought is that Google’s enterprise search group knew exactly what it was doing. Furthermore, Google’s demographic card is a component of the surround and seep strategy. Traditional marketers are not Googley. In my opinion, Google is content to blaze its own trail to the enterprise and the crown jewels of IBM, Microsoft, and Oracle. Just my opinion.

Stephen Arnold, April 10, 2009

Digg Search Changes

April 10, 2009

I don’t pay much attention to Digg.com. I checked the site last week, found an annoying toolbar, and exited the site. The Digg.com Web log here reported that Digg.com has a new search system. I can’t determine whether this is a home grown system, a home grown system plus Microsoft, or a Microsoft based system. If you know the ingredients, let me know. The new system, according the the Digg.com Web log, “99.987% Less Suck“. Among the enhancements is support for bound phrases, facets, Boolean NOT, and better performance. I ran a couple of queries and observed:

  • For my query “Google patents”, the results relevance ranking was odd. My Google patent index was listed fourth. What troubled me was that the top ranked results dated from 2006. I could not see a one click method to sort by date. Google Products offers this feature as does my Exalead index of Google Web logs.
  • I ran a query for the blog tool du jour, Squarespace. There were results, but the top ranked result dated from 2005. Inspection revealed that there were meatier and potentially more relevant results deep in the results list.
  • I ran a query for a story from 48 hours ago–Eric Schmidt and the speech before the newspaper association. I ran this as a free text query: eric schmidt newspaper association. Bingo. Happiness.

My conclusions are that this search implementation is okay. I will probably stick with free text queries and skip the Boolean. With tweaking, Digg.com’s “less suck” search solution strikes me as pretty good. I will check it out in four or five months. Now about that toolbar?

Stephen Arnold, April 10, 2009

Mixed Digital Arts Show Down: Google vs Photographer

April 10, 2009

The Telegraph’s story “Google Street View Cameraman in Row with Photographer” is a classic in my opinion. You can read the dead tree full text version here. The nub of the story is that allegedly a Googler was driving one of those Googley, camera vehicles. The idea was to take pictures for Google’s Street View. The Googler allegedly saw a person with a camera photographing the Googzilla-mobile. The Googler driving the car spoke with the individual who was taking pictures. The Telegraph reported:

The Google driver then proceeded to shout at the photographer and said: “Don’t you take pictures of me, mate.” He then asked the photographer to blur his face out of the pictures as Google does in its Street View images. The photographer managed to get about six to eight photographers of the car which had a pole-mounted revolving camera protruding from the top.

Wow. I wonder if I will need a bodyguard when I give a talk that describes what I have learned about Google from open source information. Perhaps I can wear my bunny rabbit ears? No one picks on a bunny.

Stephen Arnold, April 10, 2009

Bob Boiko, Exclusive Interview

April 9, 2009

The J Boye Conference will be held in Philadelphia, May 5 to May 7, 2009. Attendees can choose from a number of special interest tracks. These include strategy and governance, Intranet, Web content management, SharePoint, user experience, and eHealth. You can get more information about this conference here.

One of the featured speakers, is Bob Boiko, author of Laughing at the CIO and a senior lecturer at the University of Washington iSchool. Peter Sejersen spoke with Mr. Boiko about the upcoming conference and information management today.

image

Why is it better to talk about “Information Management” than “Content Management”?

Content is just one kind of information. Document management, records management, asset management and a host of other “managements” including data management all deal with other worthy forms of information. While the objects differ between managements (CM has content items, DM has file, and so on) the principles are the same. So why not unite as a discipline around information rather than fracture because you call them records and I call them assets?

Who should be responsible for the information management in the organization?

That’s a hard question to answer outside of a particular organizational context. I can’t tell you who should manage information in *your* organization. But it seems to me in general that we already have *Information* Technology groups and Chief *Information* Officers, so they would be a good place to start. The real question is are the people with the titles ready to really embrace the full spectrum of activities that their titles imply

What is your best advice to people working with information management?

Again, advice has to vary with the context. I’ve never found two organizations that needed the same specific advice. However, we can all benefit from this simple idea. If, as we all seem to believe, information has value, then our first requirement must be to find that value and figure out how to quantify it in terms of both user information needs and organizational goals.  Only then should we go on to building systems that move information from source to destination because only then will we know what the right sources and destinations are.

Your book “Laughing at the CIO” has a catchy title, but have you ever laughed at you CIO yourself?

I don’t actually. But it is always amazing to me how many nervous (and not so nervous) snickers I hear when I say the title. The sad fact is that a lot of the people I interact with don’t see their leadership as relevant.  Many (but definitely not all)  IT leaders forget or never knew that there is an I to be lead as well as a T. It’s not malicious, it has just never been their focus. I gave the book that title in an attempt to make it less ignorable to IT leaders. Once a leader (or would be leader) picks the book up, I hope it helps them build a base of strength and power based on the strategic use of information as well as technology.

Why are you speaking at a Philadelphia web conference organized by a company based in Denmark?

Janus and his crew are dynamite organizers. They know how to make a conference much more than a series of speeches. They have been connecting professionals and leaders with each other and with global talent for a long time. Those Danes get it and they know how to get you to get it too.

Peter Sejersen, J Boye. April 9, 2009

Why Traditional Media Companies Cannot Innovate

April 9, 2009

Eric Schmidt suggested that newspapers innovate to generate revenue. A reader sent me a link to an essay called “Startup #119: Why Startup Innovation Kicks Corporate Booty” by Joseph Ansanelli here. I found this write up quite good. I downloaded it and printed it out. I think I will be able to reference it in my upcoming iBreakfast talk about Google’s newspaper “issue” in New York on April 23, 2009.

Mr. Ansanelli hits the nail on the head. Technology is not the problem. What is? Mr. Ansanelli identifies three factors: People, freedom, and failure. In my opinion, a large media company is a political animal fueled by soft skills. Instead of figuring out technology, the media wizards talk about color. Color is important as is design. The problem is that innovation in a media company is different from the type of innovation one finds in a engineering centric start up.

Mr. Ansanelli wrote:

You need to invest money in lots of projects and only a few will succeed.  Corporations cannot typically afford to do this.  Which is why the most common route for successful innovation for large corporations is through acquisition of these companies.  It is far less expensive and risky to acquire an ongoing business that has proven itself then to invest in the 50 different ideas to try and find one that works.

With newspapers starved for cash, in my opinion, innovation is going to find itself starving for cash. No money, innovation will die from a lack of oxygen.

Stephen Arnold, April 9, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta