Egyptian Startup Kngine Bets on Semantic Search

June 27, 2011

The Next Web has a couple of interesting recent articles regarding startups in Egypt. First, they announce that “Four Egyptian Startups Are US-Bound for Funding.” The fledgling companies include a couple of mobile services providers, a hardware accelerator enterprise, and semantic search engine Kngine.

According to the write up:

Sawari Ventures, an international venture capital firm, is behind the concept, and is supporting the four Egyptian startup companies as part of its efforts ‘to identify, serve, and provide capital for extraordinary entrepreneurs who are determined to change the MENA [Middle East/ North Africa] region.

We applaud the effort and wish all the startups luck; nothing boosts stability like successful businesses.

We, however, are particularly interested in Kngine, a semantic search provider, said to have already attracted an international following.

We keep asking, “Is semantic search the next big thing?”

Investors and influential blogs like Next Web track the space closely; for example, see the excellent write up “Semantic Web, Meet Middle East. Middle East, Meet Kngine!

Revealing that Kngine is the first Middle Eastern semantic search engine, the article voices confidence in the product:

“The engine, while Middle Eastern focused, also works great on various global and international topics and can provide on-the-spot suggestions, related results and even calculates the average city weather per month. While it’s no WolframAlpha, Kngine has been entirely created by a two-person team. It could be a great Google/Wiki search alternative if you’re looking for quick and fast information, especially if it’s Middle Eastern related.”

Especially impressive are the robust support of complex queries and the ability to recognize Arabic. Though the results won’t be displayed in that language for another six months, the engine can connect Arabic words with their English equivalents.

You can take a tour of the service here.

Cynthia Murrell, June 26, 2011

You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published my Pandia in Oslo, Norway, in June 2011.

D4 and RiverGlass Join eDiscovery Forces

June 27, 2011

As announced on PRWeb in “D4, LLC, Partners with RiverGlass, Inc. Enabling Progressive Enhancements to D4’s eDiscovery Service Offerings,” the two companies have signed an agreement to form a strategic partnership for D4 to distribute, install and host the RiverGlass solutions.

D4 focuses on litigation support and eDiscovery services to law firms and corporate law departments. RiverGlass, Inc. is a provider of advanced information collection and analysis solutions focusing on government agencies, as well as eDiscovery and risk management applications to major corporations. The write up said:

D4’s highly technical method to eDiscovery and digital forensics leverages the maximum benefits available from the RiverGlass application.

With the solution:

Customers can harvest from many different types of data stores and ingest ESI in native format without having to have it processed. This includes network stores, SharePoint sites, websites, social media as well as structured databases.

This type of eDiscovery is blurring the lines between search and text analytics, creating a powerful tool for lawyers. It markedly improves the labor-intensive and mistake-prone legal discovery process.

Will eDiscovery go the way of customer support. What looks like a trivial exercise in using traditional search and retrieval for customer support is tough. Some of the vendors chasing customers in this segment are learning that customer support is more difficult than it appears. eDiscovery strikes me as having a higher level of complexity.

It is interesting to watch the shape shifting that is underway in the content processing sector.

Stephen E Arnold, June 27, 2011

You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published my Pandia in Oslo, Norway, in June 2011.

ROI, Google, and the Revenue Imperative

June 27, 2011

I had another conversation with the owner of a Web site which has been slammed by Google’s Panda updates. Google’s cash machine is based on an idea that originated at Idea Labs’ GoTo.com years ago. When GoTo.com made its début, I was interested in the impact of paying for traffic. It struck me in 1998 that relevance as defined by the university information retrieval PhDs was a gone goose. Forget precision and recall. Sell an advertiser a click which would be displayed on the screen of an “average Internet user.” Close enough for horse shoes. Most Web searchers in 2001 when GoTo.com was floating its secondary offering would not know how to figure out the provenance of a Web site. A search for “car rental” was good enough if it displayed links or ads to Avis or Hertz. Easy quick and, as I said, “Good enough.”

Flash forward to the world since Google. In the US, most consumers of digital information continue to struggle with figuring out if a hit is a straight arrow or bent like a bonsai tree.

image

It sure looks natural, but the entire tree is artifice. The same applies to “free” Web search results and content.

You will be surprised to learn that I am not writing about Panda. I am writing about why Panda is important. Panda is designed to clean up the Google index so that ads [a] become more useful because lost in a Mississippi flood of clutter, advertisers grouse. And [b] Google is not having much success generating significant revenue from its other products and services. You don’t need me to point out that Android is predicated on Google having a bobsled run to display search results – actually ads – to the millions of mobile users. If you think determining provenance of an alleged “news story” or “white paper” is tough on a desktop device, the mobile device makes the exercise even more difficult. In fact, our work on The New Landscape of Search makes it clear that even for purpose built search systems, users are pretty inept when it comes to finding and knowing how to separate the knowledge goose feathers from the giblets. (I don’t remember who coined that memorable phrase.)

Read more

Protected: Focus on the User in SharePoint Implementation

June 27, 2011

This content is password protected. To view it please enter your password below:

Gannett a Gone Goose?

June 26, 2011

Physorg.com announces that “USA Today Publisher Gannett Cuts 700 Jobs.” Just what we need now, hundreds more unemployed.

Despite the headline, USA Today itself is not involved. Gannett is the largest news chain in the U.S, and the layoffs will hit a multitude of local papers. The article states:

Like other US newspapers, Gannett has been grappling with declining print advertising revenue, falling circulation and the migration of readers to free news online. Robert Dickey, the head of Gannett’s US Community Publishing division, said in a memo to employees that the layoffs were necessary to ‘align our costs with the current revenue trends.

Yeah, we’ve heard it before. I’d like to see what the executives make. Also why spare the national rag?

Stephen E Arnold, former vice president at the “old” Pulitzer Prize Courier Journal” hinted that the Courier Journal was once one of the top 25 newspapers in the world.

Since Gannett’s purchase of the paper, the CJ has tumbled from its lofty perch. Steve thought that Barry Senior would have gone ballistic. In his own gentile way he would have put the quality back in the paper. He would motivate staff, officers, and others by reminding everyone that quality was what made the paper great. He would have passed the word to others, including the US president, various elected officials, and his pals at the New York Times.

When large corporations gobble up local media, the quality goes out and the buzz dies.

Now the future of Gannett becomes visible. Business wizards with systems that release stories online without the value adding that extends, complements and enhances old fashioned high value writing. Where did those databases go? Good question. The gone goose’s gone geese.

Cynthia Murrell June 25, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Search Gets Its Own Letterman List

June 26, 2011

We’re delighted, though not surprised, to see that our own Stephen E. Arnold has made Norconex’s list of the “Top 10 Enterprise Search Must Reads.”

The write-up explains,

“Looking to hone up your enterprise search skills, but having a hard time figuring out where to turn? Well, maybe we can help. . . . We’ve been through just about all of [the general enterprise search books in print], so we thought we’d put together a list of our favorites, for colleagues and clients alike. A few of these books aren’t specifically enterprise search related, but we feel are still quite relevant to the field.”

Start shopping now or tuck the list in your Useful Information folder for later. The list contains some solid jokes too. Will the list make it to Letterman?

Cynthia Murrell June 26, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Visual Search with 3D from GE

June 25, 2011

A new Google search technology could change the way people use the Internet. According to the Search Engine Watch article “Visual Health Search” Google recently launched its Google Body Browser.

Users have the ability to view a 3D layered model of the body. The article asserted:

“The body can be turned, manipulated, and literally “dissected” down to the vascular level to see how its functions work and connect. This kind of detailed information gives users access to the human body that’s never been available before and goes a long way in promoting a level of understanding that can help people make better informed decisions about their health.”

Healthline Networks in conjunction with GE Healthymagination and Visible Productions have also introduced Healthline BodyMaps. We learned: 

“This new tool layers search on top of a 3D anatomical model, and allows people to navigate male and female anatomy, view systems, and organs and explore how the body works.”

Not only can patients get a more in depth look and understanding of medical conditions but physicians and other health care providers will be able to use it to help explain medical conditions, procedures and etc. to patients. Visual Search and 3D could be the new dream team.

A couple of thoughts. Google seems to have terminated its electronic medical record project. And didn’t GE design the Fukushima nuclear facility? Visual search might be less challenging and have a higher upside.

Cynthia Murrell, June 25, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Beyond but Focused on the Semantic Web

June 25, 2011

Hmm, here’s another “beyond search,” but not from us.

Ronnieo5’s Blog entry “Semantic Web: Internet beyond Search and Social” explains a couple of problems with the semantic Web trend:

Semantic web leverages ontologies and meta-data to build paradigms of user online behavior and customizes the internet experience according to the user. Thus Semantic web moves users very quickly toward a world in which the Internet is showing us what it thinks users to see, but not necessarily what users need to see.

He continued:

Thus, the same Google search performed by two different users could turn up entirely different results, as the search giant tweaks its suggestions on each individual’s behavior. Personalization can also require sacrificing privacy: customization works best when users are willing to hand over data about what they click, how long they spend reading it, what sites they follow, and more.

We are not sure semantics is the future, and these are two reasons why. Searches that return different information depending on who and where you are can keep you from seeing the whole picture. Also, privacy sacrifices are a sore subject worldwide.

Won’t there be blow-back against both of these concerns as semantic Web searches spread? If Google testifies before Congress, how will the company explain its semantic and predictive methods? Just search won’t do the job any longer. Algorithms now require “social” graces.

Cynthia Murrell, June 25, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Google Begins the Path of Penance: A Page Is Turned

June 24, 2011

Countries can be so annoying. The purpose of 21st century American business is to generate revenue. The other ideas about what business is about are like scabby knaves. One sleeping policeman in America are elected officials, hired aides, and assorted hangers on.

I read in my hard copy of the Wall Street Journal, which actually arrived dry and early this morning, the story “FTC to Serve Google with Subpoenas in Broad Antitrust Probe.” For a short time, you may be able to read this lengthy write up online. Wait too long, and you will either have to pay to access the story or schlep to the local library to see if it has sufficient funds in the post Google world to have a subscription to Mr. Murdoch’s version of the New York Times. One interesting passage in my opinion is:

Google is quickly expanding its array of services that seek to directly answer users’ queries, departing from its original strategy of sending them quickly to the most relevant site. Since 2009, for example, Google has directed people who search for mortgages or credit cards to its own marketplace for such offers.

The key point in the write up was that some folks seem to think that Google fiddles search results intentionally or unintentionally to make its one trick pony leap through a ring of fire and walk backwards. Allegations are these. News aggregators are overflowing with comments from hither and yon. On one hand, there are other media giants to the search engine optimization crowd.

I have written three monographs about Google, posts in this blog, columns in KMWorld and Enterprise Technology Management, and referenced Google in my lectures. I declined to be interviewed by Ken Auletta, which earned me the accolade “gruff”. I played word games with the Viacom’s legal eagles until the top dog figured out I wasn’t going to talk about the GOOG.

I will stick to my policy but I can offer three observations which you are free to ignore, consider, or comment upon in the comments section of this blog.

First, other than creating a distraction from significant economic challenges in the US and global issues in more than 60 countries, will the “investigation” deliver something other than:

  • A news moment
  • A fee fest for attorneys and advisors
  • Endless explanations to our elected officials who use Google daily to find out what’s happening without understanding without understanding how Google works?

My view is, “Nope.”

Second, will those who allege Google is like one of those crooked card dealers in 1950s’ westerns get their traffic back, generate more revenue from Google Adwords, or rise shine more brightly in the SEO firmament.

My view is, “Nope.”

Third, will Google reveal its systems and methods, explain how sites that stumble through an algorithmic tripwire, or point out how a “janitor” tidies certain values in a table for a Google process?

My view is, “Nope.”

Google, like any corporate entity, has a life cycle. Google is at the rite of passage for young warrior companies. The firm and its youthful chief must endure the Months of Interrogation followed by the Path of Penance.

At the end of the journey, the youthful chief will emerge stronger and wiser. And what about the aggrieved Webmasters, SEO charlatans, and legal eagles? Pretty much the same.

Stephen E Arnold, June 24, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Five Reasons Why SEO Is Going to Lead to Buying Traffic

June 24, 2011

This week I have engaged in five separate conversations with super-bright 30 somethings. The one theme that made these conversations like a five act Shakespearian comedy was SEO or search engine optimization. The focus is on getting traffic, not building a brand or contributing to a higher value conversation.

Google continues to entertain search circus goers with its trained Pandas. These Pandas do some interesting things; for example, the gentle mouthing of the word “panda” causes heart palpitations among the marketers whose jobs depend on boosting Web traffic. Let’s face it. Most Web sites don’t get too much traffic. One company which I am reluctant to name was excited to tell me it had 800 unique visitors in May 2011.

image

Move the world? Maybe. Move a nail salon’s Web traffic? Probably a tough job.

Okay. No problem if the 800 visitors were the global market for the firm’s product. But the 800 included robots, employees, consultants, and the occasional person looking for this firm’s specific type of archiving software.

With Web site costs creeping upwards, bean counters want to know what the money is delivering. The answer in many cases is, “More costs.”

Not good news for expensive, essentially unvisited, Web sites. The painful fact of life is that among the billions of Web pages, micro sites, blogs, and whatever has a url only most get lousy traffic.

Archimedes, by way of Yale, said, “Give me a lever big enough and I’ll move the world.” The world? Maybe. Traffic to a vacuum cleaner repair shop in Prospect, Kentucky? Not a chance.

Pumping up traffic to a tire store or a nail salon or even a whizzy Internet marketing company is a tough job. I gave up on traffic after we did The Point (Top 5% of the Internet) right before we sold the property to CMGI. What the heck was traffic? What could or should one count? Robots? Inadvertent clicks?

That experience contributed to my skepticism about reports about how many visitors a site has.

Google Quietly Launches Panda Update Version 2.2” is a good write up about the fearsome Panda. Like A Nightmare on Elm Street, the Panda keeps on coming, wrecking weekends for traffic crazed marketers. Bummer. I learned:

Supposedly, one thing Google was going to address with Panda 2.2 is the issue of scraper sites – websites that republish other people’s content on their own site, usually making money from Google AdSense in the process – outranking content originators. As Frank Watson noted, "Google created the mechanism that clogs its own data centers and overwhelms its own spam battlers."

Ah, Google as the prime mover and its nemesis.

Now the five reasons:

  1. Google will offer sites a way to get traffic. Buy more Adwords. Simple.
  2. Traditional Web sites are not the preferred way to get information in some demographic segments; for example, those under 20.
  3. Social networks are not only better than results lists; social networks are curated. Selection is better than relevance determined by tricks.
  4. Content is proliferating so brute force indexes are having to take short cuts to generate outputs. Those outputs are becoming less and less useful because other methods of finding are fresher and more likely to be on target
  5. Users don’t know or care about the provenance of certain types of content. Accuracy? Who has time to double or triple check. Uncurated results can be spoofed.

A tip of the hat to the SEO experts. Most of the relevance problems in the major brute force indexes are directly attributable to both the indexing companies and the SEO professionals.

So what about the users? Eureka. Ask one’s social network, Facebook.

Stephen E Arnold, June 23, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta