The Arnold Columns: May 2011

May 3, 2011

It is that time again. Four columns this month and for cash money. Every time I get a check I think of PT Barnum. The topics I tackled this month required research, thought, and some wordsmanship. This blog, on the other hand, is a record of the items that strike me as interesting. I have help converting my snips into write ups. If you want to know who works on this Beyond Search blog, check out the new Author tab available from the Beyond Search splash page.

So what did real publishers instruct me to cover or, in some cases, allow me to explore? Here’s the line up. Keep in mind that you will have to either get a hard copy of the publishers’ outputs or find my work on the publishers’ Web site. In one case, that could take you a day or two. Search is really easy when folks responsible for search don’t use their own search system. Such is life.

  • ETM (Enterprise Technology Management, published by ISIGlobal.com), “Google’s Management Change and the Enterprise”. The idea is that Google is making significant management changes and, either intentionally or unintentionally, sending signals that indicate the enterprise unit is not part of Larry Page’s inner circle. I hope I am wrong, but if enterprise were the key to firm’s future, I think the management shake up would have added an olive and a dash of bitters to the enterprise group. What I saw was several squirts of cold water.
  • Information Today, which is technically a newspaper, “When Key Words Fail, Will Predictive Search Deliver?”. The write up uses Recorded Future, funded by the CIA and Google, as a case example. The main idea is that semantic technology have to step up because the volume of data facing a worker and the worker’s diminished appetite for research require software to be smarter.
  • KMWorld, “SharePoint Governance: Is Semantic Technology the Answer?”. My team has been immersed in things semantic. What our work revealed is that the baloney word governance really means indexing and editorial policies. The article provides some links to useful resources and then reminds the reader that putting the information horse back in the barn when the barn is on fire can be tough.
  • Online Magazine, “Rob ROI: Open Source and Technology Costs.” I apologize for the literary license, my assumption that the readers will know about Sir Walter, the Waverly novels, and Rob Roy. The thrust of the write up is that open source software reduces some costs but not every cost. As a result, poor budgeting for open source software can yield the same ROI killing overruns that plague commercial software. Don’t agree with me? Sigh.
  • Smart Business Network, a series of city business magazines and a Web site, “Coupon Monsoon: Downpours of Digital Deals.” The focus of the write up is the deluge of deals, coupons, and discounts. The problem with most of these services is building an audience and delivering offers that make sense to customers and merchants. I answer the question, “Should your business use coupons?”

Every two or three years I gather up these for-fee outputs and slap them in the ArnoldIT.com archive. However, you cannot rely on me to be much of an information professional. I can barely write these outputs. Organizing and archiving—beyond my skill set. Subscribe to these publications. The information in my for-fee columns is different from the Web log’s.

Stephen E Arnold, May 3, 2011

Not free. I am paid for columns so this write up is a shameless commercial promotion.

Exalead Embraces SWYM or “See What You Mean”

May 3, 2011

In late April 2011, I spoke with Francois Bourdoncle, one of the founders of Exalead. Exalead was acquired by Dassault Systèmes in 2010. The French firm is one of the world’s premier engineering and technology products and services companies. I wanted to get more information about the acquisition and probe the next wave of product releases from Exalead, a leader in search and content processing. Exalead introduced its search based applications approach. Since that shift, the firm has experienced a surge in sales. Organizations such as the World Bank and PriceWaterhouseCoopers (IBM) have licensed the Exalead Cloudview platform.

I wanted to know more about Exalead’s semantic methods. In our conversation, Mr. Bourdoncle told me:

We have a number of customers that use Exalead for semantic processing. Cloudview has a number of text processing modules that we classify as providing semantic processing. These are: entity matching, ontology matching, fuzzy matching, related terms extraction, categorization/clustering and event detection among others. Used in combination, these processors can extract arbitrary sentiment, meaning not just positive or negative, but also along other dimensions as well. For example, if we were analyzing sentiment about restaurants, perhaps we’d want to know if the ambiance was casual or upscale or the cuisine was homey or refined.

When I probed about future products and services, Mr. Bourdoncle stated:

I cannot pre-announce future product plans, I will say that Dassault Systèmes has a deep technology portfolio. For example, it is creating a prototype simulation of the human body. This is a non-trivial computer science challenge. One way Dassault describes its technology vision is “See-What-You-Mean”. Or SWYM.

For the full text of the April 2011 interview with Mr. Bourdoncle, navigate to the ArnoldIT.com Search Wizards Speak subsite. For more information about Exalead, visit www.exalead.com.

Stephen E Arnold, May 3, 2011

No money but I was promised a KYFry the next time I was in Paris.

Shakespeare, a Real Trendsetter

May 1, 2011

This short item is not strictly about search, but it provides some insight into language which wizards are working overtime to get computers to understand.

The old saying “history repeats itself” continues to hold true. According to the Phrases.org article “135 Phrases Coined by William Shakespeare” the famous poet can actually be called a trendsetter. We learn:

“Barry Manilow may claim to write the songs, but it was William Shakespeare who coined the phrases. He contributed more phrases and sayings to the English language than any other individual – and most of them are still in daily use.”

Who would have guessed that the popular singer Manilow has a little Shakespeare in him. Phrases such as “All that glitters is not gold,” “As dead as a doornail” and “Good riddance” can be found in some form in books, songs and even in movies. Even if Shakespeare is not actually responsible for coming up with all of those lines the fact that he was able to leave his mark on so many even those who have never read or even heard of his literary works is amazing in itself. Imitation is the truest form of flattery.

Alice Holmes, May 1, 2011

Freebie unlike a college text containing Will’s complete works

SEO Revealed. Exclusive Interview with Peter Niemi

April 26, 2011

An interesting challenge faces search engine optimization experts. Charging hefty sums, SEO experts now have to cope with demanding clients and Google’s increasingly aggressive efforts to improve the relevance of its search services. After my talk in Manhattan at the end of March 2011, I was able to interview Peter Niemi, founder of GHG Interactive, the marketing arm of Gray Interactive. In our hour long conversation, Mr. Niemi said:

The SEO experts are reeling from Google’s crack down on gaming the Google relevance system. Some SEO professionals react poorly to evidence that they may not be the smartest guys in the room as we saw…The majority of Web sessions commence with search. That’s where the eyeballs start, but not where they end. That’s the first problem, the lack of persistence. How do you get a customer if the customer never comes back or forgets you in a second or two?

He continued:

The second problem is that SEO is a commodity. Everyone is doing it to some degree, from the smallest blog to the biggest consumer brand site. SEO requires constant managing to achieve consistent success. In the last couple of years, more and more effort seems to be needed to keep one’s head above water. Market forces, competition, and changing technology require marketing professionals to revisit our campaigns more and more often. Search media agencies charge nice monthly fees to perpetuate what I call a “search arms race.” Google makes $28 billion a year off search engine marketing. In my experience, neither Google nor the marketers are motivated to challenge the status quo. Like investment banks, they make a good living off the status quo and change is not in their best interests.

With some Web sites struggling to reverse declining traffic, SEO is in the spotlight. To read the full text of the interview with Mr. Niemi, read “Google Squeezes SEO Experts: The Panda Choke Hold”.

If you wish to comment on this insightful interview, please, use the comments function for this Web log.

Stephen E Arnold, April 26, 2011

Freebie unlike some SEO mid-tier inputs

Ducks and Alphas: Wolfram Alpha and DuckDuckGo Unite

April 25, 2011

Wolfram|Alpha and DuckDuckGo Partner on API binding and Search Integration,” touts Wolfram Alpha’s own blog. Both organizations have brought something unique to the Search universe, so we’re interested to see what comes of this. Will it be more agile than a Google and Godzilla would? (Googzilla?)

Wolfram|Alpha’s Computational Knowledge Engine not only retrieves data but crunches it for you—very useful, if you phrase your query well. Play with that here.

DuckDuckGo’s claim to fame is that they don’t track us; privacy champions like that. A lot. The site provides brief info, say from a dictionary or Wikipedia, as well as related topics at the top of the results page. It’s also blissfully free of advertising clutter. Check that out here.

According to the Wolfram Alpha blog, they are combining the Wolfram|Alpha functionality with the DuckDuckGo search:

So what does this new partnership mean for you? If you are a DuckDuckGo user, you’ll start to notice expanded Wolfram|Alpha integration. DuckDuckGo will start adding more Wolfram|Alpha functionality and datasets based on users’ suggestions. If there’s a specific topic area you’d like to see integrated into DuckDuckGo, your suggestions are welcome.

And for developers, DuckDuckGo will maintain the free Wolfram Alpha API Perl binding. With that, you can integrate Wolfram|Alpha into your application. Keep in mind that InQuira and Attensity are “products” of similar tie ups.

We’ll enjoy watching the progress of this hybrid beast.

Cynthia Murrell April 25, 2011

Freebie

LexisNexis Unveils Semantic Search

April 19, 2011

LexisNexis, a legal search engine, has added semantic search technology to its search engine According to the Read Write Web article “LexisNexis Introduces Semantic Search.” The article states

The next-generation semantic search technology identifies the meaning of multiple concepts within a single search query to help users zero in on core concepts faster and make fewer revisions to their search queries.

Semantic search works by utilizing the science of meaning in language to produce quality and relevant search results. The TotalPatent service will help legal services to do important patent research as well as detailed analysis of their results. The Visualize and Compare Tool is a notable valuable addition

that allows users to compare and analyze any two or three result sets or lists of patents, regardless of the underlying search mechanism.

The legal search engine system has received a surprising yet much needed powerful boost from a somewhat unexpected source. This powerful technology could drastically improve productivity. However, the expensive price tag is a huge road block and makes this new technology unapproachable for a lot of legal heads.

We did ask about pricing. The LexisNexis contact could not comment about pricing. We did ask about the source of the technology. The LexisNexis contact could not comment about the source of the technology.

Our take. LexisNexis is rolling out another service that may be out of reach of most users. LexisNexis has some interesting pricing models and fees. Will semantics get LexisNexis back on the revenue trajectory of the era before lawyers sued their universities and big firms cut back on their hiring? Reed Elsevier probably hopes this semantic technology will be a huge financial winner. Reed Elsevier (Ticker: REN) is about $9.50 a share. Believers may want to boost their holdings.

April Holmes, April 19, 2011

Freebie

The Semantic Web as it Stands

April 16, 2011

Semantic search for the enterprise is here, but the semantic web remains  the elusive holy grail.  “Semantic Web:  Tools you can use” gives an overview of the existing state of semantic technology and what is needed to get it off the ground as a true semantic web technology.

Tim Berners-Lee was the first one to articulate what the semantic web would be like, and his vision of federated search is still sorely missing from reality.  Federated search searches several disparate resources simultaneously (like when you search several different library databases at once).  Windows 7 supports federated search, but it is still not common throughout the web.  The W3C (World Wide Web Consortium) has developed standards to support semantic web infrastructure, including SPARQL, RDF, and OWL, and Google, Yahoo and Bing are starting to use semantic metadata and support W3C standards like RDF.

Semantic software is able to analyze and describe the meaning of data objects and their inter-relationships, while resolving language ambiguities such as homonyms or synonyms, as long as standards are followed.  This has practical applications with things like shopping comparisons.  If standards are followed and semantic metadata provided by the merchants themselves, online shoppers can compare products without all the inaccuracies and out-of-date information currently plaguing third-party shopping comparison sites.
There are some tools, platforms, prewritten components, and services currently available to make semantic deployment easier and somewhat less expensive.  Jena is an open-source Java framework for building semantic Web applications, and Sesame, is an open-source framework for storing, inferencing and querying RDF data.  Lexalytics produces a semantic platform that contains general ontologies that can then be fine-tuned by service provider partners for specific business domains and applications.  Revelytix sells a knowledge-modeling tool called Knoodl.com, a wiki-based framework that helps a wide variety of types of users to collaboratively develop a semantic vocabulary for domain-specific information residing on different web sites.  Sinequa’s semantic platform, Context Engine, provides semantic infrastructure that includes a generic semantic dictionary that can translate between various languages and can also be customized with business-specific terms.  Thomson Reuters provides Machine Readable News which collects and analyzes analyzes and scores online news for sentiment (public opinion), relevance, and novelty and OpenCalais, which creates open metadata for submitted content.

Despite all these advances for the use of the semantic web in the enterprise, general, widespread use of the semantic web remains elusive, and no one can predict exactly when that will change:

“In a 2010 Pew Research survey of about 895 semantic technology experts and stakeholders, 47% of the respondents agreed that Berners-Lee’s vision of a semantic Web won’t be realized or make a significant difference to end users by the year 2020. On the other hand, 41% of those polled predicted that it would. The remainder did not answer that query.”

Semantic technology for the enterprise is not only here today, but is growing by about 20% a year according to IDC.  That kind of semantic technology is a much smaller beast to tame.  When it comes to the World Wide Wide, there is still not widespread support of W3C standards and common vocabularies, which is why more people said no than yes in the survey mentioned above.  Generalized web searches are difficult because each site has its own largely proprietary ontology instead of a shared and open taxonomy.
Sometimes even within an enterprise it is difficult to overcome differences in different sectors of the same business.

However, certain industries are starting to come under pressure from customers or industry and have responded by creating standardized ontologies.  GoodRelations is one such e-commerce ontology used by eBestBuy.com, Overstock.com, and Google.  This kind of technique has not become widespread because of the costs and slow payoff involved.  This is a catch-22 where businesses don’t want to jump on the bandwagon because there is not a critical mass yet, but the real benefits won’t start until there is a large number of businesses participating.  Things like product categories are often unique to a business and getting some kind of universal standardization is akin to a nightmare, but there still needs to be consensus on using some type of W3C standards of categorization to satisfy customers.  And, with more an more bogus information proliferating on the web, semantics become not only convenient, but essential for finding the right information.

I think the fundamental question that this article leaves us with is whether or not we have the standards we need or whether the current standards are the stepping off point to something new.  SGML was fine in its day, but it didn’t get very far.  HTML cherrypicked some of the basic ideas of SGML and added linking and the World Wide Web was born.  Now HTML 5 is re-introducing some of the ideas of SGML that were lost.  Maybe HTML can continue to evolve, or maybe someone will cherrypick its best ideas and create something (almost) entirely new.  Another issue is all the work that it takes to create all the metadata, no matter what the standards.  Flickr and Facebook have made user tagging into a fun activity, but for the semantic web to really function, machines need to do do most of the work.  Will this all be figured out by 2020?  Survey says no, but who knows?

Alice Wasielewski
April 16, 2011

Libraries Embrace Semantics

April 15, 2011

We came across a quite interesting article about semantics in the library market.

The world has become very dependent on search engine sites such as Google but programs such as this offer very limited results. According to Semanticweb.com “Semantics in the Public Library” introducing semantic Web technology into public libraries can help to bridge the information gap and build a new and better web. The article said:

“The worldwide web is very vocabulary dependent. Today’s Web search engines do not group web pages, pull out concepts, or understand them. There is no access to the deep Web.

Though Google produces seemingly an unlimited number of results it leaves the job half done. The semantic web can do more with the information and handle more complex databases as well as produce more structured results. Scopus is a semantic web search engine configured to handle a variety of complex queries and produce structured and easy to understand results. Semantics though it seems like the perfect technology is not yet a perfect science and implementing the new technology is definitely easier said then done.

April Holmes, April 15, 2011

Freebie

Content with Intent Delivers Search and Sales Impact

March 28, 2011

Millions of content creators on the Internet must now tighten their output or face obscurity. As a result of a recent change in Google’s quality grading, writers and bloggers are scrambling. Luckily, something can be done. Stephen E Arnold, ArnoldIT.com, will be one of the speakers at “Google Changes the Rules” on March 30, 2011, in Manhattan at iBreakfast. The “content with intent” tag line is one that Mr. Arnold has used since his work on the Threat Open Source Intelligence Gateway, funded by an interesting government entity in the fall of 2001. He has refined the system and method for a number of clients worldwide. To see an example of the technique, navigate to Google, run the query “taxodiary” or “inteltrax” and follow the links. Your product or company can achieve similar sales and marketing impact in as little as one month. Unlike SEO, the content with intent method persists. Run a query on Google.com for “ssnblog”. This demo site has not been updated since April 30, 2011 and the content continues to be easily findable. Keep in mind that the Web sites for each of these examples is one way to access the information. The method touches hundreds of findability services, including real time and social systems.

image

Most SEO delivers an expensive, often problematic, failure for clients with unrealistic expectations for an expensive, low traffic Web site. Source: http://www.lifepurposediscoverysystem.com/blog/uploaded_images/fear-of-failure-768216.gif

This shift ArnoldIT’s “content with intent” approach manifest is an innovation driven by a high volume of lower quality online content and increasingly heavy handed SEO tactics.

image

Stephen E Arnold’s “content with intent” method works in a manner similar to a series of bursts, a digital MIRV. Source: http://www.rolfkenneth.no/NWO_review_Sutton_Soviet.html

In what appears to be increasingly desperate attempts to generate traffic to a Web site, search engine optimization experts have forced Google and other search systems like Blekko.com to take action. Going forward, search vendors will, like a strict teacher, to scrutinize, “grade”, and flunk some online information.

Arnold says, “In effect, Google is like a college composition teacher. Grades of C, D, and F are not acceptable. Deliver A or B content or suffer the consequences.” “Does Google have an emotional investment in great writing?,” asks Arnold. He answers his own question this way, “No, Google cares about ad revenue and lousy content could harm Google cash flow.”

The relationship between content producers and Google sounds grim at best. Fortunately, Steve Arnold, author of Google: A Digital Gutenberg and managing director of Arnold IT, recently provided four tips for moving out of “SEO hell”, where guessing and shoddy content are likely to yield decreasing traffic from major search engines like Google and systems which federate its outputs:

Read more

Attensity Europe Wins Innovation Award

March 25, 2011

In a news release, “Attensity Wins IT Innovation Award“, Attensity announced that its European unit (Attensity Europe GmbH) received recognition from Initiative Mittelstand. This group is focused on the advancement of pioneering information technologies and the firms responsible for their production.

The product singled out was Attensity Analyze for German. The application mines and organizes data from a broad selection of sources including media outlets, telecommunication records, and social content. One use of the system is to identify upsides and downsides of products or a company’s marketing programs. Glückwünsche Attensity!

Micheal Cory, March 25, 2011

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta