Beyond but Focused on the Semantic Web

June 25, 2011

Hmm, here’s another “beyond search,” but not from us.

Ronnieo5’s Blog entry “Semantic Web: Internet beyond Search and Social” explains a couple of problems with the semantic Web trend:

Semantic web leverages ontologies and meta-data to build paradigms of user online behavior and customizes the internet experience according to the user. Thus Semantic web moves users very quickly toward a world in which the Internet is showing us what it thinks users to see, but not necessarily what users need to see.

He continued:

Thus, the same Google search performed by two different users could turn up entirely different results, as the search giant tweaks its suggestions on each individual’s behavior. Personalization can also require sacrificing privacy: customization works best when users are willing to hand over data about what they click, how long they spend reading it, what sites they follow, and more.

We are not sure semantics is the future, and these are two reasons why. Searches that return different information depending on who and where you are can keep you from seeing the whole picture. Also, privacy sacrifices are a sore subject worldwide.

Won’t there be blow-back against both of these concerns as semantic Web searches spread? If Google testifies before Congress, how will the company explain its semantic and predictive methods? Just search won’t do the job any longer. Algorithms now require “social” graces.

Cynthia Murrell, June 25, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Semagix: Off the Semantic Radar?

June 19, 2011

A reader sent us a question about Semagix, a semantic search engine. Here’s what we’ve found.
One of the founders was Amit Sheth, whose bio reads:

“Amit Sheth is a Professor at the University of Georgia and CTO of Semagix, Inc. He started the LSDIS lab at Georgia in 1994. Earlier he served in R&D groups at Bellcore, Unisys, and Honeywell. He founded his second company, Taalee, in 1999 based on technology developed at the LSDIS lab, and managed it as CEO until June 2001. Following Taalee’s acquisition/merger, he currently serves as CTO and a co-founder of Semagix, Inc. His research has led to three significant commercial products, several deployed applications and over 150 publications”

According to Bloomberg Businessweek, the company is alive and kicking:

“As of June 28, 2006, Semagix, Inc. was acquired by Fortent, Inc. Semagix, Inc. provides semantic metadata management solutions to enterprises and government institutions. It provides know your customer and due diligence products. The company’s know your customer solution allows financial institutions in identify, verify, and enhance due diligence processes. It has offices in the United States and Europe. Semagix was founded in 1996 as Protégé, Ltd. and changed its name to Semagix, Inc. in 2002.”

The company’s Web site is at www.semagix.com .
If anyone has more info, please use the comments section of this blog to amplify these sparse findings.

Cynthia Murrell, June 19, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Need Your Own Mini Watson?

June 18, 2011

Tony Pearson at IBM’s developerWorks explains “IBM Watson—How to Build Your Own ‘Watson Jr.’ in Your Basement.” Well, if we can build Watson in our basement, why do we need IBM?

This famous innovation is uniquely positioned for copying. Unlike many other prototypes in the tech universe, Watson:

. . . was made mostly from commercially available hardware, software and information resources. As several have noted, the 1TB of data used to search for answers could fit on a single USB drive that you buy at your local computer store.

Pearson actually describes how to build what he calls “Watson Jr.”, a scaled down version of the Jeopardy-winning A.I. The servers for the original, though as efficient as possible, require a LOT of electricity. To make the project more manageable for hobbyists, this version sacrifices game strategy and search optimization and makes speech synthesis optional.

Step-by-step directions describe exactly how to develop your own little answer machine. You can put it right next to that Zenith STOL CH 801 airplane you built and could not get out of your workshop.

Cynthia Murrell June 18, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Expert System Is on the Move

June 15, 2011

The way consumers and enterprises are accessing information is changing. Not only is there a need to access and manage information stored in the traditional internal sources, but organizations must be able to effectively manage and capture intelligence from the streams of information coming in from every direction. Without semantic technology, traditional enterprise search is unable to extract value from the stream, which means leaving a great deal of critical information behind. We learned from a recent Expert System news release:

With the overwhelming amount of information available today, there is an unprecedented need to be able to cut through the noise and capture the information that is most important to you,” said Luca Scagliarini, VP of Strategy and Business Development at Expert System. “Semantic is the only technology that can really help companies take advantage of all the information available via the real-time web, and it’s the only technology that will be able to filter the noise for the conversations, the patterns and sentiment that is important to you.

Expert System is positioning itself as a way to deliver enterprise search by intercepting the critical and the relevant from all the streams of information available. By combining the benefits of semantic tagging and semantic-based text comprehension, Cogito SEE allows the enterprise to leverage all the information organizations have access to and require to drive business strategies. New features include:

  • A point of access to structured and unstructured information including newsfeeds, social networks and other internet sources.
  • An interface that enables intuitive, visual navigation of tags, facets, as well as interaction with search results to discover new connections and data.
  • Semantic search capability for multilanguage content.
  • Automatic and customizable report generation to monitor and share evolving search details and results.

For more information, visit www.expertsystem.net.

Derek Clark, June 15, 2011

Saplo Releases Semantic Technology Platform

June 13, 2011

We noted the story “Better Text Analysis With Saplo’s New API. Regarding this open source resource, the article explains,

The Swedish startup Saplo has released a new text analysis API today that should help developers build tools to tap into the company’s semantic technology platform. This should allow people to build their own personalization or recommendation apps based on contextual meaning rather than solely on tags or keywords.

The API (Application Programming Interface) can be set up to work with different languages; right now English and Swedish are in the works. Improvements have also been made to documentation and libraries.

Definitely worth a look. The Web site is at www.saplo.com.

Cynthia Murrell June 13, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

The SIREn Call for Semi Structured Data

June 7, 2011

SIREn the new patch for Lucene has, according to its Web site found a way to better the large scale handling of semi-structured data.  Commonly searching graph structured data, or RDF, was handled by using specific triplestores. However, triplestores don’t have the same scalability as the new SIREn patch, they fail to the more consumer friendly features that a typical web search engine would utilize.

Triplestores are inefficient when searching across fields and multi-valued fields cannot be handled properly. They can’t differentiate between entry terms and where the fields they belong in. We learned:

The content query operators are the only ones that access the term content of the table, and are orthogonal to the structure operators. They include extended Boolean operations such as Boolean operators (intersection, union, difference), proximity operators (phrase, near, before, after, etc.) and fuzzy or wildcard operators.These operations allow to express complex keyword queries for each cell of the table. Interestingly, it is possibly to apply these operators not only on literals, but also on URIs (subject, predicate and object).

SIREn offers the capability to search large semi- structured content collections like those with different schemas. Something the original Lucene retrieval system failed to do. However, with a patch Lucene can now index and search RDF and text based documents with less confusion and better results.

Leslie Radcliff, June 7, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

New Landscape of Enterprise Search Details Available

May 18, 2011

Stephen E Arnold’s new report about enterprise search will be shipping in two weeks. The New Landscape of Enterprise Search: A Critical Review of the Market and Search Systems provides a fresh perspective on a fascinating enterprise application.

The centerpiece of the report are new analyses of search and retrieval systems offered by:

Unlike the “pay to play” analyses from industry consultant and self-appointed “experts,” Mr. Arnold’s approach is based on his work in developing search systems and researching search systems to support certain inquiries into systems’ performance and features.

, to focus on the broad changes which have roiled the enterprise search and content processing market. Unlike his first “encyclopedia” of search systems and his study of value added indexing systems, this new report takes an unvarnished look at the business and financial factors that make enterprise search a challenge. Then he uses a historical base to analyze the upsides and downsides of six vendors’ search solutions. He puts the firm’s particular technical characteristics in sharp relief. A reader gains a richer understanding of what makes a particular vendor’s system best suited for specific information access applications.

Other features of the report include:

  • Diagrams of system architecture and screen shots of exemplary implementations
  • Lists of resellers and partners of the profiled vendors
  • A comprehensive glossary which attempts to cut through the jargon and marketing baloney which impedes communication about search and retrieval
  • A ready-reference table for more than 20 vendors’ enterprise search solutions
  • An “outlook” section which offers candid observations about the attrition and financial health of the hundreds of companies offering search solutions.

More information about the report is available at http://goo.gl/0vSql. You may reserve your copy by writing seaky2000 @ yahoo dot com. Full ordering information and pricing will be available in the near future.

Donald C Anderson, May 18, 2011

Post paid for by Stephen E Arnold

Blekko Gets Discovered

May 17, 2011

We have known about Blekko for quite a while. The New York Times, more recently, figured out that the search engine’s 30 percent growth in the last few months is no accident.

Now that the New York Times writes about Blekko, it is official. Move over Google and Bing, there’s a new search engine on the block. In the New York Times’ “An Engine’s Tall Order: Streamline the Search,” writer Damon Darlin explains the problems with Google search results and Blekko.com’s solution. We learned:

“While you may get them (Google search results) very rapidly, they may not be all that useful and dependable.” Various search engine optimization efforts help to artificially move a Web page or article to the top of Google’s search results page. “Web pages are created specifically to fool Google’s search algorithm in order to get a higher ranking.“

Industry veteran Rich Skrenta is behind Blekko.com, which “uses a search algorithm like Google’s or Bing’s but also gets humans, mostly volunteers, to identify the sites they know, trust and visit most often and to put those at the top of the search results.” Coverage in the New York Times helps put Blekko on the mainstream radar.

About time.

Rita Safranek, May 17, 2011

Freebie unlike the hard copy of the New York Times

Digital Reasoning Continues to Expand

May 16, 2011

Move over Palantir and i2 Ltd. Digital Reasoning is expanding due to its rapid growth. As reported in MSN’s “Digital Reasoning Introduces Federal Advisory Board,” the data analytics leader has created a board to guide its push into the federal market. We learned:

With the federal government’s increased focus on cloud computing, (Digital Reasoning’s) flagship product Synthesys® provides a unique Entity Oriented Analytics solution that enables government agencies to tap into the power of big data. The Advisory Board represents a team with unique insight into the requirements of Big Data, text analytics and intelligence solutions for government agencies.

The board members are: Gen. William T. Hobbins, who retired as Commander, U.S. Air Forces in Europe; Bob Flores, founder and president of Applicology Inc., who spent 31 years in the US intelligence community; Anita K. Jones, who managed the Department of Defense’s science and technology program; Capt. Nick Buck, who spent 15 years in National Security Space, including 10 years in the National Reconnaissance Office; and Mike Miller, currently president of M4 Associates and previously VP of Juniper Networks’ Public Sector Division where he was responsible for all business with Juniper’s Public Sector customers in the US. This kind of talent should be valuable guiding Digital Reasoning’s federal sector strategy.

We have tracked this Franklin, Tennessee, company since its inception. To get some insight into the firm’s approach, you may want to read these two interviews ArnoldIT.com, the owner of this news service, conducted with Tim Estes, the founder of Digital Reasoning. The February 2010 interview explores the core technology of the firm and how it differs from other vendors’ methods. The December 2010 interview probes the new version of the firm’s flagship technology.

Stephen E Arnold, May 16, 2011

Freebie

Pragmatech Semantic Search, Upgraded

May 4, 2011

More semantic activity. SearchBlog’s write-up “Semantic Search Engine Gets Ad, News Application Layers” reports on new facets of semantic search at Pragmatech. The two-year-old company is a subsidiary of the United Development Company based in Doha, Qatar.

Ctrl news is the free news-filtering service which distills articles according to user preferences using context instead of keywords. The new feature is also able to generate a summary based on semantic analysis.

Also new is an advertising test site, which will identify the most profitable sites on which to place clients’ pitches. The article explains:

[R&D team leader Walid] Saba says the advertiser will give the company an ad, along with a variety of URLs. The technology will identify the best publisher sites to run specific ads based on the content. The technology will match the Web site with the ad content based on the contextual relevance of the advertisement, even if the ad and Web page are in two different languages.

With talent from across the globe, the team has confidence in their technology despite the crowded semantic field. We’ll see how they do as time progresses. As important is what seems to be a quickening innovation cycle outside of the US of A.

Cynthia Murrell, May 4, 2011

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta