Altova Noses into XML Semantics

March 27, 2012

IT Jungle’s Alex Woodie recently announced some good news for IBM DB2/400 fans in the article “Altova Adds Support for DB2/400 Logical Files in MissionKit.”

According to the article, Altova has now added support for DB2/400 logical files in MissionKit. The latest release of MissionKit called 2012r2, features updates to support for DB2/400 logical files have been added to the XMLSpy, MapForce, UModel, DatabaseSpy, and DiffDog products, which already supported DB2/400.

Woodie writes:

MissionKit includes eight handy utilities that allow IT professionals to accomplish a range of XML, data, and unified modeling language (UML)-related tasks. Anchoring the kit is its popular XML editor, called XMLSpy. MapForce, meanwhile, provides data conversion and related capabilities, UModel allows developers to visually design their application flows in UML, while DatabaseSpy allows users to design, query, and compare multiple databases. Rounding out the suite are StyleVision, DiffDog, SchemaAgent, and SemanticWorks.

These new features are bound to attract IBM i customers due to its powerful data manipulation tools. For more information and free trial downloads check out www.altova.com Since I am no longer receiving spam from MarkLogic and AtomicPR, I am not sure how that XML centric company is responding.

Jasmine Ashton, March 27, 2012

Sponsored by Pandia.com

Hakia Rolls Out Meaning Based Search App

March 16, 2012

App Appeal recently reviewed the new meaning based search engine app called Hakia in the article “Hakia Review – Meaning Based Web Search.”

According to the article, it was founded in 2004 and, rather than searching for how often a keyword is used, this application scours for meaning. Hakia, like most search engines, is a free service and search results are displayed as text links. Where Hakia differs is it breaks the differing results down into categories.

When distinguishing Hakia from other search engines, the article states:

Search engines are not uncommon today, however search engines that do not focus on keyword popularity are. Hakia offers a unique way to seek content based on meaning. Users can try different ways to search for topics, people, events and anything else imaginable. The search engine has some issues, but may prove to be an interesting alternative to the traditional search engine. It may not be a practical replacement, but it is a nice supplemental tool.

While Hakia is a cool alternative to keyword search engines like Google, it isn’t getting a ton of attention yet. This could be due to its lack of accuracy and cumbersome approach to search.

Jasmine Ashton, March 16, 2012

Sponsored by Pandia.com

Google and Semantic Search

March 15, 2012

The Wall Street Journal certainly has a scoop if one has been ignoring Google’s actions over the last five or six years. For a traditional “real” news publication owned by News Corp., the newspaper knows how to generate what I call “faux excitement.” The for fee version of the Wall Street Journal story is at http://goo.gl/DnRrP although the link may go dead in a New York minute.

You will want to snag a copy of the dead tree edition of the March 15, 2012, newspaper. Turn to Section B1 and read “Google Gives Search a Refresh.” If you have don’t have an online subscription to Mr. Murdoch’s favorite newspaper, click here.

I found the write up bittersweet. An era has ended at the Google. Google is moving into the choppy waters of “smart” search. Others have been in the kayaks trying to navigate meaning for a long time. Perhaps the best known player is Autonomy, which is now the “baby tiger” at Hewlett Packard. Google wants to skip the baby tiger metaphor and jump to the semantic shark.

My research suggests that Google has been grinding away at semantic search for a while, at least a decade. There were signals about Google wanting to get beyond the “clever” linking method and the semantic techniques of Oingo (Applied Semantics) a decade ago. (Notice the word “semantics” in the company name?)

Then Google took a couple of steps forward when it landed the Transformics technologies and hired Dr. Ramanathan Guha. You can get the run down on Dr. Guha’s semantic leanings when you work through the hits for this query on Google: Ramanathan Guha semantic Web. No quotes required. Dr. Guha is the wizard behind the Programmable Search Engine, which I described in some length in Google Version 2.0: The Calculating Predator, published by the UK outfit Infonortics five years ago. The monograph may still be in print, and if you can snag a copy, you will see how Google’s wizard explains a system and method to populate “fact tables” and perform other feats of semantic legerdemain. The Wall Street Journal focuses on Google’s acquisition of Metaweb Technologies which is more along the lines of a a complementary content or fact generating system. Google has a tendency to “glue” technologies together, not toss the shark technologies out with the bathwater.

The write up is one of those fear-uncertainty-doubt maneuvers which technology companies enjoy. “Real” journalists are too savvy to fall for the shiny lures. The persistent reader will learn that there is no release date for the new Google search. This surprised me because I was sure I read and later heard that Google version 2.0 was Google Plus, not plain old search with some WolframAlpha.com like touches and Blekko nuances stirred in for enhanced flavor. I must admit I was confused about a news story written in the present tense which is really about some search advances which will arrive at an indeterminate time in the future, maybe tomorrow, maybe in September when the leaves turn.

The story suggests that Google is making changes because of Microsoft Bing, Apple’s voice search, or Facebook, which has no search service of much consequence. My hunch is that Google is making changes to search for one reason: ad revenue via traditional browser based search is softening. This is bad news for anyone dependent on online advertising revenue to pay for airplanes, Davos visits, and massive television and print advertising. Forget the competitors, Google has to do something that works to pump up margins and generate massive revenue. After more than a decade of trying to diversify its revenue, Google is under the gun. If Google’s magic touch were actually working, then the company should be rolling in dough from multiple revenue streams. Where is the payoff from appliances, enterprise sales, and me-too services which have essentially zero impact on companies like Apple, Facebook, and Microsoft.

Google’s PR thrust to focus attention on how it will improve search comes too quickly after Google got “real” journalists to believe that Google 2.0 was the “social” services. Well, how has that worked out for Google? I wrote about James Whittaker’s explanation of “Why I Left Google”. If you haven’t read the Whittaker write up, click here. The passage I noted was:

I couldn’t even get my own teenage daughter to look at Google+ twice, “social isn’t a product,” she told me after I gave her a demo, “social is people and the people are on Facebook.” Google was the rich kid who, after having discovered he wasn’t invited to the party, built his own party in retaliation. The fact that no one came to Google’s party became the elephant in the room.

Net net: Google has been in the semantic game a long time. Semantic technology is now in operation at Google, just as plumbing. Now Google wants to expose the pipes and drains.

The reason?

Semantic are hoped to give Google more hooks on which to hang advertising messages. Without something new, revenue growth at Google may degrade at a time when Apple, Facebook, and Microsoft continue to grow. The unthinkable? Nope, the reality.

Stephen E Arnold, March 15, 2012

Sponsored by Pandia.com

Ontoprise GmbH: Multiple Issues Says Wikipedia

March 3, 2012

Now Wikipedia is a go-to resource for Google. I heard from one of my colleagues that Wikipedia turns up as the top hit on a surprising number of queries. I don’t trust Wikipedia, but I don’t trust any encyclopedia produced by volunteers including volunteers. Volunteers often participate in a spoofing fiesta.

seo danger transparent

Note: I will be using this symbol when I write about subjects which trigger associations in my mind about use of words, bound phrases, and links to affect how results may be returned from Exalead.com, Jike.com, and Yandex.ru, among other modern Web indexing services either supported by government entities or commercial organizations.

I was updating my list of Overflight companies. We have added five companies to a new Overflight service called, quite imaginatively, Taxonomy Overflight. We have added five firms and are going through the process of figuring out if the outfits are in business or putting on a vaudeville act for paying customers.

The first five companies are:

  1. Millenium
  2. Mondeca
  3. Nuance
  4. Synaptica
  5. Visual Mining
  6. Wand

We will be adding to the Taxonomy Overflight another group of companies on March 4, 2012. I have not yet decided how to “score” each vendor. For enterprise search Overflight, I use a goose method. Click here for an example: Overflight about Autonomy. Three ducks. Darned good.

I wanted to mention one quite interesting finding. We came across a company doing business as Ontoprise. The firm’s Web site is www.ontoprise.de. We are checking to see which companies have legitimate Web sites, no matter how sparse.

We noted that the Wikipedia entry for Ontoprise carried this somewhat interesting “warning”:

image

The gist of this warning is to give me a sense of caution, if not wariness, with regard to this company which offers products which delivered “ontologies.” The company’s research is called “Ontorule”, which has a faintly ominous sound to me. If I look at the naming of products from such firms as Convera before it experienced financial stress, Convera’s product naming was like science fiction but less dogmatic than Ontoprise’s language choice. So I cannot correlate Convera and Ontoprise on other than my personal “semantic”baloney detector. But Convera went south in a rather unexpected business action.

Read more

Blekko is Moving Forward

February 28, 2012

Here comes the new search engine Blekko. Online Media Daily reports, “Blekko Begins Testing Search Ads.” Writer Laurie Sullivan reports:

Blekko has begun to test search ads on its site through feeds from Google and Bing. It works with brands through ad networks, but does not yet have direct ad relationships. Google and Bing built up an inventory of search ads so search start-ups like Blekko can tap into the pool through the AdSense search feed.

As of yet, most ads are not linked to users search terms. That could make for some refreshingly impersonal advertising.

The article notes that, among others, Yandex is helping to bankroll Blekko. That development may have some interesting implications for Google.

Founded in 2007, Blekko aims to provide better search results than its predecessors by tapping into a stash of three billion trusted websites and avoiding content farms. It also uses slashtags to filter and categorize searches. Its hope is that custom sorted searches will reduce spam. However, the company should be careful; doing too many things means that one cannot master them all. if the core competency slips, trouble looms. Behind Blekko stands the wizards from Yandex. Yep, Russian mathematicians, computer scientists, and systems engineers.

Cynthia Murrell, February 28, 2012

Sponsored by Pandia.com

Semantics Fuel Need for Analytics

February 22, 2012

Here’s a different approach to the “next big thing.” Network Computing insists, “Semantic Technology Key to Mastering Data Growth, Analysis.”  The article examines the recent InformationWeek report titled Database Discontent.

It used to be that data analysis parameters were defined manually. However, says the report’s co-author David Read, that is becoming less and less feasible. Writer Chris Talbot explains:

With the significant depth and breadth of data contained inside and outside the enterprise, in addition to the high volume of transactions that are continually generating more data, there is no reasonable way for people to know where to look when seeking out actionable knowledge, Read said. Predictive analytics will likely outpace reporting and traditional business intelligence efforts in the future, and they will be used to inform SMEs [Subject Matter Experts] about where to invest their business intelligence efforts, he added.

SQL systems are fine for analyzing uniform data, he adds, but not the growing mounds of unstructured data. The report sees semantic technology as the answer to the problem. Talbot notes that these tools have both improved and come down in price over the last few years. The way things are going, that’s a very good thing.

Cynthia Murrell, February 22, 2012

Sponsored by Pandia.com

Interview with John Wang: Creator of Sensei

February 17, 2012

Back in October we wrote a Beyond Search story about Q-Sensei, a multi-dimensional information management company that seemed more versatile than Autonomy, Exalead, and Apache Lucene combined. Now, several months later, the company has once again crossed our radar. This time, they are advocating for the new open-source search software called Sensei.

The Sematext blog post “Sensei: Distributed, Realtime, Semi-Structured Database” shares an interview with LinkedIn’s John Wang, search architect and Sensei project lead.

Wang states:

Sensei is an open-source, elastic, real time, distributed database with native support for searching and navigating both unstructured text and structured data. Sensei is designed to handle complex semi-structured queries on very large, and rapidly changing datasets. It was created to support LinkedIn Homepage and Signal. The core engine is also used for LinkedIn search properties, e.g. people search, recruiter system, job and company search pages.

After describing the Sensei system, Wang goes into great depth answering questions regarding his reasoning for writing Sensei, what types of companies it would benefit, and its potential pitfalls.

It is exciting to see the growth of open source search software like Sensei to help meet the needs of an increasingly diverse customer base.

Jasmine Ashton, February 17, 2012

Sponsored by Pandia.com

Inforbix Cracks Next Generation Search for SolidWorks Users

February 13, 2012

Search means advertising to most Google users. In an enterprise—according to the LinkedIn discussions about enterprise search—the approach is anchored in the 1990s. The problem is that finding information requires a system which can handle content types that are of little interest to lawyers, accountants, and MBAs running a business today.

Without efficient access to such content as engineering drawings, specifications, quality control reports, and run-of-the-mill office information—costs go up. What’s worse is that more time is needed to locate a prior version of a component or locate the supplier who delivered on time and on budget work to the specification. So expensive professionals end up performing what I call Easter egg hunt research. The approach involves looking for colleagues, paging through lists of file names, and the “open, browse, close” approach to information retrieval.

Not surprisingly, the so called experts steer clear of pivotal information retrieval problems. Most search systems pick the ripe apples which are close to the ground. This means indexing Word documents, the versions of information in a content management system, or email.

I learned today that Inforbix, a company we have been tracking because it takes search to the next level, has rolled out two new products. These innovations are data apps which seamlessly aggregate product data from different file types, sources, and locations. The new Inforbix apps will help SolidWorks’ users get more out of their product data and become more productive while improving decision-making. Plus, Inforbix said that it would expand the data access to SolidWords EPDM, making it possible for SolidWords customers to get more from data managed by their PDM system.

The two products are Inforbix Charts and Inforbix Dashboard. Both complement Inforbix Tables which was released in October 2011.

Oleg Shilovitsky, founder of Inforbix, told me:

Manufacturing companies are drowning in the growing amount of product data generated and found within different file types, sources, and company data-silos. They are increasingly using a mix of vendor packages and solutions, all which generate, contain, manage, or store product data, creating a hodgepodge of resources to be combed through. Product data generated in a typical manufacturing company can be both unstructured (valuable BOM and assembly information spread out across different CAD drawings) and structured (CAD drawings within a PDM or PLM system). Our apps are tools that address specific product data tasks such as finding, re-using, and sharing product data. Inforbix can access product data within PDM systems such as ENOVIA SmarTeam and Autodesk Vault and make it available in meaningful ways to CAD and non-CAD users.

When I reviewed the system, I noted that Inforbix’s apps utilize product data semantic technology that automatically infer relationships between disparate sources of data. For example, Inforbix can semantically connect or link a SolidWorks CAD assembly found within EPDM with a related Excel file containing a BOM table stored on a file server in another department.

Inforbix Charts visualizes and presents data saved from Inforbix Tables. The product data is presented in charts that include information to help engineers better manage and run processes by identifying trends and patterns and improving data control. For example, Inforbix Charts visually presents the approval statuses of CAD and ECO documents by author, date approved, last modified date, etc.

Inforbix Dashboard dynamically collects and presents important statistics about engineering and manufacturing data and processes, such as how many versions of a particular CAD drawing currently exist, how many design revisions did it take to complete a CAD drawing, or the number of ECOs processed on time. Easy and intuitive to use, Inforbix Dashboard is an ideal tool for project managers.

SolidWords users can access Inforbix apps and their product data online. Current Inforbix customers can immediately begin using the Inforbix iPad app, available for free on the Apple App Store at http://www.inforbix.com/inforbix-mobile-search-for-cad-and-product-data-on-the-ipad/. Account access taps existing Inforbix credentials. New users are encouraged to register with Inforbix to enable the iPad app to access product data within their company. The apps soon will be available on Android devices.

A video preview of the iPad app is posted at http://www.inforbix.com/inforbix-ipad-app-first-preview/. For more information on Inforbix apps, visit http://www.inforbix.com.

Inforbix is a company on the move.

Stephen E Arnold, February 13, 2012

Sponsored by Pandia.com

Expert System Italy

February 9, 2012

In 1989, Marco Varone, along with Stefano Spaggiari and Paolo Lombardi, founded Expert System Italy. The three wanted to develop semantic software to extract knowledge from text by replicating human processes. Varone is the father of the company’s Cogito technology.

Unlike traditional technologies based on keyword and statistics that can only guess the content of a text, Cogito reads and interprets knowledge trapped in unstructured text, finding hidden relationships, trends, and events. It relies on deep linguistic analysis and semantic disambiguation of text to ensure a complete understanding of a text. The technology can be used on files, e-mails, articles, reports, and Web pages.

After developing Cogito, Expert System partnered with Microsoft and integrated the linguistic and semantic technologies into Microsoft Office. The Cogito Categorizer is also integrated to the SharePartXXL Taxonomy Extension for Microsoft SharePointby the SharePartXXL Cogito Connector. In April 2011, the company was awarded a US patent for the Cogito semantic platform.

Products include Cogito Semantic Search, Cogito semantic Advertiser, and Cogito Answers, and Cogito Intelligence Platform. Expert System positions Cogito Semantic Advertiser as an alternative to Google’s AdSense search keyword ad management tool. The company applies semantic technologies to its contextual ad formula, discerning greater meaning from the text in an article to provide more relevant ads. Cogito Answers can be used to improve customer service, combining semantic analysis of sentiment and customer satisfaction monitoring with advanced natural language customer interaction features.

Profitable from the start and with recent growth at a compound annual growth rate of 50%, Expert System has a client list that encompasses a variety of industries. Customers include Vodafone, Eni Group, Pirelli, Telecom Italia, the Italian Ministry of Defense, RIM and CVS Pharmacy. Competitors are Google, Cisco, Flurry, Nuance Communications, and RAMP. Expert System has a strong following in the mobile search space.

Rita Safranek, February 9, 2012

Sponsored by Pandia.com

Semantic Wranglers to Tame Media Content

February 6, 2012

When the prolificacy of the media scape overwhelms, it is semantic technology to the rescue. So declares ReadWriteWeb in “Semantic Tech the Key to Finding Meaning in the Media.” Writer Chris Lamb maintains that today’s deluges of information have made attention span the prize, and delivering relevancy the key. Strategies have included tapping readers’ social graphs, profiles, and preferences to filter news content. Lamb writes:

These current approaches are doomed. With respect to social graph curation, people have different roles at during different times. On the weekend, a reader might be interested in arts, entertainment and sports news based on a friends and family. During the week, this same person may be interested in business news based on recommendations from trading partners in the capital markets. How do readers seamlessly reconcile this?

Lamb doesn’t have the answer, but says he does know what technologies will underlie the eventual solutions: tagging, semantic extraction, disambiguation, and linked data structures (including cloud data). See the write up for more the reasoning behind each.

Semantic technology can perform useful functions. Rich media pose some special challenges. Among them are the issues of data volume and available processing power, latency, and variability in indexable content. What about a silent movie? What about a program which features interviews with individuals with a substance abuse problem who speak colloquially with a mumble?

Cynthia Murrell, February 6, 2012

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta