Oracle and Its Semantic Technologies Center

April 7, 2010

In the course of a research project for a client, we came across the Oracle Semantic Technologies Center. We had heard that Oracle had some testing underway with Siderean Software (now in hiatus) in the semantic technology field. According to Oracle:

Oracle Spatial 11g introduces the industry’s first open, scalable, secure and reliable RDF management platform. Based on a graph data model, RDF triples are persisted, indexed and queried, similar to other object-relational data types. The Oracle 11g RDF database ensures that application developers benefit from the scalability of Oracle 11g to deploy scalable and secure semantic applications.

There are links to a life sciences “platform” which offers a presentational, some white papers, and product brochures. One of the more interesting documents explains the Oracle “platform”. You can download the presentation (validated on April 6, 2010) here. The system includes references to known technologies like table spaces and to some methods that seem to be getting long in the tooth, for example, hand crafted rules for Oracle streams. The presentation has a date of 2006, which makes me wonder if Oracle has pulled resources from this initiative. The presentation includes information about Oracle Secure Enterprise Search, another product about which we wonder about Oracle’s investment and commitment. The information about vast quantities of data does not include information about the hardware scaling products based on Sun Microsystems’ technologies. Despite its 2006 date, the presentation is quite complete and is an excellent yardstick against which to measure the more up to date information about Oracle. Despite the marketing, Oracle’s platform strikes us as mostly unchanged in the last few years. Your view may differ, of course.

The semantic splash page does provide links to information about Oracle 11g. The September 2009 write up “Oracle Semantic Technologies Inference Best Practices with RDFS/OWL” focuses on system performance. The white paper begs the question, “If the performance referenced in this white paper is as stated, why does an Oracle customer need the new generation of Exadata Storage Servers?”

After working through the information available from the Semantic Technologies Center, I formed an impression of broad marketing statements and specific detail about the performance of the Oracle systems. What seemed to be missing were updates to some critical documents and an way to figure out how the many moving parts fit together to solve a specific semantic problem.

Worth bookmarking if you are working with Oracle search or semantic systems, however.

Stephen E Arnold, April 7, 2010

No one paid us to write this article.

A Googler Comments on the Apple iPad

April 6, 2010

I know one Googler does not represent the whole of Google. I did find the observation by Matt Cutts in his article “Mini-Review of the iPad” quite interesting, particularly in view of the comment about the Android online store as a “flea market.” Here’s the passage that caught my attention:

But the iPad isn’t for me. I want the ability to run arbitrary programs without paying extra money or getting permission from the computer manufacturer. Almost the only thing you give up when buying an iPad is a degree of openness, and tons of people could care less about that if they get a better user experience in return. I think that the iPad is a magical device built for consumers, but less for makers or tinkerers. I think the world needs more makers, which is why I don’t intend to buy an iPad. That said, I think the typical consumer will love the iPad.

This distinctions between open – closed, consumer – computer whiz may be the knife edge in which the battle for online revenues will be fought. If online advertising softens, the importance of these distinctions may increase. Google is drawing lines in the digital sand it seems. On one side is Google and the other China and Australia. Now there is a line between Google and Apple. Fascinating strategy. Apple, via its closed platform, has become the inclusionary glue for certain content. Microsoft, via its willingness to find a way to work in other countries and their laws, is emerging as less dogmatic.

Stephen E Arnold, April 6, 2010

No one paid me to write this.

Harsh Words for Android Online Store

April 6, 2010

I don’t cover eCommerce unless I have a news item about one of the vendors providing structured search systems to online retailers. I want to make an exception for Barry O’Neill’s Recombu article. In “App Friday: Android’s ‘Flea Market’ Needs Urgent Attention,” I learned that

Apple‘s iPhone and iTunes combination represents a near-perfect convergence of concept, design, usability, technology and commerce in a highly polished, well executed package.

I found out:

Android has the potential to dominate if executed properly. Google’s launch has been heavily flawed and in my view Android is still in “public beta.” Android has had explosive growth, but alarm bells ring when you look at the inclination of users to purchase apps. Just 21% of Android users purchase one or more paid apps per month, compared with 50% of iPhone users. Where this gets unusual is that of the 21% of Android users purchasing one or more apps, the average number of apps purchased is 5[1]. That is 1.4 more apps per month than the equivalent iPhone user!

The catch is:

I conclude therefore that a large proportion of Android users simply cannot purchase and download paid for apps to their phone. I blame Google and its appallingly poor management of the Android market. Despite having a Google Checkout account I still cannot see, never mind purchase paid apps. Of the free apps I see, many infringe the IP of others and many more simply fail to download or don’t work on my specific device. Internet forums are flooded with similar stories. Fragmentation and technical issues abound, content discovery is difficult, and billing doesn’t work properly. It’s 2006 all over again. I’m all for an “open” market, but Google’s “completely open” policy has resulted in the Android market being a flea market, when compared to Apple’s upmarket mall.

My take away from this write up is that Google’s push into the consumer sector has invited some interesting adjectives. I had never thought of Google as a company running a “flea market”. If the iPhone continues to hold its own in the mobile phone market and the iPad contributes some new consumers, the label “flea market” may have a significant impact on Google’s image if the label sticks and goes viral.

Stephen E Arnold, April 6, 2010

No one sponsored this article.

Splunk and Real Time Search

April 6, 2010

My column for Information World Review addressed the issue of latency in what marketers call real-time search. I am not sure when the article goes on the Information World Review Web site at http://www.iwr.co.uk/, but I can hit the three points in the write up.

  1. Real time means different things in different contexts
  2. The services which return results with less latency are specialist vendors such as Collecta, Surchur, and Twitter, among others.
  3. The real time results in the Big Three’s search systems are uniformly disappointing.

When I read “Splunk Goes Real-Time, Eliminates Latency from IT Data Search,” I wondered what I missed. After working through the write up, I realized that “real time search” was not defined. The assumption that a buzzword makes sense to a casual reader like myself is a common practice.

The write up said:

With a major upgrade, Splunk eliminates the latency by opening the doors to real-time search, analysis and monitoring for live streaming data. The company offered a glimpse by allowing me to go into the site and conduct a random search so that I could see my own search appear in real-time data, just as an IT admin might see it.

Splunk is company that specializes in log management. Logs are important for such applications as search engine optimization and certain security-related tasks. Here’s how the company describes itself:

Splunk is software that provides unique visibility across your entire IT infrastructure from one place in real time. Only Splunk enables you to search, report, monitor and analyze streaming and historical data from any source. Now troubleshoot application problems and investigate security incidents in minutes instead of hours or days, monitor to avoid service degradation or outages, deliver compliance at lower cost and gain new business insights from your IT data.

The addition of a search function that indexes in real time is a potentially big improvement over traditional log file analysis. The system includes a function to post Splunk saved search results to Twitter. You can get the script here.

The ZDNet write up includes a diagram for “Machine-Generated IT Data Contains a Categorical Record of Activity and Behavior”.

Splunk is a low latency search system that indexes certain types of content. “Real time” is a murky concept, and in my experience, every system exhibits latency to some degree.

Stephen E Arnold, April 6, 2010

This is an unsponsored post.

The Importance of YouTube.com

April 6, 2010

Seeking Alpha’s “YouTube Much More Important Than Gmail for Google” reminded me that I don’t place sufficient emphasis on YouTube.com, Google’s controversial rich media service. I know that Google has a number of initiatives in rich media, including its recent acquisition of Episodic. The main point of the Seeking Alpha story struck me as:

For YouTube, we estimate that revenue per 1,000 page views increased from about 40 cents in 2005 to about $2.40 in 2009. We expect YouTube’s revenue per 1,000 page views to increase to nearly $10 by the end of the Trefis forecast period.

Google will have to continue its efforts to take advantage of the YouTube.com revenue opportunity. If text ads begin to deteriorate in the face of advertisers jumping to Facebook.com, Google may have to hurry its efforts to pump up YouTube.com’s financial performance.

The Seeking Alpha charts present some tasty charts, but I wonder, “Is time running out for Google in rich media?” The problem is not the iPad, which may or may not be a factor for Google. The challenge is the many different issues that Google now faces. These range from the interesting Viacom legal matter to Google’s role as a champion of uncensored Internet results. Toss in the continued interest in Facebook and the softness in certain economic data. With many complexities interacting, the uncertainty for Google may be at its highest point in the firm’s 11 year history.

YouTube.com is important, and rich media will be the making or breaking of some companies in the online space. Google wants to be on the upside of this shift from text to video, from keyword search to social information acquisition.

Stephen E Arnold, April 5, 2010

This post talks about law and international affairs. Which entity has oversight of uncompensated write ups? I will report non payment to the manager of the Northern Regional Research Lab, south of Chicago.

Cognition Technologies Added to Overflight

April 6, 2010

We received a clipping about Cognition Technologies’ beefing up its management team.

Stephen J. Lief will assume responsibility for Cognition’s legal eDiscovery business. According the the Cognition announcement of dBusinessNews, “His role is to expand the company’s growth by developing partnerships with eDiscovery vendors, law firms and enterprise legal departments.” One interesting aspect of his background is that he has served as the founding editor of a number of publications, including the Legal Tech Newsletter.

Cognition develops semantic / natural language processing technology. The company describes itself this way:

Cognition’s Semantic NLP, the Company’s patented linguistic meaning-based text processing technology, is able to simultaneously deliver significantly higher levels of precision and recall than is possible with currently used NLP and Search technologies….Unlike all of the popular text reading and search in use today which utilize mathematically-based pattern-matching technology (i.e. they search for a particular word pattern based upon the user query), Cognition’s Semantic NLP understands the meaning within the context of the text it is processing. Therefore, the end benefit delivered to Cognition’s clients is simultaneously more precise and relevant understanding of their customers’ actions and intent.

Here at the goose pond, there has been an uptick in questions about semantic search. We have  added Cognition to our Overflight service.

Stephen E Arnold, April 6, 2010

This is a freebie.

Wolfram Alpha Reloads

April 5, 2010

I enjoy firing queries into the Wolfram Alpha system. The challenge for me is figuring out exactly how to get the system to respond. Addled geese are notoriously bad searchers and I get a fair share of “Wolfram Alpha isn’t sure hw to compute and answer from your input.” If I can do an input, then Wolfram Alpha can’t do an output. I don’t have that problem when I fire a query into Bing.com or Google.com. Those systems display something, often anything. For most users this approach is exactly what’s needed. In my experience, some people looking for information don’t know what they don’t know. Hence, any information is likely to be relevant in my opinion.

I learned when I read “Wolfram Alpha Tries Again: New Mobile Site, Big Refunds,” that the company is “making a new start” with its mobile search app for the iPhone. The article revealed:

The company has also drastically cut the price of the Wolfram/Alpha App for the iPhone and iPod touch to $1.99, down from the previous rather unfeasible $49.99.  It’s offering refunds to anyone that bought at the old price, here – apparently 10,000 people did. It’s a particularly generous move on the company’s part, as it means Wolfram/Alpha is covering the cut Apple took on that price.

Is Wolfram Alpha blazing a trail for the many iPhone application developers who want to cash in on the Apple craze for shiny, touchy gizmos? Publishers of “real” content are among the most interesting segment of app developers. I am looking forward to seeing the results of their software / content initiatives. Refunds would be even more unwelcome than making no app sales in my opinion.

Stephen E Arnold, April 5, 2010

No one paid us to write this.

FBI and Microsoft Fast Question

April 5, 2010

I read “FBI System Modernization Faulted” and realized that the Information Week write up might be touching upon a portion of the elephant. From my polluted pond in Kentucky, I have no access to juicy details that may be floating around watering holes in downtown DC, but maybe additional information will surface. What I noted in the write up was this interesting passage:

he Department of Justice’s inspector general has expressed new concerns about the progress of the agency’s $451 million-plus case management system, less than two weeks after FBI director Robert Mueller told Congress that the bureau had again suspended work on certain parts of the system’s deployment. “After more than 3 years and $334 million expended on the development and maintenance of Sentinel, the cost to Sentinel is rising, the completion of Sentinel has been delayed, and the FBI does not have a current schedule or cost estimate for completing the project,” the report says.

The search component seems to be chugging along:

Lockheed has begun to move administrative case data from the mainframe to the core Sentinel system itself. This will enable the FBI and Lockheed to deploy enhanced search capabilities based on Microsoft’s FAST enterprise search products. With this — and the migration of further data from ACS to Sentinel over the next phase of the project — will come the ability for users to customize their profiles, enabling them to receive automatic alerts whenever information about certain individuals, topics, and locations gets posted or uploaded to Sentinel.

Now the elected officials are monitoring the information system:

“It’s terribly frustrating that we’re in this position again,” Sen. Chuck Grassley, R-Iowa, said in a statement last week after learning of the most recent delays. “This is the third go at modernizing the FBI’s computer system and hundreds of millions of dollars have been wasted. I plan to hold the FBI’s feet to the fire until this project is back on track and completed.”

One question, “What happens when the system does not work?” I think this is an unlikely situation but something is causing ace integrator Lockheed and software giant Microsoft to look a bit off center. History has a habit of repeating itself. Describing software is so much easier than getting software to deliver on certain marketing assertions in my experience.

Stephen E Arnold, April 5, 2010

No one paid me to capture these thoughts.

Exorbyte and Sell By Search

April 5, 2010

I read the German language article “Die Suche von Morgen: Exorbyte startet mit SellBySearch eine intelligente Suche für Online-Shops” on the Online Press Web site and learned about a new Exorbyte initiative, Sell By Search.

The idea is that the Exorbyte structured search system is available as a hosted solution. The buzzword fans will recognize this as a cloud-computing service. According to the write up, the solution is competitively priced and makes it easy for online merchants to provide site visitors with a powerful search solution. Response time is measured in milliseconds. The new service also includes detailed statistics on the search behavior of their customers, from the most frequently requested items to search trends.

If you are not familiar with Exorbyte, you can read an interview with one of the firm’s founders in the ArnoldIT.com Search Wizards Speak interviews. Click here to read the interview with Exorbyte’s founder, Benno Nieswand. You can get additional information about the company from its Web site at http://www.exorbyte.com/.

Stephen E Arnold, April 5, 2010

No one paid me to write this.

TextDigger and Open Web Service

April 5, 2010

The flurry of semantic activity reminded me that I had not updated my files on TextDigger, a semantic player based in San Jose, California. In February 2009, the company received an infusion of $4.3 million from “True Ventures (San Francisco, CA) with a follow on investment from Intel Capital (Santa Clara, CA), CBS Interactive, and some private investors. The company says:

TextDigger was founded by a group of former CNET employees and executives who developed patented linguistic technologies that, today, are used to auto-generate thousands of natural language texts posted on CNET’s award winning websites. Several members of TextDigger also founded SmartShop, which was acquired by CNET Networks in 2002. SmartShop developed a feature-rich comparison shopping engine distinguished by its ability to merge catalog data (product specs and features) from different sources, normalizing their divergent syntax and semantics, and presenting a unified catalog.

The has an interesting Open Web Service angle. In effect, the firm’s automated semantic tagging tool can be used by anyone who wants to tap the company’s technology to add to the Community Semantics term list. TextDigger’s technology is the top rated service in this project, which carries more weight for me than the comments from azure chip consultants, poobahs, and carpetbagger. TextDigger offers a version of its system for search engine optimization. You can sign up for a demo of the service on the TextDigger Web site here.

I read a paper by Tim Musgrove and Robin Walsh a couple of years ago. My recollection is that TextDigger’s technology had an interesting “more like this” function. My memory is hazy, but these types of functions can be computationally hungry. As Intel’s chips get more horsepower, semantic operations become less expensive. Intel has shown interest in other text-related investments, including Endeca. The commoditization of search and spare CPU cycles may open the door to some content related processes turning up in silicon. I am tracking the use of “open” as a differentiator, and like the word “search”, “open” has many shades of meaning.

Stephen E Arnold, April 5, 2010

A freebie.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta