Attivio Springs to the Defense of MarkLogic
December 16, 2013
Through their blog, Attivio weighs in on the HealthCare.gov service: “Could IBM or Oracle Have Been the Miracle Cure for Healthcare.gov?” The telling subtitle is reads, “if you believe that, then I have a bridge to sell you.” Yes, Attivio comes out against pinning all the blame on a refusal to go with the tried and true (or outdated and limited, depending on one’s perspective.)
Senior Attivio marketing VP (and blogger) MaryAnne Sinville observes that the latest trend in the finger-pointing crusade is to assert that the site’s database component should have gone to an old stalwart like IBM, Oracle, or Microsoft instead of to the NoSQL firm MarkLogic. Not because those databases are better suited to the project, necessarily, but because it is easier to find technicians familiar with those systems.
Sinville writes:
“Does anyone really believe a better solution to a project involving many disparate sources of information, complex logic, and a dynamic interface, which must be built in a very short timeframe would have been to select IBM, Microsoft or Oracle? The idea that legacy mega-vendors have the agility required for a project of this scope is absurd, as the states of Oregon, Pennsylvania and the US Air Force have all recently learned the hard way.
Let’s take a look at the real issues at play here. Selecting a NoSQL database like MarkLogic, or more precisely in this case, an XML database, means that all of the Healthcare.gov data sources would have to be converted to XML. Of course that’s a monumental task, but it’s no more difficult and time consuming than the arduous extract, transform and load (ETL) processes required by traditional relational databases because of their fixed schema. The enormous time and cost associated with ETL is precisely why new technologies are emerging.”
For a nation that prides itself on innovation, we seem to have a lot of folks afraid of progress. Granted, Attivio has a stake in encouraging organizations to break away from traditional database providers. Still, I agree that a project this size called for the most up-to-date approach available. Let us turn our accusatory gaze from MarkLogic, which after all represents a small fraction of the vendors involved with this website, to where it belongs: on our government’s unwieldy and outdated procurement process. Granted, addressing that will be much tougher than assigning a scapegoat, but the approach has a singular advantage—it might actually fix a problem currently poised to cause us trouble for years to come.
Cynthia Murrell, December 16, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Connotate Still Going Strong
December 16, 2013
We see that Connotate continues to grow. In the News section of their Web site, the company announces, “Connotate Sees Growth and Expands Global Footprint in Q3.” The press release states that, over the last quarter, the Web-data firm has won new clients, renewed existing ones, and formed partnerships that extend their presence around the world.
We learn:
“AT&T, Experian and APCOA are among the new client wins; renewals include Thomson Reuters, Dow Jones and ADP. All of these top companies rely on Connotate to monitor and collect precise and complex data, at scale, to advance business capabilities….
“Connotate continues to grow its partner network both locally and abroad. Connotate added US-headquartered partners Ntrepid and Basis Technology to its partner network, as well as Shenzhen Plan Software, a China-based reseller focused on satisfying demand for big data solutions in Asia.”
The write-up also notes that FetchCheck, the company’s background check solution, processed more than four million transactions in 2013, a substantial number in that field. Based in New Brunswick, New Jersey, Connotate was founded in 2000. The company strives to simplify web-data extraction and monitoring, providing clients with strong business insights through a user-friendly platform.
Cynthia Murrell, December 16, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Artificial General Intelligence: Batting the Knowledge Ball toward IBM and Google
December 15, 2013
If you are interested in “artificial intelligence” or “artificial general intelligence”, you will want to read “Creative Blocks: The Very Laws of Physics Imply That Artificial Intelligence Must Be Possible. What’s Holding Us Up?” Artificial General Intelligence is a discipline that seeks to render in a computing device the human brain.
Dr. Deutsch asserts:
I cannot think of any other significant field of knowledge in which the prevailing wisdom, not only in society at large but also among experts, is so beset with entrenched, overlapping, fundamental errors. Yet it has also been one of the most self-confident fields in prophesying that it will soon achieve the ultimate breakthrough.
Adherents of making a machine’s brain work like a human’s are, says Dr. Deutsch:
split the intellectual world into two camps, one insisting that AGI was none the less impossible, and the other that it was imminent. Both were mistaken. The first, initially predominant, camp cited a plethora of reasons ranging from the supernatural to the incoherent. All shared the basic mistake that they did not understand what computational universality implies about the physical world, and about human brains in particular. But it is the other camp’s basic mistake that is responsible for the lack of progress. It was a failure to recognize that what distinguishes human brains from all other physical systems is qualitatively different from all other functionalities, and cannot be specified in the way that all other attributes of computer programs can be. It cannot be programmed by any of the techniques that suffice for writing any other type of program. Nor can it be achieved merely by improving their performance at tasks that they currently do perform, no matter by how much.
One of the examples Dr. Deutsch invokes is IBM’s game show “winning” computer Watson. He explains:
Nowadays, an accelerating stream of marvelous and useful functionalities for computers are coming into use, some of them sooner than had been foreseen even quite recently. But what is neither marvelous nor useful is the argument that often greets these developments, that they are reaching the frontiers of AGI. An especially severe outbreak of this occurred recently when a search engine called Watson, developed by IBM, defeated the best human player of a word-association database-searching game called Jeopardy. ‘Smartest machine on Earth’, the PBS documentary series Nova called it, and characterized its function as ‘mimicking the human thought process with software.’ But that is precisely what it does not do. The thing is, playing Jeopardy — like every one of the computational functionalities at which we rightly marvel today — is firmly among the functionalities that can be specified in the standard, behaviorist way that I discussed above. No Jeopardy answer will ever be published in a journal of new discoveries. The fact that humans perform that task less well by using creativity to generate the underlying guesses is not a sign that the program has near-human cognitive abilities. The exact opposite is true, for the two methods are utterly different from the ground up.
IBM surfaces again with regard to playing chess, a trick IBM demonstrated years ago:
Likewise, when a computer program beats a grandmaster at chess, the two are not using even remotely similar algorithms. The grandmaster can explain why it seemed worth sacrificing the knight for strategic advantage and can write an exciting book on the subject. The program can only prove that the sacrifice does not force a checkmate, and cannot write a book because it has no clue even what the objective of a chess game is. Programming AGI is not the same sort of problem as programming Jeopardy or chess.
After I read Dr. Deutsch’s essay, I refreshed my memory about Dr. Ray Kurzweil’s view. You can find an interesting essay by this now-Googler in “The Real Reasons We Don’t Have AGI Yet.” The key assertions are:
The real reasons we don’t have AGI yet, I believe, have nothing to do with Popperian philosophy, and everything to do with:
- The weakness of current computer hardware (rapidly being remedied via exponential technological growth!)
- The relatively minimal funding allocated to AGI research (which, I agree with Deutsch, should be distinguished from “narrow AI” research on highly purpose-specific AI systems like IBM’s Jeopardy!-playing AI or Google’s self-driving cars).
- The integration bottleneck: the difficulty of integrating multiple complex components together to make a complex dynamical software system, in cases where the behavior of the integrated system depends sensitively on every one of the components.
Dr. Kurzweil concludes:
The difference between Deutsch’s perspective and my own is not a purely abstract matter; it does have practical consequence. If Deutsch’s perspective is correct, the best way for society to work toward AGI would be to give lots of funding to philosophers of mind. If my view is correct, on the other hand, most AGI funding should go to folks designing and building large-scale integrated AGI systems.
These discussions are going to be quite important in 2014. As search systems do more thinking for the human user, disagreements that appear to be theoretical will have a significant impact on what information is displayed for a user.
Do users know that search results are shaped by algorithms that “think” they are smarter than humans? Good question.
Stephen E Arnold, December 15, 2013
Math, Proofs, and Collaboration
December 15, 2013
I know that the search engine optimization folks already are on top of this idea, but for the mere mortals of the “search” world, check out “Voevodsky’s Mathematical Revolution.” Vladimir Voevodsky is a Fields winner and he was thinking about some fresh challenges. He hit upon one: The use of a computer to verify proofs. The write up explains a “new foundation is that the fundamental concepts are much closer to where ordinary mathematicians do their work.”
The comment I noted pertains to mathematical proofs. As you know, creating a proof is, for many, mathematics. However, verifying proofs is tough work. The quote I noted is:
“I can’t see how else it will go,” he said. “I think the process will be first accepted by some small subset, then it will grow, and eventually it will become a really standard thing. The next step is when it will start to be taught at math grad schools, and then the next step is when it will be taught at the undergraduate level. That may take tens of years, I don’t know, but I don’t see what else could happen.”
The consequence of automated methods like Coq is even more interesting:
He also predicts that this will lead to a blossoming of collaboration, pointing out that right now, collaboration requires an enormous trust, because it’s too much work to carefully check your collaborator’s work. With computer verification, the computer does all that for you, so you can collaborate with anyone and know that what they produce is solid. That creates the possibility of mathematicians doing large-scale collaborative projects that have been impractical until now.
Interesting.
Stephen E Arnold, December 15, 2013
Big Data Still Faces a Few Hitches
December 15, 2013
Writer Mellisa Tolentino assesses the state of big data in, “Big Data Economy: The Promises + Hindrances of BI, Advanced Analytics” at SiliconAngle. Pointing to the field’s expected $50 billion in revenue by 2017, she says the phenomenon has given rise to a “Data Economy.” The article notes that enterprises in a number of industries have been employing big data tech to increase their productivity and efficiency.
However, there are still some wrinkles to be ironed out. One is the cumbersome process of pulling together data models and curating data sources, a real time suck for IT departments. This problem, though, may find resolution in nascent services that will take care of all that for a fee. The biggest issue may be the debate about open source solutions.
The article explains:
“Proponents of the open-source approach argue that it will be able to take advantage of community innovations across all aspects of product development, that it’s easier to get customers especially if they offer fully-functioning software for free. Plus, they say it is easier to get established partners that could easily open up market opportunities.
Unfortunately, the fully open-source approach has some major drawbacks. For example, the open-source community is often not united, making progress slower. This affects the long-term future of the product and revenue; plus, businesses that offer only services are harder to scale. As for the open core approach, though it has the potential to create value differentiation faster than the open source community, experts say it can easily lose its value when the open-source community catches up in terms of functionality.”
Tolentino adds that vendors can find themselves in a reputational bind when considering open source solutions: If they eschew the open core approach, they may be seen as refusing to support the open source community. However, if they do embrace open source solutions, some may accuse them of taking advantage of that community. Striking the balance while doing what works best for one’s company is the challenge.
Cynthia Murrell, December 15, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Business Intelligence: Free Pressures For Fee Solutions
December 14, 2013
I read “KB Crawl sort la tête de l’eau,” published by 01Business. The hook for the article is that KB Crawl, a company harvesting Internet content for business intelligence analyses, has emerged from bankruptcy. Good news for KB Crawl, whose parent company is reported to be KB Intelligence.
The write up contained related interesting information.
First, the article points out that business intelligence services like KB Crawl are perceived as costs, not revenue producers. If this is accurate, the same problem may be holding back once promising US vendors like Digital Reasoning and Ikanow, among others.
Second, the article seems to suggest that for fee business intelligence services are in direct competition with free services like Google. Although Google’s focus on ads continues to have an impact on the relevance of the Google results, users may be comfortable with information provided by free services. Will the same preference for free impact the US business intelligence sector?
Third, the article identifies a vendor (Ixxo) as facing some financial headwinds, writing:
D’autres éditeurs du secteur connaissent des difficultés, comme Ixxo, éditeur de la solution Squido.
But the most useful information in the story is the list of companies that compete with KB Crawl. Some of the firms are:
- AMI Software. www.amisw.com. This company has roots in enterprise search and touts 1500 customers
- Data Observer. www.data-observer.com. The company is a tie up between Asapspot and Data-Deliver. The firm offers “an all-encompassing Internet monitoring and e-reputation services company.”
- Digimind. www.digimind.com. The firm makes sense of social media.
- Eplica. A possible reference to a San Diego employment services firm.
- iScop. Unknown.
- Ixxo. www.ixxo.fr. The firm “develops innovative software applications to boost business responsiveness when faced with unstructured data.”
- Pikko. www.pikko-software.com. A visualization company.
- Qwam. www.qwamci.com. Another “content intelligence” company.
- SindUp. www.sindup.fr. The company offers a monitoring platform for strategic and e reputation information.
- Spotter. www.spotter.com. A company that provides the “power to understand.”
- Synthesio. www.synthesio.com. The company says, “We help brands and agencies find valuable social insights to drive real business value.”
- TrendyBuzz. www.trendybuzz.com. The company lets a client measure “Internet visibility units.”
My view is that 01Busienss may be identifying a fundamental problem in the for fee business intelligence, open source harvesting, and competitive intelligence sector.
Information about business and competitive intelligence that I see in my TRAX Overflight service is mostly of the “power of positive thinking” variety. Companies like Palantir capture attention because the firms are able to raise astounding amounts of funding. Less visible are the financial pressures on the companies trying to generate revenue with systems aimed at commercial enterprises.
If the 01Business article is on the money, what US vendors are like to have their heads under water in 2014? Use the comments section of this blog to identify the stragglers in the North American market.
Stephen E Arnold, December 14, 2013
BA Insight Makes Deloitte Fast 500 List
December 14, 2013
It looks like BA Insight is growing and growing. Yahoo Finance shares, “BA Insight Ranked Number 393 Fastest Growing Company in North America on Deloitte’s 2013 Technology Fast 500 (TM).” The list ranks the 500 fastest-growing: tech, media, telecom, life sciences, and clean tech companies on this continent. The evaluation is based on percentage fiscal year revenue growth from 2008 to 2012. (See the article for conditions contenders must meet.)
We learn:
“BA Insight’s Chief Executive Officer, Massood Zarrabian credits the emergence of Big Data and the market demand for search-driven applications for the company’s revenue growth. He said, ‘We are honored to be ranked among the fastest growing technology companies in North America. BA Insight has been focused on developing the BAI Knowledge Integration Platform that enables organization to implement powerful search-driven applications rapidly, at a fraction of the cost, time, and risk of traditional alternatives. Additionally, we have partnered with visionary organizations to transform their enterprise search engines into knowledge engines giving them full access to organizational knowledge assets.'”
The press release notes that BA Insight has grown 193 percent over five years. Interesting—while other firms are struggling, BA Insight has almost doubled. But from what to what? The write-up does not say.
BA Insight has set out to redefine enterprise search to make it more comprehensive and easier to use. Founded in 2004, the company is headquartered in Boston and keeps its technology center in New York City. Some readers may be interested to know that the company is currently hiring for the Boston office.
Cynthia Murrell, December 14, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Content Intelligence: 2003 and the Next Big Thing
December 13, 2013
I have been working through some of the archives in my personal file about search vendors. I came across a wonderfully amusing article from DMReview. The article “The Problem with Unstructured Data.”
Here’s the part I have circled in 2003, a decade ago, about the next big thing:
Content intelligence is maturing into an essential enterprise technology, comparable to the relational database. The technology comes in several flavors, namely: search, classification and discovery. In most cases, however, enterprises will want to integrate this technology with one or more of their existing enterprise systems to derive greater value from the embedded unstructured data. Many organizations have identified high-value, content intelligence-centric applications that can now be constructed using platforms from leading vendors. What will make content intelligence the next big trend is how this not-so-new set of technologies will be used to uncover new issues and trends and to answer specific business questions, akin to business intelligence. When this happens, unstructured data will be a source of actionable, time-critical business intelligence.
I can see this paragraph appearing without much of a change in any one of a number of today’s vendors’ marketing collateral.
I just finished an article for about the lack of innovation in search and content processing. My focus in that essay was from 2007 to the present. I will keep my eyes open for examples of jargon and high-flying buzzwords that reach even deeper into the forgotten past of search and retrieval.
The chit chat on LinkedIn about “best” search system is a little disappointing but almost as amusing as this quote from DM Review. Yep, “content intelligence” was the next big thing a decade ago. I suppose that “maturing” process is like the one used for Kentucky bourbon. No oak barrels, just hyperbole, for the search mavens.
Stephen E Arnold, January 26, 2013
Google and Images for Email: A Different View, Very Different
December 13, 2013
I have watched the comments about Google’s decision to cache images. A notable “this is what those guys are doing” appears in “Gmail Blows Up E-Mail Marketing by Caching All Images on Google Servers.” The focus is on the tracking function that e-mail marketers and various search engine optimization poobahs love to discuss.
Here’s a passage I noted:
Unless you click on a link, marketers will have no idea the e-mail has been seen. While this means improved privacy from e-mail marketers, Google will now be digging deeper than ever into your e-mails and literally modifying the contents. If you were worried about e-mail scanning, this may take things a step further. However, if you don’t like the idea of cached images, you can turn it off in the settings. This move will allow Google to automatically display images, killing the “display all images” button in Gmail. Google servers should also be faster than the usual third-party image host.
So the write up points out an upside (speed) and a downside (tracking).
My view is different. I don’t want to dig into the plumbing at Google. There are some interesting functions that build a knowledgebase. The knowledgebase is not a single entity. There are a number of them; for example, advertisers’ orders, various silos of indexes, and map info. Google is into metadata, so there are silos of information about information. There is no easy way to visualize this architecture, but I like to suggest that Google is fractalizing data, metadata, and information.
My view is that the shift has two other functions.
First, it signals that Google will be monetizing aggressively certain types of system actions. I am not sure if advertisers, marketers, or users will be the ones paying more to tap into Google’s ecosystem. I recall a Google VP’s comment to me in 2006:
There is no free lunch for Google search.
Second, the shift makes it much easier for Google to filter out images that carry some type of explicit ownership statement. The images can now safely be tucked into one of the data structures hinted at in the Programmable Search Engine technology invented by Dr. Ramanathan Guha. The easiest way to build a massive collection of reusable images is to get ‘em and keep ‘em. Will free image search and image access go the way of the dodo? I don’t know.
The shift, therefore, may have more to do with setting the stage for monetizing some user / advertiser actions and building out a Google controlled image service.
I know that thinking differently about Google is much easier in rural Kentucky than in the go go land of Silicon Valley. Dismiss my observations if you wish. There is more to the shift than meets the eye.
Stephen E Arnold, December 13, 2013
SharePoint Exchange Service Pack 1 Coming Soon
December 13, 2013
Service pack one is coming to the Microsoft suite: Office, SharePoint, and Exchange. Users are wondering what to expect and InfoWorld gives some details in their article, “Get Ready for the Office, SharePoint, and Exchange 2013 SP1 Service Packs.”
The article begins:
“Early 2014 will see Service Pack 1 updates for Office 2013, SharePoint 2013, and Exchange 2013 (but apparently not Lync), bringing the on-premises versions of these servers and applications up to par with the then-current Office 365 versions. It appears that issuing periodic service packs is how Microsoft will keep the on-premises versions of its offerings at parity with the cloud-delivered Office 365 versions, whose changes come more incrementally but more often — and automatically.”
Stephen E. Arnold is a longtime leader in search. He follows the latest happenings of SharePoint through his Web service, ArnoldIT. And though many will spend time on the pros and cons of SharePoint, Arnold finds add-ons and customization tools to help you get the most out of your deployment.
Emily Rae Aldridge, December 13, 2013
Sponsored by ArnoldIT.com, developer of Augmentext