March 8, 2014
Ontotext delivers very interesting services to their clients. All of their products are associated with semantic technology and utilizing big data to benefit its users. On their Web site, the company describes itself as:
“Ontotext develops a unique portfolio of core semantic technologies. Our RDF engine powers some of the biggest world-renowned media sites. Our text-mining solutions demonstrate unsurpassed accuracy across different domains – from sport news to macro-economic analysis, scientific articles and clinical trial reports. We enable the next generation web of data and we can efficiently extract information from today’s structured web – be it recipes, adverts or anything else.”
It offers services for job extraction, hybrid semantics, and semantic publishing for industries such as life sciences, government, recruitment, libraries, publishing, and media. Ontotext has a range of products to help people harness semantic technology. The most interesting to us is the Semantic Biomedical Tagger that is described as an extraction system that creates semantic annotations in biomedical texts. Ontotext also has the requisite search engine and semantic database. Its product line is fairly robust and we intend to keep an eye on its offerings.
March 7, 2014
Linguastat promises to transform big data and uses the metaphor “turning haystacks into gold.” Its Content Transformation Platform was developed for military intelligence with the goal of generating procedures specific, user-defined content. Since its launch, Linguastat counts ecommerce companies, real estate groups, sports organizations, digital publishers, and others among its client list.
What caught our attention was this bullet point about the Content Transformation Platform:
“Automatically writes optimized and copyrightable content.”
Linguastat states that its platform produces thousands of products and digital stories a day for their clients. They also take note that consumers are more likely to make online purchases when there is rich product content. The content is used to inform the consumer about the product. Its clients are in the market for usable content that comes at a cheap price.
While software is written to be extremely “smart” these days, we have a few doubts about the quality of the platform’s stories. Having never worked with the platform before, we can only go off our own experience with automated stories. Often they lack conversational or readable tone that consumers strive for and they tend to list facts in sentences. Cohesiveness is lost in automation. It is possible Linguastat has come across the magic formula that makes machine written stories digestible. Then again, they did promise to turn haystacks into gold.
February 26, 2014
Is big data the key to boosting Africa’s economic prowess? IBM seems to think so, and it is sending in its AI ambassador Watson to help with the continent’s development challenges. Watson is IBM’s natural language processing system that famously won Jeopardy in 2011. Now, Phys,org announces that “IBM Brings Watson to Africa.” The $100 million initiative is known as Project Lucy, named after the skeleton widely considered the earliest known human ancestor (Australopithecus, to be specific), discovered in Africa in 1974. (I would be remiss if I did not mention that an older skeleton, Ardipithecus, was found in 1994; there is still no consensus on whether this skeleton is really a “human ancestor,” though many scientists believe it is. But I digress.)
The write-up tells us:
“Watson technologies will be deployed from IBM’s new Africa Research laboratory providing researchers with a powerful set of resources to help develop commercially-viable solutions in key areas such as healthcare, education, water and sanitation, human mobility and agriculture.
“To help fuel the cognitive computing market and build an ecosystem around Watson, IBM will also establish a new pan-African Center of Excellence for Data-Driven Development (CEDD) and is recruiting research partners such as universities, development agencies, start-ups and clients in Africa and around the world. By joining the initiative, IBM’s partners will be able to tap into cloud-delivered cognitive intelligence that will be invaluable for solving the continent’s most pressing challenges and creating new business opportunities.”
IBM expects that with the help of its CEDD, Watson will be able to facilitate data collection and analysis on social and economic conditions in Africa, identifying correlations across multiple domains. The first two areas on Watson’s list are healthcare and education, both realms where improvement is sorely needed. The Center will coordinate with IBM’s 12 laboratories around the world and its new Watson business unit. (Wait, Watson now has its own business unit?) See the article for more on this hopeful initiative.
Cynthia Murrell, February 26, 2014
February 21, 2014
If you’re in the position to make decisions about how your company is going to handle Business Intelligence and Enterprise Search needs, you may want to have a look at Global Big Data Market 2014-2018, a new market research report offered by ReportLinker. PRNewswire reported on the publication.
The full report presents primary and secondary research conducted by TechNavio’s analysts, who
“forecast the Global Big Data market to grow at a CAGR of 34.17 percent over the period 2013-2018. One of the key factors contributing to this market growth is the need to upgrade business processes and improve productivity. The Global Big Data market has also been witnessing the increase in market consolidation. However, the lack of awareness about the potential of big data could pose a challenge to the growth of this market.”
A host of other vendors is also covered in the full report, which addresses the key challenges of the global Big Data market and the forces driving developments. My guess is that the emerging market adoption of Hadoop is one of those forces.
Laura Abrahamsen, February 21, 2014
February 20, 2014
Style Intelligence, the Business Intelligence platform from Inetsoft, released v. 11.5 late in 2013. According to the Passionned Group post “New Release of Inetsoft Includes Hadoop Connector,” the new data source connector will
“enable users to add data from the Apache Hive data warehouse to their dashboards and analysis in the same easy way they connect any other data source.”
The update also features a complete restyle of their user interface, which is based on a self-service dashboard for data reporting. Style Intelligence is built on an open standard SOA architecture and delivers data reports from both structured and unstructured sources.
In other words, Inetsoft has seen that Hadoop is the way Big Data is going, and they want to make sure their own product can work with what is fast becoming the industry standard.
Laura Abrahamsen, February 20, 2014
February 19, 2014
The press release on ThomsonReuters.com titled Thomson Reuters Cortellis Data Fusion Addresses Big Data Challenges by Speeding Access to Critical Pharmaceutical Content announces the embrace of big data by revenue hungry Thomson Reuters. The new addition the suite of drug development technologies will offer users a more intuitive interface through which they will be able to analyze large volumes of data. Chris Bouton, General Manager at Thomson Reuters Life Sciences is quoted in the article,
“Cortellis Data Fusion gives us the ability to tie together information about entities like diseases and genes and the connections between them. We can do this from Cortellis data, from third party data, from a client’s internal data or from all of these at the same time. Our analytics enable the client to then explore these connections and identify unexpected associations, leading to new discoveries… driving novel insights for our clients is at the very core of our mission…”
The changes at Thomson Reuters are the result of the company’s acquisition of Entagen, according to the article. That company is a leader in the field of semantic search and has been working with biotech and pharmaceutical companies offering both development services and navigation software. Cortellis Data Fusion promises greater insights and better control over the data Thomson Reuters holds, while maintaining enterprise information security to keep the data safe.
Chelsea Kerwin, February 19, 2014
February 19, 2014
The announcement from Centrifuge titled Centrifuge Systems Strengthens Big Data Discovery and Security promotes the release of Centrifuge 2.10. The new features of the link analysis and visualization software include the ability to block access as well as grant access to specific individuals, a more flexible method of login validation and the ability to “define hidden data sources, data connections and connection parameters.” Stan Dushko, Chief Product Officer at Centrifuge, explains the upgrades and the reasoning behind them,
“With organizations steadily gathering vast amounts of data and much of it proprietary or sensitive in nature, exposing it within visualization tools without proper security controls in place may have unforeseen consequence…Can we really take the chance of providing open access to data we haven’t previously reviewed? Not knowing what’s in the data, is all the more reason to enforce proper security controls especially when the data itself is used to grant access or discover its existence altogether.”
The Big Data business intelligence software provider promises customers peace of mind and total confidence in their technology. They believe their system to be above and beyond the dashboard management systems of “traditional business intelligence solutions” due to their displays possibility of being reorganized in a more interactive way. Speaking of organization, you may notice that finding Centrifuge Systems in Google is an interesting exercise.
Chelsea Kerwin, February 19, 2014
February 9, 2014
It is a new year and, as usual, there are big plans for big data. Instead of looking ahead, however, lets travel back to the July 4, 2012 Squiz and Funnelback European User Summit. On that day, Ben Pottier gave a discussion on “Big Data Mining With Funnelback.” Essentially it is a sales pitch for the company, but it is also a primer to understanding big data and how people use data.
At the beginning of the talk, Pottier mentions a quote from the International Data Corporation:
“The total amount of global data is expected to grow almost 3 zettabytes during 2012.”
That is a lot of ones and zeroes. How much did it grow in 2013 and what is expected for 2014? However much global data is grown, Pottier emphasizes that most of Funnelback’s clients have 75,000 documents and as it grows bigger organizations need to address how to manage it. Over the basic explanation, Pottier explains the single biggest issue for big data is finding enterprise content. In the last five minutes, he discusses data mining’s importance and how it can automate work that used to be done manually.
In Pottier’s talk, he explains that search is a vital feature for big data. Ha! Interesting how search is stretched to cover just about any content related function. Maybe instead of big data it should be changed to big search.
Whitney Grace, February 09, 2014
February 6, 2014
Tucked in “The Morning Ledger: Companies Seek Help Putting Big Data to Work” was a quote attributed to SAS, a vendor of statistical solutions and software. The quote:
David Ginsberg, chief data scientist at SAP, said communication skills are critically important in the field, and that a key player on his big-data team is a “guy who can translate Ph.D. to English. Those are the hardest people to find.”
I have been working through patent documents from some interesting companies involved in Big Data. The math is somewhat repetitive, but the combination of numerical ingredients makes the “invention” it seems.
One common thread runs through the information I have reviewed in preparation for my lectures in Dubai in early March 2014. Fancy software needs humans to:
- Verify the transforms are within acceptable limits
- Configure thresholds
- Specify outputs often using old fashioned methods like SQL and Boolean
- Figure out what the outputs “mean”.
With search and content processing vendors asserting that their systems make it easy for end users to tap the power of Big Data, I have some doubts. With most “analysts” working in Excel, a leap to the types of systems disclosed in open source patent documents will be at the outer edge of end users’ current skills.
Big Data requires much of skilled humans. When there are too few human Big Data experts, Big Data may not deliver much, if any, value to those looking for a silver bullet for their business.
Stephen E Arnold, February 6, 2014
February 5, 2014
The article titled Lawrence Livermore Explores the Shape of Data, Expanding Query-Free Analytics on GCN delves into the work of the Livermore National Laboratory in partnership with Ayasdi Inc. Using homegrown technology, the lab tackles big data to analyze various areas of research such as climate change, national security and biological defense. Recently their work has begun to incorporate topology, the study of shapes.
The article explains the connection:
“The fundamental idea is that topological methods act as a geometric approach to pattern or shape recognition within data,” says a September 2013 article in the journal Science co-authored by Ayasdi CEO Gurjeet Singh. It allows “exploration of the data, without first having to formulate a query or hypothesis.” That is, researchers can find things they did not know they were looking for. For instance, in a database of billions upon billions of phone records scientists could make sense of who was talking to whom.”
Such complicated shapes are almost impossible for people living in 3D to even imagine. But the practical applications seem endless. Stanford University began the research that resulted in TDA in the 1970’s, and received $10 million from NSF and DARPA in 2003. Five years later Ayasdi was founded for commercial use of the TDA software, which is offered as a cloud-based service.
Chelsea Kerwin, February 05, 2014