October 9, 2013
I have begun to put up early drafts of profiles I have written over the years. These are descriptions and commentary about vendors of search, content processing, and analytics systems.
The first profile to go live is one of my early analyses of Convera, a vendor which has largely dropped out of sight and out of mind—the famous Excalibur Technologies which reinvented itself as Convera. Anyone remember ConQuest Software. That was absorbed into Convera and made maintain word lists and controlled vocabularies an interesting task.
You can access the Convera profile at www.xenky.com/vendor-profiles. If you want to argue about one of the comments in this draft profile, use the comments section to this blog post.
The profiles will not be updated or maintained. I am providing the information because some students may find the explanations, diagrams, and comments of interest. The information is provided on an “as is” basis. If you want to use this for commercial purposes, please, contact me at seaky2000 at yahoo dot com.
Remember. I am almost 70 years old and some of the final versions of these profiles commanded hefty fees. Enjoy the tales of search systems that sometimes work okay and sometimes don’t work.
Stephen E Arnold, October 9, 2013
October 8, 2013
After reading Arxis Cloud’s article, “Prospective Cloud Converts Await Predictive Analytics Advancements” I learned that many businesses are on the edge of cloud deployment, but still have not hightailed it upwards. Some still claim security, service integrity, and scalability as the main reason for holding back, but most of those issues have been fixed. The big problem right now is predictive analytics.
Predictive analytics will be the key player in the future, because it gives companies an insightful, competitive edge needed in today’s market.
“The value of predictive analytics isn’t restricted to any one market segment either. Organizations with complex, globally distributed supply chains (i.e. electronics manufacturers) seem to have the most to gain from staying ahead of the curve, according to EBOnline, but those same core capabilities can also help a startup intelligently ride out its first few quarters. Whether it’s estimating the efficacy of a campaign run through CRM software solutions or conducting what-if analyses through ERP platforms, companies with an advanced view of the future are the ones poised to thrive in it.”
But the algorithms for cloud predictive analytics are complex and involve whole new design requirements. The cloud is seen as the best tool for future, but predictive analytics are even more useful. The analytics component needs to be automated for success, otherwise it will not be a cloudy day for companies.
Whitney Grace, October 08, 2013
October 4, 2013
Bitext.com recently reported on an exciting new partnership in the news release “Actuate and Bitext Announce Collaboration to Deliver Text Analytics Engines and Sentiment Analysis for Big Data Through BIRT.”
According to the article, Bitext, a leader in sentiment analysis, is teaming up with Actuate, a business intelligence software creator, to produce a new and improved text and semantics analytics solution. Additionally, Bitext has announced that it will also be creating a solution with Salesforce.com.
The article states:
“‘Our collaboration with Bitext – providers of advanced semantic solutions for social media, search, and more – extends the types of analysis that can be performed with Actuate’s commercial BIRT developer and end-user platform or solution, by adding the ability to score sentiment toward products and services,” said Josep Arroyo, VP of Analytic Solutions at Actuate. “Users of Actuate with Bitext can now tap more than just negative or positive sentiment analysis. They can also visualize anticipated risks, opportunities and threats for personalized insights, in a single display on any device.’”
This new solution will be a huge asset to marketing professionals, as well as customer support specialists looking to use predictive analytics techniques to gain valuable insights into their customer base. For more information about Bitext, navigate to www.bitext.com.
Jasmine Ashton, October 4, 2013
October 2, 2013
That the term “big data” has become a huge buzzword is clearly an understatement. Yet, in all the hype, one crucial point may get overlooked—that collecting silos of data means nothing without good data analysis. Network Computing reminds us of this crucial fact in, “Text Analytics Key to Unlocking Big Data Value.”
Blogger David Hill was inspired to write this article after attending the Text and Social Media Analytics Summit in Cambridge, Massachusetts. He cites a talk by Harvard’s Gary King who, after citing examples, declares analysis to be more important than data itself. Hill mostly agrees, but notes:
“Although King makes a strong point, the answer is that both data and analytics are important. All the analytics in the world will be of no help if the data does not exist or you cannot access the data for use. Still, King’s thesis really speaks to the need for creativity in the use of analytics to take advantage of data.”
The post goes on to discuss the integration of structured and unstructured data. Hill also mentions some examples of text analytics’ practical uses. See the article for details. The piece concludes:
“We have been subject to an application-driven software intelligence perspective of IT. . . for most of our lives. So a data-driven software intelligence perspective such as big data, where value in IT is squeezed from the data itself, is not only unfamiliar and hard to comprehend but also a little uncomfortable. Yet the world of data-driven software intelligence is the world of text analytics and will transform our view of how to get value from the IT infrastructure.”
As we have mentioned before, big data (and the analysis thereof) are not necessarily important for every business. But for many, especially large corporations, it can be a useful tool indeed. Companies should take as much care with their analysis strategy as they do with their data collection, and they should start by identifying their business’ particular needs.
Cynthia Murrell, October 02, 2013
September 28, 2013
I read “Palantir Just Raised a Massive $196M, Filing Shows.” The factoid I noted was:
The rumored value of Palantir is at over $8 billion, and its chief executive, Alex Karp, told Forbes that it’s likely to close $1 billion in contracts next year.
What I found interesting is that none of the write ups I scanned mentioned “Berto Jongman: U.S. command in Afghanistan gives Army 60 days to fix or replace intel network [meanwhile, Palantir spends millions buying legislative intervention].” Perhaps, this report about issues with Palantir is off base. My hunch is that getting more information is going to be difficult.
Investors may need Joe Rogan supplements to assist them in the Olympic revenue races ahead. Only the heat winners get to compete for after-tax profits.
Stephen E Arnold, September 28, 2013
September 28, 2013
An article on Semanticweb.com titled German Engineers Developing a Semantic Music Analysis Engine reports on a project being undertaken in Germany that will allow for greater understanding of patterns and influences in music. The article takes Shakira as an example, explaining that the Franz Liszt Music Conservatory’s compilation will enable them to see the direct influence of traditional music on her pop music. In her case, a great amount of influence comes from Colombian slave music during the colonial-era. The article explains,
“For the past two years, Brazilian music ethnologists have been working together with other experts from Europe, Asia, Latin America and Africa on a semantic search engine that automatically recognises basic musical attributes such as tone and rhythm. ‘We are creating something that is independent of the global industry. The current search engines are only capable of finding identical musical pieces from huge databases,’ explained project worker Philip Kueppers. ‘We synthesize basic elements from rhythm in order to deliver general musical information to users.’
With more of an emphasis on music as opposed to artists themselves, there can be little to no confusion as to what category or genre a piece of music falls into. Ultimately the system is able to compare pieces of music in a way that until now was not possible. Of course this musical technology begs us to ask the question, would it work for business English?
Chelsea Kerwin, September 28, 2013
September 21, 2013
In my view, artificial intelligence continues to capture attention. In actual use—particularly in search and content processing—AI evokes from me, “Aiiiiiiii.”
I read “The Unexpected Places Where Artificial Intelligence Will Emerge.” For investors who have pumped cash into various inventions that understand meaning, the article may surprise them. The future of AI is war, Google, Netflix, Amazon, spam, surveillance, robot space explorers, and financial trading.
The only challenge for AI is its lack of consistency. Smart systems work in certain circumstances and fail miserably in others. In my ISS lectures next week, I profile a number of systems which are alleged to be incredibly smart. The reality is that the systems are often rigged to generate expected outputs. The problem of “you don’t know what you don’t know” plagues the developers of these gee-whiz systems.
Will artificial intelligence improve search? Well, AI makes search easier for those who are happy to accept system outputs. For those who need to dig deeper, AI systems often produce results which do little to provide fine-grained detail or make it easy to identify suspect results.
For a good example of AI in action, look at Google search results when you are logged in. Examine Amazon recommendations closely. Better yet, watch the TV shows and films recommended for you by Netflix.
Stephen E Arnold, September 21, 2013
September 18, 2013
It seems like state-of-the-art analysis tools would be a priority in the data-rich field of finance. That’s why it is startling to learn that the technology being used by economic analysts and consultants seems to be stuck in the era of Windows 95. About Data shares a data-loving former economist’s lament in, “Bridging Economics and Data Science.”
Blogger Sam Bhagwat majored in economics because he was intrigued by innovative uses of data in that field; for example, a professor of his had gleaned conclusions about European patent law from a set of 19th century industrial-fair records. As he progressed, though, Bhagwat came to the disappointing realization that his field still relies on technology for which “outdated” is putting it mildly. He writes:
“When I graduated, the questions had changed, but the fundamental tools of analysis remained constant. Half of my classmates, including me, were headed to consulting or investment banking. These are ‘spreadsheet monkey’ positions analyzing client financial and operational data.
“In terms of relationship-building, this is great. Joining high strategy or high finance, you walk through the halls of power and learn to feel comfortable there. But in terms of technical skill-set, not so great. You begin to specialize in spreadsheets, a tool which hasn’t significantly improved since 1995.
“For someone like me, who wants to solve the most interesting problems out there, dealing with gigabytes and terabytes of data, realizing this was bitter medicine. Computational data analysis has changed a lot in the last twenty years, but my career track — economics, consulting, finance — hadn’t.”
So that is how one inquiring mind decided to make the leap from economics to data science. Bhagwat says he taught himself programming so he could pursue work he actually found challenging. I wonder, though—will he use his dual expertise to help bridge the gap between the two disciplines, or has he moved on, never to look back?
Cynthia Murrell, September 18, 2013
September 18, 2013
Processing big data is slow and requires companies to depend heavily on their IT departments to compile reports. What if there was a way to make big data faster with a self-access system? Is someone programming castles in the sky? According to Info World, the answer is no and the article, “Platfora CEO: How We Deliver Sharper Analytics Faster” has some information that possibly verifies it. After a brief rundown of big data’s history, it gets to the interesting part that references how developers were not supposed to build data warehouses until they knew what question to ask their data. The problem is that questions change and analysts then cannot get the exact data they need.
Analysts and developers big challenges at the edge of big data: amount and type of data is growing at an exponential rate, nobody can know all the exact questions they need to ask in advance, maintain competiveness, and answer unanticipated questions. The biggest item the article claims analysts need is self-service.
Platfora then steps up to the bat with its new business intelligence platform:
“The integrated platform we developed to support a new era of self-service analytics helps to remove the obstacles to business intelligence described earlier by enabling an “interest-driven pipeline” of data controlled by the end-user. The end-user — typically a business analyst — can access raw data directly from Hadoop, which is then transformed into interactive, in-memory business intelligence. There is no need for a data warehouse or for separate ETL (extract, transfer, load) software and the headaches described above.”
Self-empowerment and the ability to find new data patterns all on your own. Are we seeing the next big data phase? Platfora is making analytics “sharp.” Cue the ZZ Top music.
Whitney Grace, September 18, 2013
September 11, 2013
Author J.K. Rowling recently learned firsthand how sophisticated analytics software has become. It was a linguistic analysis of the text in The Cuckoo’s Calling‘s which unmasked her as the popular crime-novel’s author “Robert Galbraith.” (These tools were originally devised to combat plagiarism.) Now, I Programmer tells us in “Anonymouth Hides Identity,” open-source software is being crafted to foil such tools, and give writers “stylometric anonymity.”
Whether a wordsmith just wants to enjoy a long-lost sense of anonymity, as the wildly successful author of the Harry Potter series attempted to do, or has more high-stakes reasons to hide behind a pen name, a team from Drexel University has the answer. The students from the school’s Privacy, Security, and Automation Lab (PSAL) just captured the Andreas Pfitzmann Best Student Paper Award at this year’s Privacy Enhancing Technologies Symposium for their paper on the subject. The article reveals:
The idea behind Anonymouth is that sylometry can be a threat in situations where individuals want to ensure their privacy while continuing to interact with others over the Internet. A presentation about the program cites two hypothetical scenarios:
*Alice the Anonymous Blogger vs.Bob the Abusive Employer
*Anonymous Forum vs. Oppressive Government. . . .
The JStylo-Anonymouth (JSAN) framework is work in progress at PSAL under the supervision of assistant professor of computer science, Dr. Rachel Greenstadt. It consists of two parts:
*JStylo – authorship attribution framework, used as the underlying feature extraction employing a set of linguistic features
*Anonymouth – authorship evasion (anonymization) framework, which suggests changes that need to be made.
The admittedly very small study discussed in the paper found that 80 percent of participants were able to produce anonymous documents “to a limited extent.” It also found certain constraints– it was more difficult to anonymize existing documents than new creations, for example. Still, this is an interesting development, and I am sure we will see more efforts in this direction.
Cynthia Murrell, September 11, 2013