English Majors Rejoice: WolframAlpha Does Willy
April 11, 2012
I know that quite a few search engine optimization wizards, most MBAs, and probably two thirds of the attorneys love William Shakespeare. From the wonderful days in those teen years all the way through English 410 at a top-notch school like the University of Phoenix. Willy’s passion is that which passes show to the glass of fashion, text mining. Ah, analytics, how use doth breed a habit in a man.
Well, not the entire corpus of Shakespeare. “Rape of Lucrece” warrants a “WolframAlpha doesn’t understand your query.” So for my own part, it was Greek to me.
Navigate to “To Computer or Not to Compute—WolframAlpha Analyzes Shakespeare’s Plays.” I thought immediately about Vivisimo’s academic vertical search demonstrations. These were great fun, but I am not sure that academic subjects hit the Instagram jack pot. The service may be useful to those trying for figure out which character was Desdemona’s mother’s maid, and I think the service helps educate some graduate students into the virtues of doing close reading by scanning outputs from a set of algorithms little understood. Here’s the passage in the write up I noted:
Entering a play into Wolfram|Alpha, like A Midsummer Night’s Dream, brings up basic information, such as number of acts, scenes, and characters. It also provides more in-depth info like longest word, most frequent words, number of words and sentences, and more. It’s also easy to find more specific information about a particular act or scene with queries like “What is the longest word in King Lear?”, “What is the average sentence length of Macbeth?”, and “How many unique words are there in Twelfth Night?”.
Literature teachers will face essays in which words fly up. What is below is a numerical recipes. And close reading? We have heard the chimes at midnight.
Stephen E Arnold, April 11, 2012
Sponsored by Pandia.com
Woopra Educates Users about Web Analytics
April 11, 2012
We paid attention in statistics class. The quality of the data and the questions one frames are key to making any analytics exercise work. Punching buttons and generating pretty charts and graphs are not too helpful if the underlying data and the questions are off base, incorrect, or training wheels for a hassled MBA.
Quora recently posed the question “Web Analytics: Most companies don’t use the full potential of their Web Analytics tools. What do you think?” and received eight answers.The most thought provoking response came from Natalie Issa, the Marketing Director of the web analytics company Woopra.
According to Issa, there are four key points one should keep in mind when tackling the web analytics challenge. These are: Google analytics, marketing vs. developers, large companies vs. small, and educating users.
When discussing the importance of educating users, Issa writes:
“Web analytics services need to invest and devote resources to educating users. The company I work for, Woopra, is tackling this head on by hiring individuals whose job it is to teach and create learning materials for our users to make sure they have all the support they need (if this sounds like your kind of job, feel free to message me :)). We’re also seeing more and more marketing firms and consultants helping small and medium size businesses with their web analytics needs.”
Does this mean that customers are making decisions without an appropriate understanding of what the math behind the system actually delivers? Our view: training wheels on analytics can produce some interesting consequences.
The reality is that analytics will not solve problems reliably unless the users understand the data and frame the correct question. Statistics 101. It is not the training; it is the fact that people want a silver bullet, not mental effort. Just our view, of course. Marketers have a different goal, and it is not education, is it?
Jasmine Ashton, April 11, 2012
Sponsored by Pandia.com
Lucid Imagination Lands Government Big Data Deal
April 9, 2012
Give me an “L,” give me a “U,” give me a…well if I ask you to yell out all those letters on the computer screen, people might begin to think you’re a few bytes short of a complete memory drive. The reason for my cheering, though, was inspired by MarketWatch’s article on “Intelligent Software Solutions Partners With Lucid Imagination to Tackle Government Big Data Challenges.”
Intelligent Software Solutions (ISS) and Lucid Imagination have entered into a two-year joint business development deal, where both companies will create big data search and analysis solutions. The government has been having trouble with processing “teraquads” of structured and unstructured data. The article said:
“ ‘Our relationship with Lucid gets us to a higher level of proficiency on this technology by putting experts on our team to work jointly with us to build tailored solutions for our customers,’ said Wes Caldwell, chief architect, Global Enterprise Solutions Division, Intelligent Software Solutions. ‘We are in the business of providing software solutions for our customers that allow them to mine, manage and analyze large amounts of data and to derive critical knowledge from that data. The promise of harnessing big data to deliver actionable intelligence from an often diverse and vast amount of data is quickly becoming a requirement for many of our customers. By leveraging Lucid Imagination’s enterprise open source search platform, we can deliver better value to our customer base. This partnership is a key component to that strategy and further strengthens our position in that area.’”
The main force behind the joint venture is Lucid Imagination’s open source enterprise search technology that allows companies to build their own search products. ISS will add Lucid’s people to their team and together they will deliver training and support services for their government clientele. Lucid Imagination is proving to the world how powerful and useful open source software can be. Is this the start of a new trend or will it pass by quickly?
By the way, “teraquads” sounds big.
Whitney Grace, April 10, 2012
Sponsored by Pandia.com
Open Source Analytics Information Service Now Available
April 9, 2012
ArnoldIT has rolled out The Trend Point information service. Published Monday through Friday, the information services focuses on the intersection of open source software and next-generation analytics. The approach will be for the editors and researchers to identify high-value source documents and then encapsulate these documents into easily-digested articles and stories. In addition, critical commentary, supplementary links, and important facts from the source document are provided. Unlike a news aggregation service run by automated agents, librarians and researchers use the ArnoldIT Overflight tools to track companies, concepts, and products. The combination of human-intermediated research with Overflight provide an executive or business professional with a quick, easy, and free way to keep track of important developments in open source analytics. There is no charge for the service.
Stories include:
- White House Orders Big Data Solutions
- Public and Private Sectors Combine for Big Savings
- Analytic Revolution Looks Different from 90s Dotcom Boom
According to the publisher, Stephen E Arnold:
We believe that commercial abstracting and indexing services have become untenable for the busy professional. We have combined traditional indexing, literature reviews, and critical commentary which help reduce the time required to pinpoint the meaningful information in this exploding open source analytics field.
Our business model is to provide high value information without a fee. Individuals, law firms, and private equity firms wanting additional information about the people, companies, and products we cover are free to contact us. Like other professional services’ firms, we rely on motivated individuals with an information need to tap into our full-scale, in-depth research.
What sets TheTrendPoint and other ArnoldIT.com information services apart is that its approach is similar to that used by commercial information services such as Medline and Disclosure, two information services designed to make reference services more useful.
At this time, TheTrendPoint.com is designed to complement the finding services which ArnoldIT.com publishes. ArnoldIT.com is one of the leading sources of information on subjects ranging from search and content processing to next-generation intelligence systems.
New content is added to the service Monday to Friday. For more information about the service, contact the publisher at seaky2000 at yahoo dot com.
Kenneth Toth, April 9, 2012
Sponsored by Pandia.com
Protected: Exclusive Interview: David B. Camarata, IKANOW
April 9, 2012
MarkLogic Adds Big Data to Its Line Up
April 7, 2012
MarkLogic Corporation has specialized in XML databases for years, but now they have turned their attention to Big Data. Marketwatch.com reports in, “Big Data Takes Center Stage at MarkLogic World 2012” that on May 1-3, 2012 in the Ronald Regan Building in our nation’s capital, Big Data leaders and MarkLogic experts will be gathered in one place. The conference presents an excellent opportunity to meet and network with the experts, but it is also a chance to learn about industry trends, new ideas, and tips/techniques. We noted:
MarkLogic World 2012 will be keynoted by retired Adm. Mike Mullen, who was chairman of the Joint Chiefs of Staff from 2007 to 2011. Mullen will discuss “The Intersection of National Security and the Global Economy.” In his keynote, Mullen will talk about the challenges he faced while serving as the top military adviser to the president and the secretary of defense through two administrations. Mullen will also discuss the challenges facing America, looking at economic growth, infrastructure, education, and foreign and military policy.
Other nig names are three leading research analysts: Matt Aslett, research manager, 451 Research; Mark Beyer, research vice president, Gartner; and Noel Yuhanna, principal analyst, Forrester, who will give a rundown on major trends in Big Data at their panel. An award ceremony will also be held to honor leaders and innovators in the field. Conferences are always the best tools, outside of LinkedIn and other professional social networking web sites, to connect with potential collaborators and get ideas for future projects. However, these conferences surprised us—is it a marketing or technological reconfiguration of our favorite XQuery system with proprietary extensions?
The defense flavor is interesting. With the US budget gripping the scissors for some defense spending, is MarkLogic aware of a funding windfall in this sector? With the harsh actions taken toward inappropriate General Services Administration spending, the US government market may face as much turmoil as commercial sectors like book, magazine, and newspaper funding.
Is the notion of big data the next golden goose. The farm yard is getting crowded. The number of azure chip consultants on the program is interesting as well. With MarkLogic a leader in XML, enterprise search, and big data, the company seems to be poised to grow rapidly. We’re looking for hard data about gross sales, margins, and market share in the company’s core markets.
Whitney Grace, April 4, 2012
Sponsored by Pandia.com
Attivio Identifies a “Not Right”
April 5, 2012
Attivio Claims “Something Is Not Right” In Unified Information Access
Turn back to your yesteryears and take a cue from Attivio’s blog that uses famous children’s literature character, Madeline, to explain the problem with hard evidence vs. gut instinct: “ ‘Something Is Not Right’—Don’t Ignore Your Gut When Analyzing Information.” The author Mike Urbonas uses Miss Clavel, Madeline’s caregiver, famous line about trusting her instincts when something is wrong with her charge. Urbonas relates that in hospitals, healthcare professionals are worried about notifying doctors when they sense something is wrong with their cardiac patients because they not have hard data. If they had gone with their gut, more patients would have survived.
We totally agree with Urbonas when he leads into a unified information access argument:
“What I find very exciting is that unified information access (UIA) is playing a vital role in empowering managers and leaders to connect those dots between data and other silos of information to realize those critical new insights. UIA integrates, joins and presents all related information — structured data and unstructured content to complete the informational picture and significantly expand what organizations “know” to determine with confidence whether “Something is not right.”
This creative metaphor breaks up the monotony of most IT articles, but our favorite is Ikanow’s open source approach to analytics. Our concern is that as systems get improved “training wheels”, the rider may not recognize a risky situation.
Whitney Grace, April 5, 2012
Sponsored by Pandia.com
Datameer Has a New Analytics Toy
April 5, 2012
According to Marketwatch.com, Datameer, Inc, a provider of Apache built end user analytics solutions, announced the release Datameer 1.4 in “Datameer Releases a Major New Version of Analytics Platform. Datameer 1.4” improves functionality in data management, user and data security, and expanded support for data source adaptors, Hadoop, Cloudera, and IBM. We learned:
The new features in Datameer 1.4 demonstrate that Datameer is committed to delivering what customers want with an emphasis on quality and ease of use,” stated David Cornell, Software Development Manager at SophosLabs. “We are particularly excited to see support for partitioning which will dramatically enhance report generation performance.
Datameer 1.4 was released to meet the growing demands of the company’s clients. As the only Apache Hadoop analytics solution, Datameer builds solutions to aid businesses in linear scalability and cost-effectiveness to analyze/, integrate, and visualize structured and unstructured data. Datameer is a company that relies on open source software and is working hard to make a name for themselves in the business world.
The hook for this new release may be performance. Speed, more than fancy analytics, is becoming more important.
Whitney Grace, April 5, 2012
Sponsored by Pandia.com
MapR Expands Hadoop Connectors
April 4, 2012
This MapR move signals more options for Hadoop users. Talkin’ Cloud reports, “MapR Announces Broad Data Connection Options for Hadoop.” Writer Brian Taylor specifies:
The data connections, according to the press release, enable a ‘wide range of data ingress and egress alternatives for customers,’ including direct file-based access using standard tools; direct database connectivity; Hadoop-specific connectors via Sqoop, Flume and Hive; and access to popular data warehouses and applications using custom connectors.
Sqoop, Flume, and Hive are all open source projects at Apache; the first two are still in incubation.
MapR is getting a hand on this project from tech providers Pentaho and Talend, who will supply direct integration with MapR Distribution. In addition, Tableau Software is helping to promote the new data connection options.
Co-founded by Xoogler M.C. Srivas, MapR has built on the work of developers behind the open source Hadoop, making it “more reliable, more affordable, more manageable and significantly easier to use.” MapR boasts that its innovations help its customers get the most out of the big data phenomenon.
Watch for our forthcoming open source analytics blog. Roll out is April 9, 2012.
Cynthia Murrell, March 29, 2012
Sponsored by Pandia.com
Digital Reasoning and Semantic Research Tie Up
April 2, 2012
Digital Reasoning and Semantic Research today announced that they have integrated Digital Reasoning’s Synthesys big data analytics solution with Semantica data fusion and analysis software.
The integrated solution combines unstructured text analytics at scale has been combined with visualization. In addition, the tie up provides licensees with analytical workflow tools to deliver a unique solution for automatically understanding people, places, and hidden relationships in big data.
The ability to manipulate information with these tools facilitates the understanding of content without an analyst’s manually reading. Information from social networks, supply-chain networks, terrorist networks, financial networks, and government networks, among others, can yield new insights . Navigate to http://www.digitalreasoning.com/SemSynDemo to check out a video of some of the features and functions available.
Tim Estes, founder and CEO of Digital Reasoning, told us:
There is no other solution that provides massively scalable unstructured data analytics with auto-populating of visualizations and workflows tailored for the Intelligence Community. The solution we are delivering together has the ability to address key big data analytics challenges in the enterprise and government markets alike.
For more information about Digital Reasoning, point your browser at www.digitalreasoning.com. The firm provides automated understanding for Big Data. “Automated understanding” analyzes unstructured and structured data to reveal the hidden and potentially valuable relationships between people and organization in space and time. Digital Reasoning’s flagship product, Synthesys uncovers insights and accelerates the time to actionable intelligence.
Semantic Research (www.semanticresearch.com) is redefining the way users visualize, interact with, and understand data and information within the Department of Defense, Intelligence and Law Enforcement communities.
This looks like a promising tie up.
Stephen E Arnold, April 2, 2012
Sponsored by Pandia.com