Big Data Challenges Explained
September 24, 2013
According to the recent Info World story “Big Data Means Big Challenges in Lifecycle Management” Whereas we thought that managing data was an old challenge, there are more Big Data challenges on the horizon.
As the article explains, integrated lifecycle management faces a whole set of new problems when it comes to tackling big data. The issues addressed have to do with: volume, velocity, and variability.
The article highlights issues surrounding big data scales:
“Big data does not mean that your new platforms support infinite volume, instantaneous velocity, or unbounded varieties. The sheer magnitudes of new data will make it impossible to store most of it anywhere, given the stubborn technological and economic constraints we all face. This reality will deepen big data managers’ focus on tweaking multi-temperature storage management, archiving, and retention policies. As you scale your big data environment, you will need to ensure that ILM requirements can be supported within your current constraints of volume (storage capacity), velocity (bandwidth, processor, and memory speeds), and variety (metadata depth).”
As data continues to grow in size and become more ephemeral, tech companies must keep up by creating software to tackle it.
Jasmine Ashton, September 24, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
The Risks of Data Fragmentation
September 20, 2013
Our big-data age has grown a new challenge—”data fragmentation.” ITProPortal examines the growing problem in, “Shadow IT: The Struggle to Protect Corporate Information in the Face of Growing Data Fragmentation.” The extensive interview with Mimecast’s chief strategy officer, Matthew Ravden, examines the issue, from defining the problem to offering advice on how to deal with it. (Note that Mimecast offers services to combat fragmentation.)
The more fragmented (or widely dispersed) a company’s data is, the harder it is to control who can access it. The problem lies largely in the cloud, but also with information distributed across a company’s network. Complicating the issue are workers who skirt their IT department and its fussy rules, storing data however and wherever, they see fit. Revden explains:
“Ultimately, the employee is at the heart of this issue; using multiple applications and devices, often without the IT manager’s knowledge. You can understand why they do it; they want to be able to use the same applications and embrace the same ‘sharing’ culture at work that they do in their personal lives. They also sometimes feel forced to use consumer-grade tools because of the restrictions placed on them by IT, including the size of files that can be sent via the corporate email system. Of course, most employees are not conscious of the risk – they just want to use a fast and easy service which will help them get their job done. As well as identifying the potential third-party services used, IT managers need to educate users on the risks involved, in order to ensure corporate policies are respected.”
The interview discusses the business and security risks of fragmentation, the roles cloud services and email play, and steps businesses can take to fight the problem (including educating workers to the importance of the issue). It even touches on the responsibility of cloud vendors. The piece does conclude with a plug for Mimecast, but that should not deter one from reading the article. Check it out for more information on this uniquely modern issue.
Cynthia Murrell, September 20, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Speeding Up Big Data With Platfora
September 18, 2013
Processing big data is slow and requires companies to depend heavily on their IT departments to compile reports. What if there was a way to make big data faster with a self-access system? Is someone programming castles in the sky? According to Info World, the answer is no and the article, “Platfora CEO: How We Deliver Sharper Analytics Faster” has some information that possibly verifies it. After a brief rundown of big data’s history, it gets to the interesting part that references how developers were not supposed to build data warehouses until they knew what question to ask their data. The problem is that questions change and analysts then cannot get the exact data they need.
Analysts and developers big challenges at the edge of big data: amount and type of data is growing at an exponential rate, nobody can know all the exact questions they need to ask in advance, maintain competiveness, and answer unanticipated questions. The biggest item the article claims analysts need is self-service.
Platfora then steps up to the bat with its new business intelligence platform:
“The integrated platform we developed to support a new era of self-service analytics helps to remove the obstacles to business intelligence described earlier by enabling an “interest-driven pipeline” of data controlled by the end-user. The end-user — typically a business analyst — can access raw data directly from Hadoop, which is then transformed into interactive, in-memory business intelligence. There is no need for a data warehouse or for separate ETL (extract, transfer, load) software and the headaches described above.”
Self-empowerment and the ability to find new data patterns all on your own. Are we seeing the next big data phase? Platfora is making analytics “sharp.” Cue the ZZ Top music.
Whitney Grace, September 18, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Bold Assertions about Big Data Security Threats
September 17, 2013
Big Data comes with its own slew of security problems, but could it actually be used to keep track of them? The idea of using big data to catch security threats is a novel idea and a big one to stand behind. PR Newswire lets is know that, “AnubisBetworks’s Big Data Intelligence Platform Analyses Millions Of Cyber Security Threat Events.” AnubisNetworks is a well-known name in the IT security risk management software and cloud solutions field and its newest product to combat cyber threats is StreamForce. StreamForce is a real-time intelligence platform that detects and analyzes millions of cyber security threats per second.
StreamForce de-duplicates events to help speed up big data storage burden, which is one of the biggest challenges big data security faces.
“Within the new “big-data” paradigm – the exponential growth, availability and use of information, both structured and unstructured – is presenting major challenges for organizations to understand both risks as well as seizing opportunities to optimize revenue. StreamForce goes to the core of dealing with the increasingly complex world of events, across a landscape of distinct and disperse networks, cloud based applications, social media, mobile devices and applications. StreamForce goes a step further than traditional “after-the event” analysis, offering real-time actionable intelligence for risk analysts and decision makers, enabling quick reaction, and even prediction of threats and opportunities.”
StreamForce is the ideal tool for banks, financial institutions, telecommunication companies, government intelligence and defense agencies. Fast and powerful is what big data users need, but does StreamForce really stand behind its claims? Security threats are hard to detect for even the most tested security software. Can a data feather duster really do the trick to make the difference?
Whitney Grace, September 17, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Sorting Through Big Data the Right Way with Web Analytics
September 11, 2013
On B2C, an article titled 5 Web Analytics Truths for Smart Digital Marketing mentions different approaches to finding the relevant data for your business. The first suggestion is catering to the staff on hand. People at different levels have different focuses, and an open conversation about what they want to learn from the data at hand might be invaluable. At the same time, how you view the data in powerful tools such as Google Analytics can make all the difference to the impression it makes on you. The article explains,
“Today’s analytics platforms… are very powerful and allow us the ability to go beyond simplistic hit collection, and really dive into rich data and patterns. You can easily report and derive insights with visitor segmentation, have quick visibility into buyer or non-buyer behavior, group content by asset type, measure gated or ungated content consumption, and relatively easily run a cohort analysis. These are just a few views that could be utilized when segmenting your data.”
Another piece of advice is to optimize while tracking everything you can. Keeping technology up to date is imperative, yes, but only if you are using it to its full potential. Altogether, the article provides a handful of the painful truths about the reality of smart digital marketing.
Chelsea Kerwin, September 11, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Reading Minds with Big Data
September 6, 2013
I thought it might come down to this eventually: big data being used to read people’s minds. It is only a pipe dream at the moment, but ReadWrite takes a look at the process in the article, “Dyson: Big Data Driven Thought Control Is Here.” It starts out with the doomsday prophecy that big data can anticipate human behavior, but that it could lead to the government and businesses using it to predict our actions and invade our privacy. Possible? Yes, according to science historian and author George Dyson. He is concerned that the NSA will use these measures against people under the guise of tracking terrorists.
The doomsday prophecy is not scary. Humans are unpredictable creatures, but that does not matter:
“It’s not that a machine can understand exactly what we’re thinking at any given point in time. It doesn’t have to. As Dyson explains, ‘A reasonable guess at what you are thinking is good enough.’”
Individual data is being turned into metadata, which is being pulled by big data to create an analytical profile of who you are. That is the scary part. If actions can be predicted, then thoughts are on their way to being punished. There is still that spark of unpredictability, though. Humans can change in an instant. Plus there is the technological problem. Even if thoughts can be predicted, how are they going to connect to a human head to get the “real-time” data?
Whitney Grace, September 06, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Open Source Says Line Them All Up
September 5, 2013
If you ever wanted to visualize data sets containing up to one million lines of code, the impossible just became a reality without a commercial license. PRWeb has the good news: “Tableau Software Extends Tableau Public To 1 Million Rows Of Data.” Tableau Software is a data specialization company that helps its users share, analyze, and visualize their data. The company has an open source end portal Tableau Public that also allows its users to share their content on blogs and personal Web sites. Users demanded to have the line limit increased and Tableau Software added the one million limit to its public end.
“ ‘Since Tableau Public launched in 2010, we’ve seen an explosion in the number of data sets available on the web for public consumption,” said Tableau Public Product Marketing Manager Ben Jones. “It’s becoming more common for these data sets to exceed one hundred thousand records, so this change allows users of our software to share interactive visualizations of these larger data sets with their readers.’ ”
Some organizations that have big data sets out in the public are: airline on-time statistics and delay causes, US Medicare payments to hospitals, and historical weather station data recorded hourly. As the Internet grows the amount of space needed will grow proportionally and perhaps even larger. Wonder when they will release a trillion lines.
Whitney Grace, September 05, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Attivio Teams up with Capax Global
September 4, 2013
Attivio has signed up another partner, this time a leader in search. PR Newswire reveals, “Capax Global and Attivio Announce Strategic Reseller Partnership.” The move will help Capax Global’s customers smoothly shift from conventional enterprise search to the more comprehensive unified information access (UIA) approach. The press release quotes Capax Global CEO and managing director John Baiocco:
“We have seen a natural shift towards UIA as our enterprise search customers contend with massive volumes of information, coming from multiple sources, in different formats. Traditional approaches are no longer adequate in dealing with the scale and complexity of enterprise information. Attivio leads the industry in addressing the demands of big data volume, variety, and velocity that our customers face.”
David Schubmehl, research director at analysis firm IDC, also weighs in on the importance of UIA:
“Unified information access is the next logical progression beyond enterprise search as companies face unprecedented volumes of disparate information, of which 85 percent or more is unstructured. Because UIA platforms can integrate large volumes of information across disconnected silos, technologies like AIE have become a key enabler for big data analytics and decision support.”
Founded in 2007 and headquartered in Massachusetts, Attivio also has offices in other U.S. states, the U.K., Germany, and Israel. The company’s award-winning Active Intelligence Engine integrates structured and unstructured data, making it easier to translate information assets into useful business insights.
Capax Global celebrates its 20th birthday this year, making it a veteran in the search field. The privately-held company, based in New York, offers consulting services, custom implementations, and cloud-hosting services. An emphasis on its clients’ unique business objectives is no doubt part of its appeal for its many customers, which include Fortune 500 companies and major organizations around the world.
Cynthia Murrell, September 04, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Big Data Analytics Proves Invaluable In Tax Fraud Investigation
September 3, 2013
The article on ComputerWeekly.com titled Big Data Journalism Exposes Offshore Tax Dodgers reports on the findings of Offshore Leaks, the result of the work of an international group of journalists. The fascinating story of offshore tax evasion by over 100,000 owners and founders of companies and trusts begins in Australia with a hard drive containing 260 GB of corporate files and unfiltered (and unorganized) personal emails. The article explains,
“Processing and publishing the leaked data brought to the US from Australia took over 18 months to bring to fruition, and is still continuing. As the largest ever big data project tackled by journalists, the investigation faced technical problems and errors from the start, took blind alleys, and encountered problems in collaboration, as well as pioneering effective new methods…
The first wave of reporting of Offshore Leaks stories began in the UK’s Guardian in November 2012, followed by a global relaunch in April 2013.
The G8 summit in Lough Erne, Northern Ireland this year will address the issue raised by the stories. David Cameron even requested that offshore company records be published. It was not until Australian company Nuix offered the journalists its text retrieval software that the unstructured data began to give relevant tips. Without such analytics software the project may have gone nowhere.
Chelsea Kerwin, September 03, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Kofax Purchases Kapow for Analytics and Integration Software
September 3, 2013
Reuters reported on July 31 the update Kofax Buys Kapow Technologies for $47.5 Million. Kofax, an Irvine, California based company acquisitioned the Palo Alto based Kapow for its analytics and data integration software. The purchase moves Kofax into the big data analytics sphere. The Kapow integration product is lauded for its user-friendly interface and its simplicity (Kapow uses a subscription model which makes installation and tests unnecessary. The article Kofax Adds Integration, Big Data Analytics in Kapow Acquisition on eweek explains,
“Thus Kofax… is combining all its newly acquired software IP to provide the basis for a significant big-data software package that will enable large organizations to access data–particularly the hard-to-get data that sits behind apps with no APIs (application programming interfaces)–faster and more cost-effectively…
Kapow Katalyst provides near real-time application integration and process automation, offering traditional API level integration capabilities as well as what it terms a “synthetic API” approach, which provides business users with a point-and-click interface, the company said.”
Kapow Kapplets are the apps that implement that data integration made possible by Kapow Katalyst. Kapow customers include Astra Zeneca, Audi and Zurich Insurance Group. The Chief Officer of Kofax applauded Kapow for its consistent growth in revenue over the last four fiscal years. The most surprising aspect of the deal may be the low deal price, assuming $47.5 million is correct.
Chelsea Kerwin, September 03, 2013
Sponsored by ArnoldIT.com, developer of Augmentext