Big Data Used to Confirm Bad Science

November 30, 2017

I had thought we had moved beyond harnessing big data and were now focusing on AI and machine learning, but Forbes has some possible new insights in, “Big Data: Insights Or Illusions?”

Big data is a tool that can generate new business insights or it can reinforce a company’s negative aspects.  The article consists of an interview with Christian Madsbjerg of ReD Associates.  It opens with how Madsbjerg and his colleagues studied credit card fraud by living like a fraudster for a while.  They learned some tricks and called their experience contextual analytics.  This leads to an important discussion topic:

Dryburgh: This is really interesting, because it seems to me that big data could be a very two-edged sword. On the one hand you can use it in the way that you’ve described to validate hypotheses that you’ve arrived at by very subjective, qualitative means. I guess the other alternative is that you can use it simply to provide confirmation for what you already think.

Madsbjerg: Which is what’s happening, and with the ethos that we’ve got a truth machine that you can’t challenge because it’s big data. So you’ll cement and intensify the toxic assumptions you have in the company if you don’t use it to challenge and explore, rather than to confirm things you already know.

This topic is not new.  We are seeing unverified news stories reach airwaves and circulate the Internet for the pure sake of generating views and profit.  Corporate entities do the same when they want to churn more money into their coffers than think of their workers or their actual customers.  It is also like Hollywood executives making superhero movies based on comic heroes when they have no idea about the medium’s integrity.

In other words, do not forget context and the human factor!

Whitney Grace, November 30, 2017

Analytics Tips on a Budget

November 23, 2017

Self-service analytics is another way to say “analytics on a budget.”  Many organizations, especially non-profits, do not have the funds to invest in a big data plan and technology, so they decide to take the task on themselves.  With the right person behind the project, self-service analytics is a great way to save a few bucks.  IT Pro Portal shares some ways how to improve on an analytics project in, “Three Rules For Adopting Self-Service Analytics.”  Another benefit to self-service analytics is that theoretically anyone in the organization can make use of the data and find some creative outlet for it.  The tips come with the warning label:

Any adoption of new technology requires a careful planning, consultation, and setup process to be successful: it must be comprehensive without being too time-consuming, and designed to meet the specific goals of your business end-users. Accordingly, there’s no one-size-fits-all approach: each business will need to consider its specific technological, operational and commercial requirements before they begin.

What are the three tips?

  1. Define your business requirements
  2. Collaborate and integrate
  3. Create and implement a data governance policy

All I can say to this is, duh!  These are standard tips that can be applied, not only for self-service analytics but also BI plans and any IT plan.  Maybe there are a few tips directly geared at the analytics field but stick to fewer listicles and more practical handbooks.  Was this a refined form of clickbait?

Whitney Grace, November 23, 2017

Healthcare Analytics Projected to Explode

November 21, 2017

There are many factors influencing the growing demand for healthcare analytics: pressure to lower healthcare costs, demand for more personalized treatment, the emergence of advanced analytic technology, and impact of social media.  PR Newswire takes a look at how the market is expected to explode in the article, “Healthcare Analytics Market To Grow At 25.3% CAGR From 2013 To 2024: Million Insights.”  Other important factors that influence healthcare costs are errors in medical products, workflow shortcomings, and, possibly the biggest, having cost-effective measures without compromising care.

Analytics are supposed to be able to help and/or influence all of these issues:

Based on the component, the global healthcare analytics market is segmented into services, software, and hardware. Services segment held a lucrative share in 2016 and is anticipated to grow steady rate during the forecast period. The service segment was dominated by the outsourcing of data services. Outsourcing of big data services saves time and is cost effective. Moreover, Outsourcing also enables access to skilled staff thereby eliminating the requirement of training of staff.

The cloud-based delivery is anticipated to grow and be the most widespread analytics platform for healthcare.  It allows remote access, avoids complicated infrastructures, and has real-time data tracking.  Adopting analytics platforms help curb the rising problems from cost to workforce to treatment the healthcare industry faces and will deal with in the future.  While these systems are being implemented, the harder part is determining how readily workers will be correctly trained on using them.

Whitney Grace, November 21, 2017

Need Better Charts and Graphs?

October 27, 2017

If you want to move beyond the vanilla charts and graphs in Excel and PowerPoint, you will want to read “The 15 Best Data Visualization Tools.” Don’t forget to make sure the data you present are accurate, timely, and germane to the point your snappy graphic will make. (Keep in mind that some folks are happy with snazzy visuals. Close enough for horseshoes.)

Stephen E Arnold, October 27, 2017

Uber vs DC Subway: Fancy Math but No Fires

October 23, 2017

I know I am supposed to focus on search and online content processing. But when I read “Metrorail vs Uber: Travel Time and Cost,” I decided to highlight this example of local government fancy math. The write up explains when it makes sense to take the DC subway usually referenced by those who live in Washington, DC as “the metro” and Uber.

The analysis uses graphs and logic to prove that the DC subway is the better bet for commuting. I noted this passage:

It is unclear how long Uber prices will remain this low. Several news outlets have reported that Uber subsidizes its rides with money from investors, meaning current fares might not reflect the full cost of a ride.

My take is that when prices go up, the DC subway is the better choice when moving around the throbbing heart of government.

But there are the fires, the breakdowns, and the complexity of the transfer bus system to delight the visitor from out of town and the long suffering Red Line riders trying to get from Shady Grove to Pentagon City.

Nifty illustration of what one can do with spare time and a somewhat superficial analysis. Now about those dead elevators or what I call the hassle factor? For added entertainment, watch a person from another country try to buy a ticket to ride the DC subway. Great fun!

Stephen E Arnold, October 23, 2017

Skepticism for Google Micro-Moment Marketing Push

October 13, 2017

An article at Street Fight, “The Fallacy of Google’s ‘Micro-Moment’ Positioning,” calls out Google’s “micro-moments” for the gimmick that it is. Here’s the company’s definition of the term they just made up: “an intent-rich moment when a person turns to a device to act on a need—to know, go, do, or buy.” In other words, any time a potential customer has a need and picks up their smartphone looking for a solution. For Street Fight’s David Mihm and Mike Blumenthal, this emphasis seems like a distraction from the failure of Google’s analytics to provide a well-rounded view of the online consumer. In fact, such oversimplification could hurt businesses that buy into the hype. In their dialogue format, they write:

David:[The term “micro-moments”] reduces all consumer buying decisions to thoughtless reflexes, which is just not reality, and drives all creative to a conversion-focused experience, which is only appropriate for specific kinds of keywords or mobile scenarios.  It’s totally IN-appropriate for display or top-of-funnel advertising. I also think it’s intended to create a bizarre sense of panic among marketers — “OMG, we have to be present at every possible instant someone might be looking at their phone!” — which doesn’t help them think strategically or make the best use of their marketing or ad spend.

Mike: I agree. If you don’t have a sound, broad strategy no micro management of micro moments will help. To some extent I wonder if Google’s use of the term reflects the limits of their analytics to yet be able to provide a more complete picture to the business?

David: Sure, Google is at least as well-positioned as Amazon or Facebook to provide closed-loop tracking of purchase behavior. But I think it reflects a longstanding cultural worldview within the company that reduces human behavior to an algorithm. “Get Notification. Buy Thing.” or “See Ad. Buy Thing.”  That may work for the “head” of transactional behavior but the long tail is far messier and harder to predict. Much as Larry Page would like us to be, humans are never going to be robots.

Companies that recognize the difference between consumers and robots have a clear edge in this area, no matter how Google tries to frame the issue. The authors compare Google’s blind spot to Amazon’s ease-of-use emphasis, noting the latter seems to better understand where customers are coming from. They also ponder the recent alliance between Google and Walmart to provide “voice-activated shopping” with a bit of skepticism. See the article for more of their reasoning.

Cynthia Murrell, October 13, 2017

Twitch Incorporates ClipMine Discovery Tools

September 18, 2017

Gameplay-streaming site Twitch has adapted the platform of their acquisition ClipMine, originally developed for adding annotations to online videos, into a metadata-generator for its users. (Twitch is owned by Amazon.) TechCrunch reports the development in, “Twitch Acquired Video Indexing Platform ClipMine to Power New Discovery Features.” Writer Sarah Perez tells us:

The startup’s technology is now being put to use to translate visual information in videos – like objects, text, logos and scenes – into metadata that can help people more easily find the streams they want to watch. Launched back in 2015, ClipMine had originally introduced a platform designed for crowdsourced tagging and annotations. The idea then was to offer a technology that could sit over top videos on the web – like those on YouTube, Vimeo or DailyMotion – that allowed users to add their own annotations. This, in turn, would help other viewers find the part of the video they wanted to watch, while also helping video publishers learn more about which sections were getting clicked on the most.

Based in Palo Alto, ClipMine went on to make indexing tools for the e-sports field and to incorporate computer vision and machine learning into their work. Their platform’s ability to identify content within videos caught Twitch’s eye; Perez explains:

Traditionally, online video content is indexed much like the web – using metadata like titles, tags, descriptions, and captions. But Twitch’s streams are live, and don’t have as much metadata to index. That’s where a technology like ClipMine can help. Streamers don’t have to do anything differently than usual to have their videos indexed, instead, ClipMine will analyze and categorize the content in real-time.

ClipMine’s technology has already been incorporated into stream-discovery tools for two games from Blizzard Entertainment, “Overwatch” and “Hearthstone;” see the article for more specifics on how and why. Through its blog, Twitch indicates that more innovations are on the way.

Cynthia Murrell, September 18, 2017

AI to Tackle Image Reading

September 11, 2017

The new frontier in analytics might just be pictures. Known to baffle even the most advanced AI systems, the ability to break pictures into recognizable parts and then use them to derive meaning has been a quest for many for some time. It appears that Disney Research in cahoots with UC Davis believe they are near a breakthrough.

Phys.org quotes Markus Gross, vice president at Disney Research, as saying,

We’ve seen tremendous progress in the ability of computers to detect and categorize objects, to understand scenes and even to write basic captions, but these capabilities have been developed largely by training computer programs with huge numbers of images that have been carefully and laboriously labeled as to their content. As computer vision applications tackle increasingly complex problems, creating these large training data sets has become a serious bottleneck.

A perfect example of the application of this is MIT attempts to use AI to share recipes and nutritional information just by viewing a picture of food. The sky is the limit when it comes to possibilities if Disney and MIT can help AI over the current hump of limitations.

Catherine Lamsfuss, September 11, 2017

Yet Another Digital Divide

September 8, 2017

Recommind sums up what happened at a recent technology convention in the article, “Why Discovery & ECM Haven’t, Must Come Together (CIGO Summit 2017 Recap).” Author Hal Marcus first discusses that he was a staunch challenge to anyone who said they could provide a complete information governance solution. He recently spoke at CIGO Summit 2017 about how to make information governance a feasible goal for organizations.

The problem with information governance is that there is no one simple solution and projects tend to be self-contained with only one goal: data collection, data reduction, etc. When he spoke he explained that there are five main reasons for there is not one comprehensive solution. They are that it takes a while to complete the project to define its parameters, data can come from multiple streams, mass-scale indexing is challenging, analytics will only help if there are humans to interpret the data, risk, and cost all put a damper on projects.

Yet we are closer to a solution:

Corporations seem to be dedicating more resources for data reduction and remediation projects, triggered largely by high profile data security breaches.

Multinationals are increasingly scrutinizing their data sharing and retention practices, spurred by the impending May 2018 GDPR deadline.

ECA for data culling is becoming more flexible and mature, supported by the growing availability and scalability of computing resources.

Discovery analytics are being offered at lower, all-you-can-eat rates, facilitating a range of corporate use cases like investigations, due diligence, and contract analysis

Tighter, more seamless and secure integration of ECM and discovery technology is advancing and seeing adoption in corporations, to great effect.

And it always seems farther away.

Whitney Grace, September 8, 2017

Natural Language Queries Added to Google Analytics

August 31, 2017

Data analysts are valuable members of any company and do a lot of good, but in many instances, average employees – not versed in analyst-ese – need to find valuable data. Rather than bother the analysts with mundane questions, Google has upgraded their analytics to include natural language queries, much like their search function.

Reporting on this upcoming change, ZDnet explains what this will mean for businesses:

Once the feature is available, users will have the ability to type or speak out a query and immediately receive a breakout of analyzed data that ranges from basic numbers and percentages to more detailed visualizations in charts and graphs. Google says it’s aiming to make data analysis more accessible to workers across a business, while in turn freeing up analysts to focus on more complex research and discovery.

While in theory, this seems like a great idea, it may still cause issues with those not asking questions related to the data, analytic method or appropriate prior knowledge. Unfortunately, data analysts are still the best resource when trying to glean information from analytics reports.

Catherine Lamsfuss, August 31, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta