Big Data Used to Confirm Bad Science

November 30, 2017

I had thought we had moved beyond harnessing big data and were now focusing on AI and machine learning, but Forbes has some possible new insights in, “Big Data: Insights Or Illusions?”

Big data is a tool that can generate new business insights or it can reinforce a company’s negative aspects.  The article consists of an interview with Christian Madsbjerg of ReD Associates.  It opens with how Madsbjerg and his colleagues studied credit card fraud by living like a fraudster for a while.  They learned some tricks and called their experience contextual analytics.  This leads to an important discussion topic:

Dryburgh: This is really interesting, because it seems to me that big data could be a very two-edged sword. On the one hand you can use it in the way that you’ve described to validate hypotheses that you’ve arrived at by very subjective, qualitative means. I guess the other alternative is that you can use it simply to provide confirmation for what you already think.

Madsbjerg: Which is what’s happening, and with the ethos that we’ve got a truth machine that you can’t challenge because it’s big data. So you’ll cement and intensify the toxic assumptions you have in the company if you don’t use it to challenge and explore, rather than to confirm things you already know.

This topic is not new.  We are seeing unverified news stories reach airwaves and circulate the Internet for the pure sake of generating views and profit.  Corporate entities do the same when they want to churn more money into their coffers than think of their workers or their actual customers.  It is also like Hollywood executives making superhero movies based on comic heroes when they have no idea about the medium’s integrity.

In other words, do not forget context and the human factor!

Whitney Grace, November 30, 2017

AI Changes Power Delivery

November 13, 2017

Artificial intelligence has already changed the way our lives progress daily, and will continue to do so the more advances are made in that field.  What is amazing is how AI concepts can be applied to nearly every industry in the world and T&D World takes a look at how it has affected the power grid, “Artificial Intelligence Is Changing The Power-Delivery Grid.”  The author introduces the article explaining how he has noticed that when people say Thomas Edison would be familiar with today’s modern innovations and how it is a put down to the industry.

In truth, Edison would be hard pressed to rationalize today’s real-time mechanics and AI structure.  AI is an important development in all fields from medical to finance, but it plays an important role in the modern power grid:

Some of us call situational awareness technology machine learning, but most of us use the more common term of artificial intelligence (AI). AI is being used on the grid right now and it is more widespread than you might think. Case in point: In June, the U.S. Department of Energy held its 2017 Transmission Reliability Program in Washington, D.C., and there were several AI presentations. The presentation that caught my attention was about using advanced machine learning with synchrophasor technology. Synchrophasor technology generates a great deal of data using phasor measurement units (PMUs) to measure voltage and current of the grid, which can be used to calculate parameters such as frequency and phase angle.

Edison would not feel at home, instead, he would want to get his toolset and tear the power grid apart to figure out how it worked.  Geniuses may be ahead of their time, but they still are products of their own.

Whitney Grace, November 13, 2017

Treating Google Knowledge Panel as Your Own Home Page

November 8, 2017

Now, this is interesting. Mike Ramsey at AttorneyAtWork suggests fellow lawyers leverage certain Google tools  in, “Three Reasons Google Is Your New Home Page.” He points out that Google now delivers accessibility to many entities directly on the results page, reducing the number of clicks potential clients must perform to contact a firm. He writes:

[Google] has rolled out three products that provide potential clients with information about your law firm before they get to your site:

*Messages (on mobile)

*Questions and Answers (on mobile)

*Optional URLs for booking appointments (both mobile and desktop)


This means that Google search results are becoming your new ‘home page.’ All three products — Messages, Questions and Answers and URLs for appointments — are accessible from your Google My Business dashboard. They appear in your local Knowledge Panel in Google. If Google really is becoming your home page, but also giving you a say in providing potential clients with information about your firm, you will definitely want to take advantage of it.

The article explains how to best leverage each tool. For example, Messages let you incorporate text messages into your Knowledge Panel; Ramsey notes that customers prefer using text messages to resolve customer service issues. Questions and Answers will build an FAQ-like dialogue for the panel, while optional URLs allow clients to schedule appointments right from the results page. Ramsey predicts it should take about an hour to set up these tools for any given law firm, and emphasizes it is well worth that investment to make it as easy as possible for potential clients to get in touch.

Cynthia Murrell, November 8, 2017

Privacy Is Lost in Translation

October 30, 2017

Online translation tools are a wonder!  Instead of having to rely on limited dictionaries and grammars, online translation tools deliver real-time, nearly accurate translations of documents and other text.  It is usually good to double check the translation because sometimes the tools do make mistakes.  Translation tools, however, can make mistakes that lose privacy in translation.  Quartz tells an alarming story in, “If You Value Your Privacy, Be Careful With Online Translation Tools.”

Norwegian state oil company Statoil used to translate sensitive company documents.  One would think that would not be a problem, except stored the data in the cloud.  The sensitive documents included dismissal letters, contracts, workforce reduction plans, and more.  News traveled fast in Norway, resulting in the Oslo Stock Exchange blocking employees’ access to and Google Translate.

It was dubbed a massive privacy breach as private documents from other organizations and individuals were discovered. views the incident differently: sees things a little differently, however, saying it was straight with users about the fact that it was crowdsourcing human translations to improve on machine work. In a Sept. 6 blog post responding to the news reports, the company explained that in the past, they were using human volunteer translators to improve their algorithm, and during that time, had made documents submitted for translation public so that any human volunteers could easily access them. ‘As a precaution, there was a clear note on our homepage stating: ‘All translations will be sent to our community to improve accuracy.’ also offered to remove any documents upon request, but sensitive documents were still available when the Quartz article was written.  Vice president of Sales for Maria Burud pointed out that they offer a paid translation software intended for businesses to maintain their privacy.  Burud notes that that anything translate using a free web tool is bound to have privacy issues, but that there is a disclaimer on her company’s Web site.  It is up to the user to de-identify the information or watch what they post in a translation box.

In other words, watch what you translate and post online.  It will come back to haunt you.

Whitney Grace, October 30, 2017

Learn About Machine Learning

August 30, 2017

For an in-depth look at the technology behind Google Translate, turn to Stats and Bots’ write-up, “Machine Learning Translation and the Google Translate Algorithm.” Part of a series that aims to educate users about the technology behind machine learning (ML), the illustrated article delves into the details behind Google’s deep learning translation tools. Writer  Daniil Korbut explains the factors that make it problematic to “teach” human language to an AI, then describes Long Short-Term Memory (LSTM) networks, bidirectional RNNs, sequence-to-sequence models, and how Google put those tools together. See the article for those details that are a bit above this writer’s head. There’s just one thing missing—any acknowledgment of the third parties that provide Google with language technology. Oh well.

Another valuable resource on machine learning, found at YCombinator, is Google researcher Jeff Dean’s Lecture for YC AI. The post includes a video that is over an hour long, but it also shares the informative slides from Dean’s presentation. They touch on scientific and medical applications for machine learning, then examine sequence-to-sequence models,  automated machine learning, and “higher performance” ML models. One early slide reproduces a Google blog post in which Dean gives a little history (and several relevant links):

Allowing computers to better understand human language is one key area for our research. In late 2014, three Brain team researchers published a paper on Sequence to Sequence Learning with Neural Networks, and demonstrated that the approach could be used for machine translation. In 2015, we showed that this this approach could also be used for generating captions for images, parsing sentences, and solving computational geometry problems. In 2016, this previous research (plus many enhancements) culminated in Brain team members worked closely with members of the Google Translate team to wholly replace the translation algorithms powering Google Translate with a completely end-to-end learned system (research paper). This new system closed the gap between the old system and human quality translations by up to 85% for some language pairs. A few weeks later, we showed how the system could do “zero-shot translation”, learning to translate between languages for which it had never seen example sentence pairs (research paper). This system is now deployed on the production Google Translate service for a growing number of language pairs.

These surveys of Google’s machine translation tools offer a lot of detailed information for those interested in the topic. Just remember that Google is not (yet?) the only game in town.

Cynthia Murrell, August 30, 2017


Time to Ditch PowerPoint?

August 23, 2017

For decades, Microsoft PowerPoint has been used for making presentations. That is all set to change as a recent study indicates that PowerPoint presentations are ineffective.

According to an article published by Quartz and titled The Scientific Reason No One Wants to See Your PowerPoint Presentation, the publisher says:

Because the human brain process information both visually (using shapes and colors) and spatially (using location and distance, the researchers said, ZUI helps audiences by locating the information in a place, allowing them to mentally retrieve it later.

The problem with the study is that it appears to be too promotional. For instance, the article says tools like Prezi are better for making presentations because it offers a lot of animated options. Why not then use Gifographics or stock videos then?

The effectiveness of a presentation mostly depends on the person presenting it. Many speakers completely do away with any type of tools so that their audience can concentrate on what the speaker says. Moreover, the presentation can be made effective if the slides are designed professionally. Don’t be surprised if, in the near future, all presentations are made using VR headsets for that truly immersive experience.

Vishal Ingole, August 23, 2017

Fake News Is Here to Stay

August 22, 2017

Morphed pictures and videos were the realms of experts. New tools, however, are making it easier for people with average computer skills to create hyper-realistic content.

As reported by Mashable in an article titled This Scary Video Tool Makes Fake News Look Legit, which says:

Researchers at the University of Washington recently announced a new video-editing tool that they used to superimpose audio — with realistic lip movements — onto a video of former U.S. president Barack Obama, making it appear as though he’s saying whatever they want him to.

The intention of making this tool was to help special effects artists in the entertainment industry. However, as is the case with any other tool, the tool as a test run was to create a fake news content.  Couple this tool with other available tools like Google DeepMind AI and Lyrebird, a single person could be producing a number of fake videos sitting in the dungeon.

Social media platforms are already fighting the menace of fake news. However such tools make their tasks tougher. Facebook, for instance, employs an army of analysts to weed out fake news. Seems like until the problem of fake news or information is going to get worse.

Vishal Ingole, August 22, 2017

Tidy Text the Best Way to Utilize Analytics

August 10, 2017

Even though text mining is nothing new natural language processing seems to be the hot new analytics craze. In an effort to understand the value of each, along with the difference, and (most importantly) how to use either efficiently, O’Reilly interviewed text miners, Julia Silge and David Robinson, to learn about their approach.

When asked what advice they would give those drowning in data, they replied,

…our advice is that adopting tidy data principles is an effective strategy to approach text mining problems. The tidy text format keeps one token (typically a word) in each row, and keeps each variable (such as a document or chapter) in a column. When your data is tidy, you can use a common set of tools for exploring and visualizing them. This frees you from struggling to get your data into the right format for each task and instead lets you focus on the questions you want to ask.

The due admits text mining and natural language processing overlap in many areas but both are useful tools for different issues. They regulate text mining to statistical analysis and natural language processing to the relationship between computers and language. The difference may seem minute but with data mines exploding and companies drowning in data, such advice is crucial.

Catherine Lamsfuss, August 10, 2017

Big Data Visualization the Open Source Way

August 10, 2017

Big Data though was hailed in a big way, it is yet to gain full steam because of a shortage of talent. Companies working in this domain are taking another swipe by offering visualization tools for free.

The Customize Windows in an article titled List of Open Source Big Data Visualization Tools:

There are some growing number of websites which write about Big Data, cloud computing and spread wrong information to sell some others paid things.

Many industries have tried the freemium route to attract talent and promote the industry. For instance, Linux OS maker Penguin Computing offered its product for free to users. This move sparked interest among users who wanted to try something other than Windows and Mac.

The move created a huge user base of Linux users and also attracted talent to promote research and development.

Big Data players it seems is following the exact strategy by offering data visualization tools free, which they will monetize later. All that is needed now is patience.

Vishal Ingole, August 10, 2017

Wield Buzzwords with Precision

July 10, 2017

It is difficult to communicate clearly when folks don’t agree on what certain words mean. Nature attempts to clear up confusion around certain popular terms in, “Big Science Has a Buzzword Problem.” We here at Beyond Search like to call jargon words “cacaphones,” but the more traditional “buzzwords” works, too. Writer Megan Scudellari explains:

‘Moonshot’, ‘road map’, ‘initiative’ and other science-planning buzzwords have meaning, yet even some of the people who choose these terms have trouble defining them precisely. The terms might seem interchangeable, but close examination reveals a subtle hierarchy in their intentions and goals. Moonshots, for example, focus on achievable, but lofty, engineering problems. Road maps and decadal surveys (see ‘Alternate aliases’) lay out milestones and timelines or set priorities for a field. That said, many planning projects masquerade as one title while acting as another.

Strategic plans that bear these lofty names often tout big price tags and encourage collaborative undertakings…. The value of such projects is continually debated. On one hand, many argue that the coalescence of resources, organization and long-term goals that comes with large programmes is crucial to science advancement in an era of increasing data and complexity. … Big thinking and big actions have often led to success. But critics argue that buzzword projects add unnecessary layers of bureaucracy and overhead costs to doing science, reduce creativity and funding stability and often lack the basic science necessary to succeed.

In order to help planners use such terms accurately, Scudellari supplies definitions, backgrounds, and usage guidance for several common buzzwords: “moonshot,” “roadmap,” “initiative,” and “framework.” There’s even a tool to help one decide which term best applies to any given project. See the article to explore these distinctions.

Cynthia Murrell, July 10, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta