Learn About Machine Learning

August 30, 2017

For an in-depth look at the technology behind Google Translate, turn to Stats and Bots’ write-up, “Machine Learning Translation and the Google Translate Algorithm.” Part of a series that aims to educate users about the technology behind machine learning (ML), the illustrated article delves into the details behind Google’s deep learning translation tools. Writer  Daniil Korbut explains the factors that make it problematic to “teach” human language to an AI, then describes Long Short-Term Memory (LSTM) networks, bidirectional RNNs, sequence-to-sequence models, and how Google put those tools together. See the article for those details that are a bit above this writer’s head. There’s just one thing missing—any acknowledgment of the third parties that provide Google with language technology. Oh well.

Another valuable resource on machine learning, found at YCombinator, is Google researcher Jeff Dean’s Lecture for YC AI. The post includes a video that is over an hour long, but it also shares the informative slides from Dean’s presentation. They touch on scientific and medical applications for machine learning, then examine sequence-to-sequence models,  automated machine learning, and “higher performance” ML models. One early slide reproduces a Google blog post in which Dean gives a little history (and several relevant links):

Allowing computers to better understand human language is one key area for our research. In late 2014, three Brain team researchers published a paper on Sequence to Sequence Learning with Neural Networks, and demonstrated that the approach could be used for machine translation. In 2015, we showed that this this approach could also be used for generating captions for images, parsing sentences, and solving computational geometry problems. In 2016, this previous research (plus many enhancements) culminated in Brain team members worked closely with members of the Google Translate team to wholly replace the translation algorithms powering Google Translate with a completely end-to-end learned system (research paper). This new system closed the gap between the old system and human quality translations by up to 85% for some language pairs. A few weeks later, we showed how the system could do “zero-shot translation”, learning to translate between languages for which it had never seen example sentence pairs (research paper). This system is now deployed on the production Google Translate service for a growing number of language pairs.

These surveys of Google’s machine translation tools offer a lot of detailed information for those interested in the topic. Just remember that Google is not (yet?) the only game in town.

Cynthia Murrell, August 30, 2017


Time to Ditch PowerPoint?

August 23, 2017

For decades, Microsoft PowerPoint has been used for making presentations. That is all set to change as a recent study indicates that PowerPoint presentations are ineffective.

According to an article published by Quartz and titled The Scientific Reason No One Wants to See Your PowerPoint Presentation, the publisher says:

Because the human brain process information both visually (using shapes and colors) and spatially (using location and distance, the researchers said, ZUI helps audiences by locating the information in a place, allowing them to mentally retrieve it later.

The problem with the study is that it appears to be too promotional. For instance, the article says tools like Prezi are better for making presentations because it offers a lot of animated options. Why not then use Gifographics or stock videos then?

The effectiveness of a presentation mostly depends on the person presenting it. Many speakers completely do away with any type of tools so that their audience can concentrate on what the speaker says. Moreover, the presentation can be made effective if the slides are designed professionally. Don’t be surprised if, in the near future, all presentations are made using VR headsets for that truly immersive experience.

Vishal Ingole, August 23, 2017

Fake News Is Here to Stay

August 22, 2017

Morphed pictures and videos were the realms of experts. New tools, however, are making it easier for people with average computer skills to create hyper-realistic content.

As reported by Mashable in an article titled This Scary Video Tool Makes Fake News Look Legit, which says:

Researchers at the University of Washington recently announced a new video-editing tool that they used to superimpose audio — with realistic lip movements — onto a video of former U.S. president Barack Obama, making it appear as though he’s saying whatever they want him to.

The intention of making this tool was to help special effects artists in the entertainment industry. However, as is the case with any other tool, the tool as a test run was to create a fake news content.  Couple this tool with other available tools like Google DeepMind AI and Lyrebird, a single person could be producing a number of fake videos sitting in the dungeon.

Social media platforms are already fighting the menace of fake news. However such tools make their tasks tougher. Facebook, for instance, employs an army of analysts to weed out fake news. Seems like until the problem of fake news or information is going to get worse.

Vishal Ingole, August 22, 2017

Tidy Text the Best Way to Utilize Analytics

August 10, 2017

Even though text mining is nothing new natural language processing seems to be the hot new analytics craze. In an effort to understand the value of each, along with the difference, and (most importantly) how to use either efficiently, O’Reilly interviewed text miners, Julia Silge and David Robinson, to learn about their approach.

When asked what advice they would give those drowning in data, they replied,

…our advice is that adopting tidy data principles is an effective strategy to approach text mining problems. The tidy text format keeps one token (typically a word) in each row, and keeps each variable (such as a document or chapter) in a column. When your data is tidy, you can use a common set of tools for exploring and visualizing them. This frees you from struggling to get your data into the right format for each task and instead lets you focus on the questions you want to ask.

The due admits text mining and natural language processing overlap in many areas but both are useful tools for different issues. They regulate text mining to statistical analysis and natural language processing to the relationship between computers and language. The difference may seem minute but with data mines exploding and companies drowning in data, such advice is crucial.

Catherine Lamsfuss, August 10, 2017

Big Data Visualization the Open Source Way

August 10, 2017

Big Data though was hailed in a big way, it is yet to gain full steam because of a shortage of talent. Companies working in this domain are taking another swipe by offering visualization tools for free.

The Customize Windows in an article titled List of Open Source Big Data Visualization Tools:

There are some growing number of websites which write about Big Data, cloud computing and spread wrong information to sell some others paid things.

Many industries have tried the freemium route to attract talent and promote the industry. For instance, Linux OS maker Penguin Computing offered its product for free to users. This move sparked interest among users who wanted to try something other than Windows and Mac.

The move created a huge user base of Linux users and also attracted talent to promote research and development.

Big Data players it seems is following the exact strategy by offering data visualization tools free, which they will monetize later. All that is needed now is patience.

Vishal Ingole, August 10, 2017

Wield Buzzwords with Precision

July 10, 2017

It is difficult to communicate clearly when folks don’t agree on what certain words mean. Nature attempts to clear up confusion around certain popular terms in, “Big Science Has a Buzzword Problem.” We here at Beyond Search like to call jargon words “cacaphones,” but the more traditional “buzzwords” works, too. Writer Megan Scudellari explains:

‘Moonshot’, ‘road map’, ‘initiative’ and other science-planning buzzwords have meaning, yet even some of the people who choose these terms have trouble defining them precisely. The terms might seem interchangeable, but close examination reveals a subtle hierarchy in their intentions and goals. Moonshots, for example, focus on achievable, but lofty, engineering problems. Road maps and decadal surveys (see ‘Alternate aliases’) lay out milestones and timelines or set priorities for a field. That said, many planning projects masquerade as one title while acting as another.

Strategic plans that bear these lofty names often tout big price tags and encourage collaborative undertakings…. The value of such projects is continually debated. On one hand, many argue that the coalescence of resources, organization and long-term goals that comes with large programmes is crucial to science advancement in an era of increasing data and complexity. … Big thinking and big actions have often led to success. But critics argue that buzzword projects add unnecessary layers of bureaucracy and overhead costs to doing science, reduce creativity and funding stability and often lack the basic science necessary to succeed.

In order to help planners use such terms accurately, Scudellari supplies definitions, backgrounds, and usage guidance for several common buzzwords: “moonshot,” “roadmap,” “initiative,” and “framework.” There’s even a tool to help one decide which term best applies to any given project. See the article to explore these distinctions.

Cynthia Murrell, July 10, 2017

Wall Street Can Learn from Google

May 30, 2017

Ruth Porat, CFO, Alphabet tells Economic Club of New York that Wall Street should have an open culture like Google which has helped the company to keep profit levels high and investors happy.

CNBC in its news piece titled Ruth Porat Suggests Financial Crisis Could’ve Been Avoided If Wall Street Acted More Like Google said:

Ruth Porat, the former veteran Morgan Stanley executive who’s now chief financial officer of Alphabet, suggested Monday that the financial crisis could have been prevented — or at least made less severe — if Wall Street had operated with the same transparency as Google’s parent company.

Google has no employee stock option at present. According to Porat, this eliminates the possibility of employees rigging the financial numbers or engaging in financial engineering. For Google, its greatest threat is the pace of innovation.

The company has a weekly meet TGIF wherein executives are asked tough questions by employees on any aspect of the company. Porat feels it is this tool that has helped Alphabet maintain transparency and Wall Street has something to learn from it.

Vishal Ingole, May 30, 2017

Malware Infected USB Sticks on the Loose

May 18, 2017

Oops. We learn from TechRepublic that “IBM Admits it Sent Malware-Infected USB Sticks to Customers.”

The article cites the company’s support Advisory Post announcing the problem, a resource anyone who has received an IBM Storwize V3500, V3700 or V5000 USB drive should check for the models and serial numbers affected. The recommended fix—destroy the drive and, if you’d already inserted it, perform a malware purge on your computer.

Writer Conner Forrest describes:

So, what does the infected drive actually do to a system? ‘When the initialization tool is launched from the USB flash drive, the tool copies itself to a temporary folder on the hard drive of the desktop or laptop during normal operation,’ the IBM post said. Then, a malicious file is copied to a temporary folder called %TMP%\initTool on Windows or /tmp/initTool on Linux or Mac. It is important to note that, while the file is copied onto a machine, it isn’t actually executed during the initialization process, the post also said. As reported by ZDNet’s Danny Palmer, the malware was listed by Kaspersky lab as a member of the Reconyc Trojan malware family, which is primarily used in Russia and India.

It might be understandable if this were the first time this had happened, but IBM also unwittingly distributed infected USB drives back in 2010, at the AusCERT conference in Australia. Let us hope there is not a third time; customers rightly expect more vigilance from such a prominent company.

Cynthia Murrell, May 18, 2017

Some Web Hosting Firms Overwhelmed by Scam Domains

January 27, 2017

An article at Softpedia should be a wakeup call to anyone who takes the issue of online security lightly—“One Crook Running Over 120 Tech Support Scam Domains on GoDaddy.” Writer Catalin Cimpanu explains:

A crook running several tech support scam operations has managed to register 135 domains, most of which are used in his criminal activities, without anybody preventing him from doing so, which shows the sad state of Web domain registrations today. His name and email address are tied to 135 domains, as MalwareHunterTeam told Softpedia. Over 120 of these domains are registered and hosted via GoDaddy and have been gradually registered across time.

The full list is available at the end of this article (text version here), but most of the domains look shady just based on their names. Really, how safe do you feel navigating to ‘security-update-needed-sys-filescorrupted-trojan-detected[.]info’? How about ‘personal-identity-theft-system-info-compromised[.]info’?

Those are ridiculously obvious, but it seems to be that GoDaddy’s abuse department is too swamped to flag and block even these flagrant examples. At least that hosting firm does have an abuse department; many, it seems, can only be reached through national CERT teams. Other hosting companies, though, respond with the proper urgency when abuse is reported—Cimpanu holds up Bluehost and PlanetHoster as examples. That is something to consider for anyone who thinks the choice of hosting firm is unimportant.

We are reminded that educating ourselves is the best protection. The article links to a valuable tech support scam guide provided by veteran Internet security firm Malwarebytes, and suggests studying the wikis or support pages of other security vendors.

Cynthia Murrell, January 27, 2017

Google Needs a Time-Out for Censorship, But Who Will Enforce Regulations

January 26, 2017

The article on U.S. News and World Report titled The New Censorship offers a list of the ways in which Google is censoring its content, and builds a compelling argument for increased regulation of Google. Certain items on the list, such as pro-life music videos being removed from YouTube, might have you rolling your eyes, but the larger point is that Google simply has too much power over what people see, hear, and know. The most obvious problem is Google’s ability to squash a business simply by changing its search algorithm, but the myriad ways that it has censored content is really shocking. The article states,

No one company, which is accountable to its shareholders but not to the general public, should have the power to instantly put another company out of business or block access to any website in the world. How frequently Google acts irresponsibly is beside the point; it has the ability to do so, which means that in a matter of seconds any of Google’s 37,000 employees with the right passwords or skills could laser a business or political candidate into oblivion…

At times the article sounds like a sad conservative annoyed that the most influential company in the world tends toward liberal viewpoints. Hearing white male conservatives complain about discrimination is always a little off-putting, especially when you have politicians like Rand Paul still defending the right of businesses to refuse service based on skin color. But from a liberal standpoint, just because Google often supports left-wing causes like gun control or the pro-choice movement doesn’t mean that it deserves a free ticket to decide what people are exposed to. Additionally, the article points out that the supposed “moral stands” made by Google are often revealed to be moneymaking or anticompetitive schemes. Absolute power corrupts no matter who yields it, and companies must be scrutinized to protect the interests of the people.

Chelsea Kerwin, January 26, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta