Wisdom from the First OReilly AI Conference

November 28, 2016

Forbes contributor Gil Press nicely correlates and summarizes the insights he found at September’s inaugural O’Reilly AI Conference, held in New York City, in his article, “12 Observations About Artificial Intelligence from the O’Reily AI Conference.” He begins:

At the inaugural O’Reilly AI conference, 66 artificial intelligence practitioners and researchers from 39 organizations presented the current state-of-AI: From chatbots and deep learning to self-driving cars and emotion recognition to automating jobs and obstacles to AI progress to saving lives and new business opportunities. … Here’s a summary of what I heard there, embellished with a few references to recent AI news and commentary.

Here are Press’ 12 observations; check out the article for details on any that spark your interest: “AI is a black box—just like humans”; “AI is difficult”; “The AI driving driverless cars is going to make driving a hobby. Or maybe not”; “AI must consider culture and context”; “AI is not going to take all our jobs”; “AI is not going to kill us”; “AI isn’t magic and deep learning is a useful but limited tool”; “AI is Augmented Intelligence”; “AI changes how we interact with computers—and it needs a dose of empathy”; “AI should graduate from the Turing Test to smarter tests”; “AI according to Winston Churchill”; and “AI continues to be possibly hampered by a futile search for human-level intelligence while locked into a materialist paradigm.”

It is worth contemplating the point Press saved for last—are we even approaching this whole AI thing from the most productive angle? He ponders:

Is it possible that this paradigm—and the driving ambition at its core to play God and develop human-like machines—has led to the infamous ‘AI Winter’? And that continuing to adhere to it and refusing to consider ‘genuinely new ideas,’ out-of-the-dominant-paradigm ideas, will lead to yet another AI Winter? Maybe, just maybe, our minds are not computers and computers do not resemble our brains?  And maybe, just maybe, if we finally abandon the futile pursuit of replicating ‘human-level AI’ in computers, we will find many additional–albeit ‘narrow’–applications of computers to enrich and improve our lives?

I think Press is on to something. Perhaps we should admit that anything approaching Rosie the Robot is still decades away (according to conference presenter Oren Etzioni). At this early date, we may do well to accept and applaud specialized AIs that do one thing very well but are completely ignorant of everything else. After all, our Roombas are unlikely to attempt conquering the world.

Cynthia Murrell, November 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Machine Learning Does Not Have All the Answers

November 25, 2016

Despite our broader knowledge, we still believe that if we press a few buttons and press enter computers can do all work for us.  The advent of machine learning and artificial intelligence does not repress this belief, but instead big data vendors rely on this image to sell their wares.  Big data, though, has its weaknesses and before you deploy a solution you should read Network World’s, “6 Machine Learning Misunderstandings.”

Pulling from Juniper Networks’s security intelligence software engineer Roman Sinayev explains some of the pitfalls to avoid before implementing big data technology.  It is important not to take into consideration all the variables and unexpected variables, otherwise that one forgotten factor could wreck havoc on your system.  Also, do not forget to actually understand the data you are analyzing and its origin.  Pushing forward on a project without understanding the data background is a guaranteed fail.

Other practical advice, is to build a test model, add more data when the model does not deliver, but some advice that is new even to us is:

One type of algorithm that has recently been successful in practical applications is ensemble learning – a process by which multiple models combine to solve a computational intelligence problem. One example of ensemble learning is stacking simple classifiers like logistic regressions. These ensemble learning methods can improve predictive performance more than any of these classifiers individually.

Employing more than one algorithm?  It makes sense and is practical advice why did that not cross our minds? The rest of the advice offered is general stuff that can be applied to any project in any field, just change the lingo and expert providing it.

Whitney Grace, November 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

The Noble Quest Behind Semantic Search

November 25, 2016

A brief write-up at the ontotext blog, “The Knowledge Discovery Quest,” presents a noble vision of the search field. Philologist and blogger Teodora Petkova observed that semantic search is the key to bringing together data from different sources and exploring connections. She elaborates:

On a more practical note, semantic search is about efficient enterprise content usage. As one of the biggest losses of knowledge happens due to inefficient management and retrieval of information. The ability to search for meaning not for keywords brings us a step closer to efficient information management.

If semantic search had a separate icon from the one traditional search has it would have been a microscope. Why? Because semantic search is looking at content as if through the magnifying lens of a microscope. The technology helps us explore large amounts of systems and the connections between them. Sharpening our ability to join the dots, semantic search enhances the way we look for clues and compare correlations on our knowledge discovery quest.

At the bottom of the post is a slideshow on this “knowledge discovery quest.” Sure, it also serves to illustrate how ontotext could help, but we can’t blame them for drumming up business through their own blog. We actually appreciate the company’s approach to semantic search, and we’d be curious to see how they manage the intricacies of content conversion and normalization. Founded in 2000, ontotext is based in Bulgaria.

Cynthia Murrell, November 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Keeping Current with Elastic.co

November 24, 2016

Short honk. If you want to keep up with Elastic and Elasticsearch, the company’s “This Week in Elasticsearch and Apache Lucene” may be of interest. The weekly posting includes information about commits, releases, and training. Unlike the slightly crazed, revenue challenged open source search vendors, Elastic.co provides factual information about the plumbing for the search and retrieval system. We found the “Ongoing Changes” section useful and interesting. The idea is that one can keep track of certain features, methods, and issues by scanning a list. The short description of an issue, for instance, includes a link to additional information. Highly recommended for those hooked on Elastic.co’s free and open source solution or the for fee products and services the company offers.

Stephen E Arnold, November 24, 2016

Do Not Forget to Show Your Work

November 24, 2016

Showing work is messy, necessary step to prove how one arrived at a solution.  Most of the time it is never reviewed, but with big data people wonder how computer algorithms arrive at their conclusions.  Engadget explains that computers are being forced to prove their results in, “MIT Makes Neural Networks Show Their Work.”

Understanding neural networks is extremely difficult, but MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed a way to map the complex systems.  CSAIL figured the task out by splitting networks in two smaller modules.  One for extracting text segments and scoring according to their length and accordance and the second module predicts the segment’s subject and attempts to classify them.  The mapping modules sounds almost as complex as the actual neural networks.  To alleviate the stress and add a giggle to their research, CSAIL had the modules analyze beer reviews:

For their test, the team used online reviews from a beer rating website and had their network attempt to rank beers on a 5-star scale based on the brew’s aroma, palate, and appearance, using the site’s written reviews. After training the system, the CSAIL team found that their neural network rated beers based on aroma and appearance the same way that humans did 95 and 96 percent of the time, respectively. On the more subjective field of “palate,” the network agreed with people 80 percent of the time.

One set of data is as good as another to test CSAIL’s network mapping tool.  CSAIL hopes to fine tune the machine learning project and use it in breast cancer research to analyze pathologist data.

Whitney Grace, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Dawn of Blockchain Technology

November 24, 2016

Blockchain technology though currently powers the Bitcoin and other cryptocurrencies, soon the technology might find takers in mainstream commercial activities.

Blockgeeks in an in-depth article guide titled What Is Blockchain Technology? A Step-By-Step Guide for Beginners says:

The blockchain is an incorruptible digital ledger of economic transactions that can be programmed to record not just financial transactions but virtually everything of value.

Without getting into how the technology works, it would be interesting to know how and where the revolutionary technology can be utilized. Due to its inherent nature of being incorruptible due to human intervention and non-centralization, blockchain has numerous applications in the field of banking, remittances, shared economy, crowdfunding and many more, the list is just endless.

The technology will be especially helpful for people who transact over the Web and as the article points out:

Goldman Sachs believes that blockchain technology holds great potential especially to optimize clearing and settlements, and could represent global savings of up to $6bn per year.

Governments and commercial establishment, however, are apprehensive about it as blockchain might end their control over a multitude of things. Just because blockchain never stores data at one location. This also is the reason why Bitcoin is yet to gain full acceptance. But, can a driving force like blockchain technology that will empower the actual users can be stopped?

Vishal Ingole, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Writing That Is Never Read

November 23, 2016

It is inevitable in college that you were forced to write an essay.  Writing an essay usually requires the citation of various sources from scholarly journals.  As you perused the academic articles, the thought probably crossed your mind: who ever reads this stuff?  Smithsonian Magazine tells us who in the article, “Academics Write Papers Arguing Over How Many People Read (And Cite) Their Papers.”  In other words, themselves.

Academic articles are read mostly by their authors, journal editors, and the study’s author write, and students forced to cite them for assignments.  In perfect scholarly fashion, many academics do not believe that their work has a limited scope.  So what do they do?  They decided to write about it and have done so for twenty years.

Most academics are not surprised that most written works go unread.  The common belief is that it is better to publish something rather than nothing and it could also be a requirement to keep their position.  As they are prone to do, academics complain about the numbers and their accuracy:

It seems like this should be an easy question to answer: all you have to do is count the number of citations each paper has. But it’s harder than you might think. There are entire papers themselves dedicated to figuring out how to do this efficiently and accurately. The point of the 2007 paper wasn’t to assert that 50 percent of studies are unread. It was actually about citation analysis and the ways that the internet is letting academics see more accurately who is reading and citing their papers. “Since the turn of the century, dozens of databases such as Scopus and Google Scholar have appeared, which allow the citation patterns of academic papers to be studied with unprecedented speed and ease,” the paper’s authors wrote.

Academics always need something to argue about, no matter how miniscule the topic. This particular article concludes on the note that someone should get the number straight so academics can move onto to another item to argue about.  Going back to the original thought a student forced to write an essay with citations also probably thought: the reason this stuff does not get read is because they are so boring.

Whitney Grace, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Exit Shakespeare, for He Had a Coauthor

November 22, 2016

Shakespeare is regarded as the greatest writer in the English language.  Many studies, however, are devoted to the theory that he did not pen all of his plays and poems.  Some attribute them to Francis Bacon, Edward de Vere, Christopher Marlowe, and others.  Whether Shakespeare was a singular author or one of many, two facts remain:  he was a dirty, old man and it could be said he plagiarized his ideas from other writers.  Shall he still be regarded as the figurehead for English literature?

Philly.com takes the Shakespeare authorship into question in the article, “Penn Engineers Use Big Data To Show Shakespeare Had Coauthor On ‘Henry VI’ Plays.”  Editors of a new edition of Shakespeare’s complete works listed Marlowe as a coauthor on the Henry VI plays due to a recent study at the University of Pennsylvania.  Alejandro Ribeiro used his experience researching networks could be applied to the Shakespeare authorship question using big data.

Ribeiro learned that Henry VI was among the works for which scholars thought Shakespeare might have had a co-author, so he and lab members Santiago Segarra and Mark Eisen tackled the question with the tools of big data.  Working with Shakespeare expert Gabriel Egan of De Montfort University in Leicester, England, they analyzed the proximity of certain target words in the playwright’s works, developing a statistical fingerprint that could be compared with those of other authors from his era.

Two other research groups had the same conclusion with other analytical techniques.  The results from all three studies were enough to convince the lead general editor of the New Oxford Shakespeare Gary Taylor, who decided to list Marlowe as a coauthor to Henry VI.  More research has been conducted to determine other potential Shakespeare coauthors and six more will also be credited in the New Oxford editions.

Ribeiro and his team created “word-adjacency networks” that discovered patterns in Shakespeare’s writing style and six other dramatists.  They discovered that many scenes in Henry VI were non-written in Shakespeare’s style, enough to prove a coauthor.

Some Shakespeare purists remain against the theory that Shakespeare did not pen all of his plays, but big data analytics proves many of the theories that other academics have theorized for generations.  The dirty old man was not old alone as he wrote his ditties.

Whitney Grace, November 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

All the Things Watson Could Do

November 21, 2016

One of our favorite artificial intelligence topics has made the news again: Watson.   Technology Review focuses on Watson’s job descriptions and his emergence in new fields, “IBM’s Watson Is Everywhere-But What Is It?”  We all know that Watson won Jeopardy and has been deployed as the ultimate business intelligence solution, but what exactly does Watson do for a company?

The truth about Watson’s Jeopardy appearance is that very little of the technology was used. In reality, Watson is an umbrella name IBM uses for an entire group of their machine learning and artificial intelligence technology.  The Watson brand is employed in a variety of ways from medical disease interpretation to creating new recipes via experimentation.  The technology can be used for many industries and applied to a variety of scenarios.  It all depends on what the business needs resolved.  There is another problem:

Beyond the marketing hype, Watson is an interesting and potentially important AI effort. That’s because, for all the excitement over the ways in which companies like Google and Facebook are harnessing AI, no one has yet worked out how AI is going to fit into many workplaces. IBM is trying to make it easier for companies to apply these techniques, and to tap into the expertise required to do so.

IBM is experiencing problems of its own, but beyond those another consideration to take is Watson’s expense.  Businesses are usually eager to incorporate new technology, if the benefit is huge.  However, they are reluctant for the initial payout, especially if the technology is still experimental and not standard yet.  Nobody wants to be a guinea pig, but someone needs to set the pace for everyone else.  So who wants to deploy Watson?

Whitney Grace, November 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Hacking the Internet of Things

November 17, 2016

Readers may recall that October’s DoS attack against internet-performance-management firm Dyn, which disrupted web traffic at popular sites like Twitter, Netflix, Reddit, and Etsy. As it turns out, the growing “Internet of Things (IoT)” facilitated that attack; specifically, thousands of cameras and DVRs were hacked and used to bombard Dyn with page requests. CNet examines the issue of hacking through the IoT in, “Search Engine Shodan Knows Where Your Toaster Lives.”

Reporter Laura Hautala informs us that it is quite easy for those who know what they’re doing to access any and all internet-connected devices. Skilled hackers can do so using search engines like Google or Bing, she tells us, but tools created for white-hat researchers, like Shodan, make the task even easier. Hautala writes:

While it’s possible hackers used Shodan, Google or Bing to locate the cameras and DVRs they compromised for the attack, they also could have done it with tools available in shady hacker circles. But without these legit, legal search tools, white hat researchers would have a harder time finding vulnerable systems connected to the internet. That could keep cybersecurity workers in a company’s IT department from checking which of its devices are leaking sensitive data onto the internet, for example, or have a known vulnerability that could let hackers in.

Even though sites like Shodan might leave you feeling exposed, security experts say the good guys need to be able to see as much as the bad guys can in order to be effective.

Indeed. Like every tool ever invented, the impacts of Shodan depend on the intentions of the people using it.

Cynthia Murrell, November 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta