Featured

Machine Learning Frameworks: Why Not Just Use Amazon?

A colleague sent me a link to “The 10 Most Popular Machine Learning Frameworks Used by Data Scientists.” I found the write up interesting despite the author’s failure to define the word popular and the bound phrase data scientists. But few folks in an era of “real” journalism fool around with my quaint notions.

According to the write up, the data come from an outfit called Figure Eight. I don’t know the company, but I assume their professionals adhere to the basics of Statistics 101. You know the boring stuff like sample size, objectivity of the sample, sample selection, data validity, etc. Like information in our time of “real” news and “real” journalists, some of these annoying aspects of churning out data in which an old geezer like me can have some confidence. You know like the 70 percent accuracy of some US facial recognition systems. Close enough for horseshoes, I suppose.

miss sort of accurate

Here’s the list. My comments about each “learning framework” appear in italics after each “learning framework’s” name:

  1. Pandas — an open source, BSD-licensed library
  2. Numpy — a package for scientific computing with Python
  3. Scikit-learn — another BSD licensed collection of tools for data mining and data analysis
  4. Matplotlib — a Python 2D plotting library for graphics
  5. TensorFlow — an open source machine learning framework
  6. Keras — a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano
  7. Seaborn — a Python data visualization library based on matplotlib
  8. Pytorch & Torch
  9. AWS Deep Learning AMI — infrastructure and tools to accelerate deep learning in the cloud. Not to be annoying but defining AMI as Amazon Machine Learning Interface might be useful to some
  10. Google Cloud ML Engine — neural-net-based ML service with a typically Googley line up of Googley services.

Stepping back, I noticed a handful of what I am sure are irrelevant points which are of little interest to a “real” journalists creating “real” news.

First, notice that the list is self referential with python love. Frameworks depend on other python loving frameworks. There’s nothing inherently bad about this self referential approach to shipping up a list, and it makes it a heck of a lot easier to create the list in the first place.

Second, the information about Amazon is slightly misleading. In my lecture in Washington, DC on September 7, I mentioned that Amazon’s approach to machine learning supports Apache MXNet and Gluon, TensorFlow, Microsoft Cognitive Toolkit, Caffe, Caffe2, Theano, Torch, PyTorch, Chainer, and Keras. I found this approach interesting, but of little interest to those creating a survey or developing an informed list about machine learning frameworks; for example, Amazon is executing a quite clever play. In bridge, I think the phrase “trump card” suggests what the Bezos momentum machine has cooked up. Notice the past tense because this Amazon stuff has been chugging along in at least one US government agency for about four, four and one half years.

Third, Google brings up dead last. What about IBM? What about Microsoft and its CNTK. Ah, another acronym, but I as a non real journalist will reveal that this acronym means Microsoft Cognitive Toolkit. More information is available in Microsoft’s wonderful prose at this link. By the way, the Amazon machine learning spinning momentum thing supports the CNTK. Imagine that? Right, I didn’t think so.

Net net: The machine learning framework list may benefit from a bit of refinement. On the other hand, just use Amazon and move down the road to a new type of smart software lock in. Want to know more? Write benkent2020 @ yahoo dot com and inquire about our for fee Amazon briefing about machine learning, real time data marketplaces, and a couple of other most off the radar activities. Have you seen Amazon’s facial recognition camera? It’s part of the Amazon machine learning imitative, and it has some interesting capabilities.

Stephen E Arnold, September 16, 2018

Interviews

Bitext: Exclusive Interview with Antonio Valderrabanos

On a recent trip to Madrid, Spain, I was able to arrange an interview with Dr. Antonio Valderrabanos, the founder and CEO of Bitext. The company has its primary research and development group in Las Rosas, the high-technology complex a short distance from central Madrid. The company has an office in San Francisco and a number of computational linguists and computer scientists in other locations. Dr. Valderrabanos worked at IBM in an adjacent field before moving to Novell and then making the jump to his own start up. The hard work required to invent a fundamentally new way to make sense of human utterance is now beginning to pay off.

Antonio Valderrabanos of Bitext

Dr. Antonio Valderrabanos, founder and CEO of Bitext. Bitext’s business is growing rapidly. The company’s breakthroughs in deep linguistic analysis solves many difficult problems in text analysis.

Founded in 2008, the firm specializes in deep linguistic analysis. The systems and methods invented and refined by Bitext improve the accuracy of a wide range of content processing and text analytics systems. What’s remarkable about the Bitext breakthroughs is that the company support more than 40 different languages, and its platform can support additional languages with sharp reductions in the time, cost, and effort required by old-school systems. With the proliferation of intelligent software, Bitext, in my opinion, puts the digital brains in overdrive. Bitext’s platform improves the accuracy of many smart software applications, ranging from customer support to business intelligence.

In our wide ranging discussion, Dr. Valderrabanos made a number of insightful comments. Let me highlight three and urge you to read the full text of the interview at this link. (Note: this interview is part of the Search Wizards Speak series.)

Linguistics as an Operating System

One of Dr. Valderrabanos’ most startling observations addresses the future of operating systems for increasingly intelligence software and applications. He said:

Linguistic applications will form a new type of operating system. If we are correct in our thought that language understanding creates a new type of platform, it follows that innovators will build more new things on this foundation. That means that there is no endpoint, just more opportunities to realize new products and services.

Better Understanding Has Arrived

Some of the smart software I have tested is unable to understand what seems to be very basic instructions. The problem, in my opinion, is context. Most smart software struggles to figure out the knowledge cloud which embraces certain data. Dr. Valderrabanos observed:

Search is one thing. Understanding what human utterances mean is another. Bitext’s proprietary technology delivers understanding. Bitext has created an easy to scale and multilingual Deep Linguistic Analysis or DLA platform. Our technology reduces costs and increases user satisfaction in voice applications or customer service applications. I see it as a major breakthrough in the state of the art.

If he is right, the Bitext DLA platform may be one of the next big things in technology. The reason? As smart software becomes more widely adopted, the need to make sense of data and text in different languages becomes increasingly important. Bitext may be the digital differential that makes the smart applications run the way users expect them to.

Snap In Bitext DLA

Advanced technology like Bitext’s often comes with a hidden cost. The advanced system works well in a demonstration or a controlled environment. When that system has to be integrated into “as is” systems from other vendors or from a custom development project, difficulties can pile up. Dr. Valderrabanos asserted:

Bitext DLA provides parsing data for text enrichment for a wide range of languages, for informal and formal text and for different verticals to improve the accuracy of deep learning engines and reduce training times and data needs. Bitext works in this way with many other organizations’ systems.

When I asked him about integration, he said:

No problems. We snap in.

I am interested in Bitext’s technical methods. In the last year, he has signed deals with companies like Audi, Renault, a large mobile handset manufacturer, and an online information retrieval company.

When I thanked him for his time, he was quite polite. But he did say, “I have to get back to my desk. We have received several requests for proposals.”

Las Rosas looked quite a bit like Silicon Valley when I left the Bitext headquarters. Despite the thousands of miles separating Madrid from the US, interest in Bitext’s deep linguistic analysis is surging. Silicon Valley has its charms, and now it has a Bitext US office for what may be the fastest growing computational linguistics and text analysis system in the world. Worth watching this company I think.

For more about Bitext, navigate to the firm’s Web site at www.bitext.com.

Stephen E Arnold, April 11, 2017

Latest News

HSSCM Methods: Hey, Enough of This Already

I read an allegedly “real journalism” story called “Google Suppresses memo Revealing Plans to Closely Track Search Users in China.” I won’t call attention... Read more »

September 22, 2018 | Comment

Modern Journalism: Fact or Fiction or Something Quite New

I read the information on this New York Times’ Web page. The intent is to explain how the NYT can learn a “secret” from a helpful reader. I noted this statement: Each... Read more »

September 21, 2018 | Comment

Amazon: Device Proliferation and One Interesting Use Case

The technology “real news” channels are stuffed with Amazon gizmo news. Interesting stuff if one considers that these devices may snap into the eCommerce company’s... Read more »

September 21, 2018 | Comment

Deep Learning Helps Bing Spotlight Aggregate Breaking News

News aggregators sift through the vast number of news stories out there to focus on the content users want to see (and lead to filter bubbles, but that is another... Read more »

September 20, 2018 | Comment

Google: Stomping Out Bad Music Types

Google has a lot of content to lord over. And with that responsibility comes the need to police that content when it is misused. Perhaps nowhere does this happen... Read more »

September 20, 2018 | Comment

IBM Lock In Approach Modified and Given New Life

I read “Alphabet Backs GitLab’s Quest to Surpass Microsoft’s GitHub.” The write up explains that Microsoft bought GitHub. Google invests in GitLab.... Read more »

September 20, 2018 | Comment

Amazon: Will Its Brick and Mortar Ambitions Succeed?

I read “The CEO of Macy’s Says It’s Harder for an E-commerce Giant to Conquer Offline Retail Than the Other Way Around.” The main idea is that Amazon will... Read more »

September 19, 2018 | Comment

Bing: No More Public URL Submissions

Ever wondered why some Web site content is not indexed? Heck, ever talk to a person who cannot find their Web site in a “free” Web index? I know that many people... Read more »

September 19, 2018 | Comment

Factualities for September 18, 2018

Believe ‘em or not: Zero. The change in the average age of the IBM workforce after reductions in workforce. Source: Poughkeepsie Journal $115. The cost of a holiday... Read more »

September 19, 2018 | Comment

Technical Debt: A Bit of a Misunderstanding of the Iron Maiden Effect

Let’s go back in time. It is 1979 and Lockheed Martin has nosed into the commercial database  business. The system was designed around IBM mainframe technology... Read more »

September 18, 2018 | Comment


  • Archives

  • Recent Posts

  • Meta