Sure, Computers Are Psychic

June 12, 2019

Psychics, mentalism, divination, and other ways to communicate with the dead or see the future are not real. These so-called gifts are actually ancient arts in human behavior, psychology, and nature. With practice and skill anyone can learn how to manipulate and predict someone’s future movements, that is basically all algorithms are doing. According to Entrepreneur, humans are leaving bread crumb trails online that algorithms watch and then can predict an individual’s behavior: “How Algorithms Can Predict Our Intentions Faster Than We Can.”

While artificial intelligence (AI) and natural language processing (NLP) are still developing technologies, their advancements are quickly made. Simply by tracking an individual’s Web activities, AI and NLP can learn behavior patterns and “predict” intentions, thoughts, and even our next move.

Social media is a big predictor of future events too. Take the 2016 election of Hilary Clinton vs. Donald Trump, then there is Brett Kavanaugh’s trials and his confirmation to the Supreme Court. When Paul Nemirovsky’s dMetrics analyzed unstructured social media data, they found that the data was skewed in favor of Kavanaugh’s assignment to the court. Later this came to pass as fact. On the positive side of things, this could mean better investment outcomes, improved marketing messaging, higher customer satisfaction, and deeper insights into anything we choose.

Algorithms are literally dumb pieces of code. They only do what they are programmed. In order for them to understand user data, algorithms need NLP:

“Natural Language Processing, or NLP, is a neuro-network that essentially teaches itself the way we say things. By being exposed to different conversational experiences, the machine learns. Simply put, once you tell the machine what each sentence means, it records each meaning in order to process it in the future. By processing this information, it learns the skills to better understand our intentions than we do.”

NLP is not magic and needs to be programmed like any piece of software. Predictive analytics are still and will be a work in progress for some time, because of costs, applications, and also ethical violations. Will predictive analytics powered by AI and NLP be used for evil? Er, yeah. They will also be used for good, like cars, guns, computers, and putting words in the mouths of people who never made a particular statement.

Whitney Grace, June 12, 2019

Google: Can Semantic Relaxing Display More Ads?

June 10, 2019

For some reason, vendors of search systems have shuddered if a user’s query returns a null set. the idea is that a user sends a query to a system or more correctly an index. The terms in the query do not match entries in the database. The system displays a message which says, “No results match your query.”

For some individuals, that null set response is high value information. One can bump into null sets when running queries on a Web site; for example, send the anti fungicide query to the Arnold Information Technology blog at this link. Here’s the result:


From this response, one knows that there is no content containing the search phrase. That’s valuable for some people.

To address this problem, modern systems “relax” the query. The idea is that the user did not want what he or she typed in the search box. The search system then changes the query and displays those results to the stupid user. Other systems take action and display results which the system determines are related to the query. You can see these relaxed results when you enter the query shadowdragon into Google. Here are the results:


Google ignored my spelling and displays information about a video game, not the little known company Shadowdragon. At least Google told me what it did and offers a way to rerun the query using the word I actually entered. But the point is that the search was “relaxed.”

The purpose of semantic expansion is a variation of Endeca’s facets. The idea is that a key word belongs to a category. If a system can identify a category, then the user can get more results by selecting the category and maybe finding something useful. Endeca’s wine demonstration makes this function and its value clear.

Read more

A Math Cheat Sheet with 212 Pages

May 30, 2019

When I was in high school, there was one student who wrote on his arm. Fortunately I just remembered the information. I wonder if this person will be interested in the “Mathematics Cheat Sheet.” The “sheet” contains 200 plus pages. I assume that if one could write tiny numbers and letters, a page or two might be recorded on an arm, the back of one’s hand, one’s palm, and maybe another body part. On the other hand, it is probably easier to use a smart phone and look for the information surrounded by ads for one of those “help you children learn” services. If you fancy a cheat “sheet” for math which will consume three fifths of a ream of paper (plus or minus a percent or two), enjoy. (I must confess that I browsed the “sheet” and was stunned to learn how much I have forgotten. Power? When did I confront this equation, when I was 14? Maybe 15?


But at age 75, I am lucky if I can remember how to get money from an automatic teller machine which asks me which language I prefer. Still thinking. Thinking.)

Stephen E Arnold, May 30, 2019

Chain of Failure: A Reminder about Logic

May 26, 2019

I spotted a reference to a blog post on Yodaiken. It’s title is “Von Neumann’s Critique of Automata Theory and Logic in Computer Science.” Do we live in a Von Neumann world? I was delighted to be reminded of the observations in this passage. Here’s the snippet I circled in yellow highlighter:

In a sufficiently long chain of operations the cumulative effect of these individual probabilities of failure may (if unchecked) reach the order of magnitude of unity-at which point it produces, in effect, complete unreliability.

Interesting. Perhaps failure is part of the DNA of smart software?

Stephen E Arnold, May 26, 2019

Data Science Book: Free for Now

May 24, 2019

We spotted a post by Capri Granville which points to a free data science book. The post also provides a link to other free books. The Microsoft Research India book is “Foundations of Data Science” by Ravi Kannan. You can as of May 24, 2019, download the book without charge at this link: Cornell charges students about $55,188 for an academic year. DarkCyber believes that “free” may not be an operative word where the Theory Center used to love those big IBM computers. No, they were not painted Azure.

Stephen E Arnold, May 24, 2019

Predictions and Experts: Maybe Ignore Them or Just Punish Them?

May 13, 2019

I read “The Peculiar Blindness of Experts” with this subtitle:

Credentialed authorities are comically bad at predicting the future. But reliable forecasting is possible.

The write up reminded me of an anthologized essay in freshman English 101. I suggest you take a look at the original. There is a subtext chugging along in this lengthy write up. To whet your appetite, consider this passage which I circled in True Blue marker:

Unfortunately, the world’s most prominent specialists are rarely held accountable for their predictions, so we continue to rely on them even when their track records make clear that we should not.

Is the message “Get it wrong and get punished.” Outputs from Recorded Future or horse race touts could possibly be altered.

There is a bit of hope for those who can learn:

The best forecasters, by contrast, view their own ideas as hypotheses in need of testing. If they make a bet and lose, they embrace the logic of a loss just as they would the reinforcement of a win. This is called, in a word, learning.

Is smart software like a hedgehog or a fox?

I won’t predict your response.

Stephen E Arnold, May 13, 2019

China: Patent Translation System

May 10, 2019

Patents are usually easily findable documents. However, reading a patent once found is a challenge. Up the ante if the patent is in a language the person does not read. “AI Used to Translate Patent Documents” provides some information about a new system available from the Intellectual Property Publishing House. According to the article in China Daily:

The system can translate Chinese into English, Japanese and German and vice versa. Its accuracy in two-way translation between Chinese and Japanese has reached 95 percent, far more than the current industry average, and the rest has topped 90 percent…

The system uses a dictionary, natural language processing algorithms, and a computational model. In short, this is a collection of widely used methods tuned over a decade by the Chinese organization. In that span, Thomson Reuters dropped out of the patent game, and just finding patents, even in the US, can be a daunting task.

Translation has been an even more difficult task for some lawyers, researchers, analysts, and academics.

If the information in the China Daily article is accurate, China may have an intellectual property advantage., The write up offers some details, which sound interesting; for example:

  • Translation of a Japanese document: five seconds
  • Patent documents record 90 percent of a country’s technology and innovation
  • China has “a huge database of global patents”.

And the other 10 percent? Maybe other methods are employed.

Stephen E Arnold, May 10, 2019

Algorithms: Thresholds and Recycling Partially Explained

April 19, 2019

Five or six years ago I prepared a lecture about the weaknesses in widely used algorithms. In that talk, which I delivered to intelligence operatives in Western Europe and the US, I pointed out two points which were significant to me and my small research team.

  1. There are about nine or 10 algorithms which are used again and again. One example is k means. The reason is that the procedure is a fixture in many university courses, and the method is good enough.
  2. Quite a bit of the work on smart software relies on cutting and pasting. In 1962, I discovered the value of this approach when I worked on a small project at my undergraduate university. Find a code snippet that does the needed task, modify it if necessary, and bingo! Today this approach remains popular.

I thought about my lectures and these two points when I read another part of the mathy series “Untold History of AI: Algorithmic Bias Was Born in the 1980s.” IEEE Spectrum does a reasonable job of explaining one case of algorithmic bias. The story is similar to the experience Amazon had with one of its smart modules. The math produced wonky results. The word “bias” is okay with me, but the outputs from systems which happily chug away and deliver “outputs” to clueless MBAs, lawyers, and marketers may be incorrect.

Several observations:

  1. The bias in methods goes back before I showed up at the university computer center to use the keypunch machines. Way back in fact.
  2. Developers today rely on copy and paste, open source, and the basic methods taught by professors who may be thinking about their side jobs as consultants.
  3. Training data may be skewed, and no one wants to spend the money or take the time to create training data. Why bother? Just use whatever is free, cheap, or already on a storage device. Close enough for horseshoes.
  4. Users do not know [a] what’s going on behind the point and click interfaces, nor do most users care. As a result, a good graphic is “correct.”

The chatter about the one percent focuses on money. There is another, more important one percent in my opinion. The one percent who take the time to look at a sophisticated system will find the same nine or 10 algorithms, the same open source components, and some recycled procedures that few think about. Quick question: How many smart software systems rely on Thomas Bayes’ methods? Give up? Lots.

I don’t have a remedy for this problem, and I am not sure too many people care, want to talk about the “accuracy” of a smart system’s outputs. That’s a happy thought for the weekend. Imagine bad outputs in an autonomous drone or a smart system in a commercial aircraft? Exciting.

Stephen E Arnold, April 19, 2019

Stephen E Arnold,

Facial Recogntion: An Important Technology Enters Choppy Waters

April 8, 2019

I wouldn’t hold my breath: The Electronic Frontier Foundation (EFF) declares, “Governments Must Face the Facts About Face Surveillance, and Stop Using It.” Writers Hayley Tsukayama and Adam Schwartz begin by acknowledging reality—the face surveillance technology business is booming, with the nation’s law enforcement agencies increasingly adopting it. They write:

EFF supports legislative efforts in Washington and Massachusetts to place a moratorium on government use of face surveillance technology. These bills also would ban a particularly pernicious kind of face surveillance: applying it to footage taken from police body-worn cameras. The moratoriums would stay in place, unless lawmakers determined these technologies do not have a racial disparate impact, after hearing directly from minority communities about the unfair impact face surveillance has on vulnerable people. We recently sent a letter to Washington legislators in support of that state’s moratorium bill.

EFF’s communications may be having some impact.

DarkCyber noted that Amazon will be allowing shareholders a vote about sales of the online bookstore’s facial recognition technology, Rekognition. “AI Researchers Tell Amazon to Stop Selling Facial Recognition to the Police” does not explain how Amazon can remove its FAR from those entities which have licensed the technology.

DarkCyber believes that the US is poised to become a procurement innovation center. Companies and their potential customers have to figure out how to work together without creating political, legal, and financial disruptions.

A failure to resolve what seems to be a more common problem may allow vendors in other countries to capture leading engineers, major contracts, and a lead in an important technology.

Stephen E Arnold, April 8, 2019

Content Management: Now a Playground for Smart Software?

March 28, 2019

CMS or content management systems are a hoot. Sometimes they work; sometimes they don’t. How does one keep these expensive, cranky databases chugging along in the zip zip world of content utilities which are really inexpensive?

Smart software and predictive analytics?

Managing a website is not what is used to be, and one of the biggest changes to content management systems is the use of predictive analytics. The Smart Data Collective discusses “The Fascinating Role of Predictive Analytics in CMS Today.” Reporter Ryan Kh writes:

“Predictive analytics is changing digital marketing and website management. In previous posts, we have discussed the benefits of using predictive analytics to identify the types of customers that are most likely to convert and increase the value of your lead generation strategy. However, there are also a lot of reasons that you can use predictive analytics in other ways. Improving the quality of your website is one of them. One of the main benefits of predictive analytics in 2019 is in improving the performance of content management systems. There are a number of different types of content management systems on the market, including WordPress, Joomla, Drupal, and Shopify. There are actually hundreds of content management systems on the market, but these are some of the most noteworthy. One of the reasons that they are standing out so well against their competitors is that they use big data solutions to get the most value for their customers.”

The author notes two areas in which predictive analytics are helping companies’ bottom lines: fraud detection and, of course, marketing optimization; the latter through capacities like more effective lead generation and content validation.

Yep, CMS with AI. The future with spin.

Cynthia Murrell, March 28, 2019

Next Page »

  • Archives

  • Recent Posts

  • Meta