Tweaking Algorithms: Touching Up Nicks and Scrapes?
January 30, 2018
Is it possible that algorithms are growing and changing along with the basic needs of our artificial intelligence? Yes, and no, is pretty much the answer. We learned how many are claiming to be revolutionizing the DNA of artificial intelligence and how they might be blowing smoke from a recent Phys.org story, “The Algorithms of Our Future Thinking Machines.”
According to the story:
“The challenge of constructing algorithms for dynamic systems is in the nature of those systems: they are in constant change. Traffic cameras, radar, and inertial sensors are some of the devices delivering the information the algorithm requires. Now another extremely dynamic system is becoming more central to Thomas Schön’s and his colleagues’ projects: the human body.”
This is some neat research and well worth a read, but the grandiose claims are a little unfounded. People have been constructing algorithms for dynamic systems for quite a while. From search engines to video games to the above mentioned, there is not a lot of undiscovered territory in this land. However, that does not mean there are not new depths to explore. We have found the mountain, so now it is time to crack it open and see what gems are hiding inside. If bright minds like this can tune their thinking that way, we could be in for some grand surprises or clumsy scratch repairs.
Patrick Roland, January 30, 2018
Palantir: Accused of Hegelian Contradictions
January 29, 2018
I bet you have not thought about Hegel since you took that required philosophy course in college. Well, Hegel and his “contradictions” are central to “WEF 2018: Davos, Data, Palantir and the Future of the Internet.”
I highlighted this passage from the essay:
Data is the route to security. Data is the route to oppression. Data is the route to individual ideation. Data is the route to the hive mind. Data is the route to civic wealth. Data is the route to civic collapse.
Thesis, antitheses, synthesis in action I surmise.
The near term objective is synthesis. I assume this is the “connecting the dots” approach to finding what one needs to know.
I learned:
The stakes for big data couldn’t be bigger.
Okay, a categorical in our fast changing, diverse economic and political climate. Be afraid seems to be the message.
Palantir’s point of operations in Davos is described in the write up as “a pimped up liquor store.” Helpful and highly suggestive too.
The conclusion of the essay warranted a big red circle:
So next time you hear the names Palantir or Alex Karp, stop what you’re doing and pay attention. The future – your future – is under discussion. Under construction. This little first draft of history of which you’ve made it to the end (congratulations and thanks) – the history of data – is of a future that will in time come to be seen for what it is: digital that truly matters.
Several observations:
- The author wants me to believe that Palantir is not a pal.
- The big data thing troubles the author because Palantir is one of the vendors providing next generation information access.
- The goal of making Palantir into something unique is best accomplished by invoking Fancy Dan ideas.
I would suggest that knowledge about companies like Gamma Group FinFisher, Shoghi, Trovicor, and some other interesting non US entities might put Palantir in perspective. Palantir has an operational focus; some of the other vendors perform different information services.
Palantir is an innovator, but it is part of a landscape of data intercept and analysis organizations. I could make a case that Palantir is capable but some companies in Europe and the East are actually more technologically advanced.
But these outfits were not at Davos. Why? That’s a good question. Perhaps they were too busy with their commercial and government work. My hunch is that a few of these outfits were indeed “there”, just not noticed by the expert who checked out the liquor store.
Stephen E Arnold, January 29, 2019
IBM and Algorithmic Bias
January 25, 2018
I read “Unexplainable Algos? Get Off the Market, Says IBM Chief Ginni Rometty.” The idea is in line with Weapons of Math Destruction and the apparent interest in “black box” solutions. If you are old enough, you will remember the Autonomy IDOL system. It featured a “black box” which licensees used without the ability to alter how the system operated. You may also recall that the first Google Search Appliances locked users out as well. One installed the GSA and it just worked—at least, in theory.
This article includes information derived from the IBM content output for the World Economic Forum where it helps to have one’s own helicopter for transportation.
I noted this statement:
“When it comes to the new capabilities of artificial intelligence, we must be transparent about when and how it is being applied and about who trained it, with what data, and how,” the IBM chairman, president and CEO wrote.
I don’t want to be too picky but IBM owns the i2 Analyst Notebook system. If you are not familiar with this platform, it provides law enforcement and intelligence professionals with tools to organize, analyze, and marshal information for an investigation. As a former consultant to i2, I am not sure if the plumbing developed by i2 is public. In fact, IBM and Palantir jousted in court when IBM sued Palantir for improper use of its intellectual property; that is a fancy way of saying, “Palantir engineers tried to figure out how i2 worked.” The case settled out of court and many of the documents are sealed because no one party to the case wanted certain information exposed to bright sunlight.
IBM operates a number of cybersecurity services. One of these has the ability to intercept a voice call and map that call to email and other types of communications. The last time I received some information about this service I had to sign a bundle of documents. The idea, of course, is that much of the technology was, from my point of view, a “black box.”
So what?
The statement by IBM’s CEO is important because it is, in my opi9nion, hand waving. IBM deals in systems which are neither fully understood by some of the IBM experts selling these solutions, and some of the engineers who may know more about the inner working of secret or confidential systems and methods are not talking. An expert knows stuff others do not; therefore, why talk and devalue one’s expertise.
To sum up, talk about making math centric systems and procedures transparent is just noise. The number of people who can explain how systems which emerged from Cambridge University like Autonomy’s Neurolinguistic System or i2’s Analyst Notebook are in short supply.
How can one who does not understand explain how a complex system works. Black boxes exist to keep those which thumbs for fingers from breaking what works.
Talk doesn’t do much to deal with the algorithmic basics:
- Some mathematical procedures in wide use are not easily explained or reverse engineered; hence, the IBM charge that Palantir tried a short cut through the words to the cookie jar
- Most next generation systems are built on a handful of algorithms. I have identified 10 which I explain in my lectures about the flaws embedded in “smart” systems. Each of the most widely used algorithms can be manipulated in a number of ways. Some require humans to fiddle; other fiddle when receiving inputs from other systems.
- Explainable systems are based on rules. By definition, one assumes the rules work as the authors intended. News flash. Rule based systems can behave in unpredictable, often inexplicable ways. A fun example is for you, gentle reader, to try and get the default numbering system in Microsoft Word to perform consistently with regard to left justification of numbered lists.
- Chain a series of algorithms together in a work flow. Add real time data to update thresholds. Watch the outputs. Now explain what happened. Good luck with that.
I love IBM. Always marketing.
Stephen E Arnold, January 25, 2018
How SEO Has Shaped the Web
January 19, 2018
With the benefit of hindsight, big-name thinker Anil Dash has concluded that SEO has contributed to the ineffectiveness of Web search. He examines how we got here in his article, “Underscores, Optimization & Arms Races” at Medium. Starting with the year 2000, Dash traces the development of Internet content management systems (CMS’s), of which he was a part. (It is a good brief summary for anyone who wasn’t following along at the time.) WordPress is an example of a CMS.
As Google’s influence grew, online publishers became aware of an opportunity—they could game the search algorithm to move their site to the top of “relevant” results by playing around with keywords and other content details. The question of whether websites should bow to Google’s whims seemed to go unasked, as site after site fell into this pattern, later to be known as Search Engine Optimization. For Dash, the matter was symbolized by a question over hyphens or underbars to represent spaces in web addresses. Now, of course, one can use either without upsetting Google’s algorithm, but that was not the case at first. When Google’s Matt Cutts stated a preference for the hyphen in 2005, most publishers fell in line. Including Dash, eventually and very reluctantly; for him, the choice represented nothing less than the very nature of the Internet.
He writes:
You see, the theory of how we felt Google should work, and what the company had often claimed, was that it looked at the web and used signals like the links or the formatting of webpages to indicate the quality and relevance of content. Put simply, your search ranking with Google was supposed to be based on Google indexing the web as it is. But what if, due to the market pressure of the increasing value of ranking in Google’s search results, websites were incentivized to change their content to appeal to Google’s algorithm? Or, more accurately, to appeal to the values of the people who coded Google’s algorithm?
Eventually, even Dash and his CMS caved and switched to hyphens. What he did not notice at the time, he muses, was the unsettling development of the entire SEO community centered around appeasing these algorithms. He concludes:
By the time we realized that we’d gotten suckered into a never-ending two-front battle against both the algorithms of the major tech companies and the destructive movements that wanted to exploit them, it was too late. We’d already set the precedent that independent publishers and tech creators would just keep chasing whatever algorithm Google (and later Facebook and Twitter) fed to us. Now, the challenge is to reform these systems so that we can hold the big platforms accountable for the impacts of their algorithms. We’ve got to encourage today’s newer creative communities in media and tech and culture to not constrain what they’re doing to conform to the dictates of an opaque, unknowable algorithm.
Is that doable, or have we gone too far toward appeasing the Internet behemoths to turn back?
Cynthia Murrell, January 19, 2018
Out with the Old, in with the New at Google
January 17, 2018
It may have started with its finance app, but Google is making some drastic changes you might want to keep an eye on. We discovered the tip of the iceberg with Google Blog piece, “Stay on Top of Finance Information on Google.”
According to the story:
Now under a new search navigation tab called “Finance,” you’ll have easier access to finance information based on your interests, keeping you in the know about the latest market news and helping you get in-depth insights about companies. On this page, you can see performance information about stocks you’ve chosen to follow, recommendations on other stocks to follow based on your interests, related news, market indices, and currencies.
As part of this revamped experience, we’re retiring a few features of the original Google Finance, including the portfolio, the ability to download your portfolio, and historical tables. However, a list of the stocks from your portfolio will be accessible through Your Stocks in the search result, and you can get notifications when there are any notable changes on their performance.
Not a big shock, but a big part of Google trying to freshen things up. The company has been in hot water with a string of YouTube videos deemed too much. So, with moves like improving its algorithm to weed out fake news, changes to Google Home, and even Maps, Google is sending a message. The message is one of change and one we hope is for the better.
Patrick Roland, January 17, 2018
AI Makes Life-Saving Medical Advances
January 2, 2018
Too often we discuss the grey area around AI and machine learning. While that is incredibly important during this time, it is also not all this amazing technology can do. Saving lives, for instance. We learned a little more on that front from a recent Digital Journal story, “Algorithm Repairs Corrupted Digital Images.”
According to the story:
University of Maryland researchers have devised a technique exploits the power of artificial neural networks to tackle multiple types of flaws and degradations in a single image in one go.
The researchers achieved image correction through the use of a new algorithm. The algorithm operates artificial neural networks simultaneously, so that the networks apply a range of different fixes to corrupted digital images. The algorithm was tested on thousands of damage digital images, some with severe degradations. The algorithm was able to repair the damage and return each image to its original state.
The application of such technology crosses the business and consumer divide, taking in everything from everyday camera snapshots to lifesaving medical scans. The types of faults digital images can develop include blurriness, grainy noise, missing pixels and color corruption.
Very promising from a commercial and medical standpoint. Especially, the medical side. This news, coupled with the story in Forbes about AI disrupting healthcare norms in 2018 makes for a big promise. We are looking forward to seeing what the new year brings for medical AI.
Patrick Roland, January 2, 2018
Turning to AI for Better Data Hygiene
December 28, 2017
Most big data is flawed in some way, because humans are imperfect beings. That is the premise behind ZDNet’s article, “The Great Data Science Hope: Machine Learning Can Cure Your Terrible Data Hygiene.” Editor-in-Chief Larry Dignan explains:
The reality is enterprises haven’t been creating data dictionaries, meta data and clean information for years. Sure, this data hygiene effort may have improved a bit, but let’s get real: Humans aren’t up for the job and never have been. ZDNet’s Andrew Brust put it succinctly: Humans aren’t meticulous enough. And without clean data, a data scientist can’t create algorithms or a model for analytics.
Luckily, technology vendors have a magic elixir to sell you…again. The latest concept is to create an abstraction layer that can manage your data, bring analytics to the masses and use machine learning to make predictions and create business value. And the grand setup for this analytics nirvana is to use machine learning to do all the work that enterprises have neglected.
I know you’ve heard this before. The last magic box was the data lake where you’d throw in all of your information–structured and unstructured–and then use a Hadoop cluster and a few other technologies to make sense of it all. Before big data, the data warehouse was going to give you insights and solve all your problems along with business intelligence and enterprise resource planning. But without data hygiene in the first place enterprises replicated a familiar, but failed strategy: Poop in. Poop out.
What the observation lacks in eloquence it makes up for in insight—the whole data-lake concept was flawed from the start since it did not give adequate attention to data preparation. Dignan cites IBM’s Watson Data Platform as an example of the new machine-learning-based cleanup tools, and points to other noteworthy vendors investigating similar ideas—Alation, Io-Tahoe, Cloudera, and HortonWorks. Which cleaning tool will perform best remains to be seen, but Dignan seems sure of one thing—the data that enterprises have been diligently collecting for the last several years is as dirty as a dustbin lid.
Cynthia Murrell, December 28, 2017
New York Begins Asking If Algorithms Can Be Racist
December 27, 2017
The whole point of algorithms is to be blind to everything except data. However, it is becoming increasingly clear that in the wrong hands, algorithms and AI could have a very negative impact on users. We learned more in a recent ACLU post, “New York Takes on Algorithm Discrimination.”
According to the story:
A first-in-the-nation bill, passed yesterday in New York City, offers a way to help ensure the computer codes that governments use to make decisions are serving justice rather than inequality.
Algorithms are often presumed to be objective, infallible, and unbiased. In fact, they are highly vulnerable to human bias. And when algorithms are flawed, they can have serious consequences.
The bill, which is expected to be signed by Mayor Bill de Blasio, will provide a greater understanding of how the city’s agencies use algorithms to deliver services while increasing transparency around them. This bill is the first in the nation to acknowledge the need for transparency when governments use algorithms…
This is a very promising step toward solving a very real problem. From racist coding to discriminatory AI, this is a topic that is creeping into the national conversation. We hope others will follow in New York’s footsteps and find ways to prevent this injustice from going further.
Patrick Roland, December 27, 2017
A Look at Chinese Search Engine Sogou
December 25, 2017
An article at Search Engine Watch draws our attention to one overseas search contender—“What Do You Need to Know About Chinese Search Engine Sogou?” Sogu recently announced terms for a proposed IPO, so writer Rebecca Sentance provides a primer on the company. She begins with some background—the platform was launched in 2004, and the name translates to “searching dog.” She also delves into the not-so-clear issue of where Sogu stands in relation to China’s top search engine, Baidu, and some other contenders for the second-place, so see the article for those details.
I was interested in what Sentance writes about Sogou’s use of AI and natural language search:
It also plans to shift its emphasis from more traditional keyword-based search to answer questions, in line with the trend towards natural language search prompted by the rise of voice search and digital assistants. Sogou has joined major search players such as Bing, Baidu and of course Google in investing in artificial intelligence, but its small size may put it at a disadvantage. A huge search engine like Baidu, with an average of more than 583 million searches per day, has access to reams more data with which to teach its machine learning algorithms.
But Sogou has an ace up its sleeve: it is the only search engine formally allowed to access public messages on WeChat – a massive source of data that will be particularly beneficial for natural language processing. Plus, as I touched on earlier, language is something of a specialty area for Sogou, as Sogou Pinyin gives it a huge store of language data with which to work. Sogou also has ambitious plans to bring foreign-language results to Chinese audiences via its translation technology, which will allow consumers to search the English-speaking web using Mandarin search terms.
The article wraps up by looking at Sogou’s potential effect on search markets; basically, it could have a large impact within China, especially if Baidu keeps experiencing controversy. For the rest of the world, though, the impact should be minimal. Nevertheless, this is one company worth keeping an eye on.
Cynthia Murrell, December 25, 2017
Google Is Taught Homosexuality Is Bad
December 12, 2017
The common belief is that computers and software are objectives, inanimate objects capable of greater intelligence than humans. The truth is that humans developed computers and software, so the objective, inanimate objects are only as smart as their designers. What is even more hilarious is the sentiment analysis AI development process requires tons of data for the algorithms to read and teach itself to recognize patterns. The data used is “contaminated” with human emotion and prejudices. Motherboard wrote about how artificial bias pollutes AI in the article, “Google’s Sentiment Analyzer Thinks Being Gay Is Bad.”
The problem when designing AI is that if it is programmed with polluted and biased data, then these super intelligent algorithms will discriminate against people rather than being objective. Google released its Cloud Natural Language API that allows developers to add Google’s deep learning models into their own applications. Along with entity recognition, the API included a sentiment analyzer that detected when text contained a positive or negative sentiment. However, it has a few bugs and returns biased results, such as saying being gay is bad, certain religions are bad, etc.
It looks like Google’s sentiment analyzer is biased, as many artificially intelligent algorithms have been found to be. AI systems, including sentiment analyzers, are trained using human texts like news stories and books. Therefore, they often reflect the same biases found in society. We don’t know yet the best way to completely remove bias from artificial intelligence, but it’s important to continue to expose it.
The problem with programming AI algorithms is that it is difficult to feed it data free of human prejudices. It is difficult to work around these prejudices, because they are so ingrained in most data. Programmers are kept on their toes to find a solution, but it is not a one size fits all one. Too bad they cannot just stick with numbers and dictionaries.
Whitney Grace, December 12, 2017