AI Speed Bumps
March 5, 2018
If you read anything about artificial intelligence these days, it’s usually about how efficiency will go through the roof because of it. Either that, or how it’s the most destructive invention since the nuclear bomb. Both of those might be a moot point because the science behind AI is being seriously questioned. This came from a recent Science Magazine story, “Missing Data Hinders Replication of Artificial Intelligence Studies.”
According to the piece:
“The booming field of artificial intelligence (AI) is grappling with a replication crisis, much like the ones that have afflicted psychology, medicine, and other fields over the past decade. AI researchers have found it difficult to reproduce many key results, and that is leading to a new conscientiousness about research methods and publication protocols.”
The source stems from scientists’ need to publish quickly and get the upper hand in a competitive field. That means algorithms are not tested in every situation and that cracks begin to form in their theories. This feels like a growing pain for this rapidly expanding field. This comes on the heels of another study that claims we are thinking about AI all wrong. We humans tend to think of AI as working like a human brain, but it works like something totally different. We suddenly don’t have a lot of faith that scientists will get to the bottom of what that is, exactly.
As an example, the dust up between Elon Musk and Steven Pinker does little to reassure neutral observers about smart software.
Patrick Roland, March 5, 2018
Governance: Now That Is a Management Touchstone for MBA Experts
February 27, 2018
I read “Unlocking the Power of Today’s Big Data through Governance.” Quite a lab grown meat wiener that “unlocking,” “power,” “Big Data,” and “governance” statement is that headline. Yep, IDG, the outfit which cannot govern its own agreements with the people the firm pays to make the IDG experts so darned smart. (For the back-story, check out this snapshot of governance in action.)
What’s the write up with the magical word governance about?
Instead of defining “governance,” I learn what governance is not; to wit:
Data governance isn’t about creating a veil of secrecy around data
I have zero idea what this means. Back to the word “governance.” Google and Wikipedia define the word in this way:
Governance is all of the processes of governing, whether undertaken by a government, market or network, whether over a family, tribe, formal or informal organization or territory and whether through the laws, norms, power or language of an organized society.
Okay, governing. What’s governing mean? Back to the GOOG. Here’s one definition which seems germane to MBA speakers:
control, influence, or regulate (a person, action, or course of events).
The essay drags out the chestnuts about lots of information. Okay, I think I understand because Big Data has been touted for many years. Now, mercifully I assert, the drums are beating out the rhythm of “artificial intelligence” and its handmaiden “algos,” the terrific abbreviation some of the marketing jazzed engineers have coined. Right, algos, bro.
What’s the control angle for Big Data? The answer is that “data governance” will deal with:
- Shoddy data
- Incomplete data
- Off point data
- Made up data
- Incorrect data
Presumably these thorny issues will yield to a manager who knows the ins and outs of governance. I suppose there are many experts in governance; for example, the fine folks who have tamed content chaos with their “governance” of content management systems or the archiving mavens who have figured out what to do with tweets at the Library of Congress. (The answer is to not archive tweets. There you go. Governance in action.)
The article suggests a “definitive data governance program.” Right. If one cannot deal with backfiles, changes to the data in the archives, and the new flows of data—how does one do the “definitive governance program” thing? The answer is, “Generate MBA baloney and toss around buzzwords.” Check out the list of tasks which, in my experience, are difficult to accomplish when resources are available and the organization has a can-do attitude:
- Document data and show its lineage.
- Set appropriate policies, and enforce them.
- Address roles and responsibilities of everyone who touches that data, encouraging collaboration across the organization.
These types of tasks are the life blood of consultants who purport to have the ability to deliver the near impossible.
What happens if we apply the guidelines in the Governance article to the data sets listed in “Big Data And AI: 30 Amazing (And Free) Public Data Sources For 2018.” In my experience, the cost of normalizing the data is likely to be out of reach for most organizations. Once these data have been put in a form that permits machine-based quality checks, the organization has to figure out what questions the data can answer with a reasonable level of confidence. Getting over these hurdles then raises the question, “Are these data up to date?” And, if the data are stale, “How do we update the information?” There are, of course, other questions, but the flag waving about governance operates at an Ivory Tower level. Dealing with data takes place with one’s knees on the ground and one’s hands in the dirt. If the public data sources are not pulling the hay wagon, what’s the time, cost, and complexity of obtaining original data sets, validating them, and whipping them into shape for use by an MBA?
You know the answer: “This is not going to happen.”
Here’s a paragraph which I circled in Oscar Mayer wiener pink:
One of the more significant, and exciting, changes in data governance has been the shift in focus to business users. Historically, data has been a technical issue owned by IT and locked within the organization by specific functions and silos. But if data is truly going to be an asset, everyday users—those who need to apply the data in different contexts—must have access and control over it and trust the data. As such, data governance is transforming from a technical tool to a business application. And chief data officers (CDOs) are starting to see the technologies behind data governance as their critical operating environment, in much the same way SAP serves CFOs, and Salesforce supports CROs. It is rare to find an opportunity to build a new system of record for a market.
Let’s look at this low calorie morsel and consider some of its constituent elements. (Have you ever seen wieners being manufactured? Fill in that gap in your education if you have not had the first hand learning experience.)
First, business users want to see a pretty dashboard, click on something that looks interesting in a visualization, and have an answer delivered. Most of the business people I know struggle to understand if the data in their system is accurate and limited expertise to understand the mathematical processes which churn away to display an “answer.”
The reference to SAP is fascinating, but I think of IBM-type systems as somewhat out of step with the more sophisticated tools available to deal with certain data problems. In short, SAP is an artifact of an earlier era, and its lessons, even when understood, have been inadequate in the era of real time data analysis.
Let me be clear: Data governance is a management malarkey. Look closely at organizations which are successful. Peer inside their data environments. When I have looked, I have seen clever solutions to specific problems. The cleverness can create its own set of challenges.
The difference between a Google and a Qwant, a LookingGlass Cyber and IBM i2, or Amazon and Wal-Mart is not Big Data. It is not the textbook definition of “governance.” Success has more to do with effective problem solving on a set of data required by a task. Google sells ads and deals with Big Data to achieve its revenue goals. LookingGlass addresses chat information for a specific case. Amazon recommends products in order to sell more products.
Experts who invoke governance on a broad scale as a management solution are disconnected from the discipline required to identify a problem and deal with data required to solve that problem.
Few organizations can do this with their “content management systems”, their “business intelligence systems,” or their “product information systems.” Why? Talking about a problem is not solving a problem.
Governance is wishful thinking and not something that is delivered by a consultant. Governance is an emergent characteristic of successful problem solving. Governance is not paint; it is not delivered by an MBA and a PowerPoint; it is not a core competency of jargon.
In Harrod’s Creek, governance is getting chicken to the stores in the UK. Whoops. That management governance is not working. So much in modern business does not work very well.
Stephen E Arnold, February 27, 2018
Google: What We Have Here Is a Failure to Innovate
February 23, 2018
Google is one of the top technology companies in the world and their services are employed on nearly every computer, phone, and tablet. Google is at the most innovative when it comes to developing new technology, but a former Google insider said the opposite. Steve Yegge writing for Medium explains his Google experience in his article, “Why I Left Google To Join Grab.”
Yegge loved Google and still considered it to be one of the best places in the world to work, but he left for some good reasons:
“The main reason I left Google is that they can no longer innovate. They’ve pretty much lost that ability. I believe there are several contributing factors, of which I’ll list four here. First, they’re conservative…Second, they are mired in politics, which is sort of inevitable with a large enough organization; the only real alternative is a dictatorship, which has its own downsides. Third, Google is arrogant…But fourth, last, and probably worst of all, Google has become 100% competitor-focused rather than customer focused.”
Google has reached the apex of its innovative spirit and has gone the way over all corporations and, arguably, politicians. Google has grown so big and powerful, hires the top players in the field, and controls so many products/services that it does not want to lose face, its employees have ego problems, and they serve the almighty dollar. It is a repetitious pattern that has been playing out for ages. One of the greatest examples was the British Empire. The British Empire became so big and powerful that the resources were spread too thin, the ruling parties were arrogant, the subjects suffered, and those in power never wanted it to change. It sounds like Google, does it not?
Yegge then talked about the new endeavor called Grab and stresses the importance of keeping your ear to the ground in order to make and grow a business. Google has gotten too big, but it still has a lot of powerful and it will be awhile before it falls. Another company will pick up the slack. Someone always does.
Whitney Grace, February 23, 2018
Facebook and Google: Set Up a Standards Entity
January 25, 2018
Ah, governance. A murky word which means figuring out the rules of the road. Tough job.
I read “UK Advertisers urge Facebook and Google to Set Up Standards Body.” The idea is interesting. It reminds me of the hapless part time teacher who was supposed to manage my high school science club. Shortly before one of the wags ignited a smoke bomb in chemistry class, our science club was asked to stop playing pranks. Yep, that notion lasted less than 24 hours.
I think of Facebook, Google, and some other outfits as high school science and math clubs whose DNA is now more mature—just with niftier technology.
The write up ignores what I perceive as the basis of some interesting corporate behavior. I learned from the article:
Advertisers have called on Facebook and Google to establish an independent body to regulate and monitor content on both of their platforms.
Okay, both companies are supposed to generate a return for their shareholders. Both companies are not too keen on people not working in a sufficiently advanced field offering suggestions. This is similar to the concierge of a fancy hotel telling the bank president financing the outfit what to have for breakfast.
The write up opined in a “real” news way:
Google and Facebook should “thrash out some common principles” over content moderation and removal that could be adopted and enforced by an independent body, which they would fund, he [Phil Smith, director general of the Incorporated Society of British Advertiser or ISBA] said.
The write up reported:
Mr Smith, a former marketing director of Kraft, said advertisers expect the big technology companies to take action because consumers are becoming skeptical of digital advertising. “Our consumer research tells us that digital advertising is intrusive and not being trusted,” he said. Consumers “know that television advertising is regulated in some way – both the advertising and the content – but they don’t believe that to be the case in any respect when it comes to digital”.
Yep, great idea.
I believe that regulators are interested in paying more attention to Facebook and Google. I would toss Amazon and Apple into the basket as well.
However, the interest is less about sales and more about tax revenue.
How would a regulatory body go about making a modification to an automated algorithm which reacts to what users do in real time?
Facebook and Google operate in interesting ways; regulatory authorities may not be into the “interesting” thing.
Stephen E Arnold, January 25, 2018
Palantir Awaits a Decision on its US Army Matter
October 28, 2016
More information is available about the Palantir – US Army legal matter. You can find the write up at this link. The decision, according to Bloomberg, may arrive on Monday, October 31, 2016. Palantir awaits its trick or treat day.
Kenny Toth, October 28, 2016
Governance for Big Data. A Sure Fire Winner for Consultants
July 28, 2016
I read “What’s Next for Big Data Analytics?” I didn’t know the answer to this question, and I still don’t. The angle of attack is common sense. Companies with experience is dealing with digital information often have viewpoints different from the marketing collateral produced by their colleagues. This write up seems to fall in the category of Mr. Bush’s request, “Please, clap.”
The idea is that an organization has to have information policies. That sounds like consultant speak. Most organizations struggle to figure out what their company party policies are. Digital data policies are one of those tasks that senior managers allow others to wrestle to the ground and get a tap out.
The write up includes a number of diagrams. I highlighted this one:
The red area is the governance and management thing. Good luck with that. Companies need revenue. Big Data is supposed to deliver. If not, those policies and governance meeting minutes along with the consultants who billed big bucks for them are going to the shredder in my opinion.
Stephen E Arnold, July 28, 2016
MarkLogic: Not Much Information about DI2E on the MarkLogic Web Site
April 11, 2016
Short honk: I have been thinking about MarkLogic in the context of Palantir Technologies. The two companies are sort of pals. Both companies are playing the high stakes game for next generation augmented intelligence systems for the Department of Defense. Palantir’s approach has been to generate revenues from sales to the intelligence community. MarkLogic’s approach has been to ride on the Distributed Common Ground System which is now referenced in some non-Hunter circles as Di2E.
You can get a sense of what MarkLogic makes available by navigating to www.marklogic.com and running a query for DI2E or DCGS.
The Plugfest documents provide a snapshot of the vendors involved as of December 2015 in this project. Here’s a snippet from the unclassified set of slides “Plugfest Industry Day: Plugfest/Mashup 2016.”
What caught my attention is that Palantir, which has its roots in CIA-type thought processes, is in the same “industry partner” illustration as MarkLogic. I noticed that IBM (the DB2 folks) and Oracle (the one-time champion in database technology) are also “partners.”
The only hitch in this “plugfest” partnering deal is Palantir’s quite interesting AlphaDB innovation and the disclosure of data management systems and methods in US 2016/0085817, “System and Method for Investigating Large Amounts of Data”, an invention of the now not-so-secret Hobbits Geoffrey Stowe, Chris Fischer, Paul George, Eli Bingham, and Rosco Hill.
Palantir’s one-two punch is AtlasDB and its data management method. The reason I find this interesting is that MarkLogic is the NoSQL, XML, slice-and-dice advanced technology which some individuals find difficult to use. IBM and Oracle are decidedly old school.
MarkLogic may not publicize its involvement in DCGS/DI2E, but the revenue is important for MarkLogic and the other vendors in the “partnering” diagram. Palantir, however, has been diversifying with, from what I hear, considerable success.
MarkLogic is a Silicon Valley innovator which opened its doors in 2001. Yep, that’s 15 years ago. Palantir Technologies is the newer kid on the block. The company was set up in 2003, that 13 years ago. What I find interesting is that MarkLogic’s approach is looking a bit long in the tooth. Palantir’s approach is a bit more current, and its user experience is more friendly than wrestling with XQuery and its extensions.
What happens if Palantir becomes the plumbing for the DCGS/DI2E system? Perhaps IBM or Oracle will have to think about acquiring Palantir. With technology IPOs somewhat rare, Palantir stakeholders may find that thinking the unthinkable is attractive.
What happens if Palantir takes its commercial business into a separate company and then formulates a deal to sell only the high-vitamin augmented intelligence business? MarkLogic may be faced with some difficult choices. Simplifying its data management and query systems may be child’s play compared to figuring out what its future will be if either IBM or Oracle snap up the quite interesting Palantir technologies, particularly the database and data management systems.
Watch for my for-fee report about Palantir Technologies. There will be a discounted price for law enforcement and intelligence professionals and another price for those not engaged in these two disciplines. Expect the report in early summer 2016. A small segment of the Palantir special report will appear in the forthcoming “Dark Web Notebook”, which I referenced in the Singularity 1 on 1 interview in mid-March 2016. To reserve copies of either of these two new monographs, write benkent2020 at Yahoo dot com.
Stephen E Arnold, April 11, 2016
US Control of Internet Over
March 20, 2016
Short honk: I read “Quietly, Symbolically, US Control of the Internet Was Just Ended.” The write up explains that at a meeting in Morocco, people who run the “Internet’s naming and numbering system” have a plan
to end direct US government oversight control of administering the internet and commit permanently to a slightly mysterious model of global “multi-stakeholderism”.
What’s multi stakeholderism? I noted the reference to Snowden but multi stakeholderism?
Stephen E Arnold, March 20, 2016
Alphabet Spells Fiscal Controls
February 17, 2016
I read “Google’s Alphabet Poaches Intel Veteran Jim Campbell as Its First Controller.” My father was a controller at one time. He told me that he was not the most popular person at budget reviews. Gee, I thought he was lovable year round.
Here’s the passage I highlighted:
When speaking about the Alphabet reorg (particularly to Wall Street), the company’s execs have stressed that its intent was to instill tighter financial discipline around its various projects, particularly those outside of core Google, lumped on the balance sheet as Other Bets. “
I like the notion of investments as bets. I wonder if the controller will be able to reign the gambling losses as Google bets. I would bet on death remaining an unsolvable problem. Loon balloons? Pony up.
Stephen E Arnold, February 17, 2016
Photo Farming in the Early Days
November 9, 2015
Have you ever wondered what your town looked like while it was still urban and used as farmland? Instead of having to visit your local historical society or library (although we do encourage you to do so), the United States Farm Security Administration and Office Of War Information (known as FSA-OWI for short) developed Photogrammer. Photogrammer is a Web-based image platform for organizing, viewing, and searching farm photos from 1935-1945.
Photogrammer uses an interactive map of the United States, where users can click on a state and then a city or county within it to see the photos from the timeline. The archive contains over 170,000 photos, but only 90,000 have a geographic classification. They have also been grouped by the photographer who took the photos, although it is limited to fifteen people. Other than city, photographer, year, and month, the collection c,an be sorted by collection tags and lot numbers (although these are not discussed in much detail).
While farm photographs from 1935-1945 do not appear to need their own photographic database, the collection’s history is interesting:
“In order to build support for and justify government programs, the Historical Section set out to document America, often at her most vulnerable, and the successful administration of relief service. The Farm Security Administration—Office of War Information (FSA-OWI) produced some of the most iconic images of the Great Depression and World War II and included photographers such as Dorothea Lange, Walker Evans, and Arthur Rothstein who shaped the visual culture of the era both in its moment and in American memory. Unit photographers were sent across the country. The negatives were sent to Washington, DC. The growing collection came to be known as “The File.” With the United State’s entry into WWII, the unit moved into the Office of War Information and the collection became known as the FSA-OWI File.”
While the photos do have historical importance, rather than creating a separate database with its small flaws, it would be more useful if it was incorporated into a larger historical archive, like the Library of Congress, instead of making it a pet project.
Whitney Grace, November 9, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph