A Snapshot of American Innovation Today
May 23, 2016
Who exactly are today’s innovators? The Information Technology & Innovation Foundation (ITIF) performed a survey to find out, and shares a summary of their results in, “The Demographics of Innovation in the United States.” The write-up sets the context before getting into the findings:
“Behind every technological innovation is an individual or a team of individuals responsible for the hard scientific or engineering work. And behind each of them is an education and a set of experiences that impart the requisite knowledge, expertise, and opportunity. These scientists and engineers drive technological progress by creating innovative new products and services that raise incomes and improve quality of life for everyone….
“This study surveys people who are responsible for some of the most important innovations in America. These include people who have won national awards for their inventions, people who have filed for international, triadic patents for their innovative ideas in three technology areas (information technology, life sciences, and materials sciences), and innovators who have filed triadic patents for large advanced-technology companies. In total, 6,418 innovators were contacted for this report, and 923 provided viable responses. This diverse, yet focused sampling approach enables a broad, yet nuanced examination of individuals driving innovation in the United States.”
See the summary for results, including a helpful graphic. Here are some highlights: Unsurprisingly to anyone who has been paying attention, women and U.S.-born minorities are woefully underrepresented. Many of those surveyed are immigrants. The majority of survey-takers have at least one advanced degree (many from MIT), and nearly all majored in STEM subject as undergrads. Large companies contribute more than small businesses do while innovations are clustered in California, the Northeast, and close to sources of public research funding. And take heart, anyone over 30, for despite the popular image of 20-somethings reinventing the world, the median age of those surveyed is 47.
The piece concludes with some recommendations: We should encourage both women and minorities to study STEM subjects from elementary school on, especially in disadvantaged neighborhoods. We should also lend more support to talented immigrants who wish to stay in the U.S. after they attend college here. The researchers conclude that, with targeted action from the government on education, funding, technology transfer, and immigration policy, our nation can tap into a much wider pool of innovation.
Cynthia Murrell, May 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
A Romp through Google History
May 22, 2016
If you are unsure of Google’s history with the folks who allege the Alphabet Google thing fiddles search results, you will want to read and save “Why Google’s Monopoly Abuse Case in Europe Will Run and Run.” The main point of the write up is that legal processes can drag. I came away from the write up with this thought, “Lawyers involved in the legal issues will have quite a bit of work.”
The challenge the regulators in Europe have is that Google has become the go to solution for many online activities. Like Facebook and Amazon, the behavior of online services seems to operate like a monopoly. Users like the predictability of having a familiar way to perform certain tasks.
As the management of Alphabet Google changes, the drift toward pervasive functions continues. Individuals may not be aware how incremental decisions impact other organizations.
The write up points out:
critics have argued that the corporate rejig and change of leadership potentially gives the search titan plausible deniability: if regulators conclude that Google has harmed competition, then Pichai could say this hadn’t happened on his watch.
Interesting. The problem in my opinion is that the Google has been rolling down the same highway for more than a decade. I am not sure when a rest stop or breakdown will occur. How have Qwant and Quaero fared?
Stephen E Arnold, May 21, 2016
Facebook Biased? Social Media Are Objective, Correct?
May 21, 2016
I am shattered. Imagine. Facebook delivering information services which are subjective. Facebook is social media at its finest. There are humans who “vote” or “like” something. That’s crowdsourcing. A person has built a career on the wisdom of crowds.
My illusionary social world crumbled before the information presented in “The Real Bias Built In at Facebook.” The author is an academic and opinion writer. I am confident that the students of Zeynep Tufekci are able to navigate the reefs and shoals of social media because the truth is that algorithms are set up by humans who are, as some people believe, biased. (Note that you may have to pay money to read the write up by the compensated opinion writer Zeynep Tufekci. There is no bias in this approach to information. Heck, there is no bias at the New York Times, correct?)
What is Facebook doing? Here’s a passage I circled in stunned scarlet:
On Facebook the goal is to maximize your engagement with the site and keep it ad friendly.
This suggests that algorithms are set up to deliver these payoffs to Facebook. It follows that algorithms which do not deliver the required outcome are changed by programmers who:
- Either tune or structure the numerical processes to bring home the bacon
- Engage in on going tinkering until the suite of algorithms pumps out the likes, the clicks, and the revenue.
My thought is that chatter about “algorithms” is a bit trendy, just like the railroad cars stuffed with baloney explaining how artificial intelligence is the now now now big thing. Big Data, it seems, has fallen to second place in the marketing marathon.
I prefer to believe that Facebook, Google, and the other combines are really trying to be objective. When someone suggests that Google results are not in line with my query or that my deceased dog’s Facebook page displays a stream of relevant information, there is no bias.
My world is a happier place. I like searching for a restaurant when I am standing in front of it. When I look for that restaurant on my smartphone, the restaurant does not appear.
That’s objectivity in action. I know I don’t need to know where the restaurant is. I am in front of it. That’s objectivity in action.
Stephen E Arnold, May 21, 2016
Search Sink Hole Identified and Allegedly Paved and Converted to a Data Convenience Store
May 20, 2016
I try to avoid reading more than one write up a day about alleged revolutions in content processing and information analytics. My addled goose brain cannot cope with the endlessly recycled algorithms dressed up in Project Runway finery.
I read “Ryft: Bringing High Performance Analytics to Every Enterprise,” and I was pleased to see a couple of statements which resonated with my dim view of information access systems. There is an accompanying video in the write up. I, as you may know, gentle reader, am not into video. I prefer reading, which is the old fashioned way to suck up useful factoids.
Here’s the first passage I highlighted:
Any search tool can match an exact query to structured data—but only after all of the data is indexed. What happens when there are variations? What if the data is unstructured and there’s no time for indexing? [Emphasis added]
The answer to the question is increasing costs for sales and marketing. The early warning for amped up baloney are the presentations given at conferences and pumped out via public relations firms. (No, Buffy, no, Trent, I am not interested in speaking with the visionary CEO who hired you.)
I also highlighted:
With the power to complete fuzzy search 600X faster at scale, Ryft has opened up tremendous new possibilities for data-driven advances in every industry.”
I circled the 600X. Gentle reader, I struggle to comprehend a 600X increase in content processing. Dear Mother Google has invested to create a new chip to get around the limitations of our friend Von Neumann’s approach to executing instructions. I am not sure Mother Google has this nailed because Mother Google, like IBM, announces innovations without too much real world demonstration of the nifty “new” things.
I noted this statement too:
For the first time, you can conduct the most accurate fuzzy search and matching at the same speed as exact search without spending days or weeks indexing data.
Okay, this strikes me as a capability I would embrace if I could get over or around my skepticism. I was able to take a look at the “solution” which delivers the astounding performance and information access capability. Here’s an image from Ryft’s engineering professionals:
Notice that we have Spark and pre built components. I assume there are myriad other innovations at work.
The hitch in the git along is that in order to deal with certain real world information processing challenges, the inputs come from disparate systems, each generating substantial data flows in real time.
Here’s an example of a real world information access and understanding challenge, which, as far as I know, has not been solved in a cost effective, reliable, or usable manner.
Image source: Plugfest 2016 Unclassified.
This unclassified illustration makes clear that the little things in the sky pump out lots of data into operational theaters. Each stream of data must be normalized and then converted to actionable intelligence.
The assertion about 600X sounds tempting, but my hunch is that the latency in normalizing, transferring, and processing will not meet the need for real time, actionable, accurate outputs when someone is shooting at a person with a hardened laptop in a threat environment.
In short, perhaps the spark will ignite a fire of performance. But I have my doubts. Hey, that’s why I spend my time in rural Kentucky where reasonable people shoot squirrels with high power surplus military equipment.
Stephen E Arnold, May 20, 2016
CBS Jargon Meistering
May 20, 2016
I don’t pay much, if any, attention to the antics of network television giants. I noted this headline “CBS Chief Leslie Moonves Takes Aim at Competitors Dubious Ratings Claims,” and I read the article. Perhaps the CBS top dog was referring to outfits like Yahoo?
I highlighted these words and phrases as “interesting.”
- out-of-context data points
- scatter market
- out-of-the-box swing
- stock-in-trade brand.
I am uncertain of the meaning of these phrases, but I understood this statement:
“We see money coming back to network, not that it ever left.” But when it comes to digital, he added, “The bloom is off the rose.”
Ah, a reference to Robert Burns. That I understood. I also understand bologna.
Stephen E Arnold, May 20, 2016
The Kardashians Rank Higher Than Yahoo
May 20, 2016
I avoid the Kardashians and other fame chasers, because I have better things to do with my time. I never figured that I would actually write about the Kardashians, but the phrase “never say never” comes into play. As I read Vanity Fair’s “Marissa Mayer Vs. ‘Kim Kardashian’s Ass” : What Sunk Yahoo’s Media Ambitions?” tells a bleak story about the current happenings at Yahoo.
Yahoo has ended many of its services, let go fifteen percent of staff, and there are very few journalists left on the team. The remaining journalists are not worried about producing golden content, they have to compete with a lot already on the Web, especially “Kim Kardashian’s ass” as they say.
When Marissa Mayer took over Yahoo as the CEO in 2012, she was determined to carve out Yahoo’s identity as a tech company. Mayer, however, wanted Yahoo to be media powerhouse, so she hired many well-known journalists to run specific niche projects in popular areas from finance to beauty to politics. It was not a successful move and now Yahoo is tightening its belt one more time. The Yahoo news algorithm did not mesh with the big name journalists, the hope was that their names would soar above popular content such as Kim Kardashian’s ass. They did not.
Much of Yahoo’s current work comes from the Alibaba market. The result is:
“But the irony is that Mayer, a self-professed geek from Silicon Valley, threw so much of her reputation behind high-profile media figures and went with her gut, just like a 1980s magazine editor—when even magazine editors, including those who don’t profess to “get” technology, have long abandoned that practice themselves, in favor of what the geeks in Silicon Valley are doing.”
Mayer was trying to create a premiere media company, but lower quality content is more popular than top of the line journalists. The masses prefer junk food in their news.
Whitney Grace, May 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
The Guardian Adheres to Principles
May 20, 2016
In the 1930s, Britain’s newspaper the Guardian was founded, through a generous family’s endowment, on the ideas of an unfettered press and free access to information. In continued pursuit of these goals, the publication has maintained a paywall-free online presence, despite declining online-advertising revenue. That choice has cost them, we learn from the piece, ”Guardian Bet Shows Digital Risks” at USA Today. Writer Michael Wolff explains:
“In order to underwrite the costs of this transformation, most of the trust’s income-producing investments have been liquidated in recent years in order to keep cash on hand — more than a billion dollars.
“In effect, the Guardian saw itself as departing the newspaper business and competing with new digital news providers like BuzzFeed and Vox and Vice Media, each raising ever-more capital from investors with which to finance their growth. The Guardian — unlike most other newspapers that are struggling to make it in the digital world without benefit of access to outside capital — could use the interest generated by its massive trust to indefinitely deficit-finance its growth. At a mere 4% return, that would mean it could lose more than $40 million a year and be no worse for wear.
“But … the cost of digital growth mounted as digital advertising revenue declined. And with zero interest rates, there has been, practically speaking, no return on cash. Hence, the Guardian’s never-run-out endowment has plunged by more than 12% since the summer and, suddenly looking at a finite life cycle, the Guardian will now have to implement another transition: shrinking rather than expanding.”
The Guardian’s troubles point to a larger issue, writes Wolff; no one has been able to figure out a sustainable business model for digital news. For its part, the Guardian still plans to avoid a paywall, but will try to coax assorted fees from its users. We shall see how that works out.
Cynthia Murrell, May 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Big Data and Value
May 19, 2016
I read “The Real Lesson for Data Science That is Demonstrated by Palantir’s Struggles · Simply Statistics.” I love write ups that plunk the word statistics near simple.
Here’s the passage I highlighted in money green:
… What is the value of data analysis?, and secondarily, how do you communicate that value?
I want to step away from the Palantir Technologies’ example and consider a broader spectrum of outfits tossing around the jargon “big data,” “analytics,” and synonyms for smart software. One doesn’t communicate value. One finds a person who needs a solution and crafts the message to close the deal.
When a company and its perceived technology catches the attention of allegedly informed buyers, a bandwagon effort kicks in. Talks inside an organization leads to mentions in internal meetings. The vendor whose products and services are the subject of these comments begins to hint at bigger and better things at conferences. Then a real journalist may catch a scent of “something happening” and writes an article. Technical talks at niche conferences generate wonky articles usually without dates or footnotes which make sense to someone without access to commercial databases. If a social media breeze whips up the smoldering interest, then a fire breaks out.
A start up should be so clever, lucky, or tactically gifted to pull off this type of wildfire. But when it happens, big money chases the outfit. Once money flows, the company and its products and services become real.
The problem with companies processing a range of data is that there are some friction inducing processes that are tough to coat with Teflon. These include:
- Taking different types of data, normalizing it, indexing it in a meaningful manner, and creating metadata which is accurate and timely
- Converting numerical recipes, many with built in threshold settings and chains of calculations, into marching band order able to produce recognizable outputs.
- Figuring out how to provide an infrastructure that can sort of keep pace with the flows of new data and the updates/corrections to the already processed data.
- Generating outputs that people in a hurry or in a hot zone can use to positive effect; for example, in a war zone, not get killed when the visualization is not spot on.
The write up focuses on a single company and its alleged problems. That’s okay, but it understates the problem. Most content processing companies run out of revenue steam. The reason is that the licensees or customers want the systems to work better, faster, and more cheaply than predecessor or incumbent systems.
The vast majority of search and content processing systems are flawed, expensive to set up and maintain, and really difficult to use in a way that produces high reliability outputs over time. I would suggest that the problem bedevils a number of companies.
Some of those struggling with these issues are big names. Others are much smaller firms. What’s interesting to me is that the trajectory content processing companies follow is a well worn path. One can read about Autonomy, Convera, Endeca, Fast Search & Transfer, Verity, and dozens of other outfits and discern what’s going to happen. Here’s a summary for those who don’t want to work through the case studies on my Xenky intel site:
Stage 1: Early struggles and wild and crazy efforts to get big name clients
Stage 2: Making promises that are difficult to implement but which are essential to capture customers looking actively for a silver bullet
Stage 3: Frantic building and deployment accompanied with heroic exertions to keep the customers happy
Stage 4: Closing as many deals as possible either for additional financing or for licensing/consulting deals
Stage 5: The early customers start grousing and the momentum slows
Stage 6: Sell off the company or shut down like Delphes, Entopia, Siderean Software and dozens of others.
The problem is not technology, math, or Big Data. The force which undermines these types of outfits is the difficulty of making sense out of words and numbers. In my experience, the task is a very difficult one for humans and for software. Humans want to golf, cruise Facebook, emulate Amazon Echo, or like water find the path of least resistance.
Making sense out of information when someone is lobbing mortars at one is a problem which technology can only solve in a haphazard manner. Hope springs eternal and managers are known to buy or license a solution in the hopes that my view of the content processing world is dead wrong.
So far I am on the beam. Content processing requires time, humans, and a range of flawed tools which must be used by a person with old fashioned human thought processes and procedures.
Value is in the eye of the beholder, not in zeros and ones.
Stephen E Arnold, May 19, 2016
Facebook: The Telegraph Newspaper Thinks You Have Social Media by the Throat
May 19, 2016
I read “Sorry Google, We Just Don’t Want to Be Friends with You.” What struck me is that Facebook, which is not the focal point of the write up, is one beneficiary of this write up. The centerpiece of the article is a series of comments about Google’s social media belly flops: Wave, Buzz, Google +. The article omitted the wonderful Orkut service, which was embraced by certain users in Brazil.
Google is not the sole company to be dumped in a bourbon barrel filled with sour mash. Apple and even a nod to Facebook’s slops make the focus on Google less eye watering.
But the winner is Facebook. The Telegraph states:
Facebook has neutralized threats such as Instagram and WhatsApp by buying them before they really surface.
Okay, Facebook is the big dog. The write up makes one last snap at the GOOG:
Perhaps it should just accept that we don’t want to be friends with Google.
I am not sure I want to be friends with the Telegraph.
Stephen E Arnold, May 19, 2016
Signs of Life from Funnelback
May 19, 2016
Funnelback has been silent as of late, according to our research, but the search company has emerged from the tomb with eyes wide open and a heartbeat. The Funnelback blog has shared some new updates with us. The first bit of news is if you are “Searchless In Seattle? (AKA We’ve Just Opened A New Office!)” explains that Funnelback opened a new office in Seattle, Washington. The search company already has offices in Poland, United Kingdom, and New Zealand, but now they want to establish a branch in the United States. Given their successful track record with the finance, higher education, and government sectors in the other countries they stand a chance to offer more competition in the US. Seattle also has a reputable technology center and Funnelback will not have to deal with the Silicon Valley group.
The second piece of Funnelback news deals with “Driving Channel Shift With Site Search.” Channel shift is the process of creating the most efficient and cost effective way to deliver information access and usage to users. It can be difficult to implement a channel shift, but increasing the effectiveness of a Web site’s search can have a huge impact.
Being able to quickly and effectively locate information on a Web site saves time for not only more important facts, but it also can drive sales, further reputation, etc.
“You can go further still, using your search solution to provide targeted experiences; outputting results on maps, searching by postcode, allowing for short-listing and comparison baskets and even dynamically serving content related to what you know of a visitor, up-weighting content that is most relevant to them based on their browsing history or registered account.
Couple any of the features above with some intelligent search analytics, that highlight the content your users are finding and importantly what they aren’t finding (allowing you to make the relevant connections through promoted results, metadata tweaking or synonyms), and your online experience is starting to become a lot more appealing to users than that queue on hold at your call centre.”
I have written about it many times, but a decent Web site search function can make or break a site. Not only does it demonstrate that the Web site is not professional, it does not inspire confidence in a business. It is a very big rookie mistake to make.
Whitney Grace, May 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph