October 22, 2016
I followed a series of links to three articles about IBM Watson. Here are the stories I read:
- Watson’s the Name, Data’s the Game, dated October 10, 2016
- Milestones Along the Way in Watson’s Colorful History, October 10, 2016
- It’s (Not) Elementary: How Watson Works, October 10, 2016
The publication running these three write ups is Computerworld, which I translated as “ComputerWatson.”
Intrigued by the notion of “news,” I learned:
Watson uses some 50 technologies today, tapping artificial-intelligence techniques such as machine learning, deep learning, neural networks, natural language processing, computer vision, speech recognition and sentiment analysis.
But IBM does not like the idea of artificial intelligence even though I have spotted such synonyms in other “real” news write ups; for example, “augmented intelligence.”
There are factoids like “Watson can read more than 800 pages a second.” Figure 125 words per “page” and that works out to 100,000 words per second which is a nice round number. Does Watson perform this magic on a basic laptop? Probably not. What are the bandwidth and storage requirements? Oh,not a peep.
Computerworld—I mean ComputerWatson—provides a complete timeline of the technology too. The future begins in 1997. Imagine that. Boom. Watson wins at chess.
The “history” of Watson is embellished with a fanciful account of how IBM trained via humans assembling information. How much does corpus assembly cost? ComputerWatson—oh, I meant “Computerworld”—does not dive into investment.
To make Watson’s inner workings clear, the “real” news write up provides a link to an IBM video. Here’s an example of the cartoonish presentation:
These three write ups strike me as a public relations exercise. If IBM paid Computerworld to write and run these stories, the three articles are advertising. Who wrote these “news stories”? The byline is Katherine Noyes, who describes herself as “an ardent geek.” Her beat? Enterprise software, cloud computing, big data, analytics, and artificial intelligence.
Remarkable stuff but I had several thoughts:
- Not much “news” is included in the articles. It seems to me that the information has appeared in other writings.
- IBM Watson is working overtime to be recognized as the leader in the smart software game. That’s okay, but IBM seems to be pushing big markets with no easy way to monetize its efforts; for example, education, cancer, and game show participation.
- The Computerworld IBM Watson content party strikes me as eroding the credibility of both outfits.
Oh, I remember. Dave Schubmehl, the fellow who tried to sell on Amazon reports containing my research without my consent, was hooked up with IDG. I have lost track of the wizard, but I do recall the connection. More information is here.
Yep, credibility for possible content marketing and possible presentation of “news” as marketing collateral. Fascinating. Perhaps I should ask Watson: “What’s up?”
Stephen E Arnold, October 22, 2016
October 21, 2016
I love the Gray Lady. The Bits column is chock full of technology items which inspire, excite, and sometimes implant silly ideas in readers’ minds. That’s real journalism.
Navigate to “Daily Report: Explaining Yahoo’s Unexpected Rise in Traffic.”
The write up pivots on the idea that Internet traffic can be monitored in a way that is accurate and makes sense. A click is a click. A packet is a packet. Makes sense. The are the “minor” points of figuring out which clicks are from humans and which clicks are from automated scripts performing some function like probing for soft spots. There are outfits which generate clicks for various reasons including running down a company’s advertising “checkbook.” There are clicks which ask such questions as, “Are you alive?” or “What’s the response time?” You get the idea because you have a bit of doubt about traffic generated by a landing page, a Web site, or even an ad. The counting thing is difficult.
The write up in the Gray Lady assumes that these “minor” points are irrelevant in the Yahoo scheme of themes; for example:
an increased number of people were drawn to Yahoo in September. The reason may have been Yahoo’s disclosure that month that hackers stole data on 500 million users in 2014.
“People”? How do we know that the traffic is people?
The Gray Lady states:
Yahoo’s traffic has been declining for a long time, overtaken by more adept, varied and apparently secure places to stay on the internet.
Let’s think about this. We don’t know if the traffic data are counting humans, software scripts, or utility functions. We do know that Yahoo has been on a glide path to a green field without rocks and ruts. We know that Yahoo is a bit of a hoot in terms of management.
My hunch is that Yahoo’s traffic is pretty much what it has been; that is, oscillating a bit but heading in for a landing, either hard or soft.
Suggesting that Yahoo may be growing is interesting but unfounded. That traffic stuff is mushy. What’s the traffic to the New York Times’s pay walled subsite? How does the Times know that a click is a human from a “partner” and not a third party scraping content?
And maybe the traffic spike is a result of disenchanted Yahoo users logging in to change their password or cancel their accounts.
Stephen E Arnold, October 21, 2016
October 11, 2016
Publishers are not happy. Sci-Hub, a Dark Web portal provides free access to 58 million academic papers and articles that usually are sold through costly subscriptions and pay walls in the real world.
In an article that appeared on ExpressVPN titled 9 Must-See .onion Sites from the Depths of the Dark Web, the author says that –
This (Sci-Hub) gives underfunded scientific institutions, as well as individuals, unprecedented access to the world’s collective knowledge, something certain to boost humankind’s search for an end to diseases, droughts, and hunger.
Sci-Hub is brainchild of Alexandra Elbakyan a Kazak girl who wanted free access to academic literature without having to worry about money.
According to Science Magazine, everybody from students, scholars, researchers to underfunded universities are accessing the pirated academic literature.
How will publishers respond? We assume there will be meetings, legal actions, more meetings, hand waving, and attempts to convince Ms. Elbakyan to do her online system the old fashioned way: Charge universities as much as humanly possible. If these procedures fail, Ms. Elbakyan may want to be accompanied by former Kazak Olympic wrestlers and at least one legal eagle as she wends her way through life.
Vishal Ingole, October 11, 2016
September 29, 2016
I like the idea of researching technology and companies. I like to know something about the founders, but I am not too interested in their hobbies, the name of their dog, or how they spend their vacation days.
I read “MuckRock & Vice Announce Fellowship to Investigate Peter Thiel.” If the write up is accurate, which for the purposes of this blog post, is the operative assumption, I have a question: “Will this effort backfire?”
I understand that law enforcement and certain government agencies need to develop profiles and bubble gum cards about people of interest. When a person runs for a political office, journalists like to dig into the candidates’ past. But a lawyer and entrepreneur? Interesting.
The write up informed me:
I’m [author of the article cited above] not so sure how much Thiel-related info is really FOIA-able, this may put to the test Thiel’s stated claim that he wasn’t against journalism that made him look bad, in funding lawyer Charles Harder to sue Gawker into oblivion, but rather to “send a message” about protecting privacy. Of course, when you try to silence the press, there’s always a chance that the press decides to turn an even bigger spotlight on you.
Fascinating maneuver by MuckRock and Vice. I wonder if these outfits understand how tools like Palantir Gotham work, the tools’s capabilities, and the unintended consequences of collecting information about one of the beloved professionals involved in PayPal?
Worth monitoring from afar. Those lucky fellowship winners may learn quite a bit from the exercise. Did I mention that I wanted to monitor the trajectory of this “real news” adventure from afar. Really afar.
Stephen E Arnold, September 29, 2016
September 22, 2016
I know that online facilitates many functions. One can look up information. One can make up information and disseminate it so the information becomes “accurate.” One can take money and combine many functions in one glorious paean to academic integrity and scholarly research.
Consider fat and sugar. The Harvard crowd prefers kale and spring water, but for the moment consider these two essential components of many university commissaries.
Why am I linking online and the complements of chocolate and salt? The answer is my reaction to “Sugar Industry Secretly Paid for Favorable Harvard Research.” For the moment, let’s assume that this article is spot on. Hey, if something is online, that something is accurate, factual, and dead right. Well, that is what Jasper and Olli, along with the rest of the Beyond Search barn yard crew believe.
The write up informed me:
As nutrition debates raged in the 1960s, prominent Harvard nutritionists published two reviews in a top medical journal downplaying the role of sugar in coronary heart disease. Newly unearthed documents reveal what they didn’t say: A sugar industry trade group initiated and paid for the studies, examined drafts, and laid out a clear objective to protect sugar’s reputation in the public eye.
Hmmm. “Paid for” means content marketing. Search engine optimization undermines precision and recall. The mobile crowd is not into either of these yardsticks. But folks who like Twinkies can relate to sugar and fat. Now it seems that fat may be slightly less problematic for the waistline than sugar.
The write up told me:
Kearns [an expert who found the pay for brains’ play] said the papers, which the trade group later cited in pamphlets provided to policymakers, aided the industry’s plan to increase sugar’s market share by convincing Americans to eat a low-fat diet.
Yep, death. I suppose that if a few people die because of flawed research data that’s okay. Harvard has many initiatives to help those who have issues. However, Harvard does like to take care of itself and its available cash and assorted reserves.
Another maven’s comment received the fatty yellow highlight for this passage:
Marion Nestle, a nutrition expert at New York University who was not involved in the paper, said she’s still not convinced by those who argue that “sugar is poison” — a person’s total calorie consumption could matter more. But she called the UCSF findings a “smoking gun” — rare, hard evidence of the food industry meddling in science. “Science is not supposed to work this way,” she wrote in an accompanying commentary. “Is it really true that food companies deliberately set out to manipulate research in their favor? Yes, it is, and the practice continues,” Nestle added, noting that Coca-Cola and candy makers have both tried recently to influence nutrition research.
I am confident that Harvard can explain its venturing into the esteemed field of content marketing. I love that Harvard athletic program too. But I am even more fond of Harvard research than I was before learning about pay to play. I need a kale sandwich and a bottle of spring water. 23 skidoo to integrity in academe.
Stephen E Arnold, September 22, 2016
September 15, 2016
I read “Europe Announces That All Scientific Papers Should Be Free by 2020.” Sounds exciting for the major STEM publishers operating in the business-friendly European Community. The problem is the “should be”. Quite parental.
The write up calls attention to an outfit called the Competitiveness Council. The idea is that STEM content funded by the government I presume but one never knows in EC land.
I noted this passage:
Indeed, at the present time, the council has provided scarce details related to how countries can expect to make the full transition to open access and meet the deadline, which is less than four years away. That’s not too surprising, as the announcement was only just made a few days ago. But given the current state of scientific literacy, and the sad state of science communication, well, the sciences need all the help they can get. So this commitment is almost unanimously welcomed by those working in STEM and associated careers.
Google may want to commiserate with some of the publishers likely to be impinged upon if this adult idea becomes a reality. No, hold that thought. Some European publishers don’t like Google and would use a meet up with the Googlers as an opportunity to talk about getting compensated for their content.
Stephen E Arnold, September 15, 2016
September 9, 2016
I am not too sure about the information is some British newspapers. Nevertheless, I find some of the stories amusing. A good example of an online frolic is a write up designed to suck in clicks and output blogger and podcast commentaries. Case in point: Beyond Search just helped out the Daily Mail’s traffic. Wikipedia, another always-spot on source of information points to a statement about the newspaper’s “institutional racism.”
The headline which caught my attention was “Hacking Fears over Clinton server: FBI reveal Hillary Was Sent ‘Phishing’ Email with Porn Links and ‘Dark Web Browser’ Was Used to Access Another Account.” I am frightened I guess.
The write up asserts:
An unknown individual used an anonymous web browsing tool often used to access the dark web to get into an email account on the Clinton family server, the FBI revealed [on September 2, 2016].
The Daily Mail explains the bad stuff about the Dark Web. Then there is a leap:
In another incident that raised hacking fears, Clinton received a phishing email, purportedly sent from the personal email account of a State official. She responded to the email: ‘Is this really from you? I was worried about opening it!’.
And for a third cartwheel, the estimable newspaper stated:
In a separate incident, Abedin sent an email to an unidentified person saying that Clinton was worried ‘someone [was] was hacking into her email’. She had apparently received an email from a known associate ‘containing a link to a website with pornographic material’ at the time, but there is no additional information as to why she would believe she had been hacked.
Fascinating. I did not see anyone in the pictures accompanying the write wearing a baseball cap with the phrase:
Make journalism great again.
Everything I read online is accurate. Plus, I believe absolutely everything I read on my computing device’s screen. We try to remain informed about online here in rural Kentucky.
Stephen E Arnold, September 9, 2016
August 29, 2016
A scoop maybe. Navigate to “98 Personal Data Points That Facebook Uses to Target Ads to You.” The list-tickle becomes news because real newspapers report real news. For the full list, visit the estimable Washington Bezos. Sorry, Washington Post.
Here are some signals I found amusing:
- How much money user is likely to spend on next car. Doesn’t that depend on fashion, the deal, or what my spouse wants to drive?
- Users who have created a Facebook event. I don’t know what a Facebook “event” is.
- Users who investor (divided by investment type). For a real journalism outfit, I am puzzled by the phrase “who investor”.
- Types of clothing user’s household buys. Another grammatical gem.
- Users who are “heavy” buyers of beer, wine or spirits. I assume “heavy” means obese. Perhaps I am incorrect.
- Users who are interested in the Olympics, fall football, cricket or Ramadan. What about other sports like Ramadan?
All in all, a fine list. An ever more better finest scrumptious article from a real journalistic outfit, the Washington Bezos. Darn, there I go again. I mean the Washington Post.
Stephen E Arnold, August 29, 2016
July 9, 2016
I read another of those digitally informed grousing write ups from the London Guardian newspaper. This essay, which is not what I would call news from my vantage point in Harrod’s Creek, is titled “Few News Providers Will Now Be Liking Facebook.” I thought the title I thought up was more accurate; to wit: Few print centric news providers will be liking Facebook. But, hey, I live in rural Kentucky where print means the replacement for cursive. I noted this passage:
In her recent Humanitas lecture at Cambridge, for example, Columbia University’s Emily Bell pointed out that, for the first time in history, major news organizations had lost control of how their content was distributed. And George Brock, of City University, spotted that in becoming a major distributor of journalistic content, Facebook was implicitly acquiring editorial responsibilities, responsibilities that it neither acknowledged nor welcomed. But to desperate editors, faced with declining circulations and ad revenues, these seemed like theoretical considerations: however much they might dislike or fear Facebook, they had to deal with it because it was where their audiences were increasingly to be found.
Okay, Facebook with its billion plus users is more powerful than real “journalism” outfits. I would wager that Facebook is not likely to toss out its publishing system and embrace MarkLogic type technology either. How is that slicing and dicing working out?
I highlight in red ink red these sentences as well:
Social media are powerful engines for creating digital echo chambers, which is one reason why our politics is becoming so partisan. Brexiters speak only unto Brexiters. And Remainers ditto… We all inhabit echo chambers now and all Facebook has done is to increase the level of insulation on those inhabited by its users.
I think the Guardian missed the TED talk about “filter bubbles” and discovered the notion of an echo chamber itself.
My thought is that the flow of online data has washed away the foundations of the traditional approach to print on paper publishing. The white shoes are wet and muddy. The arbiters of taste and thought now have to recognize Facebook as the big dog.
Since the digital revolution is decades old now, I am delighted that real journalists are realizing that the clay tablets of ore are losing favor among some folks. You know. The young folks who do the mobile phone thing for affection, acceptance, and news.
Stephen E Arnold, July 9, 2016
June 28, 2016
Different sources suggest varying levels of malicious activity on Tor. Tech Insider shared an article responding to recent claims about Tor made by CloudFlare. The article, entitled, Google Search has a secret feature that shouts animal noises at you, offers information about CloudFlare’s perspective and that of the Tor Project. CloudFlare reports most requests from Tor, 94 percent, are “malicious” and the Tor Project has responded by requesting evidence to justify the claim. Those involved in the Tor Project have a hunch the 94 percent figure stems from CloudFlare attributing the label of “malicious” to any IP address that has ever sent spam. The article continues,
“We’re interested in hearing CloudFlare’s explanation of how they arrived at the 94% figure and why they choose to block so much legitimate Tor traffic. While we wait to hear from CloudFlare, here’s what we know: 1) CloudFlare uses an IP reputation system to assign scores to IP addresses that generate malicious traffic. In their blog post, they mentioned obtaining data from Project Honey Pot, in addition to their own systems. Project Honey Pot has an IP reputation system that causes IP addresses to be labeled as “malicious” if they ever send spam to a select set of diagnostic machines that are not normally in use. CloudFlare has not described the nature of the IP reputation systems they use in any detail.”
This article raises some interesting points, but also alludes to more universal problems with making sense of any information published online. An epistemology about technology, and many areas of study, is like chasing a moving target. Knowledge about technology is complicated by the relationship between technology and information dissemination. The important questions are what does one know about Tor and how does one know about it?
Megan Feil, June 28, 2016