Innovation Is Not Reheated Pizza. Citation Analysis Is Still Fresh Pizza.

April 22, 2016

Do you remember Eugene Garfield? He was the go to person in the field of citation analysis. The jargon attached to his figuring out how to identify who cited what journal article snagged old school jargon like bibliometrics. Dr. Garfield founded the Institute for Scientific Information. He sold ISI to Thomson (now Thomson Reuters) in 1992. I mention this because this write up explains an “innovation” which strikes me as recycled Garfield.

image

Navigate to “Who’s Hot in Academia? Semantic Scholar Dives More Deeply into the Data.” The write up explains:

If you’re in the “publish-or-perish” game, get ready to find out how you score in acceleration and velocity. Get ready to find out who influences your work, and whom you influence, all with the click of a mouse. “We give you the tools to slice and dice to figure out what you want,” said Oren Etzioni, CEO of the Allen Institute for AI, a.k.a. AI2.

My recollection is that there were a number of information professionals who could provide these type of data to me decades ago. Let’s see if I can recall some of the folks who could wrangle these types of outputs from the pre-Cambridge Scientific Abstracts version of Dialog:

  • Marydee Ojala, former information wrangler at the Bank of America and now editor of Online
  • Barbara Quint, founder of Searcher and a darned good online expert
  • Riva Basch, who lived a short distance from me in Berkeley, California, when I did my time in Sillycon Valley
  • Ann Mintz, former information wrangler at Forbes before the content marketing kicked in
  • Ruth Pagell, once at the Wharton Library and then head of the business library at Emory University.

And there were others.

The system described in the write up makes certain types of queries easier. That’s great, but it is hardly the breathless revolution which caught the attention of the article.

In my experience, it takes a sharp online specialist to ask the correct question and then determine if the outputs are on the money. Easier does not translate directly into accurate outputs. Is the set of journals representative for a particular field; for example, thorium reactor technology. What about patent documents? What about those crazy PDF versions of pre-publication research?

I know my viewpoint shocks the mobile device generation. Try to look beyond software that does the thinking for the user. Ignoring who did what, how, when, and why puts some folks in a disadvantaged viewshed. (Don’t recognize the terms. Well, look it up. It’s just a click away, right?) And, recognize that today’s innovations are often little more than warmed over pizza. The user experience I have had with reheated pizza is that it is often horrible.

Stephen E Arnold, April 22, 2016

New York Times: Editorial Quality in Action

April 22, 2016

On April 14, 2016, I flipped through my dead tree copy of the New York Times. You know. The newspaper which is struggling to sell more copies than McPaper. What first caught my eye was this advertisement for a dead tree book called “ The New York Times Manual of Style and Usage: The Official Style guide Used by the Writers and Editors of the World’s Most Authoritative News Organization. I assume this manual was produced by “real” journalists and editors. I am not familiar with this book, although I was aware of its existence. The addled goose uses the style set forth in the classic Tressler Christ circa 1958. Oh, you may be able to read a version of the New York Times story at this link. Keep in mind that you may have to pay pay pay.

image

I noted in the very same edition of the dead tree edition of the New York Times this write up about a football (soccer) match. I know that the “real” journalists working in Midtown are probably not into the European Cup if there is a Starbuck’s nearby.

I noted this interesting stylistic touch:

image

I spotted two paragraphs which are mostly the same. I assume that the new edition of the Style and Usage volume is okay with duplicate passages. It is tough to determine which is the “correct” paragraph.

Tressler Christ, as I recall, suggested that writing the same passage twice in a row was not a good move in 1958. The reality of the cost conscious New York Times may be that it is okay to pontificate and then duplicate content.

Nifty. I will try this some time.

Nifty. I will try this some time.

Nifty. I will try this some time.

Nifty. I will try this some time.

See. Not annoying annoying annoying at all.

Stephen E Arnold, April 22, 2016

Google Nest: A Nice Cafe and an Improving Culture

April 22, 2016

Working at whizzy Silicon Valley start ups has got to be rewarding. I know the shift to mobile is shaking up some assumptions about the Alphabet Google thing. I know that Google is trying to sell its robot outfit. I know that legal eagles are keeping the sun from some volleyball games. But I was delighted to learned that Google Nest has an “incredibly nice new cafe” which serves “Asian noodles.” Slam dunk.

I read “Nest CEO Tony Fadell Went to Google’s All-Hands Meeting to Defend Nest. Here’s What He Said.” I learned that Nest garnered some “damning articles.” I had not noticed because I don’t pay too much attention to home automation in general and thermostats in particular.

I learned that one “real” journalistic outfit wrote about a “corrosive culture” in another Alphabet Google operation. I am not sure what a corrosive culture is, but I think the idea is that some folks are not happy. What’s new? Anyone ever listed to a group of GS 12s discuss the efficacy of lateral transfers from Fish & Wildlife to the Postal Service? Grumpy, grumpy.

The Google is on top of employee satisfaction. There are tools to obtain feedback. There are senior managers who are managing. The passage in the write up I noted and circled in arugula green was this one:

I do respect the Nest employees. I do respect the Google employees. I respect the Alphabet employees. We try to work very hard together and partner in many different areas around the different companies. I also respect ex-Nesters, ex-Googlers, those kind of things. So when I read those things that say we don’t respect people, or I don’t, it’s absolutely wrong and that is not how I believe because I want to be treated with respect. And I give respect because I want to get respect.

My assumption was that respect at the old Google came from doing things that worked and mattered. I am a little fuzzy on the people side of the equation. The reason is that I heard long ago that the reason a certain big wheel media titan launched a multi year, very expensive legal dispute with the Google was a direct consequence of [a] senior Googlers not arriving at the meeting on time. Since the meeting was at Google Mountain View, the big wheel media person was not happy, [b] a certain founder of Google did not look at the media titan. The founder focused on his Mac laptop and ignored the media giant, [c] another Google founder arrived after the the first Google founder, perspiring because his rollerblading session ran long. Now I was not at this meeting, and this may be one of those apocryphal stories about why the Google and Viacom were not best friends for many years.

One thing the passage about respect did was trigger a memory of this anecdote. My source was a person familiar with the matter, and I gained some dribs and drabs to confirm the anecdote after the event. I assume the event and this remarkable presentation ran like a smart thermostat.

Yep, respect and Asian noodles, and the loss of a Glass executive. (Glass reports to Nest.)

Stephen E Arnold, April 22, 2016

Watson Lacks Conversation Skills and He Is Not Evil

April 22, 2016

When I was in New York last year, I was walking on the west side when I noticed several other pedestrians moving out of the way of a man mumbling to himself.  Doing as the natives do, I moved aside and heard the man rumble about how, “The robots are taking over and soon they will be ruling us.  You all are idiots for not listening to me.”  Fear of a robot apocalypse has been constant since computer technology gained precedence and we also can thank science-fiction for perpetuating it.  Tech Insider says in “Watson Can’t Actually Talk To You Like In The Commercials” Elon Musk, Bill Gates, Stephen Hawking, and other tech leaders have voiced their concerns about creating artificial intelligence that is so advanced it can turn evil.

IBM wants people to believe otherwise, which explains their recent PR campaign with commercials that depict Watson carrying on conversations with people.  The idea is that people will think AI are friendly, here to augment our jobs, and overall help us.  There is some deception on IBM’s part, however.  Watson cannot actually carry on a conversation with a person.  People can communicate with, usually via an UI like a program via a desktop or tablet.  Also there is more than one Watson, each is programmed for different functions like diagnosing diseases or cooking.

“So remember next time you see Watson carrying on a conversation on TV that it’s not as human-like as it seems…Humor is a great way to connect with a much broader audience and engage on a personal level to demystify the technology,’ Ann Rubin, Vice President IBM Content and Global Creative, wrote in an email about the commercials. ‘The reality is that these technologies are being used in our daily lives to help people.’”

If artificial intelligence does become advanced enough that it is capable of thought and reason comparable to a human, it is worrisome.  It might require that certain laws be put into place to maintain control over the artificial “life.”  That day is a long time off, however, until then embrace robots helping to improve life.

 

Whitney Grace, April 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Local News Station Produces Dark Web Story

April 22, 2016

The Dark Web continues to emerge as a subject of media interest for growing audiences. An article, Dark Web Makes Illegal Drug, Gun Purchases Hard To Trace from Chicago CBS also appears to have been shared as a news segment recently. Offering some light education on the topic, the story explains the anonymity possible for criminal activity using the Dark Web and Bitcoin. The post describes how these tools are typically used,

“Within seconds of exploring the deep web we found over 15,000 sales for drugs including heroin, cocaine and marijuana. In addition to the drugs we found fake Illinois drivers licenses, credit card and bank information and dangerous weapons. “We have what looks to be an assault rifle, AK 47,” said Petefish. That assault rifle AK 47 was selling for 10 bitcoin which would be about $4,000. You can buy bitcoins at bitcoin ATM machines using cash, leaving very little trace of your identity. Bitcoin currency along with the anonymity and encryption used on the dark web makes it harder for authorities to catch criminals, but not impossible.”

As expected, this piece touches on the infamous Silk Road case along with some nearby cases involving local police. While the Dark Web and cybercrime has been on our radar for quite some time, it appears mainstream media interest around the topic is slowly growing. Perhaps those with risk to be affected, such as businesses, government and law enforcement agencies will also continue catching on to the issues surrounding the Dark Web.

 

Megan Feil, April 22, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Data Intake: Still a Hassle

April 21, 2016

I read “Big Data’s Biggest Problem: It’s Too Hard to Get the Data In.” Here’s a quote I noted:

According to a study by data integration specialist Xplenty, a third of business intelligence professionals spend 50% to 90% of their time cleaning up raw data and preparing to input it into the company’s data platforms. That probably has a lot to do with why only 28% of companies think they are generating strategic value from their data.

My hunch is that with the exciting hyperbole about Big Data, the problem of normalizing, cleaning, and importing data is ignored. The challenge of taking file A in a particular file format and converting to another file type is indeed a hassle. A number of companies offer expensive filters to perform this task. The one I remember is Outside In, which sort of worked. I recall that when odd ball characters appeared in the file, there would be some issues. (Does anyone remember XyWrite?) Stellent purchased Outside In in order to move content into that firm’s content management system. Oracle purchased Stellent in 2006. Then Kapow “popped” on the scene. The firm promoted lots of functionality, but I remember it as a vendor who offered software which could take a file in one format and convert it into another format. Kofax (yep, the scanner oriented outfit) bought Kofax to move content from one format into one that Kofax systems could process. Then Lexmark bought Kofax and ended up with Kapow. With that deal, Palantir and other users of the Kapow technology probably had a nervous moment or are now having a nervous moment as Lexmark marches toward a new owner. Entropy, a French outfit, was a file conversion outfit. It sold out to Salesforce. Once again, converting files from Type A to another desired format seems to have been the motivating factor.

Let us not forget the wonderful file conversion tools baked into software. I can save a Word file as an RTF file. I can import a comma separated file into Excel. I can even fire up Framemaker and save a Dot fm file as RTF. In fact, many programs offer these import and export options. The idea is to lessen the pain of have a file in one format which another system cannot handle. Hey, for fun, try opening a macro filled XyWrite file in Framemaker or Indesign. Just change the file extension to one the system thinks it recognizes. This is indeed entertaining.

The write up is not interested in the companies which have sold for big bucks because their technology could make file conversion a walk in the Hounz Lane Park. (Watch out for the rats, gentle reader.) The write up points out three developments which will make the file intake issues go away:

  1. The software performing file conversion “gets better.” Okay, I have been waiting for decades for this happy time to arrive. No joy at the moment.
  2. “Data preparers become the paralegals of data science.” Now that’s a special idea. I am not clear on what a “data preparer” is, but it sounds like a task that will be outsourced pretty quickly to some country far from the home of NASCAR.
  3. Artificial intelligence” will help cleanse data. Excuse me, but smart software has been operative in file conversion methods for quite a while. In my experience, the exception files keep on piling up.

What is the problem with file conversion? I don’t want to convert this free blog post into a lengthy explanation. I can highlight five issues which have plagued me and my work in file conversion for many years:

First, file types change over time. Some of the changes are not announced. Others like the Microsoft Word XML thing are the subject of months long marketing., The problem is that unless the outfit responsible for the file conversion system creates a fix, the exception files can overrun a system’s capacity to keep track of problems. If someone is asleep at the switch, data in the exception folder can have an adverse impact on some production systems. Loss of data is interesting but trashing the file structure is a carnival. Who does not pay attention? In my experience, vendors, licensees, third parties, and probably most of the people responsible for a routine file conversion task.

Second, the thrill of XML is that it is not particularly consistent. Somewhere along the line, creativity takes precedence over for well formed. How does one deal with a couple hundred thousand XML files in an exception folder? What do you think about deleting them?

Third, the file conversion software works as long as the person creating a document does not use Fancy Dan “inserts” in the source document. Problems arise from videos, certain links, macros, and odd ball formatting of the source document. Yep, some folks create text in Excel and wonder why the resulting text is a bit of a mess.

Fourth, workflows get screwed up. A file conversion system is semi smart. If a process creates a file with an unrecognized extension, the file conversion system fills the exception folder. But what if one valid extension is changed to a supported but incorrect extension. Yep, XML users be aware that there are proprietary XML formats. The files converted and made available to a system are “sort of right.” Unfortunately sort of right in mission critical applications can have some interesting consequences.

Fifth, attention to detail is often less popular than fiddling with one’s mobile phone or reading Facebook posts. Human inattention can make large scale data conversion fail. I have watched as a person of my acquaintance deleted the folder of exception files. Yo, it is time for lunch.

So what? Smart software makes certain assumptions. At this time, file intake is perceived as a problem which has been solved. My view is that file intake is a core function which needs a little bit more attention. I do not need to be told that smart software will make file intake pain go away.

Stephen E Arnold, April 21, 2016

Artificial Intelligence Algorithms Want Wilde Byrons

April 21, 2016

I read “The Next Hot Job in Silicon Valley Is for Poets.” The idea is that English majors, among others of this ilk, will be contributors to more human artificial intelligence systems. The write up informed me:

As in fiction, the AI writers for virtual assistants dream up a life story for their bots. Writers for medical and productivity apps make character decisions such as whether bots should be workaholics, eager beavers or self-effacing. “You have to develop an entire backstory — even if you never use it,” Ewing [a Hollywood writer] said.

With the apparent boom in smart software, English majors and other word oriented creative types may see an end to their employment problems. Now what can be done about unemployed lawyers? Perhaps one can ask IBM Watson?

Stephen E Arnold, April 21, 2016

Google Removes Pirate Links

April 21, 2016

A few weeks ago, YouTube was abuzz with discontent from some of its most popular YouTube stars.  Their channels had been shut down die to copyright claims by third parties, even thought the content in question fell under the Fair Use defense.  YouTube is not the only one who has to deal with copyright claims.  TorrentFreak reports that “Google Asked To Remove 100,000 ‘Pirate Links’ Every Hour.”

Google handles on average two million DMCA takedown notices from copyright holders about pirated content.  TorrentFreak discovered that the number has doubled since 2015 and quadrupled since 2014.  The amount beats down to one hundred thousand per hour.  If the rate continues it will deal with one billion DMCA notices this year, while it had previously taken a decade to reach this number.

“While not all takedown requests are accurate, the majority of the reported links are. As a result many popular pirate sites are now less visible in Google’s search results, since Google downranks sites for which it receives a high number of takedown requests.  In a submission to the Intellectual Property Enforcement Coordinator a few months ago Google stated that the continued removal surge doesn’t influence its takedown speeds.”

Google does not take broad sweeping actions, such as removing entire domain names from search indexes, as it does not want to become a censorship board.  The copyright holders, though, are angry and want Google to promote only legal services over the hundreds of thousands of Web sites that pop up with illegal content.   The battle is compared to an endless whack-a-mole game.

Pirated content does harm the economy, but the numbers are far less than how the huge copyright holders claim.  The smaller people who launch DMCA takedowns, they are hurt more.  YouTube stars, on the other hand, are the butt of an unfunny joke and it would be wise for rules to be revised.

 

Whitney Grace, April 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Digging for a Direction of Alphabet Google

April 21, 2016

Is Google trying to emulate BAE System‘s NetReveal, IBM i2, and systems from Palantir? Looking back at an older article from Search Engine Watch, How the Semantic Web Changes Everything for Search may provide insight. Then, Knowledge Graph had launched, and along with it came a wave of communications generating buzz about a new era of search moving from string-based queries to a semantic approach, organizing by “things”. The write-up explains,

“The cornerstone of any march to a semantic future is the organization of data and in recent years Google has worked hard in the acquisition space to help ensure that they have both the structure and the data in place to begin creating “entities”. In buying Wavii, a natural language processing business, and Waze, a business with reams of data on local traffic and by plugging into the CIA World Factbook, Freebase and Wikipedia and other information sources, Google has begun delivering in-search info on people, places and things.”

This article mentioned Knowledge Graph’s implication for Google to deliver strengthened and more relevant advertising with this semantic approach. Even today, we see the Alphabet Google thing continuing to shift from search to other interesting information access functions in order to sell ads.

 

Megan Feil, April 21, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Yahoo: An Interesting Hatchet Job

April 20, 2016

I think that the Huffington Post  is part of America Online, which is part of Verizon, which is supposed to be interested in buying poor old Yahoo.

I thought about that chain of dependent clauses right after I read “The One-Time Ruler of the Web Has Lost More Than Its Mojo — A Lesson for Us All.” The write up does a good job of pointing out that Yahoo has been lost in space for a long, long time. I highlighted this statement:

One researcher tracked Yahoo!’s “boilerplate,” the block of text that describes a company’s self-description found at the bottom of most press releases. In 24 years the boilerplate changed 24 times!

What’s interesting in this mélange of popular and academic statements about the home of the Yahooligans is that the company has lacked vision. Ah, the vision thing. This argument is supported by a statement allegedly made by Helen Keller:

It is a terrible thing to see and have no vision.

Yahoo strikes me as an interesting company. One could argue that Yahoo did try to change to adapt to technology, competitors, and market opportunities. The effort, unlike Google’s approach, lacked the steady flow of cash produced by Google’s online advertising model.

Where did that Google online advertising model originate? GoTo.com which became Overture. Overture became a Yahoo property. The Google was “inspired” by that model. Perhaps one can view Google as a sibling of Yahoo, just a younger, sighted relative?

Stephen E Arnold, April 20, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta