European Union: Yes, Russia Warrants Some Attention
August 4, 2020
With so many smart people wrestling with the Google and cage fighting with England, I was surprised to read “EU, in First Ever Cyber Sanctions, Hits Russian Intelligence.” The allegedly accurate write up states:
Four members of Russia’s GRU military intelligence agency were singled out. The EU accuses them of trying to hack the wifi network of the Netherlands-based Organization for the Prohibition of Chemical Weapons, which has probed the use of chemical weapons in Syria. The 2018 attack was foiled by Dutch authorities.
In addition, two individuals described as “Chinese nationals” found themselves in the sanction target area.
There are several ways to look at this action. First, the Google is a bigger deal than the EU’s friend to the East. Second, the Brexit fishing rights thing distracted EU officials from mere intelligence and trans-national security matters. Third, maybe someone realized that cyber espionage and cyber attacks are something to think about. A couple of years or more seems pretty snappy compared to other EU projects.
Stephen E Arnold, August 3, 2020
Free Content: Like Technology, Now a Political Issue
August 3, 2020
Free content is interesting. It seems to represent a loss when compared to content that costs money. But are these two options the only ones? Nope, digital information has a negative cost. I think that’s a fair characterization of the knowledge road many are walking.
For seven years, I have produced “content” and made it available without charge to law enforcement and intelligence professionals in the US and to US allies. When I embarked on this approach, I met with skepticism and questions like “What’s the catch?”
I learned quickly that “free” means hook, trick, or sales ploy. Intrigued by the reaction, I persisted. Over time, my approach was — to a small number of people — somewhat helpful. In a few weeks, I will be 77, and I don’t plan on changing what I do, terminating the researchers who assist me, and telling those who want me to give a talk or write up a profile about one of the companies I follow to get lost.
I thought about my approach when I read “The Truth Is Paywalled But The Lies Are Free.” The title annoyed me because what I do is free. I could identify an interesting organization which has recently availed itself of one of my free reports. My team and I tried to assemble hard-to-find and little known information and package it into a format that was easy-to-understand. Yep, the document was free, and it has found its way into several groups focused on chasing down bad actors.
The write up in Current Affairs, now an online information service, states:
This means that a lot of the most vital information will end up locked behind the paywall… The lie is more accessible than its refutation.
I think I understand. The majority of free content has spin. For fee content is, therefore, delivered with less spin or without spin.
Is this true?
The reports I prepare describe specific characteristics of a particular technology. In my opinion and that of the researchers who assist me, we make an effort to identify consistent statements, present information for which there is a document like a technical specification, and use cases that are verifiable.
I suppose the fact that I maintain profiles of companies of little interest to most “real” journalists and pundits creates an exception. What I do can be set aside with the comment, “Yeah, but who really cares about the Polish company Datawalk?”
The write up states:
More reason to have publications funded by the centralized free-information library rather than through subscriptions or corporate sponsorship. Creators must be compensated well. But at the same time we have to try to keep things that are important and profound from getting locked away where few people will see them. The truth needs to be free and universal.
I think I see the point. However, my model is different. The content I produce is a side product of what I do. If someone pays me to produce a product or service, I use that money to keep my research group working.
Money can be generated and a portion of it can be designated to an information task. The challenge is finding a way to produce money and then allocating the funds in a responsible way. Done correctly, there is no need to beg for dollars, snarl at Adam Carolla for selling a “monthly nut,” or criticize information monopolies.
These toll booths for information are a result of choice, a failure of regulatory authorities, the friction of established institutions that want “things the way they have to be” thinking, and selfish agendas.
In short, the lack of high value “free” information is distinctly human. I want to point out that even with information paywalls, there are several ways to obtain useful information:
- Talking to people, preferable in person but email works okay
- Obtaining publicly accessible documents; for example, patent applications
- Comments posted in discussion groups; for instance, the worker at a large tech company who lets slip a useful factoid
- Information recycled by wonky news services; for example, the GoCurrent outfit.
The real issue is that “free” generally triggers fear, doubt, and uncertainty. Paying for something means reliable, factual, and true.
Put my approach aside. Step back from the “create a universal knowledge bank which anyone can access. Forget the paying the author angle.
High-value information exists in the flows of data. Knowledge can be extracted from deltas; that is, changes in the data themselves. The trick is point of view and noticing. The more one tries to survive by creating information, the more likely it is that generating cash will be difficult if not impossible.
Therefore, high value content can be the result of doing other types of a knowledge work. Get paid for that product or service, then generate information and give it away.
That’s what I have been doing, and it seems to work okay for me. For radicals, whiners, monopolists fearful of missing a revenue forecast — do some thinking, then come up with a solution.
What’s going on now seems to be a dead end to me. Ebsco and LexisNexis live in fear of losing a “big customer.” Therefore, prices go up. Fewer people can afford the products. The knowledge these companies have becomes more and more exclusive. I get it.
But what these firms and to some extent government agencies which charge for data assembled and paid for with tax payer dollars are accelerating intellectual loss.
The problem is a human and societal one. I am going to keep chugging along, making my content free. I think the knowledge economy seems to be one more signal that the datasphere is not a zero sum game. Think in terms of a negative number. We now have a positive (charging for information), free (accessing information for nothing), and what I call the “data negative” or D-neg (the loss of information and by extension being “informed”).
In my experience, D-neg accelerates stupidity. That’s a bigger problem than many want to think about. Arguing about the wrong thing seems to be the status quo; that is, generating negatives.
Stephen E Arnold, August 3, 2020
Spotify: Implementing the Hapsburg Hustle
August 3, 2020
I read “Spotify CEO Daniel Ek Says Working Musicians May No Longer Be Able to Release Music Only Once Every Three to Four years.” I also read the comments in Hacker News related to this write up. The Fader article states:
Ek claimed that a “narrative fallacy” had been created and caused music fans to believe that Spotify doesn’t pay musicians enough for streams of their music. “Some artists that used to do well in the past may not do well in this future landscape,” Ek said, “where you can’t record music once every three to four years and think that’s going to be enough.”
Motivated by financial and psychological factors, the big M (Mozart) hit a fatal D minor in 1791. He was 35. DarkCyber’s view is that music MBAs want output.
Nothing new. Will Spotify Bach off? Unlikely. Content needed.
Stephen E Arnold, August 3, 2020
Do Not Gamble. Own the Casino. The Google Way?
August 3, 2020
I read “Google’s Top Search Result?” What a surprise? No, not the fact that Google present Google-centric results at the top of mobile search results. The surprise is that until July 28, 2020, no one knew that Google’s magical algorithmic, math-is-objective, super duper relevance scooper got more Google goodies than any other “content producer.” Amazing.
In the good old days of big desktop anchor computers and monitors, there was screen real estate. Google filled the screen with objective results and, of course, some advertisements.
That was then; this is now. Mobile screens are mostly squint-generators. In order to be seen and generate clicks, the Google has to work overtime.
The challenges include:
- Traffic, eyeballs, and individuals who will go ga-ga over that which is Googley.
- Sizzle that will burn the greedy fingertips of competitors who want to be placed front and center.
- Useful information for consumers. Yep, what Google displays eliminates the need to think. Advertisers who want to be listed on a Google Map. Something can be worked out.
A number of organizations have groused about Google’s magical algorithmic, math-is-objective, super duper relevance scooper.
What’s fascinating is that it has taken two decades for some people to understand the wisdom embedded in the observation, “Own the casino.”
Pretty good advice and someone at the GOOG took it.
Stephen E Arnold, August 3, 2020
WhatsApp: Expiring Messages
August 3, 2020
DarkCyber noted “WhatsApp Is Working on Message Deletion Feature.” Encrypted messaging is a communications channel with magnetism. I pointed out in one of my recent lectures that:
messaging can provide many of the functions associated with old style Dark Web sites.
Messaging applications permit encrypted groups, in-app ecommerce, and links which effectively deliver digital content to insiders or customers.
Facebook, owner of WhatsApp, according to the article:
is working on an Expiring Messages feature.
The idea is that the Facebook system will:
“automatically delete a particular message for the sender and the receiver after a particular time.”
The innovation, if it is under development, begs the question, “Will Facebook retain copies of the deleted content?”
Stephen E Arnold, August 3, 2020
Search and Predicting Behavior
August 3, 2020
DarkCyber is interested in predictive analytics. Bayesian and other “statistical methods” are a go-to technique, and they find their way into many of the smart software systems. Developers rarely explain that systems share many features and functions. Marketers, usually kept in the dark like mushrooms, are free to formulate an interesting assertion or two.
I read “Google Searches During Pandemic Hint at Future Increase in Suicide,” and I was not sure about the methodology. Nevertheless, the write up provides some insight into what can be wiggled from Google search data.
Specifically Columbia University experts have concluded that financial distress is “strongly linked to suicide.”
Okay.
I learned:
The researchers used an algorithm to analyze Google trends data from March 3, 2019, to April 18, 2020, and identify proportional changes over time in searches for 18 terms related to suicide and known suicide risk factors.
What algorithm?
The method is described this way:
The proportion of queries related to depression was slightly higher than the pre-pandemic period, and moderately higher for panic attack.
Perhaps the researchers looked at the number of searches and noted the increase? So comparing raw numbers? Tenure tracks and grants await! Because that leap between search and future behavior…
Stephen E Arnold, August 3, 2020
NLP: A Time for Reflection or a Way to Shape Decades of Hyperbole and Handwaving?
August 2, 2020
The most unusual GoCurrent.com online information service published “The Field of Natural Language Processing Is Chasing the Wrong Goal.” The article comments about the Association for Computational Linguistics Conference held in July 2020.
The point of the write up is to express concern about the whither and why of NLP; for example:
My colleagues and I at Elemental Cognition, an AI research firm based in Connecticut and New York, see the angst as justified. In fact, we believe that the field needs a transformation, not just in system design, but in a less glamorous area: evaluation.
Evaluation?
Yep, the discipline appears to be chasing benchmarks. DarkCyber believes this is a version of the intra-squad rivalries as players vie to start the next game.
The write up raises this question:
How did the NLP community end up with such a gap between on-paper evaluations and real-world ability? In an ACL position paper, my colleagues and I argue that in the quest to reach difficult benchmarks, evaluations have lost sight of the real targets: those sophisticated downstream applications. To borrow a line from the paper, the NLP researchers have been training to become professional sprinters by “glancing around the gym and adopting any exercises that look hard.”
The answer, in part, is for NLP developers to follow this path:
But our argument is more basic: however systems are implemented, if they need to have faithful world models, then evaluations should systematically test whether they have faithful world models.
DarkCyber’s view is that NLP like other building blocks of content analysis and access systems have some characteristics which cause intra-squad similarities; that is, the players are more similar than even they understand:
- Reliance on methods widely taught in universities. Who wants to go in a new direction, fail, and, therefore, be perceived as a dead ender?
- Competing with one’s team mates, peers, and fellow travelers is comfortable. Who wants to try and explain why NLP from A is better than NLP from B when the results are more of the same?
- NLP like other content functions is positioned as the big solution to tough content challenges. The reality is that language is slippery and often less fancy methods deliver good enough results. Who wants to admit that a particular approach is “good enough.” It is better to get out the pink wrapping paper and swath the procedures in colorful garb.
NLP can be and is useful in many situations. The problem is that making sense of human utterances remains a difficult challenge. DarkCyber is suspicious of appeals emitted by the Epstein-funded MIT entity.
Jargon is jargon. NLP is one of those disciplines which works overtime to deliver on promises that have been made for many years. Does NLP pay off? This is like MIT asking, “Epstein who?”
Stephen E Arnold, August 2, 2020
French Computer Terminology
August 1, 2020
This is a helpful resource. However, the term for “spreadsheet” is not included. If you want that spreadsheet holding a summary of your electricity bills, be sure to know the word “tableur.” You can find the collection of terms at this link. The compilation is not une faute passible d’un coup franc, but let’s check with the video assisted referee to be sure.
Stephen E Arnold, August 1, 2020