Amazon and Open Source: A Me Too Spin on Microsoft and Its Extinguish Tactic?
September 26, 2022
I heard that Amazon — the lovable online bookstore — is thinking about open source software in general and open source search specifically. This is just a hunch, based on comments bandied about in the vendors’ area at a recent law enforcement conference. The attendees may not think much about Amazon as an ecosystem for bad actors but the vendors with whom I talk are:
Aware
Eager to use the AWS platform
Expressing varying degrees of concern.
Were these vendors representative of the cyber security community? Are you kidding? Were the conference attendees a cross section of the more than 100 US enforcement agencies? Nope.
So why do I mention this impression? Three reasons:
- Amazon, like Microsoft, provides plumbing for a number of government entities and for some darned interesting cyber security vendors in the US and elsewhere (Hello, Israel?)
- The US government is not a cohesive entity. One of the regulatory agencies, which I shall not name, is thinking hard thoughts about the friendly online bookstore. I have heard that third party seller activity (Amazon’s and some seller), Amazon’s human centric management approach, and some of Amazon’s surfing on data generated by resellers, vendors, and possibly home shoppers are topics of interest.
- Years ago, Amazon hired some Lucid Imagination open source search professionals and plopped the wizards in the Bezos Bulldozer’s Burlingame office. Evolving from that “lucid” input, the venerable online bookstore engaged in a game of fork you with Elastic, a company associated with the open source Elasticsearch, for fee services, and a digital animal dubbed ELK.
These reasons cause me to recall one of the principal conclusions my team and I formulated when we wrote “Open Source Search Report” for a mid tier consulting firm. (Unsurprisingly the company changed hands and the study was split apart with individual chapters going for $3,000 each on — guess what online bookstore? Give up? It was Amazon.
I reflected on the conclusion in our monograph: Open source is the domain of large corporate entities. Why? Open source was pretty much free and could be changed. Plus, unpaid open source enthusiasts would find and fix software problems.
One of the reasons enterprise search in general and content processing in particular has been a company killer is that search is not an “application.” Search is weirdly personal, and each enterprise search client wanted a system that would work for the many silos within an organizational structure.
The information super highway is littered with search road kill. Many of the names are long forgotten. When was the last time you longer for the francophone centric Delphis or the enterprise powerhouse Entopia?
Why am I thinking about Amazon and open source search?
I read “Open Source Bait and Switch” with the fetching and click magnet subtitle “When OSS advocacy goes too far & corporate greed takes over, free software is used as a tool to destroy competition and hurt the developer community.”
I noted these statements in the article, which is in step with our 2011 research. (Yep, more than a decade ago, which I find interesting.)
let me highlight a couple of statements from the article which arrested my attention this morning (Monday, September 26, 2022).
Take Elastic search. They were open source and killing it. But AWS was forking and not really helping their bottom line. So Elastic changed their license to block AWS. AWS started their own fork. Some people vilify Elastic in this story but those people probably never had to fight Amazon for the survival of their business. In this case, both sides weaponised open source in a business fight.
Also:
I love open source and think it’s remarkably important. That’s why we shouldn’t let corporations weaponize it.
And:
Major corporations use open source as a weapon to fight each other, we seem to benefit in the short term. But as they win the corporate mindset takes over and they double down on control.
What’s shaking at Amazon? Based on my vantage point and my limited viewshed, I will hazard several observations:
- Amazon wants to dominate via search and retrieval because it is a utility that is essential for next generate search based applications.
- Amazon wants to strike at its competitors, which are estimable organizations obviously, and deprive them of any advantage these firms may be perceived to have when it comes to findability. Could these be great outfits like Google and Microsoft as well as annoying start ups like Algolia and the almost laughable Gulliver of search in Canada as well as an interesting entity morphing as I write this essay? (Want names? Sorry, not in a free blog, you silly goose.)
- Amazon lacks imagination, and it is — in my opinion — manifesting the old Microsoft method of embrace, extend, and extinguish. Yep, extinguish. In my view, Amazon is showing other outstanding for profit entities how to attack competitors, community minded open source developers, and users of Amazon AWS simultaneously. None of the “special operation” thinking that has been in the news lately. Amazon is operating strategically and tactically with a single minded purpose. Split up the bookstore and each part will grow bigger than it is today.
Should I worry that my eBook won’t arrive or the French bulldog’s winter coat fail to show up tomorrow? Nah. What about open source, the community thing, the free thing. Yep, worry is good.
Stephen E Arnold, September 26, 2022
Microsoft and Linux: All Your Base Belong to Us
August 9, 2022
Microsoft has traditionally been concerned about Linux and has never hidden its indigestion — until the original top dogs went to the kennel. Microsoft actually hates all open source software and CEO Steve Ballmer said, ““Linux is a cancer that attaches itself in an intellectual property sense to everything it touches.” Wow! It sounds like someone wants to enforce a monopoly on technology, prevent innovation, and rake in dollars for personal gain. In other words, Ballmer is power and greed at its worst. Open source, on the other hand, inspires innovation and sharing technology. The Lunduke Journal of Technology, run by Bryan Lunduke, details his experience with controlling Microsoft heads and how Bill Gates’s company has slowly decimated Linux: “Microsoft’s Growing Control Of Linux.”
Lunduke recounted he heard Ballmer’s hatred for Linux and even had the CEO’s spittle on his face from an open source rage. Microsoft has slowly gained control over important parts of Linux and open source as whole. This includes: GitHub-the largest host of source code in the world, Linux conferences, Linux organizations-Microsoft is a “Premium Sponsor” of the Open Source Initiative and “Platinum Membership on the Linux Foundation, and hired prominent Linux developers.
Here is what Lunduke heard during a past Linux conference:
“During that keynote, the Microsoft executive (John Gossman) made a few statements worth noting:
‘You do not generally want your developers to understand how the licenses all work. If you’re a larger company, you’re very likely to have a problem of controlling all of the open source activity that’s going on … it can be bad for the company, it can be bad for the community, it can be bad lots of different ways.’
You don’t want developers to understand licenses? Not having corporate control of open source is bad? Not exactly pro-open source statements, eh?”
Microsoft does use Linux for Azure and Ubuntu, two products that make the company’s offerings stronger. This Linux thing will be an interesting challenge. MSFT “owns” GitHub. MSFT wants to sell subscriptions and maybe to what does not matter? Open source may be antithetical to MSFT subscriptions. Open source Linux? How about a subscription to MSFT Linux centric solutions?
Now that’s an idea.
Whitney Grace, August 9, 2022
Open Source Software: Is the Golden Age Unwinding?
July 1, 2022
I spotted a modest, probably inconsequential, article called “Tech War: China Doubles Down on Domestic Operating Systems to Cut Reliance on Windows, MacOS from the US.” Two thoughts struck me: First, is “Kylinsoft” pronounced “kill them softly”? Second, with Chinese contributions to open source creeping upwards, will the Kylinsoft thrust gut some useful open source software projects. (I suppose I could ask myself, “Gee, perhaps the clean code goes to the Kylinsoft thrust and the poisoned stuff flows into non-China approved repositories?” I am not going to ask that question. Why would a nation state take such a nefarious approach to the free and open community minded approach to code?)
The write up takes the approach that China wants to be free of non-China software. The future is digital; therefore, a free future requires free software Chinese entities can trust. Also, the article uses the word “war” several times. That’s interesting. A software war fought on free and open source software governed by crystal clear rules of the information superhighway.
What entity is nudging Kylinsoft forward? The write up answers the question this way:
Kylinsoft, a subsidiary of state-owned China Electronics Corp, last week joined forces with more than 10 Chinese entities, including the National Industrial Information Security Development Research Centre, to set up an open-source code community.
Probably no big deal, right? Killin’ ‘em softly with love. A swan song?
Stephen E Arnold, July 1, 2022
ACM Opens Computing Literature Archive
May 30, 2022
The history of computers is fascinating. It starts thousands of years ago with some of history’s brilliant intellects, staggers, and then quickly advances in the twentieth century. We now have humanity’s collected knowledge in the palm of our hands…if the data or Wi-Fi connections work. The Association for Computing Machinery documented the invention of modern computing since 1947 and the organization opened an archive: ACM Digital Library. Associations Now explain why ACM opened its archives in, “‘The Way Things Were’: How The Association For Computing Machinery Is Opening The Doors To Its Archives.”
ACM wants people to realize how far the computing industry has gone and for its seventy-fifth anniversary is opening up its archives to the public. In the past, these records were locked behind a paywall and now they are free to the public. More than 117,500 articles from 1951-2000 are readable to the public. The archive is part of a greater ACM initiative:
“Vicki L. Hanson, the group’s CEO, noted that the ACM Digital Library initiative is part of a broader effort to make its archives available via open access by 2025. ‘Our goal is to have it open in a few years, but there’s very real costs associated with [the open-access work],’ Hanson said. ‘We have models so that we can pay for it. While the organization is still working through its open-access effort, it saw an opportunity to make its “backfile” of materials available, timed to the organization’s 75th anniversary.”
Hanson continued that opening the archive was not a big challenge, because ACM already had a system designed for public consumption. ACM wanted a creative way to announce the archive, so they used its seventy-fifth anniversary.
Organizations need to make money to support their research, but too much scientific information is kept behind paywalls. ACM’s move to share its research is a step more organizations should make.
Whitney Grace, May 30, 2022
On Mitigating Open-Source Vulnerabilities
May 16, 2022
Open-source software has saved countless developers from reinventing the proverbial wheel so they can instead spend their time creating new ways to use existing code. That’s great! Except for one thing: Now that open-source components make up about 90% of most applications, they pose tempting opportunities for hackers. Perhaps the juiciest targets lie in the military and intelligence communities. US counter-terrorism ops rely heavily on the likes of Palantir Technologies, a heavy user of and contributor to open-source software. Another example is the F-35 stealth fighter, which operates using millions of lines of code. A team of writers at War on the Rocks explores “Dependency Issues: Solving the World’s Open-Source Software Security Problem.” Solve it? Completely? Right, and there really is a tooth fairy. The article relates:
“The problem is that the open-source software supply chain can introduce unknown, possibly intentional, security weaknesses. One previous analysis of all publicly reported software supply chain compromises revealed that the majority of malicious attacks targeted open-source software. In other words, headline-grabbing software supply-chain attacks on proprietary software, like SolarWinds, actually constitute the minority of cases. As a result, stopping attacks is now difficult because of the immense complexity of the modern software dependency tree: components that depend on other components that depend on other components ad infinitum. Knowing what vulnerabilities are in your software is a full-time and nearly impossible job for software developers.”
So true. Still, writers John Speed Meyers, Zack Newman, Tom Pike, and Jacqueline Kazil sound optimistic as they continue:
“Fortunately, there is hope. We recommend three steps that software producers and government regulators can take to make open-source software more secure. First, producers and consumers should embrace software transparency, creating an auditable ecosystem where software is not simply mysterious blobs passed over a network connection. Second, software builders and consumers ought to adopt software integrity and analysis tools to enable informed supply chain risk management. Third, government reforms can help reduce the number and impact of open-source software compromises.”
The article describes each part of this plan in detail. It also does a good job explaining how we got so dependent on open-source software and describes ways hackers are able to leverage it. The writers submits that, by following these suggestions, entities both public and private can safely continue to benefit from open-source collaboration. If the ecosystem is made even a bit safer, we suppose that is better than nothing. After all, ditching open-source altogether seems nigh impossible at this point.
Cynthia Murrell, May 16, 2022
Open Source: Dietary Insights
May 5, 2022
One of the more benign news briefs about Russia these days concerns the eating habits of the country’s secret police. The Verge explains how delivery apps revealed Russian law enforcement’s food preferences: “Data Leak From Russian Delivery App Shows Dining Habits Of The Secret Police.” A massive data leak from Yandex Food, a large food delivery service in Russia, contained names, addresses, phone numbers, and delivery instructions related to the secret police.
Yandex Food is a subsidiary of the Russian search engine of the same name. The data leak occurred on March 1 and Yandex blamed it on the bad actions of one of its employees. The leak did not include users’ login information. The Roskomnadzor, the Russian government agency responsible for mass media, threatened Yandex with a 100,000 ruble fine and it also blocked a map containing citizen and secret police data.
Bellingcat researchers were investigating leads on the poisoning of Alexey Navalny, the Russian opposition leader. They searched the Yandex Food database collected from a prior investigation and discovered a person who was in contact with Russia’s Federal Security Service (FSB) to plan Navalny’s poisoning. The individual used his work email to register with Yandex Food. They also searched for phone numbers linked to Russia’s Main Intelligence Directorate (GRU). Bellingcat found interesting information in the leak:
“Bellingcat uncovered some valuable information by searching the database for specific addresses as well. When researchers looked for the GRU headquarters in Moscow, they found just four results — a potential sign that workers just don’t use the delivery app, or opt to order from restaurants within walking distance instead. When Bellingcat searched for FSB’s Special Operation Center in a Moscow suburb, however, it yielded 20 results. Several results contained interesting delivery instructions, warning drivers that the delivery location is a military base. One user told their driver “Go up to the three boom barriers near the blue booth and call. After the stop for bus 110 up to the end,” while another said ‘Closed territory. Go up to the checkpoint. Call [number] ten minutes before you arrive!’”
The most scandalous information leaked from the Yandex Food breach was information about Putin’s former mistress and their “suspected daughter.”
While it is hilarious to read about Russian law enforcement’s eating habits, it is alarming when the situation is applied to the United States. Imagine all of the information DoorDash, Grubhub, Uber Eats, and other delivery services collect on customers. There was a DoorDash data leak in 2019 that affected 4.9 million people and it was much larger than the Yandex Food leak.
Whitney Grace, May 5, 2022
OSINT for Amateurs
January 13, 2022
Today I had a New Year chat with a person whom I met at specialized services conferences. I relayed to my friend the news that Robert David Steele, whom I knew since 1986, died in the autumn of 2021. Steele, a former US government professional, was described as one of the people who pushed open source intelligence down the bobsled run to broad use in government entities. Was he the “father of OSINT”? I don’t know, He and I talked via voice and email each week for more than 30 years. Our conversations explored the value of open source intelligence and how to obtain it.
After the call I read “How to Find Anyone on the Internet for Free.”
Wow, shallow. Steele would have had sharp words for the article.
The suggestions are just okay. Plus it is clear that a lack of awareness about OSINT exists.
My suggestion is that anyone writing about this subject spend some time learning about OSINT. There are books from professionals like Steele as well as my CyberOSINT: Next Generation Information Access. Also, attending a virtual conference about OSINT offered by those who have a background in intelligence would be useful. Finally, there are numerous resources available from intelligence gathering organizations. Some of these “lists” include a description of each site, service, or system mentioned.
For me and my team’s part, we are working to create 60 second videos which we will make available on Instagram-type services. Each short profile of an OSINT resource will appear under the banner “OSINT Radar.” These will be high value OSINT resources. Some of this information will also be presented in a new series of short articles and videos that Meg Coker, a former senior telecommunications executive, and I will create. Look for these in LinkedIn and other online channels.
Hopefully the information from OSINT Radar and the Coker-Arnold collaboration will provide useful data about OSINT resources which are useful and effective. Free and OSINT can go together, but the hard reality is that an increasing number of OSINT resources charge for the information on offer.
OSINT, unfortunately, is getting more difficult to obtain. Examples include China’s cut offs of technology information and the loss of shipping and train information from Ukraine. And there are more choke points; for example, Iran and North Korea. This means that OSINT is likely to require more effort than previously. The mix of machine and human work is changing. Consequently more informed and substantive information about OSINT will be required in 2022. The OSINT for amateurs approach is an outdated game.
Coker and Arnold are playing a new game.
Stephen E Arnold, January 13, 2022
Cherche: A Neural Search Pipeline
January 10, 2022
For fans of open source search, Cherche is available. The GitHub write up states:
Cherche is meant to be used with small to medium sized corpora. Cherche’s main strength is its ability to build diverse and end-to-end pipelines.
The “neural search” module includes ElasticSearch. The programming team for Cherche consists of Raphaël Sourty and François-Paul Servant. Beyond Search has not fired up the system and run it against our test corpus. We did have in our files a paper called “Knowledge Base Embedding by Cooperative Knowledge Distillation.” That paper states:
Given a set of KBs, our proposed approach KDMKB, learns KB embeddings by mutually and jointly distilling knowledge within a dynamic teacher-student setting. Experimental results on two standard datasets show that knowledge distillation between KBs through entity and relation inference is actually observed. We also show that cooperative learning significantly outperforms the two proposed baselines, namely traditional and sequential distillation.
The idea is that instead of retrieving strings, broader tags (concepts and classifications) appear to provide an advantage; pushing “beyond” old school search.
Stephen E Arnold, January 10, 2022
Microsoft: A Legitimate Point about Good Enough
October 20, 2021
A post by Stefan Kanthak caught my attention. The reason was an assertion that highlights what may be the “good enough” approach to software. The article is “Defense in Depth — the Microsoft Way (Part 78): Completely Outdated, Vulnerable Open Source Component(s) Shipped with Windows 10&11.” I am in the ethical epicenter of the US not too far from some imposing buildings in Washington, DC. This means I have not been able to get one of my researchers to verify the information in the Stefan Kanthak post. I, therefore, want to point out that it may be horse feathers.
Here’s the point I noted in the write up:
Most obviously Microsoft’s processes are so bad that they can’t build a current version and have to ship ROTTEN software instead!
What’s “rotten”?
The super security conscious outfit is shipping outdated versions of two open source software components: Curl.exe and Tar.exe.
If true, Stefan Kanthak may have identified another example of the “good enough” approach to software. If not true, Microsoft is making sure its software is really super duper secure.
Stephen E Arnold, October 20, 2021
Quote to Note: An Open Source Developer Speaks Truth
August 10, 2021
Navigate to “Lessons Learned from 15 Years of SumatraPDF, an Open Source Windows App.” Please, read the article. It is excellent and applicable to commercial software as well.
Here’s the quote I circled and enhanced with an exclamation point:
… changing things takes effort and the path of least resistance is to do nothing.
Keep this statement in mind when Microsoft says it has enhanced the security of its updating method or when Google explains that it has improved its search algorithm.
The author of “Lessons Learned…” quotes Jeff Bezos (the cowboy hat wearing multi billionaire who sent interesting images which were stunning I have heard) as saying:
There will never be a time when users want bloated and slow apps so being small and fast is a permanent advantage.
I would add that moving data rapidly out of an AWS module evokes an Arnold corollary:
Speed costs more, often a lot more.
The essay is a good one, and I recommend that you read it, not just the quotes I reproduced in this positive comment about the content.
Stephen E Arnold, August 10, 2021