Bing: Censoring Because China Buys Lots of MSFT Products and Services
May 27, 2022
If you think restrictions in one country do not extend to those on the other side of the world, think again. Motherboard reports, “Microsoft’s Chinese Bing Censorship Impacts United States Too, Researchers Say.” We do not find this to be a surprise since big outfits do what’s necessary to sell in a nation state and reduce their costs of operation. Do the filtering once, then do it for everyone. Efficient. Writer Joseph Cox tells us:
“Microsoft-owned search engine Bing censors content that is politically sensitive to the Chinese government for users who are using the search engine from the United States, researchers claim in a new report. The research shows how censorship efforts in one country can bleed over and impact users in others. The findings come after Bing censored image searches for the infamous ‘tank man‘ even from the United States last June. At the time, Microsoft blamed that issue on an ‘accidental human error.’ The new research indicates more widespread censorship of politically sensitive searches, and especially names of certain people. ‘Using statistical techniques, we preclude politically sensitive Chinese names in the United States being censored purely through random chance. Rather, their censorship must be the result of a process disproportionately targeting names which are politically sensitive in China,’ the report, written by researchers from the University of Toronto’s Munk School of Global Affairs Citizen Lab, reads. Microsoft operates Bing in China with limitations in place so it falls in line with Chinese law. Much of that involves heavy censorship around certain topics, events, and people, often resulting in those areas being undiscoverable via Bing searches conducted from within China.”
And from without, apparently. Specifically, the study looked at Bing’s auto-complete function. See the write-up for its methodology and findings. For its part Microsoft insists not seeing a relevant autosuggestion is “largely” based on each query and local user behavior and “does not mean it has been blocked.” We note that is not quite an outright denial. Globalization! How is that working out?
Cynthia Murrell, May 27, 2022
Online Platforms Fail to Prevent Circulation of Buffalo Shooting Video
May 27, 2022
Once again, the Internet proves that even the vilest content cannot be contained. It is a fact racist terrorists have learned to exploit to spread fear, hatred, and inspiration for more violence. The Washington Post reports, “Only 22 Saw the Buffalo Shooting Live. Millions Have Seen it Since.” Writers Drew Harwell and Will Oremus tell us:
“When the Buffalo gunman broadcast the shooting in real time Saturday on the live-streaming site Twitch, only 22 people were watching, and company officials said they’d removed it with remarkable speed — within two minutes of the first gunshots. But all it took was for one viewer to save a copy and redistribute it online. A jumble of video-hosting sites, extremist message boards and some of Silicon Valley’s biggest names did the rest, ensuring millions of people would view the video. One copy made its way onto the little-known video site Streamable, where, thanks to links posted on much larger sites, it was viewed more than 3 million times before it was removed. One link to that copy on Facebook received more than 500 comments and 46,000 shares; Facebook did not remove it for more than 10 hours. ‘Terrorism is theater,’ said Emerson T. Brooking, senior fellow at the Atlantic Council’s Digital Forensic Research Lab, which researches how information spreads online. ‘The purpose of terrorism is always to reach the greatest number of people possible with the most horrific or spectacular attack that you can perform.'”
Several ideas have been floated to combat the spread of this grisly propaganda. Some are trying to wield digital fingerprint technology for good by using it to filter content. Then there was the “hate clusters” concept. Major tech companies even set up a Global Internet Forum to Counter Terrorism, a system meant to automatically block terrorist videos that was based on tech originally used to block footage of child sexual abuse. Clearly, none of these measures are working. Will the answer be found, and implemented, before the next attack?
Cynthia Murrell, May 27, 2022
Cybersecurity: Are the Gloves Off?
May 26, 2022
Cybersecurity has been a magnet for investments. Threats are everywhere! Threats are increasing! Ransomware destroys businesses and yours will be next? One thousand bad actors attack in the SolarWinds’ misstep, right? The sky is falling!
Frightened yet?
Changes are evident. Let me offer two examples:
Lacework
The cybersecurity outfit Lacework has just allowed about 20 percent of their workforce to find their future elsewhere. Uber, perhaps? Piece work via Fiverr.com? A for-fee blog on Substack, the blog platform with real journalists, experts, pundits, wizards, etc.?
“Cloud Security Firm Lacework Lays Off 20% of Staff
” reports:
A well-funded startup in the cybersecurity industry, Lacework, has become the latest tech firm to disclose a major round of layoffs amid fears of a broader economic slowdown. In a statement provided to Protocol, Lacework confirmed that the layoffs impacted 20% of its employees, in connection with what it called a “decision to restructure our business.”
Is the number of future hunters let loose in the datasphere accurate? The article points out that Lacework used the outstanding Twitter to say, 20 percent was a “significant overestimate.” Whom does one believe? In today’s world, I have to hold two contradictory statements in my mind because I sure as heck don’t know why a hot sector with a well funded company is making more parking available and reducing demand for the ping pong table.
Cybersecurity Does Not Work
The second example I noted an advertisement in my dead tree version of the Wall Street Journal. Here’s the ad from the May 26, 2022, publication:
The text Tanium advertisement declares that cybersecurity systems fail their customers. The idea is that there are many cybersecurity vendors, and each offers pretty good barriers to a couple of threats. The customers of these firms’ products have to buy multiple solutions. The fix? License Tanium, a “best place to work.”
Stepping Back
The first example provides a hint that certain companies in the cybersecurity market are taking steps to reduce costs. Nothing works quite as well as winnowing the herd. My hunch is that Lacework is like a priest in ancient Greece poking at a sacrificial lamb and declaring, “Prepare for the pestilence and the coming famine. Have a good day.”
The second example may signal that the policy of cybersecurity vendors not criticizing one another is over. Tanium is criticizing a pride of cyber lions. My hunch is that the gloves will be coming off. Saying that no other vendor can deal with cyber threats in the Wall Street Journal is a couple of levels above making snarky comments in a security trade show booth.
Net Net
Bad actors can add some of the Lacework castoffs to their virtual crimeware teams hiding behind the benign monikers of front companies in Greece and Italy, among other respected countries. The Tanium ad copy offers proof that existing cyber defense may have some gaps. The information will encourage bad actors to keep chipping away at juicy online targets. Change has arrived.
Stephen E Arnold, May 26, 2022
Google: Quantumly Supreme in PR That Is
May 26, 2022
It looks like Google PR department is working overtime. An article at The Independent declares: “‘The Game Is Over’: Google’s DeepMind Says It Is on Verge of Achieving Human-Level AI.” The lofty claim comes from DeepMind’s lead researcher, but is he getting ahead of himself? Writer Anthony Cuthbertson reveals:
“Dr Nando de Freitas said ‘the game is over’ in the decades-long quest to realise artificial general intelligence (AGI) after DeepMind unveiled an AI system capable of completing a wide range of complex tasks, from stacking blocks to writing poetry. Described as a ‘generalist agent’, DeepMind’s new Gato AI needs to just be scaled up in order to create an AI capable of rivaling human intelligence, Dr de Freitas said. Responding to an opinion piece written in The Next Web that claimed ‘humans will never achieve AGI’, DeepMind’s research director wrote that it was his opinion that such an outcome is an inevitability. ‘It’s all about scale now! The Game is Over!’ he wrote on Twitter. ‘It’s all about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, innovative data, on/offline… Solving these challenges is what will deliver AGI.’ When asked by machine learning researcher Alex Dimikas how far he believed the Gato AI was from passing a real Turing test – a measure of computer intelligence that requires a human to be unable to distinguish a machine from another human – Dr de Freitas replied: ‘Far still.'”
So… not quite there yet after all. A reality we are keenly aware of as we ponder standing in front of a Google self-driving car surrounded by traffic cones and temporary lane dividers. Or, on a less perilous but annoying note, consider those oh-so-relevant YouTube ads. But sure, general AI is right around the corner.
Cynthia Murrell, May 26, 2022
Google Can Now Auto Summarize Content With AI: Accurate? Probably Close Enough for Horseshoes
May 26, 2022
Technology has not yet advanced to the point where computers can write decent novels or hold detailed conversations with humans, but AI is getting better at summarizing, translating foreign languages, text to speech, and natural language processing. Google’s I/O 2022 conference was still low-key because it was not in person. Google announced the Pixel 6a and Pixel Watch, but for some, it was not as exciting as the new auto summarization announcement. Android Police details the low-grade but key advancement in: “Auto-Generated Summaries Might Be The Most Important I/0 2022 Announcement.”
Google spent years advancing its AI through machine learning to understand human sentiment, content, and meaning. We see the development of this technology through digital assistants, but when it comes to reading and writing technology is still in the development phase:
“The lineage of Google AI summaries reaches back to the “transformer” model devised by its researchers in 2017, which led to language processing and generation systems like GPT-2 that are capable of self-supervised pre-training. That means much of the early AI training happens without human oversight, so it’s faster and more adaptable than older methods like recurrent neural networks. More recently, Google refined these techniques for the Pegasus text summarization model in 2020.”
Transformers paired with Pegasus provided AI with data in the form of news articles and documents to train it on how to fill in information gaps. It took years to achieve the current model that will debut in Google Docs, but the AI got smarter and smarter with the more data it processed 24/7.
Google wants to use its auto summarization for Google Chat, Google Meet, and other applications. The tool has many uses across Google’s technology domain. However, it is kind of scary imagining Google summarizing everything without checks and balances.
Whitney Grace May 26, 2022
Near: A Complement to ClearView AI?
May 26, 2022
“Data Intelligence Startup Near, with 1.6B anonymized User IDs, Lists on NASDAQ via SPAC at a $1B Market Cap; Raises $100M” is an interesting story. On one hand, in the midst of some financial headwinds, the outfit Near is a unicorn. That’s exciting for some. The most significant part of the short item is this passage: Near offers
anonymised, location-based profiles of users based on a trove of information that Near sources and then merges from phones, data partners, carriers and its customers. It claims the database has been built “with privacy by design.”
The word merging as in “merging data from different sources” is not jargony enough. The Near write up uses the term “stitching” as in “threads which hold the parts of a football together.” I prefer the term “federating” as in “federating data.”
The idea is a good one. Take information from different sources, index it (assign tags today, of course) and group information about a person under that entity’s “name.” This is a useful workflow, and my hunch is that the system works best for individuals leaving digital footprints and crumbs of ones and zeros behind as these “entities” go about their business.
The successful merging and profiling will give Near a competitive advantage. Like ClearView and many other companies, scraping and licensing commercial datasets can produce a valuable data asset.
On the downside, as ClearView has learned as it explained its business to legal eagles, some concerns for privacy can arise. Assurances of privacy have created some issues for firms performing similar work for government agencies. Law enforcement and intelligence professionals are likely to show some interest in Near’s products and services.
Successfully navigating marketing to commercial outfits and selling to government agencies is like sailing into an unfamiliar port with a very large boat.
Kudos to near for its funding. Now it will be interesting to watch the firm’s management walk the marketing tightrope over the Niagara Falls of cash flow as legal eagles circle.
Stephen E Arnold, May 26, 2022
DuckDuckGo: A Duck May Be Plucked
May 25, 2022
Metasearch engines are not understood by most Internet users. Here’s my simplified take: A company thinks it can add value to the results output from an ad-supported search engine. Maybe the search engine is a for-fee outfit? Either way, the metasearch systems gets the okay to send queries and get results. The results stream back to the metasearch outfit and the value-adding takes place.
One of the better metasearch systems was the pre-IBM Vivisimo. This outfit sent out queries to an ad-supported search engine, accepted the results, and then clustered them. The results appeared to the Vivisimo user as a results list with some folders in a panel. The idea was that the user could scan the folders and the results list. The user could decide to click on a folder and see what results it contained or just click on a link. The magic, as I understood it, was that the clustering took place in near real time. Plus, the query on the original Vivisimo pre-IBM system could send the user’s query to multiple Web search engines. The results from each search system would be de-duplicated. An interesting factoid from the 2000s is that search systems returned overlapping results 70 percent of more of the time. Dumping the duplicates was helpful. There were other interesting metasearch systems as well, but I am just using Vivisimo as an example of a pretty good one.
Privacy, like security, is a tricky concept to explain.
Using privacy to sell a free Web search system raises a number of questions; for example:
- What’s privacy in the specific context of the metasearch engine mean?
- Where is the money coming from to keep the lights on at the metasearch outfit?
- What about log files?
- What about legal orders to reveal data about users?
- What’s the quid pro quo with the search engine or engines whose results the metasearch system uses?
- What part of the search chain captures data, inserts trackers, bugs, cookies, etc. into the user’s query?
None of these questions catch the attention of the real news folks nor do most users know what the questions require to answer. The metasearch engines typically do not become chatty Cathies when someone like me shows up to gather information about metasearch systems. I recall the nervousness of the New York City wizard who cooked up Ixquick and the evasiveness of the owner of the Millionshort services.
Now we come to the the notion that a duck can be plucked. My hunch is that plucking a duck is a messy affair both duck and duck plucker.
“DuckDuckGo Browser Allows Microsoft Trackers Due to Search Agreement” presents information which appears to suggest that the “privacy” oriented DuckDuckGo metasearch system is not so private as some believed. The cited article states:
The privacy-focused DuckDuckGo browser purposely allows Microsoft trackers on third-party sites due to an agreement in their syndicated search content contract between the two companies.
You can read the cited article to get more insight into the assertion that DuckDuck has been pluck plucked in the feathered hole of privacy.
Am I surprised? No. Search is without a doubt one of the most remarkable business segments for soft fraud. How do I know? My partners and I created The Point in 1994, and even though you don’t remember it, I sure remember what I learned about finding information online. Lycos (CMGI) bought our curated search business, and I wrote several books about search. You know what? No one wants to think about search and soft fraud. Maybe more people should?
Net net: Free comes at a cost. One does not know what one does not know.
Stephen E Arnold, May 25, 2022
IBM: Fueling the Quantum Computing PR Push
May 25, 2022
I wonder if your mom wants a quantum computer in her tablet. Who wants a quantum computer? Many people, and IBM wants to deliver. The issue is that the fungible quantum computer is a bit of a specialty item. IBM wants to be perceived as the Big Dog in the sector.
“IBM Plans to Deliver 4 000+ Qubit System” explains that IBM will produce a quantum computing Big Blue Great Dane of a system. (Did you know that there are more than a dozen dogs which have blue coats? In addiiton to the Great Dane, there is the blue Chihuahua and the blue Lacy.) The important word to me is “plans.” The 4,000 qubit giant is not ready for the quantum computer market yet. But it is coming. Soon. And the road map will be updated in 2023.
The write up says:
IBM has announced the expansion of its roadmap for achieving large-scale, practical quantum computing. This roadmap details plans for new modular architectures and networking that will allow IBM quantum systems to have larger qubit-counts – up to hundreds of thousands of qubits.
Note that the numerical leap is from 4,000 qubits to hundreds of thousands of qubits.
And IBM’s innovations are not Google-style pronouncements of quantum supremacy. IBM, according to the article:
will leverage three pillars: robust and scalable quantum hardware; cutting-edge quantum software to orchestrate and enable accessible and powerful quantum programs; and a broad global ecosystem of quantum-ready organizations and communities.
In addition, IBM will roll out IBM Condor, not a Big Blue dog but a condor, which is a Big Bird. I noted this statement:
On the hardware front, IBM intends to introduce IBM Condor, the world’s first universal quantum processor with over 1 000 qubits.
The Big Blue Great Dane and the Big Bird are part of the road map. Is IBM Watson available to answer a question about this forward leaning quantum computing PR announcement about “plans”?
Stephen E Arnold, May 25, 2022
Google: Embrace, Control, and Sell Advertising?
May 25, 2022
Google claims to support open source technology and contributes some of its code libraries and projects, except for the black box search algorithm, to the public. The Verge shares Google’s new open source initiative in the article, “Google Will Start Distributing A Security-Vetted Collection Of Open-Source Software Libraries.” Google wants to add its branding and stamp of approval to open source software.
Google wants to control its portion of the open source community by curating and distributing security-vetted software to Google Cloud customers. The new initiative is called Assured Open Source Software. Andy Chang is a Google Cloud Group Produce Manager for Security and Privacy and he said there were challenges to secure open source software:
“ ‘There has been an increasing awareness in the developer community, enterprises, and governments of software supply chain risks,’ Chang wrote, citing last year’s major log4j vulnerability as an example. ‘Google continues to be one of the largest maintainers, contributors, and users of open source and is deeply involved in helping make the open source software ecosystem more secure.’”
The Assured Open Source Software will allow Google Cloud customers to use the same software auditing process as Alphabet Inc. The open source packages are the same ones the company uses and are managed by regular scanning and vulnerability analysis. Currently, there are 550 libraries Google monitors on GitHub and can be downloaded independently of Google. These same libraries will be available via Google Cloud later in 2022.
Google’s Assured Open Source Software is an industry-wide pull to secure the open source software supply chain. The Biden administration supports the endeavor.
Open source does need to be secure, but is putting a tech giant, notorious for collecting and selling user data, the right way to go? Sure it is, it is Google approved!
Whitney Grace May 25, 2022
Synthetic Data: Cheap, Like Fast Food
May 25, 2022
Fabricated data may well solve some of the privacy issues around healthcare-related machine learning, but what new problems might it create? The Wall Street Journal examines the technology in, “Anthem Looks to Fuel AI Efforts with Petabytes of Synthetic Data.” Reporter Isabelle Bousquette informs us Anthem CIO Anil Bhatt has teamed up with Google Cloud to build the synthetic data platform. Interesting choice, considering the health insurance company has been using AWS since 2017.
The article points out synthetic data can refer to either anonymized personal information or entirely fabricated data. Anthem’s effort involves the second type. Bousquette cites Bhatt as well as AI and automation expert Ritu Jyoti as she writes:
“Anthem said the synthetic data will be used to validate and train AI algorithms that identify things like fraudulent claims or abnormalities in a person’s health records, and those AI algorithms will then be able to run on real-world member data. Anthem already uses AI algorithms to search for fraud and abuse in insurance claims, but the new synthetic data platform will allow it to scale. Personalizing care for members and running AI algorithms that identify when they may require medical intervention is a more long-term goal, said Mr. Bhatt. In addition to alleviating privacy concerns, Ms. Jyoti said another advantage of synthetic data is that it can reduce biases that exist in real-world data sets. That said, she added, you can also end up with data sets that are worse than real-world ones. ‘The variation of the data is going to be very, very important,’ said Mr. Bhatt, adding that he believes the variation in the synthetic data will ultimately be better than the company’s real-world data sets.”
The article notes the use of synthetic data is on the rise. Increasing privacy and reducing bias both sound great, but that bit about potentially worse data sets is concerning. Bhatt’s assurance is pleasant enough, but how can will we know whether his confidence pans out? Big corporations are not exactly known for their transparency.
Cynthia Murrell, May 25, 2022