DuckDuckGo and Filtering

April 18, 2022

I read “DuckDuckGo Removes Pirate Websites from Search Results: No More YouTube-dl?” The main thrust of the story is:

The private search engine, DuckDuckGo, has decided to remove pirate websites from its official search results.

DuckDuckGo is a metasearch engine. These are systems which may do some focused original spidering, but may send a user’s query to partner indexes. Then the results are presented to the user (which may be a human or a software robot). Some metasearch systems like Vivisimo invested some intellectual cycles in de-duplicating the results. (A helpful rule of thumb is to assume a 50 to 70 percent overlap in results from one Web search system to another.) IBM bought Vivisimo, and I have to admit that I have no idea what happened to the de-duplicating technology because … IBM.

There are more advanced metasearch systems. One example is Silobreaker, a system influenced by some Swedish wizards. The difference between a DuckDuckGo and an industrial strength system, in my opinion, is significant. Web search is an opaque service. Many behind-the-scenes actions take place, and some of the most important are not public disclosed in a way that makes sense to a person looking for pizza.

My question, Is DuckDuckGo actively filtering?” And “Why did this take so long?” And, “Is DuckDuckGo virtue signaling after its privacy misstep, or is the company snagged in a content marketing bramble?

I don’t know. My thoughts are:

  1. The editorial policies of metasearch systems should be disclosed; that is, we do this and we do that.
  2. Metasearch systems should disclose that many results are recycled and the provenance, age, and accuracy of the results are unknown to the metasearch provider?
  3. Metasearch systems should make clear exactly what the benefits of using the metasearch system are and why the provider of some search results are not as beneficial to the user; for example, which result is an ad (explicit or implicit), sponsored, etc.

Will metasearch systems embrace some of these thoughts? Nah. Those who use “free” Web search systems are in a cloud of unknowing.

Stephen E Arnold, April 18, 2022

A Question about Robot Scientist Methods

April 13, 2022

I read “Robot Scientist Eve Finds That Less Than One Third of Scientific Results Are Reproducible.” The write up makes a big deal that Eve (he, her, it, them) examined in a semi automated way 12,000 research papers. From that set 74 were “found” to be super special. Now of the 74, 22 were “found” to be reproducible. I think I am supposed to say, “Wow, that’s amazing.”

I am not ready to be amazed because one question arose:

Can Eve’s (her, her, it, them) results be replicated. What about papers about Shakespeare, what about high energy physics, and what about SAIL Snorkel papers?

Answers, anyone.

I have zero doubt that peer reviewed, often wild and crazy research results were from one of these categories:

  1. Statistics 101 filtered through the sampling, analytic, and shaping methods embraced by the researcher or researchers.
  2. A blend of some real life data with synthetic data generated by a method prized at a prestigious research university.
  3. A collection of disparate data smoothed until suitable for a senior researcher to output a useful research finding.

Why are data from researchers off the track? I believe the quest for grants, tenure, pay back to advisors, or just a desire to be famous at a conference attended by people who are into the arcane research field for which the studies are generated.

I want to point out that one third being sort of reproducible is a much better score than the data output from blue chip and mid tier consulting firms about mobile phone usage, cyber crime systems, and the number of computers sold in the last three month period. Much of that information is from the University of the Imagination. My hunch is that quite a few super duper scholars have a degree in marketing or maybe an MBA.

Stephen E Arnold, April 13, 2022

IBM: Still Buzzwording after All These Years

April 8, 2022

I read “IBM Unveils Industry’s First Quantum-Safe System, IBM z16.” I have no doubt the machine is capable and certainly better than the IBM dog to which I had access in 1962. I loved standing in line to sign up for a card punch machine. I loved standing in line to drop off my pathetic card deck. I loved getting the green bar paper and the deck back days later. What’s not to like? Today’s system is super duper. The write explains that the “new” mainframe can prevent a quantum issue from a computer yet to be deployed as a functional encryption/decryption equipped quantum computer. That’s a pretty good wild and crazy idea: Protect against a future thing not yet in existence. Wow!

However the write up uses more buzzwords than I have seen in the patents filed by an outfit called Kyndi (if you don’t know, this is another enterprise search company with jargonized patent documents). Here’s a short list of some of the gems used to describe a mainframe. Keep in mind this is a mainframe, not a zippy Apple M chip powered gizmo. A mainframe. The words:

Quantum safe system. (Frankly I am not sure what a quantum computer will actually do once the cost, applications, cooling, etc. are figured out.)

Inference requests. (Years ago there was a Web search system called Inference. Today I am not exactly sure what an inference request is. Maybe a query requiring fancy predictive math? The IBM approach is to deliver latency optimized inferencing. I think this means latency reduced inference but maybe not. The number presented without any supporting data is 300 billion inference requests per day. Is this eight hours or 24 hours?)

Integrated on chip AI accelerator. (And what’s AI mean? Probably machine learning but the on chip AI is snappy. How big is this “artificial intelligence” conceptual umbrella? I assume IBM used the word “all” in a previous draft of this buzzwordy phrase.)

Near future threats. (After SolarWinds the threats are here and now and will persist because the attack surface is like the paved parking lots in Paramus, New Jersey. What’s near future? Like tomorrow?)

Cyber resiliency posture. (My hunch is that this means that executives at Microsoft struggling with Azure and Exchange security will sit up straight after 1,000 bad actors working for a nation state use off the shelf exploits to attack those Softies’ systems and software.)

CEX8S. (Is the acronym pronounced like the word for biological actions related to progeny creation or like the breakfast cereal one ate for breakfast? Has the acronym been influenced by Tesla’s cutesy auto labels: Model S, Model 3, and Model X, the one with long lasting performance?)

Quantum-safe cryptographic technology. (At least Kyndi spelled “quantum” this way: Quantom. IBM couldn’t be bothered to nose into Kyndi’s spelling innovation. IBM’s invocation approach may relate to the firm’s experiments with quantum computing which have allegedly ripped the crown of quantum supremacy from the scaled head of Googzilla.)

Wow. This is a mainframe, and it works pretty much like its predecessors. Why not emphasize compatibility, methods of exporting data to lesser systems, and exactly what legacy software will run on the beastie?

Not zippy enough? Certainly not for the IBM marketers. Quantum AI inferencing CEX8S are much zippier. Let’s ask the part of Watson that hasn’t been sold? Here’s the answer I think Watson will output:

IBM deliberately misclassified mainframe sales to enrich execs, lawsuit claims

That seems like a Watson like answer to me.

Stephen E Arnold, April 8, 2022

Do The Google AI Claims Grow Like a Pinocchio Body Part?

April 6, 2022

Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance” is a variant of the Google quantum supremacy announcement. Bigger, better, faster, more powerful, able to leap problems with a single tap on the Enter key. The graphic in the Google AI Blog post does grow. Didn’t Carlo Collodi cook up a dummy. The chief feature — other than teaching some how not to lie — was that the marketing was handled by Walt Disney. Like IBM’s humorous announcement that a mainframe could defeat a quantum computer’s ability to crack encryption, a claim pointed at something not invented yet is interesting. Are those marketing people at Google and IBM mentally enervated by swigs of Five Hour Energy?

Like a certain fictional character’s nose and the anigif in the blog post, the claims continue to grow.

image

I looked at this graphic closely. I noted a few omissions; for example:

  1. A mechanism to report the incidence of outliers or exceptions between the baseline system and the state of the system after iterating over a period of a month
  2. Any reference to bias identification and amelioration. This is Dr. Timnit Gebru territory, and this landscape is one that Google appears to ignore, at least in public. In private negotiations and legal chambers, maybe the Google addresses the baked in biases? Maybe not?
  3. Any reference to the handling of images, content, videos that are related to sexual harassment; for instance, allegations about personnel issues at Google and DeepMind themselves?
  4. Data about the accuracy of the outputs? Are we in 95 percentile territory or close enough for horse shoes and ad matching?

The write up uses a number of buzzwords, some Google jargon, and quite a few links to other Google documents and experts at Microsoft and NVidia. I am convinced. I believe everything I read on the Internet and Google’s blogs.

Three observations:

First, what’s at stake in my opinion is dominance if possible of off the shelf smart methods. Consolidation is the name of the game, and Google wants to beat out Amazon, Microsoft, assorted China backed outfits, and any other challengers who want to go a different direction. Not every company wants to SAIL down a certain flow of methods.

Second, Google is — bless its single revenue stream — embracing Madison Avenue techniques to convince people that it is the Big Dog in smart methods: New, improved, money back guarantee, and free trial sell toothpaste. Why not Google AI?

Third, Google — despite the alleged monopoly position — is struggling with the what’s next? Legal hassles, management practices, competition from nuisance companies like Amazon, competition for technical talent, hard to control costs — These are real issues at the Alphabet Google YouTube construct.

At end of a Silicon Valley day, some in Mountain View see Google as a one trick pony. It seems far fetched, but it looks as if Steve Ballmer may have been spot on with that one-trick pony metaphor. And there is Pinocchio’s nose.

Stephen E Arnold, April 6, 2022

PR or Reality? Only the Cyber Firms Know the Answer

April 6, 2022

Cyber crimes are on the rise. Businesses and individuals are the targets of malware bad actors. IT Online details how cyber security firms handle attacks: “What Happens Inside A Cybercrime War-Room?” As a major business player in Africa, South Africa fends off many types of cyber attacks: coin miner modules, viruses downloaded with bad software, self-spreading crypto mining malware, and ransomware.

The good news about catching cyber criminals is that white hat experts know how their counterparts work and can use technology like automation and machine learning against them. Carlo Bolzonello is the country manager for South Africa’s Trellis’s branch. He said that cyber crime organizations are run like regular businesses, except their job is to locate and target IT vulnerable environments. Once the bad business has the victim in its crosshairs, the bad actors exploit it for money or other assets for exploration or resale.

Bolzonello continued to explain that while it is important to understand how the enemy works, it is key that organizations have a security operations center armed with various tools that can pull information about possible threats into one dashboard:

“That single dashboard can show where a threat has emerged, and where it has spread to, so that action can be taken, immediately. It can reveal whether ransomware has gained access via a “recruitment” email sent to executives, whether a “living off the land” binary has taken hold via a download of an illicit copy of a movie, or whether a coin miner module has inserted itself via pirated software. Having this information to hand helps the SOC design and implement a quick and effective response, to stop the attack spreading further, and to prevent it costing money for people and businesses.”

Having a centralized dashboards allows organizations respond quicker and keep their enemies in check. Black hat cyber organizations actually might have a reverse of a security operations center that allow them to locate vulnerabilities. PR or reality? A bit of both perhaps?

Whitney Grace, April 6, 2022

Anti-Drone Measures: A Bit Like Enterprise Cyber Security?

April 5, 2022

The big news is that whatever anti-drone technology is being used by “the West”, it is not working at 100 percent efficiency. The Wall Street Journal, published on April Fool’s Day, the story “Drones Evade West’s Air Defense.” I could not spot the exact write up in my online resources, but this particular item is in the dead tree edition. If you go to an office which has humanoids who subscribe to the hard copy, you can check out the story on Page A-9. Story locations vary by edition because… advertising.

There is an online version with the jazzy title “NATO Investigates How Russian and Ukrainian Drones Bypassed Europe’s Air Defense System.” You might be able to view the article at this link, but you probably will either have to pay or see a cheerful 404 error. These folks are in the money business. News — mostly like the Ford 150 — is cargo, and it has a cult I believe.

The point of both write ups is that both Russian and Ukrainian drones have not be interdicted by anti-drone systems. How did those in neighboring companies know that Russian and Ukrainian drones were entering their air space and zipping through their anti-droned borders?

Drones crashed. People walked up and noted, “Okay, explosives on that one.” Another person spots a drone in a field and says, “Looks like this one has cameras, not bombs.”

Countries whose borders have been subject to drone incursions include Romania, Croatia, and Poland. There may be others, but some of the countries have areas which are a difficult to reach, even for an Eva Zu Beck type of person.

NATO is looking into the anti drone measures. That makes sense, since most vendors of military grade anti drone systems have PowerPoint decks which make it clear, “Our system works.” Should I name vendors? Nah, remember Ubiquiti and Mr. Krebs. (That sounds like a children’s program on a PBS station to me.) Slide decks become the reality until a drone with explosives plops down near a pre-school.

My immediate reaction to these Wall Street Journal stories was, “Maybe the anti-drone defense vendors operate with the same reliability as the vendors of enterprise cyber security systems?” The PowerPoint decks promise the same efficacy. There are even private YouTube videos which show drone defense vendors systems EMPing, blasting, or just knocking those evil constructs out of the sky. (Check out Anduril’s offering in this collision centric method, please.)

For several years I followed drone technology for an investment outfit. I learned that the information about the drone described devices best suited for science fiction. I read patents which were not in the fiction section of my local library. I watched YouTube videos with nifty DaVinci Fusion video effects.

The reality?

NATO is now investigating.

My point is that it is easy to sell certain government types advanced technology with PowerPoints and slick videos. This generalization applies to hardware and to software cyber systems.

I don’t need to invoke the SolarWinds’ misstep. I don’t need to recycle the information in the Wall Street Journal stories or the somewhat unusual content in Perun’s drone video.

Is procurement to blame? Partially. I think that Parkinson’s Law (1958) gets closer to the truth, particularly when combined with the observations in the Peter Principle (1971). Universals are at work with the assistance of fast talkers, PowerPoints, and video “proof”.

Stephen E Arnold, April 4, 2022

Google: The Quantum Supremacy Turtling

April 1, 2022

Okay, Aprils’ Fool Day.

Google Wants to Win the Quantum Computing Race by Being the Tortoise, Not the Hare” explains that the quantum supremacy “winner” which captured “time crystals” has a new angle:

it’s clear that Google — or, to be more accurate, its parent company Alphabet — has its sights set on being the world’s premiere quantum computing organization.

Machines? Nah, think cloud, gentle reader. Google has it together, but the non Googley may struggle to get the picture. The write up says:

Parent company Alphabet recently starbursted its SandboxAQ division into its own company, now a Google sibling. It’s unclear exactly what SandboxAQ intends to do now that it’s spun out, but it’s positioned as a quantum-and-AI-as-a-service company. We expect it’ll begin servicing business clients in partnership with Google in the very near-term.

But? The write up says:

We can safely assume we haven’t seen the last of Google’s quantum computing research breakthroughs, and that tells us we could very well be living in the moments right before the slow-and-steady tortoise starts to make up ground on the speedy hare.

Maybe turtle? An ectotherm like Googzilla? Eye glass frames with a relevant Google product review? So many questions.

Stephen E Arnold, April 1, 2022

Marketing Is Getting Harder And Experts Sell Whatever

March 30, 2022

Selling in a digital landscape is harder now than anytime before. In order to succeed, sellers require marketing to get attention for their goods and services. Marketing had become complex and Read Write details how in, “Marketing Is Getting More Difficult. Here’s Why.” The basics of marketing are related to the amount of the, money and effort people place into their campaign. When marketing becomes more difficult, it consumes more time and resources. It is also harder for beginners to pick up concepts and it is also harder for anyone to stand out from competition.

People are bombarded with ads everywhere, especially on the Internet. They are often ignored and mostly annoying. Old marketing techniques do not work anymore. Modern consumers prefer organic, personalized content that tells a story. Technology is another factor that makes marketing harder, including the expense, learning said technology, misusing it, and misleading metrics.

It is not impossible to be successful, only harder and it is good to remain adaptable, diversify strategies, and stay agile to avoid competition:

“Marketing is getting more difficult. That much is certain. But you can at least find solace in the fact that it’s not just getting harder for you; it’s getting harder for all of us. We are entering a new era of marketing and advertising, and that’s not necessarily a bad thing period it just means you need to remain adaptable and attentive if you want your tactics to generate a meaningful return.”

The article gives fresh, positive advice. Fortunately it does not lead into a magic wand sales pitch for how a specific marketing service. Experience, research, and giving campaigns a try is how to get marketing done.

Agencies are ready to help. For a price. And it is easy… to send invoices that is.

Whitney Grace, March 30, 2022

Automated Censorship: What Could Go CENSORED with the CENSORED System?

March 28, 2022

Automated censorship: Silent, 24×7, no personnel hassles, no vacations, no breakdowns, and no worries.

Okay, a few may have worries, but these are very small, almost microscopic, worries. The reason? If one can’t find the information, then whatever the information discusses does not exist for many people. That’s the elegance of censorship. A void. No pushback. One does not know.

How AI Is Creating a Safer Online World” does not worry about eliminating information. The argument is “more safety.” Who can disagree? Smart people understand that no information yields safety, right?

The write up states:

By using machine learning algorithms to identify and categorize content, companies can identify unsafe content as soon as it is created, instead of waiting hours or days for human review, thereby reducing the number of people exposed to unsafe content.

A net positive. The write up assumes that safe content is good. Smart software can recognize unsafe content. The AI can generate data voids which are safe.

The write up does acknowledge that there may be a tiny, probably almost insignificant issue. The article explains with brilliant prose:

Despite its promise, AI-based content moderation faces many challenges. One is that these systems often mistakenly flag safe content as unsafe, which can have serious consequences.

Do we need examples? Sure, let’s point out that the old chestnuts about Covid and politics are presented to close the issue. How are those examples playing out?

How does the write up? Logic that would knock Stephen Toulmin for a loop? A content marketing super play that will make the author and publisher drown in fame?

Nah, just jabber like this:

AI-assisted content moderation isn’t a perfect solution, but it’s a valuable tool that can help companies keep their platforms safe and free from harm. With the increasing use of AI, we can hope for a future where the online world is a safer place for all.

Does a “safer place” suggest I will be spared essays like this in the future? Oh, oh. Censorship practiced by a human: Ignoring content hoo hah. The original word I chose to characterize the analysis has been CENSORED.

Stephen E Arnold, March 28, 2022

Google Management: Fame and the F1 Crowd

March 23, 2022

With a grip on online advertising, what better way to cement exciting weekends than hanging with the in crowd at chi chi F1 venues. Will the happy Google colors find their way on the next McLaren road rocket? Are those McLaren confections climate friendly?

What does the Google team contribute to McLaren? “McLaren Racing Announces Multi-Year Partnership with Google” says:

McLaren will use 5G-enabled Android devices and Chrome browser across its operations during practice sessions, qualifying and races to support the drivers and team, with the goal of improving on-track performance.

According to the write up, a McLaren wheel will feature a Google logo. Perfect for the Google store or the ultimate booth give-away.

But those parties? The gear heads? Will Google executives abandon their daily drivers for a McLaren?

Nope, this is a high school science club interest which has been observed at Northern Light and Autonomy years ago.

Stephen E Arnold, March 23, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta