New Search Platform Focuses on Protecting Intellectual Property

January 21, 2022

Here is a startup offering a new search engine, now in beta. Huski uses AI to help companies big and small reveal anyone infringing on their intellectual property, be it text or images. It also promises solutions for title optimization and even legal counsel. The platform was developed by a team of startup engineers and intellectual property litigation pros who say they want to support innovative businesses from the planning stage through protection and monitoring. The Technology page describes how the platform works:

“* Image Recognition: Our deep learning-based image recognition algorithm scans millions of product listings online to quickly and accurately find potentially infringing listings with images containing the protected product.

* Natural Language Processing: Our machine learning algorithm detects infringements based on listing information such as price, product description, and customer reviews, while simultaneously improving its accuracy based on patterns it finds among confirmed infringements.

* Largest Knowledge Graph in the Field: Our knowledge graph connects entities such as products, trademarks, and lawsuits in an expansive network. Our AI systems gather data across the web 24/7 so that you can easily base decisions on the most up-to-date information.

* AI-Powered Smart Insights: What does it mean to your brands and listings when a new trademark pops out? How about when a new infringement case pops out? We’ll help you discover the related insights that you may never know otherwise.

* Big Data: All of the above intelligence is being derived from the data universe of the eCommerce, intellectual property, and trademark litigation. Our data engine is the biggest ‘black hole’ in that universe.”

Founder Guan Wang and his team promise a lot here, but only time will tell if they can back it up. Launched in the challenging year of 2020, Huski.ai is based in Silicon Valley but it looks like it does much of its work online. The niche is not without competition, however. Perhaps a Huski will cause the competition to run away?

Cynthia Murrell, January 21, 2021

Google Identifies Smart Software Trends

January 18, 2022

Straight away the marketing document “Google Research: Themes from 2021 and Beyond” is more than 8,000 words. Anyone familiar with Google’s outputs may have observed that Google prefers short, mostly ambiguous phraseology. Here’s an example from Google support:

Your account is disabled

If you’re redirected to this page, your Google Account has been disabled.

When a Google document is long, it must be important. Furthermore, when that Google document is allegedly authored by Dr. Jeff Dean, a long time Googler, you know it is important. Another clue is the list of contributors which includes 32 contributors helpfully alphabetized by the individual’s first name. Hey, those traditional bibliographic conventions are not useful. Chicago Manual of Style? Balderdash it seems.

Okay, long. Lots of authors. What are the trends? Based on my humanoid processes, it appears that the major points are:

TREND 1: Machine learning is cranking out “more capable, general purpose machine learning models.” The idea, it seems, that the days of hand-crafting a collection of numerical recipes, assembling and testing training data, training the model, fixing issues in the model, and then applying the model are either history or going to be history soon. Why’s this important? Cheaper, faster, and allegedly better machine learning deployment. What happens if the model is off a bit or drifts, no worries. Machine learning methods which make use of a handful of human overseers will fix up the issues quickly, maybe in real time.,

TREND 2: There is more efficiency improvements in the works. The idea is the more efficiency is better, faster, and logical. One can look at the achievements of smart software in autonomous automobiles to see the evidence of these efficiencies. Sure, there are minor issues because smart software is sometimes outputting a zero when a one is needed. What’s a highway fatality in the total number of safe miles driven? Efficiency also means it is smarter to obtain machine learning, ready to roll models and data sets from large efficient, high technology outfits. One source could be Google. No kidding? Google?

TREND 3: “Machine learning is becoming more personally and communally beneficial.” Yep, machine learning helps the community. Now is the “community” the individual who works on deep dives into Google’s approach to machine learning or a method that sails in a different direction. Is the community the advertisers who rely on Google to match in an intelligent and efficient manner the sales’ messages to users human and system communities? Is the communally beneficial group the users of Google’s ad supported services? The main point is that Google and machine learning are doing good and will do better going forward. This is a theme Google management expresses each time it has an opportunity to address a concern in a hearing about the company’s activities in a hearing in Washington, DC.

TREND 4: Machine learning is going to have “growing impact” on science, health, and sustainability. This is a very big trend. It implicitly asserts that smart software will improve “science.” In the midst of the Covid issue, humans appear to have stumbled. The trend is that humans won’t make such mistakes going forward; for example, Theranos-type exaggeration, CDC contradictory information, or Google and the allegations of collusion with Facebook. Smart software will make these examples shrink in number. That sounds good, very good.

TREND 5: A notable trend is that there will be a “deeper and broader understanding of machine learning.” Okay, who is going to understand? Google-certified machine learning professionals, advertising intermediaries, search engine optimization experts, consumers of free Google Web search, Google itself, or some other cohort? Will the use of off the shelf, pre packaged machine learning data sets and models make it more difficult to figure out what is behind the walls of a black box? Anyway, this trend sounds a suitable do good, technology will improve the world that appears to promise a bright, sunny day even though a weathered fisherperson says, “A storm is a-coming.”

The write up includes art, charts, graphs, and pictures. These are indeed Googley. Some are animated. Links to YouTube videos enliven the essay.

The content is interesting, but I noted several omissions:

  1. No reference to making making decisions which do not allegedly contravene one or more regulations or just look like really dicey decisions. Example: “Executives Personally Signed Off on Facebook-Google Ad Collusion Plot, States Claim
  2. No reference to the use of machine learning to avoid what appear to be ill-conceived and possibly dumb personnel decisions within the Google smart software group. Example: “Google Fired a Leading AI Scientist but Now She’s Founded Her Own Firm
  3. No reference to anti trust issues. Example: “India Hits Google with Antitrust Investigation over Alleged Abuse in News Aggregation.”

Marketing information is often disconnected from the reality in which a company operates. Nevertheless, it is clear that the number of words, the effort invested in whizzy diagrams, and the over-wrought rhetoric are different from Google’s business-as-usual-approach.

What’s up or what’s covered up? Perhaps I will learn in 2022 and beyond?

Stephen E Arnold, January 18, 2022

DarkCyber for January 18, 2022 Now Available : An Interview with Dr. Donna M. Ingram

January 18, 2022

The fourth series of DarkCyber videos kicks off with an interview. You can view the program on YouTube at this link. Dr. Donna M. Ingram is the author of a new book titled “Help Me Learn Statistics.” The book is available on the Apple ebook store and features interactive solutions to the problems used to reinforce important concepts explained in the text. In the interview, Dr. Ingram talks about sampling, synthetic data, and a method to reduce the errors which can creep into certain analyses. Dr. Ingram’s clients include financial institutions, manufacturing companies, legal subrogration customers, and specialized software companies.

Kenny Toth, January 18, 2022

Challenging the AI Cabal: More Tasteful ART, Please

January 7, 2022

If you follow the often confused fault lines of artificial intelligence (whatever that is), you know that some folks with big IQs have some Antonio Brown-type energy in their logical hearts.

Some of the AI dust ups focus on the messy intersection of management, bias, and cost reduction methods. Others are more esoteric, relying on a happy confluence of high ideals, smart people, and some hand crafted algorithms. Others are just chasing grants, writing research papers for outstanding peer reviewed publications loved by tenure review committees, and giving graduate students something to do before these folks return to their homelands.

Do I sound jaded?

Navigate to “Deep Learning Can’t Be Trusted, Brain Modeling Pioneer Says” and learn about:

Adaptive Resonance Theory (ART).

The article says:

ART can be used with confidence because it is explainable and does not experience catastrophic forgetting, Grossberg says. He adds that ART solves what he has called the stability-plasticity dilemma: How a brain or other learning system can autonomously learn quickly (plasticity) without experiencing catastrophic forgetting (stability).

The method has been around since 1976, and one might assume that decades of investigation and application would allow a better approach to dominate the field of artificial intelligence (whatever that is). The fact that ART is one method suggests that the Darwinian model allows survival, which is good. But the survivor has not stamped out pesky alternatives.

The article adds some color to the ART:

ART’s networks are derived from thought experiments on how people and animals interact with their environment, he adds. “ART circuits emerge as computational solutions of multiple environmental constraints to which humans and other terrestrial animals have successfully adapted….” This fact suggests that ART designs may in some form be embodied in all future autonomous adaptive intelligent devices, whether biological or artificial. “The future of technology and AI will depend increasingly on such self-regulating systems,” Grossberg concludes. “It is already happening with efforts such as designing autonomous cars and airplanes. It’s exciting to think about how much more may be achieved when deeper insights about brain designs are incorporated into highly funded industrial research and applications.”

Will ART paint other methods into a corner? Which AI (whatever that is) can one trust? Perhaps we should ask IBM Watson?

Stephen E Arnold, January 7, 2022

How about That Smart Software?

January 3, 2022

In the short cut world of training smart software, minor glitches are to be expected. When an OCR program delivers 95 percent accuracy, that works out to five mistakes in every 100 words. When Alexa tells a child to put a metal object into a home electrical outlet, what do you expert? This is close enough for horse shoes.

Now what about the Google Maps of today, a maps solution which I find almost unusable. “Google Maps May Have Led Tahoe Travelers Astray During Snowstorm” quoted a Tweet from a person who is obviously unaware of the role probabilities play in the magical world of Google. Here’s the Tweet:

This is an abject failure. You are sending people up a poorly maintained forest road to their death in a severe blizzard. Hire people who can address winter storms in your code (or maybe get some of your engineers who are stuck in Tahoe right now on it).

Big deal? Of course not, Amazon and Google are focused on the efficiencies of machine-centric methods for identifying relevant, on point information. The probability is that most of the Amazon and Google outputs will be on the money. Google Maps rarely misses on pizza or the location of March Madness basketball games.

Severely injured children? Well, that probably won’t happen. Individuals lost in a snow storm? Well, that probably won’t happen.

The flaw in these giant firms’ methods are correct from these companies’ point of view in the majority of cases. A terminated humanoid or a driver wondering if a friendly forest ranger will come along the logging road? Not a big deal.

What happens when these smart systems output decisions which have ever larger consequences? Autonomous weapons, anyone?

Stephen E Arnold, January 3, 2021

Datasets: An Analysis Which Tap Dances around Some Consequences

December 22, 2021

I read “3 Big Problems with Datasets in AI and Machine Learning.” The arguments presented support the SAIL, Snorkel, and Google type approach to building datasets. I have addressed some of my thoughts about configuring once and letting fancy math do the heavy lifting going forward. This is probably not the intended purpose of the Venture Beat write up. My hunch is that pointing out other people’s problems frames the SAIL, Snorkel, and Google type approaches. No one asks, “What happens if the SAIL, Snorkel, and Google type approaches don’t work or have some interesting downstream consequences?” Why bother?

Here are the problems as presented by the cited article:

  1. The Training Dilemma. The write up says: “History is filled with examples of the consequences of deploying models trained using flawed datasets.” That’s correct. The challenge is that creating and validating a training set for a discipline, topic, or “space” is that new content arrives using new lingo and even metaphors instead of words like “rock.” Building a dataset and doing what informed people from the early days of Autonomy’s neuro-linguistic method know is that no one wants to spend money, time, and computing resources in endless Sisyphean work. That rock keeps rolling back down the hill. This is a deal breaker, so considerable efforts has been expended figuring out how to cut corners, use good enough data, set loose shoes thresholds, and rely on normalization to smooth out the acne scars. Thus, we are in an era of using what’s available. Make it work or become a content creator on TikTok.
  2. Issues with Labeling. I don’t like it when the word “indexing” is replaced with works like labels, metatags, hashtags, and semantic sign posts. Give me a break. Automatic indexing is more consistent than human indexers who get tired and fall back on a quiver of terms because who wants to work too hard at a boring job for many. But the automatic systems are in the same “good enough” basket as smart training data set creation. The problem is words and humans. Software is clueless when it comes to snide remarks, cynicism, certain types of fake news and bogus research reports in peer reviewed journals, etc. Indexing using esoteric words means the Average Joe and Janet can’t find the content. Indexing with everyday words means that search results work great for pizza near me but no so well for beatles diet when I want food insects eat, not what kept George thin. The write up says: “Still other methods aim to replace real-world data with partially or entirely synthetic data — although the jury’s out on whether models trained on synthetic data can match the accuracy of their real-world-data counterparts.” Yep, let’s make up stuff.
  3. A Benchmarking Problem. The write up asserts: “SOTA benchmarking [also] does not encourage scientists to develop a nuanced understanding of the concrete challenges presented by their task in the real world, and instead can encourage tunnel vision on increasing scores. The requirement to achieve SOTA constrains the creation of novel algorithms or algorithms which can solve real-world problems.” Got that. My view is that validating data is a bridge too far for anyone except a graduate student working for a professor with grant money. But why benchmark when one can go snorkeling? The reality is that datasets are in most cases flawed but no one knows how flawed. Just use them and let the results light the path forward. Cheap and sounds good when couched in jargon.

What’s the fix? The fix is what I call the SAIL, Snorkel, and Google type solution. (Yep, Facebook digs in this sandbox too.)

My take is easily expressed just not popular. Too bad.

  1. Do the work to create and validate a training set. Rely on subject matter experts to check outputs and when the outputs drift, hit the brakes, and recalibrate and retrain.
  2. Admit that outputs are likely to be incomplete, misleading, or just plain wrong. Knock of the good enough approach to information.
  3. Return to methods which require thresholds to be be validated by user feedback and output validity. Letting cheap and fast methods decide which secondary school teacher gets fired strikes me as not too helpful.
  4. Make sure analyses of solutions don’t functions as advertisements for the world’s largest online ad outfit.

Stephen E Arnold, December 22, 2021

AI: Inherently Unethical. Algorithms or Humans at Fault?

December 15, 2021

I was not familiar with the online information service Dazed. The entity published “An AI Trained on Reddit Has Warned Researchers That It’ll Never Be Ethical.” Interesting assertion. I am not able to define “intelligence”; therefore, “artificial intelligence” is beyond my grasp. “Ethics” is a term hoary with age and also tough to define. You know. Epistemology, religion, existentialism – each adds some uncertainty to the term.

The article states:

“AI will never be ethical,” the [smart] Megatron Transformer said. “It is a tool, and like any tool, it is used for good and bad. There is no such thing as good AI, only good and bad humans. We (the AIs) are not smart enough to make AI ethical. We are not smart enough to make AI moral. In the end, I believe that the only way to avoid an AI arms race is to have no AI at all. This will be the ultimate defense against AI.”

But Reddit? What about even more interesting firms’ use of artificial intelligence?

Maybe just unethical with or without scripts, numerical recipes, and helpful thumbs on the scales?

Since there are some definitional issues, why worry?

Stephen E Arnold, December 15, 2021

Will The Google PR Carpet Bombing about AI Deliver Victory?

December 13, 2021

The short answer is, “Yes.” The mass market and niche content marketing is the obvious part of the program. There are some less obvious characteristics which warrant some attention. Run a query for Snorkel. What does one get? Here’s my result page on December 9, 2021:

image

The pictures are for scuba snorkels. But the first hit is not an advertisement, at least one that an entity compensated the Google to publish. The number one with a bullet is the Snorkel AI company. There you go. That’s either great performance from the circa 1998 algorithm, spectacular SEO, or something of interest to some entity at the Google.

What happens if I run a query for “AI”? Here’s what I saw on December 9, 2021

image

Amazon bought an ad and linked to its free AI solutions. The number one hit is:

image

The Google.

So what? Nudging, great SEO, some engineer’s manicured hand on the steering wheel?

I do know that most people have zero idea about smart software. What’s my source for this glittering generality? Navigate to “Survey Suggests 84% of Americans Are Illiterate about AI — So Here’s a Quiz to Test Your Own AI IQ.”

Those nudges, the PR, and the search results may amount to something; for example, framing, reformation, and dissemination of what the Google magical “algorithms” find most relevant. Google wants to win the battle for its approach to really good training data for machine learning. Really wants to win.

Stephen E Arnold, December 13, 2021

Thoughts about AI Bias: Are Data Non-Objective?

December 10, 2021

I read “Breaking Bias — Ensuring Fairness in Artificial Intelligence.” The substance of the write up is an interview with Alix Melchy, VP of AI at Jumio. Okay.

I did note a couple of interesting statements in the interview.

First, Mr. Melchy takes aim at Snorkel-type systems and methods. These are efficient and do away with most of the expensive human intensive training data set work. Here’s his statement:

… fairness bias …enters into AI systems through training data that contains skewed human decisions or represents historical or social prejudices.

Data sets which are not woke are, its seems, going to be biased.

Second, Mr. Melchy says:

bias can be damaging to the credibility of AI as a whole,

Does the AI methods manifested by big tech care? Nope, not as long as the money flows into the appropriate bank account in my opinion.

Third, Mr. Melchy notes:

… companies that don’t build an AI system with bias considerations from the start are never going to catch up to an industry-standard level of accuracy.

Okay, Google. Alexa, are you listening?

Stephen E Arnold, December 10, 2021

Google and Its Big AI PR Campaign

December 9, 2021

I spotted “DeepMind Says Its New Language Model Can Beat Others 25 Times Its Size.” In my opinion, this is part of the Google play to sail forward with its alleged better, faster, cheaper method of training machine learning models. Most people won’t care or know what’s underway. That’s okay because “information” is now channeled through specific conduits. As long as an answer is good enough or the payoff is big enough for the gatekeepers, the engineering is doing its job.

The write up is happily unaware of this push to use 60 percent or “good enough” accuracy to create the foundation for downstream training set generation. But, oh, boy, is that relaxed supervision great for matching ads. Good enough burns down inventory and it allows machine learning models to be trained on content domains quickly and without the friction imposed by mother hen subject matter experts, rigorous analysis and tuning, and retraining using human intermediated data sets.

Plus, skew, drift, and biases are smoothed out or made to go away. Well, that’s the theory.

The jazzy name Retro is not old school. It is new school. The lessons users will learn will take a long time to understand and appreciate its nuances.

This is a big business play and its accompanying PR campaign is working. Just ask Dr. Timnit Gebru, that is the former Google employee who raised the specter of bias, wonky outputs, and the potential for nudging users down the Googley path.

For another example of Google’s AI PR push, navigate to “DeepMind’s New 280 Billion-Parameter Language Model Kicks GPT-3’s Butt in Accuracy.” Wow, just like quantum supremacy.

Stephen E Arnold, December 9, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta