Facebook Targets Paginas Amarillas: Never Enough, Zuck?

October 14, 2021

Facebook is working to make one of its properties more profitable. The Next Web reports, “WhatsApp Reinvents the ‘Yellow Pages’ and Proves there Are No New Ideas.” The company will test out a new business directory feature in San Paulo, Brazil, where local users will be able to search for “businesses nearby” through the app. Writer Ivan Mehta reports:

“For years, Facebook and Instagram have been trying to connect you to businesses and make your shop through their platforms. While the WhatsApp Business app has been around, you couldn’t really search for businesses using the app, unless you’ve interacted with them previously. WhatsApp already offers payment services in Brazil. So it makes sense for it to provide discovery services for local businesses, so you can shop for goods in person, and pay through the platform. The chat app doesn’t have any ads, unlike Facebook and Instagram, so business interactions and transactions are one of the biggest ways for Facebook to earn some moolah out of it. In June, the company integrated its Shops feature in WhatsApp. So, we can expect more business-facing features in near future.”

India and Indonesia are likely next on the list for the project, according to Facebook’s Matt Idema. We are assured the company will track neither users’ locations nor the businesses they search for. Have we heard similar promises before?

Cynthia Murrell, October 14, 2021

Ex-Googlers Work On Biased NLP Solutions

October 6, 2021

Google is on top of the world when it comes to money and technology. Google is the world’s most used search engine, its Chrome Web browser is used by two-thirds of users, and about 29% of 2021 digital advertising were Google ads. Fast Company asks and investigates important questions about Google’s product quality in: “It’s Not Just You. Google Search Really Is Getting Worse.”

Over 80% of Alphabet Inc.’s revenue, Google’s parent company, comes from advertising revenue and about 85% of the world’s search engine traffic feeds through Google. Google controls a lot of users’ screen time. The search engine’s quality results have been studied and researchers have learned that very few users scroll past the “fold” (all of the available content on a screen). Advertising space at the top of search results is incredibly valuable. It also means that users are forced to scroll further and further to reach non-paid results.

Alphabet Inc. has another revenue generating platform, YouTube. A huge portion of videos include multiple ads. Users can avoid ads by paying for a premium subscription, but very few do.

Google does want to improve its search quality. Currently a lot of information from queries are distributed across multiple Web sites. Google wants to condense everything:

“Google is working on bringing this information together. The search engine now uses sophisticated “natural language processing” software called BERT, developed in 2018, that tries to identify the intention behind a search, rather than simply searching strings of text. AskJeeves tried something similar in 1997, but the technology is now more advanced.

BERT will soon be succeeded by MUM (Multitask Unified Model), which tries to go a step further and understand the context of a search and provide more refined answers. Google claims MUM may be 1,000 times more powerful than BERT, and be able to provide the kind of advice a human expert might for questions without a direct answer.”

Google controls a huge portion of the Internet and how users utilize it. Alphabet Inc. is here to stay for a long time, but there are alternatives such as Bing, DuckDuckGo, Ecosia, and Tor browsers. Google, however, will one day fade. Sears Roebuck, Blockbuster, Kmart, cassettes, etc. were al household names, until they became obsolete.

Whitney Grace, October 6, 2021

Data Federation: Sure, Works Perfectly

June 1, 2021

How easy is it to snag a dozen sets of data, normalize them, parse them, and extract useful index terms, assign classifications, and other useful hooks? “Automated Data Wrangling” provides an answer sharply different from what marketers assert.

A former space explorer, now marooned on a beautiful dying world explains that the marketing assurances of dozens upon dozens of companies are baloney. Here’s a passage I noted:

Most public data is a mess. The knowledge required to clean it up exists. Cloud based computational infrastructure is pretty easily available and cost effective. But currently there seems to be a gap in the open source tooling. We can keep hacking away at it with custom rule-based processes informed by our modest domain expertise, and we’ll make progress, but as the leading researchers in the field point out, this doesn’t scale very well. If these kinds of powerful automated data wrangling tools are only really available for commercial purposes, I’m afraid that the current gap in data accessibility will not only persist, but grow over time. More commercial data producers and consumers will learn how to make use of them, and dedicate financial resources to doing so, knowing that they’ll be reap financial rewards. While folks working in the public interest trying to create universal public goods with public data and open source software will be left behind struggling with messy data forever.

Marketing is just easier than telling the truth about what’s needed in order to generate information which can be processed by a downstream procedure.

Stephen E Arnold, June xx, 2021

More about Bert: Will TikTok Videos Be Next?

May 28, 2021

Google asserts its new AI model will deliver significant improvements. SEO Hacker discusses “Google MUM: New Search Technology.” We are told MUM, or Multi Unified Model, is like BERT but much more powerful. We learn:

“They are built on the same Transformer architecture, but MUM is 1000x more powerful than its predecessor. … Another difference between MUM and BERT is that MUM is trained across 75 languages – not just one language (usually English). This enables the search engine, through the use of MUM, to connect information from all around the world without going through language barriers. Additionally, Google mentioned that MUM is multimodal, so it understands and processes information from modalities such as text and images. They also brought up the possibility for MUM to expand to other modalities such as videos and audio files.”

For an example of how the new model will work, see either the SEO Hacker write-up or Google’s blog post on the subject. The illustration involves Mt. Fuji. Naturally, the Search Engine Optimization site ponders how the change might affect SEO. Writer Sean Si predicts MUM’s understanding of 75 languages means non-English content will find much wider audiences. The revised algorithm will also serve up more types of content, like podcasts and videos, alongside text-based resources. Both of those sound like positives, at least for searchers. Other ramifications on the field remain to be seen, but Si anticipates SEO pros will have to develop entirely new approaches. Of course, producing quality content relevant to one’s site should remain the top recommendation.

Cynthia Murrell, May 28, 2021

UCF Cracks Sarcasm: With a Crocodile Smile?

May 18, 2021

I read some big news from Big News. The story “Researchers Develop A.I. That Can Detect Sarcasm” explains that smart software has the ability to parse text so that a determination can be made about the degree of non-smarty writing can be detected. The article states:

The team taught the computer model to find patterns that often indicate sarcasm and combined that with teaching the program to correctly pick out cue words in sequences that were more likely to indicate sarcasm. They taught the model to do this by feeding it large data sets and then checked its accuracy.

Presumably the hand-crafting of the training set is able to keep pace with the language of those seeking customer support. I have commented about the brilliance and responsiveness of the customer support available from major companies; for example, Microsoft and Verizon. Improving upon the clarity of information available from these organizations is difficult for me to envision. The excellent handling of SolarWinds by Microsoft and the management acumen demonstrated by Verizon with regard to Yahoo chisels a benchmark in marketing effectiveness.

The write up adds:

The multi-head self-attention module aids in identifying crucial sarcastic cue-words from the input, and the recurrent units learn long-range dependencies between these cue-words to better classify the input text.

Mix in sentiment analysis, and the simplicity of the method is evident.

I noted this statement:

Sarcasm detection in online communications from social networking platforms is much more challenging.

It seems that one of the final frontiers of human utterance has been cross. Sarcasm has been cracked. As I write this I manifest a crocodile smile. The reason? The time and cost of maintaining the training set so it reflects what TikTok and Dread users “do” with language may be a sticking point. Then the rules must be updated in near real time, assuming that the data flows are related to crime, war fighting, or financial fraud.

A big crocodile? Yes, and a big smile. But research grants and graduate students are eager to contribute because… degree.

Stephen E Arnold, May 18, 2021

GitHub: Amusing Security Management

April 8, 2021

I got a kick out of “GitHub Investigating Crypto-Mining Campaign Abusing Its Server Infrastructure.” I am not sure if the write up is spot on, but it is entertaining to think about Microsoft’s security systems struggling to identify an unwanted service running in GitHub. The write up asserts:

Code-hosting service GitHub is actively investigating a series of attacks against its cloud infrastructure that allowed cybercriminals to implant and abuse the company’s servers for illicit crypto-mining operations…

In the wake of the SolarWinds’ and Exchange Server “missteps,” Microsoft has been making noises about the tough time it has dealing with bad actors. I think one MSFT big dog said there were 1,000 hackers attacking the company.

The main idea is that attackers allegedly mine cryptocurrency on GitHub’s own servers.

This is post SolarWinds and Exchange Server “missteps”, right?

What’s the problem with cyber security systems that monitoring real time threats and uncertified processes?

Oh, I forgot. These aggressively marketed cyber systems still don’t work it seems.

Stephen E Arnold, April 8, 2021

DarkCyber for January 12, 2021, Now Available

January 12, 2021

DarkCyber is a twice-a-month video news program about online, the Dark Web, and cyber crime. You can view the video on Beyond Search or at this YouTube link.

The program for January 12, 2021, includes a featured interview with Mark Massop, DataWalk’s vice president. DataWalk develops investigative software which leapfrogs such solutions as IBM’s i2 Analyst Notebook and Palantir Gotham. In the interview, Mr. Massop explains how DataWalk delivers analytic reports with two or three mouse clicks, federates or brings together information from multiple sources, and slashes training time from months to several days.

Other stories include DarkCyber’s report about the trickles of information about the SolarWinds’ “misstep.” US Federal agencies, large companies, and a wide range of other entities were compromised. DarkCyber points out that Microsoft’s revelation that bad actors were able to view the company’s source code underscores the ineffectiveness of existing cyber security solutions.

DarkCyber highlights remarkable advances in smart software’s ability to create highly accurate images from poor imagery. The focus of DarkCyber’s report is not on what AI can do to create faked images. DarkCyber provides information about how and where to determine if a fake image is indeed “real.”

The final story makes clear that flying drones can be an expensive hobby. One audacious drone pilot flew in restricted air zones in Philadelphia and posted the exploits on a social media platform. And the cost of this illegal activity. Not too much. Just $182,000. The good news is that the individual appears to have avoided one of the comfortable prisons available to authorities.

One quick point: DarkCyber accepts zero advertising and no sponsored content. Some have tried, but begging for dollars and getting involved in the questionable business of sponsored content is not for the DarkCyber team.

Finally, this program begins our third series of shows. We have removed DarkCyber from Vimeo because that company insisted that DarkCyber was a commercial enterprise. Stephen E Arnold retired in 2017, and he is now 77 years old and not too keen to rejoin the GenX and Millennials in endless Zoom meetings and what he calls “blatant MBA craziness.” (At least that’s what he told me.)

Kenny Toth, January 12, 2021

Ah, Chatbots. Unfortunately, Inevitable Because Who Wants to Support Customers?

December 2, 2020

Lest one think AI is here to make our lives easier, one should think again. Though the technology may bring new capabilities and insights, users must put in work and surmount frustration to get results. Bizcommunity.com discusses “The Unsuspected Stumbling Blocks of AI for Customer Experience.” Writer Mathew Conn specifically examines the use of chatbots here. He writes:

“While chatbots successfully enable one-to-one conversations with customers through automated interfaces and are a great way to deliver immediate responses, they are not right for any and all customer interactions. The first, and possibly most important failure of chatbots, is a direct result of the organization in question not identifying what customer interactions are right for enhancement with chatbots. … Because chatbots use open source libraries, most won’t be customized to the organization’s specific industry or customers. Pre-trained bots will be limited to their pre-programmed decision path and are limited by the designer or programmer’s understanding of customer behaviors and requests. While chatbots don’t reason, smarter bots can cope better with some language nuances; however, without human judgment, chatbot accuracy will always be limited. Pre-trained chatbots follow a structured conversation plan and can lose the flow fairly easily. With more access to customer history and data, smarter chatbots can ‘learn’ customer preferences. However, to keep context, chatbots need every possible response to every possible customer request.”

The more complex the interaction, the more likely customers will want to converse with a human. It can be useful to begin interactions with a chatbot then shift to a human worker, but a problem can occur when such a shift means changing platforms from a chat window to phone or email. If the company does not maintain consistency across all its channels, the customer must restart their explanation from the beginning. This does not make for a happy customer or, by extension, a good reputation for the business.

Chatbots are not the only AI function that is less of a panacea than vendors would like us to believe. Before investing in any AI solution, businesses should do their research and make certain they understand what they are getting, whether it will truly address their unique needs, and how to make the most of it.

Just cut costs and move on.

Cynthia Murrell, December 2, 2020

NLP Survey: Grains of Salt Helpful

November 30, 2020

Curious about the “state” of natural language processing? Surveys dependent on participants who self-recruit or receive a questionnaire as a result of signing up for a newsletter have to be consumed with a grain of salt and bottle of monosodium glutamate. You can get a copy of a survey sponsored by John Snow Labs via this url. This is a Medium content object, so be prepared to provide information of value to certain large organizations.

The principal findings from the survey of 571 respondents include:

  • People are spending money for entity recognition and document classification
  • Sparc and spaCy are popular
  • One third of those responding use an indexing “helper” tool.

Data about budgets are scant. Percentages are not what fuel a sales person’s interest.

For Beyond Search, the single most important finding is that four cloud services do the heavy lifting for those into NLP: AWS, Azure, Google, and IBM. Which cloud service is most popular among the NLP crowd? Give up? The survey says, “Google.”

Not surprisingly cost and complexity are holding back NLP adoption and expansion. And what is John Snow Labs? An NLP outfit. Index term: Marketing.

Stephen E Arnold, November 30, 2020

Google: Poetry Creation Made Eneasy

November 25, 2020

I spotted “Google’s Verse by Verse AI Can Help You Write in the Style of Famous Poets.” The subtitle illustrates why this Google innovation is probably going to find some Silicon Valley Shakespeares:

Quoth the Bugdroid, “Nevermore.”

The write up guides the reader to this url. Then the page displays:


Okay, let’s write a poem with the Google smart software. I am skeptical because Google set out to solve death. So far, no luck with that project. For poetic style, I quite like the approach of William Abernathy, who wrote a remarkable tribute to Queen Elizabeth called Elisaeis, Apotheosis poeticaas in Latin when he was trying to avoid arrest for religious heresy. (For more info on William Abernathy, navigate to your local university library and chase down Vol. 76, No. 5, Texts and Studies, 1979. “The Elisæis” of William Alabaster (Winter, 1979). Oh, the poem is a tribute to Elizabeth the First. Did I mention the poem was an epic, thousands upon thousands of lines. In Latin too. Hot stuff.)

Well, bummer. Mr. Alabaster is not listed as a stylistic choice on the Google write a poem Web site. I thought AI was smart. Well, let us sally forth with the clever and sometimes interesting Edwin Arlington Robinson who wrote:

Mininver loved the Medici,
Albeit he had never seen one;
He would have sinned incessantly
Could he have been one.

Yep, sin. But I had to pick other poets with which the smart Google AI is familiar. Trepedatiously I selected the fave of elderly literature teachers: Henry Wadsworth Longfellow. Plus in a nod to the Rona and rising infection rates, I plunked my mouse cursor on the liquor-loving and raven loving Edgar Allen Poe. Yep, I noted the “nevermore” in the article’s subtitle. Then I clicked “Next.”

I specified a quatrain in iambic pentameter with the rhyming scheme AB AB.

Google’s smart software wanted a chunk of poesy as a “seed” for the smart software. I provided:

Whoa, teenaged mind, cause no sorrow or pain

I want to point out that this is the first line of a poem my junior class English teacher Edwardine Sperling required us to write. (She loved cardinals, the bird, not the baseball team.) My poetic flight of fancy at age 15 on this line motivated Ms. Sperling to try and get me expelled me from high school. No sense of humor had she. (The compromise proposed by the assistant principal was that Ms Sperling could ban me from the National Honor Society as a result of my inappropriate writing, and I had to sit outside the class in the hallway for the remainder of the semester.)

And what was my “Spirit of Nature” poem about? Nothing much. Just sitting in the woods on a sunny day in early autumn. Then the Spirit of Nature emerged from a pile of leaves. I explained that my Spirit of Nature was the October 1959 Playmate of the Month from Playboy magazine. I elaborated via metaphors (terrible metaphors I must confess) how the Spirit of Nature or Miss October helped move away from “sorrow or pain.” I will leave the details to your imagination. My poem was a hoot. But I got the boot.

Back to the Google smart poetry writer, a system which I hypothesized would have zero imagination and would have been an A student in dear Ms. Sperling’s literature class.

I clicked the Next button again. Magic. Google’s fine system spit out after some prompting after I provided the first line in red. Google goodness is in blue:

Whoa, teenaged mind, cause no sorrow or pain
Enlife a phantom of an idle love;
Yet in a fancy I could now attain
Look on the beauty of that world above!

Great stuff those words in blue crafted sharp and true by Lord Google.

Ms Sperling would have relished the “enlife” word. The prefix “en” leads to many coinages; for example, enbaloney, enstupid, and enmarketing. Maybe enAI? Sure. But no Playboy bunnies. No filthy innuendo. No double entendre. The meaning thing eludes me, but, hey, Google couldn’t solve death either. The GOOG is not doing too well in poesie either I opine. Any questions about Google’s query ad matching semantic system? Good.

Stephen E Arnold, November 24, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta