Google, Smart Software, and Prime Mover for Hyperbole

May 17, 2022

In my experience, the cost of training smart software is very big problem. The bigness does not become evident until the licensee of a smart system realizes that training the smart software must take place on a regular schedule. Why is this a big problem? The reason is the effort required to assemble valid training sets is significant. Language, data types, and info peculiarities change over time; for example, new content is fed into a smart system, and the system cannot cope with the differences between the training set that was used and the info flowing into the system now. A gap grows, and the fix is to assemble new training data, reindex the content, and get ready to do it again. A failure to keep the smart software in sync with what is processed is a tiny bit of knowledge not explained in sales pitches.

Accountants figure out that money must be spent on a cost not in the original price data. Search systems return increasingly lousy results. Intelligence software outputs data which make zero sense to a person working out a surveillance plan. An art history major working on a PowerPoint presentation cannot locate the version used by the president of the company for last week’s pitch to potential investors.

The accountant wants to understand overruns associated with smart software, looks into the invoices and time sheets, and discovers something new: Smart software subject matter experts, indexing professionals, interns buying third-party content from an online vendor called Elsevier. These are not what CPAs confront unless there are smart software systems chugging along.

The big problem is handled in this way: Those selling the system don’t talk too much about how training is a recurring cost which increases over time. Yep, reindexing is a greedy pig and those training sets have to be tested to see if the smart software gets smarter.

The fix? Do PR about super duper even smarter methods of training. Think Snorkel. Think synthetic data. Think PowerPoint decks filled with jargon that causes clueless MBAs do high fives because the approach is a slam dunk. Yes! Winner!

I read “DeepMind’s Astounding New ‘Gato’ AI Makes Me Fear Humans Will Never Achieve AGI” and realized that the cloud of unknowing has not yet yield to blue skies. The article states:

Just like it took some time between the discovery of fire and the invention of the internal combustion engine, figuring out how to go from deep learning to AGI won’t happen overnight.

No kidding. There are gotchas beyond training, however. I have a presentation in hand which I delivered in 1997 at an online conference. Training cost is one dot point; there are five others. Can you name them? Here’s a hint for another big issue: An output that kills a patient. The accountant understands the costs of litigation when that smart AI makes a close enough for horseshoes output for a harried medical professional. Yeah, go catscan, go.

Stephen E Arnold, May 17, 2022

Written by Stephen E. Arnold · Filed Under AI, News | Comments Off on Google, Smart Software, and Prime Mover for Hyperbole

Differences Between Data Science And Business Intelligence

May 17, 2022

Data science is an encompassing term that is hard to define. Data science is an umbrella field that splinters in many directions. The Smart Data Collective explains the difference between two types of data science in, “The Difference Between Business Intelligence And Real Data Science.” According to the article, real data science is combining old and new data, analyzing it, and applying it to current business practices. Business intelligence (BI) focuses more on applications, such as creating charts, graphs, and reports.

Companies are interested in employing real data science and business intelligence, but it is confusing to distinguish the two. Data scientists and BI analysts are different jobs with specialized expertise. Data scientists are experts in predicting future outcomes by styling various models and discovering correlations. BI analysts know how to generate dashboards for historic data based on a set of key performance metrics.

Data scientists’ role is not based on guesswork. They are required to be experts in predictive and prescriptive analyses. Their outcomes need to be reasonably accurate for businesses’ success. BI needs advanced planning to combine data sources into useful content, data science, meanwhile, can be done instantly.

There are downsides to both:

“As you cannot get the data transformation done instantly with BI, it is a slow manual process involving plenty of pre-planning and comparisons. It needs to be repeated monthly, quarterly or annually and it is thus not reusable. Yet, the real data science process involves creating instant data transformations via predictive apps that trigger future predictions based on certain data combinations. This is clearly a fast process, involving a lot of experimentation.”

Business intelligence and real data science are handy for any business. Understanding the difference is key to utilizing them.

Whitney Grace, May 17, 2022

Written by Stephen E. Arnold · Filed Under Business process, News | Comments Off on Differences Between Data Science And Business Intelligence

Big Tech, Big Winners: Good or Bad

May 17, 2022

Science-fiction and many different types of smart people have informed us that technology and related information is dangerous if unregulated and left in the hands of a few individuals. Engadget focuses on the current reasons why big tech companies are dangerous in the article, “Hitting the Books: US Regulators Are Losing The Fight Against Big Tech.” Meta (formerly Zuckbook), Amazon, Google, and Apple control the technology space and consume…er…purchase startups before they can become a competitor. The government used to regulate the technology marketplace and, according to some written laws, they still do. The current advancement in technology has overwhelmed the government’s capacity to govern it.

Oxford professor Viktor Mayer-Schönberger and author Thomas Range wrote Access Rules: Freeing Data From Big Tech For a Better Future agree that Big Tech companies are hoarding information and there needs to be a more equitable way of accessing it. Biden’s administration has attempted to address Big Tech’s monopolies, but their efforts aren’t effective.

Biden appointed Tim to the National Economic Council as a special assistant to the president for technology and competition policy. Wu favors breaking up Big Tech companies and it was a sign that Biden leaned this way. Another signal of Biden’s leanings was Lina Khan as the Federal Trade Commission chair. Khan favors regulating Big Tech like utilities similar to electricity and AT&T before telecom deregulation. The Big Tech monopolies are not good, because it is preventing future innovation, but politicians are arguing over how to solve a convoluted issue. There are antitrust laws but are they enforceable? The complicated issue is:

“And yet it’s questionable that well-intentioned activist regulators bolstered by broad public support will succeed. The challenge is a combination of the structural and the political. As Lina Khan herself argued, existing antitrust laws are less than useful. Big Tech may not have violated them sufficiently to warrant breaking them up. And other powerful measures, such as declaring them utilities, require legislative action. Given the delicate power balance in Congress and hyper-partisan politics, it’s likely that such bold legislative proposals would not get enough votes to become enacted. The political factions may agree on the problem, but they are far apart on the solution. The left wants an effective remedy, while the right insists on the importance of market forces and worries about antitrust action micromanaging economic activity. That leaves a fairly narrow corridor of acceptable incremental legislative steps, such as “post-acquisition lockups.” This may be politically palatable, but insufficient to achieve real and sustained success.”

The Big Tech people, politicians, and other involved parties are concerned with short-term gains. The long game is being ignored in favor of the present benefits, while the future is left to deteriorate. Europe has better antitrust laws in actions against Big Tech companies. To plan for a better future, the US should copy Europe.

Whitney Grace, May 17, 2022

Written by Stephen E. Arnold · Filed Under Government, News, Technology | Comments Off on Big Tech, Big Winners: Good or Bad

Does Google Have Search Fear?

May 16, 2022

I can hear the Googlers at an search engine optimization conference saying this:

Our recent investments in search are designed to provide a better experience for our users. Our engineers are always seeking interesting, new, and useful ways to make the world’s information more accessible.

What these code words mean to me is:

Yep, the ancient Larry and Sergey thing. Not working. Oh, my goodness. What are we going to do? Buy Neeva, Kagi, Seekr, and Wecript? Let’s let Alphabet invest and we can learn and maybe earn before more people figure out our results are not as good as Bing and DuckDuckGo’s.

Even Slashdot is running items which make clear that Google and search do not warrant the title of “search giant.”

Source: Slashdot at https://bit.ly/3PkBOGt

I crafted this imaginary dialog when I read “This Germany-based AI Startup is Developing the Next Enterprise Search Engine Fueled by NLP and Open-Source.” That write up said:

Deepset, a German startup, is working to add to Natural Language Processing by integrating a language awareness layer into the business tech stack, allowing users to access and interact with data using language. Its flagship product, Haystack, is an open-source NLP framework that enables developers to create pipelines for a variety of search use-cases.

But here’s the snappy part of the article:

The Haystack-based NLP is typically implemented over a text database like Elasticsearch or Amazon’s OpenSearch branch and then connects directly with the end-user application through a REST API. It already has thousands of users and over 100 contributors. It uses transformer models to let developers create a variety of applications, such as production-ready question answering (QA), semantic document search, and summarization. The company has also introduced Deepset Cloud, an end-to-end platform for integrating customized and high-performing NLP-powered search systems into your application.

In theory, this is an open source, cloud centric super app, a meta play, a roll up of what’s needed to make finding information sort of work.

The kicker in the story is this statement:

The Berlin-based company has raised $14M in Series A funding led by GV, Alphabet’s venture capital arm.

Yep, the Google is investing. Why? Check that which applies:

( ) Its own innovation engines are the equivalent of a Ford Pinto racing a Tesla Model S Plaid? Google search is no longer the world’s largest Web site?

( ) Amazon gets more product searches than Google does?

( ) Users are starting to complain about how Google ignores what users key in the search box?

( ) Large sites are not being spidered in a comprehensive or timely manner?

( ) All of the above.

Stephen E Arnold, May 16, 2022

Written by Stephen E. Arnold · Filed Under Google, News, Search | Comments Off on Does Google Have Search Fear?

On Mitigating Open-Source Vulnerabilities

May 16, 2022

Open-source software has saved countless developers from reinventing the proverbial wheel so they can instead spend their time creating new ways to use existing code. That’s great! Except for one thing: Now that open-source components make up about 90% of most applications, they pose tempting opportunities for hackers. Perhaps the juiciest targets lie in the military and intelligence communities. US counter-terrorism ops rely heavily on the likes of Palantir Technologies, a heavy user of and contributor to open-source software. Another example is the F-35 stealth fighter, which operates using millions of lines of code. A team of writers at War on the Rocks explores “Dependency Issues: Solving the World’s Open-Source Software Security Problem.” Solve it? Completely? Right, and there really is a tooth fairy. The article relates:

“The problem is that the open-source software supply chain can introduce unknown, possibly intentional, security weaknesses. One previous analysis of all publicly reported software supply chain compromises revealed that the majority of malicious attacks targeted open-source software. In other words, headline-grabbing software supply-chain attacks on proprietary software, like SolarWinds, actually constitute the minority of cases. As a result, stopping attacks is now difficult because of the immense complexity of the modern software dependency tree: components that depend on other components that depend on other components ad infinitum. Knowing what vulnerabilities are in your software is a full-time and nearly impossible job for software developers.”

So true. Still, writers John Speed Meyers, Zack Newman, Tom Pike, and Jacqueline Kazil sound optimistic as they continue:

“Fortunately, there is hope. We recommend three steps that software producers and government regulators can take to make open-source software more secure. First, producers and consumers should embrace software transparency, creating an auditable ecosystem where software is not simply mysterious blobs passed over a network connection. Second, software builders and consumers ought to adopt software integrity and analysis tools to enable informed supply chain risk management. Third, government reforms can help reduce the number and impact of open-source software compromises.”

The article describes each part of this plan in detail. It also does a good job explaining how we got so dependent on open-source software and describes ways hackers are able to leverage it. The writers submits that, by following these suggestions, entities both public and private can safely continue to benefit from open-source collaboration. If the ecosystem is made even a bit safer, we suppose that is better than nothing. After all, ditching open-source altogether seems nigh impossible at this point.

Cynthia Murrell, May 16, 2022

Written by Stephen E. Arnold · Filed Under cybersecurity, News, Open source | Comments Off on On Mitigating Open-Source Vulnerabilities

Meta Logo Mimic?

May 16, 2022

Suing the Big Tech company Meta is like tilting windmills. Most of these lawsuits are dismissed, but sometimes the little guy has a decent chance at beating the giant. Fast Company details one of Meta’s latest fiascos: “Meta Faces Lawsuit Over Logo.” The Swiss blockchain nonprofit organization Dfinity filed a trademark infringement against Meta in a Northern California court. Dfinity’s lawsuit alleges that Meta’s new logo, shaped like an M infinity sign, bears a striking resemblance to their infinity logo.

Dfinity claims that Meta’s logo puts their reputation at stake, because Meta has a horrible record of violating people’s privacy and the association would prevent them from attracting users. The infinity symbol is in the public domain. Only original variations on it, such as the Meta and Dfinity logo, can be trademarked. Dfinity probably does not have a leg to stand on or curve to rest on with their lawsuit, but they might win. The infinity symbol is not unique to Dfinity, so if Meta had purloined the “Dfinity” name there would be a better case.

While Dfinity’s case could be dismissed, it could mean something worse:

“But even if Dfinity fails to prove its case, the lawsuit could jeopardize Meta’s attempts to earn trademark protection for its logo. That’s because it could highlight how unremarkable the logo really is. (Meta filed for trademark protection in March.) Says Lee: ‘The U.S. Patent and Trademark Office might find that [Meta’s logo] is not inherently distinctive on its own and require more evidence that consumers associate the symbol with a single company.’”

Lately, Meta is not getting a decent press. The little guy will not likely win, especially if Meta has a good legal team. Meta could ultimately lose control of their logo and it could be a fiasco as bad as Disney losing the copyright on Mickey Mouse.

Whitney Grace, May 16, 2022

Written by Stephen E. Arnold · Filed Under Facebook, Legal matters, News | Comments Off on Meta Logo Mimic?

Why Stuff Is Stupid: Yep, Online Is One Factor

May 16, 2022

I read “IQ Scores Are Falling and Have Been for Decades, New Study Finds.” Once again academic research has verified what anyone asking a young person to make change at a fast food restaurant knows: Ain’t happening.

The article reports:

IQ scores have been steadily falling for the past few decades, and environmental factors are to blame, a new study says. The research suggests that genes aren’t what’s driving the decline in IQ scores…

What, pray tell and back up with allegedly accurate data from numerous sources? I learned:

“The causes in IQ increases over time and now the decline is due to environmental factors,” said Rogeburg [Ole Rogeberg, a senior research fellow at the Ragnar Frisch Center for Economic Research in Norway], who believes the change is not due to genetics. “It’s not that dumb people are having more kids than smart people, to put it crudely. It’s something to do with the environment, because we’re seeing the same differences within families,” he said. These environmental factors could include changes in the education system and media environment, nutrition, reading less and being online more, Rogeberg said.

Ah, ha. Media and online.

Were not these innovations going to super charge learning?

I know how it is working out when I watch a teen struggling to calculate that 57 cents from $1.00 is $5.00 and 43 cents. Yes!

I am not sure to what to make of another research study. “Why Do Those with Higher IQs Live Longer? A New Study Points to Answers” reveals:

“The slight benefit to longevity from higher intelligence seems to increase all the way up the intelligence scale, so that very smart people live longer than smart people, who live longer than averagely intelligent people, and so on.” The researchers … found an association between childhood intelligence and a reduced risk of death from dementia and, on a smaller scale, suicide. Similar results were seen among men and women, except for lower rates of suicide, which had a correlation to higher childhood intelligence among men but not women.

My rule of thumb is not to stand in front of a smart self-driving automobile.

Stephen E Arnold, May 16, 2022

Written by Stephen E. Arnold · Filed Under News, Online (general), Technology | Comments Off on Why Stuff Is Stupid: Yep, Online Is One Factor

Distraction Attraction: The Twitter Bot Tactic

May 13, 2022

I have noted in this Web log some corporate actions which I think are like magician tricks. Distract the audience and pull a card from a sleeve. Wow! Magic. The distraction attraction tactic works on a larger scale than a kids’ birthday party with a rent-a-clown and a down-at-the-heels Houdini. Google said it was the holder of the coveted crown of quantum supremacy. At the same time, the lovable Google was busy finding ways to make life interesting for insiders who disagreed with the Google’s version of its technical achievements. The Windows 11 announcement, which showed the Windows Weekly crew that their access to insiders was going the way of Vista; that is, nowhere fast. I still think that the SolarWinds’ misstep motivated this distraction attraction. Who needs Windows 11? It seems fewer than 10 percent of the Windows 10 users, but who knows it this number is real or fake?

This morning (Friday the 13th as luck has it) is was greeted with headlines saying that the wizard of electric cars which have a parts supply chain problem may not buy Twitter. A representative story is from the UK Daily Star (an estimable publication): “Elon Musk Says £36 Billion Twitter Deal Is ‘On Hold’ Following Spam Bot Revelation.” Imagine a story about a service popular in the US in San Francisco/Silicon Valley and Manhattan/Brooklyn with other users dotted in hippity dippity locations like Austin and Nashville appearing in London underground riders’ hands.

That’s publicity, and visibility is the life force of distraction attraction.

I am fighting my keen interest in identifying the issues associated with this deal. Okay, I will mention one: How’s the value of Tesla stock holding up?

Is there a connection between the on hold and actually buying what may be the ultimate nest for legal eagles?

Yep, surfing the TikTok-type concentration with a T shirt that says, “News cycle? Yes!” is one of the joys of distraction attraction. What’s important? The current financial turmoil? The coming dust up with Finland if some Russian duma members have their way? Covid’s chugging along? The black hole at the center of the galaxy which has an immediate and direct contribution to my understanding of astrophysicists’ Kodak moments?

Attraction distraction: A tactic which creates a reality housing other substantive issues requiring a bit of thought, not a tweet.

Stephen E Arnold, May 13, 2022

Written by Stephen E. Arnold · Filed Under Business strategy, Marketing, News | Comments Off on Distraction Attraction: The Twitter Bot Tactic

True or False: Google and Dangerous Functionality

May 13, 2022

I want to be clear: I cannot determine if security-related announcements are PR emissions, legitimate items of data, or clickbait craziness. I am on the fence with the information is “Google Cloud Apparently Has a Security Issue Even Firewalls Can’t Stop.”

The write up presents as real news:

A misconfiguration in Google Cloud Platform has been found which could give threat actors full control over a target virtual machine (VM) endpoint…

These virtual machines are important cogs in some bad actors machinery. Sure, legitimate outfits rely on the Google for important work as well. Therefore, the announcement points some bad actors toward a new opportunity to poke around and outfits engaged in ethically informed activities to batten down their digital hatches.

The write up points out that the Google agreed that “misconfiguration could bypass firewall settings.”

And the Google, being Googley, semi-agrees. Does this mean that the Google Cloud is just semi-vulnerable?

Stephen E Arnold, May 13, 2022

Written by Stephen E. Arnold · Filed Under Cloud computing, News, Security | Comments Off on True or False: Google and Dangerous Functionality

A Really Tiny Issue with Language AIs

May 13, 2022

Why not rely on Google Docs’s linguistic AI instead of thinking for yourself? Well, there may be some issues with that approach. The Hustle advises, “Don’t Rely on Google Docs’ New Writing Tool to Do Your Work for You.” Writer Juliet Bennett Rylah explains:

“AI is fun when it eats horror movies and churns out Netflix’s Mr. Puzzles Wants You to Be Less Alive. But sometimes the lack of context is an issue. Take Google Docs’ new ‘assistive writing’ feature, which makes suggestions as you write (e.g., switching from passive to active voice or deleting repetitive words). This makes your writing more accessible, which is great. It may also suggest more inclusive language, while flagging words that could be deemed inappropriate. That’s cool, in theory, but many users have found the suggestions to be, well, a little weird. Motherboard tested out several text excerpts. While the tool suggested more gender-inclusive phrasing (e.g., ‘policemen’ to ‘police officers’), it also flagged the word ‘Motherboard.’ And while it suggested ‘property owner’ in lieu of ‘landlord,’ it didn’t flag anything in a slur-laden interview with ex-KKK leader David Duke.”

Is that good? As always, it comes down one inconvenient fact: An AI is only as good as its machine-learning materials. It seems those are as rife with bias as ever, and language tools are far from immune. Alas, it will be a while before we can confidently source out our writing to an algorithm. Meanwhile, Rylah suggests this guide for humans wishing to employ more inclusive language.

Cynthia Murrell, May 13, 2022

Written by Stephen E. Arnold · Filed Under AI, News | Comments Off on A Really Tiny Issue with Language AIs

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Google, Smart Software, and Prime Mover for Hyperbole

Differences Between Data Science And Business Intelligence

Big Tech, Big Winners: Good or Bad

Does Google Have Search Fear?

On Mitigating Open-Source Vulnerabilities

Meta Logo Mimic?

Why Stuff Is Stupid: Yep, Online Is One Factor

Distraction Attraction: The Twitter Bot Tactic

True or False: Google and Dangerous Functionality

A Really Tiny Issue with Language AIs

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta