Cyber Security: The Stew Is Stirred

October 12, 2022

Cyber security, in my opinion, is often an oxymoron. Cyber issues go up; cyber vendors’ marketing clicks up a notch. The companies with cyber security issues keeps pace. Who wins this cat-and-mouse ménage a trois? The answer is the back actors and the stakeholders in the cyber security vendors with the best marketing.

Now the game is changing from cyber roulette, which has been mostly unwinnable to digital poker.

Here’s how the new game works if the information in “With Security Revenue Surging, CrowdStrike Wants to Be a Broader Enterprise IT Player” is on the money. I have to keep reminding myself that if there is cheating in competitive fishing, chess, and poker, there might be some Fancy Dancing at the cyber security hoe down.

The write up points out that CrowdStrike, a cyber security vendor, wants to pull a “meta” play; that is, the company’s management team wants to pop up a level. The idea is that cyber security is a platform. The “platform” concept means that other products and services should and will plug into the core system. Think of an oil rig which supports the drill, the pumps, spare parts, and the mess hall. Everyone has to use the mess hall and other essential facilities.

The article says:

Already one of the biggest names in cybersecurity for the past decade, CrowdStrike now aspires to become a more important player in areas within the wider IT landscape such as data observability and IT operations…

Google and Microsoft are outfits which may have to respond to the CrowdStrike “pop up a level” tactic. Google’s full page ads in the dead tree version of the Wall Street Journal and Microsoft’s on-going security laugh parade may not be enough to prevent CrowdStrike from:

  1. Contacting big companies victimized by lousy security provided by some competitors (Hello, Microsoft client. Did you know….)
  2. Getting a group of executives hurt in the bonus department by soaring cyber security costs
  3. Closing deals which automatically cut into both the big competitors’ and the small providers’ deals with these important clients.

The write up cites a mid tier consulting firm as a source of high value “proof” of the CrowdStrike concept. The write up offers this:

IDC figures have shown CrowdStrike in the lead on endpoint security market share, with 12.6% of the market in 2021, compared to 11.2% for Microsoft. CrowdStrike’s growth of 68% in the market last year, however, was surpassed by Microsoft’s growth of nearly 82%, according to the IDC figures.

CrowdStrike’s approach is to pitch a “single agent architecture.” Is this accurate? Sure, it’s marketing, and marketing matters.

Our research suggests that cyber security remains a “reaction” game. Something happens or a new gaffe is exploited, and the cyber security vendors react. The bad actors then move on. The result is that billions in revenue are generated for cyber security vendors who sell solutions after something has been breached.

Is there an end to this weird escalation? Possibly but that would require better engineering from the git go, government regulations for vendors whose solutions are not secure, and stronger enforcement action at the point of distribution. (Yes, ISPs and network providers, I am talking about you.)

Net net: Cyber security will become a market sector to watch. Some darned creative marketing will be on display. Meanwhile as the English majors write copy, the bad actors will be exploiting old and new loopholes.

Stephen E Arnold, October 12, 2022

Wonderful Statement about Baked In Search Bias

October 12, 2022

I was scanning the comments related to the HackerNews’ post for this article: “Google’s Million’s of Search Results Are Not Being Served in the Later Pages Search Results.”

Sailfast made this comment at this link:

Yeah – as someone that has run production search clusters before on technologies like Elastic / open search, deep pagination is rarely used and an extremely annoying edge case that takes your cluster memory to zero. I found it best to optimize for whatever is a reasonable but useful for users while also preventing any really seriously resource intensive but low value queries (mostly bots / folks trying to mess with your site) to some number that will work with your server main node memory limits.

The comment outlines a facet of search which is not often discussed.

First, the search plumbing imposes certain constraints. The idea of “all” information is one that many carry around like a trusted portmanteau. What are the constraints of the actual search system available or in use?

Second, optimization is a fancy word that translates to one or more engineers deciding what to do; for example, change a Bayesian prior assumption, trim content based on server latency, filter results by domain, etc.

Third, manipulation of the search system itself by software scripts or “bots” force engineers to figure out what signals are okay and which are not okay. It is possible to inject poisoned numerical strings or phrases into a content stream and manipulate the search system. (Hey, thank you, search engine optimization researchers and information warfare professionals. Great work.)

When I meet a younger person who says, “I am a search expert”, I just shake my head. Even open source intelligence experts display that they live in a cloud of unknowing about search. Most of these professionals are unaware that their “research” comes from Google search and maps.

Net net: Search and retrieval systems manifest bias, from the engineers, from the content itself, from the algorithms, and from user interfaces themselves. That’s why I say in my lectures, “Life is easier if one just believes everything one encounters online.” Thinking in a different way is difficult, requires specialist knowledge, and a willingness to verify… everything.

Stephen E Arnold, October 12, 2022

Elastic: Bouncing Along

October 12, 2022

It seems like open-source search is under pressure. We learn from SiliconAngle that “Elastic Delivers Strong Revenue Growth and Beats Expectations, but Its Stock is Down.” For anyone unfamiliar with Elastic, writer Mike Wheatley describes the company’s integral relationship with open-source software:

“The company sells a commercial version of the popular open-source Elasticsearch platform. Elasticsearch is used by enterprises to store, search and analyze massive volumes of structured and unstructured data. It allows them to do this very quickly, in close to real time. The platform serves as the underlying engine for millions of applications that have complex search features and requirements. In addition to Elasticsearch, Elastic also sells application observability tools that help companies to track network performance, as well as threat detection software.”

Could it be that recent concerns about open-source security issues are more important to investors than fiscal success? The write-up shares some details from the company’s press release:

“The company reported a loss before certain costs such as stock compensation of 15 cents per share, coming in ahead of Wall Street analysts’ consensus estimate of a 17-cent-per-share loss. Meanwhile, Elastic’s revenue grew by 30% year-over-year, to $250.1 million, beating the consensus estimate of $246.2 million. On a constant currency basis, Elastic’s revenue rose 34%. Altogether, Elastic posted a net loss of $69.6 million, more than double the $34.4 million loss it reported in the year-ago period.”

Elastic emphatically accentuates the positive—like the dramatic growth of its cloud-based business and its flourishing subscription base. See the source article or the press release for more details. We are curious to see whether the company’s new chief product officer Ken Exner can find a way to circumvent open-source’s inherent weaknesses. Exner used to work at Amazon overseeing AWS Developer Tools. Founded in 2012, Elastic is based in Mountain View, California.

Cynthia Murrell, October 12, 2022

Surprise. Flawed Software Gums Up the Works

October 12, 2022

Did you ever hear the quote, “A man is only as good as his tools”? The quote usually applies to skilled laborers, doctors, athletes, teachers, etc. It can also work for anyone who relies on a computer hooked up to a network for work. Jacob Kaplan-Moss of the Jacobian blog posted a piece entitled, “Quality Is Systemic” which discusses how poorly designed software is the result of a poorly designed system. He negates that individual performance has a strong impact on a system.

Kaplan-Moss suggests that mediocre programmers working within a structure to design quality software will do so better than a group of phenomenal programmers who are working in a system with other goals in mind. He defines quality as documented, well-factored, and edited codebases, well-designed testing harnesses, easy-to-use high-fidelity development, and staging environments, a blameless workplace, and no toxic work relationships. He continues that humans and technical factors are important to establish a virtuous cycle for systemic quality:

“ Great tests catch errors before they become problems, but those tests don’t magically come into existence; they require a structure that affords the time and space to write tests.

That structure works because engineers are comfortable speaking up when they need some extra time to get the tests right.

  • Engineers are comfortable speaking up because they work in an environment with high psychological safety.
  • That environment exists in part because they know that production failures are seen as systemic failures, and individuals won’t be punished, blamed, or shamed.
  • Outages are treated as systemic because most of them are. That’s because testing practices are so good that individual errors are caught long before they become impactful failures.”

The post ends with suggestions to reevaluate work environment toxicity and not concentrate on hiring the best, instead focus on building a system that makes great results and encourages individual performance. Moss-Kaplan’s suggestions are ideal for any workplace, but they are almost too good to be true for the US.

Whitney Grace, October 12, 2022

Google: Business Intelligence, Its Next Ad Business

October 11, 2022

Google has been a busy beaver. One example popped out of a ho hum write up about Google management’s approach to freebies. The write up “Google’s CEO Faced Intense Pushback from Employees at a Town Hall. His 2-Sentence Response was Smart Leadership” contains a rather startling point, if the article is accurate. Here’s the passage which is presumably a direct quote from Sundar Pichai, the top Googler:

Look, I hope all of you are reading the news, externally. The fact that you know, we are being a bit more responsible through one of the toughest macroeconomic conditions underway in the past decade, I think it’s important that as a company, we pull together to get through moments like this.

Did you see the crazy admission: “being a bit more responsible”. Doesn’t this mean that the company has been irresponsible prior to this announcement. I find that amusing: More responsible. Does responsibility extend beyond Foosball and into transparency about alleged online ad fraud or the handling of personnel matters such as the Dr. Timnit Gebru example?

But to the business at hand: Business intelligence. Like enterprise search and artificial intelligence, I am not exactly sure what business intelligence means. To the people who use spreadsheets like Microsoft Excel, rows and columns of data are “business intelligence.” But there must be more than redos of Lotus 1-2-3?

Yes, there are different ways to “do” business intelligence. These range from listening in a coffee shop to buying data from a third party provider and stuffing the information into Maltego to spot previously unnoticed relationships. And there are, of course, companies eager to deliver search based applications to make finding a competitor’s proposal to a government agency easier than figuring out which Google Dork to use.

Google Days It’s Cracked the Code to Business Intelligence” explains that the Google is going to make BI as business intelligence is known to those in the know the King of the Mountain. I noted this passage:

In business intelligence [BI], “there was always this idea of governing BI and of self-service, and there was no reconciliation of the degree of trust and the degree of flexibility,” Google’s Gerrit Kazmaier told reporters last week, ahead of the Google Cloud Next conference. “At Google, I think we have cracked that code to how you get trust and confidence of data with the flexibility and agility of self-service.”

This buzzword infused statement raises several fascinating ideas. Let’s look at a couple of them, shall we?

First, the idea of “governance.” That’s a term to which I can say I don’t know what the heck it means. But the notion of “governance” and “trust” is that somehow the two glittering generalities are what Google has “cracked.” I must say, “What’s the meaning, Gerrit Kazmaier?”

Second, I noted three buzzwords strung together like faux silver skulls on a raver’s necklace: Trust, confidence,  flexibility, and agility. To me, these words mean that more users want a point-and-click solution to answer a question about a competitor or the downstream impacts of an event like sanctions on China. The reality is that like the first buzzword, these don’t communicate, they evoke. The intention is that Mother Google will deliver business intelligence.

The solution, however, is not one Google crafted. The company’s professionals could not develop a business intelligence solution. Google had to buy one. Thus, the code cracking was purchased in the form of a company called Looker. The appeal of the Looker solution is that the user does not have to figure out data sources, determine if the data are valid, wrestle to get the data normalized, run tests to determine if the data set meets the requirements of a first year statistics class problem, and figure out what one needs to know. Google will make these steps invisible and reduce knowledge work to clicking an icon. There you go. To be fair, other companies have similar goals. These range from well known US companies to small firms in Armenia. Everyone wants to generate money from easy business intelligence.

Google is an online advertising business. The company wants to knock Microsoft off its perch as the default vendor to business and government. The Department of Defense is going to embrace the Google Cloud. I am not sure that some DoD analysts will release their grip on Microsoft PowerPoint, however.

Can a company trust Google? Does Google have a mechanism for governance for data handling, managing its professional staff (hello, Dr. Gebru), and ensuring that automated advertising systems are straight and true? Does Google abandon projects without thinking too much about consequences (hello, Stadia developers and customers)?

My hunch is that reducing business intelligence from a craft to a mouse click sets the stage for:

  1. Potential embedded and intentional data bias
  2. Rapid ill-informed decisions by users
  3. A way to inject advertising into a service application and personalization.

Will the days of the free car washes return to the Google parking lot? Will having meetings in a tree house in the London office become a thing again? Will Google displace other vendors delivering search based applications which engage the user in performing thoughtful analyses?

Time will provide the answer or rather Looker will provide the answer. Google will collect the money.

Stephen E Arnold, October 11, 2022

Gmail Is for the Googley

October 11, 2022

I spotted an interesting Twitter thread about Google and its beneficial two factor authentication system. You can in theory view the sequence of tweets at this url. The prime mover is Twitter user @chadloder.

The main point is that the Google requires account verification several times a year. Individuals who are in a life condition that pivots on free phones called Obamaphones in the string of tweets lose their account. The phones are lost, broken, stolen, and replaced in many cases. However, these phones often come with a different phone number.

The result is that these individuals cannot provide the “verification” that Google requires. One of @chadloder’s tweets states:

Not only do many of these benefits sites fail to function properly on mobile devices, but if you lose access to your GMail account, your caseworker will close your case for non-response and you have to start all over again.

Let’s look at this issue from a different point of view. I hypothesize the following:

  1. Google’s executives did not think about homeless Gmail users as individuals
  2. The optimal Gmail user consumes Google advertising
  3. Individuals who do not have a home are not the targets of Google’s advertising system
  4. Those who cannot verify are not part of the desired user cluster.

To sum up, when one is Googley, these problems do not manifest themselves. Advertisers want the plump targets with money to spend.

Stephen E Arnold, October 11, 2022

TikTok E-Commerce a Success—In China, That Is

October 11, 2022

Douyin, TikTok’s predecessor and home-nation counterpart, made a very fruitful decision to emphasize e-commerce in 2020. As owner ByteDance sought to export that success via TikTok, however, the effort has been less lucrative. In an effort to understand why, Rest of World‘s Rui Ma takes a step back and examines “How TikTok Became and e-Commerce Juggernaut in China.” One key factor was live stream shopping events, an arena Douyin dominates despite entering a year after rival Kuaishou and several years after e-commerce titan Alibaba. The interloper chose to focus on brands themselves and smaller sellers instead of major influencers whose audiences could evaporate with a single PR blunder. Ma considers:

“So how does Douyin actually make money from live streaming e-commerce? If you guessed ‘by commission,’ you would only be half-correct, as the platform actually charges very little — typically 1%–5% of sales value, depending on the category of goods being sold. The take rate is low, partly because of the stiffly competitive environment, and partly because this helps boost turnover as more sellers are encouraged to use the platform. But in order to succeed, most of those sellers will have to pay Douyin in other ways, via different forms of advertising. Sound familiar? That’s right — much like how Amazon sellers pay to show up in top search results, Douyin allows you to advertise your live stream in users’ feeds. TikTok has just one option for creators to have paid posts (straightforwardly called ‘Promote’). But Douyin has at least two more, targeted towards boosting the live streams of business accounts. Together, these are believed to be a significant revenue stream for Douyin, and presumably, still part of the playbook TikTok hopes to bring overseas. Since Douyin requires live stream e-commerce transactions to be completed on the platform instead of being redirected elsewhere, this all forms a ‘closed loop,’ where the user never strays from the app. It’s the ideal flywheel, and the envy of platform companies everywhere.”

Then there is Douyin Partners, an imitation of Alibaba’s Taobao program. Third-party partners will set up and operate a seller’s account, from advertising strategy to storefront to logistics. We are told ByteDance has not yet tried to insert Partners into TikTok. Why did step one, the live streaming e-commerce approach, fail in Europe and the US? We are not sure, but it does not look like ByteDance is ready to throw in the global aspiration towel just yet. Stay tuned.

Cynthia Murrell, October 11, 2022

Gee, A Button Does Not Work? Does It Have Something to Do with Ads?

October 11, 2022

YouTube’s Interactive Rating Buttons Do Not Work

Oh, YouTube! What mistakes will are being made on the video-hosting platform now? According to The Verge, YouTube’s newest changes to its likes and dislikes features do not work: Dislike YouTube runs on a series of complex algorithms that rely on user feedback. The feedback tells the algorithms whether or not a user enjoys suggested content. As the algorithms are supposed to learn what videos users like and curate individualized content.

It is not working.

Mozilla researchers discovered that the YouTube buttons “dislike,” “not interested,” “stop recommending channel,” and “remove from watch history” do not remove the unwanted videos. Users are still plagued with more than half of the videos they do not want to see. Mozilla researchers collected their data with volunteer help:

“Mozilla researchers enlisted volunteers who used the foundation’s RegretsReporter, a browser extension that overlays a general “stop recommending” button to YouTube videos viewed by participants. On the back end, users were randomly assigned a group, so different signals were sent to YouTube each time they clicked the button placed by Mozilla — dislike, not interested, don’t recommend channel, remove from history, and a control group for whom no feedback was sent to the platform.

Using data collected from over 500 million recommended videos, research assistants created over 44,000 pairs of videos — one “rejected” video, plus a video subsequently recommended by YouTube. Researchers then assessed pairs themselves or used machine learning to decide whether the recommendation was too similar to the video a user rejected.”

It turns out that the “dislike” and “not interested” buttons were “marginally effective” at preventing 12% of poor recommendations. The “don’t recommend channel” and “remove from history” buttons were slightly better at 43% and 29% respectively.

Elena Hernandez, a YouTube spokesperson, explained that these buttons are not meant to block all content about a topic. Hernandez criticized the Mozilla team’s report, because it was not taken into consideration that the buttons are designed to not create echo chambers nor how the algorithms work. She did state, however, that YouTube welcomes academic research and that is why YouTube expanded its Data API through the YouTube Researcher Program.

TikTok and Instagram have similar feedback tools and user response is similar to what the Mozilla researchers found out about YouTube. Google, YouTube’s parent company, and the other video platforms are not interested in keeping users happy. They want to keep users engaged and continue clicking on the platform. It is a known Internet fact that when people are upset they are glued to the screen more. Are YouTube, TikTok, and Instagram purposely frustrating users?

Whitney Grace, October 11, 2022

Waking Up to a Basic Fact of Online: Search and Retrieval Is Terrible

October 10, 2022

I read “Why Search Sucks.” The metadata for the article is, and I quote:

search-web-email-google-streaming-online-shopping-broken-2022-4

I spotted the article in a newsfeed, and I noticed it was published in April 2022 maybe? Who knows. Running a query on Bing, Google and Yandex  for “Insider why search sucks” yielded links to the original paywalled story. The search worked. The reason has more to do with search engine optimization, Google prioritization of search-related information, and the Sillycon Valley source.

Why was there no “$” to indicate a paywall. Why was the data of publication not spelled out in the results? I have no idea. Why one result identified Savanna Durr as the author and the article itself said Adam Rogers was the author?

So for this one query and for billions of users of free, ad-supported Web search engines work so darned well? Free and good enough are the reasons I mention. (Would you believe that some Web search engines have a list of “popular” queries, bots that look at Google results, and workers who tweak the non Google systems to sort of like Google? No. Hey, that’s okay with me.)

The cited article “Why Search Sucks” takes the position that search and retrieval is terrible. Believe me. The idea is not a new one. I have been writing about information access for decades. You can check out some of this work on the Information Today Web site or in the assorted monographs about search that I have written. A good example is the three editions of the “Enterprise Search Report.” I have been consistent in my criticism of search. Frankly not much has changed since the days of STAIRS III and the Smart System. Over the decades, bells and whistles have been added, but to find what one wants online requires consistent indexing, individuals familiar with sources and their provenance, systems which allow the user to formulate a precise query, and online systems which do not fiddle the results. None of these characteristics is common today unless you delve into chemical structure search and even that is under siege.

The author of the “Why Search Sucks” article focuses on some use cases. These are:

  • Email search
  • Social media search (Yep, the Zuckbook properties and the soon to be a Tesla fail whale)
  • Product search (Hello, Amazon, are you there?
  • Streaming search.

The write up provides the author’s or authors’ musings about Google and those who search. The comments are interesting, but none moves the needle.

Stepping back from the write up, I formulated several observations about the write up and the handling of search and its suckiness.

First, search is not a single thing. Specific information retrieval systems and methods are needed for certain topics and specific types of content. I referenced chemical structures intentionally because the retrieval systems must accept visual input, numerical input, words, and controlled term names. A quite specific search architecture and user training are required to make certain queries return useful results. Give Inconel a whirl if you have access to a structured search system. The idea that there is a “universal search” is marketing and just simple minded. Believe it or not one of today’s Googlers complained vociferously on a conference call with a major investment bank about my characterization of Google and the then almost useless Yahoo search.

Second, the pursuit of “good enough” is endemic among researchers and engineers in academic institutions and search-centric vendors. Good enough means that the limits of user capability, system capacity, budget, and time are balanced. Why not fudge how many relevant results exist for a user looking for a way to convert a link into a dot point on a slide in a super smart and busy executive’s PowerPoint for a luncheon talk tomorrow? Trying to deliver something works and meets measurable standards of precision and recall is laughable to some in the information retrieval “space” today.

Third, the hope that “search startups” will deliver non-sucking search is amusing. Smart people have been trying to develop software which delivers on point results with near real time information for more than 50 years. The cost and engineering to implement this type of system is losing traction in the handful of organizations capable of putting up the money, assembling the technical team, and getting the plumbing working is shrinking. Start ups. Baloney.

Net net: I find it interesting that more articles express dismay and surprise that today’s search and retrieval systems suck. After more than half a century of effort, that’s where we are. Fascinating it is that so many self proclaimed search experts are realizing that their self positioning might be off by a country mile.

Stephen E Arnold, October 10, 2022

Proposed EU Rule Would Allow Citizens to Seek Restitution for Harmful AI

October 10, 2022

It looks like the European Commission is taking the potential for algorithms to cause harm seriously. The Register reports, “Europe Just Might Make it Easier for People to Sue for Damage Caused by AI Tech.”  Vice-president for values and transparency V?ra Jourová frames the measure as a way to foster trust in AI technologies. Apparently EU officials believe technical innovation is helped when the public knows appropriate guardrails are in place. What an interesting perspective. Writer Katyanna Quach describes:

“The proposed AI Liability Directive aims to do a few things. One main goal is updating product liability laws so that they effectively cover machine-learning systems and lower the burden-of-proof for a compensation claimant. This ought to make it easier for people to claim compensation, provided they can prove damage was done and that it’s likely a trained model was to blame. This means someone could, for instance, claim compensation if they believe they’ve been discriminated against by AI-powered recruitment software. The directive opens the door to claims for compensation following privacy blunders and damage caused by poor safety in the context of an AI system gone wrong. Another main aim is to give people the right to demand from organizations details of their use of artificial intelligence to aid compensation claims. That said, businesses can provide proof that no harm was done by an AI and can argue against giving away sensitive information, such as trade secrets. The directive is also supposed to give companies a clear understanding and guarantee of what the rules around AI liability are.”

Officials hope such clarity will encourage developers to move forward with AI technologies without the fear of being blindsided by unforeseen allegations. Another goal is to build the current patchwork of AI standards and legislation across Europe into a cohesive set of rules. Commissioner for Justice Didier Reynders declares citizen protection top priority, stating, “technologies like drones or delivery services operated by AI can only work when consumers feel safe and protected.” Really? I’d like to see US officials tell that to Amazon.

Cynthia Murrell, October 10, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta