Algolia: A New Approach to Relevance?

February 21, 2022

Algolia is a company providing search and retrieval services to a number of companies. A call for résumés provides some interesting assertions about the company, its philosophy, and its goal.

The goal interests me. The posting on the Algolia Web site says:

Our mission is to return relevant search results at all times and give companies the ability to tailor those results to their own specific needs. We are tasked with reimagining a core piece of Algolia’s technology: how search results are ranked. To that end, we are still early in our building, and we are looking for someone who can help us perform experiments and manage the technical aspects of our pilot program, including building clients for users to test our work and tools to evaluate the impact of our changes.

I like the idea of tailoring “search” which is certainly okay if someone knows for that which the individual is looking. I like the idea of ranking because relevance is — to some people — helpful.  I like the idea that the company is “early in building.” The right person with the right stuff will make an impact. I like the idea of measuring results, which works reasonably well when the people in the same know that which they need to find.

There are several challenges in delivering or finding better ways to rank search results.

First, today the idea of knowing the corpus and using old-fashioned techniques like precision and recall are not as sexy as capsule network or caps net methods.

Second, users who want to formulate complex search queries like those required to extract semi useful information from Google or a Dataminr feed of social media are rare birds. I heard at one big search outfit that fewer than three percent of queries are a result of complex search statements; for example, site: or filetype:. Serving experts, analysts, and intelligence professionals is different from serving the ingredients for a Sicilian pizza.

Third, the now threadbare truism of lots of data, changing rapidly, and incorporating different content types and a veritable fun house of metadata requires some innovation. So far the best efforts of some bright folks have led to outright failure (Autonomy, Fast Search & Transfer, et al) or recycling endlessly with minor variations the functionality of everyone’s favorite fighter of Amazon, Elastic.

I noted some interesting supportive information in the write up; for instance, the candidate with right stuff must have grit (the sort of effort required to get an advanced degree from MINES ParisTech or Université Paris Saclay or the toughness required to deal with a wealthy family or a generational link to the Capetians. Other ingredients in the “right stuff” trois étoiles cannelés of Bordeaux:

  • Trust
  • Care
  • Candor
  • Humility

I am eager to explore the new approach to relevance. But I harbor an abiding affection for a clear explanation of the content indexed and good old Boolean logic. Snorkels, caps nets, and a 21sst century approach to relevance? Meh.

Stephen E Arnold, February 21, 2022

A Google Dork for Everyone

February 21, 2022

In my lectures about open source intelligence for law enforcement and other government professionals, I mention Google Dorks. I won’t go into detail, but the “dork” is a fancy way of saying a person who is an information professional with a knowledge of specialized commands can get semi-on point results from the online ad outfit. See for example this link. Do Googlers wear T shirts emblazoned with the phrase “Don’t be evil.” I saw such a shirt with the message “Don’t be Google,” but I may have misread.

What’s interesting is that Google Dorking is finding its way into the mainstream of the people who perceive themselves as “experts in online research.” Yep, the expertise is often similar to mastering an automatic teller machine, but that’s possibly a characteristic of our Covid era.

Google Search Is Dying” has undergone a number of updates. The write up states:

Google still gives decent results for many other categories, especially when it comes to factual information. You might think that Google results are pretty good for you, and you have no idea what I’m talking about. What you don’t realize is that you’ve been self-censoring yourself from searching most of the things you would have wanted to search. You already know subconsciously that Google isn’t going to return a good result.

The punch line is “Google is dying.” Yeah, no kidding. When the wizard from Verity and Yahoo got involved, it was not dying. It was gifted a MOAB (that is the mother of all bombs or a disconnect from a query and stuff like precision and recall).

So what’s the fix?

A Google Dork.

Enter a query and stick “reddit” in the query. The idea is that some entity (bot or humanoid) will have posted more useful, authentic, relevant information on that service. One can be sporty and try wiki at the end of a query as well.

Google Dorking for everyone even the self proclaimed experts in online information search and retrieval! The challenge is that Google advertising is pumping cash, and that plus the bonuses for senior management is what makes Google search the outstanding service it is.

Stephen E Arnold, February 21, 2022

First Apple, Then the Google, and Now a Young Person: Facebook Faces Phrastic Fault Finding

February 21, 2022

I am not sure I fully understand “What Does A Platform Look Like When It’s Dying?” The write up strikes me as somewhat mean spirited. Name a bad thing Facebook has done? Used corrosive information flows to rip apart social structures? Hey, what about the tweeter thing?

Created a marketplace for contraband? Hey, the Dark Web has been in that game for a decade.

Fostered human trafficking and child sex crime? Definitely not a pioneer in this area.

Overall the Facebook or Zuckbook is a manifestation of what’s possible online: Monopolies, ecosystems of idiosyncratic behavior like “No, you can’t change your icons”, and getting paid anytime a user or an advertiser clicks. Online is a fine, tidy, well-lit place (sorry, Ernest, I can’t do the “lighted” word form).

The write up states:

Well, two days ago, Meta, which is what Facebook calls itself now for some reason, tried to hold a post-Super Bowl Foo Fighters concert in its new VR platform that no one wants, but users couldn’t figure out how to actually access it and the ones that did said it looked like shit and sucked. Also, yesterday, Mark Zuckerberg announced that Facebook’s corporate values now include the bizarre tagline “Meta, metamates, me.” And over on Instagram, Reels, the video feed on Meta’s once-cool photo app, is filling up with silent auto-playing one-second video memes everyone hates. Meanwhile, TikTok’s owner ByteDance reported last month that their 2021 sales grew by 70%. So, you know, you connect the dots there.

I think this means that the China-linked TikTok is the big dog of social media and video now.

Bad for Facebook? Yep. Bad for YouTube? Yep. Bad for identifying susceptible individuals who can be coerced to cooperate with a foreign power? Nope.

Stephen E Arnold, February 21, 2022

Department of Defense: Troubling News about Security

February 21, 2022

It looks like a lack of resources and opaque commercial cloud providers are two factors hampering the DOD’s efforts to keep the nation cyber-safe. Breaking Defense discusses recent research from the Pentagon’s Director of Operational Test and Evaluation (DOT&E) in, “Pentagon’s Cybersecurity Tests Aren’t Realistic, Tough Enough: Report.” We encourage anyone interested in this important topic to check out the article and/or the report itself. Reporter Jaspreet Gill summarizes:

“[The report] states DoD should refocus its cybersecurity efforts on its cyber defender personnel instead of focusing primarily on the technology associated with cyber tools, networks and systems, and train them to face off against more real threats earlier in the process. For now, cybersecurity ‘Red Teams’ are stretched too thin and the ones that do test military systems are doing it with one hand tied behind their back compared to what actual adversaries would do, the report said.”

Enabling these teams to do their best work would mean giving them more time on the network to test vulnerabilities, more extensive toolsets, realistic rules of engagement, and better end-to-end planning, the report explains. In addition, it states, cyber security training must be expanded to include mission defense teams, system users, response-action teams, commanders, and network operators. We also learn that current funding practices effectively prohibit setting up offices dedicated to cyber technology effectiveness and training. Seriously? See the write-up for more recommendations that should be obvious.

The following bit is particularly troubling in this age of increasing privatization and corporate power. Gill informs us:

“The assessment also found DoD’s cyber concerns increasingly mirror those in the commercial sector due to increasing reliance on commercial products and infrastructure, especially with cloud services. The report recommends the Pentagon renegotiate contracts with commercial cloud providers and establish requirements for future contracts. ‘The DOD increasingly uses commercial cloud services to store highly sensitive, classified data, but current contracts with cloud vendors do not allow the DOD to independently assess the security of cloud infrastructure owned by the commercial vendor, preventing the DOD from fully assessing the security of commercial clouds. Current and future contracts must provide for threat-realistic, independent security assessments by the DOD of commercial clouds, to ensure critical data is protected.’”

Well yes—again that seems obvious. Public-private partnerships should be enacted with a dash of common sense. Unfortunately, that can be difficult to come by amidst bureaucracy.

Cynthia Murrell, February 21, 2022

Dumpster Fire Has Been Replaced

February 20, 2022

Hats off to jkhendrickson for creating a useful way to describe an intentionally flawed system. The phrase lurks within one of case examples of an interesting Google YouTube enforcement action.

Here’s the phrase:

the byzantine garbage fire

The phrase begs for an acronym, which are loved by millennials, GenXers, and the military; therefore, we have:

BGF

What caused the BGF. Nothing much. Google YouTube unilaterally decided to delete videos.

Hey, free means outfits can do what they want when they want.

Nevertheless:

BGF

The “b” means byzantine garbage fire.

Stephen E Arnold, February 20, 2022

Smart Software and the Cloud, Google, How Is That Working Out?

February 19, 2022

I read “Google Drive Is Flagging Some MacOS Files for Copyright Violation.” The flagging is using Google’s smart software. The copyright violations concern the outfit Google pays a billion or so each year to make Google search the right choice for iPhone users. Yep, the right choice because Google has smart software. Smart software can connect to automated systems which send legal sounding letter which threaten fines and more to alleged offenders.

The write up states:

A disgruntled Reddit user recently reported that a ‘.DS_Store’ file on their Google Drive was flagged by the search giant for violating its copyright infringement policy. Apparently, this isn’t the first time this issue has been encountered as MacOS users also reported experiencing similar problems last month.

This is a small sample and the flagging may have been just some fantasy moment in the metaverse.

I noted this follow on statement:

A similar incident occurred recently when Google Drive accidentally flagged almost empty files containing just a few numbers for violating the company’s copyright infringement files.

Are violators able to call a Googley humanoid to provide input? Sure. Plus Google is working on a fix. A job for an intern? Maybe.

Stephen E Arnold, February 19, 2022

Blue Chip Firm May Have Put Its Finger on the Roulette Wheel

February 18, 2022

The Financial Times, protecting the orange newspaper’s content with a paywall, published an interesting item about McKinsey & Company. The outfit is allegedly the big dog of consulting firms. Its super sharp consultants, however, engineered the firm into a corner, if the orange newspaper’s report is accurate.

US Appeals Court Reinstates Racketeering Claim Against McKinsey” recounts an allegation made by Jay Alix, whose AlixPartners competes with the Blue Chip Big Dog. The article works in references to McKinsey’s advice to purveyors of opioid variants, but McKinsey was betting on bankruptcies to generate revenue.

The Alix matter,

alleged that McKinsey violated the Racketeer Influenced and Corrupt Organizations (Rico) Act, accusing the firm of filing misleading disclosure statements to the bankruptcy court in order to secure consulting appointments worth tens of millions of dollars. AlixPartners lost business as a result, he alleged.

McKinsey, acting in the optimal precepts of agile management, has ousted its managing partner. Kevin Sneader uttered a pithy truism in 2018:

Sorry.

Does this story have legs? Not for those outside the rarified atmosphere of the Blue Chip consulting firms. PR mastery? Money can’t buy love, but it can buy some things. Compare the news coverage of Facebook’s quarterly zuck up or the NSO Group’s software. McKinsey, which may be a far more impactful series of actions, is not of much interest. That’s too bad. Ethical compasses have to be manufactured somewhere.

McKinsey asserts that Alix’s allegations are untrue. Okay.

Stephen E Arnold, February 18, 2022

Bloomberg and the Japan Times on the Plight of Man: A TikTok Video to Come?

February 18, 2022

I read “‘Sapiens’? Humans Aren’t Wise, Just Too Smart for Our Own Good.” Bloomberg is the firm providing the trading system to many of Wall Street’s brightest minds. Japan is the country which has created the management actions of Toshiba and the Toyota subscription to remote starting. What I noted in the write up was this passage:

The late B.K.S. Iyengar, a yogi, once said that intelligence, like money, is a good servant but a bad master. Even science has explored why and how smart people can be so foolish. In a nutshell, it comes down to a cocktail of egocentrism, narcissism and arrogance that overpowers everything else — or what the ancient Greeks called hubris.

From the assertion that spy chips were on motherboards to ways to make life interesting for automobile owners, it is interesting to think about hubris. And the yogi. Was he talking about those who think technology solves mankind’s problems?

Stephen E Arnold, February 18, 2022

Another Example of the Corrosive Function of Digital Information

February 18, 2022

In Praise of Search Tools” contains an interesting statement. Here it is:

the shaping-up of the book that Duncan describes as he charts the advent of modern search tools might also be seen as a pulling-apart of the book. The alphabetical table that is the index “breaks down a book into its constituents.” Its structure is entirely independent from the structure of the work, sacrificing the latter for the reader’s better convenience. The alphabetical order used by the indexer breaks texts up into so many word-sized bits, but the dismemberment at issue in the culture of indexing was sometimes literal, as when concordance-makers took scissors to the pages whose words they were regrouping. In a 1919 article on the making of a concordance to the poetry of William Wordsworth, a Cornell professor describes how the eight volumes of the Oxford edition were transmuted by his team into 210,944 paper slips: records of each appearance of each of the poet’s keywords.

Interesting and in line with my ASIS Eagleton Lecture given in the mid 1980s.

Stephen E Arnold, February 18, 2022

Google Joke: A Googler Walks into a Coffee Shop with a Regulator and…

February 17, 2022

I read an amusing write up called “Google Keeps Android Ad Tool Into At Least 2024, Exploring Other Options.” I think the writer of the article is serious, not crafting a joke for Joe Rogan’s much admired “Man Show” comments. Here is the passage I found semi amusing:

Google said it would give “substantial notice” before axing what is known as AdId. But it will immediately begin seeking feedback on its proposed alternatives, which Google said aim to better protect users’ privacy and curb covert surveillance.

But better than what? What happens if there are technical issues in 2024? A Googler walks into a coffee shop with a regulator and says, “We need more time to better protect users’ privacy and curb covert surveillance.”

The regulator laughs out loud because he was thinking of Apple marginalizing Facebook. Perhaps the Google is delivering some Meta-Aid. Whoops. I meant to type Meta AdID.

Stephen E Arnold, February 17, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta