Taking Time for Search Vendor Limerance

April 18, 2018

Life is a bit hectic. The Beyond Search and the DarkCyber teams are working on the US government hidden Web presentation scheduled this week. We also have final research underway for the two Telestrategies ISS CyberOSINT lectures. The first is a review of the DarkCyber approach to deanonymizing Surface Web and hidden Web chat. The second focuses on deanonymizing digital currency transactions. Both sessions provide attendees with best practices, commercial solutions, open source tools, and the standard checklists which are a feature of  my LE and intel lectures.

However, one of my associates asked me if I knew what the word “limerance” meant. This individual is reasonably intelligent, but the bar for brains is pretty low here in rural Kentucky. I told the person, “I think it is psychobabble, but I am not sure.”

The fix was a quick Bing.com search. The wonky relevance of the Google was the reason for the shift to the once indomitable Microsoft.

Limerance, according to Bing’s summary of Wikipedia means “a state of mind which results from a romantic attraction to another person typically including compulsive thoughts and fantasies and a desire to form or maintain a relationship and have one’s feelings reciprocated.”


Upon reflection, I decided that limerance can be liberated from the woozy world of psychologists, shrinks, and wielders of water witches.

Consider this usage in the marginalized world of enterprise search:

Limerance: The state of mind which causes a vendor of key word search to embrace any application or use case which can be stretched to trigger a license to the vendor’s “finding” system.


Read more

About That Google Question Answering: Books, Scholar, and Open Source at Its Talon Tips

April 17, 2018

Googzilla prides itself on consuming search queries. Answering those questions? That’s a matter for discussion. Note that here in Harrod’s Creek we understand that if Google does not point to an entity, Web site, or factoid—that entity, Web site, or factoid does not exist. Who knew that those in Harrod’s Creek were into epistemology?

However, Pagal Parrot found “10 Questions Even Google Can’t Answer.” Let us talk a look at the write up’s exemplary 10 questions:

“1. Why does a round pizza come in a square box?

2. Why are boxing rings square?

3.What is Satan’s last name?

4. Why do we press harder on a remote control when we know the batteries are flat?

5. Why is Google not the most translated website?

6. Why do banks charge a fee on ‘insufficient funds’ when they know there is not enough?

7. Why is it that people say they ‘slept like a baby’ when babies wake up, like, every two hours?

8. Why do Baidu lead Google in China?

9. Do Atheist also swear by the Bible /Quran when they go to court?

10. Why do people get angry each time another passenger sits beside them in a seat?”

These questions also beg another question: Do people spend time trying to dumbfound Google? It appears that the answer is, “Folks do try to bedevil the GOOG.”

The article is mostly for giggles, but there are definitely more than 10 questions Google cannot answer. Here is one: When will Google answer questions with precision and recall balanced for relevance and “accuracy”? Would advertisers respond to the functionality?

Whitney Grace, April 17, 2018

Google Argues With Russia About Website Rankings

April 10, 2018

Amidst its employee petitions and the increasing concern about YouTube videos for children, Google is annoyed with Russia.

Google fiddled with its ranking algorithm to stop the dissemination of fake news and Russia believes it is biased against two of its news agencies. Reuters describes more of the argument in the story, “Google Seeks To Defuse Row With Russia Over Website Rankings.” Roskomnadzor called out Alphabet Inc. and its popular search engine Google, when it claimed that Google pushed Russian media sites Sputnik and Russia Today into lower search results.

Eric Schmidt claimed that Google would not be deleting those links, instead they would be pushed lower in search results. Russia claimed Google discriminated against Russia Today and Sputnik, also saying they would take action if necessary. Google responded:

“ ‘We’d like to inform you that by speaking about ranking of web-sources, including the websites of Russia Today and Sputnik, Dr. Eric Schmidt was referring to Google’s ongoing efforts to improve search quality,’ Google said in a letter posted on Roskomnadzor’s website… ‘We don’t change our algorithm to re-rank,’ it added. A Google spokeswoman confirmed the letter had been sent by the company but provided no further comment.”

Years ago Mr. Brin’s trip to space fizzled. Now the search giant is finding fault with a country known to use interesting methods to solve problems.

Whitney Grace, April 10, 2017

Mondeca: Another Semantic Search Option

April 9, 2018

Mondeca, based in France, has long been focused on indexing and taxonomy. Now they offer a search platform named, simply enough, Semantic Search. Here’s their description:

“Semantic search systems consider various points including context of search, location, intent, variation of words, synonyms, generalized and specialized queries, concept matching and natural language queries to provide relevant search results. Augment your SolR or ElasticSearch capabilities; understand the intent, contextualize search results; search using business terms instead of keywords.”

A few details from the product page caught my eye. Let’s begin with the Search functionality; the page succinctly describes:

“Navigational search – quickly locate specific content or resource. Informational search – learn more about a specific subject. Compound term processing, concept search, fuzzy search, simple but smart search, controlled terms, full text or metadata, relevancy scoring. Takes care of language, spelling, accents, case. Boolean expressions, auto complete, suggestions. Disambiguated queries, suggests alternatives to the original query. Relevance feedback: modify the original query with additional terms. Contextualize by user profile, location, search activity and more.”

The software includes a GUI for visualizing the semantic data, and features word-processing tools like auto complete and a thesaurus. Results are annotated, with key terms highlighted, and filters provide significant refinement, complete with suggestions. Results can also be clustered by either statistics or semantic tags. A personalized dashboard and several options for sharing and publishing round out my list. See the product page for more details.

Established in 1999, Mondeca delivers pragmatic semantic solutions to clients in Europe and North America, and is proud to have developed their own, successful semantic methodology. The firm is based in Paris. Perhaps the next time our beloved leader, Stephen E Arnold, visits Paris, the company will make time to speak with him. Previous attempts to set up a meeting were for naught. Ah, France.

Cynthia Murrell, April 9, 2018

Video Search: Still a Challenge

April 6, 2018

As MIT Technology Review describes in its article, “The Next Big Step for AI? Understanding Video,” artificial intelligence still tends to have trouble correctly interpreting video. A recent slew of new jobs at YouTube (owned by Google) underscores this flaw—“YouTube is Hiring 10,000 People to Police Offensive Videos,” reports the New York Post. When it comes to objectionable content, algorithms just don’t get it. Yet. Meanwhile, the PR machine keeps running.

MIT Tech editor Will Knight discusses some promising solutions in the above article, beginning close to home with a collaboration between MIT and IBM. He writes:

“MIT and IBM this week released a vast data set of video clips painstakingly annotated with details of the action being carried out. The Moments in Time Dataset includes three-second snippets of everything from fishing to break-dancing. ‘A lot of things in the world change from one second to the next,’ says Aude Oliva, a principal research scientist at MIT and one of the people behind the project. ‘If you want to understand why something is happening, motion gives you lot of information that you cannot capture in a single frame.’” … “The MIT-IBM project is in fact just one of several video data sets designed to spur progress in training machines to understand actions in the physical world. Last year, for example, Google released a set of eight million tagged YouTube videos called YouTube-8M. Facebook is developing an annotated data set of video actions called the Scenes, Actions, and Objects set.”

Knight also mentions Twenty Billion Neurons, which, he notes:

“… Created a custom data set by paying crowdsourced workers to perform simple tasks. One of the company’s cofounders, Roland Memisevic, says it also uses a neural network designed specifically to process temporal vision information.”

So, we should not be surprised if, soon, AI can comprehend what it “sees.” Meanwhile, sites that host video content would do well to employ the judgment of humans.

Cynthia Murrell, April 6, 2018

Build an Alternative Google: How To Wanted

April 6, 2018

Hacker News presented an interesting question, “How would you build an internet scale web crawler?” We have been talking with companies which have developed Internet search systems that are not available for free Web search. Those conversations have produced some fascinating information. Some of the data will be included in my upcoming lecture for a government agency and then in my two presentations at the June 2018 Telestrategies ISS Conference in Prague.

What was interesting about this question was the few people responded. That is interesting because my team’s research for my new presentations on deanonymizing encrypted chat and deanonymizing digital currency transactions pivot on comprehensive Internet indexing. In fact, more companies are indexing the Internet content than at any time in the last 10 years.

The second issue the post triggered was a realization that only a handful of people jumped on the topic. This low response to the question in itself is interesting. With more activity in indexing, why aren’t more people helping out JustinGarrson? That’s a question worth thinking about.

Third, one of the responses to the Hacker News question was a pointer to the YaCy.net open source project. We once included this technology in our Internet Research for Law Enforcement training program. My recollection of the system is fuzzy, so I will get one of my team to take at look.

The final thought the Hacker News’ story triggered was, “Have people just accepted Bing, Google, Qwant, and a handful of metasearch systems as too dominant to challenge?” My view is that an opportunity exists to create a public facing Internet search and retrieval system. The reason? Outstanding alternatives to Bing, Google, and Qwant are available for those who qualify as customers and who are willing to pay the license fees.

My hunch is that just as enterprise search has coalesced around the open source Lucene/Solr technologies, free Web search has become “game over” because the ad supported model has won.

The problem, of course, is that a person looking for information usually does not realize that free Web search results are neither comprehensive, timely, or objective.

I hope individuals like JustinGarrison get the information needed to seize an opportunity in Internet search.

Stephen E Arnold, April 6, 2018

Google and Search: More Churn Turmoil

April 4, 2018

I read “John Giannandrea, Head of Google’s Cornerstone Web-Search Unit, Steps Down.” I found the phrase “steps down” amusing. I think the wizard went to the Apple orchard. Since Mr. Giannandrea ran search, Google search has become less useful to me. Now I have to use multiple search systems to locate what I think are slam dunk queries. Nope. I get some pretty off the wall Google search results.

Two points jumped out of this story for me.

First, Google is forced to go back to one of the early Googlers from the AltaVista.com team. (I did some work for an outfit called PersimmonIT, which was a provider to AltaVista.com.) What’s interesting is that Jeff Dean is one of the really old Google guard. I know he’s bright and capable but that begs this question: “Aren’t their younger, smarter, and as or more capable professionals to get the over hyped Google artificial intelligence operation underway.” I can suggest at least one candidate from the DeepMind team. But, hey, who really cares?

Second, search must be pretty broken. The job has fallen to another old timer at the GOOG. Same question: “Aren’t there younger, more with it technical wizards who can handle the massively complex, software wrapped, advertising centric systems? (Yep, systems because there is “regular” search and “mobile” search. Two search systems are part of the index puzzle Google has built over the years.) Plus, do you remember Google’s “universal” search which, as aBearStearns’ legend has it, was cooked up over a weekend to deal with a PR problem triggered by an analyst’s report to which yours truly contributed. You know “universal.” One query gets you blog content,  new Web sites, Google Books, Google Scholar, yada yada. That doesn’t exist and probably will never come to pass for some pretty good reasons. But saying something is just as good as delivering I assume.)

Net net: Google is now a mature company. The founders have distanced themselves from the legal troubles in which the company is mired. The company is caught in the Silicon Valley backlash. The Oracle Jave thing is a Freddie Kruger thing for the GOOG. Management change is a companion to the craziness which seems to characterize some units of the company.

I wonder if a query launched from a desktop computer will return on point results in the near future. I sure hope so.

Stephen E Arnold, April 4, 2018

Hidden Webs May Be a Content Escape Hatch

March 28, 2018

Beyond Search and the Dark Cyber research team discussed a topic which raised some concern among the team. Censorship may be nudging some individuals to the hidden Webs; for example, the Dark Web, i2p, ZeroWeb, etc.

In the wake of several US school shootings, the outcry of more control over gun sales has grown louder. Many organizations have begun to distance themselves from firearms related topics, like YouTube who removed all of their firearms content recently. The response has created a strange subculture, as we discovered in this recent NPR story, “Restricted by YouTube, Gun Enthusiasts are Taking Their Videos to Pornhub.”

According to the story:

“InRangeTV, which has some 144,000 subscribers on its YouTube channel, has chosen to publish videos on an adult website called Pornhub…InRangeTV also recently wrote on Facebook that it is defending “Why are we seeing continuing restrictions and challenges towards content about something demonstrably legal yet not against that which is clearly illegal?” It then posted links to YouTube videos on synthesizing meth and other illicit acts.”

This is an odd place for a freedom of speech battle to take place, but not completely. It seems right in line with something Larry Flynt would have perused. Conversely, as far right leaning content is going closer and closer toward the dark web (pornography is not the dark web, but it feels like that’s the direction this is heading) the dark web is beginning to try to take down YouTube with rightwing trolling at an extreme level. What all this means for average citizens is that search is going to get more complicated, no matter what you are hunting for.

We also noted that a site dedicated to off color content has become the new home for those who are interested in weaponry. We think the shift may be gaining momentum. How does one “find” these types of content? Perhaps encrypted chat or old fashioned word of mouth messaging. Worth watching this possible shift.

Patrick Roland, March 28, 2018

Million Short: A Metasearch Option

March 22, 2018

An interview at Forbes delves into the story behind Million Short, an alternative to Google for Internet Search. As concerns grow about online privacy, information accuracy, and filter bubbles, options that grant the user more control appeal to many. Contributor Julian Mitchell interviews Million Short founder and CEO Sanjay Arora in his piece, “This Search Engine Startup Helps You Find What Google Is Missing.” Mitchell informs us:

Founded in 2012, Million Short is an innovative search engine that takes a new and focused approach to organizing, accessing, and discovering data on the internet. The Toronto-based company aims to provide greater choices to users seeking information by magnifying the public’s access to data online. Cutting through the clutter of popular searches, most-viewed sites and sponsored suggestions, Million Short allows users to remove up to the top one million sites from the search set. Removing ‘an entire slice of the web’, the company hopes to balance the playing field for sites that may be new, suffer from poor SEO, have competitive keywords, or operate a small marketing budget. Million Short Founder and CEO Sanjay Arora shares the vision behind his company, overthrowing Google’s search engine monopoly, and his insight into the future of finding information online.

The subsequent interview gets into details, like Arora’s original motivation for creating Million Short—Search is too important to be dominated by a just few companies, he insists. The pair explores both advantages and challenges the company has seen, as well as a look to the future. See the article for more.

Cynthia Murrell, March 22, 2018

Digital Antique Coca Cola Signs for Search

March 21, 2018

In a turn that is just about the most human thing we’ve ever heard, just as the world is on the cusp of an AI revolution, many are starting to look backward toward simpler times. We got a sideways glance at our fear of change from a PC Magazine story, “Download Your Entire Google Search History.”

The story is primarily about why on Earth anyone would want to see everything they have ever searched for. But it also touches on our desire for nostalgia in this lightning quick era:

“Users can now download their entire saved search history “to see a list of the terms you’ve searched for,” the company said. “This gives you access to your data when and where you want…For safety’s sake, don’t download past searches on a public computer—at the library, an Internet cafe, or even a friend’s house. Save the curiosity for home.”

This, oddly, isn’t the only place where nostalgia and AI are blending. Remember Nokia, the flip phone people? They are back and reintroducing a line of old school not-smart phones. On top of that, the company is dabbling in new tech like AI, which leads us to wonder where these two can possibly intersect. It’s an interesting move and one that will likely have antique hunters quivering.

Patrick Roland, March 21, 2018

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta