Breaking Relevance: The TrackMeKnot Method
July 18, 2011
Okay, with ProQuest, Cambridge Scientific, and Dialog about to jump into the statistical fog of relevance, I fell pretty glum. Most old school searchers prefer to type in explicit commands; for example
b 15
ss cc=77? AND cc=76?? AND esop
When the new “fuzzified” version of the commercial search system for ProQuest, Cambridge Scientific, and Dialog-type users, good luck with that. In the new commercial systems, the old school, brute force, Boolean approach would return consistent results search in and search out. Take it to the bank.
Change is afoot so queries will return somewhat unpredictable results depending on what pointers get jiggled in an index update.
If we shift to the free Web search engines, the notion of relevance is based on lots of “signals”. A signal is something that allows the search system to disambiguate or add context to an action. If you are running around an airport, the mobile search wizards want to look at your search history and hook those signals to your wandering GPS input. The result is search done for you.
Why is relevance lousy? Well, search engine optimization is to blame. The focus on selling targeted ads is a contributor. And there are some interesting software tools that aim to confuse certain traffic analysis systems. So far, no one wants to confuse the ProQuest, Cambridge Scientific, and Dialog-type systems, but the Web search world is like catnip.
One of our readers alerted us to TrackMeKnot, which is an obfuscation software designed to defeat certain types of usage tracking. Here’s what the developers say:
TrackMeNot runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and Bing. It hides users’ actual search trails in a cloud of ‘ghost’ queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles. To better simulate user behavior TrackMeNot uses a dynamic query mechanism to ‘evolve’ each client (uniquely) over time, parsing the results of its searches for ‘logical’ future query terms with which to replace those already used.
If you want to cover your search clicks, give it a whirl. Obfuscation methods, if used by lots of people, may have an adverse impact on relevance, particularly when personalization is enabled. Lucky me.
Stephen E Arnold, July 18, 2011
Sponsored by Pandia.com (www.pandia.com), publishers of The New Landscape of Enterprise Search.
Facebook Face Play No Big Surprise
June 14, 2011
You might be living under a rock if you haven’t heard about Facebook’s newest addition to its social network–facial recognition software. That’s right – the beloved social network is building a database of their user’s faces and telling us it’s all to make our lives easier. As discussed in “Facebook Quietly Switches on Facial Recognition Tech by Default” the controversial feature allows users “to automatically provide tags for the photos uploaded” by recognizing facial features of your friends from previously uploaded photos. Yet again, Facebook finds themselves under fire their laissez-faire attitude towards privacy.
This latest Facebook technology is being vilified. It has been called “creepy,” “disheartening,” and even “terrifying.” These are words that would usually be reserved for the likes of Charles Manson or Darth Vader, not an online social network. The biggest backlash seems to come from the fact that the didn’t “alert its international stalkerbase that its facial recognition software had been switched on by default within the social network.” This opt-out, instead of opt-in, attitude is what is upsetting the masses. Graham Cluely, a UK-based security expert says that “[y]et again, it feels like Facebook is eroding the online privacy of its users by stealth.”
To be fair, Facebook released a notice on The Facebook Blog in December 2010that the company was unleashing its “tag suggestions” to United States users and when you hear them describe the technology it seems to be anything, but Manson-esque. In fact, it invokes thoughts of Happy Days. They say that since people upload 100 million tagged photos everyday, that they simply are helping “you and your friends relive everything from that life-altering skydiving trip to a birthday dinner where the laughter never stopped.” They go as far as to say that photo tags are an “essential tool for sharing important moments” and facial recognition just makes that easier.
Google has also been working on facial recognition technology in the form of a smartphone app known as Google Googles and celebrity recognition. However, now Google is claiming to have halted the project because, as Google Chairman Eric Schmidt said “[p]eople could use this stuff in a very, very bad way as well as in a good way.” See “Facebooks’s Again in Spotlight on Privacy”.
So who’s right? Facebook by moving forward or Google by holding up its facial recognition technology?
It seems to me that Google is just delaying the inevitable. Let’s face it. As a Facebook user my right to my privacy may be compromised the second I sign up in exchange for what Facebook offers.
Technology, like the facial recognition software, is changing the social media landscape, and I suppose I should not be surprised when the company implements its newest creation even when it puts my privacy at risk.
Is it creepy?
Probably and users should be given an opportunity to opt-in, not out. Is it deplorable. No. It’s our option to join and Facebook is taking full advantage of it.
Jennifer Wensink, June 14, 2011
Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion
About that Cloud Security?
May 21, 2011
Let’s assume the Bloomberg story “Amazon Server Said to Be Used in Sony Attack”. If a one cloud based service can be used to attack another cloud based service, does the owner of the service used in the attack have an obligation to prevent the attack? Bloomberg reports that Sony is concerned. No kidding, but what about the customers? Bloomberg says:
…the breach at Amazon is likely to call attention to concerns some businesses have voiced over the security of computing services delivered via others’ remote servers, referred to as cloud computing. Cloud security is Amazon’s top priority, Chief Executive Officer Jeff Bezos said at an event sponsored by Consumer Reports magazine this week.
Will substantive, timely action be taken to address the issues associated with this type of alleged use of cloud services? I suppose that the companies involved will try to slap on a patch. When the dust settles, will there be significant change? My hunch is that the quest for revenues will come first. The costs associated with figuring out problems * before * they occur are just too high.
We’re still in the react mode when it comes to online. Learning to live with unknown risks just adds spice to the online stew.
Stephen E Arnold, May 21, 2011
Freebie
Tracking: Does It Matter?
May 11, 2011
A news story broke this week that was more difficult for many to ignore; it seems our beloved iPhones and iPads are paying us the same attention we lavish on them. It turns out these Apple devices keep an internal log of every cell tower or hot spot they connect to, in essence creating a map of the user’s movements for as long as ten months. It gets better. The log file is highly visible and unencrypted, making it accessible to anyone with your device in their hands.
Getting the scent. Source: http://www2.journalnow.com/news/2011/feb/07/wsweat01-beagle-found-in-a-jiffy-by-tracking-dogs-ar-760887/
This news stems from a couple of British programmers who stumbled upon said “secret” location file. In the midst of the melee that ensued from outraged consumers and lawmakers alike, I was directed to a Bloomberg article titled “Researcher: iPhone Location Data Already Used By Cops”.
Interestingly enough, a rendition of this same story has been covered by the press months ago, only featured in a different light courtesy of an individual studying forensic computing. Per the write-up: “In a post on his blog, he explains that the existence of the location database—which tracks the cell phone towers your phone has connected to—has been public in security circles for some time.
While it’s not widely known, that’s not the same as not being known at all. In fact, he has written and presented several papers on the subject and even contributed a chapter on the location data in a book that covers forensic analysis of the iPhone.”
The FTC, Google and the Buzz
March 30, 2011
I read “Google Will Face Privacy Audits For The Next 20 Long Years.” The Federal Trade Commission has under its umbrella the mechanism to trigger privacy audits of Google’s practices for the next 20 years. Okay. Two decades. The matter fired off the launch pad in February 2010 and, if the story is spot on, landed with a ruling in March 2011. Here’s the passage that caught my attention:
As the FTC put it, “Although Google led Gmail users to believe that they could choose whether or not they wanted to join the network, the options for declining or leaving the social network were ineffective.”
I think this means that Google’s judgment was found lacking. The notion of just doing something and apologizing if that something goes wrong works in some sectors. The method did not seem to work in this particular situation, however.
I noted this passage in the article:
Google has formally apologized for the whole mess, saying “The launch of Google Buzz. fell short of our usual standards for transparency and user control—letting our users and Google down.”
Yep. Apologies. More about those at the Google blog. Here’s the passage of Google speak I found fascinating:
User trust really matters to Google.
For sure. No, really. You know. Really. Absolutely.
I am not sure I have an opinion about this type of “decision”. What strikes me is that if a company cannot do exactly what it wants, that company may be hampered to some degree. On the other hand, a government agency which requires a year to make a decision seems to be operating at an interesting level of efficiency.
What about the users? Well, does either of the parties to this legal matter think about the user? My hunch is that Google wants to get back to the business of selling ads. The FTC wants to move on to weightier matter. The user will continue with behaviors that fascinate economists and social scientists.
In a larger frame, other players move forward creating value. Web indexing, ads, and US government intervention may ultimately have minimal impact at a 12 month remove. Would faster, more stringent action made a more significant difference? Probably but not now.
Maybe Google and the FTC will take Britney Spears’s advice:
“My mum said that when you have a bad day, eat ice-cream. That’s the best advice,”
A modern day Li Kui for sure. For sure. No, really.
Stephen E Arnold, March 30, 2011
Freebie unlike some of life’s activities
Will DuckDuckGo Ruffle Feathers?
January 8, 2011
Search engine DuckDuckGo’s new marketing campaign, summarized in Search Engine Journal’s “DuckDuckGo Pitches Private Search” ) says that what differentiates them from Google is privacy—they don’t store personal Internet data or associate it with a user account.
The heavy-handed marketing maneuver is being touted by DuckDuckGo founder and sole employee Gabriel Weinberg in a Search Engine Land report as an educational tactic. “I am trying to make the privacy aspects of search engines understandable to the average person who doesn’t have a lot of background knowledge on the more technical aspects.”
We are interested to see if Weinberg’s approach ruffles the feathers of the average searcher. Will they sit up and take notice of the privacy issue or does the attempt fly south?
Christina Sheley, January 8, 2011
Freebie
Serendipity or Snooping?
November 18, 2010
Barry Levine reports on Eric Schmidt’s presentation at the Tech Crunch Disrupt conference in “Eric Schmidt Sees Devices Running Your Life for You.” With a couple of brief nods to privacy concerns, the Google CEO touted his fantasy of a utopian future where computers anticipate your every move. Levine gives an example: “You’re walking down the street and your smartphone reminds you of your appointments, notes nearby sales of those shoes you’ve been searching for, and points out that your ex-girlfriend is in the restaurant on the corner.” This might send shivers of excitement down Schmidt’s spine, but it makes my hair stand on end. I don’t particularly care to have ex-boyfriends or anyone else know what restaurant I’m in. For anyone who’s been stalked this sounds more like a nightmare than a dream. Even if by some miracle we could assume everyone’s best intentions in this scenario, there’s a reason the Panopticon was a prison, not a luxury resort. And with recent backlashes against Facebook’s privacy controls, I think I’m not the only one who is still concerned about the openness of personal information online.
Alice Wasielewski, November 18, 2010
Freebie
How to Cope with Google: Change Your Name, Just Move
October 26, 2010
I find Math Club folks darned entertaining. I recall learning from someone that Google’s top dog suggested that one could deal with privacy issues by changing one’s name. No problem, but not exactly practical. Today (October 25, 2010) several people mentioned to me Dr. Schmidt’s suggestion regarding Street View’s imaging one’s home. The recommendation was, according to “Schmidt: Don’t Like Google Street View Photographing Your House? Then Move,” even more impractical than changing one’s name. In today’s real estate market, most folks struggle to make payments. The cost of moving is out of reach even if there were a compelling reason to uproot oneself. The idea of moving because Google is making snaps of one’s domicile is either pretty funny (my view) or pretty crazy (the view of one of the people in my office).
So which is it? Colbert Report material or an answer that could get you stuck in a hospital’s psychiatric ward for observation?
I side with the Math Club. Dr. Schmidt was just joking.
What’s not so funny is the mounting legal friction that Google faces. My concern is that the push back could impair Google’s ability to do deals. The issue is partially trust and partially mind share. With lawyers wanting discovery and depositions, the two Ds can get even the A student in Math Club in academic hot water. That’s bad for Google, its partners, and its stakeholders. Competitors know Google has lots of cash, but with Apple and Facebook surging, Google can no longer rely on controlled chaos to converge on a solution. Lawyers are into procedures and often lack a sense of humor.
Just move. Man, that’s a hoot. Getting a cow on top of a university bell tower will not elicit a chuckle from me. But “just move.” I am in stitches. Absolutely hilarious. But there is that other point of view… the hospital… the observation thing. Hmmm.
Stephen E Arnold, October 26, 2010
Google: No More Never Complain, Never Explain
October 23, 2010
The Straits Times reported “Google Sorry for Lapses.” Is this a change in method? I recall learning from one of my college professors at the cow town school to which I was admitted, “Never complain, never explain.” Now Google is apologizing, which combines complaining and explaining. If the write up is accurate, the Google may now be recognizing that it has created the equivalent of a ceramic brake slowing the Googlemobile to a snail’s pace. For a Googzilla, getting smoked by a snail is painful indeed. I opine that such friction may be worse than sitting out the senior prom in high school to work on a problem in partial derivatives.
Here’s the passage that caught my attention:
Mr Eustace [Google wizard and adult in charge of rocket science] provided Google’s most detailed description yet of the private data on unsecured wireless networks scooped up by Street View cars as they cruised through cities around the world taking pictures. ‘While most of the data is fragmentary, in some instances entire emails and URLs were captured, as well as passwords,’ he said. ‘We want to delete this data as soon as possible, and I would like to apologize again for the fact that we collected it in the first place. ‘We are mortified by what happened, but confident that these changes to our processes and structure will significantly improve our internal privacy and security practices for the benefit of all our users,’ Mr Eustace said.
Several observations:
- What about that phrase “most of the data is fragmentary”? “Data” is a plural but that matters not to the Google. The “most”? Well, that is more problematic and apparently ambiguous.
- With so many smart lads and lasses, how can such a mistake get propagated across the years and multiple versions of the scampering little data gobbling vehicles? Interesting to me, but I am not mortified. Google is. Ooops.
- After 12 years, a couple of alleged stalkers, and an Odwalla beverage delivery truck full of legal hassles, the Google is fixing up its “internal privacy and security practices.” I do like the categorical affirmative. Too bad the multiple exceptions create a bit of a logical issue for this goose.
In short, complaining and explaining perhaps?
Stephen E Arnold, October 23, 2010
Freebie
Google Waffles Backwards
October 21, 2010
Canada is annoyed at the Google. My view is that Google is mostly indifferent to legal hassles from countries. I mean when an enterprise can blow off the world’s largest market, what’s the difference when the likes of maple leaf lovers get annoyed. But there is an interesting item in the story “Google Ditches All Street View Wi-Fi Scanning.” Here’s the passage that caught my attention:
Google has no plans to resume using its Street View cars to collect information about the location of Wi-Fi networks, a practice that led to a flurry of privacy probes after the company said it unintentionally captured fragments of unencrypted data. The disclosure appeared in a report on Street View released today by Canadian privacy commissioner Jennifer Stoddart, who said that “collection is discontinued and Google has no plans to resume it.” Assembling an extensive list of the location of Wi-Fi access points can aid in geolocation, especially in areas where connections to cell towers are unreliable. Instead, Stoddart said that, based on her conversations with headquarters in Mountain View, Ca., “Google intends to obtain the information needed to populate its location-based services database” from “users’ handsets.”
No problem in my opinion. My thought is that the Math Club had a plan, a rogue engineer’s code, and some surprised customers. Now the GOOG seems to be doing the type of thinking one expects from a mere MBA. Is this progress? Depends on one’s point of view, right?
Stephen E Arnold, October 21, 2010
Freebie