CyberOSINT banner

Google: A Week with Low Power Fireworks

July 3, 2015

I don’t pay too much attention to Google. I find the endless squabbling over the notion that Google is a monopoly, a bad guy, the new Microsoft, yada yada boring. I did notice this morning (July 3, 2015), three unrelated news items mentioning Google. Let me point you to the sources of this information or maybe disinformation and invite you to determine if Google is in the midst of a pre fourth of July fireworks test.

Item 1: “Former Google Worker Pleads Guilty to Raping Woman Inside his East Village Home.” My reaction is surprise because I thought Googlers remained at work. Is this Raping Woman story a caboose to  “Prostitute Pleads Guilty in Google Executive Heroin Overdose Death on Yacht”?

Item 2:  “Google’s Niantic Labs Sorry Over Death Camps In Smartphone Game” makes clear that Google allegedly included the concentration camps Dachau and Sachsenhausen in a game. According to the item, a Googler was apologetic.

Item 3: “Google Apologizes after Photo Software Tags Black People as ‘Gorillas’.” If true, Google’s algorithms display “drift” when performing metatagging functions.

I don’t know if these items are accurate or false. It is interesting to note that each can be triggers for questions about Google’s management and Google’s algorithms. Perhaps the European Commission will perceive these items as germane to their investigation of Google’s business practices. Tess is heartbroken. She has deactivated the alert function on her Apple Watch until the news about Google is less disturbing for her. Calm down, Tess.

Stephen E Arnold, July 3, 2015

Meatbags Prevent Google’s Self-Driving Car

July 2, 2015

Driving is a privilege not a right…for humans and Google wants it for its self-driving carsGoogle, however, is still in the test phasing for its self-driving cars and announced that they would publish results of the study on a monthly basis.  They first report recently came out and it says that Google cars were in twelve accidents when they were on real roads.  The Register takes a snarky, informative approach to self-driving cars in “Google: Our Self-Driving Cars Would Be Tip-Top If You Meatheads Didn’t Crash Into Them.”

Google has twenty-three Lexus SUVs that have driven 1,011,338 miles with the self-driving software and 796, 250 miles with a human behind the wheel.  Many of the cars have taken to the real road, but nine are still restricted to the private track.

Google blames all twelve of the accidents on human error, not the software, and it is due to either the human driver in the autonomous car or the driver in the other car.  The Google cars, being rear-ended from driving too slow, caused seven accidents.  One accident was due to the Google car braking trying to avoid a collision and two more were when non-Google cars failed to obey traffic signs.  The worst accident caused when a Google car was driving at 63 mph and was sideswiped by a car changing lanes.  No one was hurt.  The last two accidents were the fault of Google’s employees: both accidents resulted in Google cars rear-ending the cars in front of them.

Google is quick to point out the software’s positive aspects:

“The report also highlighted some of the smarter aspects of the cars’ software. Google cars can identify emergency vehicles, for example, and automatically give way in a fashion many fleshy drivers are irritatingly unwilling to do.  The other example given was Google cars dealing with cyclists who didn’t obey the rules of the road. One cyclist veered in front of the car at night, and the software was clever enough to stop immediately to avoid a crash.”

Google will have its cars drive ten thousand miles a week on the software.  A recent luxury car ad campaign was critical of the self-driving car, saying people want the luxury of driving themselves with all the benefits of said luxury car.  It will be the TV vs. radio battle again, but the one thing holding back the self-driving car will be human error.  Stupid, stupid humans.

Whitney Grace, July 2, 2015

Sponsored by, publisher of the CyberOSINT monograph

Google, Search, and Swizzled Results

July 1, 2015

I am tired of answering questions about the alleged blockbuster revelations from a sponsored study and an academic Internet legal eagle wizard. To catch up on the swizzled search results “news”, I direct your attention, gentle reader, to these articles:

I don’t have a dog in this fight. I prefer the biases of, the wonkiness of Qwant, the mish mash of iSeek, and the mixed outputs of

I don’t look for information using my mobile devices. I use my trusty MacBook and various software tools. I don’t pay much, if any, attention to the first page of results. I prefer to labor through the deeper results. I am retired, out of the game, and ready to charge up my electric wheel chair one final time.

Let me provide you with three basic truths about search. I will illustrate each with a story drawn from my 40 year career in online, information access, and various types of software.

Every Search Engine Provides Tuning Controls

Yep, every search system with which i have worked offers tuning controls. Here’s the real life story. My colleagues and I get a call in our tiny cubicle in an office near the White House. The caller told us to make sure that the then vice president’s Web site came up for specific queries. We created for the Fast Search & Transfer system a series of queries which we hard wired into the results display subsystem. Bingo. When the magic words and phrases were searched, the vice president’s Web page with content on that subject came up. Why did we do this? Well, we knew the reputation of the vice president and I had the experience of sitting in a meeting he chaired. I strongly suggested we just do the hit boosting and stop wasting time. That VP was a firecracker. That’s how life goes in the big world of search.

Key takeaway: Every search engine provides easy or hard ways to present results. These controls are used for a range of purposes. The index just does not present must see benefits information when an employee runs an HR query or someone decides that content is not providing a “good user experience.”

Engineers Tailor Results Frequently

The engineers who have to deal with the weirdness of content indexing, the stuff that ends up in the exception file, a broken relevance function when an external synonym list is created, whatever—these issues have to be fixed one by one. No one talking about the search system knows or cares about this type of grunt work. The right fix is the one that works with the least hassle. If one tries to explain why certain content is not in the index, a broken conversion filter is not germane to the complainer’s conversation. When the exclusions are finally processed, these may be boosted in some way. Hey, people were complaining so weight these cont4ent objects so they show up. This works with grumpy advertisers, cranky Board members, and clueless new hires. Here’s the story. We were trying to figure out why a search system at a major trade association did not display more than half of the available content. The reason was that the hardware and memory were inadequate for the job. We fiddled. We got the content in the index. We flagged it so that it would appear at the top of a results list. The complaining stopped. No one asked how we did this. I got paid and hit the road.

Key takeaway: In real world search, there are decisions made to deal with problems that Ivory Tower types and disaffected online ecommerce sites cannot and will not understand. The folks working on the system put in a fix and move on. There are dozens and dozens of problems with every search system we have encountered since my first exposure to STAIRS III and BRS. Search sucked in the late 1960s and early 1970s, and it sucks today. To get relevant information, one has to be a very, very skilled researcher, just like it was in the 16th century.

New Hires Just Do Stuff

Okay, here’s a fact of life that will grate on the nerves of the Ivy League MBAs. Search engineering is grueling, difficult, and thankless works. Managers want precision and recall. MBAs often don’t understand that which they demand. So why not hard wire every darned query from this ivy bedecked whiz kid. Ask Jeeves took this route and it worked until the money for humans ran out. Today new hires come in to replace the experienced people like my ArnoldIT team who say, “Been there done that. Time for cyberOSINT.” The new lads and lasses grab a problem and solve it. Maybe a really friendly marketer wants Aunt Sally’s home made jam to be top ranked. The new person just sets the controls and makes an offer of “Let’s do lunch.”  Maybe the newcomer gets tired of manual hit boosting, writes a script to automate boosting via a form which any marketer can complete. Maybe the script kiddie posts the script on the in-house system. Bingo. Hit boosting is the new black because it works around perceived relevance issues. Real story: At a giant drug company, researchers could not find their content. The fix was to create a separate search system, indexed and scored to meet the needs of the researchers, and then redirect every person from the research department to the swizzled search system. Magic.

Key takeaway: Over time functions, procedures, and fixes get made and managers, like prison guards, no longer perform serious monitoring. Managers are too busy dealing with automated meeting calendars or working on their own start up. When companies in the search business have been around for seven, ten, or fifteen years, I am not sure anyone “in charge” knows what is going on with the newcomers’ fixes and workarounds. Continuity is not high on the priority list in my experience.

What’s My View of the Wu-velations?

I have three observations:

  1. Search results boosting is a core system function; it is not something special. If a search system does not include a boosting function, programmers will find a way to deliver boosting even if it means running two queries and posting results to a form with the boosted content smack in the top spot.
  2. Google’s wildly complex and essentially unmanageable relevance ranking algorithms does stuff that is perplexing because it is tied into inputs from “semantic servers” and heaven knows what else. I can see a company’s Web site disappearing or appearing because no one understands the interactions among the inputs in Google’s wild and crazy system. Couple that with hit boosting and you have a massive demonstration of irrelevant results.
  3. Humans at a search company can reach for a search engineer, make a case for a hit boosting function, and move on. The person doing the asking could be a charming marketer or an errant input system. No one has much, if any, knowledge of actions of a single person or a small team as long as the overall system does not crash and burn.

I am far more concerned about the predictive personalization methods in use for the display of content on mobile devices. That’s why I use

It is the responsibility of the person looking for information to understand bias in results and then exert actual human effort, time, and brain power to figure out what’s relevant and what’s not.

Fine beat up on the Google. But there are other folks who deserve a whack or two. Why not ask yourself, “Why are results from Bing and Google so darned similar?” There’s a reason for that too, gentle reader. But that’s another topic for another time.

Stephen E Arnold, July 1, 2015

The Google Cloud: Low Ceiling, Visibility Limited

June 30, 2015

I read “Google Cloud Platform: Google Execs Speak.” I highlighted one passage. In response to a question about recent Google cloud service price cuts, the Googler Brian Stevens said:

Our [pricing] is, to be honest, completely driven by measurable infrastructure improvements. So the numbers that you’re seeing aren’t even looking at the competition. They’re looking at the efficiencies. We actually can cost out all of our ongoing infrastructure for our platform, which we actually charge back to the group… We actually modeled those [costs]. We built our plans for next year. We have a set of goals around infrastructure efficiencies that we’re going to drive next year as well. Those [costs] are mapped right back into further and further discounts. So the model, for us, will continue.

I assume that Amazon will remain competitive with Google as both companies try to create value adding services. How low will Google cloud prices go? The suggestion that Google pays little attention to the actions of its competitors strikes me as interesting. I am sensitive to the words “honest” and “actually.”

Stephen E Arnold, June 30, 2015

Loon Advances. Search Grounded.

June 29, 2015

I read “Google Talks Project Loon: 14 Different Prototypes, Leaks Solved by Using Fluffy Socks.” Loon is one of Google’s research projects. From what I can figure out, Google wants to provide Internet access to everyone, yep, categorical affirmative. Facebook has this idea too. The write up reports:

During Google I/O, he [Google wizard Astro Teller] said “We knew we had a lot to learn, but we misestimated how much we had to learn.” For example, the balloons are so large, they have to be stood on by engineers, and tests were carried out to see which socks caused fewer leaks. Fluffy ones, apparently, worked best. The research and perseverance has paid off. Project Loon’s balloons now stay in the air for six months, then steered around the world and positioned to within 500 yards of the intended target area. That’s way beyond the 100-day minimum flight time estimates, that Google says will make Project Loon a viable solution to provide Internet to the billions of people who cannot get it using traditional means.

One question: Will ads persist as long as a balloon remains aloft? Nah, longer. Search remains on the ground, with feet of clay.

Stephen E Arnold, June 29, 2015

Google: Is This an X Lab for Real Journalists

June 23, 2015

I have a colleague who retired. The newspaper for which he worked continued to make like interesting for those over the age of 55. I assume that other real journalists have discovered that the appetite for those born after 1950 is changing. Bring on the younger journalism grads. YouTube savvy? Great. A high traffic blog about veganism? Come on down. A Web site which is magnet for python programmers? Hey, want to work for us?

When I read “Introducing the News Lab,” I had two different thoughts:

  1. What a great idea
  2. Quite a pool of unemployed, under employed, and want to be professionals to tap
  3. How many publishers are like hungry bass in a big lake at a fishing tournament?
  4. How many journalists know how to make Google’s system sing and dance like a top billing at a vaudeville show?

According to the write up:

It’s hard to think of a more important source of information in the world than quality journalism. At its best, news communicates truth to power, keeps societies free and open, and leads to more informed decision-making by people and leaders. In the past decade, better technology and an open Internet have led to a revolution in how news is created, distributed, and consumed. And given Google’s mission to ensure quality information is accessible and useful everywhere, we want to help ensure that innovation in news leads to a more informed, more democratic world.

There you go. What about the right to be forgotten, filtering, predictive search results, and ads? Once again I am mashing up the math club’s manifesto with reality.

The idea is that the journalists embracing the GOOG will use the GOOG to produce content. I learned:

There’s a revolution in data journalism happening in newsrooms today, as more data sets and more tools for analysis are allowing journalists to create insights that were never before possible. To help journalists use our data to offer a unique window to the world, last week we announced an update to our Google Trends platform. The new Google Trends provides journalists with deeper, broader, and real-time data, and incorporates feedback we collected from newsrooms and data journalists around the world. We’re also helping newsrooms around the world tell stories using data, with a daily feed of curated Google Trends based on the headlines of the day, and through partnerships with newsrooms on specific data experiments.

The attentive reader will notice that I have removed the numerous links in the article. Clicking around in the middle of an important article is not something I do nor encourage.

Will the News Lab deliver the benefits journalists expect and the benefit some folks need? Will Google “put wood behind” this initiative or will it suffer the same fate as Web Accelerator? Will the service generate more magnetism than the many news efforts nosing into the datasphere? Will publishers jump with glee because Google empowers new content?

No answers yet.

Stephen E Arnold, June 23, 2015

Chrome Restricts Extensions amid Security Threats

June 22, 2015

Despite efforts to maintain an open Internet, malware seems to be pushing online explorers into walled gardens, akin the old AOL setup. The trend is illustrated by a story at PandoDaily, “Security Trumps Ideology as Google Closes Off its Chrome Platform.” Beginning this July, Chrome users will only be able to download extensions for that browser  from the official Chrome Web Store. This change is on the heels of one made in March—apps submitted to Google’s Play Store must now pass a review. Extreme measures to combat an extreme problem with malicious software.

The company tried a middle-ground approach last year, when they imposed the our-store-only policy on all users except those using Chrome’s development build. The makers of malware, though, are adaptable creatures; they found a way to force users into the development channel, then slip in their pernicious extensions. Writer Nathanieo Mott welcomes the changes, given the realities:

“It’s hard to convince people that they should use open platforms that leave them vulnerable to attack. There are good reasons to support those platforms—like limiting the influence tech companies have on the world’s information and avoiding government backdoors—but those pale in comparison to everyday security concerns. Google seems to have realized this. The chaos of openness has been replaced by the order of closed-off systems, not because the company has abandoned its ideals, but because protecting consumers is more important than ideology.”

Better safe than sorry? Perhaps.

Cynthia Murrell, June 22, 2015

Sponsored by, publisher of the CyberOSINT monograph

Google: India Disconnects

June 16, 2015

I read “Google’s Big Project to Sell Super Cheap Phones in India Appears to Be Failing.” When I read the title, I wondered how Google’s other “pump up the channel” plays are working out. The shift from desktop search and advertising to the mobile platform may be leaving Adwords behind. But Google has, I thought, dozens of super wizards who hail from… wait for it … India. If any outfit could figure out how to do a slam dunk, it should be Google.

According to the write up:

The three Indian phone manufacturers that were initially involved in producing the low-cost devices have no plans to create future versions of the smartphone, The Economic Times reports.

I highlighted in pale pink this statement:

Sanjay Kalirona, who heads up the mobile phones unit at Intex, told The Economic Times: “Everything was finalized, the product was ready but market response was not there, so we dropped the idea.”

How bad were the numbers?

Sales data for Indian Android One phones shows that the devices accounted for between 2% to 2.5% of smartphone sales in India from September 2014 until May 2015. And sales estimates from Convergence Catalyst estimates the total number of Android One handsets sold in India since launch at less than 1 million units.

That strikes me as Microsoft and Amazon phone territory.

Poor Google. First, Europe, then China, and now India are not feeling the Google vibe. I love the GOOG. Disappointment. Perhaps some resources will flow into the Google Search Appliance? Or, what about more Loon balloons?

Stephen E Arnold, June 16, 2015

Wisdom of Verizon AOL Deal Questioned

June 16, 2015

Sarah Lacy, founder and editor-in-chief at PandoDaily, is highly skeptical of the official rational behind Verizon’s recent acquisition of AOL. She posits, “Can’t We All Agree the Justifications for this AOL/Verizon Deal are Bat#### Insane?” The post begins:

“What is it about AOL mergers that make no sense?

“I’ve spent the morning intermittently reading various reports by the financial press about Verizon’s surprise/not surprise acquisition of AOL. Early on, they seem divided on whether it was about buying ad tech or content, with many pundits saying Verizon was going the Comcast route… and then it became clear that AOL’s biggest media asset, the Huffington Post, would likely be spun off. The press was similarly divided on whether or not Armstrong was long shopping this company or simply got wowed by how awesome Verizon is during a meeting at Sun Valley.

“But everyone — including the company– insists this deal was about two buzzwords: Mobile. Video. AOL put out some dizzying justifications and everyone nodded like they totally understood.

“Wait, what?”

Lacy doesn’t buy the idea that Verizon acquired AOL for its mobile and video chops (she has a point there). In fact, it quickly becomes clear that the writer’s main problem is with AOL chairman and ex-Googler Tim Armstrong, for she spends much virtual ink delineating his errors, past and present. (She’s especially critical of his handling of the Huffington Post.) Lacy also refutes official statements about this deal one by one, comparing the whole situation to a nonsensical Lewis Carroll scene. See the article if you, too, think this deal is fishy (or if, for some reason, you desire ammo against Mr. Armstrong.)

Cynthia Murrell, June 16, 2015

Sponsored by, publisher of the CyberOSINT monograph


Exorbyte Offers Entity Detection

June 16, 2015

Germany based Exorbyte is a leading European solutions company for search and analysis in structured/unstructured data.  Business On tells us that Exorbyte has released a feature to help users manage their email inboxes: “Input Management: Exorbyte Automates Identity Determination.”  Using Google Translate to give us the details, the article explains that Exorbyte now offers a Full Page Entity Detect, a tool that extracts the identity data from full-text documents and compares them with reference databases.

Full Page Detect is advertised as taking out the guess work in figuring out where data originates in documents.  The process is described as:

“The identity data can be extracted directly from the digitized full-text  documents such as letters, faxes and e-mails and efficiently compared with reference databases – virtually independent of language. It doesn’t matter whether the data in question is incorrect or incomplete.  Exorbyte’s Full Page Detect Entity is able to read the valid data and organize it without fail for customers.”

Full Page Detect’s main selling point is that it can recognize information in documents no matter where it is placed in the document.  It uses Exorbyte’s leading Matchmaker technology, which is extremely reliable in detecting errors and keeping analysis on track.

Exorbyte offers a useful service for people trying to summarize their emails without having to open every single one.  It streamlines the email process and makes it more efficient.

Whitney Grace, June 16, 2015

Sponsored by, publisher of the CyberOSINT monograph


Next Page »