Wanna Be a Googler? Consider The Questions If Not the Grammar
March 27, 2016
I see a lot of a saucisson each day. I was amused by “Interview Questions Google Pulled Down Because They Were Impossibly Difficult to Answer.” Like the fabled Google Labs Aptitude Test, Google tries, like the addled goose, to winnow the goose feathers from the giblets. The addled goose focuses on the world of search and content processing. Google tries to perform the same trick with humans. You remember, humans, the reason Google’s self driving cars have accidents.
I found this question interesting:
How many times a day does a clock’s hands overlap?
What does this sentence suggest about those creating exam questions? Hint: Do not ask a Googler.
Stephen E Arnold, March27, 2016
SEO Consultants Face Google Reality
March 27, 2016
The entire search engine optimization boomlet annoyed me. The idea that individuals could trick algorithms to displaying where it should not appear goes against my old fashioned notions of relevance, precision, and recall.
I blame lots of people for this destruction of on point search and retrieval. Consultants, vendors, the wonderful Google—yep, sorry. I want a search system to deliver information directly germane to my query. I don’t want search systems to think for me.
I read an amusing write up called “Google ch-ch-ch-changes. How They’re Affecting Publishers and SEOs.” The focus is not on the users’ needs for relevant information. Nah, the focus is on publishers and the members of the class SEO.
The write up bemoans the fact that Google no longer has a wizard to explain how to fool Google’s algorithms. That’s a positive in my opinion. Next the write up points out that Google wants to use even smarter algorithms to determine what is and is not relevant. Does Google’s notion of relevance match mine. Nah, I don’t care about advertising, but my hunch is that Google cares a great deal about money with relevance a consideration. But the goal is money.
The part of the article I liked was the section labeled “SEO Is Dead.” Good. The result was a surprise. The article points out that Facebook is a better place to get information. I highlighted in social scarlet this statement:
More and more, I go to Facebook for answers because I can no longer find them in Google. Google uses AI to throw me a kitchen sink when it is not sure, and that kitchen sink rarely has much in it that’s useful for me.
How does one find on point information from a Web search engine dependent on advertising? The write up dodges the question and suggests:
If you are using informational search, SEO hasn’t gotten harder — it has just become much more irrelevant. Whereas Google used to be very good at returning exact query results, AI goes with the “broad net” approach. If Google does not have a specific “thing” it can return, it will often return a set of more general results, leaving words out of the query set. Often, the word it leaves out is the most relevant modifier.
Sound like baloney?
Stephen E Arnold, March 27, 2016
Search as a Framework
March 26, 2016
A number of search and content processing vendors suggest their information access system can function as a framework. The idea is that search is more than a utility function.
If the information in the article “Abusing Elasticsearch as a Framework” is spot on, a non search vendor may have taken an important step to making an assertion into a reality.
The article states:
Crate is a distributed SQL database that leverages Elasticsearch and Lucene. In it’s infant days it parsed SQL statements and translated them into Elasticsearch queries. It was basically a layer on top of Elasticsearch.
The idea is that the framework uses discovery, master election, replication, etc along with the Lucene search and indexing operations.
Crate, the framework, is a distributed SQL database “that leverages Elasticsearch and Lucene.”
Stephen E Arnold, March 26, 2016
Advertising and Search Confidence: Google As Government
March 26, 2016
I read “US State Department Emails: Google Wanted in 2012 to Help Syria’s Rebels Overthrow Assad.” The story might be a load of horse feathers. I stopped and read the article and noted this passage:
Messages between former secretary of state Hillary Clinton’s team and one of the company’s executives detailed the plan for Google to get involved in the region. “Please keep close hold, but my team is planning to launch a tool … that will publicly track and map the defections in Syria and which parts of the government they are coming from,” Jared Cohen, the head of what was then the company’s “Google Ideas” division, wrote in a July 2012 email to several top Clinton officials.
Perhaps this is Palantir envy? Clever folks are confident of their abilities. And here is a See Also reference.
Stephen E Arnold, March 26, 2016
Ixquick and StartPage Become One
March 25, 2016
Ixquick was created by a person in Manhattan. Then the system shifted from the USA to Europe. I lost track. I read “Ixquick Merges with StartPage Search Engine.” Web search is a hideously expensive activity to fund. Costs can be suppressed if one just passes the user’s query to Bing, Google, or some other Web indexing search system. The approach delivers what is called a value-added opportunity. Vivisimo used the approach before it morphed into a unit of IBM and emerged not as a search federation system but a Big Data system. Most search traffic flows to the Alphabet Google advertising system. Those who use federated search systems often don’t know the difference and, based on my observations, don’t care.
According to the write up:
The main difference between StartPage and the current version of Ixquick is that the former is powered exclusively by Google search results while the latter aggregates data from multiple search engines to rank them based on factors such as prominence and quantity. Both search engines are privacy orientated, and the merging won’t change the fact. IP addresses are not recorded for instance, and data is not shared with third-parties.
Like DuckDuckGo.com, Ixquick.com and StartPage.com “protect the user’s privacy. My thought is that I am not confident Tor sessions are able to protect a user’s privacy. A general interest search engine which delivers on this assertion is interesting indeed.
If you want to use the Ixquick function that presents only Google results, navigate to www.ixquick.eu. There are other privacy oriented systems; for example, Gibiru and Unbubble.
Sorry, I won’t/can’t go into the privacy angle. You may want to poke around how secure a VPN session, Tails, and Tor are. The exploration may yield some useful information. Make sure your computing device does not have malware installed, please. Otherwise, the “privacy” issue is off the table.
Stephen E Arnold, March 25, 2016
Some News, Maybe None That Is Not Sort of True?
March 25, 2016
I read “Proposed Truthfulness Law Spooks Russian News Aggregators.” I came away a little puzzled. My perception is that the “news,” regardless of country, is a weird amalgam of infotainment, bias, and theater (political, social, and William Wycherley fare). Whenever the notion of “real,” “accurate,” “objective,” and “true” enter from stage right or left, I wonder what these folks’ definition of the glittering generalities are.
According to the write up, “Russia has tight media controls that include a requirement to make sure all print, broadcast and online news is true.”
A new bill (not yet a law, gentle reader) “would effectively say that news aggregators are the same as mass media operations.” News aggregators like Yandex and the Alphabet Google thing:
would become liable if they spread false information and state agencies complain about it.
The write up, a “real” journalism outfit observes:
Although the law would create a handy way of further restricting information flows, when the bill came out, the Russian communications ministry indicated it was not keen on the idea. That said, the Kremlin has already been making life hard for big online players, particularly by mandating that they store users’ personal data on servers in Russia.
May I suggest a quick romp through Jacques Ellul’s Propaganda: The Formation of Men’s Attitudes?
Stephen E Arnold, March 25, 2016
Not So Weak. Right, Watson?
March 25, 2016
I read an article which provided to be difficult to find. None of my normal newsreaders snagged the write up called “The Pentagon’s Procurement System Is So Broken They Are Calling on Watson.” Maybe it is the singular Pentagon hooked with the plural pronoun “they”? Hey, dude, colloquial writing is chill.
Perhaps my automated systems’ missing the boat was the omission of the three impressive letters “IBM”? If you follow the activities of US government procurement, you may want to note the article. If you are tracking the tension between IBM i2 and Palantir Technologies, the article adds another flagstone to the pavement that IBM is building to support it augmented intelligence activities in the Department of Defense and other US government agencies.
Let me highlight a couple of comments in the write up and leave you to explore the article at whatever level you choose. I noted these “reports”:
The Air Force is currently working with two vendors, both of which have chosen Watson, IBM’s cognitive learning computer, to develop programs that would harness artificial intelligence to help businesses and government acquisitions officials work through the mind-numbing system.
The write up identifies one of the vendors working on IBM Watson for the US Air Force. The company is Applied Research.
I circled this quote: “The Pentagon’s procurement system is the “perfect application for Watson.”
The goslings and I love “perfect” applications.
How does Watson learn about procurement? The approach is essentially the method used in the mid 1990s by Autonomy IDOL. Here’s a passage I highlighted:
But first Watson must be trained. The first step is to feed it all the relevant documents. Then its digital intellect will be molded by humans, asking question after question, about 5,000 in all, to help understand context and the particular nuance that comes with federal procurement law.
How does this IBM deal fit into the Palantir versus IBM interaction? That’s a good question. What is clear is that the US Air Force has embraced a solution which includes systems and methods first deployed two decades ago.
What’s that about the pace of technology?
Stephen E Arnold, March 25, 2016
Play Search the Game
March 25, 2016
Within the past few years, gamers have had the privilege to easily play brand new games as well as the old classics. Nearly all of the games ever programmed are available through various channels from Steam, simulator, to system emulator. While it is easy to locate a game if you know the name, main character, or even the gaming system, but with the thousands of games available maybe you want to save time and not have use a search engine. Good news, everyone!
Sofotex, a free software download Web site, has a unique piece of freeware that you will probably want to download if you are a gamer. Igrulka is a search engine app programmed to search only games. Here is the official description:
Igrulka is a unique software that helps you to search, find and play millions of games in the network.
“Once you download the installer, all you have to do is go to the download location on your computer and install the app.
Igrulka allows you to search for the games that you love either according to the categories they are in or by name. For example, you get games in the shooter, arcade, action, puzzle or racing games categories among many others.
If you would like to see more details about the available games, their names as well as their descriptions, all you have to do is hover over them using your mouse as shown below. Choose the game you want to play and click on it.”
According to the description, it looks like Igrulka searches through free games and perhaps the classics from systems. In order to find out what Irgulka can do, download and play search results roulette.
Whitney Grace, March 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Bigger Picture Regarding Illegal Content Needed
March 25, 2016
Every once in awhile an article on the Dark Web comes along that takes a step back from the latest action on Tor and offers a deep-dive on the topic at large. Delving into the World of the Dark Web was recently published on Raconteur, for example. In this article, we learned the definition of darknets: networks only accessible through particular software, such as Tor, and trusted peer authorization. The article continues,
“The best known, and by far the most popular, darknet is the Onion Router (Tor), which was created by the US Naval Research Labs in the 90s as an enabler of secure communication and funded by the US Department of Defense. To navigate it you use the Tor browser, similar to Google Chrome or Internet Explorer apart from keeping the identity of the person doing the browsing a secret. Importantly, this secrecy also applies to what the user is looking at. It is because servers hosting websites on the Tor network, denoted by their .onion (dot onion) designation, are able to mask their location.”
Today, the Dark Web is publicly available to be used anonymously by anyone with darknet software and home to a fair amount of criminal activity. Researchers at King’s College London scraped the .onion sites and results suggested about 57 percent of Tor sites host illegal content. We wonder about the larger context; for example, what percent of sites viewed on mainstream internet browsers host illegal content?
Megan Feil, March 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
DocPoint and Concept Searching: The ONLY Choice. Huh?
March 24, 2016
DocPoint is a consulting and services firm focusing on the US government’s needs. The company won’t ignore commercial firms’ inquiries, but the line up of services seems to be shaped for the world of GSAAdvantage users.
I noted that DocPoint has signed on to resell the Concept Searching indexing system. In theory, the SharePoint search service performs a range of indexing functions. In actual practice, like my grandmother’s cookies, many of the products are not cooked long enough. I tossed those horrible cookies in the trash. The licensees of SharePoint don’t have the choice I did when eight years old.
DocPoint is a specialist firm which provides what Microsoft cannot or no longer chooses to offer its licensees. Microsoft is busy trying to dominate the mobile phone market and doing bug fixes on the Surface product line.
The scoop about the DocPoint and Concept Searching deal appears in “DocPoint Solutions Adds Concept Searching To GSA Schedule 70.” The Schedule 70 reference means, according to WhatIs.com:
a long-term contract issued by the U.S. General Services Administration (GSA) to a commercial technology vendor. Award of a Schedule contract signifies that the GSA has determined that the vendor’s pricing is fair and reasonable and the vendor is in compliance with all applicable laws and regulations. Purchasing from pre- approved vendors allows agencies to cut through red tape and receive goods and services faster. A vendor doesn’t need to win a GSA Schedule contract in order to do business with U.S. government agencies, but having a Schedule contract can cut down on administrative costs, both for the vendor and for the agency. Federal agencies typically submit requests to three vendors on a Schedule and choose the vendor that offers the best value.
To me, the deal is a way for Concept Searching to generate revenue via a third party services firm.
In the write up about the tie up, I highlighted this paragraph which is a single paragraph with an amazing assertion:
A DocPoint partner since 2012, Concept Searching is the only [emphasis added] company whose solutions deliver automatic semantic metadata generation, auto-classification, and powerful taxonomy tools running natively in all versions of SharePoint and SharePoint Online. By blending these technologies with DocPoint’s end-to-end enterprise content management (ECM) offerings, government organizations can maximize their SharePoint investment and obtain a fully integrated solution for sharing, securing and searching for mission-critical information.
Note the statement “only company whose solutions deliver…” “Only” means, according to the Google define function:
No one or nothing more besides; solely or exclusively.
Unfortunately the DocPoint assertion about Concept Searching as the only firm appears to be wide of the mark. Concept Search is one of many companies offering the functions set forth in the content marketing “news” story. In my files, I have the names of dozens of commercial firms offering semantic metadata generation, auto-classification, and taxonomy tools. I wonder if Layer2 or Smartlogic have an opinion about “only”?
Stephen E Arnold, March 24, 2016