Tag Boosting and Hybrid Cloud Environments from Microsoft Azure
March 29, 2015
The article titled Microsoft Azure Rolls Out Improved Search and New Hybrid Test Environments on Tech Week Europe touts the new direction of Microsoft Azure, namely a focus on “tag boosting,” and hybrid cloud environments. The cloud environments are for playing around with Azure features with local internet connections. As for the “tag boosting,” the ability to use the borders created by developers in order to rank search results will hopefully help narrow the definition of “relevant” searches. Senior Program Manager of Microsoft Liam Cavanagh discusses the work being done,
“Let’s say you have customers that purchase items from you regularly. For each customer, you track their top 3 or 4 brands they buy the most often. Now what you’d like to do is to boost documents in search results when those documents represent products of the preferred brands. Note that this is contextual; each user would have a different set of top-K brands they prefer.“In our experimental API… we’re introducing a new scoring function called “tag” to handle this scenario.”
This “tag” can be assigned manually to each customer, or assigned to clusters of similar shoppers. Azure continues to collect feedback on the results, making it a work in progress. Search does seem to be in progress most of the time at Microsoft.
Chelsea Kerwin, March 29, 2014
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Facebook Users Lack Understanding of Filters: No Big Surprise
March 29, 2015
Let me be clear. I am not a Facebook user. One of the goslings configured the Beyond Search blog to send content to a Facebook page. I, however, do not need a stream of information about my high school and college classmates. At my last reunion, the 50th, I saw only two mobile phones: My wife’s and mine. Obviously central Illinois is not a technology hot spot for the over 70 set.
I read “Many, Many Facebook Users Still Don’t Know That Their News Fees Are Filtered by an Algorithm.” Big whoop. Most of the MBAs I know are clueless about Google’s personalization functions and don’t have much appetite for understanding that what you see may not be what is available. For these cohorts, a little learning is just fine. Drinking from a spring is okay as long as the water comes from an authentic source like Dasani. Isn’t that Coca Cola’s outfit?
The write up reveals what strikes me as a no brainer type factoid:
But a majority of everyday Facebook users in a recent study had no idea that Facebook constructs their experience, pushing certain posts into their stream and leaving others out. And worse, many participants blamed themselves, not Facebook’s software, when friends or family disappeared from their news feeds.
The article reports:
While some participants were upset by the idea that Facebook was changing their social experience, more than half of the study participants “came to appreciate the algorithm over the course of the study.” Most came to think that the filtering and ranking software was actually doing a decent job. “Honestly I have nothing to change which I’m surprised!” one said. “Because I came in like ‘Ah, they’re screwing it all!’”
Sigh. Is there a remedy for this lack of understanding? Nope.
Do most online “experts” care? Nah, but some of them charge windmills with their iPad Airs as a shield.
The reality is that a comprehensive understanding of a particular content domain requires good, old fashioned research. The idea is to read, talk to informed individuals, gather additional primary data, analyze what you collect, and then figure out who knows what about a topic.
We are doing this type of grunt work about one facet of the Dark Web. Early results are in. Most of the people writing about the Dark Web are not doing a particularly good job of explaining where the “dark” content lives, how to find it, or what the content reveals about a fundamental shift in online usage for a small but important and interesting group of users worldwide.
If one cannot understand what Facebook is doing, the Dark Web is of zero consequence. If a Google user accepts search results as objective, I am not sure there is much hope for remedial intervention.
Net net: At a time when ease, convenience, short cuts, and distractions are of primary importance, thinking about information is not of much interest to many people.
“Hey, after the NCAA games, let’s binge watch Breaking Bad. We can post our comments on Facebook too!”
Sound fun? Oh, wait. I have to take this call, send an SMS, and post a picture of our pizza to Facebook. Cool.
Stephen E Arnold, March 29, 2015
Remotely Search Your Files
March 28, 2015
While it is a pain having to switch between apps to complete tasks, it is an even bigger pain trying to securely search your laptop or desktop computer for files using your mobile device. Sure, there are cloud storage services and the ability to log into your computer via remote Web apps. The problem still remains that you have to log on and connect with your computer. X1 Mobile Search takes off that problem and TechWorld has an oldie, but a good review on the app: “X1 Mobile Search Review.”
For a mere fifteen dollars, you download the X1 Mobile Search app on your computer and mobile device and then you can not only search for your files, but also edit them from within the app. It sounds too good to be true, but the X1 works. The application must be downloaded on both devices and connected to the Internet.
TechWorld says the mobile device is a worthy investment:
“Unlike some other programs that allow you to share files between mobile devices and PC and Macs, this one is designed for searching the whole computer, rather than just sharing specific files or pieces of information. You’ll find it a great complement to other programs such as Evernote and SugarSync.”
Give it a whirl.
Whitney Grace, March 28, 2015
Sponsored by ArnoldIT.com, publisher of CyberOSINT
Watson Goes Blekko
March 28, 2015
I read “Goodbye Blekko: Search Engine Joins IBM’s Watson Team.” According to the write up, “Blekko’s home page says its team and technology are now part of IBM’s Watson technology.” I would not know this. I do not use the service. I wrestled with the implementation of Blekko on a news service and then wondered if Yandex was serious about the company. Bottom line: Blekko is not one of my go to search systems, and I don’t cover it in my Alternatives to Google lectures for law enforcement and intelligence professionals.
The write up asserts:
Blekko came out of stealth in 2008 with Skrenta promising to create a search engine with “algorithmic editorial differentiation” compared to Google. Its public search engine finally opened in 2010, launching with what the site called “slashtags” — a personalization and filtering tool that gave users control over the sites they saw in Blekko’s search results.
Another search system becomes part of the puzzling Watson service. How many information access systems does IBM require to make Watson the billion dollar revenue generator or at least robust enough to pay the rent for the Union Square offices?
IBM “owns” the Clementine system which arrived with the SPSS purchase. IBM owns Vivisimo, which morphed into a Big Data system in the acquisition news release, iPhrase, and the wonky search functions in DB2. Somewhere along the line, IBM snagged the Illustra system. From its own labs, IBM has Web Fountain. There is the decades old STAIRS system which may still be available as Service Master. And, of course, there is the Lucene system which provides the dray animals for Watson. Whew. That is a wealth of information access technology, and I am not sure it is comprehensive.
My point is that Blekko and its razzle dazzle assertions now have to provide something that delivers a payoff for IBM. On the other hand, maybe IBM Watson executives are buying technology in the hopes that one of the people “aquihired” or the newly bought zeros and ones will generate massive cash flows.
Watson has morphed from a question answering game show winner into all manner of fantastic information processing capabilities. For me, Watson is an example of what happens when a lack of focus blends with money, executive compensation schemes, and a struggling $100 billion outfit.
Lots of smoke. Not much revenue fire. Stakeholders hope it will change. I am looking forward to a semantically enriched recipe for barbeque sauce that includes tamarind and other spices not available in Harrod’s Creek, Kentucky. Yummy. A tasty addition to the quarterly review menu: Blekko with revenue and a piquant profit sauce.
Perhaps IBM next will acquire Pertimm and the Qwant search system which terrrifes Eric Schmidt? Surprises ahead. I prefer profitable, sustainable revenues however.
Stephen E Arnold, March 28, 2015
Semantic Search Becomes Search Engine Optimization: That Is Going to Improve Relevance
March 27, 2015
I read “The Rapid Evolution of Semantic Search.” It must be my age or the fact that it is cold in Harrod’s Creek, Kentucky, this morning. The write up purports to deliver “an overview of the history of semantic search and what this means for marketers moving forward.” I like that moving forward stuff. It reminds me of Project Runway’s “fashion forward.”
The write up includes a wonky graphic that equates via an arrow Big Data and metadata, volume, smart content, petabytes, data analysis, vast, structured, and framework. Big Data is a cloud with five little arrows pointing down. Does this mean Big Data is pouring from the sky like yesterday’s chilling rain?
The history of the Semantic Web begins in 1998. Let’s see that is 17 years ago. The milestone is in the context of the article, the report “Semantic Web road Map.” I learned that Google was less than a month old. I thought that Google was Backrub and the work on what was named Google begin a couple, maybe three years, earlier. Who cares?
The Big Idea is that the Web is an information space. That sounds good.
Well in 2012, something Big happened. According to the write up Google figured out that 20 percent of its searches were “new.” Aren’t those pesky humans annoying. The article reports:
long tail keywords made up approximately 70 percent of all searches. What this told Google was that users were becoming interested in using their search engine as a tool for answering questions and solving problems, not just looking up facts and finding individual websites. Instead of typing “Los Angeles weather,” people started searching “Los Angeles hourly weather for March 1.” While that’s an extremely simplified explanation, the fact is that Google, Bing, Facebook, and other internet leaders have been working on what Colin Jeavons calls “the silent semantic revolution” for years now. Bing launched Satori, a knowledge storehouse that’s capable of understanding complex relationships between people, things, and entities. Facebook built Knowledge Graph, which reveals additional information about things you search, based on Google’s complex semantic algorithm called Hummingbird.
Yep, a new age dawned. The message in the article is that marketers have a great new opportunity to push their message in front of users. In my book, this is one reason why running a query on any of the ad supported Web search engines returns so much irrelevant information. In my just submitted Information Today column, I report how a query for the phrase “concept searching” returned results littered with a vendor’s marketing hoo-hah.
I did not want information about a vendor. I wanted information about a concept. But, alas, Google knows what I want. I don’t know what I want in the brave new world of search. The article ignores the lack of relevance in results, the dust binning of precision and recall, and the bogus information many search queries generate. Try to find current information about Dark Web onion sites and let me know how helpful the search systems are. In fact, name the top TOR search engines. See how far you get with Bing, Google, and Yandex. (DuckDuckGo and Ixquick seem to be aware of TOS content by the way.)
So semantic in the context of this article boils down to four points:
- Think like an end user. I suppose one should not try to locate an explanation of “concept searching.” I guess Google knows I care about a company with a quite narrow set of technology focused on SharePoint.
- Invest in semantic markup. Okay, that will make sense to the content marketers. What if the system used to generate the content does not support the nifty features of the Semantic Web. OWL, who? RDF what?
- Do social. Okay, that’s useful. Facebook and Twitter are the go to systems for marketing products I assume. Who on Facebook cares about cyber OSINT or GE’s cratering petrochemical business?
- And the keeper, “Don’t forget about standard techniques.” This means search engine optimization. That SEO stuff is designed to make relevance irrelevant. Great idea.
Net net: The write up underscores some of the issues associated with generating buzz for a small business like the ones INC Magazine tries to serve. With write ups like this one about Semantic Search, INC may be confusing their core constituency. Can confused executives close deals and make sense of INC articles? I assume so. I know I cannot.
Stephen E Arnold, March 27, 2015
Google, Safari, and the Europe Problem
March 27, 2015
I read “Google Loses Safari Web Tracking Court of Appeal Case.” The write up is less amusing than Loon balloons or contemplating the future of Glass. I assume the write up is accurate. I read:
UK consumers have been granted the right to take Google to court over revelations from 2012 that it bypassed security settings in Apple’s Safari browser to track users.
The write up included this paragraph:
Dan Tench, a partner at law firm Olswang, acting for the claimants, said that the decision was vital as it stops Google “evading or trivializing these very serious intrusions into the privacy of British consumers”.
Is this accurate?
My hunch is that Google may face additional legal scrutiny in Europe in 2015 despite this statement from the article:
Jonathan Hawker, who set up the Google Action Group regarding the Safari tracking issue, said that anyone who used an Apple iPhone, iPod or iPad between summer 2011 and spring 2012 could be entitled to compensation and should come forward. “Anyone who used the Safari browser during the relevant period now has the right to join our claim against Google. We urge all Safari users to join us in this battle to hold Google to account for its actions in the only way it understands,” he said.
My hunch is that Google’s legal eagles (maybe solicitor sparrows?) will seek additional legal processes. I do know that the GOOG is not keen on having its dreams thwarted. But I am not sure what Google understands although some people are confident in their grasp of the X Lab crowd.
Stephen E Arnold, March 27, 2015
Google Cloud Launcher and a Sci Fi TV Show
March 27, 2015
Google has diverse interests. First, the most important high school science and math club project.
Google will it seems get into the entertainment biz. You can learn more in “Google Takes Its Web Game to TV.” Yep, I know. Balloons, Glass, and a $70 million hire. All in a day’s work.
Now the less important news. Google posted “Deploy Popular Software Packages Using Cloud Launcher.” For those with search-oriented eyes, Lucene in the form of Elasticsearch (Bitnami) and Solr in the form of Bitnami’s “infrastructure” solution are available.
Lucene and Solr starts at $6.46 per month. Bitnami is a cloud services company, which is much loved by Amazon’s Werner Vogels.
Amazon is responding with unlimited storage for $60 per year.
A number of observations seem to be warranted:
- Google and Amazon are offering what seems bargain basement prices.
- Both companies seem to be competing to become the WalMart of services companies
- Google wants to be in the entertainment business.
Perhaps the companies will follow the path blazed by Kraft and Heinz. Would a Googlezon simplify life for customers who want tech, toys, and entertainment in a single, easy to use bundle? Competition is less efficient and therefore not logical, right?
Stephen E Arnold, March 27, 2015
Partnership Between Twitter and IBM Showing Results
March 27, 2015
The article on TechWorld titled IBM Boosts BlueMix and Watson Analytics with Twitter Integration investigates the fruits of the partnership between IBM and Twitter, which began in 2014. IBM Bluemix now has Twitter available as one the services available in the cloud based developer environment. Watson Analytics will also be integrated with Twitter for the creation of visualizations. Developers will be able to grab data from Twitter for better insights into patterns and relationships.
“The Twitter data is available as part of that service so if I wanted to, for example, understand the relationship between a hashtag on pizza, burgers or tofu, I can go into the service, enter the hashtag and specify a date range,” said Rennie. “We [IBM] go out, gather information and essentially calculate what is the sentiment against those tags, what is the split by location, by gender, by retweets, and put it into a format whereby you can immediately do visualisation.”
From the beginning of the partnership, Twitter gave IBM access to its data and the go-ahead to use Twitter with the cloud based developer tools. Watson looks like a catch all for data, and the CMO of Brandwatch Will McInnes suggests that Twitter is only the beginning. The potential of data from social media is a vast and constantly rearranging field.
Chelsea Kerwin, March 27, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Painting an IT Worker’s House Requires an NDA
March 27, 2015
You would not think that contractors, gardeners, painters, plumbers, and electricians would have to sign an non-disclosure agreement before working on someone’s home, but according to the New York Times it is happening all over Silicon Valley. “For Tech Titans, Sharing Has Its Limits” explains how home and garden maintenance workers now have to sign NDAs for big name tech workers just like they have to with celebrities. Most of the time, workers do not even know who they are working for or recognize the names. This has made it hard to gather information on how many people require NDAs, but Mark Zuckerberg recently had a lawsuit that sheds some light about why they are being used. He goes to great lengths to protect his privacy, but ironically tech people who use NDAs are the ones who make a profit off personal information disclosures.
“The lawsuit against Mr. Zuckerberg involves a different residence, 35 miles south in Palo Alto. In it, a part-time developer named Mircea Voskerician claims that he had a contract to buy a $4.8 million house adjoining Mr. Zuckerberg’s residence, and offered to sell a piece of the property to Mr. Zuckerberg. He says that in a meeting at Facebook headquarters in Menlo Park, he discussed a deal to sell his interest in the entire property to Mr. Zuckerberg. In exchange, he says, Mr. Zuckerberg would make introductions between him and powerful people in Silicon Valley, potential future business partners and clients. Mr. Voskerician passed up a better offer on the house, the suit contends, but Mr. Zuckerberg did not follow through on the pledge to make introductions.”
Voskerician said he only signed the NDA on as a condition to the proposed agreement, but Zuckerberg’s legal representation says the NDA means all information related to him. On related terms, Facebook is making more privacy rules so only certain people can see user information. It still does not change how big name IT workers want their own information kept private. It seems sharing is good as long as it is done according to a powerful company’s definition of sharing.
Whitney Grace, March 27, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Need to Remove SharePoint Results?
March 26, 2015
I read “SharePoint 2013 Items Removed with Search Result Removal Return from the Dead!” The article explains how to remove results from a user’s search results. If a user cannot locate specific information, that is a benefit, right? The write up includes links to two Microsoft documents that provide more detail. Are your search results comprehensive? Heh, heh, heh.
Stephen E Arnold, March 26, 2015