Online Shopping Is Too Hard

June 10, 2015

Online shopping is supposed to drive physical stores out of business, but that might not be the case if online shopping is too difficult.  The Ragtrader article, “Why They Abandon” explains that 45 percent of Australian consumers will not make an online purchase if they experience Web site difficulties.  The consumers, instead, are returning to physical stores to make the purchase.  The article mentions that 44 percent believe that traditional shopping is quicker if they know what to look for and 43 percent as prefer in-store service.

The research comes from a Rackspace survey to determine shopping habits in New Zealand and Australia.  The survey also asked participants what other problems they experienced shopping online:

“42 percent said that there were too many pop-up advertisements, 34 percent said that online service is not the same as in-store and 28 percent said it was too time consuming to narrow down options available.”

These are understandable issues.  People don’t want to be hounded to purchase other products when they have a specific item in mind and thousands of options are overwhelming to search through.  Then a digital wall is often daunting if people prefer interpersonal relationships when they shop.  The survey may pinpoint online shopping weaknesses, but it also helps online stores determine the best ways for improvement.

“ ‘This survey shows that not enough retailers are leveraging powerful and available site search and navigation solutions that give consumers a rewarding shopping experience.’ ”

People shop online for convenience, variety, lower prices, and deals.  Search is vital for consumers to narrow down their needs, but if they can’t navigate a Web site then search proves as useless as an expired coupon.

 

Whitney Grace, June 10, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Free Version of InetSoft Style Scope Agile Edition Available

June 10, 2015

The article titled InetSoft Launches Style Scope Agile Edition for Dashboarding and Visual Analytics on PRWeb tells of a free version of InetSoft’s application for visualizing analysis. Business users will gain access to an interactive dashboard with an easy-to-use drag and drop sensibility. The article offers more details about the launch:

“Advanced visualization types ideal for multi-dimensional charting and point-and-click controls like selection lists and ranger sliders give greater abilities for data exploration and performance monitoring than a simple spreadsheet offers. Any dashboard or analysis can be privately shared with others using just a browser or a mobile device, setting the application apart from other free BI tools… Setting up the software will be straightforward for anyone with power spreadsheet skills or basic knowledge of their database.”

Drawbacks to the free version are mentioned, such as being limited to two concurrent users. Of course, the free version is meant to “showcase” the company’s technology according to CMO Mark Flaherty. There is a demo available, to check out the features of the free application. InetSoft has been working since 1996 to bring users intuitive solutions to business problems. This free version is specifically targeted at smaller businesses who might be unable to afford the full application.

Chelsea Kerwin, June 10, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

A Loon Survivor: Facebook Lands Its Satellites

June 9, 2015

I know the quest to create a walled garden stimulates would-be AOLs thinking. I read “Facebook Has Scrapped Its Secret Plan to Build a $500 Million Satellite to Provide Cheap Internet in the Developing World.” It does appear to the addled goose that a person with some math sense calculated that operating Facebook satellites would be expensive. Facebook seems to be focusing its efforts on what the article called “ridiculously large drones.”

For me, maybe Google and its Loon balloons are a better deal. There is the problem of control, of course. Balloons drift, a fact which is evident at Kentucky Derby time when errant balloons come down in places not designed to accommodate large bags charged with hot air from open flames. I would be happier if some of this effort went into better information access, relevance, an useful information delivered to users looking for data.

Yahoo and AOL never had an opportunity to do the boom boom thing. What happens if a Facebook drone collides with a Loon balloon? Could a Jeff Bezos rocket take out both a drone and a Loon balloon? Who needs international dust ups. Corporations have to defend their turf, right?

Stephen E Arnold, June 9, 2015

Sentiment Analysis: The Progeny of Big Data?

June 9, 2015

I read “Text Analytics: The Next Generation of Big Data.” The article provides a straightforward explanation of Big Data, embraces unstructured information like blog posts in various languages, email, and similar types of content, and then leaps to the notion of text analytics. The conclusion to the article is that we are experiencing “The Coming of Age of Text Analytics—The Next Generation of Big Data.”

The idea is good news for the vendors of text analytics aimed squarely at commercial enterprises, advertisers, and marketers. I am not sure the future will match up to the needs of the folks at the law enforcement and intelligence conference I had just left.

There are three reasons:

First, text analytics are not new, and the various systems and methods have been in use for decades. One notable example is BAE Systems use of its home brew tools and Autonomy’s technology in the 1990s and i2 (pre IBM) and its efforts even earlier.

Second, the challenges of figuring out what structured and unstructured data mean require more than determining if a statement is positive or negative. Text analytics is, based on my experience, blind to such useful data as real time geospatial inputs and video streamed from mobile devices and surveillance devices. Text analytics, like key word search, makes a contribution, but it is in a supporting role, not the Beyoncé of content processing.

Third, the future points to the use of technologies like predictive analytics. Text analytics are components in these more robust systems whose outputs are designed to provide probability-based outputs from a range of input sources.

There was considerable consternation a year or so ago. I spoke with a team involved with text analytics at a major telecommunications company. The grousing was that the outputs of the system did not make sense and it was difficult for those reviewing the outputs to figure out what the data meant.

At the LE/intel conference, the focus was on systems which provide actionable information in real time. My point is that vendors have a tendency to see the solutions in terms of what is often a limited or supporting technology.

Sentiment analysis is a good example. Blog posts invoking readers to join ISIS are to some positive and negative. The point is that the point of view of the reader determines whether a message is positive or negative.

The only way to move beyond this type of superficial and often misleading analysis is to deal with context, audio, video, intercept data, geolocation data, and other types of content. Text analytics is one component in a larger system, not the solution to the types of problems explored at the LE/intel conference in early June 2015. Marketing often clouds reality. In some businesses, no one knows that the outputs are not helpful. In other endeavors, the outputs have far higher import. Knowing that a recruiting video with a moving nasheed underscoring the good guys dispatching the bad guys is off kilter. Is it important to know that the video is happy or sad? In fact, it is silly to approach the content in this manner.

Stephen E Arnold, June 9, 2014

Differing Focuses for OneDrive and SharePoint Online

June 9, 2015

Microsoft is unveiling a new OneDrive for Business, and hopes that it offers a secure and sanctioned alternative to other lightweight solutions increasingly preferred by users like: Box, Dropbox, or Google Drive. Search Content Management covers the story in their article, “OneDrive for Business and SharePoint Fill Different Niches.”

The article says:

“Microsoft has recognized users’ preference for lightweight systems, and that preference may explain the recent success of OneDrive for Business (ODB), a cloud file-sharing service that is part of the Office 365 suite. But Microsoft also has SharePoint, its heavier, more traditional content/collaboration platform, which also supports integration with a version of ODB.”

It seems that Microsoft is putting OneDrive up in the battle against others in the cloud file-sharing arena, while leaving SharePoint to handle more structured collaboration. It will be interesting to see how customers and enterprise managers market the two to their users. Stephen E. Arnold also has good coverage on both solutions for those who are looking for more information. His Web service, ArnoldIT.com, offers a good go-to SharePoint feed to keep users updated on the latest SharePoint tips, tricks, and news.

Emily Rae Aldridge, June 9, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

IBM Elevates Tape Storage to the Cloud

June 9, 2015

Did you think we left latency and bad blocks behind with tape storage? Get ready to revisit them, because “IBM Cloud Will Reach Back to Tape for Low-Cost Storage,” according to ComputerWorld. We noticed tape storage was back on the horizon earlier this year, and now IBM has made it official at its recent Edge conference in Las Vegas. There, the company was slated to present a cloud-archiving architecture that relies on a different storage mediums, including tape, depending on an organization’s needs. Reporter Stephen Lawson writes:

“Enterprises are accumulating growing volumes of data, including new types such as surveillance video that may never be used on a regular basis but need to be stored for a long time. At the same time, new big-data analytics tools are making old and little-used data useful for gleaning new insights into business and government. IBM is going after customers in health care, social media, oil and gas, government and other sectors that want to get to all of their data no matter where it’s stored. IBM’s system, which it calls Project Big Storage, puts all tiers of storage under one namespace, creating a single pool of data that users can manage through folders and directories without worrying about where it’s stored. It incorporates both file and object storage.”

A single pool of data is good. The inclusion of tape storage in this mix is reportedly part of an attempt to undercut IBM’s cloudy competitors, including AWS and Google Cloud. Naturally, the service can be implemented onsite, as a cloud service, or as a hybrid. IBM hopes Big Storage will make cloud pricing more predictable, though complexity there seems inevitable. Tape storage is slower to deliver data, but according to the plan only “rarely needed” data will be stored there, courtesy of IBM’s own Spectrum Scale distributed storage software. Wisely, IBM is relying on the tape-handling experts at Iron Mountain to run the tape-based portion of the Big Storage Project.

Cynthia Murrell, June 9, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Social Media Listening on Facebook

June 9, 2015

The article on Virtual-Strategy Magazine titled NUVI and Datasift Join Forces to Offer Clients Access to Anonymized and Aggregated Facebook Topic Data explains the latest news from NUVI. NUVI is a growing platform for social media “listening”, allowing companies to combine and visualize the data from a variety of social media sites including Facebook, Twitter, Instagram, Reddit and more. NUVI is also the exclusive partner of Berkshire Hathaway subsidiary Business Wire. NUVI is now partnering with Datasift, which gives it access to collected and anonymous Facebook topic data, which includes such information as the brands being discussed and the events being held on Facebook. The article states,

“Access to this information gives marketers a deeper understanding of the topics people are engaging in on the world’s largest social platform and the ability to turn this information into actionable insights. With NUVI’s visually intuitive custom dashboards, customers will be able to see aggregate and anonymized insights such as age ranges and gender… “Our partnership with DataSift is reflective of our desire to continue to provide access to the valuable information that our customers want and need,” said CEO of NUVI.”

Tim Barker, Chief Product Officer of Datasift, also chimes in with his excitement about the partnership, while mentioning that the business value of the deal will not affect the privacy of Facebook users. At least the range of information businesses will glean from a post will not contain a specific user’s private data, just the post they probably have no clue is of value beyond the number of likes it gets.

Chelsea Kerwin, June 9, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Semantic Search Hoohah: Hakia

June 8, 2015

My Overflight system snagged an updated post to an article written in 2006 about Hakia. Hakia, as you may know, was a semantic search system. I ran an interview with Riza C. Berkan in 2008. You can find that Search Wizards Speak interview here.

Hakia went quiet months ago. The author of “Hakia Is a True Semantic Search Engine” posted a sentence that said: “Hakia, unfortunately, failed and went out of business.”

I reviewed that nine year old article this morning and highlighted several passages. These are important because these snippets illustrate how easy it is to create a word picture which does not match reality. Search engine developers find describing their vision a heck of a lot easier than converting talk into sustainable revenues.

Let’s run down three of the passages proudly displaying my blue highlighter’s circles and arrows. The red text is from the article describing Hakia and the blue text is what the founder of Hakia said in the Search Wizards Speak interview.

Passage One

So a semantic search engine doesn’t have to address every word in the English language, and in fact it may be able to get by with a very small core set of words. Let’s say that we want to create our own semantic search engine. We can probably index most mainstream (non-professional) documents with 20-30,000 words. There will be a few gaps here and there, but we can tolerate those gaps. But still, the task of computing relevance for millions, perhaps billions of documents that use 30,000 words is horrendously monumental. If we’re going to base our relevance scoring on semantic analysis, we need to reduce the word-set as much as possible.

This passage is less about Hakia and more about the author’s perception of semantic search. Does this explanation resonate with you? For me, many semantic methods are computationally burdensome. As a result the systems are often sluggish and unable to keep pace with updated content and new content.

Here’s how Dr. Riza C. Berkan, a nuclear engineer and math whiz explained semantics:

With semantic search, this is not a problem. We will extract everything of the 500 words that is relevant content. That is why Google has a credibility problem. Google cannot guarantee credibility because its system relies on link statistics. Semantic methods do not rely on links. Semantic methods use the content itself. For example, hakia QDexed approximately 10 million PubMed pages. If there are 100 million questions, hakia will bring you to the correct PubMed page 99 percent of the time, whereas other engines will bring you perhaps 25 percent of the time, depending on the level of available statistics. For certain things, the big players do not like awareness. Google has never made, and probably never will make, credibility important. You can do advanced search and do “site: sitename” but that is too hard for the user; less than 0.5% of users ever use advanced search features.

Passage Two

What I believe the founders of Hakia have done is borrow the concept of Lambda Calculus from compiler theory to speed the process of reducing elements on pages to their conceptual foundations. That is, if we assume everyone writes like me, then most documents can be reduced to a much smaller subset of place-holders that accurately convey the meaning of all the words we use.

Okay, but in my Search Wizards Speak interview, the founder of Hakia said:

We can analyze 70 average pages per second per server. Scaling: The beauty of QDexing is that QDexing grows with new knowledge and sequences, but not with new documents. If I have one page, two pages or 1,000 pages of the OJ Simpson trial, they are all talking about the same thing, and thus I need to store very little of it. The more pages that come, the more the quality of the results increase, but only with new information is the amount of QDex stored information increased. At the beginning, we have a huge steep curve, but then, processing and storage are fairly low cost. The biggest cost is the storage, as we have many many QDex files, but these are tiny two to three Kb files. Right now, we are going through news, and we are showing a seven to 10 minute lag for fully QDexing news.

No reference to a type of calculus that thrills Googlers. In fact, a review of the patent shows that well know methods are combined in what appears to be an interesting way.

Passage Three

Documents can still pass value by reference in a semantic index, but the mechanics of reference work differently. You have more options, so less sophisticated writers who don’t embed links in their text can have just as much impact on another document’s importance as professional search optimization copywriters. Paid links may not become a thing of the past very quickly, but you can bet your blog posts that buying references is going to be more sophisticated if this technology takes off. That is what is so exciting about Hakia. They haven’t just figured out a way to produce a truly semantic search engine. They have just cut through a lot of the garbage (at a theoretical level) that permeates the Web. Google AdSense arbitragers who rely on scraping other documents to create content will eventually find their cash cows drying up. The semantic index will tell Hakia where the original content came from more often than not.

Here’s what the founder says in the Search Wizards Speak interview:

With semantic search, this is not a problem. We will extract everything of the 500 words that is relevant content. That is why Google has a credibility problem. Google cannot guarantee credibility because its system relies on link statistics. Semantic methods do not rely on links. Semantic methods use the content itself. For example, hakia QDexed approximately 10 million PubMed pages. If there are 100 million questions, hakia will bring you to the correct PubMed page 99 percent of the time, whereas other engines will bring you perhaps 25 percent of the time, depending on the level of available statistics. For certain things, the big players do not like awareness. Google has never made, and probably never will make, credibility important. You can do advanced search and do “site: sitename” but that is too hard for the user; less than 0.5% of users ever use advanced search features.

The key fact is that Hakia failed. The company tried to get traction with health and medical information. The vocabulary for scientific, technical, and medical content is less poetic than the writing in business articles and general blog posts. Nevertheless, the customers and users did not bite..

Notice that both the author of the article did not come to grips with the specific systems and methods used by Hakia. The write up “sounds good” but lacks substance. The founder’s explanation reveals his confidence in what “should be,” not what was and is.

My point: Writing about search is difficult. Founders see the world one way; those writing about search interpret the descriptions in terms of their knowledge.

Where can one get accurate, objective information about search? The options are limited and have been for decades. Little wonder that search remains a baffler to many people.

Stephen E Arnold, June 8, 2015

Information Theory: An Eclectic Approach

June 8, 2015

I love information theory. If you want to get some insight into selected themes in this discipline, check out “Journey into Information Theory.” The course or “program” falls into two sections: Ancient Information Theory and Modern Information Theory. What is interesting is that the Ancient category zips right along from written language to Morse code. Where the subject becomes troublesome for me is the Modern section. The program moves from “symbol rate” to Markov chains. To my eye, there are some omissions. But it appears that the course or program is free.

Stephen E Arnold, June 8, 2015

Data From eBay Illustrates Pricing Quirk

June 8, 2015

The next time you go to sell or buy an item, pay attention to the message the price is sending. Discover reports that “The Last Two Digits of a Price Signal Your Desperation to Sell.” Researchers at UC Berkeley’s business school recently analyzed data from eBay, tracking original prices and speed of negotiations. Writer Joshua Gans shares a chart from the report, and explains:

“The chart shows that when the posted initial price is of a round number (the red dots), like $1,000, the average counteroffer is much lower than if it is a non-round number (the blue circles), like $1,079. For example, the graph suggests that you can actually end up with a higher counteroffer if you list $998 rather than $1,000. In other words, you are better off initially asking for a lower price if price was all you cared about. [Researchers] Backus et al postulate that what is going on here is ‘cheap talk’ – that is, an easy-to-make statement that may be true or untrue with no consequences for dishonesty – and not an otherwise reliable signal. There are some sellers who don’t just care about price and, absent any other way of signaling that to buyers, they set their price at a round number. Alternatively, you can think that the more patient sellers are using non-round numbers to signal their toughness. Either way, the last two digits of the price is cheap talk.”

Gans notes that prices ending in “99” are apparently so common that eBay buyers treat them the same as those  ending in round numbers. The team performed a similar analysis on real estate sales data and found the same pattern: properties priced at round numbers sell faster. According to the write-up, real estate agents are familiar with the tendency and advise clients accordingly. Now you, too, can send and receive signals through prices’ last two digits.

Cynthia Murrell, June 8, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta