Two Pundits and Their Punditry

March 31, 2012

I find the notion of pundits fascinating. The US in 2012 pivots on a news hook, the Warhol fame thing, and a desire to share viewpoints to Flipbook and Pulse users.

This morning I was listening to the crackle of small arms fire in rural Kentucky. Dawn had not yet extended its crepuscular reach to my hollow but two write ups did. Neither is one of those magnum loads squirrel hunters desire here in the Commonwealth. Nope, these were birdshot, but each write up is interesting nonetheless.

Both indirectly concern search and retrieval. Both found their way into my “gems of the poobahs” folder.

First, I noted the digital Atlantic’s write up “The Advertising Industry’s Definition of ‘Do Not Track’ Doesn’t Make Sense.” What caught my attention was the juxtaposition of the word “advertising” with the phrase “doesn’t make sense.” Advertising making sense? The Atlantic “real” journalist has not watched television with a 67 year old. More than half of the TV commercials which I find embedded in basketball games every four minutes don’t make sense. Advertising is about creating a demand for must-have products. Advertising is part of the popular culture and an engine of growth for companies unable to generate sales without the craft and skill of psychological tactics. Check out an advertisement for Kentucky bourbon. Does this headline make sense?

“Honk if you’re proud to be a redneck?

As a resident of Kentucky, I am not sure I know what a redneck is, but I bet those folks in Boston do. But what’s “making sense” part. What advertising does is tickle the brain to make some folks want to drink. And we all know how important it is to imbibe whiskey, engage in “real” journalism, ferry children to soccer practice. Yep, makes “sense” to me.

But here’s the passage which caught my attention:

Stanford’s Aleecia McDonald found that 61 percent of people expect that clicking a Do Not Track button should shut off *all* data collection. Only 7 percent of people expected that websites could collect the same data before and after clicking a ‘Do Not Track’ button. That is to say, 93 percent of people do not understand the industry’s definition of DNT. Which totally makes sense! Who would ever think saying, “Do not track me,” actually means, “It’s fine to collect data on me, but don’t show me any signs that you’re doing so.” Simply because the industry itself has defined ‘Do Not Track’ in an idiosyncratic way doesn’t mean their self-serving decision should be the basis for all policy and practice in this field.

Almost any redneck would understand this passage, the implications of persistent cookies, and the distinction between various types of tracking, including my favorite, iFrames-based method.

Second, I read “Debunking Senator Al Franken On Google, The Internet & Privacy.” This screed is from a “real” journalist and favorite source of juicy quotes on the subject of search and retrieval. The point of the write up is that despite the author’s affection for a US senator as a comedian, the US senator does not know beans about tracking, Google, and, by extension, search and retrieval. Now “search” does not mean find. Search, I believe, means to the “real” journalist using methods to generate traffic to a Web site. I define “search” differently, but the good part in my opinion is this passage:

Ya think? But I mean, Facebook kind of does sell my friends. I can export all of them out to Yahoo and Bing, because Facebook and Yahoo and Bing all have deals. I can’t export them to Google, because, you know, they aren’t friends. Would you call that selling to the highest bidder? When I go over to search on Bing, by default, all my Facebook friends are being used to personalize my search results. Oh, I can opt-out, but you know how hard that is. Since that’s part of a Bing-Facebook deal, is that a line that’s crossed?

Please, read the entire “real” journalistic analysis of a talk by a US senator. I must admit I don’t relate to the questions and analytic points in this paragraph. I recognize the names of the companies mentioned, but “the deal” baffles me.

Why do I care? Three points:

  1. I sense the emotion in these write ups. Passion is good for advertising and good for capturing attention. However, I am struggling to figure out what the problem is. Advertising seems to be what America is. Untangling the warp and woof of this fabric is difficult for me.
  2. The ad hominem method and charged language causes me to think that the lingo of advertising has become the common parlance of “real” journalists.
  3. I struggle to unravel the meaning of certain parts of these two write ups. Am I alone?

Net net: technology and advertising are an interesting compound. Now “real” journalism is quite similar. To quote one “real” journalist, “Ya think?” Well, not much.

Stephen E Arnold, March 31, 2012

Sponsored by Pandia.com

SAS Gets More Visual

March 31, 2012

Inxight (now owned by BusinessObjects, part of the SAP empire)  is history at SAS or almost history. Now the company is moving in a different direction.

Jaikumar Vijayan writes about a new visual analytics application recently unveiled by SAS in his article “SAS Promises Pervasive BI with New Tool.” Einstein is believed to have once said “computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.” We noted this passage from Mr. Vijayan’s write up:

Unlike many purely server-based enterprise analytics technologies, Visual Analytics gives business users a full range of data discovery, data visualization and querying capabilities from desktop and mobile client devices, the company said.

The initial version of the new tool allows iPad users to view reports and download information to their devices. Future versions will support other mobile devices as well, SAS added. The quote is actually a good description of the concept that underlies Visual Analysis. The process uses analytic reasoning to detect specific information in massive amount of data. For example, a clothing manufacturer might use it to determine current trends in ladies’ fashions. The results are presented in charts and graphs to the users, who can fine-tune the parameters until their specific queries are answered.

SAS is known for its statistical functionality, its programming language, and its need for SAS-savvy cow pokes to ride herd on the bits and bytes. Will SAS be able to react to the trend for the consumerization of business intelligence.

While the technology is impressive, SAS may be a little late to the game. Palantir and Digital Reasoning have already introduced applications that offer clients powerful Visual Analysis capabilities. Time will tell if SAS is able to catch up to some competitors’ approach. We are interested in Digital Reasoning, Ikanow and Quid.

Stephen E Arnold, March 31, 2012

Sponsored by Pandia.com

A Road Map for Censorship

March 31, 2012

David Bamman, Brendan O’Connor, Noah A. Smith  present some interesting facts based on a study they wrote about in their article, Censorship and Deletion Practices in Chinese Social Media.  Their study touches on a variety of different aspects regarding how China allegedly controls the intake and outflow of information.

The Chinese government methods are far different from the United States’ approach. My understanding of the situation is that China takes censorship to extremes and infringes on the freedom of their citizens using the GFW (Great Firewall of China) , which filters key phrases and words, preventing access to sites like America’s Facebook and Google. However, Sina Weibo is the Chinese equivalent of Facebook where bloggers post and pass information presumably in a way the officials perceive as more suitable for the Middle Kingdom.

Sina Weibo is monitored and as long as members stay within the boundaries or disguise their information, posts go unnoticed. If any of the outlawed phrases are entered, the user’s post is deleted and anyone searching for the information is met with the phrase ‘Target weibo does not exist’. If the user properly masks the phrase or words used, the information will get through, showing that there is the possibility of future change regarding the censorship practices in China.

The GFW will catch obvious outgoing information such as political figures, which was monitored during the study. The article asserted:

In late June/early July 2011, rumors began circulating in the Chinese media that Jiang Zemin, general secretary of the Communist Party of China from 1989 to 2002, had died. These rumors reached their height on 6 July, with reports in the Wall Street Journal, Guardian and other Western media sources that Jiang’s name had been blocked in searches on Sina Weibo (Chin, 2011; Branigan, 2011). If we look at all 532 messages published during this time period that contain the name Jiang Zemin, we note a striking pattern of deletion: on 6 July, the height of the rumor, 64 of the 83 messages containing that name were deleted (77.1 percent); on 7 July, 29 of 31 (93.5 percent) were deleted.

No firewall is perfect, but according to the studies done on searches, blogs and texts containing prohibited information, China has a pretty impressive figure. It may not seem reasonable by American standards, but by filtering anything they deem as politically sensitive, China protects the privacy of their country, preventing global rumors and interference.

On one level, censorship makes sense, in particular regarding the business world. The Chinese government makes its corporations responsible for their employees, meaning if an employee is blogging instead of working and puts in illegal information, the company itself is fined, or worst case scenario, shut down. Thus Chinese factories have a high rate of productivity because their workers are actually doing their job.

How is China’s alleged position relevant to the US? There may be little relevance, but to officials in other countries, the article’s information may be just what one needs to check into a Holiday Inn of censorship.

Jennifer Shockley, March 31, 2012

Sponsored by Pandia.com

Love Lost between Stochastic and Google AppEngine

March 30, 2012

Stochastic Technologies’ Stavros Korokithakis has some very harsh words for Google’s AppEngine in “Going from Loving AppEngine to Hating it in 9 Days.” Is the Google shifting its enterprise focus?

Stochastic’s service Dead Man’s Switch got a huge publicity boost from its recent Yahoo article, which drove thousands of new visitors to the site. Preparing for just such a surge, the company turned months ago to Google’s AppEngine to manage potential customers. At first, AppEngine worked just fine. The hassle-free deployments while rewriting and the free tier were just what the company needed at that stage.

Soon after the Yahoo piece, Stochastic knew they had to move from the free quota to a billable status. There was a huge penalty, though, for one small mistake: Korokithakis entered the wrong credit card number. No problem, just disable the billing and re-enable it with the correct information, right? Wrong. Billing could not be re-enabled for another week.

Things only got worse from there. Korokithakis attempted to change settings from Google Wallet, but all he could do was cancel the payment. He then found that, while he was trying to correct his credit card information, the AppEngine Mail API had reached its daily 100-recipient email limit. The limit would not be removed until the first charge cleared, which would take a week. The write up laments:

At this point, we had five thousand users waiting for their activation emails, and a lot of them were emailing us, asking what’s wrong and how they could log in. You can imagine our frustration when we couldn’t really help them, because there was no way to send email from the app! After trying for several days to contact Google, the AppEngine team, and the AppEngine support desk, we were at our wits’ end. Of all the tens of thousands of visitors that had come in with the Yahoo! article, only 100 managed to actually register and try out the site. The rest of the visitors were locked out, and there was nothing we could do.

Between sluggish payment processing and a bug in the Mail API, it actually took nine days before the Stochastic team could send emails and register users. The company undoubtedly lost many potential customers to the delay. In the meantime, to add charges to injury, the AppEngine task queue kept retrying to send the emails and ran up high instance fees.

It is no wonder that Stochastic is advising us all to stay away from Google’s AppEngine. Our experiences with Google have been positive. Perhaps this is an outlier’s experience?

Cynthia Murrell, March 30, 2012

Sponsored by Pandia.com

Protected: SharePoint A Key Tool in the Collaboration Network

March 30, 2012

This content is password protected. To view it please enter your password below:

Michael Moody Joins Lucid Imagination

March 30, 2012

Market Watch recently reported on Lucid Imagination, the commercial company for Apache Lucene and Solr search technology, in the article “Lucid Imagination Names Software Development Luminary Michael Moody Senior Vice President of Engineering.”

According to the article, Michael Moody brings more than 30 years of software engineering to the search technology company.  He has held senior positions in several different companies including: Spigit, Jaspersoft, and Portal Software.

Mr. Moody said:

Thanks to Lucid Imagination, companies will be able to meet the challenge of analyzing their big data before the rapid adoption leads to operational chaos, lost opportunities, and reduced competitiveness,” said Moody. “We have the technology, business model and people in place to help drive a complete transformation of enterprise search and retrieval that will lead to phenomenally better and faster decision making.

My colleagues and I are very excited to see Michael Moody’s addition to the Lucid Imagination team.

I speak for the ArnoldIT team when I assert that we are confident that his expertise will help the company come up with even better ways to overcome the challenges of enterprise search and big data access.

We have noticed that a number of open source search vendors are touting performance enhancements, fail over methods, and value added indexing advantages which Lucid’s search system allegedly do not provide. Assertions are easy. Real world deployments are different from talking about delivering cost savings and improved efficiencies to a customer.

We have just completed an fly over of open source search vendors. In our view, Lucid’s search system out distanced the other Lucene-based search systems we examined.

We try to avoid Mac vs. PC type hassles, but the key difference among open source search vendors boils down to who can deliver efficiencies to the licensee, offer financial stability, and 24×7 engineering support and services. When measured against our “real world” yardstick, trust Lucid Imagination. There is more to the company than a single entrepreneur working nights and weekends to compete. Just our view. Maybe our Overflight report will become publicly available. Who knows?

In the meantime, navigate to www.lucidimagination.com and learn more about the company.

Jasmine Ashton, March 30, 2012

Sponsored by Pandia.com

Attensity Sallies into the Insurance Sector

March 30, 2012

Insurance Networking News recently reported on a new application that is designed to enable insurers to analyze unstructured data in the article “Insurance – Specific Social Analytics Software Launched.”

According to the article, Attensity, a provider of social analytics and engagement solutions, is working to help insurance companies make their claims processes more efficient by assisting them with the analysis of data gathered from a variety of sources including: claim forms, adjuster notes, as well as customer feedback from social media, surveys, emails and other sources.

The article states:

The software builds on the company’s text analytics application with out-of-the-box category sets, topics, reports and dashboards tailored specifically for the insurance industry. The new solution enables insurance carriers to spot fraudulent patterns, identify customer pain points early, respond to customer service requests proactively as well as analyze the data of customers that switch providers.

This is just one more example of text analytics software providers helping other industries get a better feel for what their consumers are saying. Will insurance have the same appetite as the intelligence community for Attensity’s system and method of extracting nuggets of information?

Jasmine Ashton, March 30, 2012

Sponsored by Pandia.com

Improving Governance Compliance in SharePoint

March 30, 2012

Jeremy Thake addresses the important issue of governance compliance in, “SharePoint Gone Wild: When Governance Lacks Compliance.” Many organizations employ multiple repositories for sensitive content, and an out-of-the-box SharePoint system makes it hard to enforce the guidelines of where content should go. Thake explains,

“The out-of-the-box auditing features in SharePoint 2010 have some key limitations in this space, specifically regarding the storage of this data over a prolonged period of time (most acts seem to be approximately seven years) as well as the ease of producing a report of an individual user’s activity and attached content. The most common format followed by customers with whom I work is Concordance, which is supported by LexisNexis. But more importantly, from a content perspective, the attached content should be exactly what the user viewed, modified, or created at that point in time so versioning here is the key.”

SharePoint 2010 has many added improvements to address some of the out-of-the-box compliance gaps. But compliance is an area that needs a comprehensive solution. To really extend your SharePoint capabilities and get the most out of your enterprise search investments, look to Mindbreeze.

No matter where your sensitive information is stored, on-premise or in the Cloud, Mindbreeze connects users to the right information while maintaining strict security standards. Here you can read about the cost-efficient solution:

“Fabasoft Mindbreeze Enterprise finds every scrap of information within a very short time, whether document, contract, note, e-mail or calendar entry, in intranet or internet, person- or text-related. The software solution finds all required information, regardless of source, for its users. Get a comprehensive overview of corporate knowledge in seconds without redundancy or loss of data.”

Check out their full suite of solutions to see what will work for you.

Philip West, March 30, 2012

Sponsored by Pandia.com

Newspapers Losing Revenue: Time for a Change

March 30, 2012

Newspaper acquisition time? I was surprised by a headline I landed on while browsing Business Week; an article titled, “Newspapers Lose $10 in Print for Every Digital $1” grabbed my attention.

According to the article, newspapers in the United States lost $10 in print advertising revenue in 2011 for every dollar gained online. The article cites a study by Pew Research and blames the 7.9 percent ad revenue loss on competition from tech intermediaries. Newspapers are hurting tremendously in the online arena. Paid news sites and print copies are declining in revenue because consumers want their news fast and free, usually via mobile apps and free news blogs.

Newspaper groups have failed to capitalize on the volume of personalized data available online in the face of increased competition from companies including Google (GOOG) and Facebook, which are selling advertising targeted to consumers based on their interests and demographics, typically at higher ad rates, Rosenstiel said.  Newspapers have slowly shifted their businesses online, led in part by the recent success of New York Times Co. (NYT)’s plan to charge readers for access to its newspapers’ websites. Pew’s study estimates as many as 100 newspapers are expected to offer a digital subscription model in the coming months.

No matter how one exercises ingenuity, the newspapers have a broken business model and a customer base indifferent to old information in print or online. Users are not likely to pay subscription fees, even for traditional and trusted organizations, if the material is available elsewhere for free. News groups should reconsider new business models or becoming partners with data-driven companies, or else it could be sell off and go fishing time.

Andrea Hayden, March 30, 2012

Sponsored by Pandia.com

Law Firms Learn Staff Can Be Repurposed

March 30, 2012

I know there is considerable enthusiasm for smart software. Most of the eDiscovery vendors suggest that humans and whizzy new systems can coexist. Now, a new chapter in justifiable staff reductions may be upon us. Navigate to “A New View of Review: Predictive Coding Vows to Cut E-Discovery Drudgery” to learn that recently released research from an Ivory Tower-type says that a “predictive coding approach can do a better job of sifting through more than 800,000 documents than humans.”

For many law school graduates, scouring documents for material of value to a case has long been a secure if somewhat tedious means of entering the legal profession. This will no longer be true, however, if a new type of software lives up to its creator’s claims Known as predictive coding, it can supposedly do the same job, faster, cheaper, and as well as humans. But lawyers live to bill, so perhaps software may force law firms to get rid of staff and trust the algorithms.

We learn:

There has been a long-standing myth in the legal field that exhaustive manual review is the gold standard, or nearly perfect, but that has been shown to be a fallacy, according to Maura R. Grossman, a New York City attorney. Research has shown that, under the best circumstances, manual review will identify about 70 percent of the responsive documents in a large data collection. Some technology-assisted approaches have been shown to perform at least as well as that, if not better, at far less cost.

Attorneys, paralegals, unpaid interns, and experts in India will miss 30 percent of the pertinent documents. Smart software is the path to the future.

Some observers worry about the legal defensibility of predictive coding. But such concerns are unfounded, so long as both sides agree to its use. That’s according to Craig Carpenter, a marketer for Recommind, a software development firm focused on the legal and corporate market

But even sophisticated programs don’t actually think. Without that capacity, they cannot understand the subtle nuances and informal connections that underlie written documents. It’s unlikely that predictive coding will live up to the sales hype surrounding it. But what’s new about search vendors’ marketing is that reality is often different from Spock’s world on Star Trek.

Stephen E Arnold, March 20, 2012

Sponsored by Pandia.com

Next Page »

  • Archives

  • Recent Posts

  • Meta