Big Data Shockers: Not Big

September 14, 2015

I read “Big Data Doesn’t Exist.” Now “data” is plural, but why get involved in grammar. This is the mobile, thumb typing era.

The write up states:

I’ve found it’s a good rule of thumb to assume a company has one one-thousandth of the data they say they do.

Yep, the data perception anomaly is alive and well. The folks who have too much data are too busy as well. Many of the individuals with whom I come in contact have no time to think new thoughts, complete projects on time, return phone calls, or answer email. Quartz offers “We’re Not Actually That Busy, But We’re Great at Pretending We Are.”

The factors causing the razzle dazzled view of an organization’s data and busy-ness are similar. The inability to look at information or tasks from an informed vantage point creates uncertainty. The easiest way to escape criticism for a strategic failure is to embrace a crazy generalization and protest too much about work that must be completed.

Hence, there is a boom in time management and automatic scheduling. I hear, “My calendar is full.” No kidding. That tells me the person has abrogated responsibility.

The statement that “we have too much data” underscores the individual’s inability to think about information in a way that is helpful. The consequence is the mad dash to software that does the thinking for a professional. There are visualization tools. These make it easy to see what the data allegedly say.

Baloney.

Both the craziness about Big Data and the too much to do approach to work are cover ups.

The issue is rooted deep within many individuals who are unable to cope with the mundane activities of life in the 21st century. The fix is within individuals. Stated another way, there is no fix when there is little or no incentive or desire to take responsibility for work.

I asked a Kentucky Fried Chicken store clerk, “Why do I have to wait for the biscuits to be cooked before I can have two pieces of chicken for my beloved boxers?” The boxers don’t eat biscuits on my watch.

The answer, “That’s what I was told.” Judgment and the instinct to use common sense is absent in the executive suite just as it is at retail fast food outlets.

No sale. Many professionals want a short cut and no responsibility. That’s a mushy foundation for a digital work ethic. Analytics will miss this important nuance when it processes declining revenues.

Stephen E Arnold

Yahoo: Jelly for the Peanut Butter Memo

September 14, 2015

In 2006, I learned that a Yahooligan wrote what is findable in Google as the “peanut butter manifesto.” The alleged author of the peanut butter analysis left Purpleville but thoughtfully updated his write up in 2013. The points which stick to the roof of my mind were: [a] Yahoo was doing too much with too few resources and [b] Yahooligans leaked information outside of Purpleville. Interesting to some, but the Yahoo is not germane to what I do unless the company makes wild and crazy assertions about its excellence in search, its semantic research, and the other topics I keep in the room with my favorite hobby horse.

I read “Straight Outta Sunnyvale: Yahoo Manager Gone after Racially Charged E-Mail.” It seems that another Yahooligan wrote an internal document and revealed truths about the Purple monster. I am one of those individuals who is easily confused. I assumed that the hipsters at Yahoo were in step with the trends.

I noted this passage in the “Straight Outta Sunnyvale” article:

Meghna Virick, a professor of management at San Jose State University, said Mr. Shen’s [former Yahooligan and alleged Straight Outta memo author] prompt departure from Yahoo was “harsh” and a missed opportunity to have a broader discussion at the company about what is permissible. “Yes, it’s embarrassing, and yes, it’s humiliating, but it’s sometimes good to let this stuff surface,” Prof. Virick said. “It’s important to have discussions about it, to treat this as an opportunity to talk about it with the rest of the Yahoo community. Because if [Mr. Shen] felt comfortable documenting it by e-mail, there’s a likelihood that there could be a culture of disrespect.”

Disrespect? Interesting.

Yahoo may not be able to generate robust organic growth, but its staff can crank out the internal documents which contribute to my appreciation of the Sillycon Valley business environment. I also like the meme power of their memos. Peanut butter and straight outta Sunnyvale. Very clever writing in my view.

Asterisks can be powerful. Ah, dear, old Yahoo. “So don’t be a punk.” I am not sure what that means but the phrase speaks to some at Yahoo. I wonder if the injunction will improve the company’s information access technologies. Dog food?

Stephen E Arnold, September 14, 2015

Free InetSoft Data Tools for AWS Users

September 14, 2015

Users of AWS now have access to dashboard and analytics tools from data intelligence firm InetSoft, we learn from “InetSoft’s Style Scope Agile Edition Launched on Amazon Web Services for No Extra Cost Cloud-based Dashboards and Analytics” at PRWeb. The press release announces:

“Installable directly from the marketplace into an organization’s Amazon environment, the application can connect to Amazon RDS, Redshift, MySQL, and other data sources. Its primary limitation is a limit of two simultaneous users. In terms of functionality, the enterprise administration layer with granular security controls is omitted. The application gives fast access to powerful KPI reporting and multi-dimensional analysis, enabling the private sharing of dashboards and visualizations ideally suited for individual analysts, data scientists, and small teams in any departmental function. It also provides a self-service way of evaluating much of the same technology available in InetSoft’s commercial offerings, applications suitable for enterprise-wide deployment or embedding into other cloud-based solutions.”

So now AWS users can pick up free tools with this Style Scope Agile Edition, and InetSoft may pick up a customers for its commercial version of Style Scope. The company emphasizes that their product does not require users to re-architect data warehouses, and their data access layer, based on MapReduce principles, boosts performance. Founded in 1996, InetSoft is based in New Jersey.

Cynthia Murrell, September 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Computers Learn Discrimination from Their Programmers

September 14, 2015

One of the greatest lessons one take learn from the Broadway classic South Pacific is that children aren’t born racist, rather they learn about racism from their parents and other adults.  Computers are supposed to be infallible, objective machines, but according to Gizmodo’s article, “Computer Programs Can Be As Biased As Humans.”  In this case, computers are “children” and they observe discriminatory behavior from their programmers.

As an example, the article explains how companies use job application software to sift through prospective employees’ resumes.  Algorithms are used to search for keywords related to experience and skills with the goal of being unbiased related to sex and ethnicity.  The algorithms could also be used to sift out resumes that contain certain phrases and other information.

“Recently, there’s been discussion of whether these selection algorithms might be learning how to be biased. Many of the programs used to screen job applications are what computer scientists call machine-learning algorithms, which are good at detecting and learning patterns of behavior. Amazon uses machine-learning algorithms to learn your shopping habits and recommend products; Netflix uses them, too.”

The machine learning algorithms are mimicking the same discrimination habits of humans.  To catch these computer generated biases, other machine learning algorithms are being implemented to keep the other algorithms in check.  Another option to avoid the biases is to reload the data in a different manner so the algorithms do not fall into the old habits.  From a practical stand point it makes sense: if something does not work the first few times, change the way it is done.

Whitney Grace, September 14, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Search and Find Love but Maybe Not

September 13, 2015

My trusty alert service delivered me this search gem: “13 Apps 17 Dates 30 Days: I Tried 13 Dating Apps in 30 Days in Search of Love.” Shadows of Ashley Madison could not obscure this clickable topic.

Here’s what I learned:

  • There are sites with interesting names with which I was not familiar. Here are two examples Jack’D and Scruff.
  • Dating apps can be used to deliver ads. Love for someone I assume.
  • Finding love takes “time and energy.” Yep, just like my notions for information access.
  • Some love search apps ask users to involve their Twitter followers. Now that’s a great idea for some folks.

Main point of the write up: Love apps don’t work. Wow, revelation.

Stephen E Arnold, September 13, 2015

Social Consensus: Control Becomes a Big Thing

September 13, 2015

I read “The Cable Industry Faces the Perfect Storm: Apps, App Stores, and Apple.” I think the idea is a valid one. I am not sure about the Apple thing.

Let’s go to the Web page. (Shades of Warner Wolfe.)

The write up states:

the average US consumer is spending 198 minutes per day inside apps compared to 168 minutes on TV. Please note that the 198 minutes per day spent inside apps on smart phones and tablets don’t include time spent in the mobile browser. In fact, if we add that time, the total time spent on mobile devices by the average US consumer is now 220 minutes (or 3 hours and 40 minutes) per day…

In the good old days, people were supposed to be watching the fire burn in their caves. Then folks listened to Jack Benny on Sunday night. When I was a wee lad, we had a black and white television which sort of worked. My progeny had color TV to watch. Today lots of people look at tiny screens and checking Facebook or looking for pizza via Google or Alphabet or whatever the company is.

Bad news for cable companies it seems.

Forget the cable folks. My view is that the bad news is what I call the consensus problem. Shared experiences are blockbusters in the James Twitchell sense of the word in Adcult USA.

Cohesiveness comes from the Super Bowl and similar constructs. The implications of this tiny screen shift are significant. Losers will be the organizations constructed to serve the mass markets of mass media.

Apple, bless its innovative heart, makes gizmos. The powerhouses are the outfits which deliver micro-content and micro-experiences to the OreIda’s walking around or sitting in coffee shops with their mobile devices.

Search and retrieval? A loser. Sustained concentration? A loser. Consensus? Interesting about that.

Stephen E Arnold, September 13, 2015

GIGO: A Reminder That Your Statistics 101 Class Was Important

September 12, 2015

You are familiar with GIGO, aren’t you? This is an old school acronym which is allegedly shorthand for “garbage in, garbage out.” Do not use this acronym with a civil engineer with a minor in wastewater treatment. Make a joke about a nuclear engineer.

Plus, gentle reader, I assume you remember your Statistics 101 class. The yap about sample size, data quality, various validity checks, and other assorted disturbances of an otherwise normal academic journey.

Now navigate to “The Most Important Thing to Know About Big Data: It’s Not About the Tools.” The write up in a rather pleasant way reminds me that software tools are less important than dealing with current information, accurate data, and complete, normalized data sets.

I know the “data lake” crowd dismisses these issues as trivial, irrelevant, or old fashioned (just like me).

I learned:’

So handling big data isn’t really at all about the tools but, instead, it’s about using them as a part of the process to arrive at the right decisions to meet the organization’s needs. Anyone who doubts that would do well to bear in mind the 10/90 rule put forward nearly 10 years ago by Google co-founder and digital evangelist Avinash Kaushik. He suggested that for every $10 invested in data analytics tools a business should invest $90 in people to actually extract value from the data.

I am not sure most of the Big Data hypesters agree.

Stephen E Arnold, September 12, 2015

New York Times and the Guardian: Best Buds of the Alphabet Google

September 12, 2015

i read “The New York Times and The Guardian Sign On for Google and Twitter’s Instant Article Push.” The main idea is that dead tree outfits are getting with the real time content thing.

Now don’t think about PointCast and the Rupert Murdoch dalliance. Don’t think about push and pushlets. Don’t think about BackWeb and Active Channel.

Think about instant articles. These enable a mobile user “to quickly pull up stories,” presumably with split infinitives.

I am not a fan of the notification approach to information access. My approach is to use a variety of Stone Age tools to locate information. I am one of the folks in Kentucky who thinks Homo naledi is a cutting edge group.

My hunch is that the traditional publishers see the Google thing and the Facebook thing as a new source of revenue.

My hunch is that these hip, online outfits want to extend their control over certain types of content.

Control may be more important than money to the Alphabet outfit. Facebook is the new CompuServe and emulates its walled garden approaches.

How will this end? How does one spell “Google”?

Stephen E Arnold, September 12, 2015

Lexmark Chases New Revenue: Printers to DTM

September 11, 2015

I know what a printer is. The machine accepts instructions and, if the paper does not jam, outputs something I can read. Magic.

I find it interesting to contemplate my printers and visualize them as an enterprise content management system. Years ago, my team and I had to work on a project in the late 1990s involving a Xerox DocuTech scanner and printer. The idea was that the scanner would convert a paper document to an image with many digital features. Great idea, but the scanner gizmo was not talking to the printer thing. We got them working and shipped the software, the machines, and an invoice to the client. Happy day. We were paid.

The gap between that vision from a Xerox unit and the reality of the hardware was significant. But many companies have stepped forward to convert knowledge resident systems relying on experienced middle managers to hollowed out outfits trying to rely on software. My recollection is that Fulcrum Technologies nosed into this thorn bush with DOCSFulcrum a decade before the DocuTech was delivered by a big truck to my office. And, not to forget our friends to the East, the French have had a commitment to this approach to information access. Today, one can tap Polyspot or Sinequa for business process centric methods.

The question is, “Which of these outfits is making enough money to beat the dozens of outfits running with the other bulls in digital content processing land?” (My bet is on the completely different animals described in my new study CyberOSINT: Next Generation Information Access.)

Years later I spoke with an outfit called Brainware. The company was a reinvention of an earlier firm, which I think was called SER or something like that. Brainware’s idea was that its system could process text which could be scanned or in a common file format. The index allowed a user to locate text matching a query. Instead of looking for words, Brainware system used trigrams (sequences of three letters) to locate similar content.

Similar to the Xerox idea. The idea is not a new one.

I read two write ups about Lexmark, which used to be part of IBM. Lexmark is just down the dirt road from me in Lexington, Kentucky. Its financial health is a matter of interest for some folks in there here parts.

The first write up was “How Lexmark Evolved into an Enterprise Content Management Contender.” The main idea pivots on my knowing what content management is. I am not sure what this buzzword embraces. I do know that organizations have minimal ability to manage the digital information produced by employees and contractors. I also know that most organizations struggle with what their employees do with social media. Toss in the penchant units of a company have for creating information silos, and most companies look for silver bullets which may solve a specific problem in the firm’s legal department but leave many other content issues flapping in the wind.

According to the write up:

Lexmark is "moving from being a hardware provider to a broader provider of higher-value solutions, which are hardware, software and services," Rooke [a Lexmark senor manager] said.

Easy to say. The firm’s financial reports suggest that Lexmark faces some challenges. Google’s financial chart for the outfit displays declining revenues and profits:

image

The Brainware, ISYS Search Software, and Kofax units have not been able to provide the revenue boost I expected Lexmark to report. HP and IBM, which have somewhat similar strategies for their content processing units, have also struggled. My thought is that it may be more difficult for companies which once were good at manufacturing fungible devices to generate massive streams of new revenue from fuzzy stuff like software.

The write up does not have a hint of the urgency and difficulty of the Lexmark task. I learned from the article:

Lexmark is its own "first customer" to ensure that its technologies actually deliver on the capabilities and efficiency gains promoted by the company, Moody [Lexmark senior manager] said. To date, the company has been able to digitize and automate incoming data by at least 90 percent, contributing to cost reductions of 25 percent and a savings of $100 million, he reported. Cost savings aside, Lexmark wants to help CIOs better and more efficiently incorporate unstructured data from emails, scanned documents and a variety of other sources into their business processes.

The sentiment is one I encountered years ago. My recollection is that the precursor of Convera explained this approach to me in the 1980s when the angle was presented as Excalibur Technologies.

The words today are as fresh as they were decades ago. The challenge, in my opinion, remains.

I also read “How to Build an Effective Digital Transaction Management Platform.” This article is also eWeek, from the outfit which published “How Lexmark Evolved” piece.

What does this listicle state about Lexmark?

I learned that I need a digital transaction management system. A what? A DTM looks like workflow and information processing. I get it. Digital printing. Instead of paper, a DTM allows a worker to create a Word file or an email. Ah, revolutionary. Then a DTM automates the workflow. I think this is a great idea, but I seem to recall that many companies offer these services. Then I need to integrate my information. There goes the silo even if regulatory or contractual requirements suggest otherwise. Then I can slice and dice documents. My recollection is that firms have been automating document production for a while. Then I can use esignatures which are trustworthy. Okay. Trustworthy. Then I can do customer interaction “anytime, anywhere.” I suppose this is good when one relies on innovative ways to deal with customer questions about printer drivers. And I cannot integrate with “enterprise content management.” Oh, oh. I thought enterprise content management was sort of a persistent, intractable problem. Well, not if I include “process intelligence and visibility.” Er, what about those confidential documents relative to a legal dispute?

The temporal coincidence of a fluffy Lexmark write up and the listicle suggest several things to me:

  1. Lexmark is doing the content marketing that public relations and advertising professionals enjoy selling. I assume that my write up, which you are reading, will be an indication of the effectiveness of this one-two punch.
  2. The financial reports warrant some positive action. I think that closing significant deals and differentiating the Lexmark services from those of OpenText and dozens of other firms would have been higher on the priority list.
  3. Lexmark has made a strategic decision to use the rocket fuel of two ageing Atlas systems (Brainware and ISYS) and one Saturn system (Kofax’s Kapow) to generate billions in new revenue. I am not confident that these systems can get the payload into orbit.

Net net: Lexmark is following a logic path already stomped on by Hewlett Packard and IBM, among others. In today’s economic environment, how many federating, digital business process, content management systems can thrive?

My hunch is that the Lexmark approach may generate revenue. Will that revenue be sufficient to compensate for the decline in printer and ink revenues?

What are Lexmark’s options? Based on these two eWeek write ups, it seems as if marketing is the short term best bet. I am not sure I need another buzzword for well worn concepts. But, hey, I live in rural Kentucky and know zero about the big city views crafted down the road in Lexington, Kentucky.

Stephen E Arnold, September 11, 2015

Big Data: The McKinsey Way

September 11, 2015

I read “6 Observations from a New Survey on the State of Big Data Analytics.” The data come from a study underwritten by a magazine outfit, a blue chip consulting firm, and a company selling storage and related bright and shiny things.

I found the write up suggestive. The first finding was a bomb shell.

The hype gone, big data is alive and doing well.

Aside from the subject-verb error coming from data is when data is the plural of datum, the information is revolutionary. Big Data is no longer subject to hyperbole. I did not know that. Topsy.com tallied 3,154 tweets about Big Data in the the 24 hours of September 8, 2015. For comparison, Big Data is in a dead heat with the tweets about the Bentley Bentayga SUV. Good company. FYI: Katy Perry managed only 1,468 tweets in the same time period. Nevertheless, in Harrod’s Creek, Big Data, expensive autos, and a musical 30 year old are buzz machines.

The write up reports:

No matter how many times you say “data-driven,” decisions are still not based on data. Sounds familiar? 51% of executives said that adapting and refining a data-driven strategy is the single biggest cultural barrier and 47% reported putting big data learning into action as an operational challenge.

Yikes. More consulting is needed to get this cultural change thing underway.

Other findings that underpin the article are:

  • If the CEO is into Big Data, the company is into Big Data…mostly. If the CEO is like the airline executives in the news, the CEO may have other interests
  • I love this: “Even if you have top leadership sponsorship and the right culture, getting data to drive action and strategy is a challenge.  48% of executives surveyed regard making fact-based business decisions based on data as a key strategic challenge, and 43% cite developing a corporate strategy as a significant hurdle.” Maybe Big Data is not the slam dunk consultants and journalists wish it to be?
  • Brontobyte data. Hey, we have perfectly useful words to suggest unimaginably large quantities. I like yottabyte. The study sponsors seem to be okay with the brontobyte coinage. Very hip, but I would have created a variant of Diplodocus. More colorful for sure.
  • There is a shortage of “big data miners.” Okay, I understand. The user friendly analytics tools are just not too helpful unless a company has someone who actually paid attention in statistics classes.

The only thing missing from this write up is links to the sponsors’ product pages. By the way, the article pumps up Big Data. Amusing stuff.

Stephen E Arnold, September 11, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta