CyberOSINT banner

Bye-Bye Enterprise Storage

October 19, 2015

Storage is a main component of the enterprise system.  Silos store data and eventually the entire structure transforms into a legacy system, but BusinessWire says in “MapR Extends Support For SAS To Deliver Big Data Storage Independence” it is time to say good-bye to old enterprise storage.  MapR is trying to make enterprise storage obsolete with its new extended service support for SAS, a provider of business software and services.  The new partnership between allows advanced analytics with easy data preparation and integration in legacy systems, improved security, data compliance, and assurance of service level agreements.

The entire goal is to allow SAS and MapR clients to have better flexibility for advanced analytics within Hadoop as well as to help customers harvest the most usefulness our of their data.

Here is a rundown of the partnership between SAS and MapR:

“The collaboration makes available the full scope of technologies in the SAS portfolio, including SAS® LASR™ Analytic Server, SAS Visual Analytics, SAS High-Performance Analytics, and SAS Data Loader for Hadoop. Complete MapR integration delivers security and full POSIX compliance for use in “share everything architectures,” as well as enables SAS Visual Analytics to easily and securely access all data. With SAS Data Loader for Hadoop, users can prepare, cleanse and integrate data inside MapR for improved performance and then load that data in-memory into SAS LASR for visualization or analysis, all without writing code.”

Breaking away from legacy systems with old onsite storage is one of the new trends for enterprise systems.  Legacy systems are clunky, don’t necessary comply with new technology, and have slow information retrieval.  A new enterprise system using SAS and MapR’s software will last for some time, until the new trend buzzes through town.

Whitney Grace, October 19, 2015

Sponsored by, publisher of the CyberOSINT monograph

US General Services Administration: Changes Ahead

October 17, 2015

I read “David Shive: GSA Ramps Up IT Consolidation through Acquisition Process Updates, Analytics Adoption.” Then I read “Mary Davie: GSA Updates Federal Acquistion Gateway Platform.” Ah, memories of FAR. You are familiar with the rich, informative compendium known as Federal Acquisition Regulation.

The write ups indicate that changes are afoot at the GSA, where is busy inventing point-and-click Web site services and cloud computing.

According to the Shive write up:

“We are looking at our data management strategies so we can effectively coalesce that data, and putting good predictive analytics on top of that so that we can make good decisions about things that are happening, and predicting things that are going to happen and drive down costs for things like maintenance of infrastructure,” Shive told the station [part of ExecutiveGov maybe?] in an interview.

Efficiency in government is welcomed by those who have the opportunity to interact with the professionals at their stations. One innovation is interesting:

GSA is also working to implement a statement of work library for multiple procurement categories and the a click-and-pay service on the site.

No word about the search system, and not much information about who pays whom and for what.

Stephen E Arnold, October 17, 2015

The Tweet Gross Domestic Product Tool

October 16, 2015

Twitter can be used to figure out your personal income.  Twitter was not designed to be a tool to tally a person’s financial wealth, instead it is a communication tool based on a one hundred forty character messages to generate for small, concise delivery.  Twitter can be used to chat with friends, stars, business executives, etc, follow news trends, and even advertise products by sent to a tailored audience.  According to Red Orbit in the article “People Can Guess Your Income Based On Your Tweets,” Twitter has another application.

Other research done on Twitter has revealed that your age, location, political preferences, and disposition to insomnia, but your tweet history also reveals your income.  Apparently, if you tweet less, you make more money.  The controls and variables for the experiment were discussed, including that 5,191 Twitter accounts with over ten million tweets were analyzed and accounts with a user’s identifiable profession were used.

Users with a high follower and following ratio had the most income and they tended to post the least.  Posting throughout the day and cursing indicated a user with a lower income.  The content of tweets also displayed a plethora of “wealth” information:

“It isn’t just the topics of your tweets that’s giving you away either. Researchers found that “users with higher income post less emotional (positive and negative) but more neutral content, exhibiting more anger and fear, but less surprise, sadness and disgust.” It was also apparent that those who swore more frequently in their tweets had lower income.”

Twitter uses the information to tailor ads for users, if you share neutral posts get targeted ads advertising expensive items, while the cursers get less expensive ad campaigns.  The study also proves that it is important to monitor your Twitter profile, so you are posting the best side of yourself rather than shooting yourself in the foot.

Whitney Grace, October 16, 2015
Sponsored by, publisher of the CyberOSINT monograph


Whitepaper: Plan for Holiday Sales Now

October 16, 2015

Marketing pros and retailers take note: semantic tech firm ntent offers a free whitepaper to help you make the most of the upcoming holiday season, titled “Step-By-Step Guide to Holiday Campaign Planning.” All they want in return are your web address, contact info, and the chance to offer you a subscription to their newsletter, blog, and updates. (That checkbox is kindly deselected by default.) The whitepaper’s description states:

“Halloween candy and costumes are already overflowing on retail stores shelves. You know what that means, don’t you? It’s time for savvy marketers to get serious about their online retail planning for the impending holidays, if they haven’t already started. Why is it so important to take the time to coordinate a solid holiday campaign? Because according to the National Retail Federation [PDF] the holiday season can account for more than 20–40% of a retailer’s annual sales. And if that alone isn’t enough to motivate you, Internet Retailer reported that online retail sales this year are predicted to reach $349.06 billion a 14.2% YoY increase—start planning now to get your piece of the pie! Position your business for online success, more sales and more joy as you head into 2016 using these easy-to-follow, actionable tips!”

The paper includes descriptions of tactics and best practices, as well as a monthly to-do list and a planning worksheet. Founded in 2010, ntent leverages their unique semantic search technology to help clients quickly find the information they need. The company currently has several positions open at their Carlsbad, California, office.

Cynthia Murrell, October 16, 2015

Sponsored by, publisher of the CyberOSINT monograph

Technology Fear News Flash: Search Not in the Top 10

October 15, 2015

I read one of those out-of-the-blue research study summaries. The information appears in Network World, a corporate family member of my favorite mid tier consulting firm IDC. The write up is titled with a zippy angle: Fear; to wit, “Technology Scares the Hell Out of People, University Survey Finds.”

I found the article a fiesta of take-it-to-the-bank information.

The snappy graphic caught my eye. Each of the Top 10 fears warrants a cartoon treatment. Here’s an example for running out of money in the future Fear Number Nine.


Source: Network World which used a cartoon from Chapman University. Academia and cartoons. Interesting.

I like the human carrying a weight (at first glance it looks like a debt bomb) up the pile of what appears to be back issues of unsold copies of print version of IDC reports. Adult Swim may do a feature based on this fear. That will be a winner.

On to another gem from the article. I highlighted this passage in the write up:

Technology-related concerns account for 3 of the top 5 biggest fears among Americans surveyed recently by Chapman University of Orange, Calif. — and a couple of the other concerns on the top 10 list could be considered tech-related worries as well.

And the tech fears are:

  • Cyber terrorism
  • Corporate tracking of personal information
  • Government tracking of personal information

The write up adds:

Numbers 7 (Identity theft) and #10 (Credit card fraud) could also be classified as tech-related worries.

Quite a payload of fear. The write up does not include any details about the sample size, the methodology, or the folks doing the work which could be undergraduates or adjuncts for all I know.

Stepping back, let’s think about technology and analytics. On the surface, those in the sample are not exactly comfortable with what I call the Silicon Valley way. Thinking more deeply, the fears suggest that the survey suggests trust is not part of the warp and woof of the lives of the lucky folks in the sample.

My hunch is that if we polled some government officials, big time technology company CEOs, a couple of hundred top one percenters, and 20 somethings looking for a job in Palo Alto, the results might be different. I look forward to a report from IDC on this topic. I hope the author is my favorite IDC expert Dave Schubmehl. He is not afraid of technology based on my experience.

Stephen E Arnold, October 15, 2015

Can Online Systems Discern Truth and Beauty or All That One Needs to Know?

October 14, 2015

Last week I fielded a question about online systems’ ability to discern loaded or untruthful statements in a plain text document. I responded that software is not yet very good at figuring out whether a specific statement is accurate, factual, right, or correct. Google pokes at the problem in a number of ways; for example, assigning a credibility score to a known person. The higher the score, the person may be more likely to be “correct.” I am simplifying, but you get the idea: Recycling a variant of Page Rank and the CLEVER method associated with Jon Kleinberg.

There are other approaches as well, and some of them—dare I suggest, most of them—use word lists. The idea is pretty simple. Create a list of words which have positive or negative connotations. To get fancy, you can work a variation on the brute force Ask Jeeves’ method; that is, cook up answers or statement of facts “known” to be spot on. The idea is to match the input text with the information in these word lists. If you want to get fancy, call these lists and compilations “knowledgebases.” I prefer lists. Humans have to help create the lists. Humans have to maintain the lists. Get the lists wrong, and the scoring system will be off base.

There is quite a bit of academic chatter about ways to make software smart. A recent example is “Sentiment Diffusion of Public Opinions about Hot Events: Based on Complex Network.” In the conclusion to the paper, which includes lots of fancy math, I noticed that the researchers identified the foundation of their approach:

This paper studied the sentiment diffusion of online public opinions about hot events. We adopted the dictionary-based sentiment analysis approach to obtain the sentiment orientation of posts. Based on HowNet and semantic similarity, we calculated each post’s sentiment value and classified those posts into five types of sentiment orientations.

There you go. Word lists.

My point is that it is pretty easy to spot a hostile customer support letter. Just write a script that looks for words appearing on the “nasty list”; for example, consumer protection violation, fraud, sue, etc. There are other signals as well; for example, capital letters, exclamation points, underlined words, etc.

The point is that distorted, shaped, weaponized, and just plain bonkers information can be generated. This information can be gussied up in a news release, posted on a Facebook page, or sent out via Twitter before the outfit reinvents itself.

The researcher, the “real” journalist, or the hapless seventh grader writing a report will be none the wiser unless big time research is embraced. For now, what can be indexed is presented as if the information were spot on.

How do you feel about that? That’s a sentiment question, gentle reader.

Stephen E Arnold, October 14, 2015

Watson Weekly: A Connectors Festival

October 14, 2015

I read “IBM Adds to Watson Analytics with Expert Storybooks, Connectors to Oracle, Salesforce, Microsoft Azure, AWS.” Watson is no longer a game show winning, recipe making, and cancer curing search system. Watson does analytics, which means that various IBM acquisitions’ technology can be used to count, calculate, and predict. Well, Watson is not search anymore, gentle reader. Watson is the brand, the new IBM, the revolution in cognitive computing for which you and I have been been longing.

The write up reports:

IBM today is announcing new ways for business users to easily explore and visualize company data with its Watson Analytics cloud-based big data analytics tool.

The cloud. Big data. Yes.

I learned:

IBM put together the Expert Storybooks — for working with data on sports, weather, marketing, social media, and finance — in partnership with AriBall, The Weather Co., OgilvyOne, Twitter, American Marketing Association, Nucleus Research, MarketShare, and Intangent.

Expert story books. Yes. Yes.

And what makes this solution hum is revealed in this way:

The connectors make it possible to hook up with the Redshift data warehouse service form the Amazon Web Services (AWS) public cloud, the Microsoft Azure public cloud (IBM’s SoftLayer competes with both Azure and AWS), IBM dashDB (competes with AWS Redshift) MySQL, Oracle, Microsoft SQL Server, Sybase, PostgreSQL, IBM DB2 (which competes with SQL Server, Oracle, and Sybase), Cloudera Impala, Apache Hive, Box (now a prominent IBM partner) Pivotal Greenplum, Salesforce, and Twitter, among others.

Isn’t Greenplum part of EMC Dell? Why not just use the Greenplum tools? Why not use the EMC import tools? Frankly I am not sure what IBM is offering. With some time and scripting ability, it may be possible to do analytics and “story books” with the systems to which IBM is “hooking up.”

My take is that IBM is trying really hard to make Watson into a significant revenue generating machine. Pitching as a benefit 100 million lines of code and telling readers of the New York Times about dozens of APIs are interesting approaches to making sales.

Connecting to competitive services, I must agree, is a master stroke on a par with the Watson business strategy. But IBM faces a long par five with Watson, the cognitive computing beastie.

Stephen E Arnold, October 14, 2015

Meg Whitman, President of HP, Gets Flack for Partial Follow-Through on Ultimatum

October 14, 2015

The article titled HP Didn’t Actually Fire All the Employees It Threatened to Cut on Business Insider details the management teachings from Hewlett Packard. To summarize, HP recently delivered an ultimatum to several hundred employees that they had to shift off HP’s payroll and become contract workers for significantly lower pay with HP’s partner Ciber. If they refused, they would be let go. Except that the employees mutinied and complained, resulting in HP negotiating for higher salaries from Ciber as well as holding on to a few employees who refused the deal. The article states,

“On top of that, HP is also shipping most of the jobs in this business unit offshore. Whitman wants 60% of the Enterprise Services division jobs to be in low-cost areas of the world, compared to less than 40% today. Employees in this unit fully expect HP to line up more take-it-or-leave it contract jobs, they tell us, so we’ll see how HP handles the next one if it does materialize.”

This is all in the midst of HP’s massive layoffs of over 80,000 employees, 51,000 of whom have already been let go. Morale must be under the building. The non-negotiable ultimatum strategy did not seem to work, and at any rate is bad business, especially when coupled with it being overturned later in a handful of instances.

Chelsea Kerwin, October 14, 2015

Sponsored by, publisher of the CyberOSINT monograph

Predictive Analytics: Five Ideas for Business

October 13, 2015

A mid tier consulting firm is expressing its reservations about analytics used incorrectly. But the cheerleading for fancy math is tough to ignore. I read “5 Ways Just about Anyone Should Be Leveraging Predictive Analytics.” I quite like the parental “should” too.

What are the ways? Let me count them:

  1. Customer so you can figure out lifetime value
  2. Marketing so you can figure out purchasing intent
  3. Websites and applications so you can perform content optimization
  4. Risk so you can figure out fraud and pricing
  5. Operations so you can do network optimization.

Predictive analytics appears to be applicable to many different corporate tasks. The write up omits just one minor point: How.

Why should those with an interest in marketing get involved in the type of detail required to make a predictive system useful. Next up? An international conference on how to make predictive analytics really easy.

Stephen E Arnold, October 13, 2015

The State Department Delves into Social Media

October 13, 2015

People and companies that want to increase a form of communication between people create social media platforms.  Facebook was invented to take advantage of the digital real-time environment to keep people in contact and form a web of contacts.  Twitter was founded for a more quick and instantaneous form of communication based on short one hundred forty character blurbs.  Instagram shares pictures and Pinterest connects ideas via pictures and related topics.  Using analytics, the social media companies and other organizations collect data on users and use that information to sell products and services as well as understanding the types of users on each platform.

Social media contains a variety of data that can benefit not only private companies, but the government agencies as well.  According to GCN, the “State Starts Development On Social Media And Analytics Platform” to collaborate and contribute in real-time to schedule and publish across many social media platforms and it will also be mobile-enabled.  The platform will also be used to track analytics on social media:

“For analytics, the system will analyze sentiment, track trending social media topics, aggregate location and demographic information, rank of top multimedia content, identify influencers on social media and produce automated and customizable reports.”

The platform will support twenty users and track thirty million mentions each year.  The purpose behind the social media and analytics platform is still vague, but the federal government has proven to be behind in understanding and development of modern technology.  This appears to be a step forward to upgrade itself, so it does not get left behind.  But a social media platform that analyzes data should have been implemented years ago at the start of this big data phenomenon.

Whitney Grace, October 13, 2015
Sponsored by, publisher of the CyberOSINT monograph


« Previous PageNext Page »