Gleaning Insights and Advantages from Semantic Tagging for Digital Content

September 22, 2016

The article titled Semantic Tagging Can Improve Digital Content Publishing on Aptara Corp. blog reveals the importance of indexing. The article waves the flag of semantic tagging at the publishing industry, which has been pushed into digital content kicking and screaming. The difficulties involved in compatibility across networks, operating systems, and a device are quite a headache. Semantic tagging could help, if only anyone understood what it is. The article enlightens us,

Put simply, semantic markups are used in the behind-the-scene operations. However, their importance cannot be understated; proprietary software is required to create the metadata and assign the appropriate tags, which influence the level of quality experienced when delivering, finding and interacting with the content… There have been many articles that have agreed the concept of intelligent content is best summarized by Ann Rockley’s definition, which is “content that’s structurally rich and semantically categorized and therefore automatically discoverable, reusable, reconfigurable and adaptable.

The application to the publishing industry is obvious when put in terms of increasing searchability. Any student who has used JSTOR knows the frustrations of searching digital content. It is a complicated process that indexing, if administered correctly, will make much easier. The article points out that authors are competing not only with each other, but also with the endless stream of content being created on social media platforms like Facebook and Twitter. Publishers need to take advantage of semantic markups and every other resource at their disposal to even the playing field.

Chelsea Kerwin, September 22, 2016
Sponsored by, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link:

HonkinNews for September 20, 2016 Available

September 20, 2016

Stories in the Beyond Search weekly video news program “HonkinNews” include LinkedIn’s censorship of a former CIA professional’s post about the 2016 election. Documentum, founded in 1990, has moved to the frozen wilds of Canada. A Microsoft and Nvidia sponsored online beauty contest may have embraced algorithmic bias. Google can write a customer’s ad automatically and may be able to alter users’ thoughts and actions. Which vendors of intelligence-centric software may be shown the door to the retirement home? The September 20, 2016, edition of “HonkinNews”, filmed with old-fashioned technology in the wilds of rural Kentucky is online at this link.

Kenny Toth, September 20, 2016

Lousy Backlog? Sell with Interesting Assertions.

September 19, 2016

If you are struggling to fill the sales pipeline, you will feel some pressure. If you really need to make sales, marketing collateral may be an easy weapon to seize.

I read “Examples of False Claims about Self-Service Analytics.” The write up singles out interesting sales assertions and offers them up in a listicle. I loved the write up. I lack the energy to sift through the slices of baloney in my enterprise search files. Therefore, let’s highlight the work the brave person who singled out eight vendors’ marketing statements as containing what the author called “false claims.” Personally I think each of these claims is probably rock solid when viewed from the point of view of the vendors’ legal advisers.

Here are three examples of false claims about self service analytics. (For the other five, consult the article cited in the preceding paragraph.) Keep in mind that I find these statements as good as the gold for sale in the local grocery in Harrod’s Creek. Come to think of it the gold is foil wrapped around a lousy piece of ersatz chocolate. But gold is gold sort of.

Example 1 from Information Builders. Information Builders loves New York. The company delivers “integrated solutions for data management.” Here’s the item from the article which contains a “false claim.”

Self-service BI and analytics isn’t just about giving tools to analysts; it’s about empowering every user with actionable and relevant information for confident decision-making. (link). Self-service Analytics for Everyone…Who’s Everyone? Your entire universe of employees, customers, and partners. Our WebFOCUS Business Intelligence (BI) and Analytics platform empowers people inside and outside your organization to attain insights and make better decision.

I see a subject verb agreement error. I see a semicolon which puts me on my rhetorical guard. I see the universal “everyone”. I see the fact that WebFOCUS empowers.

What’s not to like? Information Builders is repeating facts which I accept. The fact that the company is in New York enhances the credibility of the statements. Footnotes, evidence? Who needs them?

Example 2 from SAP, the outfit that delivered R3 to Westinghouse and Trex to the enterprise search market. Here’s the “false assertion” which looks as solid as a peer reviewed journal containing data related to drug trials. Remember. This quote comes from the source article. I believe absolutely whatever SAP communicates to me. Don’t you?

This tool is intended for those who need to do analysis but are not Analysts nor wish to become them.

Why study math, statistics, and related disciplines? Why get a degree? I know that I can embrace the SAP way (which is a bit like the IBM way) and crunch numbers until the cows return to my pasture in Harrod’s Creek. Who needs to worry about data integrity, sample size, threshold settings, and algorithmic sequencing? Not me. Gibraltar does not stand on such solid footing as SAP’s tool for those who eschew analysts and does not want to wake up like Kafka’s protagonist as an analyst.

Example 3 from ZoomData, a company which has caught the attention of some folks in the DC area. I love those cobblestones in Reston too.

ZoomData brings the power of self-service BI to the 99%—the non-data geeks of the world who thirst for a simple, intuitive, and collaborative way to visually interact with data to solve business problems.

To me this looks better than the stone tablets Moses hauled down from the mountain. I love the notion of non geeks who thirst for pointing and clicking. I would have expressed the idea as drink deep of data’s Empyrean spring, but I am okay with the split infinitive “to visually interact” because the statement is a fact. I tell you that it is a fact.

For the other five allegedly false assertions, please, consult the original article. I have to take a break. When my knowledge is confirmed in these brilliant assertions, I need a moment to congratulate myself on my wisdom. Wait. I am an addled goose. Maybe these examples really are hog wash? Because i live in rural Kentucky, I will step outside and seek inputs from Henrietta, my hog.

Stephen E Arnold, September 19, 2016

OpenText: Content Marketing or Real News?

September 18, 2016

When I knew people at the original Malcolm Forbes Forbes, I learned that stories were meticulously researched and edited. I read “Advanced Analytics: Insights Produce New Wealth.” I was baffled, but, I thought, that’s the point.

The main point of the write up pivots on the assertion that an “insight” converts directly to “wealth.” I am not sure about the difference between old and new wealth. Wealth is wealth in my book.

The write up tells me:

Data is the foundation that allows transformative, digital change to happen.

The logic escapes me. The squirrels in Harrod’s Creek come and go. Change seems to be baked into squirreldom. The focus is “the capitalist tool,” and I accept that the notion of changing one’s business can be difficult. The examples are easy to spot: IBM is trying to change into a Watson powered sustainable revenue machine. HP is trying to change from a conglomeration of disparate parts into a smaller conglomeration of disparate parts. OpenText is trying to change from a roll up of old school search systems into a Big Data wealth creator. Tough jobs indeed.

I learned that visualization is important for business intelligence. News flash. Visualization has been important since a person has been able to scratch animals on a cave’s wall. But again I understand. Predictive analytics from outfits like Spotfire (now part of Tibco) provided a wake up call to some folks.

The write up informs me:

While devices attached to the Internet of Things will continue to throw out growing levels of structured data (which can be stored in files and databases), the amount of unstructured data being produced will also rise. So the next wave of analytics tools will inevitably be geared to dealing with both forms of information seamlessly, while also enabling you to embed the insights gleaned into the applications of your choosing. Now that’s innovation.

Let’s recap. Outfits need data to change. (Squirrels excepted.) Companies have to make sense of their data. The data come in structured and unstructured forms. The future will be software able to handle structured and unstructured data. Charts and graphs help. Once an insight is located, founded, presented by algorithms which may or may not be biased, the “insights” will be easy to put into a PowerPoint.

BAE Systems’ “Detica” was poking around in this insight in the 1990s. There were antecedents, but BAE is a good example for my purpose. Palantir Technologies provided an application demo in 2004 which kicked the embedded analytics notion down the road. A click would display a wonky circular pop up, and the user could perform feats of analytic magic with a mouse click.

Now Forbes’ editors have either discovered something that has been around for decades or been paid to create a “news” article that reports information almost as timely as how Lucy died eons ago.

Back to the premise: Where exactly is the connection between insight and wealth? How does one make the leap from a roll up of unusual search vendors like Information Dimension, BRS, Nstein, Recommind, and my favorite old time Fulcrum Technologies produce evidence of the insight to wealth argument. If I NOT out these search vendors and focus on the Tim Bray SGML search engine, I still don’t see the connection. Delete Dr. Bray’s invention. What do we have? We have a content management company which sells content management as governance, compliance, and other consulting friendly disciplines.

Consultants can indeed amass wealth. But the insight comes not from Big Data. The wealth comes from selling time to organizations unable to separate the giblets from the goose feathers. Do you know the difference? The answer you provide may allow another to create wealth from that situation.

One doesn’t need Big Data to market complex and often interesting software to folks who require a bus load of consultants to make it work. For Forbes, the distinction between giblets and goose feathers may be too difficult to discern.

My hunch is that others, not trained in high end Manhattan journalism, may be able to figure out which one can be consumed and which one can ornament an outfit at an after party following a Fashion Week showing.

Stephen E Arnold, September 18, 2016

In-Q-Tel Wants Less Latency, Fewer Humans, and Smarter Dashboards

September 15, 2016

I read “The CIA Just Invested in a Hot Startup That Makes Sense of Big Data.” I love the “just.” In-Q-Tel investments are not like bumping into a friend in Penn Station. Zoomdata, founded in 2012, has been making calls, raising venture funding (more than $45 million in four rounds from 21 investors), and staffing up to about 100 full time equivalents. With its headquarters in Reston, Virginia, the company is not exactly operating from a log cabin west of Paducah, Kentucky.

The write up explains:

Zoom Data uses something called Data Sharpening technology to deliver visual analytics from real-time or historical data. Instead of a user searching through an Excel file or creating a pivot table, Zoom Data puts what’s important into a custom dashboard so users can see what they need to know immediately.

What Zoomdata does is offer hope to its customers for less human fiddling with data and faster outputs of actionable intelligence. If you recall how IBM i2 and Palantir Gotham work, humans are needed. IBM even snagged Palantir’s jargon of AI for “augmented intelligence.”

In-Q-Tel wants more smart software with less dependence on expensive, hard to train, and often careless humans. When incoming rounds hit near a mobile operations center, it is possible to lose one’s train of thought.

Zoomdata has some Booz, Allen DNA, some MIT RNA, and protein from other essential chemicals.

The write up mentions Palantir, but does not make explicit the need to reduce t6o some degree the human-centric approaches which are part of the major systems’ core architecture. You have nifty cloud stuff, but you have less nifty humans in most mission critical work processes.

To speed up the outputs, software should be the answer. An investment in Zoomdata delivers three messages to me here in rural Kentucky:

  1. In-Q-Tel continues to look for ways to move along the “less wait and less weight” requirement of those involved in operations. “Weight” refers to heavy, old-fashioned system. “Wait” refers to the latency imposed by manual processes.
  2. Zoomdata and other investments whips to the flanks of the BAE Systems, IBMs, and Palantirs chasing government contracts. The investment focuses attention not on scope changes but on figuring out how to deal with the unacceptable complexity and latency of many existing systems.
  3. In-Q-Tel has upped the value of Zoomdata. With consolidation in the commercial intelligence business rolling along at NASCAR speeds, it won’t take long before Zoomdata finds itself going to big company meetings to learn what the true costs of being acquired are.

For more information about Zoomdata, check out the paid-for reports at this link.

Stephen E Arnold, September 15, 2016

Big Data Processing Is Relative to Paradigm of Today

September 7, 2016

The size and volume that characterizes an information set as big data — and the tools used to process — is relative to the current era. A story from NPR reminds us of this as they ask, Can Web Search Predict Cancer? Promise And Worry Of Big Data And Health. In 1600’s England, a statistician essentially founded demography by compiling details of death records into tables. Today, trends from big data are drawn through a combination of assistance from computer technology and people’s analytical skills. Microsoft scientists conducted a study showing that Bing search queries may hold clues to a future diagnosis of pancreatic cancer.

The Microsoft scientists themselves acknowledge this [lack of comprehensive knowledge and predictive abilities] in the study. “Clinical trials are necessary to understand whether our learned model has practical utility, including in combination with other screening methods,” they write. Therein lies the crux of this big data future: It’s a logical progression for the modern hyper-connected world, but one that will continue to require the solid grounding of a traditional health professional, to steer data toward usefulness, to avoid unwarranted anxiety or even unnecessary testing, and to zero in on actual causes, not just correlations within particular health trends.”

As the producers of data points in many social-related data sets, and as the original analyzers of big data, it makes sense that people remain a key part of big data analytics. While this may be especially pertinent in matters related to health, it may be more intuitively understood in this sector in contrast to others. Whether health or another sector, can the human variable ever be taken out of the data equation? Perhaps such a world will give rise to whatever is beyond the current buzz around the phrase big data.

Megan Feil, September 7, 2016
Sponsored by, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link:

Government Seeks Sentiment Analysis on Its PR Efforts

September 6, 2016

Sentiment analysis is taking off — government agencies are using it for PR purposes. Next Gov released a story, Spy Agency Wants Tech that Shows How Well Its PR Team Is Doing, which covers the National Geospatial-Intelligence Agency’s request for information about sentiment analysis. The NGA hopes to use this technology to assess their PR efforts to increase public awareness of their agency and communicate its mission, especially to groups such as college students, recruits and those in the private sector. Commenting on the bigger picture, the author writes,

The request for information appears to be part of a broader effort within the intelligence community to improve public opinion about its operations, especially among younger, tech-savvy citizens. The CIA has been using Twitter since 2014 to inform the public about the agency’s past missions and to demonstrate that it has a sense of humor, according to an Nextgov interview last year with its social media team. The CIA’s social media director said at the time there weren’t plans to use sentiment analysis technology to analyze the public’s tweets about the CIA because it was unclear how accurate those systems are.

The technologies used in sentiment analysis such as natural language processing and computational linguistics are attractive in many sectors for PR and other purposes, the government is no exception. Especially now that CIA and other organizations are using social media, the space is certainly ripe for government sentiment analysis. Though, we must echo the question posed by the CIA’s social media director in regards to accuracy.

Megan Feil, September 6, 2016
Sponsored by, publisher of the CyberOSINT monograph There is a Louisville, Kentucky Hidden Web/DarkWeb meet up on September 27, 2016.
Information is at this link:

Google Enables Users to Delete Search History, Piece by Piece

August 31, 2016

The article on CIO titled Google Quietly Brings Forgetting to the U.S. draws attention to Google have enabled Americans to view and edit their search history. Simply visit My Activity and login to witness the mind-boggling amount of data Google has collected in your search career. To delete, all you have to do is complete two clicks. But the article points out that to delete a lot of searches, you will need an afternoon dedicated to cleaning up your history. And afterward you might find that your searches are less customized, as are your ads and autofills. But the article emphasizes a more communal concern,

There’s something else to consider here, though, and this has societal implications. Google’s forget policy has some key right-to-know overlaps with its takedown policy. The takedown policy allows people to request that stories about or images of them be removed from the database. The forget policy allows the user to decide on his own to delete something…I like being able to edit my history, but I am painfully aware that allowing the worst among us to do the same can have undesired consequences.

Of course, by “the worse among us” he means terrorists. But for many people, the right to privacy is more important than the hypothetical ways that terrorists will potentially suffer within a more totalitarian, Big Brother state. Indeed, Google’s claim that the search history information is entirely private is already suspect. If Google personnel or Google partners can see this data, doesn’t that mean it is no longer private?

Chelsea Kerwin, August 31, 2016
Sponsored by, publisher of the CyberOSINT monograph

The Equivalent of a Brexit

August 31, 2016

Britain’s historical vote to leave the European Union has set a historical precedent.  What is the precedent however?  Is it the choice to leave an organization?  The choice to maintain their independence?  Or is it a basic example of the right to choose?  The Brexit will be used as a metaphor for any major upheaval for the next century, so how can it be used in technology context?  BA Insight gives us the answer with “Would Your Users Vote ‘Yes’ For Sharexit?”

SharePoint is Microsoft Office’s collaborative content management program.  It can be used to organize projects, build Web sites, store files, and allow team members to communicate.  Office workers also spurn it across the globe over due to its inefficiencies.  To avoid a Sharexit in your organization, the article offers several ways to improve a user’s SharePoint experience.  One of the easiest ways to keep SharePoint is to build an individual user interface that handles little tasks to make a user’s life easier.  Personalizing the individual SharePoint user experience is another method, so the end user does not feel like another cog in the system but rather that SharePoint was designed for them.  Two other suggestions are plain, simple advice: take user feedback and actually use it and make SharePoint the go information center for the organization by putting everything on it.

Perhaps the best advice is making information easy to find on SharePoint:

Documents are over here, discussions over there, people are that way, and then I don’t know who the experts really are.  You can make your Intranet a whole lot smarter, or dare we say “intelligent”, if you take advantage of this information in an integrated fashion, exposing your users to connected, but different, information.  You can connect documents to the person who wrote them, then to that person’s expertise and connected colleagues, enabling search for your hidden experts. The ones that can really be helpful often reduce chances for misinformation, repetition of work, or errors. To do this, expertise location capabilities can combine contributed expertise with stated expertise, allowing for easy searching and expert identification.

Developers love SharePoint because it is easy to manage and to roll out information or software to every user.  End users hate it because it creates more problems than resolving anything.  If developers take the time to listen to what the end users need from their SharePoint experience than can avoid an Sharexit.

Whitney Grace, August 31, 2016
Sponsored by, publisher of the CyberOSINT monograph

Smart Software Pitfalls: A List-Tickle

August 26, 2016

Need page views? Why not try a listicle or, as we say here in Harrod’s Creek, a “list-tickle.”

In order to understand the depth of thought behind “13 Ways Machine Learning Can Steer You Wrong,” one must click 13 times. I wonder if the managers responsible for this PowerPoint approach to analysis handed in their college work on 5X8 inch note cards and required that the teacher ask for each individually.

What are the ways machine learning can steer one into a ditch? As Ms. Browning said in a single poem on one sheet of paper, “Let me count the ways.”

  1. The predictions output by the Fancy Dan system are incorrect. Fancy that.
  2. One does not know what one does not know. This reminds me of a Donald Henry Rumsfeld koan. I love it when real journalists channel the Rumsfeld line of thinking.
  3. Algorithms are not in line with reality. Mathematicians and programmers are reality. What could these folks do that does not match the Pabst Blue Ribbon beer crowd at a football game? Answer: Generate useless data unrelated to the game and inebriated fans.
  4. Biased algorithms. As I pointed out in this week’s HonkinNews, numbers are neutral. Humans, eh, not often.
  5. Bad hires. There you go. Those LinkedIn expertise graphs can be misleading.
  6. Cost lots of money. Most information technology projects cost a lot of money even when they are sort of right. When they are sort of wrong, one gets Hewlett Packard-Autonomy like deals.
  7. False assumptions. My hunch is that this is Number Two wearing lipstick.
  8. Recommendations unrelated to the business problem at hand. This is essentially Number One with a new pair of thrift store sneakers.
  9. Click an icon, get an answer. The Greek oracles required supplicants to sleep off a heady mixture of wine and herbs in a stone room. Now one clicks an icon when one is infused with a Starbuck’s tall, no fat latte with caramel syrup.
  10. GIGO or garbage in, garbage out. Yep, that’s what happens when one cuts the statistics class when the professor talks about data validity.
  11. Looking for answers the data cannot deliver. See Number Five.
  12. Wonky outcomes. Hey, the real journalist is now dressing a Chihuahua in discarded ice skating attire.
  13. “Blind Faith.” Isn’t this a rock and roll band. When someone has been using computing devices since the person was four years old, that person is an expert and darned sure the computer speaks the truth like those Greek oracles.

Was I more informed after clicking 13 times? Nope.

Stephen E Arnold, August 26, 2016

« Previous PageNext Page »