If an IDC Tweet Enters the Social Stream, Does Anyone Care? I Do Not.

December 31, 2014

This is a good question. The Twitter messages output by Beyond Search are automated. We know that most of these produce nothing substantive. But what about Tweets by an IDC search expert like Dave Schubmehl. You may recognize the name because he sold a report with my name on it for $3,500 on Amazon without my permission. Nifty. I don’t think of myself as a brand or fame surf board, but it appears that he does.

My Overflight system noted that since September 22, 2014, Mr. Schubmehl or an IDC software script generated 198 tweets if I counted correctly. There were quite a few tweets about BA Insight, a search vendor anchored in Microsoft SharePoint. I ask, “Is BA Insight paying for IDC to promote the brand?” I know that there may have been some brushes with IDC in the past. Whether for free or for fee, Mr. Schubmehl mentions BA Insight a half dozen times.

But Mr. Schubmehl is fascinated with IBM. He generated tweets about Watson, IBM “insights”, and IBM training 149 times. Perhaps IDC and Mr. Schubmehl should apply to be listed in the TopSEOs’ list?

Do McKinsey, Bain, and BCG consultants hammer out tweets about Watson? I suppose if the client pays. Is IDC and search expert Mr. Schubmehl in the pay-to-play business? If not, he has considerable affection for the IBM and its Watson system, which is supposed to be a $10 billion business in four or five years. I wonder how that will work out in a company that is playing poker with its financial guidance for the next fiscal year.

Stephen E Arnold, December 31, 2014

Patent Search Needs to Be Semantic

December 31, 2014

An article published on Innography called “Advanced Patent Search” brings to attention how default search software might miss important search results, especially if one is researching patents. It points pout that some parents are purposefully phrased to cause hide their meaning and relevance to escape under the radar.

Deeper into the article it transforms into a press release highlight Innography’s semantic patent search. It highlights how the software searches through descriptive task over product description, keywords, and patent abstracts. This is not anything too exciting, but this makes the software more innovative:

“Innography provides fast and comprehensive metadata analysis as another method to find related patents. For example, there are several “one-click” analyses from a selected patent – classification analysis, citation mining, invalidation, and infringement – with a user-selected similarity threshold to refine the analyses as desired. The most powerful and complete analyses utilize all three methods – keyword search, semantic search, and metadata analysis – to ensure finding the most relevant patents and intellectual property to analyze further.”

Innography’s patent search serves as an example for how search software needs to compete with comparable products. A simple search is not enough anymore, not in the world of big data. Users demand analytics, insights, infographics, easy of use, and accurate results.

Whitney Grace, December 31, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Drop Everything and Learn These New Tips for Semantic Search

December 31, 2014

IT developers are searching for new ways to manipulate semantic search, but according to Search Engine Journal in “12 Things You Need To Do For Semantic Search” they are all trying to figure out what the user wants. The article offers twelve tips to get back to basics and use semantic search as a tool to drive user adoption.

Some of the tips are quite obvious, such as think like a user, optimize SEO, and harness social media and local resources. Making a Web site stand out, requires taking the obvious tips and using a bit more. The article recommends that it is time to learn more about Google Knowledge Graph and how it applies to your industry. Schema markup is also important, because search engines rely on it for richer results and it develops how users see your site in a search engine.

Here is some advice on future proofing you site:

“Work out how your site can answer questions and provide users with information that doesn’t just read like terms and conditions. Pick the topics, services and niches that apply to your site and start to optimize your site and your content in a way that will benefit users. Users will never stop searching using specific questions, but search engines are actively encouraging them to ask a question or solve a problem so get your services out there by meeting user needs.”

More tips include seeing how results are viewed on search engines other than Google, keeping up with trends, befriending a thesaurus, and being aware that semantic search requires A LOT of work.

Whitney Grace, December 31, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Mastering Data Quality Requires Change

December 31, 2014

Big data means big changes for data management and ensuring its quality. Computer users, especially those ingrained in their ways, have never been keen on changing their habits. Insert trainings and meetings, then you have a general idea of what it takes to install data acceptance. Dylan Jones at SAS’s Data Roundtable wrote an editorial, “Data Quality Mastery Depends On Change Management Essentials.”

Jones writes that data management is still viewed as a strict IT domain and data quality suffers from it. It required change management to make other departments understand about the necessity for the changes.

Change management involves:

• “Ownership and leadership from the top

• Alignment with the overall strategy of the organization

• A clear vision for data quality

• Constant dialogue and consultation”

Jones notes that leaders are difficult to work with when it comes to change implementation, because they do not see what the barriers are. It translates to a company’s failure to adapt and learn. He recommends having an outside consultant, with an objective perspective, help when trying to make big changes.

Jones makes good suggestions, but he lacks any advice on how to feasibly accomplish a task. What he also needs to consider is data quality is constantly changing as new advances are made. Is he aware that some users cannot keep up with the daily changes?

Whitney Grace, December 31, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Useful Reading/Viewing List for Smart Software

December 30, 2014

The “Deep Learning Reading List” is a useful round up of information sources. The content is grouped by free online books, courses, videos, papers, tutorials, Web site, datasets, frameworks, and miscellaneous. Useful.

Stephen E Arnold, December 30, 2014

Losing the Past Online

December 30, 2014

I read “WWWTXT: The Oldest Internet Archive.” The write up makes clear that archival online content is tough to find. I like the idea that online history is lost. The idea, one might say, is that lack of awareness of the past makes everything new again. Here’s a quote I noted:

(Rehn’s archive was acquired from the now-defunct Deja News, which was acquired by Google in 2001.) These days, the majority of new content he gets is from old BBS archives, either given to him, or found on old floppy disks.

When experts in search are clueless about early information retrieval systems, I thought it was a failure on the part of the expert. Now I see. Those folks have no past to which to refer. Hence, old stuff is innovative. Good to know.

Stephen E Arnold, December 30, 2014

Temis Attends University

December 30, 2014

Despite budget cuts in academic research with print materials, higher education is clamoring for more digital content. You do not need Google Translate to understand that means more revenue for companies in that industry. Virtual Strategy writes that someone wants in on the money: “With Luxid Content Enrichment Platform, Cairn.info Automates The Extraction Of Bibliographic References And The Linking To Corresponding Article.”

Temis is an industry leader in semantic content enrichment solutions for enterprise and they signed a license and service agreement with CAIRN.info. CAIRN.info is a publishing portal for social sciences and humanities, providing students with access to the usual research fare.

Taking note of the changes in academic research, CAIRN.info wants to upgrade its digital records for a more seamless user experience:

“To make its collection easier to navigate, and ahead of the introduction of an additional 20.000 books which will consolidate its role of reference SSH portal, Cairn.info decided to enhance the interconnectedness of SSH publications with semantic enrichment. Indeed, the body of SSH articles often features embedded bibliographic references that don’t include actual links to the target document. Cairn.info therefore chose to exploit the Luxid® Content Enrichment Platform, driven by a customized annotator (Skill Cartridge®), to automatically identify, extract, and normalize these bibliographic references and to link articles to the documents they refer to.”

A round of applause for Cairn.info, realizing that making research easier will help encourage more students to use its services. If only academic databases would take ease of use into consideration and upgrade their UI dashboards.

Whitney Grace, December 30, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Garbling the Natural Language Processors

December 30, 2014

Natural language processing is becoming a popular analytical tool as well as a quicker way for search and customer support. Dragon Nuance is at the tip of everyone’s tongue when NLP enters a conversation, but there are other products with their own benefits. Code Project recently reviewed three of NLP in, ”A Review Of Three Natural Language Processors, AlchemyAPI, OpenCalais, And Semantria.”

Rather than sticking readers with plain product reviews, Code Project explains what NLP is used for and how it accomplishes it. While NLP is used for vocal commands, it can do many other things: improve SEO, knowledge management, text mining, text analytics, content visualization and monetization, decision support, automatic classification, and regulatory compliance. NLP extracts entities aka proper nouns from content, then classifies, tags, and provides a sentiment score to give each entity a meaning.

In layman’s terms:

“…the primary purpose of an NLP is to extract the nouns, determine their types, and provide some “scoring” (relevance or sentiment) of the entity within the text.  Using relevance, one can supposedly filter out entities to those that are most relevant in the document.  Using sentiment analysis, one can determine the overall sentiment of an entity in the document, useful for determining the “tone” of the document with regards to an entity — for example, is the entity “sovereign debt” described negatively, neutrally, or positively in the document?”

NLP categorizes the human element in content. Its usefulness will become more apparent in future years, especially as people rely more and more on electronic devices for communication, consumerism, and interaction.

Whitney Grace, December 30, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Microsoft Confirms Drop of Public Website SharePoint Feature

December 30, 2014

Microsoft has confirmed the rumors that everyone has feared – the Public Website feature of SharePoint is being discontinued. Customers are being encouraged to move to third party options that integrate with SharePoint. ZDNet breaks the news and covers the details in their article, “Microsoft Confirms it is Dropping Public Website Feature from SharePoint Online.”

The article discusses how the transition will occur:

“New customers signing up for Office 365 as of January next year won’t have access to Public Websites in SharePoint Online, Microsoft officials acknowledged in a new Knowledge Base support article published on December 19. Existing customers using SharePoint Online Public Website will continue to have access to this feature for a minimum of two years following the changeover date, Microsoft execs said.”

Interested parties will not be surprised by the news, as rumors have swirled for some time. However, it is a difficult transition for those who relied on the feature. It seems that SharePoint went through a season of trying to be all things to all people, but that did not seem to pan out the way they anticipated, and now they are scaling back. Stephen E. Arnold keeps a close eye on SharePoint on his Web service, ArnoldIT.com. Keep an eye on his SharePoint feed to see what feature may be next on the Microsoft chopping block.

Emily Rae Aldridge, December 30, 2014

Machine Learning: The Future, Get It?

December 29, 2014

I enjoyed “The Future Is Machine Learning, Not Programs or Processes.” I am down with the assertion, but I did chuckle. The machine learning systems described in my new monograph “CyberOSINT: Next Generation Information Access” (available in early 2015) are composed of programs and processes.

The article does not really mean that machine learning exists as an island, which John Donne correctly pointed out, is not an ideal situation for reasonably intelligent clergyman to seek.

The write up states:

Why do I then say that the future of BPM is in this obscure AI arena? First, machine learning is not about beating humans at some task. And second, I see machine learning ideal for work that can’t be simply automated. My well-documented opposition in this blog to orthodox BPM is caused by the BPM expert illusion that a business and the economy are designed, rational structures of money and things that can be automated. In reality they are social interactions of people that can never be encoded in diagrams or algorithms. I have shown that machine learning can augment the human ability to understand complex situations and improve decision making and knowledge sharing.

I also noted the reference to IBM Watson, one of my favorite touch points for the usefulness of open source software applied to the creation of recipes for barbeque sauce. Tamarind? Why didn’t my local BBQ joint think of that? Here in Kentucky, the humans at Texican just use bourbon. No imagination?

The write up invokes Google but omits Google’s investment in Recorded Future. I find that interesting because Recorded Future is one of the more important companies in the NGIA market.

The article concludes:

We are just at the starting point of using machine learning for process management. IT and business management are mostly not ready for such advanced approaches because they lack the understanding of the underlying technology. We see it as our goal to dramatically reduce the friction, as Sankar calls it, between the human and the machine learning computer. Using these technologies has to become intuitive and natural. The ultimate benefit is an increase in the quality of the workforce and thus in customer service.

There you go. The future.

Stephen E Arnold, December 29, 2014

Next Page »

  • Archives

  • Recent Posts

  • Meta