More Metadata: Not Needed Metadata

November 21, 2014

I find the metadata hoo hah fascinating. Indexing has been around a long time. If one wants to dig into the complexities of metadata, you may find the table from InfoLibCorp.com helpful:

image

Mid tier consulting firms often do not use the products or systems their “experts” recommend. Consultants in indexing do create elaborate diagrams that make my eyes glaze over.

Some organizations generate metadata without considering what is required. As a result, outputs from the systems can present mind boggling complex options to the user. A report displaying multiple layers  of metadata can be difficult to understand.

My thought is that before giving the green light to promiscuous metadata generation, some analysis and planning may be useful. The time lost trying to figure out which metadata is relevant to a particular issue can be critical.

But consultants and vendors are indeed impressed with flashy graphics. Too many times no one has a clue what the graphics are trying to communicate. The worst offenders are companies that sell visual sizzle to senior managers. The goal is a gasp from the audience when the Hollywood style visualizations are presented. Pass the popcorn. Skip the understanding.

Stephen E Arnold, November 21, 2014

Is This How Right to Be Forgotten Works?

November 21, 2014

I read “This Is How Google Handles Right to Be Forgotten Requests.” I must admit that the process strikes me as impressive. To explain hitting delete takes about 1,000 words.

One of Google’s problems, though, is that the process is often one-sided because a decision is based on information supplied by one person using a simple Web form.

The article explains what the GOOG allegedly does. There is one omissions. Some content has disappeared and it is difficult to figure out if a form was filed. For information about this, navigate to “Telegraph Stories Affected by EU ‘Right to Be Forgotten’”.

For a “real” journalism outfit, Computerworld is presenting the world as it understands it. Does the description match the Google reality? Good question.

Stephen E Arnold, November 21, 2014

DBpedia Makes Wikipedia Part Of The Semantic Web

November 21, 2014

SemanticWeb.com posted an article called “Retrieving And Using Taxonomy Data From DBpedia” with an interesting introduction. It explains that DBpedia is a crowd-sourced Internet community whose entire goal is to extract structured information from Wikipedia and share it. The introduction continues that DBpedia already has over three billion facts W3C standard RDF data model ready for application use.

The W3C standards are already written using the SKOS vocabulary, primarily used by the New York Times, the Library of Congress, and other organizations for their own taxonomies and subject headers. Users can extrapolate the data and implement it in their own RDF applications with the goal of giving your data more value.

DBpedia is doing a wonderful service for users so they do not have to rely on proprietary software to deliver them rich taxonomies. The taxonomies can be retrieved under the open source community bylaws and gain instant improvement for content. There is one caveat:

“Remember that, for better or worse, the data is based on Wikipedia data. If you extend the structure of the query above to retrieve lower, more specific levels of horror film categories, you’d probably find the work of film scholars who’ve done serious research as well as the work of nutty people who are a little too into their favorite subgenres.”

Remember Wikipedia is a good reference tool to gain an understanding of a topic, but you still need to check more verifiable resources for hard facts.

Whitney Grace, November 21, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Hedgehogs Digging For Content

November 21, 2014

Have you ever heard of Hedgehogs.net? According to a short introduction video it is:

“Hedgehots.net is social application framework that supports and empowers collaboration within the alternate investments community. Who is it for? It targets four main user groups: investment professionals, service providers, subject matter experts, fund investors. What does it do? Hedgehogs.net uses the power of social media and Web 2.0 technology to facilitate collaborative communication in the virtual world. It is designed to be a disruptive technology and economic force that has the commercial interests of the investment community at its core.”

Neat idea, a community that harnesses the power of social media and Web content to discuss alternate ways to invest money and ways to grow businesses! Browsing through the Web site, you will be bombarded by a pop-up that asks you to take survey and it gets annoying fast. Also reading the content raises an eyebrow too. Most of the blog posts seem centered on a random topic tagged with many keywords. Perusing through them show that there is a collective stream of consciousness that requires more than a second glance to understand. Hedgehogs.net also acts as a news aggregator similar to Fark.com and Reddit.com.

Hedgehogs.net’s purpose is buried somewhere on its page. Whatever and however it is used certainly drives a lot of traffic.

Whitney Grace, November 21, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Larry Page: Quotes from an Icon

November 20, 2014

I read “14 Quotes That Reveal How Larry Page Built Google Into The World’s Most Important Internet Company.” I suppose it is good to try and stay on the good side of the GOOG. I will be interested to see how that works out.

To the matter at hand. There is one quote that I highlighted as a candidate for recycling. Here it is:

On robots replacing humans: “The idea that everyone should slavishly work so they do something inefficiently so they keep their job – that just doesn’t make any sense to me. That can’t be the right answer.”

Makes perfect sense as long as a person is able to work at Google, be a consultant to Google, or partner successfully with Google. Now about others?

Just become an entrepreneur, innovate, work harder, etc.

Stephen E Arnold, November 20, 2013

Mozzila and Search Changes: Meh

November 20, 2014

You can read the crashing waves of opinions about Mozilla and its falling out of love with the GOOG. “Firefox Drops Google as Default Search Engine…” presents the new, “real” journalism approach; to wit:

Firefox has lost market share in recent years but is still used by roughly 17 percent of web goers.

Juicy factoid. Small percentage in a world in which traffic and eyeballs matter.

You can get the search engine optimization/inside scoop viewpoint in “Mozilla CEO: It Wasn’t Money — Yahoo Was The Better Strategic Partner For Firefox.” I noted this:

The official line from the Mozilla blog post about the deal helps parse what being a good strategic partner seems to be. It praises Yahoo as being “aligned with our values of choice and independence” — which suggests that Firefox was feeling that Google had become too controlling or wanted more control about what was happening within Firefox. Or, perhaps Mozilla felt Google has been less about supporting the web and more about supporting itself than in the past.

My view is not just tepid; it is indifferent. Monopolistic behaviors are the order of the day. Yahoo is no monopoly. Yandex may have a shot as long it stays on the right side of certain governmental authorities. Baidu is the best of the bunch, but one misstep and I would suggest that life could be viewed through a filter.

As the browser becomes the new operating system, if you are not running what’s mainstream, there may be some challenges ahead. Do you still have an Eagle desktop computer? If so, dig it out, plug in your DEC Rainbow, and let me know how you read this blog post.

Oh, and what about search? It seems to rank right along with the Mozilla attitude toward money in my opinion.

Stephen E Arnold, November 20, 2014

SAS Releases a New Component of Enterprise Miner: SAS Text Miner

November 20, 2014

The product article for SAS Text Miner on SAS Products offers some insight into the new element of SAS Enterprise Miner. SAS acquired Teragram and that “brand” has disappeared. Some of the graphics on the Text Miner page are reminiscent of SAP Business Objects’ Inxight look. The overview explains,

“SAS Text Miner provides tools that enable you to extract information from a collection of text documents and uncover the themes and concepts that are concealed in them. In addition, you can combine quantitative variables with unstructured text and thereby incorporate text mining with other traditional data mining techniques.SAS Text Miner is a component of SAS Enterprise Miner. SAS Enterprise Miner must be installed on the same machine.”

New features and enhancements for the Text Miner include support for English and German parsing and new functionality. For more information about the Text Miner, visit the Support Community available for users to ask questions and discover the best approaches for the analysis of unstructured data. SAS was founded in 1976 after the software was created at North Carolina State University for agricultural research. As the software developed, various applications became possible, and the company gained customers in pharmaceuticals, banks and government agencies.
Chelsea Kerwin, November 20, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

The Use of Semantic Enrichment for Scholarly Publishers

November 20, 2014

The article titled The Power of Semantics on Research Information investigates the advancements in semantic enrichment tools. Scholarly publishers are increasingly interested in enabling their users to browse the vast quantity of data online and find the most relevant information. Semantic enrichment is the proposed solution to guiding knowledge-seekers to the significant material while weeding out the unnecessary and unrelated. Phil Hastings of Linguamatics, Daniel Mayer of Temis and Jake Zarnegar of Silverchair were all quoted at length in the article on their views on the current usages of semantic enrichment and its future. The article states,

“Daniel Mayer, VP product and marketing at TEMIS, gave some examples of the ways this approach is being used: ‘Semantic enrichment is helping publishers make their content more compelling, drive audience engagement and content usage by providing metadata-based discoverability features such as search-engine optimisation, improved search, taxonomy/faceted navigation, links to structured information about topics mentioned in content, “related content”, and personalisation.”

Clearly, Temis is emphasizing semantics. Mayer and the others also gave their opinions on how publishers in the market for semantic enrichment might go about picking their partners. Some suggestions included choosing a partner with expertise within the field, an established customer base and the ability to share best practices.

Chelsea Kerwin, November 20, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Channel 19 Offers Office 365 Rest Point Training

November 20, 2014

With all the intricacies of SharePoint, continued training and education is important. Short training videos are getting easier to find, so that users don’t have to subscribe to large training programs, or hire someone to come in. It is worth giving these short tutorials a short. We found an interesting one on Channel 19 called, “Azure, Office 365, and SharePoint Online has REST endpoints with Mat Velloso.”

The summary says:

“Mat Velloso explains how to create applications and services in Azure that get permission to access OTHER applications like SharePoint! We’ll dig into the URL Structure of these services, see how to get events when things are updated, and figure out how ODATA and REST fit into these cloud building blocks.”

Stephen E. Arnold of ArnoldIT.com pays a good amount of attention to training and continuing education regarding SharePoint. His web service, ArnoldIT.com, is devoted to all things search, including a large SharePoint feed that helps users and managers stay on top of the latest tips, tricks, and news that may affect their implementation.  Keep an eye out for further learning opportunities.

 

Emily Rae Aldridge, November 20, 2014

Enterprise Search to Walk a Rocky Road in 2015

November 19, 2014

The Arnold IT team is working on a new report that focuses on what my team is calling Next Generation Information Access or NGIA. The Information Today column for the next issue (maybe January 2015) addresses one of the several hurdles that enterprise search vendors will have to get over in order to close some of the larger information access deals available today.

In this write up, I want to identify five other issues that bedevil enterprise search. Please, understand that I am not talking about Web search, which is essentially a different application of string matching and online advertising.

Here’s the partial list with a short comment:

  1. A suite of components, not a one shot system like a download of Lucene
  2. An architecture that allows the licensee flexibility when integrating, scaling, or modifying certain operations of the system. The black box is an intriguing notion, just unsuited to today’s working environment.
  3. A suite of components that have been designed to inter-operate without extensive custom scripting or silly explanations about the difficulty of making Component A work with Component B of the same vendor’s software. Incompatible Lego blocks don’t fly in kindergarten, and they certainly don’t work in high risk information settings.
  4. Connectors that work and support the file types commonly in use in fast moving, flexible work situations. The notion that the licensee can just code up a widget makes sense only when the vendor lacks the expertise or the resources to do this job before selling an information access system.
  5. Native support for languages in which the licensee’s content resides. Again, telling the licensee that he or she can just connect a system to a third party translation system is a limp wristed solution to today’s global content environment.

There are other hurdles as well. I will be identifying more and mapping out the specific differences between some of the aggressively marketed “charade” information access systems and solutions from vendors who have been operating at a different level in today’s three dimensional information chess game.

Playing checkers is not the same as 3D chess in my view.

Stephen E Arnold, November 19, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta