Is the End Approaching for Commercial Metadata Vendors?

April 26, 2012

This is a very interesting move, one that may have implications for the organizations which sell library metadata. Joho the Blog reports, “‘Big Data for Books’: Harvard Puts Metadata for 12M Library Items into the Public Domain.” We learn from the write up:

Harvard University has today put into the public domain (CC0) full bibliographic information about virtually all the 12M works in its 73 libraries. This is (I believe) the largest and most comprehensive such contribution. The metadata, in the standard MARC21 format, is available for bulk download from Harvard. The University also provided the data to the Digital Public Library of America’s prototype platform for programmatic access via an API. The aim is to make rich data about this cultural heritage openly available to the Web ecosystem so that developers can innovate, and so that other sites can draw upon it.”

Wow. Now, Harvard does ask users to respect community norms, like attributing sources of metadata. Blogger David Weinberger notes that licensing issues have held up the release of library metadata, and that this move makes the metadata of many, many of the most- used library items accessible.

What will happen next? Will the sellers of library metadata fight back?

Cynthia Murrell, April 26, 2012

Sponsored by PolySpot

Breaking Down SharePoint

April 26, 2012

Tim Anderson breaks down the nuts and bolts of SharePoint, what it is and what it is not, in “Making Sense of SharePoint 2010.”  Anderson gives an overview:

Microsoft calls SharePoint a ‘business collaboration platform,’ a suitably vague description for a multi-faceted product. SharePoint can be a content management system for an internal or external website, a document management system, a business search portal, and more.  So what is SharePoint really? Technically, it is an ASP.NET application which runs on Internet Information Services (IIS), Microsoft’s web server, and which stores most of its data in a SQL Server database. Conceptually, it is the outcome of Microsoft’s efforts over many years to create a web storage system, a document repository accessible via a web browser.

SharePoint has become ubiquitous, and almost obligatory, yet little writing is dedicated to what SharePoint is at its heart.  Instead, much talk is devoted to customization options. But if we see it for what it is, perhaps we can also recognize that there are other options.  There are alternatives to installing an overwhelming SharePoint infrastructure and then spending countless resources on customization processes.

Check out Fabasoft Mindbreeze for instance.  The Mindbreeze solutions are more than search, extending into mobile, web site, and enterprise realms.

Fabasoft Mindbreeze Enterprise understands you, or to be more precise, understands what the most important information is for you at any precise moment in time. It is the center of excellence for your knowledge and simultaneously your personal assistant for all questions. The information pairing technology brings enterprise and Cloud data together.

Perhaps most valuable, Fabasoft Mindbreeze ensures that all of their solutions work alone, or as a compliment to an existing SharePoint infrastructure.  Mix and match Mindbreeze offerings to get the customization your organization wants without all the headaches and extra work.

Emily Rae Aldridge, April 26, 2012

Sponsored by Pandia.com

Siemens’ Newest Software Focuses on Data Management

April 26, 2012

In a constantly changing world economy the need for product lifecycle management (PLM) for almost every industry is becoming more and more evident.  This need is driving PLM providers to continually change and improve their PLM solutions to meet the needs of clients.  Siemens PLM, one of the leaders in the industry, has recently released Teamcenter 9 with vital components to better support the onslaught of data that can choke unprepared companies.

A recent Market Watch article, “Siemens PLM Software Introduces Teamcenter 9; Enables Better Decision Making in Product Development”, explains the changes to Siemens’ newest Teamcenter version and how it relates to end users.

“…(B)ecause today’s products often have multiple options and variations, Teamcenter content management supports configuration-driven documentation that reuses common components of text, graphics and meta-data. This provides efficient, context-based multi-channel publishing to support the need for multi-media delivery on different devices and in multiple languages to support global markets.”

The need for content management is not new.  In fact other PLM providers have focused their entire software lines on such data management issues.  Inforbix, another PLM provider with an excellent reputation in customer support, believes that enabling clients to find, share and reuse data is the key to PLM success.  By basing their software solutions on this principle they have grown into a highly respected and influential leader in the PLM community.

Catherine Lamsfuss, April 26, 2012

Are Google Display Ads Losing Magnetism?

April 26, 2012

Business Insider recently reported on the latest in Google advertising in the article “Here’s One Reason Why Google’s Ad Prices May be Dropping.”

According to the article, Google’s ad prices are down by 12 percent from the previous year and this is the second quarter in a row that this has happened. The reason for this? One unnamed person who is familiar with Google’s ad business has a theory.

While Google has added new display advertising inventory, the search giant is having trouble getting small and mid-sized businesses to buy it.

The article states:

“With Google’s traditional search advertising, which still makes up the vast majority of its ad sales, a lot of small and mid-size businesses bid on keywords. This tends to drive prices up.

But display ads still have fewer bidders — mostly larger companies, since smaller companies don’t see the immediate and obvious return on investment that they do with search. It’s harder to track, for instance, the many times a user has seen a display ad before they actually click to buy something.”

It looks like Google needs to do a better job of encouraging smaller businesses to purchase display ads if it wants to continue its reign as search king.

Jasmine Ashton, April 26, 2012

Sponsored by PolySpot

CERN Embraces Yandex in Science, Search

April 26, 2012

Physicists at the European Organization for Nuclear Research, or CERN, have embraced Yandex to assist in an experiment including high-energy collisions of superconducting magnets.

An article published in Bloomberg Business Week, “With Yandex at CERN, Search and Science Collide,” we learn about the CERN project and Yandex’s involvement. Yandex is working on the project for free and the custom-built search engine is used by more than 700 physicists working on the experiment. The search engine provides instant results from the data and can be tailored by 600 different criteria. The article asserts:

“… About 13 percent of the computing power for Golutvin’s experiment is supplied by the Moscow-based company. Andrey Ustyuzhanin, a Yandex researcher, headed the search company’s five-person team, which created the CERN tool in three months. The software crawled tens of thousands of files spread across CERN’s servers, working at night while the scientists slept. Only a portion of CERN’s existing records have been crawled, but Ustyuzhanin wants to index all of the 20 billion or so particle collisions recorded this year—a number that exceeds the total volume of indexed Web pages.”

This pro-bono work by Yandex is good marketing as the company tries to hold its ground against Google, which now controls 26 percent of the Russian search market. The branding impact from this project is huge, because people will likely be impressed that Yandex is working on such an experiment and will want to become involved with the search vendor as well.

Andrea Hayden, April 26, 2012

Sponsored by PolySpot

ElasticSearch Support from Sematext

April 26, 2012

Sematext has achieved a first, PRWeb announces in “Sematext Int’l First to Offer ElasticSearch Tech Support.” The company already offers Tech Support for Apache projects Solr and Lucene. The press release explains:

“ElasticSearch is a highly scalable open-source search solution that is rapidly being adopted by enterprises world-wide. Over the last 12 months Sematext has witnessed an increase in demand for ElasticSearch and has helped a number of clients build ElasticSearch-based search solutions. Some of Sematext clients are using ElasticSearch on a truly massive scale, ingesting many millions of documents per day and serving hundreds of queries per second.”

The company responded to the growing need, and ElasticSearch Technical Support was born. Sematext founder Otis Gospedneti? stated that his company is in the best position to offer such support, having captured more ElasticSearch engagements (with huge data and query volumes) than anyone. The information gathered at Beyond Search suggests that Lucid Imagination’s Lucene/Solr is the most widely used at this time, however. Lucid offers comprehensive technical and engineer support services via full time staff and a number of partners around the world; for example, Lemur Consulting/Flax.

With many high-profile client organizations around the world, Sematext provides search and data analytics products based on a variety of open source projects. The company started out as Lucene Consulting in 2005, but branched into Solr-based services the very next year. They are proud to have never taken on debt or external funding.

ElasticSearch has been resource constrained. Hopefully the tie up with Sematext will smooth some rough edges. Support is important. Like most Lucene variant vendors, support and engineering services are needed.

Cynthia Murrell, April 26, 2012

Sponsored by PolySpot

IBM Buys Vivisimo Allegedly for Its Big Data Prowess

April 25, 2012

Big data. Wow. That’s an angle only a public relations person with a degree in 20th century American literature could craft. Vivisimo is many things, but a big data system? News to me for sure.

IBM has been a strong consumer and integrator of open source search solutions. Watson, the game show winner, used Lucene with IBM wrapper software to keep the folks in Jeopardy post production on their toes.

vivisimo search

A screen shot of the Vivisimo Velocity system displaying search results for the RAND organization. Notice the folders in the left hand panel. The interface reveals Vivisimo’s roots in traditional search and retrieval. The federating function operates behind the scenes. The newest versions of Velocity permit a user to annotate a search hit so the system will boost it in subsequent queries if the comment is positive. A negative rating on a result suppresses that result.

I learned that IBM allegedly purchased Vivisimo, a company which I have covered in my various monographs about search and content processing. Forbes ran a story which was at odds with my understanding of what the Vivisimo technology actually does. Here’s the Forbes’ title: “IBM To Buy Vivisimo; Expands Bet On Big Data Analytics.” Notice the phrase “big data analytics.”

Why do I point out the “big data” buzzword? The reasons include:

  • Vivisimo has a clustering method which takes search results and groups them, placing similar results identified by the method in “folders”
  • Vivisimo has a federating method which, like Bright Planet’s and Deep Web Technologies’, takes a user’s query and sends the query to two or more indexing systems, retrieves the results, and displays them to the user
  • Vivisimo has a clever de-duplication method which makes the results list present one item. This is important when one encounters a news story which appears on multiple Web sites.

According to the write up in Forbes, a “real” news outfit:

IBM this morning said it has agreed to acquire Vivisimo, a Pittsburgh-based provider of big data access and analysis tools.

Okay, but in Beyond Search we have documented that Vivisimo followed this trajectory in its sales and marketing efforts since the company opened for business in 2000. In fact, the Wikipedia write up about Vivisimo says this:

Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises. The focus of Vivisimo’s research thus far has been the concept of clustering search results based on topic: for example, dividing the results of a search for “cell” into groups like “biology,” “battery,” and “prison.” This process allows users to intuitively narrow their search results to a particular category or browse through related fields of information, and seeks to avoid the “overload” problem of sorting through too many results.

Read more

Basho Riak Gets Developer Love: Syslog Indexing

April 25, 2012

If you are not familiar with Basho Riak, you can work through the www.basho.com Web site, or you can navigate to www.opensearchnews.com and request our profile of the company. (Click on the “Profile” link at the top of the page.) You may want to check out “Full Text Indexing of Syslog Messages with Riak.” The article describes a tool call riak-syslog. The utility sucks up syslog messages and allows the user to search those messages using the Riak full text search system. The write up has a post which points to indexing syslog messages with Solr. Useful.

Stephen E Arnold, April 25, 2012

Sponsored by PolySpot

Is Amazon Building the Next Big Thing?

April 25, 2012

The Network Thinkers (TNT) blog believes it has discovered “The Next Big Thing:” social via Amazon. The write up posits that the information Amazon gathers from Kindle readers, which goes beyond “customers who bought this item also bought. . .” to include highlights and notes folks have made in their e-copies. The article asserts:

“It is what we specifically find interesting and useful in those books that reveals deep similarities between people — the hi-lites, bookmarks and the notes will be the connectors.  Our choices reveal who we are, and who we are like! Today, Amazon introduces you to similar books.  Tomorrow, they will introduce you to similar readers.”

Intriguing. What makes this post more interesting, though, are the comments; ideas presented as new strike some as covering old ground. “Anonymous” notes:

“Eh, not really that under the radar? Kindle.amazon has been recommending readers with similar profiles for quite some time. But more people take photos or have jobs than read books, so the scale will be less?”

Though his or her voice is tenuous, Anonymous makes a good point: book readers seem to be a dwindling breed (sigh), so the Kindleverse is unlikely to rival Facebook or LinkedIn anytime soon. Myspace, maybe.

Cynthia Murrell, April 25, 2012

Sponsored by PolySpot

The Lifecycle of SharePoint

April 25, 2012

Bjorn Furuknap is a longstanding SharePoint blogger and expert.  His piece, “Could SharePoint 2013 be SharePoint 2012?” is not notable so much for its predictions as it is for its explanation of the SharePoint lifecycle of development.  Furuknap’s hypothesis, which he ultimately refutes, is that the projected release of SharePoint 2013 could in fact be a SharePoint 2012 release, flouting the traditional three-year cycle.

He explains:

I know the cycle at Microsoft says it should be three years between a major Office release, but with the state of completion of Windows 8, and with the new Metro interface making an appearance, perhaps Microsoft aims to get Office 15 out as soon as possible, maybe even in 2012.  It wouldn’t make any sense to leave SharePoint behind then. Microsoft would want the Office client suite to take advantage of the latest and greatest, and that leads me to believe that if Office comes out named 2012, then SharePoint will be so too.

From this we see that the Microsoft web is tightly woven with many interconnecting parts.  With Furuknap’s article a bit dated, we are now more certain that SharePoint 2013 will indeed be SharePoint 2013, even if released in 2012.  But what is interesting is the tight lid that Microsoft keeps, rigidly scheduling updates even when other competing technologies have passed them by.

Take Fabasoft Mindbreeze for instance.  Fabasoft Mindbreeze is a third-party vendor offering a suite of solutions including Fabasoft Mindbreeze Enterprise.  Standing alone or working in tandem with an existing SharePoint infrastructure, Mindbreeze works hard to meet the intuitive needs of its users.  Updates are released quarterly for on-site installations, the latest edition being Spring 2012.  Updates are even more frequent for Cloud users.

If third party vendors are responding to customers’ needs quickly, releasing updates to keep pace with competing and complimentary technologies, we think Microsoft could make it a point to do the same thing.  In the meantime, explore the offerings by Fabasoft Mindbreeze and see if they can provide the efficiency and flexibility your organization needs.

Emily Rae Aldridge, April 25, 2012

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta