MapReduce: A Summary

May 19, 2012

Want to know about MapReduce? Here you go:

Remember. Think batch processing.

Stephen E Arnold, May 19, 2012

Sponsored by IKANOW

Arnold Columns: Update May 2012

May 9, 2012

We have continued to produce Stephen E Arnold’s for-fee columns. Due to some minor health excitement involving Mr. Arnold, his monthly update about what and for whom he has been writing for money has been on hold. The content continued to flow. Here’s a run down by publication of the for fee columns submitted through May 8, 2012:

Enterprise Technology Management, IMI Publishing, London, UK. ETM publishes my Google column which originally appeared in KMWorld.

  • January 2012, “Google Enterprise: The Berkeley Analysis.” The article discusses why a noted university chose Google’s apps, not Microsoft’s. The point is that price cutting is playing a major role in information technology decisions.
  • February 2012, “Google Enterprise: Is There a Poison Apple in Paradise?” The column reviews the new version of the Google Search Appliance. The question becomes, “Could Apple pose an alternative to Google, an alternative Google is not anticipating?”
  • March 2012, “Google Privacy and Enterprise Licensing.” This write up explores how recent revelations about Google’s approach to privacy may put barriers in place which could slow or block some Google enterprise license deals.
  • April 2012, ”Google’s Cloud: Building and Threatening.” The essay considers that Google has been left in the starting blocks by Amazon’s cloud services. Google may catch up, but the pricing of cloud services, regardless of vendor, can be slippery to estimate.
  • May 2012, “The Google Myth: Poetics and Glass.” The story considers Mr. Page’s role with Wall Street and Mr. Brin’s assignment to promote Google’s virtual reality “glasses.” Will these modern day Romulus and Remus billionaires continue to coexist in a positive relationship?

Information Today, Information Today, Inc. The Information Today column covers search-related topics for the an information specialist, competitive intelligence  researcher or database publishing professionals.

  • January 2012, “Augmented Reality: I’ll Be Back”. Autonomy, best known for enterprise search and content processing, has emerged as a leader in augmented reality or AR. The column discusses Aurasma, the company’s AR solution.
  • February 2012, “By Jingo: Search Catchphrases 2012.” This article considers the role and implications of marketing phrases used by enterprise search vendors. The majority of the buzzwords have more to do with competitive jockeying than communication to an organization looking for a findability solution.
  • March 2012, “Health and Medical Research: Drying Up the Bones.” Web-accessible, public medical information is tough to use. The essay looks at several services, including Quertle.
  • April 2012, “Are Analytics the New Way to Search?” Most users don’t search particularly well. Some do not want to formulate search queries. The write up considers the question, “Can analytics deliver search results without asking the user to formulate a query?”
  • May 2012, “Google and Microsoft: Interface Flipperoos.” The story points out that the new Google interface looks more like Excite 1996 than Google in 2007. Microsoft, on the other hand, looks almost exactly like Google.com’s interface in 2007. Are flips like this the new approach to search interface innovation?

KMWorld, Information Today, Inc. The column for KMWorld discusses enterprise information from the angle of semantic technology.

  • January 2012, “Insight from the Information Tsunami.” The column discusses Microsoft SharePoint and BA Insight, a software complement to SharePoint designed to address some of the “issues” associated with Microsoft’s flagship content management system.
  • February 2012, “Bitext: Engaging in the Semantic Arena.” The article profiles Madrid-based Bitext, a company emerging as a leader in the enterprise semantic market.
  • March 2012, “Xyte and Insight into Online Behaviors.” The write talks about Xyte’s approach to market research and discloses some interesting findings about Facebook. These items suggest Facebook is a more potent online force than some believe.
  • April 2012, “Consumerizing Knowledge Management.” The essay considers that analytics programs with training wheels deliver some benefits to enterprise users. However, acting on auto-generated reports without understanding the assumptions behind the report can lead to bad decisions.
  • May 2012, “Big Data, Cows, and Cadastres.” The write up looks at specific business pay offs from the analysis of big data. The biggest benefits come from analysts who understand the data and the math behind a particular numerical recipe.

Online Magazine (published six times a year). Information Today, Inc. The features written for Online Magazine focus on open source search in the enterprise. For more than a year, Mr. Arnold’s column has explored a range of subjects related to open source search.

  • February 2012, “Open Source Search: Clarity with Lucid Works.” The feature discusses Lucid Imagination’s newest release of Lucid Works Enterprise 2.0.
  • April 22012, “Open Source: Fascinating Uncertainty.” The feature takes a look at some of the jockeying which takes place in the open source world involving “foundations.”

If you are a public relations person, an azure chip consultant, or an unemployed middle school teacher, Mr. Arnold does not accept story suggestions for these for fee writings. His policy is to contact people with regard to a question or issue. Mr. Arnold is not a journalist. In a previous life, he indexed medieval sermons in Latin. He does not understand “real” journalism, marketing, public relations, investment bankers, private equity firm owners, and sales people.

These articles are available from the publishers who purchased work for hire. At some point, Mr. Arnold’s staff may post versions of some of the essays on one of the reference Web sites Mr. Arnold operates. For copies of these articles, please, contact the publishers. For a briefing on one of the topics addressed in Mr. Arnold’s for fee writings, please, contact us at seaky2000 at yahoo dot com.

Donald C. Anderson, May 9, 2012

Sponsored by PolySpot

Scholarpedia a Valuable Resource

May 1, 2012

We’d like to share a useful resource we’ve come across: Scholarpedia.org is a searchable, peer-reviewed online scientific encyclopedia. Its contributors are respected authorities in their fields, including an impressive list of Nobel Laureates and Fields Medalists. The areas covered include: Dynamical Systems, Physics, Applied Mathematics, Computational Neuroscience, and Touch.

Articles are curated by prominent authorities who take responsibility for the contents. As trusted custodians, these shepherds are also able to sponsor new articles.

The format of Scholarpedia should look familiar. The site’s About page explains:

“Scholarpedia feels and looks like Wikipedia – the free encyclopedia that anyone can edit. Indeed, both are powered by the same program — MediaWiki. Both allow visitors to read and modify articles simply by clicking on the edit this article link. However, Scholarpedia differs from Wikipedia in some very important ways.”

Ways like a strict modification approval process, the selection of elite authors, and the curator review system. The statement goes on to emphasize the advantages the online community brings to the traditional scholarly paper:

“. . . Articles are not frozen and outdated, but dynamic, subject to an ongoing process of improvement moderated by their curators. This allows Scholarpedia to be up-to-date, yet maintain the highest quality of content.”

The site is well worth checking out for all you science types.

Cynthia Murrell, May 1, 2012

Sponsored by Augmentext

Open Source Search Profiles Available

April 25, 2012

OpenSearchNews.com, the new information service from ArnoldIT, has rolled out a new profile service. The first profile describes the Basho Riak Search system. Although proprietary, the Basho team has made the Riak search system open source. You can request a copy of the Basho Riak profile, which is available without charge, from the Open Source Search Profiles link.

basho snippet

Stephen E Arnold, publisher of OpenSearchNews said:

Consulting firms specializing in open source search have been slow on the trigger when it comes to vendors who offer an alternative to proprietary, “closed” search systems. My team has completed analyses of a dozen open source search vendors and will post a fresh profile every seven to 10 days. The profiles follow the same type of format which we used in such monographs as The Google Legacy, Beyond Search (published by the “old” Gilbane Group), Enterprise Search Report, Successful Enterprise Search Management, and The New Landscape of Enterprise Search. Instead of paying hundreds, maybe thousands of dollars, ArnoldIT is making the information available without charge to facilitate greater understanding and discussion of open source search options.

Profiles contain:

  • Background of the company
  • Principal features and functions of the systems
  • The upside and downside of the system
  • An ArnoldIT “net net” which puts the system in context.

The content of the profiles is intended for individuals, students, and teachers. Libraries are free to use the content without seeking permission. Any other use requires written permission from Stephen E Arnold.

A complete collection of the 12 profiles, an introduction to the open source search, and a summary of where open source search is gaining traction, contact us by writing seaky2000 at yahoo dot com. The information is available in the form of an online or on site briefing. There is a charge for the complete set of information and/or the briefing.

For up-to-date information about open source search solutions built on Lucene, Solr, and Xapian, among others, check out OpenSearchNews.com. You can, of course, wait for one of the azure chip consultants, unemployed Webmasters, or newly minted search experts to recycle ArnoldIT content. However, the profiles are current and will be available without charge. Enjoy.

Donald C Anderson, April 24, 2012

Sponsored by ArnoldIT, your source for strategic information services

Open Search News: News about Open Source Search

April 23, 2012

As technology continues to evolve, a space has been created for more news sources than ever before. Even more importantly, each news site is able operate within its own niche. OpenSearchNews.com is a free service published Monday through Friday by Arnold IT, the publisher of Beyond Search and an expert in search and content processing.

The microsite’s content includes: critical commentary, information about products, and highlights additional sources of information about open source search.

Emily Aldrich, the information service’s editor said:

“Open source search has become a fast-growing segment of the enterprise search and big data markets. The number of companies competing in this segment is growing. The phenomenon is global with solutions available from Canada, the Danish Library, and entrepreneurs in the Russia. We are reporting on the companies, trends, and products which offer an alternative to the seven figure solutions from proprietary enterprise search solutions.”

Open source search is a relatively new development that is transforming enterprise search as we know it. Open Search News is one of the best sources for news on this burgeoning field.

Jasmine Ashton, April 23, 2012

Sponsored by PolySpot

Federated Search: A Definition

April 11, 2012

The phrase “federated search” means subtly different things to different people, and we have noted confusion occasionally arising because of this.  It is therefore good to see a new article on the Search Technologies’ Web site clarifying matters. The article defines the task of federated search as:

Deploying a search over distributed and possibly heterogeneous data sets, and receiving in return a unified search results list.

Not only does the piece provide a clear definition of the alternative strategies for implementing federated search, it also lists the fundamental pros and cons of the different approaches. Read on at Federated Search: The Options.

Iain Fletcher, April 11, 2012

Sponsored by Pandia.com

Search Infrastructure Advice from PolySpot

April 11, 2012

We think highly of PolySpot. The open source search experts have published a new white paper titled “How Agile Enterprise Search Infrastructure Can Help CIOs.” The paper is a must-read for any organization working to bridle its data. The summary states:

“Implementing advanced enterprise search applications yields significant and helps solve major technical and business issues:

  • Facilitate access to valuable information through a single gateway or profiled pushed content
  • Deliver comprehensive information, not fragmented information
  • Increase employees satisfaction thanks to higher relevance and findability
  • Ease expertise finding
  • Enhance access to legacy information systems
  • Provide access to time sensitive information such as pricing, regulations, and procedures
  • Federate legacy document management content locked in proprietary systems
  • Reduce harmonization costs and contain software license fees”

Founded in 2001 and headquartered in Paris, PolySpot designs search and information access solutions that help clients around the world boost efficiency even as big data multiplies at an astounding rate. Their tools offer universal connectivity, covering all business needs and ensuring that organizations can access the data they need, regardless of their structure, format or origin. They handle structured data with aplomb, of course, but pride themselves on their innovative cross-functional solution to unstructured data.

Cynthia Murrell, April 11, 2012

Sponsored by Pandia.com

Open Source Analytics Information Service Now Available

April 9, 2012

ArnoldIT has rolled out The Trend Point information service. Published Monday through Friday, the information services focuses on the intersection of open source software and next-generation analytics. The approach will be for the editors and researchers to identify high-value source documents and then encapsulate these documents into easily-digested articles and stories. In addition, critical commentary, supplementary links, and important facts from the source document are provided. Unlike a news aggregation service run by automated agents, librarians and researchers use the ArnoldIT Overflight tools to track companies, concepts, and products. The combination of human-intermediated research with Overflight provide an executive or business professional with a quick, easy, and free way to keep track of important developments in open source analytics. There is no charge for the service.

trendpoint splash

Stories include:

According to the publisher, Stephen E Arnold:

We believe that commercial abstracting and indexing services have become untenable for the busy professional. We have combined traditional indexing, literature reviews, and critical commentary which help reduce the time required to pinpoint the meaningful information in this exploding open source analytics field.

Our business model is to provide high value information without a fee. Individuals, law firms, and private equity firms wanting additional information about the people, companies, and products we cover are free to contact us. Like other professional services’ firms, we rely on motivated individuals with an information need to tap into our full-scale, in-depth research.

What sets TheTrendPoint and other ArnoldIT.com information services apart is that its approach is similar to that used by commercial information services such as Medline and Disclosure, two information services designed to make reference services more useful.

At this time, TheTrendPoint.com is designed to complement the finding services which ArnoldIT.com publishes. ArnoldIT.com is one of the leading sources of information on subjects ranging from search and content processing to next-generation intelligence systems.

New content is added to the service Monday to Friday. For more information about the service, contact the publisher at seaky2000 at yahoo dot com.

Kenneth Toth, April 9, 2012

Sponsored by Pandia.com

Google in the World of Academic Research

April 5, 2012

Librarians, teachers, and college professors all press their students not to use Google to research their projects, papers, and homework, but it is a dying battle. All students have to do is type in a few key terms and millions of results are displayed. The average student or person, for that matter, is not going to scour through every single result. If they do not find what they need, they simply rethink their initial key words and hit the search button again.

The Hindu recently wrote about, “Of Google and Scholarly Search,” the troubles researchers face when they only use Google and makes several suggestions for alternate search engines and databases.

Google has tried to combat their “low academic quality” results with Google Scholar and Annotum. Google Scholar is the equivalent of a regular academic database, except they don’t always return full text articles. Annotum takes a different approach by changing the search configuration all together. It is a scholarly blog platform, where experts can share their knowledge without being bogged down by personal opinion, rants, and other social networking content (Annotum was preceded by Knol, but Google is eliminating that service).

There are other tools to help the wayward researcher. The search engines Hakia, Kngine, Sensebot, and DuckDuckGo use semantic search technology instead of the usual Google formula. While they are not strictly research search engines, they do provide you with a more logical approach to search than returning every web site where the key term pops up. One semantic search engine that eliminates the usual everyman search is Deepdyve. You won’t be able to look for pop culture references with it, but it will give you more authoritative sources than Google.

If one needs information specifically on the sciences, Web of Science and SciVerce ScienceDirect are university -approved databases that host millions of articles from scientific journals, abstracts, track research data, and connect with other researchers. Another topic that is of current interest in the IT world is patents. Google, Apple, and Microsoft are all racing to create the next big technological craze, but they research patents to make sure their competitors haven’t gotten there first. Micropatent, SumoBrain, and Relecura are the top patent databases on the web used by industry and business heads.

While Google may provide the easiest way to access information, it is hardly the best for research. Use the above search engines and web sites to improve research quality and not just receive quantity from Google.

Whitney Grace, April 5, 2012

Sponsored by Pandia.com

Amazon Web Services Explained

April 2, 2012

You can make the Amazon cloud work for you if you attend to the type of information we found here; Digg presents “Cracking the Cloud: an Amazon Web Services Primer.” The article notes:

It’s safe to say that Amazon Web Services (AWS) has become synonymous with cloud computing; it’s the platform on which some of the Internet’s most popular sites and services are built. But just as cloud computing is used as a simplistic catchall term for a variety of online services, the same can be said for AWS—there’s a lot more going on behind the scenes than you might think.

Writer Matthew Braga goes on to elaborate in detail on the workings of AWS. He defines and explains Elastic Cloud Compute (EC2), Elastic Load Balance (ELB), Elastic Block Storage (EBS), and Simple Storage Service (S3). Braga emphasizes that these are just the core components, and that there are many other features of AWS that he doesn’t have space to cover here. What he does describe, though, is quite useful information for the tech reader.

Cynthia Murrell, April 2, 2012

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta