Search and Virtualization
March 1, 2011
Quick. What enterprise search vendors’ systems permit virtualization? The answer is that the marketing professional from any search firm will say, “We do.” However, the technology professional who rarely speaks to customers will say, “Well, that is an interesting question.”
Virtualization is turning big honking servers into lots of individual machines or servers. Virtualization is easy to talk about as search vendors tout their systems’ capabilities as business intelligence services. But in our experience remains both science and art. Another way to describe virtualization and search is “research project.”
Our contributing writer Sarah Rogers reports:
The commercial climate for virtualization is changing. Business intelligence (BI) represents just one force exerting its influence. As the needs of numerous businesses reach levels where accessing, housing and reviewing information are yesterday’s problems, the new focus becomes how to maximize efficiency without renting secondary office space to handle the servers required. Many are turning to virtualization.
But virtualization isn’t all perks, as examined in “Are SQL Server BI systems compatible with virtualization?”. Systems operating under the BI umbrella will not always function at full capacity when connected to an incorporeal network. Contemporary BI groups construct detail heavy examination patterns inside existing memory as you need it. These analytical systems often are designed to retain vast amounts of data, which when operating through a virtualized platform can breed obstacles in the path to access. Another issue is what is described as over commitment, where hosts ration out available memory to all those connected. A fine idea, though again analytical systems may overload the designated operating pattern and diminish results.
Though traditional databases are suited to disambiguate these compatibility issues, they seem to be struggling, awash in the flood of their in-memory counterparts. At least that is one opinion floating about. It is clear that other variables exist that will spoil the math when looking to pass through to the other side. So here is another opinion: the physical database does still have a viable roll. Why not keep your options open?
Sarah Rogers, March 1, 2011
Freebie
Sophia Search Co-Founder Speaks
March 1, 2011
Sophia Search offers an alternative to key word retrieval. What’s the secret behind this new system? The Search Wizards Speak series provides some insight into Sophia Search with its most recent interview with Dr. David Patterson.
You can read an exclusive interview with the co-founder of the Belfast, Ireland-based enterprise search vendor Sophia Search on ArnoldIT.com. Dr. Patterson explains his search system’s use of semiotics to discern the meaning of textual information. The result is that a user finds the information required more quickly, thus reducing the need to run multiple queries or plow through a long, laundry list of query results.
In the interview, Dr. Patterson said:
I prefer to call Sophia a “contextual discovery engine.” Sophia can automatically disambiguate the different meanings of words based on their context within a document. In short, Sophia searches by the meaning of what the user is looking for as opposed to just the key words they use in their query. Sophia enables users to discover contextually relevant information they were previously unaware of, and it increases the users’ understanding of their content. One of the benefits of our technical approach is that Sophia operates without human guidance or training, and it does not require taxonomies, ontologies or thesauri.
He added:
Conventional search tools and systems don’t address the discovery component of search. How can the user query for information they don’t know exists? Finally, we were fascinated by solving what we call “the context problem”. Most systems simply do not understand the context of information. Therefore, most search and retrieval systems provide a lot of irrelevant hits to the user. Sophia is all about context and providing users with relevant information in the right context. It is about understanding the meaning of what the user is looking for, not simply returning lists of documents just because they contain the user’s query terms.
You can examine a screen shot of the Sophia Search output along with Dr. Patterson’s comments about the system and method used in this enterprise search system.
You can get the full text of the interview at this link.
Stephen E Arnold, March 1, 2011
Freebie
Arnold Columns for March 2011
March 1, 2011
The for-fee columns for March 2011 cover a range of topics. If you want to access the full text of these documents, please, contact the publishers who own the rights to the versions of the write ups I submitted this month.
ETM (published by IMI Publishing in London), “Choice: A Growing Problem in Enterprise Search.” Google seems to be trimming its product line but other vendors are expanding theirs.
Information Today (published by Information Today), “Autonomy’s Surprise Move into Health Care.” Who would have thought that Autonomy would dive into improving what a health care worker does for a patient and toss in access to third party journal papers?
Information World Review (published by Bizmedia in London), “Search 2011: Shape Shifting Accelerates.” This year promises to be one in which search is subjected to some potentially severe earthquakes or user pushbacks.
KMWorld (published by Information Today), Semantics is yesterday. Semiotics is the future at Sophia Search in this article “Sophia and Semiotics for Enterprise Search”. Don’t know what semiotics is? Read the KMWorld story when it comes out in a couple of months.
Online Magazine (published by Information Today), “Tracking Solr Activity” reviews some places to locate information about open source search and some of the cost factors to consider when deciding whether to go open source or proprietary search.
Smart Business Network, possibly available in online and in print. I am not really sure at this time. The article is “StumbleUpon: A Dark Horse in the Web Traffic Race”. The point is that one can advertise on StumbleUpon and use a free promotional service provided by the online service.
The quality of the research and writing in the ArnoldIT.com for-fee work is more detailed than the information that appears in the Beyond Search blog, on the ArnoldIT.com Web site, and in our data fusion blog IntelTrax. An explanation of the differences is at this link.
If you want content for your technical or business blog, write us at seaky20000 at yahoo dot com. Our team of writers is able to produce high quality writings at a competitive price.
We will be announcing two new Web logs in the next few weeks.
Stephen E Arnold, March 1, 2011
Freebie but the publishers listed in this story pay me to write articles for their readers. The total circulation of these publications is in the 100,000 readers per issue across all six publications.
Capacity Planning in SharePoint Server 2010
March 1, 2011
“Storage and SQL Server Capacity Planning and Configuration (SharePoint Server 2010)” explains how to plan for and configure the storage and Microsoft SQL Server database tier in your Microsoft SharePoint Server 2010 environment. The article states:
“Because SharePoint Server often runs in environments in which databases are managed by separate SQL Server database administrators, this document is intended for joint use by SharePoint Server farm implementers and SQL Server database administrators. It assumes significant understanding of both SharePoint Server and SQL Server.”
With that as a given, the capacity planning is outlined through several steps. There is a summary of the databases installed with SharePoint Server 2010 and directions for estimating the IOPS. The article recommends that you run your environment on the Enterprise Edition of SQL Server 2008 or SQL Server 2008 R2.
The write up advises you on choosing a storage architecture, disk types, and RAID types. There is a table of guidelines to estimate memory requirements and some advice on network topology requirements. The Configure SQL server section advises that SharePoint Server 2010 was meant to run on several medium-sized servers rather than a couple of large ones. The final point in the article provides general guidance for monitoring the performance of your system.
This article makes me glad that I am not a database administrator. With the huge volumes of data that are found on SharePoint, it can be difficult enough to wield as a front-end user. It reminds me that the more data one has, the more important indexing and semantics become for navigating the wealth of information that someone else plans how to store. Keep in mind that Search Technologies can assist you with your SharePoint capacity planning from the perspective of searchability.
Stephen E Arnold, March 1, 2011
For Search Technologies
Protected: SharePoint Excitement: The Content API Fizzles
March 1, 2011