August 7, 2012
Do you need to pull SharePoint Fast Search crawl logs? We do. We read with interest an item on Microsoft’s TechNet Web site. “Get SharePoint Search Crawl Logs” provides an almost ready-to-run script which will accept a search service name and display the associated crawl logs. If there is a crawl log with an error, the script flags that instance. To script can be edited so that it returns different information from the crawly logs. In order to make this tweak, the $crawlLogFilters can be edited.
SharePoint Fast usually does an excellent job of processing content. However, some documents can be malformed or an unexpected network issue can arise. As a result, certain content can be skipped or ignored. A visual inspection of crawl logs is not practical when SharePoint is processing large volumes of content.
If you want to view the crawl logs, TechNet provides a wealth of information. A good place to begin your investigation is in the TechNet Library. If you want to expOrt the SharePoint 2010 search crawl logs, you will find a useful Powershell script in Dave Mc’s Blog in the article “Export the SharePoint 2010 Search Crawl Log.” MSDN also provides information about exporting SharePoint 2010 search crawl logs. To access this information, navigate to the SharePoint Escalation Team’s blog.
Search Technologies’ team of experienced engineers can provide automation tools which eliminate the need to search for solutions to common problems. To learn more about our SharePoint and FFast Search implementation services, navigate to http://www.searchtechnologies.com/microsoft-search.html or contact us at email@example.com.
Iain Fletcher, August 7, 2012
Sponsored by Augmentext
July 30, 2012
Apparently a SWAT Team is just what the doctor ordered in the case of Microsoft’s consumer initiatives.
Microsoft is calling upon former Clinton advisor and PR maven Mark Penn to lead a “SWAT Team” focused on consumer initiatives and developing strategic development and branding to meet consumers’ changing needs. The first target of the team will be Microsoft’s search engine, Bing.
“‘Mark has an incredible background in research, demographics, marketing and positioning and a proven history in developing unique insights that drive success,’ [Microsoft CEO Steve] Ballmer said in a statement. ‘With a strong set of products and an exciting pipeline for the next year, Mark’s experience and out-of-the-box thinking will help us more effectively reach new consumers and grow market share.’
Despite Microsoft’s best efforts, Bing is still holding steady at second in popularity to search king Google. Searching has become synonymous with “Googling,” and Penn has his work cut out for him. I wonder if someone in Microsoft management used a decision engine to answer the question, “How do we catch Google in search?” Perhaps Penn will be the answer.
Andrea Hayden, July 30, 2012
Sponsored by IKANOW
July 26, 2012
There is a very enlightening source of reading references to be found in Jeff Huang’s “Best Paper Awards in Computer Science.” He conveniently provided a list of informative papers neatly categorized by area of expertise, like artificial intelligence or human computer interaction.
While scrolling down the list, two interesting papers seemed to jump right out.
The first of which, “Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections,” describes a new approach, as:
“A novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation.”
The second paper, “How does search behavior change as search becomes more difficult?” Describes some research on search and their conclusions, with:
“When having difficulty in finding information, users start to formulate more diverse queries, they use advanced operators more, and they spend a longer time on the search result page as compared to the successful tasks. The results complement the existing body of research focusing on successful search strategies.”
Researchers are consistently developing models to predict and understand changes in text entry. Sadly, most of the models fail to account for varying system parameters and the ever changing human factor, nor their evolving relationship.
The latter explains the dumbing of search…but they were interesting reads.
Jennifer Shockley, July 26, 2012
July 19, 2012
In a $40 million deal, Taume announces, “IHS Acquires Trident Capital’s Invention Machine.” The write up also notes IHS’ previously announced purchase of GlobalSpec for $135 million. The company expects the combination of Invention Machine’s Goldfire business intelligence chops and GlobalSpec’s vertical search, product information, and global access point capabilities will combine to:
“. . . transform our existing engineering specifications and standards business to long-term double-digit growth, and accelerate the IHS Product Design business by increasing the value we offer to engineers, researchers and scientists by connecting innovation to knowledge workers,” said Jerre Stead, IHS chairman and chief executive officer. “With Invention Machine’s Goldfire as the front-end, we will bring together all IHS content, insight and tools into an innovative solution that will address many of the unsolved problems facing engineers. This will enable greater productivity, accuracy and design quality, and help customers accelerate innovation and deliver superior products and services.”
Invention Machine makes its home in Boston, with offices in London, Frankfurt, Paris, Tokyo, and Minsk, Belarus. They call their Goldfire “the optimal decision engine,” created to help make clients more productive. Trident Capital is a venture capital and private equity firm founded in 1993 that specializes in business service and I.T. investments.
Designed with engineers in mind, GlobalSpec supplies its customers with domain-specific vertical search engines. The firm is headquartered in East Greenbush, N.Y.
Headquartered in Englewood, Colorado, IHS operates in over 30 countries and employs over 5,500 workers. This information powerhouse was founded back in 1959 as a provider of product catalog databases on microfilm for aerospace engineers. Wow, who here remembers microfilm? Kudos to IHS for keeping up with the times!
Cynthia Murrell, July 19, 2012
Sponsored by PolySpot
July 19, 2012
SharePoint developers are eagerly waiting for SharePoint 2013. A blogger at the Microsoft Blogs wrote “SharePoint 2013-Initial Take On Changes To Search” and he has been viewing a lot of slideshows on the new version. His favorites are at SharePoint 2012: Presentation: IT Pro Training and all are easy to download. He takes a look a the Module 7: SharePoint Search 2013 that takes an in depth view into enterprise search, including architectural changes to physical and logical topologies and configuration details on crawling, content, and query.
Fast Search functionality is behind much of the SharePoint 2013 enterprise search capability:
“ From the SharePoint 2013 slides, it’s pretty clear that the rumors have played out and core components of SharePoint Search (particularly the Indexing pipeline) effectively got replaced by the Fast Search pipeline… although it will maintain the ‘SharePoint Search’ moniker (Disclaimer: I’m not a marketing guy and have no idea what the licenses will be, so this is just my observation).”
There is a lot of content to digest in from the presentations, but the article pulls out the very detailed and informative diagrams to understand how Fast Search has and will change the search architecture for SharePoint. With more than 30,000 consultant days of Fast implementation experience at Search Technologies, we will be gearing up early to support SharePoint 2013 Search Rollouts.
Iain Fletcher, July 19, 2012
Sponsored by Search Technologies
July 12, 2012
The SharePoint Blog contained a very informative explanation of SharePoint “refiners.” A “refiner”, according to Microsoft is “enable end-users to drill down into their search results based on managed properties that are associated with the indexed search items, such as creation date, author, and company names.”
Custom SharePoint 2010 Search Refiner – Displaying Range of Choices is a presentation of information which originally appeared in the ShareMuch blog. The write is, in my opinion, quite useful. The information provides a streamlined explanation of how to implement a refiner in a SharePoint 2010 installation. The write up provides an XML snippet which makes the addition of a refiner quick and easy.
The article explains:
MappedProperty maps to an actual managed property that you must define or is already defined in search service application. The SortBy defines, in this case, a custom filter right below the category. The CustomFilters node’s MappingType property means we’ll have a custom filter. In our case, we’re using a range mapper, meaning that whatever value are going to be in the managed property, our filter will display UI based on the range of those and let user toggle the display based on that range. I hope this makes sense. The DataType has only 3 types, so please don’t make the same mistake I did and try to guess the value, it’s limited to “Numeric”, “DateTime”, “String”. The CustomValue inside CustomFilter specifies the user friendly value and the OriginalValue defines the range. In our example, the “Size” property is measured in Bytes so “..1? means range anywhere from 0 bytes to 1 byte. It happens that list items and lists in search results are less than 1 byte in size which means that we can refine by list items and lists results by capturing items with size less than 1 byte. Everything else is a document.
Search Technologies implements “refiners” as well as other advanced features of SharePoint. If you want to extend SharePoint and make the system deliver even greater value to your users, contact Search Technologies.
Iain Fletcher, July 12, 2012
July 10, 2012
InetSoft Technology is mashing search technologies together with the availability of new custom data connectors for popular enterprise applications. This big mixing bowl will add connectors to the list of supported third-party data sources that do not already have open standards based connectivity according to Times Union’s article, “InetSoft Adds Google Analytics, AdWords, and Microsoft SharePoint as Data Sources for BI Dashboarding”.
The new InetSoft technology provides a smooth mix with an efficient transformation, as:
“Style Intelligence is a full-featured business intelligence solution for dashboard reporting that includes a powerful data mash-up engine. End-users get visually compelling, highly interactive access to data, and IT gets a highly customizable, easy to learn and quick to deploy business intelligence toolset and information delivery platform. Data mash-up capabilities allow for the integration of disparate data sources, enabling agile development and providing maximum self-service, while the application’s SOA architecture and open standards-based technology make for an ideal embedding and integration-ready application for dashboards, production reporting, and visualization.”
Those who take advantage of this new quick mix technology will get compelling visuals, along with highly interactive access to data. The IT department will find the technology easy to learn and highly customizable with a convenient business intelligence toolset and information delivery platform. The end result, InetSoft has mashed up search technologies into a big mixing bowl of efficiency.
Jennifer Shockley, July 10, 2012
June 22, 2012
The concept of “virtual documents” will be a familiar one for many search engine professionals. Simply put, it means assembling an indexable record in a search engine from constituent parts that otherwise exist in different places. A recently posted staff blog on Search Technologies’ Web site provides an excellent example of how virtual documents can directly address a business need.
The perspective of the searcher is often not well served by existing content structures.
The “people search” issue described by the article is a common one, and the case study shown clearly illustrates the value of virtual documents.
Read on at Virtual Documents, Search the Impossible Search.
Iain Fletcher, June 22, 2012
Sponsored by Search Technologies
June 13, 2012
Also covered “SharePointSearch, Synonyms, Thesaurus, and You” provides a useful summary of Microsoft SharePoint’s native support for controlled term lists. Today, the buzzwords taxonomy and ontology are used to refer to term lists which SharePoint can use to index content. Term lists may consist of company-specific vocabulary, the names of peoples and companies with which a firm does business, or formal lists of words and phrases with “Use for” and “See also” cross references.
The important of a controlled term list is often lost when today’s automated indexing systems process content. Almost any search system benefits when the content processing subsystem can use a controlled term list as well as the automated methods baked into the indexer.
In this TechGrowingPains write up, the author says:
A little known, and interesting, feature in SharePoint search is the ability to create customized thesaurus word sets. The word sets can either be synonyms, or word replacements, augmenting search functionality. This ability is not limited to single words, it can also be extend into specific phrases.
The article explains how controlled term lists can be used to assist a user in formulating a query. The method is called “replacement words”. The idea of suggesting terms is a good one which many users find a time saver when doing research. The synonym expansion function is mentioned as well. SharePoint can insert broader terms into a user’s query which increases or decreases the size of the result set.
The centerpiece of the article is a recipe for activating this functionality. A helpful code snippet is included as well.
If you want additional technical support, let us know. Our Search Technoologies’ team has deep experience in Microsoft SharePoint search and customization. We can implement advanced controlled term features in almost any SharePoint system.
Iain Fletcher, June 13, 2012
May 10, 2012
Security is a topic which is getting increased attention, particularly in the SharePoint community. I want to call attention to “Microsoft SharePoint and LinkedIn Data at Risk from Framesniffing Attacks” from ITWire.com. The Safari, Chrome, and Internet Explorer Web browsers are inadvertently allowing hackers to steal information from private Microsoft SharePoint Web sites and mine data from public Web sites like Linked In.
A Framesniffing Attack occurs when a hidden HTML frame loads a target Web site in the hacker’s Web page to mine information about the content and structure of the framed pages. The hacker can then overcome browser securities and read the sensitive information.
As explained in the ITWire.com article:
“Paul Stone, senior security consultant at Context said, “Using Framesniffing, it’s possible for a malicious Web page to run search queries for potentially sensitive terms on a SharePoint server and determine how many results are found for each query. For example, with a given company name it is possible to establish who their customers or partners are; and once this information has been found, the attacker can go on to perform increasingly complex searches and uncover valuable commercial information.”
The problem deals with the X-Frame-Options header that turns off the Web browser framing feature and in SharePoint it is not turned off by default. Microsoft has stated in the next SharePoint version they will set the X-Frame options, but until then, SharePoint gurus, it is up to you to find a solution. If your organization discovers a way to keep its information from prying eyes, you will still need a way to find the data.
Search Technologies implements solutions which are secure and do not impede findability or system performance. For more information, navigate to www.searchtechnologies.com.
Iain Fletcher, May 10, 2012