Specialized Search Engine Helps Diagnose Rare Diseases

April 3, 2013

A recent piece from the MIT Technology Review that examines “The Rare Disease Search Engine That Outperforms Google” compares apples with oranges. The real takeaway is much bigger than a swipe at Google—that technical innovation is being used to help humanity.

Rare diseases are notoriously difficult to diagnose, and medical professionals have been using an Internet search engine, usually Google, to help with the process for years. Of course, Google was not designed for that use, so researchers have created a tailor-made engine to streamline this difficult but essential task. The article informs us:

“Radu Dragusin at the Technical University of Denmark and a few pals unveil an alternative. These guys have set up a bespoke search engine dedicated to the diagnosis of rare diseases called FindZebra, a name based on the common medical slang ["zebra"]for a rare disease. After comparing the results from this engine against the same searches on Google, they show that it is significantly better at returning relevant results.”

Is this supposed to be a surprise? Google does ads, not rare diseases. Ah well, the important thing is that doctors have a powerful new tool to help folks with diseases that stoutly defy accurate identification. How did the team from the Technical University of Denmark do it? The write-up goes on to say:

“The magic sauce in FindZebra is the index it uses to hunt for results. These guys have created this index by crawling a specially selected set of curated  databases on rare diseases. . . . They then use the open source information retrieval tool Indri  to search this index via a website with a conventional search engine interface. The result is FindZebra.”

Though the zebra engine is still an in-progress research project, the team has made it publically available at www.findzebra.com. Medical professionals can already use the innovation to help patients who might otherwise be doomed to years of painful frustration. Hooray, progress!

Cynthia Murrell, April 03, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

It Is Movie Search Time

March 25, 2013

Google, Bing, and DuckDuckGo are the preliminary search engines users turn to for locating information. One of the problems, even with advanced search options, is sifting through the search results. Any search expert will tell you if the desired information is not in the first or second page of results, users move on. Does this call for a specialization in search engines? It just might for a subject as all encompassing as movies. MoreFlicks searches through the popular video streaming Web sites:Hulu, Netflix, Vudu, Fox, Crackel, and BBC iPlayer for movies and TV Shows.

It takes a page out of Google’s book by displaying basic facts about a movie or show: summary, genre, release date along with where it can be viewed online. Search results can be sorted by genre, most popular, new arrivals, and what is soon expiring. It will come in hand when you are searching for an obscure title. Downsides are that it only browses through legal channels. YouTube has been given the boot for these results. MoreFlicks is a niche search engine, possibly the lovechild of Google and IMDB, but how long it stays depends on content relevance or until Google snaps it up. Zeus eating Athena anyone?

Whitney Grace, March 25, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Experts Are Only A Search Away

February 18, 2013

Have you ever heard of Funnelback? Probably not, unless you are a search expert or come from the land down under. While the search experts are at work, allow me to explain a bit more about Funnelback. It is an enterprise and Web site search that sports an algorithm that can be tweaked to reflect a user’s customizable search results, editable search parameters, and a development platform for multimedia, e-commerce, e-mail media alerts, and plagiarism detection systems. The last option is one of the reasons Funnelback has gained a huge following in Australian universities.

If you are searching for a prime Funnelback experience, check out the University of Melbourne’s Web site with its “Find An Expert” search engine. The Funnelback “Find An Expert” searches through the university’s staff and faculty directory and retrieves experts related to the user’s keyword. In our example, we searched for “politics” and the results yielded 232 experts. Department, types of politics, and topics on politics, can filter the results. What makes Funnelback more entertaining than Google is that it creates a “capability map” aka a visual representation of the search results and how they connect with each other. The capability can be manipulated by filtering out or including other results.

Funnelback demonstrates that search can be entertaining and intelligent. When will Google add this to their search results page?

Whitney Grace, February 18, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

SOLR Relevancy Tuning from Search Technologies

February 11, 2013

Search Technologies introduced “Solr Lucene Relevancy Tuning.” Search Technologies will supply services to improve the relevancy of results within an existing Solr/Lucene implementation. If the service works as advertised, this could be a boon to many organizations awash with extraneous data. The announcement explains:

This engagement will provide powerful relevancy ranking improvements in an existing Solr installation. This includes setting up a basic system for relevancy evaluation, based on a set of sample queries, so that improvements can be quantitatively measured. Additions to the default relevancy formula in Solr Lucene can dramatically improve search results, solving many of the most thorny relevancy problems including:

  • Reducing the impact of peripheral content (sidebars, ads, tangential discussions, etc.)
  • Automatically handling word phrases in a flexible manner, reducing the need to use complex query constructions to obtain good search results.”

The Search Technologies’ solution changes the default Solr/Lucene functionality, which can overemphasize document size and term frequency. Search Technologies’ new Parameterized Document Similarity Function provides more control over these formulas through configurable parameters. The company’s Gradient Proximity Boost operator eliminates the need to tweak Solr/Lucene’s default “hard window,” the term-proximity parameters which can trigger a document boost. The method does this by measuring the density and completeness of terms across each document, gradually boosting documents in which terms cluster.

The post identifies the expected engagement tasks and deliverables associated with this software. The only pre-requisite listed is the presence of a working Solr /Lucene system with already-indexed documents. The firm promises ongoing maintenance and support services, including an optional round-the-clock support package.

Founded in 2005, Search Technologies bills themselves as the largest (independent) IT services company dedicated to search-engine implementation, consulting, and managed services. Staffed with veterans of the search field, the company prides itself on innovation. Search Technologies is headquartered in Herndon, Virginia, and maintains two other U.S. offices as well as locations in Berkshire, U.K., and San Jose, Costa Rica.

Ken Toth, February 11, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Search Technologies Success Lies in Corporate Retreats

January 23, 2013

While many companies may see corporate retreats as an obvious place to cut spending, co-founder and chief executive of Search Technologies believes retreats are some of the most valuable investments made by the company. In the Washington Post article “Value Added: This Herndon Search Company Found its Perfect Retreat in Costa Rica,” we learn about how Kamran Khan of Search Technologies believes corporate retreats are crucial to the success of his growing business. The most recent off-site cost $100,000 in company money and took place in highly-educated and tech-savvy Costa Rica.

The article explains the importance:

“Khan, who started Search Technologies in 2005, said it’s the only time when everyone in the company — including the management team — can be in one place. Khan uses the chance to address his 100-person staff, informing them of how the company is doing and outlining the goals for the next year. ‘I prefer to get people together and . . .clarify our strategy, which is very simple: We are going to be experts in the search space.’”

Khan and his team at Search Technologies may be onto something with this plan. Launched in 2005, the company was on track for $18 million in revenue for 2012, and the company’s net profit margin is about 5 percent. The IT services and search implementation software company services the Daily Mail newspaper’s Web site portfolio in Britain and helped Amazon.com launch its new cloud search product. Apparently the secret to success lies in Khan’s philosophy of hiring “good people” and taking beach trips. We have learned that Search Technologies is hiring in anticipation of further growth during 2013.

Andrea Hayden, January 23, 2013

SharePoint 2013 Offers Improvements in Search

January 10, 2013

An overall architecture for SharePoint 2013 Search can be found on the Search Technologies’ Web site.

As new releases tend to do, SharePoint 2013 has made some tweaks that users would do well to explore, we learn in “Search Engine Changes in SharePoint 2013” from iT Pro. SharePoint consultant Veena Sarda details the search-related changes and presents them in a handy chart.

The first thing to note is that FAST Search has now been worked into the SharePoint code base. That means that FAST capabilities like metadata extraction, visual search, and advanced linguistics are now part of the package. Content and analytics processors have been added to the logical architecture, and a specialized Search Administrator now manages these and other search-related components. Also new is a dedicated analysis engine, which performs both search and usage analytics.

Crawling has been improved; it is now possible to crawl http sites anonymously, and the time for the  index to merge and present those results has been dramatically shortened. Results rendering has been moved from the server to the client side. Document parsing is now much more refined, relying on a set of new parsing features, rather than on file extensions to do the job.

Other welcome improvements affect the user experience. The UI has been revamped to accommodate the new features, with a re-design based on nested layout templates defined in JavaScript and HTML. This change allows for easier extensibility. Furthermore, end users now have an easier time of it; the write-up notes that the platform now provides:

“Direct access to the most granular information inside of sites and documents, and then enables users to act on the results without having to leave the results page. Every search box in every team site offers full access to enterprise-wide search, people search, and other specialized search experiences in addition to the traditional scoped site search.”

Part of this simplified workflow is the new Hover feature, which presents a visual preview of sites, documents, and conversations at the pause of a mouse.

A few more search-related improvements: Authors are identified as experts based on document content, where before they were identified by My Site profiles. People Search (which used to be independent of document search) has been integrated with the core results and can be targeted by name, location, phone number, and other properties.

Perhaps one of the most noteworthy shifts is the new Query Rules feature. SharePoint 2010 only allows for simple queries—one query, one set of results. Sarda writes:

“Query Rules are a new feature in SharePoint 13 that help act upon the ‘intent’ of a query – Query Rules are composed of three top level elements: Query Conditions (i.e. matching rules), Query Actions (i.e. what do you do when you find a match), Publishing Options (i.e. when should this rule be active). Query Rules allows to have search requests from a user trigger multiple queries and multiple result sets.”

A welcome addition. For more information on SharePoint 2013, see the “brief functional walk-through” posted at Search Technologies. It contains, among other things, an easy-to-understand flow chart. The SharePoint experts there also promise to post future updates at that link.

Search Technologies leverages search engines to provide business advantages to their clients. With over twenty years of experience in the field, the company asserts that it is the largest IT services company dedicated to search engine implementation, consulting, and managed services. For information on the firm’s SharePoint 2013 Search Services, visit www.searchtechnologies.com. Search Technologies is headquartered in Herndon, Virginia.

Cynthia Murrell, January 10, 2013

In a Game of Duck Duck Goose, Google is the Goose

December 25, 2012

Google undeniably controls Internet search, but the little guy has not given up yet. DuckDuckGo is the newest competitor in the search engine game and it promises to have “fair search.” News Web sites like The New York Times, Search Engine Land, and PCMag.com hail DuckDuckGo as a “long-term threat to Google search dominance.” The newest Internet star was recently covered in the Search Engine Journal in the article,“DuckDuckGo Vs. Google-The War Get Dirty.”

What makes DuckDuckGo a “fair search” is that it does not track your search history or IP, have your questions answered by more official Web sites than Wikipedia, and the ability to search Facebook, Amazon, YouTube, and many other Web pages.

Google has problems with its rival, though, asserts Gabriel Weinberg—DuckDuckGo’s founder. In the Chrome browser, it is difficult to make DuckDuckGo the default search engine and to override some of Google’s presets the newbie had to create a Chrome plug-in. The reality is that while Google always wants to be number one, it is not going to hinder the competition:

“Manually setting DuckDuckGo as the default search engine for Chrome is quick, and it can be done in Google Chrome settings. …It is really not that complicated, and it’s a matter of seconds to set up. No unfair treatment from Google against DuckDuckGo – any new search engine would need to go through the same process.

Saying that Google purposely harms a competitor’s search engine, in a time when the search giant is being investigated by the FTC for ‘using its power in the market to smother competitors’, is misleading and unfair. It’s fighting dirty – and bad PR.”

DuckDuckGo just might be a little jealous of Google. Weinberg also might be having startup problems, so blaming it on Google is a way out for him.

Whitney Grace, December 25, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Search Framed as Iterative Discovery

December 17, 2012

Finally, we come across an article that puts search at the forefront of big data. The post “Big Data or Big Noise?” from Chilliad discusses a “new” approach to the process of search.

Relevancy has always been an issue in search and this article is nothing short of correct to point that out. This post also points to the idea that irrelevant results can become big noise and it would be a waste of time for users to read that noise.

Chilliad humbly suggests that users do not need to know what we are looking for, where to find it, or how to figure that out:

“In fact, reading is not the next thing I want to do, reading is the last thing I want to do. That is why we approach Big Data as an exploration and provide software that supports an approach we call Iterative Discovery. Iterative Discovery is exactly what it sounds like — I start with a hunch or hypothesis that I wish to validate and that requires exploration and iteration through massive amounts of data.”

The problem we have with this concept is that it does not need as much of an explanation as Chilliad gives it. Iterative discovery is a way of framing search, but it is nothing innovative or out of the box.

Megan Feil, December 17, 2012

Sponsored by Arnold IT.com, developer of Augmentext

An Intriguing Idea: SharePoint Search Is a Data Access Technology

December 7, 2012

Shortly after the SharePoint 2012 Conference, I had time to think about an interesting and quite intriguing view of search. The idea is that search is another “data access technology.” The idea was explained in “SharePoint Conference 2012: Prominent Role for Search in SharePoint 2013.

Sanjeev Bhutt gave a tip of the hat to Scot HIller, who was a speaker at the conference. Mr. Bhutt reported:

In his session on building search-driven applications, Scot Hillier made the point that we should no longer think of search in the limited scope of what occurs when a user types in a search term in a search box and the corresponding results that appear. Rather, we should think of search as a data access technology, in the same vein as CAML, REST and CSOM. In fact, he went as far as to say that search is the data access technology because, as he put it, “Search knows where all the skeletons are buried.” [Emphasis in the original text.]

Since the conference, I have noticed more emphasis on the use of a traditional and faceted search interface was a way to access a wide range of data and information types. Sphinx Search, for example, provides a system which eliminates the need for command line queries for content stored in MySQL databases. Many other vendors are moving in the same direction.

Search Technologies offers a range of services related to SharePoint 2013 search. Of particular relevance is the company’s search architecture design services. The firm’s engineers provide due diligence reviews of existing systems, to the detailed planning and costing of new search applications.

If you want to make the shift from search to finding and discovery, you will want to explore a range of technical methods and engineer your SharePoint or other information solution to deliver the results that users want: Information which answers a question without guessing what key words unlock the riches in the organization’s knowledge stores.

For more information about Search Technologies, visit www.searchtechnologies.com.

Iain Fletcher, December 7, 2012

Inflated Investments Can Be Healed with the Right Tools

December 4, 2012

Plenty is being written about HP’s acquisition of Autonomy at the moment. A view from a historical perspective can be found on Search Technologies’ blog in a piece called, “HP, Autonomy and Meaning-Based M&A.” Autonomy, a UK-based company, began as an enterprise search vendor. Without doubt, in the years since Autonomy has proven to be the most successful company in the sector; some of the reasons for that can perhaps be traced back to 1996 origins and initial engagements with the market.

The article argues that the same factors driving Autonomy’s success in the enterprise search space also drove HP to pay an inflated price. The blog post states:

Autonomy IDOL remains a capable and widely used search engine. It is our collective experience that some customers bought too heavily into the meaning-based vision, and probably paid too much for it. We work with such customers to help them make the most of their investment – and much can be made of it. IDOL is a detailed and capable search engine.

We think the team on Search Technologies provides the information needed to help people make the most of such enterprise search tools. Enterprise search technology customers need to understand their needs and make smart purchases, and it seems Search Technologies can assist in that task.

Andrea Hayden, December 4, 2012

Sponsored by ArnoldIT.com

Next Page »


 
 
 
PolySpot: Agile Enterprise Search Infrastructure