50 Niche Search Engines

June 28, 2008

Alisa Miller has compiled a list of 50 niche search engines. You can find the listing on Accredited Degrees here. Ms. Miller groups the search engines, which adds to the usefulness of her list. As I worked my way through the links, two of her finds struck me as useful:

Bookmatch provides search results from 3,300 sources with spam and silliness removed from Web log postings and news aggregators.
Congoo delivers results results from news and other sources. The company claims a higher level of information. My test queries returned useful results.

A happy quack to Ms. Miller for her list.

Stephen Arnold, June 29, 2008

Written by Stephen E. Arnold · Filed Under Federated search, News, Online (general), Search | Comments Off on 50 Niche Search Engines

The Whale and the Walrus: Two Views of Sergey and Larry

June 28, 2008

The purpose of this essay is to describe the life trajectory of two technology-centric companies. I don’t want to mention the firms by name, but you may be able to guess which company is the whale and which is the walrus.

The whale is a big creature, a whale of a company. Wherever the whale goes, it gets its way. More accurately, the whale used to get its way. Now the whale is lying on its side near the Seattle waterfront close to upscale boutiques and a Starbucks.

The second is a walrus, now quite old for a semi-leviathan. The walrus prefers to sit on a rock not far from Half Moon Bay, soak up the sun and snag whatever fish get too close. The walrus prefers to conserve its energy. Oh, the walrus will stretch and sometimes roar. Most of the time, the walrus half sits, half reclines looking — well –disconnected from the world beyond the sand bar. The walrus has some new friends named Sergey and Larry.

Let’s look at three aspects of each creature and then think about the future of each powerful beastie.

The Whale

The whale is the largest mammal. Not surprisingly, the whale is never sure if a sucker fish is tagging along for a free ride. The whale is also not really aware of its surroundings. The whale sings and tries to find other whales, but whales get together once in a while. Think of it as a Warren Buffet cocktail party with only whales allowed. Otherwise whales think whale thoughts, oblivious to their world.

Our whales know that tiny creatures can annoy a whale, but tiny creatures rarely hurt a whale. This whale believes it is master of all the known universe. The trick is to stay away from tiny creatures with weapons that can make life difficult. Every once in a while, the whale can gobble a tasty morsel like Fast Search & Transfer. Life has been good, but the whale senses trouble in a restless ocean.

The Walrus

The walrus is tired. The old game of providing tips to lost dolphins and tuna is not working any more. So, the walrus kicks back and thinks about what might have been.

The walrus is old, and the new ways of finding young fish eager to learn the old ways are tiring. This walrus prefers to lay down, make some noise, and wait for the next meal. Think of this walrus living in an assisted-living facility. The real world is too unfamiliar. The walrus has two new friends, Sergey and Larry. Sergey and Larry bring the walrus fish once a day. Getting fish is better than catching fish. The walrus likes not working too hard. The rock is a fine place. The waves lapping the beach in Half Moon Bay sooth the walrus. The walrus changes position but does not move.

Interpreting the Two Stories

The whale is a company that is disconnected from the world beyond the ocean. The whale is, for the first time in its life, unnerved, maybe frightened. Sergey and Larry people have a different business model. Customers use software and information and an advertiser pays the bill. The whale wants to swat Sergey, Larry with its tail. Sergey and Larry dance out of the way. The whale is frustrated and getting tired carrying the old business model into every skirmish and chase.

The walrus is an old timer in the digital world. The spring and bounce have been weighted down by wild and crazy decisions. Walrus friends are leaving the walrus more and more alone. The walrus is isolated. The old ways have lost their zip. The walrus remembers reading about automobiles and buggy whip manufacturers. The walrus believes that he might become a wallet, maybe a pair of shoes. Change, however, is hard at the walrus’ age. The walrus stays where it is, moving to catch the rays of the setting sun. Sergey and Larry will bring another fish today.

The message is clear. The whale is going to fight to survive. The walrus has given up. Sergey and Larry have the ability to deal with both the whale and the walrus with equal aplomb.

Observations

Neither creature has many years left. You have to admire the fighting whale. Too bad its own weight and mass will sap his strength. Not much future unless the whale shed some pounds like Subway’s Jared, the tuna eater. The walrus has found a new best friend and does not want to work too hard. The walrus will gladly do what Googzilla says. Those free fish are really tasty, thinks the walrus.

And what about Sergey and Larry in their “we’re just guys” outfit. Sergey and Larry want to out think the whale. The walrus seems happy as long as he gets a couple of fish every day.

In the great theater of business, the whale and the walrus are sushi.

Stephen Arnold, June 28, 2008

Written by Stephen E. Arnold · Filed Under Feature, Microsoft, Online (general) | Comments Off on The Whale and the Walrus: Two Views of Sergey and Larry

Microsoft Powerset: Is There a Role for Amazon?

June 27, 2008

On May 10, 2008, I offered some thoughts about Microsoft’s alleged interest in Powerset. You can find this bit of goose quacking here.

In case you missed the flurry of articles, essays, and opinion pieces, more rumors of a Microsoft Powerset tie up are in the wind. Matt Marshall ignited this story with his write up “Microsoft to Buy Semantic Search Engine Powerset for $100 Million Plus”. You must read this here. The most interesting statement in the essay is:

Google has generally dismissed Powerset’s semantic, or “natural language” approach as being only marginally interesting, even though Google has hired some semantic specialists to work on that approach in limited fashion.

My research for BearStearns last year revealed that Google has more than “some specialists” working on semantic issues. Alas, that document “Google’s Semantic Web: the Radical Change Coming to Search
and the Profound Implications to Yahoo! & Microsoft” is no longer easily available. There is some information about the work of Dr. Ramanathan Guha in my Google Version 2.0 study, but the publisher insists on charging people for the analysis of Dr. Guha’s five patent applications. Each of these comes at pieces of the semantic puzzle in quite innovative ways. If Dr. Guha’s name does not ring a bell, he worked on the documents that set forth the so-called Semantic Web.

So, Google is–according to this statement by Mr. Marshall not too keen on Powerset-style semantics. I agree, and I will get to the reasons in the Observations section of this essay.

The story triggered a wave of comments. You can find very useful link trails at Techmeme.com and Megite.com. The one essay you will want to read is Michael Arrington’s “Microsoft to Buy Powerset? Not Just Yet.” By the time you read this belated write up, there will be more information available. I enjoy Mr. Arrington’s writing, and his point about the Powerset user interface is dead accurate. We must remember that user’s are creatures of habit, and the user community seems to like type a couple of words, hitting the enter key, and accepting the first three or four Google results as pretty darn good.

Semantic technology is very important. Martin White and I are working on a new study, and at this point it appears that semantic technology is something that belongs out of site. Semantic technology can improve the results, but like my late grandmother’s girdle and garters, the direct experience is appropriate only for a select few. Semantic technology seems to share some similarities with this type of best-left-unseen experience from my childhood.

An Amazon Connection?

My interest in a Microsoft Powerset deal pivots around some information that I believe to have a kernel of truth buried in it. Earlier this year, I learned the Microsoft had a keen interest in Amazon’s database technology. Actually, the interest was not in the Oracle database that sites, like a black widow spider in the center of a Web, but in the wrapper that Amazon allegedly used to prevent direct access to the Oracle tables from creating some technical problems.

Amazon had ventured into new territory, tapping graduate students from the Netherlands, open source, specialist vendors, and internal Amazon wizards to build its present infrastructure. Amazon has apparently succeeded in creating a Google-like infrastructure at a fraction of the cost of Google’s own infrastructure. Amazon also has fewer engineers and more commercial sense than Google.

In the last 18 months, Amazon has pushed into cloud computing, Amazon Web services, and jump starting a wide range of start ups needful of a sugar daddy. I recently wrote about Zoomii.com, one innovator surfing on the Amazon Web services “wave”. You can read that essay here.

Microsoft needs a NASCAR engine for its online business. Microsoft is building data centers. But compared to Amazon and Google, Microsoft’s data centers are a couple of steps behind, based on my research work.

At one meeting in Seattle, I heard that Microsoft was “quite involved” with Amazon. When I probed the speaker for details, the engineer quickly changed the subject.

Powerset–if my sources are correct (which I often doubt)–is using Amazon Web services for some its processing. If true, we have an interesting possibility that Microsoft may be pulled into an even closer relationship with Amazon.

I am one of the people who thought that Microsoft would be better able to compete in the post-Google world if Microsoft bought Amazon. Now let me get to my thinking, and, as always, I invite comments. First, Microsoft would gain Amazon’s revenue and technical know how. Arguably these assets could provide a useful platform for a larger presence in the online world.

Second, Microsoft gains the cloud-based infrastructure that Amazon has up and running. From my point of view, this approach makes more sense than trying to whip Windows Server and SQL Server into shape. The Live.com services could run on Amazon or, alternatively, the whopping big Microsoft data centers could be used to provide more infrastructure for Amazon. An added benefit is that Microsoft–despite its spotty reputation for engineering–seem to me to be more disciplined than Amazon’s engineers. I have heard that Amazon pivots on teams that can be fed with a pizza. While good for the lone ranger programmers, the resulting code can be tough to troubleshoot. Each team can do what it needs to do to resolve a problem. The approach may be cheaper in the short run, but in my opinion, may create the risk of a cost time bomb. A problem can be tough to troubleshoot and then fix. Every minute of downtime translates to a loss in credibility or revenue.

Written by Stephen E. Arnold · Filed Under Feature, Microsoft, Online (general), Search | 1 Comment

Google: Big Rice Machine

June 27, 2008

To say that Google is really big is like saying New York City is a little seaside village. It’s a complete oversimplification. Posted in a blog at Managed Networks, the following analogy helped me get a better handle on a number too large to comprehend. Think of a single piece of data as a grain of rice. Google cooks 20 Petabytes of rice a day. That’s enough rice for every person on the planet to have 1,600 bowls each of the white stuff for dinner. Now that’s a San Francisco treat. You can read the original rice analysis here.

Jess Bratcher, June 26, 2008

Written by Stephen E. Arnold · Filed Under Google, News, Online (general) | Comments Off on Google: Big Rice Machine

X1 Extracts Search Patent

June 26, 2008

X1 Technologies in Pasadena, Calif., has been “innovating” (X1;’s word, not mine) search solutions since 2003. The company has patented an enterprise search solution touted to be quick and efficient that uses an advanced technique to find search results even as you’re typing your query. X1 calls the innovation “fast-as-you-type” search, and it narrows results as you keep typing. With this patent, X! wants to make their mark and assert themselves in the enterprise search market with their X1 Enterprise Search Suite. The suite searches over 400 applications file types across PCs and servers alike. The company has more patents pending, trying to build on this idea; the firm’s plan is to change how search is “done”, not just to make search faster. You can download 7,370,035, Methods and Systems for Search Indexing, here.

Jess Bratcher, June 26, 2008

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Search | Comments Off on X1 Extracts Search Patent

Business Objects: Number One in Business Intelligence… for Now

June 26, 2008

Business intelligence–along with content management and enterprise search–is a mid-sized blob of marketing mercury. The big names in the US are SPSS and SAS Institute. Both work hard to get colleges and universities to teach eager math students how to make these proprietary systems make data walk on their hind legs, roll over, and sit on command. Business Objects, a sales-oriented company, has made in roads into the SPSS and SAS client base and now the Gartner Group has named Business Objects as the number one business intelligence outfit.

You can read SearchDataManagement.com’s summary of the Gartner research here. You can read the Business Objects news release here. Let’s get to the meat of the Gartner study. For me this was the key point:

Combined, SAP and Business Objects controlled 26.3% of the global BI platform market in 2007, nearly double their nearest competitors. IBM and Cognos held 14.7% market share, followed by the SAS Institute at 14.5%.

So, “combined” makes Business Objects number one. Chop out the SAP part and Business Objects posts nearly $1.0 billion in revenues. Will Business Objects be able to maintain is revenues? Will the company be able to make Inxight Software into more than a content utility? Will superplatforms such as IBM. Microsoft, and Oracle bundle business intelligence with higher value systems sucking the air out of Business Objects’ growth?

For me, Business Objects means excellent sales management. Could its success come from the lack of marketing and sales management expertise, not its technology?

Stephen Arnold, June 26, 2008

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Text processing | Comments Off on Business Objects: Number One in Business Intelligence… for Now

One Reason Why Microsoft May Not Make Search a Success

June 26, 2008

The Bill Gates “noise” echoed in Kentucky. I read PCMag.com’s “Exclusive: The Bill Gates Exit Interview” here. The interview merits your attention. I zoned out with references to “the platform” and choked when I encountered this comment: “Everything in computer science is to just write less code.” And I was baffled with references to a “natural user interface”. But I am a Kentucky hill billy.

I tried to avoid reading about “Bill Gates’ Web Experience”. Michael Krigsman does work I enjoy, but I was hooked. Mr. Krigsman pulled the best bits from a PDF of an email exchange here. I discovered that this was a “flame” among Microsofties. You can read SeattlePI.com’s take on the exchange and learn why the PDF has confidential stamped on it.

I read the emails and ignored the complaints about Mr. Gates’s problems using Windows XP. What’s new?

The email put the PCMag.com interview into perspective for me. Here is the key line in the email thread. One Microsoftie writes, “I am owning the website issues.” [sic].

Now, for me the telling comment is in a response to this person’s attempt to provide leadership, accept responsibility for the mess, and fix the problem. Ready, here is what a Microsoft employee identified as Mike Beckerman wrote: “I don’t know what it means to ‘own website issues…‘”. I have added the emphasis.

Now my observations:

I am no leader, but I recognize that the person stepping forward to assume responsibility is walking and talking like a leader. Leaders are good because good leaders make things happen. For a colleague not to know what it means to “own Web site issues” is snide. In some organizations, the comment would be close to insubordinate.
When colleagues cannot cede control, preferring to keep the status quo, the management process is in danger of veering off track. The email exchange took place in 2003 and now it is 2008. The Yahoo deal flopped. Vista is an issue for some. The enterprise search and Web search initiatives are spinning their wheels. I would assert that these are examples of flawed management and a refusal for colleagues to sort out their differences and find a leader to guide them forward.
Google may have some challenges ahead. But if this email exchange is accurate (it may be a hoax for all I know), Microsoft may have some trouble closing the gap with Google in advertising, search, and cloud-based services. Google is a great many things, but so far it has avoided the headwind caused by employees who disregard a plea for changes from the fellow who founded the company.

Hopefully, I won’t have to read any more about Mr. Gates’s retirement, which I believe, has him on the Microsoft campus two or three days a week. Oh, the problems identified in the 2003 “flame” emails are still around. No one was able to fix them. Well, there is always next year, which is what IBM said about OS/2.

Stephen Arnold, June 26, 2008

Written by Stephen E. Arnold · Filed Under Microsoft, News, Online (general), Search | Comments Off on One Reason Why Microsoft May Not Make Search a Success

Google: Snuggling with OCLC

June 25, 2008

Digital Document Quarterly, Volume 7, Number 2, 2008 provided this item:

OCLC and Google have agreed to exchange book discovery data. Google will link from Google Book Search to WorldCat, which will drive traffic to online library services. Google will also share digitized book data. WorldCat will represent OCLC member library collections and link books scanned by Google. A user who finds a book in Google Book Search will be able to use WorldCat to find local library copies.

You can read the DDQ at http://home.pacbell.net/hgladney/ddq_7_2.htm. I recommend the publication if you have an interest in the library side of online information and digital documents.

My view of this is that slowly, ever so slowly, Google is encroaching on the traditional database world. I am confident the management gurus at ProQuest, Ebsco Electronic Publishing, Newsbank, and the other firms servicing this important but shrinking market has a GPS device on Googzilla.

A happy quack to H.M. Gladney from the Beyond Search goose.

Stephen Arnold, June 25, 2008

Written by Stephen E. Arnold · Filed Under News, Online (general) | Comments Off on Google: Snuggling with OCLC

IBM Search: A Trial of Patience for Customers

June 25, 2008

A quick question. What is the url for IBM’s public Web search? Ah, you did not know that IBM had a Web search system. I did. IBM’s crawler once paid a quick visit to my Web site years ago. You can use this service yourself. Navigate to http://www.ibm.com/search. The service is called the IBM Planetwide Web.

Let us run a test query. My favorite test query is for an IBM server called the PC704. I once owned two of these four processor Pentium Pro machines. For years I wanted to upgrade the memory to a full gigabyte, so I became a regular Sherlock Holmes as I tried to find memory I could afford.

Here are the results for this query PC704.

The screen shot is difficult to read, but there is one result–a reference in an IBM technical manual. Let us click on the link. We get a link to a manual about storage sub systems. I know that IBM discontinued the PC704, but the fact that there is no archive of technical information about this system is only slightly less baffling than the link to the storage documentation.

Let’s try another query. Navigate to http://www.ibm.com. We are greeted with a different splash screen with an option to “sign in” and a search box. Let’s run a new query “text mining”. The system responds with a laundry list of results. The first five hits are primarily research documents. The second page of the results has links to two IBM text mining systems, IBM TAKMI and IBM Text Mining Server. TAKMI is another research link and the Text Mining Server is on the IBM developer Web site.

I don’t know about you, but I received one hit for PC704 and and quite a few research hits for text mining. Where is the product information?

Let us persist. I know that IBM had a product called WebFountain. I want information about that product. I enter the single word, WebFountain, and the IBM system responds with 152 results. The documentation links figure prominently as well as pointers to information about a WebFountain appliance and architecture for a large-scale text analytics system.

Result 13 seemed to be on target. Here is what the Planetwide system showed me:

IBM – WebFountain – United States
WebFountain is a new text analytics technology from IBM’s Research division that analyzes millions of pages of data weekly.
URL: http://www-304.ibm.com/jct03004c/businesscenter/vent…

And here is the Web page this link displays.

Stepping Back

What have these three queries revealed?

Despite the cratering of prices for storage devices, IBM does not maintain an archive of information about its older systems. The single hit for the string PC704 was to a book about storage. The string PC704 probably appears in this technical manual, but the system’s precision and recall disappointed me.
The second query for text mining generated more than 3,000 hits. My inspection of the results suggested to me that IBM was indexing technical information. Some of the documents appeared to be as old as the PC704 that was not available in the index. The results provided no context for the bound phrase, and the results were to me delivering unsatisfactory precision. Recall was better than the single hit for PC704 however. To me, irrelevant hits are not much better than one hit.
The third query for an IBM product called WebFountain generated hits to research reports, documentation, and a Web site about WebFountain. Unfortunately, the link was active but there were no data displayed on the Web page.

All in all, IBM’s Planetwide search is pretty lousy for me. Your mileage may vary, of course.

Written by Stephen E. Arnold · Filed Under Feature, Google, Online (general), Search | 2 Comments

Management Views Search as a Side Issue

June 25, 2008

Dave Valiante’s “Enterprise Search a High Priority for Most Users, But Not for Companies” is an important essay. You can read it on the Wall Street Technology Web site here. The url is a tricky one: www.wallstreetandtech.com.

He reports on a study that says “many businesses [are] unaware of the importance of findability. His write up contains a number of interesting statistics from the report based on a survey of 500 business users. The AIIM study triggered a flurry of news items about user dissatisfaction with search, but Mr. Valiante’s essay digs a bit deeper into the results.

The one finding that jumped out at me was:

The survey states that most organizations do not have a strategic approach for enterprise search and shows that 49 percent of respondents have “no formal goal” for enterprise findability within their own organizations.

What a remarkable finding. With search an essential first step in performing work today, the idea that organizations have “no formal goal” is intriguing. Let’s assume that the finding is spot on. Half of the organizations surveyed view search and retrieval as a non-issue. If true, this explains why point solutions for customer support, litigation support, and business intelligence sell throughout an organization. Licensees are neither interested in systems already installed or, even more likely, indifferent to getting a system that meets very specific needs. Silos are not aberrations. Isolated systems, often containing content already processed by another system in the organization, are standard operating procedure.

No wonder an organization’s information technology department often shows little enthusiasm for a search or content processing system. With systems flowering, existing technical resources may be stretched to the limit. Another related thought I had, again assuming the finding is accurate, is that vendors have little incentive to change their marketing and sales strategies. A vendor can jump from market sector to market sector looking for customers who have a specific problem.

My research reveals user dissatisfaction with search and retrieval. The information in Mr. Valiante’s write up tells me that dissatisfaction is likely to be the norm in many organizations until management understanding matures. Agree? Disagree? Use the comment sections to share your views.

Stephen Arnold, June 25, 3008

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Search | Comments Off on Management Views Search as a Side Issue

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

50 Niche Search Engines

The Whale and the Walrus: Two Views of Sergey and Larry

Microsoft Powerset: Is There a Role for Amazon?

Google: Big Rice Machine

X1 Extracts Search Patent

Business Objects: Number One in Business Intelligence… for Now

One Reason Why Microsoft May Not Make Search a Success

Google: Snuggling with OCLC

IBM Search: A Trial of Patience for Customers

Management Views Search as a Side Issue

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta