SurfRay Reloaded
September 14, 2009
A happy quack to the reader who alerted me to the news about the reappearance of SurfRay, a company that dropped off my radar. The firm has announced via PR Newswire a new version of Ontolica. You can read the news release at the PR Newswire Web site. Note that PR Newswire links can go dark, so if this SharePoint compatible product interests you, you may want to do some sleuthing. Asserted in “SurfRay Announces Availability of Ontolica 4.0 for SharePoint, With New Reporting and Analytics Module” are analytics features. Furthermore, existing customers can upgrade for free through October 20, 2009. The Beyond Search team has not had an opportunity to kick the tires of this product although we did request information when rumors of the release reached us in Harrod’s Creek. You can get more information about the company at its Web site or by running this Devilfinder metasearch string. The product appears to compete in the same sector as Interse (also based in Denmark) and BA Insight (US). Some of the functionality asserted by SurfRay may be found in Coveo’s and Exalead’s SharePoint compatible systems. Adhere Solutions (owned by a Beyond Search gosling) offers software that makes it possible to use the Google Search Appliance to search, slice, and dice SharePoint content. With important announcements about Fast ESP (Microsoft’s enterprise search solution for large scale SharePoint installations), organizations with SharePoint have a large number of options to consider. The question that continues to flap around the goose pond is, “How can an organization determine which SharePoint solution is the appropriate one for that particular organization?” Marketing, not technology, seems to be the knife edge at the present time. Little wonder the geese at Beyond Search are addled. What a cornucopia of choices exist for the 100 million happy SharePoint license holders (if we accept the broad market size rumors bruited at conferences).
Stephen Arnold, September 14, 2009
Improving SharePoint Search Relevance
September 13, 2009
If you need some tips about ways to improve the relevance of SharePoint search, you will want to download Robert Mixon’s free “SharePoint Search Improving the Relevancy of Search Results”. This is a PDF file. The paper starts by explaining the difference between an Internet search and an Intranet search. You then learn about Microsoft “search scopes.” To be truthful, I don’t understand the terminology but I am strongly resistant to the invention of new search buzzwords. You get an example of slicing and dicing search results. We have struggled with some interesting challenges with Microsoft search systems. This white paper will provide some basic information. You will need to do further digging – probably quite a bit in my opinion — to find ways to tame the unruly tigers that prowl the SharePoint jungle.
Stephen Arnold, September 13, 2009
Australian Publisher in Bid to Get His Own Chapter in Bartlett’s Quotations
September 13, 2009
What outstanding phraseology. Amazing quotes. You can read a summary in “Publisher: Time to pay up, Google”. Let me give you two examples, but, please, buy a hard copy of the Daily Telegraph Australia. I cannot do justice to this wonderful material.
Quote 1 allegedly crafted by APN News & Media chief executive Brendan Hopkins:
“We don’t need to be reborn, we just need to be paid properly for what we do,” Mr Hopkins told the Pacific Area Newspaper Publishers’ Association (PANPA) conference.
And quote 2, same fellow:
To use an analogy, I see search engines as breaking into our homes, itemizing the contents, walking out and listing everything for everyone to see. And they get money out of that process,” he said.
Great word smithing. I should have remained in publishing. I need to perfect my analytical skills and my writing. Maybe I can nab an internship.
Stephen Arnold, September 13, 2009
A Modest Facebook Hack
September 13, 2009
For you lovers of Facebook, swing on over to Pjf.id.au and read “Dark Stalking on Facebook”. This is search with some jaw power. The key segment was in my opinion:
If a large number of my friends are attending an event, there’s a good chance I’ll find it interesting, and I’d like to know about it. FQL makes this sort of thing really easy; in fact, finding all your friends’ events is on their Sample FQL Queries page. Using the example provided by Facebook, I dropped the query into my sandbox, and looked at the results which came back. The results were disturbing. I didn’t just get back future events my friends were attending. I got everything they had been invited to: past and present, attending or not.
Links and some how to tips. Have fun before the former Googlers and Facebookers hop to it.
Stephen Arnold, September 13, 2009
Pigeons versus Kentucky Broadband
September 13, 2009
A happy quack to the caller today who alerted me to this news flash from Tom’s Hardware. Of course, living near a mine run off pond discourages the phone and cable companies from putting high speed anything in my dank hollow. The story’s headline is enticing: “Pigeon Found to be Faster Than Broadband”. I enjoyed this comment:
ISP Telom said that it couldn’t be held responsible for the slow transfer speeds to the IT company, as it has helped to advise the company in possible improvements, but thus far none have been accepted.
I heard from my local ISP that its system was really fast. Never mind that my Verizon WAN card times out when I try to access my mail via the ISP’s Web interface. “Works for us,” the company wrote. Yep, carrier pigeons. Also in Kentucky.
Stephen Arnold, September 13, 2009
Google Fact Extraction Pokes Out Its Nose
September 13, 2009
The Google fact extraction patents have been ignored. One Microsoftie told me last year that Google patents did not mean anything. Okay, click here and scan the fact extraction patents. The go to the Internet Stats site and review the samples. Not only did the patent documents explain the invention, some of the documents include examples. I assume the Microsoftie does not see the connection between the patent documents and the demo site. That’s one reason why Microsoft’s Bing is trying to be what Google was, not what Google is. Sort of a problem in my opinion. Denial is a useful tool for mental health, but it does not do much to narrow the gap between Microsoft’s Web search market share and Google’s market share in my opinion.
Stephen Arnold, September 13, 2009
Open Source Metadata Tool
September 12, 2009
I received an interesting question yesterday (September 11, 2009). The writer wanted to know if there was a repository of open source software which served the intelligence community. I have heard of an informal list maintained by some specialized outfits, but I could not locate my information about these sources. I suggested running a Google query. Then I received a link to a Network World story with the title “Powerful Tool to Scour Document Metadata Updated.” Although not exactly the type of software my correspondent was seeking, I found the tool interesting. The idea is that some word processing and desktop software embed user information in documents. The article asserted:
The application, called FOCA (Fingerprinting Organizations with Collected Archives), will download all documents that have been posted on a Web site and extract the metadata, or the information generated about the document itself. It often reveals who created the document, e-mail address, internal IP (Internet Protocol) addresses and much more….FOCA can also identify OS versions and application versions, making it possible to see if a particular computer or user has up-to-date patches. That information is of particular use to hackers, who could then do a spear phishing attack, where a specific user is targeted over e-mail with an attachment that contains malicious software.
Some of the information that is “code behind” what the document shows in the Word edit menu is exciting.
Stephen Arnold, September 12, 2009
Google Ordered to Provide Email Info
September 12, 2009
Short honk: The Canadian publication National Post’s “Google Ordered to ID Authors of Emails to York University” caught my attention. If true, privacy watchers may want to note this passage from the news story:
York University has won court orders requiring Google Inc. and Canada’s two largest telecommunications companies to reveal the identities of the anonymous authors of contentious emails that accused the school’s president of academic fraud.
The article suggests that this is an “extraordinary” action. Is it? When the extraordinary become ordinary, the meaning of a word and the event to which it applies can confuse me. Would Voltaire or Swift obtained tenure at York were each alive today? I don’t know what “academic fraud” means either. That is why I am an addled goose I know.
Stephen Arnold, September 12, 2009
Why Search Is Difficult
September 12, 2009
I read Henry Blodget’s “Danny Sullivan: Carol Bartz Is the Sarah Palin of Search” and recognized a rare bird spotting event. Two fellows with spectacular Google PageRankings illustrate the challenges the notion of “search” presents to analysts, pundits, wizards, search engine optimization mavens, and other assorted search trend watchers. First, you need to read Mr. Blodget’s essay and then follow the links in his article. The idea is what could pass among the online advertising crowd as a 60 second bit on 30 Rock. But the write up called attention, in my opinion, to the significant epistemological issues that stick to the word “search” like a leech to a patient’s chest.
Here are my personal observations:
First, search is not defined. The assumption is made that everyone knows what search means. In the context of Mr. Blodget’s quote rundown and the original Sullivan observation, search means online advertising and getting eyeballs to a public Web site. The problem is that a person with a different angle on search won’t know what the heck the analogy is supposed to illuminate. That’s the problem. No one knows what search means because the folks talking about the concept omit the definition part of the communication process. The analogy of Palin to Bartz is clever but does little to illuminate Yahoo’s present challenges.
Second, a single person cannot make a quick change at an outfit the size of Yahoo within the business processes in play within Yahoo. At this point in time, Yahoo is mired in its business methods, and the president (forget who is running the show at any point in time) looks silly because the business processes themselves are silly. This explains some of the statements by Ms. Bartz. There are no better nor worse than generalizations made by any executive trying to impart change when the flow of decisions works like a bowling ball going down a gutter. Change, as students of that management discipline know, is a tough job. Working to make that change and deal with the need to make public comments ensures statements that are likely to tickle some listeners’ ribs.
Third, when the word “search” is used in a context involving Yahoo, the difficult problem is the Yahoo “as is” technical infrastructure. By “infrastructure” I mean the hardware, software, technical architecture, and in place systems. In order to tame search, one needs to look at what must be done among the handful of companies that are generating positive cash flow in * any * sector of the information retrieval and content processing sector. There are not that many companies making money, a fact that is often overlooked. But the characteristics are pretty easy to identify; for example:
- Technical competence that is channeled
- Ability to solve a customer’s problem in a way that does not generate greater costs going forward than the revenue stream can support
- Reasonable cohesion within and among technical teams
- An affordable, repeatable method for getting the word out to potential buyers
- A way to generate money sufficient to pay the bills, produce surplus cash that can be invested in new ways to make money, and leave money around to keep stakeholders happy.
When these components are in balance, the company – no matter how quirky or wacky its management – can succeed in one or more of the business sectors that make up the search market. When misaligned, making money from content processing and information retrieval is tough. When revenues falter or profits collapse, the quirks become hallmarks of ineptitude.
Yahoo is a goner. I don’t think a change in its top management will make any difference whatsoever. Yahoo is a bit like the eastern European countries with big ideas and ways of doing business that failed economically. Yahoo is in that situation, and I don’t see a revolution coming. Yahoo is the end point of an Internet company following a digital entity life cycle. Yahoo is trending downward.
Forget the people at the top. Yahoo to survive has to undergo a revolution or be subjected to what Japanese management experts call “bunsha”. Without deconstruction and reinvention, Yahoo cannot be other than Yahoo. That’s no joke, and it puts in context why casual chatter about “search” rarely yields a change in a company dependent on “finding” systems.
Stephen Arnold, September 12, 2009
Google Book Download
September 12, 2009
Short honk: I search Google Books every once in a while. I find most of the page image services clunky. When possible, I try to visit a library and check out the real McCoy. If you are not the book pawing kind, you may be interested in the Google Book Downloader. You can obtain the software by navigating to http://googlebookdownloader.codeplex.com/. Its features include:
Download any book from Google Books marked as ‘Full view’
Partially download any book from Google Books marked as ‘Limited preview’
Access to any book available only for US citizens (instructions)
Searching for hidden pages (not indexed by Google Books)
I have not downloaded the software so I can’t offer a goose honk or quack. Enjoy.
Stephen Arnold, September 12, 2009