Microsoft Fast Questioned by Ayna
January 27, 2010
A happy quack to the reader who sent me a link to a news story in Ayna. The headline was “Ayna Drops Fast Search Engine.” I have no way of knowing if the write up is balanced, but I want to capture the item so it doesn’t get away from me. According to the write up:
Ayna’s CEO, Adonis El Fakih, remarks “FAST solutions, sales, and support are not in sync with the changing tides or the available search choices out there, and wanted to keep the status quo, instead of stepping up to the plate and deliver a competitive product.” Ayna’s decision to drop FAST from its repertoire of technologies, came after months of deliberation with account managers and support services, which exposed the short comings of the platform and business attitude towards competing in the search space. Mr. El Fakih continued to say “…it is lamentable to loose an important investment in our core services, however we had no choice but to cut our loss and move on.”
Ayna is a search system serving users and customers in the Middle East. Among its services are Web indexing and mapping. You can access the Ayna service in English here. The French version is here. The main site is at http://www.ayna.com/index.ar.html.
Stephen E Arnold, January 27, 2010
Tess licked me this morning but I wrote this news item as a way to make a note for myself. I will report this bibliographic intent to the Library of Congress.
Most Fantastic Microsoft Fast Interview Ever
January 27, 2010
I don’t know much about Fierce Media, but I do like the word “fierce”. I read a story / interview produced by Fierce. The write up was “One on One with Jared Spataro of Microsoft.” I noted several interesting (almost unusual) points in the article. Let me highlight each, and urge you to read the interview in its entirety:
- The head of Microsoft Fast enterprise search worked for a “leading content management vendor”.
- “SharePoint 2010 will ship with a fantastic search experience.”
- “FAST Search for SharePoint customers will get a great general productivity search experience that is integrated with SharePoint out-of-the-box, but they’ll also be able to use the advanced capabilities of the platform to build and deploy sophisticated search-enabled applications.”
- Our top-tier search solutions are all built on the FAST Search core, and over time FAST will become a common foundation for all of our products
- A great enterprise search system needs to connect to everything and be accessible from everywhere. Out-of-the-box integration with SharePoint provides immediate value for many customers, but we’ve designed our products so that they can be embedded in any user experience and can index content living in any location.
That’s enough. My observations:
- CMS experience is not exactly standard preparatory work for enterprise search in my experience. Most CMS don’t work very well. Well, maybe there will be some transference?
- I love the word “fantastic”. I used it in the headline for this write up.
- I love the word “great”. I love the phrase “out of the box” for a toolkit. I love the “common foundation for all of our products.” That’s a categorical affirmative, and I find that “fantastic” and “great”, just not accurate.
- I love “top tier”. The adjectival phrase sounds so top-tier. But nothing can beat using Fast as “a common foundation for all our products.” Two categorical affirmatives for a search engine. Wow.
- I love the repetition of “great”. Very poetic. I love “everywhere.” Another categorical. I love “any location.” Another categorical.
Sounds fantastic and great. Oh, “all”, of course.
Stephen E Arnold, January 29, 2010
No one paid me to write about this article, its word choice, and its penchant for categorical affirmatives. I am not sure which US government agency has responsibility for logic. Maybe the GAO? I will report a freebie to that fine group. A means “accountability”, not accounting.
Palantir Describes Lucene Searching with a Twist
January 27, 2010
If you do work in law enforcement, financial services, or intelligence (business or governmental), chances are high that you know about Palantir. The firm provides sophisticated data analysis and analytics tools for industrial-strength information jobs.
The company published in August 2009 and October 2009, a discussion of its approach to search and retrieval. I had occasion to update my file about Palantir technology, and I reviewed these two write ups. Both appeared in the Palantir Web log, and I thought that the information was relevant to some of the issues I am working on in 2010.
The first article is “Palantir: Search with a Twist (Part One: Memory Efficiency).” In that write up, the company points out that it uses the “venerable Java search engine Lucene.” Ah, open source, I thought. Palantir’s engineers encountered some limitations in Lucene and needed to work around these. The article explains that Palantir addressed Lucene’s approach to accumulating search results with a priority queue, streaming through results and inserting into the queue, and returning the set of results in the priority queue. The first article provides a useful summary of the Palantir method.
The second article is “Palantir: Search with a Twist (Part Two: Real-Time Indexing and Security).” This write up explains two approaches Palantir explored to deal with what the company calls “leaking information; namely that there’s data on this object that the user making the query is not privy to.” The write up says:
Given this problem, there are two approaches one can take: [1] Store all the information needed to decide which labels are visible to the user running the query and then use only the visible labels when calculating the relevance of a match. Note that is a pretty expensive operation. [2] Don’t use the length of match to compute relevance. Note that skipping a relevance calculation is, obviously, a very cheap thing do. Which do we do? Both.
I recommend that anyone wrestling with Lucene to take a look at these two articles. A third installment has been promised but I have not yet seen it.
Stephen E Arnold, January 27, 2010
A free search engine warrants a free post. No one paid me to write this. I will report this sad fact to the Department of Labor.
Can Search Save YouTube?
January 26, 2010
YouTube.com has been a topic of conversation here at the goose pond today. Several of the goslings commented about the redesign. Another pointed out that the search function was a hit-and-miss affair. I described a couple of patent documents such as US2006/0080238 that I thought were designed to give Google’s grassroots media video service some lift (as in pants on the ground). I don’t think search can save YouTube.com. Money can.
Finding pants on the ground was easy. It’s not so easy finding some other videos.
When I read “You Tube Is Doomed Guy Refuses to Admit He Was Wrong (But YouTube No Longer Doomed”, I learned that YouTube.com is going to become a pay-per-view operation. The story in Silicon Alley Insider suggests that Google will emulate the Hulu.com model.
The write up presents a summary of some conflicting or maybe just fluid information about the profitability of the YouTube.com service. Google bought YouTube.com in 2006 at about the same time it was working out a deal with dMarc and lifting some other rich media barbells in the Google gymnasium.
The key passage for me was:
Google never figured out how to get advertisers excited about millions of people’s home videos. Benjamin [critic of YouTube.com and CEO of Fliqz.com] thinks Google will continue to chase after premium content, making the site more like Hulu. He also thinks eventually, Google could charge a small fee to upload video to the site. In other words, YouTube isn’t doomed.
The guts of the article is an interview with Benjamin Wayne, Fliqz and it is worth reading.
The goslings and I were uncertain about YouTube.com. On one hand, it seems to have some challenges in the search department. Finding a video is often most easily accomplished looking for a link in a write up, not by searching for a video. The ads are indeed annoying, and these may have disappointed both Google and the folks buying ads on YouTube.com videos. On the other hand, does the world need another for-fee video site. These seem to be predicated on the same assumptions one finds in the eBook reader sector. More may not yield a bigger revenue pie.
What is Google’s play in rich media? Perhaps Google has matured sufficiently to realize that there are other business models, but these may not lend themselves to the Googley style of management. Management, not emulating Hulu.com or some other for fee rich media service, may be the deciding factor for YouTube.com.
Stephen E Arnold, January 26, 2010
A freebie. Someone promised to pay me a pittance in the future, but that faint assertion had nothing to do with the plight of YouTube.com.
Search Market Growth Projection
January 26, 2010
Short honk: The addled goose enjoys collecting market projections. A remarkable one appeared in Silicon Republic in the story “Global Search Market Tipped to Grow 46 Percent.” The article cites the ever-reliable comScore and points out that “Bing is showing the most rapid growth”. Okay.
But the key number to me was in this passage:
The total worldwide search market boasted more than 131 billion searches conducted by people age 15 or older from home and work locations in December 2009, representing a 46-percent increase in the past year.
So, this is the 2008 to 2009 growth. The headline suggests that this is the growth for the future. I am confused about:
- What does “search” mean?
- The method of collecting the data: free or fee? Raw counts or numerical recipes?
- What will be the size at the end of 2011?
Stephen E Arnold, January 25, 2010
No one paid me to ask this question. I don’t know to whom to report questions. Perhaps I can call the USA hotline and inquire.
CIO Magazine and Its Notions of Enterprise Search
January 26, 2010
Holy Toledo, Batman! I just read “Twenty Companies to Watch in 2010”, published by CIO magazine on January 12, 2010. I don’t know the “judges”, but I do want to learn a bit more about their technical and business training. Perhaps I can learn something from them? I look at some of the same companies and I reach quite different perceptions of what’s to be watched and what’s to be ignored.
Please, read the original article because I am going to comment on just a couple of items, and I don’t want to bias you for or against the insights of the CIO team, which I am confident is a formidable one with deep experience in business and technical fields quite beyond my ken.
The CIO team has identified Google as a company to watch. Quite a shocker. Google is now operating in a geo-political sphere, disrupting business sectors far from search and online advertising, engendering concern among users about privacy, and signaling that the company’s founders are going to sell millions of shares. I may have omitted a couple of Google actions in the telecommunications and enterprise sectors, but I would agree: Google is a firm to watch. In fact, if Google had not been mentioned, I wonder if I would have been able to figure out that it is an important outfit.
The second company on the list that interested me was Endeca. I have written about Endeca’s involvement with the Newssift project that was to breathe new life into the gasping Financial Times online project. Alas, Newssift.com has been removed because I am getting a 404 error from www.newssift.com today. The CIO experts point out that Endeca is “fast becoming the Google of enterprise search for e-commerce sites.” In my modest world of interest, there is a difference between “enterprise search” and “e-commerce search” unless I have been missing something for the last three decades. In addition, Endeca “offers perhaps the friendliest way to do ad hoc BI and customer analytics.” These are also quite different disciplines, and I must admit I did not think of a single firm able to deliver each of these fronts with the same high level of excellence. Well, that just goes to show you what the addled goose does not know.
A reference in CMSWire’s “CIO Magazine Says These are the Companies to Watch in 2010” added this point to my understanding of Endeca: “Enterprise search for eCommerce websites and partnerships with SAP, Open Text and Nstein are also key.” My view of SAP is that the company is gasping for revenues and having to back peddle on certain price increases. OpenText is a roll up, and the firm’s engineers are working overtime to support multiple systems that offer similar functionality to overlapping customer segments. The OpenText approach to search is to offer a menu of options that range from Endeca to mainframe centric BRS, from structured data via BASIS to the aging Fulcrum approach. The reference to Nstein was fascinating.
I will check back with CIO in 2011 to see how these firms are doing. My hunch is that the Google will be okay. Endeca may have a steeper hill to climb.
Stephen E Arnold, January 24, 2010
I bet you think someone in the magazine industry paid me to write this blog post about the companies to watch in 2010? You are wrong. This was a freebie, ignited by the insights of the CIO editorial team. In fact, the illuminative power of the information is so great that I will report the writing-for-no-money thing to NASA.
Media Factoid: Trouble from the Youngsters
January 26, 2010
Short honk: In the article “21 Things We’re Learning to Live Without”, I noticed a startling factoid. Here is the passage that caught my attention. The topic is use of traditional media among kids:
A study by the Kaiser Family Foundation found that kids between 8 and 18 spend just 38 minutes a day with some form of print media, down from 43 minutes in 2004. That’s out of a total of 7 hours and 38 minutes they spend every day using some form of media.
Kindles and other ebook readers won’t create new readers I fear.
Stephen E Arnold, January 26, 2010
A freebie. I shall report this fact to the GPO.
Knol Gets Some Publishing Steroids
January 25, 2010
I was shivering in a security line in a country without many vowels when Google published “Write Better Knols with Object Embedding and PicApp.” Here’s the segment that caught my attention:
We think it’s important for a publishing platform like knol to provide people with the best possible tools for expression, so we’ve quietly added a large number of new embeddable objects for maps, docs, spreadsheets, forms, slideshows, presentations, videos, gadgets and more. Embeddable objects help you make better knols. For example, our equation object helps you add richly formatted mathematical expressions right in your knols. We really liked the cleanly embedded equations in this knol from the Public Library of Science. Similarly, our calendar object enables you to easily share details about upcoming dates, like swing dance lessons in Oregon.
I have highlighted the words that I noticed. Implications of this move are set forth in my Google: The Digital Gutenberg.
Stephen E Arnold, January 26, 2010
A blatant commercial for my own monograph. I shall report this egregious action to my publisher if has not left for Thailand.
Mark Logic Taps Amazon
January 25, 2010
Cloud Computing “Mark Logic Leverages Amazon” reported that MarkLogic Server offers a cloud option. According to the write up said:
The move will obviously let customers use its widgetry on a pay-by-the-hour basis. A native XML database that implements the W3C-standard XML Query (XQuery) language, the server includes full-text and structured (XML) search. The AWS version consists of an Amazon Machine Image (AMI) with the MarkLogic Server pre-installed.
Mark Logic’s technology has demonstrated its versatility in a number of information-centric environments. With a client’s information within the MarkLogic Server environment, repurposing is a snap. In the last year, Mark Logic has emerged as an information infrastructure company that makes big boys like Oracle quite nervous. With the move to the cloud option, Mark Logic is poised for new services. Mark Logic’s technology exerts pressures on companies in business intelligence, enterprise publishing, and information portal services, among others.
When Larry Ellison worries, I take notice. Important step from Mark Logic.
Stephen E Arnold, January 25, 2010
Yes, I was given a free admission to the Mark Logic user conference in Washington, DC. No, I was not paid to point to this write up about the Sys-Con story. Yes, I will beg Mark Logic to throw large sums of money at me and the goslings the next time I see one of the firm’s senior managers or investors. I will report this intent to the FCC via this footnote. Wow, I feel so much better explaining that I am a shameless marketer.
Google, Word Choice and Nexus One
January 25, 2010
Short honk: I don’t have a Nexus One and I don’t plan on buying the device. Mother Google has caught my attention with each of my Google monographs. I try to steer clear. I did read “How Google’s Nexus One Censors Cuss Words.” If accurate, the story suggests that Google sees a dirty word and scrubs it out of the message. Not even IBM in its salad days would make the IBM Selectric so it would prevent the user from typing certain words. What if Charles Bukowski were still alive, keying poems on his Nexus One? Would the output be vintage Bukowski?
drunk and writing poems
at 3 a.m.what counts now
is one more
tight ######before the light
tilts out
Yep, just what Mr. Bukowski wanted, according to Mother Google and its nannytron.
Google knows more than a creative person. How reassuring. What about the references to off color subjects in Shakespeare’s comedies? What about the humor in the works of Plautus and Terence? Google would have fixed their work as well.
What a nanny society is emerging! If someone buys a device, the person can do what he wants with it in my opinion. Thank goodness I am over 65 and heading for the data center in the sky. I suppose when the Googlers, the TSA, and other nannyites arrive, I will not be allowed to write my opinions using unapproved words.
Stephen E Arnold, January 24, 2010
This is a freebie. I am expressing an opinion. I can envision a day when WordPress will delete any reference to nannies, wings, and guys like Charles Bukowski. Quite a world. I will report this to the National Archives.