The Specter of Open Source Textbooks
September 9, 2010
Have publishers found a new revenue handhold with electronic books? Can publishers create rich media books that are part content, part design and user experience, and part software? Can publishers fend off the push for open source content from “the community”? These are tough questions which I considered after reading the Silicon Valley.com news article “Cassidy: Former Sun Chief Scott McNealy’s Better Idea for School Textbooks.” I thought the write up shed light on how Scott McNealy, the co-founder of Sun Microsystems, plans to turn elementary and high school education inside out.
He has created ‘Curriki’, a Web site described as: “a combination of Facebook, Wikipedia, YouTube, MySpace, Twitter for anybody involved in K-12.” It is, he explains, “a free and open source digital compendium of just about everything teachers use to teach – textbooks, worksheets, tests, video presentations, podcasts, you name it.” This project aims to eliminate bulky textbooks, and help schools save money from getting them printed. With the open source digital textbooks, he advocates, “teachers could add, subtract, and change curriculum… comment on each other’s lesson plans. Students would receive instant feedback.” This is a fresh challenge to traditional publishers, educational bureaucracy, and a source of inspiration to save our schools and our future.
Oracle is enlisting a flock of legal eagles to deal with the open source pests heading towards the firm’s database business. Is there an Oracle among publishers? I am thinking. I am thinking. What I envision is Freddy Kruger in Nightmare on Sixth Avenue.
Stephen E Arnold, September 9, 2010
Freebie
JantaKhoj: People Search for India
September 9, 2010
The Internet was founded to network the computers, with an aim to easily facilitate and access knowledge and information. Over the years this network has become immensely massive and an unbelievable repository of facts and figures. Important amongst them is the data about the people. The reasons to know the information about any particular person might be personal or related to business, but this has become a popular activity on the Internet.
Many people search sites are popular in America and other countries, and now India has its own first and largest people search portal JantaKhoj, the literal Hindi translation of ‘People Search’. The site states its aims to “Power the most comprehensive People Search service covering the residents of India, and democratize the background verification process and being it within easy reach of every individual.” Further as stated on its site, they claim to apply “the data aggregation technologies to solve the complex problems in People Search and Background Checks that are unique to Indian scenario.”
It is indeed unique and challenging for a country of 1.17 billion people, “where there is still a long way for the public databases to come online and become accessible to the search engines,” says Tarun of JantaKhoj.com. The site with the punch line ‘Search People, Research Background’ has extended its services to include online and offline background checks on prospective domestic help, servants, maids, drivers, tenants, life-partners or employees. The search results are programmed to include the blogs and patents, apart from the court records, social profiles, pictures, videos, book results, web links, news, and personal address or phone number details.
JantaKhoj demonstrates similarities with the Cluuz Search Engine platform, where the results are displayed through semantic graphs. Just over a year old, it has a long way to go to bring the entire Indian population on to the electronic database.
Leena Singh, September 9, 2010
Freebie
Microsoft FAST and the Solr Wind: A Trend or Sun Spot Consequence
September 9, 2010
I have to be blunt. I find that LinkedIn’s enterprise search group is a service I rarely visit. I responded to one LinkedIn person who wanted advice on creating an enterprise search system from scratch. It seems my observation that the idea was not too good hurt the person’s feelings. I got an impassioned personal email wanting to know why I was so negative. Right. Negative on coding an enterprise search system from scratch. Maybe this was a great idea when STAIRS III, InQuire, and BRS ruled the roost. Not such a great idea today.
I was delighted to read a thread about moving from Microsoft Fast to Solr. The information was interesting, but I can locate vendors with a Google or Bing search. A closer look at the responses provided me with some insight into a potentially interesting “solar wind” metaphor.
Icarus and all that. Source: http://www.mentera.org/2007/01/12/flight-of-icarus/comment-page-1/
I have no idea if you can access this thread, but I will include the link that worked for me. Here you go: Link to Fast to Solr discussion. If it doesn’t work, join LinkedIn and poke around.
The suggestions for integrators who can shift information assets from Microsoft Fast to Solr included:
- Cominvent. This is a company with which I am not familiar.Lucid Imagination. This is the outfit who has me working for T shirts on the Lucene Revolution Conference, October 7-8, 2010, in Boston.
- ESR Technologies. This is a company with which I am not familiar. (The logo reminded me of Digital Reasoning, another content processing firm working on a more advanced approach to digital information.)
- Findwise (linked from Findabilityblog.se). This is a company with which I am not familiar.New Idea Engineering. This is an outfit with whom I collaborated on an open source and shareware search utility software directory and which I have tapped for some special project work.
- Search Technologies. This is an outfit that participated in my enterprise search podcast for ETM recently.
Scaling Still an Issue?
September 9, 2010
Since scaling is still an issue to some people Lexalytics Development Blog’s post “Best Practices for Scaling Salience Engine Use” is informative and a big help on how to use Salience Engine with large volumes of content.
It is recommended that you do not create a new Salience object for each piece of content to avoid overloading. Conservatively, you can have Salience sessions once per day. Even if Salience has passed soak testing, it is still advisable to recycle these sessions occasionally.
Salience is thread safe only within a thread so you should create specific threads for a particular Salience sessions and use those sessions on the threads where they are created.
“When used for processing significant volumes of content, Salience is very CPU-intensive” so therefore, create Salience sessions only as much as you have cores. Check on http://dev.lexalytics.com/blog/?cat=1 to learn more.
Martin Brooke, September 9, 2010
Freebie
Search Vendors: Spot Changing Underway
September 9, 2010
Search vendors are trying to morph into customer support systems. Will it Work?
Today it is common that when people have problem with a certain product or service, they do not approach the right people instead voice it out to their friends through social media. Similarly, if they are satisfied they do the same. This observation is from Jamie Beckett’s post “Social Media Spurs Big Changes in Customer Service”.
Comcast’s Martin Marcinczyk and Cisco’s John Hernandez both agree on the use of social media in responding to customers complaints. Both companies are looking for ways to better use the social media in getting feedback from customers and on how they can link up the social media in responding to customers’ concerns. “I want to be able to look at my overall business and know how to serve each customer”, says Marcinczyk at live broadcast.
Martin Brooke, September 9, 2010
Freebie
Java Pre Litigation and Java Post Litigation: The Future of Java
September 8, 2010
“Java – It’s not Dead, Folks – It’s Doing Just Fine” is an interesting write up. You will want to read it because it shows what happens when an analysis ignores the craziness that can result from a legal decision. Oracle, in my opinion, wants to make money from open source technology. I don’t think Oracle is too worried about a particular company or product. I do think Oracle wants to use its Sun Microsystems’ intellectual property to make money. Money can be obtained in many ways.
There is the good old fashioned use of technology to crush opponents and emerge as the winner in a US of A style market battle. Corny Vanderbilt figured this out in the mid 19th century and the method still hath its charms.
Then there is the great idea of the tool road. You can use an Oracle technology, but you need to pay for the permission. Drug patents provide some insight into this method.
Third, the technology patents can be used for some horse trading. I have run across this pragmatic approach a number of times. In effect, a deal makes it possible to operate with confidence within a particular sphere.
One can combine methods and add unique ingredients; for example, protecting a market, creating a shotgun marriage, or obtaining a favorable concession.
The goal is usually the same: money. Get more money, increase revenues in a particular sector such as database technology in the enterprise, or free up money because sales and marketing costs go down. See. Money.
The hitch in the git along is the Oracle Google matter. Pile up examples of what’s happening now. Great exercise. The challenge is to figure out what happens if the legal issue breaststrokes down Sea World Parkway, oops, Oracle Parkway. Sorry. Evidence of proliferation does not change the unpredictable nature of legal eagles, judges, juries, and wacky parties to the matter.
Stephen E Arnold, September 8, 2010
Freebie
Quote to Note: Google as Remote Control for the World
September 8, 2010
Here is a quote to note. With Texas regulators looking to toss a haunch of Googzilla on the grill, statements like the one below may become the sauce for the cook out:
“Google has become the remote control for the world; it’s the first stop, not TV,” said Will Margiloff, CEO of Innovation Interactive, a unit of Denstu. “More than any other media, that messaging is requested; people are seeking BP’s answers out as opposed to waiting to be told.”
You can get more about the ad industry’s perception of Google in “What Big Brands Are Spending on Google.” I have no idea if the numbers are accurate. What is interesting is that the numbers are in the millions with a spread from $2.0 million to $8.0 million. Equally impressive are the outfits pumping dough into Google’s online advertising systems. I noted that Google’s executives tag AT&T for about $8.0 million.
Not so much for a year of Google, right?
Wrong.
The alleged data are for a single month. On an annualized basis, that works out to about $100.0 million.
Some thoughts from the goose pond this fine morning:
- What is the traction a mom-and-pop business (assuming there are any left with ad dollars) get from a Google ad? The magnitude of the spend makes a couple of hundred of earmarked bucks for the Google look somewhat modest next to these Chrysler Building scale investments.
- What must AT&T executives think about giving the Google $100.0 million in cash to reach Google’s customers? I can only imagine the joy, warmth, and happiness that spreads across the ad manager’s face.
- What accommodations, intentional or unintentional, accidental or purposeful, must Google make to ensure that big buck advertisers’ messages reach the right eyeballs? My hunch is that algorithms are objective. But are those pesky thresholds and dependencies set my humanoids or another look up table? Fascinating to think about.
Quite a phrase, however, no matter what the answers to these questions are. “Remote control for the world.” That has a ring to it. I can’t work my TV’s remote control either. C
Stephen E Arnold, September 8, 2010
Freebie
dtSearch has a New Release
September 8, 2010
dt Search Corp. announced the release of their extensions to its 64-bit developer product line. The new release covers both dtSearch’s enterprise and developer products including native 64-bit versions. For the developer products, the new release provides expanded sample code for use with Microsoft’s most recent Visual Studio version and for the enterprise products the new release updates the user interface, providing a wider selection of “look and feel” options for users. The dtSearch product line shares the same core feature set with Terabyte Indexes and File Formats and Databases also with Spider, Search Features and International Languages Support. Prices start from $199. If you want to know more about this new product you may call 1-800-IT-FINDS or visit www.dtsearch.com
Stephen E Arnold, September 8, 2010
Freebie
Fair Search Rankings: SEO and Its Sins Come Home to Roost
September 7, 2010
You will be reading a lot from the search engine optimization crowd in the coming weeks. SEO means get a site on the first page of Google results no matter what. The “no matter what” part means tricks which Web indexing systems try to de-trick. Both sides are in a symbiotic relationship. The poor goofs with Web sites that pitch a pizza parlor have zero chance to get traffic. An elaborate dance takes place among the engineers who tweak algorithms to make sure that when I enter the query “white house”, I get the “right” white house.
A 1,000 calorie plus Krispy Kreme burger of Texas indigestion is on the menu for the Google if the Associated Press’s story is spot on. Source: http://new.wxerfm.com/blogs/post/bolson/2010/aug/06/krispy-kreme-burger/
You know the one with the President of the country where Google and Microsoft have headquarters. If you are another “white house”, you can hire some SEO azurini and trust that these trial-and-error experts can improve your ranking in Google, Bing, Ask, or other search system. But most of the SEO stuff does not work reliably, so the Web site owner gets to buy ads or pay for traffic. Quite an ecosystem.
Now the game may be officially declared the greatest thing in marketing since the invention of the sandwich board advertising bars in Times Square or be trashed as a scam on hapless Web site owners. The first hint of a potential rainy day is “Texas Opens Inquiry into Google Search Results.” I don’t quote from the AP. The goose is nervous about folks who get too eager to chase feathered fowl with legal eagles. I also am getting more and more careful about my enthusiasm for things Googley.
I don’t have much of a comment and I have only one observation. Add one more Krispy Kreme sized problem to the Paul Allen patent extravaganza, the Oracle dust up, the Facebook chase, and the dissing of the Google TV. I thought Google’s lousy summer was over. Is September 2010 going to trump Google’s June, July, and August 2010? It may. Quite a Labor Day in a state noted for its passion for justice Texas style.
Stephen E Arnold, September 7, 2010
Freebie
Guha Still Going Strong: Spam Prevention in the PSE
September 7, 2010
I still think Ramanathan Guha is a pretty sharp Googler. I met a university professor who did not agree. Tough patooties. Here is the most recent patent application from the guru Guha: US20100223250, “Detecting Spam Related and Biased Contexts for Programmable Search Engines.”
A programmable search engine system is programmable by a variety of different entities, such as client devices and vertical content sites to customize search results for users. Context files store instructions for controlling the operations of the programmable search engine. The context files are processed by various context processors, which use the instructions therein to provide various pre-processing, post-processing, and search engine control operations. Spam related and biased contexts and search results are identified using offline and query time processing stages, and the context files from vertical content providers associated with such spam and biased contexts and results are excluded from processing on direct user queries.
What’s the significance? You will have to wait for one of the azurini to explain Guhaisms. I would note these clues:
- Context
- Entities
- query time processing stages.
But what does an addled goose know? Not much.
Stephen E Arnold, September 7, 2010
Freebie