China and Its Super Filter Idea

January 20, 2011

New Controls on Text Messages” caught our attention.

Even if we completely disregard legality, how is it possible to filter every text message for a country that houses over a billion people. That’s more than a billion text messages every day. Alright, now I may be a little confused but I just don’t see how it is going to be possible for China Unicom to make this happen. The story says:

“This is a requirement being imposed by the telecoms [service provider], that these keywords cannot be sent. If they are sent, then they will not be received by the user.”

Though it is quite obvious that the Chinese government seeks to ban communications of certain words in order to sustain peace and control it’s also blatantly obvious that it will probably not work. It’s like giving a child a piece of candy and then telling him he has to wait until after dinner to eat it.

Though I digress, there have been banned words in the past however the list has never been quite so large. All told around 800 keywords that cite references from historical figures to sex were banned including some English words and important historical dates.

I have to wonder, how is this going to affect daily communications? And search? What about search? Objective search results seem to be a notion from the distant past.

Leslie Radcliff, January 20, 2011

Freebie

Self-Service Business Intelligence: McDonaldization of Data

January 20, 2011

Drive into McDonald’s. Hear a recorded message about a special. Issue order. Get Big Mac and lots of questionable commercial food output. Leave.

Yep, business intelligence and “I’m loving it.”

No one pays much attention to the food production system and even less to what happens between the cow’s visit to the feedlot and the All American meal.

Ah, self-service. Convenience. Speed. Ease of use. Yes!

These days we pump our own gas (minus Oregon and NJ) and pour our own soft drinks.  We Google instead of asking the reference librarian (usually).

Is this the future of Business Intelligence?  “Soon Self-Service BI, SaaS to Dominate the Tech World” predicts that 2011 will bring an increase in self-service BI.  According to SiliconIndia News: “Numerous vendors, including IBM, SAP, Information Builders, Tibco Software, QlikTech, and Tableau Software, already offer [self-service BI]  tools, and adoption will accelerate as more companies try to deliver BI capabilities to nontechnical users, business analysts, and others.”  The question becomes: Will the users know what the outputs mean?  The SaaS part of the prediction I heartily agree with, in BI and everywhere else.

What if the data are dirty? Malformed? Selected to present a particular view of the Big Mac world? How about that user experience?

Alice Wasielewski, January  20, 2011

The Large Knowledge Collider

January 20, 2011

Semantic Web reports that “LarKC Turns Up The Volume On Web-Scale Reasoning Systems for the Semantic Web.” LarKC (pronounced “lark” and stands for Large Knowledge Collider) has released version 2.0. The program was initially developed as a platform to remove barriers between systems for the Semantic Web. Version 2.0 remakes the program from the ground up and allows for plug-ins with any sort of computer, cluster, or Amazon EC2 cloud.

“Now you have a platform that takes care of auto-distribution and auto-parallelization if your plug-in allows this…With auto-distribution and parallelization capabilities, “you can analyze larger data sets than before…you make use of multiple machines inside the cluster.”

LarKC’s purpose is to use plug-ins that can be reused by other sources without having to create new plug-ins in the process. Once one is implemented, it can be shared with others and run on the same platform, which is LarKC. LarKC’s last phase is to enter the market and find people interested in using/purchasing the program.

Whitney Grace, January 20, 2011

Freebie

Hit Boosting and Google

January 20, 2011

Well, well, well. The Google watchers have discovered hit boosting. “Hit boosting” is the must-have function in a real-world search system. Forget the research computing lab demos. In the real world, folks like Dick Cheney want his Web site at the top of a results list. Do you rely on fancy indexing methods or egg head numerical recipes? This goose doesn’t. “Hit boosting” is a short cut. There are many ways to make a certain result or hits from a specific site or about a certain topic come up first in a results list. If you want more impact than number one in a result list, the goose can slap a box on the results page with the boosted content front and center. To make the content more visible, the goose can force you to locate a tiny “close” link.

Hit boosting has apparently been discovered by non-search experts. Navigate to “Google Gins Search Formula to Favor Its Own Services.” Here’s the key passage in my opinion:

Harvard professor Ben Edelman and colleague Benjamin Lockwood found that Google’s algorithm links to Gmail, YouTube, and other house brands three times more often than other search engines. Search terms such as “mail”, “email”, “maps”, or “video” all yield top results featuring Google’s services, they found. The practice, which Yahoo! was also found to engage in – albeit less blatantly – puts the search engines’ interests ahead of users’ need for unbiased data about the most useful sites on the web, they warned.

Okay, I wonder if the researchers realize that content tuning is a function of a number of search systems. In fact, if there is no administrative control to “weight” content, teenagers can be pressed into duty to write stored queries or hard wired routines to make sure that the boosted content is at the top of a results list. What if the content is not relevant to the user’s query? It doesn’t matter. Hit boosting is not a particularly user-sensitive method. The content is put where it is required. Period.

Do users know about hit boosting? Nah. The computer is supposed to deliver “objective” results. Talk about cluelessness! Most results lists will contain useful information, but there can be lots of other stuff on a Web page. In fact, the Web page may not be a results list. The hits are in a container surrounded by other containers of quite interesting stuff.

Is Google doing something unusual? Nah. Categorical affirmatives are risky, but I can safely say that hit boosting is the norm in many search deployments. Google is just being logical.

Did Dick Cheney’s content appear at the top of the results list? You bet. Did he notice? No. But his Yale and Harvard helpers were on top of the situation. Just like the boosted content.

Amazing what some folks assume.

Stephen E Arnold, January 20, 2011

Freebie

Yahoo: January 2011

January 19, 2011

Yahoo’s search share is flat. The signals about Flickr and Delicious are mixed. The company is chugging along a familiar path. One of our colleagues opined, “Poor, poor Yahoo.”  The reason was that the speaker was old enough to remember the early days might wonder what in the world has happened.

The Decline & Fall of the Yahoo Empire” lays the blame squarely at the feet of the sadly neglected engineer.  The notable quote: “Managers and executives who don’t understand programming want to believe that it’s just plumbing, and that programmers are fungible. This is simply not the case. A good programmer isn’t a mere two- or three-times as productive as a bad one; the multiple is more like 20 or 50, if not higher. Hiring ‘a programmer’ is like signing ‘a football player’; there are millions who play, but only a minority are good enough for a professional team, and only a tiny few can play in the biggest league.”  From my perspective, yes, this seems to be the case with Yahoo.  Yahoo seems to buy and buy ideas (startups), but then the people who came up with these ideas leave and leave.  It doesn’t seem like all that long ago that startups couldn’t wait to be bought out by Yahoo, and yet now they’d rather risk sinking than swim in Yahoo’s big toxic pond.  Geocities, Zimbra, AltaVista — the list goes on and on.

Alice Wasielewski, January 19, 2011

Freebie

Broken Search?

January 19, 2011

Beer and cake is not a combination I’ve tried yet, nor is it one that sounds even remotely appetizing. Has Google has dropped the ball or turned a blind eye to some clever search engine optimization tactics?

The search giant seems to think beer and cake go together like peas and carrots or peanut butter and jelly. I think it seems more like tuna and a rubber tire but really, who am I to judge?

According to Phil of Phil’s blawg on tumblr.com, there had been several searches for “Cake Central” that yielded a top result of Beerby.com. Beerby is a mobile application that lets users track their favorite beers and breweries, as well as the eateries that carry them.  Though they have since corrected the issue, the question of why Beerby was the #1 search result when a user queries for cake recipes and decorating tips remains a mystery.

I am beginning to think that it is difficult to get a set of results from a query that is relevant and on target. I hope I am wrong, however.

Leslie Radcliff, January 19, 2011

Freebie

SharePoint Sharing from Attivio

January 19, 2011

Of interest to businesses overwhelmed with voluminous SharePoint content: “Attivio Announces AIE for SharePoint Integration” tells of the Active Intelligence Engine’s new availability to aggregate information across not only SharePoint, but also websites, databases, email, CRM and other information sources.

According to the announcement,

“The difficulty of discovering and delivering timely insight derived from all of these resources exposes gaps in an organization’s ability to integrate and rapidly update information; providing a single method for users to find the information they need, regardless of its origins.  AIE for SharePoint Integration enables secure access to all types of information by unifying diverse datasets, while avoiding the cost and delays of cumbersome legacy integration stages. “

With AIE companies no longer have to worry about SharePoint silos, easily accessible only within departments and can instead maximize insight and collaboration across the entire enterprise.  Attivio promises retention of data relationships from text sources, rapid implementation, and tight security to result in maximized competitive advantage. We believe the company has a good approach to a very tough SharePoint challenge.

Alice Wasielewski, January 19, 2011

Semantics. Complexity? What Complexity?

January 19, 2011

We’ve been trying to figure out a decent way to present all the various aspects of web semantics to our readers, so they would see how they all fit together. Some More Individual relieved the pressure from our shoulders with this lovely visualization of “The Common, Layered Semantic Web Technology Stack.” While it may resemble a Rubik’s Cube, each colored bar in the graphic represents a significant part of web semantics. It does appear complex and a comment by the designer Benjamin Nowack explains how he felt about his work:

“At least the concepts can be separated from specific technologies and the application layer has a different angle than before (which I personally think makes more sense).”

One can see how each part of Web semantics is inter-related and builds upon the other. The write-up also includes explanations for each part of the graphic to help one understand each bit’s purpose. If you still find web semantics complicated after reading the article, perhaps building a 3-D model out of Legos or Lincoln logs will help.

Whitney Grace, January 19, 2011

Freebie

Vivsimo Powers New NARA Search

January 19, 2011

NARA Brings Online Collections Together” announces the National Archives and Records Administrations’ new online public access, called, creatively, Online Public Access.  Okay, snark aside, this is wonderful news for anyone searching some of NARA’s most popular resources.  Behind the scenes, Obama’s open government initiative provided the funding and an enterprise search software company, the technology.

Here is the basic idea: “NARA implemented a search engine from Vivisimo to crawl the Archives.gov, Archival Research Catalog and Access to Archival Databases. The results include all the contextual information of the records including date, type of material whether picture or document or movie and the physical location of the material. Users can click on the picture or movie and view it right there on the screen.”  And, this is only the beginning.  This year, NARA is planning to add about 40 terabytes of records, a zoom function, and sharing through social sites, contingent on funding from Congress.  I hope that the budget fairy is kind to them.

Alice Wasielewski, January 19, 2011

Freebie

IBM, Public Relations, and Rocket Science

January 19, 2011

IBM has time on its hands. The idea that its staff has time, energy, and management support to win a TV game show speaks volumes about the company. We prefer the Google approach of creating a product or service and making it available. There’s not much available about a game show demo.

We did enjoy the write up in It’s Nuts Out There by one of our friends, a former big-time magazine publisher. Navigate to “Artificial Intelligence (AI): Why Didn’t I Think of That?” Here’s the passage we liked:

Watson has been taught (programmed) to buzz and answer, with a simulated masculine voice, in the form of a question. Watson claims to know slang, contemporary jargon, trivia, subtleties of language, puns and riddles.  In a practice game, the category was “Chicks Dig Me.” The answer: “Kathleen Kenyon’s excavation of this city mentioned in Joshua shows that the walls had been repaired 17 times.” Watson buzzed in first with “What is Jericho?” Bingo! Next answer: “He was the 16th president of the United States.” Watson again: What is Jericho? (Joke.) Fear not human beings… Watson doesn’t kill–to the best of the creators’ knowledge– nor does it play Wheel of Fortune. Lucky for us. If this trend of creating artificial intelligence continues, we may not need monkeys any more.

Stephen E Arnold, January 19, 2011

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta