Language Found to Reveal Hierarchies
January 5, 2012
Isn’t it great when technology is used to pursue answers to the burning questions of our day? MIT’s Technology Review announces, “Computer Scientists Create Algorithm That Measures Human Pecking Order.” Cornell University’s John Kleinberg, known for his work on the HITS Web page ranking algorithm, and associates have discovered that language usage can reveal power differences between humans. The article states:
They say the style of language during a conversation reveals the pecking order of the people talking. ’We show that in group discussions, power differentials between participants are subtly revealed by how much one individual immediately echoes the linguistic style of the person they are responding to,’” say Kleinberg and co.
Particularly, the researchers look at functional words like articles and conjunctions. It seems that, while top dogs feel no compunction to copy the speech or writing of others, those lower on the totem pole do. Unconsciously, of course.
Though these findings may seem like a simple curiosity, the article points out potential real world ramifications. Companies might analyze email exchanges to determine the leaders among their employees. Also, if done in real time, the technique could influence key conversations like negotiations and interviews.
Perhaps we have another way to explore privacy and manipulate?
Cynthia Murrell, January 5, 2012
Sponsored by Pandia.com
Digital Reasoning Connects with TeraDact
January 4, 2012
Big data analytics specialist Digital Reasoning has been a regular topic of discussion here at Beyond Search, most recently for achieving series B funding for a big data intelligence push.
Now, we would like to share an exciting new development in the quest to solve the big data problem in the news release “Digital Reasoning and TeraDact Partner to Automatically Remove Sensitive Information from Big Data.”
According to the article, TeraDact Solutions, a software tools and data integration solutions provider, has integrated their TeraDactor Information Identification and Presentation capabilities with Synthesys Cloud, a software-as-a-service data analytics solution.
The news story states:
In conjunction with Synthesys, TeraDactor can automatically assist in appropriately classifying information not recognized by the original data provider. TeraDactor allows participants to push and pull information without waiting for the declassification process, assuring that formerly classified documents may be released without unintended leakages.
The innovative technology that TeraDact Solutions brings to Digital Reasoning’s table demonstrates the power of Synthesys as a cloud-based data analytics tool in building the next generation of Big Data analytic solutions. Kudos to the surging Digital Reasoning organization.
Jasmine Ashton, January 4, 2012
Sponsored by Pandia.com
SharePoint Troubleshooting Videos for the Enterprise Weary
January 4, 2012
The GetThePoint Blog is highlighting a series of troubleshooting videos being offered by Microsoft regarding SharePoint and its well-publicized list of common pain points.
As the write-up expresses:
Microsoft’s Technical Readiness team has been building a collection of what they call ‘Break/Fix’ videos that address specific technical issues when using Office 365 . . . The Microsoft SharePoint End-user Content team is investigating creating similar quick videos that address specific pain points for SharePoint, and would apply to both O365 online and on-premises versions.
While we understand the need for online assistance, especially in terms of quick fixes, the real issue lies in the fact that so many of these fixes are needed in the first place. SharePoint has never promised to be a complete solution, an out-of-the-box application that needs no further tweaking. However, the extent to which SharePoint has to be customized and manipulated leads us to believe that a third-party solution might be a less painful enterprise option.
A third party solution worth a second look is Fabasoft Mindbreeze and its suite of solutions. Receiving the KM World Trendsetting Product Award for the fourth consecutive year in 2011, Mindbreeze is often lauded for its ability to be customized, but also extremely efficient out-of-the-box.
“Fabasoft Mindbreeze Appliance as a pre-packaged solution (hardware and software) offers a quick and easy way to enjoy a high-end enterprise search solution out-of-the-box. The product is ready to use within in a very short timeframe. ‘We make it easy for our customers. We deliver the ready-to-run appliance and configure it together with the customer via an online meeting. Fabasoft Mindbreeze Enterprise is then ready for use.’ – Daniel Fallmann describes some of the advantages of the solution.”
Fabasoft Mindbreeze works as a standalone enterprise solution, or can be used to enhance an existing SharePoint infrastructure through the Mindbreeze Connectors offerings.
Emily Rae Aldridge, January 4, 2012
Sponsored by Pandia.com
Protected: The Cloud Yields to a Typo
January 4, 2012
Do Not Put the PLM Cart before the Data Management Horse
January 4, 2012
The new age PLM systems bring in much needed maturity to enable consistent Engineering Data Management, collaboration and visualization of different CAD formats. This leads to a detailed examination of the way the enterprise is organized to work with engineering assets, and because it addresses data structures, it leads to questions about product structures.
Creative Tip to Avoid Indexing in SharePoint Fast
January 4, 2012
At his Tech and Me blog, Mikael Svenson provides a unique search tip in “How to Prevent an Item from Being Indexed with FAST for SharePoint.” Keeping an item from being indexed in FAST using the meta data or text of a file has long been considered next to impossible. Svenson, however, has found a way, and that way is through profanity. Yes, you can use the Offensive Content Filter to your advantage. The article explains:
The thing about the offensive content filter is that it will prevent documents from being indexed if they contain a certain about of bad language. If you get embarrassed by such words, then skip reading 🙂 “So now we have a stage which can drop items, the rest is to assign enough bad words to ‘ocfcontribution’ to get above the threshold it triggers on.
See the write up for a detailed description of how to implement this creative approach.
Svenson notes one important caveat: if you have any documents containing profanity that you actually want to have indexed, this solution may backfire. Avoid difficulties by tapping the deep search expertise of Search Technologies.
Iain Fletcher, January 4, 2011
Sponsored by Pandia.com
The Yahoo Foray into Content Analysis
January 4, 2012
Is it too late, or can Yahoo still get into the content analysis game? On December 21, ReadWrite Enterprise reported, “New Yahoo Content Analysis API Available Today.” Writer David Strom explains:
[The API’s] aim is to rank content by overall relevance, point to particular Wikipedia pages and annotate the results with extensive meta-data. The service is available as a Yahoo Query Language (YQL) table and more information can be found here. You can try out a sample query request and see the XML code that is returned in response, as well as documentation for the particular fields that are part of the interface.
Developed for internal use, the API is now fair game for any developer familiar with YQL. A couple of interesting points: key terms can be extracted from the content stream for ranking purposes. Also, content can be mapped to the Yahoo taxonomy. English and Chinese are currently supported, but more may become available.
Well, Yahoo, perhaps it is better late than never. We are not sure which company warrants close observation: Hewlett Packard, Research in Motion or Yahoo. Toss up maybe?
Cynthia Murrell, January 4, 2012
Sponsored by Pandia.com
Watson Fights Cancer: Talented Search System That
January 4, 2012
As if to continue trying to prove that it can do anything, “IBM’s Watson to Help Doctors Diagnose, Treat Cancer,” reports eWeek. The AI supercomputer will be working with the Cedars-Sinai cancer center and insurance company WellPoint to evaluate cancer treatment options. Writer Brian T. Horowitz explains:
Using its data analytics and NLP [Natural Language Processing] capabilities, Watson would integrate data such as medical literature, patient histories, clinical trials, side effects and outcomes data to help doctors decide on courses of treatment. . . . Watson would also look at the characteristics of a patient’s cancer and make recommendations on cost-effective treatment that would lead to the best outcome.
Of course, this advice would not replace that of a doctor, but it could become a valuable tool. Other health care organizations have been turning to technology for solutions. For example, Dell just donated an entire cloud infrastructure to the Translational Genomics Research Institute for storing medical trial data on pediatric cancer.
Good to see technology being used for the good of humanity, right? We would like to see IBM put Watson up on a test corpus for the public to use. Wishful thinking I suppose.
Cynthia Murrell, January 4, 2012
Sponsored by Pandia.com
Everlasting Metadata?
January 4, 2012
Professional photographers are working to protect their rights in the digital world, as CNET reveals in “Should Metadata Be Permanent?” The groups supporting an initiative to require that metadata be permanently adhered to image, text, audio, and video files are understandably focused on protecting copyrights. However, there could be other repercussions to the move. Writer Alexandra Savvides points out:
Imagine a whistle-blowing case involving photographic evidence, where the metadata clearly reveals who took the photo. The manifesto also doesn’t seem to address issues of data tampering or manipulation. We’ve seen numerous cases where photo-encryption systems have been cracked, showing that an obviously manipulated image is an original file created by a camera in question. There is nothing to stop similar methodologies being developed that could change the metadata to imply that another person created an image.
It’s a thorny question. I sympathize with artists who must protect their work. On the other hand, there’s the law of unintended consequences. There is also the question of “language drift.” If metadata are not up to date, the searcher of the future might not be able to locate the information object because the search term does not match the metadata’s lingo.
Our question, though, is a little more pragmatic: what if the meta data needs to be changed? Hmm. Inconvenient, that.
Cynthia Murrell, January 4, 2012
Sponsored by Pandia.com
Google and Marketing: Welcome to 2012
January 3, 2012
Short honk: It is difficult for me to muster much enthusiasm for commenting on Google’s 2012 marketing efforts. I burned out on the GOOG after writing The Google Legacy, Google Version 2.0: The Calculating Predator, and Google: The Digital Gutenberg. Google has allowed itself to fritter away its opportunities.
One comment about “Google’s Jaw Dropping Sponsored Post Campaign for Chrome”. We’re surprised that an outfit which uses the Muppets to position information retrieval is pumping out “interesting” content?
Image source: http://chowying.com/a-quick-example-of-local-adaptation-1053.html
There’s a reason Amazon has pulled ahead in cloud computing. There’s a reason Apple is doing a good job with tablets. There’s a reason Microsoft is making points with procurement teams in DC with Office 365. There’s a reason the huge Google DC data center is under utilized. There’s a reason that Web sites are complaining that their search results are unpredictable. There’s a reason I use Yandex for certain searches.
So what happened to Google after 2006? The Muppets. Oh, yes. The image I want in my mind for next-generation search and retrieval. Google certainly has that consumer angle nailed just like dataspaces, the Programmable Search Engine, and predictive search. And content spam. I almost forgot content spam.
Stephen E Arnold, January 3, 2012
Sponsored by Pandia.com