Pentaho Semaphonic and Infobright Partner to Tackle Big Data

August 13, 2012

A partnership between three Web analytics organizations is another to keep your eye on. The press release, “Pentaho, Semaphonic and Infobright Deliver Big Data Web Analytics,” uncovers the meaningful union intended to help organizations maximize the value of their Websites as lead generation and revenue engines.

Pentaho, specializing in business analytics, is teaming up with Web analytics consultancy Semphonic and high-performance database Infobright to tackle big data concerns. The partnership is intended to bring together everything organizations need to understand their online channels better and what to do wit that knowledge.

Susan Davis, Vice President of Marketing at Infobright, comments on the partnership in the news release:

“This partnership makes it easy for organizations to use their detailed Web analytics data to drive business improvement.  With their expertise in Web analytics and measurement, Semphonic helps organizations understand what data is important and how to use it to their advantage. Using Infobright technology simplifies the process of storing and analyzing the growing volumes of Web data at a fraction of the cost of traditional approaches, while letting companies gain a deep level of data analysis delivered in interactive user-driven visualization and dashboards from Pentaho.”

An interactive demo is available at http://webanalytics.infobright.com/ and will show users how to gain insight from detailed Web data. The trio is sure to help organizations find a way to meaningfully and easily understand digital behavior at the customer level.

Andrea Hayden, August 13, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Japanese Government Uses Social Network Data to Reduce Suicide

August 13, 2012

Technology Review recently reported on behavior analysis through social networks in the article “Spotting Suicidal Tendencies on Social Networks.”

According to the article, a history of abnormally high suicide rates among Japanese men (ages 20 to 44) and women (ages 15 to 34) have caused the Japanese government to invest heavily in suicide research and prevention in hopes of cutting the rate by 20 percent by 2017.

One of the tactics that is being discussed is by identifying people who have regular thoughts of suicide, also known as suicide ideation, through their social networks. At the University of Tokyo, Naoki Masuda and a few others have taken to researching the popular Japanese social network Mixi which has over 25 million members.

After identifying user communities that may be more prone to suicide ideation, and comparing them with a control group, Masuda found that the differences were quite subtle. There were no differences in friend numbers, age, or gender between the two groups.

Some differences included:

“People prone to suicide ideation are likely to be members of more community groups than the control group. That may be the result of spending longer online and of a desire to want to interact. But a key indicator seems to be that these people are much less likely to be members of friendship triangles. In other words, they have fewer friends who also friends of each other.  This low density of friendship triangles appears to be a crucial.”

This is an interesting application of algorithms. Utilizing social networks to discover the links between online and offline behavior is still a burgeoning field and there still remain gaps in our understanding.

Jasmine Ashton, August 13, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Informatica Study Reveals Attitudes on Personal Data

August 10, 2012

Informatica provides us with some interesting information on perceptions of the personal data issue in “UK Consumers Rank Top Contributors to Personal Data Deluge.” The data integration company commissioned a survey of over 2,000 consumers in the UK in May 2012 in order to discover their attitudes and behaviors when it comes to sharing personal data with businesses. Not surprisingly, young people were found to be the least reluctant to hand over personal information; they are also the group most ready to accept that supplying such data can result in better service.

Some highlights of the report include:

“*Only 35 per cent of UK adults trust businesses to use their personal data as directed by them. . . .

*59 per cent of 18 to 24 year olds and 48 per cent of 25 to 34 year olds agreed that if businesses provided clearer explanations of why they wanted their personal data, and what it will be used for, they would be more inclined to give it to them.

*Further to that, almost one in ten (9%) of the younger generation (those aged 18 to 34) felt that the more personal information they provide a business with, the better the service they receive as a result.”

Another interesting finding: sixty-one percent of respondents chose their family doctor as the least likely to share their information with a third party. Facebook’s score on that question was the lowest, at thirteen percent. That high?

See the press release for more findings from the survey. Informatica’s take-away is that companies must communicate better with users about the ways in which they will use their data. I wonder, though; if the issue is a matter of trust, how much help will clearer language really be?

Informatica boasts that it is the world’s foremost independent provider of data integration software, with nearly 5,000 organizations using their products. Though the company has offices around the world, its headquarters can be found in Redwood City, California.

Cynthia Murrell, August 10, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Lexmark Touts Brainware as a Global Player

August 6, 2012

In their News Blog, Lexmark praises Brainware’s latest global associations in “Brainware Update: Capturing Relationships Globally.” The piece explains that data capture outfit Brainware is “expanding more than ever before,” with several new partnerships in new regions formed over the last six weeks alone. These new allies include Mexico’s STN Latam, whose specialty is finance resource planning; Outsourcing and IT consultants Novosit in the Dominica Republic; and IT services firm Content Concepts, operating in the Asia Pacific market. Nice work. See the write up for more details on each enterprise.

The Lexmark second quarter earnings call also mentioned Brainware, stating:

“Perceptive announced that the University of Kansas plans to expand the use of Perceptive Software solutions to a university-wide contract. This will also include the use of Brainware’s award-winning Distiller software to streamline invoice processing. . . .

“And we are leveraging our MPS enterprise presence along with our new Brainware intelligent capture expertise to help our customers extend their smart MFPs to now scan, classify and extract key content from documents all automatically and deposit the content directly into a core system or process, reducing time and manual labor costs in the process.”

It sounds like Brainware is bringing a lot to Lexmark’s projects. The company was formed in 2006 with the buyout of technology from SER Solutions. They emphasize that their auto-learning data capture and search solutions are scalable and user-friendly.

Veteran enterprise search technology vendor ISYS Search Software, another Lexmark acquisition, also received a (brief) mention in the earnings call.

Cynthia Murrell, August 6, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Google Fails to Delete Data as Ordered by ICO

August 6, 2012

An article on The Telegraph, “Google: We Failed to Delete All Streetview Data,” reveals another big privacy “whoops” from Google.

In 2010, Google collected data over open WiFi networks during its Streetview mapping in Britain and a number of other countries around the world. Britain recently reopened the investigation as Google has revealed that “human error” prevented the company from deleting all the data it was ordered to destroy by the Information Commissioner.

The article includes a statement from the ICO:

“This data was supposed to have been deleted in December 2010. The fact that some of this information still exists appears to breach the undertaking to the ICO signed by Google in November 2010. […] Google indicated that they wanted to delete the remaining data and asked for the ICO’s instructions on how to proceed. Our response, which has already been issued, makes clear that Google must supply the data to the ICO immediately, so that we can subject it to forensic analysis before deciding on the necessary course of action.”

Google maintains that the collection was unintentional, and apologizes to the ICO and the public. Apparently it is easier to apologize than ask for permission. Cool. If the investigators in this case were of Google caliber, perhaps they would understand “human error” as Google defines it.

Andrea Hayden, August 6, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Open Source Data Treasure Trove

August 2, 2012

Hungering for open source data? The H Open reports, “Data on 500,000 Open Source Projects Available.” The trove comes from Ohloh, a directory of open source projects maintained by Black Duck Software. The projects listed are all available under the Creative Commons Attribution 3.0 license. The write up reveals:

“The company also made a RESTful API available that allows information about the projects to be queried. Ohloh analyses projects from around 5,000 repositories, including GitHub, SourceForge, Google Code, kernel.org, Eclipse, Mozilla, and Apache.

“To use the API, users first have to register for a key and they can then query metrics such as the number of active contributors and commits, the number of lines of code, the main programming language used, and licensing information for the project. Black Duck uses the Ohloh database to identify items such as particularly active new projects or licensing trends.”

The folks at Black Duck are also working on a new version of their code search for Ohloh, now in beta; they hope the search will help improve links between project code and metadata.

Since its founding in 2002, Black Duck Software has been dedicated to supplying strategy, products, and services to enable the adoption of open source software on the enterprise scale. In fact, they boast that they are the leading provider of such solutions worldwide.

Cynthia Murrell,August 2, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Disagreement on Value of Big Data

August 1, 2012

Is the Big Data phenomenon good or bad for society? The Pew Research Center’s Internet & American Life Project and Elon University’s Imagining the Internet Center recently performed a study that gathered some pretty strong opinions on both sides of the issue, we learn in MediaPost’s “Pew: Value of ‘Big Data’ Debated.”

The survey asked over a thousand technology pros about Big Data, and more than half of them agreed that, by 2020, it will be a “huge positive for society in nearly all respects.” Researchers noted that:

“Big Data proponents predict continuing development of real-time data analysis and enhanced pattern recognition that could bring revolutionary change to personal life, business, and government.”

Probably so. However, a sizable minority (thirty-nine percent) disagree with the rosy outlook, asserting that, by 2020, Big Data will prove to be “a big negative.” I suspect that a field of (informed) non-technical respondents might have turned up an even larger proportion of naysayers. Writer Mark Walsh tells us these survey takers:

“. . . noted that the people controlling the collection and management of large data sets are typically governments or corporations with their own agendas. Dissenters also pointed to a shortage of human curators with the tools to sort through the glut of data, increasing the possibility that data can be manipulated or misread.”

Also true. Hmm.

The write up is an interesting read, and the opinions that accompany the survey results even more so (if you have the time to go through them). My take—like any powerful invention, Big Data collection and analysis can be employed for weal or woe, depending on who’s using it. Where would our society be if we rejected every technology that could be used nefariously? I’m afraid that individual, corporate, and governmental integrity are still the keys. Yes, even now.

Cynthia Murrell, August 1, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Oracle Text Makes Search Scores Adjustable

July 29, 2012

Oracle Text Search lets you sort search result by score according to IT Newscast’s
article, “Adjusting the Score on Oracle Text search results.”

They explain the process in laymen’s terms as:

“In theory, the more relevant the search term is to the document, the higher ranked Score it should receive. But in practice, the relevancy score can seem somewhat of a mystery. It’s not entirely clear how it ranks the importance of some documents over others based on the search term. And often times, once a word appears a certain number of times within a document, the Score simply maxes out at 100 and the top results can be difficult to discern from one another.”

To index, search and analyze text both in the Oracle database and on the web, Oracle Text uses standard SQL. This software is capable of utilizing keyword search, context queries, Boolean operations, mixed thematic queries, HTML/XML and more.

It can also perform linguistic analysis and support multiple languages with their advanced relevance ranking technology. There are additional features available for those who need even more advanced search methods like clustering and classification.

Oracle has been a leader in database software for more than three and a half decades. Their knowledge on adjusting search results should not come as a shock. Oracle is one company that will probably remain on top with enterprise grade applications and platform services.

Jennifer Shockley, July 29, 2012

Sponsored by IKANOW

Sinequa Generates Voltage at Siemens

July 28, 2012

The electricity is flowing at Siemens, and Sinequa is generating the voltage. LeMagIt’s article “Siemens Adopts Sinequa for its Retrieval” talks about their new collaborative platform for business information and expertise exchange.
The project is called TechnoSearch, and it makes short work of mass data, combining it into an efficient, accessible data source with the TechnoWeb Siemens. They will be implementing a mass of knowledge and technological know-how from millions of documents, databases and applications and combining it within one unified source. This opens the door to a range of uses and collaborations in the future.

According to Thomas Lackner, director of the project “Open Innovation” at Siemens:

“The traditional document management systems often provide insufficient research capacity. In addition, some information is available only to a select group of employees and some teams work closed using Web 2.0 technologies for communication and knowledge sharing. Our goal was to establish a universal solution for all our employees abroad, which extracts the relevant information from data sources the most diverse and put available to all via a single platform.”

The highly charged current may continue into the future, as Siemans has plans on extending applications. Currently they are supported on TechnoSearch Sinequa SS8, where unified access is granted to content for those connected.

Open innovation is a bright concept, but with high voltage comes risk. Let us hope Siemens has a surge protector, just in case.

Jennifer Shockley, July 28, 2012

Sponsored by Polyspot

Finally Some Praise for Discovery

July 28, 2012

One of the challenges many businesses face today is how to deal with unstructured data in the ever increasing big data realm. The answer is not that complicated according to Business 2 Community’s article “Discovery in the Age of Big Data.” One just has to ‘discover’ the solution.

How do we discover the solutions, simply put, we search:

“Data exploration and discovery is a critical component of any big data initiative. Search is an important tool in this exploration and discovery process.

“Successful big data implementations take a phased approach and deciding what data to roll into your big data platform is part of this process. This data exploration phase is critical in developing and understanding what data exists, what is missing and how the data ties to the use case scenarios most important to the business.”

Today’s business owners want it all, and with all the tools at their disposal they can. Efficient use of search and discovery can provide a wealth of solutions and relevant information. There are enhanced navigation and visualization tools available that will take users well beyond the fundamental keyword searches commonly used.

Once users start utilizing search within their existing metadata, extracted entities and dynamic document clustering combined, the ability to extract value will be greatly increased. The end result is the ‘discovery’ of greater return of investment in the future.

Finally, some praise for discovery.

Jennifer Shockley, July 28, 2012

Sponsored by Polyspot

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta