Connotate Adds More Team Players

March 1, 2013

One field that weathered the economic downturn and is currently exploding is IT. Connotate, a leader in data monitoring and extraction solutions, wants to be a few steps ahead of the predicted growth for 2013. So-Co-It explains how in the story, “Connotate Expands Senior Management Team.” The new Connotate team members are Frank Hunt, CFO; Jeff Sacks, CMO and Bogdan Sabac, VP of engineering. The new additions are expected to guide the company in a new direction:

“’In preparation for accelerated growth in 2013, we recruited top executive talent to help the company scale, to clearly communicate with customers and to continue to build relationships with financial partners who want to be part of our expansion,’ said Keith Cooper, CEO of Connotate. ‘Frank, Jeff and Bogdan are experts in their field as well as experienced and aggressive entrepreneurs in their own right. They have already started helping us prepare for an exciting year ahead.’”

As the new VP of engineering, Sabac will take the lead developing new data collection projects and monitoring solutions. Jeff Sacks is a well-regarded online marketer with health content, financial services, and consumer packaged goods experience. His plan is to expand and change Connotate’s marketing strategy at all levels. Hunt as the new CFO will be responsible for all financial-related activities.

Connotate is gearing up for some big changes and it is good to know how. They are yet another company to follow.

Whitney Grace, March 01, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Scientists and Businesspeople Work Together for Big Data Research Solution

February 28, 2013

Ah, we’re pleased to see this real-world step with regard to big data. Science Daily informs us, “Solving Big-Data Bottleneck: Scientists Team with Business Innovators to Tackle Research Hurdles.” Researchers from Harvard Medical School, Harvard Business School, and London Business School have partnered to apply the benefits of a commercial crowdsourcing platform to a significant challenge—finding a data-analysis program that can handle the complexities of biological research analysis. The article reveals:

“Partnering with TopCoder, a crowdsourcing platform with a global community of 450,000 algorithm specialists and software developers, researchers identified a program that can analyze vast amounts of data, in this case from the genes and gene mutations that build antibodies and T cell receptors. Since the immune system takes a limited number of genes and recombines them to fight a seemingly infinite number of invaders, predicting these genetic configurations has proven a massive challenge, with few good solutions.

“The program identified through this crowdsourcing experiment succeeded with an unprecedented level of accuracy and remarkable speed.”

This is certainly a worthy big-data application, and confirmation that folks from different disciplines can effectively work together to accelerate progress. Before the project could really get going, though, the biologists had to translate their query into less-specialized language for the TopCoder community. After that, the viable suggestions came rolling in, and researchers picked their solution from an array of good choices. (Alas, the article does not disclose which software was selected.)

Researchers see more cross-discipline projects in the future; Harvard Business School’s Karim Lakhani notes that existing platforms and communities can provide a speedy alternative to the creation of custom data-analysis solutions. Yes, let’s hear it for cooperation!

Cynthia Murrell, February 28, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Open Data Movement Seen as Falling Short in Canada

February 28, 2013

Here’s more good news for the closed data crowd. In his Whimsley blog, writer Tom Slee explains “Why the ‘Open Data Movement’ is a Joke.” The post was spurred by a couple of developments in his native Canada: the country’s inclusion in the international Open Government Partnership, and budget cuts that imperiled jobs at Statistics Canada. Slee writes:

“A gov­ern­ment can simul­ta­ne­ously be the most secre­tive, con­trol­ling Cana­dian gov­ern­ment in recent mem­ory and be wel­comed into the club of ‘open gov­ern­ment’. The announce­ments high­light a few prob­lems with the ‘open data move­ment’ (Wikipedia page):

*It’s not a move­ment, at least in any rea­son­able polit­i­cal or cul­tural sense of the word,

*It’s doing noth­ing for trans­parency and account­abil­ity in government,

*It’s co-opting the lan­guage of pro­gres­sive change in pur­suit of what turns out to be a small-government-focused sub­sidy for industry.”

It is worth noting that Slee’s opinions are Canada-specific. He wishes “open government data” were more of a synonym for “transparent government.” (He excludes the “open scientific data” movement from his criticisms.) He observes:

“There seems to be no link between the government’s actions and the actions of this ‘move­ment’, and basi­cally that’s because the Open Data Move­ment is more focused on for­mats, digitally-accessible data sets, free access to postal codes, and so on than it is focused on actual gov­ern­ment trans­parency around issues that mat­ter. It’s a move­ment that has had no impact on gov­ern­ment accountability.”

See the article for a list of grievances Slee has with the current prime minister and his apparently opaque administration. The write-up encourages Canadian progressives to take a hard look at what is (and is not) actually happening in the name of open government data. To my mind, though, incremental progress is better than no progress at all.

Cynthia Murrell, February 28, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

MarkLogic Takes Olympic Coverage From Probable Nightmare to Practical Success

February 26, 2013

Most people never really think about how news organizations transmit data across continents when there is a big event. For the Summer Olympics in 2012 The Press Association relied on MarkLogic’s XML repository’s ability to store and query hundreds of thousands of pieces of metadata per second.

In “How PA Cleared The Big Data Hurdle At The London Olympics” the Press Associations director of technical architecture, John O’Donovan, gives consumers an in depth look at how the office was able to cope with more than 50,000 requests per second.

“The problem with that is having to sit down and design a relational database model that can represent everything that’s in the XML. That takes quite a lot of time, you have to build all of your input/output extenders and map XML objects into relational stores.”

At first look it seems like an impossible task, organizing all of the photos, biographical information, statistics, and competition results for thousands of athletes and beaming it to televisions, phones and computers everywhere, but, by removing the relational database the PA made it possible.XML store instead of storing it in the relational database and then retransferring the data back to XML.

It simplified the delivery system from 100 to 34 man hour days to get off the ground and was so successful that The Press Association will be utilizing the new system for all of its wire and output communications.

Big thumbs ups to MarkLogic’s ability to handle the process and to the PA for finding a new way to utilize an already reliable resource.

Leslie Radcliff, February 26, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Original Ways To Use Social Media Data

February 18, 2013

I recently heard on a news program that instead of taking cigarette breaks, people are now taking Facebook, Twitter, or social media breaks. The concept of constantly being connected is integrating into society as a regular, even necessary habit for some people. As a result social media creates a lot of data and organizations want to take advantage of its multiple uses. While social media data provides the standard trends, habits, etc. of people, some organizations have found interesting ways to harness it. Digimind takes a look at “5 Innovative And Original Uses of Social Media Data.”

The article lists five amazing and practical ways universities have used various social media outlets. The University of Bristol tracked the UK’s public mood and found that negativity resulted strongly from poor economic times. The University of Virginia is trying to detect early signs of adverse drug reaction, while Virginia Tech is looking into a project to find vehicle defects for auto manufacturers. Digimind even launched a Web site to track global funding deals in real time. The best, though, involves saving dolphins:

“This is a definite contender for one of the most noble uses of social media ever. Scientists in Australia’s Duke University used data from social media (Twitter, Flickr, Facebook and YouTube) to document ecosystems and development in Western Australia in an impressive bid to protect “the last great marine wilderness left on Earth”.

By using the digital footprint of volunteers to map the state of coastal ecosystems, in particular the snubfin and humpback dolphins, researchers were able to detect where human activities and marine resources overlap and potentially conflict. It’s easy to imagine how a similar social media mapping project could be extended into other areas of conservation to monitor the status of endangered and threatened plants and animals.”

Imagine! Using technology to save the Earth instead of destroying it. Social media information holds a lot of potential to do more than track consumer habits. Maybe it even holds the key to world peace.

Whitney Grace, February 18, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

DataFacet Video

February 15, 2013

DataFacet’s stream of news slowed in late 2012. The outfit seems to be quiet; what’s going on over there? While we wait for their next move, check out the interesting video on the DataFacet Web site, which effectively introduces their product. It begins with a good explanation of “taxonomy,” which might be useful to bookmark in case you need to define the term for someone unfamiliar with the field. The video goes on to show someone using parts of the DataFacet system, which gives a much better idea of what it does than any text explanation could. It’s set to a catchy tune, too.

The product description surrounding the video specifies:

DataFacet provides a taxonomy based data model for your enterprise’s unstructured information along with a sophisticated, yet easy to use, set of tools for applying the data model to your content.

It’s an easy three step process:

  1. Choose your foundation taxonomies from the DataFacet library of over 500 topic domains
  2. Customize your taxonomy with DataFacet Taxonomy Manager
  3. Tag your content with DataFacet Taxonomy Server

DataFacet is already available for the following search and content environments:

DataFacet is actually a joint project, built by taxonomists from WAND and Applied Relevance. Based in Denver, Colorado, WAND has been developing structured multi-lingual vocabularies since 1998. Their taxonomies have been put to good use in online search systems, ad-matching engines, B2B directories, product searches, and within enterprise search engines.

Applied Relevance offers automated tagging to help organizations contextualize their unstructured data. They have designed their user interface using cross-platform JavaScript and HTML5, which gives their application the flexibility to run in a browser, be embedded in a Web page, or be hosted in an Adobe Air desktop application.

Cynthia Murrell, February 15, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

DataStax Attempts Security with NoSQL

February 13, 2013

DataStax, a company based on the Cassandra NoSQL database, has announced the release of DataStax Enterprise 3.0. The new platform is not just an upgrade; it is really an overhaul. Kristin Bent covers the release for CRN in the story, “DataStax Merges Enterprise Security, NoSQL In Big Data Platform.”

The article states:

“Big data applications vendor DataStax said this week it will start shipping its next-generation data management platform on Feb. 25, a release the company says melds the flexibility of NoSQL databases with enterprise-level security. The new platform, dubbed DataStax Enterprise (DSE) 3.0, is targeted at organizations looking to adopt NoSQL databases — a type of next-generation, non-relational database optimized for big data — without sacrificing the robust security features native to more traditional SQL databases, explained Robin Schumacher, vice president of products at DataStax.”

Most NoSQL solutions do not have built in security. But enterprises have grown used to advanced security features. DataStax hopes to bridge the gap by bringing enterprise security solutions to the NoSQL base. However, some may not trust the first version of such a blended solution. Many will still trust traditional enterprise search solutions built on trusted names, for instance, LucidWorks and its usage of Apache Lucene and Solr technologies.

Emily Rae Aldridge, February 13, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Half Our Medical Treatments May or May Not Work

February 8, 2013

It seems like we should be past this point by now. The Washington Post reports, “Surprise! We Don’t Know if Half Our Medical Treatments Work.” Where are the big-data breakthroughs in this, one of humanity’s most crucial subject matters?

The British Medical Journal recently undertook a project called simply Clinical Evidence, which examined some 3,000 treatments that have been studied in controlled, randomized studies. For fully half, the studies are inconclusive. It is important to note that this doesn’t mean that half the time we are using treatments of unknown effectiveness, but rather that we don’t know the worth of half the number of treatments out there, including those rarely used. Still, that is a disturbing gap in our body of knowledge.

The report says the mystery-value treatments are those “for which there is insufficient data or data of inadequate quality.” That’s the part that is hard for me to wrap my head around. I guess the data management pros have some work to do in this area.

The lack of information can impact very concrete decisions. Writer Sarah Kliff reminds us:

“When health policy wonks talk about ending unnecessary care, they usually mean targeting these types of treatments — the ones where we have no idea whether they’re making us any healthier, but still increase spending.

“There are specific bodies dedicated to figuring out whether these 1,500 treatments actually work. That includes the Patient Centered Outcomes Research Institute, or PCORI, which was created by the health-care law to study comparative effectiveness research. . . . This Clinical Evidence research suggests they’ll have no shortage of medical treatments to study.”

Indeed. Let us hope our lack of intel does not send any hidden medical miracles into the dustbin of time.

Cynthia Murrell, February 08, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

DataStax Enterprise 3 0

February 7, 2013

DataStax hooked itself to the Facebook entity and now pitches its newest version, we learn from the Register’s “DataStax Cranks Up Facebook NoSQL to 3.0 with Enterprise Features.” The article explains what to expect from the latest release of its DataStax Enterprise Edition. It is worth noting that this company also offers a search system.

Writer Timothy Prickett Morgan informs us that DataStax’s raison d’être is to commercialize the open-source Cassandra NoSQL data store created by Facebook. The company does offer a stripped-down Community Edition for free, but the list of features available only in the Enterprise Edition is significant. Version 3.0 tackles perceived security flaws in Hadoop with new features, including some tweaks it is releasing to the open-source-community edition of Cassandra. Morgan writes:

“The open source tweaks include internal authentication and internal object permissions, with the same grant/revoke paradigm used by relational databases also being applied to the NoSQL data store – in this case, it is done at a table or column level. Databases also have row-level locking, but there is no analogy to this in a NoSQL data store. DataStax has also added client-to-node encryption based on the familiar SSL protocol to make sure that data being passed between Cassandra and an end user device is encrypted in flight.”

Enterprise users can also count on external authentication, encryption for at-rest data inside the stack, data auditing features, and a commercial version of Cassandra (aka the DataStax Enterprise Database Server). Round-the-clock tech-support coverage is thrown in as well. The product is not quite ready for general release, but “early adopter customers” can take it for a spin now. Check back around the end of February for general availability.

Headquartered in San Mateo, California, DataStax was founded in 2010. Their Cassandra-based software implementations are flexible and scalable, and are employed by businesses from young startups to Fortune 100 companies, including such notables as Adobe, eBay, and Netflix.

Cynthia Murrell, February 07, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Ai-One Touts Intelligent Agent Advantage

February 6, 2013

Is it another breakthrough in the analysis of unstructured text? Ai-one provides a detailed account of its data-analysis platform ai-BrainDocs in, “Big Data Solutions: Intelligent Agents Find Meaning of Text.” The write-up begins with an summary of the familiar problems many organizations face when trying to make the most of the vast amounts of data they have collected, particularly limitations of the keyword approach. Ai-one describes how they have moved beyond those limitations:

“Our approach generates an ‘ai-Fingerprint’ that is a representational model of a document using keywords and association words. The ‘ai-Fingerprint‘ is similar to a graph G[V,E] where G is the knowledge representation, V (vertices) are keywords, and E (edges) are associations. This can also be thought of as a topic model. . . .

“The magic is that ai-one’s API automatically detects keywords and associations – so it learns faster, with fewer documents and provides a more precise solution than mainstream machine learning methods using latent semantic analysis. Moreover, using ai-one’s approach makes it relatively easy for almost any developer to build intelligent agents.”

The write-up tells us how to build such “intelligent agents,” delving into the perspectives of both humans and conventional machine learning (including natural language processing and latent analysis techniques). It concludes by describing the creation of their ai-BrainDocs prototype. The article is rich in detail—a worthwhile read for anyone interested in such mechanics.

Founded in Zurich in 2003, ai-one is now headquartered in La Jolla, California, with research in Zurich and European operations in Berlin. The company licenses their software to developers around the world, who embed it in their own products.

Cynthia Murrell, February 06, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta