November 18, 2014
The article on CNN Money titled Varonis Announces Metadata Framework Version 6, Including New Functionality For Four Varonis Solutions explores the new features of Version 6. Varonis, the leading software provider, focuses on human-generated data that is unstructured and might include anything from spreadsheets to emails to text messages. They can boast over 3,000 customers in fields as varied as healthcare, media and financial services. The Varonis MetaData Framework has been perfected over the last decade. The article describes it this way,
“ [It is ] a single platform on a unifying code base, purpose-built to tackle the many challenges and use cases that arise from the massive volumes of unstructured data files created and stored by organizations of all sizes. Currently powering five distinct Varonis products, the Varonis Metadata Framework intelligently extracts and analyzes metadata from customers’ vast, distributed unstructured data stores, and enables a variety of uses cases, including data governance, data security, archiving, file synchronization, enhanced mobile data accessibility, search, and business collaboration.”
Exciting new features in Version 6 include a search API for DatAnswers, “bi-directional permissions visibility” for DatAdvantage to reduce operational overhead, and reduced risk through DatAlert with the information of malware location and timing.
Chelsea Kerwin, November 18, 2014
November 12, 2014
The article titled The Five Rules for Data Discovery on Computerworld discusses Enterprise Data Discovery. In the pursuit of fast-paced, accurate data analytics, Enterprise Data Discovery is touted in this article as a ramped up tool for accessing relevant information quickly. The first capability is “governed self-service discovery” which enables users to reformulate their data search on their own. This also allows for the blending of data types including social media and unstructured data. The article also emphasizes the importance of having a dialogue with the data,
“You also discovered that the spike in sales occurred in the middle of the media campaign and during the time of the spike, there was a major sporting event. This new clue prompts a new question – what could a sporting event have to do with the spike? Again, the data reveals its value by providing a new answer – one of the advertisements from the campaign got additional play at the event. Now, you have something solid to work on.”
According to the article, Enterprise Data Discovery offers a view of the road less travelled, enabling users to approach their discovery with new questions. Of course, the question that arises while reading this article is, who has time for this? The emphasis on self-service is interesting, but it also suggests that users will be spending a good chunk of time manipulating the data on their own.
Chelsea Kerwin, November 12, 2014
November 10, 2014
Depending on one’s field, it may seem like every bit of information in existence is now just an Internet search away. However, as researchers well know, there is a wealth of potentially crucial information that is still difficult to access. In fact, GCN tells us that marketing firm IDC estimates up to 90 percent of “big data” falls into this category. GCN also turns our attention to a potential solution in, “Brown Dog Digs Into the Deep, Dark Web.”
Brown Dog is a project out of the National Center for Supercomputing Application [NCSA] at the University of Illinois at Urbana-Champaign. In 2013, the team received a $10 million, five-year award from the National Science Foundation for the project. Already, they have developed two services that facilitate access to uncurated data collections. The write-up reports:
“The first, called Data Access Proxy (DAP), transforms unreadable files into readable ones by linking together a series of computing and translational operations behind the scenes. Similar to an Internet gateway, the configuration of the DAP would be entered into a user’s machine settings. Thereafter, data requests over HTTP would first be examined by the proxy to determine if the native file format is readable on the client device.
“The second tool, the Data Tilling Service (DTS), lets individuals search collections of data, using an existing file to discover similar files in the data. For example, while browsing an online image collection, a user could drop an image of three people into the search field, and the DTS would return images in the collection that also contain three people. If the DTS encounters a file format it is unable to parse, it would use the Data Access Proxy to make the file accessible. It also indexes the data and extracts and appends metadata to files to give users a sense of the type of data they are encountering.”
The article notes that Brown Dog’s makers are building on previous software development, and that they hope to “bring together every possible source of automated help already in existence.” That’s some goal! Not surprisingly, the prospective tools have been likened to a time machine of sorts. Kenton McHenry, one of the project’s leaders, reminds us that the world’s first web browser, Mosaic, was also developed at NCSA; his team hopes to leave a similarly significant legacy.
Cynthia Murrell, November 10, 2014
November 5, 2014
Well, this is interesting. The Inquirer reports that the Germans are taking a stand against Google’s practice of consolidating users’ Web-wide data in, “Germany Tells Google to Pause for Permission Before Profiling People.” The Hamburg Data Protection Authority has a particular problem with Google’s one-privacy-policy-fits-all-countries stance. For its part, Google continues to assert that the “simpler, more effective services” it can provide by pulling the threads of our online presences are worth the privacy tradeoff. I’m sure the increased ad revenue is just a nice side-effect.
Reporter Dave Neal quotes Johannes Caspar, the Hamburg commissioner of data protection and freedom:
“On the substantial issue of combining user data across services, Google has not been willing to abide to the legally binding rules and refused to substantially improve the user’s controls. So we had to compel Google to do so by an administrative order. Our requirements aim at a fair balance between the concerns of the company and its users. The issue is up to Google now. The company must treat the data of its millions of users in a way that respects their privacy adequately while they use the various services of the company.”
I suppose we’ll see about that. What will be the next step in the struggle between Google and the world’s privacy advocates?
Cynthia Murrell, November 05, 2014
October 31, 2014
The article on Fortune titled The Company Was In a Death Spiral. She Brought It Back From the Brink lauds the work of Penny Herscher at data analytics firm FirstRain. Herscher took over the company in 2004 after successful work at Cadence Design Systems, Simplex and Texas Instruments. FirstRain was a bankrupt company with a great prototype but no product. Herscher embraced the challenges posed by FirstRain and began her overhaul with a move from New York to California. The article goes on,
“She raised $20 million from new investors and hired a trusted team, including chief operating officer Y.Y. Lee, a mathematician and software engineer… Today, more than 50% of FirstRain’s senior leadership is women. The fledgling company had barely started developing a product when storms began brewing on the horizon. It was 2008. The global economy was beginning to collapse. “The wheels came off the bus,” Herscher says with lament. To survive, the company had to completely change course again…It pulled through.”
But only after major lay-offs and changes in the structure. Today FirstRain customers include IBM and Cisco, and it is only continuing to grow, with new offices in San Mateo. Herscher’s story of success is one of commitment and creative problem-solving.
Chelsea Kerwin, October 31, 2014
October 30, 2014
The information page titled What You Can Do With: Presto on Software AG Products provides an overview of the data-combining software formerly known as JackBe until its acquisition by Software AG. JackBe is now Presto! (Exclamation point optional.) Information flow since March 2014 has been modest. The article offers an overview and some of the capabilities of the software, such as in-memory analytics and visualization and data mashing. The article states,
“Presto combines data from any source for data visualizations. Accessing the original data—directly from data warehouses, news feeds, social media, existing BI systems, streaming big data, even Excel spreadsheets—lets business users respond to changing conditions as they happen. Presto’s “point-click-connect” assembly tool, Wires, makes it easy to bring together and manipulate data from multiple existing systems into meaningful data visualizations. Simple, powerful data mashing means IT and power users can create new apps and dashboards in hours—even minutes…”
Software AG began in 1969 in Germany and in 2013 acquired JackBe. According to the Company History page, the deal was actually awarded the title of Strategic M&A deal of the Year by the Association for Corporate Growth. Other acquisitions include Apama Complex Event Processing Platform, alfabet AG, and Longjump.
Chelsea Kerwin, October 30, 2014
October 28, 2014
Partnerships offer companies ways to improve their product quality and create new ones. Semantic Web reports that “Expert System And WAND Partner For A More Effective Management Of Enterprise Information.” Expert System is a leading semantic technology company and WAND is known for its enterprise taxonomies. Their new partnership will allow businesses to have a better and more accurate way to organize data.
Each company brings unique features to the partnership:
“The combination of the strengths of each company, on one side WAND’s unique expertise in the development of enterprise taxonomies and Expert System’s Cogito on the other side with its unique capability to analyze written text based on the comprehension of the meaning of each word, not only ensures the highest quality possible, but also opens up the opportunity to tackle the complexity of enterprise information management. With this new joint offer, companies will finally have full support for a faster and flexible information management process and immediate access to strategic information.”
Enterprise management teams are going to get excited about how Expert System and WAND will improve taxonomy selection and have more native integration with in-place data systems. One of the ways the two will combine their strengths is with the new automatic classification: when a WAND taxonomy is selecting, Expert System brings in its semantic based categorization rules and an engine for automatic categorization.
October 27, 2014
Here’s a new spin on scraping and parsing from Connotate’s blog, Web Data Insider. The recent emphasis on predictive analytics has writer Laura Teller discussing “The Data Supply Chain… and Why You Should Get One.” She reminds us that businesses now do much more with data than they used to. In fact, she asserts, any company that invests in data analytics possesses a critical advantage. Of course, as a prominent web-data extraction firm, Connotate does have a dog in this fight; at the same time, Teller has a point—for many businesses, especially larger ones, data analytics can be an indispensable tool.
Companies put considerable effort into streamlining their supply chains for other resources, so why not data? The article elaborates, and gives us a checklist for investigating our own data-supply needs:
“Once we start conceiving of data as a critical input or a brave new resource, it changes the paradigm of how we think about it, manage it, and leverage it. Data is no longer just an artifact of the ‘real work’ of companies. Rather, it’s something that has to be strategically sourced, managed, and leveraged. Just as companies have supply chains for other raw materials, like sugar, steel, electronic components, etc., they have to think about data in the same way and with the same rigor. They have many decisions to make:
*What to get and where they’ll get it
*How to ensure supply
*How to protect their ability to get it
*Who they’ll source from and how they’ll manage them
*What to pay for it
*How to store it
*How to refine it and add value to it
*How to package it for sale”
Teller notes that her company welcomes this “paradigm shift,” which is no surprise, considering that they are well-positioned to help customers address this burgeoning need. The company’s platform has been named a KMWorld “Trend-Setting Product” a healthy nine times. Based in New Brunswick, New Jersey, Connotate was founded in 2000.
Cynthia Murrell, October 27, 2014
October 13, 2014
The article titled What If Your Data Worked Together on The Woopra Blog makes a plea for normalized and federated information. It also stomps data silos in the dirt for causing frustration in both customers and employees. The call for efficiency in this article does laud certain companies for organizing relevant data, with the example that follows,
“I had ordered a bed from (Overstock.com) and called them a few days later to ask a question about the delivery. The woman who answered my call didn’t ask me for a single piece of information, just “How can I help you?”. She already knew exactly who I was and what I had ordered…she told me that their system automatically gave her my profile based on my phone number.”
This particular example resonated with me, especially after dealing with certain cable companies who seem to keep all of their data in lockboxes and throw away the keys. The article went on to suggest that data silos hurt companies as much as customers by segmenting data and making it more difficult to understand the entire story in a certain usage. This article ends with a promise that it will follow up with more information on data harmony, and we can only hope that someone out there is listening.
Chelsea Kerwin, October 13, 2014
October 4, 2014
I read “After Legal Threat, Google Says It Removed ‘Tens of Thousands’ of iCloud Hack Pics.” On the surface, the story is straightforward. A giant company gets a ringy dingy from attorneys. The giant company takes action. Legal eagles return to their nests.
However, a question zipped through my mind:
What does remove mean?
If one navigates to a metasearch engine like Devilfinder.com, the user can run queries. A query often generates results with a hot link to the Google cache. Have other services constructed versions of the Google index to satisfy certain types of queries? Are their third parties that have content in Web mirrors? Is content removed from those versions of content? Does “remove” mean from the Fancy Dan pointers to content or from the actual Google or other data structure? (See my write ups in Google Version 2.0 and The Digital Gutenberg to get a glimpse of how certain content can be deconstructed and stored in various Google data structures.)
Does remove mean a sweep of Google Images? Again are the objects themselves purged or are the pointers deleted.
Then I wondered what happens if Google suffers a catastrophic failure. Will the data and content objects be restored by a back up. Are those back ups purged?
I learned in the write up:
The Hollywood Reporter on Thursday published a letter to Google from Hollywood lawyers representing “over a dozen” of the celebrity victims of last month’s leak of nude photos. The lawyers accused Google of failing to expeditiously remove the photos as it is required to do under the Digital Millennium Copyright Act. They also demanded that Google remove the images from Blogger and YouTube as well as suspend or terminate any offending accounts. The lawyers claimed that four weeks after sending the first DMCA takedown notice relating to the images, and filing over a dozen more since, the photos are still available on the Google sites.
What does “remove” mean?
Stephen E Arnold, October 4, 2014