February 25, 2016
Outdated file formats, particularly those with no metadata, are especially difficult to search and utilize. The National Science Foundation (NSF) reports on a new search engine designed to plumb the unstructured Web in, “Brown Dog: A Search Engine for the Other 99 Percent (ofData).” With the help of a $10 million award from the NSF, a team at the University of Illinois-based National Center for Supercomputing Application (NCSA) has developed two complementary services. Writer Aaron Dubrow explains:
“The first service, the Data Access Proxy (DAP), transforms unreadable files into readable ones by linking together a series of computing and translational operations behind the scenes. Similar to an Internet gateway, the configuration of the Data Access Proxy would be entered into a user’s machine settings and then forgotten. From then on, data requests over HTTP would first be examined by the proxy to determine if the native file format is readable on the client device. If not, the DAP would be called in the background to convert the file into the best possible format….
“The second tool, the Data Tilling Service (DTS), lets individuals search collections of data, possibly using an existing file to discover other similar files in the data. Once the machine and browser settings are configured, a search field will be appended to the browser where example files can be dropped in by the user. Doing so triggers the DTS to search the contents of all the files on a given site that are similar to the one provided by the use…. If the DTS encounters a file format it is unable to parse, it will use the Data Access Proxy to make the file accessible.”
See the article for more on these services, which NCSA’s Kenton McHenry likens to a DNS for data. Brown Dog conforms to NSF’s Data Infrastructure Building Blocks program, which supports development work that advances the field of data science.
Cynthia Murrell, February 25, 2016
January 26, 2016
Several articles lately have shined light on the dynamics at play in the cybercriminal marketplaces of the Dark Web; How much is your Uber account worth?, for example, was recently published on Daily Mail. Summarizing a report from security researchers at Trend Micro for CNBC, the article explains this new information extends the research previously done by Intel Security’s The Hidden Data Economy report. Beyond describing the value hierarchy where Uber and Paypal logins cost more than social security numbers and credit cards, this article shares insights on the bigger picture,
“’Like any unregulated, efficient economy, the cybercrime ecosystem has quickly evolved to deliver many tools and services to anyone aspiring to criminal behavior,’ said Raj Samani, chief technology officer for Intel Security EMEA. ‘This “cybercrime-as-a-service” marketplace has been a primary driver for the explosion in the size, frequency, and severity of cyber attacks.
‘The same can be said for the proliferation of business models established to sell stolen data and make cybercrime pay.’”
Moving past the shock value of the going rates, this article draws our attention to the burgeoning business of cybercrime. Similarly to the idea that Google has expanded the online ecosystem by serving as a connector, it appears marketplaces in the Dark Web may be carving out a similar position. Quite the implications when you consider the size of the Dark Web.
Megan Feil, January 26, 2016
July 21, 2015
If media websites take this suggestion from an article at Monday Note, titled “How Linking to Knowledge Could Boost News Media,” there will be no need to search; we’ll just follow the yellow brick links. Writer Frederic Filloux laments the current state of affairs, wherein websites mostly link to internal content, and describes how embedded links could be much, much more valuable. He describes:
“Now picture this: A hypothetical big-issue story about GE’s strategic climate change thinking, published in the Wall Street Journal, the FT, or in The Atlantic, suddenly opens to a vast web of knowledge. The text (along with graphics, videos, etc.) provided by the news media staff, is amplified by access to three books on global warming, two Ted Talks, several databases containing references to places and people mentioned in the story, an academic paper from Knowledge@Wharton, a MOOC from Coursera, a survey from a Scandinavian research institute, a National Geographic documentary, etc. Since (supposedly), all of the above is semanticized and speaks the same lingua franca as the original journalistic content, the process is largely automatized.”
Filloux posits that such a trend would be valuable not only for today’s Web surfers, but also for future historians and researchers. He cites recent work by a couple of French scholars, Fabian Suchanek and Nicoleta Preda, who have been looking into what they call “Semantic Culturonomics,” defined as “a paradigm that uses semantic knowledge bases in order to give meaning to textual corpora such as news and social media.” Web media that keeps this paradigm in mind will wildly surpass newspapers in the role of contemporary historical documentation, because good outside links will greatly enrich the content.
Before this vision becomes reality, though, media websites must be convinced that linking to valuable content outside their site is worth the risk that users will wander away. The write-up insists that a reputation for providing valuable outside links will more than make up for any amount of such drifting visitors. We’ll see whether media sites agree.
Cynthia Murrell, July 21, 2015
May 29, 2014
According to a press release from Virtual Strategy, Centrifuge Systems-a company that develops big data software-has created four new data connectors within its visual link analysis software. “Centrifuge Expands Their Big Data Discovery Integration Footprint,” explains that with the additional data software users will be able to make better business decisions.
“ ‘Without the ability to connect disparate data – the potential for meaningful insight and actionable business decisions is limited,’ says Stan Dushko, Chief Product Officer at Centrifuge Systems. ‘It’s like driving your car with a blindfold on. We all take the same route to the office every day, but wouldn’t it be nice to know that today there was an accident and we had the option to consider an alternate path.’ ”
The new connectors offer real time access to ANX file structure, JSON, LDAP, and Apache Hadoop with Cloudera Impala. Centrifuge’s entire goal is to add more data points that give users a broader and more detailed perspective of their data. Centrifuge likes to think of itself as the business intelligence tool of the future. Other companies, though, offer similar functions with their software. What makes Centrifuge different from the competition?
March 27, 2014
The article on TechWorld titled Tableau Folds Splunk Data Into Business Analysis shares information on the new connector enabling the analysis of machine-generated data, developed in partnership by Tableau Enterprises and Splunk. The collaboration allows for a better understanding of product analytics and customer experience, since Splunk’s software collects data on what customers do when they visit a website. The article explains,
“The new driver for Tableau expands the scope of how Splunk data can be used by the enterprise. It imports data captured by Splunk into Tableau’s data processing and visualization environment. As a result, business analysts can merge the event data generated by servers with other sources of data, which would potentially provide new insights into customer behavior or corporate operations…The connector is a ODBC (Open Database Connectivity) driver that is included in the Tableau 8.1.4 maintenance release.”
Splunk’s software was initially used more for finding issues in a system, but with the addition of analysis tools the software’s ability’s were broadened. Now instead of just noting trouble spots on a website, the software is used to discover patterns in customer behavior. The article uses the example of users filling shopping carts on a website but not making purchases. Splunk’s software is used by managers to pinpoint the issue that is causing that lack of follow-through. Whether or not the partnership of Tableau and Splunk will pay off remains to be seen.
Chelsea Kerwin, March 27, 2014
October 18, 2013
Many organizations still see SharePoint as an internal enterprise tool and have yet to take advantage of any opportunity for external data integration. No doubt external integration is trickier and few organizations are willing to take risks. So, many are turning to the Layer2 Business Data List Connector to seamlessly integrate external data streams into an existing SharePoint infrastructure. OpenPR covers the product in their story, “Layer2 Business Data List Connector for SharePoint V5 Released To Close Gaps With External Data Integration.”
The article begins:
“Layer2 has announced version 5 of the SharePoint Business Data List Connector (BDLC) that connects almost any external corporate data source with native SharePoint lists and closes many gaps that still exist with SharePoint data integration.”
Add-ons are all too common when it comes to SharePoint deployments Many gaps exist, just like the external data integration gap mentioned above. Stephen E. Arnold, of Arnold IT, is a longtime expert in search and a frequent critic of SharePoint. In a recent article, Arnold highlights that SharePoint is missing the mark on its critical functions, including search. Microsoft would do well to listen, but until a major redesign takes place, users will continue to rely on add-ons.
Emily Rae Aldridge, October 18, 2013
August 30, 2013
Specialized hardware vendor MaxxCAT offers a SQL connector, allowing their appliances to directly access SQL databases. We read about that tool, named BobCAT, at the company’s Search Connect page. We would like to note that the company’s web site has made it easier to locate their expanding range of appliances for search and storage.
Naturally, BobCAT can be configured for use with Microsoft SQL Server, Oracle, and MySQL, among other ODBC databases. The connector ‘s integration with MaxxCAT’s appliances makes it easier to establish crawls and customize output using tools like JSON, HTML and SQL. The write-up emphasizes:
“The results returned from the BobCAT connector can be integrated into web pages, applications, or other systems that use the search appliance as a compute server performing the specialized function of high performance search across large data sets.
“In addition to indexing raw data, The BobCAT connector provides the capability for raw integrators to index business intelligence and back office systems from disparate applications, and can grant the enterprise user a single portal of access to data coming from customer management, ERP or proprietary systems.”
MaxxCAT does not stop with its SQL connector. Their Lynx Connector facilitates connection to their enterprise search appliances by developers, integrators, and connector foundries. The same Search Connect page explains:
“The connector consists of two components, the input bytestream and a subset of the MaxxCAT API that controls the processing of collections and the appliance.
“There are many applications of the Lynx Connector, including building plugins and connector modules that connect MaxxCAT to external software systems, document formats and proprietary cloud or application infrastructure. Users of the Lynx Connector have a straightforward path to take advantage of MaxxCAT’s specialized and high performance retrieval engine in building solutions.”
Developers interested in building around the Lynx framework are asked email the company for more information, including a line on development hardware and support resources. MaxxCAT was founded in 2007 to capitalize on the high-performance, specialized hardware corner of the enterprise search market. The company manages to offer competitive pricing without sacrificing its focus on performance, simplicity, and ease of integration. We continue to applaud MaxxCAT’s recently launched program for nonprofits.
Cynthia Murrell, August 30, 2013
February 18, 2013
Big data is exploitable, increasingly necessary for enterprise functionality as organizations become more complex and can provide endless opportunities for ROI. However, there are some organizations that have not fully realized their potential to tap into this resource. ZDNet‘s article “Big Data: Why Most Businesses Just Don’t Get It” discusses how these organizations want to look at multiple pieces of data across different information sources but cannot execute the technology and manpower required.
Gartner vice president and distinguished analyst Debra Logan offers up her insights in the referenced article stating that 95 to 97 percent of organizations she knows are only exploring possible big data solutions currently. However, research from Microsoft says 75 percent of organizations are implementing solutions in the next 12 months.
The article quotes Logan:
Software companies in general have no interest in helping you make anything smaller because they make their money from more data and the more disorganised that data is, the more money they make. The most advanced industry in terms of big data is retail. It’s the stuff they do with all the RFID, the supply chain, with loyalty cards. Those are big-data problems.
Enterprise organizations are faced with the very real problem of too much information that is scattered across various departments in silos. However, there are solutions like PolySpot that use connectors to break the barriers of incompatible data types to draw out important knowledge and information.
Megan Feil, February 18, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search.
February 12, 2013
A recent article from GigaOM furthers the conversation in the midst of what some describe as a data backlash in “Why Big Data Matters and Data-ism Doesn’t.” New York Times columnist David Brooks is credited with coining the term data-ism to characterize the common phenomenon where people reduce everything in the world to statistics and this GigaOM writer agrees that data-ism is something to stay far away from.
While many data enthusiasts are simply content with lists of data and statistics for the sake of the data, it is important to see beyond the mere data points. Big data and the technological tools available are helping to further the possibilities and opportunities that data offers every field from research to business.
The author of this article states:
If there’s one thing I’ve learned, it’s that the real value of data isn’t just in uncovering statistical realities, but in finding methods for doing so where it was hitherto impossible and in creating entirely new products that change the way we interact with our world. Big data is a technological revolution centered around collecting, storing and processing more data of more types than ever before.
One way we are seeing the larger connections hidden within the various and high volumes of data points that make big data arise to the surface is through solutions like PolySpot. This technology scores big in the realm of connectors with over one hundred different types.
Megan Feil, February 12, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search.
February 6, 2013
Last year, Gartner forecasted that by 2017, the CMO would be spending more on IT than the CIO. The landscape of office politics has caused what Forbes is calling “Big Data Star Wars: The CMO/CIO Wars Continue.” Money drives the decisions in external business affairs and clearly the same is the case for internal issues as well.
Acknowledging that IT budgets are shrinking and CMO desires to tap into insights from unstructured data platforms is only increasing, the article purports the trend as a continuing one.
The article states:
Marketers want to do more with big data in 2013, which probably means they will increase the pressure on the IT department or by-pass it with cloud-based resources. More than half of the survey respondents said they have already started implementing real-time data and plan to make greater use of it in 2013 to drive more personalized marketing campaigns, with another 30 percent saying they plan on using it for the first time or consider using it.
Many of the technology resources available to organizations aid in connecting departments across the enterprise with information access to insights churned from big data solutions. One such technology that we have seen make waves in this area is PolySpot. Their library of over one hundred connectors makes decision making for business professionals a cinch.
Megan Feil, February 6, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search.