Big Data Prompts Organizations to Rely on Innovative Information Delivery Solutions
November 21, 2012
The hot topic in any industry are inherently connected to corresponding hot jobs. In technology, big data would not be possible without data scientists. Recently, ZDNet posted a video featuring the musings of IBM data evangelist James Kobielus in the posting, “Why Data Scientists Are In Demand And How They Enable Big Data.” After facing some initial skepticism that big data did not exist, Kobielus speaks to the reality of its impact on the world.
Firstly, he discusses that traditional technologies such as massively parallel database are incorporated into the concept of big data, but it also refers to many innovative open source projects focused on analytics. According to Kobielus, there will be 1.2 million new jobs created in the big data analytics sector over the next decade. It is not surprising to read that many of these will fall under the category of data scientist.
The brief introduction to the clip summarizes the video:
In this two-minute video clip, IBM big data evangelist James Kobielus explains what a data scientist does, the skill set that it requires, how they collaborate with subject matter experts to deliver important insights, and why the role is so important to the future of IT and big data.
Big data means one thing and one thing only for most enterprise organizations: more information for employees to utilize in achieving ROI. Information management solutions such as those from PolySpot enable users to pinpoint the information they need in near real-time.
Megan Feil, November 21, 2012
Sponsored by ArnoldIT.com, developer of Augmentext.
Web Site Redesign Important for Many Reasons
November 21, 2012
When organizations want to redesign their Web presence, reasons like updating content or refreshing the look and feel are often tops. But with the increased news and controversy surrounding the Google algorithm, usability and search engine optimization should be reasons that make that list as well. Search Engine Watch adds SEO considerations to the topic of Web site redesign in, “Website Redesign? Get Some SEO Consultation Before You Launch.”
The author writes that organizations often get to the topic of SEO long after those initial discussions of color, content, and other cosmetics:
It’s typically not until launch is around the corner that folks start asking about SEO. ‘Sometimes’ they have serious discussions about usability . . . Usability and SEO go hand-in-hand. Search engines want to rank websites that provide a quality user experience for the searcher. How that’s defined can be somewhat subjective (every website is unique and its target audience will also be unique).
One way to increase usability and the overall user experience is to incorporate effective search. Fabasoft Mindbreeze Insite offers a cloud-based search service that requires no installation or maintenance and recognizes the semantics that are important to you and your user. Fabasoft Mindbreeze as a company has been an important and growing leader in enterprise services, offering a wide range of enterprise solutions. Consider adding Insite to your Web redesign plans, ensuring the SEO and usability are addressed just as thoroughly as look and feel.
Emily Rae Aldridge, November 21, 2012
Sponsored by ArnoldIT.com, developer of Augmentext.
Big Data Profiling Hits Hadoop
November 21, 2012
IT News Online revealed some of the latest new featuring a leader in open source software: “Talend Simplifies Big Data Further with New Release of Enterprise Open Source Integration Platform.” Talend released version 5.2 of its next-generation integration platform, which is the only one that provides a unified environment for managing entire lifecycles for data, application, and process integration requirements. Version 5.2 has support for NoSQL databases and data profiling for Hadoop. Talend’s biggest concentration has been to make the Big Data process easier:
“In its mission to democratize big data, Talend has focused extensively on solutions that make deploying and managing Apache Hadoop and related technologies simple, without requiring specific expertise in these areas. With version 5.2, Talend has taken its big data strategy a step further by adding big data profiling for Hadoop, providing companies with the ability to discover and understand data in Hadoop clusters. Among the typical problems associated with data quality are duplication, incompleteness and inconsistency, which create inefficiencies in data processing. Talend Platform for Big Data includes new capabilities for visibility into big data in all its forms and locations.”
Version 5.2 also includes upgrades for products that use Talend’s Unifed Platform. Big Data is very complex and products like those from Talend make it easier to leverage the data and reap the benefits. LucidWorks has a Big Data search tool that was designed to find the hidden data in Big Data, making it another great tool for the Big Data handler. However, LucidWorks also has the trusted name and the customer support that others cannot boast.
Whitney Grace, November 21, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
dtSearch Rolls Out New Filters
November 21, 2012
Dr. Dobb’s software development website recently reported on new proprietary search features that cover online and offline data types in the article, “New dtSearch Document Filter Products.”
According to the article, dtSearch, a text retrieval software company that allows users to instantly search terabytes of text, has announced the latest release of its product line. Version 7.70 sees improved document filters embedded across the entire dtSearch product line.
The article states:
“The new version extends the document filters to add image support to Word (.doc/.docx), PowerPoint (.ppt/.pptx), Excel (.xls/.xlsx), Access (.mdb/accdb), RTF, and email files including Thunderbird (mbox/.eml) and Outlook (.pst/.msg) files. The release displays these formats showing highlighted hits in context with both text and images. The release also adds support for Japanese Ichitaro documents.
dtSearch’s proprietary document filters support a broad range of data types from “Office” documents: MS Office, OpenOffice, RTF, PDF to emails and also MS Exchange, Outlook, Thunderbird — all with nested attachments.”
This company has received impressive reviews regarding their search power and indexing abilities. We can only assume that dtSearch 7.70 will be even better.
Jasmine Ashton, November 21, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Autonomy and HP: That Was Fast
November 20, 2012
I like to point out that making money via search and content processing is a challenge. Not long ago, an investment bank’s 30 somethings told me, “We can figure this search stuff out.”
Don’t hold your breath.
I just learned that HP is not too happy with Autonomy’s accounting or business model. Hmmm. Isn’t the time to be unhappy before one writes a check for $10 billion? Call me old fashioned, but caveat emptor.
Navigate to “HP Takes $8.8B Charge on Autonomy ‘Improprieties’”. The write up asserts:
Hewlett-Packard took a massive charge related to its purchase of Autonomy and indicated that it bought the company based pumped-up and fraudulent accounting. In its fourth-quarter earnings report, HP recorded a charge of $8.8 billion in its software unit.
When I read this, I thought of the old chestnut attributed to either a cartoon character or a baseball professional, “It’s deja vu all over again.”
Interesting.
I never buy printer ink unless I check first. I assume the same prudence might apply to buying a search and content processing vendor.
Is there fix? Yep, the goose floats quietly in the pond awaiting the call.
Stephen E Arnold, November 20, 2012
Investigate Google? Nah
November 20, 2012
Short honk: I am in a miserable, dark city listening to the second—make that the third–false fire alarm in crummy hotel. I checked the news and saw this item, “Google Should Not Be Accused of “Unfair” Acts: Lawmakers.” (This may be a wonky link, so you may have to hunt around.) I suppose this sounds like a pretty good idea. Why ask questions? Pointless? In today’s world, doing the ostrich thing seems popular. Why search, which requires formulating a query? Let a predictive system tell you what you need to know. The new world of information retrieval has arrived. Politics as an AdWord.
Stephen E Arnold, November 20, 2012
Renaissance in the Enterprise Calls for Proven Features
November 20, 2012
A general partner at venture capitalist firm Andreessen Horowitz stated the obvious at a recent conference: there are a lot of changes going on in the enterprise. Enough changes, says Andreessen Horowitz partner Peter Levine, that it could be considered a renaissance and an entirely new generation of creativity in the enterprise. According to the article “Andreessen Horowitz General Partner Peter Levine: There’s an Enterprise Renaissance Going On” on TechCrunch, Levine is comparing the enterprise renaissance to that that occurred in the city states of Italy.
The article states:
“[…]Levine said, there is lots of proof that the renaissance is underway — well illustrated in the shift from the personal computer to mobile. The infrastructure has to change in this shift; the applications will have to be built natively to the mobile device. Services out of the back-end will need to be secured. The devices are getting more powerful and will have to integrate with distributed infrastructures around the world. Data platforms are just emerging. The development is just starting.”
The changes and the Big Data renaissance call for new ways of dealing with and addressing data. We recommend Intrafind, which offers some renaissance features that are tried and true due to their maturity in the enterprise search market. We look forward to the changes in the new age and believe businesses should prepare with the right tools to help them learn and collaborate in the emerging market.
Andrea Hayden, November 20, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
No Industry Escapes the Big Data Revolution
November 20, 2012
Those seeking objective and statistically produced facts know that big data is a major fuel source of such information. Every so often it is refreshing to take a peek at how big data influences many key fields of study and industry. The Guardian article “Big Data: Revolution by Numbers” outlines each of these areas and discusses the impact big data has had on each.
Everything from sports to medicine has been hugely revolutionized, or in the process of undergoing such changes, due to the utilization of big data. Powerful technologies and ideas have transformed daily operations and even yielded life-changing outcomes. For example, Cambridge researchers stopped an MRSA outbreak affecting 12 babies in the Rosie Hospital by rapidly sequencing the genome of the bacteria.
The article delves into some examples of data-intensive projects in scientific research:
The data recorded by each of the big experiments at the Large Hadron Collider (LHC) at Cern in Geneva is enough to fill around 100,000 DVDs every year. Or take the Sloan Digital Sky Survey, which is measuring 500 distinct attributes for each of 100m galaxies, 100m stars and 1m quasars. The result: three terabytes of data, where a terabyte is 1,000 gigabytes. Analysing that volume of data is beyond the capacity of humans, so it has to be done by computers.
Enterprise organizations used to deal with the same overwhelming amounts of data stored and managed using legacy software. Fortunately, the influx of even more data has prompted many innovative software vendors such as PolySpot to develop information delivery solutions.
Megan Feil, November 20, 2012
Sponsored by ArnoldIT.com, developer of Augmentext.
Metalogix Announces Migration Tool for SharePoint 2013 Upgrade
November 20, 2012
Metalogix recently launched Content Matrix 6.0, a new take on the former Migration Manager product. Details of the tool can be read in, “Speeding, Easing SharePoint Migration.” Content Matrix aims to provide power, speed, and flexibility to upgrade to SharePoint 2013 from any previous version. The product is explained:
Content Matrix 6.0 is designed to simplify an organization’s content experience, including moving to the cloud. In addition to SharePoint migration, architects and administrators can migrate file shares and documents from legacy enterprise content management (ECM) systems and keep SharePoint content organized in a high fidelity and ongoing basis. Further, content owners have more control over content directly from the SharePoint user interface.
The company is also introducing SharePoint 4.0, which offers the ability to automatic and continuously back up unstructured SharePoint content. The tool may be worth looking into for a migration option. You may also want to consider a more comprehensive solution, especially for reducing content storage sprawl and adding structure to your vast unstructured data. Fabasoft Mindbreeze integrates knowledge from all sections of a company into a uniform, linked whole hub of business information. With the added benefit of a SharePoint connector, Mindbreeze snaps seamlessly into existing systems to extend capabilities and efficiently create relevant business knowledge.
Philip West, November 20, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Print is Dead but Journalism is Still Alive
November 20, 2012
Print media is going down, while digital media continues to grow. Big news moguls have been commenting that anyone with a computer or phone can be a reporter, but that leads into the quantity vs. quality argument. But with the digital media onslaught, new tools have entered the journalism world that makes the field better. Computer World makes note of a new tool in “Open Source Spotlight: How DocumentCloud Adds Depth to Digital Journalism.”
DocumentCloud is an open source product designed for the Internet journalist or college student. It provides bibliographic information, annotation tools, and a Cloud where it can be added a primary source document. DocumentCloud was made for journalist by journalists and it is already being used by many top news Web sites. As an open source project, DocumentCloud is powered by:
“Behind the scenes the project is driven by software including Apache’s Solr/Lucene search platform. DocumentCloud also uses the Tesseract OCR engine developed by HP and open sourced in 2005. “We, in turn, have been giving back to the open source community as well,” Pilhofer says.”
Since the open source community prides itself on sharing, DocumentCloud shares every line of code. Journalism and technology have always worked hand and hand, though print and digital fight each other. DocumentCloud closes the barrier for any reluctant technology users. The DocumentCloud team takes advantage of the Apache Lucene search, much like how LucidWorks did for its search applications. LucidWorks uses Apache Lucene to power its powerful and trusted enterprise and Big Data search products.
Whitney Grace, November 20, 2012
Sponsored by ArnoldIT.com, developer of Augmentext