May 18, 2013
It is not a surprise that 97 percent of state and local IT professional expect their data to grow by more than 50 percent over the next two years. However, more than 75 percent of them are only somewhat or not very familiar with the term big data. These findings are found in a recent report by MeriTalk and GCN did a nice write up on the implications of the study in, “Is Big Data Big Trouble for State, Local Governments?”
A survey of 150 state and local government CIOs and IT managers taken in November and December 2012 comprise the respondents in “The State and Local Big Data Gap.”
The article lists more of the statistics gleaned from the study:
“Seventy-nine percent of responding agencies said it will be at least three years before they are able to take full advantage of big data, even though they see it improving overall efficiency (57 percent); increasing the speed and accuracy of the decision-making process (54 percent); and providing a greater understanding of citizens’ needs (37 percent). And although 79 percent said they were just somewhat or not very familiar with the term, they do report having the kind of problems that big data techniques are intended to solve.”
Are state and local governments able to tap the alleged power of big data? Maybe not yet? That is certainly the conclusions that the numbers speak to.
Megan Feil, May 18, 2013
May 17, 2013
MapR Technologies, specializing in Hadoop for Big Data, announced a new partnership with LucidWorks to bring full-text search and discovery to the platform. Read all the benefits to customers in the KM World article, “MapR taps LucidWorks for Hadoop.”
The announcement begins:
“MapR Technologies has announced distribution of LucidWorks Search with the MapR Platform for Apache Hadoop. MapR says the move allows customers to perform predictive analytics, full search and discovery, as well as conduct advanced database operations, on a single platform. The MapR/LucidWorks enterprise-class search capability works directly on Hadoop data but can also index and search standard files without having to perform any conversion or transformation.”
LucidWorks is a known leader in full-text search for the enterprise. Their LucidWorks Search and LucidWorks Big Data solutions are built on the sturdy open source infrastructure of Apache Lucene/Solr. Partnering with LucidWorks adds functionality to MapR solutions as well as shows a willingness to do what is best for the customer. LucidWorks’ strong track record only adds to MapR’s reputation and legitimacy.
Emily Rae Aldridge, May 17, 2013
May 16, 2013
Dan Kuznetsky is a trusted authority in enterprise search. He brings his expertise to the topic of search in Big Data in his latest article for ZDNet, “Evolution of Search in Big Data as Told by LucidWorks.”
After a discussion of how LucidWorks is contributing to the open source community through its participation in the Apache Software Foundation, Kuznetsky goes on to explore this interesting development model:
“LucidWorks is one of a growing number of technology companies that are building products based upon open-source software that was created in products hosted by the Apache Software Foundation. It is fascinating how they are cooperating to build the basic technology and then focusing on different competitive market niches. Each time I mentioned what I thought was a competitor, the folks from LucidWorks pointed out that those companies are partners that are trying to use their individual strengths together to serve the market. This is an area that is worth watching.”
Kuznetsky hit on the strength of LucidWorks and the rest of the value-added open source market. Innovation is encouraged, many benefit, but each company finds a niche that makes it profitable, but also useful. In this way, innovation is encouraged, open source development is encouraged, and users benefit from continuous improvement and support for the solutions in which they invest. Sounds like a win-win.
Emily Rae Aldridge, May 16, 2013
May 16, 2013
With the flood of interest in big data solutions and technology that can chop the masses of unstructured content down to size, we are also seeing much VC funding go towards startups in this area. Linux Today reports that “LucidWorks Pulls in $10M to Turn Open Source Data into ‘Business Gold’”
LucidWorks started as Lucid Imagination in 2008 and focused on provided support, training and consulting for open source search technologies Lucene and Solr. Opportunity arose when the company saw a need for open source search to become more accessible by entering into the big data market.
Providing a quick rundown on LucidWorks current technology offerings, the article tells us:
“LucidWorks product suite contains two development platforms that enable organizations to search, discover, and analyze their data. LucidWorks Search is built on top of Apache Lucene/Solr open-source search project and seeks to simplify and improve the process of building embedded search applications. The other product, LucidWorks Big Data, then helps businesses make sense of the data.”
We have been following LucidWorks since before their name changed and their exciting news comes as no surprise. As one of the leaders in open source enterprise search technology they will undoubtedly remain one of the top on our list to follow.
Megan Feil, May 16, 2013
May 14, 2013
This week the Text Radar big data and content intelligence blog covered a set of interesting topics this week that are pertinent to anyone interested in harnessing the power of big data insights.
“Data Analytical Decisions are More Definitive at Adding Insightful and Valuable Content” explains how important raw data is to business success. The use of this data, however, can be difficult to manage without experts to advise.
The article explains:
“This view is held even more firmly in the manufacturing, energy and government sectors, and 65 percent assert that more and more management decisions are based on ‘hard analytic’ information.
The research shows that organizations are increasingly moving towards evidence-based decision making, but at the same time, face significant challenges in managing and leveraging the ever-increasing volumes of data not only from a technology perspective but also as an organization.”
Another article, “Big Data Analysis Not a Simple Data Collection Technique,” dispels some of the rumors surrounding big data. It explains that big data mining is far more than simple data collection.
The article provides this example:
“Taking an influential paper on economics and intelligence efforts around the Boston bombing suspects as background, wherein a few missing rows in Excel and a misspelling of Boston Marathon bombing suspect Tamerlan Tsarnaev’s name, Wise points out that ‘data management tools (i.e., the FBI’s systems and Excel) were undone by fairly simple errors,’ with terrible results. In other words, as much as we may believe Big Data is as simple as ‘Input data into Hadoop, outcome insights!’ the reality depends heavily on the people querying that data.”
Managing data without the appropriate skill set can lead to the failure of any company. One way that big data can be most helpful when used appropriately is when “Mining Data for Finding Talent for Hire”. Gild, helps companies find “diamond in the rough” or individuals that have slipped through the cracks of traditional recruiting methods by mining social media sites.
The article provides the thoughts of Gild’s chief scientist, Vivian Ming:
“Dr. Ming doesn’t suggest eliminating human judgment, but she does think that the computer should lead the way, acting as an automated vacuum and filter for talent. The company has amassed a database of seven million programmers, ranking them based on what it calls a Gild score — a measure, the company says, of what a person can do. Ultimately, Dr. Ming wants to expand the algorithm so it can search for and assess other kinds of workers, like Web site designers, financial analysts and even sales people at, say, retail outlets.”
As you can see, data can be used to find the answer’s you’ve been searching for as long as you have the right tools. A company leading the way with text analytical tools is Smartlogic. Their suite of tools has the ability to join data with content and applying content analytics to that information for the purpose of content intelligence giving integrity and reliable methods to making decisions in any environment.
Jasmine Ashton, May 14, 2013
May 14, 2013
A Business Wire press release caught our eyes recently as it announced the distribution of LucidWorks Search with the MapR Platform for Apache Hadoop. “MapR Technologies Distributes Enterprise-Grade Search with Hadoop Platform” shares that now customers will have predictive analytics, search, discovery and advanced database operations at their fingertips on a single platform.
Integrating LucidWorks technology with MapR beefs up the added value that LucidWorks Search offers as far as security, connectivity and user management. Additionally, MapR announced that the M7 Edition is available; this combines unprecedented Hadoop and NoSQL capabilities together in one platform.
According to Ben Woo, managing director, Neuralytix:
“Integrating search capabilities into Hadoop is an important milestone for the industry and represents tremendous opportunity for customers to find new insight and derive value from Big Data. This is an enormous step forward especially in time-sensitive processes such as fraud detection where Big Data must be searched as it streams into the enterprise.”
MapR’s chief application architect tells us that using search and big data is not just about analyzing social media content and Web traffic. We wonder…big data and search: has the holy grail (or one of them) been found?
Megan Feil, May 14, 2013
May 8, 2013
Next week I am doing an invited talk in London. My subject is search and Big Data. I will be digging into this notion in this month’s Honk newsletter and adding some business intelligence related comments at an Information Today conference in New York later this month. (I have chopped the number of talks I am giving this year because at my age air travel and the number of 20 somethings at certain programs makes me jumpy.)
I want to highlight one point in my upcoming London talk; namely, the financial challenge which companies face when they embrace Big Data and then want to search the information in the system and search the Big Data system’s outputs.
Here are the simplified curves:
Notice that precision and recall has not improved significantly over the last 30 years. I anticipate that many search vendors will tell me that their systems deliver excellent precision and recall. I am not convinced. The data which I have reviewed show that over a period of 10 years most systems hit the 80 to 85 percent precision and recall level for content which is about a topic. Content collections composed of scientific, technical, and medical information where the terminology is reasonably constrained can do better. I have seen scores above 90 percent. However, for general collections, precision and recall has not been improving relative to the advances in other disciplines; for example, converting structured data outputs to fancy graphics.
May 8, 2013
While there is some controversy over whether Hadoop is the only necessary tool to mine opportunities from big data, Hadoop and insights from big data seem to be synonymous according to Datamation’s recent article. They give us the rundown on “Seven Hot Hadoop Startups that Will Tame Big Data.”
According to this article, the current Hadoop ecosytem market is worth around $77 million. With growth, the value is projected to be at $813 million by 2016. The article notes that Hadoop has not been proven as completely effective in the enterprise world. Queries are still a weak point.
The article discusses seven startups that intend on seeing Hadoop through into maturity like Alpine Data Labs. The following excerpt explains why they are on this list:
“According to Alpine Data, part of the problem is that it’s much too difficult to get real insights out of Hadoop and other parallel platforms. Most companies don’t know what to do with massive datasets, and few have gotten any further with Hadoop than batch processing and basic querying. Alpine Data set out to simplify machine-learning methods and make them available on petabyte-scale datasets. Their tools make these methods available in a lightweight web application with a code-free, drag-and-drop interface.”
With the amount of attention on Hadoop over the years, Hadoop start ups are not a commodity. A list featuring a selection of the new ones to watch is much appreciated. Check out the full and useful list of hot Hadoop start ups.
Megan Feil, May 08, 2013
May 7, 2013
This week, the Text Radar news service covered some interesting stories that are pertinent to the world of big data analytics.
According to “Big Data Will Improve Decision Making and The Bottom Line” four in five IT managers in India are making big data a priority in 2013.
The article states:
“Network traffic is doubling and tripling, driven by mobile devices, business applications, video, and Big Data – Almost half of IT managers surveyed in India (46 per cent) estimated networks loads to double over the next two years; while one in four (26 per cent) felt that this would triple in the next two years. However, only two out five surveyed (41 per cent) report they are ready for the surge in network traffic.”
In addition to helping IT professionals, big data is also helping solve crime. According to “Tax Information Made Public Thanks to Big Data,” big data has helped zero in on tax evaders.
Big data has particularly helped with self employed people that have been less upfront with their earnings:
“The NSO says there are over three million of them and the SSS has over 600,000 voluntary registrants who declare themselves as self-employed. We want to increase the average annual tax payment from P33,000 to P200,000 minimum which is reasonable as this means a monthly income of just around P50,000. If we’re able to increase the 400,000 tax filers to 1.5 million and the average payment to P200,000, that’s P300 billion pesos. At our current GDP, that’s three percent of GDP.”
Another sector that has been greatly improved by big data technology is sports. “Big Data Technology in Sports Not Just About Performance But Also Improves Safety Factors” baseball and other sports have are using big data to maximize athletes’ potential success and to generate ongoing game statistics.
The article states:
“Professional Sport teams are taking action with Big Data and the information that can be attained is most impressive, and it is not just about performance. During the 2012 Olympic Games in London, real-time situational awareness was made available from sensors that improved safety factors. Also, Big Data was key to managing information during this year’s Super Bowl.”
As you can see, big data is impacting nearly every industry. Not just us techies. But don’t worry, there are plenty of tools available to help you tap into your unstructured data. One tool that we highly recommend is Smartlogic’s Semaphore Content Intelligence Platform. It using semantic software to make information easily accessible to its users.
Jasmine Ashton, May 02, 2013
April 30, 2013
This week, the Text Radar content intelligence, compliance, and big data news service covered quite a few interesting stories.
The first that I would like to highlight is, “Smartphone Data Used to Better Serve Customers.” According to the article, thanks to smartphones, app stores can tap into a wide range of data sources about user preferences and activity.
The article states:
“This ‘big data’ available within an app store can significantly help to tailor the user experience and offerings. For example, a user who lives in NYC and just landed in London might be interested in the ‘TimeOut: London’ app or ‘Booking.com’ app for booking a hotel. A user who posted a video on Facebook of the latest Knicks game may be interested in the ‘New York Knicks Official App,’ and a user who listens to Coldplay a lot, might want to download some Coldplay wallpapers.”
Another story, explains how big data has brought the IT and marketing community together. “Creating a Customer Centric Culture with Big Data Analytics” advocates the use of big data to create a customer centric corporate culture.
A study found:
“* 40% of marketers and 51% of IT executives said it’s critical for improved decision making.
* 36% of marketers and 23% of IT execs said data drives the ability to personalize customer experiences.”
The final story that I would like to highlight for this week’s issue involves big data’s impact on the health care industry. “Turning Unstructured Data into Healthcare Improvements” explains how doctors can find value using data from your mobile phone and other devices.
The author provides this example:
“For example, she said, an app could process data from a mobile carrier to determine whether new supplements for early-stage arthritis are actually helping a patient. If the patient is checking her phone earlier in the morning and moving around more frequently, that could indicate that the medicine it’s doing its job.
Service providers may balk at the prospect of releasing their troves of user activity data – and Estrin acknowledged that they would likely worry about PR headaches and privacy issues.”
It is important to understand the various outlets that you can use big data to be beneficial to your company’s success. Smartlogic’s Semaphore Content Intelligence Platform runs on semantic technology giving your organization’s information rich value and a better experience for your users.
Jasmine Ashton, April 30, 2013