AI Is Key to Unstructured Data

October 5, 2017

Companies are now inclined to keep every scrap of data they create or collect, but what use is information that remains inaccessible? An article at CIO shares “Three Ways to Make Sense Out of Dark Data.” Contributor Sanjay Srivastava writes:

Most organizations sit on a mountain of ‘dark’ data – information in emails and texts, in contracts and invoices, and in PDFs and Word documents – which is hard to automatically access and use for descriptive, diagnostic, predictive, or prescriptive automations. It is estimated that some 80 percent of enterprise data is dark. There are three ways companies can address this challenge: use artificial intelligence (AI) to unlock unstructured data, deploy modular and interoperable digital technologies, and build traceability into core design principles.

Naturally, the piece elaborates on each of these suggestions. For example, we’re reminded AI uses natural language processing, ontology detection, and other techniques to plumb unstructured data. Interoperability is important because new processes must be integrated into existing systems. Finally, Srivastava notes that AI challenges the notion of workforce governance, and calls for an “integrated command and control center” for traceability. The article concludes:

Digital technologies such as computer vision, computational linguistics, feature engineering, text classification, machine learning, and predictive modeling can help automate this process.  Working together, these digital technologies enable pharmaceutical and life sciences companies to move from simply tracking issues to predicting and solving potential problems with less human error. Interoperable digital technologies with a reliable built-in governance model drive higher drug quality, better patient outcomes, and easier regulatory compliance.

Cynthia Murrell, October 5, 2017

European Tweets Analyzed for Brexit Sentiment

September 28, 2017

The folks at Expert System demonstrate their semantic intelligence chops with an analysis of sentiments regarding Brexit, as expressed through tweets. The company shares their results in their press release, “The European Union on Twitter, One Year After Brexit.” What are Europeans feeling about that major decision by the UK? The short answer—fear. The write-up tells us:

One year since the historical referendum vote that sanctioned Britain’s exit from the European Union (Brexit, June 23, 2016), Expert System has conducted an analysis to verify emotions and moods prevalent in thoughts expressed online by citizens. The analysis was conducted on Twitter using the cognitive Cogito technology to analyze a sample of approximately 160,000 tweets in English, Italian, French, German and Spanish related to Europe (more than 65,000 tweets for #EU, #Europe…) and Brexit (more than 95,000 tweets for #brexit…) posted between May 21 – June 21, 2017. Regarding the emotional sphere of the people, the prevailing sentiment was fear followed by desire as a mood for intensely seeking something, but without a definitive negative or positive connotation. The analysis revealed a need for more energy (action), and, in an atmosphere that seems to be dominated by a general sense of stress, the tweets also showed many contrasts: modernism and traditionalism, hope and remorse, hatred and love.

The piece goes on to parse responses by language, tying priorities to certain countries. For example, those tweeting in Italian often mentioned “citizenship”, while tweets in German focused largely on “dignity” and “solidarity.” The project also evaluates sentiment regarding several EU leaders. Expert System  was founded back in 1989, and their Cogito office is located in London.

Cynthia Murrell, September 28, 2017

Millennials Want to Keep Libraries

September 22, 2017

Many people think that libraries are obsolete and are only for senior citizens who want to read old paperbacks.  The Pew Research Center says otherwise in the article, “Most Americans-Especially Millennials-Say Libraries Can Help Them Find Reliable, Trustworthy Information.”

Sensationalism in the news is not new, but it has reached extraordinary new heights with the Internet and mass information consumption.  In order to gain audiences, news outlets (if some of them can be called that) are doing anything they can and this has lead to an outbreak of fake news.

The Pew Research Center conducted a test to see if adults would like to be taught how to recognize fake information and discovered that 61% said they would.  They also discovered that 78% of adults feel that libraries can help them find trustworthy information.  An even more amazing fact is that Millennials are the biggest supporters for libraries.

A large majority of Millennials (87%) say the library helps them find information that is trustworthy and reliable, compared with 74% of Baby Boomers (ages 52 to 70) who say the same. More than eight-in-ten Millennials (85%) credit libraries with helping them learn new things, compared with 72% of Boomers. And just under two-thirds (63%) of Millennials say the library helps them get information that assists with decisions they have to make, compared with 55% of Boomers.

People also use the libraries to receive technology training and gain confidence in these skills.  Other interesting facts are that women are more likely than men to say that libraries help them find reliable information.  Hispanic people also love the library and see it as an essential tool to cope with the busy world.  Also, those without a high school diploma say that libraries help them in more than one way.

Libraries are far from obsolete.  Libraries are epicenters for technology training and finding reliable and trustworthy information in world hooked on sensationalism.

Whitney Grace, September 22, 2015


AI Will Build Better Chatbots

September 21, 2017

For better or worse, chatbots have well and truly supplanted the traditional customer service role. Sure, one can still reach a human at many companies with persistence, but it is the rare (and appreciated!) business that assigns a real person to handle point-of-contact. Geektime ponders, “What is the Future of Chatbot Development and Artificial Intelligence?” Writer Damian Wolf surveys chatbots as they now exist, and asserts it is AI that will bridge the gap between these simple systems and ones that can realistically replicate human responses. He writes:

The future of AI bots looks promising and exciting at the same time. The limitation in regards to accessing big data can be eradicated by using AI techniques. The ultimate aim for the futuristic chatbot is to be able to interact with users as a human would. Computationally, it is a hard problem. With AI evolving every day, the chances of success are already high. The Facebook AI chatbot is already showing promises as it was able to come up with negotiation skills by creating new sentences. E-Commerce will also benefit hugely with a revolution in AI chatbots. The key here is the data  collection and utilization. Once done correctly, the data can be used to strengthen the performance of highly-efficient algorithm, which in turn, will separate the bad chatbots from the good ones. … Automation is upon us, and chatbots are leading the way. With a fully-functional chatbot, e-commerce, or even a healthcare provider can process hundreds of interactions every single minute. This will not only save them money but also enable them to understand their audience better.

In order for this vision to be realized, Wolf insists, companies must invest in machine learning infrastructure. The article is punctuated with informative links like those in the quotation above; one I’m happy to see is this guide for non-technical journalists who wish to write accurately about AI developments (also good for anyone unfamiliar with the field). See the article for more useful links, and for more on chatbots as they currently exist.

Cynthia Murrell, September 21, 2017

Natural Language Queries Added to Google Analytics

August 31, 2017

Data analysts are valuable members of any company and do a lot of good, but in many instances, average employees – not versed in analyst-ese – need to find valuable data. Rather than bother the analysts with mundane questions, Google has upgraded their analytics to include natural language queries, much like their search function.

Reporting on this upcoming change, ZDnet explains what this will mean for businesses:

Once the feature is available, users will have the ability to type or speak out a query and immediately receive a breakout of analyzed data that ranges from basic numbers and percentages to more detailed visualizations in charts and graphs. Google says it’s aiming to make data analysis more accessible to workers across a business, while in turn freeing up analysts to focus on more complex research and discovery.

While in theory, this seems like a great idea, it may still cause issues with those not asking questions related to the data, analytic method or appropriate prior knowledge. Unfortunately, data analysts are still the best resource when trying to glean information from analytics reports.

Catherine Lamsfuss, August 31, 2017

Google Is Rewiring Internet, Again

August 25, 2017

Google revolutionized the Internet by downloading all data in its server and offering fast search results. The search engine giant plans to do it again by introducing a series of network infrastructures to make search faster.

The Next Platform in an article titled How Google Wants to Rewire the Internet says:

Running a fast, efficient, hyperscale network for internal datacenters is not sufficient for a good user experience, and that is why Google has created a software defined networking stack to do routing over the public Internet, called Espresso.

The whole exercise of creating an extra layer or network infrastructure is to enhance user experience. As Google today generates 25% of the global Internet traffic, it is becoming difficult for the search engine giant to keep the results relevant.

Google used custom developed routers and switches for implementing this program. Hope that now people are able to find what they are looking for without getting lost in the maze of sponsored advertisements.

Vishal Ingole, August 25, 2017

Where Your Names Intersect

August 21, 2017

Google Maps might be the top navigational app in the world, but some apps like can help its users find intersections across the US with a choice of their names.

According to an article published by Forbes titled, “A New Search Engine Finds Quirky Intersections Across the U.S“, the author says: can search for intersections anywhere in the country by name. Plug in two names – say, yours and your spouse’s – and you’ll likely find at least a handful of crossroads somewhere between Hawaii and Florida.

The app in the true sense is just for the novelty. Or probably for some investigator who wants to find out how many intersections exist in the country with a particular name. Apart from a couple of fancy functions for a very very niche audience, the app offers no real utility. Moreover, only a handful of players has so far been able to monetize their navigational apps. Thus the long term viability is still in question.

Vishol Ingole, August 21, 2017

Demanding AI Labels

August 16, 2017

Artificial intelligence has become a standard staple in technology driven societies.  It still feels like that statement should still only be in science-fiction, but artificial intelligence is a daily occurrence in developed nations.  We just do not notice it.  When something becomes standard practice, one thing we like to do is give it labels.  Guess what Francesco Corea did over at Medium in his article, “Artificial Intelligence Classification Matrix”?  He created terminology to identify companies that specialize in machine intelligence.

Before we delve into his taxonomy, he stated that if the framework for labeling machine intelligence companies is too narrow it is counterproductive to the sector’s purpose of maintaining flexibility.    Corea came up with four ways to classify machine intelligence companies :

i) Academic spin-offs: these are the more long-term research-oriented companies, which tackle problems hard to break. The teams are usually really experienced, and they are the real innovators who make breakthroughs that advance the field.


  1. ii) Data-as-a-service (DaaS): in this group are included companies which collect specific huge datasets, or create new data sources connecting unrelated silos.


iii) Model-as-a-service (MaaS): this seems to be the most widespread class of companies, and it is made of those firms that are commoditizing their models as a stream of revenues.


  1. iv) Robot-as-a-service (RaaS): this class is made by virtual and physical agents that people can interact with. Virtual agents and chatbots cover the low-cost side of the group, while physical world systems (e.g., self-driving cars, sensors, etc.), drones, and actual robots are the capital and talent-intensive side of the coin.

There is also a chart included in the article that explains the differences between high vs. low STM and high vs. low defensibility.  Machine learning companies obviously cannot be categorized into one specific niche.  Artificial intelligence can be applied to nearly any field and situation.

Whitney Grace, August 16, 2017

Tidy Text the Best Way to Utilize Analytics

August 10, 2017

Even though text mining is nothing new natural language processing seems to be the hot new analytics craze. In an effort to understand the value of each, along with the difference, and (most importantly) how to use either efficiently, O’Reilly interviewed text miners, Julia Silge and David Robinson, to learn about their approach.

When asked what advice they would give those drowning in data, they replied,

…our advice is that adopting tidy data principles is an effective strategy to approach text mining problems. The tidy text format keeps one token (typically a word) in each row, and keeps each variable (such as a document or chapter) in a column. When your data is tidy, you can use a common set of tools for exploring and visualizing them. This frees you from struggling to get your data into the right format for each task and instead lets you focus on the questions you want to ask.

The due admits text mining and natural language processing overlap in many areas but both are useful tools for different issues. They regulate text mining to statistical analysis and natural language processing to the relationship between computers and language. The difference may seem minute but with data mines exploding and companies drowning in data, such advice is crucial.

Catherine Lamsfuss, August 10, 2017

Big Data Too Is Prone to Human Bug

August 2, 2017

Conventional wisdom says Big Data being a realm of machines is immune from human behavioral traits like discrimination. Insights from data scientists, however, are different.

According to an article published by PHYS.ORG titled Discrimination, Lack of Diversity, and Societal Risks of Data Mining Highlighted in Big Data, the author says:

Despite the dramatic growth in big data affecting many areas of research, industry, and society, there are risks associated with the design and use of data-driven systems. Among these are issues of discrimination, diversity, and bias.

The crux of the problem is the way data is mined, processed and decisions made. At every step, humans need to be involved in order to tell machines how each of these processes are executed. If the person guiding the system is biased, these biases are bound to seep into the subsequent processes in some way.

Apart from decisions like granting credit, human resources which also is being automated may have diversity issues. The fundamental remains the same in this case too.

Big Data was touted as the next big thing and may turn out to be so, but most companies are yet to figure out how to utilize it. Streamlining the processes and making them efficient would be the next step.

Vishal Ingole, August 2, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta