Browse >
Home / Archive: January 2015
On Exalead’s blog in the post, “Build Customer Interaction For Tomorrow,” the company examines how startups, such as AirBnb, Uber, online banks, and others dedicated to services, have found success. The reason is they have made customer service a priority through the Internet and using applications that make customer service an easy experience. This allowed the startups to enter the oversaturated market and become viable competition.
They have been able to make customer service a priority, because they have eliminated the barriers that come between clients and the companies.
“First of all, they have to communicate with agility inside the company. When you have numerous colleagues, all specialized in a particular function, the silos have to break down. Nothing can be accomplished without efficient cooperation between teams. The aim: transform internal processes and then boost customer interaction.
Next, external communication, headed by the customer. Each firm has to know its clients in order to respond to their needs. The first step was to develop Big Data technologies. Today we have to go further: create a real 360° view of the customer by enriching data. It’s the only way to answer customer challenges, especially in the multi-channel era.”
The startups have changed the tired, old business model that has been used since the 1980s. The 1980s was solid for the shoulder pads and Aqua Net along with the arguably prosperous economy, but technology and customer relations have changed. Customers want to feel like they are not just another piece of information. They want to connect with a real person and have their problems resolved. New ways to organization information and harness data provide many solutions for customer service, but there are still industries that are forgetting to make the customer the priority.
Whitney Grace, January 16, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Organizing uploaded content is a pain in the rear. In order to catalog the content, users either have to add tags manually or use an automated system that requires several tedious fields to be filled out. CMS Wire explains the difficulties with document organization in “Stop Pulling Teeth: A Better Way To Classify Documents.” Manual tagging is the longer of the two processes and if no one created a set of tagging standards, tags will be raining down from the cloud in a content mess. Automated fields are not that bad to work with if you have one or two documents to upload, but if you have a lot of files to fill out you are more prone to fill out the wrong information to finish the job.
Apparently there is a happy medium:
“Encourage users to work with documents the way they normally do and use a third party tool such as an auto classification tool to extract text based content, products, subjects and terms out of the document. This will create good, standardized metadata to use for search refinement. It can even be used to flag sensitive information or report content detected with code names, personally identifiable information such as credit card numbers, social security numbers or phone numbers.”
While the suggestion is sound, we thought that auto-classification tools were normally built in collaborative content platform like SharePoint. Apparently not. Third party software to improve enterprise platforms once more saves the day for the digital paper pusher.
Whitney Grace, January 16, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Written by Stephen E. Arnold · Filed Under News, Tools | Comments Off on Organizing Content is a Manual or Automated Pain
If you are building your personal knowledgebase about smart software, I suggest you read “A Brief overview of Deep Learning.” The write up is accessible, which is not something I usually associate with outputs from Cal Tech wonks.
I highlighted this passage with my light blue marker:
In the old days, people believed that neural networks could “solve everything”. Why couldn’t they do it in the past? There are several reasons.
- Computers were slow. So the neural networks of past were tiny. And tiny neural networks cannot achieve very high performance on anything. In other words, small neural networks are not powerful.
- Datasets were small. So even if it was somehow magically possible to train LDNNs, there were no large datasets that had enough information to constrain their numerous parameters. So failure was inevitable.
- Nobody knew how to train deep nets. Deep networks are important. The current best object recognition networks have between 20 and 25 successive layers of convolutions. A 2 layer neural network cannot do anything good on object recognition. Yet back in the day everyone was very sure that deep nets cannot be trained with SGD, since that would’ve been too good to be true!
It’s funny how science progresses, and how easy it is to train deep neural networks, especially in retrospect.
Highly recommended.
Stephen E Arnold, January 15, 2015
Written by Stephen E. Arnold · Filed Under AI, News, Technology | Comments Off on Deep Learning in 4000 Words
Well, one of the Star Trek depictions is closer to reality. Google announced a new and Microsoft maiming translate app. You can read about this Bing body blow in “Hallo, Hola, Ola to the New, More Powerful Google Translate App.” Google has more translation goodies in its bag of floppy discs. My hunch is that we will see them when Microsoft responds to this new Google service.
The app includes an image translation feature. From my point of view, this is helpful when visiting countries that do not make much effort to provide US English language signage. Imagine that! No English signs in Xi’an or Kemerovo Oblast.
The broader impact will be on the industrial strength, big buck translation systems available from the likes of BASIS Tech and SDL. These outfits will have to find a way to respond, not to the functions, but the Google price point. Ouch. Free.
Stephen E Arnold, January 15, 2015
Written by Stephen E. Arnold · Filed Under Google, News, Translation | Comments Off on Captain Page Delivers the Google Translator
The big-data field has recently seen a boom in technology that collects location-related information. The ability to quickly make good use of that data, though, has lagged behind our capacity to collect it. That gap is now being addressed, according to IT World’s piece, “Startup Rethinks Databases for the Real-Time Geospatial Era.” SpaceCurve, launched in 2009 and based in Seattle, recently released their new database system (also named “SpaceCurve”) intended to analyze geospatial data as it comes in. Writer Joab Jackson summarizes some explanatory tidbits from SpaceCurve CEO Dane Coyer:
“Traditional databases and even newer big data processing systems aren’t really optimized to quickly analyze such data, even though most all systems have some geospatial support. And although there are no shortage of geographic information systems, they aren’t equipped to handle the immense volumes of sensor data that could be produced by Internet-of-things-style sensor networks, Coyer said.
“The SpaceCurve development team developed a set of geometric computational algorithms that simplifies the parsing of geographic data. They also built the core database engine from scratch, and designed it to run across multiple servers in parallel.
“As a result, SpaceCurve, unlike big data systems such as Hadoop, can perform queries on real-time streams of data, and do so at a fraction of the cost of in-memory analysis systems such as Oracle’s TimesTen, Coyer said.”
Jackson gives a brief rundown of ways this data can be used. Whether these examples illustrate mostly positive or negative impacts on society I leave for you, dear readers, to judge for yourselves. The piece notes that SpaceCurve can work with data that has been packaged with REST, JSON, or ArcGIS formats. The platform does require Linux, and can be run on cloud services like Amazon Web Services.
Naturally, SpaceCurve is not the only company who has noticed the niche spring up around geospatial data. IBM, for example, markets their InfoSphere Streams as able to handily analyze large chunks of such information.
Cynthia Murrell, January 15, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Written by Stephen E. Arnold · Filed Under Analytics, Big data, News | Comments Off on Startup SpaceCurve Promises Speedy Geospatial Data Analysis
In another week everyone will be tired of the “year in review” articles. However, for now, there is still useful information to be gleaned. Check out the latest installment in the CMS Wire article, “CMSWire’s Top 20 Hits of 2014: SharePoint.”
The article begins:
“You’ve all heard of Ground Hog Day, right? Well, how about Ground Hog Year? Looking back at the SharePoint landscape over the past 12 months, that’s certainly what it looks like. In 2013, the conversation was dominated by 1) SharePoint Online 2) SharePoint and Yammer and 3) SharePoint in Office 365. In 2014, the conversation was dominated by … well, you guessed it: 1) SharePoint Online 2) SharePoint and Yammer and 3) SharePoint in Office 365.”
And while SharePoint was pretty unoriginal in 2014, there are rumors of things brewing in 2015. Stay tuned to Stephen E. Arnold at ArnoldIT.com. His Web service is devoted to all things search, including enterprise. His SharePoint feed is a great way to filter out the noise and hone in on all things relevant to SharePoint users and managers.
Emily Rae Aldridge, January 15, 2015
Written by Stephen E. Arnold · Filed Under News, SharePoint | Comments Off on SharePoint Year in Review
I enjoy the IBM marketing hoo hah about Watson. Perhaps it lags behind the silliness of some other open source search repackagers, it is among my top five most enjoyable emissions about information access.
I read “IBM Debuts New Mainframe in a $1 Billion Bet on Mobile.” I love IBM mainframes, particularly the older MVS TSO variety for which we developed the Bellcore MARS billing system. Ah, those were the days. Using Information Dimensions BASIS and its wonder little exit and run this routine, we did some nifty things.
Furthermore, the mainframe is still a good business. Just think of the banks running IBM mainframes. Those puppies need TLC and most of the new whiz kids are amazed at keyboards with lots and lots of function keys. Fiddle with a running process and make an error. Let me tell you that produces billable hours for the unsnarlers.
IBM has “new” mainframe. Please, no oxymoron emails. Dubbed the z13—you, know alpha and omega, so with omega taken—z is the ultimate. Los primeros required hard wiring and caution when walking amidst the DASDs. Not today. These puppies are pretty much like tame mainframes with a maintenance dependency. z13s are not iPads.
The blue bomber has spent $1 billion on this new model. Watson received big buck love too, but mainframes are evergreen revenue. Watson is sort of open sourcey. The z13 is not open sourcey. That’s important because proprietary means recurring revenue.
Companies with ageing mainframes are not going to shift to a stack of Mac Minis bought on eBay. Companies with ageing mainframes are going to lease—wait for it—more mainframes. Try to find a recent comp sci grad and tell him to port the inter bank transfer system to a Mac Mini. How eager will that lass be?
Now to the write up. Here’s the passage I highlighted in pink this morning:
The mainframe is one of IBM’s signature hardware products that will help sell related software and services, and it’s debuting at a critical time for the Armonk, New York-based company. Chief Executive Officer Ginni Rometty is trying to find new sources of revenue growth from mobile offerings, cloud computing and data analytics as demand for its legacy hardware wanes.
There you go. The mainframe does mobile. The new version also does in line, real time fraud detection. The idea is that z13 prevents money from leaving one account for another account if there is a hint, a mere sniff, of fraud.
My view is that it will be some time before Amazon, Facebook, and Google port their mobile systems to the z13, but for banks? This is possible a good thing.
Will the z13 allow me to view transaction data on a simulated green screen? Will their be a Hummingbird widget to convert this stuff to a 1980 interface?
I am delighted I don’t have to come up with ideas to generate hundreds of millions in new revenue for IBM. This is a very big task, only marginally more difficult than converting Yahoo into the next Whatsapp.
No word on pricing for a z13 running Watson.
Stephen E Arnold, January 14, 2015
IBM’s Watson has some open-source competition. As EE Times reports in “DARPA Offers Free Watson-Like Artificial Intelligence,” DARPA’s DeepDive is now a freely available alternative to the famous machine-learning AI. Both systems have their roots in the same DARPA-funded project. According to DeepDive’s primary programmer, Christopher Re, while Watson is built to answer questions, DeepDive’s focus is on extracting a wealth of structured data from unstructured sources. Writer R. Colin Johnson informs us:
DeepDive incorporates probability-based learning algorithms as well as open-source tools such as MADlib, Impala (from Oracle), and low-level techniques, such as Hogwild, some of which have also been included in Microsoft’s Adam. To build DeepDive into your application, you should be familiar with SQL and Python.
“Underneath the covers, DeepDive is based on a probability model; this is a very principled, academic approach to build these systems, but the question for use was, ‘Could it actually scale in practice?’ Our biggest innovations in Deep Dive have to do with giving it this ability to scale,” Re told us.
For the future, DeepDive aims to be proven in other domains. “We hope to have similar results in those domains soon, but it’s too early to be very specific about our plans here,” Re told us. “We use a RISC processor right now, we’re trying to make a compiler, and we think machine learning will let us make it much easier to program in the next generation of DeepDive. We also plan to get more data types into DeepDive.”
It sounds like the developers are just getting started. Click here to download DeepDive and for installation instructions.
Cynthia Murrell, January 14, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Written by Stephen E. Arnold · Filed Under News, Open source, Technology | Comments Off on Open Source DeepDive Now Available
It is now possible to map regional unemployment estimates based solely on social-media data. That’s the assertion of a little write-up posted by Cornell University Library titled, “Social Media Fingerprints of Unemployment.” Researchers Alejandro Llorente, Manuel Garcia-Harranze, Manuel Cebrian, and Esteban Moro reveal:
“Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regions. We quantify the extent to which deviations in diurnal rhythm, mobility patterns, and communication styles across regions relate to their unemployment incidence. For this we examine a country-scale publicly articulated social media dataset, where we quantify individual behavioral features from over 145 million geo-located messages distributed among more than 340 different Spanish economic regions, inferred by computing communities of cohesive mobility fluxes. We find that regions exhibiting more diverse mobility fluxes, earlier diurnal rhythms, and more correct grammatical styles display lower unemployment rates.”
The team used these patterns to create a model they say paints an accurate picture of regional unemployment incidence. They assure us that these results can be created at low-cost using publicly available data from social media sources. Click here (PDF) to view the team’s paper on the subject.
Cynthia Murrell, January 14, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Written by Stephen E. Arnold · Filed Under Data, News, Social Media | Comments Off on Divining Unemployment Patterns from Social Media Data
I received a call about Zaizi and the company’s search and content services. The firm’s Web site is at www.zaizi.com. Based on the information in my files, the company appears to be open source centric and an integrator of Lucene/Solr solutions.
What’s interesting is that the company has embraced Mondeca/Smartlogic jargon; for example, content intelligence. I find the phrase interesting and an improvement over the Semantic Web lingo.
The idea is that via indexing, one can find and make use of content objects. I am okay with this concept; however, what’s being sold is indexing, entity extraction, and classification of content.
The issue facing Zaizi and the other content intelligence vendors is that “some” content intelligence and slightly “smarter” information access is not likely to generate the big bucks needed to compete.
Firms like BAE and Leidos as well as the Google/In-Tel-Q backed Recorded Future offer considerably more than indexing. The need is to process automatically, analyze automatically, and generate outputs automatically. The outputs are automatically shaped to meet the needs of one or more human consumers or one or more systems.
Think in terms of taking outputs of a next generation information access system and inputting the “discoveries” or “key items” into another system. The idea is that action can be taken automatically or provided to a human who can make a low risk, high probability decision quickly.
The notion that a 20 something is going to slog through facets, keyword search, and the mind numbing scan results-open documents-look for info approach is decidedly old fashioned.
You can learn more about what the next big thing in information access is by perusing CyberOSINT: Next Generation Information Access at www.xenky.com/cyberosint.
Stephen E Arnold, January 14, 2015
Written by Stephen E. Arnold · Filed Under Indexing, News, NGIA | Comments Off on Zaizi: Search and Content Consulting
« Previous Page — Next Page »