Elasticsearch in Under a Minute

May 30, 2013

Elasticsearch is generating a lot of buzz of late. For those who are curious, JavaWorld is offering a brief step-by-step tutorial entitled, “ElasticSearch on EC2 in less than 60 seconds.” It begins:

“Curious to see what all the ElasticSearch hubbub is about? Wanna see it in action without a lot of elbow grease? Then look no further, friend – in less than 60 seconds, I’ll show you how to install ElasticSearch on an AWS AMI. You’ll first need an AWS account along with an SSH key pair. If you don’t already have those two steps done, go ahead and do that. The steps that follow suggest a particular AMI; however, you are free to select the instance type. Micro instance types are free to use; consequently, you can get up and running with ElasticSearch in less than a minute for free.”

The instructions continue, but only a developer or programmer would be able to follow along. That’s okay, because they are the audience. However, for those who are interested in an out-of-the-box solution that satisfies both developers and end-users, LucidWorks would be worth investigating. On the market longer than Elasticsearch, LucidWorks is a more trusted solution that works for enterprise search and Big Data. Most importantly, LucidWorks’ security track record is so strong, that other companies, including MapR, seek out LucidWorks as a partner in order to provide security and peace of mind to customers.

Emily Rae Aldridge, May 30, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Properly Licensed Software Delivers a Competitive Advantage

May 29, 2013

You may want to check out “Competitive Advantage: the Economic Impact of Properly Licensed Software.” This is a report prepared for the BSA The Software Alliance (www.bsa.org). According to the information available at http://portal.bsa.org/insead/index.html:

A new study from BSA | The Software Alliance and INSEAD, one of the world’s leading business schools, finds that increasing the amount of properly licensed software in use globally by one percent would add an estimated $73 billion to the world economy, compared to $20 billion from pirated software — meaning there is a $53 billion advantage associated with using licensed software. The study also finds the greatest potential for economic gains are in emerging markets where piracy is most common today.

Beefy statement. The news release uses the word “groundbreaking.” You will have to judge for yourself.

Stephen E Arnold, May 29, 2013

Sponsored by Augmentext

WinterGreen Translation Companies and Services

May 29, 2013

We came across a 4,600 word news release about the language translation software market. The study has more than 400 pages and covers a wide range of topics, including mobile phone translation systems. We worked on the Topeka Capital Markets’ Google voice report. We are biased because Google seems to have a significant technology and resource edge. As we worked through the news release we did see a list of the firms which WinterGreen discusses.

A notable translation helper, the Rosetta Stone. A happy quack to the British Museum at www.britishmuseum.org.

I want to snag the list because it had some surprises as well as both familiar and unfamiliar firms in the inventory. Here’s what I noticed in the news release:

ABBYY Lingvo (http://www.lingvo-online.ru/en)
Alchemy CATALYST (
http://www.alchemysoftware.com/)
AppTek HMT (now a unit of SAIC.
http://www.saic.com)
Babylon (free)
Bitext (
www.bitext.com)
CallMiner (
http://www.callminer.com/)
Cloudwords (
http://www.cloudwords.com/)
Cognition Technologies (
www.cognition.com)
Duolingo (more of a learning system.
http://duolingo.com/)
Google (ah, the GOOG)
Hewlett Packard (maybe
www.autonomy.com)
IBM WebSphere Translation Server (try
http://goo.gl/hGS2R)
Kilgray Translation Technologies (
http://kilgray.com/)
KudoZ (
http://www.proz.com/kudoz/)
Language Engineering (http://
www.lec.com)
Language Weaver (Now part of SDL. See
http://goo.gl/IH3mg)
Lingo24 (An agency. See
http://www.lingo24.com/)
Lingotek (
http://www.lingotek.com/)
Lionbridge (crowdsourcing and integrator at
http://www.lionbridge.com/)
MT@EC (
http://ec.europa.eu/isa/actions/02-interoperability-architecture/2-8action_en.htm)
Mission Essential Personnel (humans for rent at
http://www.lionbridge.com/)
Moravia (
http://www.moravia.com/)
MultiCorpora (
http://www.multicorpora.com/en/products/)
Nuance (
http://www.nuance.com)
OpenAmplify (
http://www.openamplify.com/)
Plunet BusinessManager (A  management system at
http://www.plunet.com/us/)
Proz.com (humans for rent at
http://www.proz.com)
RWS Legal Translation (
http://www.rws.com/EN/)
Reverso (Free. See
http://www.reverso.net/text_translation.aspx?lang=EN)
SDL Trados (Part of SDL. See
http://www.trados.com/en/)
Sail Labs (
http://www.sail-labs.com/)
Softissimo (Services and software.
http://www.softissimo.com/softissimo.asp?lang=IT)
Symbio Software (
http://www.symbio.com/)
Systran (
http://www.systransoft.com/)
Translations.com (Services and software.
http://www.translations.com/)
Translators without Borders (Humans for rent.
http://translatorswithoutborders.org/)
Veveo (More semantics than translation.
http://corporate.veveo.net/)
Vignette (Open Text.
http://www.opentext.com)
Word Magic Technology (I could not locate.)
WorldLingo (Rent a human.
http://goo.gl/dhiu)

Of these 30 or so companies, there were some which struck me a surprise. Hewlett Packard, for example, owns Autonomy. I suppose that other units of Hewlett Packard have translation capabilities, but were these licensed or home grown? Also, the inclusion of Vignette is interesting. I must admit that I don’t hear much about Vignette as a translation system. The list makes translation look robust. The key players boil down to a handful of companies. I did not spot firms in the translation services or software business in China, India, Japan, or Russia, but I may have missed these firms in the WinterGreen news release describing the report.

If you want to buy a copy of the report, which I assume has paragraphs unlike the news release, point your browser at http://goo.gl/97e2s and have your credit card ready. The report is about US$7,500.

Stephen E Arnold, May 29, 2013

Sponsored by Augmentext

What is Happening with Natural Language Processing?

May 29, 2013

Why Are We Still Waiting for Natural Language Processing, an article on The Chronicle of Higher Education, explores the failure of the 21st century to produce Natural Language Processing, or NLP. This would mean the ability of computers to process natural human language. The steps required are explained in the article,

“ In the 1980s I was convinced that computers would soon be able to simulate the basics of what (I hope) you are doing right now: processing sentences and determining their meanings.

To do this, computers would have to master three things. First, enough syntax to uniquely identify the sentence; second, enough semantics to extract its literal meaning; and third, enough pragmatics to infer the intent behind the utterance, and thus discerning what should be done or assumed given that it was uttered.”

Currently, typing a question into Google can result in exactly the opposite information from what you are seeking. This is because it is unable to infer, since natural conversation is full of gaps and assumptions that we are all trained to leap through without failure. According to the article, the one company that seemed to be coming close to this technology was Powerset in 2008. After making a deal with Microsoft, however, their site now only redirects to Bing, a Google clone. Maybe NLP like Big Data, business intelligence, and predictive analytics is just a buzzword with marketing value.

Chelsea Kerwin, May 29, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

LinkedIn Finds Friends You Did Not Even Know You Had

May 29, 2013

In the article titled LinkedIn: The Creepiest Social Network on Interactuality, the strange connections pushed on users is discussed. The author seems very surprised to see certain relations and old friends suggested on the People You May Know bar when first logging into LinkedIn. These included his girlfriend’s stepfather, a Twitter follower he had never met, and his high school girlfriend. The article explains,

“After perusing my LinkedIn settings, I found three different areas where Privacy Controls are listed. If you go to your Settings page and click on Profile, you will see privacy controls for a variety of profile related issues. If you click on the Account tab, you can adjust privacy controls for advertising. However, I hadn’t noticed (mainly because I didn’t think to look for privacy controls in more than one place) the privacy options under the Groups, Companies & Applications tab.”

However, the article also mentions that a partnership with Twitter and/or Facebook is not mentioned in any of the privacy setting options for any of the three sites. How LinkedIn knows to suggest certain acquaintances is still a mystery, since even after contacting customer service the article’s author only received an emailed reiteration of the sites blurb. So is LinkedIn creepy? No more so than any other person centric online services focused on marketing, data and revenue.

Chelsea Kerwin, May 29, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Twitter Needs To Watch Its Tweets Google Plus Is Catching Up

May 29, 2013

When Google joined the social networking circle, users bemoaned it was too late to join the ride, simply hanging on the coattails of Facebook and Twitter. Quite the opposite appears to be happening, however, according to the Business Insider article, “Google Plus Is Outpacing Twitter.” GlobalWebIndex reports that Google Plus has outranked Twitter as the number two social media service. Google Plus continues to add users at a high rate, the reason is most likely due to Google streamlining its services—you log into one and you are signed into all.

Google Plus has become more of a social meeting environment, like the AOL chat rooms of days of yore. Facebook is better to use to maintain connectivity with established friends. Google is taking advantage of this offering and hopes to expand its offerings:

“’We’re extremely happy with our progress so far, and one of our main goals is to transform the overall Google experience and make all of the services people already love faster, more relevant, and more reliable,’ Google said.”

Not many people have Google Plus accounts, yet everyone seems to have a Facebook account. Google Plus is still in that phase between societal acceptance and select-few usage. Give it another year and time for Facebook to go down the tube more and it will catch on. Twitter may have reason to fear, but not enough to stop chirping.

Whitney Grace, May 29, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

StackSearch Launches Search as a Service

May 29, 2013

Search as a service subscription models are becoming more popular with the increasing need for scalable, stackable enterprise solutions. A new option is now on the market as StackSearch released their Qbox search as a service. Read more in the press release at Wall Street Journal entitled, “Stacksearch Launches Qbox Search-as-a-Service.”

The release begins:

“StackSearch, Inc. today announced the availability of Qbox.io search-as-a-service. Qbox, available via a tiered monthly subscription model, was built ‘by developers, for developers’ and empowers developers to incorporate supported and fully managed ElasticSearch indexes into their apps without having to worry about the hassle of installing, maintaining, and scaling on-premise search infrastructure.”

While the concepts StackSearch built on seem relevant, some organizations may hesitate to adopt. First, security and other issues often arise with new companies. StackSearch was founded in 2012 and has yet to be vetted across the industry. Also, the new solution depends on ElasticSearch, a new company itself, and one that has already encountered security and reliability concerns. While agility and scalability are features that enterprises find enticing, security reigns supreme. So organizations are still likely to turn to a tried-and-true company like LucidWorks for their valuable enterprise search and Big Data needs, without sacrificing innovation and creativity.

Emily Rae Aldridge, May 29, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Bitext Delivers a Breakthrough in Localized Sentiment Analysis

May 29, 2013

Identifying user sentiment has become one of the most powerful analytic tools provided by text processing companies, and Bitext’s integrative software approach is making sentiment analysis available to companies seeking to capitalize on its benefits while avoiding burdensome implementation costs.  A few years ago, Lexalytics merged with Infonics. Since that time, Lexalytics has been marketing aggressively to position the company as one of the leaders in sentiment analysis. Exalead also offered sentiment analysis functionality several years ago. I recall a demonstration which generated a report about a restaurant which provided information about how those writing reviews of a restaurant expressed their satisfaction.

Today vendors of enterprise search systems have added “sentiment analysis” as one of the features of their systems. The phrase “sentiment analysis” usually appears cheek-by-jowl with “customer relationship management,” “predictive analytics,” and “business intelligence.” My view is that the early text analysis vendors such as Trec participants in the early 2000’s recognized that key word indexing was not useful for certain types of information retrieval tasks. Go back and look at the suggestions for the benefit of sentiment functions within natural language processing, and you will see that the idea is a good one but it has taken a decade or more to become a buzzword. (See for example, Y. Wilks and M. Stevenson, “The Grammar of Sense: Using Part-of-Speech Tags as a First Step in Semantic Disambiguation, Journal of Natural Language Engineering,1998, Number 4, pages 135–144.)

One of the hurdles to sentiment analysis has been the need to add yet another complex function which has a significant computational cost to existing systems. In an uncertain economic environment, additional expenses are looked at with scrutiny. Not surprisingly, organizations which understand the value of sentiment analysis and want to be in step with the data implications of the shift to mobile devices want a solution which works well and is affordable.

Fortunately Bitext has stepped forward with a semantic analysis program that focuses on complementing and enriching systems, rather than replacing them. This is bad news for some of the traditional text analysis vendors and for enterprise search vendors whose programs often require a complete overhaul or replacement of existing enterprise applications.

I recently saw a demonstration of Bitext’s local sentiment system that highlights some of the integrative features of the application. The demonstration walked me through an online service which delivered an opinion and sentiment snap in, together with topic categorization. The “snap in” or cloud based approach eliminates much of the resource burden imposed by other companies’ approaches, and this information can be easily integrated with any local app or review site.

The Bitext system, however, goes beyond what I call basic sentiment. The company’s approach processes contents from user generated reviews as well as more traditional data such as information in a CRM solution or a database of agent notes, as they do with the Salesforce marketing cloud. One important step forward for  Bitext’s system is its inclusion of trends analysis. Another is its “local sentiment” function, coupled with categorization. Local sentiment means that when I am in a city looking for a restaurant, I can display the locations and consumers’ assessments of nearby dining establishments. While a standard review consists of 10 or 20 lines of texts and an overall star scoring, Bitext can add to that precisely which topics are touched in the review and with associated sentiments. For a simple review like, “the food was excellent but the service was not that good”, Bitext will return two topics and two valuations: food, positive +3; service, negative -1).

A tap displays a detailed list of opinions, positive and negative. This list is automatically generated on the fly. The  Bitext addition includes a “local sentiment score” for each restaurant identified on the map. The screenshot below shows how location-based data and publicly accessible reviews are presented.

Bitext’s system can be used to provide deep insight into consumer opinions and developing trends over a range of consumer activities. The system can aggregate ratings and complex opinions on shopping experiences, events, restaurants, or any other local issue. Bitext’s system can enrich reviews from such sources as Yelp, TripAdvisor, Epinions, and others in a multilingual environment

Bitext boasts social media savvy. The system can process content from Twitter, Google+ Local, FourSquare, Bing Maps, and Yahoo! Local, among others, and easily integrates with any of these applications.

The system can also rate products, customer service representatives, and other organizational concerns. Data processed by the Bitext system includes enterprise data sources, such as contact center transcripts or customer surveys, as well as web content.

In my view, the  Bitext approach goes well beyond the three stars or two dollar signs approach of some systems.  Bitext can evaluate topics or “aspects”. The system can generate opinions for each topic or facet in the content stream. Furthermore, Bitext’s use of natural language provides qualitative information and insight about each topic revealing a more accurate understanding of specific consumer needs that purely quantitative rating systems lacks. Unlike other systems I have reviewed,  Bitext presents an easy to understand and easy to use way to get a sense of what users really have to say, and in multiple languages, not just English!

For those interested in analytics, the  Bitext system can identify trending “places” and topics with a click.

Stephen E Arnold, May 29, 2013

Sponsored by Augmentext

It Is About Time We Start Data Mining Mobile Phones

May 28, 2013

One of the main areas that companies are failing to collect data on is mobile phones. Interestingly enough, Technology Review has this article to offer the informed reader: “Released: A Trove Of Cell Of Cell Phone Data-Mining Research.” Cell phone data offers a plethora of opportunity, one that is only starting to be used to its full potential. It is not just the more developed countries that can use the data, but developing countries as well could benefit. It has been noted that cell phones could be used to redesign transportation networks and even create some eye-opening situations in epidemiology.

There is a global wide endeavor to understand cell phone data ramifications:

“Ahead of a conference on the topic that starts Wednesday at MIT, a mother lode of research has been made public about how to use this data. For the past year, researchers around the world responded to a challenge dubbed Data for Development, in which the telecom giant Orange released 2.5 billion records from five million cell-phone users in Ivory Coast. A compendium of this work is the D4D book, holding all 850 pages of the submissions. The larger conference, called NetMob (now in its third year), also features papers based on cell phone data from other regions, described in this book of abstracts.”

Before you get too excited, take note that privacy concerns are an important issue. No one has found a reasonable way to disassociate users with their cell phone data. It will only be a matter of time before that happens, until then we can abound in the possibilities.

Whitney Grace, May 28, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Keep These Skills Off Your Resume

May 28, 2013

There comes a time in every worker’s career when the skills they once spent years mastering no longer have any function in the job market. This is especially true for IT professionals as the digital workplace reaches a thirty year mark. Brush up your resume and take a hint of advice from ReadWrite about how to improve your job prospects in, “10 Technology Skills That Will No Longer Help You Get A Job.” Some of the skills are obvious, Windows XP and Adobe Flash Developer, but others beg the question why?

SEO specialist is going on the wayside as search results get more polluted with social media information. Quality assurance specialists are very low on the totem pole and this is a job farmed out to all workers, instead. Any technology that is wrought to get old will and that requires you to keep learning:

“Here’s the bottom line: Since so much technology is fairly new to everyone, why should a company invest in experienced candidates – rather than someone just starting out?…It’s not just about the money, of course. To justify any salary, it’s not only about what you know – now – but what you can learn going forward. The key to a long career in Silicon Valley, or anywhere in the tech world, is showing that you can learn and adapt – and master – constant change.”

Be prepared to not get complacent. Keep up with the industry trends and keep mastering your skills. It is the only thing to do these days.

Whitney Grace, May 28, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta