December 12, 2014
Analytics outfit Lexalytics is going all-in on their European expansion. The write-up, “Lexalytics Expands International Presence: Launches Pain-Free Text Mining Customization” at Virtual-Strategy Magazine tells us that the company has boosted the language capacity of their recently acquired Semantria platform. The text-analytics and sentiment-analysis platform now includes Japanese, Arabic, Malay, and Russian in its supported-language list, which already included English, French, German, Chinese, Spanish, Portuguese, Italian, and Korean.
Lexalytics is also setting up servers in Europe. Because of upcoming changes to EU privacy law, we’re told companies will soon be prohibited from passing data into the U.S. Thanks to these new servers, European clients will be able to use Semantria’s cloud services without running afoul of the law.
Last summer, the company courted Europeans’ attention by becoming a sponsor of the 2014 Enterprise Hackathon in Prague. The press release tells us:
“All participants of the Hackathon were granted unlimited access and support to the Semantria API during the event. Nearly every team tried Semantria during the 36 hours they had to build a program that could crunch enough data to be used at the enterprise level. Redmore says, “We love innovative, quick development events, and are always looking for good events to support. Please contact us if you have a hackathon where you can use the power of our text mining solutions, and we’ll talk about hooking you up!”
Lexalytics is proud to have been the first to offer sentiment analysis, auto theme detection, and Wikipedia integration. Designed to integrate with third-party applications, their text analysis software is chugs along in the background at many data-related organizations. Founded in 2003, Lexalytics is headquartered in Amherst, Massachusetts.
Cynthia Murrell, December 12, 2014
December 6, 2014
ROI is the end goal for many big data and enterprise related projects and it is refreshing to see some information published in regards to if companies achieve it like we recently saw in a Smart Data Collective article, “Text Analytics, Big Data and the Keys to ROI.” According to a study released last year (further discussed in“Text/Content Analytics 2011: User Perspectives on Solutions and Providers”) the reason many businesses do not get positive returns has to do with the planning phase. Many report that they did not start with a clear plan to get there.
The author shares with us an example from his full-time work in text analytics. One of his clients that was focused on sifting through masses of social media data and data from government applications looking for suspicious activity needed a solution for a text-heavy application. The author responded by suggesting a selective cross-lingual process, one which worked with the text in its native language, and only on the text that was relevant to the topic of interest.
The following happened after the author’s suggestion:
Although he seemed to appreciate the logic of my suggestions and the quality benefits of avoiding translation, he just didn’t want to deal with a new approach. He asked to just translate everything and analyze later – as many people do. But I felt strongly that he’d be spending more and getting weaker results. So, I gave him two quotes. One for translating everything first and analyzing later – his way, and one for the cross-lingual approach that I recommended. When he saw that his own plan was going to cost over a million dollars more, he quickly became very open minded about exploring a new approach.
It sounds like the author could have suggested a number of similar semantic processing solutions. For example, Cogito Intelligence API enhances the ability to decipher meaning and insights from a multitude of content sources including social media and unstructured corporate data. The point is that ROI is out there and there are innovative companies like Expert System and beyond enabling it.
Megan Feil, December 6, 2014
December 3, 2014
The article titled Semantic Technology Provider Ontotext Announces Strategic Hires for Ontotext USA on PRWeb discusses the expansion of Ontotext in North America. Tony Agresta, Brad Bogle and Tom Endyke joined Ontotext, as Senior VP of Worldwide Sales, Director of Marketing and Director of Solutions Architecture, respectively. Ontotext, the semantic search and text-mining leader has laid out several main focuses for the near future, including the growth of worldwide marketing efforts and the development of relationships. The article quotes Tony Agresta on Ontotext’s product development,
“Our flagship product, GraphDB™ (formerly OWLIM) has been deployed across the globe and is widely known as a highly scalable enterprise RDF triplestore… But what makes Ontotext truly unique are three other essential elements: 1) a full complement of semantic enrichment, integration, curation and authoring tools that extend our platform approach, 2) a large critical mass of semantic engineers, professional services and support teams that represent the most experienced professionals in the world and 3) S4, the Self Service Semantic Suite.”
Ontotext has provided semantic solutions for such companies as BBC, AstraZeneca, John Willey & Sons, and The British Museum. Their recent expansion efforts in North America are an attempt to reach more semantic technology users in this continent.
Chelsea Kerwin, December 03, 2014
November 28, 2014
As the Internet grows and evolves, the features users expect from search and content management systems is changing. SearchContentManagement addresses the shift in “Semantic Technologies Fuel the Web Experience Wave.” As the title suggests, writer Geoffrey Bock sees this shift as opening a new area with a new set of demands — “web experience management” (WEM) goes beyond “web content management” (WCM).
The inclusion of metadata and contextual information makes all the difference. For example, the information displayed by an airline’s site should, he posits, be different for a user working at their PC, who may want general information, and someone using their phone in the airport parking lot, where they probably need to check their gate number or see whether their flight has been delayed. (Bock is disappointed that none of the airlines’ sites yet work this way.)
The article continues:
“Not surprisingly, to make contextually aware Web content work correctly, a lot of intelligence needs to be added to the underlying information sources, including metadata that describes the snippets, as well as location-specific geo-codes coming from the devices themselves. There is more to content than just publishing and displaying it correctly across multiple channels. It is important to pay attention to the underlying meaning and how content is used — the ‘semantics’ associated with it.
“Another aspect of managing Web experiences is to know when you are successful. It’s essential to integrate tracking and monitoring capabilities into the underlying platform, and to link business metrics to content delivery. Counting page views, search terms and site visitors is only the beginning. It’s important for business users to be able to tailor metrics and reporting to the key performance indicators that drive business decisions.”
Bock supplies an example of one company, specialty-plumbing supplier Uponor, that is making good use of such “WEM” possibilities. See the article for more details on his strategy for leveraging the growing potential of semantic technology.
Cynthia Murrell, November 28, 2014
November 25, 2014
Computers are only as smart as the humans who program them, but they lack the spontaneous ability that humans possess in droves. This does not mean that computers are not getting “smarter,” in fact, according to Market Wired their comprehension levels just increased. Market Wired reports on “Expert Systems Extends The Cogito API Portfolio: To Fashion, Advertising, Intelligence, And Media And Publishing Applications.” Expert Systems is one of the world’s leaders in semantic technology and the Cogito API has been designed to increase an organization’s use of unstructured data.
” ‘Companies want to better exploit the ever growing amounts of internal and external information,’ said Marco Varone, President and CTO, Expert System. ‘Cogito API is the perfect match for these needs and we’re thrilled that the community of developers and all the organizations can leverage our semantic technology to increase in a significant way the value of their information across any sector, whether that is entering new markets, extending their customer reach, or creating innovative products and services for market intelligence, decision making and strategic planning.’ “
Cogito is available as part of the CORE or PACK packages. Expert Systems promises that its technology can be tailored to suit any industry and provide an array of solutions for semantic technology.
November 21, 2014
SemanticWeb.com posted an article called “Retrieving And Using Taxonomy Data From DBpedia” with an interesting introduction. It explains that DBpedia is a crowd-sourced Internet community whose entire goal is to extract structured information from Wikipedia and share it. The introduction continues that DBpedia already has over three billion facts W3C standard RDF data model ready for application use.
The W3C standards are already written using the SKOS vocabulary, primarily used by the New York Times, the Library of Congress, and other organizations for their own taxonomies and subject headers. Users can extrapolate the data and implement it in their own RDF applications with the goal of giving your data more value.
DBpedia is doing a wonderful service for users so they do not have to rely on proprietary software to deliver them rich taxonomies. The taxonomies can be retrieved under the open source community bylaws and gain instant improvement for content. There is one caveat:
“Remember that, for better or worse, the data is based on Wikipedia data. If you extend the structure of the query above to retrieve lower, more specific levels of horror film categories, you’d probably find the work of film scholars who’ve done serious research as well as the work of nutty people who are a little too into their favorite subgenres.”
Remember Wikipedia is a good reference tool to gain an understanding of a topic, but you still need to check more verifiable resources for hard facts.
November 20, 2014
The article titled The Power of Semantics on Research Information investigates the advancements in semantic enrichment tools. Scholarly publishers are increasingly interested in enabling their users to browse the vast quantity of data online and find the most relevant information. Semantic enrichment is the proposed solution to guiding knowledge-seekers to the significant material while weeding out the unnecessary and unrelated. Phil Hastings of Linguamatics, Daniel Mayer of Temis and Jake Zarnegar of Silverchair were all quoted at length in the article on their views on the current usages of semantic enrichment and its future. The article states,
“Daniel Mayer, VP product and marketing at TEMIS, gave some examples of the ways this approach is being used: ‘Semantic enrichment is helping publishers make their content more compelling, drive audience engagement and content usage by providing metadata-based discoverability features such as search-engine optimisation, improved search, taxonomy/faceted navigation, links to structured information about topics mentioned in content, “related content”, and personalisation.”
Clearly, Temis is emphasizing semantics. Mayer and the others also gave their opinions on how publishers in the market for semantic enrichment might go about picking their partners. Some suggestions included choosing a partner with expertise within the field, an established customer base and the ability to share best practices.
Chelsea Kerwin, November 20, 2014
October 28, 2014
Partnerships offer companies ways to improve their product quality and create new ones. Semantic Web reports that “Expert System And WAND Partner For A More Effective Management Of Enterprise Information.” Expert System is a leading semantic technology company and WAND is known for its enterprise taxonomies. Their new partnership will allow businesses to have a better and more accurate way to organize data.
Each company brings unique features to the partnership:
“The combination of the strengths of each company, on one side WAND’s unique expertise in the development of enterprise taxonomies and Expert System’s Cogito on the other side with its unique capability to analyze written text based on the comprehension of the meaning of each word, not only ensures the highest quality possible, but also opens up the opportunity to tackle the complexity of enterprise information management. With this new joint offer, companies will finally have full support for a faster and flexible information management process and immediate access to strategic information.”
Enterprise management teams are going to get excited about how Expert System and WAND will improve taxonomy selection and have more native integration with in-place data systems. One of the ways the two will combine their strengths is with the new automatic classification: when a WAND taxonomy is selecting, Expert System brings in its semantic based categorization rules and an engine for automatic categorization.
October 26, 2014
You can find an interesting discussion of the Semantic Web on Hacker News. Semantic Web search engines have had a difficult time capturing the imagination of the public. The write up and the comments advance the notion that the Semantic Web is alive and well, just invisible.
I found the statement from super Googler Peter Norvig a window into how Google views the Semantic Web. Here’s the snippet:
Peter Norvig put it best: “The semantic web is the future of the web, and always will be.” (For what it’s worth, the startup school video that quote comes from is worth watching: http://youtu.be/LNjJTgXujno?t=20m57s)
There are references to “semantic search” companies that have failed; for example, Ontoprise. There are links to clever cartoons.
The statement I highlighted was:
The underlying data just doesn’t necessarily map very well into the seem-web representations, so duplicates occur and possible values explode in their number of valid permutations even though they all mean the same handful of things. And it’s the read-only semantic-web, so you can’t just clean it, you have to map it.. Which is why I’m always amazed that http://www.wolframalpha.com/ works at all. And hopefully one day https://www.freebase.com/ will be a thing. I remember being excited about http://openrefine.org/ for “liberating” messy data into clean linked data… but it turns out that you really don’t want to curate your information “in the graph”; it seems obvious, but traditional relational datasets are infinitely more manageable than arbitrarily connected nodes in a graph. So, most CMS platforms are doing somewhat useful things in marking up their content in machine-readable ways (RDFa, schema.org [as evil as that debacle was], HTTP content-type negotiation and so on) either out-of-the-box or with trivially installed plugins.
Ah, content management systems. Now that’s the model for successful information access as long as one does not want engineering drawings, videos, audio, binaries, and a host of proprietary data types like i2 Analyst Notebook files.
Worth checking out the thread in my view.
Stephen E Arnold, October 26, 2014
October 22, 2014
In April 2014, I cited a report that suggested Hakia was moving forward. It now appears that the Hakia Web site has gone dark. Information about Hakia’s semantic system is available in this interview with Riza C. Berkan.
Stephen E Arnold, October 22, 2014