January 19, 2015
Fujitsu has joined many other companies by taking Hadoop and creating its own software from it to leverage big data. IT Web Open Source’s article, “Fujitsu Makes It Easy For Customers To Reap The Benefits Of Big Data With PRIMEFLEX For Hadoop” divulges the details about the new software.
The new Hadoop application is part of Fijitsu’s PRIMEFLEX software line of workload specific integrated systems. Its purpose is similar to many other big data software on the market: harness big data and make use of actionable analytics. Fujitsu describes it as a wonder software:
“Fujitsu has developed PRIMEFLEX for Hadoop to simplify and tame big data. The powerful, dedicated all-in-one hardware cluster is designed to integrate with existing hardware infrastructures, introducing distributed parallel processing based on Cloudera Enterprise Hadoop. This is an open-source software framework which gathers, processes and analyses data from various sources, then puts together and presents the big picture on how to act on the information gathered.”
Fijitsu is a recognized and respected brand, but the big data market is saturated with other companies that offer comparable software. Other companies also started with a Hadoop based application as part of their software line-up. Fujitsu is entering the Hadoop analytics a little late.
January 16, 2015
Organizing uploaded content is a pain in the rear. In order to catalog the content, users either have to add tags manually or use an automated system that requires several tedious fields to be filled out. CMS Wire explains the difficulties with document organization in “Stop Pulling Teeth: A Better Way To Classify Documents.” Manual tagging is the longer of the two processes and if no one created a set of tagging standards, tags will be raining down from the cloud in a content mess. Automated fields are not that bad to work with if you have one or two documents to upload, but if you have a lot of files to fill out you are more prone to fill out the wrong information to finish the job.
Apparently there is a happy medium:
“Encourage users to work with documents the way they normally do and use a third party tool such as an auto classification tool to extract text based content, products, subjects and terms out of the document. This will create good, standardized metadata to use for search refinement. It can even be used to flag sensitive information or report content detected with code names, personally identifiable information such as credit card numbers, social security numbers or phone numbers.”
While the suggestion is sound, we thought that auto-classification tools were normally built in collaborative content platform like SharePoint. Apparently not. Third party software to improve enterprise platforms once more saves the day for the digital paper pusher.
January 8, 2015
The article titled 15 Website Personalization and Recommendation Software Tools on Smart Insights contains a roundup of personalization software. Think of Amazon.com. Groups of customers see vastly different suggestions from the store, all based on what they have bought or looked at in the past and what other people who bought or looked at similar items also considered. But in the last few years personalization software has become even more tailored to specific pursuits. The article explains the winning brands in one category, B2B and publisher personalization tools,
“Evergage is mentioned as tool that fits best in this category. WP Greet Box is a personalisation plug-in used by WordPress blogging users, including me once, to deliver a welcome message to first time users depending on their referrers. It’s amazing this approach isn’t used more on commercial sites. WP Marketing Suite is another WordPress plugin that has been featured in the comments.”
The article also explores the best in the category of Commerce management systems. The article states that “both Sitecore and Kentico have built in tools to personalize content based on various rules, such as geo-location, search terms…” this is in addition to the more widely understood personalization based on user behavior. The idea behind all of these companies is to improve search for consumers.
Chelsea Kerwin, January 08, 2014
December 30, 2014
Despite budget cuts in academic research with print materials, higher education is clamoring for more digital content. You do not need Google Translate to understand that means more revenue for companies in that industry. Virtual Strategy writes that someone wants in on the money: “With Luxid Content Enrichment Platform, Cairn.info Automates The Extraction Of Bibliographic References And The Linking To Corresponding Article.”
Temis is an industry leader in semantic content enrichment solutions for enterprise and they signed a license and service agreement with CAIRN.info. CAIRN.info is a publishing portal for social sciences and humanities, providing students with access to the usual research fare.
Taking note of the changes in academic research, CAIRN.info wants to upgrade its digital records for a more seamless user experience:
“To make its collection easier to navigate, and ahead of the introduction of an additional 20.000 books which will consolidate its role of reference SSH portal, Cairn.info decided to enhance the interconnectedness of SSH publications with semantic enrichment. Indeed, the body of SSH articles often features embedded bibliographic references that don’t include actual links to the target document. Cairn.info therefore chose to exploit the Luxid® Content Enrichment Platform, driven by a customized annotator (Skill Cartridge®), to automatically identify, extract, and normalize these bibliographic references and to link articles to the documents they refer to.”
A round of applause for Cairn.info, realizing that making research easier will help encourage more students to use its services. If only academic databases would take ease of use into consideration and upgrade their UI dashboards.
December 26, 2014
The interesting tool called WikiSummarizer presents a summary of Wikipedia articles, particularly useful for students and consultants. Rather than reading the full text of a Wikipedia article (which is, yes, already a condensed text) you can now search for summarized article to get the headlines of a given subject. The FAQ’s for WikiSummarizer explain,
“WikiSummarizer automatically summarizes the Wikipedia articles. The program identifies the most important keywords and ranks them by relevancy. For each keyword the most significant sentences in the original text are presented to the reader. You instantly get the headlines with the most important sentences and keywords. The blending of visualization with summarization, knowledge browsing, mind mapping provides you with a wide range of means to explore relevant content. At a glance, without much reading, you immediately spot the key information chunks.”
Perhaps someday soon, we will be able to read nothing at all and know… the “chunks.” For example, when you search the keyword Hamlet, (the play) what Wikipedia decides to promote as the most relevant information is when Shakespeare wrote it and what the story was based on. This is followed by several blurbs summarizing the play itself and then a brief description of the critical reception among Romantics, providing what reads as a Sparknote of a Sparknote. WikiSummarizer offers visual summary maps, visual trees, and word clouds connected to the Wikipedia Knowledge base.
Chelsea Kerwin, December 26, 2014
December 11, 2014
We’ve learned that Sail Labs has put out the next iteration of its Media Mining Indexer from the company’s post, “Sail Labs Announces Availability of Release Version 2014-2 and Media Mining Indexer 6.3.” The refreshingly straightforward press release offers bulleted lists of new features and major changes to be found throughout the new version. For the indexer, it lists:
- Support for sentiment analysis, i.e. classification of text segments into positive, negative, neutral or mixed sentiment
- Currently supported languages: US and International English, German and Russian
- Support for continuous intermittent result output, without final XML result, which increases performance in cases where collective results are not required.
- Support for licensing using a central license manager/server (LiMa), which is intended for use with cloud based use cases.
- Script-based building of language models using lmtscript.
For those not already familiar with Media Mining Indexer, it processes speech from multiple sources into XML, which can then be uploaded into a range of digital-asset-management systems for subsequent search and retrieval. The software boasts automatic speech recognition, speaker ID, speaker change detection, story detection, and topic classification.
Sail Labs specializes in high-end software for speech and multimedia analysis for vertical markets. Its name derives from “Speech Artificial Intelligence Language Laboratories.” Sail Labs is located in Vienna, Austria, and was founded in 1999.
Cynthia Murrell, December 11, 2014
December 8, 2014
It is harder than ever to find a job for young graduates and seasoned workers. Yet according to the FitFrnd blog, Silicon Valley is having trouble finding good employees. The post “Silicon Valley’s Best-Kept Secret: How AngelList Is Slowly Disrupting The Hiring Industry” explains that rather than relying on “old-fashioned” job search engines, AngelList’s is proving to be more reliable in finding talent.
AngelList is primarily a crowdfundung Web site used by startups to raise money for new endeavors. AngelList, however, is proving to be a new resource to find a job or locate someone to fill the position. Other career Web sites fail to attract the right talent. The post explains how FitFrnd ad trouble finding a blogger/content marketer:
“We finally decided to give AngelList a serious try. We had tried it before, but our efforts had been half-hearted. This time we improved our copy, added information such as why the company is such an amazing place to work (it is!), details about salary and equity ranges, and even screenshots of the app. Within a few days, we have received about 80 resumes, including some really compelling candidates.”
What makes AngelList different is that it allows applicants to apply privately and know the salary up front. It also cuts out the middleman. While the information is searchable, you have to join AngelList. While it does not cost to join, it eventually might, but the price is you are paying for a service that works…for the moment.
December 3, 2014
The article titled Semantic Technology Provider Ontotext Announces Strategic Hires for Ontotext USA on PRWeb discusses the expansion of Ontotext in North America. Tony Agresta, Brad Bogle and Tom Endyke joined Ontotext, as Senior VP of Worldwide Sales, Director of Marketing and Director of Solutions Architecture, respectively. Ontotext, the semantic search and text-mining leader has laid out several main focuses for the near future, including the growth of worldwide marketing efforts and the development of relationships. The article quotes Tony Agresta on Ontotext’s product development,
“Our flagship product, GraphDB™ (formerly OWLIM) has been deployed across the globe and is widely known as a highly scalable enterprise RDF triplestore… But what makes Ontotext truly unique are three other essential elements: 1) a full complement of semantic enrichment, integration, curation and authoring tools that extend our platform approach, 2) a large critical mass of semantic engineers, professional services and support teams that represent the most experienced professionals in the world and 3) S4, the Self Service Semantic Suite.”
Ontotext has provided semantic solutions for such companies as BBC, AstraZeneca, John Willey & Sons, and The British Museum. Their recent expansion efforts in North America are an attempt to reach more semantic technology users in this continent.
Chelsea Kerwin, December 03, 2014
November 25, 2014
Ontopia has been silent since August 1, 2013. Prior to that outdated update, Ontopia used to share news three or four times a year. Ontopia was developed as a community for open source tools for building, maintaining, and deploying topic maps-based applications. Topic maps are knowledge structures that directly connect information to a source. The process is also are also called information mapping or mind mapping, which is a concept that has been played around with by many develops. An old Mashable article has a list: “Twenty Four Essential Mind Mapping And Brainstorming Tools.”
Perusing the Ontopia Web page leaves it in the throws of Web 1.0 and with only some features that could pass as a modern Web site. Even the product’s description, in all its simplicity, is dated:
“Ontopia is a set of tools which contains everything you need to build a full Topic Maps-based application. Using Ontopia you can design your ontology, populate the topic map manually and/or automatically, build the user interface, show graphical visualizations of the topic map, and much more.
The core of Ontopia is the engine, which stores and maintains the topic maps, and has an extensive Java API. On top of it are built a number of additional components, as shown in the diagram below. More information about these components can be found on the right.
Ontopia is 100% Java, and runs on any operating system which has Java 1.5. It is fully open source and can be used without any restrictions beyond those in the Apache 2.0 license.”
The last time Ontopia updated, they wrote a post about how version 5.3.0 was just released and the details were available on the wiki. Has Ontopia been in the sequestered in a closet working on the latest version or has it gained abandoned open source project?
October 16, 2014
Microsoft is adding a new big data piece to its Office 365 lineup. And in a bit of a change of direction for the company, Microsoft has sought to make this element aesthetically pleasing as it points out patterns of likes and dislikes. Read more about Microsoft Delve in the InfoWorld article, “Microsoft’s Delve: The Office 365 Spy You Just Might Love.”
The article says:
“Microsoft’s Delve is an intriguing new offering for Office 365 business customers. Previously known as Oslo, Delve brings a concierge, Instragram-like pulse to business environments, as curated by Office Graph, sophisticated machine-learning technology that maps relationships between people, content, and activity across Office 365 accounts. Delve pulls content from within your organization’s OneDrive, SharePoint, and Yammer accounts, serving it up to users in a card-based interface reminiscent of Pinterest.”
The verdict is still out as to how helpful the product will really be in the business environment. It does behave without existing permissions, only showing users that which they are granted permission to see. Stephen E. Arnold is a longtime leader in search and reports on the latest news in his SharePoint feed. Since Delve may have helpful implications for SharePoint, keep an eye on ArnoldIT.com for all the latest tips and tricks.
Emily Rae Aldridge, October 16, 2014