MapR Expands Hadoop Connectors

April 4, 2012

This MapR move signals more options for Hadoop users. Talkin’ Cloud reports, “MapR Announces Broad Data Connection Options for Hadoop.” Writer Brian Taylor specifies:

The data connections, according to the press release, enable a ‘wide range of data ingress and egress alternatives for customers,’ including direct file-based access using standard tools; direct database connectivity; Hadoop-specific connectors via Sqoop, Flume and Hive; and access to popular data warehouses and applications using custom connectors.

Sqoop, Flume, and Hive are all open source projects at Apache; the first two are still in incubation.

MapR is getting a hand on this project from tech providers Pentaho and Talend, who will supply direct integration with MapR Distribution. In addition, Tableau Software is helping to promote the new data connection options.

Co-founded by Xoogler M.C. Srivas, MapR has built on the work of developers behind the open source Hadoop, making it “more reliable, more affordable, more manageable and significantly easier to use.” MapR boasts that its innovations help its customers get the most out of the big data phenomenon.

Watch for our forthcoming open source analytics blog. Roll out is April 9, 2012.

Cynthia Murrell, March 29, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Analytics, Cloud computing, News, Text analytics, Text processing | Comments Off on MapR Expands Hadoop Connectors

Digital Reasoning and Semantic Research Tie Up

April 2, 2012

Digital Reasoning and Semantic Research today announced that they have integrated Digital Reasoning’s Synthesys big data analytics solution with Semantica data fusion and analysis software.

The integrated solution combines unstructured text analytics at scale has been combined with visualization. In addition, the tie up provides licensees with analytical workflow tools to deliver a unique solution for automatically understanding people, places, and hidden relationships in big data.

The ability to manipulate information with these tools facilitates the understanding of content without an analyst’s manually reading. Information from social networks, supply-chain networks, terrorist networks, financial networks, and government networks, among others, can yield new insights . Navigate to http://www.digitalreasoning.com/SemSynDemo to check out a video of some of the features and functions available.

Tim Estes, founder and CEO of Digital Reasoning, told us:

There is no other solution that provides massively scalable unstructured data analytics with auto-populating of visualizations and workflows tailored for the Intelligence Community. The solution we are delivering together has the ability to address key big data analytics challenges in the enterprise and government markets alike.

For more information about Digital Reasoning, point your browser at www.digitalreasoning.com. The firm provides automated understanding for Big Data. “Automated understanding” analyzes unstructured and structured data to reveal the hidden and potentially valuable relationships between people and organization in space and time. Digital Reasoning’s flagship product, Synthesys uncovers insights and accelerates the time to actionable intelligence.

Semantic Research (www.semanticresearch.com) is redefining the way users visualize, interact with, and understand data and information within the Department of Defense, Intelligence and Law Enforcement communities.

This looks like a promising tie up.

Stephen E Arnold, April 2, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Analytics, Business strategy, News, Technology, Text analytics, Text processing | Comments Off on Digital Reasoning and Semantic Research Tie Up

Inteltrax: Top Stories, March 26 to March 30

April 2, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the ways in which unstructured data is impacting the big data industry.

Our feature story this week, “Digital Reasoning Makes Major Move in Military,” shows how the leader in unstructured data wrangling is helping the military increase its reach.

“Unstructured Data Demands Right Tools” proves that not all unstructured data softwares are created equal. That’s not a bad thing, it’s just a shell game for users to find the right one for their needs.

“Governments Get Self Conscious with Analytics” showed how clever government agencies are clearing up inaccuracies and becoming more efficient by utilizing the massive collections of unstructured data lingering in their systems.

If you aren’t familiar with the term “unstructured data” you will be. It’s the big horizon in the analytics world. We, fortunately, are well versed in the ephemeral stuff. It’s going to change the way the entire industry works and we’ll be following it every day.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax.

April 2, 2012

Written by Stephen E. Arnold · Filed Under Analytics, Business intelligence, Business strategy, Data mining, Enterprise, Natural language processing, Text analytics, Text processing | Comments Off on Inteltrax: Top Stories, March 26 to March 30

SAS Gets More Visual

March 31, 2012

Inxight (now owned by BusinessObjects, part of the SAP empire) is history at SAS or almost history. Now the company is moving in a different direction.

Jaikumar Vijayan writes about a new visual analytics application recently unveiled by SAS in his article “SAS Promises Pervasive BI with New Tool.” Einstein is believed to have once said “computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.” We noted this passage from Mr. Vijayan’s write up:

Unlike many purely server-based enterprise analytics technologies, Visual Analytics gives business users a full range of data discovery, data visualization and querying capabilities from desktop and mobile client devices, the company said.

The initial version of the new tool allows iPad users to view reports and download information to their devices. Future versions will support other mobile devices as well, SAS added. The quote is actually a good description of the concept that underlies Visual Analysis. The process uses analytic reasoning to detect specific information in massive amount of data. For example, a clothing manufacturer might use it to determine current trends in ladies’ fashions. The results are presented in charts and graphs to the users, who can fine-tune the parameters until their specific queries are answered.

SAS is known for its statistical functionality, its programming language, and its need for SAS-savvy cow pokes to ride herd on the bits and bytes. Will SAS be able to react to the trend for the consumerization of business intelligence.

While the technology is impressive, SAS may be a little late to the game. Palantir and Digital Reasoning have already introduced applications that offer clients powerful Visual Analysis capabilities. Time will tell if SAS is able to catch up to some competitors’ approach. We are interested in Digital Reasoning, Ikanow and Quid.

Stephen E Arnold, March 31, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Analytics, Business intelligence, News, Text analytics, visualization | Comments Off on SAS Gets More Visual

Law Firms Learn Staff Can Be Repurposed

March 30, 2012

I know there is considerable enthusiasm for smart software. Most of the eDiscovery vendors suggest that humans and whizzy new systems can coexist. Now, a new chapter in justifiable staff reductions may be upon us. Navigate to “A New View of Review: Predictive Coding Vows to Cut E-Discovery Drudgery” to learn that recently released research from an Ivory Tower-type says that a “predictive coding approach can do a better job of sifting through more than 800,000 documents than humans.”

For many law school graduates, scouring documents for material of value to a case has long been a secure if somewhat tedious means of entering the legal profession. This will no longer be true, however, if a new type of software lives up to its creator’s claims Known as predictive coding, it can supposedly do the same job, faster, cheaper, and as well as humans. But lawyers live to bill, so perhaps software may force law firms to get rid of staff and trust the algorithms.

We learn:

There has been a long-standing myth in the legal field that exhaustive manual review is the gold standard, or nearly perfect, but that has been shown to be a fallacy, according to Maura R. Grossman, a New York City attorney. Research has shown that, under the best circumstances, manual review will identify about 70 percent of the responsive documents in a large data collection. Some technology-assisted approaches have been shown to perform at least as well as that, if not better, at far less cost.

Attorneys, paralegals, unpaid interns, and experts in India will miss 30 percent of the pertinent documents. Smart software is the path to the future.

Some observers worry about the legal defensibility of predictive coding. But such concerns are unfounded, so long as both sides agree to its use. That’s according to Craig Carpenter, a marketer for Recommind, a software development firm focused on the legal and corporate market

But even sophisticated programs don’t actually think. Without that capacity, they cannot understand the subtle nuances and informal connections that underlie written documents. It’s unlikely that predictive coding will live up to the sales hype surrounding it. But what’s new about search vendors’ marketing is that reality is often different from Spock’s world on Star Trek.

Stephen E Arnold, March 20, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under EDiscovery, Marketing, News, Text analytics, Text processing | Comments Off on Law Firms Learn Staff Can Be Repurposed

New Alert Feature for Clarabridge Social Media Analytics

March 28, 2012

Social functions are refining their role in the business intelligence niche. The BrainYard reports, “Clarabridge Adds Alerts to Social Media Analytics.”

So what do you do with all that information your business collects from social media? A backlog in the analysis of time-sensitive data could cost a company in lost opportunities. Clarabridge now addresses this problem with automatic alerts. Writer David F. Carr explains:

Clarabridge 5.0 provides tools for collaborating around an analysis. By configuring more proactive notifications, Clarabridge users might also configure the system to automatically send alerts to the correct regional manager–or product manager, or department head–making it more likely that the organization will take action immediately after detecting a specific problem or opportunity. . . . ‘If somebody just tweeted, “I went into Kohl’s and slipped and fell, so now I’m going to sue,” if you’re Kohl’s you want to know that,’ [Clarabridge VP Sidra] Berman said.

Collaboration is the focus of Clarabridge 5.0, formally released March 20, 2012. After all, much of this data points to challenges that require action from multiple departments. Though the alert function is a useful tool, it is important to remember that it will take skillful action to make the most of the new feature.

Clarabridge aims to delve deeper into the meaning behind each piece of content than the competition. Having spent years developing its sentiment and text analytics technology, the company boasts that it is uniquely positioned to support enterprise-scale customer feedback initiatives.

Cynthia Murrell, March 28, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under News, Social, Text analytics, Text processing | Comments Off on New Alert Feature for Clarabridge Social Media Analytics

Connotate Acquires Fetch Technologies

March 27, 2012

I know, “Who? Bought what?”

Connotate is a data fusion company which uses software bots (agents) to harvest information. Fetch Technologies, founded more than a decade ago, processes structured data. The deal comes on the heels of some executive ball room dancing. Connotate snagged a new CEO, Keith Cooper, according to New Jersey Tech Week. Fetch also uses agent technology.

Founded in 1999, Fetch Technologies enables organizations to extract, aggregate and use real-time information from Web sites. Fetch’s artificial intelligence-based technology allows precise data extraction from any Web site, including the so-called Deep Web, and transforms that data into a uniform format that can be integrated into any analytics or business intelligence software.

The company’s technology originated at the University of Southern California’s Information Sciences Institute. Fetch’s founders developed the core artificial intelligence algorithms behind the Fetch Agent Platform while they were faculty members in Computer Science at USC. Fetch’s artificial intelligence solutions were further refined through years of research funded by the Defense Advanced Research Projects Agency (DARPA), the National Science Foundation (NSF), the U.S. Air Force, and other U.S. Government agencies.

The Connotate news release said:

Fetch is very excited to combine our information extraction, integration, and data analytics solution with Connotate’s monitoring, collection and analysis solution,” said Ryan Mullholland, Fetch’s former CEO and now President of Connotate. Our similar product and business development histories, but differing go-to-market strategies creates an extraordinary opportunity to fast-track the creation of world-class proprietary ‘big data’ collection and management solutions.

Okay, standard stuff. But here’s the paragraph that caught my attention:

Big data, social media and cloud-based computing are major drivers of complexity for business operations in the 21st century,” said Keith Cooper, CEO of Connotate. “Connotate and Fetch are the only two companies to apply machine learning to web data extraction and can now take the best of both solutions to create a best-of-breed application that delivers inherent business value and real-time intelligence to companies of all sizes.

I am not comfortable with the assertion of “only two companies to apply machine learning to Web data extraction.” In our coverage of the business intelligence and text mining market in Inteltrax.com, we have written about many companies which have applied such technologies and generated more market traction. Examples range from Digital Reasoning to Palantir, and others.

The deal is going to deliver on a “unified vision.” That may be true; however, saying and doing are two different tasks. As I write this, unification is the focus of activities from big dogs like Autonomy, now part of Hewlett Packard, to companies which have lower profiles than Connotate or Fetch.

We think that the pressure open source business intelligence and open source search are exerting will increase. With giants like IBM (Cognos, i2 Group, SPSS) and Oracle working to protect their revenues, more mergers like the Connotate-Fetch tie up are inevitable. You can read a July 14, 2010, interview with Xoogler Mike Horowitz, Fetch Technologies at this link.

Will the combined companies rock the agent and data fusion market? We hope so.

Stephen E Arnold, March 27, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Business strategy, News, Text analytics, Text processing | 1 Comment

The Invisibility of Open Source Search

March 27, 2012

I was grinding through my files and I noticed something interesting. After I abandoned the Enterprise Search Report, I shifted my research from search and retrieval to text processing. With this blog, I tried to cover the main events in the post-search world. The coverage was more difficult than I anticipated, so we started Inteltrax, which focuses on systems, companies, and products which “find” meaning using numerical recipes. But that does not do enough, so we are contemplating two additional free information services about “findability.” I am not prepared to announce either of these at this time. We have set up a content production system with some talented professionals working on our particular approach to content. We are also producing some test articles.

Until we make the announcement, I want to reiterate a point I made in my talks in London in 2011 about open source search and content processing:

Most reports about enterprise search ignore open source search solution vendors. A quiet revolution is underway, and for many executives, the shift is all but invisible.

We think that the “invisible” nature of the open source search and content processing options is due to four factors:

Most of the poobahs, self appointed experts and former home economics majors have never installed, set up, or optimized an open source search system. Here at ArnoldIT we have that hands on experience. And we can say that open source search and content processing solutions are moving from the desks of Linux wizards to more mainstream business professionals.

Next, we see more companies embracing open source, contributing to the overall community with bug fixes and new features and functions. At the same time, the commercial enterprises are “wrapping” open source with proprietary, value-added features and functions. The leader in this movement is IBM. Yep, good old Big Blue is an adherent of open source software. Why? We will try to answer this in our new information services.

Third, we think the financial pressure on organizations is greater than ever. CNBC and the Murdoch outfitted Wall Street Journal are cheering for the new economic recovery. We think that most organizations are struggling to make sales, maintain margins, and generate new opportunities. Open source search and content solutions promise some operating efficiencies. We want to cover some of the business angles of the open source search and content processing shift. Yep, open source means money.

Finally, the big solutions vendors are under a unique type of pressure. Some of it comes from licensees who are not happy with the cost of “traditional” solutions. Other comes from the data environment itself. Let’s face it. Certain search systems such as your old and dusty version of IBM STAIRS or Fulcrum won’t do the job in today’s data and information rich environment. New tools are needed. Why not solve a new information problem without dragging the costs, methods, and license restrictions of traditional enterprise software along for the ride? We think change is in the wind just like the smell of sweating horses a couple of months before the Kentucky Derby.

Our approach to information in our new services will be similar to that taken in Beyond Search. We want to provide pointers to useful write ups and offer some comments which put certain actions and events in a slightly different light. Will you agree with the information in our new services? We hope not.

Stephen E Arnold, March 27, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Analytics, Natural language processing, News, Search, Semantic, Sentiment analysis, Text analytics, Text processing | 3 Comments

Publishers Pose Threats to Text Mining Expansion

March 26, 2012

Text mining software is all the rage these days due to its ability to make significant connections by quickly scanning through thousands of documents. This software can recognize, extract and index scientific information from vast amounts of plain text, allowing computers to read and organize a body of knowledge that is expanding too fast for any human to keep up with. However, Nature.com recently reported on a some issues that have developed in this growing industry in the article “Trouble at the Text Mine.”

According to the article, text mining programmers Max Haeussler and Casey Bergman have run into trouble trying to get science publishers to agree to let them mine their content.

The article asserts:

Many publishers say that they will allow their subscribers to text-mine, subject to contract and the text-miners’ intentions, and point to a number of successful agreements. But like many early advocates of the technology, Haeussler and Bergman complain that publishers are failing to cope with requests, and so are holding up the progress of research. What is more, they point out, as text-mining expands, it will be impractical for individual academic teams to spend years each working out bilateral agreements with every publisher.

While some publishers are getting on board the text mining train, many are still trying to work out how to take advantage of the commercial value before signing on. Too bad it takes more than a degree in English to make text mining deliver useful results. Bummer.

Jasmine Ashton, March 26, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Data mining, Database, News, Online (general), Publishing, Text analytics, Text processing | Comments Off on Publishers Pose Threats to Text Mining Expansion

Quid: Another Analytics Player

March 24, 2012

There’s a new player in content processing. Quid enters the market with big names and solid financing, too. The product description specifies:

Quid software is used by decision-makers running companies, NGOs, banks, and funds. It captures data, structures it, and enables people to visualize and interact with the information, to understand the global technology landscape. Teams can immerse themselves in and play with the data, optimizing decision-making about what to build and where to invest or partner. Quid software augments your ability to perceive this complex world.

Sounds like a valuable tool for those looking to invest in the next big thing. The software provides the ability to: map emerging technology sectors and identify rising stars; track tech R&D and breakthroughs; analyze white spaces for opportunities; and discern co-investment relationships in order to craft solid investment strategies.

We admire the company’s Origami-inspired way of explaining math and analytics. Very creative. Also, the “Life at Quid” page is well designed to entice potential employees.

Quid is one to watch as the company continues to move forward.

Cynthia Murrell, March 24, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Analytics, News, Technology, Text analytics, Text processing | Comments Off on Quid: Another Analytics Player

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

MapR Expands Hadoop Connectors

Digital Reasoning and Semantic Research Tie Up

Inteltrax: Top Stories, March 26 to March 30

SAS Gets More Visual

Law Firms Learn Staff Can Be Repurposed

New Alert Feature for Clarabridge Social Media Analytics

Connotate Acquires Fetch Technologies

The Invisibility of Open Source Search

Publishers Pose Threats to Text Mining Expansion

Quid: Another Analytics Player

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta