Text Analytics SummitPolySpot: Agile Enterprise Search Infrastructure

Exclusive Interview with Ana Athayde, Spotter SA

August 16, 2011

I have been monitoring Spotter SA, a European software development firm specializing in business intelligence for several years. A lengthy interview with the founder, Ana Athayde appears in the Search Wizards Speak section of the ArnoldIT.com Web site.

The company has offices throughout Europe, the Middle East, and in the United States. The firm offers solutions in market sentiment, reputation management, risk assessment, crisis management, and competitive intelligence.

In the wide ranging interview, Ms. Athayde mentioned that she had been recognized as an exceptional manager, but she was quick to give credit to her staff and her chief technical officer, who was involved in the forward looking Datops SA content analytics service, now absorbed into the LexisNexis organization.

I asked her what pulled her into the vortex of content processing and analytics. She told me:

My background is business and marketing management in the sports field. In my first professional experience, I had to face major challenges in communication and marketing working for the International Olympic Committee. The amount of information published on those subjects was so huge that the first challenge was to solve the infoglut: not only to search for relevant information and build a list, but to understand opinions and assess reputation at an international level….I decided to fund a company to deliver a solution that could make use of information in textual form, what most people call unstructured data. But I knew that the information had to be presented in a way that a decision maker could actually use. Data dumps and row after row of numbers usually mean no one can tell what’s important without spending minutes, maybe hours deciphering the outputs.

I asked her about the firm’s technical plumbing. She replied:

The architecture of our own crawling system is based on proprietary methods to define and tune search scenarios. The “plumbing” is a fully scalable architecture which distributes tasks to schedulers. The content is processed, and we syndicate results. We use what we call “a source monitoring approach” which makes use of standard Web scraping methods. However, we have developed our own methods to adjust the scraping technology to each source in order to search all available documents. We extract metadata and relevant content from each page or content object.  Only documents which have been assessed as fresh are processed and provided to users. This assessment is done by a proprietary algorithm based on rules involving such factors as the publication date. This means that each document collected by Spotter’s tracking and monitoring system is stamped with a publication date. This date is extracted by the Web scraping technology, from the document content. The type of behavior of the source; that is, the source has a known update cycle. We analyze the text content of the document. And we use the date and time stamp on the document itself.

Anyone who has tried to use the dates provided in some commercial systems realizes that without accurate time context, much information is essentially useless without additional research and analysis.

To read the complete interview with Ms. Athayde, point your browser to the full text of our discussion. More information about Spotter SA is available at the firm’s Web site www.spotter.com.

Stephen E Arnold, August 16, 2011

Freebie but you may support our efforts by buying a copy of The New Landscape of Enterprise Search

Holodeck: For Your Spies Only

July 14, 2011

Wired announces. “Spy Geeks Want Holodeck Tech for Intel Analysts.” Yes, finally! Wait, intelligence analysis?

The U.S. intelligence community’s research group DARPA is working on the Synthetic Holographic Observation (SHO) program, which will allow intelligence analysts use holographic displays to collaborate. Oh. I guess that’s cool too.

Though we’re still a long way from the Holodeck as envisioned in Star Trek, writer Adam Rawnsley is emphasizes that this is a step in that direction. More importantly for the current point in history, it could become an indispensible tool for our intelligence officers. The article asserts:

The program is aimed at generating 3-D displays that let analysts get a better feel for the mountains of imagery that the intelligence community collects. In particular, SHO needs to render conventional imagery and LIDAR (light detection and ranging) into holographic light fields. . . .SHO needs to be able to let multiple analysts work together on the same image at the same time. To do that, it has to be interactive. DARPA asking prospective builders to make a hologram that analysts can navigate and manipulate in ways that regular maps don’t allow.

Sounds like a great idea. I look forward to learning more. We think a phase change is search and information access is underway.

Cynthia Murrell, July 14, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Visual Search with 3D from GE

June 25, 2011

A new Google search technology could change the way people use the Internet. According to the Search Engine Watch article “Visual Health Search” Google recently launched its Google Body Browser.

Users have the ability to view a 3D layered model of the body. The article asserted:

“The body can be turned, manipulated, and literally “dissected” down to the vascular level to see how its functions work and connect. This kind of detailed information gives users access to the human body that’s never been available before and goes a long way in promoting a level of understanding that can help people make better informed decisions about their health.”

Healthline Networks in conjunction with GE Healthymagination and Visible Productions have also introduced Healthline BodyMaps. We learned: 

“This new tool layers search on top of a 3D anatomical model, and allows people to navigate male and female anatomy, view systems, and organs and explore how the body works.”

Not only can patients get a more in depth look and understanding of medical conditions but physicians and other health care providers will be able to use it to help explain medical conditions, procedures and etc. to patients. Visual Search and 3D could be the new dream team.

A couple of thoughts. Google seems to have terminated its electronic medical record project. And didn’t GE design the Fukushima nuclear facility? Visual search might be less challenging and have a higher upside.

Cynthia Murrell, June 25, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

Augmenting the Future with Aurasma

June 24, 2011

I somehow missed the earlier reporting until now.  Straight out of science fiction, “Aurasma App Is Augmented Reality, Augmented” gives a glimpse of the next app to potentially saturate the market.

Aurasma describes itself as an augmented reality platform.  With an equipped Smartphone, one can point the window at the world and conjure up an associated “aura”, or online video content.  The content is created by anyone and stored in an infrastructure provided by the brains of this innovation, Autonomy. The article stated:

The idea is that media companies can use Aurasma to recognize printed matter – street posters, newspapers, magazines – to call up compelling video and online content they have made themselves or from TV stations and movie studios. … It’s making the world browsable.

Okay, the given examples of animating the assembly instructions of flat-pack furniture, talking newspapers and bringing advertisements to life will be at least momentarily entertaining.  But these applications toe the gimmick line and can lose their appeal as quickly as animated ads have ruined the internet. 

Let’s consider some of the more useful extensions.  In engineering, for one: imagine issuing equipment specs or construction issue drawings with associated 3-D models rather than typical 2-D likenesses.  Sure, 3-D CAD files exist, but not everyone can afford those licenses merely for viewing. Highlighting all of the most important and oft ignored drawing details with Aurasma animations would be another option.  Any industry based on communicating thru flat images could benefit greatly from this service. 

Further and a bit closer to home, making the world "browsable" is tantamount to making it searchable.  If the technology sticks it will not take long before search can be leveraged thru Autonomy’s brilliant platform.  It sounds like another layer is being added to the prospering location based services business.  If we could view it through Aurasma, we would probably see some cheerfully dancing dollar signs.

Sarah Rogers, June 23, 2011

ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Maps on Steroids

May 17, 2011

Here is an interesting link from the people behind Mapsys.info: “Public Data Visualization with Google Maps and Fusion Tables”.

“Visualizing” public data basically means mapping information that is relevant to a community.  A good working example mentioned in the posting is San Francisco’s Bay Area bike accident tracker.  The map’s legend decodes the various colored dots as the type of accident and how it came to be recorded.

image

Source: http://mapsys.info/

A screenshot of the coding needing to display a map with personalized details is offered in the posting.  The star of the show is the integration with a fusion table, a tool offered by Google to house data sets to be presented on a map.  Added functionality is included by using “SQL-like query syntax” and leveraging “the Python libraries Google provides for query generation and API calls”.  This allows you to pick smaller data sets out of the fusion table.

So behind the scenes, this looks like another example of search moving beyond the token keyword.  You won’t hear any complaints out of us. I remember creating maps using old fashioned methods when I was working on my engineering degree. This method delivers accuracy and time savings.

Sarah Rogers, May 17, 2011

Freebie

Visualization Components

May 15, 2011

David Galles, of the Computer Science University of San Francisco, gives us a useful collection of visualization components in his “Data Structure Visualizations” list. The structures and algorithms addressed include the Basics, Indexing, Sorting, Heap-like Data Structures, Graph Algorithms, Dynamic Programming, and “Others.”

In his page discussing visualizations, Galles explains,

The best way to understand complex data structures is to see them in action. We’ve developed interactive animations for a variety of data structures and algorithms. Our visualization tool is written in JavaScript using the HTML5 canvas element, and run in just about any modern browser — including iOS devices like the iPhone and iPad, and even the web browser in the Kindle! (The frame rate is low enough in the Kindle that the visualizations aren’t terribly useful, but the tree-based visualizations — BSTs and AVL Trees — seem to work well enough).

Galles also provides a tutorial for creating one’s own visualizations. Check it out if you’re wrestling with your own complex data structures. As search vendors thrash and flail, business intelligence looks like a promising market sector. Nothing sells business intelligence like hot graphics. Just ask Palantir.

Cynthia Murrell May 15, 2011

As Search and Analytics Merge: Visualizations Surge

May 7, 2011

“6 Great Data Visualization Applications” provides some interesting screen shots and links to exemplary graphical presentations of result sets. The drivers for visualization are MBAs looking to add sizzle to their otherwise narcotized PowerPoints and big data. When running a query against petabytes of data, a laundry list is essentially useless. With top results distorted by spam and SEO machinations, I find it difficult to pinpoint what I need to answer a question. I find myself falling back on traditional research methods such as notecards, looking up information in books (printed and digital), and talking to people who allegedly know their stuff.

Assume you want some snappy visualizations. The article from Techlozenge will help you out. You get a screen shot and a brief description of six tools. These include:

Of this group, I found the Newsmap and Google Chart Tools links the most useful. You may want to take a look at these examples. Keep in mind that these are not industrial strength toolsets like those provided with Palantir and Digital Reasoning. The idea is to provide some examples.

Stephen E Arnold, May 7, 2011

Freebie

Visualization Tools for Data Analysis

May 2, 2011

Just like consumers companies often compare the products on the market. Companies gather loads of data to help them meet the needs of their clients and stay productive. Visualization is an important data analysis tool that transforms text into graphics in order to make the data easier to comprehend. Users can then study the graphics and look for trends or patterns.

The hefty price tag for visualization tools often make them seem unattainable for many. The Computerworld article “22 Free Tools For Data Visualization and Analysis” provides an in depth review of 22 free tools that can be used in data visualization and analysis. Statistical analysis, data cleaning and mapping are just a few tools available.

With data analysis such an important part of staying competitive in the business world, companies must have the tools needed in order to effectively do the job. With nothing to lose, but maybe a little time, this deal seems too good to pass up. This is worth downloading and tucking away. Useful article.

April Holmes, May 2, 2011

Freebie

Automated Understanding: Digital Reasoning Cracks the Information Maze

March 4, 2011

I learned from one reader that the presentation by Tim Estes, the founder of Digital Reasoning, caused some positive buzz at a recent conference on the west coast. According to my source, this was a US government sponsored event focused on where content processing was going. The surprise was that as other presenters talked about the future, a company called Digital Reasoning displayed a next generation system. Keep in mind that i2 Ltd. is a solid analyst’s tool with technology roots that stretch back 15 years. (I did some work for the founder of i2 a few years ago and have a great appreciation for the case value of the system for law enforcement.) Palantir has some useful visualization tools, but the company continues to attract attention from litigation and brushes with outfits with some interesting sales practices. Beyond Search covered this story here and here.

dr solving the maze copy

ArnoldIT.com sees Digital Reasoning’s Synthesys as solving difficult information puzzles quickly and efficiently because it eliminates most of the false path or trial-and-error of traditional systems. Solving the information maze of real world flows is now possible in our view.

The shift was from semi-useful predictive numerical recipes and overlays or augmented outputs to something quite new and different. The Digital Reasoning presentation focused on real data and what the company called “automated understanding.”

For a few bucks last year, one of my colleagues and I got a look at the automated understanding approach of the Synthesys 3 platform. Tim Estes explained that real data poses major challenges to systems that lack an ability to process large flows, discern nuances, and apply what Mr. Estes described as “entity oriented analytics.”

Our take at ArnoldIT.com is that Digital Reasoning moves “beyond search” in a meaningful way. The key points we recall from our briefing was the a modular approach eliminates the need for a massive infrastructure build and the analytics reflect what is happening in a real time flow of unstructured information. My personal view is that historical research is best served by key word systems. The more advanced methods deliver actionable information and better decisions by focusing on the vast amounts of “now” data. A single Twitter message can be important. A meaningful analysis of a flow of Twitter messages moves insight to the next level.

Read more

Free Visio Stencil Art for SharePoint Planning

December 27, 2010

Why get lost in design search architectures when you could reference a map? Microsoft recently released a handy collection of Visio shapes created specifically for generating diagram models of server deployment. These shapes prove useful for the Microsoft 2010 versions of Office, SharePoint Server, Project Server, Search Server and SharePoint Foundation.

It only takes seconds to grab this 1-MB .zip file from the website and extract its contents into your Visio shape folder. The system requirements and download instructions are clearly posted on the webpage, making the whole process a snap. Microsoft was even thoughtful enough to provide several examples of sensible ways to employ the custom shapes; the IT pro content publishing team put together a smattering of SharePoint Server and Office 2010 technical diagrams as guidelines. Now your own SharePoint installations can quickly become a matter a record, making following the path easier in the future.

Sarah Rogers, December 27, 2010

Freebie

Next Page »

  •  Only search links from this page: