An Interview with Stefan Andreasen
Data integration is an issue that can derail some information projects. Not just search is affected. Data mining, migrating data from one system to another, and preparing information objects for a NoSQL data management system can be stopped due to data integration issues.
Some trade publications refer to data integration as data fusion, mashups, ETL (jargon for extracting, transforming, and loading data from one system to another). The phrase is less important than the cost and time impacts of data integration.
Kapow Software (which recently changed its name from Kapow Technologies) has been active in data integration for more than a decade. With interest in reducing costs by providing users a single interface to business information, Kapow has been rapidly expanding its product and service offerings. The company offers technology for extraction, content production, and mobile, among others.
I spoke with the founder of Kapow Software on my recent trip to Palo Alto. The full text of my interview with Stefan Andreasen, Kapow Software’s founder and chief technology officer, appears in the ArnoldIT.com Search Wizards Speak mini-site at this link.
Where did the idea of Kapow originate? Was there a seminal project that triggered the idea of a content processing and enabling system?
In the late 1990s in Denmark, I founded Kapow.net, which grew into what became Europe’s largest online marketplace for luxury goods. As part of building Kapow.net, we developed a platform for internal use that automated the data extraction process from more than 5,000 Web sites while simultaneously normalizing and cleansing the data. It was built on a disruption assumption: a Web browser could be used as the interface for viewing, extracting and even integrating data; an approach heavily influenced by my prior experience working with flow-chart data visualization tools and object-oriented layouts. This platform allowed the Kapow.net team to perform work that would have been completely impossible using any other technology on the market.
What's your background? When did you become interested in data integration, text and content processing? What experience / professor / project pulled you into this technical space when you were in school?
I graduated from the Technical University of Denmark, the nation’s top technical academic institution. Following university, I spent five years with data visualization leader Advanced Visual Systems (Nasdaq: MUZE) where I developed a number of advanced C++ and Java and data visualization products.
As soon as we started building the foundational technology at Kapow.net in Denmark, I knew we were on to something special that had broad applicability far beyond that company. For one, the Web was evolving rapidly from an information-hub to a transaction-hub where businesses required the need to consolidate and automate millions of cross-application transactions in a scalable way. Also, Fortune 1000 companies were then and, as you know, even more so today, turning to outsourced consultants and hoards of manual workers to do the work that this innovation could do instantly.
When was this?
In 2001, I sold Kapow.net to the largest bank in Demark, and formed Kapow Technologies Solutions Inc., recently renamed to Kapow Software. We now focus solely on the data extraction browser technology.
Very early on in Kapow’s history, our current CEO John Yapaola joined the company to assist in moving the headquarters to the U.S. John comes from a long history of managing and growing high-tech startups. Prior to joining Kapow, he was CEO of Let’s Think Wireless, a wireless infrastructure design, engineering and installation company.
What's Kapow Katalyst and how does it differ from other data access and integration systems?
That’s a very important question. But let me first explain the problem Katalyst solves.
The explosion of Web applications without documented APIs in the enterprise—both on the Web and in the cloud--has created major challenges for data integration projects. This situation has become especially critical because rapid data delivery is emerging as the top strategic priority for most businesses.
Traditional integration methods can take months or years to rewrite applications, wait for APIs from the data provider, customize published APIs, or manually cut and paste data between different sources. The standard process makes it impossible to efficiently extract, transform and integrate the data where and when it’s needed. This is where Kapow Katalyst comes into play.
Now your answer. Kapow Katalyst is the first and only data integration platform that automatically extracts, transforms, integrates and migrates data to and from any application and any device without requiring an API.
How does this method work?
At the core of Katalyst is the first-of-its-kind extraction browser that automatically extracts data from virtually any source, transforms data into meaningful information using business rules, then integrates and migrates data into any database, Web site, enterprise app, cloud app, SaaS app, or mobile app.
So with the Kapow Extraction Browser no API is required because the powerful Kapow robots can automatically access and extract data directly through any layer of the data stack.
Everyone assumes that data fusion and "smart information" is the way information retrieval will work in an organization. What's your view of how the Kapow systems and software fit into the business professionals' work life?
The success of any enterprise largely depends on its ability to adapt to changing business conditions and customer expectations. In this respect, the ability to access, integrate and respond to real-time data and intelligence is arguably the number one business requirement to achieve this.
Kapow recognized the gap between the data business needs and IT’s inability to deliver it quickly and accurately. Critical data is housed everywhere: The Web, in legacy applications, out in the cloud. But it’s difficult to get at because most applications and Web sites do not have published APIs.
Companies could be waiting weeks for the manual coding necessary to integrate a new data source while the competition passes them by. Kapow’s platform takes the APIs, and therefore the dependencies and substantial information technology lag time that come with them, out of the equation, creating an agile business infrastructure with the real-time data they need to streamline processes and make snap decisions.
Would you give me an example of a customer use case?
Of course. In one user case, Audi, the automobile manufacturer, was able to eliminate dependencies, streamline their engineering process, and minimize the time-to-market on their new A8 model.
Audi employs Katalyst to integrate data for their state of the art navigation system, called MMI, which combines Google Earth with real-time data about weather, gas prices, and other travel information, customizing the driver’s real-time experience according to their location and taste preferences. In developing the navigation system, Audi had relied on application providers to write custom real-time APIs compatible with the new Audi system.
After months of waiting for the APIs and just two weeks away from the car launch date, Audi sought Kapow’s assistance. Katalyst was able to solve their problem quickly, wrapping their data providers’ current web applications into custom APIs and enabling Audi to meet their target launch date. By employing Kapow, Audi is now able to quickly launch the car in regional markets because Katalyst enables the Audi engineers to easily change and integrate new data sources for each market, in weeks rather than months.
Without divulging your firm's methods, will you characterize a representative technical enhancement you have made to leverage the Kapow technical capability?
Yes, that’s a great question. Brand new to Kapow Katalyst is a Web-based management console that improves cross-departmental and cross-functional collaboration around strategic integration projects. In addition, the management console is 100 percent compliant with today’s enterprise information technology infrastructure and security requirements. Our approach allows Kapow Katalyst users to leverage their current technology and system investments. The new console also enables the information technology department to monitor continuously the status of the organization’s entire production system across the enterprise to ensure maximum performance, governance and security.
Companies shifting business functionality to the cloud has had a profound impact on data integration. How is Kapow positioned to take advantage of this shift?
A company’s decision to move to the cloud is most often driven by a desire to reduce information technology infrastructure costs. On the other hand, companies are also driven to Software-as-a-Service (SaaS) in order to meet unfulfilled line of business needs.
When you shift your applications to the cloud or SaaS, you are moving to an environment which is always Web-based, often lacking database access. Adding to the access and integration challenge, applications and data will then be even more widely distributed among internal/private cloud, external cloud and various SaaS clouds.
Is this the disruption you mentioned a moment ago?
Yes, Kapow is disrupting the way companies solve these data delivery challenges with an end-to-end platform that uniquely integrates with any layer in the application stack. That’s the presentation layer always present in the cloud, application layer or database layer. Kapow automates the processes of extracting, transforming, integrating and migrating data from virtually any source in any of those layers to virtually any source. We know that this approach gives users unprecedented agility in moving data wherever they need it: inside their firewall, across the Web or in the cloud.
One challenge to those involved with squeezing useful elements from large volumes of content is the volume of content AND the rate of change in existing content objects. What does your firm provide to customers to help them deal with the amount of new information available across their organization?
If the data volume falls below a quarter million records, Kapow has proven through customer successes to be a viable and cost-effective way to work with large, internal datasets. Only for internal, extremely high-volume, low-latency data integration, would a more traditional data integration approach be a better choice.
The Kapow Extraction Browser gets its instructions from programmatic robots.
Software robots, right?
Yes, that’s our nomenclature for powerful scripted agents that automatically collect data and/or repeat data entry and business processing in real-time. IT staff can build robots with click-and-drag simplicity – no custom coding required – that perform any number of automated processes in parallel to collect, transform, integrate and migrate data quickly and accurately. Once these robots (or advanced extract-transform-load scripts) are put into action, the programmatic robots will automatically synchronize or update changes in the data in real-time or scheduled intervals. If the business needs a new source of data, the IT department can build a new robot with the same click-and-drag simplicity and very little lag time due to IT complications.
In real-world terms, Kapow allows AT&T to automatically synchronize data from 5,000 transactions between multiple customer-service applications per day.
Another challenge, particularly in professional intelligence operations, is moving data from point A to point B; that is, information enters a system but it must be made available to an individual who needs that information or at least must know about the information. What features have you introduced to allow Kapow clients to work without having to hunt or guess what search query unlocks the information treasure chest?
Yes, and this is really one of Kapow Katalyst’s sweet spots.
Business analysts rarely request data they don’t already know how to pull up on their screen via its original source. Since Kapow’s Extraction Browser allows a user to search for and extract the data exactly the same way, through the same presentation layer the analyst uses, our system completely eliminates the guesswork on how to access and deliver actionable data. The analyst can simply “show” the information technology staff where the data lives and the IT department can then quickly build a robot that can extract, transform, integrate or migrate that data exactly the way the analyst requested, with virtually no lag time.
There has been a surge in interest in putting "everything" in a repository and then manipulating the indexes to the information in the repository. On the surface, this seems to be gaining traction because network resident information can "disappear" or become unavailable. What's your view of the repository versus non repository approach to content processing?
When it comes to the epic repository debate, I have to enter Kapow as a neutral party. Kapow is a data delivery platform and can deliver the data in any file format, database, Web-service interface (REST/SOAP), or programmatic API (Java or Dot NET). We make it easy to deliver, transform or extract data from your preferred repository, but Kapow does not include its own repository nor do we favor any particular method. We can support your preferred data repository method, providing the same data extraction, transformation, and integration our customers have come to depend on.
Rich media is a hot content type. What is Kapow doing with regards to audio, video, imagery?
Rich media formats are not a passing phase and will only become an increasingly important data format. Inherent to our vision, Kapow Katalyst supports high-volume streaming of any media format directly from a Web application or site to your preferred destination file system, whether it is internal or in the cloud. Kapow has prioritized Katalyst optimization with the latest rich media formats, ensuring simple data delivery by maintaining 100 percent compatibility with the latest media format standards throughout our product roadmap.
I am on the fence about the merging of retrieval within other applications. What's your take on the "new" method which some people describe as "search enabled applications"?
I think we will see a great deal more “search-enabled applications” in the coming days, simply because end users will continue to demand their time-saving benefits. In what I think is indicative of their growing popularity, we actually have several customers employing Kapow Katalyst for federated search, in which they leverage our single-sign on ability to access multiple account-related data sources.
There seems to be a popular perception that the world will be doing computing via iPad devices and mobile phones. My concern is that serious computing infrastructures are needed and that users are "cut off" from access to more robust systems? How does Kapow see the computing world over the next 12 to 18 months?
Nearly every company faces growing demand from customers, partners and employees to deliver Web content, services and applications to mobile devices such as smart phones, tablets, e-readers, and navigation systems, which I think we can argue are becoming more “robust” themselves every day. Extending all relevant existing applications and data sources to the mobile platform would take developers years to do by traditional methods of rewriting applications.
Kapow Katalyst can wrap a client’s existing Web site into mobile- ready APIs, without changing the systems already in place. Our approach reduces overhead costs and reduces the time-to-market for companies looking to enter the mobile arena. I expect the mobile market to do nothing but grow and plan to continue to enable enterprise access to the mobile devices where their audience is increasingly spending time online.
For example, a leading European bank has used Kapow Katalyst to wrap their existing consumer-facing banking Web site into a set of REST services, and added a modern mobile platform on top. This has cut down the time-to-market from more than 24 months to just three months. Our approach spared the company from the considerable consultancy costs that would have been required to hastily rewrite their existing Web banking applications.
If you look out across the regulatory landscape, do you see more or less government intervention in information processing?
Tough question. Right now I see increasing regulatory oversight on the horizon: certainly a development for which Kapow intends to optimize its data delivery methods. Not surprisingly given the recent economic turbulence and bank bailout, the financial sector is facing a new era of regulation. Already we have worked with FiServ, a financial technology provider, to API-enable their financial accounts across more than 300 banks in over 10 countries in order to meet regulatory asset reporting to the treasury department. By employing Kapow, the whole project was implemented in three months and successfully delivers the real-time reporting necessary to ensure compliancy.
With an eye to improving regulatory capabilities around data access and integration, the Kapow Katalyst platform includes a key component for organizational governance, the Management Console. As one of the newest Katalyst features, the Management Console supports the creation of different administrator roles and access levels for personnel across enterprise operations, allowing for the secure establishment of business rules within the programming, ensuring the transformation and delivery of data is compliant with corporate and legal regulations. For instance, a network administrator or manager could have access to server monitoring and statistics tools, an application developer would have access to all code and APIs, while one user role might have access to the metrics needed to identify any non-compliant data reporting. From the management console, the IT managers can easily dictate the access allowed, the information displayed, and how and where it is displayed.
What are the most significant technologies that you see affecting our business?
I would say there are three evolutionary trends that I have been following closely for their ties to our roadmap. The first is the enterprise shift to the cloud and SaaS applications, which is something Kapow Katalyst has been simultaneously evolving to address.
Another important technology trend is the development of Web standards, such as HTML5, which will have a dramatic effect on web development landscape and, therefore, the way we access Web data.
My final and third technological observation is the fruition of a semantic web: one where web data is created so that machines can understand it, rather than humans.
With all three trends progressing and changing every day, our product road map emphasizes are ability stay abreast of these technological developments and standards, continuing our product innovation while maintaining 100 percent compatibility.
Where does a reader get more information about Kapow?
Our Web site is a good place. Its address is www.kapowsoftware.com
Our investigation of Kapow Software revealed a number of interesting features. Among those we noted were the company’s ability to manipulate unstructured information for use in XML repositories and to migrate data from a traditional relational database to a NoSQL data management system.
Worth a look.
Stephen E Arnold, December 14, 2010