Exegy Delivers Ultra High Performance Hosted Service
February 3, 2010
With the buzz about real time content processing and outfits like Thomson Reuters delivering really fast throughput, I was not surprised to read in Wall Street & Technology that Exegy has gunned its engine and driven into the low latency hosted content processing service business. “Exegy Deploys Ultra Low Latency Ticket Plant on Options PIPE Platform” reports that Exegy has teamed with Options IT to make its Ticket Plant available on the Options IT platform. If you are not familiar with these firms, both support customers who require low latency access to information. The article said:
The Option PIPE platform is a fully optimized and managed, software-vendor-neutral, global technology infrastructure, providing clients with the efficiencies of a hosted technology service delivered with the scalability, strength and security of an enterprise solution. The hosted Exegy Ticker Plant is the first hardware-accelerated market data appliance built from the ground up to ensure high-frequency traders continuously have the best view of the electronic markets.
Exegy has engineered its hardware, firmware, and software to chop latency from content processing. For more information about Exegy navigate to http://www.exegy.com. For information about Options IT, point your browser to http://www.options-it.com/.
In a drag race, which vendor would win? I would lean toward the Exegy teams. Serious invention from that crowd in my opinion. I described Exegy in a a couple of my studies of next generation content processing vendors because the company distinguished itself with low latency crunching for the Wall Street crowd that has been thinned along with me in the economic melt down.
Stephen E Arnold, February 3, 2010
No one paid me to write this short article. I will report non payment to the IRS who cares about me. No, it really cares. For me. For you. For everyone.
Exclusive Interview: Digital Reasoning
February 2, 2010
Tim Estes, the youthful founder and chief technologist, for Digital Reasoning, a search and content processing company based in Tennessee, reveals the technology the is driving the company’s growth. Mr. Estes, a graduate of the University of Virginia, tackled the problem of information overload with a fresh approach. You can learn about Digital Reasoning’s approach that delivers a system that “deeply, conceptually searches within unstructured data, analyzes it and presents dynamic visual results with minimal human intervention. It reads everything, forgets nothing and gets smarter as you use it.”
Mr. Estes explained:
Digital Reasoning’s core product offering is called “Synthesys.” It is designed to take an enterprise from disparate data silos (both structured and unstructured), ingest and understand the data at an entity level (down to the “who, what, and wheres” that are mentioned inside of documents), make it searchable, linkable, and provide back key statistics (BI type functionality). It can work in an online/real-time type fashion given its performance capabilities. Synthesys is unique because it does a really good job at entity resolution directly from unstructured data. Having the name “Umar Farouk Abdul Mutallab” misspelled somewhere in the data is not a big deal for us – because we create concepts based on the patterns of usage in the data and that’s pretty hard to hide. It is necessarily true that a word grounds its meaning to the things in the data that are of the same pattern of usage. If it wasn’t the case no receiving agent could understand it. We’ve figured out how to reverse engineer that mental process of “grounding” a word. So you can have Abdulmutallab ten different ways and it doesn’t matter. If the evidence links in any statistically significant way – we pull it together.
You can read the full-text of this exclusive interview with Tim Estes on the ArnoldIT.com site in the Search Wizard Speak series. You can get more information about Digital Reasoning from the company’s Web site.
The Search Wizards Speak series provides the largest collection of free, detailed information about major enterprise search systems.Why pay the azure-chip consultants for sponsored listings, write ups prepared by consultants with little or no hands on experience, and services that “sell” advertorials. You hear in the developer’s, founders, and CEO’s own words what a system does and how it solves content-related problems.
Stephen E Arnold, February 2, 2010
No one paid me to write about my own Web site. I will report this charitable act to the head of the Red Cross.
Thomson Reuters Redefines Real Time
January 29, 2010
“Real time” is one of those phrases that is so easy to say but so, so difficult to deliver. Exalead has demonstrated to me a latency of 12 to 15 minutes. This means that when a change is made to the location of a package, that datum becomes available to a user of the client’s search enabled application within 12 to 15 minutes. In my experience, that’s fast. The old Excite.com (Architex) indexing system would grind for hours to update Adverworld pages. A mainstream search system labored for hours to update several million Web pages. But real time means no latency. Zero. Zip. Nada.
Thomson Reuters’ approach is explained in “Thomson Reuters Delivers Microsecond Access To News In London And Chicago.” Real time means that in Chicago and New York, certain content is available in microseconds. The write up said:
Rich Brown, Global Business Manager, Machine Readable News, Thomson Reuters, said: “Being first to act on this information can dramatically affect a firm’s profit and loss. The launch of NewsScope Direct, the market’s fastest machine readable news service, into London and Chicago reflects our commitment to delivering the market moving information our clients need at the speed required by their high performance trading strategies.”
Some questions:
- Are the data numeric or text?
- What is the latency for the information prior to its being received at a Thomson Reuters’ data center?
- What does “microsecond” mean?
- What part of the system delivers “microsecond” access?
Until I know more, I think this is a marketing and PR play to differentiate Thomson Reuters from other financial trading data vendors. I wonder if Thomson Reuters is able to beat the pants off Exegy, another outfit with speedy systems for the financial services industry?
Stephen E Arnold, January 29, 2010
A post I wrote whilst watching Tyson shiver in front of the fire. I will report his chill and my lack of compensation to the sharp eyed folks at the SEC.
Financeoid Pushes Business News Aggregation Forward
January 27, 2010
Business information is important but difficult to index. On the surface, business information appears to be a less-than-demanding type of content. Some publishers embed ticker symbols. Others stuff in metatags. The problem, however, boils down to language. Marketing mavens are quick to invent new words, spell names in a weird way, and cook up bizarre coinages to create consulting buzz. (A good example of this appeared in the Wall Street Journal on january 25, 2010, page B7 in the article “Strategic Plans Lose Favor.” That was a buzzword fest in my opinion.)
Now there is Financeoid.com. I admit that I am not crazy about the name, but I can see that the service makes certain business information easily accessible. I have already dropped the hard copy subscription to the Financial Times, and I think my local newspaper subscription is next. If Financeoid shows some muscle, maybe I will drop the Wall Street Journal hard copy subscription. I find its information increasingly stale and feature oriented. Not what I want with my McVittie’s biscuit in the morning.
I suggest you take a look at a financial news aggregation service that pulls “financial news, tips, and advises [sic] from 15,000 financial/ business blogs.” Although still in shake down cruise mode, the service makes pretty clear that the traditional financial media may have to shift their hate gaze from Google to other online innovators. The service Financeoid.com is at http://www.financeoid.com.
The site calculates “karma” via a proprietay algorithm. The goslings and I think this is a quite interesting aggregation service. The company promises that it will offer additoinal aggregations in the future. Worth a look.
Stephen E Arnold, January 27, 2010
A free write up. I will report this to the Bureau of Labor. I am a slave to this blog. If I were younger, I could turn myself in for employee overwork.
Crime and Timelines
January 27, 2010
I had a conversation yesterday (January 25, 2010) with a colleague. The question was the ability of online systems to date and time stamp events. If you are familiar with the i2.co.uk technology, you know that one of its features is the ability to generate a timeline. Events can be plotted by hours, days, or other intervals. This type of information display is quite useful in certain types of research.
The I2 company is a specialized outfit and it is not well known outside of a narrow market sector. The company, in my opinion, was a pioneer in the machine processing of data and generating useful time displays. But the usefulness of time lines remains an area of interest to specialists.
Google has done an excellent job with its Google News Timeline. If you have not explored that service, you can access it at http://newstimeline.googlelabs.com/. I don’t want to veer into a discussion of Google’s significant work in time and historical systems and methods.
Source: http://www.fetch.com/products/footprint/fetchcheck/howitworks/
I do want to highlight a company I mentioned in my conversation referenced above: Fetch Technologies. In 2008, I saw a demonstration of the company’s FootPrint technology. You can get some basic information about the FootPrint system from the Fetch Technologies Web site. I thought the system was forward looking in 2008 and I have the same opinion today.
In a nutshell:
Fetch FootPrint’s DaaS gives background screening/pre employment professionals accurate real time criminal history data culled from hundreds of online sources without having to purchase dedicated software, pay upfront license agreements or make internal IT integration investments. “We leveraged our AI tools to develop our DaaS delivery model to eliminate the expense of court runners and the time consuming process of looking up online records manually.
The system processes information from hundreds of online information sources. The focus is on real time updates, not the content that may be weeks, months, or years old; for example, certain publicly accessible repositories of registration information not updated by certain information providers.
You can get more information from the company about some of its approaches by contacting the company directly.
As I said to my colleague, this is worth a look. Fetch has a number of sophisticated real-time information services. Several of these push the envelope in real time information acquisition and analysis. Before embracing a “real time” solution from a content management vendor or content processing vendor jumping into real time because it is trendy, check out the Fetch approach.
Stephen E Arnold, January 27, 2010
A freebie. I don’t think I have ever been to El Segundo for fun or money. I will report this to the governor of California.
SharePoint and 100 Percent CPU Usage
December 29, 2009
We love SharePoint and are mesmerized by its search functionality. We spotted the tweet about 100 percent CPU use during indexing. After some clicking, we located “Search and CPU Usage” on MSGroups. The writer wanted to know how to troubleshoot 100 percent CPU usage. The answer delighted us. We find that fresh content and near real time indexing are getting more popular. But if your SharePoint system is under resourced, the fix is easy. Index less frequently. Here’s what the plea for help elicited:
You need check indexing interval. WSS 3.0 default crawling interval is every 5 minute. So if you have many content and indexing can not complete in 5 minute. It’s possible 100% CPU continuously. Yu can change indexing interval on central admin page….from every 5 minutes to every day.
Yep, fresh results in near real time. Obvious solution. Timely information in an index is obviously irrelevant.
Stephen E. Arnold, December 29. 2009
A freebie. Think of this as a late Boxing Day present. I will report this to the IRS.
Ambient Makes a Reappearance
December 21, 2009
Every few moon cycles, search ideas morph and surface with fresh lipstick. There was the nomadic search and there was ambient information. Ambient is back, and the idea is to take the information from RSS, Twitter, geolocation, and updates to Web pages and snap the Lego blocks together. Great idea, and it is one that outfits like Bright Planet, Deep Web Technologies, Fetch Technologies, Jack Be, Kapow, and a number of intelligence service systems are doing. Sure, some of the outfits are farther along (intel services); others have nifty technology (Fetch Technologies); and others shave updated established content acquisition systems (Bright Planet).
If you want to read more about ambient search, navigate to “Beyond Real-time Search: The Dawning Of Ambient Streams” by Edo Segal. The write up in interesting and contains a Venn diagram to pinpoint the opportunity for those who want to know about one of the “next big things”.
The Google is slogging away in this space, and it has been since it bought a Seattle start up and hired Dr. Ramanathan Guha. Other Google wizards are coding their souls’ insights to make information more timely, smart, useful, and “federated”.
For me, the most interesting comment in the write up was:
The challenges we face in terms of making real progress stems from the fact that the overarching goal is one that requires a multi-disciplinary approach across a myriad of data sets. While there are many companies executing in each of the quadrants few are in a position to access the full scope of data and therefore the ability to create the Holy Grail of filters is limited.
That’s an interest area of the Google. I don’t think it will carry humans beyond search. Moving beyond search requires a leap frog play and a willingness to leave behind the popular notion of intentional information retrieval.
Stephen E. Arnold, December 22, 2009
I disclose this is a freebie, designed to provide some context for what appears to be a break through insight. For reporting what seems to be new but is actually not new, I am monitored by the Bureau of Industry and Security. Keep industry safe from learning about the past is my motto.
Google Search Appliance Speaks Tweet
December 11, 2009
In one of those odd Google technical lurches, the Google Search Appliance now speaks tweet. For the unknowing, a tweet is a Twitter message. If I think real hard, I understand that social communication is the “new” thing. It follows that the Google Search Appliance should index tweets. I think it would be nice to invest a bit of time in security, connectors, and access to structured data. Google’s wizards obviously don’t agree, finding tweet content more important. I think organizations have some pretty useful structured data, but I assume that the post 1994 crowd finds that type of corporate problem trivial, irrelevant, or (most damning) “not interesting.” If you want to know more about the speak tweet movement, you will want to read “Google Search Appliance Goes Tweet Crazy” in the Washington Post’s version of a the original TechCrunch article.
Stephen Arnold, December 11, 2009
I feel compelled to alert the MARC train schedulers that I was not paid to report on this timely enhancement of the Google Search Appliance. All aboard for real time search in the enterprise. No database access, allowed.
Googzilla Switches Its Tail and Imperils Some Real Time Search Services
December 8, 2009
One this about scale is that it is often big. Big in data is good. The addled goose will leave it to you to read Google’s “Relevance Meets the Real-Time Web”. You can wade through the punditry. For this goose, here is the key paragraph in today’s (December 7, 2009) announcement:
Our real-time search features are based on more than a dozen new search technologies that enable us to monitor more than a billion documents and process hundreds of millions of real-time changes each day. Of course, none of this would be possible without the support of our new partners that we’re announcing today: Facebook, MySpace, FriendFeed, Jaiku and Identi.ca — along with Twitter, which we announced a few weeks ago.
Several comments:
- Folks pumping dough into the many real time indexing operations are likely to have second thoughts and will be asked some tough questions like “So, how is your service better than Google’s?”
- Users will have convenience delivery. This is like the 7-11 approach to shopping. The “cost” is irrelevant. The service makes the experience valuable.
- Forget the UX or user experience. The Google delivers data. Microsoft will have to come up with a hat and then find a rabbit in it.
Now what does this mean for “real” news? Interesting question. I will wait until a pundit, satrap, wizard, or azure chip consultant elucidates. I am a reactive goose.
Stephen Arnold, December 8, 2009
Oyez, oyez, I want to disclose to the Greenwich Observatory that I was not paid to make this comment. Lots of tourists in Greenwich.
Will Google Ring Its Digital Cash Register for Thee and TV
December 3, 2009
One of the media mavens wrote “Is YouTube Ready for Primetime? Google Wants to Stream TV, for a Fee” and illustrated the write up with pictures of my relatives from Harrod’s Creek cooking over a wood fire. The maven also chose a picture that showed my relatives watching the flames burning instead of sitting in the wooden hovel telling stories about the deer who got away.
The point of this write up to me is summed up in this passage:
YouTube already lets users watch a smattering of TV shows for free, with advertising. Now it envisions something similar to what Apple and Amazon already offer: First-run shows, without commercials, for $1.99 an episode, available the day after they air on broadcast or cable. Sources say the site’s negotiations with the networks and studios that own the shows are preliminary. But both sides seem optimistic, since models for such deals already exist. No comment from YouTube.
I thought that YouTube.com was 99 percent crap and one percent good content. I thought Google lacked the business acumen to do “something” with YouTube.com. I thought YouTube.com was a giant mistake that would function like a giant, digital albatross and make Google into a current version of the ancient mariner. Guess I was misunderstanding previous punditry?
I don’t have too much to add to the maven’s analysis. I won’t even question this statement:
But while Web users have an insatiable appetite for video, they’ve yet demonstrate much interest in paying for it. If any of this is going to work, that will have to change.
I think I will raise some questions that might be considered:
- What are the core technologies on which this alleged new initiative of Google rests? When were they developed? What other functions do these have, assuming the technologies do indeed exist?
- What are the costs of the alleged new service? How will Google’s present business model benefit from such a push if such a push takes place?
- What other rich media could be affected by such a push assuming the push is extended?
- Why haven’t other companies built a sufficiently significant competitive barrier to entry? What are the weaknesses of the existing services? is Google’s service competitively flawed? If so, in what way? Is it superior? In what way?
I can generate some other questions that pundits, mavens, azure chip consultants, Google watchers, and other experts should be considering. I wonder if these wizards know the Pope has a TV channel on Google? Digital Gutenberg, anyone?
Stephen Arnold, December 3, 2009
I wish to disclose to American Society of Composers, Authors and Publishers that I was not paid to point out that questions about Google are yet to be answered. Maybe the truth is within one of the many Sergey-and-Larry-eat-pizza books?