<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Clustify: Identifying Similar Documents</title>
	<atom:link href="http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/feed/" rel="self" type="application/rss+xml" />
	<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/</link>
	<description>by Stephen E. Arnold</description>
	<lastBuildDate>Sun, 21 Mar 2010 14:15:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Steve</title>
		<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/comment-page-1/#comment-56571</link>
		<dc:creator>Steve</dc:creator>
		<pubDate>Fri, 19 Jun 2009 13:21:48 +0000</pubDate>
		<guid isPermaLink="false">http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/#comment-56571</guid>
		<description>Interesting article.  Came across it on a google search for clustering.  Have you done other research into this area with other tools.  There are several tools in the marketplace today, but I would be interested in how this clustify worked against the other tools.</description>
		<content:encoded><![CDATA[<p>Interesting article.  Came across it on a google search for clustering.  Have you done other research into this area with other tools.  There are several tools in the marketplace today, but I would be interested in how this clustify worked against the other tools.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen E. Arnold</title>
		<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/comment-page-1/#comment-3112</link>
		<dc:creator>Stephen E. Arnold</dc:creator>
		<pubDate>Mon, 10 Mar 2008 17:35:36 +0000</pubDate>
		<guid isPermaLink="false">http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/#comment-3112</guid>
		<description>Dmitri, you can see some of the features at www.magportal.com.

Stephen Arnold, March 10, 2008, 1 35 pm Eastern</description>
		<content:encoded><![CDATA[<p>Dmitri, you can see some of the features at <a href="http://www.magportal.com" rel="nofollow">http://www.magportal.com</a>.</p>
<p>Stephen Arnold, March 10, 2008, 1 35 pm Eastern</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dmitri</title>
		<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/comment-page-1/#comment-3106</link>
		<dc:creator>Dmitri</dc:creator>
		<pubDate>Mon, 10 Mar 2008 16:31:35 +0000</pubDate>
		<guid isPermaLink="false">http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/#comment-3106</guid>
		<description>Very interesting, too bad there is no online demo for this. Mining the concepts from text and tagging documents with them sounds like a very reasonable approach, as it gives the end user a quick grasp on the large body of documents (we are doing a similar thing). Regarding their legal implementation, I wonder if it is all based on derived tags, though; I think that a system like this also needs to have a dictionary with specific terms and semabntic constructs pertinent to the legal field. Glad I came across your blog Stephen, will keep following it.</description>
		<content:encoded><![CDATA[<p>Very interesting, too bad there is no online demo for this. Mining the concepts from text and tagging documents with them sounds like a very reasonable approach, as it gives the end user a quick grasp on the large body of documents (we are doing a similar thing). Regarding their legal implementation, I wonder if it is all based on derived tags, though; I think that a system like this also needs to have a dictionary with specific terms and semabntic constructs pertinent to the legal field. Glad I came across your blog Stephen, will keep following it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen E. Arnold</title>
		<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/comment-page-1/#comment-3105</link>
		<dc:creator>Stephen E. Arnold</dc:creator>
		<pubDate>Mon, 10 Mar 2008 15:57:26 +0000</pubDate>
		<guid isPermaLink="false">http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/#comment-3105</guid>
		<description>Thanks for posting Rob, I have looked at Carrot2, and it looks to me that Carrot2 can handle this job. You would have to write the middleware and hook into Carrot2. The stream function caught my eye with Clustify. Also, Vivisimo has a good clustering engine, and it works on the fly. I don&#039;t know, however, if Vivisimo makes that available as a stand alone. 

Stephen Arnold, March 10, 2008 Noon Eastern</description>
		<content:encoded><![CDATA[<p>Thanks for posting Rob, I have looked at Carrot2, and it looks to me that Carrot2 can handle this job. You would have to write the middleware and hook into Carrot2. The stream function caught my eye with Clustify. Also, Vivisimo has a good clustering engine, and it works on the fly. I don&#8217;t know, however, if Vivisimo makes that available as a stand alone. </p>
<p>Stephen Arnold, March 10, 2008 Noon Eastern</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Young</title>
		<link>http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/comment-page-1/#comment-3059</link>
		<dc:creator>Rob Young</dc:creator>
		<pubDate>Sun, 09 Mar 2008 22:14:20 +0000</pubDate>
		<guid isPermaLink="false">http://arnoldit.com/wordpress/2008/03/09/clustify-identifying-similar-documents/#comment-3059</guid>
		<description>I wonder how easy it would be to write a contender for this using the &lt;a href=&quot;http://project.carrot2.org/&quot; rel=&quot;nofollow&quot;&gt;Carrot2&lt;/a&gt; clustering engine. The search could be managed identically with Lucene. So far I have only heard of Carrot2 being used for clustering resultsets (ie. comparatively tiny sets), but it does accept a stream as input so it may be possible to shape up a good contender.
Thoughts?</description>
		<content:encoded><![CDATA[<p>I wonder how easy it would be to write a contender for this using the <a href="http://project.carrot2.org/" rel="nofollow">Carrot2</a> clustering engine. The search could be managed identically with Lucene. So far I have only heard of Carrot2 being used for clustering resultsets (ie. comparatively tiny sets), but it does accept a stream as input so it may be possible to shape up a good contender.<br />
Thoughts?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
