<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Semantics, Search and Big Honking Databases</title>
	<atom:link href="http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/feed/" rel="self" type="application/rss+xml" />
	<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/</link>
	<description>Gerry Campbell's View on the emergence of Technology and Business</description>
	<lastBuildDate>Fri, 01 Jan 2010 07:29:01 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: FluidDB: The next web paradigm? &#124; Provoking: The blog of Filip Dousek</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-352</link>
		<dc:creator>FluidDB: The next web paradigm? &#124; Provoking: The blog of Filip Dousek</dc:creator>
		<pubDate>Mon, 13 Apr 2009 16:18:54 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-352</guid>
		<description>[...] a weekend reading everything I could find on the net ((@terrycojones, his blog, notable articles by Gerry Campbell, Paul Erb and the original firestater) and exchange a few long emails with Terry. Interest [...]</description>
		<content:encoded><![CDATA[<p>[...] a weekend reading everything I could find on the net ((@terrycojones, his blog, notable articles by Gerry Campbell, Paul Erb and the original firestater) and exchange a few long emails with Terry. Interest [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lucky Robot - Semantics, Search &#38; Big Honking Databases &#171; Collecta.com Blog</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-328</link>
		<dc:creator>Lucky Robot - Semantics, Search &#38; Big Honking Databases &#171; Collecta.com Blog</dc:creator>
		<pubDate>Fri, 06 Mar 2009 01:30:43 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-328</guid>
		<description>[...] Lucky Robot - Semantics, Search &amp; Big Honking&#160;Databases December 5, 2008   http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/ [...]</description>
		<content:encoded><![CDATA[<p>[...] Lucky Robot &#8211; Semantics, Search &amp; Big Honking&nbsp;Databases December 5, 2008   <a href="http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/" rel="nofollow">http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/</a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maxim</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-318</link>
		<dc:creator>Maxim</dc:creator>
		<pubDate>Mon, 23 Feb 2009 15:26:14 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-318</guid>
		<description>I read your post and was amazed how our work close to what you describe here as semantic technology. We have developed a technology for semantic search and text analysis which leverage Wikipedia knowledge to derive concept meaning and relationships. To recent moment Wikipedia has grown into a massive up-to-date database of such relationships. We would like to show our technology to you as it implements nearly everything that you discribed in your post: disambiguation, semantic tagging, semantic similarity to find related content/concepts and more. Could you please email me at &lt;a href=&quot;mailto:maxim@grinev.net&quot; rel=&quot;nofollow&quot;&gt;maxim@grinev.net&lt;/a&gt; and I will reply with more details. Thank you.</description>
		<content:encoded><![CDATA[<p>I read your post and was amazed how our work close to what you describe here as semantic technology. We have developed a technology for semantic search and text analysis which leverage Wikipedia knowledge to derive concept meaning and relationships. To recent moment Wikipedia has grown into a massive up-to-date database of such relationships. We would like to show our technology to you as it implements nearly everything that you discribed in your post: disambiguation, semantic tagging, semantic similarity to find related content/concepts and more. Could you please email me at <a href="mailto:maxim@grinev.net" rel="nofollow">maxim@grinev.net</a> and I will reply with more details. Thank you.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kingsley Idehen</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-268</link>
		<dc:creator>Kingsley Idehen</dc:creator>
		<pubDate>Fri, 19 Dec 2008 20:37:11 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-268</guid>
		<description>How about the burgeoning cloud of RDF based Linked Data?&lt;br&gt;&lt;br&gt;Links:&lt;br&gt;1. &lt;a href=&quot;http://virtuoso.openlinksw.com/images/dbpedia-lod-cloud.html&quot; rel=&quot;nofollow&quot;&gt;http://virtuoso.openlinksw.com/images/dbpedia-l...&lt;/a&gt;&lt;br&gt;2. &lt;a href=&quot;http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData&quot; rel=&quot;nofollow&quot;&gt;http://esw.w3.org/topic/SweoIG/TaskForces/Commu...&lt;/a&gt;&lt;br&gt;3. &lt;a href=&quot;http://dbpedia.org/resource/Linked_Data&quot; rel=&quot;nofollow&quot;&gt;http://dbpedia.org/resource/Linked_Data&lt;/a&gt; - cross linked with Freebase and many other structured data spaces</description>
		<content:encoded><![CDATA[<p>How about the burgeoning cloud of RDF based Linked Data?</p>
<p>Links:<br />1. <a href="http://virtuoso.openlinksw.com/images/dbpedia-lod-cloud.html" rel="nofollow"></a><a href="http://virtuoso.openlinksw.com/images/dbpedia-l.." rel="nofollow">http://virtuoso.openlinksw.com/images/dbpedia-l..</a>.<br />2. <a href="http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData" rel="nofollow"></a><a href="http://esw.w3.org/topic/SweoIG/TaskForces/Commu.." rel="nofollow">http://esw.w3.org/topic/SweoIG/TaskForces/Commu..</a>.<br />3. <a href="http://dbpedia.org/resource/Linked_Data" rel="nofollow">http://dbpedia.org/resource/Linked_Data</a> &#8211; cross linked with Freebase and many other structured data spaces</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Mapstead</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-260</link>
		<dc:creator>Rob Mapstead</dc:creator>
		<pubDate>Wed, 10 Dec 2008 00:01:54 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-260</guid>
		<description>With the 2010 Census on the horizon, I&#039;m thinking about applying for a job with the Census just to see if I can help make sense of it all.  Wouldn&#039;t it be great if the Census data actually provided us with data that all Americans could actually benefit from?  Your discussion of tagging data is extremely important in this regard.&lt;br&gt;&lt;br&gt;As it relates to tagging words, isn&#039;t this just XML?  And don&#039;t we also need to tag whole phrases and not just words?</description>
		<content:encoded><![CDATA[<p>With the 2010 Census on the horizon, I&#39;m thinking about applying for a job with the Census just to see if I can help make sense of it all.  Wouldn&#39;t it be great if the Census data actually provided us with data that all Americans could actually benefit from?  Your discussion of tagging data is extremely important in this regard.</p>
<p>As it relates to tagging words, isn&#39;t this just XML?  And don&#39;t we also need to tag whole phrases and not just words?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: hymanroth</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-259</link>
		<dc:creator>hymanroth</dc:creator>
		<pubDate>Mon, 08 Dec 2008 15:24:01 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-259</guid>
		<description>Gerry, I would say the goal is to *infer* context rather than codify it.&lt;br&gt;&lt;br&gt;For example, in a document that it tagged as about MSFT, references to Gates are statistically more likely to refer to the person rather than the object. So, instead of tagging (codifying) each individual reference to Gates in the document, context can be inferred from one single tag, and hence the ambiguity resolved.</description>
		<content:encoded><![CDATA[<p>Gerry, I would say the goal is to *infer* context rather than codify it.</p>
<p>For example, in a document that it tagged as about MSFT, references to Gates are statistically more likely to refer to the person rather than the object. So, instead of tagging (codifying) each individual reference to Gates in the document, context can be inferred from one single tag, and hence the ambiguity resolved.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: gerry campbell</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-258</link>
		<dc:creator>gerry campbell</dc:creator>
		<pubDate>Mon, 08 Dec 2008 15:08:55 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-258</guid>
		<description>and can&#039;t we use co-occurrence, etc to establish that relatedness...</description>
		<content:encoded><![CDATA[<p>and can&#39;t we use co-occurrence, etc to establish that relatedness&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: gerry campbell</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-257</link>
		<dc:creator>gerry campbell</dc:creator>
		<pubDate>Mon, 08 Dec 2008 15:07:49 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-257</guid>
		<description>Does the nature of the task (and this discussion) change if we talk about it as codifying *relationships*? That&#039;s really where I am going. &lt;br&gt;&lt;br&gt;I am not sure it makes any difference at all WHAT the thing is, it&#039;s more about the interrelatedness of one word to other words. In that case, the ambiguity is represented in a set of linkages that are more or less exclusive. &lt;br&gt;&lt;br&gt;For example - the linkages to gates the thing vs gates the person would be different. Even in the case of that double entendre, the two sets could be statistically separable.</description>
		<content:encoded><![CDATA[<p>Does the nature of the task (and this discussion) change if we talk about it as codifying *relationships*? That&#39;s really where I am going. </p>
<p>I am not sure it makes any difference at all WHAT the thing is, it&#39;s more about the interrelatedness of one word to other words. In that case, the ambiguity is represented in a set of linkages that are more or less exclusive. </p>
<p>For example &#8211; the linkages to gates the thing vs gates the person would be different. Even in the case of that double entendre, the two sets could be statistically separable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: hymanroth</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-256</link>
		<dc:creator>hymanroth</dc:creator>
		<pubDate>Mon, 08 Dec 2008 07:53:02 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-256</guid>
		<description>Gerry&#039;s right about &#039;semantics&#039; being an overly used expression.&lt;br&gt;&lt;br&gt;Terry&#039;s argument regarding the objective definition of meaning refers to the term&#039;s traditional philosophical usage, whereas Gerry and direwolf are talking about contextual ambiguity.&lt;br&gt;&lt;br&gt;I believe pursuing semantics (philosophical) in computing is a futile endeavor until machines are able to feel the wind on the their faces.&lt;br&gt;&lt;br&gt;Resolving contextual ambiguity, however, is a much more attainable and in many ways more useful goal. How many times have you Googled something only to be returned hundreds pages with the &#039;other&#039; use of your key word?&lt;br&gt;&lt;br&gt;Whether progress is made via changes in representation, better algorithms or even some sort of stochastic analysis is largely irrelevant (to me). &lt;br&gt;&lt;br&gt;The key point is that whoever makes progress in this space will, as the VCs like to say, take away a lot of pain.</description>
		<content:encoded><![CDATA[<p>Gerry&#39;s right about &#39;semantics&#39; being an overly used expression.</p>
<p>Terry&#39;s argument regarding the objective definition of meaning refers to the term&#39;s traditional philosophical usage, whereas Gerry and direwolf are talking about contextual ambiguity.</p>
<p>I believe pursuing semantics (philosophical) in computing is a futile endeavor until machines are able to feel the wind on the their faces.</p>
<p>Resolving contextual ambiguity, however, is a much more attainable and in many ways more useful goal. How many times have you Googled something only to be returned hundreds pages with the &#39;other&#39; use of your key word?</p>
<p>Whether progress is made via changes in representation, better algorithms or even some sort of stochastic analysis is largely irrelevant (to me). </p>
<p>The key point is that whoever makes progress in this space will, as the VCs like to say, take away a lot of pain.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: terrycojones</title>
		<link>http://luckyrobot.com/2008/12/05/semantics-search-and-big-honking-databases/comment-page-1/#comment-255</link>
		<dc:creator>terrycojones</dc:creator>
		<pubDate>Mon, 08 Dec 2008 05:03:17 +0000</pubDate>
		<guid isPermaLink="false">http://luckyrobot.com/?p=155#comment-255</guid>
		<description>Hi again Gerry&lt;br&gt;&lt;br&gt;I wasn&#039;t being very nuanced in my original comments. That&#039;s partly due to lack of time, partly due to liking a more colorful debate. So here are a few more thoughts, and some pointers.&lt;br&gt;&lt;br&gt;Consider Artificial Intelligence and its pursuit of intelligence. We once thought it took real intelligence to play chess (for example). But as we got better and better at engineering, and we thought up smarter (but completely mechanical and non-mysterious) algorithms, we moved the goalposts. I.e., we decided that actually you didn&#039;t need to be &quot;intelligent&quot; to play chess after all.&lt;br&gt;&lt;br&gt;I don&#039;t believe that &quot;intelligence&quot; corresponds to any &quot;thing&quot; either, just like I think &quot;meaning&quot; and &quot;understanding&quot; are also just words. What I do believe however is in engineering and tool-building. We&#039;re primates, and primates are pretty good tool builders. So I often suggest to people that they spend less time (and investment monies :-)) on chasing abstract words and more time on building tools.&lt;br&gt;&lt;br&gt;The lesson of AI seems clear. If your tools are good enough, you can give the illusion of intelligence up to and beyond (i.e., beyond grandmaster) where it matters in any practical sense. The computer plays chess so well that you might as well say it&#039;s intelligent, or not - it just doesn&#039;t matter anymore.&lt;br&gt;&lt;br&gt;And I believe the same is true of semantics, and going after meaning and understanding. Those things can perfectly well not really exist while at the same time we can practically achieve them (i.e., the convenient and practical illusion, as with intelligence for the purposes of chess playing) by just focusing on engineering and tools.&lt;br&gt;&lt;br&gt;Make sense?&lt;br&gt;&lt;br&gt;From that POV, I argue that huge strides can be made by improving representation. If you get representation right, things that look like problems can simply go away. If you get the representation right, you may not even need a clever algorithm. Can you do an end-run around Google&#039;s armies of PhDs by changing representation? I.e., don&#039;t challenge them on the algorithm front, where you&#039;re bound to lose, but change the ground under them. You wont be surprised to hear that I think the answer is yes. I&#039;m not talking about &quot;beating&quot; Google as a company, but of taking search - and how we work with information in general - to a new level.&lt;br&gt;&lt;br&gt;I wrote about this at some length, back before it was so fashionable to be me :-)&lt;br&gt;&lt;br&gt;The main posting is &lt;a href=&quot;http://www.fluidinfo.com/terry/2007/03/19/why-data-information-representation-is-the-key-to-the-coming-semantic-web/&quot; rel=&quot;nofollow&quot;&gt;http://www.fluidinfo.com/terry/2007/03/19/why-d...&lt;/a&gt;&lt;br&gt;&lt;br&gt;And there are several others, including some that give very simple examples of why representation is so important, at &lt;a href=&quot;http://www.fluidinfo.com/terry/category/representation/&quot; rel=&quot;nofollow&quot;&gt;http://www.fluidinfo.com/terry/category/represe...&lt;/a&gt;&lt;br&gt;&lt;br&gt;In summary, I don&#039;t think the words matter much. I think we can achieve amazing results (things that look like real intelligence, real understanding, that somehow capture meaning, etc) simply by focusing on engineering. My best bet about where to focus is on representation. What are the implications of the various new ways of representing information that we&#039;re exploring? I&#039;ve been pondering that for over a decade! :-) My own bet, via Fluidinfo, definitely has some strong advantages and some strong weaknesses. It&#039;s a tradeoff, like so many things in computer science. Other approaches represent different tradeoffs. It&#039;s far from clear what will &quot;win&quot;. But as I said in my earlier comment, it&#039;s a vast space we&#039;re starting to explore, and, I like to imagine, there&#039;s plenty of value to go around.&lt;br&gt;&lt;br&gt;I hope that&#039;s a clearer and a more useful answer.</description>
		<content:encoded><![CDATA[<p>Hi again Gerry</p>
<p>I wasn&#39;t being very nuanced in my original comments. That&#39;s partly due to lack of time, partly due to liking a more colorful debate. So here are a few more thoughts, and some pointers.</p>
<p>Consider Artificial Intelligence and its pursuit of intelligence. We once thought it took real intelligence to play chess (for example). But as we got better and better at engineering, and we thought up smarter (but completely mechanical and non-mysterious) algorithms, we moved the goalposts. I.e., we decided that actually you didn&#39;t need to be &#8220;intelligent&#8221; to play chess after all.</p>
<p>I don&#39;t believe that &#8220;intelligence&#8221; corresponds to any &#8220;thing&#8221; either, just like I think &#8220;meaning&#8221; and &#8220;understanding&#8221; are also just words. What I do believe however is in engineering and tool-building. We&#39;re primates, and primates are pretty good tool builders. So I often suggest to people that they spend less time (and investment monies <img src='http://luckyrobot.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ) on chasing abstract words and more time on building tools.</p>
<p>The lesson of AI seems clear. If your tools are good enough, you can give the illusion of intelligence up to and beyond (i.e., beyond grandmaster) where it matters in any practical sense. The computer plays chess so well that you might as well say it&#39;s intelligent, or not &#8211; it just doesn&#39;t matter anymore.</p>
<p>And I believe the same is true of semantics, and going after meaning and understanding. Those things can perfectly well not really exist while at the same time we can practically achieve them (i.e., the convenient and practical illusion, as with intelligence for the purposes of chess playing) by just focusing on engineering and tools.</p>
<p>Make sense?</p>
<p>From that POV, I argue that huge strides can be made by improving representation. If you get representation right, things that look like problems can simply go away. If you get the representation right, you may not even need a clever algorithm. Can you do an end-run around Google&#39;s armies of PhDs by changing representation? I.e., don&#39;t challenge them on the algorithm front, where you&#39;re bound to lose, but change the ground under them. You wont be surprised to hear that I think the answer is yes. I&#39;m not talking about &#8220;beating&#8221; Google as a company, but of taking search &#8211; and how we work with information in general &#8211; to a new level.</p>
<p>I wrote about this at some length, back before it was so fashionable to be me <img src='http://luckyrobot.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>The main posting is <a href="http://www.fluidinfo.com/terry/2007/03/19/why-data-information-representation-is-the-key-to-the-coming-semantic-web/" rel="nofollow"></a><a href="http://www.fluidinfo.com/terry/2007/03/19/why-d.." rel="nofollow">http://www.fluidinfo.com/terry/2007/03/19/why-d..</a>.</p>
<p>And there are several others, including some that give very simple examples of why representation is so important, at <a href="http://www.fluidinfo.com/terry/category/representation/" rel="nofollow"></a><a href="http://www.fluidinfo.com/terry/category/represe.." rel="nofollow">http://www.fluidinfo.com/terry/category/represe..</a>.</p>
<p>In summary, I don&#39;t think the words matter much. I think we can achieve amazing results (things that look like real intelligence, real understanding, that somehow capture meaning, etc) simply by focusing on engineering. My best bet about where to focus is on representation. What are the implications of the various new ways of representing information that we&#39;re exploring? I&#39;ve been pondering that for over a decade! <img src='http://luckyrobot.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  My own bet, via Fluidinfo, definitely has some strong advantages and some strong weaknesses. It&#39;s a tradeoff, like so many things in computer science. Other approaches represent different tradeoffs. It&#39;s far from clear what will &#8220;win&#8221;. But as I said in my earlier comment, it&#39;s a vast space we&#39;re starting to explore, and, I like to imagine, there&#39;s plenty of value to go around.</p>
<p>I hope that&#39;s a clearer and a more useful answer.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
