<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: CMIS &#8211; Is XPath Just A Bit Too Tricksy?</title>
	<atom:link href="http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/</link>
	<description>Just a nerd trying to save the publishing industry. Again.</description>
	<lastBuildDate>Fri, 11 May 2012 17:43:33 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: anon_anon</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-4562</link>
		<dc:creator>anon_anon</dc:creator>
		<pubDate>Wed, 25 Nov 2009 22:36:03 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-4562</guid>
		<description>you might want to look at vtd-xml for best possible xpath query perfomrance

&lt;a href=&quot;http://vtd-xml.sf.net&quot; rel=&quot;nofollow&quot;&gt;vtd-xml&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>you might want to look at vtd-xml for best possible xpath query perfomrance</p>
<p><a href="http://vtd-xml.sf.net" rel="nofollow">vtd-xml</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Marks</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-165</link>
		<dc:creator>Jon Marks</dc:creator>
		<pubDate>Thu, 09 Apr 2009 21:40:16 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-165</guid>
		<description>Al,

Thanks again for clarifying. And yes, I am thinking of the case where Color is a property of many types with no common type ancestor. I don&#039;t really like the idea of creating type inheritance trees for this. And even if I did, most of the CMS products I touch don&#039;t entertain the notion of subtypes (which I don&#039;t really care for) or aspects/mixins (which I would love).

Although it might sound strange to have many content types with common attributes, it seems to happen a lot. Sometimes, we might even have two or more content types with *identical* properties, for example &quot;News&quot; and &quot;Press Release&quot;. Now these should really be the same type but for the quirks of a products that mean you need a different type to apply different workflows, security or other policies. But I digress.

I&#039;m looking forward to seeing 1.0. After receiving comments on this posting, I&#039;ve found so much more to read ... :-) I wish I was a vendor so I could try to implement it!

Thanks again,
Jon</description>
		<content:encoded><![CDATA[<p>Al,</p>
<p>Thanks again for clarifying. And yes, I am thinking of the case where Color is a property of many types with no common type ancestor. I don&#8217;t really like the idea of creating type inheritance trees for this. And even if I did, most of the CMS products I touch don&#8217;t entertain the notion of subtypes (which I don&#8217;t really care for) or aspects/mixins (which I would love).</p>
<p>Although it might sound strange to have many content types with common attributes, it seems to happen a lot. Sometimes, we might even have two or more content types with *identical* properties, for example &#8220;News&#8221; and &#8220;Press Release&#8221;. Now these should really be the same type but for the quirks of a products that mean you need a different type to apply different workflows, security or other policies. But I digress.</p>
<p>I&#8217;m looking forward to seeing 1.0. After receiving comments on this posting, I&#8217;ve found so much more to read &#8230; <img src='http://jonontech.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  I wish I was a vendor so I could try to implement it!</p>
<p>Thanks again,<br />
Jon</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Al Brown</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-163</link>
		<dc:creator>Al Brown</dc:creator>
		<pubDate>Thu, 09 Apr 2009 16:58:03 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-163</guid>
		<description>For CMIS, Color would have to be on Document.  If Color was on Invoice, then you could do “SELECT * FROM Invoice WHERE ‘green’ = ANY Color”.

If you have many types that have Color, but there is not a common type ancestor that has Color, then best practice would be to create such a type ancestor.

Sometimes adhering to that best practice is problematic.  Especially if a system desires multiple inheritance.  One of the discussions in CMIS is around mixins/aspects.  This is one of the more forward looking proposals.  It would allow you to add an aspect, e.g., Colorable, to each class/object instance as appropriate.  You could then treat Clorable as a type to search against.  

This proposal is being discussed in the TC and John Newton is leading it.  It is unclear whether or not it will make it into 1.0.</description>
		<content:encoded><![CDATA[<p>For CMIS, Color would have to be on Document.  If Color was on Invoice, then you could do “SELECT * FROM Invoice WHERE ‘green’ = ANY Color”.</p>
<p>If you have many types that have Color, but there is not a common type ancestor that has Color, then best practice would be to create such a type ancestor.</p>
<p>Sometimes adhering to that best practice is problematic.  Especially if a system desires multiple inheritance.  One of the discussions in CMIS is around mixins/aspects.  This is one of the more forward looking proposals.  It would allow you to add an aspect, e.g., Colorable, to each class/object instance as appropriate.  You could then treat Clorable as a type to search against.  </p>
<p>This proposal is being discussed in the TC and John Newton is leading it.  It is unclear whether or not it will make it into 1.0.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Marks</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-161</link>
		<dc:creator>Jon Marks</dc:creator>
		<pubDate>Thu, 09 Apr 2009 16:41:58 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-161</guid>
		<description>Oh, okay. I got that wrong too then. Seems the only thing I was right about in this post was being wrong about everything :-) I think I&#039;ve also proved that it is dangerous to comment on a spec based on reading it alone. You need to play with it to understand it properly.

Just to make sure I&#039;ve got this right, then: I can do &quot;SELECT * FROM Document WHERE ‘green’ = ANY Color&quot; even if Color is not a property of the base type, but is of a few subtypes? I understood from the &lt;a href=&quot;http://jonontech.com/wp-content/uploads/2009/04/cmisqueryscope.jpg&quot; rel=&quot;nofollow&quot;&gt;Search Query Scope Diagram&lt;/a&gt; that I could only do the query you suggested if Color was a property of the base type. I presume that &quot;SELECT *&quot; will only return the properties in Document (as each subtype will have different properties).

I do really appreciate someone like yourself that knows CMIS so well answering my questions! I&#039;ll repay with beers any time :-)

Jon</description>
		<content:encoded><![CDATA[<p>Oh, okay. I got that wrong too then. Seems the only thing I was right about in this post was being wrong about everything <img src='http://jonontech.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  I think I&#8217;ve also proved that it is dangerous to comment on a spec based on reading it alone. You need to play with it to understand it properly.</p>
<p>Just to make sure I&#8217;ve got this right, then: I can do &#8220;SELECT * FROM Document WHERE ‘green’ = ANY Color&#8221; even if Color is not a property of the base type, but is of a few subtypes? I understood from the <a href="http://jonontech.com/wp-content/uploads/2009/04/cmisqueryscope.jpg" rel="nofollow">Search Query Scope Diagram</a> that I could only do the query you suggested if Color was a property of the base type. I presume that &#8220;SELECT *&#8221; will only return the properties in Document (as each subtype will have different properties).</p>
<p>I do really appreciate someone like yourself that knows CMIS so well answering my questions! I&#8217;ll repay with beers any time <img src='http://jonontech.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Jon</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adriaan Bloem</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-160</link>
		<dc:creator>Adriaan Bloem</dc:creator>
		<pubDate>Thu, 09 Apr 2009 15:54:57 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-160</guid>
		<description>I suspect (with far more reservations on my own ability to understand this than you have) it would be a lot harder to get XPath to work on these rather heterogeneous repositories than it is to implement CSQL. (Or at the very least, it&#039;d be a lot harder to squeeze any kind of performance out of it).

But whereas I agree -- and I&#039;d much prefer XPath to SQL -- it may not just be a case of it being to hard for the vendors. I think it might also be too hard on many developers. That&#039;s probably why it has seen some, but not great uptake (much the same as XSLT).

So if you want to set a standard that everybody already sort of knows, you go with SQL. Not because it&#039;s the best, or the most appropriate, but because it&#039;ll have the most success, which in the end is pretty important for a standard :)</description>
		<content:encoded><![CDATA[<p>I suspect (with far more reservations on my own ability to understand this than you have) it would be a lot harder to get XPath to work on these rather heterogeneous repositories than it is to implement CSQL. (Or at the very least, it&#8217;d be a lot harder to squeeze any kind of performance out of it).</p>
<p>But whereas I agree &#8212; and I&#8217;d much prefer XPath to SQL &#8212; it may not just be a case of it being to hard for the vendors. I think it might also be too hard on many developers. That&#8217;s probably why it has seen some, but not great uptake (much the same as XSLT).</p>
<p>So if you want to set a standard that everybody already sort of knows, you go with SQL. Not because it&#8217;s the best, or the most appropriate, but because it&#8217;ll have the most success, which in the end is pretty important for a standard <img src='http://jonontech.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Florent Guillaume</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-159</link>
		<dc:creator>Florent Guillaume</dc:creator>
		<pubDate>Thu, 09 Apr 2009 15:42:52 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-159</guid>
		<description>In CMIS, Document is the mandatory base type for all document types. Querying on Document is required to do the query on all its subtypes. Ok, actually only subtypes that have includeInSuperTypeQuery=true, but a repository that doesn&#039;t have that for the direct subtypes of Document would be considered subpar (even though I think the spec allows it).

I don&#039;t agree that the difference between CMIS and JCR is SQL vs XPath, especially when you consider JCR 2. For me the difference between the two is quite deeper. CMIS is higher level &amp; simpler, it describes protocols (on top of a unique model) and not a language binding, and is supported by more vendors. See the points I made in http://asserttrue.blogspot.com/2009/04/hell-freezes-over-as-big-ecm-vendors.html?showComment=1238880000000#c3031100695560533822</description>
		<content:encoded><![CDATA[<p>In CMIS, Document is the mandatory base type for all document types. Querying on Document is required to do the query on all its subtypes. Ok, actually only subtypes that have includeInSuperTypeQuery=true, but a repository that doesn&#8217;t have that for the direct subtypes of Document would be considered subpar (even though I think the spec allows it).</p>
<p>I don&#8217;t agree that the difference between CMIS and JCR is SQL vs XPath, especially when you consider JCR 2. For me the difference between the two is quite deeper. CMIS is higher level &amp; simpler, it describes protocols (on top of a unique model) and not a language binding, and is supported by more vendors. See the points I made in <a href="http://asserttrue.blogspot.com/2009/04/hell-freezes-over-as-big-ecm-vendors.html?showComment=1238880000000#c3031100695560533822" rel="nofollow">http://asserttrue.blogspot.com/2009/04/hell-freezes-over-as-big-ecm-vendors.html?showComment=1238880000000#c3031100695560533822</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Al Brown</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-158</link>
		<dc:creator>Al Brown</dc:creator>
		<pubDate>Thu, 09 Apr 2009 15:42:41 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-158</guid>
		<description>Jon,

This is a very interesting article. Your choice of search metaphor (XPath vs SQL) is based on where you start your search from.  Most ECM systems start search from the point of type.  I am looking for a document (invoice, claim, loan, etc) that meets the following criteria.  The criteria could be on the properties or its location.  It is not necessarily clear that those objects exist in a folder structure.  They could.  That&#039;s a customer&#039;s choice.

With XPath, you start with a hierarchy.  I want to find items under this folder that match a certain criteria.  This implies that customers always use the hierarchy and have organized their content that way.

Based on the paradigm you start with, you add in access to the rest of the vendor&#039;s models, such as /root/unfiled hacks. It is probable companies did not implement these extensions the same way as the standard and thus the cost to support increases.

Also, the issue with XPath was since many vendors did not natively support that style, when XPath limitations were encountered (and they were) adding support for XQuery became problematic.

CMIS is based on the idea of standardizing the 80% of what all the vendors do and do well.  It is backward looking by design.  This decreases the cost and makes it an easier decision on whether to support the effort or not.  This was proven by the interoperability plugfest where most vendors created their prototypes with a couple of man-months of effort.  That is impressive for a standard that provides significant capability.

I think JCR is a good standard, and a forward looking one, and is the result of a lot of good work by very many talented people.  Unfortunately, times change, and a big drawback of a Java standard is typically the lack of MSFT support.  SharePoint, like it or not, has an impact on this space and it is important to include them in interoperability.</description>
		<content:encoded><![CDATA[<p>Jon,</p>
<p>This is a very interesting article. Your choice of search metaphor (XPath vs SQL) is based on where you start your search from.  Most ECM systems start search from the point of type.  I am looking for a document (invoice, claim, loan, etc) that meets the following criteria.  The criteria could be on the properties or its location.  It is not necessarily clear that those objects exist in a folder structure.  They could.  That&#8217;s a customer&#8217;s choice.</p>
<p>With XPath, you start with a hierarchy.  I want to find items under this folder that match a certain criteria.  This implies that customers always use the hierarchy and have organized their content that way.</p>
<p>Based on the paradigm you start with, you add in access to the rest of the vendor&#8217;s models, such as /root/unfiled hacks. It is probable companies did not implement these extensions the same way as the standard and thus the cost to support increases.</p>
<p>Also, the issue with XPath was since many vendors did not natively support that style, when XPath limitations were encountered (and they were) adding support for XQuery became problematic.</p>
<p>CMIS is based on the idea of standardizing the 80% of what all the vendors do and do well.  It is backward looking by design.  This decreases the cost and makes it an easier decision on whether to support the effort or not.  This was proven by the interoperability plugfest where most vendors created their prototypes with a couple of man-months of effort.  That is impressive for a standard that provides significant capability.</p>
<p>I think JCR is a good standard, and a forward looking one, and is the result of a lot of good work by very many talented people.  Unfortunately, times change, and a big drawback of a Java standard is typically the lack of MSFT support.  SharePoint, like it or not, has an impact on this space and it is important to include them in interoperability.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Marks</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-157</link>
		<dc:creator>Jon Marks</dc:creator>
		<pubDate>Thu, 09 Apr 2009 14:59:14 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-157</guid>
		<description>Thanks so much for your reply. The link you posted is exactly what I was looking for. I was Googling things link &quot;CMIS XPath SQL&quot; in the search, where using &quot;iECM XPath SQL&quot; is far cleverer. I guess the CMIS discussions didn&#039;t want to rehash the iECM discussions too much. I&#039;ll read this properly and add my thoughts.

On your points:
- I still think an XPath query still makes sense for multi-filed documents. Unfiled make slightly less sense. Probably need to hack in some new virtual /root/unfiled/ node. 
- I probably wasn&#039;t clear here. My example would be searching across content types such as NEWS_ARTICLE, PRESS_RELEASE or WHITE_PAPER that may have different but overlapping attributes. In your example, is &quot;Document&quot; a reserved work that means all content types? Could I do this in one CSQL query as you have? Or have I misunderstood/misread the spec?
- Agree. I probably shouldn&#039;t have mentioned the breadcrumb notion at all. I mentioned it as it is something that always needs consideration but isn&#039;t really relevant to the discussion at hand.
- I guess I&#039;ve got some more reading to do here too. You know a lot more than I do ...

Would you agree, though, that the main difference between CMIS and JCR is SQL v. XPath? Or do you think the other differences are more fundamental?

Thanks again
Jon</description>
		<content:encoded><![CDATA[<p>Thanks so much for your reply. The link you posted is exactly what I was looking for. I was Googling things link &#8220;CMIS XPath SQL&#8221; in the search, where using &#8220;iECM XPath SQL&#8221; is far cleverer. I guess the CMIS discussions didn&#8217;t want to rehash the iECM discussions too much. I&#8217;ll read this properly and add my thoughts.</p>
<p>On your points:<br />
- I still think an XPath query still makes sense for multi-filed documents. Unfiled make slightly less sense. Probably need to hack in some new virtual /root/unfiled/ node.<br />
- I probably wasn&#8217;t clear here. My example would be searching across content types such as NEWS_ARTICLE, PRESS_RELEASE or WHITE_PAPER that may have different but overlapping attributes. In your example, is &#8220;Document&#8221; a reserved work that means all content types? Could I do this in one CSQL query as you have? Or have I misunderstood/misread the spec?<br />
- Agree. I probably shouldn&#8217;t have mentioned the breadcrumb notion at all. I mentioned it as it is something that always needs consideration but isn&#8217;t really relevant to the discussion at hand.<br />
- I guess I&#8217;ve got some more reading to do here too. You know a lot more than I do &#8230;</p>
<p>Would you agree, though, that the main difference between CMIS and JCR is SQL v. XPath? Or do you think the other differences are more fundamental?</p>
<p>Thanks again<br />
Jon</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Florent Guillaume</title>
		<link>http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/#comment-156</link>
		<dc:creator>Florent Guillaume</dc:creator>
		<pubDate>Thu, 09 Apr 2009 14:37:53 +0000</pubDate>
		<guid isPermaLink="false">http://jonontech.com/?p=445#comment-156</guid>
		<description>Mainly, XPath has not been used because the repository vendors aren&#039;t too fond of it (most of them are SQL guys). Note also that in JCR 2 (JSR-283) SQL has become the primary mandatory query language, and XPath is deprecated.

It seems to me that XPath is mostly used by XML fans, and is only applicable if you can have a natural mapping of your data into the XML basic infoset. That&#039;s sometimes hard.

Also, have a look at John Newton&#039;s article from 2006 about SQL vs XPath, which is still relevant: http://newton.typepad.com/content/2006/09/sql_vs_xpath_vs.html

Regarding some of the bulleted points you made:
- XPath is more natural for a hierarchy but the hierarchy aspects of CMIS only apply to the folders, the documents themeselves can be multi-filed or unfiled so XPath would have a harder time applying to them,
- If you want to find all documents that are green you can do SELECT * FROM Document WHERE &#039;green&#039; = ANY Color. It&#039;s hard to see a use case where you would want to query at the same times Documents and Folders, or Documents and Relations for instance,
- Folder breadcrumb is an ill-defined notion if you have multi-filing or unfiling, and that&#039;ll be quite common in some CMIS repositories. Anyway you can do getObjectParents to get all the containing folders, then getFolderParent with returnToRoot=true on each,
- Pagination functionality is expressed in the model but may appear differently in the actual protocols, for instance AtomPub specifies RFC 5005 for paging.</description>
		<content:encoded><![CDATA[<p>Mainly, XPath has not been used because the repository vendors aren&#8217;t too fond of it (most of them are SQL guys). Note also that in JCR 2 (JSR-283) SQL has become the primary mandatory query language, and XPath is deprecated.</p>
<p>It seems to me that XPath is mostly used by XML fans, and is only applicable if you can have a natural mapping of your data into the XML basic infoset. That&#8217;s sometimes hard.</p>
<p>Also, have a look at John Newton&#8217;s article from 2006 about SQL vs XPath, which is still relevant: <a href="http://newton.typepad.com/content/2006/09/sql_vs_xpath_vs.html" rel="nofollow">http://newton.typepad.com/content/2006/09/sql_vs_xpath_vs.html</a></p>
<p>Regarding some of the bulleted points you made:<br />
- XPath is more natural for a hierarchy but the hierarchy aspects of CMIS only apply to the folders, the documents themeselves can be multi-filed or unfiled so XPath would have a harder time applying to them,<br />
- If you want to find all documents that are green you can do SELECT * FROM Document WHERE &#8216;green&#8217; = ANY Color. It&#8217;s hard to see a use case where you would want to query at the same times Documents and Folders, or Documents and Relations for instance,<br />
- Folder breadcrumb is an ill-defined notion if you have multi-filing or unfiling, and that&#8217;ll be quite common in some CMIS repositories. Anyway you can do getObjectParents to get all the containing folders, then getFolderParent with returnToRoot=true on each,<br />
- Pagination functionality is expressed in the model but may appear differently in the actual protocols, for instance AtomPub specifies RFC 5005 for paging.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

