Thursday, March 24, 2011

Xpath query and time

Hi, this is the content of my XML file:

<?xml version="1.0" encoding="ISO-8859-1"?>
<mainNode>
    <sub time="08:00">
     <status id="2">On</status>
     <status id="3">Off</status>
    </sub>
    <sub time="13:00">
     <status id="4">On</status>
     <status id="7">On</status>
    </sub>
    <sub time="16:00">
     <status id="5">On</status>
     <status id="6">On</status>
     <status id="7">Off</status>
     <status id="8">On</status>
    </sub>
    <sub time="20:00">
     <status id="4">Off</status>
     <status id="7">On</status>
    </sub>
    <sub time="23:59">
     <status id="4">On</status>
     <status id="7">On</status>
    </sub>
</mainNode>

My program gets the current time: if I get 15.59, I must retrieve all the status id of the next sub time (16.00):

<sub time="16:00">
     <status id="5">On</status>
     <status id="6">On</status>
     <status id="7">Off</status>
     <status id="8">On</status>
    </sub>

With a second XPath query I must get all the status id of the previous sub time (13.00). How to do it? I know SQL but I'm quite new to XPath. I accept urls to serious XPath resources too, if any. Thanks!

From stackoverflow
  • If you generate the xml yourself, you could change the way you store the time attribute using an integer value (ticks, for example), then you could do an easy numerical comparison using something like

    //sub[@time > 1389893892]
    
  • Well as long as the time is HH:MM something like the following should work: (I must excuse my syntax since I'm just dabbling without running, consider this pseudo-xpath):

    xmlns:fn="http://www.w3.org/2005/02/xpath-functions"
    
    //sub[fn:compare(@time,'12:59') > 0][1]/status
    

    This should select all the elements where time is greater than 12:59 and then select the first of those elements.

    You could also pass the value '12:59' as an external parameter into the xpath evaluation.

    Dimitre Novatchev : It is simpler to just use "lt" and "gt" when comparing two xs:time values.
  • Here is the ugly Xpath 1.0 solution:-

    sub[number((substring-before(@time, ':')) * 60 + number(substring-after(@time, ':'))) &gt; 959][1]
    

    Note 959 = 15 * 60 + 59 which I'm sure you can do in your calling code.

    Give that node the previous node can be accessed as:-

    preceding-sibling::sub[1]
    

    However a pragmatic, common sense solution would be to load the XML data into a set of data structures and use a language more suited to this task to look the items up.

    Tomalak : For an XPath solution, that's the way to go. It's not even especially ugly, IMHO. +1
    AnthonyWJones : @Tomalak: I must admit I was expecting it to be much uglier but you're right its not actually that bad.
    Dimitre Novatchev : No need to calculate the total seconds -- see my answer.
  • Here are two solutions:

    I. XPath 1.0

    This is one pair of XPath 1.0 expressions that select the required nodes:

    /*/*
        [translate(@time, ':','') 
        > 
         translate('15:59',':','')
        ][1]
    

    selects the first sub node with time later than 15:59.

    /*/*
        [translate(@time, ':','') 
        < 
         translate('15:59',':','')
        ][last()]
    

    selects selects the first sub node with the previous than 15:59 sub time.

    We can include these in an XSLT transformation and check that the really wanted result is produced:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output omit-xml-declaration="yes"/>
    
        <xsl:template match="/">
          First time after 15:59: 
          <xsl:copy-of select=
           "/*/*
              [translate(@time, ':','') 
             > 
               translate('15:59',':','')
              ][1]
          "/>
    
          First time before 15:59: 
          <xsl:copy-of select=
           "/*/*
              [translate(@time, ':','') 
             &lt; 
               translate('15:59',':','')
              ][last()]
          "/>
      </xsl:template>
    </xsl:stylesheet>
    

    When the above transformation is applied on the originally provided XML document:

    <mainNode>
        <sub time="08:00">
         <status id="2">On</status>
         <status id="3">Off</status>
        </sub>
        <sub time="13:00">
         <status id="4">On</status>
         <status id="7">On</status>
        </sub>
        <sub time="16:00">
         <status id="5">On</status>
         <status id="6">On</status>
         <status id="7">Off</status>
         <status id="8">On</status>
        </sub>
        <sub time="20:00">
         <status id="4">Off</status>
         <status id="7">On</status>
        </sub>
        <sub time="23:59">
         <status id="4">On</status>
         <status id="7">On</status>
        </sub>
    </mainNode>
    

    the wanted result is produced:

      First time after 15:59: 
    
    
    <sub time="16:00">
         <status id="5">On</status>
         <status id="6">On</status>
         <status id="7">Off</status>
         <status id="8">On</status>
    </sub>
    
      First time before 15:59: 
    
     <sub time="13:00">
         <status id="4">On</status>
         <status id="7">On</status>
     </sub>
    

    Do note the following:

    1. The use of the XPath translate() function to get rid of the colons

    2. The use of the last() function in the second expression

    3. There is no need to convert the time to seconds before the comparison

    4. When used as part of an XML document (such as an XSLT stylesheet, the < operator must be escaped.

    II. XPath 2.0

    In XPath 2.0 we can use the following two expressions to produce select the desired nodes:

    /*/*[xs:time(concat(@time,':00')) 
        gt 
         xs:time('15:59:00')
        ][1]
    

    selects the first sub node with time later than 15:59.

    /*/*[xs:time(concat(@time,':00')) 
       lt 
         xs:time('15:59:00')
        ][last()]
    

    selects selects the first sub node with the previous than 15:59 sub time.

    We can include these in an XSLT 2.0 transformation and check that the really wanted result is produced:

    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xsl:output omit-xml-declaration="yes"/>
    
        <xsl:template match="/">
          First time after 15:59: 
          <xsl:copy-of select=
           "/*/*[xs:time(concat(@time,':00')) 
               gt 
                 xs:time('15:59:00')
                 ][1]
          "/>
    
          First time before 15:59: 
          <xsl:copy-of select=
           "/*/*[xs:time(concat(@time,':00')) 
               lt 
                 xs:time('15:59:00')
              ][last()]
          "/>
        </xsl:template>
    </xsl:stylesheet>
    

    When the above transformation is applied on the originally provided XML document (the same as in the first solution), the same wanted result is produced.

    Do note the following:

    1. In XPath 2.0 xs:time is a native data type. However, in order to construct an xs:time() from the values in the xml document, we have to concat to them the missing seconds part.
    2. In XPath 2.0 xs:time values can be compared with the "atomic-value comarison operators" such as lt or gt.
    Dimitre Novatchev : @vyger Thanks for pointing this -- some weird bug in the "code" button. The indexes in the XSLT 2 code were shown as 7 and 8. Just edited it, if this continues, I will edit further and will not use the code button.
    Dimitre Novatchev : @vyger See: http://tinyurl.com/dd4tsy This is about XSLT books, but they cover by necessity XPath very well. My favourite is Michael Kay and his latest book on both XSLT 2.0 and XPath 2.0. BTW, if you think and answer is what you wanted, you can select it (click the check mark)
    AnthonyWJones : +1. Nice answer. Good to see you on SO as well ;)
    Dimitre Novatchev : @AnthonyWJones Thanks, SO is a very nice Q&A site.

0 comments:

Post a Comment