0

I am try to remove duplicate entry after entity &#x00a7; and if contains the , in entry and after tokenize the start-with the ( round bracket then entry e.g (17200(b)(2), (4)–(6)) s/b e.g (<p>17200(b)(2)</p><p>17200(b)(4)–(6)</p>).
Input XML

<root>
    <p>CC &#x00a7;1(a), (b), (c)</p>
    <p>Civil Code &#x00a7;1(a), (b)</p>
    <p>CC &#x00a7;&#x00a7;2(a)</p>
    <p>Civil Code &#x00a7;3(a)</p>
    <p>CC &#x00a7;1(c)</p>
    <p>Civil Code &#x00a7;1(a), (b), (c)</p>
    <p>Civil Code &#x00a7;17200(b)(2), (4)–(6), (8), (12), (16), (20), and (21)</p>
</root>

Expected Output

<root>
   <sec specific-use="CC">
      <title content-type="Sta_Head3">CIVIL CODE</title>
      <p>1(a)</p>
      <p>1(b)</p>
      <p>1(c)</p>
      <p>2(a)</p>
      <p>3(a)</p>
      <p>17200(b)(2)</p>
      <p>17200(b)(4)–(6)</p>
      <p>17200(b)(8)</p>
      <p>17200(b)(12)</p>
      <p>17200(b)(16)</p>
      <p>17200(b)(20)</p>
      <p>17200(b)(21)</p>
   </sec>
</root>

XSLT Code

<xsl:template match="root">
    <xsl:copy>
        <xsl:for-each-group select="p[(starts-with(., 'CC ') or starts-with(., 'Civil Code'))]" group-by="replace(substring-before(., ' &#x00a7;'), 'Civil Code', 'CC')">
            <xsl:text>&#x0A;</xsl:text>
            <sec specific-use="{current-grouping-key()}">
                <xsl:text>&#x0A;</xsl:text>
                <title content-type="Sta_Head3">CIVIL CODE</title>
                <xsl:for-each-group select="current-group()" group-by="replace(substring-after(., '&#x00a7;'), '&#x00a7;', '')">
                    <xsl:sort select="replace(current-grouping-key(), '[^0-9.].*$', '')" data-type="number" order="ascending"/>
                    <xsl:for-each 
                        select="distinct-values(
                        current-grouping-key() ! 
                        (let $tokens := tokenize(current-grouping-key(), ', and |, | and ') 
                        return (head($tokens), tail($tokens) ! (substring-before(head($tokens), '(') || .)))
                        )" expand-text="yes">
                        <p>{.}</p>
                    </xsl:for-each>
                </xsl:for-each-group>
            </sec>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

1 Answer 1

0

You could do it like this, in a two-step approach where you first compute the list of existing elements and then use a for-each-group to remove duplicates.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <xsl:variable name="listP">
        <xsl:apply-templates select="root/p"/>
    </xsl:variable>
    
    <xsl:for-each-group select="$listP" group-by="p">
        <p><xsl:value-of select="current-grouping-key()"/></p>
    </xsl:for-each-group>
  </xsl:template>
  
  <xsl:template match="p">
    <xsl:variable name="input" select="replace(substring-after(.,'&#x00a7;'),'&#x00a7;','')"/>
    <xsl:variable name="chapter" select="substring-before($input,'(')"/>
    <xsl:for-each select="tokenize(substring-after($input, $chapter),',')">
        <p><xsl:value-of select="concat($chapter,replace(replace(.,' ',''),'and',''))"/></p>    
    </xsl:for-each>
  </xsl:template>
  
</xsl:stylesheet>

See it working here : https://xsltfiddle.liberty-development.net/gVrvcxQ

Sign up to request clarification or add additional context in comments.

4 Comments

Please check, missing (b) in last line e.g. 17200(b)(4)–(6) and if you given me some solution my existing code then for me like and I will appreciated.
@Sam What is the rule to determine that 17200(b) requires the first parentheses and then the other subsections but the other cases don't require it, like 1(a) is not 1(a)(b).
I think after tokenize the end with '(.*)(.*?)$' then then replace to group 1.
@Sam ok then you have the solution. I think my code goes far enough, you can modify it to suit the specific needs of your requirements.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.