0

I have the following SQL:

            with q1 ( Tdata,Key) AS (
    SELECT  (XMLtype(pint.transportdata, nls_charset_id('AL32UTF8')))
    , PINT.PAYMENTINTERCHANGEKEY
  from bph_owner.paymentinterchange pint
  where         PINT.TRANSPORTTIME >= to_date('2024-01-11 00:00:00', 'yyyy-mm-dd hh24:mi:Ss')
       AND  PINT.TRANSPORTTIME < to_date('2024-04-12 00:00:00', 'yyyy-mm-dd hh24:mi:Ss')
       AND  LENGTH(pint.transportdata)>0
       AND PINT.FILEFORMAT like 'pain%'
  )
    SELECT  C1.column_value , q1.Key from q1,
  XMLTABLE(
     '//*'
           PASSING q1.Tdata
              )   C1
  --where regexp_count( C1.column_value,'<') < 2
      ;

Which works - sample output :

COLUMN_VALUE
<Id xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"><PrvtId><Othr><Id>IE57ZZZ05327119</Id><SchmeNm><Prtry>SEPA</Prtry></SchmeNm>/Othr></PrvtId></Id>
<PrvtId xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"><Othr><Id>IE57ZZZ05327119</Id><SchmeNm><Prtry>SEPA</Prtry></SchmeNm>/Othr>/PrvtId>
<Othr xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"><Id>IE57ZZZ05327119</Id><SchmeNm><Prtry>SEPA</Prtry></SchmeNm></Othr>
<Id xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">IE57ZZZ05327119</Id>
<SchmeNm xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"><Prtry>SEPA</Prtry></SchmeNm>
<Prtry xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">SEPA</Prtry>

But If I try to use the REGEXP_COUNT it gives :

ORA-19011: Character string buffer too small

If I cut the predicate down to a smaller date range I get it to work :

COLUMN_VALUE
<FinInstnId xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"/>

Any idea how I can get this to work with a larger date range?

1
  • Please edit the question to include a minimal reproducible example with: the CREATE TABLE and INSERT statements for your sample data to allow us to replicate the problem. (We cannot work out how to get your desired output if we don't know what we are starting with.) Commented Apr 25, 2024 at 8:19

2 Answers 2

1

C1.column_value is (implicitly) an XMLType. You are passing that into regexp_count(), which expects a character value. That means your XMLType is implicitly converted, and by default that will use getstringval(). Once you get a column_value that is more than 4000 bytes (or maybe 32k depending on your version/settings) you'll see this error. That's why it sometimes works - with different date ranges you just happen to avoid the longer values.

To pass it in as a CLOB instead, explicitly convert it:

regexp_count( C1.column_value.getclobval(),'<')

But don't do this anyway; do what @MT0 said...

Sign up to request clarification or add additional context in comments.

2 Comments

Many thanks - I did try that before but for some reason it doesn't like the SELECT C1.column_value , q1.Key from q1, XMLTABLE( '//*' PASSING q1.Tdata ) C1 where regexp_count( C1.column_value.getclobval(),'<') ; I get : [Error] Execution (92: 54): ORA-00920: invalid relational operator but the @MT0 suggestions is working. Thanks for your time and help.
@Peterwarren - you seem to have lost the < 2 part...
1

Your filter:

where regexp_count( C1.column_value,'<') < 2

Appears to be matching leaf elements of the XML that contain no text.

If that is the logic that you are after then don't use a regular expression; instead, search for those conditions directly in the XPath expression when you parse the XML.

  • //*[not(*)] will find elements with no children;
  • //*[not(text())] will find elements with no text; and
  • you can combine them both to //*[not(*)][not(text())]

Like this:

SELECT  C1.column_value,
        p.paymentinterchangekey AS Key
FROM    bph_owner.paymentinterchange p
        CROSS JOIN XMLTABLE(
          '//*[not(*)][not(text())]'
           PASSING XMLtype(p.transportdata)
        ) C1
WHERE   p.TRANSPORTTIME >= DATE '2024-01-11'
AND     p.TRANSPORTTIME <  DATE '2024-04-12'
AND     LENGTH(p.transportdata)>0
AND     p.FILEFORMAT like 'pain%';

Which, for the sample data:

CREATE TABLE paymentinterchange (
  transportdata,
  paymentinterchangekey,
  transporttime,
  fileformat
) AS
SELECT EMPTY_CLOB() || '<Id xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"><PrvtId><Othr><Id>IE57ZZZ05327119</Id><SchmeNm><Prtry>SEPA</Prtry></SchmeNm></Othr></PrvtId></Id>',
       1,
       DATE '2024-01-11',
       'paint'
FROM   DUAL UNION ALL
SELECT EMPTY_CLOB() || '<FinInstnId xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"/>',
       2,
       DATE '2024-01-11',
       'paint'
FROM   DUAL;

Outputs:

COLUMN_VALUE KEY
<FinInstnId xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"/> 2

fiddle

3 Comments

Many thanks - I have it working though I needed to slightly amend : PASSING XMLtype(p.transportdata, nls_charset_id('AL32UTF8')) Much appreciated !
being cheeky here, is there a way the output can have the namespace removed ? Thanks again
@Peterwarren you can look at stackoverflow.com/q/24308225/1509264 or stackoverflow.com/q/29354318/1509264 - if they don't answer your question then you should probably ask a new question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.