Oracle 11g gets all matched regex occurrences

Question

Oracle 11g gets all matched regex occurrences

I am using Oracle 11g and I would like to use REGEXP_SUBSTR to match all occurrences for this pattern. for example

SELECT REGEXP_SUBSTR('Txa233141b Ta233141 Ta233142 Ta233147 Ta233148', '(^|\s)[A-Za-z]{2}[0-9]{5,}(\s|$)') "REGEXP_SUBSTR" FROM DUAL;

returns only the first match of Ta233141, but I would like to return other occurrences matching the regular expression, i.e. Ta233142 Ta233147 Ta233148.

+8

oracle regex

florins Jul 11 '13 at 14:40

source share

5 answers

How to add a function that will loop through and return all values?

 create or replace function regexp_substr_mr ( p_data clob, p_re varchar ) return varchar as v_cnt number; v_results varchar(4000); begin v_cnt := regexp_count(p_data, p_re, 1,'m'); if v_cnt < 25 then for i in 1..v_cnt loop v_results := v_results || regexp_substr(p_data,p_re,1,i,'m') || chr(13) || chr(10); end loop; else v_results := 'WARNING more than 25 matches found'; end if; return v_results; end;

Then just call the function as part of the select query.

+1

Philj May 15, '14 at 16:10

source share

It's a bit late, but I needed basically the same thing and could not find a good snippet. I needed to look for a free text table column for some conditions and collect them. Since this may be useful for another, I have included a version based on this question. Although REGEXP_SUBSTR returns only one value, Oracle also provides REGEXP_COUNT to tell you how many matching elements are in a given row, so you can join this with a list of indexes to select each one as follows (summarized for one row at the top as "source_table "):

 WITH source_table AS ( SELECT 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' as free_text FROM dual ) , source AS ( SELECT cnt , free_text FROM ( SELECT RegExp_Count(free_text, '(^|\s)[A-Za-z]{2}[0-9]{5,}(\s|$)') AS cnt , free_text FROM source_table ) WHERE cnt > 0 ) , iota AS ( SELECT RowNum AS idx FROM dual CONNECT BY RowNum <= ( SELECT Max(cnt) FROM source ) ) SELECT UNIQUE RegExp_SubStr(s.free_text, '(^|\s)[A-Za-z]{2}[0-9]{5,}(\s|$)', 1, i.idx) AS result FROM source s JOIN iota i ON ( i.idx <= s.cnt )

+1

Steven cochran May 10, '17 at 13:07

source share

I fix the @Alex Poole answer to support multiple lines and for faster execution:

 with templates as (select '\w+' regexp from dual) select regexp_substr(str, templates.regexp, 1, level) substr from ( select 1 id, 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' as str from dual union select 2 id, '2 22222222222222Ta233141 2Ta233142 2Ta233147' as str from dual union select 3 id, '3Txa233141b 3Ta233141 3Ta233142' as str from dual ) join templates on 1 = 1 connect by id = connect_by_root id and regexp_instr(str, templates.regexp, 1, level) > 0 order by id, level

Source lines:

 ID STR -- ---------------------------------------------- 1 Txa233141b Ta233141 Ta233142 Ta233147 Ta233148 2 2 22222222222222Ta233141 2Ta233142 2Ta233147 3 3Txa233141b 3Ta233141 3Ta233142

Result:

 Txa233141b Ta233141 Ta233142 Ta233147 Ta233148 2 22222222222222Ta233141 2Ta233142 2Ta233147 3Txa233141b 3Ta233141 3Ta233142

0

David E. Veliev Dec 9 '16 at 20:59

source share

Below is a simple solution for your question.

 SELECT REGEXP_SUBSTR('Txa233141b Ta233141 Ta233142 Ta233147 Ta233148', '([a-zA-Z0-9]+\s?){1,}') "REGEXP_SUBSTR" FROM DUAL;

-one

Mohammed zubair Dec 04 '15 at 19:02

source share

Alex poole · Accepted Answer · 2013-07-11T15:08:31+0000

REGEXP_SUBSTR returns only one value. You can turn your row into a pseudo-table and then query it for matches. There is an XML based method that eludes me at the moment, but works using connections if you only have one source line:

 SELECT REGEXP_SUBSTR(str, '[^ ]+', 1, LEVEL) AS substr FROM ( SELECT 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' AS str FROM DUAL ) CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(str, '[^ ]+')) + 1;

... gives you:

 SUBSTR -------------------- Txa233141b Ta233141 Ta233142 Ta233147 Ta233148

... and you can filter it out with a slightly simpler version of the original template:

 SELECT substr FROM ( SELECT REGEXP_SUBSTR(str, '[^ ]+', 1, LEVEL) AS substr FROM ( SELECT 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' AS str FROM DUAL ) CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(str, '[^ ]+')) + 1 ) WHERE REGEXP_LIKE(substr, '^[A-Za-z]{2}[0-9]{5,}$'); SUBSTR -------------------- Ta233141 Ta233142 Ta233147 Ta233148

It is not very beautiful, but does not contain several values in one field.

Oracle 11g gets all matched regex occurrences

More articles: