select records where regex equals other column

Question

I've got a table whith integers in column A and strings in column B, like:

+---------+-----------------+
| columnA | columnB         | 
+---------+-----------------+
| 32      | 1,8,12,32       | <--
| 16      | 1,1,2,9,2,7     | 
| 321     | 3,10,56,111,321 | <--
+---------+-----------------+

Is there simple way to select rows where columnB ends with value from columnA?

NO. Most of the solutions, including the "dup", would incorrectly allow "21" to match the last case. Suggesting reopening. — Rick James
– Rick James, Commented Mar 20, 2020 at 3:33

Rick James · Accepted Answer · 2020-03-20 03:49:40Z

I agree with Gordon's rant against storing a list that way.

FIND_IN_SET() checks for the integer being anywhere in the commalist.

RIGHT() won't check for a suitable boundary. So, "21" would match "3,10,56,111,321". As I understand the Question, only "321" should match.

RIGHT(), plus prefixing with a ',", would have "321" match "3,10,56,111,321" but fail with "321".

Before 8.0, "[[:<:]]321$" could be constructed to use as a regexp' \\b cannot be used.

MySQL 8.0 would not like the above regexp, but could use "\\b321$".

So...

Plan A: Combine some of the above tests and hope you have not left out an edge case.

Plan B: Do as Gordon suggested: fix the schema.

OK, I think this might be the shortest:

WHERE SUBSTRING_INDEX(colB, ',', -1) = colA

mysql> SELECT SUBSTRING_INDEX('321', ',', -1) = '321';
+-----------------------------------------+
| SUBSTRING_INDEX('321', ',', -1) = '321' |
+-----------------------------------------+
|                                       1 |
+-----------------------------------------+
+----------------------------------------+
| SUBSTRING_INDEX('321', ',', -1) = '21' |
+----------------------------------------+
|                                      0 |
+----------------------------------------+
+-------------------------------------------+
| SUBSTRING_INDEX('7,321', ',', -1) = '321' |
+-------------------------------------------+
|                                         1 |
+-------------------------------------------+
+----------------------------------------------+
| SUBSTRING_INDEX('7,321,11', ',', -1) = '321' |
+----------------------------------------------+
|                                            0 |
+----------------------------------------------+

Thanks for anwser, this edge case was valid in just 170 in around 1.3M results, but still VALID and btw. query execution time dropped from ~1000sec to ~60sec.

marcin.qxv · Accepted Answer · 2020-03-19 17:30:38Z

0

Writing this question I got an idea for an anwser using concatenation.

SELECT * FROM myTable t WHERE columnB REGEXP CONCAT(t.columnA, '$');

answered Mar 19, 2020 at 17:30

marcin.qxv

431 silver badge6 bronze badges

1 Comment

Rick James Over a year ago

But "21" would satisfy the last case.

Gordon Linoff · Accepted Answer · 2020-03-19 17:32:11Z

0

You should fix your data model! Do not store numbers as strings! Do not store multiple values in a string column!

That said, sometimes we are stuck with other peoples really, really bad decisions. MySQL has a useful function, find_in_set() for this case:

where find_in_set(columnA, columnB) > 0

answered Mar 19, 2020 at 17:32

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

4 Comments

marcin.qxv Over a year ago

Thanks for anwser, I'm working on translating an old database.

Rick James Over a year ago

OP wants to check for at the end.

user3408541 May 24 at 2:18

I dont agree with "Do not store multiple values in a string column". There is nothing wrong with storing a delimited list of values. Especially in cases where there can be a large or arbitrary number of values. The only alternative would be to have tables with HUGE amounts of columns, and that will waste a LOT of memory and be difficult to manage especially if only a few rows use all the columns. I would prefer a delimited set of values over a table with COLUMN1 ... COLUMN100 in it. In the delimited set you only select the one column, then split the data on the delimiter. Should be safe.

Gordon Linoff May 29 at 15:42

I don’t see why you need a lot of columns. Another table with id for the row, and value.

halfer · Accepted Answer · 2025-06-28 14:23:36Z

I am a little embarrassed to say I couldn't find a way to do this correctly with pure SQL. It seems completely obvious that I would want to use the data from one field as a regex in another, but there seems to be no straightforward way to do this if it is possible at all.

I had a similar problem with this question and chose a similar path. How can I store a regex pattern in a MySQL field and check an input against it?

I tried the other solutions and they seemed to work for simple cases, but I am not sure how they would scale. Also many of them had the problem mentioned above where 21 would also match 321 etc. This type of thing is trivial to solve correctly in Perl. The regex to match this is only...

$columnB =~ /,$columnA$/;

columnA has to be anchored at the end of columnB with the delimiter , in front. That way it won't mistakenly match substrings. So I decided to hand write a solution in Perl myself. The script first drops the table if it is there, creates the table, inserts the data, then runs the regex check on each row from the database using DBI.

Here is the code...

#!/usr/bin/perl -w

use DBI;
#pass all database information as command line argument
my ($database,$host,$port,$user,$pass) = (shift,shift,shift,shift, shift);
my $dsn = "DBI:mysql:database=$database;host=$host;port=$port";
my $dbh = DBI->connect($dsn,$user,$pass) or die "Connection Error: $DBI::errstr\n";

my $createTableSql = "
    CREATE TABLE `numberCheck` (
      `columnA` int(32) NOT NULL,
      `columnB` varchar(256) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;";

#I inserted a few test rows for debugging
my $insertRowsSql = "INSERT INTO `numberCheck` (`columnA`, `columnB`)
    VALUES
        (32,  '1,8,12,32'  ),
        (16,  '1,1,2,9,2,7'),
        (321, '3,10,56,111,321'),
        (21,  '3,10,56,111,321'),
        (1,   '2,1,7,4,1'),
        (2,   '2,1,3,5,8'),
        (3,   '2,5,7,7,3'),
        (4,   '4,4,2,6,7'),
        (5,   '1,5,3,2,5'),
        (6,   '1,1,3,4,3'),
        (7,   '2,6,7,1,7');
";

#drop table numberCheck first.  If it fails it's OK just go ahead and create it anyway
eval { $dbh->do("DROP TABLE numberCheck") }; print "Drop table failed: $@\n" if $@;
$dbh->do($createTableSql);
$dbh->do($insertRowsSql);

my @matchingRows;

# now retrieve data from the table.
my $sth = $dbh->prepare("SELECT * FROM numberCheck");
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
  my $columnA = $ref->{columnA};
  my $columnB = $ref->{columnB};
  if($columnB =~ /,$columnA$/ ){
    print "MATCH: columnA: $ref->{columnA} columnB: $ref->{columnB}\n";
    push(@matchingRows,$ref->{columnA});
  } else{
    print "\tDID NOT MATCH: columnA: $ref->{columnA} columnB: $ref->{columnB}\n";
  }
}

my $matchingRowsSql = "select * from numberCheck where columnA in (" . join(',',@matchingRows) . ");";
print "\nQuery to get matching rows:\n$matchingRowsSql\n\n";

my $nonMatchingRowsSql = $matchingRowsSql =~ s/in/not in/r;
print "Query to get non matching rows:\n$nonMatchingRowsSql\n";

$sth->finish();

Seems to be working. Here is the output...

$ perl run.regex.on.column.with.data.from.another.column.pl DATABASE HOST PORT USER PASS

MATCH: columnA: 32 columnB: 1,8,12,32
    DID NOT MATCH: columnA: 16 columnB: 1,1,2,9,2,7
MATCH: columnA: 321 columnB: 3,10,56,111,321
    DID NOT MATCH: columnA: 21 columnB: 3,10,56,111,321
MATCH: columnA: 1 columnB: 2,1,7,4,1
    DID NOT MATCH: columnA: 2 columnB: 2,1,3,5,8
MATCH: columnA: 3 columnB: 2,5,7,7,3
    DID NOT MATCH: columnA: 4 columnB: 4,4,2,6,7
MATCH: columnA: 5 columnB: 1,5,3,2,5
    DID NOT MATCH: columnA: 6 columnB: 1,1,3,4,3
MATCH: columnA: 7 columnB: 2,6,7,1,7

Query to get matching rows:
select * from numberCheck where columnA in (32,321,1,3,5,7);

Query to get non matching rows:
select * from numberCheck where columnA not in (32,321,1,3,5,7);

I ran the resulting queries in mysqlsh and got the following result...

+---------+-----------------+
| columnA | columnB         |
+---------+-----------------+
|      32 | 1,8,12,32       |
|     321 | 3,10,56,111,321 |
|       1 | 2,1,7,4,1       |
|       3 | 2,5,7,7,3       |
|       5 | 1,5,3,2,5       |
|       7 | 2,6,7,1,7       |
+---------+-----------------+
+---------+-----------------+
| columnA | columnB         |
+---------+-----------------+
|      16 | 1,1,2,9,2,7     |
|      21 | 3,10,56,111,321 |
|       2 | 2,1,3,5,8       |
|       4 | 4,4,2,6,7       |
|       6 | 1,1,3,4,3       |
+---------+-----------------+

If I run across an easier way to do it I will update the answer. I saw some stuff using user-defined variables that looks promising, but I didn't get anything working that way yet.

Collectives™ on Stack Overflow

select records where regex equals other column

4 Answers 4

1 Comment

1 Comment

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related