1

I've got a table whith integers in column A and strings in column B, like:

+---------+-----------------+
| columnA | columnB         | 
+---------+-----------------+
| 32      | 1,8,12,32       | <--
| 16      | 1,1,2,9,2,7     | 
| 321     | 3,10,56,111,321 | <--
+---------+-----------------+

Is there simple way to select rows where columnB ends with value from columnA?

3
  • WHERE columnB LIKE CONCAT('%', t.columnA) Commented Mar 19, 2020 at 17:32
  • See also stackoverflow.com/questions/3653462/… Commented Mar 19, 2020 at 17:41
  • NO. Most of the solutions, including the "dup", would incorrectly allow "21" to match the last case. Suggesting reopening. Commented Mar 20, 2020 at 3:33

4 Answers 4

2

I agree with Gordon's rant against storing a list that way.

FIND_IN_SET() checks for the integer being anywhere in the commalist.

RIGHT() won't check for a suitable boundary. So, "21" would match "3,10,56,111,321". As I understand the Question, only "321" should match.

RIGHT(), plus prefixing with a ',", would have "321" match "3,10,56,111,321" but fail with "321".

Before 8.0, "[[:<:]]321$" could be constructed to use as a regexp' \\b cannot be used.

MySQL 8.0 would not like the above regexp, but could use "\\b321$".

So...

Plan A: Combine some of the above tests and hope you have not left out an edge case.

Plan B: Do as Gordon suggested: fix the schema.

OK, I think this might be the shortest:

WHERE SUBSTRING_INDEX(colB, ',', -1) = colA

mysql> SELECT SUBSTRING_INDEX('321', ',', -1) = '321';
+-----------------------------------------+
| SUBSTRING_INDEX('321', ',', -1) = '321' |
+-----------------------------------------+
|                                       1 |
+-----------------------------------------+
+----------------------------------------+
| SUBSTRING_INDEX('321', ',', -1) = '21' |
+----------------------------------------+
|                                      0 |
+----------------------------------------+
+-------------------------------------------+
| SUBSTRING_INDEX('7,321', ',', -1) = '321' |
+-------------------------------------------+
|                                         1 |
+-------------------------------------------+
+----------------------------------------------+
| SUBSTRING_INDEX('7,321,11', ',', -1) = '321' |
+----------------------------------------------+
|                                            0 |
+----------------------------------------------+
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for anwser, this edge case was valid in just 170 in around 1.3M results, but still VALID and btw. query execution time dropped from ~1000sec to ~60sec.
0

Writing this question I got an idea for an anwser using concatenation.

SELECT * FROM myTable t WHERE columnB REGEXP CONCAT(t.columnA, '$');

1 Comment

But "21" would satisfy the last case.
0

You should fix your data model! Do not store numbers as strings! Do not store multiple values in a string column!

That said, sometimes we are stuck with other peoples really, really bad decisions. MySQL has a useful function, find_in_set() for this case:

where find_in_set(columnA, columnB) > 0

4 Comments

Thanks for anwser, I'm working on translating an old database.
OP wants to check for at the end.
I dont agree with "Do not store multiple values in a string column". There is nothing wrong with storing a delimited list of values. Especially in cases where there can be a large or arbitrary number of values. The only alternative would be to have tables with HUGE amounts of columns, and that will waste a LOT of memory and be difficult to manage especially if only a few rows use all the columns. I would prefer a delimited set of values over a table with COLUMN1 ... COLUMN100 in it. In the delimited set you only select the one column, then split the data on the delimiter. Should be safe.
I don’t see why you need a lot of columns. Another table with id for the row, and value.
0

I am a little embarrassed to say I couldn't find a way to do this correctly with pure SQL. It seems completely obvious that I would want to use the data from one field as a regex in another, but there seems to be no straightforward way to do this if it is possible at all.

I had a similar problem with this question and chose a similar path. How can I store a regex pattern in a MySQL field and check an input against it?

I tried the other solutions and they seemed to work for simple cases, but I am not sure how they would scale. Also many of them had the problem mentioned above where 21 would also match 321 etc. This type of thing is trivial to solve correctly in Perl. The regex to match this is only...

$columnB =~ /,$columnA$/;

columnA has to be anchored at the end of columnB with the delimiter , in front. That way it won't mistakenly match substrings. So I decided to hand write a solution in Perl myself. The script first drops the table if it is there, creates the table, inserts the data, then runs the regex check on each row from the database using DBI.

Here is the code...

#!/usr/bin/perl -w

use DBI;
#pass all database information as command line argument
my ($database,$host,$port,$user,$pass) = (shift,shift,shift,shift, shift);
my $dsn = "DBI:mysql:database=$database;host=$host;port=$port";
my $dbh = DBI->connect($dsn,$user,$pass) or die "Connection Error: $DBI::errstr\n";

my $createTableSql = "
    CREATE TABLE `numberCheck` (
      `columnA` int(32) NOT NULL,
      `columnB` varchar(256) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;";

#I inserted a few test rows for debugging
my $insertRowsSql = "INSERT INTO `numberCheck` (`columnA`, `columnB`)
    VALUES
        (32,  '1,8,12,32'  ),
        (16,  '1,1,2,9,2,7'),
        (321, '3,10,56,111,321'),
        (21,  '3,10,56,111,321'),
        (1,   '2,1,7,4,1'),
        (2,   '2,1,3,5,8'),
        (3,   '2,5,7,7,3'),
        (4,   '4,4,2,6,7'),
        (5,   '1,5,3,2,5'),
        (6,   '1,1,3,4,3'),
        (7,   '2,6,7,1,7');
";

#drop table numberCheck first.  If it fails it's OK just go ahead and create it anyway
eval { $dbh->do("DROP TABLE numberCheck") }; print "Drop table failed: $@\n" if $@;
$dbh->do($createTableSql);
$dbh->do($insertRowsSql);

my @matchingRows;

# now retrieve data from the table.
my $sth = $dbh->prepare("SELECT * FROM numberCheck");
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
  my $columnA = $ref->{columnA};
  my $columnB = $ref->{columnB};
  if($columnB =~ /,$columnA$/ ){
    print "MATCH: columnA: $ref->{columnA} columnB: $ref->{columnB}\n";
    push(@matchingRows,$ref->{columnA});
  } else{
    print "\tDID NOT MATCH: columnA: $ref->{columnA} columnB: $ref->{columnB}\n";
  }
}

my $matchingRowsSql = "select * from numberCheck where columnA in (" . join(',',@matchingRows) . ");";
print "\nQuery to get matching rows:\n$matchingRowsSql\n\n";

my $nonMatchingRowsSql = $matchingRowsSql =~ s/in/not in/r;
print "Query to get non matching rows:\n$nonMatchingRowsSql\n";

$sth->finish();

Seems to be working. Here is the output...

$ perl run.regex.on.column.with.data.from.another.column.pl DATABASE HOST PORT USER PASS

MATCH: columnA: 32 columnB: 1,8,12,32
    DID NOT MATCH: columnA: 16 columnB: 1,1,2,9,2,7
MATCH: columnA: 321 columnB: 3,10,56,111,321
    DID NOT MATCH: columnA: 21 columnB: 3,10,56,111,321
MATCH: columnA: 1 columnB: 2,1,7,4,1
    DID NOT MATCH: columnA: 2 columnB: 2,1,3,5,8
MATCH: columnA: 3 columnB: 2,5,7,7,3
    DID NOT MATCH: columnA: 4 columnB: 4,4,2,6,7
MATCH: columnA: 5 columnB: 1,5,3,2,5
    DID NOT MATCH: columnA: 6 columnB: 1,1,3,4,3
MATCH: columnA: 7 columnB: 2,6,7,1,7

Query to get matching rows:
select * from numberCheck where columnA in (32,321,1,3,5,7);

Query to get non matching rows:
select * from numberCheck where columnA not in (32,321,1,3,5,7);

I ran the resulting queries in mysqlsh and got the following result...

+---------+-----------------+
| columnA | columnB         |
+---------+-----------------+
|      32 | 1,8,12,32       |
|     321 | 3,10,56,111,321 |
|       1 | 2,1,7,4,1       |
|       3 | 2,5,7,7,3       |
|       5 | 1,5,3,2,5       |
|       7 | 2,6,7,1,7       |
+---------+-----------------+
+---------+-----------------+
| columnA | columnB         |
+---------+-----------------+
|      16 | 1,1,2,9,2,7     |
|      21 | 3,10,56,111,321 |
|       2 | 2,1,3,5,8       |
|       4 | 4,4,2,6,7       |
|       6 | 1,1,3,4,3       |
+---------+-----------------+

If I run across an easier way to do it I will update the answer. I saw some stuff using user-defined variables that looks promising, but I didn't get anything working that way yet.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.