3

I've been banging my head against a wall for hours now and am looking for some help. To simplify my issue, I have two arrays, one which contains wildcards, and the other which makes use of those wildcards:

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

I cannot get PowerShell to recognize that these match.

My end goal is to have an output that tells me that purple.102 and orange.abc are not in $WildCardArray.

Seems super simple! Some of the things I've tried:

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")
foreach($Item in $SpelledOutArray)
{
$item | where {$wildcardarray -contains $item}
}

I get BLUE!.txt as a result because it's my control with no wildcards. If I change that to -notcontains, I get all of the results returned except BLUE. I've tried contains, match, equals, like and all of their opposites, compare-object, and nothing works. I get no errors, I just do not get the expected results

I tried replacing "*" with [a-zA-Z] and other combinations, but it replaces it literally, and not as a wildcard. I'm not sure what I'm doing wrong.... PSVersion 5.1 Win 10

Does anybody know the logic behind WHY like/match/contains doesn't work, and what I can do make it work? It doesn't have to be pretty, it just needs to work

3
  • 1
    -contains looks for exact match, and I'm pretty sure it threats * as a character without any special meaning Commented Dec 12, 2016 at 18:32
  • $WildCardArray | ForEach-Object {$Wildcard = $_ ; $SpelledOutArray | Where-Object {$_ -like $WildCard}} Commented Dec 12, 2016 at 18:42
  • @beatcracker that gives the wrong output. Commented Dec 12, 2016 at 19:06

3 Answers 3

9

banging my head against a wall for hours [..] Seems super simple!

That's probably a hint that it's not super simple. You're trying to cross match two lists: red to red, yellow, blue.... then Blue to red, yellow, blue... then Green to red, yellow, blue.... 30 comparisons but you only have 5 loops happening.

You need more.

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

# Loop once over the spelled out items
foreach($Item in $SpelledOutArray)
{
    # for each one, loop over the entire WildCard array and check for matches
    $WildCardMatches = foreach ($WildCard in $WildCardArray)
    { 
        if ($item -like $WildCard) {
            $Item
        }
    }

    # Now see if there were any wildcard matches for this SpelledOut Item or not
    if (-not $WildCardMatches)
    {
        $Item 
    }
}

and the inner loop over WildCardArray can become a filter, but you have to be filtering the array, not the individual item as your code does.

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

foreach($Item in $SpelledOutArray)
{
   $WildCardMatches = $wildcardarray | Where { $item -like $_ }

   if (-not $WildCardMatches)
   {
       $Item 
   }
}

And I guess you could mash that down into an unclear double-where filter if you had to.

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

$SpelledOutArray |Where {$item=$_; -not ($WildCardArray |Where {$item -like $_}) }
Sign up to request clarification or add additional context in comments.

4 Comments

This works perfectly, thank you! Do you know, how does your solution hold up under large amounts of data? If for example, the $WildCardArray were to contain 1000 items and $SpelledOutArray upwards of 100k?
@Nick It would work, but probably slowly, it's O(N*M) runtime. You might be better off if you could combine all your wildcards into one regex e.g. @("RED-123.htm", "456.yellow", "BLUE!.txt", "789.green", "purple.102", "orange.abc") -notmatch '^RED-.*\.htm|.*\.yellow|BLUE!\.txt|.*\.green|.*\.purple$' does it for your example. But you said your example was simplified, so that might not be easy / possible for whatever you're really doing.
@TessellatingHeckler: Funny you should mention it; that's what I went with in my answer ;-)
Taking a list of file names/extensions from a website via invoke-webrequest, storing that in an array (WildCardArray), and then querying thousands of shares recursively (SpelledOutArray), possibly millions of files to see if any files match anything from the webrequest all in a foreach -parallel workflow... so when I said simplified I meant super simplified :( As long as it works I'm not super worried about time.
4

Your wildcard array is effectively a list of patterns to look for. You can turn this into a single regex and match against that:

$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

# Turn wildcards into regexes
# First escape all characters that might cause trouble in regexes (leaving out those we care about)
$escaped = $WildcardArray -replace '[ #$()+.[\\^{]','\$&' # list taken from Regex.Escape
# replace wildcards with their regex equivalents
$regexes = $escaped -replace '\*','.*' -replace '\?','.'
# combine them into one regex
$singleRegex = ($regexes | %{ '^' + $_ + '$' }) -join '|'

# match against that regex
$SpelledOutArray -notmatch $singleRegex

This has the potential to be faster than checking everything in a loop, although I didn't test. Also, extraordinarily-long regexes may cause trouble as well.

3 Comments

I've been messing around with this solution and have a question. It doesn't seem to take the "."'s literally, meaning that a wildcard of *.yellow will pull "123.greenyellow" . Is this a quirk of regex? I looked up the wildcards but can't seem to figure it out.
@Nick: Sorry, my escaping replacement actually removed all those characters instead of escaping them. I fixed that now.
Brilliant! Thanks for this pattern. TIP: This is an easy pattern to wrap in a function for easier reuse. Feel free to comment here if you have a library or example implementation of $ArrayToCheck -notmatch ToSingleRegex($ComparisonStringArray)
1
$WildCardArray = @("RED-*.htm", "*.yellow", "BLUE!.txt", "*.green", "*.purple")
$SpelledOutArray = @("RED-123.htm", "456.yellow", "BLUE!.txt",  "789.green", "purple.102", "orange.abc")

$WildCardArray | %{$str=$_; $SpelledOutArray | ? {$_ -like $str}  }

other solution, not short

$WildCardArray | 
   %{$current=$_; $SpelledOutArray | %{ [pscustomobject]@{wildcard=$current; value=$_ }}} | 
        where {$_.value -like $_.wildcard } 

1 Comment

This gives the wrong output... "My end goal is to have an output that tells me that purple.102 and orange.abc are not in $WildCardArray." but your code outputs RED-*.htm *.yellow BLUE!.txt *.green

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.