4

Following are two arrays of strings

arr1=("aa" "bb" "cc" "dd" "ee")
echo ${#arr1[@]} //output => 5
arr2=("cc" "dd" "ee" "ff")
echo ${#arr2[@]} //output => 4

The difference of the two arrays is arr_diff=("aa" "bb" "ff") I can get the difference using the following and other methods from stackoverflow

arr_diff=$(echo ${arr1[@]} ${arr2[@]} | tr ' ' '\n' | sort | uniq -u)
OR
arr_diff=$(echo ${arr1[@]} ${arr2[@]} | xargs -n1 | sort | uniq -u)
echo ${arr_diff[@]} //output => (aa bb ff)

The point is not printing out the difference of the arrays, but getting the size of the difference array, so that I can validate if the two arrays have the same elements or not. However, if I try to query the size of the difference array, I get wrong answer.

echo ${#arr_diff[@]} //output => 1

I always get output as 1 irrespective of size of difference array (even when size is zero, i.e. both arr1 and arr2 have the same elements)

2
  • 2
    I'm not sure what you're expecting to get here; arr_diff is a plain variable, not an array at all. If you try to treat it as an array, it essentially functions like an array with just a single element (which might be any of "0", "1", "2", etc). So, to the extent that the number of elements in it is defined, it's always going to be 1. What did you expect it to be? Commented Feb 5 at 18:51
  • 1
    By contrast, readarray -t diff_items < <(comm -3 <(printf '%s\n' "${arr1[@]}" | sort -u) <(printf '%s\n' "${arr2[@]}" | sort -u)) will do the right thing as long as none of your array elements contain newlines. (Note that some of the entries in diff_items will have whitespace prepended; this indicates whether the item was only in arr1 or only in arr2). Commented Feb 5 at 19:42

3 Answers 3

6

To get the different elements from 2 arrays you can use this awk:

arr1=("aa" "bb" "cc" "dd" "ee")
arr2=("cc" "dd" "ee" "ff")

awk 'FNR == NR {
   arr[$1]
   next
}
{
   if ($1 in arr)
      delete arr[$1]
   else
      print $1
}
END {
   for (i in arr)
      print i
}' <(printf '%s\n' "${arr1[@]}") <(printf '%s\n' "${arr2[@]}")

ff
aa
bb

Now to get the difference in an array use:

read -ra diffarr < <(awk -v ORS=' ' 'FNR == NR {arr[$1]; next} {if ($1 in arr) delete arr[$1]; else print $1} END{for (i in arr) print i}' <(printf '%s\n' "${arr1[@]}") <(printf '%s\n' "${arr2[@]}"))

# check diffarr content
declare -p diffarr
declare -a diffarr=([0]="ff" [1]="aa" [2]="bb")

# print number of elements in diffarr

echo "${#diffarr[@]}"
3
Sign up to request clarification or add additional context in comments.

Comments

1

I think a good old bash loop also deserves a place. The following code:

# return 0 if first argument is contained in all the others
args_contain() {
  local i
  for i in "${@:2}"; do
    if [[ "$i" == "$1" ]]; then
      return
    fi
  done
  return 1
}

# return 0 if first argument, which is an array, contains the second argument
arr_contains() {
  local i="$1[@]"
  args_contain "$2" "${!i}"
}

# append to the first array argument
# elements of second array argument 
# which are not in the third array argument
arr_diff() {
  local i="$2[@]"
  for i in "${!i}"; do
    if ! arr_contains "$3" "$i"; then
       eval "$1+=(\"\$i\")"
    fi
  done
}

arr1=("aa" "bb" "cc" "dd" "ee")
arr2=("cc" "dd" "ee" "ff")
arr_diff=()
arr_diff arr_diff arr1 arr2
arr_diff arr_diff arr2 arr1
declare -p arr_diff

outputs:

declare -a arr_diff=([0]="aa" [1]="bb" [2]="ff")

1 Comment

Using namevars would let you avoid the eval
-1

The problem occurs because of how arr_diff is assigned. When you use command substitution $(...), the result is assigned as a single string, not as an array. That's why echo ${#arr_diff[@]} returns 1, because it treats the entire output as a single element. Try this one:

arr_diff=($(echo ${arr1[@]} ${arr2[@]} | tr ' ' '\n' | sort | uniq -u))
echo "${arr_diff[@]}"

1 Comment

Boo, hiss -- this has its own bugs; if your array has arr1=( "New York" "New Amsterdam" ), it'll be parsed with New as one word existing twice even if arr2 doesn't contain either of those entries!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.