0

I have lots of rows in excel in the following format. These are image file names separated by a "|".

Based on the raw file sample, it should be separated into 3 unique first string, in this case:

  • imagefile1
  • imagefile2
  • imagefile3

All file names that starts with a unique first string before the underscore should be in a new line. A single row can have 50+ unique first string.

enter image description here

6 Answers 6

4

Using TEXTSPLIT to create an array and FILTERing and TEXTJOIN:

=LET(_z,TEXTSPLIT(A1,"_","|"),_u,UNIQUE(TAKE(_z,,1)),TRANSPOSE(BYROW(_u,LAMBDA(_y,TEXTJOIN("|",,FILTER(TAKE(_z,,1)&"_"&TAKE(_z,,-1),TAKE(_z,,1)=_y))))))

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

I've been trying to use TEXTJOIN trying to figure it out, but I found out it's not available on Excel 2010. :(
With 2010 vba will be the best method.
Ah. I've used VBA before. Sorry for having such an ancient software. Maybe it's time to upgrade.
@user3075601 please don't change the existing tags, because we have taken sometime to post the answers, let the Excel Formula tag be there, if anyone interested shall post the VBA solution for you. However, it is best for you to show some code/trials of yours so that one who is interested may find it easy to help you out. Thanks!
Vow +:) Side note: have taken up your suggestion to use VBA 2010 demonstrating a tricky use of Match.
3

If I have understood correctly, then the following formula should do the task you are aiming for, note that the formulas exclusively works with MS365 per the tags posted.

enter image description here


• Method One:

=LET(
     a, TEXTSPLIT(B2,,"|"),
     b, TEXTBEFORE(a,"_"),
     DROP(GROUPBY(b,a,LAMBDA(x,TEXTJOIN("|",1,x)),,0),,1))

• Method Two:

=LET(
     a, TEXTSPLIT(B2,,"|"),
     b, TEXTBEFORE(a,"_"),
     c, UNIQUE(b),
     MAP(c,LAMBDA(x,TEXTJOIN("|",1,FILTER(a,b=x)))))

Or, With ETA :

=LET(
     a, TEXTSPLIT(B2,"_","|"),
     b, TAKE,
     DROP(GROUPBY(b(a,,1),b(a,,1)&"_"&b(a,,-1),LAMBDA(x,TEXTJOIN("|",,x)),,0),,1))

• For multiple rows of data then using REDUCE():

enter image description here


=DROP(REDUCE("",B2:B3,LAMBDA(x,y,
 LET(z, TEXTSPLIT(y,"_","|"), 
     w, TAKE, VSTACK(x,
     DROP(GROUPBY(w(z,,1),w(z,,1)&"_"&w(z,,-1),
     LAMBDA(v,TEXTJOIN("|",1,v)),,0),,1))))),1)

Comments

2

The formula in A4
no is the numbers to check in the filenames
tsep is the array of the filenames

=LET(sep,TEXTSPLIT(A1,"imagefile"),
no,DROP(MID(sep,1,SEARCH("_",sep)-1),,1),
tsep,TEXTSPLIT(A1,"|"),
fg,BYROW(SEQUENCE(MAX(VALUE(no))),LAMBDA(a,TEXTJOIN("|",TRUE,IF(a=VALUE(no),tsep,"")))),
fg)

enter image description here

Comments

2

Assuming the input text is in cell A1, the output will begin in cell A3.

Microsoft documentation:

Dictionary object

Range.Resize property (Excel)

Split function

Option Explicit
Sub Demo()
    Const SEP = "|"
    Dim objDic As Object, rngData As Range
    Dim i As Long, sKey As String
    Set objDic = CreateObject("scripting.dictionary")
    Dim sTxt As String: sTxt = Range("A1")
    Dim aTxt: aTxt = Split(sTxt, SEP)
    For i = LBound(aTxt) To UBound(aTxt)
        sKey = Split(aTxt(i), "_")(0)
        If objDic.exists(sKey) Then
            objDic(sKey) = objDic(sKey) & SEP & aTxt(i)
        Else
            objDic(sKey) = aTxt(i)
        End If
    Next
    If objDic.Count > 0 Then
        Range("A3").Resize(objDic.Count, 1) = Application.Transpose(objDic.items)
    End If
End Sub

1 Comment

Admire this ingeniously short & simple approach +:) - Fyi you might be interested in my alternative approach using an unorthodox Match function.
2

enter image description here

Formula in A3:

=TOCOL(REGEXEXTRACT(A1,"\b([^_|]+)(_(?1)+\b)(?:\|\1(?2))*",1))

1 Comment

Smart & instructive solution +:)
2

Tricky Array matching with VBA vers. 2010

As one option suggested by OP and also following Scott Craner's advice I wanted to demonstrate some tricky & possibly instructive methods applied upon arrays by means of 2010 VBA only:

  • Tokenizing and regrouping of Array elements by subsequent Splitting (→ 1.1 Main function getElems())

  • Split combined with negative filtering to isolate base names as well as ..

  • Match comparing array elements with itself instead of iterative search of single elements within a given array (→ 1.2 Helper function FirstPos())

Furthermore his approach allows different filename prefixes and provides also for empty or error inputs.

I don't intend, however to show a better or even shorter approach than @taller's ingeniously short & simple solution (via a dictionary object).

1.1 Main function getElems()

Function getElems(ByVal s As Variant)
'Purp: Group file names by splitting
'Site: https://stackoverflow.com/questions/79228137/excel-separate-a-line-of-text
'Auth: https://stackoverflow.com/users/6460297/t-m
'a) tokenize elements
    Dim elems: elems = Split(s, "|")
'b) find & mark unique start file
    Dim pos
    For Each pos In FirstPos(s)     ' << 
        If IsNumeric(pos) Then s = Replace(s, "|" & elems(pos - 1), "$" & elems(pos - 1))
        Debug.Print pos
    Next
'c) rebuild elements including subordinated file names
    getElems = Split(s, "$")
End Function

1.2 Helper function FirstPos()

Function FirstPos(ByVal s As Variant)
'a) get base names (file name before "_")
    Dim tmp: tmp = Replace(s, "_", "|_")
    Dim arr: arr = Filter(Split(tmp, "|"), "_", False)
'b) return matched base name positions
'   via tricky MATCH() function comparing array elements with itself
    FirstPos = Application.Match(arr, arr, 0)
End Function

2. Example call

Note ad) A) Get input: Outcomment the Test assignment to the input string s = ... in the upper code line if you want to refer to a range value and activate the outcommented alternative block below instead.

Sub DemoTM()
    Const ERROR = "ERROR!": Const NOTHG = "EMPTY!"    ' at least 1 character (or 1 blank!)
'A) get input
    Dim s
    s = "imagefile1_1.jpg|imagefile1_2.jpg|imagefile2_1.jpg|imagefile3_2.jpg|imagefile3_1.jpg"
    ' 'alternatively:
    ' s = Sheet1.Range("A1").Value2 ' << change worksheet as needed
'B) treat exceptions
    If IsError(s) Then s = ERROR
    If s = "" Then s = NOTHG            ' simulate blank instead of empty element
'C) get regrouped elements
    Dim elems: elems = getElems(s)      ' << see function getElems()
'D) display output in Immediate Window and/or target range
    Debug.Print Join(elems, vbNewLine)
    Sheet1.Range("D3").Resize(UBound(elems) + 1, 1) = Application.Transpose(elems)  ' << change sheet as needed
End Sub

Output in VB Editor's Immediate window:

imagefile1_1.jpg|imagefile1_2.jpg
imagefile2_1.jpg
imagefile3_2.jpg|imagefile3_1.jpg

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.