python list in Recursion

Question

I want to find all links in a div, for example:

<div>
  <a href="#0"></a>
  <a href="#1"></a>
  <a href="#2"></a>
</div>

So I write a func as follow:

def get_links(div):
    links = []
    if div.tag == 'a':
        links.append(div)
        return links   
    else:
        for a in div:
            links + get_links(a)
        return links

why the results is [] rather than [a, a, a]? ------- question

I know this is a question of list reference, could you show some detail

This is the complete module:

import lxml.html


def get_links(div):
    links = []
    if div.tag == 'a':
        links.append(div)
        return links   
    else:
        for a in div:
            links + get_links(a)
        return links


if __name__ == '__main__':

    fragment = '''
        <div>
          <a href="#0">1</a>
          <a href="#1">2</a>
          <a href="#2">3</a>
        </div>'''
    fragment = lxml.html.fromstring(fragment)
    links = get_links(fragment)    # <---------------

Try changing links + get_links(a) to links += get_links(a) — vaultah
– vaultah, Commented Jan 5, 2015 at 8:04
Yes, This is the right way. Thanks. I want to write +=, but I forget, and I think I write is +=. so I dont find the error... and I think this is a question of list reference — zwidny
– zwidny, Commented Jan 5, 2015 at 8:26

6502 · Accepted Answer · 2015-01-05 08:13:35Z

2

List addition in Python returns a new list obtained from the concatenation of the arugments, doesn't change them:

x = [1, 2, 3, 4]
print(x + [5, 6])  # displays [1, 2, 3, 4, 5, 6]
print(x)           # here x is still [1, 2, 3, 4]

you can use the extend method:

x.extend([5, 6])

or also +=

x += [5, 6]

The latter is IMO a bit "strange" because it's a case in which x=x+y is not the same as x+=y and therefore I prefer to avoid it and make the in-place extension more explicit.

For your code

links = links + get_links(a)

would also be acceptable but remember that it does a different thing: it allocates a new list with the concatenation and then assign the name links to point to it: it doesn't change the original object referenced by links:

x = [1, 2, 3, 4]
y = x
x = x + [5, 6]
print(x)   # displays [1, 2, 3, 4, 5, 6]
print(y)   # displays [1, 2, 3, 4]

but

x = [1, 2, 3, 4]
y = x
x += [5, 6]
print(x)   # displays [1, 2, 3, 4, 5, 6]
print(y)   # displays [1, 2, 3, 4, 5, 6]

edited Jan 5, 2015 at 8:13

answered Jan 5, 2015 at 8:07

6502

115k17 gold badges177 silver badges277 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

zwidny Over a year ago

Yes, This is the right way.Thank you! I want to write +=, but I forget, and I think I write is +=. so I dont find the error... and I think this is a question of list reference

Mariusz Jamro · Accepted Answer · 2015-01-05 08:08:08Z

1

If tag is not 'a' your code looks like that.

# You create an empty list

links = []
for a in div:
    # You combine <links> with result of get_links() but you do not assign it to anything
    links + get_links(a)
# So you return an empty list   
return links

You should change + with +=:

links += get_links(a)

Or use extend()

links.extend(get_links(a))

answered Jan 5, 2015 at 8:08

Mariusz Jamro

31.9k25 gold badges129 silver badges170 bronze badges

1 Comment

zwidny Over a year ago

Yes, This is the right way. I want to write +=, but I forget, and I think I write is +=. so I dont find the error... and I think this is a question of list reference

Vivek Sable · Accepted Answer · 2015-01-05 08:15:42Z

0

Other option is to use xpath method to get all a tags from div at any level.

Code:

from lxml import etree
root = etree.fromstring(content)
print root.xpath('//div//a')

Output:

[<Element a at 0xb6cef0cc>, <Element a at 0xb6cef0f4>, <Element a at 0xb6cef11c>]

edited Jan 5, 2015 at 8:15

answered Jan 5, 2015 at 8:10

Vivek Sable

10.3k6 gold badges45 silver badges63 bronze badges

2 Comments

xiaofeng.li Over a year ago

Your code only returns a tags that are direct children to the div tag. '//div//a' is better.

Vivek Sable Over a year ago

@infgeoax: yes agree. Updated code to get a tags from div at any level. Thanx.

Collectives™ on Stack Overflow

python list in Recursion

3 Answers 3

1 Comment

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related