WEBVTT 00:00:00.001 --> 00:00:04.220 If you run into a problem with some API or Python code, what do you do to solve it? 00:00:04.220 --> 00:00:08.800 Well, personally, I throw a few keywords into Google, sometimes before even checking the full 00:00:08.800 --> 00:00:14.440 docs. It works great. But why does it work so well? Because invariably, an excellent conversation 00:00:14.440 --> 00:00:19.280 and answer from Stack Overflow comes back as the top result, and it's usually just what I needed. 00:00:19.280 --> 00:00:24.600 This week, you'll meet Martin Peters, one of the top Python contributors at Stack Overflow, 00:00:24.600 --> 00:00:30.720 with over 16,500 questions answered and a reputation of over half a million. 00:00:30.720 --> 00:00:36.060 This is Talk Python To Me, episode 86, recorded November 2nd, 2016. 00:00:49.200 --> 00:01:05.680 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the 00:01:05.680 --> 00:01:10.300 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter, 00:01:10.300 --> 00:01:15.200 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm, 00:01:15.200 --> 00:01:21.300 and follow the show on Twitter via at talkpython. This episode has been sponsored by Rollbar and 00:01:21.300 --> 00:01:27.180 GoCD. Thank them both for supporting the podcast by checking out what they're offering during their 00:01:27.180 --> 00:01:33.260 segments. Martin, welcome to Talk Python. Hi there. This is exciting. Yeah, it's an honor to have you 00:01:33.260 --> 00:01:39.660 here. You've done some really amazing work in the Python space on Stack Overflow, and we're going to 00:01:39.660 --> 00:01:44.940 spend a lot of time talking about what you've done there, what your experiences were, but also kind of a 00:01:44.940 --> 00:01:50.400 survey. Some really cool questions that you've pulled out for us to talk about that are representative 00:01:50.400 --> 00:01:57.220 of good work and basically the Python community at Stack Overflow. But of course, before we get to 00:01:57.220 --> 00:02:00.500 any of that, let's start with your story. How did you get into Python and programming? I started 00:02:00.500 --> 00:02:07.260 programming quite early on. My father was a computer engineer. He fixed computers the way you fix a bike. 00:02:07.260 --> 00:02:13.660 You find the part that breaks and solder a new one in back in the 70s. So in the 80s, I finally got my own 00:02:13.660 --> 00:02:22.840 computer. I was given a MSX. And I basically haven't looked back since. Python. I started Python in 1999, I think, 00:02:22.840 --> 00:02:31.560 when I was a web developer and I found a new platform called Zope. It lets you think of web pages as components 00:02:31.560 --> 00:02:38.260 and as objects, which was interesting. And that called me into Python at the same time. My first project, I gave a time estimate 00:02:38.260 --> 00:02:45.140 and beat my own time estimate, which blew away my manager because that never happened before. I always go over time. 00:02:45.140 --> 00:02:52.960 That's all engineers do. And I haven't... That was my start with Python. It really was a love at first sight, I think. 00:02:53.300 --> 00:02:56.500 Yeah, that's great. And what kind of project was that? That Zope project? 00:02:56.500 --> 00:03:05.820 It was for a befriended company that was doing software on the web. And when you sell software on the web, you need a web shop. 00:03:06.480 --> 00:03:15.100 So I built a web shop. I see. And because their software was already built in Python, it was very easy to incorporate their license generator into it too. 00:03:15.100 --> 00:03:22.080 Oh, of course. Yeah. So they were selling Python software. You needed to run their license thing. It's just like, yeah, we'll just, you know, import that right here. 00:03:22.080 --> 00:03:22.440 Just plug it in? 00:03:22.440 --> 00:03:31.460 Yeah, yeah. Beautiful. That's great that it was quick and easy. But I think building e-commerce sites today, quite a bit easier than it was back in 1999. 00:03:31.460 --> 00:03:33.540 What do you think? Have you built one recently? 00:03:33.540 --> 00:03:40.180 I tend to stay away from shop. I've mostly done, in my career, I've mostly done content management. 00:03:40.180 --> 00:03:49.660 So I think an intranet website where everybody has to be able to give their input as to what the company is doing or whatever. Share between teams, that kind of stuff. 00:03:49.660 --> 00:03:57.280 All right. So mostly content management stuff, which is really, really valuable and helpful. What do you do today? You work at Facebook, right? 00:03:57.280 --> 00:04:07.500 Today I work at Facebook and I do very different things. I don't so much work on the website anymore. As you know, Facebook does use PHP for that. Strange language. 00:04:07.500 --> 00:04:16.580 I work on the backend side, on the source control side where I hack on Mercurial, an open source project that lets you do development distributed source control. 00:04:16.900 --> 00:04:24.560 Oh, that's cool. Yeah. So Mercurial is kind of a cousin to Git, similar type of source control. What's Mercurial written in? 00:04:24.560 --> 00:04:25.720 Mercurial is written in Python. 00:04:25.720 --> 00:04:26.800 Interesting. 00:04:26.800 --> 00:04:36.120 Mercurial is a Python project. And because it's a Python project, we were able to make it scale. So Facebook, everything is turned to 11. Everything is crazy. 00:04:36.660 --> 00:04:50.240 Everything is just over the top high numbers and skills over the chart. And to be able to follow all the developers and all the changes they make every day, we couldn't actually make Git scale there. We couldn't make Git follow. And Mercurial did follow. 00:04:50.480 --> 00:05:00.020 So we can, because Mercurial is so written in Python and highly extensible and highly tweakable, we can make it work at the crazy levels that Facebook is at. 00:05:00.020 --> 00:05:02.640 That's really, I'm sure that's so exciting to work at Facebook. 00:05:02.640 --> 00:05:04.820 Absolutely. It's a great place to be. 00:05:04.820 --> 00:05:07.700 Do you work remote or is there a place there in Cambridge? 00:05:07.700 --> 00:05:14.560 I work in London. So I commute four days a week up to London to work and one day a week I stay at home. 00:05:14.560 --> 00:05:24.060 How much source code does Facebook have? Is it like, so when you talk about scaling it, is it the number of people using the system? Is it the amount of data? What's the scale look like? 00:05:24.060 --> 00:05:38.800 It's everything. So we put everything into one monorepo. Repo means that we try and put everything into one place, one repository, so that you can facilitate all the sharing and moving. And that really helps speed things up. 00:05:38.800 --> 00:05:49.320 But at the same time, you can easily come into the file counts of six figures. So trying to get a full check out of that is not always the best thing you want to do. 00:05:49.320 --> 00:05:58.640 So we have issues with the number of files, the number of directories, the number of commits each day, the number of developers that work on it. All those things play. 00:05:58.640 --> 00:06:03.060 Yeah, I can imagine that. Wow. Yeah. And Google also does this monorepo thing. 00:06:03.060 --> 00:06:12.240 This is a really interesting idea to me when you have that many files. Have you open sourced any of these tweaks or anything written about them? 00:06:12.240 --> 00:06:23.360 Lots of this stuff is open sourced. So Facebook has a repository with their own extensions in there. They are not perhaps immediately something you can use, but there is a lot of inspiration there. 00:06:23.360 --> 00:06:44.480 We also contribute straight back to the Mercurial project itself. Until recently, the founder of Mercurial, Matt McEll, worked at Facebook. He recently decided that after 11 years of Mercurial work, he wanted to do something different. So he moved on to other things. And the Mercurial project is now trying to make its way without him. 00:06:44.480 --> 00:06:54.140 Wow. Yeah. So you had the guy there to help you build this up. That's really interesting. Let's talk about Stack Overflow. Maybe tell the world what Stack Overflow is. The one person who doesn't know it. 00:06:54.480 --> 00:07:17.100 Stack Overflow is a site for programmers. And the goal here is to help programmers find answers to their common problems. It was born from frustration with forums. You know, the general forum where someone says that they had a problem with this code and can someone help them and 20 people pipe in, but that they have the problem too and they don't know the solution. And then maybe on page 15, someone has got the solution. 00:07:17.480 --> 00:07:32.740 And then it disappears into a huge pile of posts about different things and you never find it back again. Stack Overflow is very much aimed at trying to surface the question and the answers. So you have just a question. 00:07:33.100 --> 00:07:53.160 And people, they use gamification to get the best answers on the top. So the general idea is that people vote. This was a question that is worth, that is well asked, that is complete. And these are the answers. And this answer is the best because that gets the most votes on it. That's the idea is that you get an encyclopedia of knowledge, of programming knowledge. 00:07:53.160 --> 00:08:00.240 And they, you know, Jeff Atwood and Joel Spolsky, the two founders, they have nailed that in some special way, right? 00:08:00.240 --> 00:08:01.780 They found a secret sauce. 00:08:01.780 --> 00:08:16.280 They found, they definitely found a secret sauce and it's working really, really well. Like pretty much any programming question you have, if it has technical detail to it, there's probably a Stack Overflow result in the top five Google results. 00:08:16.280 --> 00:08:19.980 Exactly. For the most common programming problems, you will find a solution there. 00:08:19.980 --> 00:08:31.380 Yeah. And it's not just, well, you should put this curly brace here or whatever. It's also, it's much deeper than that, right? I think, I think there's a lot of really interesting questions as we'll see. 00:08:31.380 --> 00:08:39.660 Absolutely. Absolutely. There's such a wide range of things that answer there, provided you stay on topic, which is sometimes a little harder. 00:08:39.660 --> 00:08:52.500 Yeah, it can be. But yeah, it's, and they're very, very much careful about making sure things stay on target and focused and don't rant and basically become the forum that they're trying to escape from. 00:08:52.500 --> 00:08:53.580 So let's start. 00:08:53.580 --> 00:08:53.940 Exactly. 00:08:54.200 --> 00:09:06.580 Yeah. So let's start with what makes a good question on Stack Overflow. So I guess before you answer that, tell us, I was looking at your, your user profile and there's some pretty astounding statistics there. 00:09:07.120 --> 00:09:21.060 So for people who know you have about 500,000, a little over 500,000 for reputation, but you have 16,000 plus answers. And it says you've reached 22.9 million people. That's really amazing. 00:09:21.060 --> 00:09:35.900 The numbers, they keep on growing because I like answering. I like finding solutions to problems. It's, it is kind of addictive. It is addictive with the fact that the feedback that you get and the, the, you learn so much yourself from this. 00:09:35.900 --> 00:09:57.000 I love finding the key to the problem. So if someone has a particular view on the problem, they have a certain idea about how things should work or not work. And they clearly don't understand something. And I love finding the key that helps them suddenly see, oh, that's why this is how it works. So I'd prefer not just to give you the solution. I also prefer to tell you why. 00:09:57.000 --> 00:09:59.220 Yeah. I noticed that from your answers. Yeah. 00:09:59.220 --> 00:10:04.960 In doing that, you learn yourself whether or not, you know why. And that helps me grow as a programmer as well. 00:10:05.120 --> 00:10:31.800 Yeah. I think that's a really good point because it's one thing to say, this code doesn't run as I expected. You could say, well, change it to this and it would. It's another to say, well, the reason your code doesn't work is if you really look inside, like say the CPython interpreter, it's doing this. And when it does that, here's what it means. And so here's how you fix it. Like that, that both enriches the person's understanding, but yours as well, like you say. 00:10:31.800 --> 00:10:35.100 Exactly. I get to learn and apply it elsewhere. 00:10:35.100 --> 00:10:42.100 I can use this in my own daily life. My own work is informed by what I learned from answering on Stack Overflow. 00:10:42.100 --> 00:10:47.520 Yeah, absolutely. I'm sure it is. I'm sure it is. Let's start with what makes a good question on Stack Overflow. 00:10:48.080 --> 00:11:03.620 So Stack Overflow is very strict about what it accepts for questions. Over the years, we've learned certain things don't work. So certain things are not on the topic. Stay away from those certain things, like asking what is the best web framework? 00:11:04.080 --> 00:11:17.200 What is the best is very subjective and it's very hard to answer without a lot more detail about what you're doing. Most people don't give that detail. And what then also happens is that the spammers come along, the people that have something to sell and they say, ours is the best. 00:11:17.420 --> 00:11:28.600 So those kinds of things are off topic. The same thing for opinion-based things. And we don't want to go on. We don't want to end up writing books either. 00:11:28.600 --> 00:11:41.500 So if you want to know about the best practices for using object-oriented programming and when to use one technique or another technique, that would rapidly be closed as too broad. 00:11:41.500 --> 00:11:47.360 Now, once you narrow it down to very specific programming problems, then you already start being on topic. 00:11:47.360 --> 00:11:59.960 And then what you definitely need to do is, first of all, realize that these questions are not just for you. We want them to be there for everybody. So someone else coming Googling later on, need to be able to recognize that they have the same problem. 00:11:59.960 --> 00:12:08.520 And the other thing is that if you want answers, you need to help the people that answer you. So you want to show the context of research that you've done. 00:12:09.280 --> 00:12:22.420 And you want to narrow down your program to the essentials. So if you come in and say, I have this XML document that I want to parse and get this information out and write it to a file, but I want to transform this information this way and that way. 00:12:22.420 --> 00:12:25.700 How do I do this? You're going to end up disappointed. 00:12:25.700 --> 00:12:35.800 Yeah. And the XML file is like a thousand lines long with all sorts of complicated stuff. And they haven't focused in on just the essence of the problem. They're basically like, solve this problem for me, right? 00:12:35.800 --> 00:12:49.500 Exactly. So solve this problem for me. What you can do at such a point is you need to break this down into separate issues. Perhaps you already succeeded in loading the XML file and parsing it. You've already maybe made choices there. Tell us about those. 00:12:49.500 --> 00:12:58.680 So now you've narrowed it down to perhaps only finding specific information in that file. Leave out the part where you're going to transform it and write it to something else. Those are separate problems. Those are separate steps. 00:12:59.060 --> 00:13:14.280 And once you narrowed it down to one of these steps and show us what you put in there, what you expected to come out and what have it instead with all the error message and all the full traceback and everything that you can, that you see, then we can see it too. 00:13:14.280 --> 00:13:16.820 And often you will come to a very quick answer. 00:13:17.200 --> 00:13:31.040 Or it might be, again, I cannot emphasize this enough. You have to have done your research because it could be that the question was already asked before and answered before and your question will be closer to duplicates. 00:13:31.040 --> 00:13:41.640 Yeah. So Stack Overflow is actually pretty good at helping with you find answers as you're kind of ignoring the fact that they might exist. I guess is the way to think of it. 00:13:41.640 --> 00:13:56.340 We don't hate duplicates either. We don't dislike duplicates. The internet is, of course, very impersonal. You can't see the other person's smile or feel any warmth. So people often see this as a personal putting their question down thing. 00:13:56.340 --> 00:14:09.480 And duplicates, sometimes people actually go, I'm sorry, I asked the duplicates. And to that I say, it's all right. Duplicates happen. You can't always know exactly what you're searching for. We can help with that. 00:14:09.980 --> 00:14:17.140 And duplicates actually are valuable to Stack Overflow as well because what happens is that you now added more keywords for people to search on. 00:14:17.140 --> 00:14:25.040 So when you created the duplicate, you asked the question. It was well asked, but it was a duplicate. Google still has indexed your question. 00:14:25.040 --> 00:14:32.420 And when someone clicks on it on Google, we actually redirect new visitors straight to the canonical question that you were a duplicate of. 00:14:32.420 --> 00:14:36.920 So at that moment, you have helped someone else find the same solution. 00:14:36.980 --> 00:14:39.260 Oh, that's interesting. I didn't really think of it that way. 00:14:39.260 --> 00:14:46.800 So the research part is really important. I suspect it really affects the way that you feel about answering the question. 00:14:46.800 --> 00:14:56.500 If like somebody has just put zero effort or what appears to be zero effort into finding their solution and they're just like, I don't know, let's put it on Stack Overflow, see if somebody will do this for me. 00:14:56.500 --> 00:15:00.660 Probably your willingness to give a good solid answer goes down, right? 00:15:00.980 --> 00:15:12.980 There's a certain type, certain classes of questions that we've seen so many times now, so often asked, and there's so many times that there's a duplicate, these common errors, that some people may lose patience a little bit. 00:15:12.980 --> 00:15:16.340 And then you get, haven't you done your research? 00:15:16.340 --> 00:15:19.540 You could have found this in the first five results on Google. 00:15:19.540 --> 00:15:20.840 Right. I took your title. 00:15:20.840 --> 00:15:21.920 I put it into Google. 00:15:21.920 --> 00:15:26.980 And here's the Stack Overflow answer that actually has 10 answers and 100 upvotes and whatnot. 00:15:26.980 --> 00:15:27.780 Exactly. 00:15:27.780 --> 00:15:28.800 Interesting. 00:15:28.800 --> 00:15:30.760 So it is the same kind of thing. 00:15:30.760 --> 00:15:41.300 I think there was a funny video online, what if Google was an actual person, where you see someone in an office and people come in and ask, how is baby formed to this person's face? 00:15:41.300 --> 00:15:49.920 The person behind the desk being a comedian does a great job of looking frustrated or can't ask this question 100 times already. 00:15:49.920 --> 00:15:51.260 Yeah, exactly. 00:15:51.260 --> 00:15:52.580 Nice. 00:15:52.580 --> 00:15:53.020 Okay. 00:15:53.020 --> 00:15:58.420 So I think everyone's got a good sense of more or less what makes a good question. 00:15:58.420 --> 00:16:03.020 Let's talk about some of the ones that you found that are noteworthy or whatever. 00:16:03.020 --> 00:16:07.400 First of all, how many Python questions are on Stack Overflow? 00:16:07.400 --> 00:16:08.020 Do you know? 00:16:08.020 --> 00:16:12.560 Oh, if I look it up very quickly, it's about 650,000. 00:16:12.560 --> 00:16:14.800 650,000. 00:16:14.800 --> 00:16:19.580 650,000 questions today that have been tagged with Python. 00:16:19.580 --> 00:16:20.880 There can be more. 00:16:20.880 --> 00:16:26.900 There are several version-specific sub-tags, so Python 2.7 or Python 3.5. 00:16:26.900 --> 00:16:32.440 And not everybody uses the main tag when they ask about a specific version. 00:16:32.440 --> 00:16:32.920 Right. 00:16:32.920 --> 00:16:36.180 It might be just tagged Django, but it would also count as Python. 00:16:36.440 --> 00:16:45.080 There might just be Django-specific library questions, so Django, Flask, or Pandas, or NumPy, those kinds of things. 00:16:45.080 --> 00:16:48.240 They're all Python questions, too, but they might like the tag. 00:16:48.240 --> 00:16:49.560 So there might be more. 00:16:49.560 --> 00:16:50.000 All right. 00:16:50.000 --> 00:16:50.460 There probably are. 00:16:50.460 --> 00:16:52.100 Yeah, it's probably a subset. 00:16:52.100 --> 00:16:53.320 Okay, cool. 00:16:53.980 --> 00:17:00.400 So you've, out of the 650,000-plus questions, chose a few that are really good. 00:17:00.400 --> 00:17:06.420 And you said that some of the questions that are really interesting are these what questions. 00:17:06.420 --> 00:17:07.660 W-A-T. 00:17:07.660 --> 00:17:09.260 I think W-A-T. 00:17:09.260 --> 00:17:16.140 This comes from a talk by a man named Gary Bernard, who did a very short and very funny talk about wah moments. 00:17:16.140 --> 00:17:19.080 He goes, wah, completely memes. 00:17:19.080 --> 00:17:23.120 Those moments where you go, it really shouldn't be doing that. 00:17:23.120 --> 00:17:25.260 That is such a crazy response to the code. 00:17:25.260 --> 00:17:25.800 Yeah. 00:17:25.800 --> 00:17:26.300 Usually it is. 00:17:26.300 --> 00:17:30.720 It's bugs or very obscure corner cases of the language. 00:17:30.720 --> 00:17:40.160 They are, for me personally, the most interesting ones because they definitely will cover areas that I know nothing about. 00:17:40.900 --> 00:17:42.360 Or very little about it. 00:17:42.360 --> 00:17:45.380 If they make me go wah, then it's definitely something interesting. 00:17:45.380 --> 00:17:46.220 Yeah, absolutely. 00:17:46.220 --> 00:17:51.480 And depending on the language you're working with, there's more or there's fewer. 00:17:51.480 --> 00:17:56.460 Like the example that Gary chose was JavaScript, which is full of wonderful watts. 00:17:56.460 --> 00:17:57.040 Exactly. 00:17:57.040 --> 00:18:00.640 Python, on the whole, is a very, very low watt count. 00:18:00.640 --> 00:18:03.800 But there are a few, and usually they are due to bugs. 00:18:03.800 --> 00:18:07.740 So Python is a very consistent language. 00:18:08.080 --> 00:18:18.280 The developers do an awesome job of making sure that Python is a consistent and predictable language, unlike JavaScript or even few Ruby examples as well. 00:18:18.280 --> 00:18:19.960 But there are a few of those. 00:18:19.960 --> 00:18:30.060 Recently, very recently, I think last week, there was one question that was asked where someone built a set contains unique elements. 00:18:30.540 --> 00:18:36.300 But if they built a set from a list, they got a different result than when they built a set from a set literal. 00:18:36.300 --> 00:18:38.540 So they had multiple elements in it. 00:18:38.540 --> 00:18:42.120 And they listed the elements in the same order. 00:18:42.120 --> 00:18:49.900 But if you used a set literal, they got a different result from building a list first and then passing that to the set callable. 00:18:49.900 --> 00:18:51.620 That should not happen. 00:18:51.620 --> 00:18:52.760 That should not happen. 00:18:53.620 --> 00:18:54.360 It shouldn't. 00:18:54.360 --> 00:19:01.260 So if you say set, open parenthesis, and you pass an iterable thing, like a list or something, you get one result. 00:19:01.260 --> 00:19:14.380 If you say curly brace, thing, comma, thing, comma, thing, comma, thing, close curly brace, and you put the same things in that you pass to your set initializer, you actually get different sets. 00:19:15.680 --> 00:19:18.460 With the properly constructed inputs. 00:19:18.460 --> 00:19:21.160 So at that moment, I definitely go, that shouldn't happen. 00:19:21.160 --> 00:19:22.200 That is weird. 00:19:22.200 --> 00:19:23.800 I'm intrigued. 00:19:23.800 --> 00:19:24.220 I'm hooked. 00:19:24.220 --> 00:19:27.740 And the next hour is going to be spent figuring out what's going on. 00:19:27.740 --> 00:19:28.940 Where did it come from? 00:19:28.940 --> 00:19:29.680 Why? 00:19:29.680 --> 00:19:30.340 Why? 00:19:30.340 --> 00:19:30.900 Why? 00:19:30.900 --> 00:19:32.440 In this case, it was a bug. 00:19:32.440 --> 00:19:33.620 It was a bug. 00:19:33.620 --> 00:19:34.060 Okay. 00:19:34.060 --> 00:19:37.000 And I'm looking at your answer here. 00:19:37.000 --> 00:19:42.180 And of course, everybody who's listening, all these questions and answers will be added as links in the show notes. 00:19:42.520 --> 00:19:48.040 So I'm looking at your answer here, and you're like, all right, so here I'm trying a few things, and we're trying to understand it. 00:19:48.040 --> 00:19:54.680 And all right, let's just open up the disk module and do a straight disassembly on this thing. 00:19:54.680 --> 00:20:00.140 Yeah, because at that point, you're going to have to dig in, what is the Python interpreter doing? 00:20:00.140 --> 00:20:05.540 So you need the bytecode, and the bytecode is built from the source code, drives the interpreter. 00:20:05.540 --> 00:20:09.880 And then you can start looking at the interpreter source code and see what is going on. 00:20:10.420 --> 00:20:20.660 And for me, it was actually hard to figure out what was going on because I looked at the most recent version of Python from source control, so from the Mercurial repository. 00:20:20.660 --> 00:20:24.580 And the bug had been fixed very recently as well. 00:20:24.580 --> 00:20:30.360 So you used, if you do this with a release version of Python, you will not see the bug. 00:20:30.360 --> 00:20:31.480 You will see the bug. 00:20:31.480 --> 00:20:35.820 If you actually look at the source code as it is today, you won't see the bug. 00:20:36.040 --> 00:20:36.300 I see. 00:20:36.300 --> 00:20:38.880 And did you go to the source code to try to understand it? 00:20:38.880 --> 00:20:40.360 Like, if I can do it? 00:20:40.360 --> 00:20:41.260 I went to the source code. 00:20:41.260 --> 00:20:47.980 It looks to me, my hypothesis is that the set literal is parsing these things in the wrong direction, in the wrong order. 00:20:47.980 --> 00:20:50.280 That it's doing this in the wrong order. 00:20:50.280 --> 00:20:53.500 But in the source code, I could see that it was parsing it in the right order. 00:20:54.380 --> 00:20:57.640 So at that moment, I'm trying to figure out what is going on. 00:20:57.640 --> 00:20:58.720 I could reproduce the issue. 00:20:58.720 --> 00:21:00.700 It clearly wasn't matching what the code was. 00:21:00.700 --> 00:21:09.280 So I actually went to search for issues in the bug tracker, found it, and then quickly realized that I was looking at a fixed version of the code because it already had been fixed. 00:21:09.940 --> 00:21:21.640 So what was happening is that for all currently released Python versions, if you use a set literal, the stack is each element in that literal is pushed onto the stack. 00:21:21.640 --> 00:21:24.040 And then the stack is taken in reverse to build the set. 00:21:24.040 --> 00:21:28.560 So you always get the items from last to first added to the set. 00:21:28.560 --> 00:21:31.020 Now, a set tests for equality. 00:21:31.680 --> 00:21:43.580 So if two objects are test equal, even though they look different to you as a developer and there are different types, they still say that they are equal, then the first one in the set will win and will stay in the set. 00:21:43.580 --> 00:21:44.700 And the other ones are rejected. 00:21:44.700 --> 00:21:45.320 They were not added. 00:21:45.320 --> 00:21:58.460 So in this specific case, someone was using zero and then the complex number zero, so zero j, and the instance of the decimal class from the decimal module, decimal zero. 00:21:59.240 --> 00:22:10.240 And because the decimal module in Python 2 doesn't support equality tests with complex numbers, only with integers, it would test equal against zero but not against complex zero. 00:22:10.240 --> 00:22:15.500 And so depending on the order in which they go in there, it would, yeah, change it. 00:22:15.500 --> 00:22:20.560 Accept one or the other as being equal to what's already in there and rejected, and therefore you get different results. 00:22:20.560 --> 00:22:20.980 Yeah. 00:22:20.980 --> 00:22:21.860 Wow. 00:22:21.860 --> 00:22:23.240 These are definitely corner cases. 00:22:23.240 --> 00:22:25.980 These are obscure stuff, and most people won't come across these. 00:22:26.540 --> 00:22:29.640 But that's the kind of question that I personally enjoy. 00:22:29.640 --> 00:22:33.760 Well, one, really interesting to think about why that would even be possible and what's happening. 00:22:33.760 --> 00:22:39.220 But also, you actually found a bug or the person found a bug, and you verified the bug, I guess. 00:22:39.220 --> 00:22:40.760 In this case, I confirmed the bug. 00:22:40.760 --> 00:22:41.060 Yes. 00:22:41.060 --> 00:22:42.700 Exactly, exactly. 00:22:42.700 --> 00:22:44.400 How often has that happened? 00:22:44.400 --> 00:22:45.760 It doesn't happen that often. 00:22:45.760 --> 00:22:48.960 And also for packages, there are bugs and corner cases. 00:22:49.340 --> 00:22:51.540 Python is a pretty solid piece of software. 00:22:51.540 --> 00:22:56.260 Usually, when I do find the bugs, they are usually very minor. 00:22:56.260 --> 00:23:04.660 There might be documentation bugs, or one of the modules in the standard library might have some unexpected behavior that could be improved. 00:23:04.660 --> 00:23:12.580 The fundamental issue, like the order of set literals being processed, that's pretty rare. 00:23:12.580 --> 00:23:17.500 Yeah, I suspect most of the bugs that are found are up above the standard library. 00:23:17.500 --> 00:23:20.900 They're in some external package or something like that, right? 00:23:20.900 --> 00:23:21.680 Exactly. 00:23:21.680 --> 00:23:27.660 So usually, if far fewer people look at it, it might be a different project that has a different standard for testing. 00:23:27.660 --> 00:23:32.880 And so, more likely to find them there than I would find them in Core Python. 00:23:32.880 --> 00:23:33.620 Yeah, absolutely. 00:23:33.620 --> 00:23:33.720 Absolutely. 00:23:33.720 --> 00:23:51.580 This portion of Talk Python To Me has been brought to you by Rollbar. 00:23:51.580 --> 00:23:55.100 One of the frustrating things about being a developer is dealing with errors. 00:23:55.100 --> 00:24:03.280 Ah, relying on users to report errors, digging through log files trying to debug issues, or a million alerts just flooding your inbox and ruining your day. 00:24:03.280 --> 00:24:10.520 With Rollbar's full-stack error monitoring, you'll get the context, insights, and control that you need to find and fix bugs faster. 00:24:10.520 --> 00:24:12.220 It's easy to install. 00:24:12.220 --> 00:24:16.460 You can start tracking production errors and deployments in eight minutes or even less. 00:24:16.460 --> 00:24:25.820 Rollbar works with all the major languages and frameworks, including the Python ones, such as Django, Flask, Pyramid, as well as Ruby, JavaScript, Node, iOS, and Android. 00:24:25.820 --> 00:24:30.680 You could integrate Rollbar into your existing workflow, send error alerts to Slack or HipChat, 00:24:30.840 --> 00:24:35.020 or even automatically create issues in Jira, Pivotal Tracker, and a whole bunch more. 00:24:35.020 --> 00:24:38.380 Rollbar has put together a special offer for Talk Python To Me listeners. 00:24:38.380 --> 00:24:44.080 Visit rollbar.com slash Talk Python To Me, sign up, and get the bootstrap plan free for 90 days. 00:24:44.080 --> 00:24:47.140 That's 300,000 errors tracked all for free. 00:24:47.140 --> 00:24:50.860 But hey, just between you and me, I really hope you don't encounter that many errors. 00:24:51.300 --> 00:24:56.860 Loved by developers at awesome companies like Heroku, Twilio, Kayak, Instacart, Zendesk, Twitch, and more. 00:24:56.860 --> 00:24:58.480 Give Rollbar a try today. 00:24:58.480 --> 00:25:01.080 Go to rollbar.com slash Talk Python To Me. 00:25:08.860 --> 00:25:14.180 The next one that you said was really interesting, and I agree, is it's so simple. 00:25:14.180 --> 00:25:22.760 It's like eight or nine words, the question, but it has 3,200 upvotes. 00:25:22.760 --> 00:25:28.160 And the question is, what is a metaclass in Python, and what do you use them for? 00:25:28.500 --> 00:25:33.220 So this is, I think, is an example of a really awesome answer. 00:25:33.220 --> 00:25:39.180 So I'm not talking about the one that's marked as accepted, but the one below it that has the most upvotes. 00:25:39.180 --> 00:25:45.900 This is one user that has learned that for certain subjects, it is worth it to write an essay, to write a full answer. 00:25:45.900 --> 00:25:51.500 And his name is Isetus, and his answer for metaclasses is absolutely epic. 00:25:51.500 --> 00:25:54.220 This is why we have Stack Overflow. 00:25:54.220 --> 00:25:56.920 This is what needs to flow to the top. 00:25:57.960 --> 00:26:01.260 Metaclasses are an unfamiliar concept to most people. 00:26:01.260 --> 00:26:05.580 Metaclasses are the type that creates classes. 00:26:05.580 --> 00:26:10.280 So classes are a factory, a thing that you use to make instances. 00:26:10.280 --> 00:26:11.880 But what makes classes? 00:26:11.880 --> 00:26:13.540 Now, that's a metaclass. 00:26:13.540 --> 00:26:14.920 Metaclass can produce a class. 00:26:14.920 --> 00:26:19.720 It can be a mind-boggling and hard-to-grasp concept. 00:26:19.720 --> 00:26:26.060 But Isetus does an absolute marvelous job of explaining this anyway. 00:26:26.320 --> 00:26:41.780 So he starts with explaining what classes are, what objects are, and this goes deeper and deeper and deeper and includes not only what metaclasses are and how they work, but also includes excellent advice as to when you want to use them, which usually is not. 00:26:42.260 --> 00:26:45.720 But when you do need them, when you do need them, metaclasses are a very powerful concept. 00:26:45.720 --> 00:26:49.080 And he does, I think, one of the best jobs of explaining that. 00:26:49.080 --> 00:26:52.160 Yeah, it's like a little book chapter almost that he wrote. 00:26:52.160 --> 00:26:56.120 I think it's like 15 pages, you know, based on my screen size or whatever. 00:26:56.120 --> 00:26:58.200 That's pretty impressive. 00:26:58.480 --> 00:27:01.280 And I really do think it's a great answer. 00:27:01.280 --> 00:27:04.340 Yeah, metaclasses are definitely interesting and useful. 00:27:04.340 --> 00:27:06.460 But I think we mostly consume them, right? 00:27:06.460 --> 00:27:10.160 More than create them as average, everyday programmers. 00:27:10.160 --> 00:27:12.800 Average, everyday programmers, you mostly consume them. 00:27:12.800 --> 00:27:14.340 Most people don't know that they even exist. 00:27:14.340 --> 00:27:16.360 This is something that frameworks use. 00:27:17.020 --> 00:27:30.940 The Django framework or the WT forms framework use metaclasses heavily because they get to have access to all the attributes you just set on a class and transform them and manipulate them and make them do something magic. 00:27:30.940 --> 00:27:33.800 Yeah, I was thinking of SQLAlchemy's declarative base. 00:27:33.800 --> 00:27:36.360 Yeah, SQLAlchemy is very similar to that. 00:27:36.360 --> 00:27:44.440 Again, it's a model where each of the fields defines something about the model that is a little bit more than just the object you just put in there. 00:27:45.240 --> 00:27:49.840 So define a class that also models the columns in a table. 00:27:49.840 --> 00:27:53.580 You usually want those attributes to behave differently in different contexts. 00:27:53.580 --> 00:28:01.940 Like when you are creating a new instance or you are just putting stuff in from the database, that moment, that class has to behave differently. 00:28:01.940 --> 00:28:07.300 And that complexity, it can be neatly encapsulated and hidden by using a metaclass. 00:28:07.300 --> 00:28:09.520 Yeah, it's really, really wonderful. 00:28:09.520 --> 00:28:10.740 Awesome. 00:28:10.740 --> 00:28:11.140 Okay. 00:28:11.140 --> 00:28:13.780 How about accepting, next question. 00:28:14.200 --> 00:28:17.100 Accepting the user input until you get a valid response. 00:28:17.100 --> 00:28:17.820 Yeah. 00:28:17.820 --> 00:28:24.160 I picked that one because it's a very good example of, again, what the Python community around Stack Overflow does. 00:28:24.160 --> 00:28:30.760 So Stack Overflow itself is a site, but next to that we have a chat site as well. 00:28:30.760 --> 00:28:40.420 So there's a chat.stackoverflow.com where there's various different rooms where people, members of the community come together to hang out or to chat about all sorts of weird things and sometimes even about Python. 00:28:40.420 --> 00:28:41.880 And there's a Python chat room. 00:28:41.880 --> 00:28:50.000 And because people talk about and are connected to Stack Overflow, that room sees lots of patterns happening. 00:28:50.000 --> 00:28:53.840 So questions that keep coming up, certain problems that keep coming up. 00:28:53.940 --> 00:28:55.900 And there's sometimes a class of them. 00:28:55.900 --> 00:29:02.860 Like using raw input or input in your Python program to get a response from a user. 00:29:02.860 --> 00:29:08.900 Most homework starts with take an integer from the user and put it into something. 00:29:09.560 --> 00:29:15.440 And that is something many new programmers fall over and keep stumbling into. 00:29:15.440 --> 00:29:27.260 So this particular question was written by a member of the Python chat room together with an answer based on their experience helping people again and again and again with the same issues, the same related issues. 00:29:27.900 --> 00:29:29.360 What do you do to validate that input? 00:29:29.360 --> 00:29:33.240 What do you do when the user has got it wrong and you want to repeat it? 00:29:33.240 --> 00:29:37.180 All the pitfalls all collected into one place and as a self-answer. 00:29:37.180 --> 00:29:45.440 So this was both the question and the answer have been written by one person at the start and then refined later on by the members of the community. 00:29:45.440 --> 00:29:46.180 Interesting. 00:29:46.760 --> 00:29:51.680 So this one, they made the answer, their own answer, a community wiki. 00:29:51.680 --> 00:29:53.000 Exactly. 00:29:53.000 --> 00:30:04.000 I've seen that you can make an answer a community wiki or you can sort of keep it to yourself and you get reputation if you keep it to yourself and you don't on a community wiki. 00:30:04.000 --> 00:30:05.000 You want to talk about that? 00:30:05.000 --> 00:30:05.320 Exactly. 00:30:05.320 --> 00:30:06.500 Yeah. 00:30:06.500 --> 00:30:12.900 The community wiki is definitely aimed to be able to lower the bar of improving a post. 00:30:13.080 --> 00:30:20.300 So in this case, the community definitely felt this is something we as a community want to do because it's pretty much needed and it's too scattered around the site. 00:30:20.300 --> 00:30:24.840 And we don't want it to be owned by one person specific. 00:30:24.840 --> 00:30:29.420 So you should feel more, you should feel freer to actually edit this and improve it. 00:30:29.420 --> 00:30:30.960 That's why it's made a community wiki. 00:30:30.960 --> 00:30:40.300 And the moment you actually then invite more people to edit it, then you don't no longer know who can earn more than other people if there's upvotes on this. 00:30:40.300 --> 00:30:41.860 So that's completely taken away. 00:30:41.860 --> 00:30:42.720 Yeah, that makes sense. 00:30:42.720 --> 00:30:44.660 Also see that it's protected. 00:30:44.660 --> 00:30:46.000 The question is protected. 00:30:46.000 --> 00:30:46.800 What does that mean? 00:30:46.800 --> 00:30:47.160 Yes. 00:30:47.160 --> 00:30:50.420 That is just something, a trick we use as moderators. 00:30:50.420 --> 00:30:52.980 But high rep users can do the same thing. 00:30:52.980 --> 00:31:02.060 When a post like this becomes so very popular, what can happen is that lots and lots of new people, so people with low reputation come in and go, 00:31:02.840 --> 00:31:03.680 my solution is unique. 00:31:03.680 --> 00:31:04.320 My solution is unique. 00:31:04.320 --> 00:31:09.100 And they don't have the experience to actually see that their solution is perhaps not that unique. 00:31:09.100 --> 00:31:11.980 Or they say, I don't quite understand. 00:31:11.980 --> 00:31:15.640 They don't quite understand yet how Stack Overflow works and they don't understand the answer. 00:31:16.060 --> 00:31:23.720 So they might post in an answer below it another question where they request that someone explains how the whole thing works. 00:31:23.720 --> 00:31:25.520 Well, answers are not questions. 00:31:25.520 --> 00:31:27.140 So that is the wrong place to post this. 00:31:28.740 --> 00:31:33.020 So a question like this can attract lots of those kinds of answers. 00:31:33.420 --> 00:31:47.020 So if I have enough reputation that I can see not only the answers that are posted, which currently says 10 at the top, but most people can only see the first three, four, six posts. 00:31:47.020 --> 00:31:57.960 So there's four more on that same page that have been posted before that are either someone not having understood that their solution is not unique and that they have basically repeated the same thing or explained it wrong. 00:31:57.960 --> 00:31:59.580 Or are questions. 00:31:59.580 --> 00:32:00.780 This one doesn't work. 00:32:00.780 --> 00:32:01.140 How does it? 00:32:01.140 --> 00:32:01.980 Yeah, interesting. 00:32:02.280 --> 00:32:07.120 So what kind of powers do you get as you go up in reputation there? 00:32:07.120 --> 00:32:09.680 Full disclosure, my reputation is like 2,000. 00:32:09.680 --> 00:32:11.000 I've answered some questions. 00:32:11.000 --> 00:32:16.200 I've asked some questions, but nowhere near that I would have any of the exposure to the stuff that you get. 00:32:16.200 --> 00:32:19.060 There's quite a wide range of things that we can do. 00:32:19.060 --> 00:32:30.100 Mostly what we like to see is that people start, when you start getting invested in Stack Overflow and you start gaining reputation, you get experience with how things work and we trust you more and more. 00:32:30.100 --> 00:32:32.500 And at that moment we can rope you in to help. 00:32:32.500 --> 00:32:44.380 So a lot of these, once you get past the initial kind of points where you can start voting and start adding comments, which before to keep spammers out, don't give you to you immediately. 00:32:44.380 --> 00:32:48.840 You start getting into what we call community moderation privileges. 00:32:48.840 --> 00:32:54.480 So you can start helping the electric community moderators keep the site clean. 00:32:54.480 --> 00:32:57.220 Most of the site is kept clean by the community. 00:32:57.220 --> 00:33:02.440 So you can start doing like, you're allowed to edit posts at 2,000 points. 00:33:02.440 --> 00:33:05.240 We allow you to edit posts without review. 00:33:05.240 --> 00:33:09.000 So you can go to any other posts and start fixing things. 00:33:09.420 --> 00:33:10.860 If you see a typo, fine. 00:33:10.860 --> 00:33:11.860 You have 2,000 points or more. 00:33:11.860 --> 00:33:14.080 We trust you that you can make that change. 00:33:14.080 --> 00:33:18.060 At some point you can start helping gardening the tags. 00:33:18.060 --> 00:33:21.420 The tags could be marked as synonyms of one another, say. 00:33:21.420 --> 00:33:24.680 The Python tag has got several synonyms. 00:33:24.680 --> 00:33:27.520 Like Pythonic was not seen as something that needed a separate tag. 00:33:27.520 --> 00:33:29.620 It's made a synonym of the Python tag. 00:33:29.620 --> 00:33:32.060 At 2,500 points you can start helping with that. 00:33:32.700 --> 00:33:35.380 And then you can start helping close posts or put them on hold. 00:33:35.380 --> 00:33:38.900 And also reopen them again if you feel that it shouldn't be, stay closed. 00:33:38.900 --> 00:33:40.220 Maybe it was improved and fixed. 00:33:40.220 --> 00:33:41.720 So now you can reopen it again. 00:33:41.720 --> 00:33:45.620 Tag wikis, the things that we keep a bit of documentation about each tag. 00:33:45.620 --> 00:33:47.180 Is it called a tag wiki and that's that? 00:33:47.180 --> 00:33:50.180 At some point you're allowed to edit those without review as well. 00:33:50.180 --> 00:33:53.620 Et cetera, et cetera, all the way to 20,000 points. 00:33:53.620 --> 00:33:58.220 And that's the highest amount of, oh no, sorry, 25,000 points. 00:33:58.220 --> 00:34:00.380 It's the highest level that you can get to. 00:34:00.660 --> 00:34:04.260 And we give you access to internal data, site statistics. 00:34:04.260 --> 00:34:07.940 Because then we think you're so invested into this, you really want to know how well we're doing. 00:34:07.940 --> 00:34:09.060 That's really cool. 00:34:09.060 --> 00:34:12.040 Yeah, and you're a little bit past 25,000 these days. 00:34:12.040 --> 00:34:14.080 So you've had those privileges for a while. 00:34:14.080 --> 00:34:15.120 But that's really great. 00:34:15.120 --> 00:34:18.160 I think I might be a little past this, yeah. 00:34:18.160 --> 00:34:22.700 Exactly, like 400, I don't know, hundreds of thousands past that. 00:34:22.700 --> 00:34:23.080 That's awesome. 00:34:23.080 --> 00:34:28.600 So another question that you put up here has to do with creating type level variables 00:34:28.600 --> 00:34:36.880 and trying to initialize one of them with a list comprehension from another or something like this, right? 00:34:36.880 --> 00:34:37.560 Yeah. 00:34:37.720 --> 00:34:43.820 This is one of those wah moments, again, where I learned such a lot about the Python programming language, 00:34:43.820 --> 00:34:51.400 about how scopes interact, but also a little bit about how the Python developers definitely see the language as something living 00:34:51.400 --> 00:34:54.400 and will make changes if that makes sense. 00:34:54.960 --> 00:35:02.320 So in this case, this is a weird interaction between scopes for classes and functions and list comprehensions. 00:35:02.320 --> 00:35:08.500 Now, normally when you have a function, that's a scope, and we have the global scope. 00:35:08.500 --> 00:35:09.980 So we have a local and a global scope. 00:35:09.980 --> 00:35:11.640 The things are simple. 00:35:11.640 --> 00:35:12.500 You just have functions. 00:35:12.680 --> 00:35:18.400 But as soon as you start nesting functions or you start using functions in classes, you introduce more scopes, 00:35:18.400 --> 00:35:23.780 and you have to start figuring out how you access names that exist outside of the function. 00:35:23.780 --> 00:35:30.160 So normally in a function, if you use a global, it is available because it can be found in the global namespace, 00:35:30.160 --> 00:35:31.320 and that's part of the normal search. 00:35:31.320 --> 00:35:36.380 But as soon as you start using classes, a class object is also a scope. 00:35:36.820 --> 00:35:42.240 So the class definition of building your class object has a separate scope of names, 00:35:42.240 --> 00:35:44.520 and different rules apply there. 00:35:44.520 --> 00:35:50.060 Let's say you have a class that defines a class attribute, and then you define a method in that. 00:35:50.060 --> 00:35:55.340 The class attribute is not a parent scope, and it's not available to the function in the class 00:35:55.340 --> 00:36:00.020 because you normally would bind that function as a method, and then you work with an instance over the class, 00:36:00.020 --> 00:36:01.380 and there might be multiple of these. 00:36:01.380 --> 00:36:05.980 And therefore, you need to separate out the lifetime of the instance in the class, 00:36:06.580 --> 00:36:08.060 and defining it. 00:36:08.060 --> 00:36:15.220 So the names in a class scope are not normally available to the functions that you define inside of the class. 00:36:15.220 --> 00:36:16.160 That comes later. 00:36:16.160 --> 00:36:19.640 You can access those via the class object that has been created. 00:36:19.640 --> 00:36:22.380 How does it play with list comprehensions, you ask? 00:36:22.380 --> 00:36:29.220 Well, the Python developer started with seeing list comprehensions as very simple constructs, 00:36:29.220 --> 00:36:31.240 no different than a for loop. 00:36:31.240 --> 00:36:35.600 If you put the for loop in a function, all the names you use in that for loop, 00:36:35.600 --> 00:36:37.620 including the target for the iteration. 00:36:38.120 --> 00:36:48.860 So the for name in an iterable, each time you iterate through the for loop, the name that you've used for that gets assigned the next value. 00:36:48.860 --> 00:36:51.140 And it's in the sequence. 00:36:51.140 --> 00:36:56.100 That name lives in the same scope everything else lives in for that function. 00:36:56.100 --> 00:36:57.200 So this is the local name. 00:36:57.600 --> 00:37:00.880 So after your for loop is done, you see that local name, and you can still access it. 00:37:00.880 --> 00:37:04.720 So for i in range 10, you'll make your loop run from 0 to 9. 00:37:04.720 --> 00:37:08.900 At the end of it, i equals 9 is still available to the rest of the code. 00:37:08.900 --> 00:37:11.100 List comprehensions were seen exactly the same. 00:37:11.100 --> 00:37:13.800 Just a local thing in your scope. 00:37:14.000 --> 00:37:16.260 So the for loop target of that is available. 00:37:16.260 --> 00:37:20.500 The Python developers then next created generator expressions. 00:37:20.500 --> 00:37:23.220 A generator expression is evaluated lazily. 00:37:23.220 --> 00:37:26.160 So you can sort of see it as a function. 00:37:26.160 --> 00:37:27.940 You first create it. 00:37:27.940 --> 00:37:33.400 It is an object that will behave in a certain way, but only when you make it to later on. 00:37:33.400 --> 00:37:35.500 So you deferred execution. 00:37:36.020 --> 00:37:36.200 Right. 00:37:36.200 --> 00:37:42.620 So if you want to try to, say, access the iteration variable in a generator, a generator expression, 00:37:42.620 --> 00:37:48.880 it may have been created, but it may not have been created, depending on whether you've actually executed the generator, right? 00:37:48.880 --> 00:37:49.480 Exactly. 00:37:49.480 --> 00:37:54.680 So you actually, at that point, you have executed your generator. 00:37:54.680 --> 00:37:56.220 So these variables don't even exist yet. 00:37:56.220 --> 00:37:59.400 So generators were given a hidden function scope. 00:37:59.400 --> 00:38:00.580 They behave just like functions. 00:38:00.580 --> 00:38:04.600 Everything inside it is local to a new function, and that's hidden. 00:38:04.880 --> 00:38:07.120 So that you can execute it just like a function. 00:38:07.120 --> 00:38:16.740 And then the local variables are part of the generator object, like a function would be, and are cleaned up separately once a generator object is done. 00:38:16.740 --> 00:38:23.220 And the Python developers, as they built this, then realized that this also really applies to list comprehension. 00:38:23.220 --> 00:38:28.740 This comprehension and generator expressions are just basically the same thing, except one of them is immediately executed. 00:38:28.740 --> 00:38:29.760 The other one is deferred. 00:38:29.760 --> 00:38:33.020 So in Python 3, list comprehensions were given in scope 2. 00:38:33.020 --> 00:38:34.460 Now, why is this important? 00:38:34.740 --> 00:38:41.140 I always talked about how classes have a scope, and then the methods in the class can't access the class scope. 00:38:41.140 --> 00:38:42.600 They will have to go for the class. 00:38:42.600 --> 00:38:46.780 This applies to generator expressions and list comprehensions too. 00:38:46.780 --> 00:38:47.740 Well, at least. 00:38:47.740 --> 00:38:49.400 So they do it in Python 3. 00:38:49.400 --> 00:38:55.520 In Python 2, where list comprehensions are still sort of simple things without their own scope, that behaves differently. 00:38:55.520 --> 00:39:00.340 So there was a discrepancy there between Python 2 and Python 3 where things changed. 00:39:00.340 --> 00:39:07.440 And most of the time, people write simple classes and write class attributes and write their methods. 00:39:07.440 --> 00:39:12.760 And they don't think about trying to access things that they just defined for class attributes. 00:39:12.760 --> 00:39:22.540 But when you're writing, say, a list comprehension to make an attribute of your class, it's certainly very surprising that you can't access anything else you already defined for that class. 00:39:22.540 --> 00:39:24.580 Because it's a scope that is not available to you. 00:39:24.580 --> 00:39:30.080 I think this is a really interesting example of understanding why this was changed. 00:39:30.080 --> 00:39:40.980 Do you know if the Python core developers are paying attention to Stack Overflow and possibly making changes to the next version of Python based on things that appear there? 00:39:41.040 --> 00:39:43.700 Or does it have to bubble up farther through some other channel? 00:39:43.700 --> 00:39:52.260 Stack Overflow is one of the inputs, one of the sources where you see there's a long tradition of the Python developers looking at how Python is used. 00:39:52.260 --> 00:39:59.180 There's a longstanding mailing list called Python Tutors where you can ask questions and people will answer those. 00:39:59.180 --> 00:40:04.360 It's like a mailing list forum that answers Python beginner questions. 00:40:05.120 --> 00:40:15.020 And I have seen plenty of evidence that that source over the years has already informed the core developers on how they do things, that they see these common problems. 00:40:15.020 --> 00:40:22.900 A lot of the core developers are themselves consultants that teach Python, teachers or consultants that help companies use Python better. 00:40:22.900 --> 00:40:25.440 And they use those inputs too. 00:40:25.440 --> 00:40:27.500 And Stack Overflow is just another input in that way. 00:40:27.500 --> 00:40:30.520 So I definitely think that they're using it. 00:40:30.520 --> 00:40:35.440 I know that several of the Python core developers are regularly answering on Stack Overflow as well. 00:40:35.440 --> 00:40:38.040 So they definitely get that input too. 00:40:38.040 --> 00:40:38.920 Yeah, that's great. 00:40:38.920 --> 00:40:39.720 I'm sure they are. 00:40:39.720 --> 00:40:47.620 But this type of question seems like it can really clearly pull out some of those decisions. 00:40:47.620 --> 00:40:52.160 And maybe if it came earlier, before that was changed, maybe highlight some of the problems. 00:40:52.160 --> 00:40:53.360 Exactly. 00:40:53.360 --> 00:40:53.420 Exactly. 00:40:53.420 --> 00:41:12.340 This portion of Talk Python To Me is brought to you by GoCD from ThoughtWorks. 00:41:12.340 --> 00:41:16.980 GoCD is the on-premise, open-source, continuous delivery server. 00:41:16.980 --> 00:41:23.800 With GoCD's comprehensive pipeline model, you can model complex workflows for multiple teams with ease. 00:41:23.800 --> 00:41:29.620 And GoCD's value stream map lets you track changes from commit to deployment at a glance. 00:41:29.620 --> 00:41:34.600 GoCD's real power is in the visibility it provides over your end-to-end workflow. 00:41:34.600 --> 00:41:39.560 You get complete control of and visibility into your deployments across multiple teams. 00:41:39.560 --> 00:41:44.040 Say goodbye to release day panic and hello to consistent, predictable deliveries. 00:41:44.300 --> 00:41:48.960 Commercial support and enterprise add-ons, including disaster recovery, are available. 00:41:48.960 --> 00:41:54.920 To learn more about GoCD, visit talkpython.fm/gocd for a free download. 00:41:54.920 --> 00:41:58.240 That's talkpython.fm/gocd. 00:41:58.240 --> 00:41:59.320 Check them out. 00:41:59.320 --> 00:42:00.340 It helps support the show. 00:42:08.620 --> 00:42:16.640 Speaking of newer versions of Python, the next question is, why is Python 3's super function magic or super method? 00:42:16.640 --> 00:42:21.500 Yeah, but this definitely came also from feedback on how things work and don't work for Python. 00:42:21.500 --> 00:42:31.460 So the super function, I'm going to introduce it real quick, is the way that you can access methods on parent classes. 00:42:31.800 --> 00:42:44.820 So when you override in a subclass a method and you still want to use the functionality of the original, you can use super to access methods on classes in your method resolution order. 00:42:44.820 --> 00:42:49.540 So the linearization of all your base classes. 00:42:50.300 --> 00:42:56.360 The linearization means that we put all your classes in a specific order and that order is always fixed and predictable. 00:42:56.360 --> 00:43:06.720 So when you create a subclass and you want to put an under-under init method in it and you want to initialize a few more attributes that your more specialized subclass uses, 00:43:06.720 --> 00:43:10.700 but you still want to use the original init method that you use super to get there. 00:43:10.920 --> 00:43:16.000 Now, the way Python inheritance works, I already mentioned the method resolution order. 00:43:16.000 --> 00:43:23.080 You want to be able to find which class is next in line so you can call its init methods. 00:43:23.080 --> 00:43:28.440 When you're using multiple inheritance and diamond patterns, it's not always that clear, but Python solves that for you. 00:43:28.440 --> 00:43:35.600 And so you don't have to worry about it if you use the super method because it will take care of finding the next class for you. 00:43:35.600 --> 00:43:41.920 Right, there's a predictable, well-known order in which it will search for the thing you're trying to call on the super, yeah. 00:43:41.920 --> 00:43:42.720 Exactly. 00:43:42.720 --> 00:43:46.000 So we call it the method resolution order. 00:43:46.000 --> 00:43:48.720 That's the order in which the next one will come about. 00:43:48.720 --> 00:43:56.960 But for a super to be able to know where it is, because your class could be subclass and subclass and subclass further, 00:43:56.960 --> 00:44:02.540 for it to know where on the method resolution order for the current object, 00:44:03.180 --> 00:44:07.800 because that matters, it's not just the current class, it's for the current instance that may use a subclass, 00:44:07.800 --> 00:44:13.760 it has to know where on that line you are at this point, this method. 00:44:13.760 --> 00:44:19.840 So if you have a foo.unit method and you want to call the next one in line, 00:44:19.840 --> 00:44:22.500 you need to tell super, I am foo. 00:44:22.500 --> 00:44:24.260 So you can search past that point. 00:44:24.260 --> 00:44:28.580 You can find foo in the line of other classes in the whole method resolution order 00:44:28.580 --> 00:44:30.800 and find the next one in line. 00:44:30.800 --> 00:44:38.900 So just like your current type, the type of object you are, isn't enough because you could have called down into some intermediate level of inheritance, 00:44:38.900 --> 00:44:42.760 which is then calling super or something that can get really complicated, right? 00:44:42.760 --> 00:44:43.340 Exactly. 00:44:43.340 --> 00:44:49.420 Foo could have been subclassed by bar and bar could have called foo.unit via the super trick. 00:44:49.920 --> 00:44:53.960 And then self, the type of self is bar and therefore you can't use that. 00:44:53.960 --> 00:44:57.720 You can't say type of self is bar and therefore I'm going to start searching from there. 00:44:57.720 --> 00:45:01.640 Because you're no longer in the bar class, you're now in the foo class. 00:45:01.640 --> 00:45:06.660 So you explicitly in Python 2 have to tell it, start searching here. 00:45:06.660 --> 00:45:09.260 That's the first argument is the current class. 00:45:09.260 --> 00:45:11.920 Second argument is the current instance. 00:45:11.920 --> 00:45:14.360 So you can find the MRO from that. 00:45:14.360 --> 00:45:17.460 That means you constantly are repeating yourself. 00:45:17.460 --> 00:45:19.660 So you're writing a class foo. 00:45:19.660 --> 00:45:24.540 And then in all the methods that need to use super, you keep saying, this is class foo. 00:45:24.540 --> 00:45:25.380 This is class foo. 00:45:25.380 --> 00:45:26.180 This is class foo. 00:45:26.180 --> 00:45:29.960 So that violates the do not repeat yourself principle, the drive principle. 00:45:29.960 --> 00:45:37.740 Then there are other tricks in Python, like using a class decorator or simply storing your class in a different name. 00:45:37.740 --> 00:45:44.740 Classes are objects, so you can use a new variable, a new global variable, call it spam and assign foo to it. 00:45:44.740 --> 00:45:47.640 And then you can call spam and suddenly it is called spam. 00:45:47.640 --> 00:45:51.300 So you cannot also at that moment guarantee that the old name still exists. 00:45:51.300 --> 00:45:57.700 So what might happen if you have a class foo and you use super foo everywhere in there? 00:45:57.700 --> 00:46:05.000 It may be that at the moment that you call foo on the internet, the class is no longer available under the name foo. 00:46:05.000 --> 00:46:09.360 It might be called something different or something else might have been assigned to foo. 00:46:09.360 --> 00:46:16.300 So now you have a problem because when you created the class, it was called one thing and now it's called something else and the names no longer match. 00:46:16.300 --> 00:46:17.860 And then you have problems. 00:46:17.860 --> 00:46:18.640 Yeah, absolutely. 00:46:18.640 --> 00:46:25.700 So super, super, super, super tied to the class existing on underneath the same name. 00:46:26.100 --> 00:46:28.940 And you have to constantly repeat yourself. 00:46:28.940 --> 00:46:33.900 So the Python developers tried full, long and hard as to how they were going to solve this. 00:46:33.900 --> 00:46:36.980 How can they avoid having to repeat yourself? 00:46:36.980 --> 00:46:51.780 And how can they avoid record, somehow record the current class on which you just define this method so that it is no longer depending on the run, on runtime circumstances where the class still is available in the same name. 00:46:51.780 --> 00:46:57.020 And that's where Python 3 super comes from, where you no longer have to pass in anything anymore. 00:46:57.020 --> 00:46:59.080 You call super without arguments. 00:46:59.080 --> 00:47:01.780 And that moment, something magic happens. 00:47:02.460 --> 00:47:09.620 And that magic, some people feel, goes against the grain of certain Python principles. 00:47:09.620 --> 00:47:13.540 Like there should be obvious things happening. 00:47:13.540 --> 00:47:16.620 That Python is very explicit about what it does. 00:47:16.620 --> 00:47:25.540 As Tim Peters once said in a very famous little recorded Python philosophy, there should be one obvious way to do it. 00:47:25.740 --> 00:47:28.100 And explicit is better than implicit. 00:47:28.100 --> 00:47:30.100 Super in Python 3 is implicit. 00:47:30.100 --> 00:47:33.680 When you don't pass in arguments, it implicitly finds its arguments. 00:47:33.680 --> 00:47:35.500 So that's why people call this magic. 00:47:35.500 --> 00:47:39.200 But the reason why they're doing it is to avoid having to repeat yourself. 00:47:39.200 --> 00:47:49.120 There's a lot of extra things going on in the hood in that when you create a method in a class, then a closure is created specifically for it. 00:47:49.120 --> 00:47:57.940 So it is as if the function has been defined in a context where the name under class is available in a parent context. 00:47:57.940 --> 00:48:02.940 Basically, it means it captures its setup when it gets created. 00:48:02.940 --> 00:48:06.360 And then it can't really be influenced from the outside. 00:48:06.360 --> 00:48:08.320 It can no longer be influenced from the outside. 00:48:08.320 --> 00:48:12.060 And you no longer have to repeat yourself and passing it in. 00:48:12.060 --> 00:48:18.900 It was also a common source of problems, a common source of errors where people misunderstood what needs to be passed in. 00:48:18.900 --> 00:48:26.520 I already said you explicitly have to call, name foo and not take the current type of self because that may no longer be foo. 00:48:26.520 --> 00:48:27.580 There might be a subclass. 00:48:28.380 --> 00:48:36.640 And at the time, I did a search on code search engines and found tens of thousands of errors of people passing in either self. 00:48:36.640 --> 00:48:41.140 On their class or passing in type self, thinking that's good enough. 00:48:41.140 --> 00:48:44.460 I don't have to repeat myself here and type foo again. 00:48:44.460 --> 00:48:46.300 I'll just take it from self. 00:48:46.300 --> 00:48:53.280 And that leads to problems because when you have a subclass that also uses super, you get an infinite loop. 00:48:53.280 --> 00:49:00.940 Because then you say foo is subclass to bar, bar calls foo, and then foo passes in bar each time. 00:49:00.940 --> 00:49:05.180 And therefore, you go back up in the circle and we keep calling the same thing. 00:49:05.180 --> 00:49:07.320 Do you actually get a stack overflow exception? 00:49:07.320 --> 00:49:09.120 You get an infinite recursion error. 00:49:09.120 --> 00:49:09.440 Okay. 00:49:09.440 --> 00:49:11.860 It protects you from stack overflow. 00:49:11.860 --> 00:49:12.600 Yeah, good. 00:49:12.600 --> 00:49:14.180 Nice. 00:49:14.180 --> 00:49:14.540 All right. 00:49:14.540 --> 00:49:16.160 So that is really cool. 00:49:16.160 --> 00:49:29.800 And, you know, I think what's really interesting is as you, you know, as you mentioned in the beginning, but as we go through these, it's really clear how it's what you learn by going through this exercise of reading the question, the answer or answering it yourself. 00:49:29.800 --> 00:49:37.500 It's not just what is the question, what is the answer, but there's so many additional things and techniques and practices that come through. 00:49:37.500 --> 00:49:38.580 I think it's great. 00:49:38.580 --> 00:49:47.760 I really learned to understand what the decimal due does and how that works and how the Python evaluation loop works purely from answering on stack overflow. 00:49:47.760 --> 00:49:49.740 Yeah, I can see that. 00:49:49.740 --> 00:49:50.520 All right. 00:49:50.520 --> 00:49:54.480 So another one is about Python 3's range objects. 00:49:54.480 --> 00:50:02.340 Really two questions and sort of relative to Python 2's, which just turns them into a list of the things, a list of the integers. 00:50:02.340 --> 00:50:15.120 Yeah, so Python 2, you have range, which is just a function that produces a list of integers and x range, which is a separate object that models the sequence, but it's quite simple and it's quite limited. 00:50:15.120 --> 00:50:19.740 Python 3, they renamed x range to range because you don't really need that as a function. 00:50:19.740 --> 00:50:24.980 You can just call list on the range object and then get the same results. 00:50:24.980 --> 00:50:29.560 So range in Python 3 is a separate sequence object. 00:50:29.560 --> 00:50:30.760 And I love the object. 00:50:30.760 --> 00:50:31.940 I think it's a great idea. 00:50:31.940 --> 00:50:41.280 I wonder why it took so long to make this such a first class object and to make it something that is part of Python core the way range is now in Python 3. 00:50:41.280 --> 00:50:43.120 Ranges are very lightweight. 00:50:43.120 --> 00:50:49.300 You just tell it, this is the start, this is the end, and this is the step size, how we get from start to end. 00:50:49.300 --> 00:50:50.840 And that's all it records. 00:50:50.840 --> 00:50:56.200 It doesn't need to know all the numbers in between because you can very quickly just calculate those. 00:50:56.640 --> 00:51:07.500 If you know that you start at 1 and you end at 10 and your step size is 2, then you know that 3 and 5 and 7 and 9 are all part of that sequence. 00:51:07.500 --> 00:51:09.380 You can just calculate that, right? 00:51:09.380 --> 00:51:10.160 It's just math. 00:51:10.780 --> 00:51:12.300 So why would you have to keep that in memory? 00:51:12.300 --> 00:51:16.920 So most people still don't realize that range objects are smart like that. 00:51:17.700 --> 00:51:20.260 So you do see lots of questions about them. 00:51:20.260 --> 00:51:32.780 These are two example questions about someone being surprised, so surprised that if you make the range really large, you think that if you start at 1 and you put the endpoint at 10 million, that's numbers, right? 00:51:32.860 --> 00:51:37.120 The human mind will see that that's 10 million numbers in that object. 00:51:37.120 --> 00:51:38.140 That must take a lot of memory. 00:51:38.140 --> 00:51:44.760 That must be really slow to figure out whether or not the number 10 million and 1 is part of that range. 00:51:44.760 --> 00:51:45.440 But it isn't. 00:51:45.440 --> 00:51:46.420 It's just math. 00:51:46.420 --> 00:51:47.700 It's very easy to calculate. 00:51:48.140 --> 00:51:56.160 Yeah, and it gives you a good look inside it because the person asking the question says, well, I could write one with generators and yields and so on. 00:51:56.160 --> 00:51:58.560 But it takes it forever. 00:51:58.560 --> 00:52:04.060 And so your answer to it was basically, well, let's look at the actual implementation. 00:52:04.060 --> 00:52:09.080 And it's not just something that generates a sequence but can be smarter. 00:52:09.080 --> 00:52:14.400 Like it can implement dunder contains and get item and lend and all these different things that have actual. 00:52:14.600 --> 00:52:18.420 It behaves just like other sequences, just like lists and tuples and strings. 00:52:18.420 --> 00:52:25.020 You can ask for any of the elements at any point in that range, but it doesn't have to keep them up front. 00:52:25.020 --> 00:52:26.900 It can calculate them as you need them. 00:52:26.900 --> 00:52:28.020 Yeah, yeah, that's cool. 00:52:28.020 --> 00:52:31.540 It's really, again, a really nice look inside of things. 00:52:31.540 --> 00:52:40.600 And I guess the last example we want to look at is about the order or lack thereof for sets and dictionaries, which is slightly changing, right? 00:52:40.600 --> 00:52:42.980 That's exactly what's happening. 00:52:44.000 --> 00:52:50.480 For many beginners, it is surprising that collections like dictionaries and sets don't have a set order. 00:52:50.480 --> 00:52:55.540 There's an implementation detail of how Python has implemented dictionaries. 00:52:55.540 --> 00:52:59.500 And that comes from – it usually happens to people also new to programming. 00:52:59.500 --> 00:53:10.200 So they don't just come to Python as new and as the first thing that – it's also usually the first thing that they come to as a programmer, to programming in the whole. 00:53:10.200 --> 00:53:18.800 Because if you, for example, have done systems programming in C or C++, most people that come from that kind of bank account already realize this. 00:53:18.800 --> 00:53:30.340 But to make a dictionary efficient, so to be able to map a key to a value, you want to be able to store this efficiently and quickly find things. 00:53:30.780 --> 00:53:43.720 So once you start working on this, you realize that it looks like there's an order, but the dictionary can't promise any order because the order that the keys are put in is just a property of the storage medium that we put things in. 00:53:44.560 --> 00:53:54.420 Because we are putting a bunch of keys into a much smaller space and then quickly being able to find if a key exists or not in such a table. 00:53:55.160 --> 00:54:02.560 Because we base this on what we call the hash, the property of any of these keys to be reduced to a simple number. 00:54:02.560 --> 00:54:06.380 Explaining that is surprisingly often needed. 00:54:06.380 --> 00:54:09.660 Once people understand it, they can – oh, okay, right, fine. 00:54:09.660 --> 00:54:14.100 And we can go to another type and we can point to things like the order dict that also exists in Python. 00:54:14.560 --> 00:54:18.200 Or tell them that in Python 3.6, the implementation changed. 00:54:18.200 --> 00:54:27.240 So now dictionaries actually do remember order just because someone found a more efficient way of building the same kind of data type. 00:54:27.240 --> 00:54:37.920 So Python 3.6 makes dictionaries ordered, not because everybody wants them to be ordered, but because it was a more efficient way of storing information. 00:54:37.920 --> 00:54:43.080 Python 3.6 makes dictionaries an order of magnitude smaller in certain cases. 00:54:43.080 --> 00:54:49.540 And that's actually super important because that's in some sense the backing store for all instances of objects. 00:54:49.540 --> 00:54:49.960 Exactly. 00:54:49.960 --> 00:54:50.320 Yeah. 00:54:50.320 --> 00:54:50.740 Exactly. 00:54:50.740 --> 00:54:54.000 Lots and lots and lots and lots and lots and lots of stuff in Python uses dictionaries. 00:54:54.000 --> 00:54:57.460 So you can end up with quite a number of those. 00:54:57.460 --> 00:54:59.820 So making them smaller is a very good idea. 00:54:59.820 --> 00:55:00.140 Yeah. 00:55:00.140 --> 00:55:08.020 So this question is really like you need to understand data structures to understand what is happening here because it doesn't make sense. 00:55:08.020 --> 00:55:10.240 Like, well, why wouldn't I just want it to have the order? 00:55:10.240 --> 00:55:16.180 But of course, you know, at least the older implementations, you want it to be fast more than you want it to be other order. 00:55:16.180 --> 00:55:16.700 Exactly. 00:55:17.300 --> 00:55:29.620 And it also helped implement a few other new things in Python 3.6 like the keyword argument order can be important in certain cases or the class attribute definition order can be important in certain cases. 00:55:29.760 --> 00:55:34.620 And they are now as a property of this new dictionary implementation also implemented. 00:55:34.620 --> 00:55:44.080 So suddenly when you create a class, all your attributes are kept in the same order that you define them, which can help form building libraries, for example. 00:55:44.080 --> 00:55:44.800 Yeah. 00:55:44.800 --> 00:55:45.060 Yeah. 00:55:45.060 --> 00:55:47.680 Currently, you have to do crazy tricks to make keep it order. 00:55:47.680 --> 00:55:48.840 Yeah, absolutely. 00:55:48.840 --> 00:55:49.400 Absolutely. 00:55:49.400 --> 00:55:58.740 And MongoDB, for example, has its own dictionary type thing because it doesn't want to reorder and entirely rewrite big parts of stuff. 00:55:58.740 --> 00:56:02.600 And there's a lot of areas where this order actually would be kind of nice to have. 00:56:02.600 --> 00:56:04.160 So very cool. 00:56:04.260 --> 00:56:06.520 So those were some great questions. 00:56:06.520 --> 00:56:08.820 And I feel like I've really learned a lot. 00:56:08.820 --> 00:56:11.780 I hope people have as well just hearing you talk about them. 00:56:11.780 --> 00:56:14.300 But why – like this is a lot of work. 00:56:14.300 --> 00:56:17.040 Why do you contribute so much time and energy to Stack Overflow? 00:56:17.040 --> 00:56:19.980 I already mentioned that I learned a lot from this. 00:56:19.980 --> 00:56:24.020 I didn't start Stack Overflow because I wanted to learn so much. 00:56:24.020 --> 00:56:27.800 I started because I was part of the Plone management community. 00:56:27.960 --> 00:56:39.480 And we discovered Stack Overflow early on as a good idea to support people that wanted to learn about Plone and wanted to use Plone and then sort of fell into answering Python questions as well. 00:56:39.480 --> 00:56:57.460 But I've since heard in an interview with another Stack Overflow numinary and creator of the C# compiler, Eric Lippert once had an interview, I think, with Stack Overflow actually, giving his reasoning as to why he started with Stack Overflow. 00:56:57.680 --> 00:56:58.920 And answering questions. 00:56:58.920 --> 00:57:04.140 And that a manager actually told him that he should become an expert in certain things. 00:57:04.140 --> 00:57:07.820 It would be helpful for him to become an expert in a certain subject. 00:57:07.820 --> 00:57:15.640 And that the best method of becoming an expert in anything was to find a source of questions and start answering them. 00:57:15.640 --> 00:57:24.180 Because then you really discover all the things you don't know yet and start learning those yourself to try and be able to answer these things. 00:57:24.180 --> 00:57:25.800 And that really resonates with me. 00:57:25.880 --> 00:57:29.220 That's really why I'm answering on Stack Overflow so much. 00:57:29.220 --> 00:57:33.240 It's because it keeps teaching me new things. 00:57:33.240 --> 00:57:40.660 It keeps pushing me into new directions and new areas of expertise that I didn't know I could do before. 00:57:40.660 --> 00:57:41.620 Yeah, that makes a really… 00:57:41.620 --> 00:57:42.540 I really enjoy that. 00:57:42.620 --> 00:57:45.240 Yeah, that makes a lot of sense to me, your answer there. 00:57:45.240 --> 00:57:47.000 About the expert especially. 00:57:47.460 --> 00:57:55.380 I think one of the things that makes somebody an expert is not that they've sat down and done programming for five years or ten years or whatever. 00:57:55.820 --> 00:57:58.200 It's really how you spend that five or ten years. 00:57:58.200 --> 00:58:11.660 If you just do the same thing over and over and don't run into many challenges and don't force yourself to continue to dig deeper, then you're not that much more of an expert than what you learned in the beginning. 00:58:11.660 --> 00:58:16.440 But it's more the challenges and the problems and the edge cases that have hit you. 00:58:16.440 --> 00:58:20.740 And I feel like Stack Overflow is that concentrated. 00:58:21.280 --> 00:58:24.860 Yes, and it is a constant source of new challenges. 00:58:24.860 --> 00:58:28.700 And in that way, it can be sort of addictive as well. 00:58:28.700 --> 00:58:30.060 So you have to be careful. 00:58:30.060 --> 00:58:30.700 I'm sure. 00:58:30.700 --> 00:58:45.400 So speaking of addictive, what if I was really convinced that 2,000 for my reputation was insufficient and I decided the next week I'm going to take eight hours a day and just start answering questions and asking questions and just going crazy? 00:58:45.400 --> 00:58:50.280 Obviously, my reputation would go up, but would there be other effects? 00:58:50.280 --> 00:58:53.060 Would people start contacting me for jobs? 00:58:53.060 --> 00:58:56.320 Like, hey, I saw you're doing great on Stack Overflow. 00:58:56.320 --> 00:58:57.660 And answer these questions. 00:58:57.660 --> 00:58:59.320 Do you want to come work for us or things like that? 00:58:59.320 --> 00:59:04.620 People have used my Stack Overflow presence as a reason to contact me. 00:59:04.620 --> 00:59:06.880 But I think that's not really the norm. 00:59:06.880 --> 00:59:11.180 So I'm also looking at this from the point of view of hiring people. 00:59:11.180 --> 00:59:12.820 How would I use Stack Overflow? 00:59:12.820 --> 00:59:18.780 If I see someone that has a better reputation of Stack Overflow, that it works just like a blog. 00:59:18.780 --> 00:59:27.280 It gives me a great insight into how you think and how you work and how you look at code and how proficient you are in those things. 00:59:27.280 --> 00:59:30.900 Because you have this body of things out there that we can look at. 00:59:31.180 --> 00:59:33.700 Or just like open source work, open source projects that you did. 00:59:33.700 --> 00:59:44.020 A Stack Overflow profile with a good body of answers can be a real asset to let potential employers know that you know your stuff. 00:59:44.020 --> 00:59:48.480 That you know a thing or two about what you're talking about because you can actually demonstrate this. 00:59:48.480 --> 00:59:54.060 Which can give you a huge advantage because a lot of times people applying for jobs can't demonstrate their skills. 00:59:54.160 --> 00:59:54.460 Exactly. 00:59:54.460 --> 00:59:55.580 Exactly. 00:59:55.580 --> 01:00:06.420 So Stack Overflow is actually really building on top of this because they give you ability to develop your CV or developer stories they call it now on the site. 01:00:06.420 --> 01:00:17.860 And that they actually have a nice site business or actually their main business next to it to get employers connected with those people that have filled in a profile there. 01:00:17.940 --> 01:00:18.080 Right. 01:00:18.080 --> 01:00:19.940 The whole Stack Overflow careers thing, right? 01:00:19.940 --> 01:00:21.840 The whole Stack Overflow careers thing. 01:00:21.840 --> 01:00:28.700 And that does give you a lot, can give you a lot of insight into people how they work and how they think and see code. 01:00:28.700 --> 01:00:35.580 So up to a certain level, getting answers on Stack Overflow, provided you, of course, know your stuff a bit, can help you there. 01:00:35.580 --> 01:00:42.340 If I see someone with my level of reputation, I would start asking, are they spending too much time there? 01:00:42.340 --> 01:00:46.840 Will they actually do work for me or will they just come to work and answer Stack Overflow questions? 01:00:47.740 --> 01:00:48.860 Interesting. 01:00:48.860 --> 01:00:51.100 I said, keep people interested, right? 01:00:51.100 --> 01:00:51.460 Yeah. 01:00:51.460 --> 01:00:54.740 You would learn a lot, I think. 01:00:54.740 --> 01:01:05.260 If you seek out questions, there is such a thing, it's the fastest gun in the West where someone already knows the answer, can type faster than you and gets the first answer in and then that gets upvoted. 01:01:05.720 --> 01:01:17.480 So if you're in it for the reputation and the upvotes, you're going to have to wait a little while because you're going to probably be a little slower than the experienced hands, the people that can type out the answer quickly. 01:01:17.480 --> 01:01:17.960 Sure. 01:01:18.440 --> 01:01:24.740 But at the same time, at that moment, you can compare your answer to theirs and see, did I miss something? 01:01:24.740 --> 01:01:25.720 Did I miss a detail? 01:01:25.720 --> 01:01:27.040 Did I miss a trick? 01:01:27.040 --> 01:01:32.140 Did I use something interesting out of the standard library that I didn't know about yet? 01:01:32.600 --> 01:01:36.680 So next to that, you're having to figure out how I'm answering this. 01:01:36.680 --> 01:01:37.440 Do I know the answer? 01:01:37.440 --> 01:01:39.800 You also get to see how other people are answering. 01:01:40.420 --> 01:01:44.420 So that's something different than you're trying to find the solution to a problem you have. 01:01:44.420 --> 01:01:49.540 You are actually looking at new questions as they come in and see how other people are answering them at that moment. 01:01:49.540 --> 01:01:51.100 It's a slightly different angle. 01:01:51.100 --> 01:01:53.900 You get to see how other people think about their problem. 01:01:53.900 --> 01:01:55.700 And you might learn a thing or two about that. 01:01:55.700 --> 01:01:56.080 Okay. 01:01:56.080 --> 01:01:56.800 Very interesting. 01:01:56.800 --> 01:02:02.760 I probably won't take a week and go work on Stack Overflow, but it's interesting to think about what would happen if I did. 01:02:02.760 --> 01:02:07.860 It is certainly an interesting thing you can do for half an hour when you're maybe waiting for the bus or something like that. 01:02:07.860 --> 01:02:08.100 Yep. 01:02:08.100 --> 01:02:08.560 Absolutely. 01:02:08.560 --> 01:02:09.720 All right. 01:02:09.860 --> 01:02:12.100 So let me ask you one final Stack Overflow question. 01:02:12.100 --> 01:02:19.780 And I've seen a couple of articles lately that felt like Stack Overflow was unfriendly to beginners or to newcomers. 01:02:19.780 --> 01:02:21.840 And you had an interesting thought on that. 01:02:21.840 --> 01:02:22.980 There's two angles here. 01:02:22.980 --> 01:02:29.720 First of all, there's often a mismatch of expectations what Stack Overflow is for. 01:02:29.720 --> 01:02:38.660 Stack Overflow is often seen as a personal help desk, something you can quickly put a question on and they'll help me fix my problem. 01:02:39.300 --> 01:02:57.880 And then get upset because their question might get put on hold or downvoted because they didn't meet the expectations of what Stack Overflow has, which is to build a knowledge base, to build a repository of good questions and even better answers that help future visitors. 01:02:57.880 --> 01:03:15.240 So if you forgot to explain what inputs came out of it or what the full error message is or you show a clear lack of having actually done the research about how to solve your problem, then you might actually find Stack Overflow very disappointing. 01:03:15.560 --> 01:03:25.040 And then there's the other thing is that people say that Stack Overflow is only for professionals and not for newcomers and for new people, newbies new to programming or new to our programming language. 01:03:25.040 --> 01:03:26.380 Again, that's not true. 01:03:26.380 --> 01:03:33.760 It's usually what happens at the same time is that people are new to asking questions and they don't know how to ask a good question. 01:03:33.860 --> 01:03:36.600 And that then really fires back on them. 01:03:36.600 --> 01:03:48.620 So, and then there's, of course, the fact that because we are trying to, Stack Overflow is so hugely popular, we get maybe a few thousand of such people come in every day and asking questions. 01:03:49.100 --> 01:03:56.920 So the community, on the other hand, is running out of steam and running out of power to keep helping each and every one of those learn about Stack Overflow. 01:03:56.920 --> 01:04:01.440 So a lot of people don't get the hand-holding that they might expect. 01:04:01.440 --> 01:04:02.640 Well, I'm new to this site. 01:04:02.640 --> 01:04:03.800 Why didn't you tell me? 01:04:03.800 --> 01:04:10.260 Well, Stack Overflow tries to automate this, tries to give you the help up front. 01:04:10.260 --> 01:04:13.380 You get a lot of information about how to ask a good question. 01:04:13.380 --> 01:04:14.840 There's lots of information in the help center. 01:04:15.420 --> 01:04:19.760 But when you have a programming problem and your homework is due tomorrow, not everybody will reach that. 01:04:19.760 --> 01:04:22.700 And then they skip by that very quickly. 01:04:22.700 --> 01:04:25.980 And then what they see is what... 01:04:25.980 --> 01:04:26.180 Yeah. 01:04:26.180 --> 01:04:34.180 I feel a huge difference based on whether or not the person is legitimately trying to solve the problem and needs help or if they're just being lazy. 01:04:34.180 --> 01:04:34.680 You know? 01:04:34.680 --> 01:04:36.540 I do want to avoid the term lazy. 01:04:36.540 --> 01:04:38.660 It is often just a misunderstanding. 01:04:38.660 --> 01:04:42.720 And a certain sense of urgency can... 01:04:42.720 --> 01:04:50.400 Because you have this problem and you're so into this problem and trying to solve it that you can forget to see the larger picture around it. 01:04:50.400 --> 01:04:53.680 It's not necessary that you're being lazy always. 01:04:53.680 --> 01:05:01.640 Sometimes there are lazy people that do post their homework on the site or even their exam questions as they sit in the exam. 01:05:02.360 --> 01:05:12.400 I have literally seen posts where that consists of nothing more than a photograph of a paper that's clearly taken underneath the desk. 01:05:12.400 --> 01:05:14.940 No, no, this is not an exam. 01:05:14.940 --> 01:05:17.400 This is a trial question. 01:05:17.400 --> 01:05:19.500 But can you give it to me in the next 10 minutes? 01:05:21.000 --> 01:05:24.540 I'm going to need that before 9 o'clock because I've got to turn this in. 01:05:24.540 --> 01:05:26.640 I'm going to turn this in. 01:05:26.640 --> 01:05:36.200 Some people are lazy but most people just don't understand what makes a good question and why Stack Overflow is there. 01:05:36.200 --> 01:05:39.020 And that clash can lead to frustration. 01:05:39.020 --> 01:05:39.400 Yeah. 01:05:39.400 --> 01:05:47.020 So I think this conversation has definitely helped everyone who's listened to it understand what makes a good question and so on. 01:05:47.020 --> 01:05:59.240 So hopefully we've done a small part to reduce the frustration on both sides, the people answering the questions, being frustrated with the unprepared homework folks coming in. 01:05:59.240 --> 01:06:02.400 I guess we'll leave it there for Stack Overflow. 01:06:02.400 --> 01:06:05.360 But, Martin, that was very, very interesting. 01:06:05.820 --> 01:06:12.100 And I think reading through your answers and other people's answers and questions is definitely enlightening. 01:06:12.100 --> 01:06:13.420 Thanks for the work on that. 01:06:13.420 --> 01:06:13.860 That's cool. 01:06:13.860 --> 01:06:17.200 Before I let you go, two final questions I'd always ask my guests. 01:06:17.200 --> 01:06:19.520 First of all, favorite PyPI package? 01:06:19.520 --> 01:06:22.140 What would you recommend to people out there that they might not know about? 01:06:22.140 --> 01:06:23.660 This is one I discovered. 01:06:23.660 --> 01:06:26.400 Someone else used it in an answer and I love it. 01:06:26.400 --> 01:06:28.100 I have been using it ever since. 01:06:28.820 --> 01:06:32.440 It's called FTFY or Fix This For Me, For You. 01:06:32.440 --> 01:06:33.660 Fix This For You. 01:06:33.660 --> 01:06:41.600 It is a library to fix Mojibaki or text encoding errors. 01:06:41.600 --> 01:06:50.860 So if you ever try to make sense of UTF-8 encoding text decoded as Latin 1 and then re-encoded as something else, 01:06:51.560 --> 01:06:54.660 the Fix This For You library, FTFY library, will do it for you. 01:06:54.660 --> 01:07:00.960 It automatically detects when an encoding was misapplied and fixes it for you, 01:07:00.960 --> 01:07:04.940 as well as some other common noise in incoming text. 01:07:04.940 --> 01:07:07.980 I love this for content management. 01:07:07.980 --> 01:07:12.980 I love it for handling badly decoded webpages, anything like that. 01:07:12.980 --> 01:07:16.520 Fix This For You has saved my backside a couple of times already. 01:07:16.520 --> 01:07:17.440 That's fantastic. 01:07:17.440 --> 01:07:19.140 All right, favorite editor? 01:07:19.140 --> 01:07:21.240 If you're going to write some Python code, what do you open up? 01:07:21.440 --> 01:07:23.000 If I'm on a terminal, Vim. 01:07:23.000 --> 01:07:26.120 And generally on my desktop, it's Sublime Text 3. 01:07:26.120 --> 01:07:26.580 Nice. 01:07:26.580 --> 01:07:28.820 Yeah, I was just using Sublime Text earlier. 01:07:28.820 --> 01:07:29.520 I like it a lot. 01:07:29.520 --> 01:07:31.380 All right, how about a final call to action? 01:07:31.380 --> 01:07:34.220 What should people do to get more involved with Stack Overflow? 01:07:34.220 --> 01:07:36.000 Common and try and answer stuff. 01:07:36.000 --> 01:07:40.980 There are always niche tags that need more people answering. 01:07:40.980 --> 01:07:45.700 Things like Python may be overflown with people that know their things. 01:07:45.700 --> 01:07:50.780 But if you have a specific expertise in programming, there usually is a tag for you. 01:07:51.320 --> 01:07:54.220 And finding more answers to these questions are always helpful. 01:07:54.220 --> 01:07:55.120 All right, excellent. 01:07:55.120 --> 01:07:59.100 Well, thank you for all the work you've done on Stack Overflow around Python. 01:07:59.100 --> 01:08:00.140 It's really amazing. 01:08:00.140 --> 01:08:01.400 And thanks for being on the show. 01:08:01.400 --> 01:08:02.360 Great to talk to you. 01:08:02.360 --> 01:08:02.600 Cool. 01:08:02.600 --> 01:08:03.400 Great. 01:08:03.400 --> 01:08:03.860 Thanks. 01:08:03.860 --> 01:08:04.240 Bye. 01:08:05.400 --> 01:08:08.320 This has been another episode of Talk Python To Me. 01:08:08.320 --> 01:08:11.260 The guest today has been Martin Peters. 01:08:11.260 --> 01:08:14.280 This episode has been sponsored by Rollbar and GoCD. 01:08:14.280 --> 01:08:16.180 Thank you both for supporting the show. 01:08:16.180 --> 01:08:19.000 Rollbar takes the pain out of errors. 01:08:19.220 --> 01:08:26.720 They give you the context and insight you need to quickly locate and fix errors that might have gone unnoticed until your users complain, of course. 01:08:27.000 --> 01:08:33.880 As Talk Python To Me listeners, track a ridiculous number of errors for free at rollbar.com slash Talk Python To Me. 01:08:33.880 --> 01:08:38.880 GoCD is the on-premise, open-source, continuous delivery server. 01:08:38.880 --> 01:08:43.020 Want to improve your deployment workflow but keep your code and builds in-house? 01:08:43.440 --> 01:08:49.360 Check out GoCD at talkpython.fm/G-O-C-D and take control over your process. 01:08:49.360 --> 01:08:51.940 Are you or a colleague trying to learn Python? 01:08:51.940 --> 01:08:56.600 Have you tried books and videos that just left you bored by covering topics point by point? 01:08:56.600 --> 01:09:05.240 Well, check out my online course, Python Jumpstart, by building 10 apps at talkpython.fm/course to experience a more engaging way to learn Python. 01:09:05.240 --> 01:09:12.560 And if you're looking for something a little more advanced, try my Write Pythonic Code course at talkpython.fm/Pythonic. 01:09:12.920 --> 01:09:19.340 You can find the links from this episode at talkpython.fm/episodes slash show slash 86. 01:09:19.340 --> 01:09:21.560 Be sure to subscribe to the show. 01:09:21.560 --> 01:09:23.760 Open your favorite podcatcher and search for Python. 01:09:23.760 --> 01:09:25.000 We should be right at the top. 01:09:25.000 --> 01:09:34.320 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct RSS feed at /rss on talkpython.fm. 01:09:34.320 --> 01:09:39.420 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix. 01:09:39.420 --> 01:09:42.400 Corey just recently started selling his tracks on iTunes. 01:09:42.400 --> 01:09:46.100 So I recommend you check it out at talkpython.fm/music. 01:09:46.100 --> 01:09:51.440 You can browse his tracks he has for sale on iTunes and listen to the full-length version of the theme song. 01:09:51.440 --> 01:09:53.540 This is your host, Michael Kennedy. 01:09:53.540 --> 01:09:54.820 Thanks so much for listening. 01:09:54.820 --> 01:09:56.020 I really appreciate it. 01:09:56.020 --> 01:09:58.160 Smix, let's get out of here. 01:10:06.160 --> 01:10:18.480 I'll see you next time. 01:10:18.480 --> 01:10:18.880 Bye. 01:10:18.880 --> 01:10:19.360 Bye. 01:10:19.360 --> 01:10:19.880 Bye.