forked from mikeckennedy/talk-python-transcripts
-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy path022_cpython.txt
More file actions
1740 lines (870 loc) · 71.2 KB
/
022_cpython.txt
File metadata and controls
1740 lines (870 loc) · 71.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
00:00:00 It's time to look deep within the machine and understand what really happens when your Python code executes.
00:00:06 We're code walking through the CPython code base and visualizing it at pythontutor.com.
00:00:11 This is episode number 22 with Philip Guau, recorded Monday, August 3rd, 2015.
00:00:17 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the
00:00:47 ecosystem, and the personalities.
00:00:49 This is your host, Michael Kennedy.
00:00:51 Follow me on Twitter where I'm @mkennedy, and keep up with the show and listen to past
00:00:56 episodes at talkpython.fm.
00:00:58 Be sure to follow the show on Twitter where it's at talkpython.
00:01:02 This episode is brought to you by Hired and CodeChip.
00:01:06 Thank them both for supporting the show on Twitter via at HiredHQ and at CodeChip.
00:01:11 Now let me introduce Philip.
00:01:14 Philip Guau is an assistant professor of computer science at the University of Rochester in New York.
00:01:19 He researches human-computer interactions with a focus on user interfaces for online learning.
00:01:24 He's especially interested in studying how to better train software engineers and data scientists.
00:01:30 He created a free web-based visualization tool for learning programming called Online Python Tutor
00:01:35 at pythontutor.com, which has been used by over 1.2 million people in over 165 countries to visualize over 11 million pieces of code.
00:01:45 Philip, welcome to the show.
00:01:46 My pleasure.
00:01:48 Yeah, it's really exciting to have you here.
00:01:50 We're going to talk a lot about many things.
00:01:53 We're going to talk about CPython and a really cool project that you put on your website and on YouTube called CPython, a 10-hour code walk.
00:02:02 And so we'll be digging into CPython.
00:02:04 And we're also going to talk about this thing called Python Tutor at pythontutor.com that you are working to help people understand the internals of Python better.
00:02:12 So that's going to be great stuff.
00:02:14 Cool.
00:02:14 I'm looking forward to it.
00:02:16 Yeah.
00:02:16 Before we get into the details, though, you know, everyone likes to know how people got into programming and how they got started in Python.
00:02:22 What's your story?
00:02:23 So my story was I was always interested in computers as a kid, like many people who got into computer science.
00:02:29 But I never really had a strong programming background until I went to college.
00:02:34 So I tried to learn QBasic by myself when I was 10.
00:02:39 And that, you know, I had a book and then I failed after a few weeks because I had no one teaching me.
00:02:43 I took an AP computer science course in high school.
00:02:46 That was in C++.
00:02:46 And that was really fun.
00:02:48 And that was kind of my first introduction to really doing programming.
00:02:52 And in college, I decided to major in electrical engineering and computer science.
00:02:57 And that's when I started just learning programming formally.
00:03:01 But really, the Python relevance is I didn't actually start hacking for fun until my about my senior year of college.
00:03:09 And the first language that I learned for programming for fun and not just because I had to do it for class was actually Python.
00:03:15 So the first kinds of programs I wrote were scripts to manage my photos and, you know, kind of manipulate and manage my own personal photo gallery and, you know, put it up on a simple website.
00:03:28 So that was where I got started getting hooked on Python.
00:03:30 That was, you know, it was about 10 years ago.
00:03:32 That was around 2005.
00:03:33 That was like Python 2.4 or something like that.
00:03:36 Yeah, that's a great way to get started.
00:03:38 I think a lot of people have interesting stories like that, you know, just they have some small problem they're trying to solve.
00:03:44 And, you know, it leads you down this path.
00:03:47 And all of a sudden, you discover this world where, hey, there's this great thing, you know, programming or Python or whatever.
00:03:52 Yep, that's exactly right.
00:03:54 So I see you're calling in from Seattle, right?
00:03:57 What are you doing up there?
00:03:58 So I am currently an assistant professor of computer science at the University of Rochester in upstate New York.
00:04:05 So that's nowhere near Seattle.
00:04:06 That's what I was going to say.
00:04:07 You're not – it's not at all in Seattle.
00:04:08 So I get to – one of the real benefits of being a professor is that your summers are free to do research or to travel or to do other sorts of scholarly work.
00:04:19 So most professors in most terms, they stay on campus in the summers and they do research full time for three months.
00:04:27 What I decided to do this summer since I had some colleagues at Microsoft was to spend most of my summer at Microsoft Research doing research and both in software engineering and in online education at the lab in Seattle.
00:04:41 And I came here because I actually was an intern here a long time ago when I was back in grad school.
00:04:47 So I'm actually back interning in the same group.
00:04:49 So it's sort of a homecoming of sorts.
00:04:51 Back to the future.
00:04:52 That's excellent.
00:04:53 Yeah, I've done some work with some of the guys up at Microsoft.
00:04:55 It's a cool place up there.
00:04:57 So excellent.
00:04:57 Is this related to PythonTutor.com?
00:05:00 No, not really.
00:05:01 I mean, this is just a completely separate sort of research project.
00:05:04 So there's nothing Python related in the work here, unfortunately.
00:05:08 All right.
00:05:09 Cool.
00:05:10 All right.
00:05:11 So let's talk about your CPython internals class.
00:05:16 This was a class you did at University of Rochester, right?
00:05:19 2014, I think.
00:05:21 At least the recorded version was 2014.
00:05:23 Yep.
00:05:24 So this was a class I taught in fall 2014.
00:05:27 And the name of the course was Dynamic Languages and Software Development.
00:05:32 So I actually inherited this course from another professor who was taking a leave and teaching another class that term.
00:05:41 And that class was originally in Ruby.
00:05:43 So it was sort of a graduate level programming languages class about these sorts of dynamically type languages.
00:05:48 And originally he did it in Ruby.
00:05:50 But since I knew Python a lot better, I revamped the class to be in Python and basically turned it into what the videos are online.
00:05:59 So I'd be happy to talk about that in detail.
00:06:02 Just for everyone listening, the videos are online.
00:06:05 And I actually spent like the last week going through your class.
00:06:08 So I feel like I've had like some super intense summer course or something, you know, doing like 10 lectures.
00:06:14 And people can find those on your website at pgbovine.net slash cpython dash internals dot htm.
00:06:21 And I actually went through, unrelated to this conversation or maybe preceding this whole having you on the show, I just saw your videos and thought they were awesome.
00:06:30 And I put them into a YouTube playlist at bit.ly slash cpython walk.
00:06:34 So both of those work well.
00:06:35 What was the main goal of the class?
00:06:38 Sort of get people to understand what happens when you actually run dynamic code like Python?
00:06:43 Yeah, I think that was basically that was basically a philosophy.
00:06:48 So a lot of programming languages classes are taught from more of a theoretical perspective.
00:06:54 Right.
00:06:54 So it's usually kind of some formal syntax and semantics and maybe doing some proofs.
00:07:01 And it's very, you know, kind of a formalism heavy.
00:07:04 And I thought it would be interesting to do a very different sort of class for graduate students from the opposite side, which is something extremely applied to saying, you know, here is a here is a piece of Python code.
00:07:15 Let's start with hello world or a simple for loop or a simple function call.
00:07:20 And what actually happens throughout all the steps between that code being parsed and then the output appearing on your screen, let's say.
00:07:29 So I wanted to dive into the interpreter and show students how everything worked under the hood and how there's really, you know, by deconstructing, you can show that there's really no magic here.
00:07:39 There's just a lot of C code behind the scenes that keeps track of a lot of stuff.
00:07:43 And eventually your program runs.
00:07:45 So we don't do the parsing stage because I think parsing is fairly standard.
00:07:50 And that's covered by most kind of introductory compilers classes.
00:07:54 You write a grammar and parser generator and some code gives you a like an AST.
00:08:00 And then that gets walked to turn into some kind of bytecode.
00:08:04 So the class actually starts with assuming you have a bunch of Python bytecode.
00:08:07 How does the bytecode actually get interpreted step by step by the by the interpreter runtime system to do your your programs operations?
00:08:16 Yeah, that's really cool.
00:08:17 And I think, you know, if I think about how like C code runs and then my intuition about how that C code actually executes, if you understand a little bit about registers and memory addresses and pointers, your intuition more or less will carry the day.
00:08:32 I think with interpreted languages, all bets are off.
00:08:36 Right.
00:08:37 I mean, you have some concept of the programming language doing things, but then the way that happens, you really have to look inside.
00:08:44 Right.
00:08:45 Yeah, exactly.
00:08:47 Because the these interpret languages are often not implemented like you would conceptually think of it.
00:08:53 Right.
00:08:53 You think of something as you have frames and variables and pointers to each other.
00:08:57 But really, these bytecodes are this sort of the Python one is sort of this stack based kind of virtual machine.
00:09:04 I think Java, the Java virtual machine is that way, too, but I forgot the exact semantics.
00:09:08 But it's not something you would think about normally, but they do it that way.
00:09:11 One, because it's really compact and it's kind of leads to really compact code and sort of easy to understand code for the implementer.
00:09:18 But yeah, but that's very different than the conceptual model in your head at the very high level, how a program ought to work.
00:09:24 And we can talk about that later when we talk about the Python tutor as well, because that kind of leads into that other tool.
00:09:30 So we can keep talking about the CPython stuff first.
00:09:32 Sure. So one of the things I thought was interesting was in your very first session, you did you have kind of a cool whiteboarding thing you're doing with a Microsoft Surface and like a pen where you can kind of draw on the code.
00:09:43 And that's cool.
00:09:44 You do a cool little sketch about what actually happens when you type Python space some file that py.
00:09:50 And I mean, on one level, I knew it.
00:09:54 On the other, it was a little surprising to me to say.
00:09:56 And the first step is compilation.
00:09:58 Can you maybe like talk just briefly about like what happens when I run my Python code before we get into the interpreter itself?
00:10:05 Yeah.
00:10:06 So many people are surprised when there's a compilation step in Python or in these sorts of dynamic or what people call scripting languages, because usually you think of running Python space, whatever, or Perl space, whatever, Ruby space, whatever.
00:10:20 And it just runs, right?
00:10:21 Right.
00:10:22 I just thought, here we go with the interpreter.
00:10:24 And now it's interpreting, right?
00:10:26 Right.
00:10:27 So with Java or C or C#, you have a compilation step and then you run a compiled binary.
00:10:33 And there's two separate steps.
00:10:34 But with Python, as with many other languages, the compilation happens before the execution.
00:10:40 So what happens is as a standard kind of a front end to a compiler, it takes the source code.
00:10:46 It does the lexical analysis.
00:10:48 It does the parsing.
00:10:49 It creates a AST or abstract syntax tree from that.
00:10:52 And then it walks that tree and creates a bunch of bytecode.
00:10:56 So the Python bytecode language, you can read in the documentation, it has, I don't know, a few dozen operations like add, load, store, and also some operations that are a little bit more Python specific, like build a list, build a dictionary, build a tuple, function call, those sorts of things.
00:11:16 So the compilation step really takes your source code, which is in human readable, somewhat human readable form, and turns it into a linear stream of instructions.
00:11:26 Very much like assembly language, except you can think of bytecode as an assembly language for a Python virtual computer.
00:11:35 Right. That was kind of the impression I got as well, like a much, much richer assembly language where you have operations like build class and call method, push and pop stuff off stacks and so on.
00:11:46 Yep, exactly.
00:11:47 If we want to go work with this, right, we can go to python.org and download the code and decompress it or untar it or whatever.
00:11:56 And it's just, it literally is a bunch of C code, right?
00:11:59 The C in CPython is, here's your C implementation of this interpreter, right?
00:12:03 That's right.
00:12:04 So if you go to, this is what I do on the first day of class.
00:12:07 We have everybody download the C interpreter source code from, sorry, the CPython source code from python.org and unzip it and do configure and make.
00:12:19 Now, part of the class, I didn't require students to actually run the interpreter if they didn't want,
00:12:26 because most of the class was actually reading through the code and walking through it.
00:12:29 Now, the students who were a bit more adventurous, they could try to compile the interpreter themselves.
00:12:34 And then try to, you know, put in debug statements or print statements to see how it works behind the scenes.
00:12:40 But actually, compile interpreter itself might not be easy if you're on, say, especially, say, on a Windows machine, which doesn't have a lot of development tools, compilers.
00:12:49 So I'm usually on Linux and Mac machines.
00:12:51 If you install the standard developer tool chain with GCC and make and configure and all that stuff.
00:12:58 In theory, right, building is always hard.
00:13:01 But in theory, if you do dot slash configure and then and you type make all your your you'll actually call the C compiler on your machine.
00:13:10 And it will compile all the C files and the C and the H files in the CPython slash directory.
00:13:18 And in the end, it will produce a binary executable file called Python.
00:13:22 And that Python you can just run.
00:13:24 And that is the Python interpreter that you just compiled from C source code.
00:13:29 So most of the class, what we do is we go over what a lot of those C files actually do and see.
00:13:34 Maybe you could give us like a 10,000 foot view of what are the interesting parts of that source code and what is just noise and details.
00:13:44 So there's like objects and then there's include there's see eval dot C.
00:13:49 There's there's like a few really common parts that you come back to over and over and over.
00:13:53 And then there's a bunch of details.
00:13:54 Yeah.
00:13:55 So on the Web site with all the videos, I actually show the files that they reference.
00:14:00 But really, the core file that I keep on going back to what you're saying is in Python slash C eval dot C.
00:14:08 And what that is that that file at its core is the main interpreter loop.
00:14:12 So conceptually, how you execute how Python executes code is a byte code is just a bunch of them.
00:14:20 It's just a list of instructions.
00:14:22 Each one is add or subtract or build list or function call or so forth.
00:14:28 And all the interpreter does is just go through one instruction at a time, take it off the list of instructions, do something and then move to the next instruction, do something, move the instruction and then do something else.
00:14:41 And it might jump around the stream of instructions if you have, say, a function call or a loop.
00:14:46 But really, the main interpreter loop in C eval dot C, all it does is it's just a big, wild, true, infinite loop that just.
00:14:54 Yeah, there's like a huge switch statement.
00:14:56 And it is huge, right?
00:14:57 That's right.
00:14:58 Yeah, there's like a 3000 or whatever line switch statement.
00:15:02 There's a fun fact in there.
00:15:03 If you actually I don't know if it's in all the versions, but at least in some of the versions I saw, there's some kind of comment in there saying that they needed to like break up the switch statement in some weird way.
00:15:15 Because some C compilers just can't take switch statements that are that big.
00:15:21 So they had to actually break up the code into pieces because, you know, it wouldn't compile on some kind of computers because that code was just too giant.
00:15:28 Yeah, that's pretty funny.
00:15:30 It's like a 3000 line switch statement.
00:15:31 It's pretty cool.
00:15:32 But those are more or less the steps that have all the opcodes.
00:15:37 And so if I look at Python, it's not necessarily mapping one to one the Python code I write to these opcodes, which is a good thing for Python programmers, right?
00:15:50 That means you're working in a high level language.
00:15:52 You're not working like down in the detail, right?
00:15:54 But it also means it's hard for me to understand if I write, you know, create a class and I say, you know, T equals new test class.
00:16:01 What does that actually mean?
00:16:03 Like, how do I line that up?
00:16:04 And so you had a cool way to disassemble that, right?
00:16:07 And look at it.
00:16:37 Currently, candidates receive five or more offers in just the first week and there are no obligations ever.
00:16:42 Sounds pretty awesome, doesn't it?
00:16:45 Well, did I mention there's a signing bonus?
00:16:47 Everyone who accepts a job from Hired gets a $2,000 signing bonus.
00:16:51 And as Talk Python listeners, it gets way sweeter.
00:16:55 Use the link Hired.com slash Talk Python To Me and Hired will double the signing bonus to $4,000.
00:17:04 Opportunity's knocking.
00:17:04 Visit Hired.com slash Talk Python To Me and answer the call.
00:17:18 Right, right.
00:17:19 So the disassembler actually comes in the standard Python library.
00:17:25 So if you do, right, so if you do Python space dash M space DIS, which runs the disk module space, the Python file name, name of Python file, I'll actually run the main function in the DIS module.
00:17:41 And what that will do is I'll actually print out a somewhat human readable representation of the bytecode.
00:17:46 And the cool thing about that is that it shows the line number of which line of your Python source code compiles into which bytecode.
00:17:55 And as you mentioned, it's not a one-to-one mapping.
00:17:57 So one line usually compiles to several bytecodes because the bytecode is at a lower level.
00:18:03 So you can run that DIS command.
00:18:05 And the DIS module, you can just search for, if you search on your favorite search engine for Python space DIS, you should see the documentation for this disassembler module.
00:18:16 And that is in the standard library, and that gives you all of the stuff.
00:18:20 So now that said, though, that only prints out the instructions.
00:18:24 There was somebody who made a library called byteplay, which is B-Y-T-E-P-L-A-Y.
00:18:31 And that library actually is an enhanced version of the disassembler that lets you get the disassembled bytecode into objects.
00:18:39 You can actually play with it yourself.
00:18:41 You can manipulate it.
00:18:42 You can, you know, take it apart.
00:18:44 You can analyze it.
00:18:45 So this byteplay library, I haven't used it myself personally, but I know people who really like playing with it.
00:18:51 Yeah, that's cool.
00:18:52 A little more powerful.
00:18:53 One thing about the DIS module is it's super easy to look at just sort of flat code in Python files.
00:19:02 But if I want to look at the functions or I've got nested functions and classes, it's a little more work to do that, right?
00:19:07 Yeah.
00:19:09 So the default with the DIS module is it just disassembles the top level of your program.
00:19:15 So all the top level says is that if you define a function, it'll just say function definition.
00:19:20 And then what you have to do is you actually have to go inside that function and disassemble that function itself.
00:19:27 So it is a little bit more hairy.
00:19:29 And I don't know if byteplay handles all that out of the box, but it might.
00:19:34 But the idea is that the DIS module, if you just run it by default, it will just disassemble the top level program.
00:19:39 And any functions will not be disassembled automatically.
00:19:42 You have to actually grab the code of those functions and go in there and call dis on that.
00:19:47 So it is a little bit more tricky to do that.
00:19:50 Sure.
00:19:51 The other thing I thought was interesting is if I've got a function, let's say foo, in Python, I could say, what is it?
00:19:59 Foo.func underscore bytecode.
00:20:03 How do I – the bytecode is actually there on the function.
00:20:06 And you can look at it in its encoded form, which is kind of some binary string type thing.
00:20:12 And then you can also disassemble that as well, right?
00:20:15 That's right.
00:20:16 And that's what I think we're just leading into that.
00:20:18 So the idea is that DIS itself, if you just run it, it disassembles the bytecode of the, I guess, of the top level file.
00:20:26 But each function itself has its own code.
00:20:29 And like you said, I think it's – it's actually different in Python 2 and 3, the name of it.
00:20:34 But I think in one version it's like the function object dot func underscore code.
00:20:39 The other one is just like just dot code or something like that.
00:20:43 But the idea is that the code of the function just appears inside of it as a binary string of data.
00:20:50 So if you actually print it out, it just looks like some garbled string.
00:20:53 But if you run it through some – you can run it through some pretty printing function or through DIS.
00:20:58 And it actually shows you the bytecode of the function.
00:21:01 Because all a function object is that it's some context plus an actual string of bytecode that represents what the instructions are that the function is supposed to execute when you run it.
00:21:14 Yeah, the other thing I thought was pretty cool is – or interesting to understand is that sort of compile step that you talk about, right?
00:21:22 When I run Python My Python file, I get first like a compile step to bytecode and then the dynamic interpreted execution.
00:21:30 But all those functions and stuff, that bytecode is there and ready to roll.
00:21:34 It's just not kind of wired together until it gets to the interpreter, right?
00:21:39 That's right.
00:21:40 So you can actually compile – I think it's just the Python interpreter does the compiling and running all at the same time.
00:21:48 But I think there's actually a mode in Python that you can just compile to – you can just compile the bytecode and not actually run it yet.
00:21:57 I'm not sure exactly which flags are that one.
00:21:59 But sometimes people actually ship pre-compiled Python bytecode instead of the source code.
00:22:06 So there's – I don't know what reason people do this because you can just run the source code.
00:22:12 And some people like to obfuscate their bytecode maybe, but I don't know how well that actually works because you can kind of reverse engineer it.
00:22:20 But yeah, so the compile step is completely separate from the running step.
00:22:25 And like you said, once you compile, it's just a bunch of – instead of a text file, a .py file, it's called, I think, a .pyo file or something.
00:22:33 It's just a bunch of garbled stuff.
00:22:35 And then that garbled stuff, you can just run through the interpreter and it'll do your – it'll run with your program.
00:22:40 Yeah, it's really interesting to see how it's all coming together.
00:22:45 What do you think some of the main reasons for studying Python at this level are?
00:22:49 Like how does it make you a better programmer, do you think?
00:22:51 That's a great question.
00:22:54 I think that studying Python at this level of the implementation level, it kind of makes you – I feel like it makes you a better programmer in that you kind of, one, build a really good mental model of what goes on behind the scenes.
00:23:09 And you see that these languages are just tools made by people.
00:23:14 I think there's something really powerful in that.
00:23:15 I feel this is a very kind of systems perspective of programming.
00:23:20 So one analogy is that why do people study, say, operating systems or study compilers?
00:23:26 That's a good example.
00:23:27 Like the kind of classic thing in college is that a lot of people have to take an operating systems course where they build a very simple sort of OS kernel in C and maybe some assembly.
00:23:39 And their kernel kind of runs and it does a simple hello world.
00:23:42 Or you do a compilers course where you build a compiler using some basic building blocks.
00:23:49 And the idea there is that it's not that you're going to ever build an operating system or a compiler in real life or a new programming language.
00:23:56 You're not – most people are not going to implement a new kind of programming language.
00:23:59 But by studying the principles behind how it works, I feel like – I think it makes you a better programmer in that you kind of understand how large complex code bases are organized and logically broken down.
00:24:12 So I view this class like you've seen with these videos as more of like a code reading or literature exercise in a way.
00:24:18 Because we're actually reading through dozens of – actually not that many.
00:24:23 Maybe a dozen really core complex files and seeing how they – the pieces fit together.
00:24:30 So it's sort of like dissecting, you know, kind of a large piece of code.
00:24:35 I think that's really interesting in its own right.
00:24:38 Yeah.
00:24:39 A lot of people when they're in school at least studying this stuff, it's all very – I don't know, like you said, abstract or maybe not – it's not quite what I'm looking for.
00:24:48 But like it doesn't have the nitty-gritty details of the real world applied to it.
00:24:54 So all the error conditions that are so bizarre and all the optimizations, you don't necessarily have to deal with that.
00:24:58 And so when you do finally get to a real world complex code base, it's super hard to feel comfortable.
00:25:05 And I think, you know, you kind of helped your students do that a lot in there.
00:25:08 So that was cool.
00:25:08 Yeah.
00:25:10 I think that's – and like you mentioned, there's always a tradeoff, right?
00:25:13 So even in my choice of what to cover in this class, if you notice, I only cover maybe a dozen or so files.
00:25:19 I mean, the Python code base has hundreds or thousands of source code files.
00:25:24 And obviously, I don't have – one, I don't have time to cover all that.
00:25:27 And two, I feel like this dozen is really the conceptual core of the interpreter.
00:25:31 A lot of the files are just modules, right?
00:25:33 A lot of the files are just like here's how strings are implemented.
00:25:37 Here's how, you know, the socket class is implemented.
00:25:41 Here's how, you know, memory mapped iOS is implemented.
00:25:43 Those are all, I feel, auxiliary things.
00:25:45 But whereas the core thing is, you know, what is an object?
00:25:48 What is, you know, a class?
00:25:50 What is a function?
00:25:51 What is the interpreter?
00:25:52 So – and even as you notice from watching the videos, I don't go over every single line in excruciating detail.
00:25:58 I basically gloss over things and say, look, this block happens if there's some kind of error.
00:26:02 You run out of memory.
00:26:03 So, you know, look at that in spare time.
00:26:05 Exactly.
00:26:05 But here's like conceptually what happens.
00:26:07 So, it is a balance of, you know, exposing students to the nitty-gritty, like you said, but also not too nitty-gritty because there's so much complexity in the code that isn't core to the lessons in the class.
00:26:20 So, it's a balance.
00:26:21 Yeah.
00:26:22 A lot of times as programmers, we are – to be effective, we have to kind of zoom in, look at the tree, zoom out, look at the forest, zoom back in on another tree, zoom back in.
00:26:31 And that skill of like in and out is pretty awesome.
00:26:34 We talked about the opcodes.
00:26:36 That's one – and that eval C – cEval.c function or class where it has the main eval loop running around and around.
00:26:44 That's one of the main architectural pieces of CPython.
00:26:48 Another one was – that struck me was everything is this type of C object called pi object.
00:26:55 Yep.
00:26:56 Everything is a pi object, right?
00:26:58 Pretty much.
00:27:00 So, numbers, strings, custom classes, those all kind of make sense.
00:27:06 But even the class definition itself, functions, methods.
00:27:11 So, that was really interesting to me.
00:27:14 And then we have derivatives of those, like things that have pi object kind of as their base class, like pi int object for int, pi list object for lists, and so on.
00:27:24 But C is not an object-oriented language.
00:27:28 So, how does that work?
00:27:29 Right.
00:27:30 So, like you mentioned, the pi object structure, I guess, in C is the base of how everything is implemented.
00:27:39 All the objects are implemented in Python.
00:27:41 And what that contains is – that contains actually really few sorts of basic data.
00:27:47 And I think the most basic, I'm trying to remember off the top of my head, is one is a reference count of how many pointers are pointing to this object at once.
00:27:55 And it's because Python implements garbage collection by doing reference counting.
00:28:00 So, if you have nobody pointing to you, then you get garbage collected and your memory gets reclaimed.
00:28:04 So, everything is conceptually a subclass of pi objects.
00:28:10 So, if you want to make an integer object, it's a pi in object.
00:28:13 Or if you want to make a string, it's a pi string object.
00:28:15 If you want to make a function object, it's a pi function object.
00:28:18 And like you mentioned, C is not object-oriented language.
00:28:21 So, there's no inheritance in the language.
00:28:23 But really, you can fake it by basically doing what's called structural inheritance or structural subtyping.
00:28:31 What that really does is it's a hack where you basically create a struct that is – where the first few elements of the struct are exactly the same as the base class.
00:28:42 So, basically, the pi in object – I don't have the code in front of me.
00:28:45 But the pi in object, the first whatever – What is it?
00:28:49 Type?
00:28:50 Like the class, the original type, and then the ref count like you're saying, right?
00:28:53 That's right.
00:28:54 So, those are the – yeah, those are the two things in pi object.
00:28:58 That's right.
00:28:58 So, there's a pointer to the – a tag saying what type it is.
00:29:01 And then there's the number of references.
00:29:04 So, every struct that represents some kind of a Python class – all the names are getting mixed up – starts with those two things.
00:29:16 And the cool thing there is because if you have C code that expects a pi object star, a pi object pointer, and operates on it, it knows that the first thing it accesses in memory is the type.
00:29:29 And the second thing, I think, is the reference count.
00:29:31 So, all of your code will work perfectly fine if it's an in object or a long object or a string object if the function you're passing it into expects just a base class of pi objects.
00:29:42 So, basically, conceptually, it's just subclassing or subtyping.
00:29:46 But that's how it ends up being implemented in C.
00:29:49 And, actually, how C++ does subtyping, I think, in its most basic form, is basically that.
00:29:56 Because C++ is meant to be compiled to be somewhat backwards compatible with C.
00:30:01 So, this idea of piling another class on top of another one structurally with the fields in the same places is a pretty classic technique.
00:30:09 Yeah.
00:30:10 You kind of – yeah, absolutely.
00:30:12 You kind of have to really understand C pointers pretty well to get it.
00:30:16 But once you do, it's pretty straightforward, right?
00:30:18 Because when you say pointer and you dereference that pointer and you say a name, that really just maps to, like, an offset from the base address.
00:30:25 And long as they all have the same shape up to that point in terms of in memory, you basically have inheritance, right?
00:30:31 That's cool.
00:30:46 This episode is brought to you by CodeShip.
00:30:48 CodeShip has launched organizations, create teams, set permissions for specific team members, and improve collaboration in your continuous delivery workflow.
00:30:57 Maintain centralized control over your organization's projects and teams with CodeShip's new organizations plan.
00:31:03 And as Talk Python listeners, you can save 20% off any premium plan for the next three months.
00:31:09 Just use the code TALKPython, all caps, no spaces.
00:31:12 Check them out at CodeShip.com and tell them thanks for supporting the show on Twitter where they're at CodeShip.
00:31:18 Yep, exactly.
00:31:24 And that's another kind of a side effect of studying this sort of – studying implementation.
00:31:29 Because most implementations are usually in C.
00:31:32 So you get to kind of see these interesting C tricks and see how other languages are built on top of that, like object-in-order programming.
00:31:40 Yeah, it's cool.
00:31:41 I definitely have a better appreciation for macros after spending 10 hours looking through that idea.
00:31:48 Because I did a lot of C++, but not a lot of pure C.
00:31:51 So, you know, some of the tricks you might do differently in C++, you know, almost, you know, have really nice macro solutions.
00:31:59 So that's cool.
00:31:59 Your other project, Python Tutor at PythonTutor.com, what's the relationship to this?
00:32:08 I mean, certainly PythonTutor.com helps you understand that sort of in-memory what's happening inside your Python code.
00:32:16 So I kind of see these things as somewhat related, these two projects that you had.
00:32:20 Maybe you could just introduce Python Tutor for everyone and then we could talk a bit about it.
00:32:24 Sure.
00:32:25 So Python Tutor at PythonTutor.com is a – it's a web-based tool where you can write Python code.
00:32:32 And actually now you can write code in a lot of other languages.
00:32:34 So you can write code.
00:32:35 It supports Python, Java, JavaScript, TypeScript, which is a Microsoft version of JavaScript with types, which works really well.
00:32:43 And also Ruby now.
00:32:45 What you do is you write code in your browser and then you run it.
00:32:48 And it actually goes – it sends your code to a server to run in a sandbox.
00:32:52 So it actually runs a real version of the language and not some kind of JavaScript-y simulation of it.
00:32:59 So it runs the code.
00:33:01 It sends back the execution trace, which is everything that happened when your code ran.
00:33:06 You know, what it did at every step, when it printed out, what variables there are, what data structures there are.
00:33:12 And then it produces a visualization for you that you can step through.
00:33:15 So it produces a visualization of every step of the code execution.
00:33:20 And then you can use a slider to go through it and see that, you know, the variables being created, the function stack frames being created, the pointers that are pointing to each other.
00:33:29 And what that lets you do is that lets beginners especially build up a mental model of what is kind of going on inside their program.
00:33:38 Because even for code – for experienced programmers, we actually build up this model ourselves.
00:33:43 We look at a piece of Python code and we think in our heads, oh, there's a variable here that's pointing something else here and that's pointing this other thing here.
00:33:50 And then we call a function and that function points to the same thing we do.
00:33:53 But those structures are really hard for beginners to build up in their heads.
00:33:57 And this tool has just been really helpful for a lot of people to build up that model.
00:34:02 And the relationship between that and the CPython stuff is actually really interesting because the CPython stuff is really for advanced learners who want to learn how things really work behind the scenes.
00:34:12 And like we mentioned earlier, the Python tutors, for most people, I think, it's more useful because it's really what happens.
00:34:18 It draws the pictures of what happens at the conceptual level, right?
00:34:22 That you're actually – conceptually, all you want to think about is you run every line of code and something happens.
00:34:27 You don't need to know about the bytecode or the stack or the main interpreter loop or PyObjects or everything.
00:34:34 So I think those two are really complementary.
00:34:36 One is for advanced kind of programmers who want to study internals, whereas the Python tutor is for beginners who are just learning the language.
00:34:45 Yeah, that's for sure.
00:34:47 I kind of saw it the same way.
00:34:49 I feel like there's sort of this understanding of the thing that is CPython.
00:34:55 And Python tutor is this great way to help beginners kind of form good mental models.
00:35:00 And your CPython walk is really good at actually showing a super deep understanding.
00:35:05 But they kind of give like two perspectives of the same thing.
00:35:07 So even though I've been doing Python for a long time and I know C really well or C++ anyway, I still thought that just looking at the stuff that was going on in Python tutor,
00:35:15 like it has some really great visualizations for showing basically like scope, variable scope and things like that, because that can be kind of hard to understand for beginners.
00:35:26 Those kinds of things, right?
00:35:28 Like because it's not just, well, it's in the curly braces.
00:35:30 And so when it leaves the curly braces, this variable is gone, right?
00:35:33 There's a whole different mechanism for finding what's defined where and so on.
00:35:37 Right.
00:35:37 That's right.
00:35:38 And also with the, with nested scopes and closures in Python, that gets even more tricky.
00:35:44 So the Python tutor has a way of visualizing kind of your parent frame.
00:35:48 So if you, for example, the classic case, if you define a function within a function, that inner function has access to the outer functions variables as well as the global variables.
00:35:59 And it gets even trickier when, you know, you have a function foo and inside of foo, you define bar and bar access is something within foo.
00:36:06 But then foo returns bar to its color and foo is the stack of foo is gone.
00:36:12 But when you call bar again, you can actually still get back to the variables that foo had, even though foo has finished executing.
00:36:19 And the Python tutor and these sorts of tools visualize that for you.
00:36:23 And it's been used by quite a few classes, especially I teach these things like nested functions and closures, which are not as obvious, you know, and they're, they're more advanced concepts.
00:36:33 Yeah.
00:36:34 I do professional like training for Python and other, other technologies as well.
00:36:39 And I was thinking I'd probably pull that up when it gets to the scope stuff for students, just, you know, because, you know, I'm teaching a lot of guys who have done C++ or .NET or something like that.
00:36:48 And just their mental model is not appropriate.
00:36:50 Right.
00:36:51 And just like seeing it is a lot easier than spending five minutes talking about it, writing some demos.
00:36:56 So I think that's really cool.
00:36:57 I think it can help, help a lot in those areas as well.
00:37:00 Yeah, definitely.
00:37:01 Please, please use it.
00:37:03 And, and, and let me know if you have issues.
00:37:05 I mean, it's pretty, it's pretty robust at this point.
00:37:08 I mean, the thing is, it does require an expert such as yourself to guide people through.
00:37:12 I mean, it's helpful for people by themselves.
00:37:14 But if you just, what people do as instructors, like yourself, is you just pull up a browser and start writing code and start running it and start explaining the code to the students one step at a time.
00:37:25 And that's a lot more useful, I think, than starting a terminal.
00:37:28 Right.
00:37:28 Because the alternative now is you start a terminal, write a, write a function or a nested function or whatever, and then put a bunch of print statements inside.
00:37:35 And then you just run the terminal and just print a bunch of stuff.
00:37:37 And you're like, okay, I got to explain why it's printing this.
00:37:40 But whereas in the Python tutor, it's printing it to the web terminal.
00:37:43 But then also every step you see, oh, it's printing this because X is now pointing to this.
00:37:48 And now X points to something else and it's printing that.
00:37:50 It's extremely clear.
00:37:51 Yeah, it is very clear.
00:37:53 And it's like, you know, if you were to do your terminal example and then go over to the whiteboard and sketch out what's really happening as you try to describe it, like Python tutor just does that drawing for you, right?
00:38:02 Exactly.
00:38:04 So the exact use case is what you said.
00:38:06 It really replaces a combination of a terminal.
00:38:11 It really replaces a text editor, you know, interpreted a terminal scene, like a REPL, and a separate whiteboard all in one.
00:38:21 And I thought it was really interesting you mentioned the .NET kind of the C slash .NET developers switching the mental model of Python.
00:38:28 A funny story about this is recently I wanted to learn Ruby.
00:38:31 I've always wanted to learn Ruby for a while.
00:38:33 I've never done it before.
00:38:34 And I felt a good way for me to learn Ruby is to actually write my own Ruby backend for the Python tutor.
00:38:42 So the Python tutor is actually, it's a very platform, it's a language independent interface.
00:38:47 If you notice the visualizations, nothing about the visualizations has Python.
00:38:50 They're just variables and stack frames and functions and lists and objects with attributes and stuff.