Skip to content

Commit 00a19e3

Browse files
committed
Merge pull request ictar#9 from ictar/pw_245
add 3 raw posts from python weekly #245
2 parents c66f283 + 4b386f9 commit 00a19e3

5 files changed

Lines changed: 2777 additions & 1 deletion
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
原文:[Python Weekly Issue 245](http://us2.campaign-archive2.com/?u=e2e180baf855ac797ef407fc7&id=bb4672538f&e=148158c7b4)
2+
3+
---
4+
5+
欢迎来到Python周刊第245期。让我们直接看看本周有啥吧。
6+
7+
# 来自赞助商
8+
9+
[![](https://gallery.mailchimp.com/e2e180baf855ac797ef407fc7/images/711a53fa-d9a3-4b1d-897c-853ccb078c96.png)](https://software.intel.com/en-us/intel-sdp-home)
10+
11+
Intel是PyCon 2016的荣誉赞助商!参观 #533 展位来学习(并赢得酷炫奖品)Intel如何通过代码贡献,带[Python加上原生代码分析](https://software.intel.com/en-us/python-profiling)[Python发行](https://software.intel.com/en-us/python-distribution)的性能解决方案, PyPy加速和数据分析平台,来助力Python社区。
12+
13+
14+
# 新闻
15+
16+
[PyCon Ireland征集建议](https://python.ie/pycon-2016/call-proposals/)
17+
18+
PyConIE 2016的建议征集现已开放。格式将由两个座谈轨迹和两个研讨会轨迹组成。请在2016年7月18日之前提交建议。
19+
20+
21+
# 文章,教程和讲座
22+
23+
[使用BigQuery和TensorFlow进行需求预测](http://nbviewer.jupyter.org/github/GoogleCloudPlatform/training-data-analyst/blob/blog_20160513/CPB100/lab4a/demandforecast.ipynb)
24+
25+
在这个notebook上,我们将开发一个机器学习模型来预测纽约的出租车需求。
26+
27+
[Episode #60:使用Ufora将Python缩放到1000个内核上](https://talkpython.fm/episodes/show/60/scaling-python-to-1000-s-of-cores-with-ufora)
28+
29+
之前,在这个节目上,你听过我谈论到关于缩放Python和Python性能。但在这一集,我将带给你一个非常有趣的项目,对于某一类的应用,它将提高Python性能的上限。你将见到来自Ufora的Braxton McKee。他们开发了一个全新的Python运行时环境,该运行时环境关注于跨1000个CPU内核和甚至GPU,来水平扩展Python应用。他们将其称之为“为数据科学编译,自动并行的python”。
30+
31+
[Python 101:用基准问题测试代码的一个简介](http://www.blog.pythonlibrary.org/2016/05/24/python-101-an-intro-to-benchmarking-your-code/)
32+
33+
用基准问题测试代码意味着什么?基准或分析背后的主要思想是弄清你的代码执行得有多快,以及瓶颈在哪里。做这种事的最主要的原因是为了优化。你将遇到一些情况,在这些情况下需要你的代码运行得更快,因为你的业务需求已经改变了。当这种情况发生时,你将需要弄清楚代码中的哪些部分让该过程变慢。本章将只讨论如果使用各种工具来分析代码。它不会涉及到具体代码优化。让我们开始吧!
34+
35+
[使用Python和Flask入门Slack API](https://realpython.com/blog/python/getting-started-with-the-slack-api-using-python-and-flask)
36+
37+
在这篇文章中,我们将看到如何通过API和官方的SlackClient Python辅助库来使用Slack。我们将抓取一个API访问token,并写一些Python代码到列表中,通过该API检索和发送数据。让我们开始挖掘吧!
38+
39+
[Podcast.__init__ 第58集 - 和Tom Dyson谈谈Wagtail](http://pythonpodcast.com/tom-dyson-wagtail.html)
40+
41+
如果你正在操作一个网站,该网站需要发布和管理定期的内容,那么一个CMS(内容管理系统)将成为减少你的工作量显而易见的选择。有大量可用的选项,但是如果你正在寻找一个利用Python的力量并具有灵活性的解决方案,那么你应该认真考虑下Wagtail。在这一集中,Tom Dyson解释了Wagtail是如何被创建的,它与其他选择有何不同之处,以及何时你应该为你的项目实现它。
42+
43+
[抛弃Python 2](https://asmeurer.github.io/blog/posts/moving-away-from-python-2/)
44+
45+
[逆向工程我的酒店中的一个神秘的UDP流](http://wiki.gkbrk.com/Hotel_Music.html)
46+
47+
[Raspberry Pi LCD设置以及在Python中编程](https://www.youtube.com/watch?v=zC3i3CbKZfw)
48+
49+
[条件Python依赖](https://hynek.me/articles/conditional-python-dependencies/)
50+
51+
[使用Python进行网页抓取 —— 抓取Comixology的数字漫画信息](http://felipegalvao.com.br/blog/2016/05/24/web-scraping-with-python-scraping-digital-comics-information-from-comixology/)
52+
53+
54+
# 好玩的项目,工具和库
55+
56+
[Mycroft Core](https://github.com/MycroftAI/mycroft-core)
57+
58+
Mycroft Core是组成Mycroft人工智能平台的主要模块。Mycroft利用Adapt Intent Parser, Speech-to-Text软件和Text-to-Speech。该平台背后的思想在于,能够在任何设备上启用语音,并且将其变成一个智能个人助手,能够执行一些任务。
59+
60+
[OpenWPM](https://github.com/citp/OpenWPM)
61+
62+
OpenWPM是一个网络隐私测量框架,它易于收集规模从数千到以百万计的网站上的数据,以进行隐私研究。OpenWPM 建立在Firefox顶层,使用Selenium提供的自动化。它包括一些用于数据收集钩子,包括一个代理,一个Firefox扩展,以及对Flash cookie的访问。
63+
64+
[http-prompt](https://github.com/eliangcs/http-prompt)
65+
66+
HTTP Prompt是一个交互式命令行HTTP客户端,具有自动完成和语法高亮的功能,建立在HTTPie和prompt_toolkit之上。
67+
68+
[aima-python](https://github.com/aimacode/aima-python)
69+
70+
Russell和Norvig的 "Artificial Intelligence - A Modern Approach"中的算法的Python实现。
71+
72+
[stack](https://github.com/RyanKung/stack)
73+
74+
stack是stack的一个Python版本,它是一个用于开发Python项目的跨平台的程序。
75+
76+
[textX](https://github.com/igordejanovic/textX)
77+
78+
textX是一个元语言,用于在Python中构建领域特定语言(Domain-Specific Languages (DSLs))。简而言之,textX将帮助你以一种简单的方式构建你自己的文本语言。你可以发明你自己的语言,或者构建对已存在的文本语言或者文件格式的支持。
79+
80+
[tesserocr](https://github.com/sirfz/tesserocr)
81+
82+
一个用于光学字符识别(Optical Character Recognition (OCR))的tesseract-ocr API的简单的,Pillow友好型装饰器。
83+
84+
[flask-ask](https://github.com/johnwheeler/flask-ask)
85+
86+
容易用Python, Flask, 以及Alexa技能套件来写Amazon Echo应用。
87+
88+
[DQN-tensorflow](https://github.com/devsisters/DQN-tensorflow)
89+
90+
通过深度强化学习的人类水平控制的Tensorflow实现。
91+
92+
[micropython-redis](https://github.com/dwighthubbard/micropython-redis)
93+
94+
一个redis客户端实现,设计于使用micropython。
95+
96+
[say_what](https://github.com/joshnewlan/say_what)
97+
98+
在诈骗电话中,使用语音到文本,来全面检测。
99+
100+
[mlbgame](https://github.com/zachpanz88/mlbgame)
101+
102+
一个检索和读取MLB GameDay XML数据的Python API。
103+
104+
[Meson](https://github.com/mesonbuild/meson/)
105+
106+
Meson是一个跨平台的构建系统,设计为尽可能的快速和用户友好。它支持许多语言和编译器,包括GCC, Clang和Visual Studio。其生成定义是以一种简单的非图灵完备的DSL编写的。
107+
108+
[RhodeCode](https://rhodecode.com/download/community)
109+
110+
分布式存储库的集中控制。Mercurial, Git, 和Subversion的统一使用工具。
111+
112+
[GooPyCharts](https://github.com/Dfenestrator/GooPyCharts)
113+
114+
Python的谷歌图表API。这意味着用于替代matplotlib。
115+
116+
[callbot](https://github.com/makaimc/callbot)
117+
118+
使用Twilio打电话的Slack机器人。
119+
120+
121+
# 最新发布
122+
123+
[Scrapy 1.1](https://blog.scrapinghub.com/2016/05/25/data-extraction-with-scrapy-and-python-3/)
124+
125+
带有对Python 3支持的Scrapy 1.1正式发布了!Python 3支持并不是该版本的唯一一个好消息。它还有其他一些功能和改进。
126+
127+
[Django 1.10 alpha 1](https://www.djangoproject.com/weblog/2016/may/20/django-110-alpha-1-released/)
128+
129+
130+
# 近期活动和网络研讨会
131+
132+
[PyData Paris 2016](http://pydata.org/paris2016/)
133+
134+
PyData会议是Python中数据分析工具的用户和开发者的聚会。其目标是提供Python爱好者一个平台来分享想法和彼此学习如何最好地应用该语言和工具,以应对数据管理、处理、分析和可视化的广泛领域中不断发展的挑战。
135+
136+
[网络研讨会:介绍SciPy生态系统](http://www.oreilly.com/pub/e/3714)
137+
138+
Python拥有一个庞大而活跃的科学编程社区,并且一直都在开发额外的工具。面对这个新世界,可能会让你感到困惑。加入Ben Root吧,他提供了SciPy生态系统的一个高层次的概述,并着重介绍了一些他最喜欢的工具,让你入门SciPy。

Python Weekly/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@
33
本目录是Python Weekly的中译版,从Issue 243开始。
44

55
- [Issue 243](./Python Weekly Issue 243.md)
6-
- [Issue 244](./Python Weekly Issue 244.md)
6+
- [Issue 244](./Python Weekly Issue 244.md)
7+
- [Issue 245](./Python Weekly Issue 244.md)

raw/Demand forecasting with BigQuery and TensorFlow.md

Lines changed: 676 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
原文:[Reverse Engineering A Mysterious UDP Stream in My Hotel](http://wiki.gkbrk.com/Hotel_Music.html)
2+
3+
---
4+
5+
Hey everyone, I have been staying at a hotel for a while. It's one of those
6+
modern ones with smart TVs and other connected goodies. I got curious and
7+
opened Wireshark, as any tinkerer would do.
8+
9+
I was very surprised to see a huge amount of UDP traffic on port 2046. I
10+
looked it up but the results were far from useful. This wasn't a standard
11+
port, so I would have to figure it out manually.
12+
13+
At first, I suspected that the data might be a television stream for the TVs,
14+
but the packet length seemed too small, even for a single video frame.
15+
16+
### Grabbing the data
17+
18+
The UDP packets weren't sent to my IP and I wasn't doing ARP spoofing, so
19+
these packets were sent to everyone. Upon closer inspection, I found out that
20+
these were **Multicast** packets. This basically means that the packets are
21+
sent once and received by multiple devices simultaneously. Another thing I
22+
noticed was the fact that all of those packets were the same length (634
23+
bytes).
24+
25+
I decided to write a Python script to save and analyze this data. First of
26+
all, here's the code I used to receive Multicast packets. In the following
27+
code, _234.0.0.2_ is the IP I got from Wireshark.
28+
29+
```python
30+
31+
import socket
32+
import struct
33+
34+
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
35+
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
36+
s.bind(('', 2046))
37+
38+
mreq = struct.pack("4sl", socket.inet_aton("234.0.0.2"), socket.INADDR_ANY)
39+
s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)
40+
41+
while True:
42+
data = s.recv(2048)
43+
print(data)
44+
45+
```
46+
47+
On top of this, I also used
48+
[binascii](https://docs.python.org/3.5/library/binascii.html) to convert this
49+
to hex in order make reading the bytes easier. After watching thousands of
50+
these packets scroll through the console, I noticed that the first ~15 bytes
51+
were the same. These bytes probably indicate the protocol and the
52+
packet/command ID but I only received the same one so I couldn't investigate
53+
those.
54+
55+
### Audio is so LAME
56+
57+
It also took me an embarrassingly long time to see the string
58+
`LAME3.91UUUUUUU` at the end of the packets. I suspected this was MPEG
59+
compressed audio data, but saving one packet as test.mp3 failed to played with
60+
mplayer and the _file_ utility only identified this as `test.mp3: data`. There
61+
was obviously data in this packet and _file_ should know when it sees MPEG
62+
Audio data, so I decided to write another Python script to save the packet
63+
data with offsets. This way it would save the file `test1` skipping 1 byte
64+
from the packet, `test2` skipping 2 bytes and so on. Here's the code I used
65+
and the result.
66+
67+
```python
68+
69+
data = s.recv(2048)
70+
for i in range(25):
71+
open("test{}".format(i), "wb+").write(data[i:])
72+
73+
```
74+
75+
After this, I ran `file test*` and voilà! Now we know we have to skip 8 bytes
76+
to get to the MPEG Audio data.
77+
78+
```python
79+
80+
$ file test*
81+
test0: data
82+
test1: UNIF v-16624417 format NES ROM image
83+
test10: UNIF v-763093498 format NES ROM image
84+
test11: UNIF v-1093499874 format NES ROM image
85+
test12: data
86+
test13: TTComp archive, binary, 4K dictionary
87+
test14: data
88+
test15: data
89+
test16: UNIF v-1939734368 format NES ROM image
90+
test17: UNIF v-1198759424 format NES ROM image
91+
test18: UNIF v-256340894 format NES ROM image
92+
test19: UNIF v-839862132 format NES ROM image
93+
test2: UNIF v-67173804 format NES ROM image
94+
test20: data
95+
test21: data
96+
test22: data
97+
test23: DOS executable (COM, 0x8C-variant)
98+
test24: COM executable for DOS
99+
test3: UNIF v-1325662462 format NES ROM image
100+
test4: data
101+
test5: data
102+
test6: data
103+
test7: data
104+
test8: MPEG ADTS, layer III, v1, 192 kbps, 44.1 kHz, JntStereo
105+
test9: UNIF v-2078407168 format NES ROM image
106+
107+
```
108+
109+
```python
110+
111+
while True:
112+
data = s.recv(2048)
113+
sys.stdout.buffer.write(data[8:])
114+
115+
```
116+
117+
Now all we need to do is continuously read packets, skip the first 8 bytes,
118+
write them to a file and it should play perfectly.
119+
120+
But what was this audio? Was this a sneakily placed bug that listened to me?
121+
Was it something related to the smart TVs in my room? Something related to the
122+
hotel systems? Only one way to find out.
123+
124+
```python
125+
126+
$ python3 listen_2046.py > test.mp3
127+
* wait a little to get a recording *
128+
^C
129+
130+
$ mplayer test.mp3
131+
MPlayer (C) 2000-2016 MPlayer Team
132+
224 audio & 451 video codecs
133+
134+
Playing test.mp3.
135+
libavformat version 57.25.100 (external)
136+
Audio only file format detected.
137+
=====
138+
Starting playback...
139+
A: 3.9 (03.8) of 13.0 (13.0) 0.7%
140+
141+
```
142+
143+
### The Revelation/Disappointment
144+
145+
What the hell? I can't believe I spent time for this. It's just elevator
146+
music. It is played in the hotel corridors around the elevators. Oh well, at
147+
least I can listen to it from my room now.
148+
149+

0 commit comments

Comments
 (0)