这篇文章灵感来源于一个新项目A short guide on features of Python 3 for data scientists,这个项目列出来了作者使用Python 3用到的一些特性。正巧我最近也想写一篇介绍Python 3(特指Python 3.6+)特色用法的文章。开始吧!

pathlib模块

pathlib模块是Python 3新增的模块,让你更方便的处理路径相关的工作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
In : from pathlib import Path
In : Path.home()
Out: PosixPath('/Users/dongweiming') # 用户目录
In : path = Path('/user')
In : path / 'local' # 非常直观
Out: PosixPath('/user/local')
In : str(path / 'local' / 'bin')
Out: '/user/local/bin'
In : f = Path('example.txt')
In : f.write_bytes('This is the content'.encode('utf-8'))
Out[16]: 19
In : with f.open('r', encoding='utf-8') as handle: # open现在是方法了
....: print('read from open(): {!r}'.format(handle.read()))
....:
read from open(): 'This is the content'
In : p = Path('touched')
In : p.exists() # 集成了多个常用方法
Out: False
In : p.touch()
In : p.exists()
Out: True
In : p.with_suffix('.jpg')
Out: PosixPath('touched.jpg')
In : p.is_dir()
Out: False
In : p.joinpath('a', 'b')
Out: PosixPath('touched/a/b')

可迭代对象的解包

1
2
3
4
5
6
7
8
9
10
11
In : a, b, *rest = range(10) # 学过lisp就很好懂了,相当于一个「everything else」
In : a
Out: 0
In : b
Out: 1
In : rest
Out: [2, 3, 4, 5, 6, 7, 8, 9]
In : *prev, next_to_last, last = range(10)
In : prev, next_to_last, last
Out: ([0, 1, 2, 3, 4, 5, 6, 7], 8, 9)

强制关键字参数

使用强制关键字参数会比使用位置参数表意更加清晰,程序也更加具有可读性,那么可以让这些参数强制使用关键字参数传递,可以将强制关键字参数放到某个参数或者单个后面就能达到这种效果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
In : def recv(maxsize, *, block):
....:
....: pass
....:
In : recv(1024, True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-8e61db2ef94b> in <module>()
----> 1 recv(1024, True)
TypeError: recv() takes 1 positional argument but 2 were given
In : recv(1024, block=True)

通配符**

我们都知道在Python 2时不能直接通配递归的目录,需要这样:

1
2
3
4
5
6
found_images = \
glob.glob('/path/*.jpg') \
+ glob.glob('/path/*/*.jpg') \
+ glob.glob('/path/*/*/*.jpg') \
+ glob.glob('/path/*/*/*/*.jpg') \
+ glob.glob('/path/*/*/*/*/*.jpg')

Python3的写法要清爽的多:

1
found_images = glob.glob('/path/**/*.jpg', recursive=True)

事实上更好的用法是使用pathlib:

1
found_images = pathlib.Path('/path/').glob('**/*.jpg')

print

Python 3之后print成为了函数,有了更多的扩展能力:

1
2
3
4
5
6
7
8
In : print(*[1, 2, 3], sep='\t')
1 2 3
In : [x if x % 3 else print('', x) for x in range(10)]
0
3
6
9
Out: [None, 1, 2, None, 4, 5, None, 7, 8, None]

格式化字符串变量

1
2
3
4
5
6
7
8
9
10
11
12
13
14
In : name = 'Fred'
In : f'My name is {name}'
Out: 'My name is Fred'
In : from datetime import *
In : date = datetime.now().date()
In : f'{date} was on a {date:%A}'
Out: '2018-01-17 was on a Wednesday'
In : def foo():
....: return 20
....:
In : f'result={foo()}'
Out: 'result=20'

更严格的对比规范

下面这几种类型的用法在Python 3都是非法的:

1
2
3
4
5
3 < '3'
2 < None
(3, 4) < (3, None)
(4, 5) < [4, 5]
sorted([2, '1', 3])

统一unicode的使用

这是很多人黑Python 2的一点,举个例子。在Python 2里面下面的结果很奇怪:

1
2
3
4
5
6
7
In : s = '您好'
In : print(len(s))
6
In : print(s[:2])
?

Python 3就方便了:

1
2
3
4
5
6
7
In : s = '您好'
In : print(len(s))
2
In : print(s[:2])
您好

合并字典

1
2
3
4
5
In : x = dict(a=1, b=2)
In : y = dict(b=3, d=4)
In : z = {**x, **y}
In : z
Out: {'a': 1, 'b': 3, 'd': 4}

字典可排序

Python 3不再需要直接使用OrderedDict:

1
2
In : {str(i):i for i in range(5)}
Out: {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4}