Python Deep Drive II 第 8 節 – Iteration Tools (76~83)

sky · 2022年09月29日12:21

本週進度 8: Iteration Tools（中）

行前提示：

Predicate: any function that given an input returns True or False is called a predicate.

任何回傳 Bool 值的函式，稱為 Predicate。俗稱「斷言」

itertools - Python 官網總表

Infinite iterators:

無窮迭代器

Iterator	Arguments	Results	Example
`count()`	start, [step]	start, start+step, start+2*step, …	`count(10) → 10 11 12 13 14 ...`
`cycle()`	p	p0, p1, … plast, p0, p1, …	`cycle('ABCD') → A B C D A B C D ...`
`repeat()`	elem [,n]	elem, elem, elem, … endlessly or up to n times	`repeat(10, 3) → 10 10 10`

Iterators terminating on the shortest input sequence:

根據最短輸入序列長度停止的迭代器（真繞口，難看懂）

Iterator	Arguments	Results	Example
`accumulate()`	p [,func]	p0, p0+p1, p0+p1+p2, …	`accumulate([1,2,3,4,5]) → 1 3 6 10 15`
`chain()`	p, q, …	p0, p1, … plast, q0, q1, …	`chain('ABC', 'DEF') → A B C D E F`
`chain.from_iterable()`	iterable	p0, p1, … plast, q0, q1, …	`chain.from_iterable(['ABC', 'DEF']) → A B C D E F`
`compress()`	data, selectors	(d[0] if s[0]), (d[1] if s[1]), …	`compress('ABCDEF', [1,0,1,0,1,1]) → A C E F`
`dropwhile()`	pred, seq	seq[n], seq[n+1], starting when pred fails	`dropwhile(lambda x: x<5, [1,4,6,4,1]) → 6 4 1`
`filterfalse()`	pred, seq	elements of seq where pred(elem) is false	`filterfalse(lambda x: x%2, range(10)) → 0 2 4 6 8`
`groupby()`	iterable[, key]	sub-iterators grouped by value of key(v)
`islice()`	seq, [start,] stop [, step]	elements from seq[start:stop:step]	`islice('ABCDEFG', 2, None) → C D E F G`
`pairwise()`	iterable	(p[0], p[1]), (p[1], p[2])	`pairwise('ABCDEFG') → AB BC CD DE EF FG`
`starmap()`	func, seq	func(seq[0]), func(seq[1]), …	`starmap(pow, [(2,5), (3,2), (10,3)]) → 32 9 1000`
`takewhile()`	pred, seq	seq[0], seq[1], until pred fails	`takewhile(lambda x: x<5, [1,4,6,4,1]) → 1 4`
`tee()`	it, n	it1, it2, … itn splits one iterator into n
`zip_longest()`	p, q, …	(p[0], q[0]), (p[1], q[1]), …	`zip_longest('ABCD', 'xy', fillvalue='-') → Ax By C- D-`

Combinatoric iterators:

排列組合迭代器

Iterator	Arguments	Results
`product()`	p, q, … [repeat=1]	cartesian product, equivalent to a nested for-loop
`permutations()`	p[, r]	r-length tuples, all possible orderings, no repeated elements
`combinations()`	p, r	r-length tuples, in sorted order, no repeated elements
`combinations_with_replacement()`	p, r	r-length tuples, in sorted order, with repeated elements

Examples	Results
`product('ABCD', repeat=2)`	`AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD`
`permutations('ABCD', 2)`	`AB AC AD BA BC BD CA CB CD DA DB DC`
`combinations('ABCD', 2)`	`AB AC AD BC BD CD`
`combinations_with_replacement('ABCD', 2)`	`AA AB AC AD BB BC BD CC CD DD`

資料來源： Python 官網 itertools

76~77. Selecting and Filtering

共同點：

import：from itertools import xxx（除了 built-in function filter）

return：returns a lazy iterator

functions	語法	補充
filter	filter(predicate or None, iterable)	predicate: a function, 可以是 None
	(item for item in iterable if pred(item))	iterable 符合 predicate (true)就留下
	filter(x) → x	predicate 為 None 時 → identity function
	(item for item in iterable if item)
filterfalse	filterfalse(predicate or None, iterable)	語法同 filter，但值取 false
compress	compress(data, selectors)	用 selectors 的值(true)，來過濾 data
takewhile	takewhile(pred, iterable)	while pred(item) is Truthy, take iterators
		符合條件後即結束，不管之後是否又有符合條件的
dropwhile	dropwhile(pred, iterable)	while pred(item) is Truthy, drop iterators
		前方不符合的丟棄，符合條件後，後方全拿

functions	範例
filter	filter(lambda x: x < 4, [1, 10, 2, 10, 3, 10])
	▌1, 2, 3 return (lazy) iterator, 不是 list [1, 2, 3]
	filter(None, [0, ‘’, ‘hello’, 100, False])
	▌’hello’, 100
filterfalse	filterfalse(lambda x: x < 4, [1, 10, 2, 10, 3, 10])
	▌10, 10, 10
	filterfalse(None, [0, ‘’, ‘hello’, 100, False])
	▌0, ‘’, False

看看 Python 內部如何運作（近似作法，非原始碼）

filter

# 本例改寫自 filterfalse，非 Python 官網資料
def filter(predicate, iterable):
    # filter(lambda x: x%2, range(10)) --> 1 3 5 7 9
    if predicate is None:
        predicate = bool
    for x in iterable:
        if predicate(x):
            yield x

filterfalse

def filterfalse(predicate, iterable):
    # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
    if predicate is None:
        predicate = bool
    for x in iterable:
        if not predicate(x):
            yield x

functions	範例
compress	data =	[ ‘a’,	‘b’,	‘c’,	‘d’,	‘e’ ]

	selectors =	[ True,	False,	1,	0 ]	None
compress(data, selectors) →		a		c

看看 Python 內部如何運作（近似作法，非原始碼）

compress

def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in zip(data, selectors) if s)

functions	範例
takewhile	takewhile(lambda x: x < 5, [1, 3, 5, 4, 2])
	▌1, 3 符合條件後即結束，即使 5 的後面 4, 2 也符合
dropwhile	dropwhile(lambda x: x < 5, [1, 3, 5, 4, 2])
	▌5, 4, 2 符合條件後後方全拿，即使 5 的後面 4, 2 也符合條件

看看 Python 內部如何運作（近似作法，非原始碼）

takewhile

def takewhile(predicate, iterable):
    # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
    for x in iterable:
        if predicate(x):
            yield x
        else:
            break

dropwhile

def dropwhile(predicate, iterable):
    # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
    iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            yield x
            break
    for x in iterable:
        yield x

Code Exercises

老師的 Jupyter Notebook 原始碼：python-deepdive/Part 2/Section 08 - Iteration Tools/03 - Selecting and Filtering.ipynb at main · fbaptiste/python-deepdive · GitHub

本節英文名詞複習

predicate：斷言、斷定

identity：身份識別

78~79. Infinite Iterators

共同點：

import：from itertools import xxx

return：returns a lazy iterator

複習一下官方表格（同最上方總表，稍微改一下外觀）：

Iterator	Arguments	Results	Example
`count()`	start [,step]	start, start+step, start+2*step, …	count(10)
	start=0, step=1	預設值	▌10 11 12 13 14 …
`cycle()`	p	p0, p1, … plast, p0, p1, …	cycle(‘ABCD’)
		plast 是指 p last，p 的最後一個	▌A B C D A B C D …
`repeat()`	elem [,n]	elem, elem, elem, … endlessly or	repeat(10, 3)
		up to n times	▌10 10 10

Fred 老師講義：

functions	語法	補充
count	count (start, step)	similar to range → 有 start, step
		different from range → no stop
	start & step can be any numeric type	float, complex, Decimal, bool
cycle	cycle(p)	loop over a finite iterable indefinitely
		重要！如果 p 是 iterator，即使耗盡 exhausted，cycle 仍會持續產生 p
repeat	repeat(data, n)	yields the same value n times
	n 預設無限大	yields the same value indefinitely
		重要！重覆的 data 是同一個物件，所有 data 指向記憶體中的同一個位址

functions	範例
count	count(10, 2)
	▌10, 12, 14, …
	count(10.5, 0.1)
	▌10.5, 10.6, 10.7, …
	takewhile(lambda x: x < 10.8, count(10.5, 0.1))
	▌10.5, 10.6, 10.7
cycle	cycle([‘a’, ‘b’, ‘c’])
	▌’a’, ‘b’, ‘c’, ‘a’, ‘b’, ‘c’, …
repeat	repeat(‘spam’)
	▌’spam’, ‘spam’, ‘spam’, ‘spam’, …
	repeat(‘spam’, 3)
	▌’spam’, ‘spam’, ‘spam’

看看 Python 內部如何運作（近似作法，非原始碼）

count

def count(start=0, step=1):
    # count(10) --> 10 11 12 13 14 ...
    # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
    n = start
    while True:
        yield n
        n += step

cycle

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element

repeat

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in range(times):
            yield object

一種常見的 repeat 用法如下：

list(map(pow, range(10), repeat(2)))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Chris 兄關於 repeat vs. tee 的補充，非常感謝！

Code Exercises

老師的 Jupyter Notebook 原始碼：python-deepdive/Part 2/Section 08 - Iteration Tools/04 - Infinite Iterators.ipynb at main · fbaptiste/python-deepdive · GitHub

80~81. Chaining and Teeing

共同點：

import：from itertools import xxx

return：returns a lazy iterator（tee 回傳的是 內含 iterators 的 tuple ）

複習一下官方表格（同最上方總表，稍微修改）：

Iterator	Arguments	Results	Example
`chain(*iterables)`	p, q, …	p0, p1, … plast, q0, q1, …	`chain('ABC', 'DEF') → A B C D E F`
`chain.from_iterable()`	iterable	p0, p1, … plast, q0, q1, …	`chain.from_iterable(['ABC', 'DEF']) → A B C D E F`
`tee()`	it, n	it1, it2, … itn splits one iterator into n	補充：這裡的 it 是 iterable 的縮寫

Fred 老師講義：

functions	語法	補充
chain	chain(*args)	很類似 sequence concatenation. str3 = str1 + str2
	不同點一	dealing with iterables (including iterators)
	不同點二	chaining is itself a lazy iterator
chain.from_iterable	chain.from_iterable(it)	“constructor” for chain
		it 是 iterable 的縮寫
		當一個 iterable 中含有多個 iterables，例：list1 = [iter1, iter2, iter3]。但 chain 只處理最外層的 list1，而不是裡面的 iterN
		方法一：用 chain(*list1) 來 unpack，但 unpacking 是 eager，不是 lazy。
		方法二：用 chain.from_iterable，對 list1 中的每個 iterN 用 lazy 方式處理。
tee	tee(iterable, n)	對 iterator 多次處理，或平行處理
		想像成，但不一樣 sequence multiplication. “ha” * 3 = “hahaha”
		tee(iterable, 10) → (iter1, iter2, …, iter10)
		提醒！原始的 iterable 和回傳值 tuple 中的 iterator1, iterator2… 為不同物件

functions	範例
chain	print (list(chain([1, 4, 7], [2, 5, 8], [3, 6, 9])))
	▌[1, 4, 7, 2, 5, 8, 3, 6, 9]
chain.from_iterable	print (list(chain.from_iterable([[1, 4, 7], [2, 5, 8], [3, 6, 9]])))
	▌[1, 4, 7, 2, 5, 8, 3, 6, 9]
tee	for i in tee([‘a’, ‘b’, ‘c’, ‘d’], 2):
	print (list(i))
	▌[‘a’, ‘b’, ‘c’, ‘d’]
	▌[‘a’, ‘b’, ‘c’, ‘d’]

看看 Python 內部如何運作（近似作法，非原始碼）

chain

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

chain.from_iterable

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

tee

def tee(iterable, n=2):
    it = iter(iterable)
    deques = [collections.deque() for i in range(n)]
    def gen(mydeque):
        while True:
            if not mydeque:             # when the local deque is empty
                try:
                    newval = next(it)   # fetch a new value and
                except StopIteration:
                    return
                for d in deques:        # load it to all the deques
                    d.append(newval)
            yield mydeque.popleft()
    return tuple(gen(d) for d in deques)

tee 在什麼場合使用：

Iterators can only be iterated once in python.
After that they are “exhausted” and don’t return more values.

tee() takes an iterator and gives you two or more, allowing you to use the iterator passed into the function more than once.

來源：python - tee() function from itertools library - Stack Overflow

Code Exercises

老師的 Jupyter Notebook 原始碼：python-deepdive/Part 2/Section 08 - Iteration Tools/05 - Chaining and Teeing Iterators.ipynb at main · fbaptiste/python-deepdive · GitHub

82~83. Mapping and Reducing

共同點：

import：built-in function（map, sum, min, max）

import： from functools import reduce

import： from itertools import xxx（starmap, accumulate）

return：returns a lazy iterator

下述這些不在 itertools 中：map, sum, min, max, reduce（functools）。

下述這些在 itertools 中：starmap, accumulate。

Iterator	Arguments	Results	Example
`starmap()`	func, seq	func(seq[0]), func(seq[1]), …	`starmap(pow, [(2,5), (3,2), (10,3)]) → 32 9 1000`
`accumulate()`	p [,func]	p0, p0+p1, p0+p1+p2, …	`accumulate([1,2,3,4,5]) → 1 3 6 10 15`

Fred 老師講義：

functions	語法	補充
	Mapping	applying a callable to each element of an iterable
map	map(fn, iterable)
	Accumulation	reducing an iterable down to a single value
sum	sum(iterable)	returns iterable 的總和
min	min(iterable)	returns iterable 中的最小值
max	max(iterable)	returns iterable 中的最大值
reduce	reduce(fn, iterable, [initializer])	fn 是兩個參數的 function，依序由 iterable 中計算累積
		看完 reduce，順便觀察 accumulate 其異同
accumulate	accumulate(iterable, fn)	1. 無初始值設定(no initializer)測試結果可以
		2. reduce 傳回最終值；accumulate 傳回 lazy iterator
		3. 參數傳遞順序不同 reduce(fn, iterable, ) vs. accumulate(iterable, fn)
		4. fn 為選填，預設為 addition
starmap	accumulate(iterable, fn)	類似 map
		1. Unpack iterable sub element to function
		2. 對於將 an iterable of iterables mapping 到多參數函式很有用（類似chain.from_iterable() 之於 chain）

functions	範例
map(fn, iterable)	map(lambda x: x**2, [1, 2, 3, 4])
	也可以用 generator： maps = (fn(item) for item in iterable)
	▌1, 4, 9, 16 (lazy iterator)
sum(iterable)	list1 = [1, 2, 3, 4]
min(iterable)	sum(list1), min(list1), max(list1)
max(iterable)	▌(10, 1, 4)
reduce(fn, iterable, [initializer])	reduce(lambda x, y: x + y, list1)
	▌10 ( ( (1 + 2) + 3) + 4)
	reduce(lambda x, y: x + y, list1, 100)
	▌110 ( ( ( (100) + 1 + 2) + 3) + 4)
	reduce(lambda x, y: x * y, list1)
	▌24 ( ( (1 * 2) * 3) * 4)
	reduce(operator.mul, list1)
	▌24 ( ( (1 * 2) * 3) * 4)
	和上方 reduce 相比較
accumulate(iterable, fn)	accumulate(list1, operator.mul)
	▌1, 2, 6, 24 return (lazy) iterator
starmap	list2 = [ [1, 2], [3, 4] ]
	starmap(operator.mul, list2)
	generator 也可: operator.mul(*item) for item in list2
	▌2, 12 (1 * 2, 3 * 4)
	list3 = [ [1, 2, 3], [10, 20, 30], [100, 200, 300] ]
	starmap(lambda: x, y, z: x + y + z, l)
	▌6, 60, 600 (1+2+3, 10+20+30, 100+200+300)

看看 Python 內部如何運作（近似作法，非原始碼）

starmap

def starmap(function, iterable):
    # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
    for args in iterable:
        yield function(*args)

accumulate

def accumulate(iterable, func=operator.add, *, initial=None):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    # accumulate([1,2,3,4,5], initial=100) --> 100 101 103 106 110 115
    # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
    it = iter(iterable)
    total = initial
    if initial is None:
        try:
            total = next(it)
        except StopIteration:
            return
    yield total
    for element in it:
        total = func(total, element)
        yield total

Code Exercises

老師的 Jupyter Notebook 原始碼：python-deepdive/Part 2/Section 08 - Iteration Tools/06 - Mapping and Reducing.ipynb at main · fbaptiste/python-deepdive · GitHub

本節英文名詞複習

accumulation：累積

cumulatively：累積

ChrisWei · 2022年09月30日15:28

你今晚說的 repeat 那邊好像漏掉什麼? 但一時想不起來，我推測你想說的其實應該是漏掉了一個老師在課程上說過的重要提醒:

repeat 所重複出來的物件都是同一個參照，因此要小心，如果中間有去變更到參照的值，則所有 repeat 產生的物件都會一起發生改變。
對比於 tee，tee 則是針對單一 iterable object，去做真正的 clone，因此不會有像repeat那樣，其中一個被 tee 複製產生的 iterable object 有變化，導致其他 tee 產生的 iterable object 也一起改變的問題。

ChrisWei · 2022年09月30日15:29

非常感謝這次的課程分享~