Python Deep Drive II 第 2 節 – Sequence Types (17~24)

sky · 2022年08月02日11:25

說明：udemy 的系統，如果新插入章節，原本的編號就會改變。以下編號為 2022/8/2 資料，未來可能不同，未來參考請以標題為準。

17. In-Place Concatenation and Repetition

基礎知識：

說明 in-place concatenation 前，先複習一下 concatenation。

concatenation 串接。運算元 +，使用方式：a = a + b

in-place concatenation 原址串接。運算元 +=，使用方式：a += b。

這裡的 in-place 是指相同的記憶體位址，運算後的值，存在原來的位址。

a = a + b 與 a += b 是否相同？

+ 和 += 是兩種不同的運算元。

如果 a, b 是不可變的物件（immutable objects），例如 numbers, strings, tuples。因為不可變 immutable，所以 in-place concatenation 無法改變 a 的值，執行 a += b 時，實際是執行 a = a + b，將運算結果存成新的物件（不同記憶體位址）。

We cannot mutate an immutable container! What happens is that += is not actually defined for the tuple, and so Python essentially executed this code:
t1 = t1 + t2
which, as we already know, always creates a new object.

我們先看範例（concatenation vs. in-place concatenation）：

iframe 先移除，免得每次編輯都重新 reload

table for memory address

list += any iterable

以下執行後之 list1 位址皆不變：

list1 = [1, 2, 3, 4]
tuple1 = 5, 6, 7
print(id(list1), list1)
print(id(tuple1), tuple1, "\n")

# list += tuple
list1 += tuple1
print(id(list1), list1, "\n")

# list += range
list1 += range(8, 11)
print(id(list1), list1, "\n")

# iterable non-sequence types
# list += set
list1 += {11, 12, 13}
print(id(list1), list1)

In-Place Repetition

in-place repetition 原址覆接（*=）：和 in-place concatenation 原址串接一樣的原理，list1 *= 2 會將執行後的結果，儲存在原物件 list1 的記憶體中。

不可變物件（immutable object，如：tuple）則是存成新物件，新位址。

補充：String Concatenation

補充：忘了

19. Assignments in Mutable Sequences

幾個設定可變序列 mutable sequences 值的方式：

append
insert
extend
in-place concatenation 原址串接
slicing 切分（一般翻譯為：切片）

本節即在介紹，以 slicing 切分方式，實作這些功能：modify 修改、insert 插入、delete 刪除。

方式	語法	示範
indexing	[ i ]	指定 [i] 的值，對照參考用。
slicing	[ i:j ]	slice(i, j)
extended slicing	[ i:j:k ]	slice(i, j, k)

modify 修改／replacement 置換

我們可以用任何 iterable 可迭代物件 來取代 sequence 序列 中的部分值，即使 len(object) 長度不同也沒關係。示範語法如下：

1. slicing modify 範例

list1 = [1, 2, 3, 4, 5]
print(id(list1), list1)

list1[0:3] = ['a', 'b', 'c', 'd']
print(id(list1), list1)

2. extended slice modify 範例

list1e = [1, 2, 3, 4, 5]
print(id(list1e), list1e)

list1e[::2] = ['a', 'c', 'e']  # 置換和被置換的 len() 必須相同
print(id(list1e), list1e, "\n")

重要：extended slice 置換和被置換的 len(object) 必須相同，否則會報錯。

delete 刪除

刪除其實就是以空的 iterable 可迭代物件，取代 sequence 序列 中的部分值。

list3 = [1, 2, 3, 4, 5]
print(id(list3), list3)

list3[0:2] = []
print(id(list3), list3)

insert 插入

插入剛好相反，將 iterable 可迭代物件，取代 sequence 序列 中某空白物件（雖然是空的，但其實已指定 index）。

list4 = [1, 2, 3, 4, 5]
print(id(list4), list4)

print(list4[1:1])  #DEBUG: confirming it's a empty slice
list4[1:1] = 'abc'
print(id(list4), list4, "\n")

21. Custom Sequences - Part 2

Part 1 已經教了 Immutable custom sequence 不可變的自定義序列，這節我們要來學 mutable custom sequence。

實作 concatenation 串接 + 和 in-place concatenation 原址串接 +=，基本上就是 overloaded 多載這些運算元。

repetition 覆接 * 和 in-place repetition 原址覆接 *= 也是。

運算元	實作	說明（我們預期）
+	`__add__`	obj1 + obj2
		obj1 and obj2 are of the same type
		result is a new object also of the same type
+=	`__iadd__`	obj1 += obj2
		obj2 is any iterable
		result is the original obj1 memory reference (obj1 was mutated)
*	`__mul__`	obj1 * n
		n is a non-negative integer
		result is a new object of the same type as obj1
*=	`__imul__`	obj1 *= n
		n is a non-negative integer
		result is the original obj1 memory reference (obj1 was mutated)

補充： __iadd__ 和 __imul__ 的 i，代表 in-place

add: addition 加法

mul: multiplication 乘法

兩個參考資料：

overload 多載：可參考 中英程式譯詞對照，搜尋 overload

其他語法實作

operators	methods	說明
seq	`__getitem__`	seq[n], seq[i:j], seq[i:j:k]
	`__setitem__`	slice → assign an iterable
		extended slices → len(slice) & len(iterable) 要相等
in	`__contains__`
del	`__delitem__`
n * seq	`__rmul__`	seq * n 是 `__mul__`
append
extend
pop

__rmul__ 裡面的 r，是 right 還是 reverse？

22. Custom Sequences - Part 2a - Coding

在 Custom Sequences - Part 1 中，我們實作了 __getitem__ 和 __len__。接著要來練習這幾個運算元：

concatenation (+)
in-place concatenation (+=)
repetition (*)
in-place repetition (*=)
index assignment (seq[i]=val)
slice assignment (seq[i:j]=iter and seq[i:j:k]=iter)
append, extend, in, del, pop

由簡入繁

先從最簡單的 print 輸出開始，每步驟完成確認無誤後，再進行下一步。

部分運算元，可能分幾個步驟來說明。

The + and += Operators

一開始先用 print 做簡單測試，這部分跳過。

class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __add__(self, other):
        return MyClass(self.name + ' ' + other.name)
        
    def __iadd__(self, other):
        self.name += ' ' + other.name
        return self

The * and *= Operators

    # 續上方程式碼
    def __mul__(self, n):
        return MyClass(self.name * n)
        
    def __imul__(self, n):
        self.name *= n
        return self

上方的程式，雖然可以順利執行 c1 * 3 c1 *= 4，但如果反過來 2 * c1，就會報錯。

TypeError: unsupported operand type(s) for *: 'int' and 'MyClass'

要解決很簡單，完成這個 method 即可：__rmul__

    # 續上方程式碼
    def __rmul__(self, n):
        self.name *= n
        return self

The in Operators

in 運算元也很容易，實作 __contains__ 即可。

    # 續上方程式碼
    def __contains__(self, value):
        return value in self.name

23/24. Custom Sequences - Part 2b/2c - Coding

接下來的幾個 method，是用點構成的多邊形作示範。

本範例有兩個 class，點（Point，由 x, y 座標組成）和多邊形（Polygon，由不定數目的點所組成。但為了示範 Custom Sequences，將不會用 named tuple 實作）

首先介紹了實數的分辨方式：isinstance(n, numbers.Real)

class Point

The x and y coordinates should be real numbers only
Point instances should be a sequence type so that we can unpack it as needed in the same way we were able to unpack the values of a named tuple.

class Point:
    def __init__(self, x, y):
        if isinstance(x, numbers.Real) and isinstance(y, numbers.Real):
            self._pt = (x, y)
        else:
            raise TypeError('Point co-ordinates must be real numbers.')
            
    def __repr__(self):
        return f'Point(x={self._pt[0]}, y={self._pt[1]})'
    
    def __len__(self):
        return 2  # 固定設為2
    
    def __getitem__(self, s):
        return self._pt[s]

class Polygon

接著設計第一版的 class Polygon：a mutable sequence of points

# version 1
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        return f'Polygon({self._pts})'

看起來好像正常執行，但是目前的 __repr__ 回傳的是點的 sequences，而不是一個含了許多點的 list（multiple arguments, not a single iterable.）

# version 2
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join(self._pts)  # TypeError
# TypeError: sequence item 0: expected str instance, Point found
        return f'Polygon({pts_str})'

    def __repr__(self):
-       return f'Polygon({self._pts})'
+       pts_str = ', '.join(self._pts)
+       return f'Polygon({pts_str})'

但這個方法會有問題： because the join method expects an iterable of strings - here we are passing it an iterable of Point objects:

# version 3
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])  # OK this time
        return f'Polygon({pts_str})'

接下來開始實作 custom sequence 最基本的 __len__ 和 __getitem__

完成後用 index 取值，以及 slicing，皆可正確執行。

Notice how we are simply delegating those methods to the ones supported by lists since we are storing our sequence of points internally using a list!

# version 4
    # 續上方程式碼
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]

實作 concatenation 和 in-place concatenation

驗證時要看 in-place concatenation 的記憶體位置是否未變。

# version 5
    # 續上方程式碼
    def __add__(self, other):
        if isinstance(other, Polygon):
            new_pts = self._pts + other._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')

    def __iadd__(self, pt):
        if isinstance(pt, Polygon):
            self._pts = self._pts + pt._pts
            return self
        else:
            raise TypeError('can only concatenate with another Polygon')

雖然 + 和 += 都運作正常，但參數只能接受 Point，無法接受 iterables。執行 p1 += [(2,2), (3,3)] 會報錯：

TypeError: can only concatenate with another Polygon

# version 6
    # 續上方程式碼
    def __iadd__(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
        return self

接著實作 append , extend 和 insert :

# version 7
    # 續上方程式碼
    def append(self, pt):
        self._pts.append(Point(*pt))
        
    def extend(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
            
    def insert(self, i, pt):
        self._pts.insert(i, Point(*pt))

Notice how we used almost the same code for __iadd__ and extend ? The only difference is that __iadd__ returns the object, while extend does not - so let’s clean that up a bit:

# version 8
    # 續上方程式碼
    def extend(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
    
    def __iadd__(self, pts):
        self.extend(pts)
        return self

__setitem__ method so we can support index and slice assignments

首先是 slicing 的實作（slice assignments）：

# version 9
    # 續上方程式碼
    def __setitem__(self, s, value):
        # value could be a single Point (or compatible type) for s an int
        # or it could be an iterable of Points if s is a slice
        # let's start by handling slices only first
        self._pts[s] = [Point(*pt) for pt in value]

TypeError: type object argument after * must be an iterable, not int

然後是 index assignments：

# version 10
    # 續上方程式碼
    def __setitem__(self, s, value):
        # value could be a single Point (or compatible type) for s an int
        # or it could be an iterable of Points if s is a slice
        # we could do this:
        if isinstance(s, int):
            self._pts[s] = Point(*value)
        else:
            self._pts[s] = [Point(*pt) for pt in value]

這樣做雖然可以正確設定，但 assign a single Point to a slice 會報錯：

p[0:2] = Point(10, 10)

TypeError: type object argument after * must be an iterable, not int

另外，What about assigning an iterable of points to an index 的報錯容易誤導：

p[0] = [Point(10, 10), Point(20, 20)]

TypeError: Point co-ordinates must be real numbers.

# version 11
    # 續上方程式碼
    def __setitem__(self, s, value):
        # we first should see if we have a single Point
        # or an iterable of Points in value
        try:
            rhs = [Point(*pt) for pt in value]
            is_single = False
        except TypeError:
            # not a valid iterable of Points
            # maybe a single Point?
            try:
                rhs = Point(*value)
                is_single = True
            except TypeError:
                # still no go
                raise TypeError('Invalid Point or iterable of Points')
        
        # reached here, so rhs is either an iterable of Points, or a Point
        # we want to make sure we are assigning to a slice only if we 
        # have an iterable of points, and assigning to an index if we 
        # have a single Point only
        if (isinstance(s, int) and is_single) \
            or isinstance(s, slice) and not is_single:
            self._pts[s] = rhs
        else:
            raise TypeError('Incompatible index/slice assignment')

最後，del keyword and the pop method.

# version 12: `del` keyword
    # 續上方程式碼
    def __delitem__(self, s):
        del self._pts[s]

# version 13: `pop` method
    # 續上方程式碼
    def pop(self, i):
        return self._pts.pop(i)

課外補充：這些英語怎麼唸？

concatenation