說明:和之前一樣。為了簡化,本文下方會把幾個簡短的輸出測試寫在一起。
如同老師一開始提的:
Annotated type hint 單純就是在現有型態中,再加上 metadata 資訊。Python 不做任何處理(所以也完全不影響原來的現有型態任何運作)。
但第三方函式庫卻可以利用這個功能,來達成他們的目的。
(這是個很漂亮的設計,我看了之後的感想是:WOW!原來可以這樣)。
Annotated type hint 並非 Pydantic 獨有,所有第三方工具(例如:mypy, IDE…)都受益於此設計,以下為 PEP 3107 引入 Annotation 的部分參考資訊。
▌1. 關於 Annotation(Type Hint)
Python 中的 Annotation,不限於 Pydantic。
Annotation(Type Hint)是在 Python 3.0(2008 年 12 月 3 日)中引入的, PEP 3107 定義了語法。
PEP 3107 中說明的 使用情境
可以理解為「Annotated 試圖解決什麼問題?」。
連結中的數字,為原文件中的數字,未重排。
- Providing typing information
- Other information
- Documentation for parameters and return values ([23])
上述中文翻譯(稍加修飾,尚未檢查)
- 提供 typing 訊息
- 其他資訊
- 參數和傳回值的文件 ( [23] )
幾個景場示範
- 函數(或方法)參數的註解,作為 幫助訊息。
# Annotation for arguments of functions ( or methods ) as help message
def compile(source: "something compilable",
filename: "where the compilable thing comes from",
mode: "is this a single statement or a suite?"):
- 函數(或方法)的參數註解,作為 Type Hint。
# Annotation for arguments of functions ( or methods ) as Type Hint
def haul(item: Haulable, *vargs: PackAnimal) -> Distance:
...
- 參數 的註解
# Annotation for parameters
def foo((x1, y1: expression),
(x2: expression, y2: expression)=(None, None)):
...
- 回傳值 的註解
# Annotation for return value
def sum() -> expression:
...
再次提醒:Python 並未為註釋賦予任何特定的含義或重要性。
語法
沒什麼特別,請自行參考 原說明。
▌2. Pydantic and Annotated Types
使用註釋將元資料(x)新增給類型 T: Annotated[T, x]。像這樣:
from typing import Annotated
myType = Annotated[type, metadata]
2-1. 查看 Annotated metadata 的函式:get_args
from typing import Annotated
from typing import get_args
SpecialInt = Annotated[int, "metadata 1", [1, 2, 3], 100]
get_args(SpecialInt)
(int, 'metadata 1', [1, 2, 3], 100)
2-2. 簡化重複的變數設定
Pydantic makes extensive use of Annotated types, especially useful for creating re-usable types.
Pydantic 常將 Annotated types 使用在重覆使用的自訂型態:
keep our code DRY (Don’t Repeat Yourself)
from pydantic import BaseModel, Field, ValidationError
+BoundedInt = Annotated[int, Field(gt=0, le=100)]
class Model(BaseModel):
- x: int = Field(gt=0, le=100)
- y: int = Field(gt=0, le=100)
- z: int = Field(gt=0, le=100)
+ x: BoundedInt
+ y: BoundedInt
+ z: BoundedInt
## 然後證明上述兩種寫法的結果相同
Model.model_fields
證明上述兩種寫法的結果相同。而且不只可以使用在這個 class 中。
{'x': FieldInfo(annotation=int, required=True, metadata=[Gt(gt=0), Le(le=100)]),
'y': FieldInfo(annotation=int, required=True, metadata=[Gt(gt=0), Le(le=100)]),
'z': FieldInfo(annotation=int, required=True, metadata=[Gt(gt=0), Le(le=100)])}
2-3. 型態驗證(Validation)依然有效
Model(x=10, y=20, z=30)
try:
Model(x=0, y=10, z=103)
except ValidationError as ex:
print(ex)
Model(x=10, y=20, z=30)
2 validation errors for Model
x
Input should be greater than 0 [type=greater_than, input_value=0, input_type=int]
For further information visit https://errors.pydantic.dev/2.7/v/greater_than
z
Input should be less than or equal to 100 [type=less_than_equal, input_value=103, input_type=int]
For further information visit https://errors.pydantic.dev/2.7/v/less_than_equal
2-4. 也可以僅在 Model 中使用
from pydantic import BaseModel, Field, ValidationError
class Model(BaseModel):
field_1: Annotated[int, Field(gt=0)] = 1
field_2: Annotated[str, Field(min_length=1, max_length=10)] | None = None
Model()
Model(field_1=10)
Model(field_2="Python")
try:
Model(field_1=-10, field_2 = "Python" * 3)
except ValidationError as ex:
print(ex)
Model(field_1=1, field_2=None)
Model(field_1=10, field_2=None)
Model(field_1=1, field_2='Python')
2 validation errors for Model
field_1
Input should be greater than 0 [type=greater_than, input_value=-10, input_type=int]
For further information visit https://errors.pydantic.dev/2.7/v/greater_than
field_2
String should have at most 10 characters [type=string_too_long, input_value='PythonPythonPython', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/string_too_long
▌3. Annotated Types and Type Variables
3-1. TypeVar 型別變數/泛型
T = TypeVar('T') # Can be anything
S = TypeVar('S', bound=str) # Can be any subtype of str
A = TypeVar('A', str, bytes) # Must be exactly str or bytes
U = TypeVar('U', bound=str|bytes) # Can be any subtype of the union str|bytes
V = TypeVar('V', bound=SupportsAbs) # Can be anything with an __abs__ method
我們已經在 2-2 看過,使用 Annotated types 來簡化重覆使用的自訂型態。
那如果我們不僅想要簡化自訂型態,還想將這個自訂型態擴展到更多的型能呢?
例如以下這個例子,原本是 int 的列表,要如何擴展到 float, string 呢?
from pydantic import BaseModel, Field, ValidationError
from typing import Annotated
-BoundedListInt = Annotated[list[int], Field(max_length=10)]
class Model(BaseModel):
field_1: BoundedListInt = []
field_2: BoundedListInt = []
-BoundedListFloat = Annotated[list[float], Field(max_length=10)]
-BoundedListString = Annotated[list[str], Field(max_length=10)]
老師先故意使用一個不太適合的方式:Any
from typing import Any
-BoundedList = Annotated[list[Any], Field(max_length=10)]
問題是:Any 可接受任何型態,但我們其實想要的同一種型態(整數列表、字串列表…)。
此時可以使用型別變數(適用自訂類型):TypeVar
from typing import TypeVar
T = TypeVar('T')
BoundedList = Annotated[list[T], Field(max_length=10)]
BoundedList[int]
BoundedList[str]
typing.Annotated[list[int], FieldInfo(annotation=NoneType, required=True, metadata=[MaxLen(max_length=10)])]
typing.Annotated[list[str], FieldInfo(annotation=NoneType, required=True, metadata=[MaxLen(max_length=10)])]
和 2-4 一樣,我們也可以在 Model 中使用。
class Model(BaseModel):
integers: BoundedList[int] = []
strings: BoundedList[str] = []
Model()
Model(integers=[1.0, 2.0], strings=["abc", "def"])
Model(integers=[], strings=[])
Model(integers=[1, 2], strings=['abc', 'def'])
和 Any 不同,指定為整數卻傳入浮點數時,會報錯。
這就是我們想要的結果。
try:
Model(integers=[0.5])
except ValidationError as ex:
print(ex)
1 validation error for Model
integers.0
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.5, input_type=float]
For further information visit https://errors.pydantic.dev/2.7/v/int_from_float
▌4. 字串約束 String Constraints
4-1. 在 Field 中設定
前面的章節介紹過 String Constraints,我們可以在 Field 中設定字串的限制,例如:
from pydantic import BaseModel, Field, ValidationError
class Model(BaseModel):
name: str = Field(min_length=2, max_length=5)
4-2. StringConstraints
如果我們想要做的更多(刪除空白字符、轉換為大寫或小寫),那上述方法就不適用了。請改用 StringConstraints。
Name | Type | Description |
---|---|---|
strip_whitespace | bool | None | 是否從字串中刪除空白字符。 Whether to strip whitespace from the string. |
to_upper | bool | None | 是否將字串轉換為大寫。 Whether to convert the string to uppercase. |
to_lower | bool | None | 是否將字串轉換為小寫。 Whether to convert the string to lowercase. |
strict | bool | None | 是否在嚴格模式下驗證字串。 Whether to validate the string in strict mode. |
min_length | int | None | 字串的最小長度。 The minimum length of the string. |
max_length | int | None | 字串的最大長度。 The maximum length of the string. |
pattern | str | Pattern[str] | None | 字串必須匹配的正則表達式模式。 A regex pattern that the string must match. |
from typing import Annotated
from pydantic import StringConstraints
StandardString = Annotated[
str,
StringConstraints(to_lower=True, min_length=2, strip_whitespace=True)
]
class Model(BaseModel):
code: StandardString | None = None
Model()
Model(code="ABC ")
try:
Model(code=" a ")
except ValidationError as ex:
print(ex)
Model(code=None)
Model(code='abc')
1 validation error for Model
code
String should have at least 2 characters [type=string_too_short, input_value=' a ', input_type=str]
For further information visit https://errors.pydantic.dev/2.5/v/string_too_short
▌5. Project
5-1. BoundedString & BoundedList
建立一個註釋類型,名為 BoundedString
,定義一個最少 2 字元、最多 50 字元的字串。
建立一個註釋類型,名為 BoundedList
,使用 type variable(類型變數)定義一個由 elements 組成的列表,elements 數量最少 1 個、最多 5 個。
from typing import Annotated, TypeVar
from pydantic import Field, ValidationError
BoundedString = Annotated[str, Field(min_length=2, max_length=50)]
T = TypeVar('T')
BoundedList = Annotated[list[T], Field(min_length=1, max_length=5)]
在實際放入原專案前,先做各測試以確定符合前述規格。
class Test(BaseModel):
field1: BoundedString
# Test 1:正常值測試
Test(field1="abc")
# Test 2:字串長度低於下限
try:
Test(field1="a")
except ValidationError as ex:
print(ex)
# Test 3:字串長度超過上限
try:
Test(field1="a" * 51)
except ValidationError as ex:
print(ex)
# Result 1
Test(field1='abc')
# Result 2
1 validation error for Test
field1
String should have at least 2 characters [type=string_too_short, input_value='a', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/string_too_short
# Result 3
1 validation error for Test
field1
String should have at most 50 characters [type=string_too_long, input_value='aaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaa', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/string_too_long
class Test(BaseModel):
my_list: BoundedList[int]
# Test 1:正常值測試
Test(my_list=[1, 2, 3])
# Test 2:element 數量低於下限
try:
Test(my_list=[])
except ValidationError as ex:
print(ex)
# Test 3:elements 數量超過上限
try:
Test(my_list=[1, 2, 3, 4, 5, 6])
except ValidationError as ex:
print(ex)
# Result 1
Test(my_list=[1, 2, 3])
# Result 2
1 validation error for Test
my_list
List should have at least 1 item after validation, not 0 [type=too_short, input_value=[], input_type=list]
For further information visit https://errors.pydantic.dev/2.7/v/too_short
# Result 3
1 validation error for Test
my_list
List should have at most 5 items after validation, not 6 [type=too_long, input_value=[1, 2, 3, 4, 5, 6], input_type=list]
For further information visit https://errors.pydantic.dev/2.7/v/too_long
測試無誤後,我們 用 BoundedString
當作 BoundedList
的 elements。
class Test(BaseModel):
my_list: BoundedList[BoundedString]
# Test 1:正常值測試
Test(my_list=['aa', 'bb', 'cc'])
# Test 2:BoundedList 數量低於下限
try:
Test(my_list=[])
except ValidationError as ex:
print(ex)
# Test 3:BoundedString 字串低於下限(BoundedList 數量符合要求)
try:
Test(my_list=['a', 'bb', 'cc'])
except ValidationError as ex:
print(ex)
# Test 4:BoundedString 字串高於上限(BoundedList 數量符合要求)
try:
Test(my_list=['a' * 51, 'bb', 'cc'])
except ValidationError as ex:
print(ex)
# Result 1
Test(my_list=['aa', 'bb', 'cc'])
# Result 2
1 validation error for Test
my_list
List should have at least 1 item after validation, not 0 [type=too_short, input_value=[], input_type=list]
For further information visit https://errors.pydantic.dev/2.7/v/too_short
# Result 3
1 validation error for Test
my_list.0
String should have at least 2 characters [type=string_too_short, input_value='a', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/string_too_short
# Result 4
1 validation error for Test
my_list.0
String should have at most 50 characters [type=string_too_long, input_value='aaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaa', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/string_too_long
測試無誤後,放進上週的專案中。
目標一
將以下在 Automobile
model 中的類型,改用 annotated type(解答中標綠色處):
manufacturer
series_name
vin
registration_country
license_plate
目標二
程式標紅色處。
- 欄位名稱
top_features
- 放在
vin
field 之前(serializing/deserializing 順序要固定) - 既 反序列化 又 序列化 到
topFeatures
(測試資料中的 key) BoundedList
中的 elementBoundedString
,字串長度下限 2、上限 50。(就是我們前面實作並測試的事)- 該欄位為 optional,預設值為
None
from datetime import date
from enum import Enum
from uuid import uuid4
from pydantic import BaseModel, ConfigDict, Field, field_serializer
from pydantic.alias_generators import to_camel
from pydantic import UUID4
class Automobile(BaseModel):
model_config = ConfigDict(
extra="forbid",
str_strip_whitespace=True,
validate_default=True,
validate_assignment=True,
alias_generator=to_camel,
)
id_: UUID4 | None = Field(alias="id", default_factory=uuid4)
+ manufacturer: BoundedString
+ series_name: BoundedString
type_: AutomobileType = Field(alias="type")
is_electric: bool = False
manufactured_date: date = Field(validation_alias="completionDate", ge=date(1980, 1, 1))
base_msrp_usd: float = Field(
validation_alias="msrpUSD",
serialization_alias="baseMSRPUSD"
)
- top_features: BoundedList[BoundedString] | None = None
+ vin: BoundedString
number_of_doors: int = Field(
default=4,
validation_alias="doors",
ge=2,
le=4,
multiple_of=2,
)
+ registration_country: BoundedString | None = None
+ license_plate: BoundedString | None = None
@field_serializer("manufactured_date", when_used="json-unless-none")
def serialize_date(self, value: date) -> str:
return value.strftime("%Y/%m/%d")```
然後以老師提供的測試資料(省略)來測試,一樣的話就過關。