diff --git a/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md b/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md new file mode 100644 index 000000000..7e7317e61 --- /dev/null +++ b/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md @@ -0,0 +1,996 @@ +--- +title: "iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session" +author: "ZhgChgLi" +date: 2024-08-13T08:10:37.015+0000 +last_modified_at: 2024-08-14T12:07:49.774+0000 +categories: "KKday Tech Blog" +tags: ["ios-app-development","vision-framework","apple-intelligence","ai","machine-learning"] +description: "Vision framework 功能回顧 & iOS 18 新 Swift API 試玩" +image: + path: /assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg +render_with_liquid: false +--- + +### iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session + +Vision framework 功能回顧 & iOS 18 新 Swift API 試玩 + + +![Photo by [BoliviaInteligente](https://unsplash.com/@boliviainteligente?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash){:target="_blank"}](/assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg) + +Photo by [BoliviaInteligente](https://unsplash.com/@boliviainteligente?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash){:target="_blank"} +#### 主題 + + +![跟 Vision Pro 的關係就跟熱狗跟狗的關係一樣,毫無關係。](/assets/755509180ca8/1*ebqm2jzCK1GSrDDY0XtrUA.png) + +跟 Vision Pro 的關係就跟熱狗跟狗的關係一樣,毫無關係。 +### Vision framework + +Vision framework 是 Apple 整合機器學習的圖像辨識框架,讓開發者可以簡單快速地實現常見的圖像辨識功能;Vision framework 早在 iOS 11\.0\+ \(2017/ iPhone 8\) 就已推出,期間不斷地迭代優化,並完善與 Swift Concurrency 的特性整合提升執行效能,並且從 iOS 18\.0 提供全新的 Swift Vision framework API 發揮 Swift Concurrency 最大效果。 + +**Vision framework 特色** +- 內建眾多圖片辨識、動態追蹤方法 \(iOS 18 為止一共 31 種\) +- On\-Device 單純使用手機晶片運算,辨識過程不依賴雲端服務,快速又安全 +- API 簡單好操作 +- Apple 全平台均支援 iOS 11\.0\+, iPadOS 11\.0\+, Mac Catalyst 13\.0\+, macOS 10\.13\+, tvOS 11\.0\+, visionOS 1\.0\+ +- 已發布多年 \(2017~今\) 且不斷更新 +- 整合 Swift 語言特性提升運算效能 + + + +> **_6 年前曾經小玩過: [Vision 初探 — APP 頭像上傳 自動識別人臉裁圖 \(Swift\)](../9a9aa892f9a9/)_** + + + + + +> _這次搭配 [WWDC 24 Discover Swift enhancements in the Vision framework Session](https://developer.apple.com/videos/play/wwdc2024/10163/){:target="_blank"} 重新回顧並結合新的 Swfit 特性再玩一次。_ + + + + +#### CoreML + +Apple 還有另外一個 Framework 叫 [CoreML](https://developer.apple.com/documentation/coreml){:target="_blank"} ,也是基於 On\-Device 晶片的機器學習框架;但他可以讓你自己訓練想辨識的物件、文件模型,並將模型放到 App 中直接使用,有興趣的朋友也可以玩看看。\(e\.g\. [即時文章分類](../793bf2cdda0f/) 、即時 [垃圾訊息檢測](https://apps.apple.com/tw/app/%E7%86%8A%E7%8C%AB%E5%90%83%E7%9F%AD%E4%BF%A1-%E5%9E%83%E5%9C%BE%E7%9F%AD%E4%BF%A1%E8%BF%87%E6%BB%A4/id1319191852){:target="_blank"} …\) +#### p\.s\. + +[**Vision**](https://developer.apple.com/documentation/vision/){:target="_blank"} **v\.s\. [VisionKit](https://developer.apple.com/documentation/visionkit){:target="_blank"} :** + + +> [**_Vision_**](https://developer.apple.com/documentation/vision/){:target="_blank"} _:主要用於圖像分析任務,如臉部識別、條碼檢測、文本識別等。它提供了強大的 API 來處理和分析靜態圖像或視頻中的視覺內容。_ + + + + + +> [**_VisionKit_**](https://developer.apple.com/documentation/visionkit){:target="_blank"} _:專門用於處理與文件掃描相關的任務。它提供了一個掃描儀視圖控制器,可以用來掃描文檔,並生成高質量的 PDF 或圖像。_ + + + + + +Vision framework 在 M1 機型上無法跑在模擬器,只能接實體手機測試;在模擬器環境執行會拋出 `Could not create Espresso context` Error,查 [官方論壇討論,沒找到解答](https://forums.developer.apple.com/forums/thread/675806){:target="_blank"} 。 + + +> _因手邊沒有實體 iOS 18 裝置進行測試,所以本文中的所有執行結果都是使用舊的 \(iOS 18 以前\) 的寫法結果; **如新寫法有出現錯誤再麻煩留言指教** 。_ + + + + +### WWDC 2024 — Discover Swift enhancements in the Vision framework + + +![[Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"}](/assets/755509180ca8/1*8N5GtY1uqxP-4iAAAticOA.png) + +[Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"} + + +> _本文是針對 WWDC 24 — [Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"} Session 的分享筆記,跟一些自己實驗的心得。_ + + + + +### Introduction — Vision framework Features +#### 人臉辨識、輪廓識別 + + +![](/assets/755509180ca8/1*RNGfE_EeaQhiKAPdJeFYQw.png) + + + +![](/assets/755509180ca8/1*iMdzeLm2aWjATVV6_Kvrjg.png) + +#### 圖像內容文字辨識 + +截至 iOS 18 為止,支援 18 種語言。 + + +![](/assets/755509180ca8/1*kU_OYn5w368h-ahDYU4lDw.png) + +```swift +// 支援的語系列表 +if #available(iOS 18.0, *) { + print(RecognizeTextRequest().supportedRecognitionLanguages.map { "\($0.languageCode!)-\(($0.region?.identifier ?? $0.script?.identifier)!)" }) +} else { + print(try! VNRecognizeTextRequest().supportedRecognitionLanguages()) +} + +// 實際可用辨識語言以這為主。 +// 實測 iOS 18 輸出以下結果: +// ["en-US", "fr-FR", "it-IT", "de-DE", "es-ES", "pt-BR", "zh-Hans", "zh-Hant", "yue-Hans", "yue-Hant", "ko-KR", "ja-JP", "ru-RU", "uk-UA", "th-TH", "vi-VT", "ar-SA", "ars-SA"] +// 未看到 WWDC 提到的 Swedish 語言,不確定是還沒推出還是跟裝置地區、語系有關聯 +``` +#### 動態動作捕捉 + + +![](/assets/755509180ca8/1*6TfyCcszdD1NdId0bdM16Q.gif) + + + +![](/assets/755509180ca8/1*8y_XXdH36uKpfP0p6BCJQA.gif) + +- 可以實現人、物件動態捕捉 +- 手勢補捉實現隔空簽名功能 + +#### What’s new in Vision? \(iOS 18\)— 圖片評分功能 \(品質、記憶點\) +- 可對輸入圖片得計算出分數,方便篩選出優質照片 +- 分數計算方式包含多個維度,不只是畫質,還有光線、角度、拍攝主體、 **是否有讓人感到的記憶點** …等等 + + + +![](/assets/755509180ca8/1*XwjeaHcB6arxJhIR7cFsWg.png) + + + +![](/assets/755509180ca8/1*YdhZlZBlTaIZd4nLxhBtaQ.png) + + + +![](/assets/755509180ca8/1*IhMDFdk6DWwTv1qIG0Gi0Q.png) + + +WWDC 中給了以上三張圖片做說明\(相同畫質之下\),分別是: +- 高分的圖片:取景、光線、有記憶點 +- 低分的圖片:沒有主體、像是隨手或不小心拍的 +- 素材的圖片:技術上拍的很好但是沒有記憶點,像是作為素材圖庫用的圖片 + + +**iOS ≥ 18 New API: [CalculateImageAestheticsScoresRequest](https://developer.apple.com/documentation/vision/calculateimageaestheticsscoresrequest){:target="_blank"}** +```swift +let request = CalculateImageAestheticsScoresRequest() +let result = try await request.perform(on: URL(string: "https://zhgchg.li/assets/cb65fd5ab770/1*yL3vI1ADzwlovctW5WQgJw.jpeg")!) + +// 照片分數 +print(result.overallScore) + +// 是否被判定為素材圖片 +print(result.isUtility) +``` +#### What’s new in Vision? \(iOS 18\) — 身體+手勢姿勢同時偵測 + + +![](/assets/755509180ca8/1*A9320aRV-jdccgiXrmSrJw.png) + + +以往只能個別偵測人體 Pose 和 手部 Pose,這次更新可以讓開發者同時偵測身體 Pose \+ 手部 Pose,合成同一個請求跟結果,方便我們做更多應用功能開發。 + +**iOS ≥ 18 New API: [DetectHumanBodyPoseRequest](https://developer.apple.com/documentation/vision/detecthumanbodyposerequest){:target="_blank"}** +```swift +var request = DetectHumanBodyPoseRequest() +// 一併偵測手部 Pose +request.detectsHands = true + +guard let bodyPose = try await request.perform(on: image). first else { return } + +// 身體 Pose Joints +let bodyJoints = bodyPose.allJoints() +// 左手 Pose Joints +let leftHandJoints = bodyPose.leftHand.allJoints() +// 右手 Pose Joints +let rightHandJoints = bodyPose.rightHand.allJoints() +``` +### New Vision API + +Apple 在這次的更新當中提供了新的 Swift Vision API 封裝給開發者使用,除了基本的包含原本的功能支援之外,主要針對加強 Swift 6 / Swift Concurrency 的特性,提供效能更優、寫起來更 Swift 的 API 操作方式。 +### Get started with Vision + + +![](/assets/755509180ca8/1*mv9g5jmqrS6YScxoGYJemQ.png) + + + +![](/assets/755509180ca8/1*iidNN7nKHoskh_tcjfuHKQ.png) + + +這邊講者又重新介紹了一次 Vision framework 的基礎使用方式,Apple 已經封裝好了 [31 種](https://developer.apple.com/documentation/vision/visionrequest){:target="_blank"} \(截至 iOS 18\)常見的圖像辨識請求「Request」與對應回傳的「Observation」物件。 +1. **Request:** DetectFaceRectanglesRequest 人臉區域識別請求 +**Result:** FaceObservation +之前的文章「 [Vision 初探 — APP 頭像上傳 自動識別人臉裁圖 \(Swift\)](../9a9aa892f9a9/) 」就是用這對請求。 +2. **Request:** RecognizeTextRequest 文字辨識請求 +**Result:** RecognizedTextObservation +3. **Request:** GenerateObjectnessBasedSaliencyImageRequest 主體物件辨識請求 +**Result:** SaliencyImageObservation + +### 全部 31 種請求 Request: + +[VisionRequest](https://developer.apple.com/documentation/vision/visionrequest){:target="_blank"} 。 +[https://gist.github.com/zhgchgli0718/e169c248763964841468cc9efc4c5f0f](https://gist.github.com/zhgchgli0718/e169c248763964841468cc9efc4c5f0f){:target="_blank"} +- 前面補上 VN 就是舊的 API 寫法 \(iOS 18 以前的版本\) + + +講者提到了幾個常用的 Request,如下。 +#### ClassifyImageRequest + +辨識輸入的圖片,得到標籤分類與置信度。 + + +![](/assets/755509180ca8/1*8NSQEjxGejujKLbXcILmxQ.jpeg) + + + +![\[遊記\] 2024 二訪九州 9 日自由行,經釜山→博多郵輪入境](/assets/755509180ca8/1*f1rNoOIQbE33M9F9NmoTXg.png) + +\[遊記\] 2024 二訪九州 9 日自由行,經釜山→博多郵輪入境 +```swift +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + let request = ClassifyImageRequest() + Task { + do { + let observations = try await request.perform(on: URL(string: "https://zhgchg.li/assets/cb65fd5ab770/1*yL3vI1ADzwlovctW5WQgJw.jpeg")!) + observations.forEach { + observation in + print("\(observation.identifier): \(observation.confidence)") + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let completionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNClassificationObservation] else { + return + } + observations.forEach { + observation in + print("\(observation.identifier): \(observation.confidence)") + } + } + + let request = VNClassifyImageRequest(completionHandler: completionHandler) + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: URL(string: "https://zhgchg.li/assets/cb65fd5ab770/1*3_jdrLurFuUfNdW4BJaRww.jpeg")!, options: [:]) + do { + try handler.perform([request]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +**分析結果:** +```r + • outdoor(戶外): 0.75392926 + • sky(天空): 0.75392926 + • blue_sky(藍天): 0.7519531 + • machine(機器): 0.6958008 + • cloudy(多雲): 0.26538086 + • structure(結構): 0.15728651 + • sign(標誌): 0.14224191 + • fence(柵欄): 0.118652344 + • banner(橫幅): 0.0793457 + • material(材料): 0.075975396 + • plant(植物): 0.054406323 + • foliage(樹葉): 0.05029297 + • light(光): 0.048126098 + • lamppost(燈柱): 0.048095703 + • billboards(廣告牌): 0.040039062 + • art(藝術): 0.03977703 + • branch(樹枝): 0.03930664 + • decoration(裝飾): 0.036868922 + • flag(旗幟): 0.036865234 +....略 +``` +#### RecognizeTextRequest + +辨識圖片中的文字內容。\(a\.k\.a 圖片轉文字\) + + +![[\[遊記\] 2023 東京 5 日自由行](../9da2c51fa4f2/)](/assets/755509180ca8/1*XL40lLT774PfO60rCIfnxA.jpeg) + +[\[遊記\] 2023 東京 5 日自由行](../9da2c51fa4f2/) +```swift +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + var request = RecognizeTextRequest() + request.recognitionLevel = .accurate + request.recognitionLanguages = [.init(identifier: "ja-JP"), .init(identifier: "en-US")] // Specify language code, e.g., Traditional Chinese + Task { + do { + let observations = try await request.perform(on: URL(string: "https://zhgchg.li/assets/9da2c51fa4f2/1*fBbNbDepYioQ-3-0XUkF6Q.jpeg")!) + observations.forEach { + observation in + let topCandidate = observation.topCandidates(1).first + print(topCandidate?.string ?? "No text recognized") + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let completionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNRecognizedTextObservation] else { + return + } + observations.forEach { + observation in + let topCandidate = observation.topCandidates(1).first + print(topCandidate?.string ?? "No text recognized") + } + } + + let request = VNRecognizeTextRequest(completionHandler: completionHandler) + request.recognitionLevel = .accurate + request.recognitionLanguages = ["ja-JP", "en-US"] // Specify language code, e.g., Traditional Chinese + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: URL(string: "https://zhgchg.li/assets/9da2c51fa4f2/1*fBbNbDepYioQ-3-0XUkF6Q.jpeg")!, options: [:]) + do { + try handler.perform([request]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +**分析結果:** +```makefile +LE LABO 青山店 +TEL:03-6419-7167 +*お買い上げありがとうございます* +No: 21347 +日付:2023/06/10 14.14.57 +担当: +1690370 +レジ:008A 1 +商品名 +税込上代数量税込合計 +カイアック 10 EDP FB 15ML +J1P7010000S +16,800 +16,800 +アナザー 13 EDP FB 15ML +J1PJ010000S +10,700 +10,700 +リップパーム 15ML +JOWC010000S +2,000 +1 +合計金額 +(内税額) +CARD +2,000 +3点御買上げ +29,500 +0 +29,500 +29,500 +``` +#### DetectBarcodesRequest + +偵測圖片中的條碼、QRCode 數據。 + + +![](/assets/755509180ca8/1*Z72y9rIwIKQCmnnuwsq0uQ.png) + + + +![泰國當地人推薦鵝牌清涼膏](/assets/755509180ca8/1*s3V1UQRIqto-iG1e30PK7Q.jpeg) + +泰國當地人推薦鵝牌清涼膏 +```swift +let filePath = Bundle.main.path(forResource: "IMG_6777", ofType: "png")! // 本地測試圖片 +let fileURL = URL(filePath: filePath) +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + let request = DetectBarcodesRequest() + Task { + do { + let observations = try await request.perform(on: fileURL) + observations.forEach { + observation in + print("Payload: \(observation.payloadString ?? "No payload")") + print("Symbology: \(observation.symbology)") + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let completionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNBarcodeObservation] else { + return + } + observations.forEach { + observation in + print("Payload: \(observation.payloadStringValue ?? "No payload")") + print("Symbology: \(observation.symbology.rawValue)") + } + } + + let request = VNDetectBarcodesRequest(completionHandler: completionHandler) + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: fileURL, options: [:]) + do { + try handler.perform([request]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +**分析結果:** +```makefile +Payload: 8859126000911 +Symbology: VNBarcodeSymbologyEAN13 +Payload: https://lin.ee/hGynbVM +Symbology: VNBarcodeSymbologyQR +Payload: http://www.hongthaipanich.com/ +Symbology: VNBarcodeSymbologyQR +Payload: https://www.facebook.com/qr?id=100063856061714 +Symbology: VNBarcodeSymbologyQR +``` +#### RecognizeAnimalsRequest + +辨識圖片中的動物與置信度。 + + +![](/assets/755509180ca8/1*5zF3gA3WB1Q0-_cgt6mTCw.png) + + + +![[meme Source](https://www.redbubble.com/i/canvas-print/Funny-AI-Woman-yelling-at-a-cat-meme-design-Machine-learning-by-omolog/43039298.5Y5V7){:target="_blank"}](/assets/755509180ca8/1*KZ7mdE8fobP-_oj7tJf_Ww.jpeg) + +[meme Source](https://www.redbubble.com/i/canvas-print/Funny-AI-Woman-yelling-at-a-cat-meme-design-Machine-learning-by-omolog/43039298.5Y5V7){:target="_blank"} +```swift +let filePath = Bundle.main.path(forResource: "IMG_5026", ofType: "png")! // 本地測試圖片 +let fileURL = URL(filePath: filePath) +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + let request = RecognizeAnimalsRequest() + Task { + do { + let observations = try await request.perform(on: fileURL) + observations.forEach { + observation in + let labels = observation.labels + labels.forEach { + label in + print("Detected animal: \(label.identifier) with confidence: \(label.confidence)") + } + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let completionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNRecognizedObjectObservation] else { + return + } + observations.forEach { + observation in + let labels = observation.labels + labels.forEach { + label in + print("Detected animal: \(label.identifier) with confidence: \(label.confidence)") + } + } + } + + let request = VNRecognizeAnimalsRequest(completionHandler: completionHandler) + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: fileURL, options: [:]) + do { + try handler.perform([request]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +分析結果: +```csharp +Detected animal: Cat with confidence: 0.77245045 +``` +#### 其他: +- 偵測圖像中的人體:DetectHumanRectanglesRequest +- 偵測人、動物的 Pose 動作 \(3D or 2D 都可以\):DetectAnimalBodyPoseRequest、DetectHumanBodyPose3DRequest、DetectHumanBodyPoseRequest、DetectHumanHandPoseRequest +- 檢測並追蹤物件的運動軌跡\(在影片、動畫不同的偵中\):DetectTrajectoriesRequest、TrackObjectRequest、TrackRectangleRequest + +#### **iOS ≥ 18 Update Highlight:** +```rust +VN*Request -> *Request (e.g. VNDetectBarcodesRequest -> DetectBarcodesRequest) +VN*Observation -> *Observation (e.g. VNRecognizedObjectObservation -> RecognizedObjectObservation) +VNRequestCompletionHandler -> async/await +VNImageRequestHandler.perform([VN*Request]) -> *Request.perform() +``` +### WWDC Example + +WWDC 官方影片以超市商品掃描器為例。 +#### 首先大多數的商品都有 Barcode 可供掃描 + + +![](/assets/755509180ca8/1*YT_Uf8eEi36Iv7zcOrmP4A.png) + + + +![](/assets/755509180ca8/1*J9uIwRKubLoJoC7i096AdQ.png) + + + +![](/assets/755509180ca8/1*gKg-NfHYqy7uBqe5hxzBSw.png) + + +我們可以從 `observation.boundingBox` 取得 Barcode 所在位置,但不同於常見 UIView 座標系, `BoundingBox` 的相對位置起點是從左下角,值的範圍落在 0~1 之間。 +```swift +let filePath = Bundle.main.path(forResource: "IMG_6785", ofType: "png")! // 本地測試圖片 +let fileURL = URL(filePath: filePath) +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + var request = DetectBarcodesRequest() + request.symbologies = [.ean13] // 如果只要掃描 EAN13 Barcode,可直接指定,提升效能 + Task { + do { + let observations = try await request.perform(on: fileURL) + if let observation = observations.first { + DispatchQueue.main.async { + self.infoLabel.text = observation.payloadString + // 標記顏色 Layer + let colorLayer = CALayer() + // iOS >=18 新的座標轉換 API toImageCoordinates + // 未經測試,實際可能還需要計算 ContentMode = AspectFit 的位移: + colorLayer.frame = observation.boundingBox.toImageCoordinates(self.baseImageView.frame.size, origin: .upperLeft) + colorLayer.backgroundColor = UIColor.red.withAlphaComponent(0.5).cgColor + self.baseImageView.layer.addSublayer(colorLayer) + } + print("BoundingBox: \(observation.boundingBox.cgRect)") + print("Payload: \(observation.payloadString ?? "No payload")") + print("Symbology: \(observation.symbology)") + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let completionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNBarcodeObservation] else { + return + } + if let observation = observations.first { + DispatchQueue.main.async { + self.infoLabel.text = observation.payloadStringValue + // 標記顏色 Layer + let colorLayer = CALayer() + colorLayer.frame = self.convertBoundingBox(observation.boundingBox, to: self.baseImageView) + colorLayer.backgroundColor = UIColor.red.withAlphaComponent(0.5).cgColor + self.baseImageView.layer.addSublayer(colorLayer) + } + print("BoundingBox: \(observation.boundingBox)") + print("Payload: \(observation.payloadStringValue ?? "No payload")") + print("Symbology: \(observation.symbology.rawValue)") + } + } + + let request = VNDetectBarcodesRequest(completionHandler: completionHandler) + request.symbologies = [.ean13] // 如果只要掃描 EAN13 Barcode,可直接指定,提升效能 + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: fileURL, options: [:]) + do { + try handler.perform([request]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +**iOS ≥ 18 Update Highlight:** +```less +// iOS >=18 新的座標轉換 API toImageCoordinates +observation.boundingBox.toImageCoordinates(CGSize, origin: .upperLeft) +// https://developer.apple.com/documentation/vision/normalizedpoint/toimagecoordinates(from:imagesize:origin:) +``` + +**Helper:** +```swift +// Gen by ChatGPT 4o +// 因為照片在 ImageView 是設定 ContentMode = AspectFit +// 所以要多計算上下因 Fit 造成的空白位移 +func convertBoundingBox(_ boundingBox: CGRect, to view: UIImageView) -> CGRect { + guard let image = view.image else { + return .zero + } + + let imageSize = image.size + let viewSize = view.bounds.size + let imageRatio = imageSize.width / imageSize.height + let viewRatio = viewSize.width / viewSize.height + var scaleFactor: CGFloat + var offsetX: CGFloat = 0 + var offsetY: CGFloat = 0 + if imageRatio > viewRatio { + // 圖像在寬度方向上適配 + scaleFactor = viewSize.width / imageSize.width + offsetY = (viewSize.height - imageSize.height * scaleFactor) / 2 + } + + else { + // 圖像在高度方向上適配 + scaleFactor = viewSize.height / imageSize.height + offsetX = (viewSize.width - imageSize.width * scaleFactor) / 2 + } + + let x = boundingBox.minX * imageSize.width * scaleFactor + offsetX + let y = (1 - boundingBox.maxY) * imageSize.height * scaleFactor + offsetY + let width = boundingBox.width * imageSize.width * scaleFactor + let height = boundingBox.height * imageSize.height * scaleFactor + return CGRect(x: x, y: y, width: width, height: height) +} +``` + +**輸出結果** +```makefile +BoundingBox: (0.5295758928571429, 0.21408638121589782, 0.0943080357142857, 0.21254415360708087) +Payload: 4710018183805 +Symbology: VNBarcodeSymbologyEAN13 +``` +#### 部分商品無 Barcode,如散裝水果只有商品標籤 + + +![](/assets/755509180ca8/1*jeZhLtg9j11kgOAvKZmevg.jpeg) + + + +![](/assets/755509180ca8/1*YNokMMUewMA2kzjoGmMJPw.png) + + +因此我們的掃瞄器也需要同時支援掃描純文字標籤。 +```swift +let filePath = Bundle.main.path(forResource: "apple", ofType: "jpg")! // 本地測試圖片 +let fileURL = URL(filePath: filePath) +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + var barcodesRequest = DetectBarcodesRequest() + barcodesRequest.symbologies = [.ean13] // 如果只要掃描 EAN13 Barcode,可直接指定,提升效能 + var textRequest = RecognizeTextRequest() + textRequest.recognitionLanguages = [.init(identifier: "zh-Hnat"), .init(identifier: "en-US")] + Task { + do { + let handler = ImageRequestHandler(fileURL) + // parameter pack syntax and we must wait for all requests to finish before we can use their results. + // let (barcodesObservation, textObservation, ...) = try await handler.perform(barcodesRequest, textRequest, ...) + let (barcodesObservation, textObservation) = try await handler.perform(barcodesRequest, textRequest) + if let observation = barcodesObservation.first { + DispatchQueue.main.async { + self.infoLabel.text = observation.payloadString + // 標記顏色 Layer + let colorLayer = CALayer() + // iOS >=18 新的座標轉換 API toImageCoordinates + // 未經測試,實際可能還需要計算 ContentMode = AspectFit 的位移: + colorLayer.frame = observation.boundingBox.toImageCoordinates(self.baseImageView.frame.size, origin: .upperLeft) + colorLayer.backgroundColor = UIColor.red.withAlphaComponent(0.5).cgColor + self.baseImageView.layer.addSublayer(colorLayer) + } + print("BoundingBox: \(observation.boundingBox.cgRect)") + print("Payload: \(observation.payloadString ?? "No payload")") + print("Symbology: \(observation.symbology)") + } + textObservation.forEach { + observation in + let topCandidate = observation.topCandidates(1).first + print(topCandidate?.string ?? "No text recognized") + } + } + catch { + print("Request failed: \(error)") + } + } +} else { + // 舊的寫法 + let barcodesCompletionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNBarcodeObservation] else { + return + } + if let observation = observations.first { + DispatchQueue.main.async { + self.infoLabel.text = observation.payloadStringValue + // 標記顏色 Layer + let colorLayer = CALayer() + colorLayer.frame = self.convertBoundingBox(observation.boundingBox, to: self.baseImageView) + colorLayer.backgroundColor = UIColor.red.withAlphaComponent(0.5).cgColor + self.baseImageView.layer.addSublayer(colorLayer) + } + print("BoundingBox: \(observation.boundingBox)") + print("Payload: \(observation.payloadStringValue ?? "No payload")") + print("Symbology: \(observation.symbology.rawValue)") + } + } + + let textCompletionHandler: VNRequestCompletionHandler = { + request, error in + guard error == nil else { + print("Request failed: \(String(describing: error))") + return + } + guard let observations = request.results as? [VNRecognizedTextObservation] else { + return + } + observations.forEach { + observation in + let topCandidate = observation.topCandidates(1).first + print(topCandidate?.string ?? "No text recognized") + } + } + + let barcodesRequest = VNDetectBarcodesRequest(completionHandler: barcodesCompletionHandler) + barcodesRequest.symbologies = [.ean13] // 如果只要掃描 EAN13 Barcode,可直接指定,提升效能 + let textRequest = VNRecognizeTextRequest(completionHandler: textCompletionHandler) + textRequest.recognitionLevel = .accurate + textRequest.recognitionLanguages = ["en-US"] + DispatchQueue.global().async { + let handler = VNImageRequestHandler(url: fileURL, options: [:]) + do { + try handler.perform([barcodesRequest, textRequest]) + } + catch { + print("Request failed: \(error)") + } + } +} +``` + +**輸出結果:** +``` +94128s +ORGANIC +Pink Lady® +Produce of USh +``` + +**iOS ≥ 18 Update Highlight:** +```swift +let handler = ImageRequestHandler(fileURL) +// parameter pack syntax and we must wait for all requests to finish before we can use their results. +// let (barcodesObservation, textObservation, ...) = try await handler.perform(barcodesRequest, textRequest, ...) +let (barcodesObservation, textObservation) = try await handler.perform(barcodesRequest, textRequest) +``` +#### iOS ≥ 18 [performAll\( \)](https://developer.apple.com/documentation/vision/imagerequesthandler/performall(_:)?changes=latest_minor){:target="_blank"} 方法 + + +![](/assets/755509180ca8/1*z0364eYD4F4On194EgQ1kQ.png) + + +前面的 `perform(barcodesRequest, textRequest)` 處理 Barcode 掃描跟文字掃描的方式需要等到兩個 Request 都完成才能繼續執行;iOS 18 開始提供新的 `performAll()` 方法,將回應方式改為串流,在收到其中一個 Reqeust 結果是就能做對應處理,例如掃描到 Barcode 就直接響應。 +```swift +if #available(iOS 18.0, *) { + // 新的使用 Swift 特性的 API + var barcodesRequest = DetectBarcodesRequest() + barcodesRequest.symbologies = [.ean13] // 如果只要掃描 EAN13 Barcode,可直接指定,提升效能 + var textRequest = RecognizeTextRequest() + textRequest.recognitionLanguages = [.init(identifier: "zh-Hnat"), .init(identifier: "en-US")] + Task { + let handler = ImageRequestHandler(fileURL) + let observation = handler.performAll([barcodesRequest, textRequest] as [any VisionRequest]) + for try await result in observation { + switch result { + case .detectBarcodes(_, let barcodesObservation): + if let observation = barcodesObservation.first { + DispatchQueue.main.async { + self.infoLabel.text = observation.payloadString + // 標記顏色 Layer + let colorLayer = CALayer() + // iOS >=18 新的座標轉換 API toImageCoordinates + // 未經測試,實際可能還需要計算 ContentMode = AspectFit 的位移: + colorLayer.frame = observation.boundingBox.toImageCoordinates(self.baseImageView.frame.size, origin: .upperLeft) + colorLayer.backgroundColor = UIColor.red.withAlphaComponent(0.5).cgColor + self.baseImageView.layer.addSublayer(colorLayer) + } + print("BoundingBox: \(observation.boundingBox.cgRect)") + print("Payload: \(observation.payloadString ?? "No payload")") + print("Symbology: \(observation.symbology)") + } + case .recognizeText(_, let textObservation): + textObservation.forEach { + observation in + let topCandidate = observation.topCandidates(1).first + print(topCandidate?.string ?? "No text recognized") + } + default: + print("Unrecongnized result: \(result)") + } + } + } +} +``` +### Optimize with Swift Concurrency + + +![](/assets/755509180ca8/1*LgxxMOVS6is3n6EqPWqA6Q.png) + + + +![](/assets/755509180ca8/1*80CFJpkb-gjy3bJs4jAC2A.png) + + +假設我們有一個圖片牆列表,每張圖片都需要自動裁切出物件主體;這時候可以善用 Swift Concurrency 增加載入效率。 +#### **原始寫法** +```swift +func generateThumbnail(url: URL) async throws -> UIImage { + let request = GenerateAttentionBasedSaliencyImageRequest() + let saliencyObservation = try await request.perform(on: url) + return cropImage(url, to: saliencyObservation.salientObjects) +} + +func generateAllThumbnails() async throws { + for image in images { + image.thumbnail = try await generateThumbnail(url: image.url) + } +} +``` + +一次只執行一個,效率、效能緩慢。 +#### **優化 \(1\) — TaskGroup** Concurrency +```swift + +func generateAllThumbnails() async throws { + try await withThrowingDiscardingTaskGroup { taskGroup in + for image in images { + image.thumbnail = try await generateThumbnail(url: image.url) + } + } +} +``` + +將每個 Task 都加入 TaskGroup Concurrency 執行。 + + +> **_問題:圖片辨識、截圖操作非常消耗記憶體性能,如果無節制狂加並行任務,可能造成使用者卡頓、OOM 閃退問題。_** + + + + +#### 優化 \(2\) — TaskGroup Concurrency \+ 限制並行數量 +```swift +func generateAllThumbnails() async throws { + try await withThrowingDiscardingTaskGroup { + taskGroup in + // 最多執行數量不得超過 5 + let maxImageTasks = min(5, images.count) + // 先填充 5 個 Task + for index in 0..