-
-
Notifications
You must be signed in to change notification settings - Fork 243
Refactor extracting audio metadata into standalone service #1429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request refactors audio metadata extraction logic from the Book Core Data model class into a new standalone BookMetadataService. The refactoring improves code organization, makes the extraction logic extensible to different audio formats, and modernizes the codebase by migrating to async/await patterns.
- Introduces
BookMetadataServicewith support for extracting metadata and chapters from multiple audio formats (M4B, MP3 with Overdrive tags, FLAC/Vorbis, and ID3 chapters) - Updates
LibraryServicemethods to be async and use the new metadata service for book creation - Removes metadata extraction logic from the
Bookmodel class, leaving only a simplifiedgetBookTitlehelper method
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| Shared/Services/BookMetadataService.swift | New service implementing comprehensive audio metadata and chapter extraction for various formats |
| Shared/Services/AudioMetadataService.swift | Removed old service (renamed and expanded into BookMetadataService) |
| Shared/Services/LibraryService.swift | Migrated to async/await, integrated BookMetadataService, added helper methods for book creation and chapter storage |
| Shared/CoreData/Backed-Models/Book+CoreDataClass.swift | Removed metadata extraction logic, retaining only simplified title helper method |
| Shared/Artwork/AVAudioAssetImageDataProvider.swift | Updated to use async/await APIs and AVURLAsset |
| Shared/CoreData/DatabaseInitializer.swift | Added BookMetadataService instantiation and setup |
| BookPlayerWatch/ExtensionDelegate.swift | Added BookMetadataService instantiation and setup |
| BookPlayer/AppDelegate.swift | Added BookMetadataService factory method and updated service initialization |
| BookPlayer/Hardcover/Network/HardcoverService.swift | Updated to use BookMetadataServiceProtocol instead of AudioMetadataServiceProtocol |
| BookPlayer/Settings/Storage/StorageViewModel.swift | Updated to handle async book creation |
| BookPlayer/Library/ItemList/LibraryRootView.swift | Updated to await async insertItems call |
| BookPlayer/Coordinators/DataInitializerCoordinator.swift | Updated to await async insertItems call |
| BookPlayer/Profile/Profile/ProfileSyncTasksSectionView.swift | Added BookMetadataService setup in preview |
| BookPlayer/Profile/Profile/ProfileListenedSectionView.swift | Added BookMetadataService setup in preview |
| BookPlayer/Library/ItemList/Views/BookView.swift | Added BookMetadataService setup in preview |
| BookPlayerTests/Services/LibraryServiceTests.swift | Updated test methods to async and added BookMetadataService setup |
| BookPlayerTests/PerformanceTests/PlaybackPerformanceTests.swift | Updated test methods to async and added BookMetadataService setup |
| BookPlayerTests/ImportOperationTests.swift | Added BookMetadataService setup |
| BookPlayerTests/DataManagerTests.swift | Added BookMetadataService setup |
| BookPlayerTests/Support/StubFactory.swift | Added BookMetadataService setup |
| BookPlayer.xcodeproj/project.pbxproj | Updated file references from AudioMetadataService to BookMetadataService |
Comments suppressed due to low confidence (1)
Shared/Services/LibraryService.swift:623
- The async
insertItemsmethod performs metadata extraction (line 599) outside of a Core Data context.perform block while using Core Data objects. The metadata extraction can take significant time, during which the Core Data context and its objects are being held without proper isolation. This could lead to threading issues if the context is accessed from multiple threads. Consider either: 1) Using a background context with proper perform blocks, or 2) Ensuring the context is only accessed on its designated queue.
func insertItems(from files: [URL], parentPath: String? = nil) async -> [SimpleLibraryItem] {
let context = dataManager.getContext()
let library = getLibraryReference()
var processedFiles = [SimpleLibraryItem]()
for file in files {
let libraryItem: LibraryItem
if let attributes = try? FileManager.default.attributesOfItem(atPath: file.path),
let type = attributes[.type] as? FileAttributeType,
type == .typeDirectory
{
libraryItem = Folder(from: file, context: context)
/// Handle folder contents and wait for completion to ensure proper metadata extraction
await self.handleDirectory(file)
} else {
// Extract metadata FIRST (includes chapters)
let metadata = await bookMetadataService.extractMetadata(from: file)
// Create Book with metadata
let book = createBook(from: file, metadata: metadata, context: context)
libraryItem = book
// Create chapters immediately if available
if let chapters = metadata?.chapters {
storeChapters(chapters, for: book, context: context)
}
}
libraryItem.orderRank = getNextOrderRank(in: parentPath)
if let parentPath,
let parentFolder = getItemReference(with: parentPath) as? Folder
{
parentFolder.addToItems(libraryItem)
/// update details on parent folder
} else {
library.addToItems(libraryItem)
}
processedFiles.append(SimpleLibraryItem(from: libraryItem))
dataManager.saveContext()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (index, chapter) in chapters.enumerated() { | ||
| let chapterDuration: TimeInterval | ||
|
|
||
| if index == chapters.endIndex - 1 { |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition index == chapters.endIndex - 1 is incorrect. For an array, you should use index == chapters.count - 1 instead. The endIndex property returns the position one past the last valid index, making this comparison off by one.
| if index == chapters.endIndex - 1 { | |
| if index == chapters.count - 1 { |
| /// Extract metadata from an AVAsset | ||
| /// - Parameter asset: The AVAsset to extract metadata from | ||
| /// - Returns: AudioMetadata if extraction succeeds, nil otherwise | ||
| func extractMetadata(from asset: AVAsset) async -> AudioMetadata? |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The public protocol method extractMetadata(from asset: AVAsset) lacks documentation. Consider adding a doc comment explaining its parameters and return value, similar to the documentation provided for the URL-based overload.
| private var keychain: KeychainServiceProtocol! | ||
| private let graphQL = GraphQLClient(baseURL: "https://api.hardcover.app/v1/graphql") | ||
| private var audioMetadataService: AudioMetadataServiceProtocol! | ||
| private var audioMetadataService: BookMetadataServiceProtocol! |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name audioMetadataService is inconsistent with the actual service type BookMetadataService. While the old service was called AudioMetadataService, the new refactored service is BookMetadataService, so the variable name should be bookMetadataService to maintain consistency with the type name and other parts of the codebase.
| title: data.name ?? "", | ||
| start: start, | ||
| duration: chapterDuration, | ||
| index: index + 1 | ||
| ) | ||
|
|
||
| chapters.append(chapter) | ||
| } | ||
|
|
||
| return chapters.isEmpty ? nil : chapters | ||
| } | ||
|
|
||
| private func extractID3Chapters(from metadata: [AVMetadataItem], duration: TimeInterval) async -> [ChapterMetadata]? { | ||
| var chapterData: [(start: Double, title: String)] = [] | ||
|
|
||
| for item in metadata { | ||
| guard let identifier = item.identifier?.rawValue, | ||
| identifier.hasPrefix("id3/CHAP") else { continue } | ||
|
|
||
| // Extract chapter start time and title from CHAP frame | ||
| // Note: AVFoundation may not fully expose CHAP frame timing data currently, | ||
| // but we attempt to load what's available for future compatibility. | ||
| if let dateValue = try? await item.load(.dateValue) { | ||
| let startTime = dateValue.timeIntervalSince1970 | ||
| let title = (try? await item.load(.stringValue)) ?? "" | ||
| chapterData.append((start: startTime, title: title)) | ||
| } | ||
| } | ||
|
|
||
| // Sort chapters by start time | ||
| chapterData.sort { $0.start < $1.start } | ||
|
|
||
| var chapters: [ChapterMetadata] = [] | ||
| for (index, data) in chapterData.enumerated() { | ||
| let chapterDuration: TimeInterval | ||
|
|
||
| // Calculate duration | ||
| if index < chapterData.count - 1 { | ||
| chapterDuration = chapterData[index + 1].start - data.start | ||
| } else { | ||
| chapterDuration = duration - data.start | ||
| } | ||
|
|
||
| let chapter = ChapterMetadata( | ||
| title: data.title, | ||
| start: data.start, | ||
| duration: chapterDuration, | ||
| index: index + 1 | ||
| ) | ||
|
|
||
| chapters.append(chapter) | ||
| } | ||
|
|
||
| return chapters.isEmpty ? nil : chapters | ||
| } | ||
|
|
||
| private func extractOverdriveChapters(from metadata: [AVMetadataItem], duration: TimeInterval) async -> [ChapterMetadata]? { | ||
| guard let txxxItem = metadata.first(where: { $0.identifier?.rawValue == "id3/TXXX" }), | ||
| let overdriveMetadata = try? await txxxItem.load(.stringValue) | ||
| else { return nil } | ||
|
|
||
| let matches = overdriveMetadata.matches(of: /<Marker>(.+?)<\/Marker>/) | ||
| var chapters: [ChapterMetadata] = [] | ||
|
|
||
| for (index, match) in matches.enumerated() { | ||
| let (_, marker) = match.output | ||
|
|
||
| guard let (_, timeMatch) = marker.matches(of: /<Time>(.+?)<\/Time>/).first?.output else { | ||
| continue | ||
| } | ||
|
|
||
| let start = TimeParser.getDuration(from: String(timeMatch)) | ||
| let title: String | ||
|
|
||
| if let (_, nameMatch) = marker.matches(of: /<Name>(.+?)<\/Name>/).first?.output { | ||
| title = String(nameMatch) | ||
| } else { | ||
| title = "" | ||
| } | ||
|
|
||
| let chapter = ChapterMetadata( | ||
| title: title, | ||
| start: start, | ||
| duration: 0, // Will be calculated below | ||
| index: index + 1 | ||
| ) | ||
|
|
||
| chapters.append(chapter) | ||
| } | ||
|
|
||
| // Overdrive markers do not include the duration, we have to parse it from the next chapter over | ||
| var finalChapters: [ChapterMetadata] = [] | ||
| for (index, chapter) in chapters.enumerated() { | ||
| let chapterDuration: TimeInterval | ||
|
|
||
| if index == chapters.endIndex - 1 { | ||
| chapterDuration = duration - chapter.start | ||
| } else { | ||
| chapterDuration = chapters[index + 1].start - chapter.start | ||
| } |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chapter duration calculation is duplicated across three different chapter extraction methods: extractVorbisChapters (lines 263-271), extractID3Chapters (lines 308-315), and extractOverdriveChapters (lines 367-373). This logic could be extracted into a helper function to improve maintainability and reduce code duplication.
Shared/Services/LibraryService.swift
Outdated
| book.relativePath = url.relativePath(to: DataManager.getProcessedFolderURL()) | ||
| book.remoteURL = nil | ||
| book.artworkURL = nil | ||
| book.title = metadata?.title ?? url.lastPathComponent.replacingOccurrences(of: "_", with: " ") |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback logic for the title is incomplete. The extractTitle method returns an empty string when no title is found, but at line 636, the code only checks if metadata?.title is nil using the nil-coalescing operator ??. An empty string returned by extractTitle will not trigger the fallback to the filename. This means files with no metadata will have an empty title instead of using the filename. Consider checking for empty strings: (metadata?.title?.isEmpty == false) ? metadata!.title : url.lastPathComponent.replacingOccurrences(of: "_", with: " ") or have extractTitle return nil instead of an empty string.
| book.title = metadata?.title ?? url.lastPathComponent.replacingOccurrences(of: "_", with: " ") | |
| let fallbackTitle = url.lastPathComponent.replacingOccurrences(of: "_", with: " ") | |
| if let title = metadata?.title, !title.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty { | |
| book.title = title | |
| } else { | |
| book.title = fallbackTitle | |
| } |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new BookMetadataService with its chapter extraction logic lacks test coverage. The refactored metadata extraction functionality should have dedicated unit tests to verify correct extraction of titles, artists, artwork, and chapters from various audio formats (MP3, M4B, FLAC with Vorbis chapters, Overdrive chapters, etc.). This is especially important since the chapter extraction logic was moved from the Book model and includes complex parsing for different metadata formats.
Shared/Services/LibraryService.swift
Outdated
| book.remoteURL = nil | ||
| book.artworkURL = nil | ||
| book.title = metadata?.title ?? url.lastPathComponent.replacingOccurrences(of: "_", with: " ") | ||
| book.details = metadata?.artist ?? "voiceover_unknown_author".localized |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the title field issue, the extractArtist method returns an empty string when no artist metadata is found. However, at line 637, the fallback logic uses metadata?.artist ?? "voiceover_unknown_author".localized, which only checks for nil, not empty strings. This means books with no artist metadata will have an empty string as the artist instead of the localized "Unknown Author" text. Consider checking for empty strings or having extractArtist return nil instead of an empty string.
| book.details = metadata?.artist ?? "voiceover_unknown_author".localized | |
| let artist = metadata?.artist?.trimmingCharacters(in: .whitespacesAndNewlines) | |
| book.details = (artist?.isEmpty == false ? artist : nil) ?? "voiceover_unknown_author".localized |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata extraction in extractMetadata loads asset metadata twice: once on line 76 for basic metadata extraction and again on line 152 within extractChapters. This is inefficient as it requires two separate async loads of the same data. Consider passing the already-loaded metadata array to extractChapters to avoid the duplicate load operation.
| } | ||
| } | ||
| public class func getBookTitle(from fileURL: URL) -> String { | ||
| let asset = AVAsset(url: fileURL) |
Copilot
AI
Jan 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getBookTitle method uses AVAsset(url:) instead of AVURLAsset(url:). For consistency with the refactored code in BookMetadataService and AVAudioAssetImageDataProvider, this should be changed to AVURLAsset which is the appropriate subclass when working with URL-based assets. Additionally, this method uses the old synchronous API (asset.metadata) instead of the new async load API, which could block the calling thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Extract chapter start time and title from CHAP frame | ||
| // Note: AVFoundation may not fully expose CHAP frame timing data currently, | ||
| // but we attempt to load what's available for future compatibility. | ||
| if let dateValue = try? await item.load(.dateValue) { | ||
| let startTime = dateValue.timeIntervalSince1970 |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ID3 chapter extraction logic is using dateValue.timeIntervalSince1970 which interprets the date as an absolute timestamp (seconds since Jan 1, 1970). However, ID3 CHAP frames typically contain relative timestamps (seconds from the start of the audio file), not absolute Unix timestamps. This will result in incorrect chapter start times, likely causing negative or very large duration values when calculated against the actual audio duration.
| // Extract chapter start time and title from CHAP frame | |
| // Note: AVFoundation may not fully expose CHAP frame timing data currently, | |
| // but we attempt to load what's available for future compatibility. | |
| if let dateValue = try? await item.load(.dateValue) { | |
| let startTime = dateValue.timeIntervalSince1970 | |
| // Extract chapter start time and title from CHAP frame. | |
| // CHAP timestamps are relative offsets from the start of the audio, | |
| // so we prefer time / numeric values rather than absolute dates. | |
| var startTime: TimeInterval? | |
| if let timeValue = try? await item.load(.time), | |
| timeValue.isValid, | |
| !timeValue.isIndefinite, | |
| timeValue.timescale != 0 { | |
| startTime = CMTimeGetSeconds(timeValue) | |
| } else if let numberValue = try? await item.load(.numberValue) { | |
| startTime = numberValue.doubleValue | |
| } | |
| if let startTime = startTime, startTime >= 0 { |
| } | ||
| } | ||
|
|
||
| return allChapters.isEmpty ? nil : allChapters |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard chapter extraction iterates through all available chapter locales and appends chapters from each to allChapters. If the same chapter information is available in multiple locales, this will result in duplicate chapters being extracted. Consider either selecting only the first/preferred locale, or ensuring duplicates are filtered based on start time and duration before returning the results.
| return allChapters.isEmpty ? nil : allChapters | |
| if allChapters.isEmpty { | |
| return nil | |
| } | |
| // Deduplicate chapters by start time and duration, keeping the first occurrence | |
| var uniqueChapters: [ChapterMetadata] = [] | |
| var seenKeys = Set<String>() | |
| for chapter in allChapters { | |
| let key = "\(chapter.start)-\(chapter.duration)" | |
| if !seenKeys.contains(key) { | |
| seenKeys.insert(key) | |
| uniqueChapters.append(chapter) | |
| } | |
| } | |
| return uniqueChapters.isEmpty ? nil : uniqueChapters |
| private func createBook(from url: URL, metadata: AudioMetadata?, context: NSManagedObjectContext) -> Book { | ||
| let entity = NSEntityDescription.entity(forEntityName: "Book", in: context)! | ||
| let book = Book(entity: entity, insertInto: context) | ||
|
|
||
| book.relativePath = url.relativePath(to: DataManager.getProcessedFolderURL()) | ||
| book.remoteURL = nil | ||
| book.artworkURL = nil | ||
| let title = metadata?.title ?? "" | ||
| book.title = title.isEmpty ? url.lastPathComponent.replacingOccurrences(of: "_", with: " ") : title | ||
| let artist = metadata?.artist ?? "" | ||
| book.details = artist.isEmpty ? "voiceover_unknown_author".localized : artist | ||
| book.duration = metadata?.duration ?? 0 | ||
| book.originalFileName = url.lastPathComponent | ||
| book.isFinished = false | ||
| book.type = .book | ||
|
|
||
| return book | ||
| } |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The createBook private helper method lacks documentation explaining its parameters and purpose. Since there are now two methods with similar names (the public async one and this private one), adding a docstring would help clarify that this is a helper for creating a Book entity with provided metadata, not extracting metadata itself.
| private func storeChapters(_ chapters: [ChapterMetadata], for book: Book, context: NSManagedObjectContext) { | ||
| for chapterMeta in chapters { | ||
| let chapter = Chapter(context: context) | ||
| chapter.title = chapterMeta.title | ||
| chapter.start = chapterMeta.start | ||
| chapter.duration = chapterMeta.duration | ||
| chapter.index = Int16(chapterMeta.index) | ||
| book.addToChapters(chapter) | ||
| } | ||
| } | ||
|
|
||
| /// Overload for backwards compatibility when we need to query by relativePath | ||
| private func storeChapters(_ chapters: [ChapterMetadata], for relativePath: String, context: NSManagedObjectContext) { | ||
| guard let book = getItem(with: relativePath, context: context) as? Book else { | ||
| return | ||
| } | ||
| storeChapters(chapters, for: book, context: context) | ||
| } |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The storeChapters helper methods lack documentation. Since there are two overloaded versions (one accepting a Book directly, one accepting a relativePath), adding docstrings would help clarify when to use each variant and their specific purposes.
| // FLAC/Vorbis chapters (CHAPTER tags) | ||
| if identifiers.contains(where: { $0.contains("CHAPTER") && !$0.contains("NAME") }) { |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Vorbis chapter detection condition $0.contains("CHAPTER") && !$0.contains("NAME") will incorrectly match identifiers that contain "CHAPTER" anywhere in the string but don't contain "NAME". This could match unrelated metadata tags. Consider using a more specific pattern like checking for identifiers that match the regex pattern CHAPTER\d+$ to ensure it only matches actual chapter time markers.
| // FLAC/Vorbis chapters (CHAPTER tags) | |
| if identifiers.contains(where: { $0.contains("CHAPTER") && !$0.contains("NAME") }) { | |
| // FLAC/Vorbis chapters (CHAPTER tags: time markers look like CHAPTER01, CHAPTER02, etc.) | |
| if identifiers.contains(where: { $0.range(of: #"CHAPTER\d+$"#, options: .regularExpression) != nil }) { |
| await self.handleDirectory(file) | ||
| } else { | ||
| libraryItem = Book(from: file, context: context) | ||
| // Extract metadata FIRST (includes chapters) |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "Extract metadata FIRST (includes chapters)" but chapters are stored separately in the conditional block below. Consider updating the comment to be more accurate, e.g., "Extract metadata including chapters if available" to clarify that chapters are part of the metadata but handled separately.
| // Extract metadata FIRST (includes chapters) | |
| // Extract metadata first (including chapters, if available) |
Purpose