Skip to content

Conversation

@jleinenbach
Copy link
Contributor

@jleinenbach jleinenbach commented Dec 23, 2025

PR Description

This PR introduces a major architectural overhaul to support concurrent multi-account usage and provides a centralized Ephemeral Identity (EID) Resolver API tailored for Bluetooth scanners like [Bermuda BLE Trio](https://github.com/agittins/bermuda).

It moves the integration from a singleton-based approach to a fully entry-scoped architecture, allowing multiple Google accounts to track devices simultaneously while enabling cross-account device resolution for local Bluetooth scanners.

🚀 Key Features

1. Global EID Resolver API (Bermuda Support)

  • Introduced GoogleFindMyEIDResolver exposed via hass.data[DOMAIN][DATA_EID_RESOLVER].
  • Cross-Account Resolution: The resolver aggregates identity keys from all loaded config entries. This allows local BLE scanners (Bermuda) to resolve "FMDN RAW" payloads for devices owned by any configured account (including shared devices), not just the primary one.
  • Performance: Implements pre-computation of EIDs and caching mechanisms to handle high-frequency BLE advertisements without blocking the event loop.

2. True Multi-Account Support

  • Entry-Scoped State: Refactored TokenCache, Coordinator, and FCMReceiver to operate strictly within the scope of a ConfigEntry.
  • Isolation: Multiple Google accounts can now coexist. Keys, tokens, and device lists are isolated per account.
  • Duplicate Prevention: Added logic to detect and prevent adding the same Google account twice, raising a standard Repair issue if detected.

3. Device Registry & Unique ID Migration

  • Namespaced Identifiers: Migrated entity and device unique IDs to include the entry_id (e.g., entry_id:device_id). This prevents collisions when the same physical tracker appears in multiple accounts (e.g., Owner and Shared User).
  • Self-Healing: Added routines to detect and repair orphaned device registry entries or incorrect parent/service-device links.

🐛 Bug Fixes & Stability Improvements

  • FCM Receiver Logic: Implemented reference counting for the shared FCM receiver. It now correctly attaches the Home Assistant context (attach_hass) to enable owner-index fallback routing.

  • Coordinator Robustness:

  • Fixed race conditions during startup and reloads.

  • Improved handling of ConfigEntryAuthFailed to correctly trigger the Re-Auth flow without spamming logs.

  • Added explicit handling for RAW SPY scenarios where previously keys might not have been loaded into the resolver.

  • API Error Handling: Better mapping of upstream NovaAuthError and HTTP 401/403 responses to HA authentication states.

  • Cleanup: Proper shutdown routines (async_unload_entry) to flush caches and stop background tasks/timers, preventing memory leaks on reloads.

⚠️ Breaking Changes / Notes

  • Unique IDs: Internal Unique IDs for entities are migrated. This should be transparent to the user, but ensures uniqueness across multiple accounts.

Bermuda

You can use this fork to make it run:
https://github.com/jleinenbach/bermuda/releases/tag/v0.6.8-GoogleFindMy-HA-5

image

@jleinenbach jleinenbach marked this pull request as draft December 23, 2025 16:56
@jleinenbach jleinenbach changed the title 1.7.0 3 1.7.0-3 feat: Global EID Resolver API & Stability Overhaul Dec 23, 2025
@jleinenbach jleinenbach marked this pull request as ready for review December 23, 2025 17:07
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 7762 to 7767
resolver = bucket.get(DATA_EID_RESOLVER)
if isinstance(resolver, GoogleFindMyEIDResolver):
if not entries_bucket:
resolver.stop()
bucket.pop(DATA_EID_RESOLVER, None)
else:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid stopping global EID resolver on failed unload

During parent unload, the code pops the entry from entries_bucket and immediately stops/removes the shared GoogleFindMyEIDResolver when the bucket becomes empty. If a later step in the same function fails (for example async_unload_platforms returns False or a subentry unload raises), the entry runtime data is reinserted and the unload returns False, but the resolver is never recreated. That leaves the entry still active while the global resolver stays stopped/absent, so BLE EID resolution is permanently disabled until Home Assistant is restarted or the integration is reloaded successfully. The stop/remove should be deferred until unload success is confirmed, or the resolver should be restarted when the entry is restored.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jleinenbach jleinenbach marked this pull request as draft December 23, 2025 17:15
@jleinenbach jleinenbach marked this pull request as ready for review December 23, 2025 18:02
jleinenbach and others added 24 commits January 4, 2026 22:58
…ls-Sq3Sp

fix: Re-raise NovaAuthError to trigger re-authentication
When a tracker is shared between multiple Google accounts, each account
has its own canonical_id but they share the same identity_key. Previously,
location updates were only applied to the device that received them directly.

This fix implements location propagation:
- Added _identity_key_to_devices mapping to track which devices share trackers
- Added _register_identity_key() to update the mapping when identity_keys are seen
- Added _propagate_location_to_shared_devices() to propagate location updates
- Location updates are now propagated to all devices with the same identity_key
- Only newer timestamps are propagated (prevents stale data overwriting fresh data)
- Includes infinite loop guard (_propagating_location flag)

Updated tests to include new coordinator attributes.
…-7H3OJ

Propagate location updates to all devices sharing same identity_key
When the TokenCache is closed (e.g., after a failed reload or during
shutdown race conditions), the integration enters an invalid state.
Previously this resulted in cryptic "TokenCache is closed; writes are
disallowed" errors that didn't trigger re-authentication.

Now this error is recognized and converted to ConfigEntryAuthFailed,
which prompts the user to re-authenticate and forces a clean
re-initialization of the integration.

This addresses the issue where users with new secrets.json files
would still see errors due to stale cache references.
…ls-Sq3Sp

fix: Handle TokenCache closed error as auth failure
The _get_entry_cache method now checks if the TokenCache._closed flag
is True before returning the cache instance. When a cache is closed
(e.g., after a failed reload or during shutdown), returning it would
cause writes to fail with "TokenCache is closed" errors.

By returning None for closed caches, the reauth flow can proceed
properly and self-heal, as the entry will be fully reloaded with a
fresh cache instance when async_update_reload_and_abort() completes.

This prevents scenarios where:
1. Entry enters auth-failed state
2. Cache gets closed but runtime_data is not cleared
3. Reauth flow attempts to use the closed cache
4. Writes fail, preventing recovery even with valid new secrets
…-Yx4yJ

fix: Return None from _get_entry_cache when cache is closed
When Bermuda detects an FMDN beacon, this now propagates the detection data
(area, rssi, distance, floor) to ALL devices sharing the same physical tracker.

Changes:
- bermuda_listener: Added _async_propagate_bermuda_data_to_shared_devices()
  - Resolves EID to all matching devices via resolve_eid_all()
  - Sends Bermuda data (area, rssi, distance, floor) to each device's coordinator
  - Extracts additional Bermuda attributes (distance, floor)

- coordinator: Extended _propagate_location_to_shared_devices()
  - Added Bermuda-specific fields to propagation (bermuda_area, bermuda_rssi, etc.)
  - Updated timestamp comparison to handle bermuda_last_seen separately
  - Data propagates even without coordinates (Bermuda-only updates)

This ensures that when Bermuda detects a shared tracker via EID, all accounts
see the same Bermuda sensor data (Area, Distance, Floor, RSSI).
The test was scanning .venv313/lib/python3.13/site-packages and flagging
issues in external dependencies (aifc, tqdm, pip, homeassistant, etc.)
that are outside our control.

Added to SKIP_PARTS:
- .venv313 (virtual environment)
- site-packages (external packages)
- lib, lib64 (Python lib directories)
…-7H3OJ

Claude/debug tracker sharing 7 h3 oj
- Endpoint UploadLocationReports confirmed via PCAPdroid network capture
- QUIC traffic observed to spot-pa.googleapis.com during FMDN activity
- Enabled FMDN_UPLOAD_ENABLED flag
- Implemented _upload_via_spot_request with async_spot_request(cache=...)

Tests: ruff ✓, mypy --strict ✓, pytest (942 passed) ✓
feat(fmdn): Enable FMDN upload endpoint after PCAPdroid confirmation
Complete rewrite of bermuda_listener.py to work with actual Bermuda entities:

**Previous (broken) approach:**
- Expected sensor.bermuda_*fmdn* entities with fmdn_eid attribute
- These entities don't exist - Bermuda doesn't expose EID

**New (correct) approach:**
- Listen for device_tracker.*_bermuda_tracker* state changes
- Find GoogleFindMy device via HA Device Registry (same device)
- Get current EID from GoogleFindMy coordinator/eid_generator
- Upload semantic location (area) on area change

Architecture:
- Bermuda creates device_tracker for GoogleFindMy devices (shared HA device)
- Entity pattern: device_tracker.{device_name}_bermuda_tracker*
- Area attribute contains semantic location (e.g., "Windfang")
- EID computed from identity_key + time using generate_eid_variant()

Tests updated to match new architecture.

Tests: ruff ✓, mypy --strict ✓, pytest (945 passed) ✓
…licates

Android devices can have multiple canonical IDs (historical IDs from
updates, factory resets, etc.). Previously, get_canonic_ids() returned
ALL IDs for each device, causing duplicate device entries in Home Assistant.

Now it only returns the first (primary) canonical ID per device, matching
the behavior in fcm_receiver_ha.py. This prevents smartphones from
appearing as multiple duplicate devices.
feat(fmdn): FMDN upload via Bermuda + fix device duplication
Simplify device matching for Bermuda FMDN integration:

- For FMDN devices, Bermuda uses the SAME identifiers as GoogleFindMy
  via the "congealment" mechanism (see bermuda/entity.py)
- This means identifier matching works because both share the same
  (DOMAIN, device_id) tuples in the HA Device Registry
- Removed incorrect MAC and name matching strategies (MAC rotates,
  name matching is fragile)
- Added comprehensive documentation explaining the architecture

The matching now relies on Bermuda's proper FMDN integration where
metadevices have fmdn_device_id pointing to the GoogleFindMy HA device.

References:
- jleinenbach/bermuda fork FMDN integration
- docs/google_find_my_support.md (EID Resolver API)
feat(bermuda_listener): Add multi-strategy device matching
- Add detailed DEBUG logging for successful uploads with response size
- Add INFO log with semantic location on successful upload
- Update device_tracker semantic_name attribute after successful FMDN upload
- Trigger coordinator refresh to update entity state
- Pass google_device_id and coordinator through the upload chain

After successful upload, the device_tracker will show the semantic
location (e.g., "Büro", "Wohnzimmer") in its location_name attribute.
feat(fmdn): Add upload response logging and semantic_name update
- bermuda_listener: Add 30-second area stabilization before upload
  - Prevents rapid "jumping" room changes from triggering uploads
  - AreaDebounceState dataclass tracks debounce state per entity
  - Only uploads after area is stable for AREA_STABILIZATION_SECONDS

- location_uploader: Fix protobuf structure for semantic locations
  - Use correct LocationReport fields (semanticLocation, geoLocation, status)
  - Set status=SEMANTIC for Bermuda area-based uploads
  - Include semanticLocation.locationName with area name
  - Add encrypted GPS data in geoLocation.encryptedReport

- location_uploader: Improved throttling for semantic uploads
  - Add MIN_UPLOAD_INTERVAL_SEMANTIC (60s) for area changes
  - Allow uploads when semantic_area changes (even without GPS change)
  - Track semantic_area in UploadCacheEntry for throttle comparison
  - Add "Upload allowed" info logging with reason
Add early return when new_area == old_area to prevent unnecessary
task creation. This fixes test_bermuda_state_change_ignores_unchanged_area
which expects no async_create_task call when area is unchanged.
Implement entity name pattern matching as primary strategy for finding
GoogleFindMy devices from Bermuda tracker entities:

- Strategy 1: Entity name matching (NEW)
  - Bermuda: device_tracker.moto_tag_jens_schlusselbund_bermuda_tracker_2
  - GoogleFindMy: device_tracker.moto_tag_jens_schlusselbund
  - Extract base name via regex, lookup GoogleFindMy entity directly

- Strategy 2: Device identifier matching (fallback)
  - Original congealment-based matching via device registry identifiers

Fixes: "No GoogleFindMy device found" error when Bermuda and GoogleFindMy
devices exist separately in HA device registry.
Simplify device matching to use Bermuda's congealment mechanism:
- Bermuda and GoogleFindMy entities share the SAME HA device
- Find all entities for the HA device via entity registry
- Look for GoogleFindMy device_tracker among them
- Extract Google device ID from entity unique_id

Removes incorrect entity name pattern matching which assumed
separate HA devices for Bermuda and GoogleFindMy.
…tion

- Add comprehensive ASCII diagram documenting Bermuda's congealment
  mechanism in bermuda_listener.py module docstring
- Add 4 new tests for shared device matching via entity registry:
  - test_find_googlefindmy_device_via_congealment
  - test_find_googlefindmy_device_no_gfm_entity_on_device
  - test_find_googlefindmy_device_extracts_device_id_from_unique_id
  - test_find_googlefindmy_device_ignores_non_device_tracker_entities
- Add async_entries_for_device stub to conftest.py for test compatibility
- Fix test patch paths to use homeassistant.helpers.entity_registry

Addresses: device matching via shared HA device (congealment) between
Bermuda and GoogleFindMy integrations - NOT via entity names which
users can rename.
claude and others added 30 commits January 14, 2026 11:49
- Format 3 files with ruff format
- Fix 3 auto-fixable lint issues
- Remove unused import (math in test_logic_safeguards.py)
Align test_decoder_accuracy_regression.py with the new "sanitize-and-set"
strategy where invalid accuracy is SET to DEFAULT_ACCURACY_FALLBACK (2000.0m)
instead of being removed from the dict.
…tection-j0bkS

fix: "Phantom Update" Bug (Google FMDN accuracy=0.0)
…ation

User feedback: A 2km fallback radius is useless for finding a tracker.
Since trackers use Bluetooth (~100m range) + GPS error margin (~100m),
200m is physically plausible and still useful for map navigation.

Math validation: 200m fallback loses to real GPS (20-50m) in fusion:
- Weight ratio: 200²/20² = 100x lower weight for fallback data
- Self-healing: Real GPS data quickly overrides corrupted values

Changes:
- PRIVACY_ACCURACY_FALLBACK = 200.0m (new canonical constant)
- DEFAULT_ACCURACY_FALLBACK = alias for backward compatibility
- Updated docstrings and test assertions to reflect 200m value
…distance-SwpOu

Fix tracker location fallback distance strategy
Changed exception handling to differentiate between connection errors
and unexpected errors:

- Connection errors (TimeoutError, ClientConnectionError, ClientError)
  are now logged at WARNING level for visibility
- Unexpected errors are now logged at ERROR level with full stack trace
  (exc_info=True) to help diagnose bugs or API changes

Previously, all errors were logged at DEBUG level, making the system
completely opaque when locate/sound operations failed silently.
…ations

Changes in decrypt_locations.py:
- Invalid/missing timestamp: DEBUG → WARNING (includes raw value for debugging)
- Decrypted payload not bytes: DEBUG → WARNING (includes type info)
- Exception handling: Split into expected errors (WARNING) and unexpected
  errors (ERROR with full stack trace via exc_info=True)
- No locations found: DEBUG → INFO (informational, not an error)
- Protobuf parse failure: DEBUG → WARNING
- Invalid coordinates: DEBUG → WARNING

Changes in polling.py:
- Google Home spam filter: DEBUG → WARNING
- No location data available: DEBUG → WARNING

Previously, many location drops were logged at DEBUG level, making the
system opaque when troubleshooting missing tracker data. These changes
ensure that dropped locations are visible in normal Home Assistant logs.
asyncio.TimeoutError is just an alias for the builtin TimeoutError,
so catching both is redundant.
…on-9droF

Fix tracker data missing in integration
- Add NovaLogicError and NovaProtobufDecodeError exceptions for specific
  error categories (API logic errors vs. decode failures)
- Upgrade retry logs from INFO to WARNING level for better visibility
  in Home Assistant logs
- Add try/except DecodeError protection around all ParseFromString calls
  in decoder.py (parse_device_list_protobuf, parse_device_update_protobuf,
  parse_location_report_upload_protobuf)
- Add explicit exception handling for new exceptions in api.py
  (async_get_basic_device_list, async_get_device_location)

This ensures users can immediately see in the logbook:
- Network problems (Timeout/DNS) via WARNING-level retry logs
- Auth rejection (401/403) via existing error handling
- API logic errors (invalid device ID, etc.) via NovaLogicError
- Decode failures via NovaProtobufDecodeError

No more silent failures.
Google's Nova API returns errors in Protobuf format (google.rpc.Status),
not as HTML/text. This caused silent failures and unreadable error messages.

Changes:
- Add vendored google.protobuf.Any and google.rpc.Status Protobuf definitions
  (avoiding googleapis-common-protos dependency conflicts)
- Implement _decode_error_response() in nova_request.py with smart fallback:
  1. Try parsing as google.rpc.Status Protobuf (primary)
  2. Fall back to text/HTML parsing for load balancer errors
- Add explicit exception handling in locate.py for Nova-specific errors:
  - NovaAuthError: Authentication/permission issues (WARNING level)
  - NovaRateLimitError: 429 Too Many Requests (WARNING level)
  - NovaHTTPError: Server errors 5xx (WARNING level)
  - NovaLogicError: API logic errors (WARNING level)
  - NovaProtobufDecodeError: Decode failures (WARNING level)

Now errors appear clearly in logs, e.g.:
"Nova API request failed: HTTP 400 - RPC 3: Request contains an invalid argument"

Instead of silent failures or HTML parsing errors.
Replace manually-created protobuf implementations with proper protoc-generated
code. The previous manual implementations used an incorrect approach with
explicit FieldDescriptors and Reflection API.

Changes:
- Install protoc (3.21.12) and regenerate Any_pb2.py and RpcStatus_pb2.py
- Fix import path in RpcStatus.proto to use full package path
- Add ruff noqa comments for protoc-specific patterns (dynamic globals)
- Format generated code to pass ruff linting

The protoc-generated code uses serialized FileDescriptorProto and the
standard builder pattern, which is the correct approach for proto3.
- Change RpcStatus.proto import to relative path (ProtoDecoders/Any.proto)
- Fix Python import in RpcStatus_pb2.py to use full HA module path
- Regenerate proto files with correct proto_path
- Add comprehensive ruff noqa comments for protoc-generated code
- Add Any_pb2.pyi and RpcStatus_pb2.pyi for mypy compatibility
- Fix mypy strict compliance test
- Update existing .pyi files with ruff formatting fixes
- All 1964 tests pass
- Add ruff: noqa headers to Common_pb2.py, DeviceUpdate_pb2.py, LocationReportsUpload_pb2.py
- Add decoder.py to pyproject.toml per-file-ignores with documented justifications
- All ruff checks now pass for ProtoDecoders/
…ors-c4Svf

Refactor NOVA API error handling and transparency
- Change 401 handling logs to be more descriptive and use appropriate levels
  - Step 1 (ADM refresh): INFO level with clear token expiration message
  - Step 2 (AAS+ADM refresh): INFO level with AAS invalidation context
  - Step 3 (long cooldown): WARNING level
  - Step 4 (permanent failure): ERROR level without retry counts
- Add start/success logs for ADM and AAS token generation
- Change retry counting from "attempt X/Y" to "retry X/Y" semantics
  - First attempt is not counted as a retry
  - Subsequent failures show "retry 1/2", "retry 2/2" etc.
- Make inner token generation flows transparent in logs
- Remove unused max_auth_retries variable (ruff F841)
- Add pytest-homeassistant-custom-component to dev dependencies
…ion-VsUs6

refactor: token expiration flow for ADM and AAS
Reduce log verbosity by demoting successful EID resolution (HIT) messages
to DEBUG level. MISS messages remain at INFO to highlight unresolved EIDs.
…gging-B4xgZ

refactor: change EID resolver HIT logging from INFO to DEBUG
Previously, HTTP 500 and other retry-eligible errors only logged the
status code, hiding potentially useful diagnostic information from the
server response. The response body was already being decoded but wasn't
included in the log output.

Now the warning and error messages include the decoded server response
(RPC status message or text/HTML content), making it much easier to
diagnose transient server failures.
Improved error logging throughout the codebase to include actual error
messages and server responses:

- nova_request.py: Include server response in retry warnings and
  final error messages for HTTP 500/502/503/504/429 errors

- start_sound_request.py, stop_sound_request.py: Propagate all errors
  to caller instead of returning None, enabling proper error logging

- api.py: Improve Play/Stop Sound failure messages to indicate when
  server returns empty response

- location_request.py: Reference previous warnings in FCM token
  failure message

- aas_token_retrieval.py: Include exception details in all token
  generation failure logs (errors and warnings)

- adm_token_retrieval.py: Include exception details in all token
  generation failure logs, including InvalidAasTokenError cases

Users will now see meaningful error details like HTTP status codes,
RPC error messages, and authentication failure reasons instead of
generic "failed" messages.
…sponse-mI7Zg

fix: include server response details in retry error logs
The location_accuracy property was missing the _is_location_stale()
guard that latitude and longitude have. This caused a race condition
where HA's zone.async_active_zone() would receive valid coordinates
but None for accuracy, triggering a TypeError in zone comparison.

The bug manifested intermittently because all three properties
(latitude, longitude, location_accuracy) are called separately,
and without consistent guards they could return inconsistent data.
…guard-IEW35

fix: add stale location guard to location_accuracy property
…cation_name

Fixes TypeError crash when HA's zone engine receives valid coordinates but
None for location_accuracy. This occurred when device data had lat/lon but
no accuracy value, causing zone.async_active_zone() to fail with:
  TypeError: '<' not supported between instances of 'float' and 'NoneType'

Changes:
- latitude/longitude: Return None if accuracy is missing, ensuring all three
  values (lat, lon, accuracy) are consistently present or absent together
- location_name: Add _is_location_stale() guard for consistency with
  coordinate properties, preventing stale semantic labels from showing
…g-5oURC

fix: add accuracy completeness guard to lat/lon and stale guard to lo…
Individual connection failures during retry attempts are now logged at
DEBUG level instead of ERROR. The ERROR level is reserved for the final
failure after all retry attempts are exhausted. This reduces log noise
during transient network issues since the retry mechanism handles these
cases gracefully.

Also fixed typo: "Could not connected" → "Could not connect"
…errors-Js9Qf

fix: reduce log level for transient FCM connection errors during retry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants