Skip to content

Commit fa94c8f

Browse files
committed
Back-up meeting notes
1 parent 9009614 commit fa94c8f

19 files changed

+1343
-1
lines changed

.github/workflows/ci.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,3 @@ jobs:
1111
uses: lycheeverse/lychee-action@v2.4.1
1212
with:
1313
fail: true
14-

lychee.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
exclude_path = ["meeting-notes"]

meeting-notes/2022-10-06.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
OpenPodcastSync API meeting notes
2+
===
3+
4+
### Participating projects
5+
6+
podfriend, gpodder 4 Nextcloud, antennapod, kasts, funkwhale
7+
8+
9+
### What are the problems we are trying to solve?
10+
11+
Try to get the big picture around the various issues.
12+
13+
Subscriptions.
14+
15+
Problems identified with gpodder:
16+
- Multi-device support is confusing to users. Gpodder stores each device as an entity and allows you to link to devices to sync them. Users find this confusing and don't understand why content isn't synced properly across non-linked devices.
17+
- This is only implemented for subscriptions, not for episodes. This inconsistency is confusing for users.
18+
- The database often overflows due to a large dataset being stored. All actions are stored and never cleaned up, and all episode actions can only be stored once. E.g:
19+
- If you listen to an episode once and then listen again, an action such as "new" is only sent once.
20+
- Sending the exact same play position once cannot be stored twice.
21+
- Duplicate episodes/subscriptions are an issue. They use the media URL as an identifier for an episode, but if the file changes due to reupload or something else this creates a brand new entry. Syncing these changes is difficult.
22+
- User documentation is lacking. e.g.:
23+
- If podcast creators change GUID and URL for an episode, there isn't an agreed-upon behavior for the API or for clients consuming the episodes.
24+
- If an action is stored locally, and a conflicting action is received from the server at later stage; what happens on sync? Can take inspiration from listenbrains scrobbles.
25+
- Subscription lists can duplicate due to URLs not being updated reliably.
26+
- There is no agreed-upon way to handle updating URLs, and this is mostly being handled by clients
27+
- We need to be able to synchronize a queue of episodes in the correct order between devices
28+
- We need to handle multiple queues, and have graceful handling for syncing with clients/servers that cannot handle multiple queues
29+
30+
People would expect to find all their data, queues and progress to be synced accross all their apps, using a single online identity.
31+
Howto handle when a server shuts down? Would we need some export/import features? Like an extended OPML? Or can we rely on clients as 'intermediaries' (sync data, log out from server, log in to other server)?
32+
Switching from mobile (home/commute) to web/desktop app (at work) is a common use case amongst us.
33+
34+
What would be our Minimum Viable Product?
35+
36+
Next steps?
37+
- split the list into compenents problems
38+
- asynchronous discussions
39+
- organize meetings when needed on specific matters
40+
41+
###### tags: `project-management` `meeting-notes` `OpenPodcastAPI`

meeting-notes/2023-03-14.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Meeting 2023-03-14
2+
===
3+
4+
Participants:
5+
* Sporiff
6+
* keunes
7+
* gcrkrause
8+
9+
# Who has the authority of the GUID
10+
11+
* In the first place the RSS feed
12+
* If thats not available the server might *optionally* ask podcastindex.org
13+
* The client may send a `guid` in the `POST` request **only** if it is obtained from the RSS feed. The server accepts sent `guid` information as authoritative
14+
* The client already has the GUID from the feed
15+
* The server (project) may decide to be as slim as possible, to the extent that it doesn't do any RSS fetching
16+
* The server MUST return a `guid` immediately. This can either be the `guid` sent by the client **or** a generated `guid` if nothing is sent. An asynchronous task CAN fetch the RSS feed to check for a `guid` if one was generated, store an updated `guid` and put an 'updated since' flag to tell clients on next connect to update this data.
17+
* In case a user subscribes to the same podcast though with different feed URLs while there is no `guid` that connects the two, or if a server is unresponsive and this causes issues, it is accepted that this can lead to duplicate subscriptions.
18+
19+
# Deletion process
20+
21+
* The `DELETE` verb should actually remove data as a cascade
22+
* The server should keep a record **only** of the GUID and mark it as deleted
23+
* The API should return a `410 GONE` status for any deleted entries
24+
* The `PATCH` unsubscribe request marks all entries as **unsubscribed**
25+
* The server should not remove any data associated with **unsubscribed** subscriptions unless they are deleted
26+
27+
# Tasks until next time
28+
29+
- [ ] Update specs @Ciaran
30+
- [ ] [Setup Hosted OpenAPI specs](https://github.com/OpenPodcastAPI/api-specs/issues/13) @Georg
31+
- [ ] Setup Sphinx @Ciaran
32+
- [ ] Reference Implementation @Georg
33+
- [ ] Check that Ciarán isn't speaking nonsense in client behavior spec @keunes
34+
35+
###### tags: `project-management` `meeting-notes` `OpenPodcastAPI`

meeting-notes/2023-04-11.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Open Podcast API 11/04/2023
2+
===
3+
4+
present: Ciarán (FW), Jonathan (GfN), Keunes (AP) and Frederik ([MusicPod](https://github.com/ubuntu-flutter-community/musicpod))
5+
6+
Ciarán to update:
7+
8+
* Fetch logic:
9+
* All timestamp fields must be checked against the `since` parameter in the call (`subscription_changed`, `guid_changed`)
10+
* Deletion logic:
11+
* `is_deleted` boolean field should be replaced with a timestamp field that is included in fetch calls to inform clients of deletions
12+
* A deleted subscription should be reinstated by a client adding a new subscription with the same GUID. The `subscription_changed` and `guid_changed` fields should reflect the date that the subscription is reinstated. The `deleted` timestamp field should be NULLed
13+
* On receipt of a deleted subscription, the client should present the user with the option to **remove** their local data or **send** their local data to the server to reinstate the subscription details
14+
15+
Keunes to add a project goal/description to the [Index page](https://github.com/OpenPodcastAPI/api-specs/blob/main/docs/index.md) directly in the PR (use [MyST formatting](https://myst-parser.readthedocs.io/en/latest/)).
16+
17+
We'll call the specs 'pre-release' or 'ALPHA' until we have implemented all specs that we deem as 'required' for all servers. Ciarán will add a banner at the top of the pages to warn readers of this.
18+
19+
JonOfUs to add a GitHub Actions workflow for PRs to create and publish a preview of them (template [here](https://github.com/OpenPodcastAPI/api-specs/issues/28))
20+
21+
Once the above changes are reflected, we should merge the subscriptions endpoint spec to have something on the site.
22+
23+
We can use some Creative Commons license for this specification (tbd). Reference implementations can pick their own license (gPodder for Nextcloud & Funkwhale will have AGPL).
24+
25+
Ciarán will be in a podcast early May, would be good to have the Subscriptons endpoint merged by then.
26+
27+
## Future discussion
28+
29+
* Ensure that user data is separated by user ID
30+
* Outline what data can be shared and what is per-user data
31+
* Reflect these rules in the spec for multi-tenant and single-tenant servers
32+
* What calls are core/required; which ones are 'feature' ([GH discussion](https://github.com/orgs/OpenPodcastAPI/discussions/16))
33+
* Declaring versions & supported endpoints (well-known/other way; [Matrix](https://spec.matrix.org/v1.6/client-server-api/#capabilities-negotiation) e.g. does this at `$prefix/v1/capabilities`)
34+
35+
###### tags: `meeting` `project-management` `OpenPodcastAPI`

meeting-notes/2023-05-30.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
2023-05-30 9pm in the middle of the night
2+
===
3+
4+
## Endpoints
5+
* `GET/PUT /episodes`
6+
* returns only episodes changed
7+
* parameter `since`
8+
* ~~`GET/PUT /episodes/{guid-hash}`~~
9+
* Don't allow this endpoint to prevent problems with duplicate GUIDs
10+
* `GET /subscriptions/{guid}/episodes`
11+
* parameter `since`
12+
* parameter `guid`?
13+
* `GET/PUT /subscriptions/{guid}/episodes/{fetch-hash}` (hash: SHA1?)
14+
* if fetch-hash clash, server expected to return BAD REQUEST
15+
* Hash here, because GUIDs can be any String
16+
17+
18+
We want to explain in the specs why we have endpoints 'under' subscriptions, and why we might refuse updates. (i.e. how this will help avoid gPodder API pitfalls.)
19+
20+
## Episode endpoint
21+
22+
The episode endpoint is required to synchronize playback positions and played status for specific episodes. At a minimum, the endpoint should accept and return the following:
23+
24+
1. The episode's **Podcast GUID** (most recent)
25+
2. The episode's **GUID** (sent by the client if found in the RSS feed, or generated by the server if not): String (not necessarily GUID/URL formatted).`
26+
4. A **Status** field containing lifecycle statuses. E.g.:
27+
* `New`
28+
* `Played`
29+
* `Ignored`
30+
* `Queued`
31+
6. A **Playback position** marker, updated by a PUT request
32+
7. A **timestamp** of the last time the episode was played/paused (used for conflict resolution on the playback position)
33+
8. A **Favorite** field to mark episodes
34+
9. A **timestamp** for the last time some metadata (except playback position) was updated
35+
36+
We discussed if it makes sense to use episode numbers, but it's not part of the feed anyways so we don't have this information and don't need it anyways
37+
38+
https://www.rssboard.org/rss-specification#ltguidgtSubelementOfLtitemgt
39+
40+
41+
### Episode identification
42+
#### Fetch-hash vs GUID
43+
Discussion whether to generate a new (static?) identifier per episode and use that for synchronisation (clients would have to store it additionally per episode?) or to use existing GUIDs as sync identifier and generate them if none is present (one endpoint needs the GUIDs to be passed by their hash/base64 then for REST-compliancy)
44+
45+
#### Fetch-hash
46+
Fetch-hash creation: SHA1/MD5 hash of
47+
1. `<guid>` https://www.rssboard.org/rss-specification#ltguidgtSubelementOfLtitemgt
48+
49+
x. `<link>` https://www.rssboard.org/rss-specification#hrelementsOfLtitemgt
50+
x. `<enclosure>` (aka media file URL) https://www.rssboard.org/rss-specification#ltenclosuregtSubelementOfLtitemgt
51+
52+
Priority of latter 2 tbd: `<link>` might be less likely to be unique, while `<enclosure>` might be less stable (more likely to change).
53+
54+
Consideration: why not BASE64? (REST-compliant, can be "unhashed", so hash wouldn't have to be stored on the server)
55+
56+
Good practice/required: store all 3 (GUID, link, media file URL). This will allow for later matching of episodes if one or two of these are missing. For example, if a totally new client is connecting to a server, and an episode doesn't have a GUID and the `<link>` has changed, matching would still be possible based on media file URL. (If we don't do this, finding the right episode locally might be hard when receiving a fetch-hash that's not unique, or a GUID that's missing. We know the podcast and within each podcast there'll be only a limited set of 'wrong' episodes, so a client would only have to create hashes for a few episodes in order to find a match. But still, not very economic.)
57+
58+
<details>
59+
<summary markdown="span">Matching proposal in pseudo-code (click to expand)</summary>
60+
61+
```pseudo-code
62+
are_episodes_equal(client-episode c, server-episode s):
63+
// this filters out any potential GUID duplicates
64+
if c.podcast_guid != s.podcast_guid then
65+
return False
66+
67+
// if GUID is present, decide exclusively according to it
68+
if c.guid not empty then
69+
return c.guid == s.guid
70+
71+
// if enclosure matches, probably the same (since they share the media file)
72+
if c.enclosure not empty && c.enclosure == s.enclosure then
73+
return True
74+
75+
// case: no media file
76+
if c.enclosure empty then
77+
// no guid, enclosure or link -> not matchable
78+
if c.link empty then
79+
return False
80+
81+
// no media file, but episode URL matches - very probably the same
82+
// (how large is the error here?)
83+
if c.link == l.link then
84+
return True
85+
86+
// All other cases: not matching
87+
return False
88+
```
89+
</details><br>
90+
91+
?? Each field that is empty/not present in the RSS is stored & sent empty. ~~The fetch-hash is only used when sending a request about a specific episode.~~ (that wouldn't work well in case of batch updates - see below) Payloads don't contain fetch-hashes, only the three separate fields.
92+
93+
Two options for identifying episodes in communication:
94+
[I don't think these are the only options, see [here](#Fetch-hash-vs-GUID)]
95+
* For each episode (e.g. in queue; batch update), all three fields/tags are included. Lot of (unnecessary) data exchange.
96+
* Each episode gets a calculated fetch-hash, which is used for communication. Clients can decide to store or generate on the fly. (Generating on-the-fly is dangerous, episode identifier should be static even if episode changes)
97+
98+
Server creates fetch-hash, similar to creation of Podcast GUID, based on the logic described above.
99+
100+
Why do we trust the server to create the hash, more than the client? Because for each person, there's probably just 1 server in the game, more likely multiple clients. So if the server messes it up, there's still a single outcome for each user.
101+
102+
#### GUID
103+
Why shouldn't the server just create a GUID (seed: available payloads or whole episode, can also be just random) and send this back to the client? (the client would map using `<enclosure>` and `<link>` and then store this GUID)
104+
[Advantage: less payload fields, only `<enclosure>`, `<link>` and `<guid>` and after first sync only `<guid>` (`guid-hash` only for `PUT /subs../{guid}/epis../{guid-hash}`)]
105+
[Further advantage: easier to implement for clients, they probably already have an `episode_guid` field in their DB]
106+
107+
Only create GUID if none is present, otherwise use existing one.
108+
Identify episode always by `podcast_guid`+`episode_guid` (e.g. when referencing queue items, settings, ...)
109+
[PodcastIndex seems to handle this [the same way](https://podcastindex-org.github.io/docs-api/#get-/episodes/byguid)]
110+
111+
The workflow if a new client connects could then be:
112+
1. Get subscriptions & fetch feeds
113+
2. Get episodes
114+
3. Feed with GUIDs: map by GUID
115+
4. Feed without GUIDs: map by matching algorithm [[above](#Matching-proposal-in-pseudo-code)], then store GUID from sync server
116+
117+
#### Deduplication
118+
119+
Two options:
120+
a. agree on a deduplication logic as part of the spec which is to be executed at server level (hard to 'enforce')
121+
b. let clients figure out deduplication, and spec the calls that will allow clients to merge episodes.
122+
123+
To be discussed further. Latter is easier for us :-)
124+
Latter should be in the spec in either case, so that we don't have to change the whole spec if some podcast feeds mess up in a way we never anticipated. Clients can adapt a lot faster.
125+
126+
#### New GUID/Fetch-hash logic
127+
Necessary for changing GUIDs, can also be used for deduplication?
128+
129+
Options:
130+
1. `PUT /episodes` with additional field `old_fetch-hash` (or `old_guid`)
131+
2. `PUT /subscriptions/{guid}/episodes/{guid-/fetch-hash}` with additional field `new_fetch-hash` (or `new_guid`)
132+
133+
Case where both episodes are contained in the feed (episode didn't change, but podcasters published twice): To mark duplicate, additional boolean `is_duplicate` so that the server handles `fetch-hash`/`guid` of both as aliases (tombstoning one, if one of them is requested, return aliases in field/array `aliases`/`duplicate_fetch-hashes/guids`)
134+
135+
In both cases, server changes fetch-hash/GUID of episode entry, sets `fetch-hash/GUID_changed` timestamp and creates tombstone for old value
136+
[On `GET /episodes`, old value is in `fetch-hash`/`guid` and new value in `new_fetch-hash/new_guid`, same behaviour as in Subscriptions]
137+
138+
Case to handle:
139+
1. Client 1 marks {`fetch-hash2`/`guid2`} as new guid of {`fetch-hash1`/`guid1`}
140+
2. Client 2 receives & stores this
141+
3. Client 2 marks {`fetch-hash1`/`guid1`} as new guid of {`fetch-hash2`/`guid2`}
142+
143+
(could happen through e.g. slightly different podcast feed, e.g. one feed contains MP3s, the other AACs, but podcast GUID is the same)
144+
145+
146+
## Excursus Database Schema in the specs
147+
148+
* We should focus on the format of the communications, not how the database is stored
149+
* We have all field data types specified anyways in the API endpoint specification
150+
* We can leave the proposed database schema as an example
151+
152+
153+
###### tags: `project-management` `meeting-notes` `OpenPodcastAPI`

meeting-notes/2023-07-11.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2023-07-11 20:00
2+
===
3+
4+
## Episode identification
5+
6+
Possible way forward for selecting ideal ['identification' (ID) for episodes](https://pad.funkwhale.audio/oCfs5kJ6QTu02d_oVHW7DA): write up test cases (examples of data gaps) > what satisfies all our test cases?
7+
8+
* rss feed without episode guids
9+
* rss feed with 2 duplicate guids
10+
* guid changes for a given episode in the rss feed
11+
* ...
12+
13+
Then make table.
14+
15+
We should probably add a warning, reminding that these cannot be used as the only indices in database in multi-user environment (users have different playback positions).
16+
17+
## Data
18+
19+
1. The episode's **Podcast GUID** (most recent)
20+
2. The episode's **GUID** (sent by the client if found in the RSS feed, or generated by the server if not): String (not necessarily GUID/URL formatted).`
21+
4. A boolean **played** field / or a field(e.g. nested json) **state** containing information about the state this episode currently in (like played, in_queue, ignored, ...)
22+
a. What is 'played' differs between clients (e.g. in AntennaPod you can set as played even if 20 seconds at end is skipped)
23+
b. Interaction with other potential states? (e.g. 'ignored') E.g. 'notified' (to avoid getting notifications on multiple devices). Need a list of statuses (& combinations) to keep track of, and then see which options (boolean, integer, nested booleans, etc) are best.
24+
c. Solution: define a set of states and explain those well
25+
6. Liked/Favourited
26+
7. A **Playback position** marker, updated by a PUT request
27+
8. A **time_played** counter, containing the total amount of seconds this episode was played
28+
9. A **timestamp** of the last time the episode was played/paused
29+
10. To resolve sync conflicts: dedocated timestamp for each of the fields? Or single timestamp for whole episode.
30+
a. Two timestamps: **last_played** (for conflict resolution on the playback position) and **metadata_changed** (for conflict resolution on all other episode information)
31+
~~b. One timestamp for everything~~
32+
~~c. Separate timestamps for each field~~ [too complicated]
33+
11. Episode length? (gpodder.net had this) TBD (cases with media files shorter like 30 sec when abroad, or when media files have ads removed after x-thousand downloads because podcaster gets paid only for first 10k)
34+
12. Any other markers (e.g. bookmarked playback positions; timed annotations)
35+
13. Ratings/Reviews (probably better as separate endpoint, referencing the episode)
36+
37+
###### tags: `project-management` `meeting-notes` `OpenPodcastAPI`

0 commit comments

Comments
 (0)