This started out as an attempt to tell people how to build their own literature bots in microsoft flow. But to be honest the process is so complex and so prone to errors that I think it's crazy to even try and start from scratch. Thus, the notes are incomplete, but I'll leave them here anyway just in case they're useful one day.
So, dlvr.it has massively constrained the free option. In a hunt for another free option, I thought that Microsoft Power Automate may suffice. Many academics will have access to Microsoft Power Automate through an institutional subscription to Office365, so here's a set of instructions for getting going.
They're a little complicated, but the principle is simple.
First, we use PowerAutomate to check each RSS feed periodically.
Then, we filter out just the new papers from each feed.
Finally, we post the new papers over the next time period, evenly spaced.
NB I hate power automate, and you may come to hate it too. It's like programming without access to anything useful. It's worse than the lego drag and drop programming thing that my kids and I use on the iPad.
NEVERHTELESS Here we're going to make a 'Flow'. Take a deep breath...
- Go to Power Automate.
- Click on "Create" > "Scheduled cloud flow."
- Name your flow "literature_bot_phypapers" or whatever the hell you like
- Set it to run every Day and click "Create."
NB: Pubmed gets updated once every 24 hours, and the rest of this flow assumes you only check it once every 24 hours. If you check it more often you'll get a lot of duplicate posts.
We'll set the variables that different people will want to change right at the top. This will make it easier to adapt this to different RSS feeds and/or people.
-
Click on "+" and "Add an Action", then search for the "Initialize variable" action and select it. Set it up as follows:
- Name:
BlueskyUsername
(e.g.phypapers.bsky.social
) - Type: String
- Value: Enter your Bluesky username.
- Name:
-
Add another "Initialize variable" action and set it up as follows:
- Name:
BlueskyAPIPassword
(should be something with alphanumeric characters in the formxxxx-xxxx-xxxx-xxxx
) - Type: String
- Value: Enter your Bluesky API password.
- Name:
-
Add another "Initialize variable" action and set it up as follows:
- Name:
RssURL
- Type: String
- Value: Enter the URL for your RSS feed (e.g. mine is
https://pubmed.ncbi.nlm.nih.gov/rss/search/1pSbSzklLaRDgrBBecLaHXjj_NtDB256CbB-lTk3MQA9gZRkc4/?limit=100&utm_campaign=pubmed-2&fc=20240525000654
)
- Name:
- Click on "+" and "Add an Action"
- Search for "RSS" and select "List all feed items."
- Configure the action:
- Feed URL: click the lightning bolt and select the variable
RssUrl
which you set earlier
- Feed URL: click the lightning bolt and select the variable
- Click on "+" and "Add an Action"
- Search for "Filter array" and select it.
- Configure the action:
- Name: Call it
FilterArray
- From: Select
body
from the "List all feed items" action. - Condition:
- In the left box, click the
fx
and paste this into the text box:formatDateTime(item()?['publishDate'], 'yyyy-MM-dd')
- Choose
is greater or equal to
for the operator. - In the right box, click the
fx
and paste this into the text box:formatDateTime(addDays(utcNow(), -1), 'yyyy-MM-dd')
- In the left box, click the
- Name: Call it
This keeps only the papers in the RSS feed that have been added to it in the last 24 hours, which stops us double posting (the feed will always have 100 items, but not all of them will necessarily be new each day).
Let's aim to post everything we've got within 23 hours.
- Add a "Compose" action after filtering the RSS feed.
- Name:
PostCount
- Inputs: click the blue
fx
and enter this in the text box:length(body('FilterArray'))
- Name:
- Add a "Compose" action after getting the post count.
- Name:
MinutesBetweenPosts
- Inputs: click the blue
fx
and enter this in the text box:div(1380, outputs('PostCount'))
- Name:
This will allow us to trickle out our posts over a ~23 hour period.
-
Add an "HTTP" action and call it
GetAccessToken
- Method: POST
- URI:
https://bsky.social/xrpc/com.atproto.server.createSession
- Headers:
- Content-Type: application/json
- Body:
{ "identifier": "@{variables('BlueskyUsername')}", "password": "@{variables('BlueskyAPIPassword')}" }
-
Add a "Parse JSON" action.
- Content: click the lightning bolt and choose
body
ofGetAccessToken
- Schema:
{ "type": "object", "properties": { "accessJwt": { "type": "string" }, "refreshJwt": { "type": "string" } } }
- Content: click the lightning bolt and choose
-
Add an "Initialize variable" action and call it
AccessToken
- Type: String
- Value: use the lightning bolt and select
Body acessJWT
from Parse JSON
-
Add an "Initialize variable" action and call it
RefreshToken
- Type: String
- Value: use the lightning bolt and select
Body refreshJWT
from Parse JSON
We need these tokens later to post to Bluesky
-
Add an "Apply to each" action.
- Value: use the lightning bolt to select the FilterArray
body
- Name:
PostToBluesky
- Value: use the lightning bolt to select the FilterArray
-
Inside the "Apply to each" action, add a "Compose" action.
- Name:
CurrentPaper
- Inputs: use the lightning bolt to select the PostToBluesky
Current Item
- Name:
-
Next, add a "Compose" action to get the title.
- Name:
Title
- Inputs: select the blue
fx
and in the code box putitem()?['title']
- Name:
-
Next, add a "Compose" action to strip HTML tags from the title
- Name:
Title
- Inputs: select the blue
fx
and in the code box putjoin(xpath(xml(concat('<root>', outputs('Title'), '</root>')), '//text()'), '')
- Name:
-
Next, add a "Compose" action to truncate the title if it's longer than 260 characters
- Name:
Title
- Inputs: select the blue
fx
and in the code box putif(greater(length(outputs('CleanTitle')), 260), substring(outputs('CleanTitle'), 0, 260), outputs('CleanTitle'))
- Name:
-
Next, add a "Compose" action to get the link.
- Name:
Link
- Inputs: select the blue
fx
and in the code box putitem()?['primaryLink']
- Name:
-
Next, add a "Compose" action to take the crud off the link.
- Name:
CleanLink
- Inputs: select the blue
fx
and in the code box putsplit(outputs('Link'), '?')[0]
- Name:
- Inside the "PostToBluesky" loop, add a "Compose" action after the
ShortTitle
andCleanLink
actions.- Name:
PostContent
- Inputs:
"@{concat(outputs('ShortTitle'),' ',outputs('CleanLink'))}"
- Name:
Access tokens don't last for long, so we need to refresh it each time we post
-
Inside the "PostToBluesky" loop, add an "HTTP" action.
- Name:
RefreshAccessToken
- Method: POST
- URI:
https://public.api.bsky.app/xrpc/com.atproto.server.refreshSession
- Headers:
- Accept: application/json
- Authorization:
Bearer @{variables('RefreshToken')}
- Name:
-
Add a "Parse JSON" action.
- Name:
ParseRefreshResponse
- Content: click the lightning bolt and choose
body
ofRefreshAccessToken
- Schema:
{ "type": "object", "properties": { "accessJwt": { "type": "string" } } }
- Name:
- Inside the "PostToBluesky" loop, add an "HTTP" action after the
PostContent
action.- Name:
PostToBlueskyAPI
- Method: POST
- URI:
https://bsky.social/xrpc/com.atproto.repo.createRecord
- Headers:
- Content-Type: application/json
- Authorization:
Bearer @{variables('AccessToken')}
- Body:
{ "collection": "app.bsky.feed.post", "repo": "@{variables('BlueskyUsername')}", "record": { "$type": "app.bsky.feed.post", "text": "@{outputs('PostContent')}", "facets": [ { "index": { "byteStart": @{add(length(outputs('ShortTitle')), 1)}, "byteEnd": @{length(outputs('PostContent'))} }, "uri": "@{outputs('CleanLink')}" } ], "createdAt": "@{utcNow()}" } }
- Name:
- Add a "Set variable" action to update the access token.
- Name:
AccessToken
- Value: click the lightning bolt and choose
body accessJWT
ofParseRefreshResponse
- Name:
- Add a 'Delay` action
- Select the blue lightning bolt and choose the
Outputs
of theMinutesBetweenPosts
variable
This will make the bot wait, so the papers trickle out over ~23 hours.