Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add api endpoint to add citations to an item by libraryID and itemID #148

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Dominic-DallOsto
Copy link
Collaborator

@Dominic-DallOsto Dominic-DallOsto commented Jan 6, 2022

Fixes #144

Adds the endpoint /cita/citation/add by which citations (listed by itemKey) can be added to an item (identified by libraryID and itemKey)

An example request in python:

requests.post('http://127.0.0.1:23119/cita/citation/add', json={'libraryID':'1','itemKey':"5ZKBTFW2",'citedItemKeys':["HQPJ4N8V","SCN78LKP"]})

src/zoteroOverlay.jsx Outdated Show resolved Hide resolved
@cboulanger
Copy link

cboulanger commented Jan 7, 2022

Unfortunately, I am having CORS issues: trying to access the connector endpoint from a python script running in a virtual machine (Windows WSL2) fails because the connector server only seems to accept connections from localhost. Similarly, I cannot connect to the connector server from a script running in another window served from a different port. I fear that's probably nothing that can be changed, since the Zotero people will probably not add a more liberal CORS header.

@retorquere
Copy link

Web browsers cannot connect, but a python script should be fine, I do it all the time.

@cboulanger
Copy link

Web browsers cannot connect, but a python script should be fine, I do it all the time.

My python cgi script tries to connect from 127.0.1.1 (WSL2 Ubuntu VM) to the Zotero server listening on 127.0.0.1 and it refuses the connection. Maybe it is a firewall issue, though. I'll switch to a Mac where it should work.

@retorquere
Copy link

Try setting another agent header

@cboulanger
Copy link

cboulanger commented Jan 8, 2022

Ok, I got it working but soon discovered that the existing connector API does not provide the methods I need for programmatically adding the reference data that I want to link with cita (or I didn't find them in docs or in the source), such as methods to lookup items in order to avoid duplicates, or to create them if they don't exist already.

Would you consider adding the endpoints that I put here to your PR? I tested them by pasting the code into the "Run JavaScript" console, which works well. Although they do what they should, they are not 100% ready yet because I want to test them in my workflow first (maybe I need another method or additional data). But maybe you can already have a look now.

@cboulanger
Copy link

@Dominic-DallOsto One comment regarding parameter names: Maybe it would be better to rename itemID => itemKey and citedItemsIDs => citedItemsKeys because internally, Zotero uses ID for the internal primary ids stored in the Sqlite database, whereas the alphanumeric "keys" are part of the item data.

@cboulanger
Copy link

@Dominic-DallOsto I have added more connector endpoints: https://github.com/cboulanger/excite-docker/blob/main/zotero/zoteroOverlay.jsx - maybe it makes sense that I create a PR to add them after your PR has been merged. They add capabilities that are essential to work programmatically with the cita plugin from outside Zotero.

@Dominic-DallOsto
Copy link
Collaborator Author

@Dominic-DallOsto One comment regarding parameter names: Maybe it would be better to rename itemID => itemKey and citedItemsIDs => citedItemsKeys because internally, Zotero uses ID for the internal primary ids stored in the Sqlite database, whereas the alphanumeric "keys" are part of the item data.

That makes sense!

@Dominic-DallOsto I have added more connector endpoints: https://github.com/cboulanger/excite-docker/blob/main/zotero/zoteroOverlay.jsx - maybe it makes sense that I create a PR to add them after your PR has been merged. They add capabilities that are essential to work programmatically with the cita plugin from outside Zotero.

Thanks, that looks great! Actually I was thinking maybe this would make more sense as a separate extension, because most of the functionality is for working with Zotero data generally and it might be something others who don't necessarily use Cita would be interested in? Then I think anything specific to citations should be part of Cita. What do you think? Would you like to do up an extension? Otherwise I can quickly do it tomorrow.

@cboulanger
Copy link

Thanks, that looks great! Actually I was thinking maybe this would make more sense as a separate extension, because most of the functionality is for working with Zotero data generally and it might be something others who don't necessarily use Cita would be interested in? Then I think anything specific to citations should be part of Cita. What do you think? Would you like to do up an extension? Otherwise I can quickly do it tomorrow.

You are right that these features go beyond Cita and should really be part of a more general set of query features for the connector (and there are lots more that I would wish to be available). I found the prerequisites of plugin development a bit involved and have shied away from it so far, so if you would set up the infrastructure for a separate plugin, that would be really great and I would be happy to contribute to it!

@retorquere
Copy link

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

@Dominic-DallOsto
Copy link
Collaborator Author

Thanks! I'll whip this up quickly.

@cboulanger
Copy link

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

That sounds wonderful! I really like TypeScript because it catches so many bugs in advance and is self-documenting.

@cboulanger
Copy link

Thanks! I'll whip this up quickly.

Thank you. We can then open up issues discussing which endpoints should be available and what data they should return.

@Dominic-DallOsto
Copy link
Collaborator Author

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

I just posted this issue retorquere/generator-zotero-plugin#20 I'm having with running the plugin

@Dominic-DallOsto
Copy link
Collaborator Author

@cboulanger There's an initial version up here. Hopefully I didn't break anything. And it should be possible to tidy up a lot of the checks with Typescript.

@cboulanger
Copy link

@Dominic-DallOsto Really great job - so fast and beautiful code, too! Looking forward to trying it out and I'll certainly contribute some more endpoints. It is great to have a place to add them. I have always been missing a fast and reliable way of scripting Zotero in a cross-platform way. The zotero.org web API is great and everything, but this is much more versatile (and extensible, too).

@Dominic-DallOsto Dominic-DallOsto force-pushed the Dominic-DallOsto/issue144 branch from 585af77 to 19d96ae Compare January 22, 2022 14:49
@Dominic-DallOsto
Copy link
Collaborator Author

Updated now with new methods - see what you think / if it works for your workflow

@Dominic-DallOsto Dominic-DallOsto marked this pull request as ready for review January 26, 2022 10:01
@Dominic-DallOsto
Copy link
Collaborator Author

Ok, I think this is finished and ready for review now. The only additional change is that when adding a citation to a Zotero item, that Zotero item will be automatically linked in the citation.

@diegodlh
Copy link
Owner

diegodlh commented Jan 26, 2022

Hi all! Thanks for the work you've put into this. I wasn't aware of the extensibility of the Zotero HTTP server, and I like to see that Cita can be integrated into automated workflows through it. And I really like the brand new zotero-api-endpoint plugin. Congratulations!

I have reviewed the changes in the pull request and I generally like them. I do have some comments, though:

First, I like that you've put the initialization code in the init function of the zoteroOverlay object (src/zoteroOverlay.jsx). It seems to be the earliest point in the extension's bootstrapping process where we have access to our modules.

Then, I think that the specifics of the addItemCitations and deleteItemCitations should be implemented elsewhere, in the SourceItemWrapper class, where they can be reused in the future, for example to address #39. I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

Regarding the addCitationsByKey method, it creates Citation objects and adds them to the citing item using its addCitations method. However, because these citations are linked to Zotero items by passing their key to the Citation constructor, rather than using the Citation's linkToZoteroItem method, the citing and cited items are not marked as "related" in Zotero. We may call the Citation's linkToZoteroItem method (instead of passing the cited item key to the Citation constructor), but I think doing this before adding the Citations to the source/citing item (via addCitations) may be a bad idea. This is probably a design flaw in Cita (I'll post a separate issue: #156)

To solve this, I would suggest that we change the SourceItemWrapper class' addCitations method to:

  1. create a temporary array of Zotero keys in the Citation objects being added, and
  2. use these keys collected to mark citing and cited items as related (if not yet marked so).

This would also work for links that may have been established by manually editing (discouraged) the Citations note attachment (instead of using the GUI). I'll post a separate issue for this too: #157.

On the other hand, regarding the deleteCitationsByIndex method, I see that it uses the citing item's deleteCitation method. This is OK. However, when this method is called with sync=true it also removes the remote citation link in Wikidata (if known). This may be unexpected by the user. For example, when citations are removed using the GUI, a confirmation dialog is shown asking the user whether they want to remove the remote citation link as well. I think that in this case we should make explicit that we are calling deleteCitation with sync=false.

Regarding the endpoints, I think we should:

  • have a single endpoint for citations (/cita/citations) accepting GET and POST (and DELETE?) methods
  • specify source/citing item's library ID and Zotero key as query string parameters: ?libraryID=xx&itemKey=xx
  • to get citations for an item one would GET /cita/citations and specify the library ID and item key as query string parameters. We may have an additional format parameter indicating the citation list format desired. The current implementation may be the cita format, but future additions may include bibtex, etc.
  • to add citations for an item one would POST the citations to be added to /cita/citations, again specifying library ID and (citing) item key as query string parameters. The citations to be added may still be specified in the body of the request (as in your current implementation). In the future we may use the same endpoint to add citations in other ways, depending on the parameters provided in the request body. For example, POSTing cited item full metadata, instead of cited item Zotero key.
  • to delete citations I thought of using a DELETE request, but I understand that whether DELETE requests should have request bodies or not is debatable. Instead, we may use a POST request as for adding citations (above) and maybe have an action query string parameter indicating whether we want to add or delete citations. In this case, the indices of the citations to be deleted may be sent in the request body (as in your current implementation), leaving room in the future for other ways of indicating citations to be deleted (e.g., by cited item Zotero key).

Finally, do you think we may provide basic API documentation in a root /cita endpoint? Ideally I was thinking of something like Swagger, but I don't know if this is easy or even possible. At least, I would include some basic documentation in the README, so these API features are somehow discoverable.

I think that pretty much covers everything I thought while reviewing this PR. In short:

  • Move addCitationsByKey and deleteCitationsByIndex to SourceItemWrapper
  • Update Zotero item relations inside SourceItemWrapper's addCitations
  • Call SourceItemWrapper's deleteCitations with explicit sync=false
  • Update endpoint paths
  • Basic API documentation

I'll try to reply to future comments and changes as soon as possible! Thank you again!!

@cboulanger
Copy link

@diegodlh Thank you for your thoughful review, I think it is a very good idea to integrate the target feature better into the existing cita infrastructure. As to your suggestion to align the requests better with the HTTP methods, that is certainly better practice than putting everything into the POST data. However, in terms of API documention/service discovery I think separating request data into a GET querystring and POST data makes validation and documentation more complicated.

In zotero-api-endpoint we use some wizardry to convert the typescript information into json-schema files which can then be used to validate the input and could further be transformed into OpenApi/Swagger information with little cost. I wonder if that's something that could be interesting for Cita, although it might be an overkill for three service methods.

@cboulanger
Copy link

cboulanger commented Jan 26, 2022

Thinking about it some more, the question for me is if the effort of making the API RESTful, given a server that doesn't let you parse out dynamic URI components (such as /cita/citations/{libraryID}/{itemKey}) really provides major advantages over treating the enpoints more like RPC endpoints which receive POSTed JSON data without caring so much about the HTTP protocol. Then all you have to care about is the structure of the JSON data, which makes validation much easier. In Cita, and in the case of the present PR it's really not so much of deal, since we're talking about three endpoints, but I am thinking about it because it's a very relevant question for the zotero-api-endpoint add-on, which potentially could have a larger number of endpoints exposing major parts Zotero's javascript API.

@diegodlh
Copy link
Owner

diegodlh commented Jan 27, 2022

I think separating request data into a GET querystring and POST data makes validation and documentation more complicated.

Regarding this, I also liked the idea of using /cita/citations/{libraryID}/{itemKey} endpoints, but I also assumed it wouldn't be possible with Zotero's builtin endpoint registration.

it might be an overkill for three service methods.

Yes, regarding documentation I agree that doing something very complicated would not be urgent right now. I would at least provide some docs in the Readme or in a landing page at the root /cita endpoint.

the endpoints are separated from the implementations, so the API itself can be versioned and improved.

I'm quoting part of your email yesterday, concerning the zotero-api-endpoint plugin, I believe. I think it applies here too. I think I prefer an initial API that looks more like how we want it to be in the long term. However, if making some of the changes I suggested would delay merging this PR too long, I guess we may open a separate issue and address these in the future.

@cboulanger
Copy link

It does in fact not add the owl:related relationship between citing and cited items. Since I am already processing data with it, it would be good if

  • this relationship could be added programmatically with a separate endpoint
  • or if re-linking an item was possible without creating a new link (which currently happens), just adding the relationship

@Dominic-DallOsto
Copy link
Collaborator Author

Dominic-DallOsto commented Jan 30, 2022

Thanks for the super detailed review - I'll go through step by step:

Then, I think that the specifics of the addItemCitations and deleteItemCitations should be implemented elsewhere, in the SourceItemWrapper class, where they can be reused in the future, for example to address #39. I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

That makes a lot of sense, yep!

Regarding the addCitationsByKey method, it creates Citation objects and adds them to the citing item using its addCitations method. However, because these citations are linked to Zotero items by passing their key to the Citation constructor, rather than using the Citation's linkToZoteroItem method, the citing and cited items are not marked as "related" in Zotero. We may call the Citation's linkToZoteroItem method (instead of passing the cited item key to the Citation constructor), but I think doing this before adding the Citations to the source/citing item (via addCitations) may be a bad idea. This is probably a design flaw in Cita (I'll post a separate issue: #156)

Yep - will follow up in #157. A solution that works when passing in Zotero keys to the constructor would be most idiomatic if possible I think.

On the other hand, regarding the deleteCitationsByIndex method, I see that it uses the citing item's deleteCitation method. This is OK. However, when this method is called with sync=true it also removes the remote citation link in Wikidata (if known). This may be unexpected by the user. For example, when citations are removed using the GUI, a confirmation dialog is shown asking the user whether they want to remove the remote citation link as well. I think that in this case we should make explicit that we are calling deleteCitation with sync=false.

That makes sense as well!

  • have a single endpoint for citations (/cita/citations) accepting GET and POST (and DELETE?) methods
  • specify source/citing item's library ID and Zotero key as query string parameters: ?libraryID=xx&itemKey=xx

I'm not sure if it's possible to mix query string parameters and post data. I'll investigate this more here (Dominic-DallOsto/zotero-api-endpoint#14) - but initially reading the docs, it looks like we can only get either the query string (if it's a GET request) or the post data (if it's a POST request) but not both. I'm not sure if DELETE requests are supported - I will have to check.

  • to get citations for an item one would GET /cita/citations and specify the library ID and item key as query string parameters. We may have an additional format parameter indicating the citation list format desired. The current implementation may be the cita format, but future additions may include bibtex, etc.

That sounds good. Would relate to #4 #125.

Finally, do you think we may provide basic API documentation in a root /cita endpoint? Ideally I was thinking of something like Swagger, but I don't know if this is easy or even possible. At least, I would include some basic documentation in the README, so these API features are somehow discoverable.

Definitely docs would be a good idea. We'll think about what this should look like.

@cboulanger
Copy link

I'm not sure if it's possible to mix query string parameters and post data. I'll investigate this more here (Dominic-DallOsto/zotero-api-endpoint#14) - but initially reading the docs, it looks like we can only get either the query string (if it's a GET request) or the post data (if it's a POST request) but not both. I'm not sure if DELETE requests are supported - I will have to check.

Given the limitations of the Zotero server, I would suggest that we stick to GET/POST as it is now and improve in the case that the Zotero devs improve the server to process requests in a more RESTful way.

@diegodlh
Copy link
Owner

diegodlh commented Feb 7, 2022

or if re-linking an item was possible without creating a new link (which currently happens), just adding the relationship

@cboulanger, this should be addressed with #157.

I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

I think I was partly wrong here: deleteCitationsByIndex is already a source item method (deleteCitation). Maybe we should just make it accept multiple indices. Calling it multiple times may have unexpected consequences, because citation indices may change after a citation has been deleted.

I'm not sure if it's possible to mix query string parameters and post data.

Apparently it is. See my last comment in Dominic-DallOsto/zotero-api-endpoint/issues/14

I'm not sure if DELETE requests are supported - I will have to check.

Just checked: they are not supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants