layout | title | nav_order | permalink |
---|---|---|---|
default |
ADR-0018 |
21 |
records/0018/ |
- Status: accepted
- Decider(s):
- Andrew Berger
- Infrastructure team
- Date(s):
- Accepted: 2023-09-01
This work is motivated by improving performance in Argo, especially for cocina objects with large numbers of files. A cocina object with 5000 files was timed at 12 seconds to render in Argo. This is due to:
- The time taken to retrieve the object from the database (timed at 3.5 seconds).
- The time to instantiate and validate the cocina object in both DSA and Argo.
- The time to send the cocina object over the network.
- Argo making 5 requests for a cocina object to render a single item details page.
- The performance improvement for rendering cocina objects in Argo.
- The amount of new machinery / infrastructure that must be put into place to address the problem.
- The opportunity to reuse new machinery / infrastructure to address other use cases.
- Add a Graphql endpoint to DSA to support partial retrieval
- Extend the existing REST endpoints to support partial retrieval
- Caching of cocina objects by Argo
Partial retrieval improves performance by only retrieving the large parts of the cocina object (viz., the structural) when needed. A partial improving strategy requires changes to the cocina models gem to allow cocina objects to omit required attributes.
Caching improves performance by reducing the number of times that a cocina object must be retrieved.
Add a Graphql endpoint to DSA because:
- Partial retrieval is a core feature of Graphql.
- Graphql has a well-supported Rails engine (ruby-graphql), so adding it to the DSA Rails app was not a significant burden.
- Graphql has the possibility to be re-used to address other use cases.
Using a Graphql endpoint reduces the rendering of a cocina object with a large number of files to under a second.
An additional component was added to DSA, which is already a large codebase.
A Graphql endpoint allows a client to request only the parts of a cocina object that are needed. In turn, the Graphql server only retrieves the requested parts from the database.
- Pro: Partial retrieval is a core feature of Graphql.
- Pro: Graphql has a well-supported Rails engine.
- Pro: In the future, Graphql may address other possible use cases, including patching cocina objects (instead of the current approach of updating an entire cocina object) and providing a facade over multiple APIs / services (to reduce the number of calls that are required to retrieve all of the data for a cocina object, reduce client complexity, and make it easier to split out services).
- Con: Requires adding an additional component to DSA.
- Con: Infra team is not familiar with Graphql.
- Con: ruby-graphql is split into a free/pro versions. As our usage increases, it may be necessary to license the pro version (similar to our licensing of sidekiq pro).
- Pro: Does not require adding additional components to DSA.
- Con: Does not address other possible use cases.
Rails integration with caching solutions provides approaches for caching cocina objects to minimize the number of requests made to DSA.
- Pro: Confines changes to Argo.
- Pro: Uses machinery that is already part of Rails.
- Con: Because of the way that Argo is deployed (multiple processes on multiple servers), the recommended approach is to use memcache. However, a memcache system is significant infrastructure and is not supported by Ops.