Make our CSRD processing code easier to make plugins from #70
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is the result of me rethinking how we use the Arelle XBRL parsing library in the carbon.txt validator, with a view to making it much easier to build plugins that query other CSRD datapoints that might be of interest.
The big change is that rather than there being a a single CSRD processor which contains a bunch of hard coded datapoints (i.e. "our" datapoints for tracking green energy), we:
ArelleProcessor
which is a simplified wrapper around the Arelle's Python APIGreenwebCSRDProcessor
class that makes much more explicit use of the methods offered by theArelleProcessor
directly. We're only really querying for a few datapoints, so we don't really need / want full access to what Arelle offers us.Protocol
for other Processors to satisfy, in order to be considered compatible with this use of the logic encapsulated in the ArelleProcessor.The thinking here is that others who want to make use of the CSRD parsing language need only make a relatively small class similar to the
GreenwebCSRDProcessor
themselves, that also consumes theArelleProcessor
methods directly.I've used a Protocol and composition here instead of inheritance - the intention here is that ultimately it's better to have a bit of upfront friction to create a new class yourself, that you understand fully, and uses a small, well defined API on a separate object, than to subclass something you don't fully understand, and then have to guess at what methods might be "yours" vs in the class hierarchy.
Anyway - the ML generated summary is below too.
This pull request includes significant changes to the
carbon_txt
module, particularly focusing on refactoring the CSRD processing classes and updating related tests. The main changes involve replacing theCSRDProcessor
with the newGreenwebCSRDProcessor
andArelleProcessor
, as well as introducing a protocol for CSRD processing plugins.Refactoring and Class Renaming:
src/carbon_txt/process_csrd_document.py
: ReplacedCSRDProcessor
withGreenwebCSRDProcessor
for processing CSRD documents. [1] [2]src/carbon_txt/processors.py
: RenamedCSRDProcessor
toArelleProcessor
, and introducedGreenwebCSRDProcessor
andCSRDProcessorProtocol
to handle CSRD processing with a more modular approach. [1] [2]Code Simplification and Cleanup:
ArelleProcessor
. [1] [2]Test Updates:
GreenwebCSRDProcessor
andArelleProcessor
instead ofCSRDProcessor
. [1] [2] [3] [4] [5] [6] [7]Configuration Adjustments:
src/carbon_txt/web/config/settings/base.py
to improve readability and ensure proper environment variable handling. [1] [2] [3]