Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Processor class and refactored some preliminary processing #106

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

QuanMPhm
Copy link
Contributor

@QuanMPhm QuanMPhm commented Sep 25, 2024

Closes #99. Somewhat dependant on #100. A Processor class has been added, subclassing from the Invoice class. This is the first step to refactor invoicing in order to seperate the processing and filtering/exporting functionalities of our current Invoice subclasses. Subclasses of Processor should only process invoices and manipulate its internal data, while subclasses of Invoice should only perform filtering and exporting, never changing any data itself.

In addition to implementing Processor, two of its subclasses, ValidatePIAliasProcessor and AddInstitutionProcessor has been added to perform some preliminary processing steps.

@QuanMPhm QuanMPhm force-pushed the 99/implement_processor branch 2 times, most recently from d72d59b to d9b838c Compare September 25, 2024 18:32
lenovo_inv = lenovo_invoice.LenovoInvoice(
name=args.Lenovo_file, invoice_month=invoice_month, data=merged_dataframe.copy()
name=args.Lenovo_file, invoice_month=invoice_month, data=add_institute_proc.data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we not passing a copy of the dataframe here anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an honest mistake. Good catch!

)
nonbillable_inv = nonbillable_invoice.NonbillableInvoice(
name=args.nonbillable_file,
invoice_month=invoice_month,
data=merged_dataframe.copy(),
data=add_institute_proc.data,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question

)
nonbillable_inv = nonbillable_invoice.NonbillableInvoice(
name=args.nonbillable_file,
invoice_month=invoice_month,
data=merged_dataframe.copy(),
data=add_institute_proc.data.copy(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if, after you are done with processing (for example on line 217), you assign the dataframe of the last Processor to a variable with a meaningful name like processed_data.

Passing over add_institute_proc.data requires

  • The continued existence of the add_institute_proc object instance so that you can access its dataframe
  • Knowing that the reason why you are passing that specifically here is because add_institute_proc is the last processor and therefore that is the fully processed dataframe, which requires you to scroll up to figure out that is the case.
  • In the case of adding another Processor, you need to change all instances of add_institute_proc.data to their actual new processor name.


@dataclass
class ValidatePIAliasProcessor(processor.Processor):
alias_map: dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's inconsistency in how you're initializing the 2 processors.

  • ValidatePIAliasProcessor takes in a dictionary which contains the alias_map, whereas
  • AddInstitutionProcessor loads its own file to get at the institution map.

I don't think you should change this logic now, but this is something that you should be aware when you implement further processors.

A `Processor` class has been added, subclassing from the `Invoice` class.
This is the first step to refactor invoicing in order to seperate the
processing and filtering/exporting functionalities of our current
Invoice subclasses. Subclasses of `Processor` should only process invoices
and manipulate its internal data, while subclasses of `Invoice` should
only perform filtering and exporting, never changing any data itself.

In addition to implementing `Processor`, two of its subclasses,
`ValidatePIAliasProcessor` and `AddInstitutionProcessor` has been
added to perform some preliminary processing steps.
@QuanMPhm QuanMPhm merged commit 6276e84 into CCI-MOC:main Oct 21, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a Processor class to refactor invoice processing
3 participants