Replies: 4 comments 1 reply
-
|
Just to clarify, Digest implementation could be something along the lines of: class Digest:
# ...
def write_to_disk(file_path):
with open(file_path) as f:
f.write(self.digest_service.summarize(self.newsletters.to_text())) |
Beta Was this translation helpful? Give feedback.
-
|
Hi @nestak, thanks for starting this discussion! Let's start with a quick recap. Recap
Oh yes. When I started this project, my goal was to create a simple tool for myself that would summarize emails and remove noise and marketing BS from them. In the beginning, I had one class
Of course, it worked, but it was hard to read and develop. Then I split this one big class into smaller ones that have clear responsibilities. In the end, I have a lot of "doer" classes, as you mentioned. In the C # implementation, I tried to follow more or less clean architecture principle. Then, I refactored the app to a Python app. Now you started this discussion and this is perfect timing, because the app is really small and we can consider some options and implement them easily. 🎉 Operation-centric naming vs. domain-centric namingAs I see it we talk about operation-centric names ("doers") vs. domain-centric naming First approach focuses more on implementation details, the second one is more conceptual and I see it can be more flexible. I have some concerns and open questions about how the second approach will work when we add new features to the app. Let's see! Design considerations1. How might the redesign handle adding new LLM providers? Would you need to modify
|
Beta Was this translation helpful? Give feedback.
-
|
Wow, thanks for this writeup, you raised great points! |
Beta Was this translation helpful? Give feedback.
-
|
Hey, my answers and some "food for thought":
My idea was to define interface in If by extending you mean inheriting from
More concrete example: class DigestNetworkService(DigestService):
def __init__(self, user_input):
self.user_input = user_input # "user_input" is just args
# this can be done in different ways
# in this example I am using dictionary
self.backends: {
'ollama': self.__ollama,
'foo': self.__foo
}
# could be a static method
def available_backends(self):
return list(self.backends.keys())
def summarize(self, text):
backed = self.user_input.summarizer
if not text:
return ""
elif self.backends[backend]:
return self.backends[backend](text)
else:
raise NotImplementedError(
f"No such backend found: {backend}")
def __ollama(self, text):
# http call implementation
# ...# main.py
backends = DigestNetworkService().available_backends()
parser.add_argument("--summarizer", choices=backends, default=backends[0],
For now, lets think of "user_input" as just renamed In my examples, I pass the whole args object, instead of primitives. I should have explained that right away, sorry about that.
Why such concern? We will always be forced to modify some class or calling code to "add"/"register" a new LLM backend (all those if statements in a factory, or main function). What should not change, is existing functionality and interface. I guess that's what you meant, but correct me if I am wrong. Regarding passing in a providerI don't like this approach, because client code needs to have all those if statements checking Regarding using factoryI like this approach better. This is similar to example I given above, but separates implementations into another classes. On inheritance
I thought input and output of
I am not sure about this, why "standardize" errors?
I think the core is text in, text out. On new features
I think we basically need two methods –
Probably yes, having above methods fetch() and to_text().
I previously mentioned, that I would merge EmailService with NewsletterCollection, so it could be a type of "ContentSource".
I am not sure yet, I would need to look into specifics. On proposalHmm, looks like you changed one set of "doers" to another set: "summarizers" vs "providers" :) Regarding "DigestOrchestrator"... yeah, I don't like it :) I would stick to "Digest" or "Summary". Other than that, like before, it leaves to client code (main) to handle all the logic with choosing to instantiate correct classes. I think we could explore other directrions, and hide all the if statements in a factory or something similar. Make the objects do something useful for us (delegate), and not be tied to knowing which class to instantiate. If I understood examples correctly, anytime the arguments are different, client code needs to handle that. What would you say if, there would be only one composition in client code? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey,
first of all, thank you for the initiative and the invitation to discuss ideas!
I've read the implementation of the NitroDigest and have some questions and suggestions regarding it.
Thorough the code I see a lot of "doer" classes, for example "writer", "summarizer", "processor", etc.
I think this happens, because usually we are constrained on time, and focus on implementation. So, a writing procedure becomes an object which is a "writer" with a "write" method, or a "summarizer" object with a "summary" method.
So was naming intended, or it came out of usual conventions we see being copied everywhere (controller, provider, writer, etc.)?
Interesting reading in regard of the above, could be what Carlo Pescio wrote on his blog: Your Coding Conventions Are Hurting You. I think he has some good points.
I thought it would be nice to do something about it, so I experimented a bit on naming (see below). Let me know what do you think about this direction.
Redesign attempt
I asked myself a few questions:
My first (not good) attempt was:
Above is only a rough draft, and a lot is to be desired. What I didn't touch yet, was html_extractor, prompt, and summary_writer classes. Also, dealing with user input would be nice, and having some plan for the main.py file, which now acts as a controller (micro-managing, looping).
Second attempt
After thinking some more, I came up with this:
Then, we use above classes:
More changes
Usage:
Beta Was this translation helpful? Give feedback.
All reactions