Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tests for filters, formatters, handlers #206

Merged
merged 5 commits into from
Oct 11, 2016

Conversation

voith
Copy link
Contributor

@voith voith commented Sep 20, 2016

List of things to be done as part of this PR:

  • Add tests for formatters
  • Add tests for filters
  • Add tests for Handlers

Besides adding tests, I have removed the duplicate code for DateTimeEncoder from json.py and used the one in utils. I have also added python-json-logger to requirements as JSONFormatter needs it.

Update:
There was a bug in PlainValuesFilter which I have fixed now. It would give the following error on python 3.x:

def filter(self, record):
        if isinstance(record.msg, dict):
            for field_name in self.excluded_fields:
                setattr(record, field_name, record.msg.get(field_name, ''))
            record.msg = self.separator.join([format_str(value)
>                                             for key, value in record.msg.items()
                                              if key not in self.excluded_fields])
E           TypeError: sequence item 0: expected str instance, bytes found

update: 22-09-2016
Besides adding tests for handlers, I've fixed a bug RedisListHandler. some arguments to the constructor were not ordered properly. so I added the name of the argument instead.
Also, I have enabled redis ontravis and I've the redis module to requirements

The motivation behind this PR is #144. I have added tests in order to better understand formatters, filters and handlers. On completing this PR I will add docs for them in a separate PR.

@codecov-io
Copy link

codecov-io commented Sep 20, 2016

Current coverage is 72.35% (diff: 100%)

Merging #206 into master will increase coverage by 1.89%

@@             master       #206   diff @@
==========================================
  Files            70         68     -2   
  Lines          4563       4500    -63   
  Methods           0          0          
  Messages          0          0          
  Branches        596        589     -7   
==========================================
+ Hits           3215       3256    +41   
+ Misses         1208       1097   -111   
- Partials        140        147     +7   

Powered by Codecov. Last update 6eb8e06...fff2338

@voith voith changed the title [WIP] Add tests for filters, formatters, handlers Added tests for filters, formatters, handlers Sep 22, 2016
@voith
Copy link
Contributor Author

voith commented Sep 22, 2016

This PR is now ready for review. I have added tests for formatters, filters and handlers in 3 different commits respectively. @sibiryakov Can you please review this?

Copy link
Member

@sibiryakov sibiryakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @voith, thank you for another contribution! This time you touched a bit forgotten, less frequently used area, and we need to think what's best way to deal with these components.

self.r.flushall()
del self.logger

def initialise_handler(self, use_to=False, list_name='test_list', max_messages=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a misspelling: initialize_handler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wont be needed if we are removing the redis handler

reset=colors['reset']))


class TestHandlerConsoleManager(TestHandlerConsole):
Copy link
Member

@sibiryakov sibiryakov Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that _MANAGER, _BACKEND, _DEBUGGING and _EVENTS is used somewhere. As I remember they don't. May be it's easier to remove this handlers than writing a tests?

Last time when I was refactoring the logging system, I switched to per-class loggers and their configuration is meant to be done by means of logging.config package (e.g. adding handlers). The handlers above and other components I left because they appeared useful to me, but I never used them, and not aware of somebody else using them. Now, the biggest question for me as a reviewer: does it make sense to test them, or better to remove them?

If we leave them, then yes we need tests and much much more we need a document explaining they exists and how to use them.

Color logging is nice feature, but useful only when running in CMD line. JSON formatter is useful for sure. Filters? Probably yes, mainly to filter events by type, or URLs. Redis? Do you think it's a good idea to store log in redis? Why not setup a local logging to file with rollback? Overall, it needs thought.

I don't want to demotivate you, but this is how it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sibiryakov I totally get what you're trying to say. I myself went to through a similar dilemma when I wrote these tests, whether this code was really useful. The reason why I wrote them anyways was because I got a chance to learn. I don't mind removing the code altogether and also the tests that I've added.

Please check that _MANAGER, _BACKEND, _DEBUGGING and _EVENTS is used somewhere.

I had checked this when I had written the tests. This code actually should have been removed as part of commit d4d668b where FrontierLogger and EventLogManager were removed. This code doesn't make much sense without the FrontierLogger and EventLogManager. I think this code should be removed and while adding docs we can provide examples of different use cases of the ColorFormatter.

Color logging is nice feature, but useful only when running in CMD line. JSON formatter is useful for sure.

I like them too and I think we should keep them. They can be useful.

Filters? Probably yes, mainly to filter events by type, or URLs

The Filters look like they were designed specially to work along with ColorFormatter. If we are keeping ColorFormatter than I think we should keep them too.

Redis? Do you think it's a good idea to store log in redis?

I'm not very sure if the redis handler is useful. May be helpful in a distributed setup so that all logs can be retrieved from one common place. But personally I don't think the RedisListHandler in frontera is very useful. Looks like it was copied from https://github.com/jedp/python-redis-log and slightly modified. I think the code at there is much better and if we ever feel the need for redis lets use the library rather. I'd say lets remove the RedisListHandler.

Can you reply with your second thoughts. So that I know exactly what to do.

I don't want to demotivate you, but this is how it is.

I don't mind any suggestions. As long as you review my code regularly expect a lot more contributions from me :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so let's remove _MANAGER, _BACKEND, _DEBUGGING and _EVENTS handlers and Redis-related stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't remove CONSOLE handler as is it is used SW and DBW. https://github.com/scrapinghub/frontera/blob/master/frontera/worker/db.py#L15

@voith
Copy link
Contributor Author

voith commented Sep 26, 2016

@sibiryakov Is there any bug or feature you'd like to get in frontera which you think is of high priority? I would like to help out. Although I have my own ideas for places to contribute, It would be helpful to know what is of high priority. I would rather like to work on something that would interest you more. Don't worry about the topic. If it's something that I don't know than I don't mind studying :)

@sibiryakov
Copy link
Member

sibiryakov commented Sep 26, 2016

@voith Documentation needs an articles on such topics as

  • distributed backend run mode quickstart,
  • interaction between components (lots of people don't understand how Scrapy and Frontera interact and why some of the settings set in Scrapy doesn't work.)
  • ZeroMQ preserving order of components run.

If you have time please have a look at and get work done
#136 (may be it became better now) or
#128 or consider implementing Web UI for Frontera.

@voith
Copy link
Contributor Author

voith commented Oct 6, 2016

@sibiryakov Can we move forward with this?
I'll definitely work on the todo's list mentioned above

@sibiryakov
Copy link
Member

@voith it looks good! is it ready for a merge?

@voith
Copy link
Contributor Author

voith commented Oct 7, 2016

@sibiryakov Its ready for merge. I'll add docs in a follow up PR.

@sibiryakov sibiryakov merged commit 7c2319b into scrapinghub:master Oct 11, 2016
@sibiryakov
Copy link
Member

cool 🍰

@voith voith deleted the test-logger branch October 11, 2016 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants