Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure partitioning is consistent accross message bus and queue #284

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

isra17
Copy link
Contributor

@isra17 isra17 commented Jun 21, 2017

Current partitioning scheme cannot be easily customized and is inconsistent throughout the code base. For instance, the DBWorker might be use fingerprint partitioning while the queue is actually partitioned by the hostname (And can't be change to do otherwise). In my case, I have multiples spider partition, partitioned by fingerprint, but the Queue end up partitioning everything in a single partition based on the hostname. So unless a spider from this partition ask for a batch, the other spiders will keep receiving empty batches.

This PR does add two settings:SPIDER_FEED_PARTIONER and SPIDER_LOG_PARTITIONER. This way users can switch from fingerprint or hostname partitioning. It also let them create their own custom partitioner.

QUEUE_HOSTNAME_PARTITIONING has been deprecated since SPIDER_FEED_PARTIONER can be used instead.

@isra17 isra17 force-pushed the fix-consistent-partitionning branch from 2c68f59 to 1da2d73 Compare June 21, 2017 17:12
@codecov-io
Copy link

codecov-io commented Jun 21, 2017

Codecov Report

Merging #284 into master will increase coverage by 0.03%.
The diff coverage is 78.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #284      +/-   ##
==========================================
+ Coverage   70.11%   70.14%   +0.03%     
==========================================
  Files          68       68              
  Lines        4718     4733      +15     
  Branches      632      629       -3     
==========================================
+ Hits         3308     3320      +12     
- Misses       1273     1275       +2     
- Partials      137      138       +1
Impacted Files Coverage Δ
frontera/contrib/backends/memory/__init__.py 96.61% <100%> (-0.1%) ⬇️
frontera/settings/default_settings.py 100% <100%> (ø) ⬆️
frontera/worker/db.py 64.31% <40%> (+1.13%) ⬆️
frontera/contrib/backends/hbase.py 71.17% <60%> (+0.62%) ⬆️
frontera/contrib/backends/sqlalchemy/components.py 67.32% <62.5%> (-0.82%) ⬇️
frontera/contrib/backends/sqlalchemy/revisiting.py 80.61% <66.66%> (-1.21%) ⬇️
frontera/contrib/backends/partitioners.py 86.04% <71.42%> (-13.96%) ⬇️
frontera/core/components.py 75.3% <75%> (-0.02%) ⬇️
frontera/contrib/backends/sqlalchemy/__init__.py 72.78% <77.77%> (+0.52%) ⬆️
frontera/contrib/messagebus/zeromq/__init__.py 83.33% <81.25%> (-1.01%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a6ecd6...1da2d73. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants