Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TDL-18879: Clear offset when max_skip error is encountered and return 0 if skip is greater than 250k in the current state. #29

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

hpatel41
Copy link
Contributor

@hpatel41 hpatel41 commented May 3, 2022

Description of change

TDL-18879: Clear offset when max_skip error is encountered and return 0 if skip is greater than 250k in the current state.

  • Updated the code to clear offset written in the state file when we encounter max_skip error.
  • Updated the code to return 0 if the current state contains an offset greater than 250k when syncing.

Manual QA steps

Risks

Rollback steps

  • revert this branch

if not date_window:
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 15 days.".format(ctx.config.get("date_window")))
date_window = 15
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 5 days.".format(ctx.config.get("date_window")))
Copy link

@dbshah1212 dbshah1212 May 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 5 days.".format(ctx.config.get("date_window")))
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of {} days.".format(ctx.config.get("date_window")),DATE_WINDOW_SIZE)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

except ValueError:
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 15 days.".format(ctx.config.get("date_window")))
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 5 days.".format(ctx.config.get("date_window")))
Copy link

@dbshah1212 dbshah1212 May 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of 5 days.".format(ctx.config.get("date_window")))
LOGGER.warning("Invalid value of date window is passed: \'{}\', using default window size of {} days.".format(ctx.config.get("date_window")), DATE_WINDOW_SIZE)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

date_window = int(ctx.config.get("date_window", 15))
# if date_window is 0, '0' or None, then set default window size of 15 days
date_window = int(ctx.config.get("date_window", DATE_WINDOW_SIZE))
# if date_window is 0, '0' or None, then set default window size of 5 days

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# if date_window is 0, '0' or None, then set default window size of 5 days
# if date_window is 0, '0' or None, then set the default window size to DATE_WINDOW_SIZE(5 days)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

@luandy64 luandy64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading the card this PR is written to address, I don't think this is the approach we should take to debug this issue for the customer

@hpatel41 hpatel41 changed the title TDL-18879: Update default date window size to 5 for activities stream TDL-18879: Clear offset when max_skip error is encountered and return 0 if skip is greater than 250k in the current state. May 5, 2022
@@ -46,7 +46,10 @@ def set_bookmark(self, path, val):

def get_offset(self, path):
off = bks_.get_offset(self.state, path[0])
return (off or {}).get(path[1])
value = (off or {}).get(path[1])
if path[0] == "activities" and value > 250000:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here regarding a specific scenario like when the state reached this condition for the customer with the previous version and this will reduce skip to 0 for date_windowing and all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment

ctx.write_state()
if 'max_skip = ' in str(e):
if tap_stream_id == IDS.ACTIVITIES:
LOGGER.warning("The skip cannot be greater than 250000, please reduce the date window size and try again.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGGER.warning("The skip cannot be greater than 250000, please reduce the date window size and try again.")
LOGGER.warning("Hit max_skip error so clearing skip offset, please reduce the date window size and try again.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

# clear offset
ctx.clear_offsets(tap_stream_id)
ctx.write_state()
raise

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add the message to reduce the date window.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated error message

ctx.write_state()
if 'max_skip = ' in str(e):
if tap_stream_id == IDS.ACTIVITIES:
LOGGER.warning("Hit max_skip error so clearing skip offset, please reduce the date window size and try again.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be an error and must be raised with this message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@hpatel41 hpatel41 requested a review from KrisPersonal May 5, 2022 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants