Skip to content

Conversation

@SourceryAI
Copy link

Thanks for starring sourcery-ai/sourcery ✨ 🌟 ✨

Here's your pull request refactoring your most popular Python repo.

If you want Sourcery to refactor all your Python repos and incoming pull requests install our bot.

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch https://github.com/sourcery-ai-bot/data-science-from-scratch master
git merge --ff-only FETCH_HEAD
git reset HEAD^

Comment on lines -182 to +183
if years_experience < 3.0: return "paid"
elif years_experience < 8.5: return "unpaid"
else: return "paid"
if years_experience < 3.0 or years_experience >= 8.5: return "paid"
else: return "unpaid"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function predict_paid_or_unpaid refactored with the following changes:

document_lengths = map(len, documents)

distinct_words = set(word for document in documents for word in document)
distinct_words = {word for document in documents for word in document}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 181-181 refactored with the following changes:

Comment on lines -38 to +45

if n % 2 == 1:
# if odd, return the middle value
return sorted_v[midpoint]
else:
# if even, return the average of the middle values
lo = midpoint - 1
hi = midpoint
return (sorted_v[lo] + sorted_v[hi]) / 2
# if even, return the average of the middle values
lo = midpoint - 1
hi = midpoint
return (sorted_v[lo] + sorted_v[hi]) / 2
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function median refactored with the following changes:

Comment on lines -129 to +140
c1, c2 = min([(cluster1, cluster2)
for i, cluster1 in enumerate(clusters)
for cluster2 in clusters[:i]],
key=lambda p: cluster_distance(p[0], p[1], distance_agg))
c1, c2 = min(
(
(cluster1, cluster2)
for i, cluster1 in enumerate(clusters)
for cluster2 in clusters[:i]
),
key=lambda p: cluster_distance(p[0], p[1], distance_agg),
)


# remove them from the list of clusters
clusters = [c for c in clusters if c != c1 and c != c2]
clusters = [c for c in clusters if c not in [c1, c2]]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function bottom_up_cluster refactored with the following changes:

Comment on lines -180 to +181
if years_experience < 3.0: return "paid"
elif years_experience < 8.5: return "unpaid"
else: return "paid"
if years_experience < 3.0 or years_experience >= 8.5: return "paid"
else: return "unpaid"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function predict_paid_or_unpaid refactored with the following changes:

Comment on lines -132 to +134
for i in range(10):
for _ in range(10):
binary.append(x % 2)
x = x // 2
x //= 2
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function binary_encode refactored with the following changes:

Comment on lines -156 to +160

# training data
xs = [[0., 0], [0., 1], [1., 0], [1., 1]]
ys = [[0.], [1.], [1.], [0.]]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

document_lengths = [len(document) for document in documents]

distinct_words = set(word for document in documents for word in document)
distinct_words = {word for document in documents for word in document}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 208-208 refactored with the following changes:

import tqdm

for iter in tqdm.trange(1000):
for _ in tqdm.trange(1000):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 252-252 refactored with the following changes:

Comment on lines -63 to +92

num_epochs = 10000
random.seed(0)

guess = [random.random(), random.random()] # choose random value to start

learning_rate = 0.00001


with tqdm.trange(num_epochs) as t:
learning_rate = 0.00001

for _ in t:
alpha, beta = guess

# Partial derivative of loss with respect to alpha
grad_a = sum(2 * error(alpha, beta, x_i, y_i)
for x_i, y_i in zip(num_friends_good,
daily_minutes_good))

# Partial derivative of loss with respect to beta
grad_b = sum(2 * error(alpha, beta, x_i, y_i) * x_i
for x_i, y_i in zip(num_friends_good,
daily_minutes_good))

# Compute loss to stick in the tqdm description
loss = sum_of_sqerrors(alpha, beta,
num_friends_good, daily_minutes_good)
t.set_description(f"loss: {loss:.3f}")

# Finally, update the guess
guess = gradient_step(guess, [grad_a, grad_b], -learning_rate)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

for i in t:
# i is prime if no smaller prime divides it.
i_is_prime = not any(i % p == 0 for p in primes)
i_is_prime = all(i % p != 0 for p in primes)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function remove_projection_from_vector.main.primes_up_to refactored with the following changes:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant