Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions pgeocode.py
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,32 @@ def query_location(

return pd.DataFrame(columns=self._data.columns)

def query_geocode(self, lat: float, lon: float):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's call it query_coordinates maybe.

Also I think it would be better to also add a top_k: int=1 param and returns a dataframe not a series, similar to how query_location works. Add the return type annotations.

"""Get locations information from geo points

Args:
lat (float): latitude
lon (float): longitude
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use numpy style docstring.


Returns:
pandas.DataFrame: a DataFrame with the relevant information
"""
distances = self._data_frame.apply(
lambda row: haversine_distance(
np.array([[lat, lon]]), np.array([[row["latitude"], row["longitude"]]])
),
axis=1,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can build the array of coordinates _data_frame and pass it to haversine_distance directly (it should work with arrays of shape N, that would be faster.


# Get the index of the row with the smallest distance
nearest_index = distances.idxmin()

# Return the nearest row
nearest_row = self._data_frame.loc[nearest_index].copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can argsort distances and return top_k

nearest_row["calculated_distance"] = distances[nearest_index][0]

return nearest_row

def _str_contains_search(self, text: str, col: str) -> pd.DataFrame:
match_mask = self._data[col].str.lower().str.contains(text.lower())
match_mask.fillna(False, inplace=True)
Expand Down
Loading