Skip to content

Latest commit

 

History

History
109 lines (78 loc) · 3.93 KB

README.md

File metadata and controls

109 lines (78 loc) · 3.93 KB

japanese-addresses

PyPI version Python package codecov Binder

Parsing Japan addresses to prefectures and cities.

Installation

pip install japanese-addresses

Examples

from japanese_addresses import separate_address

parsed_address = separate_address('宮城県仙台市泉区市名坂字東裏97-1')

print(parsed_address)
"""
ParsedAddress(prefecture='宮城県', city='仙台市泉区', street='市名坂')
"""

parsed_address = separate_address('鹿児島県志布志市志布志町志布志')

print(parsed_address)
"""
ParsedAddress(prefecture='鹿児島県', city='志布志市', street='志布志町志布志')
"""

How to use it in combination with pandas.

import pandas as pd
from japanese_addresses import separate_address

df = pd.read_csv('sample.csv')
df.head()
"""
	address
0	宮城県仙台市泉区市名坂字東裏97-1
1	鹿児島県志布志市志布志町志布志
2	東京都 神津島村284番
"""
target_col = 'address'

# https://stackoverflow.com/questions/16236684/apply-pandas-function-to-column-to-create-multiple-new-columns
def get_separate_address(address):
    parsed_address = separate_address(address)
    return parsed_address.prefecture, parsed_address.city, parsed_address.street


df['prefecture'], df['city'], df['street']= zip(*df[target_col].map(get_separate_address))
df.head()
"""
	address	prefecture	city	street
0	宮城県仙台市泉区市名坂字東裏97-1	宮城県	仙台市泉区	市名坂
1	鹿児島県志布志市志布志町志布志	鹿児島県	志布志市	志布志町志布志
2	東京都 神津島村284番	東京都	神津島村
"""

Testing

pip install poetry
poetry install
poetry run pytest

License

Japanese_addresses are licensed under MIT

prefecture2city2street.pkl is a derivative work with a modification of geolonia / japanese-addresses

Also, prefecture2city2street.pkl was created using csv_to_dict.py.

Information on the original work

Here's the link to the original work.

https://geolonia.github.io/japanese-addresses/

geolonia/japanese-addresses - GitHub

The following is written in Japanese according to the description of the original work.

タイトル

Geolonia 住所データ

出典

本データは、以下のデータを元に、毎月 Geolonia にて更新作業を行っています。

スポンサー

一般社団法人 不動産テック協会

ライセンス

CC BY 4.0