Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance/gtfs network refactor w unit tests #86

Merged
merged 28 commits into from
Apr 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e8a2bb0
update contribution guidelines
sablanchard Apr 1, 2021
ed05642
minor formatting, updates to docstrings, and prints for clarity
sablanchard Apr 1, 2021
823a80b
replace '{}/{}'.format() -> os.path.join()
sablanchard Apr 1, 2021
02a2016
address TODO to simplify read HDF5 store key print
sablanchard Apr 1, 2021
e0e86b0
add prints for saving and loading HDF5 files and minor updates to pri…
sablanchard Apr 1, 2021
295cdc9
address YAMLLoadWarning by replacing yaml.load(f) -> yaml.safe_load(f)
sablanchard Apr 1, 2021
56d6602
ensure lists and DFs are returned in correct sort order for unit tests
sablanchard Apr 1, 2021
8201d7d
minor formatting, print updates for simplification, and docstring upd…
sablanchard Apr 1, 2021
c436e5a
minor formatting, prints, and docstring updates
sablanchard Apr 1, 2021
b87161b
move time range value check to its own function _check_time_range_for…
sablanchard Apr 1, 2021
597709c
dont allow overwrite_existing_stop_times_int and use_existing_stop_ti…
sablanchard Apr 1, 2021
0f08cc5
add prints to clarify when overwrite_existing_stop_times_int or use_e…
sablanchard Apr 1, 2021
174ba1e
only print if applicable
sablanchard Apr 1, 2021
b5ced24
add specific ValueError when interpolator sees duplicate stop_sequenc…
sablanchard Apr 1, 2021
a97f92b
refactor section that uses _check_if_index_name_in_cols() for clarity…
sablanchard Apr 1, 2021
b84c46f
update docstring
sablanchard Apr 1, 2021
e229887
refactor edge_impedance_by_route_type(): simplify function, update to…
sablanchard Apr 1, 2021
03227ab
refactor save_processed_gtfs_data(): simplify function, add prints, a…
sablanchard Apr 1, 2021
832de7a
refactor load_processed_gtfs_data(): simplify function, add prints, a…
sablanchard Apr 1, 2021
106c422
improve print, add TODO
sablanchard Apr 1, 2021
122e71b
remove TODO as print is accurate in what its counting
sablanchard Apr 1, 2021
5ed3813
add new unit tests to gtfs.network.gtfs_network, update existing, exp…
sablanchard Apr 1, 2021
01ccf72
fix minor typos
sablanchard Apr 2, 2021
b2b5065
pycodestyle fixes and unit test update
sablanchard Apr 2, 2021
9295484
debug travis, add run time profile to test
sablanchard Apr 2, 2021
566a6ba
debug travis
sablanchard Apr 2, 2021
0a7ca7d
fix travis
sablanchard Apr 2, 2021
ef3482f
debug travis py3.5 issue
sablanchard Apr 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ install:
- conda create -n test-environment python=$TRAVIS_PYTHON_VERSION pyyaml --file requirements-dev.txt
- source activate test-environment
- conda info --all
- pip install 'numpy>=1.18'
- pip install .
- pip list
- pip show urbanaccess
Expand Down
68 changes: 60 additions & 8 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,68 @@
Thanks for using UrbanAccess!

This is an open source project that's part of the Urban Data Science Toolkit. Development and maintenance is a collaboration between UrbanSim Inc, U.C. Berkeley's Urban Analytics Lab, and other contributors.

## If you have a problem:

- Take a look at the [open issues](https://github.com/UDST/urbanaccess/issues) and [closed issues](https://github.com/UDST/urbanaccess/issues?q=is%3Aissue+is%3Aclosed) to see if there's already a related discussion

- Open a new issue describing the problem -- if possible, include any error messages, a full reproducible example of the code that generated the error, the operating system and version of Python you're using, and versions of any libraries that may be relevant

## Feature proposals:

- Take a look at the [open issues](https://github.com/UDST/urbanaccess/issues) and [closed issues](https://github.com/UDST/urbanaccess/issues?q=is%3Aissue+is%3Aclosed) to see if there's already a related discussion

- Post your proposal as a new issue, so we can discuss it (some proposals may not be a good fit for the project)

## Contributing code:

- Create a new branch of `UDST/urbanaccess/dev`, or fork the repository to your own account

- Make your changes, following the existing styles for code and inline documentation

- Add [tests](https://github.com/UDST/urbanaccess/tree/dev/urbanaccess/tests) if possible
- We use the test suite: Pytest

- Run tests and address any issues that may be flagged. If flags are raised that are not due to the PR note that in a new comment in the PR
- Run Pytest test suite: `py.test`
- UrbanAccess currently supports Python 2.7, 3.5, 3.6, 3.7, 3.8. Tests will be run in these environments when the PR is created but any flags raised in these environments should also be addressed
- UrbanAccess also uses a series of integration tests to test entire workflows, run the integration tests:
- Run:
```cd demo
jupyter nbconvert --to python simple_example.ipynb
cd ../urbanaccess/tests/integration
python remove_nb_magic.py -in simple_example.py -out simple_example_clean.py
cd ../../../demo
python simple_example_clean.py
cd ../urbanaccess/tests/integration
python integration_madison.py
python integration_sandiego.py
- Run pycodestyle Python style guide checker: `pycodestyle --max-line-length=100 urbanaccess`

- Open a pull request to the `UDST/urbanaccess` `dev` branch, including a writeup of your changes -- take a look at some of the closed PR's for examples

- Current maintainers will review the code, suggest changes, and hopefully merge it and schedule it for an upcoming release

## Updating the documentation:

- See instructions in `docs/README.md`

## Preparing a release:

- Make a new branch for release prep

- Update the version number and changelog
- Update the version number and changelog:
- `CHANGELOG.md`
- `setup.py`
- `urbanaccess/__init__.py`
- `docs/source/conf.py`
- `docs/source/index.rst`
- `docs/source/conf.py`

- Make sure all the tests are passing, and check if updates are needed to `README.md` or to the documentation

- Open a pull request to the master branch to finalize it

- After merging, tag the release on GitHub and follow the distribution procedures below
- Open a pull request to the `dev` branch to finalize it and wait for a PR review and approval

- After the PR has been approved, it can be merged to `dev`. Then a release PR can be created from `dev` to merge into `master`. Once merged, tag the release on GitHub and follow the distribution procedures below:

## Distributing a release on PyPI (for pip installation):

Expand All @@ -24,17 +72,21 @@

- Run `python setup.py sdist bdist_wheel --universal`

- This should create a `dist` directory containing two package files -- delete any old ones before the next step
- This should create a `dist` directory containing a gzip package file -- delete any old ones before the next step

- Run `twine upload dist/*` -- this will prompt you for your pypi.org credentials

- Check https://pypi.org/project/osmnet/ for the new version
- Check https://pypi.org/project/urbanaccess/ for the new version


## Distributing a release on Conda Forge (for conda installation):

- The [conda-forge/urbanaccess-feedstock](https://github.com/conda-forge/urbanaccess-feedstock) repository controls the Conda Forge release
- The [conda-forge/urbanaccess-feedstock](https://github.com/conda-forge/urbanaccess-feedstock) repository controls the Conda Forge release, including which GitHub users have maintainer status for the repo

- Conda Forge bots usually detect new releases on PyPI and set in motion the appropriate feedstock updates, which a current maintainer will need to approve and merge

- Maintainers can add on additional changes before merging the PR, for example to update the requirements or edit the list of maintainers

- You can also fork the feedstock and open a PR manually. It seems like this must be done from a personal account (not a group account like UDST) so that the bots can be granted permission for automated cleanup

- Check https://anaconda.org/conda-forge/urbanaccess for the new version (may take a few minutes for it to appear)
8 changes: 4 additions & 4 deletions urbanaccess/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

def _format_check(settings):
"""
Check the format of a urbanaccess_config object.
Check the format of an urbanaccess_config object.

Parameters
----------
Expand Down Expand Up @@ -84,7 +84,7 @@ def __init__(self,
def from_yaml(cls, configdir='configs',
yamlname='urbanaccess_config.yaml'):
"""
Create a urbanaccess_config instance from a saved YAML configuration.
Create an urbanaccess_config instance from a saved YAML configuration.

Parameters
----------
Expand All @@ -108,7 +108,7 @@ def from_yaml(cls, configdir='configs',
yaml_file = os.path.join(configdir, yamlname)

with open(yaml_file, 'r') as f:
yaml_config = yaml.load(f)
yaml_config = yaml.safe_load(f)

settings = cls(data_folder=yaml_config.get('data_folder', 'data'),
logs_folder=yaml_config.get('logs_folder', 'logs'),
Expand Down Expand Up @@ -143,7 +143,7 @@ def to_dict(self):
def to_yaml(self, configdir='configs', yamlname='urbanaccess_config.yaml',
overwrite=False):
"""
Save a urbanaccess_config representation to a YAML file.
Save an urbanaccess_config representation to a YAML file.

Parameters
----------
Expand Down
60 changes: 18 additions & 42 deletions urbanaccess/gtfs/headways.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import logging as lg

from urbanaccess.utils import log
from urbanaccess.gtfs.utils_validation import _check_time_range_format
from urbanaccess.gtfs.network import _time_selector

warnings.simplefilter(action="ignore", category=FutureWarning)
Expand Down Expand Up @@ -68,26 +69,25 @@ def _headway_handler(interpolated_stop_times_df, trips_df,
Parameters
----------
interpolated_stop_times_df : pandas.DataFrame
interpolated stop times dataframe for stop times within the time range
interpolated stop times DataFrame for stop times within the time range
trips_df : pandas.DataFrame
trips dataframe
trips DataFrame
routes_df : pandas.DataFrame
routes dataframe
routes DataFrame
headway_timerange : list
time range for which to calculate headways between as a
list of time 1 and time 2 where times are 24 hour clock strings
such as:
['07:00:00', '10:00:00']
time range for which to calculate headways between in a list with time
1 and time 2 as strings. Must follow format of a 24 hour clock for
example: 08:00:00 or 17:00:00

Returns
-------
headway_by_routestop_df : pandas.DataFrame
dataframe of statistics of route stop headways in units of minutes
DataFrame of statistics of route stop headways in units of minutes
with relevant route and stop information
"""
start_time = time.time()

# add unique trip and route id
# add unique trip and route ID
trips_df['unique_trip_id'] = (
trips_df['trip_id'].str.cat(
trips_df['unique_agency_id'].astype('str'), sep='_'))
Expand All @@ -105,7 +105,7 @@ def _headway_handler(interpolated_stop_times_df, trips_df,

trips_df = trips_df[columns]

# add unique route id
# add unique route ID
routes_df['unique_route_id'] = (
routes_df['route_id'].str.cat(
routes_df['unique_agency_id'].astype('str'), sep='_'))
Expand Down Expand Up @@ -138,7 +138,7 @@ def _headway_handler(interpolated_stop_times_df, trips_df,
headway_by_routestop_df['unique_stop_id'].str.cat(
headway_by_routestop_df['unique_route_id'].astype('str'), sep='_'))

log('headway calculation complete. Took {:,.2f} seconds'.format(
log('Headway calculation complete. Took {:,.2f} seconds.'.format(
time.time() - start_time))

return headway_by_routestop_df
Expand All @@ -153,9 +153,9 @@ def headways(gtfsfeeds_df, headway_timerange):
gtfsfeeds_df : object
gtfsfeeds_dfs object with all processed GTFS data tables
headway_timerange : list
time range for which to calculate headways between as a list of
time 1 and time 2 where times are 24 hour clock strings such as:
['07:00:00', '10:00:00']
time range for which to calculate headways between in a list with time
1 and time 2 as strings. Must follow format of a 24 hour clock for
example: 08:00:00 or 17:00:00

Returns
-------
Expand All @@ -164,39 +164,15 @@ def headways(gtfsfeeds_df, headway_timerange):
route stop headways in units of minutes
with relevant route and stop information
"""

time_error_statement = (
'{} starttime and endtime are not in the correct format. '
'Format should be a 24 hour clock in following format: 08:00:00 '
'or 17:00:00'.format(headway_timerange))
if not isinstance(headway_timerange, list) or len(headway_timerange) != 2:
raise ValueError('timerange must be a list of length 2')
if headway_timerange[0].split(':')[0] > headway_timerange[1].split(':')[0]:
raise ValueError('starttime is greater than endtime')

for t in headway_timerange:
if not isinstance(t, str):
raise ValueError(time_error_statement)
if len(t) != 8:
raise ValueError(time_error_statement)
if int(headway_timerange[1].split(':')[0]) - int(
headway_timerange[0].split(':')[0]) > 3:
long_time_range_msg = (
'WARNING: Time range passed: {} is a {} hour period. Long periods '
'over 3 hours may take a significant amount of time to process.')
log(long_time_range_msg.format(headway_timerange,
int(str(
headway_timerange[1][0:2])) - int(
str(headway_timerange[0][0:2]))),
level=lg.WARNING)
_check_time_range_format(headway_timerange)

if gtfsfeeds_df is None:
raise ValueError('gtfsfeeds_df cannot be None')
raise ValueError('gtfsfeeds_df cannot be None.')
if gtfsfeeds_df.stop_times_int.empty or gtfsfeeds_df.trips.empty or \
gtfsfeeds_df.routes.empty:
raise ValueError(
'one of the gtfsfeeds_dfs objects: stop_times_int, trips, '
'or routes were found to be empty.')
'One of the following gtfsfeeds_dfs objects: stop_times_int, '
'trips, or routes were found to be empty.')

headways_df = _headway_handler(
interpolated_stop_times_df=gtfsfeeds_df.stop_times_int,
Expand Down
Loading