Skip to content

Commit

Permalink
GTFS network refactor (#86)
Browse files Browse the repository at this point in the history
  • Loading branch information
sablanchard authored Apr 29, 2021
1 parent 44b3a6a commit 531c394
Show file tree
Hide file tree
Showing 19 changed files with 1,852 additions and 637 deletions.
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ install:
- conda create -n test-environment python=$TRAVIS_PYTHON_VERSION pyyaml --file requirements-dev.txt
- source activate test-environment
- conda info --all
- pip install 'numpy>=1.18'
- pip install .
- pip list
- pip show urbanaccess
Expand Down
68 changes: 60 additions & 8 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,68 @@
Thanks for using UrbanAccess!

This is an open source project that's part of the Urban Data Science Toolkit. Development and maintenance is a collaboration between UrbanSim Inc, U.C. Berkeley's Urban Analytics Lab, and other contributors.

## If you have a problem:

- Take a look at the [open issues](https://github.com/UDST/urbanaccess/issues) and [closed issues](https://github.com/UDST/urbanaccess/issues?q=is%3Aissue+is%3Aclosed) to see if there's already a related discussion

- Open a new issue describing the problem -- if possible, include any error messages, a full reproducible example of the code that generated the error, the operating system and version of Python you're using, and versions of any libraries that may be relevant

## Feature proposals:

- Take a look at the [open issues](https://github.com/UDST/urbanaccess/issues) and [closed issues](https://github.com/UDST/urbanaccess/issues?q=is%3Aissue+is%3Aclosed) to see if there's already a related discussion

- Post your proposal as a new issue, so we can discuss it (some proposals may not be a good fit for the project)

## Contributing code:

- Create a new branch of `UDST/urbanaccess/dev`, or fork the repository to your own account

- Make your changes, following the existing styles for code and inline documentation

- Add [tests](https://github.com/UDST/urbanaccess/tree/dev/urbanaccess/tests) if possible
- We use the test suite: Pytest

- Run tests and address any issues that may be flagged. If flags are raised that are not due to the PR note that in a new comment in the PR
- Run Pytest test suite: `py.test`
- UrbanAccess currently supports Python 2.7, 3.5, 3.6, 3.7, 3.8. Tests will be run in these environments when the PR is created but any flags raised in these environments should also be addressed
- UrbanAccess also uses a series of integration tests to test entire workflows, run the integration tests:
- Run:
```cd demo
jupyter nbconvert --to python simple_example.ipynb
cd ../urbanaccess/tests/integration
python remove_nb_magic.py -in simple_example.py -out simple_example_clean.py
cd ../../../demo
python simple_example_clean.py
cd ../urbanaccess/tests/integration
python integration_madison.py
python integration_sandiego.py
- Run pycodestyle Python style guide checker: `pycodestyle --max-line-length=100 urbanaccess`
- Open a pull request to the `UDST/urbanaccess` `dev` branch, including a writeup of your changes -- take a look at some of the closed PR's for examples
- Current maintainers will review the code, suggest changes, and hopefully merge it and schedule it for an upcoming release
## Updating the documentation:
- See instructions in `docs/README.md`
## Preparing a release:
- Make a new branch for release prep
- Update the version number and changelog
- Update the version number and changelog:
- `CHANGELOG.md`
- `setup.py`
- `urbanaccess/__init__.py`
- `docs/source/conf.py`
- `docs/source/index.rst`
- `docs/source/conf.py`
- Make sure all the tests are passing, and check if updates are needed to `README.md` or to the documentation
- Open a pull request to the master branch to finalize it

- After merging, tag the release on GitHub and follow the distribution procedures below
- Open a pull request to the `dev` branch to finalize it and wait for a PR review and approval
- After the PR has been approved, it can be merged to `dev`. Then a release PR can be created from `dev` to merge into `master`. Once merged, tag the release on GitHub and follow the distribution procedures below:
## Distributing a release on PyPI (for pip installation):
Expand All @@ -24,17 +72,21 @@
- Run `python setup.py sdist bdist_wheel --universal`
- This should create a `dist` directory containing two package files -- delete any old ones before the next step
- This should create a `dist` directory containing a gzip package file -- delete any old ones before the next step
- Run `twine upload dist/*` -- this will prompt you for your pypi.org credentials
- Check https://pypi.org/project/osmnet/ for the new version
- Check https://pypi.org/project/urbanaccess/ for the new version
## Distributing a release on Conda Forge (for conda installation):
- The [conda-forge/urbanaccess-feedstock](https://github.com/conda-forge/urbanaccess-feedstock) repository controls the Conda Forge release
- The [conda-forge/urbanaccess-feedstock](https://github.com/conda-forge/urbanaccess-feedstock) repository controls the Conda Forge release, including which GitHub users have maintainer status for the repo
- Conda Forge bots usually detect new releases on PyPI and set in motion the appropriate feedstock updates, which a current maintainer will need to approve and merge
- Maintainers can add on additional changes before merging the PR, for example to update the requirements or edit the list of maintainers
- You can also fork the feedstock and open a PR manually. It seems like this must be done from a personal account (not a group account like UDST) so that the bots can be granted permission for automated cleanup
- Check https://anaconda.org/conda-forge/urbanaccess for the new version (may take a few minutes for it to appear)
8 changes: 4 additions & 4 deletions urbanaccess/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

def _format_check(settings):
"""
Check the format of a urbanaccess_config object.
Check the format of an urbanaccess_config object.
Parameters
----------
Expand Down Expand Up @@ -84,7 +84,7 @@ def __init__(self,
def from_yaml(cls, configdir='configs',
yamlname='urbanaccess_config.yaml'):
"""
Create a urbanaccess_config instance from a saved YAML configuration.
Create an urbanaccess_config instance from a saved YAML configuration.
Parameters
----------
Expand All @@ -108,7 +108,7 @@ def from_yaml(cls, configdir='configs',
yaml_file = os.path.join(configdir, yamlname)

with open(yaml_file, 'r') as f:
yaml_config = yaml.load(f)
yaml_config = yaml.safe_load(f)

settings = cls(data_folder=yaml_config.get('data_folder', 'data'),
logs_folder=yaml_config.get('logs_folder', 'logs'),
Expand Down Expand Up @@ -143,7 +143,7 @@ def to_dict(self):
def to_yaml(self, configdir='configs', yamlname='urbanaccess_config.yaml',
overwrite=False):
"""
Save a urbanaccess_config representation to a YAML file.
Save an urbanaccess_config representation to a YAML file.
Parameters
----------
Expand Down
60 changes: 18 additions & 42 deletions urbanaccess/gtfs/headways.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import logging as lg

from urbanaccess.utils import log
from urbanaccess.gtfs.utils_validation import _check_time_range_format
from urbanaccess.gtfs.network import _time_selector

warnings.simplefilter(action="ignore", category=FutureWarning)
Expand Down Expand Up @@ -68,26 +69,25 @@ def _headway_handler(interpolated_stop_times_df, trips_df,
Parameters
----------
interpolated_stop_times_df : pandas.DataFrame
interpolated stop times dataframe for stop times within the time range
interpolated stop times DataFrame for stop times within the time range
trips_df : pandas.DataFrame
trips dataframe
trips DataFrame
routes_df : pandas.DataFrame
routes dataframe
routes DataFrame
headway_timerange : list
time range for which to calculate headways between as a
list of time 1 and time 2 where times are 24 hour clock strings
such as:
['07:00:00', '10:00:00']
time range for which to calculate headways between in a list with time
1 and time 2 as strings. Must follow format of a 24 hour clock for
example: 08:00:00 or 17:00:00
Returns
-------
headway_by_routestop_df : pandas.DataFrame
dataframe of statistics of route stop headways in units of minutes
DataFrame of statistics of route stop headways in units of minutes
with relevant route and stop information
"""
start_time = time.time()

# add unique trip and route id
# add unique trip and route ID
trips_df['unique_trip_id'] = (
trips_df['trip_id'].str.cat(
trips_df['unique_agency_id'].astype('str'), sep='_'))
Expand All @@ -105,7 +105,7 @@ def _headway_handler(interpolated_stop_times_df, trips_df,

trips_df = trips_df[columns]

# add unique route id
# add unique route ID
routes_df['unique_route_id'] = (
routes_df['route_id'].str.cat(
routes_df['unique_agency_id'].astype('str'), sep='_'))
Expand Down Expand Up @@ -138,7 +138,7 @@ def _headway_handler(interpolated_stop_times_df, trips_df,
headway_by_routestop_df['unique_stop_id'].str.cat(
headway_by_routestop_df['unique_route_id'].astype('str'), sep='_'))

log('headway calculation complete. Took {:,.2f} seconds'.format(
log('Headway calculation complete. Took {:,.2f} seconds.'.format(
time.time() - start_time))

return headway_by_routestop_df
Expand All @@ -153,9 +153,9 @@ def headways(gtfsfeeds_df, headway_timerange):
gtfsfeeds_df : object
gtfsfeeds_dfs object with all processed GTFS data tables
headway_timerange : list
time range for which to calculate headways between as a list of
time 1 and time 2 where times are 24 hour clock strings such as:
['07:00:00', '10:00:00']
time range for which to calculate headways between in a list with time
1 and time 2 as strings. Must follow format of a 24 hour clock for
example: 08:00:00 or 17:00:00
Returns
-------
Expand All @@ -164,39 +164,15 @@ def headways(gtfsfeeds_df, headway_timerange):
route stop headways in units of minutes
with relevant route and stop information
"""

time_error_statement = (
'{} starttime and endtime are not in the correct format. '
'Format should be a 24 hour clock in following format: 08:00:00 '
'or 17:00:00'.format(headway_timerange))
if not isinstance(headway_timerange, list) or len(headway_timerange) != 2:
raise ValueError('timerange must be a list of length 2')
if headway_timerange[0].split(':')[0] > headway_timerange[1].split(':')[0]:
raise ValueError('starttime is greater than endtime')

for t in headway_timerange:
if not isinstance(t, str):
raise ValueError(time_error_statement)
if len(t) != 8:
raise ValueError(time_error_statement)
if int(headway_timerange[1].split(':')[0]) - int(
headway_timerange[0].split(':')[0]) > 3:
long_time_range_msg = (
'WARNING: Time range passed: {} is a {} hour period. Long periods '
'over 3 hours may take a significant amount of time to process.')
log(long_time_range_msg.format(headway_timerange,
int(str(
headway_timerange[1][0:2])) - int(
str(headway_timerange[0][0:2]))),
level=lg.WARNING)
_check_time_range_format(headway_timerange)

if gtfsfeeds_df is None:
raise ValueError('gtfsfeeds_df cannot be None')
raise ValueError('gtfsfeeds_df cannot be None.')
if gtfsfeeds_df.stop_times_int.empty or gtfsfeeds_df.trips.empty or \
gtfsfeeds_df.routes.empty:
raise ValueError(
'one of the gtfsfeeds_dfs objects: stop_times_int, trips, '
'or routes were found to be empty.')
'One of the following gtfsfeeds_dfs objects: stop_times_int, '
'trips, or routes were found to be empty.')

headways_df = _headway_handler(
interpolated_stop_times_df=gtfsfeeds_df.stop_times_int,
Expand Down
Loading

0 comments on commit 531c394

Please sign in to comment.