Orbiting Carbon Observatory (OCO) Publications Tool

The OCO Publications Tool gathers, ingests, organizes, reports, and highlights publications related to the OCO missions. This software can be used for any subject matter though, it does not have to be limited to a specific project, misison, or topic. However, as it was created for the OCO missions, you may see examples on pages and their forms that are OCO-specific -- feel free to change as needed.

Overview

The OCO Publications Tool was built using Python's Bottle library, with a MySQL database backend. The site employs very basic styling from Bootstrap 5, so you are left to style the site the way you would like.

Google Scholar Alerts

The data on publications for this tool comes from Google Scholar. You will need to set up alerts for whatever keywords you are interested in, and have them mailed to an Inbox capable of downloading the messages as HTML files that can be parsed on your local system. See the utils/parse_alerts.py for an example of how you can parse these email messages and ingest them into the database. Alert messages come in as they are detected by Google, so check often for new messages.

The Website

The website is powered by Python's Bottle library, but launching the main.py script. Remember to set debug=False if using in production. Any credentials should be in the config.json file. You may also need to conigure the site to run over HTTPS.

The front end of the tool icludes the following pages:

Homepage: The main page that lists all available publications in the database, giving a total count at the top of the page. It shows how many publications were published in each year, in addition to publications that are in press or in review. You can also filter by year.
Review: This page shows what new publications have been ingested into the database from the Google Scholar Alerts. You can approve or reject entries here. If 'approved,' the publication is added to the database and immediately displayed on the website. If 'rejected,' the publication is removed from the database. You can also add comments about why the publication was approved or rejected, if desired. All data for this page is stored in the newPublications table int eh database, which is meant only hold this temporary data.
Add: Here you can manually add information on a publication if it was not detected by a Google Scholar Alert. You can import a publication using its DOI and verifying all the data it populates (data pulled from CrossRef) or enter it yourself.
Update: This page shows all existing entries in the database. If you wish to update an entry, click on the 'UPDATE CITATION' link and you can edit its information. Note that there is no 'delete' function through the website. Deletions must be manually done through the database.
Highlights: This section of the site is used to select any publications you think should be highlighted, and you can assign team members to create and submit highlight slides.
- Pending Highlights: All newly entered publications appear on this pending page. If you think a publication is worth highlighting, assign it a rank and a team member to review it. That team member should receive an email with their assignment.
- In-Progress Highlights: This page shows all highlights that were assigned to team memebers. Here, team memebers can submit their highlight slides, which are uploaded to the slides directory in the webroot. Slides should be in .pdf or .pptx format.
- Complete Highlights: List of all completed highlights, with links to their slides.
- Graveyard Highlights: List of publications that were not highlighted. If you decide that one of these is actually worth highlighting, you can move it back to the In-Progress page.
Reports: This page has links to API endpoints that generate CSV reports.

The Database Backend

As noted, the database backend is powered by MySQL. The database schema used for the site can be found in db.sql. It will generate a database with empty tables.

Example Files

The reports/ folder contains an example report that could be generated by the site.
The slides/ folder contains a blank slide. This is the folder you would want to upload any highlight slides in.
The utils/ folder includes an example email parser and function that alerts assignees that they have new highlights to review.
The config.json is an exmaple of what credentials are needed.
The db.sql file includes the structure of the database.

Note that there are places in various scripts that call for a URL or email addresses. You will have to update these to your website's URL and a list of folks you want to email.

Flowcharts

Adding a Publication to the System

flowchart TD;
    A[Google Alert of new publications];
    A --> B[System adds to Review page];
    B --> C[SUPERUSER Approves or Rejects based on relevance];
    C --> D[Rejected: Publication removed from system];
    C --> E[Approved: CURATOR notified];
    E --> F[CURATOR adds publication to website] --> |optional| G[USER updates a publication];

Highlights

flowchart TD;
    A[Added publication becomes a pending highlight];
    A --> B[SUPERUSER assigns highlight to user or rejects as a highlight] --> C[Rejected: Highlight sent to Graveyard];
    B --> D[Assigned: Highlight moved to In-Progress];
    D --> E[USER reviews highlight] --> C;
    E --> F[Accepted: Highlight slide uploaded by USER];
    F --> G[Highlight moved to Complete];

Roles

Roles are defined in the database:

USERS are responsible for reviewing and adding highlights
SUPERUSERS are responsible approving/rejecting new publications and assigning/rejecting new highlights
CURATORS are responsible for adding newly approved publications to the website

A Note on Author Names

Author names are a particular problem since multiple people can share the same name. We make an attempt to store unique names in the database authors.authorID for each unique author name we encounter. We then store a string of these in publications.fullAuthors, with IDs separated by the '#' characters. There are probably better ways to do this, such as just storing a long string of the authors, doing away with the authors table entirely -- though you would still want to make sure all author names entered for a pubication conform to some standard so that all your citations are consistent. Another way would be to use ORCIDs.

A Note on requirements.txt

You may need to remove the mysqlclient requirement from the file, and install it separately through pip or conda.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orbiting Carbon Observatory (OCO) Publications Tool

Overview

Google Scholar Alerts

The Website

The Database Backend

Example Files

Flowcharts

A Note on Author Names

A Note on requirements.txt

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
reports		reports
site		site
slides		slides
utils		utils
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE.txt		LICENSE.txt
README.md		README.md
config.json		config.json
db.sql		db.sql
main.py		main.py
requirements.txt		requirements.txt

License

NASA-OCO/oco-publications

Folders and files

Latest commit

History

Repository files navigation

Orbiting Carbon Observatory (OCO) Publications Tool

Overview

Google Scholar Alerts

The Website

The Database Backend

Example Files

Flowcharts

A Note on Author Names

A Note on requirements.txt

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages