Skip to content

Commit

Permalink
Merge pull request #71 from GLAM-Workbench/v3
Browse files Browse the repository at this point in the history
Bump version
  • Loading branch information
wragge authored Sep 15, 2024
2 parents 61a63fa + fd4c491 commit 20b6de1
Show file tree
Hide file tree
Showing 3 changed files with 640 additions and 628 deletions.
8 changes: 4 additions & 4 deletions .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"related_identifiers": [
{
"scheme": "url",
"identifier": "https://github.com/GLAM-Workbench/trove-newspapers/tree/v1.3.4",
"identifier": "https://github.com/GLAM-Workbench/trove-newspapers/tree/v2.0.0",
"relation": "isDerivedFrom",
"resource_type": "software"
},
Expand All @@ -22,7 +22,7 @@
"resource_type": "other"
}
],
"version": "v1.3.4",
"version": "v2.0.0",
"upload_type": "software",
"keywords": [
"digital humanities",
Expand All @@ -31,13 +31,13 @@
"newspapers",
"GLAM Workbench"
],
"publication_date": "2022-06-26",
"publication_date": "2024-09-15",
"creators": [
{
"orcid": "0000-0001-7956-4498",
"name": "Sherratt, Tim"
}
],
"access_right": "open",
"description": "<p>Current version: <a href=\"https://github.com/GLAM-Workbench/trove-newspapers/releases/tag/v1.3.4\">v1.3.4</a></p> <p>This repository contains Jupyter notebooks to work with data from Trove’s newspapers zone. For more information see the <a href=\"https://glam-workbench.net/trove-newspapers/\">Trove Newspapers</a> section of the GLAM Workbench.</p> <h2 id=\"notebook-topics\">Notebook topics</h2> <h3 id=\"trove-newspapers-in-context\">Trove newspapers in context</h3> <ul> <li><strong>Visualise the total number of newspaper articles in Trove by year and state</strong> – explore how Trove’s newspaper articles are distributed over time, and by state</li> <li><strong>Analyse rates of OCR correction</strong> – explore patterns in OCR text correction; how many corrections are there and where have they been made?</li> <li><strong>Finding non-English newspapers in Trove</strong> – use automated language detection to identify non-English language newspapers in Trove</li> <li><strong>Beyond the copyright cliff of death</strong> – find newspapers with content published after 1954</li> <li><strong>Gathering historical data about the addition of newspaper titles to Trove</strong> – find when newspaper titles were added to Trove by extracting lists from web archives</li> </ul> <h3 id=\"visualising-searches\">Visualising searches</h3> <ul> <li><strong>QueryPic</strong> – simple app to visualise newspaper searches over time, this is the latest version with many new features</li> <li><strong>QueryPic Deconstructed</strong> – an older version of QueryPic that lets you build queries using keywords, states, or newspapers</li> <li><strong>Visualise Trove newspaper searches over time</strong> – use facets to slice up newspaper search results and visualise over time</li> <li><strong>Map Trove newspaper results by state</strong> – create a choropleth map to visualise search results by state</li> <li><strong>Map Trove newspaper results by place of publication</strong> – links newspapers to their place of publication and maps the results</li> <li><strong>Map Trove newspaper results by place of publication over time</strong> – adds a time dimension to the example above</li> </ul> <h3 id=\"harvesting-data\">Harvesting data</h3> <p>See the <a href=\"https://glam-workbench.net/trove-harvester/\">Trove Newspaper and Gazette Harvester</a> if you want to harvest all the articles from a search.</p> <ul> <li><strong>Harvest information about newspaper issues</strong> – get information about available issues for each newspaper from the Trove API</li> <li><strong>Harvest the issues of a newspaper as PDFs</strong> – harvest available issues of a newspaper as PDFs</li> <li><strong>Harvest Australian Women’s Weekly covers (or the front pages of any newspaper)</strong> – harvest the front pages of any newspaper, including covers from the Australian Women’s Weekly</li> </ul> <h3 id=\"useful-tools\">Useful tools</h3> <ul> <li><strong>Save a Trove newspaper article as an image</strong> – grabs the page on which an article was published, and then crops the page image to the boundaries of the article to create a complete, intact image of the article as it was originally published</li> <li><strong>Download a page image</strong> – a simple app that lets you download page images as complete, high-resolution JPG files</li> <li><strong>Generate an article thumbnail</strong> – generate a nice square thumbnail image for a newspaper article</li> <li><strong>Upload Trove newspaper articles to Omeka-S</strong> – steps through the process of uploading Trove newspaper articles to your own Omeka-S instance via the API</li> </ul> <h3 id=\"tips-and-tricks\">Tips and tricks</h3> <ul> <li><strong>Today’s news yesterday</strong> – uses the <code>date</code> index and the <code>firstpageseq</code> parameter to find articles from exactly 100 years ago that were published on the front page</li> <li><strong>Create a Trove OCR corrections ticker</strong> – uses the <code>has:corrections</code> parameter to get the total number of newspaper articles with OCR corrections</li> <li><strong>Get a list of Trove newspapers that doesn’t include government gazettes</strong> – workaround for a problem with the <code>newspaper/titles</code> endpoint of the API</li> <li><strong>Get the page coordinates of a digitised newspaper article from Trove</strong> – demonstrates how to find the coordinates of a newspaper article on a digitised page</li> </ul> <h3 id=\"get-creative\">Get creative</h3> <ul> <li><strong>Make composite images from lots of Trove newspaper thumbnails</strong> – creates thumbnails from a search and compiles them into a mega image</li> <li><strong>Create ‘scissors and paste’ messages from Trove newspaper articles</strong> – snip words out of page images and compile them into the message of your choice</li> <li><strong>Create large composite images from snipped words</strong> – harvest multiple versions of a list of words and compile them all into one big image</li> </ul> <p>See the <a href=\"https://glam-workbench.github.io/trove-newspapers/\">GLAM Workbench for more details</a>.</p> <h3 id=\"data-files\">Data files</h3> <ul> <li>CSV formatted lists of newspaper titles in Trove <ul> <li><a href=\"trove_newspaper_titles_2009_2021.csv\">trove_newspaper_titles_2009_2021.csv</a> – complete dataset of captures and titles</li> <li><a href=\"trove_newspaper_titles_first_appearance_2009_2021.csv\">trove_newspaper_titles_first_appearance_2009_2021.csv</a> – filtered dataset, showing only the first appearance of each title / place / date range combination</li> <li>There is also an <a href=\"https://gist.github.com/wragge/7d80507c3e7957e271c572b8f664031a\">alphabetical list of newspaper titles</a>, showing approximately when they first appeared in Trove.</li> </ul></li> <li><a href=\"data/aww-issues.csv\">CSV formatted list of Australian Women’s Weekly issues, 1933 to 1982</a></li> <li><a href=\"https://cloudstor.aarnet.edu.au/plus/s/NaKjoKNFOGXXDNN\">Australian Women’s Weekly front covers, 1933 to 1982</a> (2,566 images on Cloudstor) For easy browsing, I’ve compiled the images into a set of PDF files, one for each decade, available from Dropbox: <ul> <li><a href=\"https://www.dropbox.com/s/0j6zpeuw6tbey5k/aww-1933-1939.pdf?dl=0\">1933 to 1939</a></li> <li><a href=\"https://www.dropbox.com/s/y1he8dd6h655weu/aww-1940-1949.pdf?dl=0\">1940 to 1949</a></li> <li><a href=\"https://www.dropbox.com/s/i9gp9i51nofmlqo/aww-1950-1959.pdf?dl=0\">1950 to 1959</a></li> <li><a href=\"https://www.dropbox.com/s/2of63tovcnphijo/aww-1960-1969.pdf?dl=0\">1960 to 1969</a></li> <li><a href=\"https://www.dropbox.com/s/f2yxpg8u4dx5uf2/aww-1970-1979.pdf?dl=0\">1970 to 1979</a></li> <li><a href=\"https://www.dropbox.com/s/xanohtas1fi7eu4/aww-1980-1982.pdf?dl=0\">1980 to 1982</a></li> </ul></li> <li><a href=\"https://gist.github.com/wragge/9aa385648cff5f0de0c7d4837896df97\">Trove newspapers with non-English language content</a></li> <li><a href=\"newspapers_post_54.csv\">Trove newspapers with articles published after 1954</a></li> </ul> <h2 id=\"cite-as\">Cite as</h2> <p>See the GLAM Workbench or <a href=\"https://doi.org/10.5281/zenodo.3521724\">Zenodo</a> for up-to-date citation details.</p> <hr /> <p>This repository is part of the <a href=\"https://glam-workbench.github.io/\">GLAM Workbench</a>.<br /> If you think this project is worthwhile, you might like <a href=\"https://github.com/sponsors/wragge?o=esb\">to sponsor me on GitHub</a>.</p>"
"description": "<p>CURRENT VERSION: v2.0.0</p> <p>A GLAM Workbench repository</p> <p>For more information and documentation see the <a href=\"https://glam-workbench.net/trove-newspapers\">Trove newspapers</a> section of the <a href=\"https://glam-workbench.net\">GLAM Workbench</a>.</p> <h2 id=\"notebooks\">Notebooks</h2> <ul> <li>Upload Trove newspaper articles to Omeka-S</li> <li>Today’s news yesterday</li> <li>Map Trove newspaper results by state</li> <li>Harvesting Australian Women’s Weekly covers (or the front pages of any newspaper)</li> <li>Beyond the copyright cliff of death</li> <li>Create a Trove OCR corrections ticker</li> <li>Create ‘scissors and paste’ messages from Trove newspaper articles</li> <li>Make composite images from lots of Trove newspaper thumbnails</li> <li>QueryPic</li> <li>Map Trove newspaper results by place of publication</li> <li>Harvest information about newspaper issues</li> <li>Harvest the issues of a newspaper as PDFs</li> <li>None</li> <li>Corrections of OCRd text in Trove’s newspapers</li> <li>Create large composite images from snipped words</li> <li>Save a Trove newspaper article as an image</li> <li>Visualise the total number of newspaper articles in Trove by year and state</li> <li>Gathering historical data about the addition of newspaper titles to Trove</li> <li>Get the page coordinates of a digitised newspaper article from Trove</li> <li>Finding non-English newspapers in Trove</li> <li>Visualise Trove newspaper searches over time</li> <li>Map Trove newspaper results by place of publication over time</li> <li>Download a page image</li> <li>Generate a thumbnail image from a Trove newspaper article</li> <li>QueryPic deconstructed</li> </ul> <h2 id=\"associated-datasets\">Associated datasets</h2> <ul> <li><a href=\"https://github.com/GLAM-Workbench/trove-newspapers-data-post-54\">trove-newspapers-data-post-54</a></li> <li><a href=\"https://github.com/GLAM-Workbench/trove-newspaper-titles-web-archives\">trove-newspaper-titles-web-archives</a></li> <li><a href=\"https://github.com/GLAM-Workbench/aww-data\">aww-data</a></li> <li><a href=\"https://github.com/GLAM-Workbench/trove-newspapers-corrections/\">trove-newspapers-corrections</a></li> <li><a href=\"https://github.com/GLAM-Workbench/trove-newspapers-non-english\">trove-newspapers-non-english</a></li> <li><a href=\"https://doi.org/10.5281/zenodo.12547036\">zenodo.12547036</a></li> </ul> <hr /> <p>Created by <a href=\"https://timsherratt.au\">Tim Sherratt</a> for the <a href=\"https://glam-workbench.net\">GLAM Workbench</a></p>"
}
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# trove-newspapers

CURRENT VERSION: v2.0.0

A GLAM Workbench repository

For more information and documentation see the [Trove newspapers](https://glam-workbench.net/trove-newspapers) section of the [GLAM Workbench](https://glam-workbench.net).
Expand Down
Loading

0 comments on commit 20b6de1

Please sign in to comment.