Skip to content

Getting statistics on Usage

Naomi Dushay edited this page Oct 12, 2022 · 6 revisions

By running some commands on the Rails console or Rake tasks on the server, you can fetch some statistics about usage. These commands are intended for developers and the Repository Manager, not for general users. Note that in the descriptions below, "project" == "BatchContext" in our code. Basically, when you fill in the form in pre-assembly, you are create a new "BatchContext", which represents an accessioning project. A single project can then spawn both discovery reports and pre-assembly jobs using those parameters (sometimes multiple jobs if there are errors the first time). So a single project produces one or more jobs.

Rails Console Commands

The following need to be run on the server on the Rails console, so SSHing into the server and start the console:

ssh preassembly@sul-preassembly-prod.stanford.edu
cd pre-assembly/current
bundle exec rails c -e production

Total number of unique projects (i.e. batch_contexts, includes all pre-assembly and discovery report derived jobs):

BatchContext.count

Total number of Job Runs (both pre-assembly and discovery report):

JobRun.count

Total number of pre-assembly jobs:

JobRun.where(job_type: 'preassembly').count

Total number of discovery report jobs:

JobRun.where(job_type: 'discovery_report').count

Total number of projects that used a file manifest:

BatchContext.where(using_file_manifest: true).count

Total number of projects that did not use a file manifest:

BatchContext.where(using_file_manifest: false).count

Total number of distinct users who have created projects:

User.count

Number of projects by user, descending:

BatchContext.joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.to_h

Number of projects by user in a time frame, descending

 BatchContext.where("batch_contexts.updated_at > ?", DateTime.now.utc - 1.years).joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.reverse!

List users that make use of file_manifest.csv and how many projects have they created that use one:

BatchContext.where(using_file_manifest: true).joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.to_h

Number of projects by user, in a year increment:

year = 2022
BatchContext.joins(:user).group(:sunet_id).where('batch_contexts.created_at > ? and batch_contexts.created_at < ?',Time.zone.parse("#{year}/01/01"),Time.zone.parse("#{year}/12/31")).count.sort_by {|_key, value| value}.to_h

Rake Tasks

The following are run as a rake task, so NOT on the Rails console, but on the server itself in the same directory as the app is installed. It iterates over all discovery report jobs in the database, and then outputs statistics on all jobs which have JSON reports still available on disk. The output is a CSV file with the following headers: num_objects, num_files, num_errors, runtime_minutes, user, report_date

ssh preassembly@sul-preassembly-prod.stanford.edu
cd pre-assembly/current
RAILS_ENV=production bundle exec rake reports:discovery
less tmp/discovery_report_stats.csv