Skip to content

Commit

Permalink
Merge pull request #1 from cul/aws_checksum_verification_and_websocke…
Browse files Browse the repository at this point in the history
…t_server_implementation

Add aws checksum verification logic and websocket server implementation
  • Loading branch information
elohanlon authored Jul 23, 2024
2 parents 0593c28 + 46a6110 commit 8e60e51
Show file tree
Hide file tree
Showing 33 changed files with 992 additions and 12 deletions.
9 changes: 9 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,12 @@ AllCops:

Lint/MissingCopEnableDirective:
Enabled: false

Metrics/MethodLength:
Exclude:
- lib/check_please/aws/object_fixity_checker.rb
- app/controllers/fixity_checks_controller.rb

RSpec/VerifiedDoubles:
Exclude:
- spec/check_please/aws/object_fixity_checker_spec.rb
4 changes: 4 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ gem 'best_type', '~> 1.0'
gem 'bootsnap', require: false
# Add CRC32C support to the Ruby Digest module
gem 'digest-crc', '~> 0.6.5'
# Client library for connecting to a websocket endpoint
gem 'faye-websocket', '~> 0.11.3'
# Google Cloud Storage SDK
gem 'google-cloud-storage', '~> 1.49'
# Use JavaScript with ESM import maps [https://github.com/rails/importmap-rails]
Expand Down Expand Up @@ -70,6 +72,8 @@ gem 'omniauth-cul', '~> 0.2.0'
group :development, :test do
# See https://guides.rubyonrails.org/debugging_rails_applications.html#debugging-with-the-debug-gem
gem 'debug', platforms: %i[mri windows]
# json_spec for easier json comparison in tests
gem 'json_spec'
# Rubocul for linting
gem 'rubocul', '~> 4.0.11'
# gem 'rubocul', path: '../rubocul'
Expand Down
13 changes: 13 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ GEM
drb (2.2.1)
ed25519 (1.3.0)
erubi (1.12.0)
eventmachine (1.2.7)
factory_bot (6.4.6)
activesupport (>= 5.0.0)
factory_bot_rails (6.4.3)
Expand All @@ -173,6 +174,9 @@ GEM
faraday-net_http (>= 2.0, < 3.2)
faraday-net_http (3.1.0)
net-http
faye-websocket (0.11.3)
eventmachine (>= 0.12.0)
websocket-driver (>= 0.5.1)
ffi (1.16.3)
globalid (1.2.1)
activesupport (>= 6.1)
Expand Down Expand Up @@ -227,6 +231,9 @@ GEM
activesupport (>= 5.0.0)
jmespath (1.6.2)
json (2.7.1)
json_spec (1.1.5)
multi_json (~> 1.0)
rspec (>= 2.0, < 4.0)
jwt (2.8.1)
base64
language_server-protocol (3.17.0.3)
Expand Down Expand Up @@ -361,6 +368,10 @@ GEM
sinatra (>= 0.9.2)
retriable (3.1.2)
rexml (3.2.6)
rspec (3.13.0)
rspec-core (~> 3.13.0)
rspec-expectations (~> 3.13.0)
rspec-mocks (~> 3.13.0)
rspec-core (3.13.0)
rspec-support (~> 3.13.0)
rspec-expectations (3.13.0)
Expand Down Expand Up @@ -518,9 +529,11 @@ DEPENDENCIES
devise
digest-crc (~> 0.6.5)
factory_bot_rails
faye-websocket (~> 0.11.3)
google-cloud-storage (~> 1.49)
importmap-rails
jbuilder
json_spec
omniauth
omniauth-cul (~> 0.2.0)
puma (~> 6.0)
Expand Down
19 changes: 14 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@

DLST app for performing fixity checks on cloud storage files.

## Development

### First-Time Setup

**First-Time Setup (for developers)**
Clone the repository.
`git clone git@github.com:cul/check_please.git`

Expand All @@ -16,8 +18,15 @@ Set up config files.
Run database migrations.
`bundle exec rake db:migrate`

Seed the database with necessary values for operation.
`rails db:seed`
Start the application using `bundle exec rails server`.
`bundle exec rails s -p 3000`

## Testing

Run: `bundle exec rspec`

## Deployment

Run: `bundle exec cap [env] deploy`

Start the application using `rails server`.
`rails s -p 3000`
NOTE: Only the `dev` environment deploy target is fully set up at this time.
14 changes: 14 additions & 0 deletions app/channels/application_cable/connection.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,19 @@

module ApplicationCable
class Connection < ActionCable::Connection::Base
identified_by :uuid

def connect
authenticate! # reject connections that do not successfully authenticate
self.uuid = SecureRandom.uuid # assign a random uuid value when a user connects
end

private

def authenticate!
return if request.authorization&.split(' ')&.at(1) == CHECK_PLEASE['remote_request_api_key']

reject_unauthorized_connection
end
end
end
46 changes: 46 additions & 0 deletions app/channels/fixity_check_channel.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# frozen_string_literal: true

class FixityCheckChannel < ApplicationCable::Channel
FIXITY_CHECK_STREAM_PREFIX = "#{CHECK_PLEASE['action_cable_stream_prefix']}fixity_check:".freeze

# A websocket client subscribes by sending this message:
# {
# "command" => "subscribe",
# "identifier" => { "channel" => "FixityCheckChannel", "job_identifier" => "cool-job-id1" }.to_json
# }
def subscribed
return if params[:job_identifier].blank?

stream_name = "#{FIXITY_CHECK_STREAM_PREFIX}#{params[:job_identifier]}"
Rails.logger.debug "A client has started streaming from: #{stream_name}"
stream_from stream_name
end

def unsubscribed
# Any cleanup needed when channel is unsubscribed
return if params[:job_identifier].blank?

stream_name = "#{FIXITY_CHECK_STREAM_PREFIX}#{params[:job_identifier]}"
Rails.logger.debug "A client has stopped streaming from: #{stream_name}"
stop_stream_from stream_name
end

# A websocket client runs this command by sending this message:
# {
# "command" => "run_fixity_check_for_s3_object",
# "identifier" => { "channel" => "FixityCheckChannel", "job_identifier" => "cool-job-id1" }.to_json,
# "data" => {
# "action" => "run_fixity_check_for_s3_object", "bucket_name" => "some-bucket",
# "object_path" => "path/to/object.png", "checksum_algorithm_name" => "sha256"
# }.to_json
# }
def run_fixity_check_for_s3_object(data)
Rails.logger.debug("run_fixity_check_for_s3_object action received with job_identifier: #{params[:job_identifier]}")
job_identifier = params[:job_identifier]
bucket_name = data['bucket_name']
object_path = data['object_path']
checksum_algorithm_name = data['checksum_algorithm_name']

AwsCheckFixityJob.perform_later(job_identifier, bucket_name, object_path, checksum_algorithm_name)
end
end
36 changes: 36 additions & 0 deletions app/controllers/api_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# frozen_string_literal: true

class ApiController < ActionController::API
include ActionController::HttpAuthentication::Token::ControllerMethods

rescue_from ActiveRecord::RecordNotFound do
render json: errors('Not Found'), status: :not_found
end

private

# Returns 406 status if format requested is not json. This method can be
# used as a before_action callback for any controllers that only respond
# to json.
def ensure_json_request
return if request.format.blank? || request.format == :json

head :not_acceptable
end

# Renders with an :unauthorized status if no request token is provided, or renders with a
# :forbidden status if the request uses an invalid request token. This method should be
# used as a before_action callback for any controller actions that require authorization.
def authenticate_request_token
authenticate_or_request_with_http_token do |token, _options|
ActiveSupport::SecurityUtils.secure_compare(CHECK_PLEASE['remote_request_api_key'], token)
end
end

# Generates JSON with errors
#
# @param String|Array json response describing errors
def errors(errors)
{ errors: Array.wrap(errors).map { |e| { message: e } } }
end
end
44 changes: 44 additions & 0 deletions app/controllers/fixity_checks_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# frozen_string_literal: true

class FixityChecksController < ApiController
before_action :authenticate_request_token

# POST /fixity_checks/run_fixity_check_for_s3_object
def run_fixity_check_for_s3_object
bucket_name = fixity_check_params['bucket_name']
object_path = fixity_check_params['object_path']
checksum_algorithm_name = fixity_check_params['checksum_algorithm_name']

checksum_hexdigest, object_size = CheckPlease::Aws::ObjectFixityChecker.check(
bucket_name, object_path, checksum_algorithm_name
)

render plain: {
bucket_name: bucket_name, object_path: object_path, checksum_algorithm_name: checksum_algorithm_name,
checksum_hexdigest: checksum_hexdigest, object_size: object_size
}.to_json
rescue StandardError => e
render plain: {
error_message: e.message,
bucket_name: bucket_name, object_path: object_path, checksum_algorithm_name: checksum_algorithm_name
}.to_json, status: :bad_request
end

private

def fixity_check_response(bucket_name, object_path, checksum_algorithm_name, checksum_hexdigest, object_size)
run_fixity_check_for_s3_object
{
bucket_name: bucket_name, object_path: object_path, checksum_algorithm_name: checksum_algorithm_name,
checksum_hexdigest: checksum_hexdigest, object_size: object_size
}.to_json
end

def fixity_check_params
params.require(:fixity_check).tap do |fixity_check_params|
fixity_check_params.require(:bucket_name)
fixity_check_params.require(:object_path)
fixity_check_params.require(:checksum_algorithm_name)
end
end
end
67 changes: 67 additions & 0 deletions app/jobs/aws_check_fixity_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# frozen_string_literal: true

class AwsCheckFixityJob < ApplicationJob
queue_as CheckPlease::Queues::CHECK_FIXITY

def perform(job_identifier, bucket_name, object_path, checksum_algorithm_name)
response_stream_name = "#{FixityCheckChannel::FIXITY_CHECK_STREAM_PREFIX}#{job_identifier}"

checksum_hexdigest, object_size = CheckPlease::Aws::ObjectFixityChecker.check(
bucket_name,
object_path,
checksum_algorithm_name,
on_chunk: progress_report_lambda(response_stream_name)
)

# Broadcast message when job is complete
broadcast_fixity_check_complete(
response_stream_name, bucket_name, object_path, checksum_algorithm_name, checksum_hexdigest, object_size
)
rescue StandardError => e
broadcast_fixity_check_error(response_stream_name, e.message, bucket_name, object_path, checksum_algorithm_name)
end

def broadcast_fixity_check_complete(
response_stream_name, bucket_name, object_path, checksum_algorithm_name, checksum_hexdigest, object_size
)
ActionCable.server.broadcast(
response_stream_name,
{
type: 'fixity_check_complete',
data: {
bucket_name: bucket_name, object_path: object_path,
checksum_algorithm_name: checksum_algorithm_name,
checksum_hexdigest: checksum_hexdigest, object_size: object_size
}
}.to_json
)
end

def broadcast_fixity_check_error(
response_stream_name, error_message, bucket_name, object_path, checksum_algorithm_name
)
ActionCable.server.broadcast(
response_stream_name,
{
type: 'fixity_check_error',
data: {
error_message: error_message, bucket_name: bucket_name,
object_path: object_path, checksum_algorithm_name: checksum_algorithm_name
}
}.to_json
)
end

def progress_report_lambda(response_stream_name)
lambda do |_chunk, _bytes_read, chunk_counter|
return unless (chunk_counter % 100).zero?

# We periodically broadcast a message to indicate that the processing is still happening.
# This is so that a client can check whether a job has stalled.
ActionCable.server.broadcast(
response_stream_name,
{ type: 'fixity_check_in_progress' }.to_json
)
end
end
end
3 changes: 2 additions & 1 deletion config/deploy.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@
'config/gcp.yml',
'config/permissions.yml',
'config/redis.yml',
'config/resque.yml'
'config/resque.yml',
'config/cable.yml'

# Default value for linked_dirs is []
append :linked_dirs,
Expand Down
2 changes: 1 addition & 1 deletion config/deploy/dev.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

server 'check-please-dev.library.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
server 'fixity-test-2.svc.cul.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
# Current branch is suggested by default in development
ask :branch, `git rev-parse --abbrev-ref HEAD`.chomp
2 changes: 1 addition & 1 deletion config/deploy/prod.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

server 'check-please.library.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
server 'not-available-yet.library.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
# In test/prod, suggest latest tag as default version to deploy
ask :branch, proc { `git tag --sort=version:refname`.split("\n").last }
2 changes: 1 addition & 1 deletion config/deploy/test.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

server 'check-please-test.library.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
server 'not-available-yet.library.columbia.edu', user: fetch(:remote_user), roles: %w[app db web]
# In test/prod, suggest latest tag as default version to deploy
ask :branch, proc { `git tag --sort=version:refname`.split("\n").last }
2 changes: 2 additions & 0 deletions config/environments/deployed.rb
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@
# config.action_cable.mount_path = nil
# config.action_cable.url = 'wss://example.com/cable'
# config.action_cable.allowed_request_origins = [ 'http://example.com', /http:\/\/example.*/ ]
# Allow Action Cable access from any origin.
config.action_cable.disable_request_forgery_protection = true

# Force all access to the app over SSL, use Strict-Transport-Security, and use secure cookies.
# config.force_ssl = true
Expand Down
2 changes: 1 addition & 1 deletion config/environments/development.rb
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
# config.action_view.annotate_rendered_view_with_filenames = true

# Uncomment if you wish to allow Action Cable access from any origin.
# config.action_cable.disable_request_forgery_protection = true
config.action_cable.disable_request_forgery_protection = true

# Raise error when a before_action's only/except options reference missing actions
config.action_controller.raise_on_missing_callback_actions = true
Expand Down
4 changes: 4 additions & 0 deletions config/environments/test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,8 @@

# Raise error when a before_action's only/except options reference missing actions
config.action_controller.raise_on_missing_callback_actions = true

# Allow Action Cable access from any origin (so that it works in Capybara tests)
config.action_cable.disable_request_forgery_protection = true
# config.action_cable.allowed_request_origins = ['https://rubyonrails.com', %r{http://ruby.*}]
end
5 changes: 5 additions & 0 deletions config/routes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,9 @@

# Defines the root path route ("/")
root 'pages#home'

post '/fixity_checks/run_fixity_check_for_s3_object', to: 'fixity_checks#run_fixity_check_for_s3_object'

# Mount ActionCable Websocket route
mount ActionCable.server => '/cable'
end
Loading

0 comments on commit 8e60e51

Please sign in to comment.