Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

plymouthsoftware/athens

 
 

Repository files navigation

Athens

Athens is a wrapper around the standard AWS athena sdk, with a much simpler interface for executing queries and processing results.

Installation

Add this line to your application's Gemfile:

gem 'athens'

And then execute:

$ bundle

Or install it yourself as:

$ gem install athens

Usage

Quickstart

There are two main classes for Athens, the Connection and the Query. First "open" a connection to the database:

conn = Athens::Connection.new(database: 'sample')

Then start a query:

query = conn.execute("SELECT * FROM mytable")

That kicks off an Athena query in the background. If you want you can just wait for it to finish:

query.wait
# or
query.wait(5) # Wait 5 seconds at most

When your query is done, grab the results as an array:

results = query.to_a
# [
#   ['column_1', 'column_2', 'column_3'],
#   [15, 'data', true],
#   [20, 'foo', false],
#   ...
# ]

Or as a hash (which is really an array where each row is a hash):

results = query.to_h
# [
#   {'column_1': 15, 'column_2': 'data', 'column_3': true},
#   {'column_1': 20, 'column_2': 'foo', 'column_3': false},
#   ...
# ]

Results are also available as unbuffered enumerators of row arrays:

query.rows.each {|row| ...}
# ['column_1', 'column_2', 'column_3']
# [15, 'data', true]
# [20, 'foo', false],
# ...

Or hashes:

query.records.each {|record| ...}
# {'column_1': 15, 'column_2': 'data', 'column_3': true}
# {'column_1': 20, 'column_2': 'foo', 'column_3': false}
# ...

Athens attempts to parse the sql data types into their ruby equivalents, although there's currently no support for the more complex Array/Map types.

Configuration

Configure your AWS settings in an Athens.configure block (in rails put this in config/initializers/athens.rb):

Athens.configure do |config|
  config.output_location = "s3://my-bucket/my-folder/athena/results/"  # Required
  config.aws_access_key      = 'access'     # Optional
  config.aws_secret_key      = 'secret'     # Optional
  config.aws_profile         = 'myprofile'  # Optional
  config.aws_region          = 'us-east-1'  # Optional
  config.wait_polling_period = 0.25         # Optional - What period should we poll for the complete query?
  config.result_encryption   = nil          # Optional, see below
end

The aws parameters are all "optional", in that you can omit those in favor of any of the standard AWS configuration options (i.e. IAM Roles, environment variables, .aws/credentials files).

You can also override the AWS client configuration on a per-connection basis:

conn = Athens::Connection.new(aws_client_override: {})

Take a look at the AWS Athena SDK for a list of all the available options.

The result_encryption option controls how the Athens results will be encrypted at the output_location. By default it's set to use the Amazon SSE encryption if you don't set it at all:

{ encryption_option: "SSE_S3" }

If you set it to nil, it'll default to the bucket encryption settings. You can also use a customer kms key, see https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Athena/Types/EncryptionConfiguration.html for the correct format.

Advanced Usage

Providing a database name to the connection is optional, if you omit the name you'll have to specify it in your query:

conn = Athens::Connection.new(database 'sample')
query = conn.execute("SELECT * FROM mytable")

# or

conn = Athens::Connection.new
query = conn.execute("SELECT * FROM sample.mytable")

While waiting for a query to finish, you could get one of two exceptions:

conn = Athens::Connection.new(database 'sample')
query = conn.execute("SELECT * FROM mytable")

begin
  query.wait()
rescue Athens::QueryFailedError => qfe
  # Query returned a failure message, qfe.message has details
rescue Athens::QueryCancelledError => qce
  # Query was canceled (usually by the user), qce.message has details
end

When a query is running you can do a few things:

conn = Athens::Connection.new(database: 'sample')
query = conn.execute("SELECT * FROM mytable")

query.state  # Returns one of QUEUED, RUNNING, SUCCEEDED, FAILED, or CANCELLED (https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Athena/Types/QueryExecutionStatus.html#state-instance_method)
query.state_reason  # Further details from AWS about the state
query.query_execution_id  # The id of the query returned from AWS
query.cancel   # Attempts to cancel an in-progress query, returns true or false (if the query has already finished this will return false)

query.to_a(header_row: false)  # If you want your query results returned without a header row of column names

The execute method also optionally supports the request_token and work_group parameters:

conn = Athens::Connection.new(database: 'sample')
query = conn.execute("SELECT * FROM mytable", request_token: single_use_token, work_group: my_work_group)

Development

After checking out the repo, run bin/setup to install dependencies. You can also run bin/console for an interactive prompt that will allow you to experiment.

If you want you can use Vagrant instead, there's already a Vagrantfile so a simple vagrant up should get you setup.

If you want to use Docker, grab a ruby image and boot into the console. Then install git and bash and you can run the bin/setup script from there:

docker run --rm -it -v $(pwd):/app -w /app ruby:2.7-alpine /bin/sh -c 'apk add bash git;/bin/bash'

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/getletterpress/athens. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the WTFPL License.

Code of Conduct

Everyone interacting in the Athens project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

About

Simple AWS Athena queries

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 98.2%
  • Shell 1.8%