-
Notifications
You must be signed in to change notification settings - Fork 61
DynamoDB
DynamoDB is Aamazon's distributed key-value store, optimized for fast read and writes of data. Like many other distributed key-value stores, its query language does not support joins.
It can run in either Strongly Consistent mode (similar to HBase), or Eventually Consistent mode (like Cassandra).
When using AWS managed services, the boto
library for Python is a good tool to connect and setup AWS services.
To install boto, run the command
pip install boto3
In order to not have your AWS credentials in your source code, boto
allows you to store them in a file ~/.boto
, which you have to create.
## copy and paste the three lines (modified with your AWS keys) into ~/.boto
[Credentials]
aws_access_key_id=XXXXX
aws_secret_access_key=XXXxxxXXX/XXXxxxXX
Using boto, a simple script to create a table in DynamoDB is the following
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.create_table(
TableName='myTable',
KeySchema=[
{
'AttributeName': 'myKey',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'myKey',
'AttributeType': 'S'
}
],
# pricing determined by ProvisionedThroughput
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
The key features of this script are that it contains a key, 'myKey' this example, which is specified as a HASH
key and is a String (hence the S
type). Note that it is not needed to specify a schema for each row, which is why DynamoDB is sometimes referred to as a "schemaless database".
Note that the last configs, 'ReadCapacityUnits' and 'WriteCapacityUnits' determine how many reads/writes are per second.
An example script to write data to the table we created is
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('myTable')
for i in range(0,100):
curKey = "key" + str(i)
curValue = "Hello World! This is data entry " + str(i)
response = table.update_item(
Key={
"myKey": curKey
},
UpdateExpression="set myValueString = :val",
ExpressionAttributeValues={
':val':curValue
},
ReturnValues="UPDATED_NEW"
)
Find out more about the Insight Data Engineering Fellows Program in New York and Silicon Valley, apply today, or sign up for program updates.
You can also read our engineering blog here.