Skip to content

A java boilerplate for AWS Q Business infrastructure using a custom data source connector.

License

Notifications You must be signed in to change notification settings

ethan-pritchard/aws-qbusiness-java-boilerplate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aws-qbusiness-java-boilerplate

A java boilerplate for AWS Q Business infrastructure using a custom data source connector.

Getting Started

This boilerplate has two todos:

Workflows

Batch Processing Workflow

An AWS Step Function state machine which when executed:

  • Manages the AWS Q Business data source sync job lifecycle (start, stop)
  • Calls BatchDocument which fetches documents from your persistence, transforms documents into AWS Q Business Documents, and uses qbusiness:BatchPutDocument to iteratively index AWS Q Business Documents

This boilerplate demonstrates its flexibility by adding pagination support with the following modifications:

  • Added parameters maxPages nextToken startDate endDate
  • Modified BatchDocument in order to:
    • Consume nextToken startDate endDate in the request
    • Output a nextToken if applicable
    • Passthrough startDate endDate into the response
  • Modified batch processing state machine in order to recursively increment the currentPage for each BatchDocument request
  • Modified batch processing state machine in order to recursively call BatchDocument with the refreshed nextToken until currentPage exceeds maxPages or nextToken is ""

For larger workflows, I recommend the AWS Step Functions Parallel state instead of recursion.

alt

How To Use

Trigger TriggerBatchProcessingStateMachineFunction with this input:

{
    "applicationId": "Q Business application Id",
    "dataSourceId": "Q Business data source Id",
    "indexId": "Q Business index Id",
    "maxPages": Max nextTokens to consume,
    "nextToken": "" OR Paginated next token,
    "startDate": -1 OR Valid timestamp,
    "endDate": -1 OR Valid timestamp
}

There are four configurations of parameters for the TriggerBatchProcessingStateMachineFunction:

  • nextToken is "". startDate/endDate are -1
    • Starts at beginning of pagination API
  • nextToken is "". startDate/endDate are timestamp
    • Starts at beginning of timeframe filtered pagination API
  • nextToken is paginated next token. startDate/endDate are -1
    • Starts at pagination next token of pagination API
  • nextToken is paginated next token. startDate/endDate are timestamp
    • Starts at pagination next token of timeframe filtered pagination API

Optionally, add automation using AWS EventBridge Scheduler to trigger TriggerBatchProcessingStateMachineFunction on a schedule.

The BatchProcessingStateMachine will recursively call the pagination API until

  • maxPages is reached, or
  • nextToken output from BatchDocumentLambda is ""

Deployment

  • aws s3api create-bucket --bucket <bucket name>
  • aws cloudformation package --template-file ./cfn/template.json --s3-bucket <bucket name> --output-template-file ./target/template.json
  • aws cloudformation deploy --template-file ./target/template.json --capabilities "CAPABILITY_NAMED_IAM" --stack-name <stack name> and optionally adding --parameter-overrides IdCInstanceArn=<idc instance arn>

License

LICENSE