Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible corrupted table cache info on application start #3520

Open
1 task
Sussumu opened this issue Oct 19, 2024 · 0 comments
Open
1 task

Possible corrupted table cache info on application start #3520

Sussumu opened this issue Oct 19, 2024 · 0 comments
Assignees
Labels
bug This issue is a bug. dynamodb needs-reproduction This issue needs reproduction. p2 This is a standard priority issue

Comments

@Sussumu
Copy link

Sussumu commented Oct 19, 2024

Describe the bug

We recently faced a bug in production where the application would not load any document, stating that the number of hash keys was different than one. This application has been running for a few months with no changes whatsoever so we thought this was some kind of unwanted infrastructure change. After a restart, everything came back to normal.

I didn't put a lot of time investigating the AWS SDK code, but from what I could see, the code checks for the number of hash keys declared by the application which has to be exactly one. It gets this data from a previously cached value which may come from a DescribeTable call or from the code itself depending on the value of the DisableFetchingTableMetadata. Our code didn't explicitly set this attribute so it may have come from a DescribeTable call. Please correct if I'm wrong.

Is it possible that this call may have corrupt data?

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The application was supposed to query an document from its partition key and sort key as it was doing for a few months.

Current Behavior

System.InvalidOperationException: Must have one hash key defined for the table <<TABLE_NAME>>
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.MakeKey(Object hashKey, Object rangeKey, ItemStorageConfig storageConfig, DynamoDBFlatConfig flatConfig)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.LoadHelperAsync[T](Object hashKey, Object rangeKey, DynamoDBOperationConfig operationConfig, CancellationToken cancellationToken)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.LoadAsync[T](Object hashKey, Object rangeKey, DynamoDBOperationConfig operationConfig, CancellationToken cancellationToken)

We inject a IDynamoDBContext and load the document like this:

await _context.LoadAsync<TModel>(partitionKey, sortKey, configuration);

Since the restart we didn't face any more errors like this.

Reproduction Steps

I've just copied the most important parts. There's nothing special about this configuration and we basically copy/paste to another projects with no problem. I can't reproduce it now. Maybe if some background call like the DescribeTable that I've mentioned is altered we can get the same error.

using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.DataModel;
using Amazon.Runtime;

const string serviceUrl = "http://localhost:8000/";
const string authenticationRegion = "us-west-1";

var localstackCredentials = new BasicAWSCredentials("local", "local");
var dynamoDbConfig = new AmazonDynamoDBConfig
{
    ServiceURL = serviceUrl,
    AuthenticationRegion = authenticationRegion
};

var dynamoDbClient = new AmazonDynamoDBClient(localstackCredentials, dynamoDbConfig);
var dynamoDbContext = new DynamoDBContext(dynamoDbClient);

var configuration = new DynamoDBOperationConfig
{
    Conversion = DynamoDBEntryConversion.V2,
    ConsistentRead = true,
    RetrieveDateTimeInUtc = true
};

// Exception is thrown here
// The document exists in dynamo
// Query is on the database itself, not in any GSI or LSI
var document = await dynamoDbContext.LoadAsync<Model>("partitionKey", "sortKey", configuration);

// Table name is correct
// We don't configure any other attribute like [DynamoDBHashKey]
[DynamoDBTable(TableNames.SOME_CONSTANT)]
public class Model
{
    public string PartitionKey { get; set; }
    public string SortKey { get; set; }
}

Possible Solution

As I said, I think it's related to the underlying DescribeTable. I assume that disabling DisableFetchingTableMetadata and manually specifying the keys may correct this since it's one less moving part.

Additional Information/Context

The bug started after a Kubernetes pod restart after a node change. All other pods including other ones that query DynamoDb on the same account restarted but only this one got the bug.

AWS .NET SDK and/or Package version used

AWSSDK.DynamoDBv2 Version="3.7.300.12"
AWSSDK.Extensions.NETCore.Setup Version="3.7.300"
AWSSDK.SecretsManager Version="3.7.301.11"
AWSSDK.SecurityToken Version="3.7.300.22"

Targeted .NET Platform

.NET 7.0

Operating System and version

Custom Alpine x64 image

@Sussumu Sussumu added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 19, 2024
@bhoradc bhoradc added needs-reproduction This issue needs reproduction. dynamodb p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Oct 21, 2024
@bhoradc bhoradc self-assigned this Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. dynamodb needs-reproduction This issue needs reproduction. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants