Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BedrockAgentClient::startIngestionJob results in "Invalid timestamp value passed to DateTimeResult::fromTimestamp" #2967

Closed
Saeven opened this issue Jul 25, 2024 · 10 comments · Fixed by #2974
Assignees
Labels
bug This issue is a bug. investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue

Comments

@Saeven
Copy link

Saeven commented Jul 25, 2024

Describe the bug

BedrockAgentClient::startIngestionJob results in "Invalid timestamp value passed to DateTimeResult::fromTimestamp"

Expected Behavior

That the SDK not crash with a date parsing exception

Current Behavior

#0 /vendor/aws/aws-sdk-php/src/Api/Parser/JsonParser.php(60): Aws\Api\DateTimeResult::fromTimestamp('2024-07-25T20:5...', 'iso8601')
#1 /vendor/aws/aws-sdk-php/src/Api/Parser/JsonParser.php(27): Aws\Api\Parser\JsonParser->parse(Object(Aws\Api\TimestampShape), '2024-07-25T20:5...')
#2 /vendor/aws/aws-sdk-php/src/Api/Parser/JsonParser.php(27): Aws\Api\Parser\JsonParser->parse(Object(Aws\Api\StructureShape), Array)
#3 /vendor/aws/aws-sdk-php/src/Api/Parser/RestJsonParser.php(34): Aws\Api\Parser\JsonParser->parse(Object(Aws\Api\StructureShape), Array)
#4 /vendor/aws/aws-sdk-php/src/Api/Parser/AbstractRestParser.php(63): Aws\Api\Parser\RestJsonParser->payload(Object(GuzzleHttp\Psr7\Response), Object(Aws\Api\StructureShape), Array)
#5 /vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php(126): Aws\Api\Parser\AbstractRestParser->__invoke(Object(Aws\Command), Object(GuzzleHttp\Psr7\Response))
#6 /vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php(93): Aws\WrappedHttpHandler->parseResponse(Object(Aws\Command), Object(GuzzleHttp\Psr7\Request), Object(GuzzleHttp\Psr7\Response), Array)
#7 /vendor/guzzlehttp/promises/src/Promise.php(204): Aws\WrappedHttpHandler->Aws\{closure}(Object(GuzzleHttp\Psr7\Response))
#8 /vendor/guzzlehttp/promises/src/Promise.php(153): GuzzleHttp\Promise\Promise::callHandler(1, Object(GuzzleHttp\Psr7\Response), NULL)
#9 /vendor/guzzlehttp/promises/src/TaskQueue.php(48): GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}()
#10 /vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(118): GuzzleHttp\Promise\TaskQueue->run()
#11 /vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(145): GuzzleHttp\Handler\CurlMultiHandler->tick()
#12 /vendor/guzzlehttp/promises/src/Promise.php(248): GuzzleHttp\Handler\CurlMultiHandler->execute(true)
#13 /vendor/guzzlehttp/promises/src/Promise.php(224): GuzzleHttp\Promise\Promise->invokeWaitFn()
#14 /vendor/guzzlehttp/promises/src/Promise.php(269): GuzzleHttp\Promise\Promise->waitIfPending()
#15 /vendor/guzzlehttp/promises/src/Promise.php(226): GuzzleHttp\Promise\Promise->invokeWaitList()
#16 /vendor/guzzlehttp/promises/src/Promise.php(269): GuzzleHttp\Promise\Promise->waitIfPending()
#17 /vendor/guzzlehttp/promises/src/Promise.php(226): GuzzleHttp\Promise\Promise->invokeWaitList()
#18 /vendor/guzzlehttp/promises/src/Promise.php(62): GuzzleHttp\Promise\Promise->waitIfPending()

Reproduction Steps

$bedrockAgentClient->startIngestionJob([
    'dataSourceId' => $dataSourceId,
    'knowledgeBaseId' => $knowledgebaseId,
]);

Possible Solution

Looks like a bad format of the date stamp is being pushed through your Date parser.

Additional Information/Context

While invoking StartIngestionJob - the command succeeds, but an exception is thrown with the stack trace noted above.

Examining key differences with breakpoints at the affected date parser, we see that successful calls return:
Screenshot 2024-07-25 at 4 54 24 PM

Whereas this failed call returns:
Screenshot 2024-07-25 at 4 55 24 PM

(I've tested other calls, S3, Polly, etc. which all succeed without exceptions).

It's important to note that the call actually triggers the process on the service side. The problem lies in the composer package, which throws an exception in the codebase when the response is being parsed.

SDK version used

3.316.8

Environment details (Version of PHP (php -v)? OS name and version, etc.)

PHP 8.3.9 (cli) (built: Jul 2 2024 14:10:14) (NTS) Copyright (c) The PHP Group Zend Engine v4.3.9, Copyright (c) Zend Technologies with Zend OPcache v8.3.9, Copyright (c), by Zend Technologies

@Saeven Saeven added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 25, 2024
@Saeven Saeven changed the title (short issue description) BedrockAgentClient::startIngestionJob results in "Invalid timestamp value passed to DateTimeResult::fromTimestamp" Jul 25, 2024
@Saeven
Copy link
Author

Saeven commented Jul 25, 2024

This issue also affects getIngestionJob - I imagine it may affect all BedrockClientAgent tasks.

@yenfryherrerafeliz yenfryherrerafeliz self-assigned this Jul 26, 2024
@yenfryherrerafeliz yenfryherrerafeliz added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Jul 26, 2024
@Saeven
Copy link
Author

Saeven commented Jul 29, 2024

@yenfryherrerafeliz Thanks for investigating/fixing! Let me know if you need any extra data!

@stobrien89
Copy link
Member

stobrien89 commented Jul 31, 2024

Hi all,

Apologies for the delay here. I suspect this may be an issue with the timestamp values Bedrock is sending, but I've tested both startIngestionJob and getIngestionJob and there were no issues parsing the iso8601 timestamp values sent by the service. I even manually changed the $value on responses to 2024-07-25T20:55:08.816730762Z (like @Saeven posted) and they were parsed successfully.

For those still affected, would it be possible to post debug logs from a call with a response parsing error? To enable debug logging, set debug to true in the client constructor and please be sure to obscure any sensitive information (such as account numbers) before posting. Thanks!

@stobrien89 stobrien89 added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 31, 2024
@bhavya2109sharma bhavya2109sharma added the p2 This is a standard priority issue label Jul 31, 2024
@Saeven
Copy link
Author

Saeven commented Jul 31, 2024

@stobrien89 No sweat - I know that sometimes these things aren't easy to debug. The crux of it is.

  1. The Service returns this JSON
{
  "ingestionJob": {
    "dataSourceId": "OBFUSCATED",
    "ingestionJobId": "OBFUSCATED",
    "knowledgeBaseId": "OBFUSCATED",
    "startedAt": "2024-07-31T18:53:00.544453478Z",
    "statistics": {
      "numberOfDocumentsDeleted": 0,
      "numberOfDocumentsFailed": 0,
      "numberOfDocumentsScanned": 0,
      "numberOfMetadataDocumentsModified": 0,
      "numberOfMetadataDocumentsScanned": 0,
      "numberOfModifiedDocumentsIndexed": 0,
      "numberOfNewDocumentsIndexed": 0
    },
    "status": "STARTING",
    "updatedAt": "2024-07-31T18:53:00.544453478Z"
  }
}

This bubbles down to this small stack:

Screenshot 2024-07-31 at 3 10 20 PM

Which in the end, attempts to call this (which reproduces the problem in one line):

new \Aws\Api\DateTimeResult('2024-07-31T19:05:47.623768151Z');

Calling this directly yields a slightly more informational error, which is otherwise masked by the way the parser is constructed, (specifically in this block):

        try {
            if ($expectedFormat == 'iso8601') {
                try {
                    return self::fromISO8601($timestamp);
                } catch (Exception $exception) {
                    return self::fromEpoch($timestamp); // exception gets trapped, and a less informational one gets thrown here
                }
            } else if ($expectedFormat == 'unixTimestamp') {

Looks like the true exception message is "The timezone could not be found in the database".

Reproducible here:
https://3v4l.org/sr3Bv#v7.4.33

I apologize a bit for the confusion about environment version; our prod box is running a diff environment (7.4). I didn't think too much of it since composer install succeeds on both environments (req >= 7.2.5).

Playing with 3v4l, it looks like parsing this DateFormat fails on 8.0.9:
https://3v4l.org/sr3Bv#v8.0.9

But succeeds with 8.0.10:
https://3v4l.org/sr3Bv#v8.0.10

The changelog for 8.0.10, https://www.php.net/ChangeLog-8.php#8.0.10, notes a compensation for ((DateTimeZone accepting invalid UTC timezones).

Now interestingly - if you reduce to 6 decimal positions, it will parse on all versions:

Valid: https://3v4l.org/sjeLa#v7.4.33

echo (new \DateTime('2024-07-31T19:16:58.543693Z'))->format('Y-m-d H:i:s');

Invalid:

echo (new \DateTime('2024-07-31T19:16:58.623768151Z'))->format('Y-m-d H:i:s');

So you'd perhaps want to modify the service to output only 6 decimal positions, or, compensate in the parser, or, perhaps make the service return a regular old linux TS to address the issue.

Hope this helps!
Alex

@stobrien89
Copy link
Member

Thanks @Saeven,

That's extremely helpful. Seems like we might want to do a version compare and reduce decimal positions on the affected ranges, but I'll ask around to see if there's broad enough impact to warrant a change at the service level. This seems to be PHP-specific, in which case we'll likely get a fix out tomorrow or Friday, but I'll keep you posted.

@Saeven
Copy link
Author

Saeven commented Jul 31, 2024

My pleasure. From years of working in Java, the standard there is six decimal places at most. The RFC doesn't place a limit on the precision, but until calling this service, I've never seen one with more than six. I suppose that the folks at PHP just follow a standard even though the RFC would contend that what you're doing is OK.

That's a hell of a precision though! 🤣. With clock drift and the time it took to crunch those electrons, I wouldn't trust those tail digits too much!

Thanks for the fix, look forward to it!

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Aug 1, 2024
@stobrien89
Copy link
Member

@Saeven,

Just posted #2974, which should resolve this for anywhere from 7 to 9 fractional seconds. It seems like most languages have native support for nanosecond precision and this is the first time (as far as I know) we're seeing this coming from an AWS service. Not sure if we'll ever see picoseconds, but that's a bridge we'll cross when/if we get to it 😃 .

@Saeven
Copy link
Author

Saeven commented Aug 1, 2024

ROFL! Hopefully we're retired by then! 🍻

Thanks for the patch, eagerly look forward to the composer package increment!

Copy link

github-actions bot commented Aug 2, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

@stobrien89
Copy link
Member

Any time! Thanks for your patience. This should be live in today's release around 2:15PM EDT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants