From fab0e3c4794471473440ec2930a5e849820d8818 Mon Sep 17 00:00:00 2001 From: Connor Adams Date: Mon, 20 May 2024 09:24:06 +0100 Subject: [PATCH 1/5] Clarify AWS Lambda storage There is ephemeral storage in `/tmp` https://docs.aws.amazon.com/lambda/latest/api/API_EphemeralStorage.html Which could technically be used if desired `CRAWLEE_STORAGE_DIR=/tmp/crawlee/storage` --- website/versioned_docs/version-3.10/deployment/aws-cheerio.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md index 0a047ead77bb..735d40cfa0a8 100644 --- a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md +++ b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md @@ -9,7 +9,7 @@ Locally, we can conveniently create a Crawlee project with `npx crawlee create`. Whenever we instantiate a new crawler, we have to pass a unique `Configuration` instance to it. By default, all the Crawlee crawler instances share the same storage - this can be convenient, but would also cause “statefulness” of our Lambda, which would lead to hard-to-debug problems. -Also, when creating this Configuration instance, make sure to pass the `persistStorage: false` option. This tells Crawlee to use in-memory storage, as the Lambda filesystem is read-only. +Also, when creating this Configuration instance, make sure to pass the `persistStorage: false` option. This tells Crawlee to use in-memory storage, as the default storage directory for Crawlee is read-only on the Lambda Filesystem. ```javascript title="src/main.js" // For more information, see https://crawlee.dev/ @@ -123,4 +123,4 @@ The memory size can greatly affect the execution speed of your Lambda. See the [official documentation](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) to see how the performance and cost scale with more memory. -::: \ No newline at end of file +::: From 54d3217195db8a55525319f630f091b62bd0b241 Mon Sep 17 00:00:00 2001 From: Connor Adams Date: Mon, 20 May 2024 09:26:04 +0100 Subject: [PATCH 2/5] Try to add whitespace back --- website/versioned_docs/version-3.10/deployment/aws-cheerio.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md index 735d40cfa0a8..8c67def604c4 100644 --- a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md +++ b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md @@ -123,4 +123,4 @@ The memory size can greatly affect the execution speed of your Lambda. See the [official documentation](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) to see how the performance and cost scale with more memory. -::: +::: From 47d939867c5ff1e34189881cd473df4875dad3c1 Mon Sep 17 00:00:00 2001 From: Connor Adams Date: Mon, 20 May 2024 09:27:45 +0100 Subject: [PATCH 3/5] Trying to resolve diff --- website/versioned_docs/version-3.10/deployment/aws-cheerio.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md index 8c67def604c4..735d40cfa0a8 100644 --- a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md +++ b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md @@ -123,4 +123,4 @@ The memory size can greatly affect the execution speed of your Lambda. See the [official documentation](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) to see how the performance and cost scale with more memory. -::: +::: From d1c4ef43b9ae77ce521d6439713eee7bb35ef9d2 Mon Sep 17 00:00:00 2001 From: Connor Adams Date: Mon, 20 May 2024 09:30:45 +0100 Subject: [PATCH 4/5] Fix whitespace hopefully --- website/versioned_docs/version-3.10/deployment/aws-cheerio.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md index 735d40cfa0a8..bbcaae961dc0 100644 --- a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md +++ b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md @@ -123,4 +123,4 @@ The memory size can greatly affect the execution speed of your Lambda. See the [official documentation](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) to see how the performance and cost scale with more memory. -::: +::: \ No newline at end of file From 70a4fdd8978b15a9a26432fba4b5e50ea9143182 Mon Sep 17 00:00:00 2001 From: Connor Adams Date: Sat, 25 May 2024 14:03:22 +0100 Subject: [PATCH 5/5] Put underlying reason for not using file storage --- website/versioned_docs/version-3.10/deployment/aws-cheerio.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md index bbcaae961dc0..cb9964ed9bd7 100644 --- a/website/versioned_docs/version-3.10/deployment/aws-cheerio.md +++ b/website/versioned_docs/version-3.10/deployment/aws-cheerio.md @@ -9,7 +9,7 @@ Locally, we can conveniently create a Crawlee project with `npx crawlee create`. Whenever we instantiate a new crawler, we have to pass a unique `Configuration` instance to it. By default, all the Crawlee crawler instances share the same storage - this can be convenient, but would also cause “statefulness” of our Lambda, which would lead to hard-to-debug problems. -Also, when creating this Configuration instance, make sure to pass the `persistStorage: false` option. This tells Crawlee to use in-memory storage, as the default storage directory for Crawlee is read-only on the Lambda Filesystem. +Also, when creating this Configuration instance, make sure to pass the `persistStorage: false` option. This tells Crawlee to use in-memory storage, which will also prevent Lambda "statefulness" and its potential issues. ```javascript title="src/main.js" // For more information, see https://crawlee.dev/ @@ -123,4 +123,4 @@ The memory size can greatly affect the execution speed of your Lambda. See the [official documentation](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) to see how the performance and cost scale with more memory. -::: \ No newline at end of file +:::