Redshift inserts taking longer #160
-
Hi All, I am encountering an issue where we are trying to insert close to 500k records into Redshift per day. The data is coming from a streaming source that has varying levels of data influx, with high activity during certain times and low volume during others. Currently, we are using the Redshift connector for this task, but we are facing several challenges. The main issue we are encountering is that the connector is only processing 30k records per hour. This is far below our current needs, and we need to find a way to increase the processing speed. Furthermore, we have observed that when multiple AWS Lambdas are triggered to insert data, a deadlock occurs. This leads to all remaining processes/lambdas failing, resulting in disruption of our workflow. below settings are used by the cursor to load the data.
please let me know if there is any functionality with the connector that can help us achieve this goal. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @rangan-anand , Thank you for reaching out. To summarize, it sounds like there are 2 separate issues you are facing: slow performance when inserting data and deadlocks. Regarding the first issue, performance is an area of redshift-connector we are looking to improve. We anticipate a minor performance improvement in our next release for customers who are not using bind parameters. While it seems you are using bind parameters, I figured I'd mention. For the second issue, could you share the exception/error messages you are seeing when the deadlock occurs? Are these multiple Lambdas modifying the same table? Until then, I'm going to share some resources about locks and Redshift that may help in narrowing down this issue further: https://repost.aws/knowledge-center/prevent-locks-blocking-queries-redshift |
Beta Was this translation helpful? Give feedback.
Hi @rangan-anand ,
Thank you for reaching out. To summarize, it sounds like there are 2 separate issues you are facing: slow performance when inserting data and deadlocks.
Regarding the first issue, performance is an area of redshift-connector we are looking to improve. We anticipate a minor performance improvement in our next release for customers who are not using bind parameters. While it seems you are using bind parameters, I figured I'd mention.
For the second issue, could you share the exception/error messages you are seeing when the deadlock occurs? Are these multiple Lambdas modifying the same table? Until then, I'm going to share some resources about locks and Redshift that may h…