-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub-Issue#2778: Added CloudWatchLogs Sink Config Files #2922
Conversation
4104e08
to
d4ad1b0
Compare
@@ -0,0 +1,42 @@ | |||
.gradle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be at the root of the repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the package path data-prepper-plugins/cwl-sink
. We have someone working on CWL as source as well. So, it probably makes sense to use the path data-prepper-plugins/cloudwatch-logs/
and then you can have data-prepper-plugins/cwl-sink/src/main/java/org/opensearch/dataprepper/plugins/sink/
for this and the source will have data-prepper-plugins/cwl-sink/src/main/java/org/opensearch/dataprepper/plugins/source
@@ -0,0 +1,42 @@ | |||
.gradle | |||
build/ | |||
!gradle/wrapper/gradle-wrapper.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should removed. I think it was not intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those files should be removed now.
@@ -0,0 +1,8 @@ | |||
package org.opensearch.dataprepper; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should be removed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
public static final String DEFAULT_BUFFER_TYPE = "in_memory"; | ||
|
||
//TODO: Change this to custom aws config class as its just a data container. | ||
@JsonProperty("aws_config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use "aws:" for this. Please follow the same convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name has been changed from 'aws_config' to 'aws'.
} | ||
|
||
group = 'org.opensearch.dataprepper' | ||
version = '2.3.0-SNAPSHOT' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need line 6 and 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed these lines.
* where the configuration allows the sink to fetch Aws credentials | ||
* and resources. | ||
*/ | ||
public class AwsConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use this and not AwsAuthenticationOptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change to AwsConfig.
@JsonProperty("aws") | ||
@NotNull | ||
@Valid | ||
private AwsAuthenticationOptions awsConfig; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do private AwsConfig awsConfig;
?
Duplicate AwsAuthenticationOptions
code if needed.
We are working on moving this to a common location. For now, it's ok if you duplicate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it to AwsConfig.
dependencies { | ||
api project(':data-prepper-api') | ||
implementation project(':data-prepper-plugins:aws-plugin-api') | ||
implementation project(path: ':data-prepper-plugins:s3-sink') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are adding this to import AwsAuthenticationOptions. But we currently do not do like this. Ideally the AwsAuthenticationOptions
should be moved to common directory. But for now, it is OK if you duplicate the code and put it in cloud watch logs directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I realized that by adding that dependency we are not isolating the separate plugins. I have removed the dependency and proceeded to use my own version of AwsConfig.
|
||
private int backOffTime = DEFAULT_BACKOFF_TIME; | ||
|
||
public int getBatchSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be too many config options for the customer. We also want to avoid a retry storm so maybe we should decide on a retry_count
ourselves.
Up to DP team how they want to proceed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep this open for configuration and provide default values. Customer might want to tweak based on the use case. We can use default values and document in Readme. I'm not aware of all the cloudwatch configuration, but we should allow CX to configure value to increase throughput and remove the ones that are not required.
* where the configuration allows the sink to fetch Aws credentials | ||
* and resources. | ||
*/ | ||
public class AwsConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reuse AwsCredentialsOptions. Will this class be used to construct AwsCredentialsOptions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, AwsConfig is re-using code from AwsCredentialsOptions as AwsCredentialsOptions is not a general utils class. This would make the CloudWatchLogs plugin rely on a dependency from another plugin if we used AwsCredentialsOptions. I changed it to AwsConfig after talking to Krishna about this, we can discuss this further as well.
public class CwlSinkConfig { | ||
public static final String DEFAULT_BUFFER_TYPE = "in_memory"; | ||
|
||
//Class was utilized from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incomplete documentation.
|
||
private int backOffTime = DEFAULT_BACKOFF_TIME; | ||
|
||
public int getBatchSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep this open for configuration and provide default values. Customer might want to tweak based on the use case. We can use default values and document in Readme. I'm not aware of all the cloudwatch configuration, but we should allow CX to configure value to increase throughput and remove the ones that are not required.
@JsonProperty("path_to_credentials") | ||
private String pathToCredentials; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this ? Look like we dont use this parameter in other Sinks ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally presented this parameter as an option to allow customers to specify credential files for SDK configuration, but after discussing this further with Mark and it being brought up here, it would be best to remove it. I will have this for the next revision. It is too much control for the customer and insecure as we require the pathing for the file.
@JsonProperty("buffer_type") | ||
private String bufferType = DEFAULT_BUFFER_TYPE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are other options ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is only a default option for the buffer, and that is InMemoryBuffer, but this was made an option as in the future we could implement different buffering methods for customers who want to handle that application aspect differently. Similar to how the S3 handles InMemoryBuffering and InFileBuffering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like Speculative Generality. Is there a plan to add additional buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently within the scope of the Design of this plugin, there are no goals for this. But I have added the addition of other Buffering systems such as Kafka and File Buffering as Next Step additions that could benefit the plugin for future usage. The generalization of this Buffer will prove useful as our services grow and customers require variable sources for which to buffer information. It would make it easier for downstream additions to be applied and integrated.
public static final String LOG_GROUP = "testGroup"; | ||
public static final String LOG_STREAM = "testStream"; | ||
public static final String BUFFER_TYPE = "in_memory"; | ||
public static final int BATCH_SIZE = 10; | ||
public static final int MAX_RETRIES = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these values used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it seems that while working on the testing, I had left these values here although they were never used. I will get rid of these for the next revision.
@Valid | ||
private AwsConfig awsConfig; | ||
|
||
@JsonProperty("threshold_config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to threshold
, having config here is redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name has been changed for the next revision.
@Size(min = 2, max = 1224, message = "awsStsExternalId length should be between 2 and 1224 characters") | ||
private String awsStsExternalId; | ||
|
||
public int getDEFAULT_CONNECTION_ATTEMPTS() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? If it's not configurable by user you can use the static variable in CwlSink
class directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the default_connections to a static variable, this should follow better standard conventions.
@Size(min = 1, max = 1048576, message = "max_batch_request_size amount should be between 1 and 1048576 bytes") | ||
private int maxRequestSize = DEFAULT_SIZE_OF_REQUEST; | ||
|
||
@JsonProperty("retry_count") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this retry count different from AWS SDK retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different from AWS SDK retry as the SDK retry is a built in error retransmission method that can still fail and throw exceptions that are not directly the fault of the client. This can be seen in the documentation for the exceptions thrown by a call to putLogEvents() that can return RetryableException or AwsServiceException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter is also based on the S3-Sink retry parameter "max_retries" as it attempts to retry recoverable requests.
public class CwlSinkConfigTest { | ||
@Test | ||
void check_null_auth_config_test() { | ||
assertThat(new CwlSinkConfig().getAwsConfig(), equalTo(null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use this ReflectivelySetField class to configure the options and verify if they're being set correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for letting me in on this. I will re-write the test for the CwlSink to match these. I also used ObjectMapper.
api project(':data-prepper-api') | ||
implementation project(':data-prepper-plugins:aws-plugin-api') | ||
testImplementation platform('org.junit:junit-bom:5.9.1') | ||
implementation platform('software.amazon.awssdk:bom:2.20.56') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to define bom dependency here, I think it's configured in root build.gradle. If you have to use latest version update it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the sdk bom dependecy for the next revision.
import static org.hamcrest.MatcherAssert.assertThat; | ||
import static org.hamcrest.Matchers.equalTo; | ||
|
||
public class ThresholdConfigTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add test to check all default values.
void check_default_values() {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the default test to ThresholdConfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. All the files should include copyright header, you can fix that in next revision.
…pensearch-project#2910) Create Elasticsearch client, implement search and pit apis for ElasticsearchAccessor Signed-off-by: Taylor Gray <tylgry@amazon.com> Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Added Config Files for CloudWatchLogs Sink. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
… syntax) Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Added default params for back_off and log_send_interval alongside test cases for ThresholdConfig. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
…ith AwsConfig. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
…mer and params to AwsConfig Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
…tive mapping to tests CwlSink Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
…xRequestSize Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Signed-off-by: Marcos Gonzalez Mayedo <95880281+MaGonzalMayedo@users.noreply.github.com>
43a0a40
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
…-project#2922) * Elasticsearch client implementation with pit and no context search (opensearch-project#2910) Create Elasticsearch client, implement search and pit apis for ElasticsearchAccessor Signed-off-by: Taylor Gray <tylgry@amazon.com> Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * GitHub-Issue#2778: Refactoring config files for CloudWatchLogs Sink (#4) Added Config Files for CloudWatchLogs Sink. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added fixes from comments to code (including pathing and nomenclature syntax) Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Refactoring config (#5) Added default params for back_off and log_send_interval alongside test cases for ThresholdConfig. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Fixed deleted AwsConfig file Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Removed the s3 dependency from build.gradle, replaced the AwsAuth.. with AwsConfig. Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added modifiable back_off_timer, added threshold test for back_off_timer and params to AwsConfig Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added fixes to gradle file, added tests to AwsConfig, and used Reflective mapping to tests CwlSink Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added default value test to ThresholdConfig and renamed getter for maxRequestSize Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Removed unnecessary imports Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added cloudwatch-logs to settings.gradle Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> * Added a quick fix to the back_off_time range Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> --------- Signed-off-by: Taylor Gray <tylgry@amazon.com> Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com> Signed-off-by: Marcos Gonzalez Mayedo <95880281+MaGonzalMayedo@users.noreply.github.com> Co-authored-by: Taylor Gray <tylgry@amazon.com> Co-authored-by: Marcos <alemayed@amazon.com> Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Description
This change contains the Configuration files for the CloudWatchLogs Sink. It will act as a basis for reading the
user entered configuration yaml file.
Issues Resolved
This PR will work towards resolving CWL-Sink for Data-Prepper [Issue https://github.com//issues/2778]
Link: #2778 (comment)
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.