-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-14337 - Enhance JoltTransformJSON to Support JOLT Transformation… #9785
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for proposing this improvement @Srilatha-ramreddy.
Although having an optional property works, it is not immediately clear that this would alter the behavior. To make the implementation clearer, it would be helpful to introduce an additional strategy property. The property could be named JSON Source
and could have values of Attribute
or FlowFile
using an enum
that implements DescribedValue
to bound the supported options. The property would be required, and would default to FlowFile
, maintaining the current behavior. The new FlowFile Attribute
property would depend on this strategy property.
Although that approach means introducing an additional property, it would make the configured behavior of the Processor much clearer. Feel free to raise any questions about this strategy.
.name("Json Attribute") | ||
.displayName("Json Attribute") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The displayName()
method is not needed when the name()
is the same. I also recommend renaming the property to FlowFile Attribute
.
.name("Json Attribute") | |
.displayName("Json Attribute") | |
.name("FlowFile Attribute") |
final Map<String, String> attributes = Collections.singletonMap("jsonAttr", | ||
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simply use Map.of
like you do on line 497. Similar comment for line 511.
final Map<String, String> attributes = Collections.singletonMap("jsonAttr", | |
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}"); | |
final Map<String, String> attributes = Map.of("jsonAttr", | |
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}"); |
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to 'success' relationship. " | ||
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to 'success' relationship. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to 'success' relationship. " | |
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to 'success' relationship. " | |
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to the 'success' relationship. " | |
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to the 'success' relationship. " |
jsonSourceAttributeName = context.getProperty(JSON_SOURCE_ATTRIBUTE).evaluateAttributeExpressions(original).getValue(); | ||
final String jsonSourceAttributeValue = original.getAttribute(jsonSourceAttributeName); | ||
if (StringUtils.isBlank(jsonSourceAttributeValue)) { | ||
logger.error("FlowFile attribute value evaluated to null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringUtils.isBlank
is not only when a string is null. It can even be when the string is empty or only has white space.
logger.error("FlowFile attribute value evaluated to null"); | |
logger.error("FlowFile attribute value was blank); |
final boolean isSourceFlowFileContent = SourceStrategy.FLOW_FILE == context.getProperty(JSON_SOURCE).asAllowableValue(SourceStrategy.class); | ||
String jsonSourceAttributeName = null; | ||
|
||
if (isSourceFlowFileContent ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (isSourceFlowFileContent ) { | |
if (isSourceFlowFileContent) { |
@Test | ||
void testJsonAttributeNotInitialised() throws IOException { | ||
runner.setProperty(JoltTransformJSON.JSON_SOURCE, SourceStrategy.ATTRIBUTE); | ||
runner.setProperty(JoltTransformJSON.JOLT_SPEC, "./src/test/resources/specs/shiftrSpec.json"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this value "./src/test/resources/specs/shiftrSpec.json"
being used a total of three times by you on this line, line 481, and line 494. Another test uses the same value on line 225. In addition there is another form of this same file used "src/test/resources/specs/shiftrSpec.json"
without the leading ./
on lines 214 and 255. Please make a private static final String
variable with one of these values and use it in all six places.
runner.setProperty(JoltTransformJSON.JSON_SOURCE, SourceStrategy.ATTRIBUTE); | ||
runner.setProperty(JoltTransformJSON.JOLT_SPEC, "./src/test/resources/specs/shiftrSpec.json"); | ||
runner.setProperty(JoltTransformJSON.JOLT_TRANSFORM, JoltTransformStrategy.SHIFTR); | ||
runner.setProperty(JoltTransformJSON.JSON_SOURCE_ATTRIBUTE, "jsonAttr"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this value "jsonAttr"
being used 8 times for defining the name of the attribute (this line and on lines 483, 484, 498, 510, 511, 523 and 525). Please make a private static final String
variable and use it in each of those places.
@@ -464,6 +464,71 @@ void testJoltSpecInvalidEL() throws IOException { | |||
runner.assertNotValid(); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exceptionfactory I see 4 new unit tests which are configured very similarly. Do you think these four tests should be folded into a JUnit5 ParamaterizedTest
and use the MethodSource
annotation to define a method which would return the necessary arguments for each of these tests?
7ab8446
to
e9ef7ed
Compare
@exceptionfactory @dan-s1 Thanks for the feedback. Review comments are now addressed and is Ready for review. Thanks |
Thanks for the updates @Srilatha-ramreddy, please review the Checkstyle warnings. I will take a closer look at the latest version soon.
|
return Stream.of( | ||
Arguments.of(JSON_SOURCE_ATTR_NAME, null, SHIFTR_SPEC_PATH, | ||
JoltTransformStrategy.SHIFTR, false, null), | ||
Arguments.of(JSON_SOURCE_ATTR_NAME, Map.of(JSON_SOURCE_ATTR_NAME, INVALID_INPUT_JSON), SHIFTR_SPEC_PATH, | ||
JoltTransformStrategy.SHIFTR, false, null), | ||
Arguments.of("${dynamicJsonAttr}", Map.of("dynamicJsonAttr", JSON_SOURCE_ATTR_NAME, JSON_SOURCE_ATTR_NAME, EXPECTED_JSON), SHIFTR_SPEC_PATH, | ||
JoltTransformStrategy.SHIFTR, true, SHIFTR_JSON_OUTPUT), | ||
Arguments.of(JSON_SOURCE_ATTR_NAME, Map.of(JSON_SOURCE_ATTR_NAME, EXPECTED_JSON), CHAINR_SPEC_PATH, | ||
JoltTransformStrategy.CHAINR, true, CHAINR_JSON_OUTPUT) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Srilatha-ramreddy Thanks for adding the ParamaterizedTest
. That is what I had in mind. I am requesting one minor change, instead of using Arguments.of
please use Arguments.argumentSet
so you can give a meaningful name to each of the tests. I was going to start from the original names of the unit tests you had started with although I no longer see that commit as it seems you squashed it. Please note in general once the PR has been submitted no squashes should be done. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dan-s1 Thanks for the feedback. Please review the latest commit and will also not squash the commits anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @Srilatha-ramreddy. I noted a few minor adjustments, and one larger question regarding Expression Language evaluation. As mentioned, support for Expression Language opens up some additional possibilities that should be considered.
@RequiresInstanceClassLoading | ||
public class JoltTransformJSON extends AbstractJoltTransform { | ||
|
||
public static final PropertyDescriptor JSON_SOURCE = new PropertyDescriptor.Builder() | ||
.name("Json Source") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON
should be all uppercase in property names:
.name("Json Source") | |
.name("JSON Source") |
|
||
public static final PropertyDescriptor JSON_SOURCE = new PropertyDescriptor.Builder() | ||
.name("Json Source") | ||
.description("Specifies whether the JOLT transformation is applied to FlowFile JSON content or to specified FlowFile JSON attribute.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.description("Specifies whether the JOLT transformation is applied to FlowFile JSON content or to specified FlowFile JSON attribute.") | |
.description("Specifies whether the Jolt transformation is applied to FlowFile JSON content or to specified FlowFile JSON attribute.") |
.build(); | ||
|
||
public static final PropertyDescriptor JSON_SOURCE_ATTRIBUTE = new PropertyDescriptor.Builder() | ||
.name("Json Source Attribute") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.name("Json Source Attribute") | |
.name("JSON Source Attribute") |
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to the 'success' relationship. " | ||
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to the 'success' relationship. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is duplicative of the property description, so I recommend removing it.
.description("The FlowFile attribute containing JSON to be transformed. " | ||
+ "This property is required only when 'Json Source' is set to ATTRIBUTE.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be specified as a multiline string, however, it is not necessary to include the second sentence, since the documentation rendering automatically describes dependent properties.
logger.error("JSON parsing failed for {}", original, e); | ||
session.transfer(original, REL_FAILURE); | ||
return; | ||
final boolean isSourceFlowFileContent = SourceStrategy.FLOW_FILE == context.getProperty(JSON_SOURCE).asAllowableValue(SourceStrategy.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor renaming recommendation:
final boolean isSourceFlowFileContent = SourceStrategy.FLOW_FILE == context.getProperty(JSON_SOURCE).asAllowableValue(SourceStrategy.class); | |
final boolean sourceStrategyFlowFile = SourceStrategy.FLOW_FILE == context.getProperty(JSON_SOURCE).asAllowableValue(SourceStrategy.class); |
jsonSourceAttributeName = context.getProperty(JSON_SOURCE_ATTRIBUTE).evaluateAttributeExpressions(original).getValue(); | ||
final String jsonSourceAttributeValue = original.getAttribute(jsonSourceAttributeName); | ||
if (StringUtils.isBlank(jsonSourceAttributeValue)) { | ||
logger.error("FlowFile attribute value was blank"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute name should be included in the message:
logger.error("FlowFile attribute value was blank"); | |
logger.error("FlowFile attribute [{}] value is blank", jsonSourceAttributeName); |
return; | ||
} | ||
} else { | ||
jsonSourceAttributeName = context.getProperty(JSON_SOURCE_ATTRIBUTE).evaluateAttributeExpressions(original).getValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for Expression Language raises as important question. Evaluating the expression to return an attribute name could be confusing, since ${attributeName}
would return the value JSON, not the attribute name itself. One option is to remove support for Expression Language. The other option is to change the property name to describe the JSON Source itself. This also impacts the JSON Source
property options. The options could be FLOW_FILE
and SOURCE_REFERENCE
or similar, with JSON Source Reference
supporting Expression Language. The property naming needs some further consideration, as I'm not sure Source Reference
is as clear as it should be. Perhaps JSON Source Content
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exceptionfactory I agree with you, EL support was just to support any edge case scenarios but there can be workarounds to achieve what is intended. Happy to remove the expression language support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exceptionfactory Can you please review.
try { | ||
inputJson = jsonUtil.jsonToObject(jsonSourceAttributeValue); | ||
} catch (final Exception e) { | ||
logger.error("JSON parsing failed on FlowFile attribute for {}", original, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to include the name of the attribute like you did above.
logger.error("JSON parsing failed on FlowFile attribute for {}", original, e); | |
logger.error("JSON parsing failed on attribute '{}' of FlowFile {}", jsonSourceAttributeName, original, e); |
logger.info("Transform completed on FlowFile attribute for {}", original); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, it may be beneficial to name the attribute which was transformed so it is clear in the logs
logger.info("Transform completed on FlowFile attribute for {}", original); | |
} | |
logger.info("Transform completed on attribute {} of FlowFile {}", sonSourceAttributeName, original); | |
} |
…s on Attributes
Summary
NIFI-14337
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000
NIFI-00000
Pull Request Formatting
main
branchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-check
Licensing
LICENSE
andNOTICE
filesDocumentation