Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28825: Support conversion of complex types to String #5702

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

soumyakanti3578
Copy link
Contributor

What changes were proposed in this pull request?

Support conversion of complex types to String using cast(<value> as string) (GenericUDFToString.java)

Why are the changes needed?

Details in https://issues.apache.org/jira/browse/HIVE-28825

Does this PR introduce any user-facing change?

No, except for the new feature.

Is the change a dependency upgrade?

No

How was this patch tested?

mvn test -pl itests/qtest -Pitests -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile="complex_type_to_string.q"

@@ -0,0 +1,31 @@
create table table1(id int, txt string, num int, flag string);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some test cases where complex types are nested?

struct {
  a map<string, array<int>>,
  b struct { c decimal(10,3), d double}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added nested test in udf_to_string.q

Comment on lines 15 to 16
select md5(s) csum from (select cast(array(*) as string) s from table1) t;
select md5(s) csum from (select cast(array(*) as string) s from table2) t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests pass a string to the md5 function. I don't think it is necessary to test it because we already have test for that in udf_md5.q
Please let me know if there is something else that these tests cover.

Do we want to support md5 with complex type parameter? I think it is out of scope of this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed tests with md5.

Comment on lines 13 to 14
select cast(array(*) as string) from table1;
select cast(array(*) as string) from table2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add these tests to udf_to_string.q file? I think we don't need a separate q file. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for complex types in udf_to_string.q

Comment on lines 5 to 18
SELECT CAST(NULL AS STRING) FROM src tablesample (1 rows);
SELECT CAST(NULL AS STRING) FROM alltypesorc tablesample (1 rows);

SELECT CAST(TRUE AS STRING) FROM src tablesample (1 rows);
SELECT CAST(TRUE AS STRING) FROM alltypesorc tablesample (1 rows);

SELECT CAST(CAST(1 AS TINYINT) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(-18 AS SMALLINT) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(-129 AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(-1025 AS BIGINT) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(1 AS TINYINT) AS STRING) FROM alltypesorc tablesample (1 rows);
SELECT CAST(CAST(-18 AS SMALLINT) AS STRING) FROM alltypesorc tablesample (1 rows);
SELECT CAST(-129 AS STRING) FROM alltypesorc tablesample (1 rows);
SELECT CAST(CAST(-1025 AS BIGINT) AS STRING) FROM alltypesorc tablesample (1 rows);

SELECT CAST(CAST(-3.14 AS DOUBLE) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(-3.14 AS FLOAT) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(-3.14 AS DECIMAL(3,2)) AS STRING) FROM src tablesample (1 rows);
SELECT CAST(CAST(-3.14 AS DOUBLE) AS STRING) FROM alltypesorc tablesample (1 rows);
SELECT CAST(CAST(-3.14 AS FLOAT) AS STRING) FROM alltypesorc tablesample (1 rows);
SELECT CAST(CAST(-3.14 AS DECIMAL(3,2)) AS STRING) FROM alltypesorc tablesample (1 rows);

SELECT CAST('Foo' AS STRING) FROM src tablesample (1 rows);
SELECT CAST('Foo' AS STRING) FROM alltypesorc tablesample (1 rows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the from clause relevant in these statements? It seem that we are casting constants.
Example:

SELECT CAST(CAST(-3.14 AS DOUBLE) AS STRING);

should also test GenericUDFToString.

Could you please confirm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants