Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content-type in Avro schema contains invalid characters #3394

Closed
amarjandu opened this issue Sep 1, 2021 · 2 comments
Closed

content-type in Avro schema contains invalid characters #3394

amarjandu opened this issue Sep 1, 2021 · 2 comments
Labels
bug [type] A defect preventing use of the system as specified debt [type] A defect incurring continued engineering cost manifests [subject] Generation and contents of manifests orange [process] Done by the Azul team

Comments

@amarjandu
Copy link
Contributor

from azul.service.manifest_service import PFBManifestGenerator
from azul.plugins.metadata.hca.transform import FileTransformer
from azul.service.avro_pfb import pfb_schema_from_field_types
field_types = FileTransformer.field_types()
pfb_schema = pfb_schema_from_field_types(field_types)
pfb_schema['fields'][2]['type'][11]

{'fields': [{'name': 'document_id', 'type': ['string']},
            {'name': 'submission_date', 'type': ['string']},
            {'name': 'update_date', 'type': ['string']},
            {'name': 'content-type', 'type': ['string']},
            {'name': 'indexed', 'type': ['string', 'boolean']},
            {'name': 'name', 'type': ['string']},
            {'name': 'crc32c', 'type': ['string']},
            {'name': 'sha256', 'type': ['string']},
            {'name': 'size', 'type': ['string', 'long']},
            {'default': None,
             'logicalType': 'UUID',
             'name': 'uuid',
             'type': ['string']},
            {'name': 'drs_path', 'type': ['string']},
            {'name': 'version', 'type': ['string']},
            {'name': 'file_type', 'type': ['string']},
            {'name': 'file_format', 'type': ['string']},
            {'name': 'content_description',
             'type': {'items': ['string'], 'type': 'array'}},
            {'name': 'is_intermediate', 'type': ['string', 'boolean']},
            {'name': 'file_source', 'type': ['string']},
            {'name': '_type', 'type': ['string']},
            {'name': 'related_files',
             'type': {'items': {'fields': [{'name': 'document_id',
                                            'type': ['string']},
                                           {'name': 'submission_date',
                                            'type': ['string']},
                                           {'name': 'update_date',
                                            'type': ['string']},
                                           {'name': 'content-type',
                                            'type': ['string']},
                                           {'name': 'name', 'type': ['string']},
                                           {'name': 'crc32c',
                                            'type': ['string']},
                                           {'name': 'sha256',
                                            'type': ['string']},
                                           {'name': 'size',
                                            'type': ['string', 'long']},
                                           {'default': None,
                                            'logicalType': 'UUID',
                                            'name': 'uuid',
                                            'type': ['string']},
                                           {'name': 'drs_path',
                                            'type': ['string']},
                                           {'name': 'version',
                                            'type': ['string']}],
                                'name': 'related_files',
                                'type': 'record'},
                      'type': 'array'}},
            {'name': 'read_index', 'type': ['string']},
            {'name': 'lane_index', 'type': ['string', 'long']},
            {'name': 'matrix_cell_count', 'type': ['string', 'long']}],
 'name': 'files',
 'type': 'record'}

The - character within the content-type is invalid according to http://avro.apache.org/docs/current/spec.html#names

@amarjandu amarjandu added the orange [process] Done by the Azul team label Sep 1, 2021
@amarjandu
Copy link
Contributor Author

This was found when using the tool, https://json-schema-validator.herokuapp.com/avro.jsp
Screen Shot 2021-08-31 at 5 33 50 PM

@melainalegaspi melainalegaspi added bug [type] A defect preventing use of the system as specified code [subject] Production code labels Sep 1, 2021
@melainalegaspi melainalegaspi added debt [type] A defect incurring continued engineering cost and removed bug [type] A defect preventing use of the system as specified labels Sep 9, 2021
@melainalegaspi melainalegaspi added the bug [type] A defect preventing use of the system as specified label Sep 30, 2021
@theathorn theathorn added manifests [subject] Generation and contents of manifests and removed code [subject] Production code labels Nov 6, 2021
@bvizzier-ucsc
Copy link

@hannes-ucsc: "Made obsolete by: #2693"

@bvizzier-ucsc bvizzier-ucsc closed this as not planned Won't fix, can't repro, duplicate, stale May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug [type] A defect preventing use of the system as specified debt [type] A defect incurring continued engineering cost manifests [subject] Generation and contents of manifests orange [process] Done by the Azul team
Projects
None yet
Development

No branches or pull requests

4 participants