Skip to content

csvw2rdf tests do not merge embedded metadata #903

@filipjezek

Description

@filipjezek

CSV headers are set as the titles of the respective columns, as stated here (steps 7.3.2.2 and 7.3.2.3).

Additionally, this document specifies that

If there is no name property defined on this column, the first titles value having the same language tag as default language, or und or if no default language is specified, becomes the name annotation for the described column.

I believe this means that the default column names (_col.1, _col.2, ...) should only be used when the CSV file contains no header for the column. The CSVW test cases 107, 148, 149, and 278 contain headers, but the expected results ignore them.

For example, the test 107 looks as follows:

test107-metadata.json

{
  "@context": "http://www.w3.org/ns/csvw",
  "rdfs:comment": "tableSchema with invalid value MUST act as if it was an empty object",
  "url": "tree-ops.csv",
  "tableSchema": 1
}

tree-ops.csv

GID,On Street,Species,Trim Cycle,Inventory Date
1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010
2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010

test107.ttl (the provided expected output, which I believe is wrong)

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

 [
    a csvw:TableGroup;
    csvw:table [
      a csvw:Table;
      rdfs:comment "tableSchema with invalid value MUST act as if it was an empty object";
      csvw:row [
        a csvw:Row;
        csvw:describes [
          <tree-ops.csv#_col.1> "1";
          <tree-ops.csv#_col.2> "ADDISON AV";
          <tree-ops.csv#_col.3> "Celtis australis";
          <tree-ops.csv#_col.4> "Large Tree Routine Prune";
          <tree-ops.csv#_col.5> "10/18/2010"
        ];
        csvw:rownum 1;
        csvw:url <tree-ops.csv#row=2>
      ],  [
        a csvw:Row;
        csvw:describes [
          <tree-ops.csv#_col.1> "2";
          <tree-ops.csv#_col.2> "EMERSON ST";
          <tree-ops.csv#_col.3> "Liquidambar styraciflua";
          <tree-ops.csv#_col.4> "Large Tree Routine Prune";
          <tree-ops.csv#_col.5> "6/2/2010"
        ];
        csvw:rownum 2;
        csvw:url <tree-ops.csv#row=3>
      ];
      csvw:url <tree-ops.csv>
    ]
 ] .

rdf-tabular output

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

 [
    a csvw:TableGroup;
    csvw:table [
      a csvw:Table;
      csvw:row [
        a csvw:Row;
        csvw:describes [
          <file:fake-path/test107.csv#GID> "1";
          <file:fake-path/test107.csv#Inventory%20Date> "10/18/2010";
          <file:fake-path/test107.csv#On%20Street> "ADDISON AV";
          <file:fake-path/test107.csv#Species> "Celtis australis";
          <file:fake-path/test107.csv#Trim%20Cycle> "Large Tree Routine Prune"
        ];
        csvw:rownum 1;
        csvw:url <file:fake-path/test107.csv#row=2>
      ], [
        a csvw:Row;
        csvw:describes [
          <file:fake-path/test107.csv#GID> "2";
          <file:fake-path/test107.csv#Inventory%20Date> "6/2/2010";
          <file:fake-path/test107.csv#On%20Street> "EMERSON ST";
          <file:fake-path/test107.csv#Species> "Liquidambar styraciflua";
          <file:fake-path/test107.csv#Trim%20Cycle> "Large Tree Routine Prune"
        ];
        csvw:rownum 2;
        csvw:url <file:fake-path/test107.csv#row=3>
      ];
      csvw:url <file:fake-path/test107.csv>
    ];
    prov:wasGeneratedBy [
      a prov:Activity;
      prov:endedAtTime "2025-07-23T20:13:34.627+02:00"^^xsd:dateTime;
      prov:qualifiedUsage [
        a prov:Usage;
        prov:entity <file:fake-path/test107.csv>;
        prov:hadRole csvw:csvEncodedTabularData
      ];
      prov:startedAtTime "2025-07-23T20:13:34.625+02:00"^^xsd:dateTime;
      prov:wasAssociatedWith <https://rubygems.org/gems/rdf-tabular>
    ]
  ] .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions