Skip to content

Commit

Permalink
Fix Parquet multi column byte writing (Bears-R-Us#3312)
Browse files Browse the repository at this point in the history
For Parquet, we were previously incorrectly writing
byte columns for segmented array values by setting reptition
level to "required", when it should have been "optional". By
fixing that schema setup, we will correctly write the definition
level of each value.
  • Loading branch information
bmcdonald3 authored Jun 11, 2024
1 parent c7ffa24 commit c900224
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/ArrowFunctions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -952,7 +952,7 @@ std::shared_ptr<parquet::schema::GroupNode> SetupSchema(void* column_names, void
auto list = parquet::schema::GroupNode::Make("list", parquet::Repetition::REPEATED, {element});
fields.push_back(parquet::schema::GroupNode::Make(cname_ptr[i], parquet::Repetition::OPTIONAL, {list}, parquet::ConvertedType::LIST));
} else {
fields.push_back(parquet::schema::PrimitiveNode::Make(cname_ptr[i], parquet::Repetition::REQUIRED, parquet::Type::BYTE_ARRAY, parquet::ConvertedType::NONE));
fields.push_back(parquet::schema::PrimitiveNode::Make(cname_ptr[i], parquet::Repetition::OPTIONAL, parquet::Type::BYTE_ARRAY, parquet::ConvertedType::NONE));
}
}
}
Expand Down

0 comments on commit c900224

Please sign in to comment.