You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 29, 2023. It is now read-only.
When the genomic region we want to examine is divided into shards, we use a STRICT shard boundary to remove duplicate data that would occur at the end of the current shard and also at the beginning of the next shard.
This works fine when we are working over an entire chromosome.
BUT when we want to shard a subset of a chromosome, we are filtering out the records at the beginning of the very first shard even though they would not be duplicated in any other shards.
Some times we do want to make use of those records that overlap the beginning of the shard boundary.
We need a way to use OVERLAPS for the first shard and STRICT for all subsequent shards.
Confirm this functionality with a JoinNonVariantSegmentsWithVariants integration test that operates over a small genomic region specified by both normal sharding and SitesToShards.
The text was updated successfully, but these errors were encountered:
When the genomic region we want to examine is divided into shards, we use a STRICT shard boundary to remove duplicate data that would occur at the end of the current shard and also at the beginning of the next shard.
Confirm this functionality with a JoinNonVariantSegmentsWithVariants integration test that operates over a small genomic region specified by both normal sharding and SitesToShards.
The text was updated successfully, but these errors were encountered: