Skip to content
This repository has been archived by the owner on Oct 29, 2023. It is now read-only.

add support for STRICT intra-shard boundary #201

Open
deflaux opened this issue Jun 14, 2016 · 0 comments
Open

add support for STRICT intra-shard boundary #201

deflaux opened this issue Jun 14, 2016 · 0 comments
Assignees

Comments

@deflaux
Copy link
Contributor

deflaux commented Jun 14, 2016

When the genomic region we want to examine is divided into shards, we use a STRICT shard boundary to remove duplicate data that would occur at the end of the current shard and also at the beginning of the next shard.

  • This works fine when we are working over an entire chromosome.
  • BUT when we want to shard a subset of a chromosome, we are filtering out the records at the beginning of the very first shard even though they would not be duplicated in any other shards.
    • Some times we do want to make use of those records that overlap the beginning of the shard boundary.
    • We need a way to use OVERLAPS for the first shard and STRICT for all subsequent shards.

Confirm this functionality with a JoinNonVariantSegmentsWithVariants integration test that operates over a small genomic region specified by both normal sharding and SitesToShards.

@deflaux deflaux self-assigned this Jun 14, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant