Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow null values in PFBs (#2462) #5444

Merged

Conversation

nadove-ucsc
Copy link
Contributor

@nadove-ucsc nadove-ucsc commented Aug 1, 2023

Connected issues: #2462

Checklist

Author

  • PR is a draft
  • Target branch is develop
  • Name of PR branch matches issues/<GitHub handle of author>/<issue#>-<slug>
  • PR title references all connected issues
  • PR title matches1 that of a connected issue or comment in PR explains why they're different
  • For each connected issue, there is at least one commit whose title references that issue
  • PR is connected to all connected issues via ZenHub
  • PR description links to connected issues
  • Added partial label to PR or this PR completely resolves all connected issues

1 when the issue title describes a problem, the corresponding PR
title is Fix: followed by the issue title

Author (reindex, API changes)

  • Added r tag to commit title or this PR does not require reindexing
  • Added reindex label to PR or this PR does not require reindexing
  • PR and connected issue are labeled API or this PR does not modify a REST API
  • Added a (A) tag to commit title for backwards (in)compatible changes or this PR does not modify a REST API
  • Updated REST API version number in app.py or this PR does not modify a REST API

Author (chains)

  • This PR is blocked by previous PR in the chain or this PR is not chained to another PR
  • Added base label to the blocking PR or this PR is not chained to another PR
  • Added chained label to this PR or this PR is not chained to another PR

Author (upgrading)

  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading
  • Added u tag to commit title or this PR does not require upgrading
  • Added upgrade label to PR or this PR does not require upgrading

Author (operator tasks)

  • Added checklist items for additional operator tasks or this PR does not require additional tasks

Author (hotfixes)

  • Added F tag to main commit title or this PR does not include permanent fix for a temporary hotfix
  • Reverted the temporary hotfixes for any connected issues or the prod branch has no temporary hotfixes for any connected issues

Author (before every review)

  • Rebased PR branch on develop, squashed old fixups
  • Ran make requirements_update or this PR does not touch requirements*.txt, common.mk, Makefile and Dockerfile
  • Added R tag to commit title or this PR does not touch requirements*.txt
  • Added reqs label to PR or this PR does not touch requirements*.txt
  • make integration_test passes in personal deployment or this PR does not touch functionality that could break the IT

Peer reviewer (after requesting changes)

Uncheck the Author (before every review) checklists.

Peer reviewer (after approval)

  • PR is not a draft
  • Ticket is in Review requested column
  • Requested review from primary reviewer
  • Assigned PR to primary reviewer

Primary reviewer (after requesting changes)

Uncheck the before every review checklists. Update the N reviews label.

Primary reviewer (after approval)

  • Actually approved the PR
  • Labeled connected issues as demo or no demo
  • Commented on connected issues about demo expectations or all connected issues are labeled no demo
  • Decided if PR can be labeled no sandbox
  • PR title is appropriate as title of merge commit
  • N reviews label is accurate
  • Moved ticket to Approved column
  • Assigned PR to current operator

Operator (before pushing merge the commit)

  • Checked reindex label and r commit title tag
  • Checked that demo expectations are clear or all connected issues are labeled no demo
  • PR has checklist items for upgrading instructions or PR is not labeled upgrade
  • Squashed PR branch and rebased onto develop
  • Sanity-checked history
  • Pushed PR branch to GitHub
  • Pushed PR branch to GitLab dev and added sandbox label or PR is labeled no sandbox
  • Pushed PR branch to GitLab anvildev or PR is labeled no sandbox
  • Pushed PR branch to GitLab anvilprod or PR is labeled no sandbox
  • Build passes in sandbox deployment or PR is labeled no sandbox
  • Build passes in anvilbox deployment or PR is labeled no sandbox
  • Build passes in hammerbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in sandbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in anvilbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in hammerbox deployment or PR is labeled no sandbox
  • Deleted unreferenced indices in sandbox or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in anvilbox or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in hammerbox or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Started reindex in sandbox or this PR does not require reindexing sandbox
  • Started reindex in anvilbox or this PR does not require reindexing sandbox
  • Started reindex in hammerbox or this PR does not require reindexing sandbox
  • Checked for failures in sandbox or this PR does not require reindexing sandbox
  • Checked for failures in anvilbox or this PR does not require reindexing sandbox
  • Checked for failures in hammerbox or this PR does not require reindexing sandbox
  • Title of merge commit starts with title from this PR
  • Added PR reference to merge commit title
  • Added commit title tags to merge commit title
  • Moved connected issues to Merged column in ZenHub
  • Pushed merge commit to GitHub

Operator (chain shortening)

  • Changed the target branch of the blocked PR to develop or this PR is not labeled base
  • Removed the chained label from the blocked PR or this PR is not labeled base
  • Removed the blocking relationship from the blocked PR or this PR is not labeled base
  • Removed the base label from this PR or this PR is not labeled base

Operator (after pushing the merge commit)

  • Pushed merge commit to GitLab dev or PR is labeled no sandbox
  • Pushed merge commit to GitLab anvildev or PR is labeled no sandbox
  • Pushed merge commit to GitLab anvilprod or PR is labeled no sandbox
  • Build passes on GitLab dev1
  • Reviewed build logs for anomalies on GitLab dev1
  • Build passes on GitLab anvildev1
  • Reviewed build logs for anomalies on GitLab anvildev1
  • Build passes on GitLab anvilprod1
  • Reviewed build logs for anomalies on GitLab anvilprod1
  • Deleted PR branch from GitHub
  • Deleted PR branch from GitLab dev
  • Deleted PR branch from GitLab anvildev
  • Deleted PR branch from GitLab anvilprod

1 When pushing the merge commit is skipped due to the PR being
labelled no sandbox, the next build triggered by a PR whose merge commit is
pushed determines this checklist item.

Operator (reindex)

  • Deleted unreferenced indices in dev or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in anvildev or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in anvilprod or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Started reindex in dev or this PR does not require reindexing
  • Started reindex in anvildev or this PR does not require reindexing
  • Started reindex in anvilprod or this PR does not require reindexing
  • Checked for and triaged indexing failures in dev or this PR does not require reindexing
  • Checked for and triaged indexing failures in anvildev or this PR does not require reindexing
  • Checked for and triaged indexing failures in anvilprod or this PR does not require reindexing
  • Emptied fail queues in dev deployment or this PR does not require reindexing
  • Emptied fail queues in anvildev deployment or this PR does not require reindexing
  • Emptied fail queues in anvilprod deployment or this PR does not require reindexing

Operator

  • Unassigned PR

Shorthand for review comments

  • L line is too long
  • W line wrapping is wrong
  • Q bad quotes
  • F other formatting problem

@github-actions github-actions bot added the orange [process] Done by the Azul team label Aug 1, 2023
@nadove-ucsc nadove-ucsc force-pushed the issues/nadove-ucsc/2462-allow-null-values-in-pfbs branch 2 times, most recently from e35ca50 to a01d54e Compare August 2, 2023 01:51
@coveralls
Copy link

coveralls commented Aug 2, 2023

Coverage Status

coverage: 83.693% (-0.008%) from 83.701% when pulling 0f4b45f on issues/nadove-ucsc/2462-allow-null-values-in-pfbs into 1c7c898 on develop.

@codecov
Copy link

codecov bot commented Aug 2, 2023

Codecov Report

Merging #5444 (0f4b45f) into develop (1c7c898) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #5444      +/-   ##
===========================================
- Coverage    83.67%   83.67%   -0.01%     
===========================================
  Files          152      152              
  Lines        18707    18698       -9     
===========================================
- Hits         15654    15645       -9     
  Misses        3053     3053              
Files Changed Coverage Δ
src/azul/service/avro_pfb.py 90.24% <100.00%> (-0.57%) ⬇️
test/service/test_pfb.py 94.28% <100.00%> (+0.16%) ⬆️

Copy link
Contributor

@dsotirho-ucsc dsotirho-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

@dsotirho-ucsc dsotirho-ucsc marked this pull request as ready for review August 2, 2023 18:29
@nadove-ucsc
Copy link
Contributor Author

To confirm the successful generation of PFB manifests containing null values, I compared the contents of PFB manifests generated with and without these changes.

https://service.sandbox.dev.singlecell.gi.ucsc.edu/fetch/manifest/files?catalog=dcp3&filters=%7B%0A%20%20%22donorDisease%22%3A%20%7B%0A%20%20%20%20%22is%22%3A%20%5B%0A%20%20%20%20%20%20%22plasma%20cell%20myeloma%22%0A%20%20%20%20%5D%0A%20%20%7D%0A%7D&format=terra.pfb

$ diff <(pfb show -i ~/Desktop/dev.avro | jq .) <(pfb show -i ~/Desktop/pr.avro | jq .)
12c12
<       ""
---
>       null
58c58
<       ""
---
>       null
61c61
<       ""
---
>       null
96c96
<       ""
---
>       null
169c169
<       ""
---
>       null
229c229
<       ""
---
>       null
243c243
<       ""
---
>       null
258c258
<     "estimated_cell_count": ""
---
>     "estimated_cell_count": null
281,282c281,282
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
285,286c285,286
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
342c342
<       ""
---
>       null
388c388
<       ""
---
>       null
391c391
<       ""
---
>       null
426c426
<       ""
---
>       null
505,506c505,506
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
509,510c509,510
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
624,625c624,625
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
628,629c628,629
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
685c685
<       ""
---
>       null
731c731
<       ""
---
>       null
734c734
<       ""
---
>       null
769c769
<       ""
---
>       null
842c842
<       ""
---
>       null
878,879c878,879
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
882,883c882,883
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
997,998c997,998
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1001,1002c1001,1002
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1116,1117c1116,1117
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1120,1121c1120,1121
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1185,1186c1185,1186
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1189,1190c1189,1190
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1304,1305c1304,1305
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1308,1309c1308,1309
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1373,1374c1373,1374
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1377,1378c1377,1378
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1492,1493c1492,1493
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1496,1497c1496,1497
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1561,1562c1561,1562
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1565,1566c1565,1566
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1630,1631c1630,1631
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1634,1635c1634,1635
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1699,1700c1699,1700
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1703,1704c1703,1704
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1818,1819c1818,1819
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
1822,1823c1822,1823
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
1879c1879
<       ""
---
>       null
1925c1925
<       ""
---
>       null
1928c1928
<       ""
---
>       null
1963c1963
<       ""
---
>       null
2042,2043c2042,2043
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2046,2047c2046,2047
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2111,2112c2111,2112
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2115,2116c2115,2116
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2230,2231c2230,2231
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2234,2235c2234,2235
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2349,2350c2349,2350
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2353,2354c2353,2354
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2468,2469c2468,2469
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2472,2473c2472,2473
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2537,2538c2537,2538
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2541,2542c2541,2542
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2606,2607c2606,2607
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2610,2611c2610,2611
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2675,2676c2675,2676
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2679,2680c2679,2680
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2744,2745c2744,2745
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2748,2749c2748,2749
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2813,2814c2813,2814
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2817,2818c2817,2818
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2882,2883c2882,2883
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2886,2887c2886,2887
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null
2951,2952c2951,2952
<     "is_intermediate": "",
<     "file_source": "",
---
>     "is_intermediate": null,
>     "file_source": null,
2955,2956c2955,2956
<     "lane_index": "",
<     "matrix_cell_count": ""
---
>     "lane_index": null,
>     "matrix_cell_count": null

@nadove-ucsc
Copy link
Contributor Author

To confirm successful handover to Terra using manifests that contain null values, I imported the null-containing manifest generated in the previous step by editing the URL generated by the Data Browser to point to my personal deployment instead of dev:
https://bvdp-saturn-dev.appspot.com/#import-data?format=PFB&url=https%3A%2F%2Fservice.nadove.dev.singlecell.gi.ucsc.edu%2Fmanifest%2Ffiles%3Fcatalog%3Ddcp3%26format%3Dterra.pfb%26filters%3D%257B%2522donorDisease%2522%253A%2B%257B%2522is%2522%253A%2B%255B%2522plasma%2Bcell%2Bmyeloma%2522%255D%257D%257D%26objectKey%3Dmanifests%252Fbb25db3a-3cc9-5120-b6df-686e75096b62.c951adc1-60cc-52e3-8fc8-7294e9cf5741.avro

Data import was successful. Null values are displayed as "(0 items)"

image

@nadove-ucsc
Copy link
Contributor Author

Given that the data import was successful, I'm confident that these changes can be merged with minimal risk of disrupting the dev deployment.

@achave11-ucsc achave11-ucsc removed the request for review from hannes-ucsc August 3, 2023 18:57
@nadove-ucsc nadove-ucsc force-pushed the issues/nadove-ucsc/2462-allow-null-values-in-pfbs branch from a01d54e to 72fe0f5 Compare August 3, 2023 19:07
@achave11-ucsc achave11-ucsc force-pushed the issues/nadove-ucsc/2462-allow-null-values-in-pfbs branch from 72fe0f5 to 0f4b45f Compare August 7, 2023 21:57
@achave11-ucsc achave11-ucsc added the sandbox [process] Resolution is being verified in sandbox deployment label Aug 7, 2023
@achave11-ucsc achave11-ucsc merged commit 4c2f0de into develop Aug 8, 2023
8 checks passed
@achave11-ucsc achave11-ucsc deleted the issues/nadove-ucsc/2462-allow-null-values-in-pfbs branch August 8, 2023 23:15
@achave11-ucsc achave11-ucsc removed their assignment Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
orange [process] Done by the Azul team sandbox [process] Resolution is being verified in sandbox deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants