Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend for Portfolio Groups #83

Merged
merged 2 commits into from
Dec 18, 2023
Merged

Backend for Portfolio Groups #83

merged 2 commits into from
Dec 18, 2023

Conversation

gbdubs
Copy link
Contributor

@gbdubs gbdubs commented Dec 18, 2023

This PR is the backend half of one I worked on this weekend to build out our portfolio group management capabilities.

@gbdubs gbdubs requested a review from bcspragu December 18, 2023 15:29
"go.uber.org/zap"
)

func populateAll[Source any, TargetID ~string, Target any](
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clever use of generics, but it's also hard to understand what it's doing, both here and where its called. I'd add a comment or similar.

Also, it looks like we're doing that "populate a pre-existing pointer" thing, which I begrudgingly accepted was useful in a GraphQL system for populating arbitrary recursive responses, but in a standard RESTful server, I can't see the motivation for such gymnastics when populating a static, non-recursive structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC on commenting this to make it clearer.

Not sure of the distinction you're drawing on that one: we have recursive structures in our data model here too: Portfolio.MemberOf is a PortfolioGroupMembership,which has a Portfolio field (same with PortfolioGroup). The value of these selsective population things is exactly that we can build a DB layer that is "dumb", and then populate the data we want selectively based on the context.

This is the ~the same pattern we used for Abound, ex, because the data model is actually really similar. Portfolio:Art, Gallery:PortfolioGroup, ex.

cmd/server/pactasrv/populate.go Outdated Show resolved Hide resolved
Comment on lines +14 to +29
const portfolioQueryStanza = `
SELECT
portfolio.id,
portfolio.owner_id,
portfolio.name,
portfolio.description,
portfolio.created_at,
portfolio.holdings_date,
portfolio.blob_id,
portfolio.admin_debug_enabled,
portfolio.number_of_rows,
portfolio_group_membership.portfolio_group_id,
portfolio_group_membership.created_at
FROM portfolio
LEFT JOIN portfolio_group_membership
ON portfolio_group_membership.portfolio_id = portfolio.id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted below, but the downside of this approach is that we repeatedly load all the portfolio information, once for each group membership. As long as the portfolio metadata is a small known quantity (e.g. no giant blob columns), and we don't expect the cardinality of groups per portfolio to be in the thousands, this is likely fine, but otherwise I'd just batch load membership info in a separate query given all the portfolio IDs.

Same deal for PortfolioGroups

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's the case - the portfolio_group_membership table is just the join table, not the raw portfolio table, so even if we stored a blob on the portfolio table, this wouldn't pull that in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example to better illustrate my point:

CREATE TABLE portfolio (
	id TEXT NOT NULL PRIMARY KEY,
	some_giant_blob TEXT NOT NULL
);

CREATE TABLE some_join_table (
	id TEXT NOT NULL PRIMARY KEY,
	portfolio_id TEXT NOT NULL REFERENCES portfolio (id)
);

INSERT INTO portfolio (id, some_giant_blob) VALUES ('p1', 'imagine this is huge');
INSERT INTO some_join_table (id, portfolio_id) VALUES ('sjt1', 'p1'), ('sjt2', 'p1'), ('sjt3', 'p1');

SELECT
	portfolio.id,
	portfolio.some_giant_blob,
	some_join_table.id
FROM portfolio
LEFT JOIN some_join_table
	ON some_join_table.portfolio_id = portfolio.id;

 id |   some_giant_blob    |  id
----+----------------------+------
 p1 | imagine this is huge | sjt1
 p1 | imagine this is huge | sjt2
 p1 | imagine this is huge | sjt3
(3 rows)

See how we load the portfolio once for each relationship in the other table? That's potentially a lot of extra data to transmit and parse if the portfolio is large or there are a lot of groups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I misunderstood this - I see the point and I know a fix! I'll pursue this afternoon.

if err != nil {
return nil, fmt.Errorf("querying portfolio: %w", err)
}
pvs, err := rowsToPortfolios(rows)
if err != nil {
return nil, fmt.Errorf("translating rows to portfolios: %w", err)
}
return exactlyOne("portfolio", id, pvs)
return exactlyOne("portfolio", id, valuesFromMap(pvs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had initially complained that it seemed silly to query by primary key with query instead of queryRow, only to then verify we got a single entry. Now I'm doubly complaining because we're loading what we know is zero or one result(s) into a map, to load it into an array, to verify we got a single entry. Not that this is anywhere near a bottleneck, but it would be simpler and more efficient just to use queryRow and check for ErrNoRows or whatever it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the diagnosis of layercaking here, but the benefit IMO outweighs the extra layer. One code pathway for decoding rows allows us to build code that is (a) more resilient to changes (i.e. we couldn't read a field in one case but not in another), (b) less work to maintain (i.e. a change to the data model needs to be made in fewer places)

db/sqldb/portfolio.go Outdated Show resolved Hide resolved
db/sqldb/portfolio.go Show resolved Hide resolved
openapi/pacta.yaml Outdated Show resolved Hide resolved
@gbdubs gbdubs merged commit 3aab8bf into main Dec 18, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants