Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏗️ Major Refactor to Remove Label Button Services #1086

Merged

Conversation

JGreenlee
Copy link
Collaborator

This PR depends on #1061

We have 3 remaining Angular services that work with user inputs - one for each of the 3 buttons we show in the UI (depending on configuration). These are MultiLabelService, EnketoTripButtonService, and EnketoNotesButtonService.

Each of these services has several functions controlling how inputs are formatted and stored in memory, as well as how they are matched to trips. However, the logic between them is not significantly different. When we expanded the possible range of user inputs to include 'additions' for time use, we did so in a way that is configurable and as generic as possible.
Much of the logic between the 3 services is redundant and more complex than necessary.

Furthermore, the services rely on the strategy of 'populating' trips and places with the inputs that get matched to them - but the React ecosystem prefers a one-way flow of information from parent to child, where objects are passed down from a single source of truth and never mutated.

This PR refactors this significantly so that when user inputs are matched to trips/places, the mappings are stored in a single map rather than being mutated into the trip/place objects. This map is read downstream to access all user inputs.

This removes the last major obstacle to having the trip and place objects be fully typed -- the type definitions for CompositeTrip and ConfirmedPlace will be able to match the server.

e-mission#1014 (comment)
Confirmed and composite trips have section summaries that are computed on the server. We can use these here instead of using the raw 'sections'.
Let's also add type definitions for the section summaries.
@JGreenlee
Copy link
Collaborator Author

This has proven to be one of the more challenging tasks of the migration – I knew we'd have to rework this eventually but I've been chipping away at it for the better part of the week and it's still not there yet. I find myself going back and forth on decisions - let me explain.

The old strategy was to keep 2 fields on the trip/place object for each of (i) user inputs and (ii) additions. user_input and additions are the fields that come from the server, while userInput and additionsList were added on the phone. user_input and additions only had processed inputs, while userInput and additionsList contained the merged processed and unprocessed inputs (with duplicates removed).

It was 1) confusing to have properties with such similar names and 2) not congruent with our goal of having the trip/place objects match up to the server-side types.

So my next idea was to have a single map of trip/place IDs to unprocessed inputs and read from that downstream wherever needed. But this would require us to read processed and unprocessed inputs and perform de-duplication every time we want to access the inputs for that trip/place. That could be bad for performance.

Now, I am realizing the map should contain the merged unprocessed and processed inputs, essentially containing the same information that userInput and additionsList had, but contained in a single object instead of on a trip-by-trip basis.
This way, performance should not be affected - it's still just an object lookup.

@JGreenlee
Copy link
Collaborator Author

JGreenlee commented Oct 28, 2023

I've also realized that the structure of user_input differs depending on configuration

In the ENKETO configuration, on a trip with a processed trip survey response, user_input contains the raw data entry for the survey response:

image

But in the MULTILABEL configuration, on a trip with a label, user_input contains just a string:

image

This will limit our ability to have type definitions for user inputs

@JGreenlee
Copy link
Collaborator Author

JGreenlee commented Oct 28, 2023

I suppose we can just expect that if there is a trip_user_input or place_user_input field, it will be a raw data entry (i.e. an object with data and metadata, etc), but any other field (like mode_confirm) we will expect a string.

Later today I'll check the types in the server code to make sure this will work.

Either way, I think we are going to end up with more stringent typings on the phone than on the server

@JGreenlee
Copy link
Collaborator Author

Looking at the server, I realize that Userlabel and Userinput are two different types. I was confused because user labels are also stored on the user_input field despite not being Userinputs.

Preparing to remove the buttons services (e-mission#1086), the filter functions need to be tweaked. `user_input` will no longer be stored on the trip object, it will be accessible in the LabelTabContext, not from these functions, so it will be passed in as a second param to these functions as userInputForTrip.

Also, we'll remove the surveyOpt variable from LabelTabContext since (i) it can just be read directly from appConfig and (ii) we don't want to keep more variables than necessary in LabelTabContext to keep it from getting too cluttered.

- the INVALID_EBIKE filter (along with invalidCheck) was removed as it is not used
- logic for toLabelCheck was written more concisely
- 'width' is no longer needed on the configuredFilters - this was for the old UI
This is a major refactor of how we handle user inputs that removes the need for having separate button services (e-mission#1086)

Instead of populating fields called "userInput" and "additionsList" on trip/place objects, we will store all inputs of each kind in a single object mapping trip/place IDs to inputs. These will be the single source of truth for what inputs are matched to what trips/places and will remove the need for populating/repopulating trips.
These maps will be accessible in LabelTabContext and read downstream wherever user inputs need to be read.

Specific notes on the changes:
- In LabelTab, matching inputs now happens not as a part of 'populating' trips/places, but in a useEffect so that whenever timelineMap is updated, timelineLabelMap and timelineNotesMap are also updated.
- in timelineHelper, we read unprocessedInputs from local and/or server and instead of returning them from 'get' methods, we'll cache them here in timelineHelper and update those caches with 'update' methods.
- Type definition for UserInput renamed to UnprocessedUserInput to avoid confusion - this only applies to unprocessed inputs and is not necessarily the structure that processed user inputs will have
- in diaryHelper, getBaseModeOfLabeledTrip was removed since it was basically just a shortcut to getBaseModeByValue with a trip as param instead of its userInput. All usages replaced with getBaseModeByValue
These are not needed since f7716cf and 4d48aee
As a result of 4d48aee, trips are not populated with the 'verifiability' property. We can handle this with a function call
The last property on the trip/place objects that differs between phone and server is 'getNextEntry' called from the input matching function.
We can instead pass 'nextEntry' as a second parameter to the input matching functions.

This allows us to significantly simplify our typings and ensure that 'CompositeTrip' and 'ConfirmedPlace' here will be the same as they are on the server. Hooray!

-- While adjusting inputMatcher, I also renamed 'getUserInputForTrip' to 'getUserInputForTimelineEntry' (i) to be more consistent and (ii) in case we support a 'place user input' in the future
On trips with no notes/additions, an empty <View> was being rendered (causing a 10px gap due to padding)
If there were no notes, `timelineNotesMap[trip._id.$oid]` would be `undefined`, which does not equal 0.
We should only show notes if it is defined AND the length is not 0, so we can just say `timelineNotesMap[trip._id.$oid]?.length`.
Replaced one use of `Logger` with `displayError`. Now we don't need to use these Angular services here anymore. The only Angular service still used in this file is TimelineHelper.
Before appConfig is loaded, we we are unsure whether MultilabelButtonGroup or UserInputButton should be shown neither. As long as appConfig is undefined we want to show neither. So a simple if/else is not sufficient here, we should specifically check for the presence of either MULTILABEL or ENKETO
In inputMatcher -> mapInputsToTimelineEntries, adds back the logic for handling processed survey response inputs that are already matched to the trip/place on the server. For user_input, we only use it if there is no unprocessed user input (since unprocessed entries will be newer).
For additions, we have to merge processed+unprocessed and then from this, get the non-deleted, unique entries.

in timelineHelper, when storing unprocessedNotes we should remove duplicates according to their write_ts, not by getUniqueEntries(getNotDeletedCandidates(...)), because we need to retain the "DELETED" entries so they can be matched to any processed entries that they may refer to
-added type def for ConfirmedPlace
-create ObjectId instead of repeating `{ $oid: string }`
-expanded user input typings:
  -- this was tricky because the structure of the user input received from the server differs by key. For labels, we just get a string value for the label chosen. For survey responses, we get a actual raw data entry. For the purpose of typing these, I am differentiating them based on whether the key ends in 'user_input' (used for surveys) or if it ends in 'confirm' (used for labels)
-use these new typings in LabelTab and inputMatcher
Since 8712380, the getDetectedModes function no longer goes through the sections of a trip and sums up distances to get percentages. We now use cleaned_section_summary and inferred_section_summary which are computed on the server for us. Updating the test to reflect this causes it to pass again.
@JGreenlee
Copy link
Collaborator Author

JGreenlee commented Oct 31, 2023

I just fixed diaryHelper.test.ts and it's passing, but LoadMoreButton.test.tsx failed on the last commit due to a timeout.

It passed locally and seems to fail only sporadically on Github - and I noticed it has happened on other branches too

@JGreenlee
Copy link
Collaborator Author

Other than that, I think it's time to get at least an initial review on this since it is substantial

@JGreenlee JGreenlee marked this pull request as ready for review October 31, 2023 18:23
@shankari
Copy link
Contributor

shankari commented Nov 2, 2023

So I have some high-level comments here. When we originally designed the multi-labels, we did so a bit hackily, with one entry for each label. Again, the expectation at that time was that partners would create their own UIs and may have different sets of labels, so I created one type for each label.

Then UNSW implemented the surveys, which did not have a neat structure, so we created trip_confirm for it.

But we never went back and revisited the original mode_confirm and purpose_confirm decision. And now that we are changing this substantially and running into issues, maybe the time to revisit has come.
#1086 (comment)

An obvious fix would be to change the multiple label structure to have a single object as well

So instead of separate manual/mode_confirm, manual/purpose_confirm and manual/replaced_mode objects, we can have an object like

{
"metadata": {"key": "manual/trip_multilabel"},
...
"data": {
    "mode":
    "purpose": 
    "replaced_mode":
}
}

and the user input could then be

{"user_input": {"trip_multilabel": {...}}

Or we could even just store the multilabel as a manual/trip_user_input and distinguish between multilabel and enketo based on the survey name or the absence of xmlResponse or something similar.

A design like this should simplify the codebase significantly, not just on the phone but also on the server.
And it shouldn't be that hard to plumb through, given that we already handle trip_user_input properly.
It will require thinking through and implementing migration + backwards compatibility, but once we are done with the migration, everything will be beautiful 😄

Thoughts?

@JGreenlee
Copy link
Collaborator Author

@shankari I do agree that the proposed approach would be much simpler and more beautiful (I would have a slight preference for manual/trip_multilabel still being separate from manual/trip_user_input because the nature of survey XML response vs. simple label value is still fundamentally different).

But is this something we want to tackle now, or once the JS framework rewrite is done? The backwards compat will require careful planning and significant changes on both phone and server (and I suppose dashboard as well??) Although the inconsistency between the structure of manual/*_user_input and manual/*_confirm puzzled me for a day or two, I was able to get things working in this PR, and I would prefer not to hold up progress on the rewrites.

@shankari
Copy link
Contributor

shankari commented Nov 2, 2023

I am fine with rewriting later. I just thought that if we were still stuck on the difference, we could get unstuck this way 😄

@JGreenlee
Copy link
Collaborator Author

Yes, to clarify, I was able to create a full type definition for user_input on trips / places by differentiating the type based on the key. (some cool Typescript features https://www.typescriptlang.org/docs/handbook/2/template-literal-types.html, https://www.typescriptlang.org/docs/handbook/2/objects.html#index-signatures)

type ConfirmedPlace = {
  ...
  ...
  user_input: {
    /* for keys ending in 'user_input' (e.g. 'trip_user_input'), the server gives us the raw user
      input object with 'data' and 'metadata' */
    [k: `${string}user_input`]: UserInputEntry;
    /* for keys ending in 'confirm' (e.g. 'mode_confirm'), the server just gives us the user input value
      as a string (e.g. 'walk', 'drove_alone') */
    [k: `${string}confirm`]: string;
  };
};

And also in the type definition for timelineLabelMap, we have a different type for SURVEY than we do for MODE, PURPOSE, or REPLACED_MODE

export type TimelineLabelMap = {
  [k: string]: {
    /* if the key here is 'SURVEY', we are in the ENKETO configuration, meaning the user input
      value is a raw survey response */
    SURVEY?: UserInputEntry;
    /* all other keys, (e.g. 'MODE', 'PURPOSE') are from the MULTILABEL configuration
      and use a LabelOption for the user input value */
    MODE?: LabelOption;
    PURPOSE?: LabelOption;
    REPLACED_MODE?: LabelOption;
  };
};

@shankari
Copy link
Contributor

shankari commented Nov 2, 2023

Checks are failing in setup.

This was committed by mistake in 5110845
Prettier was not running on these files because they had syntax errors from bad merge conflicts. Fixes syntax on these and runs Prettier to clean up
-add labelOptionByValue function to make it easier to lookup a labelOption by a label value; use it in other functions that need to perform this lookup
-type getFakeEntry, return undefined if falsy input
-verifiabilityForTrip should only consider 'inferred' true if it has any truthy values; not if all values are undefined
-inferFinalLabels: if no usable inferences, return empty object rather than object with undefined values
@JGreenlee
Copy link
Collaborator Author

Those were some tough merge conflicts, but they are resolved now.

@shankari shankari changed the base branch from service_rewrite_2023 to master November 10, 2023 05:24
@shankari shankari changed the base branch from master to service_rewrite_2023 November 10, 2023 05:24
Something got jumbled when pulling in upstream changes + resolving conflicts
This was also likely due to bad merging from upstream changes
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew! That was a big change but very clean. I have some comments, mainly questions and high-level and/or long-term improvements. I would suggest working on the new tests first.

Merging this, and moving it to the cleanup pile to unblock others...

www/__tests__/inputMatcher.test.ts Show resolved Hide resolved
www/js/diary/LabelTabContext.ts Show resolved Hide resolved
www/js/types/diaryTypes.ts Show resolved Hide resolved
www/js/types/diaryTypes.ts Show resolved Hide resolved
@@ -6,7 +6,7 @@ import React, { useContext, useState } from 'react';
import moment from 'moment';
import { Modal } from 'react-native';
import { Text, Button, DataTable, Dialog } from 'react-native-paper';
import { LabelTabContext } from '../../diary/LabelTab';
import LabelTabContext from '../../diary/LabelTabContext';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own edification, what does this change do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved LabelTabContext out of LabelTab.tsx into its own file so anywhere it is used, the import path changed.

LabelTabContext has type definitions now while it used to just be an any, and I didn't want all that cluttering up LabelTab.tsx.

Comment on lines +91 to +99
const notesResults = comboResults.slice(labelsPromises.length).flat(2);
// fill in the unprocessedLabels object with the labels we just read
labelResults.forEach((r, i) => {
if (appConfig.survey_info?.['trip-labels'] == 'ENKETO') {
unprocessedLabels['SURVEY'] = r;
} else {
unprocessedLabels[getLabelInputs()[i]] = r;
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point, I do want to unify the label handling between ENKETO and MULTILABEL so we can get rid of all this complexity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

return Promise.all([...labelsPromises, ...notesPromises]).then((comboResults) => {
const labelsConfirmResults = {};
const notesConfirmResults = {};
function updateUnprocessedInputs(labelsPromises, notesPromises, appConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any new tests for this new function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test updateUnprocessedInputs / updateLocalUnprocessedInputs / updateAllUnprocessedInputs in Katie's #1085 which I offered to help write tests for anyway

@@ -77,13 +78,6 @@ export function getBaseModeByKey(
return BaseModes[key] || BaseModes.UNKNOWN;
}

export function getBaseModeOfLabeledTrip(trip, labelOptions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this just unused? We do still use base modes for the labeled trips, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used in 1 or 2 places but getBaseModeByValue does the same thing

(from 4d48aee):

- in diaryHelper, getBaseModeOfLabeledTrip was removed since it was basically just a shortcut to getBaseModeByValue with a trip as param instead of its userInput. All usages replaced with getBaseModeByValue

Comment on lines +89 to +90
// whenever timelineMap is updated, map unprocessed inputs to timeline entries, and
// update the displayedEntries according to the active filter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this will be called every time the data is reloaded, and also that when we get new user inputs. But it looks like it maps all inputs to timeline entries. Is there a way to only do so for new inputs - i.e. incrementally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, I think so. But we'd have to be really careful about it because input matching relies on nextEntry in some cases.

If you load only week 1, the last trip of week 1 has no nextEntry. Once you load week 2, that last trip of week 1 still needs to be re-matched because the input matched to it before might actually belong on the first trip of week 2. If we didn't re-match the last trip of week 1, we could end up with a duplicate or something.

So I think the safest thing to do is just treat input matching holistically.

And I suspect network/ retrieval time is by far the bottleneck for Label tab slowness - any performance savings here are negligible compared to that

@shankari shankari merged commit 4992e8d into e-mission:service_rewrite_2023 Nov 10, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Address code reviews with "future"
Development

Successfully merging this pull request may close these issues.

2 participants