You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to onboard ULI Pilot participants and test the matching algorithm, we need to be able to onboard each org one at a time and return the possible duplicates and their confidence scores in a payload.
The matching algorithm that this service uses will also be queried from an interactive search form down the road, and should share the same search functions.
ULI Service
Create an endpoint that can ingest new data from a given Provider UOI.
A request to this endpoint would look something like the following:
Where each item in the value array is a ULI data structure matching the above schema.
As such, schema validation should be performed to ensure that some proper subset of the above dictionary fields is present, and that they are passed in the value array. If true, then respond with a 200, and if not then 400 and tell them which field(s) caused the error. Use ajv or Yup for schema validation.
Sync Orgs with UOI Production Sheet
We should also validate UOIs against the reference sheet. This will mean we'll have a cache of orgs running locally on the API server.
Later on, we'll need the ability to refresh from the UOI sheet. We could take this from the Cert API - just the Sync service and UI. We don't need to do this by the Jan 7 demo though. For now, have a singleton service that populates the cache the first time it's used, and then returns what's there until restarted. If we reuse what's in Cert then we can deal with it that way.
Once the ingest job has been started, then if the user makes a request to the queue there won't be records in it right away. It will take some time before there are results. We'll need some kind of notification in the future.
Once each record gets pushed, then we'll run a scavenging job on them against what's already in the system.
In order to do this, we'll dynamically form a query based on the information each record contains according to the matching formula, and be able to support a variable set of weights through a separate index that would eventually have its own UI in production.
In the case where a new ULI can be created or an existing match is suggested, we'll use a format similar to the following:
The API will classify the matching events into event types such as ULI Assigned, ULIs Suggested.
ULI Assigned - in this case, a ULI was assigned since there was no matching record within the configurable confidence threshold. A confidence score will be shown for the item including the fields that were matched on. A ULI can also be assigned through the review process.
ULI Suggested - for each inbound record there may be one or more suggested ULIs pertaining to that record. They will also be shown with their confidence scores and which fields they match on.
Processing - we will also likely want access to the records currently being processed as well, so there should be a third event type of Processing, but it shouldn't be returned unless asked for.
There may be other events added in the future, but this is a good start.
As such, we'll want the root path to also take an optional eventType parameter:
Due Date: Demo on Jan 7 2022.
In order to onboard ULI Pilot participants and test the matching algorithm, we need to be able to onboard each org one at a time and return the possible duplicates and their confidence scores in a payload.
The matching algorithm that this service uses will also be queried from an interactive search form down the road, and should share the same search functions.
Create an endpoint that can ingest new data from a given Provider UOI.
A request to this endpoint would look something like the following:
Response:
Where each item in the value array is a ULI data structure matching the above schema.
As such, schema validation should be performed to ensure that some proper subset of the above dictionary fields is present, and that they are passed in the value array. If true, then respond with a 200, and if not then 400 and tell them which field(s) caused the error. Use ajv or Yup for schema validation.
We should also validate UOIs against the reference sheet. This will mean we'll have a cache of orgs running locally on the API server.
Later on, we'll need the ability to refresh from the UOI sheet. We could take this from the Cert API - just the Sync service and UI. We don't need to do this by the Jan 7 demo though. For now, have a singleton service that populates the cache the first time it's used, and then returns what's there until restarted. If we reuse what's in Cert then we can deal with it that way.
ULI Assigned
,ULI Suggested
,Processing
Once the ingest job has been started, then if the user makes a request to the queue there won't be records in it right away. It will take some time before there are results. We'll need some kind of notification in the future.
Once each record gets pushed, then we'll run a scavenging job on them against what's already in the system.
In order to do this, we'll dynamically form a query based on the information each record contains according to the matching formula, and be able to support a variable set of weights through a separate index that would eventually have its own UI in production.
In the case where a new ULI can be created or an existing match is suggested, we'll use a format similar to the following:
which is a uuid that uses the RESO URN namespace (3.4.3).
The API will classify the matching events into event types such as
ULI Assigned
,ULIs Suggested
.ULI Assigned
- in this case, a ULI was assigned since there was no matching record within the configurable confidence threshold. A confidence score will be shown for the item including the fields that were matched on. A ULI can also be assigned through the review process.ULI Suggested
- for each inbound record there may be one or more suggested ULIs pertaining to that record. They will also be shown with their confidence scores and which fields they match on.Processing
- we will also likely want access to the records currently being processed as well, so there should be a third event type ofProcessing
, but it shouldn't be returned unless asked for.There may be other events added in the future, but this is a good start.
As such, we'll want the root path to also take an optional
eventType
parameter:Where not specifying a type returns all items except
Processing
.The
ScoringFactors
data should be accessible by having Elastic explain each query. It's not required for MVP though, just a nice to have.The text was updated successfully, but these errors were encountered: