You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model inference definition can potentially reveal personally identifiable information used in categorical encoding maps. This is usually not a problem since the access permissions for reviewing the model definitions are the same as for reviewing the training datasets where this PII occurred.
However, there is no reason to have original categorical strings stored in the model. For the learning algorithm, it is sufficient to use the distinct representation of the categories produced by a cryptographic hash function.
Note that the encodings need to be unique only within the same feature, which reduces the complexity of the hash function
The text was updated successfully, but these errors were encountered:
Both Google and Facebook use cryptographic hash function to encode PII. Google suggests using SHA256+salt while Facebook does not explicitly mention the CHF algorithm.
Model inference definition can potentially reveal personally identifiable information used in categorical encoding maps. This is usually not a problem since the access permissions for reviewing the model definitions are the same as for reviewing the training datasets where this PII occurred.
However, there is no reason to have original categorical strings stored in the model. For the learning algorithm, it is sufficient to use the distinct representation of the categories produced by a cryptographic hash function.
Note that the encodings need to be unique only within the same feature, which reduces the complexity of the hash function
The text was updated successfully, but these errors were encountered: