As of March 2021, Presidio had undergo a revamp to a new version refereed to as V2.
The main changes introduced in V2 are:
-
gRPC replaced with HTTP to allow more customizable APIs and easier debugging
-
Focus on the Analyzer and Anonymizer services.
- Presidio Anonymizer is now Python based and pip installable.
- Presidio Analyzer does not use templates and external recognizer store.
- Image Redactor (formerly presidio-image-anonymizer) is in early beta and is Python based and pip installable.
- Other services are deprecated and potentially be migrated over time to V2 with the help of the community.
-
Improved documentation, samples and build flows.
-
Format Preserving Encryption replaced with Advanced Encryption Standard (AES)
Version V1 (legacy) is still available for download. To continue using the previous version:
- For docker containers, use tag=v1
- For python packages, download version < 2 (e.g. pip install presidio-analyzer==0.95)
!!! note "Note" The legacy V1 code base will continue to be available under branch V1 but will no longer be officially supported.
The move from gRPC to HTTP based APIs included changes to the API requests.
-
Change in payload - moving from structures to jsons.
-
Removing templates from the API - includes flattening the json.
-
Using snake_case instead of camelCase .
Below is a detailed outline of all the changes done to the Analyzer and Anonymizer.
{
"text": "My phone number is 212-555-5555",
"AnalyzeTemplateId": "1234",
"AnalyzeTemplate": {
"Fields": [
{
"Name": "PHONE_NUMBER",
"MinScore": "0.5"
}
],
"AllFields": true,
"Description": "template description",
"CreateTime": "template creation time",
"ModifiedTime": "template modification time",
"Language": "fr",
"ResultsScoreThreshold": 0.5
}
}
{
"text": "My phone number is 212-555-5555",
"entities": ["PHONE_NUMBER"],
"language": "en",
"correlation_id": "213",
"score_threshold": 0.5,
"trace": true,
"return_decision_process": true
}
{
"text": "hello world, my name is Jane Doe. My number is: 034453334",
"template": {
"description": "DEPRECATED",
"create_time": "DEPRECATED",
"modified_time": "DEPRECATED",
"default_transformation": {
"replace_value": {...},
"redact_value": {...},
"hash_value": {...},
"mask_value": {...},
"fpe_value": {...}
},
"field_type_transformations": [
{
"fields": [
{
"name": "FIRST_NAME",
"min_score": "0.2"
}
],
"transfomarion": {
"replace_value": {...},
"redact_value": {...},
"hash_value": {...},
"mask_value": {...},
"fpe_value": {...}
}
}
],
"analyze_results": [
{
"text": "Jane",
"field": {
"name": "FIRST_NAME",
"min_score": "0.5"
},
"location": {
"start": 24,
"end": 32,
"length": 6
},
"score": 0.8
}
]
}
}
{
"text": "hello world, my name is Jane Doe. My number is: 034453334",
"anonymizers": {
"DEFAULT": {
"type": "replace",
"new_value": "val"
},
"PHONE_NUMBER": {
"type": "mask",
"masking_char": "*",
"chars_to_mask": 4,
"from_end": true
}
},
"analyzer_results": [
{
"start": 24,
"end": 32,
"score": 0.8,
"entity_type": "NAME"
},
{
"start": 24,
"end": 28,
"score": 0.8,
"entity_type": "FIRST_NAME"
},
{
"start": 29,
"end": 32,
"score": 0.6,
"entity_type": "LAST_NAME"
},
{
"start": 48,
"end": 57,
"score": 0.95,
"entity_type": "PHONE_NUMBER"
}
]
}
Specific for each anonymization type:
Anonymization name | Legacy format (V1) | New json format (V2) |
---|---|---|
Replace | string newValue = 1; |
{ "new_value": "VALUE" } |
Redact | NONE | NONE |
Mask | string maskingCharacter = 1; |
{ |
Hash | NONE | {"hash_type": "VALUE"} |
FPE (now Encrypt) | string key = 3t6w9z$C&F)J@NcR; |
{"key": "3t6w9z$C&F)J@NcR"} |
!!! note "Note" The V2 API keeps changing please follow the change log for updates.