Google Cloud Data Loss Prevention (DLP) for XML data; an example of invoking the REST API

I worked a little bit to decipher the documentation for content:deidentify from Google Cloud. After some trial and error, this is what worked for me.

POST :dlp/v2/projects/:project/content:deidentify
content-type: application/json
x-goog-user-project: :project
Authorization: Bearer :token

{
  "inspectConfig": {
    "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
  },
  "deidentifyConfig": {
    "infoTypeTransformations": {
      "transformations": [ {
        "infoTypes": [
          {
            "name": "URL"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -8,
            "reverseOrder": true,
          }
        }
      },
      {
        "infoTypes": [
          {
            "name": "PHONE_NUMBER"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -1,
            "reverseOrder": false,
            "charactersToIgnore": [
              {
                "charactersToSkip": ".-"
              }
            ]
          }
        }
      },
      {
        "infoTypes": [
          {
            "name": "EMAIL_ADDRESS"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -3,
            "reverseOrder": false,
            "charactersToIgnore": [
              {
                "charactersToSkip": ".@"
              }
            ]
          }
        }
      } ]
    }
  },
  "item": {
    "value": "<doc xmlns=\"urn:932F4698-0A64-49D4-963F-E6615BC399E8\">  <Name>Marcia</Name>  <URL>https://marcia.com</URL>  <Email>marcia@example.com</Email><Phone>412-343-0919</Phone></doc>"
  }
}

Notes:

  • As described here, You need to specify the header x-goog-user-project: :project (obviously replacing the word :project with your own project name), otherwise you will get the dreaded 403 error message like this:
    {
       "error": {
         "code": 403,
         "message": "Your application is authenticating by using local Application Default Credentials. The dlp.googleapis.com API requires a quota project, which is not set by default. To learn how to set your quota project, see https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds .",
         "status": "PERMISSION_DENIED",
         "details": [
           {
             "@type": "type.googleapis.com/google.rpc.ErrorInfo",
             "reason": "SERVICE_DISABLED",
             "domain": "googleapis.com",
             "metadata": {
               "service": "dlp.googleapis.com",
               "consumer": "projects/325555555"
             }
           }
         ]
       }
     }
  • You can specify the de-identify config as a template. example follows:
    POST :dlp/v2/projects/:project/content:deidentify
    content-type: application/json
    x-goog-user-project: :project
    Authorization: Bearer :token
    
    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
      },
      "deidentifyTemplateName": "projects/my-project-name-12345/deidentifyTemplates/3816550063387353440",
      "item": {
        "value": "<doc xmlns=\"urn:932F4698-0A64-49D4-963F-E6615BC399E8\">  <Name>Marcia</Name>  <URL>https://marcia.com</URL>  <Email>marcia@example.com</Email><Phone>412-343-0919</Phone></doc>"
      }
    }