Even from Windows, Emacs Tramp mode is Terrific

I’m feeling so thankful for the smart people that built tramp.el for emacs. For those who don’t know, tramp refers to “Transparent Remote Access, Multiple Protocols”, and it allows me to use the emacs running on my home laptop, to edit files running on a remote machine that I connect to via SSH. So I can edit files on my raspberry pi, or on my cloud shell machine, or on my remote host at nearlyfreespeech. All from the same emacs. It feels pretty magical at first, and then it just sort of fades into routine, like any sufficiently advanced technology. I don’t even think about it while using it.

Recently I wanted to add a requirement to use a FIDO key authentication to the ssh connection to a raspberry pi device. The server side is pretty easy – just set the proper key into ~/.ssh/authorized_keys . Generating a key is also pretty easy – follow the instructions from Yubico, or from other sources. Support for retrieving a private key from a security key, like a Yubico key, is possible in OpenSSH since version 8.2 (I think). But the OpenSSH on Windows lags a little bit. I think the feature finally worked in v8.9 on the Windows build. The “builtin” OpenSSH that gets installed alongside Powershell is v8.1p1.

PS > c:\windows\System32\OpenSSH\ssh.exe -V
OpenSSH_for_Windows_8.1p1, LibreSSL 3.0.2

Which is not sufficient. What to do?

Upgrade OpenSSH on the Windows machine, of course. You can get releases here. I installed v9.2.2.0. That allowed me to run ssh-keygen from a terminal to generate a key using type ecdsa-sk, and then store it on the Yubico key. And I could also run ssh from the terminal to connect into the rpi, after confirming my presence by touching the security key. All good. The only trick here was to insure I was using the correct version of OpenSSH. The newly installed OpenSSH did not overwrite the version in \windows\system32. Instead it appeared in program files:

PS > c:\progra~1\OpenSSH\ssh.exe -V
OpenSSH_for_Windows_9.2p1, LibreSSL 3.7.2

The next trick was persuading tramp.el in emacs to use the appropriate executable. This is done by twiddling the tramp-methods variable. For me, this worked:

(setf
(car (alist-get “sshx” tramp-methods nil nil #’equal))
‘(tramp-login-program “C:/Progra~1/OpenSSH/ssh.exe”))

And then, set up my ~/.ssh/config, and after that I can use a filespec like /sshx:rpi:/home to open a remote file, after confirming with a touch on my security key.

It’s satisfying when things work.

Cloud Run instances may shutdown at any time

A bit of a surprise for me, I thought I’d note it down. I’ve been using Cloud Run for serverless hosting of containerized apps. The whole experience is really slick. I can provide a source directory for something like a Java app or a nodejs app, and Cloud Run will build the container for me, when I use something like

gcloud run deploy my-service –source . –allow-unauthenticated –region us-west1

You can also set minimum and maximum instances with these options:

–min-instances 1
–min-instances 2

But what is “minimum” anyway? What I did not realize is this fact, taken from the Cloud Run documentation:

For Cloud Run services, an idle instance can be shut down at any time, including instances kept warm via a minimum number of instances.

You have been warned.

What this means to me is, I need to design my services so that they can always start up, and re-configure themselves. Any state they maintain in memory needs to be … persisted somewhere, so that in the case of shutdown and restart, that state can be re-applied.

I should not have been surprised by this.

Google Cloud Data Loss Prevention (DLP) for XML data; an example of invoking the REST API

I worked a little bit to decipher the documentation for content:deidentify from Google Cloud. After some trial and error, this is what worked for me.

POST :dlp/v2/projects/:project/content:deidentify
content-type: application/json
x-goog-user-project: :project
Authorization: Bearer :token

{
  "inspectConfig": {
    "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
  },
  "deidentifyConfig": {
    "infoTypeTransformations": {
      "transformations": [ {
        "infoTypes": [
          {
            "name": "URL"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -8,
            "reverseOrder": true,
          }
        }
      },
      {
        "infoTypes": [
          {
            "name": "PHONE_NUMBER"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -1,
            "reverseOrder": false,
            "charactersToIgnore": [
              {
                "charactersToSkip": ".-"
              }
            ]
          }
        }
      },
      {
        "infoTypes": [
          {
            "name": "EMAIL_ADDRESS"
          }
        ],
        "primitiveTransformation": {
          "characterMaskConfig": {
            "numberToMask": -3,
            "reverseOrder": false,
            "charactersToIgnore": [
              {
                "charactersToSkip": ".@"
              }
            ]
          }
        }
      } ]
    }
  },
  "item": {
    "value": "<doc xmlns=\"urn:932F4698-0A64-49D4-963F-E6615BC399E8\">  <Name>Marcia</Name>  <URL>https://marcia.com</URL>  <Email>marcia@example.com</Email><Phone>412-343-0919</Phone></doc>"
  }
}

Notes:

  • As described here, You need to specify the header x-goog-user-project: :project (obviously replacing the word :project with your own project name), otherwise you will get the dreaded 403 error message like this:
    {
       "error": {
         "code": 403,
         "message": "Your application is authenticating by using local Application Default Credentials. The dlp.googleapis.com API requires a quota project, which is not set by default. To learn how to set your quota project, see https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds .",
         "status": "PERMISSION_DENIED",
         "details": [
           {
             "@type": "type.googleapis.com/google.rpc.ErrorInfo",
             "reason": "SERVICE_DISABLED",
             "domain": "googleapis.com",
             "metadata": {
               "service": "dlp.googleapis.com",
               "consumer": "projects/325555555"
             }
           }
         ]
       }
     }
  • You can specify the de-identify config as a template. example follows:
    POST :dlp/v2/projects/:project/content:deidentify
    content-type: application/json
    x-goog-user-project: :project
    Authorization: Bearer :token
    
    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
      },
      "deidentifyTemplateName": "projects/my-project-name-12345/deidentifyTemplates/3816550063387353440",
      "item": {
        "value": "<doc xmlns=\"urn:932F4698-0A64-49D4-963F-E6615BC399E8\">  <Name>Marcia</Name>  <URL>https://marcia.com</URL>  <Email>marcia@example.com</Email><Phone>412-343-0919</Phone></doc>"
      }
    }
    

How to Compute an HTTP Signature for Mastodon (and an example in NodeJS)

I am reading this documentation from Mastodon.

And from it, I understand that Mastodon requires an HTTP Signature, signing at least these headers:
(request-target) host date digest

If your client is written in JavaScript and runs on Nodejs, an example for how to build a signature is given on the npmjs site.

But I believe this example is out of date. It does not use the (request-target) pseudo header. So that’s not gonna work.

So what must you do? Go back to the Mastodon documentation. Unfortunately, that too, is either out of date or confusing. For a post request, the mastodon documentation states that you must first compute the “RSA-SHA256 digest hash of your request’s body”. This is not correct. There is no such thing as “RSA-SHA256 digest”! RSA-SHA256 is not the name of a digest. Message Digests include: SHA1, SHA256, MD5 (old and insecure at this point) and others. According to my reading of the code, Mastodon supports only SHA-256 digests. The documentation should state that you must compute the “SHA256 digest”. (There is no RSA key involved in computing a digest).

Regardless of the digest algorithm you use, the computed digest is a byte array. That brings us to the next question: how to encode that byte array as a string, in order to pass it to Mastodon. Some typical options for encoding are: hex encoding (aka base16 encoding), base64 encoding, or base64-url encoding. The documentation does not state which of those encodings is accepted. Helpfully, the example provided in the documentation shows a digest string that appears to be hex-encoded. Unhelpfully, again according to my reading of the code, Mastodon requires a base64-encoded digest!

With these gaps and misleading things in the documentation, I think it would be impossible for a neophyte to navigate the documentation and successfully implement a client that passes a validatable signature.

  1. produce the POST body
  2. compute the SHA-256 digest of the POST body, including all whitepsace and leading or trailing newlines. Try this online tool to help you verify your work.
  3. Encode that computed digest (which is a byte array) with base64. This should produce a string of about 44 characters.
  4. Set the Digest header to be SHA-256=xxxyyyyy , where xxxyyyy is the base64 encoding of the SHA-256 digest.
  5. Set the http headers for the pending outbound request to include at least host, date, and digest.
  6. compute the signature following the example from the npmjs.com site, with headers of “(request-target) host date digest”, and using the appropriate RSA key pair.

If it were me, I would also include a :created: and an :expires: field in the http signature.

You can play around with HTTP Signatures using this online tool. That tool does not yet support computing a Digest of a POST body, but I’ll look into extending it to do that too.

Let me know in the comments if any of this is not clear.

I posted a working example for Nodejs as a gist on Github.

It depends only on nodejs and the builtin libraries for crypto and URL to compute the hash/digest and signature. It does not actually send a request to Mastodon; that is left for you to do.

Yarp vs Envoy proxy – build time comparison

I’m doing some self-education these days, and was exploring YARP today. I learned about this via HackerNews some time ago, and only now got around to taking the time to explore in more detail. As Microsoft describes it, YARP is “a library to help create reverse proxy servers that are high-performance, production-ready, and highly customizable.”

It’s not a reverse proxy in its own right, but a library that you can embed into an ASPNET app to allow it to act as a reverse proxy. The “Yet Another” moniker is completely appropriate; there are many, many Reverse Proxies out there in various shapes, sizes and configurations. Why Microsoft wanted to build another one when there are good options out there – Envoy, nginx, haproxy, and many others – is perhaps a topic worth exploring (Google, my most recent employer, promotes the open-source Envoy proxy as a general-purpose RP, and also sells an API Platform, Apigee, that includes its own reverse proxy). It seems to me that Microsoft has large cloud investments, and wants to have control over this particular critical piece of widely-used infrastructure. Rather than compromise with something that’s already out there, go build something that fits the requirements for their massive cloud footprint, as well as for the shops other than Microsoft who are invested in .NET. I don’t think it’s worthy of too much more discussion than that.

With so many available options in reverse proxies, an interested observer might want to have some insight into comparisons between them. Now there are various criteria a person might want to investigate when comparing – features like hot-reload of configuration, the configuration model in general, support for “farms” of proxies all centrally managed, platform availability, performance…. Any reader could probably add two or three more items to that list.

All of that is interesting, but I don’t have time to conduct a thorough comparison at this time. But I will offer one quick observation. While exploring Envoy proxy back in November, I built it from source on my macbook pro. The build was a bear, and took maybe 90 minutes? Something like that. Basically it built every library that the envoy proxy depended on. I suspect most people don’t do that; they just use the docker container that the envoy project publishes.

Just for fun I cloned the yarp repo and ran a build. After sorting out some puzzles [1, 2] on my own, the build completed in about 55 seconds. That’s a pleasant surprise.

I know that with Bazel, the envoy build will be much faster on subsequent runs. But even so, building a YARP proxy is much much faster than building Envoy.

nodejs on Google App Engine – forcing HTTPS inbound, via HSTS

How can I force my nodejs app running on Google App Engine, to always redirect to HTTPS ?

I have a pretty vanilla app that looks like this:

This thing is running in Google App Engine (GAE), and I’d like to make sure it listens only on HTTPS. There are standards like HSTS that can help. How can I use them?

This question and answer on Stackoverflow showed me the way. Basically, just add in a tiny module called yes-https. The new code looks like this:

Redeploying (no change to app.yaml) gets me the always-HTTPS behavior I want. When a client requests my service via http, it receives a 301 redirect pointing to the secure site.

HTTP/1.1 301 Moved Permanently
Date: Wed, 20 Jun 2018 16:27:56 GMT
Transfer-Encoding: chunked
X-Powered-By: Express
Location: https://foo-bar.appspot.com/
Via: 1.1 google

Nice, easy, clear.
Thanks to Justin for this handy module.

Jackson and XmlMapper – reading arbitrary data into a java.util.Map

I like the Jackson library from FasterXML. Really handy for reading JSON, writing JSON. Or I should say “serialization” and “deserialization”, ’cause that’s what the cool kids say. And the license is right. (If you need a basic overview of Jackson, I suggest this one from Eugen at Stackify.)

But not everything is JSON. Sometimes ya just wanna read some XML, amiright?

I work on projects where Jackson is included as a dependency. And I am aware that there is a jackson-dataformat-xml module that teaches Jackson how to read and write XML, using the same simple model that it uses for JSON.

Most of the examples I’ve seen show how to read XML into a POJO – in other words “databinding”. If my XML doc has an element named “Fidget” then upon de-serialization, the value there is used to populate the field or property on the Java object called “Fidget” (subject to name remapping of course).

That’s nice and handy, but like I said, sometimes ya just wanna read some XML. And it’s not known what the schema is. And you don’t have a pre-compiled Java class to hold the data. What I really want is to read XML into a java.util.Map<String,Object> . Very similar to what I would do in JavaScript with JSON.parse(). How can I do that?

It’s pretty easy, actually.

This works but there are some problems.

  1. The root element is lost. This is an inadvertent side-effect of using a JSON-oriented library to read XML.
  2. For any element that appears multiple times, only the last value is retained.

What I mean is this:
Suppose the source XML is:

<Root>
  <Parameters>
    <Parameter name='A'>valueA</Parameter>
    <Parameter name='B'>valueB</Parameter>
  </Parameters>
</Root>

Suppose you deserialize that into a map, and then re-serialize it as JSON. The output will be:

{
  "Parameters" : {
    "Parameter" : {
      "name" : "B",
      "" : "valueB"
    }
  }
}

What we really want is to retain the root element and also infer an array when there are repeated child elements in the source XML.

I wrote a custom deserializer, and a decorator for XmlStreamReader to solve these problems. Using them looks like this:

String xmlInput = "<Root><Messages><Message>Hello</Message><Message>World</Message></Messages></Root>";
InputStream is = new ByteArrayInputStream(xmlInput.getBytes(StandardCharsets.UTF_8));
RootSniffingXMLStreamReader sr = new RootSniffingXMLStreamReader(XMLInputFactory.newFactory().createXMLStreamReader(is));
XmlMapper xmlMapper = new XmlMapper();
xmlMapper.registerModule(new SimpleModule().addDeserializer(Object.class, new ArrayInferringUntypedObjectDeserializer()));
Map map = (Map) xmlMapper.readValue(sr, Object.class);
Assert.assertEquals( sr.getLocalNameForRootElement(), "Root");
Object messages = map.get("Messages");
Assert.assertTrue( messages instanceof Map, "map");
Object list = ((Map)messages).get("Message");
Assert.assertTrue( list instanceof List, "list");
Assert.assertEquals( ((List)list).get(0), "Hello");
Assert.assertEquals( ((List)list).get(1), "World");

And the output looks like this:

{
  "Parameters" : {
    "Parameter" : [
      {
        "name" : "A",
        "" : "valueA"
      },{
        "name" : "B",
        "" : "valueB"
      }
    ]
  }
}

…which is what we wanted.

Find the source code here: https://github.com/DinoChiesa/deserialize-xml-arrays-jackson

Hat tip to Jegan for the custom deserializer.

medialize/URI.js – why’d you go and get all fancy?

I have relied on URI.js from medialize for years.

I downloaded it a long time ago, and it just works. It’s handy for parsing and building URIs form within Javascript.
I happen to use nodejs often, but I also use a JavaScript engine that runs in the JVM (via Rhino or Nashorn). So I liked URI.js for its usability across those systems.

Recently I decided to download “the latest and greatest” URI.js, and what I found… did not make me jump for joy.

URI.js is no longer “just downloadable”.

Where before I could just download the raw JS file, URI.js now has a builder that allows me to select which options I wish to include. I get the concept, and it’s a nice idea, but when I de-selected every option, I got a minimized URI.js that I did not want. When I went to the source tree I found a URI.js that included all the require() statements for punycode, Second-Level Domains, and ipv6, all stuff I did not want.

*snif*

I couldn’t figure out how to get it to “just work” in nodejs without all of that, so I had to resort to manually changing the code. Basically I just removed all the require() statements for those unneeded / unwanted modules.

And it works.

It’s possible I’m missing something basic, but for sure, it got more complicated to get the simple solution. Seems like a step backward.

It’s that time of year… when people think about exchanging JWT for opaque tokens

Yes, it’s that time of year when people think about RFC7523, which describes how to exchange JWT for opaque OAuth tokens.

Right?

If you’re like me, the waves of acronyms, jargon, and IETF RFCs (see what I did there?) seem to never end. OAuth, JWT, RFC 7523, JTI, claims, RS256, PBKDF2…? I feel your pain.

But there is some good news… here’s something that will help clarify the ideas and use cases around RFC7523. I wrote a quick article, and also created an Apigee Edge API Proxy, that implements this for you. It illustrates exactly how to exchange JWT for opaque OAuth tokens, and I even include some commentary int he readme explaining why you’d want to do it. (Spoiler alert: It’s faster to verify opaque OAuth tokens). All available on the Apigee community site.

The way I think about RFC7523 – it is an alternative to the client_credentials “grant type”, described in IETF RFC6749, which is the document that describes the OAuth v2.0 Framework.

OK, I hear you saying it: “back up, Dino… What is this client_credentials thing?” Yes, there is an underscore there. The client_credentials grant type is designed to allow a client app to identify itself to a token dispensary. The client says “here’s my ID, and here’s a secret that only I (the client app) should know.” And the token dispensary can then look at those two pieces of information, and if they are valid (the client_id is not expired or revoked), then the token dispensary can issue a token. It’s like username + password authentication for a person, but client_credentials is used for identifying a client app. This grant type mostly useful in server-to-server communications, when one service is being used by another service. BUT, some people use client_credentials grants in their mobile apps, so that the API service can trust that the mobile app is who it claims to be. (There are some problems with this; basically the client_secret needs to be embedded in the client code, therefore it is accessible to hackers, and therefore it is not truly “secret”. We can talk about mitigations for this in a future blog post.)

So that’s the client_credentials grant type. As I said, RFC7523 is an alternative to the client_credentials grant. Basically, instead of sending in a client_id and client_secret, under the RFC7523 flow (which has the helpful and easy-to-remember moniker of “JSON Web Token (JWT) Profile for OAuth 2.0 Client Authentication and Authorization Grants”, seriously) the client app self-signs a JWT which includes the client_id as the issuer. The app sends that to the token dispensary. The token dispensary verifies the signature, verifies that the client_id is valid, and then issues an opaque OAuth v2.0 token.

Now, there are some interesting implications to this model. Maybe these are obvious to some of you, but I will state them anyway:

  1. the token dispensary and the client app have to conform to the same JWT signing convention. JWT can be signed with shared-secret (HS256) or with public/private key (RS256). Either way is fine, but the two sides must agree.
  2. regardless of the signing convention, it must be possible for the token dispensary to verify the signature. If HS256 is the agreed convention, this means the token dispensary and the client app must share a secret. (This can be the client_secret! if it has sufficient entropy, or it can be a key obtained from PBKDF2) If RS256 is the signing convention, it means the two parties must have a shared trust relationship, where the token dispensary has access to the public key of the client app. Bottom line, there is a little bit more overhead for you, setting up an JWT-for-opaque-token exchange mechanism, if you use RS256: specifically you need to provision a new RSA public/private keypair for the client, and the client needs to make the public key available to the token dispensary.
  3. the client app needs some extra intelligence, specifically a library that allows it to create a signed JWT. There are myriad options available regardless of the app platform + language you use, so in practice, this won’t be an obstacle, but it does mean there will be new code you must include in your client.

Once you get past those implications and the extra set-up overhead, the model in RFC 7523 is really nice because it’s extensible. That’s because the request-for-token is encapsulated in a JWT, and the JWT itself is extensible. You, as an API designer, can stipulate any arbitrary (custom) claims that clients must include in the JWT, in order to compose a valid request-for-token. And you can include restrictions on the standard claims or custom claims. Some examples:

  1. a proof-of-work string, something like a HashCash string or similar. Including proof-of-work would be a discouragement for bots.
  2. As another example, you can stipulate that the JWT be short lived. Verification of the JWT might include a proviso that rejects tokens that have a lifetime beyond 180 seconds, for example.
  3. you could institute a one-use policy on such JWT.
  4. you could require a “scopes” claim and validate the strings contained in that claim against the issuer (==client_id)

BTW, the example API Proxy I shared on Github shows how to implement the lifetime and one-use-only controls. (As with everything I publish on github, pull requests are welcomed!) If the inbound JWT that comprises the request-for-opaque-token does not pass these checks, a 401 Unauthorized is sent back.

BTW #2, did you know that Google services like Stackdriver and cloud storage use JWT-for-opaque-token exchange in order to enable service-to-service integration? Google also institutes the lifetime and one-use-only controls. The lifetime of the JWT must be less than 300 seconds.

Say, that reminds me!, Speaking of Google, did I mention that Google has acquired Apigee? Yes, I work for Google now! Part of the Apigee team within Google. w00t! I’m pumped, psyched, charged up, amped, and very pleased about this development.

So far, minimal changes for me, except for me I got a Chromebook! And yes, I authored this post from that very same device.

As always, I’m interested to hear your feedback on this. Let me know in the comments section.

Finally, I would like to wish all of you a Merry RFC7523 Season; and I wish you many Happy short-lived OAuth Tokens in the new year.

Drupal 7, #states, and mutually exclusive checkboxes

This post will be a bit techy. I confronted and solved a minor problem yesterday, and in the spirit of the internet, thought I’d share the solution, in case anyone else tries something similar.

This is about Drupal forms, and specifically within forms, the #states capability, which is a way that form designers can tell Drupal to do jQuery magic things on the form elements, enabling or disabling some of them based on the state or value of others.

The typical example is a checkbox, that when checked, will either enable (css ‘disabled: false’) or make visible (css ‘display: block’) a dependent textbox. Simple enough, right? and for that kind of simple case, it works well.

Drupal’s Forms API is described here, and the related
drupal_process_states here.

This is what it looks like to configure a Form in Drupal:

That says, show the textfield only when the referenced checkbox is checked. The reference to the checkbox is with a jQuery selector. This one works, really straightforward. And, the state is managed by Drupal in both directions. When the referenced checkbox is checked, then the textfield is visible. When the referenced checkbox is unchecked, then the textfield becomes not visible.

But what if you want a set of mutually exclusive checkboxes?

Mutually Exclusive Checkboxes

One approach is to just use the above model, and have each checkbox depend on the other. In other words, something like this:

This will not work. The reason this does not work, is that the state is managed by drupal in both directions. When checkbox #1 is checked, then checkbox #2 becomes unchecked. Which means checkbox #1 gets checked. Which means checkbox #2 becomes unchecked. And if you turn on the Firebug debugger, you can see the logical loop going round and round, endlessly.

There was an approach described here that suggested using two conditions in the array. But that didn’t work for me; I still had the endless loop. After fiddling with this for an hour, searching around for hints, I decided to just do it myself with my own jQuery. The logic was simple to write. And, I didn’t want to fight the Drupal Forms API any longer.

So here’s the solution. Include this JavaScript in your module:

As you can see, it registers a ‘change’ hook for a specially-marked checkbox. And when the checkbox is affirmatively checked, it unchecks the other checkbox. When the checkbox is unchecked, it does nothing.

How does that JS get loaded? In the Drupal module code, do this:

And finally, how do we set up the checkboxes in the Forms API? Like this:

And that gets the desired behavior: It is possible for zero or one of those checkboxes to be checked, but not both.

It took more time to write this post than it took to build the solution shown here! And of course I never did manage to figure out how to do the same just using the Forms API. This is an example of an API, the Forms API in Drupal, that does some things well, and this one thing….? Not so well. Much easier to just jump out and solve it this way.

Maybe this will help some one else!

By the way, this is included in a Drupal module that allows administrators to verify / validate user registration.