A misleading error when using gRPC with Go and nginx
I wanted to document an issue I ran into while trying to write a gRPC service in Go, in case someone else just so happens to run into the same problem.
The scenario was this:
- I had a web service written in Go
- serving both HTTP traffic and gRPC traffic in plaintext on the same port
- using nginx as a reverse proxy to provide TLS.
This might seem a bit contrived, but it came about fairly naturally. I was working on a web service written in Go, and then I wanted to add a gRPC API to it as well, for use by a mobile client. Rather than setting up a new sub-domain or messing with firewall rules to expose another port, I had the Go service accept gRPC requests on the same port that it was already listening on. Then, some time later I decided it would be easier to let nginx handle TLS rather than figuring out how to set up certificate rotation in the Go service itself, and so I had nginx terminate the TLS connection and proxy requests to the Go service over plain HTTP.
The error message
However, when I attempted this, all my gRPC requests stopped working, and I started seeing errors something like this in the nginx logs:
2020/12/13 01:14:04 [error] 29#29: *3 upstream sent too large http2 frame:
4740180 while reading response header from upstream, client: 172.21.0.1,
server: , request: "POST /Service/Method HTTP/2.0", upstream:
"grpc://172.21.0.2:8080", host: "localhost:8080"
My attempts to Google this error message didn’t turn up much. I found a
blog post
about a similar-sounding error, but I wasn’t using Kubernetes, and changing the
proxy_buffers
settings in nginx (the recommended fix) didn’t seem to have any
effect.
So what was going on here?
I didn’t make much progress towards understanding this, until I came across the
httputil.DumpRequest()
method. When I used this method to log an incoming request in my main handler
method, I saw this:
PRI * HTTP/2.0
Connection: close
I’d never heard of the PRI method, but at least this was something easy enough to search for online. I found that this comes from the HTTP/2 “client connection preface”. The preface is sent by the client at the start of an HTTP/2 connection, and it begins with a fixed byte sequence consisting of:
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n
(where \r
and \n
indicate the carriage return and newline characters,
respectively). This byte sequence was specifically chosen to look like an
invalid HTTP/1.1 request, so that servers which don’t understand HTTP/2 will
error out right away.
But why did my Go service start treating this as a request in its own right?
HTTP/2 and Go
When I added the gRPC service on the same port as the rest of the web service, I
followed a technique mentioned in the grpc-go
documentation. It
comes down to a check that the incoming request is using HTTP/2 (as gRPC
is implemented using HTTP/2), and has a Content-Type
header whose value
begins with application/grpc
:
Unfortunately, I either skipped right over the previous paragraph in the documentation, or I simply forgot about it by the time I tried to move the TLS handling into the nginx reverse proxy:
The provided HTTP request must have arrived on an HTTP/2 connection. When using the Go standard library’s server, practically this means that the Request must also have arrived over TLS.
This is the core of the issue. The HTTP server in Go’s standard library does not support plaintext HTTP/2 by default.
(Why doesn’t Go support this by default? Well, HTTP/2 on the web effectively requires encryption, in the sense that currently all major web browsers will use HTTP/2 only with https:// URLs.1 If Go’s standard library HTTP server is mainly used for writing web services, then this seems like a reasonable design decision.)
But what does it mean to “support plaintext HTTP/2”? Why does the HTTP/2 protocol care if the underlying transport is encrypted or not?
Starting an HTTP/2 connection
Before all of this, I was only vaguely aware that HTTP/2 existed. I had a basic understanding of HTTP/1 from a networking course in college, but HTTP/2 is quite different. To begin with, while HTTP/1 is a text-based protocol, HTTP/2 is a binary protocol. One implication of this is that HTTP/2 is not backwards-compatible with HTTP/1, and so clients and servers need some way to signal to each other that they support HTTP/2. There are three main ways to do this.
The first way is specific to https. As part of any https connection, the client and server first engage in something called a “TLS handshake” to agree on encryption parameters and other connection settings. When HTTP/2 was standardized, an extension was also added to the TLS protocol called “Application-Layer Protocol Negotiation” (ALPN) to let the client and server decide which protocol to use for the connection. The client can use this extension to indicate that it supports both HTTP/1 and HTTP/2, and then the server will indicate back to the client which of these options it has decided to use.
However, for plaintext http, there is no corresponding negotiation step when
starting a connection. This brings us to the second way to signal support for
HTTP/2: using the connection “upgrade” feature of HTTP/1.1. A client can send
an HTTP/1 request that includes the header Upgrade: h2c
(along with a few
other required headers2) to signal that it can speak HTTP/2.
The third way to use HTTP/2 is for a client to have “prior knowledge” that a particular server supports it. Essentially, if some out-of-band signaling mechanism exists, a client can start speaking HTTP/2 directly.
This last method is the one that nginx uses for gRPC requests. As the gRPC protocol is implemented using HTTP/2, this constitutes “prior knowledge” that any gRPC endpoint must support HTTP/2. When nginx connects to the Go service, it immediately begins speaking HTTP/2, starting with the client connection preface.
However, because the Go standard library HTTP supports only the first method (ALPN), when it receives the HTTP/2 client connection preface, it parses this as an HTTP/1 request, and responds with an HTTP/1 response. When nginx receives this HTTP/1 response, this leads to the original error.
But why does the error message read “upstream sent too large http2 frame”? What’s that about?
HTTP/2 message framing
To answer that question, we need to understand just a little bit more about the HTTP/2 protocol:
- HTTP/2 messages consist of a number of “frames.”
- Each frame begin with a fixed-length header.
- The first 3 bytes of this header is a Length field.
- Unless otherwise indicated, the maximum value of this Length field is 16,384. (cf. RFC 7540 §4.1)
When nginx starts reading the erroneous HTTP/1 response from the Go service, it will try to start parsing it as an HTTP/2 message frame. Now, the first line of an HTTP/1 response starts contains the HTTP version and status code, for example:
HTTP/1.1 400 Bad Request
Trying to parse these bytes as the start of an HTTP/2 frame results in reading the first three bytes (“HTT”) as the frame length. In hexadecimal, these bytes are 0x485454, or 4,740,180 in decimal. Obviously this exceeds the maximum frame length of 16,384, and so this explains why nginx thinks the frame is too large.
The number 4740180 even shows up in the error message, and this could have been an important clue, but I did not realize its significance at the time. (This is the same exact idea as the magic number 1213486160, but with only 3 bytes instead of all 4 of “HTTP”.)
Putting it all together
To recap:
- When nginx connects to an upstream gRPC server, it begins speaking HTTP/2 directly, starting with the HTTP/2 client connection preface. (The use of gRPC is the “prior knowledge” that the server must support HTTP/2.)
- The connection preface begins with a fixed byte sequence chosen to look like like an HTTP/1 request, but with an invalid method and version.
- The Go
http
standard library package does not recognize plaintext HTTP/2 requests, and so the Go service misinterprets the client connection preface as the start of an HTTP/1 request, and responds with an HTTP/1 response. - Then nginx attempts to parse the HTTP/1 response as an HTTP/2 frame, which leads to the original error message.
The nginx error message is misleading because it talks about an HTTP/2 frame being too large, but the real problem is that the Go service wasn’t speaking HTTP/2 at all.
So what is the fix?
Luckily, there’s an easy way to add plaintext HTTP/2 support to the Go standard library HTTP server, using the golang.org/x/net/http2/h2c package.
If the default http2.Server
options are acceptable, this can be as simple as replacing something like
this:
http.ListenAndServe(addr, handler)
with this:
http.ListenAndServe(addr, h2c.NewHandler(handler, &http2.Server{}))
Of course, another option would be to move the gRPC portion of the Go service onto a different port, and use the gRPC library’s built-in server (which does recognize plaintext HTTP/2 requests just fine). If I had simply done that to begin with, I wouldn’t have run in to this issue.
Further reading:
- “Introduction to HTTP/2” from Google’s Web Fundamentals series
- HTTP/2 Frequently Asked Questions
- RFC 7540 (the full HTTP/2 specification)
- RFC 7301 (the TLS ALPN extension specification)
- “What Happens in a TLS Handshake?” from Cloudflare
- What’s with the letters ‘PRI’ and ‘SM’ in the client connection preface?
“The secret message hidden in every HTTP/2 connection”
-
Specifically, the
HTTP2-Settings
andConnection
headers (cf. RFC 7540 §3.2). ↩︎