domingo, 15 de enero de 2017

Starting to love gRPC for interprocess communication (1/2)

In the context of a discussion around programming languages and static typing a colleague said that when you get older you stop caring about fancy technologies and you realize that is way better to just use safe and well probed solutions.  

I'm kind of tired of having been using loosely defined JSON-HTTP interfaces for many years and when I discovered gRPC last year it looked exactly what I was looking for.  I would love to start using it in production as soon as possible so I decided to play with it for a while first and explain how it went.

I will split my comments about gRPC in two posts. This first one about what is gRPC and what advantages provide and the next one on how to use it in our applications.

gRPC embraces the RPC paradigm where the APIs are defined as actions receiving some arguments and replying with a response.  Initially it feels like going 10y back when we started to use SOAP and similar technologies but we have to admit that is much simpler to map those primitives to our client and server code (for example no url path mapping) and it is more strict and explicit on what can and cannot be done for each operation and that usually makes the system more robust.

In gRPC you define your interfaces (methods, arguments and results) in an IDL using the protocol buffers format.   This definition is used to generate the server and client code automatically.   The serialization of the calls is done using the binary protobuf format too.  This makes the communication efficient and the protocol extensible being able to use all the features available in protobuf (for example composition or enum types).

Two of the advantages of this approach are automatic code generation and schema validation.  That can also be done in the "traditional" REST interfaces, but it is more tedious, less efficient and in my experience much easier to make mistakes when you add new features or refactor the code.

The communication in gRPC is based on HTTP2 transport.  This provides all the advantages of the new HTTP version (multiplexing, streaming, compression) while at the same time allows you to keep using existing HTTP infrastructure (nginx or other load balancers for example).

Another special feature of gRPC is the streaming support that is very convenient for some APIs these days.    With gRPC you are able to send a (potentially infinite) sequence of arguments to the server and receive a sequence of results from it.   That is very useful to implement applications more responsive where data can be processed and displayed even if part of it is still not ready.    It is also very useful for APIs based on notifications like in case of a chat application for example.

When compared with other IPC frameworks like Finagle (disclaimer, i'm a fan of it) gRPC is still missing important features client side load balancing (although it is wip) and some other goodies like circuit breakers, retries or service discovery.   In the mean time people is implementing those features on top of the framework.

The other missing piece is browsers support.  Even if there is support for many languages including Javascript, the browsers limitations make it not possible to implement a gRPC compatible web client nowadays.   The community is working on an extension of the protocol to support browsers and in the mean time the only solution seem to be the grpc-gateway proxy that generates a JSON-HTTP to gRPC proxy based on the IDL of the service with some extra annotations.