Agnostic Configuration Service Design

Just for fun, I am developing several web applications using the microservice architecture. I am trying to determine the best way to manage the configuration, and I am concerned that my setup approach may have some huge traps and / or something better.

To state the problem, let's say I have an authentication service written in C ++, an authentication service written in rust, analytics services written in haskell, some middletier written in scala, and an interface written in javascript. There will also be a corresponding DB identifier, auth DB, analytic DB (possibly redis cache for sessions), etc. I am deploying all of these applications using dockers.

When one of these applications is deployed, it must necessarily detect all other applications. Since I use docker road, discovery is not a problem, since all nodes share the required overlay network.

However, for each application, host_addr hosting services are still needed, possibly a port, credentials for some databases or private services, etc.

I know that docker has secrets , which allows applications to read the configuration from the container, but then I would need to write some kind of configuration parser in each language for each service. It seems messy.

What I would like to do is to have a configuration service that supports the knowledge of configuring all other services. Thus, each application will begin with some RPC call, designed to obtain configuration for the application at run time. Something like

 int main() { AppConfig cfg = configClient.getConfiguration("APP_NAME"); // do application things... and pass around cfg return 0; } 

AppConfig will be defined in the IDL, so the class will be instantly accessible by the language agnostic.

This seems like a good solution, but maybe I really don't get the point here. Even on a scale of tens of thousands of nodes, they can be easily serviced by several configuration services, so I see no problems with scaling. Again, this is just a hobby project, but I like to think about what-if scenarios :)

How are configuration schemes handled in microservice architecture? Does this sound like a reasonable approach? What do major players like Facebook, Google, LinkedIn, AWS, etc. do? Do?

+7
architecture configuration docker-secrets
source share
2 answers

Instead of creating a custom configuration management solution, I would use one of the following:

Spring Cloud Configuration

Spring Cloud Config is a Java-based configuration server that offers an HTTP API for receiving application configuration parameters. Obviously, it comes with a Java client and nice Spring integration, but since the server is just an HTTP API, you can use it in any language you like. The configuration server also has symmetric / asymmetric encryption of configuration values.

Source of configuration . The external configuration is stored in the GIT repository, which must be available for the Spring Cloud Config server. The properties in this repository are then accessible through the HTTP API, so you can even consider implementing an update process for configuration properties.

Server location Ideally, you make your configuration server accessible through a domain (for example, config.myapp.io ), so if necessary, you can implement load balancing and failure scenarios. In addition, all you need to provide all your services is exactly the exact location (and some authentication / decryption information).

Getting Started: Perhaps you are looking at the Getting Started Guide for Central Configuration on Spring docs or read this Quick Introduction to Spring Cloud Config .

Netflix archaius

Netflix Archaius is part of the Netflix OSS stack and is a Java library that provides an API for accessing and using properties that can dynamically change at runtime. "Although it is limited to Java (which does not quite match the context you specify), the library can use the database as a source for configuration properties.

confd

confd updates local configuration files using data stored in external sources (etcd, consul, dynamodb, redis, vault, ...). After changing the configuration, confd restarts the application so that it can receive the updated configuration file.

In the context of your question, it might be worth a try, as confd makes no assumptions about the application and does not require special client code. Most languages ​​and frameworks support file-based configurations, so confd should be fairly easy to add over existing microservices that currently use env variables and do not expect decentralized configuration management.

+1
source share

I do not have a good solution for you, but I can point out some problems for you.

First, your applications probably need some kind of bootstrap configuration that allows them to locate and connect to the configuration service. For example, you mentioned an IDL configuration service API definition for a middleware system that supports remote procedure calls. I assume you mean something like CORBA IDL. This means that your bootstrap configuration will be not only the endpoint for the connection (indicated, possibly as a gated IOR or the path / in / naming / service), but also the configuration file for the CORBA product that you are using. You cannot download this CORBA product configuration file from the configuration service because it will be a chicken and egg situation. Therefore, instead, you will have to manually save a separate copy of the CORBA product configuration file for each instance of the application.

Secondly, your pseudo-code example assumes that you will use one RPC call to get the entire application configuration in one go. This gross level of detail is good. If instead the application used a separate RPC call to retrieve each pair of name = value, then you might run into serious scalability issues. To illustrate this, suppose your application has 100 name = value pairs in its configuration, so you need to make 100 RPC calls to get its configuration data. I can anticipate the following scalability issues:

  • Each RPC can take, say, 1 millisecond of round-trip time if the application and the configuration server are on the same local network, so the launch time of your application is 1 millisecond for each of the 100 calls RPC = 100 milliseconds = 0.1 s. This may seem acceptable. But if you now deploy another instance of the application on another continent, say, for a 50 millisecond round-trip delay, then the launch time for this new instance of the application will be 100 RPC calls with a delay of 50 milliseconds per call = 5 seconds. Oh!

  • The need to make only 100 RPC calls to retrieve configuration data implies that the application will retrieve each name = value pair once and cache this information, for example, in an object instance variable, and then later a name = value pair through this local cache. However, sooner or later, someone will call x = cfg.lookup("variable-name") from within for -loop, which means that the application will create an RPC every time around the loop. Obviously, this will slow down this instance of the application, but if you end up with dozens or hundreds of application instances, then your configuration service will be loaded with hundreds or thousands of requests per second and this will become a central performance bottleneck.

  • You can start writing long-lived applications that make 100 RPCs at startup to retrieve configuration data, and then run for several hours or days before shutting down. Suppose these applications are CORBA servers that other applications can communicate with through RPC. Sooner or later, you can write several command line utilities to do things like β€œping” an application instance to see if it works; "request" an instance of the application to get status information; ask the application instance to legally shut down; and so on. Each of these command line utilities is short-lived; when they start, they use RPC to get their configuration data, and then do the "real" work, creating one RPC for the server process for ping / query / kill, and then complete. Now someone will write a UNIX shell script that calls these ping and query commands once per second for each of your tens or hundreds of application instances. This seemingly harmless shell script will be responsible for creating tens or hundreds of short-lived processes per second, and from those short-lived processes there will be many RPC calls to a centralized configuration server to retrieve name = value pairs one at a time. Such a shell script can have a huge load on your centralized configuration server.

I am not trying to dissuade you from developing a centralized configuration server. The above points just warn of scalability issues that you need to consider. Your application plan to retrieve all of its configuration data through a single RPC call with coarse granularity will certainly help you avoid the scalability issues I mentioned above.

To provide some food for thought, you might consider a different approach. You can store application configuration files on a web server. Running a wrapper script shell for an application can do the following:

  • Use wget or curl to download the template configuration files from the web server and save the files to the local file system. The template configuration file is a regular configuration file, but with some placeholders for the values. The placeholder may look like ${host_name} .

  • Also use wget or curl to download a file containing search and replace pairs, for example ${host_name}=host42.pizza.com .

  • Perform a global search and replace of these search and replace conditions in all downloaded template configuration files to create ready-to-use configuration files. You can use UNIX shell tools such as sed or a scripting language to perform this global search and replace. Alternatively, you can use the template engine, such as Apache Speed .

  • Run the actual application using the command line argument to specify the path / to / loaded / config / files.

0
source share

All Articles