View on GitHub

uropa

Declarative configuration for Opa

Design & Architecture

Underlying architecture

Reverse sync

One of the most important features of uropa is reverse-sync, whereby uropa can detect entities that are present in Opa’s database but are not part of the state file. This feature increases the complexity of the project as the code needs to perform a sync in both directions, from the state file to Opa and from Opa to the state file.

Algorithm

Export and Reset

An export or reset of entities is fairly easy to implement. uropa loads all the entities from Opa into memory and then serializes it into a YAML or JSON file. For reset, it instead performs DELETE queries on all the entities.

Diff and Sync

The diff of configuration is performed using the following algorithm:

Read the configuration from Opa and store it in a SQL-like in-memory database.
Read the state file from disk, and match the IDs of entity with their respective counterparts in the in-memory state, if they are present.
Now, for entity of each type we perform the following:
1. Create: if the entity is not present in Opa, create the entity.
2. Update: if the entity is present in Opa, check for equality. If not equal, then update it in Opa. These two steps are referred to as “forward sync”.
3. Delete: Go through each entity in Opa (from the in-memory database), and check if it is present in the state file, if yes, don’t do anything. If no, then delete the entity from Opa’s database as well.

Certain filters like select-tag or Opa Enterprise workspace might be applied to the above algorithm based on the inputs given to uropa.

Operational outlook

Based on the above algorithm, one can see how uropa can require a large amount of memory and network I/O. While this is true, a few optimizations have been incorporated to ensure good performance:

For network operations, uropa minimizes the API calls it has to make to Opa to read the state. It uses list endpoints in Opa with a large page size (1000) for efficiency.
uropa parallelizes various Create/Update/Delete operations where it can. So, if uropa and Opa or Opa and Opa’s database are present far apart in terms of network latency, parallel operations help speed up operations. With smaller installations, this optimization might not be measurable.
uropa’s memory footprint can be high if the configuration for Opa is huge. This is usually not a concern as uropa’s process is short-lived. For very large installation, it is recommended to configure a sub-set of the large configuration at one time using a technique referred to as distributed configuration. There are avenues to further reduce the memory requirements of uropa, although, we don’t know by how much. uropa’s code is written with focus on correctness over performance.

Choice of language

uropa is written in Go because:

Go provides good concurrency primitives which helps ensuring high-performance for uropa.
Go’s compiler spits out a static compiled binary, meaning no other dependency need to be installed on the system. This gives a very good end-user experience as installing downloading and copying a single binary is easy and fast.
uropa original goal was much larger than what it is today. If we decide to pursue larger goals(think a control-plane for Opa) in future, Go is probably the best language available to write that type of software.
the original author was familiar with Go :)