1. Introduction

Open Privacy Vault (OPV) - Secure, Performant, Open Source PII as a Service. The goal of OPV is to provide an open standard of PII management. With Structural PII validation, performant APIs, pluggable databases, extendabale encryption and hashing engine, flexible policy and authorization, we are aiming to bring the highest open standard for the fair usage of PII.

Quick Start

Running Local OPV

Note that all the configuration of OPV is done through environment variables. For example, setting OPV_DB_DRIVER=mysql. See Env Configuration →.

Start with docker.

# Control Plane

docker run -it \
  -v /tmp/opv_data:/data \
  -p 27999:27999 \
  openprivacyio/opv \
  controlplane
# Data Plane

docker run -it \
  -v /tmp/opv_data:/data \
  -p 28000:28000 \
  openprivacyio/opv \
  dataplane
# Proxy Plane (Depends on Data Plane, and a grant token created from Control Plane)

docker run -it \
  -v /tmp/opv_data:/data \
  -p 28001:28001 \
  -e OPV_PROXY_PLANE_DEFAULT_DP_BASE_URL='http://127.0.0.1:28000' \
  -e OPV_PROXY_PLANE_DEFAULT_DP_GRANT_TOKEN='grant_token_that_can_read_write_facts' \
  openprivacyio/opv \
  proxyplane

Start from source code:

git clone https://github.com/open-privacy/opv
cd opv
make vendor

# Start the control plane and data plane
make run

# Start the proxyplane (optional)
OPV_PROXY_PLANE_DEFAULT_DP_GRANT_TOKEN=<grant_token_that_can_read_write_facts> make run_proxyplane

Try it with the playground:

  • https://playground.openprivacy.io/swagger/index.html
  • A testing domain sandbox.example.com grant token ( v1:sandbox.example.com:Iy8TJZcuhicocCklFdwA ) can be used to test the playground. It has the permission to access to all the paths and http methods of the dataplane within the domain of sandbox.example.com.
{
  "token": "v1:sandbox.example.com:Iy8TJZcuhicocCklFdwA",
  "domain": "sandbox.example.com",
  "allowed_http_methods": ["*"],
  "paths": ["*"]
}
arch.png

APIs

For more information, please take a look at

Now you can test the APIs with curl.

# Create a new grant token from the control plane http://127.0.0.1:27999

curl -X POST 'http://127.0.0.1:27999/api/v1/grants' \
--header 'Content-Type: application/json' \
--data-raw '{
        "allowed_http_methods": ["*"],
        "paths": ["*"],
        "domain": "sandbox.example.com"
}'


# The response will give you a grant token for data plane access
# You can pass the token via HTTP header X-OPV-GRANT-TOKEN
{
  "token": "v1:sandbox.example.com:Iy8TJZcuhicocCklFdwA",
  "domain": "sandbox.example.com",
  "allowed_http_methods": ["*"],
  "paths": ["*"]
}
# Store a new fact by calling the data plane http://127.0.0.1:28000
# Please replace the token with the token you just got above

curl -X POST 'http://127.0.0.1:28000/api/v1/facts' \
-H 'Content-Type: application/json' \
-H 'X-OPV-GRANT-TOKEN: v1:sandbox.example.com:Iy8TJZcuhicocCklFdwA' \
--data-raw '{
        "fact_type_slug": "ssn",
        "value": "123-45-6789"
}'

One can open the local swagger UI to test the APIs:

One can also test the proxyplane:

Performance

Endpoint Latency

We are expecting to see P99 < 10ms latency when sending GET requests to /api/v1/facts/:id, which is the most heavily used endpoint to retrieve PIIs from a tokinized fact. We did some simple benchmarking (nowhere near perfert of course), to keep tracking the performance of the GET endpoints.

#!/bin/sh

# Make sure you have vegeta downlaoded. https://github.com/tsenart/vegeta
# Replace the X-Opv-Grant-Token and fact_id for the benchmark script

echo $'GET http://127.0.0.1:28000/api/v1/facts/fact_1LqMuvudjA1xdtqbjd0l \nX-Opv-Grant-Token: v1:sandbox.example.com:Iy8TJZcuhicocCklFdwA' \
    | vegeta attack -duration=10s | vegeta report

# Example of the result
# $ sh vegeta.sh

# Requests      [total, rate, throughput]         500, 50.11, 50.11
# Duration      [total, attack, wait]             9.978s, 9.977s, 1.17ms
# Latencies     [min, mean, 50, 90, 95, 99, max]  753.995µs, 1.679ms, 1.415ms, 2.564ms, 3.061ms, 6.926ms, 12.096ms
# Bytes In      [total, mean]                     62000, 124.00
# Bytes Out     [total, mean]                     0, 0.00
# Success       [ratio]                           100.00%
# Status Codes  [code:count]                      200:500
# Error Set:
benchmark_flamegraph.png

Prometheus Monitoring

By default, prometheus metrics are enabled for both data plane and control plane (controlled by the environment variable OPV_PROMETHEUS_ENABLED). You can easily connect your prometheus scraper with the following config:

scrape_configs:
  - job_name: opv-metrics
    metrics_path: /metrics
    static_configs:
      - targets:
          - "opv_controlplane:27999"   # replace it with the actual controlplane's base URL
          - "opv_dataplane:28000"      # replace it with the actual dataplane's base URL

For local benchmarking, we created a docker-compose network to test the prometheus and grafana, and we can closely monitor the performance regression.

cd ./benchmark
docker-compse up --build
prometheus.png

Edit this page on GitHub