PAP Configuration

This section describes the setup of VarFish behind a PAP (package filter, application gateway, package filter) structure.

VarFish stores human genetic data which is by its very nature very privacy sensitives. Administrators will thus want to set up VarFish in protected institution networks that are not accessible by the outside world. However, certain data exchange is generally desired, such as connecting two or more VarFish instances with the clinical beacon protocol.

PAP Structure

In such cases, the German agency for information security (BSI) recommends the P-A-P structure (link to 2021 edition of their recommendation). The following figure illustrates the structure

_images/pap-structure.png

Overview of VarFish server behind P-A-P structure.

The structure is as follows:

  • A demilitarized zone (DMZ) network is setup to contain an application gateway. In the case of HTTP(S), this is a reverse proxy.

  • Incoming traffic from the internet passes into the gateway passes through a packetfilter (in other words: firewall).

  • Outgoing traffic out of the gateway passes another packetfilter and it then reaches the destination server in protected network.

The reasoning behind the structure is explained in the NET 3.2 document linked to above. In the following section, we will explain the technical implementation.

Firewall and Network Setup

The German specification NET.3.2.A16 is as follows:

NET.3.2.A16 Aufbau einer “P-A-P” Struktur (S) Eine “Paketfilter - Application-Level-Gateway - Paketfilter”-(P-A-P)-Struktur SOLLTE eingesetzt werden. Sie MUSS aus mehreren Komponenten mit jeweils dafür geeigneter Hard- und Software bestehen. Für die wichtigsten verwendeten Protokolle SOLLTEN Sicherheitsproxies auf Anwendungsschicht vorhanden sein. Für andere Dienste SOLLTEN zumindest generische Sicherheitsproxies für TCP und UDP genutzt werden. Die Sicherheitsproxies SOLLTEN zudem innerhalb einer abgesicherten Laufzeitumgebung des Betriebssystems ablaufen.

Which translates into English roughly as follows:

NET.3.2.A16 Creating a “P-A-P” Structure (S) A “packet filter - application level gateway - packet filter”-(P-A-P)-Structure SHOULD be used. It MUST consist of multiple components with appropriate hardware and software. For the most important protocols, security proxies SHOULD exist on the application layer. For other services, at least generic security proxies for TCP and UDP SHOULD be used. The security proxies SHOULD run inside a secured runtime enviornment of the operating system.

A possible implementation looks as follows:

  • The VarFish server runs in the internal network with IP 10.0.10.10.

  • Create a separate VLAN for the PAP structure and use a /30 (or lower) CIDR prefix. Only place proxy services there, ideally only one.

    • Example: use 1.2.3.0/30 with IP gateway 1.2.3.1 and application gateway server 1.2.3.2.

  • Configure the firewall to allow incoming traffic via HTTPS (TCP/443) to 1.2.3.2 only.

  • Allow outgoing traffic from 192.168.0.1 via the packet filter to 10.0.10.10 via HTTPS (TCP/443) only.

The following section describes how to setup a Linux Docker container with the traefik reverse proxy. To the authors’ best understanding, this fulfills all of the required and optional rules for P-A-P by BSI.

Traefik Reverse Proxy Setup

Traefik is a versatile reverse proxy (and load balancer). It works well with Docker but configuring it can be a bit daunting for beginners. The following describes a straightforward and minimal setup.

Preparation:

  1. Install a modern Linux server on the gateway server (1.2.3.2 from above)

  2. On the server, install Docker following the official instructions

  3. Also install Docker Compose with the official instructinos

  4. Setup public DNS (e.g., varfish-ext.example.com) to point to 1.2.3.2 and ensure that public resolvers can resolve it (e.g., Google DNS at 8.8.8.8)

  5. We assume that your internal VarFish instance is available as varfish-int.example.com and it is setup with a valid TLS certificate.

  6. Collect the public IPs of the hosts on the internet that you want to be able to access your VarFish instance. These might be cluster IPs if the remote servers are behind NAT. In the example below we use the sub network 2.3.4.0/28 and IP 3.4.5.6 as valid sources.

First, create some directories with the following command:

# mkdir -p /etc/reverse-proxy
# mkdir -p /etc/reverse-proxy/var/traefik
# mkdir -p /etc/reverse-proxy/etc/trafik
# mkdir -p /etc/reverse-proxy/etc/trafik/conf.d

Now, create the file /etc/reverse-proxy/docker-compose.yaml as follows.

/etc/reverse-proxy/docker-compose.yaml
version: "2"

services:
  traefik:
    image: traefik:latest
    restart: always
    ports:
      - "443:443"
    networks:
      - web
    volumes:
      - ./var/traefik:/var/traefik:rw
      - ./etc/traefik:/etc/traefik:ro
    container_name: traefik

networks:
  web:

This will create a new container named traefik with the latest version of Traefik. The container goes into its own network and the port 443 is exposed. The container can read /etc/reverse-proxy/traefik as /etc/traefik via a bind mount and read and write /etc/reverse-proxy/var/traefik as /var/traefik. The first will contain configuration, the latter will be used for storing letsencrypt certificate generation state

Next, create /etc/reverse-proxy/etc/traefik/traefik.yaml and /etc/reverse-proxy/etc/traefik/conf.d/dynamic_config.yaml

/etc/reverse-proxy/etc/traefik/traefik.yaml
entryPoints:
  websecure:
    address: ":443"

providers:
  file:
    directory: /etc/traefik/conf.d
  docker:
    exposedByDefault: false

certificatesResolvers:
  le:
    acme:
      email: youremail@example.com
      storage: /var/traefik/acme.json
      tlsChallenge: true

This will setup traefik correctly using letsencrypt certificate.

Note

Regarding use of “legacy” technical language. Please note that the term ipwhitelist below is part of the traefik configuration syntax. We will update our documentation once updated terms are available.

/etc/reverse-proxy/etc/traefik/conf.d/dynamic_config.yaml
# (1) TLS store
tls:
  stores:
    default: {}

http:
  # (2) set routing source for reverse proxy
  routers:
    varfish:
      middlewares:
        - varfish-add-prefix
        - varfish-ip-allowlist
      entryPoints:
        - websecure
      service: varfish
      rule: "Host(`varfish-ext.example.com`)"
      tls:
        certresolver: le
  # (3) routing destination for the reverse proxy
  services:
    varfish:
      loadBalancer:
        servers:
          - url: "https://varfish-int.bihealth.org"

  middlewares:
    # (4) expose only beaconsite endpoint
    varfish-add-prefix:
      addprefix:
        prefix: "/beaconsite/endpoint"
    varfish-ip-allowlist:
      ipwhitelist:
        sourcerange: "2.3.4.0/28,3.4.5.6"

This will setup the

  1. TLS store for the certificates

  2. routing source and

  3. routing destination for the reverse proxy

  4. automatically add /beaconsite/endpoint prefix so only the beaconsite endpoint is exposed, and

  5. restrict access to the given source sites.

You can now startup the reverse proxy:

# cd /etc/reverse-proxy
# docker-compose up -d

You can inspect the logs by using docker logs --tail=100 --follow traefik. You can increase the log verbosity by placing the following block on top of traefik.yaml.

log:
  level: DEBUG