Post mortem: oauth2-proxy security incident
The use of the trusted-ip
-flag for oauth2-proxy
resulted in skipped authentication steps for various services in the Shivering-Isles infrastructure.
Impact
Multiple endpoints, that were considered to be protected by OIDC authentication through the central SI-Auth SSO provider, were exposed without authentication.
After a brief investigation regarding the exposed endpoints, it can be confidently said that no PII or otherwise sensitive information has been leaked. Further, no indicators of compromise were found.
While Prometheus, Alertmanager and Forecastle were exposed to the internet, its unlikely to be reason for concern, as their access is read-only in nature.
For the Longhorn web UI, that was exposed to the internet, the attack vectors were limited to adjusting longhorn configurations, deleting or creating backups and deleting or creating volumes. As metrics indicate that none of these actions have taken place, it’s assumed that no attacker took advantage of these capabilities.
All other services, including all IoT endpoints, that were affected by this incident, have not been exposed to the internet or untrusted network devices. Therefore it’s assumed that no compromise has taken place.
Root Causes
The trusted-ip
-flag was introduced in the misconception it would work similar to trusted IP/Proxy options in other software, where this option is used to allow trusted headers, like X-Forward-For
to be interpreted by oauth2-proxy
, providing the original request IP in the request logs.
The actual implementation however, allows to skip authentication entirely, when the request is submitted from an IP address of this range. The configuration option was set to the pod CIDR of the Kubernetes cluster, to allow ingress-nginx
Pods to be identified as a trusted entity.
Trigger
The incident itself was triggered by the commit b404d3ca
which rolled out the change for usage of trusted-ip
to oauth2-proxy
instances in the entire infrastructure.
Resolution
The mistake was fixed by removing the trusted-ip
option from the deployment in the commit a500e1ca
.
Detection
The issue was detected, when reviewing a configuration change in Alertmanager and no authentication screen was triggered, before access was granted. Investigation through a “private Firefox window” showed that no authentication was required.
First investigation expected a problem with ingress-nginx
, but access logs from both ingress-nginx
and oauth2-proxy
confirmed that requests were successfully routed. However oauth2-proxy
would always answer with HTTP Status 202 on the /oauth2/auth
endpoint, which is only expected for authenticated users.
Action Items
- Remove the use of
trusted-ip
from alloauth2-proxy
instances - Put network-level restrictions in place to add an additional layer of security
- Revisit all deployed
oauth2-proxy
instances and check configuration options - Add monitoring for expected authentication requests
Lessons Learned
- “Trusted IPs” can have very different meanings depending on software implementations
- Validate endpoints to be actually authenticated on a regular basis
What went well
- Most endpoints protected by
oauth2-proxy
were also restricted to local networks only as a security-in-depth measure. As a result continued to be inaccessible from the internet - All services that were exposed contained non-critical information
- Even if attacker had deleted volumes and backups, the second level of backups would have been able to be recovered and by that extend all content of the volumes
- Most software was already using their built-in SSO capabilities for OIDC, resulting in not being vulnerable, even if they were additionally behind an
oauth2-proxy
(like Grafana or Minio)
What went wrong
- The Longhorn web UI was exposed, this could have resulted in deleted volumes and backups of the volumes
- The state of skipped authentication was kept unnoticed for 96 days and was only discovered by accident
Where we got lucky
- Noticing this issue was pure luck, it could have stayed unnoticed for further weeks
- The fact that no one decided to mess with the Longhorn web UI was also lucky, preventing actual damage to services
Timeline
Time (Europe/Berlin) | Action |
---|---|
2023-09-26 20:18:55 | Introduction of the trusted-ip configuration option |
2023-12-31 03:58:00 | Noticing the unauthenticated endpoint for Alertmanager |
2023-12-31 04:00:00 | Restrict monitoring endpoints to local networks |
2023-12-31 04:11:00 | Validating configuration and searching for recent bug reports about external authentication with ingress-nginx |
2023-12-31 04:14:00 | Validating configuration and searching for recent bug reports about external authentication with ingress-nginx in combination with oauth2-proxy |
2023-12-31 04:50:00 | Validating issue with trusted-ip -flag |
2023-12-31 04:56:00 | Fix disabling trusted-ip lands in GitOps Repository and is deployed to production |
2023-12-31 05:09:00 | Investigating some unrelated problems with oauth2-proxy integration that now show up due to actual authentication taking place |
2023-12-31 05:30:00 | Add monitoring for authenticated Endpoints that validates authentication requirement |
2023-12-31 05:45:00 | Investigating exposure (relevant Endpoints and introduction of trusted-ip setting) and validation of oauth2-proxy logic |
2023-12-31 06:31:00 | Writing post-mortem for incident |
Supporting information
Quote from the oauth2-proxy
configuration page regarding the trusted-ip
setting:
list of IPs or CIDR ranges to allow to bypass authentication (may be given multiple times). When combined with
--reverse-proxy
and optionally--real-client-ip-header
this will evaluate the trust of the IP stored in an HTTP header by a reverse proxy rather than the layer-3/4 remote address. WARNING: trusting IPs has inherent security flaws, especially when obtaining the IP address from an HTTP header (reverse-proxy mode). Use this option only if you understand the risks and how to manage them.
Relevant sections in the oauth2-proxy
code:
- Relevant function for the
/oauth2/auth
endpoint - Relevant function for the session handling
- Relevant function that grants access if the source IP is in the trusted IP range
Further information regarding usage of the external-auth feature with ingress-nginx
: