Metadata Aggregation Practice Statement v20220712

Changes to the Metadata Aggregation Practice Statement are announced to the SAFIRE Participants’ Forum.

SAFIRE generates a number of metadata aggregates for various purposes, including inter-federation and its own internal operations. This document gives a broad overview of how the aggregation process works. It is currently non-normative and will be refined over time.

Metadata aggregator

SAFIRE makes use of WAYF’s PHPH (PHederation PHeeder) metadata aggregation software. An overview of the configuration of this aggregator and the aggregates it generates is publically available at https://phph.safire.ac.za/.

Not all of the aggregates generated by PHPH are intended for public consumption. The definitive source of information about SAFIRE’s metadata is  https://safire.ac.za/technical/metadata/.

SAFIRE Federation Registry

SAFIRE’s internal federation registry serves as an input to the aggregation process. The process for taking entities into the Federation is documented in the metadata registration practice statement.

The registry is implemented as a Git repository hosted in a private repository on Github. To ensure that the aggregator functions correctly in the event of an outage, it maintains a cached copy of this repository and collates entities into a private feed. However, a webhook exists to ensure that changes that are pushed into the master repository are automatically pulled by the aggregator when it next runs, thus updating the cache. This means that changes to the federation registry should automatically reflect with aggregates on the next publish.

Publication information

All metadata aggregates generated by SAFIRE make use of the SAML-Metadata-RPI-V1.01 metadata extension to indicate that SAFIRE is the publisher. The following is a non-normative example

<md:Extensions xmlns:md="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:mdrpi="urn:oasis:names:tc:SAML:metadata:rpi">
  <mdrpi:PublicationInfo creationInstant="2017-03-03T08:00:01Z" publisher="https://safire.ac.za"/>
</md:Extensions>

Signing

All metadata aggregates produced by SAFIRE are signed at publication using XMLDsig with the rsa-sha256 algorithm. A separate key management practice statement details the processes for handling private keys. Details of the signing keys for individual aggregates are available from https://safire.ac.za/technical/metadata/.

Validation

SAFIRE’s metadata aggregator validates all incoming metadata for XML well-formedness, against the relevant schemas, and using a set of XSLT rules derived from the UK federation’s metadata toolchain2. Entities that fail validation MAY be removed from the aggregates SAFIRE publishes.

The validation rules currently in use by SAFIRE are available from SAFIRE’s Github repository. The online validator at https://validator.safire.ac.za/ applies a superset of these rules and checks for conformance with the SAFIRE SAML2 Technology Profile and can be used to debug failing metadata.

Logo caching

To ensure the availability of logos within the Federation hub’s user interface (and in particular, on the user consent page) and on discovery pages, the aggregator caches logos specified by SAML-Metadata-MDUI-V1.0 metadata extensions3.

The logo cache is inspected during each aggregation run, and cached logos are updated if necessary. Freshness lifetime is calculated from the Expires: and Cache-Control: max-age directives from the logo source in accordance with RFC 72344. Where the lifetime is known, requests are made with an If-Modified-Since: header. However, the aggregator does not understand ETags nor does it honour must-revalidate directives. Cached logos do not automatically expire, and may continue to be served in the case where the original source becomes unreachable (even if they’ve surpassed the freshness lifetime).

Smaller logos (typically less than 40 kilobytes per eduGAIN Best Current Practice5) may be replaced with their equivalent using the RFC 2397 data: URI scheme6. Copies of other logos are hosted on the Federation’s content delivery network (CDN) and may be served from there. If appropriate, the corresponding mdui:Logo element will be updated to reflect the cached source. Logos for which there is no cached copy will be served with the original URL, even if that may result in a broken image.

If the dimensions of a cached logo do not match those declared in the mdui:Logo element, the dimensions may be updated to match those of the cached logo.

Validity & expiry

By default, SAFIRE’s metadata aggregates are published with a validUntil date ten (10) days ahead of the current time. However, if any of the source metadata has an earlier expiry date, the earliest expiry date is used for the entire aggregate (this notably happens with the republication of eduGAIN metadata).

The cacheDuration is set to six (6) hours. This is to encourage providers to refresh their metadata more regularly than the minimum once-per-day specified in the SAML Technical Profile.

Timings

Timing of metadata aggregate generation

Metadata aggregates can be generated either automatically or manually. Routine moves, adds and changes to entities are usually only published by the automated process, as are updates from inter-federation.

Automatic metadata aggregate generation is done one once per day, starting at 10:00 SAST (UTC+2). The generation time can vary but is typically completed within five minutes.

Manual generation is done on an ad-hoc basis as and when required. This is typically in response to emergency change requests or to correct errors that have been reported in the published aggregates.

Timing of new metadata imports

SAFIRE’s federation hub and identity provider proxies can import new metadata either automatically or manually.

New metadata is automatically fetched once an hour, beginning at seven minutes past the hour. Importation time can vary but typically takes no longer than a couple of minutes. In general, you can safely assume that new metadata will be imported by 10 past the hour.

Propogation of changes into eduGAIN takes somewhat longer. It can take anywhere from a few hours to a couple of days for a particular eduGAIN service provider to pick up new metadata. The eduGAIN metadata aggregator first has to fetch the new metadata from SAFIRE7, then the service’s home federation has to fetch an updated aggregate from eduGAIN, and finally the service provider has to fetch new metadata from their home federation. How long each of these steps takes depends on the individual operator’s own metadata practices (for instance, SAFIRE generally only fetches updated metadata from eduGAIN once a day).

Hosting

Published metadata aggregates are hosted on https://metadata.safire.ac.za/. This web server provides appropriate headers to facilitate caching per RFC 72344.

References


  1. “SAML V2.0 Metadata Extensions for Registration and Publication Information Version 1.0”, 3 April 2012, OASIS Committee Specification 01. http://docs.oasis-open.org/security/saml/Post2.0/saml-metadata-rpi/v1.0/cs01/saml-metadata-rpi-v1.0-cs01.html↩︎

  2. “UK federation Metadata Toolchain”, UK Access Management Federation for Education and Research. https://github.com/ukf/ukf-meta/tree/master/mdx/_rules↩︎

  3. “Metadata Extensions for Login and Discovery User Interface Version 1.0”, 3 April 2012, OASIS Committee Specification 01. http://docs.oasis-open.org/security/saml/Post2.0/sstc-saml-metadata-ui/v1.0/sstc-saml-metadata-ui-v1.0.html ↩︎

  4. Fielding, R., Nottingham, M. & Reschke, J., “Hypertext Transfer Protocol (HTTP/1.1): Caching”, RFC 7234, June 2014. ↩︎ ↩︎

  5. “eduGAIN Best Current Practice”, eduGAIN Steering Group. https://wiki.geant.org/display/eduGAIN/Best+Current+Practice ↩︎

  6. Masinter, L., “The ‘data’ URL scheme”, RFC 2397, August 1998. ↩︎

  7. “Metadata Aggregation Practice Statement”, eduGAIN Operations Team. https://wiki.geant.org/display/eduGAIN/Metadata+Aggregation+Practice+Statement ↩︎

South African Identity Federation