Skip to content

Installing the OSDF Cache by RPM

This document describes how to install an Open Science Data Federation (OSDF) Cache service via RPMs. This service allows a site or regional network to cache data frequently used in Open Science Pool jobs, reducing data transfer over the wide-area network and increasing throughput to jobs.

Before Starting

Before starting the installation process, consider the following requirements:

  • Operating system: A RHEL 8 or RHEL 9 or compatible operating system.
  • User IDs: If it does not exist already, the installation will create the Linux user named xrootd for running daemons.
  • File Systems: The cache should have a partition of its own for storing data and metadata.
  • Host certificate: Required for authentication. See note below.
  • Network ports: The cache service requires the following ports open:
  • Inbound TCP port 8443 for file access via the HTTP(S) and XRoot protocols.
  • (Optional) Inbound TCP port 8444 for access to the web interface for monitoring and configuration; if enabled, access to this port should be restricted to the LAN.
  • Service requirements:
    • A cache serving the OSDF federation as a regional cache should have at least:
      • 8 cores
      • 40 Gbps connectivity
      • 50-200 TB of NVMe disk for the cache partition; you may distribute the disk, e.g., by using an NVMe-backed Ceph pool, if you cannot fit that much disk into a single chassis
      • 24 GB of RAM
    • A cache being used to serve data from the OSDF to a single site should have at least:
      • 8 cores
      • 40 Gbps connectivity
      • 2 TB of NVMe disk for the cache partition
      • 24 GB of RAM
    • The cache should be a mounted filesystem; its mount location is referred to as <CACHE PARTITION> in the documentation below. We suggest that several gigabytes of local disk space be available for log files, although some logging verbosity can be reduced.

As with all OSG software installations, there are some one-time steps to prepare in advance:

Host certificates

Caches are accessed by users through browsers, meaning caches need a certificate from a CA acceptable to a standard browser. Examples include Let's Encrypt or the InCommon RSA CA. Caches without a valid certificate for the browser cannot be added to the OSDF. Note that, unlike legacy grid software, the public certificate file will need to contain the "full chain", including any intermediate CAs (if you're unsure about your setup, try accessing your cache from your browser).

The following locations should be used (note that they are in separate directories):

  • Host Certificate Chain: /etc/pki/tls/certs/pelican.crt
  • Host Key: /etc/pki/tls/private/pelican.key

Installing the Cache

The cache service is provided by the osdf-cache RPM. Install it using one of the following commands:

OSG 24:

root@host # yum install osdf-cache

OSG 23:

root@host # yum install --enablerepo=osg-upcoming osdf-cache

osdf-cache 7.11.1

This document covers versions 7.11.1 and later of the osdf-cache package; ensure the above installation results in an appropriate version.

Configuring the Cache Server

In /etc/pelican/config.d/20-cache.yaml, set Cache.LocalRoot, Cache.DataLocation and Cache.MetaLocation as follows, replacing <CACHE PARTITION> with the mount point of the partition you will use for the cache.

Cache:
  LocalRoot: "<CACHE PARTITION>/namespaces"
  DataLocation: "<CACHE PARTITION>/data"
  MetaLocation: "<CACHE PARTITION>/meta"

Preparing for Initial Startup

  1. The cache identifies itself to the federation via public key authentication; before starting the cache for the first time, it is recommended to generate a keypair.

    root@host$ cd /etc/pelican
    root@host$ pelican generate keygen
    

    The newly created files, issuer.jwk and issuer-pub.jwks are the private and public keys, respectively.

  2. Save these files; if you lose the issuer.jwk, your cache will need to be re-approved.

Validating the Cache Installation

Do the following steps to verify that the cache is functional:

  1. Start the cache using the following command:

    root@host$ systemctl start osdf-cache
    
  2. Download a test file from the OSDF through your cache (replacing CACHE_HOSTNAME with the host name of your cache)

    root@host$ osdf object get -c CACHE_HOSTNAME:8443 /ospool/uc-shared/public/OSG-Staff/validation/test.txt /tmp/test.txt
    root@host$ cat /tmp/test.txt
    
    Hello, World!
    

    If the download fails, rerun the above osdf object get command with the -d flag added; additional debugging information is located in /var/log/pelican/osdf-cache.log. See this page for requesting assistance; please include the log file and the osdf object get -d output in your request.

Joining the Cache to the Federation

The cache must be registered with the OSG prior to joining the data federation. Send mail to help@osg-htc.org requesting registration; provide the following information:

  • Cache hostname
  • Administrative and security contact(s)
  • Institution that the cache belongs to

OSG Staff will register the cache and respond with the Resource Name that the cache was registered as.

Once you have that information, edit /etc/pelican/config.d/15-osdf.yaml, and set XRootD.Sitename:

XRootD:
  Sitename: <RESOURCE NAME REGISTERED WITH OSG>

Then, restart the cache by running

root@host$ systemctl restart osdf-cache

Let OSG Staff know that you have restarted the cache with the updated sitename, so they can approve the new cache.

Managing the Cache Service

Use the following SystemD commands as root to start, stop, enable, and disable the OSDF Cache.

To... Run the command...
Start the cache systemctl start osdf-cache
Stop the cache systemctl stop osdf-cache
Enable the cache to start on boot systemctl enable osdf-cache
Disable the cache from starting on boot systemctl disable osdf-cache

Getting Help

To get assistance, please use the this page.

Back to top