Installing the OSDF Origin¶
Warning
If you want to run origins for authenticated and unauthenticated data, you must run them on separate hosts. This requires registering a resource for each host.
This document describes how to install an Open Science Data Federation (OSDF) origin service. This service allows an organization to export its data to the data federation.
Note
The OSDF Origin was previously named "Stash Origin" and some documentation and software may use the old name.
Note
The origin must be registered with the OSG prior to joining the data federation. You may start the registration process prior to finishing the installation by using this link along with information like:
- Resource name and hostname
- VO associated with this origin server (which will be used to determine the origin's namespace prefix)
- Administrative and security contact(s)
- Who (or what) will be allowed to access the VO's data
- Which caches will be allowed to cache the VO data
Before Starting¶
Before starting the installation process, consider the following points:
- Operating system: A RHEL 7 or compatible operating system.
- User IDs: If they do not exist already, the installation will create the Linux user IDs
condor
andxrootd
; only thexrootd
user is utilized for the running daemons. - Host certificate: The origin service uses a host certificate to authenticate with the caches it serves. The host certificate documentation provides more information on setting up host certificates.
- Network ports: The origin service requires the following ports open:
- Inbound TCP port 1094 for file access via the XRootD protocol
- Outbound TCP port 1213 to
redirector.osgstorage.org
for connecting to the data federation - Outbound UDP port 9930 for reporting to
xrd-report.osgstorage.org
andxrd-mon.osgstorage.org
for monitoring. - Hardware requirements: We recommend that an origin has at least 1Gbps connectivity and 8GB of RAM. We suggest that several gigabytes of local disk space be available for log files, although some logging verbosity can be reduced.
As with all OSG software installations, there are some one-time steps to prepare in advance:
- Obtain root access to the host
- Prepare the required Yum repositories
- Install CA certificates
Installing the Origin¶
The origin service consists of one or more XRootD daemons and their dependencies for the authentication infrastructure. To simplify installation, OSG provides convenience RPMs that install all required software with a single command:
[email protected] # yum install stash-origin
For this installation guide, we assume that the data to be exported to the federation is mounted at /mnt/stash
and owned by the xrootd:xrootd
user.
Configuring the Origin Server¶
The stash-origin
package provides a default configuration files in
/etc/xrootd/xrootd-stash-origin.cfg
and /etc/xrootd/config.d
.
Administrators may provide additional configuration by placing files in /etc/xrootd/config.d
of the form /etc/xrootd/config.d/1*.cfg
(for directives that need to be processed BEFORE the OSG configuration)
or /etc/xrootd/config.d/9*.cfg
(for directives that are processed AFTER the OSG configuration).
You must configure every variable in /etc/xrootd/config.d/10-common-site-local.cfg
and /etc/xrootd/config.d/10-origin-site-local.cfg
.
The mandatory variables to configure are:
File | Config line | Description |
---|---|---|
10-common-site-local.cfg | set rootdir = /mnt/stash |
The mounted filesystem path to export; this document calls it /mnt/stash |
10-common-site-local.cfg | set resourcename = YOUR_RESOURCE_NAME |
The resource name registered with OSG |
10-origin-site-local.cfg | set originexport = /VO |
The directory relative to rootdir that is the top of the exported namespace for the origin services |
For example, if the HCC VO would like to set up an origin server exporting from the mount point /mnt/stash
,
and HCC's registered namespace is /hcc
, then the following would be set in 10-common-site-local.cfg
:
set rootdir = /mnt/stash
set resourcename = HCC_OSDF_ORIGIN
And the following would be set in 10-origin-site-local.cfg
:
set originexport = /hcc
With this configuration, the data under /mnt/stash/hcc/bio/datasets
would be available under the path
/hcc/bio/datasets
in the OSDF namespace and the data under /mnt/stash/hcc/hep/generators
would be available under the path
/hcc/hep/generators
in the OSDF namespace.
Warning
If you want to run origins for authenticated and unauthenticated data, you must run them on separate hosts. This requires registering a resource for each host.
Warning
The OSDF namespace is a global namespace. Directories you export must not collide with directories provided by other origin servers; this is why the explicit registration is required.
Manually Setting the FQDN (optional)¶
The FQDN of the origin server that you registered in Topology may be different than its internal hostname
(as reported by hostname -f
).
For example, this may be the case if your origin is behind a load balancer such as LVS or MetalLB.
In this case, you must manually tell the origin services which FQDN to use for topology lookups.
- Create the file
/etc/systemd/system/stash-origin-authfile.service.d/override.conf
with the following contents:[Service] Environment=ORIGIN_FQDN=<Topology-registered FQDN>
Managing the Origin Service¶
The origin service consists of the following SystemD units that you must directly manage:
Service name | Notes |
---|---|
[email protected] |
Performs data transfers (unauthenticated instance) |
[email protected] |
Performs data transfers (authenticated instance) |
These services must be managed with systemctl
and may start additional services as dependencies.
As a reminder, here are common service commands (all run as root
):
To... | On EL7, run the command... |
---|---|
Start a service | systemctl start <SERVICE-NAME> |
Stop a service | systemctl stop <SERVICE-NAME> |
Enable a service to start on boot | systemctl enable <SERVICE-NAME> |
Disable a service from starting on boot | systemctl disable <SERVICE-NAME> |
In addition, the origin service automatically uses the following SystemD units:
Service name | Notes |
---|---|
[email protected] |
Integrates the origin into the data federation (unauthenticated instance) |
[email protected] |
Integrates the origin into the data federation (authenticated instance) |
stash-origin-authfile.timer |
Updates the authorization files periodically |
Verifying the Origin Server¶
Once your server has been registered with the OSG and started, perform the following steps to verify that it is functional.
Testing availability¶
To verify that your origin is correctly advertising its availability, run the following command from the origin server:
[[email protected] ~]$ xrdmapc -r --list s redirector.osgstorage.org:1094
0**** redirector.osgstorage.org:1094
Srv ceph-gridftp1.grid.uchicago.edu:1094
Srv stashcache.fnal.gov:1094
Srv stash.osgconnect.net:1094
Srv origin.ligo.caltech.edu:1094
Srv csiu.grid.iu.edu:1094
The output should list the hostname of your origin server.
Testing directory export¶
To verify that the directories you are exporting are visible from the redirector, run the following command from the origin server:
[[email protected] ~]$ xrdmapc -r --verify --list s redirector.osgstorage.org:1094 <EXPORTED DIR>
0*rv* redirector.osgstorage.org:1094
>+ Srv ceph-gridftp1.grid.uchicago.edu:1094
? Srv stashcache.fnal.gov:1094 [not authorized]
>+ Srv stash.osgconnect.net:1094
- Srv origin.ligo.caltech.edu:1094
? Srv csiu.grid.iu.edu:1094 [connect error]
<EXPORTED_DIR>
for the directory the service is suppose to export.
Your server should be marked with a >+
to indicate that it contains the given path and the path was accessible.
Testing file access¶
To verify that you can download a file from the origin server, use the stashcp
tool.
Place a <TEST FILE>
in <EXPORTED DIR>
. Where <TEST FILE>
can be any file. The
stashcp
tool is available in the stashcp
RPM.
Run the following command:
[[email protected]]$ stashcp <TEST FILE> /tmp/testfile
If successful, there should be a file at /tmp/testfile
with the contents of the test file on your origin server.
If unsuccessful, you can pass the -d
flag to stashcp
for debug info.
You can also test directly downloading from the origin via xrdcp
, which is available in the xrootd-client
RPM.
Run the following command:
[[email protected]]$ xrdcp xroot://<origin server>:1094/<TEST FILE> /tmp/testfile
Registering the Origin¶
To be part of the Open Science Data Federation, your origin must be
registered with the OSG. The service type is XRootD origin server
.
The resource must also specify which VOs it will serve data from.
To do this, add an AllowedVOs
list, with each line specifying a VO whose data the resource is willing to host.
For example:
MY_OSDF_ORIGIN:
Service: XRootD origin server
Description: OSDF origin server
AllowedVOs:
- GLOW
- OSG
ANY
to indicate that the origin will serve data from any VO that puts data on it.
In addition to the origin allowing a VOs via the AllowedVOs
list,
that VO must also allow the origin in its DataFederations/StashCache/AllowedOrigins
list.
See the page on getting your VO's data into OSDF.
Updating to OSG 3.6¶
The OSG 3.5 series is reaching end-of-life on May 1, 2022. Admins are strongly encouraged to move their origins to OSG 3.6.
See general update instructions.
Unauthenticated origins ([email protected]
service) do not need any configuration changes.
Authenticated origins ([email protected]
service) may need the configuration changes described in the
updating to OSG 3.6 section
of the XRootD authorization configuration document.
Getting Help¶
To get assistance, please use the this page or contact help@opensciencegrid.org directly.