Skip to content

for network troubleshooting in OSG/WLCG." persona: troubleshoot owners: ["networking-team@osg-htc.org"] status: active

tags: [troubleshoot, playbook, diagnostics]

🔧 Troubleshooter — Diagnose & Fix Network Issues

Systematic approach to identifying and resolving network and perfSONAR problems.


Quick Start (5 Minutes)

Is it a Network Problem?

  1. Gather facts: Run the Quick Triage Checklist —

collects system info, connectivity, services, logs

  1. Basic diagnostics: Follow **[Network Troubleshooting Guide](

../../network-troubleshooting.md)** — contact procedures, support escalation

  1. Learn more: **[ESnet Troubleshooting Guide](

https://fasterdata.es.net/performance-testing/troubleshooting/)** — detailed network investigation

Is it a perfSONAR Problem?


Diagnostic Tools & Guides

On the perfSONAR Host

Check system status:

  • Systemd services: systemctl status perfsonar-*

  • Container status: podman ps -a or docker ps -a

  • Container logs: podman logs perfsonar-testpoint or docker logs

Verify network configuration:

Check firewall & security:

Network Path Analysis

ESnet tools: ESnet Troubleshooting Guide

perfSONAR tools:

  • pScheduler: pScheduler documentation

  • Test API: Query test meshes and historical results

  • Measurement archive: Access stored results via web interface


Common Scenarios & Playbooks

Container Won't Start

Playbook: Container Startup Issues (in progress)

Quick checks:

  • Image available: podman images | grep perfsonar

  • Volumes mounted: podman volume ls

  • Ports available: ss -ltnp | grep -E '(443|5001|9000|8080)'

  • Logs: podman logs perfsonar-testpoint

Tests Not Running

Playbook: Tests Not Running (in progress)

Quick checks:

  • pSConfig enrolled: psconfig remote list

  • Mesh connectivity: Can reach psconfig.opensciencegrid.org?

  • pScheduler agent: systemctl status perfsonar-pscheduler-agent

  • Log errors: podman logs perfsonar-testpoint | grep -i error

High Latency / Slow Tests

Playbook: Performance Issues (in progress)

Quick checks:

  • Host tuning: Run fasterdata-tuning.sh audit mode

  • NIC settings: Check MTU, GRO, GSO, ring buffers

  • Network load: Peak bandwidth during test time?

  • Competing tests: Multiple tests running simultaneously?

Firewall Blocking Tests

Playbook: Firewall & Network Access (in progress)

Quick checks:

  • Required ports: Security & Firewall Guide

  • Test connectivity: Can reach remote perfSONAR instances?

  • Firewall logs: Check local and campus firewall rules

  • DNS resolution: Can resolve perfSONAR hosts?


Escalation & Support

When to contact support:

Level 1: Self-Service Diagnostics

Level 2: Site-Specific Support

  • Contact your site's network administrator

  • Check local firewall, VLAN, NIC configuration

  • Verify DNS, IP routing, upstream connectivity

Level 3: OSG/WLCG Support

  • OSG sites: GOC Support Ticket

  • Include: hostname, triage checklist results, error messages, logs

  • WLCG sites: GGUS Ticket → "WLCG Network Throughput" or "WLCG perfSONAR support"

Level 4: perfSONAR Community


Setup & Installation

Configuration & Optimization

Understanding the System