Utilization Reporting for non-ACCESS allocated resources

Infrastructure Integration Roadmap Task

Task Type(s): Technology
Start by phase: Integration
Complete by phase: Operations
RP role(s): UUtilization data contact(s), System administrator(s)

Summary

ACCESS Metrics provides ACCESS XDMoD and various data reporting and analysis services for ACCESS-allocated resources and for other NSF-funded projects such as the Campus Cyberinfrastructure program.

This task involves setting up the data transfer of utilization data for a compute resource that is not allocated by ACCESS. This includes compute resources that are funded by NSF’s CC* program that are expected to work cooperatively with ACCESS award activities and services.

The data reporting will be via a Federated XDMoD portal that is managed by ACCESS Metrics. This federated XDMoD portal is based on the Open XDMoD software and is configured to be able to report across multiple NSF funded projects and resources. The data will be made available to NSF staff.

Required information:

  1. Resource Specification information including: number of compute nodes, number of compute cores, number of GPU devices, whether resource is node or core allocated, whether compute node sharing is enabled.

  2. Information about the system users: username on compute resource, first name, last name, organization name

  3. Resource manager accounting log data.

Desirable information:

  1. Unique identifiers for system users such as an ORCID iD or ACCESS ID.

  2. Allocation information such as project name, project PI, project field-of-science, funding agency

Prerequisite tasks

  1. Resource Metrics Data Availability Assessment v1

Support Information

For assistance with this task see the Support Information section in the Integration Roadmap Description.

Detailed Instructions

The required log data should be periodically sent to ACCESS Metrics. ACCESS Metrics supports two types of data transfer mechanism. The choice of data transfer mechanism depends on the RP preferences:

  1. Data transfer via REST endpoint (preferred)

    OR

  2. Globus Transfer with scheduled transfers

The REST endpoint data transfer mechanism is preferred for small amounts of log data transfer (less than 1 MB per day). the Globus Transfer mechanism is more appropriate for large file transfers (but can be used for smaller amounts of data if desired).

REST data transfer

To send data to ACCESS via REST you need an API Token. We will generate one and email it to you. Below are examples of how to send the data using curl and how to send data using a python script.

Example of data transfer using curl command line tool:

curl --form file='@FILE_TO_UPLOAD' -H 'Content-Type: text/plain' -H "Authorization: Bearer YOUR_TOKEN_HERE" https://data.ccr.xdmod.org/logs

where YOUR_TOKEN_HERE should be set to the token that was provided to you by ACCESS Metrics and FILE_TO_UPLOAD should be set to the filename of the file to upload (note the ‘@’ character before the filename is required).

Example python script to upload the file using the python requests library:

#!/usr/bin/env python3

# Tool for POSTing data from standard input to the UB CCR XDMoD data endpoint.
# Replace 'YOUR_TOKEN_HERE' below with the API token you received from the
# XDMoD team.

import requests
import sys

def main():
    if len(sys.argv) < 2:
        print("Usage: {} [FILENAME]".format(sys.argv[0]))
        return

    api_token = 'YOUR_TOKEN_HERE'

    file = {'file': open(sys.argv[1], 'rb')}
    response = requests.post(
        'https://data.ccr.xdmod.org/logs',
        files=file,
        headers={
            'content-type': 'text/plain',
            'authorization': 'Bearer ' + api_token,
        },
    )
    print('Response: ', response.status_code, response.text)

if __name__ == '__main__':
    main()

Globus Transfer

To setup Globus Transfer, please provide the name or names of the globus accounts that will be used for data transfer. ACCESS Metrics will create a globus share and give write permissions to the user accounts provided by the RP. Instructions for setting up the periodic transfers using the globus tools are either Globus timer cli tool or the web-based Globus Timers.

Required Log files

Log Type

Description

Filename

Update Frequency

Resource Manager Log Files

Daily dumps of the resource manager log files. Use the format described in the Open XDMoD shredder documentation.

Use the filename direct from the resource manager logs, or if using slurm’s sacct use slurm.log

Daily

Name mapping

Mapping between the username on the resource and a persons full name. Use the format described in the Open XDMoD Names documentation. For Regional Computing resources that support users from multiple different institutions then the User’s instittion should also be included in the file contents as an additional column after the user’s last name.

names.csv

Upload new copy of the file when new users are added to the resource.

Resource Specifications

Information about the number of compute devices (nodes, cores, GPUs) on the resource. The file format supports tracking resource changes over time. Use the format described in the Open XDMoD resource specs

resource_specs.json

Whenever the resource size changes

Document Management

Status: Draft

Official date: TBD

Coordinators: Joseph White, ACCESS Metrics

Last revised date: 2023-09-15

Retired date: