Data Transfer and Sharing on Eagle Using Globus

Help Desk

Theta and ThetaGPU
Updated: 09/17/2021

Contents

 

Overview

Collaborators throughout the scientific community have the ability to write data to and read scientific data from Globus Guest Collections on the Eagle filesystem. The project PI needs to have an active ALCF account to set up the guest collections and share them with collaborators. This form of data sharing will provide PIs with a natural and convenient storage space for collaborative work.

Globus is a service that makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus uses GridFTP for more reliable and high-performance file transfer, and will queue file transfers to be performed asynchronously in the background.

If you are migrating your data from Petrel, please review the migration instructions on this page

Login Screen.png

Logging into Globus with your ALCF Login

ALCF researchers can use their ALCF Login username and password to access Globus. Go to https://www.globus.org/ and click on Log In in the upper right corner of the page.

Type or scroll down to "Argonne LCF" in the Organization box, and then click Continue.

Globus_ALCF_Login.png

You will be taken to a familiar-looking page for ALCF login. Enter your ALCF login username and password.

 

Accessing your Eagle Project Directory

There are two ways for a PI to access their project directory on Eagle. 

 

1. Web Interface: By logging in to Globus interface directly and navigating to the ALCF Eagle endpoint.

** Specifically for PIs with Eagle 'Data-only' projects and no other compute allocations, logging in from the Globus-side to get to Eagle is the only way for them to access their Eagle project directory. 

 

File Manager

 

2. POSIX: By logging in to the ALCF systems from the terminal window.

** For Eagle Data and Allocation projects, the PI will have access to the required ALCF systems (besides the Globus Web Interface) to login and access their Eagle project directory. 

 

terminal window

 

Creating a Guest Collection

You (PI) need to have an 'active' ALCF account in place to create and share guest collections with collaborators. 

There are two ways to create a Guest Collection for your project directory.

Method 1: Using 'Endpoints'

In the Globus application in your browser:

  1. Click 'Endpoints' located in the left panel (or go to https://app.globus.org/endpoints)
  2. In the search box located at the top of the page, type "alcf#dtn_eagle" and click the magnifying glass to search
  3. Click on the Managed Public Endpoint "alcf#dtn_eagle" from the search results
  4. Shared endpoints always remain active. When you select an endpoint to transfer data to/from, you may be asked to authenticate with that endpoint:
    1. Click the ‘Activate Now’ button to activate the endpoint 
    2. Click the ‘Continue’ button to be redirected to the Globus site for authentication.
  5. Select the 'Collections' tab on the top, you may have to provide Authentication/Consent for the Globus web app to manage collections on this endpoint on your behalf by clicking the "Continue" button
  6. Click 'Add a Guest Collection' located at the top right hand corner
  7. Fill out the form:
    1. Choose the directory by clicking the browse button and navigating to your project directory. You can create a guest collection with your project directory as the root or pick a sub directory within your project directory to share
    2. Give the collection a Display Name (choose a descriptive name, e.g. you created a collection out of the 'Eagle_Testing' project directory <Eagle_Testing_Entire_ProjFolder>)
  8. Click 'Create Collection'
Create-Guest-Collection-Entire.png

Method 2: Using 'File Manager'

In the Globus application in your browser:

  1. Click on 'File Manager' located in the left panel
  2. Type 'alcf#dtn_Eagle' in the Collection field and select the collection
  3. Selecte your project directory or a sub directory that you would like to share with collaborators as a Guest Collection
  4. Click on 'Share' on the right side of the panel
  5. Click on 'Add a Guest Collection' button located at the top right hand corner
  6. Give the collection a Display Name (choose a descriptive name e.g. you created a collection out of a subfolder in 'Eagle_Testing' project directory  and named it <Project_Subfolder_Share_Only>)
  7. Click 'Create Collection'
File-Manager-Subfolder.png
Create-Guest-Collection-Subfolder.png

Sharing Data with Collaborators Using Guest Collections

If your data is on the ALCF systems, you can easily share it with collaborators who are at ALCF or elsewhere. All they need is a (free) Globus account. You have full control over which files your collaborator can access, and whether they have read-only or read-write permissions.

**Note:

  • If collaborators DO NOT have access to the ALCF system, 'Eagle', then the ONLY way for them to get to the shared (guest) collection is by logging in to Eagle from the Globus Web Interface.
  • If collaborators HAVE access to the ALCF system, 'Eagle', then they can use either the Globus Web Interface and/or a Terminal window to get to the shared (guest) collection.

To share data with collaborators (that either have a Globus account or an ALCF account), click on 'Endpoints'. Select your newly created Guest Collection, and go to the 'Permissions' tab. Click on 'Add Permissions - Share With':

Permissions.png

You can share with other Globus users or Globus Groups (for more information on Groups, see section 7). You can give the collaborators read, write or read+write permissions. Once the options have been selected, click 'Add Permission'.

Permissions_Share_With.png

PI can also choose to share their data with 'Public' with anonymous read access (and anonymous write disabled).  This allows anyone that has access to the data read and/or download it without authorizing the request.

https-read

You should then see the share and the people you have shared it with. You can repeat this process for any number of collaborators. At any time, you can terminate access to the directory by clicking the trash can next to the user.

 

Permissions-2.png

Caveats about using Guest Collections:

  • You can only share directories, not individual files.
  • Globus allows directory trees to be shared as either read or read/write. This means that any subdirectories within that tree also have the same permissions.
  • If you don't want collaborators to have have access to the entire project directory, create a guest collection for the desired sub-directory and share it with the intended collaborators.
  • When you create a Guest Collection endpoint and give access to one or more Globus users, you can select whether each person has read or read/write access. If they have write access, they can also delete files within that directory tree, so you should be careful about providing write access.
  • Guest Collections are created and managed by project PIs. If the PI of a project changes, the new PI will have to create a new Guest Collection and share them with the users. Guest Collections' ownership cannot be transferred.
  • Guest Collections are active as long as the project directory is available and the PI's ALCF account is active. If the account goes inactive, the collections become inaccessible to all the users. Access is restored once the PI's account is reactivated.
  • All RW actions are performed as the PI, when using Guest Collections. If a PI does not have permissions to read or write a file or a directory, then the Guest Collection users won't either.

Creating a group

  1. Go to Groups on the left panel
  2. Click on ‘Create a new group’ at the top
  3. Give the group a descriptive name and add Description for more information
  4. Make sure you select ‘group members only’ radio button
  5. Click on ‘Create Group’
create group

Installing the Globus Client on your Desktop

The Globus Connect client is available for Windows, Mac or Linux desktop systems. There are detailed instructions on the Globus website. See links below.

How to install and configure Globus Connect Personal on
:

email_2

Transferring data between your desktop and Eagle 

On your desktop system, you will need to have Globus Connect Personal running. Point your web browser to www.globus.org. Click on 'Log on', and enter your ALCF username and password on the following ALCF login page. After authenticating, you will be taken to the Globus File Manager page.

In the 'Collection' box, type the name of Eagle managed endpoint (alcf#dtn_eagle). You may need to authenticate again. If so, you will be taken to the Globus authentication page as described above and can authenticate with your ALCF login username and password.

 

Collection-Search.png

By default, you should see the files in your /home on Eagle appear. You can also point to your /projects or another shared area by entering, for example, '/projects/myusername' in the Path box.
You can either download or transfer the data using the respective option. 

Click on 'Download' to download the required file. 

download

Select 'Transfer or Sync to' option to transfer the file. 

Image removed.

Enter the other endpoint, in this case the endpoint name that you gave to your desktop system when you installed Globus. You should now see both endpoints listed in two panes of the Globus window.

lalitha-laptop.png

To transfer files, select a file or directory on one endpoint, and click the blue 'Start' button.

transfer.png

The page will now say that the transfer request submitted successfully.

success

Click on 'View details' to display task detail information. Statistics are displayed at this page.

Transfer_Complete.png

You will also receive an email when the transfer is complete.

Screen Shot 2021-08-22 at 9.57.13 PM.png

Deleting an Endpoint: 

To see all endpoints you have shared, go to 'Endpoints' in the left bar, then 'Administered by You'. It is highly recommended that you (PI) delete the endpoint share when your collaborator has completed downloading the data. You can do so by going to 'Endpoints' in the left bar, then 'Administered by You', select the endpoint, and click on 'Delete endpoint'.

Delete_Endpoint.png

What to tell your collaborators  

If you set up a shared endpoint and want your collaborator to download the data, this is what you need to tell them.

First, the collaborator needs to get a Globus account. The instructions for setting up a Globus account are as described above. This account is free. They may already have Globus access via their institution.

If the collaborator is downloading the data to his/her personal workstation, they need to install the Globus Connect client. Globus connect clients are available for Mac, Windows or Linux systems and are free.

 

If you clicked on the 'notify users via email' button when you added access for this user, they should have received a message that looks like this:

 

Screen Shot 2021-08-22 at 10.00.17 PM_Touched.png

You can, of course, also send email to your collaborators yourself, telling them you've shared a folder with them. The collaborator should click on the link, which will require logging in with their institutional or Globus login username and password. They should then be able to see the files you shared with them. External collaborator's view of the shared collection is shown below: 

 

collaborator_view.png

They should click on the files they want to transfer, then 'Transfer or Sync to', enter their own endpoint name and desired path and click the 'Start' button near the bottom to start the transfer.

lalitha-laptop

Encryption and Security  

Data can be encrypted during Globus file transfers. In some cases encryption cannot be supported by an endpoint, and Globus Online will signal an error.

For more information, see How does Globus Online ensure my data is secure?

 

In the Transfer Files window, click on 'More options' at the bottom of the 2 panes. Check the 'encrypt transfer' checkbox in the options.

security_2

Alternatively, you can encrypt the files before transfer using any method on your local system, then transfer them using Globus, then unencrypt on the other end.

Note that encryption and verification will slow down the data transfer.

FAQ Categories

General: 

1. What is ALCF’s Eagle?

The Community File System (CFS) is a global file system available on all ALCF computational systems. It allows sharing of data between users, systems, and the "outside world".

 

2. What is the difference between Guest, Shared and a Mapped collection?  

  • Guest collections: A Guest collection is a logical construct that a PI sets up on their project directory in Globus that makes it accessible by non-ALCF project members. The PI creates a guest collection at or below their project and shares it with the Globus account holders.
  • Shared collection: A guest collection becomes a shared collection when it is shared with a user/group.
  • Mapped Collections: Mapped Collections are created by the endpoint administrators. In the case of Eagle, these are created by ALCF.

 

3. What is the difference between Internal and External Collaborators?

  • Internal collaborators are the ones that have an ALCF account.
  • External collaborators are the ones that do NOT have an ALCF account but they DO need to have a Globus account in place. 

 

4. Who is an Access Manager? 

Access Manager is a select user who can act as a Proxy on behalf of the PI to manage the collection. The Access Manager has the ability to add users, remove users, grant or revoke read/write access privileges for those users on that particular shared collection.

 

5. What are Groups? 

Groups are constructs that enable Multi-user data collaboration. A PI (and/or an Access Manager) can create new groups, add members to them and share a guest collection with a group of collaborators. Note that members of groups do not need to have a unix account in ALCF.

 

6. What are some of the Common Errors you see and what do they mean?

  • EndpointNotFound   -  If <endpoint> not found
  • PermissionDenied    -  If you do not have permissions to view or modify the collection on <endpoint>.
  • ServiceUnavailable  -  If the service is down for maintenance.

 

PI: 

1. How will the PI request for Eagle Storage allocation? 

Interested users can request an allocation using the Director’s Discretionary Allocation Request form provided here: https://accounts.alcf.anl.gov/allocationRequests. The allocations committee reviews the applications and provides the needed approvals. It usually takes 1-2 weeks for the approvals to come through. To request an allocation for an existing project, please email support@alcf.anl.gov with your proposal.

 

2. Is it important for an Eagle PI to have an ALCF account?

Yes.

You (PI) need to have an 'active' ALCF account in place to create and share guest collections with collaborators. 

 

3. How will the PI access project directory on Eagle?

There are two ways for a PI to access it.

  • Web Interface: By logging in to Globus interface directly and navigating to the ALCF Eagle endpoint
  • POSIX: By logging in to the ALCF systems using the Mobile token

 

4. What endpoint will the PI be using?

For data that is located in /eagle (or /lus/eagle), you will use the Eagle endpoint (alcf#dtn_eagle).

 

5. What are the actions an Eagle PI can perform?

  • Create and delete guest collections, groups
  • Create, delete and share the data with ALCF users and external collaborators (who have Globus accounts)
  • Specify someone as a Proxy (Access Manager) for the guest collections
  • Transfer data between the guest collection on Eagle and other Globus endpoints/collections

 

6. How can a PI specify someone as a Proxy on the Globus side?

Go to alcf#dtn_eagle -> collections -> shared collection -> roles -> select 'Access Manager'

Roles

 

Proxy

 

7. How does the overall workflow for onboarding new users on to eagle look like? 

  • PI requests an Eagle allocation project
  • Allocations Committee reviews and approves requests
  • A project with a unixgroup, project directory and quota is setup for the approved Eagle allocation project
  • A Globus sharing policy is created for the project with appropriate access controls
  • PI creates a guest collection for the project, using the Globus mapped collection for Eagle.
    • Note: PI needs to have an active ALCF Account and will need to log in to Globus using their ALCF credentials.
    • If PI already has a Globus account, it needs to be linked to their ALCF account
  • PI adds collaborators to the guest collection. Collaborators can be ALCF users and external collaborators (who need to have Globus accounts)
    • Added with Read only or Read-Write permissions

 

8. Should PI add their ALCF project members to Eagle separately to access Collections?

Not required. ALCF project members already have access to the entire project directory. The main goal of Eagle is to allow sharing of data with outside collaborators that are non-ALCF users (with a globus account). PI can share guest collections to the external collaborator email address using the Globus interface.

 

9. Who has the permissions to create a guest collection?

ONLY the PI has the ability to create a guest collection. The Access Manager, along with the PI has permissions to share it with collaborators (R-only or R-W permissions as needed)
[Go to alcf#dtn_eagle-> File Manager- > projectdir -> folder -> right click and share -> Add a guest collection]

 

10. Who can create groups?

A PI (and an Access Manager) can create new groups, add members to them and share a guest collection with a group of collaborators. Note that members of groups do not need to have a unix account in ALCF. For more information, please refer to section: Creating a Group

 

11. How can a PI transfer data to Eagle from their laptop?

PI will need to set up a Globus Personal endpoint for such transfers to happen.

 

12. What happens when the PI of a project is changed? What happens to the shared collection endpoint?

The new PI will need to create new shared collections and share it with collaborators again.

 

13. I notice that I am the owner of all the files that were transferred by external collaborators using the shared collection endpoint. Why is that?

If external collaborators are given WRITE permissions on the guest collection, then they have the same permissions as the PI on the files. Which means for all files that the collaborators transfer in/out of the collection, they show as being done "on behalf" of the PI. So it is important that the external collaborators be cautious on editing/deleting files from the collection.

 

14. What happens to the endpoint when the PI’s account goes Inactive?

Endpoint collections are there but are dormant meaning, no one can access those collections until the PI's account is activated. Not even the Proxy.

 

15. How long does it take for the endpoint to become accessible to collaborators after PI’s account is activated?

Right away. Page needs to be refreshed - sometimes logging out and back in makes it quicker.

 

Access Manager:

1. What are the pre-requisites for an Access Manager? 

  1. They should be a Globus user and setup with Access Manager role
  2. They should have access to some other Globus endpoint or Globus Connect Personal (laptop) to share their data to/from.

 

2. What are the actions an Access Manager can perform?

  1. Access Manager should be able to see the collection under ‘Shared with you’ and ‘Shareable by you’ tabs.
  2. Has permissions to add and/or delete collaborators on the shared collection and restrict their R-W access as needed.
  3. As long as the Access Manager has access to two active endpoints, they can transfer data between them.

 

3. Does an Access Manager need to have an ALCF account?

Not necessary. However, if they need to manage the membership on the POSIX side, they will need an ALCF account and be a Proxy on the project.

 

4. What is the difference between an ALCF project Proxy vs Guest Collection Access Manager?

ALCF Project Proxy has permissions to manage project membership on the ALCF filesystem side whereas guest collection Access Manager has permissions to manage the project membership specific to that guest collection shared by the PI.   

5. How will they access project directory on Eagle? 

There are two ways for an Access Manager to access it.     

  1. If they do NOT have an ALCF account, they can access the guest collection shared with them by logging in to the Globus interface directly
  2. If they are ALCF account holders, they can access the project directory on the filesystem by logging in using the Mobile token.

 

6. Can an Access Manager give external collaborators access to the collections that are shared with them on Eagle?

Yes, a proxy will see ‘Permissions’ tab at the top of the shared collection page and can share it with collaborators and/or a group.

 

7. Can an Access Manager create collections using the shared endpoint?

No. They don’t have access to create ‘New collections’, ONLY a PI has the permissions to do that.

 

8. Can an Access Manager leave a globus group or withdraw membership request for collaborators?

Yes.[Go to alcf#dtn_eagle-> Groups > group_name -> Members -> click on specific user -> Role & Status -> Set the appropriate status]

remove_from_group

 

9. Can an Access Manager delete guest collections created by PI?

No. ONLY a PI can delete those guest collections.

 

External Collaborators: 

1. What are the actions an External Collaborator can perform?

External Collaborator is the one who DOES NOT have an ALCF account but DO have an active Globus account in place.

  1. Collaborator can read files from a collection *
  2. Collaborator can write to a collection **
  3. Collaborator can delete files in a collection **

*  if the PI has read permissions for those files on the POSIX side
** if the PI has write permissions for those files on the POSIX side AND the collaborator is given write permissions in Globus