Data transfer to and from SurfDrive and ResearchDrive using rclone

This section describes how to transfer files to and from SURFdrive and ResearchDrive, the storage solutions offered by SURF. On ALICE or SHARK, the recommend method is to use rclone. The procedure described below is very similar for ResearchDrive and SurfDrive and is based on the documentation in the ResearchDrive Wiki.

For SHARK users, SURFDrive is not available for LUMC users.

Preparations

  • Make sure that you have an account on ResearchDrive and/or SURFdrive

  • Log in to ResearchDrive or SURFdrive

  • In ResearchDrive/SURFdrive, go to your username in the top right corner, choose "Settings" and then go to "Security"

  • In the section "WebDAV passwords", set an app name and generate a password. This will show your username and WebDAV password. Do not click on "Done" at this point, but keep it open until the setup procedure is complete.

Procedure

  • Login in to ALICE or SHARK

  • Load rclone

ALICE

SHARK

Self-Install

ALICE

SHARK

Self-Install

On ALICE, you can load the module for rclone like this

# to always load the latest version, just do: module load rclone

(or install rclone in your user environment, see right column)

You can check which version is available like this:

[me@res-hpc-lo02 ~]$ rclone --version

On SHARK, rclone is automatically available in your path. No need to load it as a module.

You can check which version is available like this:

[me@res-hpc-lo02 ~]$ rclone --version

 

You can follow the instructions in the step “Fetch and unpack” on the rclone website: Install (rclone.org)

If you want your personal rclone version to always be available when you log in, copy the executable to the “bin” directory in your home directory:

  • Now we start the configuration process of rclone with

  • In the first configuration step, select "n" for "New Remote"

  • Next, rclone asks you to choose a name for the connection. It is arbitrary and it will be used later to identify the remote connection. Here we will use as an example "RD" for ResearchDrive and "SD" for SURFdrive

  • Next, rclone will present you with a list of connection types. Find the one that says "WebDAV" and type in the corresponding number. In this example, it is 37, but it might be another one when you run it.

  • In the following step, you have to enter an url. You can find the required url when you switch back to your browser where you have the "Security" settings open for your RD or SD account. Copy the url below "To access your files through WebDAV, please use the following URL:" and paste in the terminal on ALICE which you use to configure rclone.

  • In the next rclone configuration step, select the corresponding number for "OwnCloud". Here it is 2, but it could be another one in your case.

  • Next, rclone will ask you for your SD or RD username. You can also find this in the Security settings for your RD or SD account where you have generated the WebDAV password.

  • In the following step, you need to type in "y" to set your own password. This will be the WebDAV password that you generated on the Security settings of your RD or SD account.

  • Rclone will ask you for a password now. Copy the WebDAV password from your SD and RD account and paste it here. This is *not* your SD/RD account password, but the one that you generated in the security settings.

  • Finally, you can confirm the default settings for the next two steps

  • then confirm with "y" if the settings are correct

  • and quite the configuration scheme if everything checks out and you do not do anything else:

Testing

If the setup procedure was successful, you should now be able to access content in your SD or RD storage from ALICE and move data back and forth. Here, we have collected a few examples for you to try out.

One of the first things that you might want to try is to list the top level file content of your SD or RD storage:

If you want to list the content for your SD account, then you just have to replace "RD" with "SD"

Whenever you want to access your RD or SD storage, you have to use RD: or SD:

You could go ahead now and try to copy data to or from ALICE. It would be best to start with a single small file. For example

or

Best Practices

Rclone has a lot of options for improving the performance of your data transfer. We recommend to have a look at the ResearchDrive wiki linked above and the rclone documentation (e.g., How to upload or download your files - Research Drive - SURF Wiki (surfnet.nl), rclone performance considerations - Research Drive - SURF Wiki (surfnet.nl))

Here, we list some best practices

  • Despite the recommendations on the ResearchDrive wiki, DO NOT increase the number of transfer beyond 8. This is because we still need to gather data on the impact of the number of transfers on the network for other users.

  • For ResearchDrive and SURFdrive, rclone is okay to be used for file sizes up to 30GB (per file). Above that it might get unstable. For file sizes above 100GB (per file), other methods have to be used to copy data to ResearchDrive and SURFdrive. Please consult the ResearchDrive and SURFdrive documentation.

  • Use the --timeout option when copying files setting it to 10m for each GB of the biggest file.

  • Always use the --use-cookies whenever you transfer files: