ALICE
The instructions below assume that you have already used and setup Miniconda on ALICE. You can run the instructions on the login nodes outside of a slurm job.
Because of the disk space required by funannotate, in particular for the databases, we will install it on scratch-shared.
Before you start:
Make sure that you start with a clean module environment
Make sure that you have already initiated conda and the channels
bioconda
andconda-forge
are available.Because the amount of packages that conda needs to download can also fill up your home, it can be useful to move the default location to the shared-scratch: https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/pages/37749421/Installing+Python+packages#Download-directory-for-packages
Instructions
Let’s create a directory for funannotate in your user directory on scratch-shared first. The path here is just an example, adjust as needed.
mkdir /data1/$USER/funannotate
Create the conda environment for funannotate in your user directory on scratch-shared. Note that we will note install funannotate right away because we noticed issues with resolving the dependencies:
conda create --prefix=/data1/$USER/funannotate/funannotate_env "python>=3.6,<3.9"
Here, we specified the full path to the environment and activating it will require you to do this every time. Alternatively, you can also move your default location for for environments to your user directory to scratch-shared: see https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/pages/37749421/Installing+Python+packages#Installation-directory
Activate the new environment, install mamba to fix the issue with resolving dependencies and then install funannotate
conda activate /data1/$USER/funannotate/funannotate_env # your prompt should have changed showing the activated environment in brackets # now, we can install mamba conda install mamba # next, we install funannotate mamba install funannotate
After the installation has finished, you can keep the environment activated and continue with setting up the databases. First, we create a directory for the databases:
mkdir -p /data1/$USER/funannotate/funannotate_db
Set the environment variable for the location of the database $FUNANNOTATE_DB. Note that the instructions suggest to run the database setup before setting the environment variable. However, we had issues with this so, we will set the environment variables first. For this, you will need the full path to the installation directory of funannotate in your environment:
“echo "export FUNANNOTATE_DB=/data1/$USER/funannotate/funannotate_db" > /data1/$USER/funannotate/funannotate_env/etc/conda/activate.d/funannotate.sh” “echo "unset FUNANNOTATE_DB" > /data1/$USER/funannotate/funannotate_env/etc/conda/deactivate.d/funannotate.sh”
Next, deactivate and activate the conda environment, to make sure that the database location settings are working properly:
conda deactivate conda activate /data/$USER/funannotate/funannotate_env
Run the command to setup the database
funannotate setup -d /data1/$USER/funannotate/funannotate_db
Create a yml file from the environment which can be used to recreate it later on. This step is optional but recommended for reproducibility and as a backup of the environment. The yml file can be stored in your home directory, but it should also be copied to external storage for backup or uploaded for example to a gitlab repo:
today=$(date +%Y%m%d) conda env export > $HOME/funannotate_env_$today.yml
SHARK
For SHARK, you can basically follow the same steps as for ALICE, adjusting of course the location for the conda environment. Again, it is also not recommended to use your home directory, but an export directory to which you have access to.