Data lies at the heart of almost everything we do in the computing world, and there are a myriad of attributes that are important in figuring where to store data and how to access it. To enable SCI users to best manage their data, SCI IT provides a number of different data storage locations. This document details those locations and their features.

If you need more disk space in a centralized data store (home directory, research project, etc), would like to set up a new shared directory, or need to recover lost data that you weren’t able to find in your snapshots, please contact SCI IT.

Centralized Data Stores

SCI IT provides centralized disk space that is fast, highly available, accessible from any SCI computer, and regularly backed up. There are two main types of centralized storage:

  • Home Directories– the home directory you have when you log into a SCI computer is meant for your personal storage, and may be used however you like. These are accessed via /home/{sci,collab}/<username>.
  • Research and Other Directories– these directories are provided for specific purposes, such as for a specific research project or group, departmental web pages, etc. They are usually accessed via /usr/sci but other directories also exist.

Quotas

Each of these directories are limited to a set size (a quota) in order to manage costs and assure availability. For home directories, quotas are:

  • Faculty: 1TB
  • Staff/Postdoc/Grad: 500GB
  • Undergrad: 100GB
  • Collaborator: 10GB

Snapshots

If you accidentally delete data in one of these centralized data stores, you may possibly be able to recover it by copying it out of a snapshot directory. Snapshots are automatically taken on an hourly, nightly, and weekly basis by the servers that provide our centralized disk space. Snapshots can be accessed via the .snapshot directory, which is a special subdirectory that exists in every directory in our centralized storage. Note that the .snapshot directory will not appear unless you explicitly reference it on the command line, as in the example below:

For example, here I am removing a file and then getting a copy from the hourly snapshot taken at 2am:

clake@memphis: ~/dev > ls -la
total 36K
drwxr-xr-x 2 clake sci-it 2.0K Oct 8 11:14 ./
drwxr-xr-x 25 clake sci-it 22K Oct 8 11:15 ../
-rw-r--r-- 1 clake sci-it 6.1K Jun 26 14:04 50unattended-upgrades
-rwxr-xr-x 1 clake sci-it 582 Sep 4 17:58 dns_check*
-rwxr-xr-x 1 clake sci-it 4.0K May 31 17:09 harden_sshd*
-rwxr-xr-x 1 clake sci-it 181 Apr 30 17:06 Ubuntu_24.04_customization*

clake@memphis: ~/dev > rm dns_check
rm: remove regular file 'dns_check'? y

clake@memphis: ~/dev > ls -la
total 32K
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 11:15 ./
drwxr-xr-x 25 clake sci-it 22K Oct 8 11:15 ../
-rw-r--r-- 1 clake sci-it 6.1K Jun 26 14:04 50unattended-upgrades
-rwxr-xr-x 1 clake sci-it 4.0K May 31 17:09 harden_sshd*
-rwxr-xr-x 1 clake sci-it 181 Apr 30 17:06 Ubuntu_24.04_customization*

clake@memphis: ~/dev > ls -lt .snapshot/
total 144K
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 11:00 11_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 10:00 10_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 09:00 09_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 08:00 08_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 07:00 07_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 06:45 06_hourly_sci_28783/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 06:45 06_45_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 06:00 06_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 05:00 05_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 04:00 04_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 03:00 03_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 02:00 02_hourly_sci_28766/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 01:00 01_hourly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 00:05 08_nightly_sci/
drwxr-xr-x 2 clake sci-it 1.5K Oct 8 00:00 00_hourly_sci_28758/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 23:00 23_hourly_sci_28755/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 22:00 22_hourly_sci_28752/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 21:00 21_hourly_sci_28749/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 20:00 20_hourly_sci_28746/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 19:00 19_hourly_sci_28743/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 18:00 18_hourly_sci_28740/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 17:00 17_hourly_sci_28737/
drwxr-xr-x 2 clake sci-it 1.5K Oct 7 16:00 16_hourly_sci_28734/
drwxr-xr-x 2 clake sci-it 2.0K Oct 7 15:00 15_hourly_sci_28731/
drwxr-xr-x 2 clake sci-it 2.0K Oct 7 14:00 14_hourly_sci_28728/
drwxr-xr-x 2 clake sci-it 2.0K Oct 7 13:00 13_hourly_sci_28725/
drwxr-xr-x 2 clake sci-it 2.0K Oct 7 12:00 12_hourly_sci_28722/
drwxr-xr-x 2 clake sci-it 2.0K Oct 7 00:05 07_nightly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Oct 6 00:05 2024_10_06_weekly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Oct 5 00:05 05_nightly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Oct 4 00:05 04_nightly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Oct 3 00:05 03_nightly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Oct 2 00:05 02_nightly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Sep 29 00:05 2024_09_29_weekly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Sep 22 00:05 2024_09_22_weekly_sci/
drwxr-xr-x 2 clake sci-it 2.0K Sep 15 00:05 2024_09_15_weekly_sci/

clake@memphis: ~/dev > cp -p .snapshot/02_nightly_sci/dns_check .

clake@memphis: ~/dev > ls -la
total 36K
drwxr-xr-x 2 clake sci-it 2.0K Oct 8 11:15 ./
drwxr-xr-x 25 clake sci-it 22K Oct 8 11:15 ../
-rw-r--r-- 1 clake sci-it 6.1K Jun 26 14:04 50unattended-upgrades
-rwxr-xr-x 1 clake sci-it 582 Sep 4 17:58 dns_check*
-rwxr-xr-x 1 clake sci-it 4.0K May 31 17:09 harden_sshd*
-rwxr-xr-x 1 clake sci-it 181 Apr 30 17:06 Ubuntu_24.04_customization*

Local Disk

Desktops

Practically all modern desktop and workstation computers come with locally installed hard disks that provide space well in excess of what is needed by the operating system. The standard build in SCI makes the extra space not needed by the operating system available for users to access and use as scratch space.

This local disk space is available on the local machine as /scratch or /scratch_{storage_type}  (e.g. /scratch_nvme or /scratch_ssd).

The benefits of using this scratch space is that it there generally is a lot of space available, and when used locally on a machine, it is fast. However, this scratch space is not backed up, so in the event of a hardware failure or data deletion, SCI IT will be unable to recover data.  As such, this is best used for data that can be recreated or recovered from elsewhere (i.e. kept in multiple places).

Laptops

Similarly to desktops, most modern laptops come with a large amount of available space on them. Given that laptops are not backed up by SCI IT, any important information on your laptop should be backed up by copying it to a centralized data directory or by some other means of backing up your laptop (e.g. 3rd-party backup software, manual copying to an external disk, etc).  If you do backup your laptop outside of SCI, please be mindful about sensitive data and where/how that data may be copied.

Non-SCI data stores

The University of Utah contains many other groups and Departments that provide data storage services, in addition to those provided at SCI. Two of the most relevant ones are:

UIT

UIT has several options for storage, including personal and departmental storage via Box and One Drive, departmental shares, Google Workspace, etc. For more information, see the UIT Service Catalog.

CHPC

When using resources in the Center for High Performance Computing, there are several options for data storage, including home directories, scratch space, etc. Additionally, they do provide storage in their Protected Environment if using sensitive data. For more information, see the CHPC page on Storage Services.