corners
Jackson School of GeosciencesUTIG logo
Institute for Geophysics
Department of Geological SciencesBureau of Economic GeologyInstitute for Geophysics
Institute for Geophysics
Overview
Support Staff
Printers
Wireless/VPN
HSM Storage

 Technical Support
HSM Storage Help

HSM Storage

Background:

Software on some UTIG servers transparently handles movement of data between disk and tape libraries to give the appearance of limitless storage. Files are backed up to slower disk or tape a short while after they are created. If you don't access a file for a while, its content disappears from the disk, although its name and attributes are still there. When you refer to it, software reloads its content from disk, if available (this is very fast) or from tape. Disk copies are retained only for 3-4 weeks. After that time, our statistics indicate the data won't be used again for a long while.

If you haven't used a set of files for a while, their content may need to be reloaded from tape. This can be very slow, particularly when many tapes are involved, or your order of reference isn't the order the files are stored on tape. This effect is often seen with Geoquest, Landmark and Paradigm datasets because many files need to be available to present the content of a dataset. This wait can be extremely frustrating...

If you know you are going back to a project after a long while (say weeks or months), you could pre-load the project using low-level commands directly. If you inadvertently ask for many tapes or too much data, the tape library can be clogged with tens of thousands of reloading files, and hours of work. This prevents other reloads from being done with the expected few minute delay, and makes everyone grouchy.

Solution:

"recall" can be used to pre-load a project you haven't used in a while, or when you do refer to some data and find it to be unacceptably slow. It shows you the latest files you've recently reloaded, by your name and by one or more storage locations. If you are on a Solaris host, and see messages on the console, or dmesg shows: "nfs: file temporarily unavailable on the server, retrying..." then you can use recall to investigate what set of files might reduce the delay.

utig.ig.utexas.edu% recall

recall advises you how much SAM-FS tape data you might request.
Usage: recall some_files_or_dirs...

recall walks you through the process of recalling large
amounts of data from sam-fs tape storage without the
risk that you will clog the system by asking for too many
tapes or too much data. It works in two stages. The first
stage verifies that you know the size and locations for the data, and
how many tapes the data is distributed across, and the second stage
actually performs the recall

last 5 files recalled for markw from tape:
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup.cmd
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/xpgsetup.cmd
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl
If I want to check on particular data, I can ask about that:
recall /disk/other/pgclass/TUA_2D/hds/Condor_new

recall advises you how much SAM-FS tape data you might request.

Have you already checked the locations and sizes? n

last 5 files recalled for markw from tape:
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup.cmd
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/xpgsetup.cmd
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl

full samfs path name for this location is /d1/other/pgclass/TUA_2D/hds/Condor_new

last 5 files recalled for location /disk/other/pgclass/TUA_2D/hds/Condor_new from tape:
2007/10/08 16:50:18 00A463 /d1/other/pgclass/TUA_2D/hds/Condor_new/Intersections2D/Intersections2DDescr.ctl
2007/10/08 16:50:28 00A463 /d1/other/pgclass/TUA_2D/hds/Condor_new/Intersections2D/Intersections2D.ctl
2007/10/08 16:51:56 00A303 /d1/other/pgclass/TUA_2D/hds/Condor_new/Pref/surv_set_Clust
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl

recent tape recall list for /disk/other/pgclass/TUA_2D/hds/Condor_new:
  56 00A431
  26 00A303
   5 00A281
   4 00A463
   2 00A153
   2 00A012
   1 00A411
   1 00A004

cataloging file status and locations (may take a while):

Total size of files to be recalled from /disk/other/pgclass/TUA_2D/hds/Condor_new:
    1.2 MBytes

Number of files from each tape needed for this recall:
  31 00A004
  14 00A012
   2 00A022
   4 00A150
   1 00A153
  76 00A281
   2 00A303
   1 00A322
   3 00A389
   1 00A398
   1 00A407
   3 00A411
  22 00A463

161 files 13 tapes 1.20119 MB will take about      39 minutes

161 files on 13 tapes totaling 1.20119 MB will take about      39 minutes

NOTES:

MOST IMPORTANT: when you ask for too many tapes at once, it takes a very long
time for even multiple tape drives to deliver what was asked. Meanwhile, all
other requests are backed up waiting first-come first-served for your requests.

If this request requires more than a few tapes, it will be serialized, tape by
tape, so your request won't monopolize drives. If a very large number of files
are recalled, tape access becomes less efficient.
The design of a better scheme for such cases is being worked on.

OK to continue with all locations? 

If you answer yes to the last question, the actual recall will be done.

"recall" works on all Sun Sparc and Linux systems. Recall'ing a non-samfs directory is fine; it won't need any tapes. Asking about a directory which you don't have privilege to inspect may printout lots of error messages from a system utility named sfind. If you ask for information about a very large directory, it will take a while, but worst case, less than about 10-15 minutes. You may specify as many locations or individual files as will fit on a command line, and current working directory relative directories or files are fine too.

For example:

recall fdpsv geoframe
...

full samfs path name for this location is /d1/staff/markw/fdpsv

last 5 files recalled for location fdpsv from tape:

recent tape recall list for fdpsv:

cataloging file status and locations (may take a while):

Total size of files to be recalled from fdpsv:
  935.8 MBytes

Number of files from each tape needed for this recall:
     40 00A022
     19 00A092
     45 00A115

104 files 3 tapes 935.791 MB will take about    10.9 minutes

full samfs path name for this location is /d1/staff/markw/geoframe

last 5 files recalled for location geoframe from tape:

recent tape recall list for geoframe:

cataloging file status and locations (may take a while):

Total size of files to be recalled from geoframe:
    0.0 MBytes

Number of files from each tape needed for this recall:
      3 00A092

3 files 1 tapes 0.000556946 MB will take about     3.0 minutes

107 files on 4 tapes totaling 935.792 MB will take at -least-    13.9 minutes
Footer
About UTIG Mission Statement Director's Letter Strategic Plan Directions to UTIG History Academic Partners
Overview TXESS Revolution IPY Learning Activites Wired Antarctica GK-12 Program Adopt-A-School Teachers in the Field Earthquake Hazards
Support UTIG Industry Sponsors Sponsored Projects
News Main Seminars In The News Spotlights News Releases Contacts Experts Field Work Calendar JSG Meetings
Directory Research Staff Technical Staff Administrative Staff Students Alumni Standing Committees Job Opportunities
Research Main Active Projects Archived Projects Plate Models Neotectonics Plate Boundary Processes Earthquake Seismology Continental Margins Climate Polar Studies Ice and Ice-covered Lithosphere Sea-Level Fluctuations Gas Hydrate Studies Natural Resource Exploration Quantitive Geophysics Planetary Geophysics
Overview Technical Support Seismic Data Center Library OBS Facilities TexSeis Earthquake Center Hockley Seismic Station Contribution Search