Software on some UTIG servers transparently handles
movement of data between disk and
tape libraries to give the appearance of limitless storage.
Files are backed up to slower disk or
tape a short while after they are created.
If you don't access a file for a while, its content disappears
from the disk, although its name and attributes are still
there. When you refer to it, software reloads its content
from disk, if available (this is very fast) or from tape.
Disk copies are retained only for 3-4 weeks. After that
time, our statistics indicate the data won't be
used again for a long while.
If you haven't used a set of files for a while, their content may need
to be reloaded from tape. This can be very slow, particularly when
many tapes are involved, or your order of reference isn't the
order the files are stored on tape. This effect is often seen
with Geoquest, Landmark and Paradigm datasets because many files
need to be available to present the content of a dataset. This
wait can be extremely frustrating...
If you know you are going back to a project after a long while
(say weeks or months), you could pre-load the project using low-level
commands directly. If you inadvertently ask for many tapes or too much data,
the tape library can be clogged with tens of thousands of reloading
files, and hours of work. This prevents other reloads from being
done with the expected few minute delay, and makes everyone grouchy.
Solution:
"recall" can be used to pre-load a project you haven't used in a while,
or when you do refer to some data and find it to be unacceptably slow.
It shows you the latest files you've recently reloaded, by your name and by
one or more storage locations.
If you are on a Solaris host, and see messages on the console, or dmesg shows:
"nfs: file temporarily unavailable on the server, retrying..." then you
can use recall to investigate what set of files might reduce the delay.
utig.ig.utexas.edu% recall
recall advises you how much SAM-FS tape data you might request.
Usage: recall some_files_or_dirs...
recall walks you through the process of recalling large
amounts of data from sam-fs tape storage without the
risk that you will clog the system by asking for too many
tapes or too much data. It works in two stages. The first
stage verifies that you know the size and locations for the data, and
how many tapes the data is distributed across, and the second stage
actually performs the recall
last 5 files recalled for markw from tape:
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup.cmd
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/xpgsetup.cmd
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl
If I want to check on particular data, I can ask about that:
recall /disk/other/pgclass/TUA_2D/hds/Condor_new
recall advises you how much SAM-FS tape data you might request.
Have you already checked the locations and sizes? n
last 5 files recalled for markw from tape:
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/pgsetup.cmd
2007/10/24 14:44:21 00A249 /d1/backup/cdroms/Paradigm/sp_epos3SEsh_417/xpgsetup.cmd
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl
full samfs path name for this location is /d1/other/pgclass/TUA_2D/hds/Condor_new
last 5 files recalled for location /disk/other/pgclass/TUA_2D/hds/Condor_new from tape:
2007/10/08 16:50:18 00A463 /d1/other/pgclass/TUA_2D/hds/Condor_new/Intersections2D/Intersections2DDescr.ctl
2007/10/08 16:50:28 00A463 /d1/other/pgclass/TUA_2D/hds/Condor_new/Intersections2D/Intersections2D.ctl
2007/10/08 16:51:56 00A303 /d1/other/pgclass/TUA_2D/hds/Condor_new/Pref/surv_set_Clust
2007/10/24 17:03:51 00A012 /d1/other/pgclass/TUA_2D/hds/Condor_new/Datastore.HDS
2007/10/24 17:04:55 00A281 /d1/other/pgclass/TUA_2D/hds/Condor_new/map/FileTable.tbl
recent tape recall list for /disk/other/pgclass/TUA_2D/hds/Condor_new:
56 00A431
26 00A303
5 00A281
4 00A463
2 00A153
2 00A012
1 00A411
1 00A004
cataloging file status and locations (may take a while):
Total size of files to be recalled from /disk/other/pgclass/TUA_2D/hds/Condor_new:
1.2 MBytes
Number of files from each tape needed for this recall:
31 00A004
14 00A012
2 00A022
4 00A150
1 00A153
76 00A281
2 00A303
1 00A322
3 00A389
1 00A398
1 00A407
3 00A411
22 00A463
161 files 13 tapes 1.20119 MB will take about 39 minutes
161 files on 13 tapes totaling 1.20119 MB will take about 39 minutes
NOTES:
MOST IMPORTANT: when you ask for too many tapes at once, it takes a very long
time for even multiple tape drives to deliver what was asked. Meanwhile, all
other requests are backed up waiting first-come first-served for your requests.
If this request requires more than a few tapes, it will be serialized, tape by
tape, so your request won't monopolize drives. If a very large number of files
are recalled, tape access becomes less efficient.
The design of a better scheme for such cases is being worked on.
OK to continue with all locations?
If you answer yes to the last question, the actual recall will be done.
"recall" works on all Sun Sparc and Linux systems.
Recall'ing a non-samfs directory is fine; it won't need any tapes.
Asking about a directory which you don't
have privilege to inspect may printout lots of error messages from
a system utility named sfind. If you ask for information
about a very large directory, it will take a while, but worst case,
less than about 10-15 minutes. You may specify as many locations
or individual files as will fit on a command line, and current
working directory relative directories or files are fine too.
For example:
recall fdpsv geoframe
...
full samfs path name for this location is /d1/staff/markw/fdpsv
last 5 files recalled for location fdpsv from tape:
recent tape recall list for fdpsv:
cataloging file status and locations (may take a while):
Total size of files to be recalled from fdpsv:
935.8 MBytes
Number of files from each tape needed for this recall:
40 00A022
19 00A092
45 00A115
104 files 3 tapes 935.791 MB will take about 10.9 minutes
full samfs path name for this location is /d1/staff/markw/geoframe
last 5 files recalled for location geoframe from tape:
recent tape recall list for geoframe:
cataloging file status and locations (may take a while):
Total size of files to be recalled from geoframe:
0.0 MBytes
Number of files from each tape needed for this recall:
3 00A092
3 files 1 tapes 0.000556946 MB will take about 3.0 minutes
107 files on 4 tapes totaling 935.792 MB will take at -least- 13.9 minutes