|
|
Accessing OSCER's Tape Archive
Questions about the tape archive or how to use it?
Contact us!
The OSCER tape archive is now online.
This tape archive replaces
the /data disk partition.
Access to the tape archive is still somewhat primitive, but
we plan to improve the access software,
to make your use of the tape archive both simpler and
more powerful.
Currently, the only way to access the tape archive is using
sftp
(Secure File Transfer Protocol).
On
all OSCER systems
(topdawg.oscer.ou.edu,
schooner.oscer.ou.edu,
condor.oscer.ou.edu)
the
sftp
command to use is:
sftp archive
So, you can log in to
topdawg.oscer.ou.edu
or
schooner.oscer.ou.edu
and do the command above,
then copy files from, for example,
/scratch/yourusername
to the tape archive:
cd /scratch/yourusername
sftp archive
NOTES:
-
When you
sftp into the tape archive,
currently it is the case that you end up
in the directory one level above your directory,
so if you do an ls command to list
the contents of your current working directory,
you'll see the list of OSCER users.
Therefore,
the first command that you should type
when you sftp into the tape archive is:
cd yourusername
(Replace yourusername with your user name.)
-
We encourage you to set your directory permissions as
tightly as you're able to.
So, the FIRST TIME you log in,
after you do the
cd command,
above,
do this:
chmod 700 .
NOTICE THE PERIOD AT THE END OF THE COMMAND --
IT'S CRUCIAL!!!
This command means:
"Set the permissions on my tape archive directory
so that I can read, write and go into my files and directories,
but nobody else can do anything with my files and directories."
If you also want your files to be
accessible by members of your user group,
you can do:
chmod 750 .
NOTICE THE PERIOD AT THE END OF THE COMMAND --
IT'S CRUCIAL!!!
This command means:
"Set the permissions on my tape archive directory
so that I can read, write and go into my files and directories,
and members of my user group can read and go into
my files and directories,
but nobody else can do anything with my files and directories."
-
You can get a list of Unix-like commands that can be used
within
sftp by typing
help
at the sftp prompt.
-
Note that when you want to get files from the tape archive,
there may be a delay, of anywhere from several minutes on up,
before your files become available,
because the tape that contains your files
will have to be
selected by the tape robot,
placed into
one of the tape drives,
and then
fast forwarded or rewound to the appropriate place in the tape.
(Each tape is 400 GB, so this can take a while.)
So, please be patient when using the tape archive.
Also, if there are many people trying to access tapes
at the same time,
this will cause even longer delays.
-
If you had a
/data/yourusername
directory on
topdawg.oscer.ou.edu
(if you don't know whether you had one,
then you didn't have one),
then your files that were on
/data/yourusername
have already been moved to the tape archive.
-
If you have files parked on
/scratch/yourusername,
because you couldn't find
a suitable place to park them elsewhere,
then please move those files to the tape archive.
-
When storing files to the tape archive, and especially when
retrieving the files from the tape archive later,
it's much much faster to store a few large files
than many small files.
You can accomplish this by creating one or a few "tar" files.
How to create a tar file is explained below.
Suppose that you have many files that together consume a lot of
disk, but that many of these files are individually quite small.
For example, suppose that the files had an aggregate total size of
10 GB, but most of the files were about 1 MB each.
That would mean that you had approximately 10,000 files.
You now have two choices:
-
Save each of the individual small files
to the tape archive.
-
Create a
tar
file consisting of all of the files
(or a big subset of them),
and save the tar file to the tape archive.
(If you aren't familiar with tar files,
they're like
zip
files in Windows:
one file can contain many smaller files,
and even a directory structure, inside it;
more below).
When you save a file to the tape archive, here's what happens:
-
When you do the sftp,
your file(s) actually write to a disk
that acts as a cache for the tape archive.
-
At some point after you save your file(s)
to that disk cache,
the tape archive software
automatically copies your file(s) out to tape,
and then erases your file(s) from the disk cache.
Therefore,
the process of saving a file to the tape archive
typically is quite fast,
because disk is much faster than tape.
When you want to retrieve a file from the tape archive,
here's what happens:
-
The tape archive determines
which tape cartridge contains
the file that you want to retrieve,
and which storage slot
that tape cartridge is stored in.
-
If all of our tape drives are full,
then the tape archive waits patiently
for your tape cartridge's turn.
-
The tape archive robot pulls that tape cartridge
out of its storage slot.
-
The robot carries the tape cartridge
to the tape drive that's ready for your tape cartridge.
-
The robot inserts the tape cartridge into that tape drive.
-
The tape drive winds the tape cartridge
to the place where your file is stored.
-
The tape drive reads your file
and copies it to the disk cache.
-
The tape drive ejects the tape cartridge.
-
The robot removes the tape cartridge from the tape drive.
-
The robot carries the tape cartridge
back to its storage slot.
As you can imagine, this can take a lot of time.
Now,
suppose that you have one big tar file containing
all of your 10,000 little files.
Then this procedure will have to be performed only once.
On the other hand,
suppose that you've saved
each of the 10,000 files individually to the tape archive.
Then this procedure will have to be performed
as many as 10,000 times,
because there's no guarantee that
the individual files are all stored on the same tape.
That's bad.
HOW TO CREATE A TAR FILE
Here's how to create a tar file:
tar zcvf DirectoryName_Date.tgz
DirectoryName
For example, suppose that,
in my home directory
/home/yourusername,
I have a subdirectory named
TestSymposium2004,
and suppose that I want to create a tar file of
TestSymposium2004,
on Jan 9 2008,
and have that tar file reside in my scratch directory,
/scratch/yourusername
(which is a great idea).
Then the tar command would be:
cd /home/yourusername
tar zcvf
/scratch/yourusername/TestSymposium2004_20080109.tgz
TestSymposium2004
Note that
"z" means "gzip"
(that is, compress to a smaller size, much like zipping),
"c" means "create,"
"v" means "verbose"
(tell me what's happening as it happens),
and
"f" means
"the next thing on this command line is
the name of the tar file."
Then,
the only file
that I would want to save to the tape archive would be:
/scratch/yourusername/TestSymposium2004_20080109.tgz
For example, the commands I'd use would be:
cd /scratch/yourusername
sftp archive
cd yourusername
put
TestSymposium2004_20080109.tgz
logout
And, once I'd saved the tar file to the tape archive,
I could delete the tar file from
/scratch/yourusername:
rm
/scratch/yourusername/TestSymposium2004_20080109.tgz
And,
when I wanted the contents of
TestSymposium2004_20080109.tgz,
then I'd retrieve it from the tape archive,
probably into my scratch directory
/scratch/yourusername,
and then extract the individual files from the tar file.
Here's how I'd retrieve the tar file:
cd /scratch/yourusername
sftp archive
cd yourusername
get
TestSymposium2004_20080109.tgz
logout
And here's how I'd extract the contents,
creating a subdirectory under
/scratch/yourusername
named
/scratch/yourusername/TestSymposium2004:
cd /scratch/yourusername
tar zxvf
/scratch/yourusername/TestSymposium2004_20080109.tgz
Notice that the only differences
between this tar file extraction command
and the tar file creation command (above) are:
-
"x" (for "extract")
replaces
"c" (for "create"),
and
-
we don't bother to say the name of the directory
that's stored inside the tar file,
because that name itself is stored inside the tar file.
Questions about the tape archive or how to use it?
Contact us!
|