When you create a file in Unix, there are quite a few things that happen. In this lecture, we are going to focus on three components of a file in Unix:
UNIX> echo "This is f1.txt" > f1.txt UNIX> ls -lai total 20 497584377 drwxr-xr-x 2 plank guest 38 Feb 3 13:54 . 430430681 drwxr-xr-x 51 plank guest 4096 Feb 3 2014 .. 497584369 -rw-r--r-- 1 plank loci 15 Feb 3 13:54 f1.txt 497584451 -rw-r--r-- 1 plank guest 9896 Feb 3 13:44 lecture.html UNIX>We created a file called f1.txt, and that places three things on disk:
You can see the inode for f1.txt (497584369), and how it points to block 0x4ab, which contains the bytes of the file (you don't have access to this information unless you are the operating system -- I just made up the number 0x4ab for the sake of the example). I have included the information in the directory, which is itself a file on disk, and the inode for that file. See how everything links together?
Also, you will note that I haven't put a null character at the end of the string in the disk block. That's because there is no null character -- that is only there when you are using a string inside a C program. When you write it to disk, there is no null character.
When you give the -i flag to ls, it will tell you the inode number, as in the example above.
To use Unix lingo, the way we name a file is by attaching a "link" to the inode. Links are stored in "directories" -- each entry in a directory maps the name of the link to the inode number of the inode that points to the file. Again, you can see that in the example above.
We can have more than one link point to a file. Suppose we are in a fresh directory, and we have created the file f1 to contain the bytes "This is f1\n". Moreover, suppose that file has an inode number of 34778. And now we do the following:
UNIX> ln f1 f2This says to create another link to the file f1, and call it "f2". That link is really an entry in the directory that maps "f2" to inode 34778. What we have now are two pointers to the same metadata and the same bytes on disk. When we do a listing:
UNIX> ls -li f1 f2 34778 -rw-r--r-- 2 plank 11 Sep 16 10:12 f1 34778 -rw-r--r-- 2 plank 11 Sep 16 10:12 f2 UNIX> cat f1 This is f1 UNIX> cat f2 This is f1 UNIX>We see that the files are exactly the same, except that the links have different names. If we change either of these files -- for example, let's edit f2 using vi, and change the word "This" to "That", then the change is seen in both f1 and f2, because they both point to the same bytes on disk:
UNIX> vi f2 ... UNIX> cat f2 That is f1 UNIX> cat f1 That is f1 UNIX> ls -li f1 f2 34778 -rw-r--r-- 2 plank 11 Sep 16 10:14 f1 34778 -rw-r--r-- 2 plank 11 Sep 16 10:14 f2 UNIX>Note that even though we only modified f2, the file modification time for f1 has changed as well. That is because file modification time is stored as part of the inode -- thus, when f2 changes it, the change is seen in f1 as well. Same with file protection modes. If we change the protection for f1, then we will see the changes in f2:
UNIX> chmod 0400 f1 UNIX> ls -li f1 f2 34778 -r-------- 2 plank 11 Sep 16 10:14 f1 34778 -r-------- 2 plank 11 Sep 16 10:14 f2 UNIX>Note the third column of the ls command. It is the number of links to the file. If we make another link to f1, then this column will be updated:
UNIX> ln f1 f3 UNIX> ls -li f1 f2 f3 34778 -r-------- 3 plank 11 Sep 16 10:14 f1 34778 -r-------- 3 plank 11 Sep 16 10:14 f2 34778 -r-------- 3 plank 11 Sep 16 10:14 f3When we use the "rm" command, we are actually removing links. E.g.
UNIX> chmod 0644 f1 UNIX> rm f1 UNIX> ls -li f* 34778 -rw-r--r-- 2 plank 11 Sep 16 10:14 f2 34778 -rw-r--r-- 2 plank 11 Sep 16 10:14 f3 UNIX>When the last link to a file is removed, then the file itself, inode and all, is deleted. As long as there is a link pointing to a file, however, the file remains. It is interesting to see what happens when files with links are overwritten. For example, suppose I do the following:
UNIX> cat > f2 This is now file f2 ^D UNIX> cat f2 This is now file f2 UNIX> cat f3 This is now file f2By saying you want to redirect output to the file f2, you end up changing f3. This means that when the shell performs output redirection, it opens the file and truncates it, instead of removing the file and creating it anew.
Instead, suppose you do:
UNIX> gcc -o f2 ../Stat/src/ls1.c UNIX> ls -li f* 34779 -rwxr-xr-x 1 plank 24576 Sep 16 10:16 f2 34778 -rw-r--r-- 1 plank 20 Sep 16 10:16 f3 UNIX>You'll note that the c compiler gcc did a "rm f2" before creating f2 as an executable.
All directories have at least 2 links:
UNIX> mkdir test UNIX> ls -li | grep test 34800 drwxr-xr-x 2 plank 512 Sep 16 10:17 test UNIX>This is because every directory contains two subdirectories "." and ".." The first is a link to itself, and the second is a link to the parent directory. Thus, there are two links to the directory file "test": "test" and "test/." Similarly, suppose we make a subdirectory of test:
UNIX> mkdir test/sub UNIX> ls -li | grep test 34800 drwxr-xr-x 3 plank 512 Sep 16 10:17 test UNIX>Now there are three links to "test": "test", "test/.", and "test/sub/.."
Besides these links which are automatically created for you, you cannot manually create links to directories. Instead, there is a special kind of a link called a "symbolic link" (also called a "soft link"), which you make using the command "ln -s". For example, we can create a soft link to the test directory as follows:
UNIX> ln -s test test-soft UNIX> ls -li | grep test 34800 drwxr-xr-x 3 plank 512 Sep 16 10:17 test 34801 lrwxrwxrwx 1 plank 4 Sep 16 10:18 test-soft -> test UNIX>Note that soft links have a different kind of directory listing. Moreover, note that the creation of a soft link to "test" doesn't update the link field of test's inode. That only records regular, or "hard" links.
A soft link is a way of pointing to a file without changing the file's inode. However, soft links can do pretty much everything that hard links can do:
UNIX> cat > f1 This is f1 UNIX> ln -s f1 f2 UNIX> cat f2 This is f1 UNIX> cat > f2 This is f2 UNIX> cat f1 This is f2 UNIX> ls -l f* -rw-r--r-- 1 plank 11 Sep 16 10:19 f1 lrwxrwxrwx 1 plank 2 Sep 16 10:18 f2 -> f1 UNIX> chmod 0600 f2 UNIX> ls -l f* -rw------- 1 plank 11 Sep 16 10:19 f1 lrwxrwxrwx 1 plank 2 Sep 16 10:18 f2 -> f1 UNIX>What is the main difference between hard and soft links then? Well, if the file to which the soft link points gets deleted or moved, then the link becomes unusable:
UNIX> rm f1 UNIX> ls -l f* lrwxrwxrwx 1 plank 2 Sep 16 10:18 f2 -> f1 UNIX> cat f2 cat: f2: No such file or directory UNIX>The link is called "unresolved."
UNIX> ln /home/jplank/cs360/notes/Links/lecture.html ~/lecture.htmlbecause your home directory is not on the same filesystem as mine. However, you can make a soft link:
UNIX> ln -s /home/jplank/cs360/notes/Links/lecture.html ~/lecture.html