Without any additional knowledge, you might try the code in openunique_1.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
main(int argc, char **argv)
{
struct stat buf;
int fd;
char *filename;
char *string;
if (argc != 2) {
fprintf(stderr, "usage: openunique_1 filename\n");
exit(1);
}
filename = argv[1];
string = "Example string\n";
if (stat(argv[1], &buf) != 0) {
fd = open(filename, O_WRONLY | O_CREAT, 0);
if (fd < 0) {
perror(filename);
exit(1);
}
write(fd, string, strlen(string));
close(fd);
} else {
printf("%s already exists -- not opening\n", filename);
}
exit(0);
}
|
This code uses stat() to test whether the given file exists. If it does, then the program says so and exits. Otherwise, it creates the file and writes the string "Example string" to it. Let's run it a few times:
UNIX> ls f1.txt ls: f1.txt: No such file or directory UNIX> openunique_1 f1.txt UNIX> ls -l f1.txt ---------- 1 plank plank 15 Feb 12 11:44 f1.txt UNIX> cat f1.txt cat: f1.txt: Permission denied UNIX> chmod 0644 f1.txt UNIX> ls -l f1.txt -rw-r--r-- 1 plank plank 15 Feb 12 11:44 f1.txt UNIX> cat f1.txt Example string UNIX> openunique_1 f1.txt f1.txt already exists -- not opening UNIX> rm f1.txt UNIX> openunique_1 f1.txt UNIX> ls -l f1.txt ---------- 1 plank plank 15 Feb 12 11:45 f1.txt UNIX> rm -f f1.txt UNIX>The first time we run it, it creates f1.txt. Take a look at the "mode" parameter of the open() call. It is zero. That means that the file will be created so that no one can read, write or execute it. However, the openunique_1 program can write to it, because it was opened for writing. Any subsequent open() call will fail. When we do a long listing on it, we see all those dashes, meaning that we don't have permission to do anything with the file. This is why cat fails on it. We can do the chmod command to give us permissions and then we can cat the file. A second call to openunique_1 fails because the file exists. After we remove it, openunique_1 succeeds.
Now, let's test how well this works when two processes are competing to open the same file. Take a look at openunique_2.c:
main(int argc, char **argv)
{
struct stat buf;
int fd;
char *filename;
int iterations, i;
int successful, unsuccessful, erroneous;
if (argc != 3) {
fprintf(stderr, "usage: openunique_2 filename iterations\n");
exit(1);
}
filename = argv[1];
iterations = atoi(argv[2]);
successful = 0;
unsuccessful = 0;
erroneous = 0;
for (i = 0; i < iterations; i++) {
if (stat(argv[1], &buf) != 0) {
fd = open(filename, O_WRONLY | O_CREAT, 0);
if (fd < 0) {
erroneous++;
} else {
close(fd);
remove(filename);
successful++;
}
} else {
unsuccessful++;
}
}
printf(" Successful: %5d\n", successful);
printf("Unsuccessful: %5d\n", unsuccessful);
printf(" Erroneous: %5d\n", erroneous);
exit(0);
}
|
This program iterates trying to create a file if it doesn't exist, and it keeps track of three results: it is successful if the file doesn't exist and it is created successfully. In this case, the open file is closed, and the remove() system call removes it. It is unsuccessful if the file already exists. It is erroneous if the stat() call said that the file didn't exist, but when we try to open it, we can't. Why would that happen? It would happen if someone creates the file with no permissions between the stat() and the open() call. Can that really happen? Take a look:
UNIX> ls f1.txt ls: cannot access f1.txt: No such file or directory UNIX> openunique_2 f1.txt 10000 Successful: 10000 Unsuccessful: 0 Erroneous: 0 UNIX> touch f1.txt UNIX> openunique_2 f1.txt 10000 Successful: 0 Unsuccessful: 10000 Erroneous: 0 UNIX> rm f1.txt UNIX> openunique_2 f1.txt 10000 & ; openunique_2 f1.txt 10000 [1] 28140 Successful: 1511 Unsuccessful: 6881 Erroneous: 1608 UNIX> Successful: 2440 Unsuccessful: 6079 Erroneous: 1481 [1] Done openunique_2 f1.txt 10000 UNIX>The first openunique_2 runs successfully, because f1.txt doesn't exist. The second openunique_2 runs unsuccessfully, because f1.txt does exist. In the third call, we run two openunique_2's simultaneously by putting the first in the background with the ampersand. The cluttered output is because the second openunique_2 finished before the first one, and gave us the UNIX prompt back. I typed <RETURN> to get my prompt back. What you see there, is that both processes have a significant number of erroneous open calls -- the file was changed between the stat() and open() calls!
If you read the man page for open() ("man -s 2 open"), you'll see the following flag:
O_EXCL error if create and file exists
................................. If O_EXCL is set with O_CREAT and the
file already exists, open() returns an error. This may be used to imple-
ment a simple exclusive access locking mechanism.
|
Let's test this out in openunique_3.c:
main(int argc, char **argv)
{
int fd;
char *filename;
char *string;
if (argc != 2) {
fprintf(stderr, "usage: openunique_3 filename\n");
exit(1);
}
filename = argv[1];
string = "Example string\n";
fd = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0);
if (fd < 0) {
perror(filename);
exit(1);
}
write(fd, string, strlen(string));
close(fd);
exit(0);
}
|
And try some calls:
UNIX> ls f1.txt ls: cannot access f1.txt: No such file or directory UNIX> openunique_3 f1.txt UNIX> ls -l f1.txt ---------- 1 plank loci 15 2010-02-12 12:05 f1.txt UNIX> openunique_3 f1.txt f1.txt: File exists UNIX> rm -f f1.txt UNIX> openunique_3 f1.txt UNIX> ls -l f1.txt ---------- 1 plank loci 15 2010-02-12 12:05 f1.txt UNIX>The flag works as advertized. We can chase down the proper errno by looking at /usr/include/errno.h, and finding the proper include file that contains the string "File exists". On my macintosh, that's /usr/include/sys/errno.h. On my office machine, you have to chase down a bunch of include statements to find that it's in /usr/include/asm-generic/errno-base.h:
UNIX> grep 'File exists' /usr/include/asm-generic/errno-base.h #define EEXIST 17 /* File exists */ UNIX>Armed with that knowledge, we write openunique_4.c, which is like openunique_2.c, except it uses the O_EXCL flag instead of stat():
main(int argc, char **argv)
{
int fd;
char *filename;
int iterations, i;
int successful, unsuccessful, erroneous;
if (argc != 3) {
fprintf(stderr, "usage: openunique_4 filename iterations\n");
exit(1);
}
filename = argv[1];
iterations = atoi(argv[2]);
successful = 0;
unsuccessful = 0;
erroneous = 0;
for (i = 0; i < iterations; i++) {
fd = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0);
if (fd < 0 && errno == EEXIST) {
unsuccessful++;
} else if (fd < 0) {
perror(filename);
exit(1);
erroneous++;
} else {
close(fd);
remove(filename);
successful++;
}
}
printf(" Successful: %5d\n", successful);
printf("Unsuccessful: %5d\n", unsuccessful);
printf(" Erroneous: %5d\n", erroneous);
exit(0);
}
|
Running two of these simultaneously now works -- the file is never opened erroneously:
UNIX> rm -f f1.txt UNIX> openunique_4 f1.txt 10000 Successful: 10000 Unsuccessful: 0 Erroneous: 0 UNIX> touch f1.txt UNIX> openunique_4 f1.txt 10000 Successful: 0 Unsuccessful: 10000 Erroneous: 0 UNIX> rm -f f1.txt UNIX> openunique_4 f1.txt 10000 & ; openunique_4 f1.txt 10000 [1] 28790 Successful: 3895 Unsuccessful: 6105 Erroneous: 0 Successful: 3764 Unsuccessful: 6236 Erroneous: 0 [1] + Done openunique_4 f1.txt 10000 UNIX>Oddly, this fails on my macintosh, with the open() call returning -1 and setting errno to ENOENT: "No such file or directory." I can only conclude that this is a bug in the Mac's operating system, and that if you get this, you should retry it. Welcome to the ills of systems programming!
umask() sets the process's file creation mask to mask and
returns the previous value of the mask. The low-order 9
bits of mask are used whenever a file is created, clearing
corresponding bits in the file access permissions. (see
stat(2V)). This clearing restricts the default access to a
file.
The mask is inherited by child processes.
When you call umask from a program, or from the shell, it changes the
"File creation mask". This mask consists of 9 bits. Whenever a file
is created, for example by open(), creat(), or mkdir(),
and a mode m
is specified, then the file is created with the mode:
(m & ~umask)Umask the system call returns the old umask value.
For example, look at the following program (um1.c)
main()
{
int fd;
int old_mask;
old_mask = umask(0);
printf("The old mask was 0%o\n", old_mask);
fd = open("f1", O_WRONLY | O_CREAT | O_TRUNC, 0666);
close(fd);
printf("created f1: 0666\n");
fd = open("f2", O_WRONLY | O_CREAT | O_TRUNC, 0200);
close(fd);
printf("created f2: 0200\n");
umask(022);
fd = open("f3", O_WRONLY | O_CREAT | O_TRUNC, 0666);
close(fd);
printf("created f3: 0%o\n", 0666 & ~022 & 0777);
fd = open("f4", O_WRONLY | O_CREAT | O_TRUNC, 0777);
close(fd);
printf("created f4: 0%o\n", 0777 & ~022 & 0777);
fd = open("f5", O_WRONLY | O_CREAT | O_TRUNC, 0200);
close(fd);
printf("created f5: 0%o\n", 0200 & ~022 & 0777);
}
|
When we execute um1 and list the five files created, we see the following:
UNIX> um1 The old mask was 022 created f1: 0666 created f2: 0200 created f3: 0644 created f4: 0755 created f5: 0200 UNIX> ls -l f? -rw-rw-rw- 1 plank plank 0 Feb 12 12:27 f1 --w------- 1 plank plank 0 Feb 12 12:27 f2 -rw-r--r-- 1 plank plank 0 Feb 12 12:27 f3 -rwxr-xr-x 1 plank plank 0 Feb 12 12:27 f4 --w------- 1 plank plank 0 Feb 12 12:27 f5 UNIX>The first two open() calls created the file with the exact mode specified. The other three had the modes modified by the umask, which was set to 022. In terms of bits, 022 is equal to (000 010 010). Thus, ~022 is equal to (111 101 101). The third open() call specifies a mode of 0666, which is (110 110 110), and if we take the binary AND of the two numbers:
111 101 101 110 110 110 ----------- 110 100 100We get 0644, which corresponds to "-rw-r--r--". Similarly, opening the file with 0777 gets you 0755.
The umask is convenient because it allows the user to get his or her default protection mode. Programs typically call open() with modes of 0666 or 0777 (if the file is executable), and the umask handles the user's preference as to who can see the file. I typically set my umask to 022, which lets others read and execute my files, but not write them. Some people like to set it to 077, which doesn't let others do anything with the files.
The umask value is set per process, not per user. So, if your shell's umask is 022, and you have a program set it to 0, then that does not affect the shell:
UNIX> cat um2.c
main()
{
umask(0);
}
UNIX> umask
22
UNIX> um2
UNIX> umask
22
UNIX>
Read section 4.9 of the bookfor a more thorough description, including use of the mode bits like S_IRUSR, etc.
The program o1.c has an example that you may want to go over:
main()
{
struct stat buf;
int fd;
fd = open("f1.txt", O_WRONLY | O_CREAT | O_TRUNC);
sleep(1);
printf("Doing chmod\n");
chmod("f1.txt", 0000);
sleep(1);
printf("Doing write\n");
write(fd, "Hi\n", 3);
}
|
The program opens f1.txt for writing, and after a second, it calls chmod to set the permissions to nothing. One second later, it writes to the open file. Should this write succeed or not? Let's see:
UNIX> rm -f f1.txt UNIX> o1 Doing chmod Doing write UNIX> ls -l f1.txt ---------- 1 plank plank 3 Feb 12 13:36 f1.txt UNIX> chmod 0644 f1.txt UNIX> cat f1.txt Hi UNIX> chmod 0644 f1.txt UNIX> echo "New Text" > f1.txt UNIX> cat f1.txt New Text UNIX> chmod 0 f1.txt UNIX> o1 Doing chmod Doing write UNIX> ls -l f1.txt ---------- 1 plank plank 9 Feb 12 13:38 f1.txt UNIX> chmod 0644 f1.txt UNIX> cat f1.txt New Text UNIX>The answer is that the program can write to the file even after the chmod call. This is similar to openunique_1.c which can open a file with no permissions set, yet it can still write to the file. If the open call is successful, then the operating system won't change the open file from under you -- the write is successful.
Now, what happens in the second call to o1. Here, the open() call fails because the file exists and has no permissions set. The chmod() call works, but the write() call fails because fd will be -1. For that reason, f1.txt is unmodified by running o1.
UNIX> ln f1 f2f2 has to be a file -- it cannot be a directory.
unlink(char *f1) works like:
UNIX> rm f1remove(char *f1) works like unlink, but it also works for (empty) directories. unlink fails on directories. Take a look at o2.c:
main()
{
int fd;
char s[11];
int i;
strcpy(s, "Fun Fun\n");
fd = open("f1.txt", O_RDONLY);
sleep(1);
printf("Removing f1.txt\n");
remove("f1.txt");
sleep(1);
system("ls -l f1.txt");
i = read(fd, s, 10);
s[i] = '\0';
printf("Read returned %d: %d %s\n", i, fd, s);
}
|
This program opens f1.txt for reading, sleeps a second, and then removes f1.txt. It sleeps again, performs a long listing and then tries to read 10 bytes from the open file. The question is -- what happens when we remove f1.txt? Will the read call succeed, or fail because the file is gone?
UNIX> cat > f1.txt Tramp - You can call me that. UNIX> cat f1.txt Tramp - You can call me that. UNIX> o2 Removing f1.txt ls: f1.txt: No such file or directory Read returned 10: 3 Tramp - Yo UNIX>The ls command shows that f1.txt is indeed gone after the remove() call. However, the operating system does not delete the file until the last file descriptor to it is closed. For that reason, the read() call succeeds.
Try o2 when f1.txt doesn't exist:
UNIX> rm -f f1.txt UNIX> o2 Removing f1.txt ls: f1.txt: No such file or directory Read returned -1: -1 Fun Fun UNIX>What happened? First, the open() call failed and returned -1. Thus, the read() call also failed and returned -1. Since the read call failed, the bytes of s were never overwritten - thus when we printed them out, we got "Fun Fun." Make sure you understand this code and its output. It is deterministic -- we are not getting segmentation violations or random behavior with these calls -- we are simply getting well-defined errors in our system calls.
rename(char *f1, char *f2) works just like:
UNIX> mv f1 f2
UNIX> mkdir ... UNIX> rmdir ...Read chapter 4.20 or the man pages.
UNIX> cd ... UNIX> pwdRead chapter 4.22 or the man pages. You will not need these for jtar, but can use them if you'd like.