CS360 Lecture notes -- Umask and Other System Calls

This is a catch-all lecture, to go over some system calls that I don't go over in other parts of the class. With the exception of umask, they are straightforward, so if I end up being short on lecture time, I will skip them and just refer you here. Many of them will be useful for you when you write jtar.

Umask

Umask is a system call that handles the "file mode creation mask." Here's a man page (from my Mac in 2018):

NAME
     umask -- set file creation mode mask

SYNOPSIS
     #include <sys/stat.h>

     mode_t
     umask(mode_t cmask);

DESCRIPTION
     The umask() routine sets the process's file mode creation mask to cmask and returns
     the previous value of the mask.  The 9 low-order access permission bits of cmask are
     used by system calls, including open(2), mkdir(2), mkfifo(2), and mknod(2) to turn
     off corresponding bits requested in file mode.  (See chmod(2)).  This clearing allows
     each user to restrict the default access to his files.

     The default mask value is S_IWGRP | S_IWOTH (022, write access for the owner only).
     Child processes inherit the mask of the calling process.

RETURN VALUES
     The previous value of the file mode mask is returned by the call.

ERRORS
     The umask() function is always successful.

The "file creation mask" (which I will call the "umask" out of habit) is a nine-bit number. If a bit in the umask is set, then whenever you make a system call that creates a file, that bit in the protection mode will be turned off.

Formally, when you specify a mode when you open a file, the real protection mode will be:

(mode & ~umask)

Until you get used to "AND-NOT", it can be confusing. If you have a mask m, then:

So, the umask "turns off" protection bits. The point of the umask is to allow programs to create files with the following protection modes: The user can tailor the protection modes to his or her liking with the umask.

In the examples that follow, I'm not going to make the system call, but simply use the umask command, which does the same thing, but in the current shell. If I type umask into my shell, then it will tell me the current umask, in octal:

UNIX> umask
22
UNIX> echo "Hi" > f1.txt
UNIX> umask 0
UNIX> echo "Hi" > f2.txt
UNIX> umask 77
UNIX> echo "Hi" > f3.txt
UNIX> umask 777 
UNIX> echo "Hi" > f4.txt
UNIX> ls -l f?.txt
-rw-r--r--. 1 plank loci 3 Feb 13 09:53 f1.txt
-rw-rw-rw-. 1 plank loci 3 Feb 13 09:53 f2.txt
-rw-------. 1 plank loci 3 Feb 13 09:53 f3.txt
----------. 1 plank loci 3 Feb 13 09:54 f4.txt
UNIX> 
The shell, when it opens a file for output redirection, uses a mode of 0666. As you can see from above: The same thing is true of directories:
UNIX> rm -rf f?.txt
UNIX> umask 22
UNIX> mkdir d1
UNIX> umask 0
UNIX> mkdir d2
UNIX> umask 077
UNIX> mkdir d3
UNIX> umask 0777
UNIX> mkdir d4
UNIX> ls -l | grep 'd.$'
drwxr-xr-x. 2 plank loci     6 Feb 13 09:59 d1
drwxrwxrwx. 2 plank loci     6 Feb 13 09:59 d2
drwx------. 2 plank loci     6 Feb 13 09:59 d3
d---------. 2 plank loci     6 Feb 13 10:00 d4
UNIX> rm -rf d?
UNIX> umask 22
UNIX> 
You'll note, in the umask command, I don't need to include the initial 0 -- it interprets its argument in octal. In the system call, you should specify octal.

Random File/Inode System calls.

These are sketchy because they are straightforward.

chmod(char *path, mode_t mode) -- Works just like chmod when executed from the shell. E.g. chmod("f1", 0600) will set the protection of file f1 to be "rw-" for you, and "---" for everyone else.

The man page for chmod() -- "man -s 2 chmod" shows you a bunch of #define's from <sys/stat.h> that are useful for accessing individual bits from the mode.

Quiz yourself on your understanding of how open() and chmod() interact. Compile and run o1.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>

int main()
{
  int fd;

  printf("Opening the file:\n");
  fd = open("f1.txt", O_WRONLY | O_CREAT | O_TRUNC);
  sleep(1);

  printf("Doing chmod\n");
  chmod("f1.txt", 0000);
  sleep(1);

  printf("Doing write\n");
  write(fd, "Hi\n", 3);

  return 0;
}

UNIX> ./o1
Opening the file:
Doing chmod
Doing write
UNIX> ls -l f1.txt
What will this show as the protection mode and the size of the file?
UNIX> cat f1.txt
What will this do?
UNIX>
The answers are as follows:
UNIX> ls -l f1.txt
----------. 1 plank loci 3 Feb 13 10:07 f1.txt
UNIX>
The file descriptor is a valid file descriptor for writing. The chmod() command did not do anything to the open file, so the process can successfully write with it. That is why the file's size is three, and not zero. The protection mode, of course, was changed by the chmod command.
UNIX> cat f1.txt
cat: f1.txt: Permission denied
UNIX> 
Since the protection mode was "---------", the cat program received an error when it tried to open the file (most likely with fopen(), which calls open()).

Suppose we run o1 again:

UNIX> o1
Opening the file:
Doing chmod
Doing write
UNIX> ls -l f1.txt
----------. 1 plank loci 3 Feb 13 10:07 f1.txt
UNIX>
You'll note that the modification time of f1.txt has not changed. This is because the open() call failed and return -1. The file was not truncated or modified in any way. That's why the modification time is unchanged. The chmod() command succeeded, but the write() system call was given a file descriptor of -1, so it failed too.

Let's kill that file:

UNIX> rm -f f1.txt
UNIX> 

link, unlink, remove, rename: These are straightforward: link(char *f1, char *f2) works just like:

UNIX> ln f1 f2
f2 has to be a file -- it cannot be a directory.

unlink(char *f1) works like:

UNIX> rm f1
remove(char *f1) works like unlink(), but it also works for (empty) directories. Unlink() fails on directories. Take a look at o2.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>

int main()
{
  int fd;
  char s[11];
  int i;

  printf("Opening f1.txt and putting \"Fun Fun\" into s.\n");
  strcpy(s, "Fun Fun\n");
  fd = open("f1.txt", O_RDONLY);
  sleep(1);

  printf("Removing f1.txt\n");
  remove("f1.txt");
  sleep(1);

  printf("Listing f1.txt, and reading 10 bytes from the open file descriptor.\n");
  system("ls -l f1.txt");
  i = read(fd, s, 10);
  s[i] = '\0';
  printf("Read returned %d: %d %s\n", i, fd, s);
  return 0;
}

This program opens f1.txt for reading, sleeps a second, and then removes f1.txt. It sleeps again, performs a long listing and then tries to read 10 bytes from the open file. The question is -- what happens when we remove f1.txt? Will the read call succeed, or fail because the file is gone?

UNIX> rm -f f1.txt
UNIX> echo "Jim Plank" > f1.txt
UNIX> ./o2
Opening f1.txt and putting "Fun Fun" into s.
Removing f1.txt
Listing f1.txt, and reading 10 bytes from the open file descriptor.
ls: cannot access f1.txt: No such file or directory
Read returned 10: 3 Jim Plank

UNIX> 
The ls command shows that f1.txt is indeed gone after the remove() call. However, the operating system does not delete the file until the last file descriptor to it is closed. For that reason, the read() call succeeds.

Try o2 again -- since f1.txt was removed, it does not exist now:

UNIX> ./o2
Opening f1.txt and putting "Fun Fun" into s.
Removing f1.txt
Listing f1.txt, and reading 10 bytes from the open file descriptor.
ls: cannot access f1.txt: No such file or directory
Read returned -1: -1 Fun Fun

UNIX>
What happened? First, the open() call failed and returned -1. Thus, the read() call also failed and returned -1. Since the read call failed, the bytes of s were never overwritten - thus when we printed them out, we got "Fun Fun." Make sure you understand this code and its output. It is deterministic -- we are not getting segmentation violations or random behavior with these calls -- we are simply getting well-defined errors in our system calls.


rename(char *f1, char *f2) works just like:
UNIX> mv f1 f2

symlink and readlink: These routines mess with symbolic links. You will need them in your jtar assignment. Go ahead and read the man pages for these.
mkdir and rmdir: Again straightforward, and like:
UNIX> mkdir ...
UNIX> rmdir ...
Read the man pages.
utime: This system call lets you change the time fields of a file's inode. It looks like it should be illegal (for example, one could write a program to make it look like one has finished his homework on time...), but it is very handy, especially for writing tar (and jtar). As always, read the man page. When working with time values, you need to be aware of a few data structures: Do "man ctime" to see a list of procedures that convert between time_t data types, struct tm data types and strings. Useful ones are ctime(), localtime(), mktime() (which is really useful when you want to, say, subtract a year from the current time), strftime().
chdir, getcwd: These are like the shell commands cd and pwd. You will not need these for jtar, but can use them if you'd like.