C has a very weak form of data encapsulation that is provided via the generic void * pointer and the ability to declare that a struct is local to a file. Suppose I want to declare a Stack data type in C and I want to hide its implementation, including its data structures, from users. I can do this by first defining a public file called Stack.h that contains my generic Stack data type and the functions that the stack data type supports:
Stack.h: typedef void * Stack; Stack stack_new(int size); void stack_free(Stack s); void stack_push(Stack s, int value); int stack_pop(Stack s);Note that I have prefaced all my function names with the "stack_" prefix so that I can avoid name conflicts with user selected names. C++ and Java have ways to avoid these name conflicts and they will be discussed later.
Next I create my stack.c file that contains the implementation for my stack data type:
#include "stack.h" #include <stdlib.h> typedef struct { int size; int *data; int top; } myStack; Stack stack_new(int size) { myStack *newStack = (myStack *)malloc(sizeof(myStack)); newStack->size = size; newStack->data = (int *)malloc(sizeof(int) * size); newStack->top = 0; return (Stack)newStack; /* cast myStack to a (void *) */ } void stack_push(Stack s, int value) { myStack *stack = (myStack *)s; if (stack->top == stack->size) return; /* should really do error handling */ stack->data[stack->top] = value; stack->top++; } ...Since myStack is declared locally and is not declared extern in the stack.h file, its scope is limited to stack.c. Hence only the functions in stack.c can manipulate the myStack data structure. The user is handed a (void *) which effectively hides a stack's implementation because there is no way for the user to cast the (void *) to a myStack. Whenever the user wants to manipulate the stack the user passes a (void *) to the appropriate stack function. The stack function can cast this (void *) to a myStack struct and manipulate the stack in any way it wishes.
This form of data encapsulation using void *'s is fairly kludgy but it does allow several files to share their implementation, as long as each file declares its local data structures in exactly the same way. For example, I could spread the stack implementation over two files by declaring a myStack struct locally in both files. The obvious drawback to this approach is that instead of having one central declaration for the stack's data structures I have one declaration per file, which makes it much more difficult and error-prone to change the data structures.
A positive aspect of the void * implementation is that you can hand a binary implementation to a third party without divulging any proprietary implementation knowledge because the third party will only see the void * in the .h file. Hence the third party will not even know what data structures you are using.
The public, protected, and private accessors in C++ provide a way to control access to the implementation of a class. Unfortunately, these accessors are "all" or "nothing" accessors, they either let everyone access the implementation or only subclasses to access the implementation. They do not provide a way to say "let classes A, B, and C have access to each other's implementation, but exclude everyone else."
C++'s developers partially address this problem by providing the friend keyword. A class can declare that other classes are its friends, which allow the other classes to examine the protected and private instance variables of the class. For example:
class ListNode { friend class List; ... };This declaration gives any method in List the ability to examine any variable in ListNode and to call any method in ListNode, regardless of whether or not the access protection is public.
Friendship has a number of klunky disadvantages. First it is not two way. When you declare List to be ListNode's friend, ListNode does not become a friend of List. List must explicitly declare ListNode to be a friend before the friendship becomes two way. Second, subclasses do not inherit a superclass's friendship status. For example, suppose you have the following subclass declaration:
class DList : public List { ... }DList is not considered a friend of ListNode, despite the fact that it is a subclass of a friend of ListNode.
These restrictions are incredibly annoying and really limit the effectiveness of friends in C++. First, if you want classes A, B, and C to share their implementation, you must ensure that all the classes mutually refer to each other as friends. Second, if you want their subclasses to also be friends, which invariably you do, then you have to make sure that the subclasses mutually refer to each other as friends. In general, if you want n classes to be friends, you will need n(n-1) friend declarations. In addition, if you add a new class to the system that should be included amongst the friends, then you must remember to add 2n more friend declarations. What a mess!
The second module-related concept in C++ is that of namespace's. The namespace keyword allows a programmer to specify that a certain set of variables, functions, and classes belong to the same library or "module". For example, a programmer might write:
namespace ibm { class Stack { ... }; class List { ... }; class ListNode { ... }; class Consult { ... }; ... } namespace apple { class Stack { ... }; class List { ... }; class ListNode { ... }; class Cut { ... }; ... }Notice that the same set of names have been re-used, but since they are in two different namespaces, that is ok. There are three common ways to access members of a namespace:
ibm::Stack *s = new ibm::Stack();
using ibm::Stack; Stack *s = new Stack();
using namespace ibm; Stack *s = new Stack();
If you import conflicting names into the same namespace it is problematic only if you try to use that name:
using namespace ibm; using namespace apple; Consult *c = new Consult(); // ok--no name conflict Stack *s = new Stack(); // compiler error because of a name conflictNamespaces can span one or more files, so you can still place declarations in a .h file and definitions in a .cpp file. For example, to declare the methods for ibm's Stack class one could use any of the following three styles in an ibm.cpp file:
#include "ibm.h" using namespace ibm; int Stack::pop() { ... } ... |
#include "ibm.h" int ibm::Stack::pop() { ... } ... |
#include "ibm.h" namespace ibm { int Stack::pop() { ... } ... } |
C++ implements its standard template library using the std namespace. This library provides a number of pre-defined data structures, such as vectors and lists.
Namespaces solve another of C's problems, which is that all variable, function, and class names end up in the same global name space. This common grouping can create problems when you combine third party software from two different vendors, who duplicate one or more names, as shown above.
Unfortunately C++'s developers did not create true modules with the namespace keyword. Unlike Java's packages, C++'s namespaces do not provide a way to share implementation among members of the namespace. If ListNode and List are declared in the same namespace, they still cannot access one another's members without using the friend keyword. It would have been nice if they also added the concept of package-level access so that one could truly create modules in C++, but they didn't. As a result Java has a much more powerful module mechanism than C++.
class ListNode { friend class List; ... };
class DList : public List { ... }then DList is not considered a friend of ListNode. This restriction is incredibly annoying and really limits the effectiveness of friends in C++