Having Fun with K&R C

Nov 24, 2025

Don't want to miss any updates? You can follow this RSS feed or sign up for my newsletter:

A big thanks to Diomidis Spinellis who read both parts in this series, caught *many* mistakes, and provided valuable feedback!

Introduction

This is the successor to a previous article where we learned how to simulate Unix v7 on PDP-11 with OpenSIMH. In this article we will use the C compilers that shipped with Unix v7—namely Dennis Ritchie’s compiler (cc) and the Portable C Compiler (pcc) by Steven C. Johnson—to explore how C was back then, before it became ANSI C. Theoretically you don’t need to go into the troubles of the previous article to follow this article. You can follow along as long as you have a running copy of Ritchie’s compiler. But I find it much more reliable to use Unix v7 on PDP-11.

The Length of Variable Names is Limited to 8 Characters

Even though, as today, variables could have names arbitrarily long (meaning you wouldn’t get a compiler error if you had e.g., a 20-char variable name), only the first 8 characters were considered by the compiler. To see that in practice, create a .c file with:

int main(void) {
  int our_variable_one=111;
  int our_variable_two=222;
  return 0;
}

Now let’s try to compile it with cc:

# cc compiler.c
compiler.c:3: our_vari redeclared
compiler.c:3: Declaration syntax

pcc has the same behavior:

# pcc compiler.c
compiler.c
"compiler.c", line 3: redeclaration of our_vari

Later, when we look at the source code of the compiler, we’ll see exactly why we get this behavior in cc.

Linking Problems

As you might expect, this is how the compiler created the object files too, which means it manifested as a linking problem too. For example, let’s create link1.c:

int global_variable_one=111;

int ret1() {
  return global_variable_one;
}

Then link2.c:

int global_variable_two=222;

int ret2() {
  return global_variable_two;
}

And finally main.c:

int ret1();
int ret2();

int main() {
  int a = ret1() + ret2();
}

Now we do:

# cc link1.c -c
# cc link2.c -c
# cc main.c -c

These three run fine, but then if we try to link them...

# cc link1.o link2.o main.o
_global_: ld:link2.o: Multiply defined

But that’s not a problem caused by the linker. Let’s see the object file:

# nm link1.o
000020 t L1
000006 t L2
000014 t L3
000022 D _global_
000000 T _ret1
       U cret
       U csv
000000 t ~ret1

There are a bunch of interesting things here. First of all, look how small the object file is! This will make more sense if we see the assembly, which we can do with adb (in the last section I will present an easier way).

# adb link1.o

_ret1,10?i
_ret1:          jsr     r5,_ret1
                br      L1

L2:             mov     _global_,r0
                br      L3

L3:             jmp     _ret1

L1:             br      L2
                jmp     *-(pc)
text address not found

We launch link1.o in adb, and then we tell the tool to print 10 instructions starting from _ret1 label. First, _ret1 corresponds to ret1(). It was a convention back then (that is still true in different forms) for the C compiler to prepend C symbols with _ to distinguish them from non-C symbols, like csv (which we will talk about in a bit). The first line is a jump to _ret1, which looks like infinite recursion. But the reason is that in place of _ret1 is an undefined relocation. In other words, the instruction is really like this:

jsr     r5,<undefined>

because _ret1 has not gotten an address yet. What will go to its place is csv, which stands for “C save (registers)”, i.e., it sets up the stack. In any case, we could spend a lot of time talking about linkers, but let’s get back to the problem at hand. As you can see, the symbol was truncated by the compiler and that’s how it made it to the object file. It’s not the linker’s problem. I wanted to test if the linker has this problem, and I tried to write raw assemby and and then assemble it into an object file:

cat > ret1.s << 'EOF'
        .data
        .globl  _global_variable_one

_global_variable_one:
        .word   111

        .text
        .globl  _ret1

_ret1:
        mov     _global_variable_one, r0
        rts     pc                         / return r0
EOF

and we assemble it:

as ret1.s

This produces a.out. Now let’s see the symbols:

# nm a.out
       u .word
000006 D _global_
000000 T _ret1

So apparently it is also a problem in the assembler. I think the only alternative if we want to make sure our object files have the full identifier is to write raw bytes. That is quite a pain, so it is left as an exercise to the reader...

Old-Style Initialization

In the very old C you could initialize variables without the = operator. Here’s init.c:

#include <stdio.h>

int main() {
  int a 43;
  printf("%d\n", a);
}

Then the following works just fine:

# cc init.c
# ./a.out
43

pcc gives us a warning but it still works:

# pcc init.c
init.c
"init.c", line 4: warning: old-fashioned initialization: use =
# ./a.out
43

Non-Existent Keywords

There are many keywords in ANSI C that have been introduced back then. One of them is void. Here’s void.c:

void test() { return; }

int main() {
  test();
}

Compilation fails:

# cc void.c
void.c:1: External definition syntax

const and volatile were also not available. Here’s const.c:

int main() {
  const int a = 2;
}

And again, compilation fails:

# cc const.c
const.c:2: const undefined; func. main
const.c:2: Expression syntax
const.c:3: a undefined

Similarly for volatile:

# cc volatile.c
volatile.c:2: volatile undefined; func. main
volatile.c:2: Expression syntax
volatile.c:4: a undefined

Struct Fields Were Global

Fields of one struct could be accessed from a variable of type of another struct without a cast. Here’s an example in sct.c:

#include <stdio.h>

struct A {
  int x;
  int y;
};

struct B {
  int w;
  int z;
};

int main() {
  struct A a;
  struct B *bp;
  a.x = 1;
  a.y = 2;
  bp = &a;
  printf("%d\n", bp->x);
}

This compiles and runs fine:

# cc sct.c
# ./a.out
1

We did not have to add a cast to assign &a to bp. More surprisingly, even though the compiler knows that bp points to struct B and struct B does not have a field x, it does not complain. This is because struct fields in K&R C were global, not tied to a specific struct declaration. What was saved was only the offset from the struct they were declared to so that you can access them correctly. This is not an edge case; as we will see, it is used all over Dennis Ritchie’s C compiler.

Old-Style Function Definitions

K&R C allowed what we now call “old-style function definition”. Here’s what it looks like:

#include <stdio.h>

int test(x) int x; { return x + 2; }

int main() {
  float x = 1.2;
  int y = test(x);
  printf("%d\n", y);
}

We provide the name of parameters (x) but not its type. The type is provided between ) and {, where nothing goes today. But if the type is added there, it is not type-checked, which is why this program compiles with not even a warning by both cc and pcc. Instead, this should be thought of as a cast. That is, we cast whatever comes in to int. The value we get in both cases is: (a) bogus, and (b) the same across the two compilers.

# cc fdef.c
# ./a.out
16539
# pcc fdef.c
fdef.c
# ./a.out
16539

Now there are two things that are crazy. The first is that all the previous examples do not compile today. However, this example not only does it still compiles today, but it also does so with no warning at all even if you ask for extra warnings!

> gcc test.c -o test -Wall -Wextra && ./test
3
> clang test.c -o test -Wall -Wextra && ./test
3

The only difference is that you get a value you would expect.

The second thing that is crazy is that K&R C in Unix v7 simply did not support the function definitions we know today.

int test(int a, int b) { return a + b; }

int main() {
  int a = test(1 , 2);
}

This does not compile with either compiler.

# cc fdef2.c
fdef2.c:1: Declaration syntax
fdef2.c:1: Declaration syntax
fdef2.c:1: a undefined; func. test
fdef2.c:1: b undefined; func. test
# pcc fdef2.c
fdef2.c
"fdef2.c", line 1: syntax error
"fdef2.c", line 1: warning: old-fashioned initialization: use =
"fdef2.c", line 1: warning: undeclared initializer name a
"fdef2.c", line 1: unknown size
"fdef2.c", line 1: cannot recover from earlier errors: goodbye!

Compiler Internals: The 8-Character Limit

The moment we’ve all been waiting for arrived: we are going to take a sneak peek at the internals of Dennis Richie’s compiler. The source code lives under /usr/src/cmd/c and the directory looks like this:

c0.h
c00.c
c01.c
c02.c
c03.c
c04.c
c05.c
c1.h
c10.c
c11.c
c12.c
c13.c
c2.h
c20.c
c21.c
cvopt.c
makefile
table.s

The compiler has 3 main passes: c0, c1, and c2, and a special pass, cvopt. c1 is what we would call today the “front end”. It does lexing, parsing, and semantic analysis. c2 is what we would call today the “back end”, as it generates assembly from an AST, while also performing some AST optimizations. c2 is a low-level optimizer that works directly on the assembly. Finally, cvopt is also an optimization pass, but doesn’t perfrorm standard compiler optimizations; it’s for compressing the output assembly. The files of each pass (e.g., c00.c-c05.c) don’t seem to have any specific ordering.

Let’s first review the issue of the 8-character length for identifiers. What concerns us is lexing, which is where identifiers should be recognized. Lexing happens in the function symbol() in c00.c. If we ignore some logic dealing with peeking characters, basically this function first calls getchar() from the C standard library to get the next character. It also has a character table, ctab, defined in c05.c, which maps characters to syntactical categories. It looks like this:

char ctab[] {
  EOFC,  INSERT,  UNKN,  UNKN,  UNKN,  UNKN,  UNKN,  UNKN,
  UNKN,  SPACE,  NEWLN,  SPACE,  SPACE,  UNKN,  UNKN,  UNKN,
  ...
  SPACE,  EXCLA,  DQUOTE,  SHARP,  UNKN,  MOD,  AND,  SQUOTE,
  LPARN,  RPARN,  TIMES,  PLUS,  COMMA,  MINUS,  PERIOD,  DIVIDE,
  ...
  DIGIT,  DIGIT,  COLON,  SEMI,  LESS,  ASSIGN,  GREAT,  QUEST,
  UNKN,  LETTER,  LETTER,  LETTER,  LETTER,  LETTER,  LETTER,  LETTER,
  ...
  LETTER,  LETTER,  LETTER,  LBRACE,  OR,  RBRACE,  COMPL,  UNKN
};

A character table is standard practice in front ends, but this is a pretty hardcore way to make one, meaning, by hand. You have to make sure you map each character in the ASCII table correctly, and if you miss one, then it’s all wrong. But it wasn’t the only case in old compilers. LCC did the same. You don’t have to do it this way, though. You can do something like this. Note something subtle in Ritchie’s table: the underscore maps to LETTER.

symbol() then switches on the lookup (still inside symbol()):

switch(ctab[c]) {
...

We are concerned with the case LETTER:

case LETTER:
  sp = symbuf;
  while(ctab[c]==LETTER || ctab[c]==DIGIT) {
    if (sp<symbuf+NCPS)
      *sp++ = c;
    c = getchar();
  }
  while(sp<symbuf+NCPS)
    *sp++ = '\0';
  mossym = 0;
  if (mosflg) {
    mossym = FMOS;
    mosflg = 0;
  }
  peekc = c;
  if ((c=lookup())==KEYW && cval==SIZEOF)
    c = SIZEOF;
  return(c);

The goal of this code is to save the characters of an identifier into symbuf, which is just a buffer of characters (defined in c0.h):

char  symbuf[NCPS+2];

We keep going as long as we have a letter or a digit and then ignoring this mos-related logic,1 we look up the identifier to see if it’s a keyword. There’s a special case for sizeof, I think because it’s the only keyword that can appear arbitrarily inside expressions. So if it is sizeof we return that, otherwise we return what lookup() returned. If lookup() can’t find the identifier in keywords, it will return NAME, which means this code will return NAME for valid identifiers, which is what we want.

But wait a minute... What is this NCPS? Apparently the code saves in symbuf (through sp) only the first NCPS characters. As you might expect, this is the Number of Characters Per Symbol. It’s defined in c0.h:

#define  NCPS  8  /* # chars per symbol */

And this is why our identifiers are truncated to the first 8 characters. This is also why old C code used short names for variables, function names, etc.

Compiler Internals: Optimizations

Surprisingly there is quite a bit of optimization performed by cc. Here we’ll take a look at a two cases.

Constant Folding

In c12.c we can find the following:

constant:
  if (tree->tr1->op==CON && tree->tr2->op==CON) {
    const(op, &tree->tr1->value, tree->tr2->value);
    return(tree->tr1);
  }

The compiler deals with trees internally. As it was common in old compilers, a tree is really a tree node. In this case it has two children, tr1 (the LHS) and tr2 (the RHS). If both are constant, we go to const() to optimize it. In the second argument we pass a pointer because we will modify it to have the folded constant. Let’s take a look at a few cases in the body of the function. Generally, it is what you would expect:

case PLUS:
  *vp =+ v;
  return;

case TIMES:
  *vp =* v;
  return;

case AND:
  *vp =& v;
  return;

However, note that it does not contain any floating-point constant folding (unlike modern compilers; see this). Let’s see that in practice. We first create flt.c:

float foo() {
  double a = 1.2 + 2.4;
}

I use doubles because K&R C did not have float literals (e.g., 1.2f), so all floating-point literals were by default 64-bit doubles.

Now we want to see the assembly code. Instead of following our previous method with adb, we will simply ask the compiler to generate assembly code. To do that, we need to understand better how compiler the driver works. This is /usr/src/cmd/cc.c (which calls the passes). The main() of this file has the following near the beginning:

while(++i < argc) {
  if(*argv[i] == '-') switch (argv[i][1]) {
  default:
    goto passa;
  case 'S':
    sflag++;
    cflag++;

So, if we pass -S, then sflag becomes 1 and also -c is assumed. The other interesting part of this program is this:

if (oflag) {
  av[0] = "c2";
  av[1] = tmp5;
  av[2] = tmp3;
  av[3] = 0;
  if (callsys(pass2, av)) {
    unlink(tmp3);
    tmp3 = assource = tmp5;
  } 
  else
    unlink(tmp5);
}
if (sflag)
  continue;
assemble:
  av[0] = "as";
  av[1] = "-u";
  av[2] = "-o";
  ...
  if (callsys("/bin/as", av) > 1) {

A couple of things happen here. First, c2 is not called unless we pass the -O option. Second, if we pass -S we don’t proceed to assemble the assembly file into an object file. And this is what we want because that way we can both: (a) inspect the assembly, and (b) call c2 manually to see what it does. We’re ready to compile the file:

# cc flt.c -S
# cat flt.s
...
L2:.data
L10000:40231;114631;114631;114632
.text
.data
L10001:40431;114631;114631;114632
.text
movf    L10000,r0
addf    L10001,r0
movf    r0,-16(r5)
...
.globl  fltused
...

I have isolated the interesting parts. Most importantly, in the data segment we have the two FP numbers, and in the text segment (i.e., the body of the function) we have an explicit addf instruction (which was not folded during compilation). We can run c2 on its own, but it gives us basically the same thing:

# /lib/c2 flt.s
.data
L10000:40231;114631;114631;114632
L10001:40431;114631;114631;114632
...
movf    L10000,r0
addf    L10001,r0
movf    r0,-16(r5)
jmp     cret
.globl  fltused

Two things are subtle here. The first is the floating-point encoding. It is not the standard IEEE-754 floating-point format we have today, because that had not even been established yet (that happened in 1985)! It uses the PDP-11 floating-point format.

The other thing you probably noticed is that in both snippets I left the fltused declaration there. The reason is that this symbol is still relevant today for the same reason it was back then. In c10.c we find the following:

/*
  * If any floating-point instructions
  * were used, generate a reference that
  * pulls in the floating-point part of printf.
  */
if (nfloat)
  printf(".globl  fltused\n");

Back then it was a linking optimization, i.e., the linker pulled parts of the standard library that dealt with floating-point numbers only if fltused was defined. Today it is only relevant on Windows, whose linker expects to find the symbol, something one has to deal with if they want to create freestanding executables.

Instruction Combining

This is how LLVM names this kind of optimization, which is why I’m using this name (we could also use the term “peephole optimizations”, but it is more general). I am talking about e.g., turning a*2 to a<<1. Here’s an excerpt from c11.c:

ispow2(atree)
{
  register int d;
  register struct tnode *tree;

  tree = atree;
  if (!isfloat(tree) && tree->tr2->op==CON) {
    d = tree->tr2->value;
    if (d>1 && (d&(d-1))==0)
      return(d);
  }
  return(0);
}

pow2(atree)
struct tnode *atree;
{
  register int d, i;
  register struct tnode *tree;

  tree = atree;
  if (d = ispow2(tree)) {
    for (i=0; (d=>>1)!=0; i++);
    tree->tr2->value = i;
    switch (tree->op) {

    case TIMES:
      tree->op = LSHIFT;
      break;

    case DIVIDE:
      tree->op = ULSH;
      tree->tr2->value = -i;
      break;

There are a couple of interesting things here. First, note that in both functions tree is a pointer to struct tnode. Here’s its declaration in c1.h:

struct  tnode {
  int  op;
  int  type;
  int  degree;
  struct tnode *tr1, *tr2;
};

You may notice that it does not have a value field, yet both functions access one using tree->tr2 (which is also a struct tnode pointer). This because we have already checked that tree->tr2 is a constant, and there is a struct tnode that has an int value field. We explained why this is possible earlier.

Other than that, ispow2() checks whether the RHS is a constant with a value that is a power of 2. If that is the case, pow2() handles many different cases. Here I have two: multiplication to left shift and division to right shift. The way it happens is with a for loop that keeps incrementing i while there are still set bits in d. Note that the loop body is empty! It is just ;. Since d is a power of 2, i.e. 2^e, there is only one set bit and this loop stores the exponent e in i. Today we wouldn’t use a for loop to do that because there is an instruction that counts trailing zeroes in almost all instruction sets (e.g., tzcnt in x86). But PDP-11 did not have such an instruction. Note that today the compiler will optimize the code accordingly only if d is unsigned (example) or if you tell it (in a non-portable way) that d >= 0 (example).

Finally, note the =>> operator. Because of it, the same code cannot be parsed today. However, back then the compound assignment operators like +=, >>=, etc. could be written either way. That is, =>> was equivalent to >>=. This was discontinued in ANSI C.

Conclusion

This concludes this small 2-part series. Hopefully you had fun with PDP-11, Unix v7, and K&R C. I hope it inspired you to go ahead and figure out more on your own!

Don't want to miss any updates? You can follow this RSS feed or sign up for my newsletter:

Footnotes

“mos” stands for “member of structure”. The comment in the definition of symbol() says “mosflg means that the next symbol, if an identifier, is a member of structure or a structure tag”. It’s a global variable that is set in a few places, like strdec() in c03.c which parses struct declarations.↩