LmCast :: Stay tuned in

Curly braces: An evolution of Unix and C

Recorded: May 24, 2026, 4:58 p.m.

Original Summarized

Curly braces: An evolution of UNIX and C | Thalia Archibald’s blog

Thalia Archibald's blog
Curly braces: An evolution of UNIX and C
19 May 2026
How were { } curly braces typed with a Teletype Model 33 on UNIX? These
characters are especially important for C, but absent on this terminal. I was
just asked a similar question 1 and in response, this is a tour of the
coevolution of UNIX and C, from this perspective, featuring “hello, world”
through the ages.
This work is entirely my own (no AI) and the code samples are my construction.
Sources for all inferences are cited.
ASCII 1963
The Teletype Model 33 famously couldn’t write lowercase letters. This
teleprinter was designed around the first edition of the ASCII standard, ASA
X3.4-1963, which hadn’t yet decided lowercase was worth adding. Some in the
committee thought more control characters would be a better use of the limited
encoding space. The standard soon evolved into its modern form, but the Model 33
was the first commercial use of ASCII and wildly popular, so its issues stuck.
In addition to missing lowercase, ASCII 1963 and the Model 33 lacked { }
curly braces, | vertical bar, ` backtick, and ~ tilde, and they had
↑ up arrow instead of ^ caret and ← left arrow instead of _ underscore.
Trigraphs and digraphs
Curly braces are a prominent part of C syntax, used for blocks. For example:
int main(int argc, char *argv[]) {
printf("hello, world!\n");
}

To support character sets without these characters, C89 invented trigraphs, so
{ could be written as ??< and } as ??>:
int main(int argc, char *argv[]) ??<
printf("hello, world!\n");
??>

The trigraph ??/ for \ backslash can be used at the end of a line to produce
a line continuation, which was lexical undefined behavior when within a
universal character name. I encountered this case while writing a static
analysis, but it was later fixed in C++26 2.
Then, C95 introduced nicer-looking digraphs, so { can be written as <% and
} as %>:
int main(int argc, char *argv[]) <%
printf("hello, world!\n");
%>

But, trigraphs were only introduced after the Teletype Model 33 was obsolete.
How did they write C code in the early ’70s?
Terminal drivers
Starting in UNIX V4 in November 1973, the teletype driver would translate
between \( and { and between \) and }:
main(argc, argv)
char *argv[];
\(
printf("hello, world!\n");
\)

This support was added sometime between the nsys kernel 3 in August
1973 and the V4 manual in November 1973 4. The Utah_v4 kernel (June 1974)
5 and Dennis_v5 kernel (November 1974) 6 have support, but
nsys, a pre-release version of V4 before pipes were added back in, does not. The
V2 and V3 kernels, which were written in assembly, did not survive, but the nsys
kernel matches the V3 manual 7 and the V1 kernel 8.
UNIX exposes devices through a common byte stream interface and this character
translation is transparent to user space programs. Programs use the bytes for
ASCII { } and the kernel translates them to \( \) on write to a
Teletype Model 33, or in reverse on read.
This escaping evolved out of the need to delete characters sent by a terminal,
since teleprinters can’t erase text that’s already been printed on paper. The
scheme they used, inherited from Multics, was to process input by lines and
interpret # “erase” as deleting the previous character and @ “kill” as
clearing the current line. Either character can be escaped with backslash to get
the literal character.
For example, this Utah_v4 session writes that program with a Teletype Model 33
and uses @ and # to fix a few mistakes:
% ed hello.c
?
a
main(argc, argv)
char *argv[];
\(
printf("hallo, welt@ printf("hello #, world!\n");
\)
.
w
63
q
% cc hello.c
% a.out
hello, world!

If you signed in with another terminal, you would see:
% cat hello.c
main(argc, argv)
char *argv[];
{
printf("hello, world!\n");
}

What about before UNIX V4? From its start in June 1972 9, C used only
braces. You just needed to use a terminal that could produce braces.
Early C structs
Interestingly, when structs were first added to C in December 1972
10, they used parentheses instead of braces! For a time around
nsys in August 1973, you could even write structs with either parentheses or
braces. It was fully switched to only the modern syntax at the latest by June
1974 11. This definition in nsys uses both styles 12:
struct user {
int u_rsav[2]; /* must be first */
/* ... */
struct (
int u_ino;
char u_name[DIRSIZ];
) u_dent;
/* ... */
} u; /* u = 140000 */

But that’s just one language feature; blocks still required braces.
B
Before C was B, an interpreted language made by Ken Thompson for UNIX. B had no
types—every value was a machine word—, perfect for the PDP-7 that UNIX started
on, with 18-bit words.
A descendant of this early B remained in use for the Honeywell 6070, far after
UNIX B was replaced with C. This machine has 36-bit words, so four characters
fit into a word. The 1973 B language tutorial for the H6070 13 had the
first-ever “hello, world” program, also using curly braces:

main( ) {
extrn a, b, c;
putchar(a); putchar(b); putchar(c); putchar('!*n');
}

a 'hell';
b 'o, w';
c 'orld';

B to NB
But this everything-is-a-word strategy fails on the PDP-11, which UNIX very
quickly transitioned to. This machine has 16-bit words and 8-bit addressing.
Since addresses could then be misaligned, B on the PDP-11 needed a hack where
globals that weren’t word-aligned by the linker would be patched at runtime
14.
So, around May 1972 9, Dennis Ritchie added char and [] pointer
types to the language, calling it “new B”. Note the use of only [], instead of
the later *:

main(argc, argv)
char argv[][]; {
printf("hello, world!\n");
}

NB to C
He then turned it into a compiler that produces machine code instead of the
inefficient threaded code of B, and renamed it to C. The syntax remained the
same:
main(argc, argv)
char argv[][];
{
printf("hello, world!\n");
}

However, the first C compiler in June 1972 had dropped the $( $) escapes for
braces from B and support never returned 15.
But it still retained much of the semantics of B. Functions, arrays, and even
labels were indirected via a writable pointer 16, leading to quirks like
reassignable labels 17 18:
goto init;
init:
ouptr = oubuf;
init = init1;
init1:

This indirection was removed when structs were added, which created a
distinction between pointers and arrays. Pointers are reassignable and arrays
are not. As such, * was introduced, at the latest by August 1973 19:
main(argc, argv)
char *argv[];
{
printf("hello, world!\n");
}

With a compiler and structs, C was fast and expressive enough to rewrite the
kernel in C, culminating in the release of UNIX V4.
PDP-11 B
Backing up from C to B, we can finally use a Teletype Model 33 again! B on the
PDP-11 supported braces, in addition to the following escapes 20:

*0: NUL
*e: End of file
*(: {
*): }
*t: Tab
**: *
*': '
*": "
*n: Line feed

UNIX was ported to the PDP-11 in February 1971 21. Some time between then
and the B reference manual in January 1972 20, the use of { } curly
braces was invented, setting it apart from its predecessors. At the time of the
draft mid-1971 manual 22, they were clearly using the Teletype Model 37, a
newer teleprinter that supported braces 23. If PDP-11 B did not support
{ } from the start, it gained them very shortly thereafter.
The runtime library resembles later C:

main() $(
printf("hello, world!*n");
$)

Unfortunately, no B source code survived from this era. However, the compiled
PDP-11 B runtime from June 1972 survived 24 and has been disassembled, and
a compiler that produces this output has been reconstructed 14.
PDP-7 B
And before the PDP-11, PDP-7 B also supported $( and $) instead of, or in
addition to, braces. But it used two-char printing for the 18-bit word size:

main() $(
write('he'); write('ll'); write('o,');
write(' w'); write('or'); write('ld'); write(041012);
$)

Only two B programs from the PDP-7 era of UNIX have survived, both showing this
syntax 25:

main $(
auto ch;
extrn read, write;

goto loop;
while (ch != 04)
$( if (ch > 0100 & ch < 0133)
ch = ch + 040;
if (ch==015) goto loop;
if (ch==014) goto loop;
if (ch==011)
$( ch = 040040;
write(040040);
write(040040);
$)
write(ch);
loop:
ch = read()&0177;
$)
$)

This syntax is directly borrowed from its predecessor, BCPL.
BCPL to B
B was Ken Thompson’s version of BCPL, simplified to its core, as he so often
did. The name too was a contraction of either BCPL or Bon, an unrelated language
he created during his Multics days 16.
An example in the style of the 1967 BCPL manual, reflecting the state of the
language at the time B forked from it 26:

let Start() be
$( Writech(MONITOR,'h'); Writech(MONITOR,'e'); Writech(MONITOR,'l')
Writech(MONITOR,'l'); Writech(MONITOR,'o'); Writech(MONITOR,',')
Writech(MONITOR,' '); Writech(MONITOR,'w'); Writech(MONITOR,'o')
Writech(MONITOR,'r'); Writech(MONITOR,'l'); Writech(MONITOR,'d')
Writech(MONITOR,'!'); Writech(MONITOR,'*n') $)

Although this manual uses mixed letter case and rich symbols, the canonical
style was uppercase 27. The 1967 manual does not specify an entrypoint,
so I adapted Start from START in the 1979 BCPL book 28.
BCPL only gained { and } for blocks later, in imitation of C 27.
Teletype Model 37
Even before UNIX V4 extended the terminal driver to replace \( and \) for
the Teletype Model 33, UNIX programmers had stopped writing B code with $( and
$). They had moved on from the Model 33 to the Teletype Model 37, its
successor.
The Model 37 was 50% faster and supported the full, modern ASCII character set.
They were no longer limited to the ASCII 1963 subset.
It was the most advanced electromechanical teleprinter ever made, i.e., it
operates purely mechanically without digital logic, but was soon obsolesced by
video terminals.
It had many escape sequences: black and red colors, half-forward and
half-reverse line feeds (useful for sub- and superscripts), reverse line feed,
horizontal and vertical tab setting, and half- and full-duplex 29
30. Last year, Brian Kernighan recounted a humorous use of one of these
features in the UNIX group: Robert Morris Sr. sent an email to Joe Ossanna which
contained a hundred reverse line feeds, making it suck the long fan-fold paper
out the back of the Model 37 and drop it on the floor 31.
Terminals on UNIX
UNIX gained Teletype Model 37 support early on and it quickly became preferred.
PDP-7 UNIX supported only the Model 33 32. But, already before the
UNIX V1 manual was finalized, a draft mid-1971 manual implies that many UNIX
users were already using the Teletype Model 37 23. The V1 kernel
supported the Model 37 33. login from V2 at the latest to V5 would
cycle through speeds and login messages for different terminals, supporting the
TermiNet 300 and Teletype Model 37 34. It grew in V6, once getty was
rewritten in C, to support many more terminals and further in V7, but still
supported the Model 37 35.
No version of the assembly kernel uses braces in its source code, even V1, once
development used the Model 37.
Modern implications
The character set limitations of the Teletype Model 33 have had lasting
influence on modern computing.
UNIX uses lowercase almost exclusively. This is still Ken’s writing style
36.
PDP-7 UNIX sources don’t contain a single underscore (it would have been ← on
the Model 33). Later versions use it sparingly. The core of libc still uses
flatcase naming style.
Identifiers in C were very short to fit within the 7 or 8-byte limit. Many such
functions are still in libc. Though, that’s due to the assembler—a topic worthy
of another post, once I finish my assembler.
Design decisions from 1963 still affect us today, 63 years later!

I collect teletypes and am seeking a Teletype Model 37 37. If you have
any leads on one, please get in touch! And, I hope to eventually acquire a
PDP-11 too.
Appendix: hello, world
All the “hello, world” snippets, together:
// BCPL, circa 1967
let Start() be
$( Writech(MONITOR,'h'); Writech(MONITOR,'e'); Writech(MONITOR,'l')
Writech(MONITOR,'l'); Writech(MONITOR,'o'); Writech(MONITOR,',')
Writech(MONITOR,' '); Writech(MONITOR,'w'); Writech(MONITOR,'o')
Writech(MONITOR,'r'); Writech(MONITOR,'l'); Writech(MONITOR,'d')
Writech(MONITOR,'!'); Writech(MONITOR,'*n') $)

/* PDP-7 B, 1969 */
main() $(
write('he'); write('ll'); write('o,');
write(' w'); write('or'); write('ld'); write(041012);
$)

/* PDP-11 B, 1971 */
main() $(
printf("hello, world!*n");
$)

/* NB, May 1972 */
main(argc, argv)
char argv[][]; {
printf("hello, world!\n");
}

/* Early C, June 1972 */
main(argc, argv)
char argv[][];
{
printf("hello, world!\n");
}

/* C, August 1973 at the latest */
main(argc, argv)
char *argv[];
{
printf("hello, world!\n");
}

/* C and UNIX V4, November 1973 */
main(argc, argv)
char *argv[];
\(
printf("hello, world!\n");
\)

/* C89, 1989 */
int main(int argc, char *argv[]) ??<
printf("hello, world!\n");
??>

/* C95, 1995 */
int main(int argc, char *argv[]) <%
printf("hello, world!\n");
%>

// C99, 2000
int main(int argc, char *argv[]) <%
printf("hello, world!\n");
%>

References

“What Teletype are you using? The ASR33 lacks curly brackets which
makes me skeptical it was used in C development. Yes, there are trigraphs (
https://en.wikipedia.org/wiki/Digraphs_and_trigraphs_(programming) ) but
FWIU these came some time later.”
by Michael Katzmann, 19 May 2026 ↩

P2621R3: “UB? In My Lexer?”
by Corentin Jabot, 6 April 2023 ↩

nsys /usr/sys/dmr/tty.c:canon,
modified and snapshotted 31 August 1973 ↩

UNIX Programmer’s Manual Fourth Edition, dc(4)
“dc – DC-11 communications interfaces”, November 1973 ↩

Utah_v4 /usr/sys/dmr/tty.c:canon
and mptab, modified 10 June 1974,
and /usr/sys/dmr/lp.c:lpcanon,
modified 10 June 1974, snapshotted 12 June 1974 ↩

Dennis_v5 /usr/sys/dmr/tty.c:canon
and mptab, modified 26 November 1974,
and /usr/sys/dmr/lp.c:lpcanon,
modified 26 November 1974, snapshotted 21 March 1975 ↩

UNIX Programmer’s Manual Third Edition, dc(4)
“dc – DC-11 communications interfaces”, February 1973 ↩

UNIX V1 u7.s:canon,
snapshotted 14 September 1972 ↩

The earliest surviving file of any C compiler is Dennis_Tapes
last1120c/sptab.s,
modified 4 June 1972 (some ar archive members may be modified sooner, but
their epoch looks wrong). The earliest surviving .c file is Dennis_Tapes
dmr/cgd/cvft.c, modified 11 June 1972. The only surviving NB .b file is
Dennis_Tapes dmr/fc.b, modified 4 May 1972 (all other .b files are
untyped). Thus, NB evolved into C during May and early June 1972. ↩ ↩2

The C parser in Dennis_Tapes
prestruct-c/c00.c:tdeclare,
modified 6 December 1972, snapshotted 8 December 1972, only accepts the
parenthesis form. ↩

Utah_v4 /usr/c/c02.c:strdec,
modified 10 June 1974, snapshotted 12 June 1974 ↩

nsys /usr/sys/user.h:struct user,
modified and snapshotted 30 August 1973, uses both parenthesis- and
brace-style structs in the same file. ↩

“A Tutorial Introduction to the Language B” by Brian Kernighan,
Bell Laboratories Computing Science Technical Report #8: The Programming Language B,
January 1973 ↩

“The B Programming Language”
by Angelo Papenhoff, first published 2019 ↩ ↩2

$ and \ are unknown characters in the earliest C compiler lexer in
Dennis_Tapes last1120c/nc0/c02.c:statement,
last1120c/nc0/c00.c:symbol,
and last1120c/nc0/c0t.s:ctab,
modified July 1972 ↩

“The Development of the C Language”
by Dennis Ritchie, 1993 ↩ ↩2

“A cursed feature of C in 1972”
by Thalia Archibald, 9 May 2026 ↩

Dennis_Tapes dmr/cgd/cvft.c:putchar and getcha,
modified 11 June 1972 ↩

The earliest surviving C file to use * for pointers is
nsys /usr/sys/ken/iget.c,
modified 30 August 1973 ↩

“Users’ Reference to B”
by Ken Thompson, 7 January 1972 ↩ ↩2

“Hidden Early History of Unix”
by Warner Losh, FOSDEM ‘20, February 2020 ↩

“DRAFT: The UNIX Time-Sharing System
by Dennis Ritchie, circa mid-1971, and transcribed,
documents the b command ↩

“DRAFT: The UNIX Time-Sharing System
by Dennis Ritchie, circa mid-1971, and transcribed,
writes “Currently this signal is generated by typing the ASCII “FS” character
(control \ on model 37 Teletypes)”, implying common use of the Teletype
Model 37, though other terminals are also mentioned. The document itself
uses half line feeds and `, ^, and _ characters, features of the
Model 37. ↩ ↩2

Dennis_Tapes s2 /usr/lib/bilib.a, modified 8 June 1972, and
/usr/lib/libb.a, modified 19 June 1972 ↩

PDP-7 UNIX ind.b
and lcase.b,
pages 2 and 4 of Norman Wilson’s 08-rest.pdf,
undated ↩

“The BCPL Reference Manual”
by Martin Richards, 21 July 1967, and a forward by Dennis Ritchie ↩

“Martin Richards’s BCPL Reference Manual, 1967”,
foreword by Dennis Ritchie ↩ ↩2

BCPL - the language and its compiler
by Martin Richards and Colin Whitby-Stevens, 1979 ↩

“Teletype Model 37 functional specification”
by Thalia Archibald, 2025 ↩

“37 Typing Unit (37P003 and up): Description and Principles of Operation”,
Teletype Corporation, Bell System Practices, Section 574-320-101, Issue 2,
February 1973, section 4.23 ↩

VCF East: “UNIX: A History and a Memoir”
by Brian Kernighan, 15 August 2025, 49:00 and transcribed ↩

PDP-7 UNIX init supported only a generic tty in init.s:init1,
almost certainly the Teletype Model 33, and the machine’s keyboard and display
in init.s:init1. ↩

UNIX V1 u9.s:trcv,
snapshotted 14 September 1972, switched between Teletype Model 37/non-37
parity filtering. ↩

getty from V2 to V5 cycles through speeds and login messages,
starting with TermiNet 300, then Teletype Model 37. Other terminals were
supported by drivers.
Dennis_Tapes s1 getty.s,
modified circa June 1972.
Utah_v4 /usr/source/s1/getty.s,
modified 10 June 1974, snapshotted 12 June 1974.
Dennis_v5 /usr/source/s1/getty.s,
modified 27 November 1974, snapshotted 21 March 1975. ↩

Dennis_v6 /usr/source/s1/getty.c,
modified 13 May 1975, snapshotted 18 July 1975.
Spencer_v7 /usr/src/cmd/getty.c,
modified 5 May 1979, snapshotted 8 June 1979. ↩

“[TUHS] Unix gre, forgotten successor to grep (was: forth on early unix)”
by Ken Thompson, 23 September 2025, and
“[TUHS] A PDP-10 used for UNIX just after the PDP-7?”
by Ken Thompson, 16 January 2026 ↩

“Teletype Model 37 documentation”
by Thalia Archibald, 2025 ↩

This site is open source. Improve this page.

The evolution of UNIX and C is intertwined with the history of character encoding and terminal capabilities, with curly braces serving as a focal point for this development. The initial constraints stemmed from the ASCII 1963 standard and terminals like the Teletype Model 33, which lacked features such as curly braces, vertical bars, and specific arrow characters. To accommodate C syntax requirements for blocks, C89 introduced trigraphs, such as ??< for { and ??> for }, and later C95 introduced digraphs, using such as <% for { and %> for }.

The mechanism for character handling in UNIX involved terminal drivers that translated characters between the internal system representation and the output capabilities of devices like the Teletype Model 33. This escaping system, inherited from Multics, used backslashes to escape control characters like erase (\#) and kill (@), allowing input to be processed line by line. This escaping evolved to handle the limitations of early teleprinters.

Before C, the precursor language B, developed by Ken Thompson, differed in structure and implementation. Early B utilized parentheses for scope, and its handling of memory structures reflected the limitations of early machine word sizes. As UNIX transitioned to the PDP-11, internal hacks were required to manage memory alignment for non-word-aligned globals. Consequently, Dennis Ritchie evolved B into C, where the introduction of structs necessitated the use of braces, marking a significant syntactic shift. The transition involved the introduction of pointers, such as the asterisk, to manage indirection, which further formalized the distinction between arrays and pointers.

The support for curly braces also depended on hardware evolution. While the Teletype Model 33 limited capabilities, the successor, the Teletype Model 37, offered support for the full ASCII character set and various escape sequences, which UNIX programmers eventually adopted. During the development of UNIX, kernels supported drivers that translated between the system's character stream and the terminal's output, relying on the Model 37's capabilities.

The historical context reveals that design decisions from the early 1960s directly affect modern computing practices. For instance, the character set limitations of the Model 33 influenced legacy UNIX sources, which often relied on ASCII subsets. Furthermore, the notation used in early reference materials, such as the "hello, world" examples spanning BCPL, B, and C, illustrates the progression from simple character output to the complex syntax of C, demonstrating how language features were integrated alongside hardware and terminal constraints. The overall trajectory shows a coevolution where linguistic features like braces were developed in response to, and in tandem with, the practical needs of operating systems and hardware.