
A Princeton professor, discovering slightly time for himself in the summertime educational lull, emailed an outdated good friend a pair months in the past. Brian Kernighan mentioned hiya, requested how their US go to was going, and dropped off a whole lot of strains of code that might add Unicode help for AWK, the text-parsing instrument he helped create for Unix at Bell Labs in 1977.
“I’ve examined this a good quantity however clearly extra exams are wanted,” Kernighan wrote within the e-mail, posted in late Could as a sort of pseudo-commit on the onetrueawk repo by longtime maintainer Arnold Robbins. “As soon as I determine how … I’ll attempt to submit a pull request. I want I understood git higher, however regardless of your assist, I nonetheless do not have a correct understanding, so this may occasionally take some time.”
Kernighan is the “Okay” in AWK, a special-purpose language for extracting and manipulating language that was key to Unix’s pipeline options and interoperability between techniques. A working awk
perform (AWK is the language, awk
the command to invoke it) is essential to each Normal UNIX Specification and IEEE POSIX certification for interoperability. There are numerous variants of awk
—together with trendy derivations with help for Unicode—however “One True AWK,” typically often called nawk
, is a sort of canonical model primarily based on Kernighan’s 1985 guide The AWK Programming Language and his subsequent enter.

Copies of The C Programming Language of their native campus bookstore setting, written by Brian Kernighan and Dennis Ritchie (RIP).
Kernighan can be the “Okay” in “Okay&R C,” the foundational 1978 guide The C Programming Language he cowrote with Dennis Ritchie that sticks with programmers, mentally and in dog-eared paper type. C’s roots go a lot deeper. Kernighan had been educating C to staff at Bell Labs and satisfied its creator, Ritchie, to collaborate on a guide to unfold the information. That guide gave start to “the one true brace type,” the infinite debate that goes with it, and the construction underpinning each trendy programming language.
Kernighan additionally named Unix and first demonstrated the “Hiya, world” code instance. He spoke with Ars Technica’s Richard Jensen for a fiftieth anniversary historical past of Unix.
The onetrueawk repository, the place Kernighan appeared in late Could, is a comparatively quiet place, with 21 contributors, 46 GitHub customers watching, and commits coming each few months. As famous by The Register, Kernighan’s Unicode repair got here to mild largely as a result of it was talked about in an interview with the professor by YouTube channel Computerphile.
https://www.youtube.com/watch?v=GNyQxXw_oMQ
Brian Kernighan talking with Computerphile about his work, current and historic.
“It is at all times been a humiliation that AWK solely labored with ASCII, or perhaps 8-bit inputs, nevertheless it would not actually deal with Unicode in any respect,” Kernighan tells interviewer professor David Brailsford. “A couple of months in the past, I spent a while working with (laughs) an extremely outdated program. I’ve it at this level the place it’s going to really deal with UTF-8 enter and output as a way to have common expressions that, you understand, choose up Japanese characters, issues like that.”
Kernighan, now 80, offhandedly mentions within the interview that he has additionally patched one thing “fast and soiled” to let AWK deal with CSV recordsdata.