đź“–Crafting Interpreters

Nystrom, Robert
  • Many techniques for parsing originally come from AI community and were intended to parse human languages. (ch.2)
  • William Wulf coined the term Middle End
  • C’s type declaration reflects use (ch.3)
  • Perl, PHP, Python started with ref counting but moved to full GC later (ch.3)
  • argument—value passed in the function call; parameter—variable holding the value inside function (ch.3)
  • In Lexer, position information can be stored as offset+length. Line+column can be retrieved later by calculating newlines. That is slow, but you only need to do that for tokens that you show for user. Most of the tokens never need this, so you save time on these. (ch.4)
  • Maximal munch: if two rules match, the one that matches more characters wins (ch.4)
  • The Visitor pattern can be used to turn around Expression problem in OOP languages (ch.5)
    • For FP, flipping Expression problem can be done with typeclasses (?)
  • Novelty Budget (or Strangeness Budget — Steve Klabnik) — balancing familiarity with new features (easy to learn vs more power) (ch.28)
    • that’s why syntactic changes don’t usually pay off
    • similar to idiosyncrasy credit
  • On of error recovery techniques is adding error productions to the grammar. Then, you can show the user a custom error message as you know what he was trying to achieve (ch.6, 6.3.2)
  • Early linkers for C only treated the first six characters of external identifiers as meaningful. Everything after that was ignored. If you’ve ever wondered why the C standard library is so enamored of abbreviation—looking at you, strncmp()—it turns out it wasn’t entirely because of the small screens (or teletypes!) of the day. (§20.1.1)