📖Programming Language Pragmatics

Scott, Michael
  • The first C++ compiler was not a preprocessor. It thoroughly analyzed the code and produced C output that was always correct. Any errors would be reported by C++ compiler itself. → The first C++ compiler was not a preprocessor
  • Compiler vs. preprocessor. The difference is in how well they understand the source code. (Preprocessor does not understand the internal structure of the language.)
  • p.21 bootstrapping compiler + bootstrapped Pascal

    • pascal bootstrapping via VM
  • Compilation pass vs. phase. Pass means all previous passes have finished and next can only start when the current pass is finished.
  • BNF was devised for definition of Algol-60 (p.49)
  • John Backus was also inventor of Fortran (p.49)
  • Hand-coding scanner is often used for production compilers because it’s fast (faster than generating scanner from regexes)
  • (p.64) In pre-Fortran 90

    ! This is loop header
    DO 5 I = 1,25
    ! this is assignment to DO5I variable
    DO 5 I = 1.25

    DO is either keyword or part of identifier. This requires arbitrary lookahead for lexer

  • bottom-up (LR) parsers were first to receive attention of researchers. That’s why bottom-up form is called “canonical.” (p.72)
  • class of LR grammars is larger than LL (p.70)
  • Binding time trade-off (p.117)

    • earlier → faster execution
    • later → more flexibility
  • my_proc.x syntax to access shadowed variable (in Ada)
  • Fortran used to beat C in performance because of aliasing. C assumes pointers may alias and thus cannot perform some optimizations. Fortran just doesn’t have aliases. (p.146)
  • values

    first-class valuesecond-classthird-class
    passed as param++-
    returned from subroutine+--
    assigned into variable+--
  • Lisp uses Cambridge Polish Notation. p.225
  • 0.1 is “repeated decimal” in binary. For certain values of x, (0.1 + x) * 10.0 and 1.0 + (x * 10.0) can differ by as much as 25%, even when 0.1 and x are of the same magnitude. p.243
  • In Ada and and or are not short-circuiting. Short-circuiting versions are and then and or else. p.245
  • Fortran’s for loops do not allow arbitrary modification of the counter variable. That means the compiler can generate a code to pre-compute the number of iterations and iterate faster. (p.264) → Fortran’s specialized for loop can generate faster code

    • Decrement-compare to 0-jump can be performed with a single instruction on many processors.
    • Fortran also requires that bound is computed only once. (C does not.) (p.265)
  • Duff’s device—C hack with interleaving loop + switch in C (by Tom Duff from Lucasfilm)
  • IEEE-754 2008 defines decimal floating-point types. Mantisa and exponent are still binary, but exponent is a power of 10 (not 2). Decimal type has greater precision but smaller range. p.307
  • pointer != address (p.378)

    • pointer is high-level construct that can be implemented as address (or some different way)
  • formal parameter—parameter; actual parameter—argument. (p.411)
  • Eiffel has named constructors §10.3.1 p.496
  • Constructor as method on the class (Eiffel / Smalltalk) §10.3

    • (Rust does something similar but does not introduce class as an object.)
  • Reverse assignment (C#) §10.4

    x ?= y;

    x = dynamic_cast<typeof x>(y);

    assign y to x iff types are compatible; otherwise, assign null

  • fragile base class problem (Java) (p.512)

    If code is compiled against newer version of the class but executed with the older, calling old methods might fail if language uses static dispatch (because offset might have changed)

    • (related to pimpl in C++)