8000
Skip to content

Instantly share code, notes, and snippets.

@kuotsanhsu
Last active June 25, 2025 02:43
Show Gist options
  • Select an option

  • Save kuotsanhsu/84b7a422abfe6b8306b03107b73da713 to yo 8000 ur computer and use it in GitHub Desktop.

Select an option

Save kuotsanhsu/84b7a422abfe6b8306b03107b73da713 to your computer and use it in GitHub Desktop.

Formal Studies and Lean 4

  • EWD 288: Concern for Correctness as a Guiding Principle for Program Composition
  • EWD 690: The pragmatic engineer versus the scientific designer

    "Poor Man's Induction": the "proof" that 60 can be divided by all smaller natural number: you just try! 1? Yes. 2? Yes. 3? Yes. 4? Yes. 5? Yes! 6? Yes!! OK.... Let us try a random example. 10? Yes!!! 12? Yes!!!!!!! Obviously 60 can be divided by all smaller natural numbers.

  • EWD 898: The threats to computing science
  • EWD 1036: On the cruelty of really teaching computing science
  • EWD 1305: Answers to questions from students of Software Engineering

    Medical researchers are not required to suffer from the diseases they investigate.

  • Leslie Lamport, How to Write Software With Mathematical Perfection
  • Edgar Dijkstra interview on Dutch TV
  • Edgar Dijkstra, Reasoning about Programs

We find that at the heart of STEM is math because math is a common framework for reasoning. Math contains

  1. the language for reasoning,
  2. lots of useful facts in the name of theorems that are generally applicable,
  3. verifications of such facts called proofs,
  4. a deep context behind each theorem and proof providing motivations, and
  5. ongoing progress is characterized by famous unsolved problems and conjectures.

Physicists formulate theories in math and inherits the deep context of existing mathematical knowledge. As the most intense user of math, physics demands greater and greater advances in math. However, the urgency of certain beta features of math forces physicists to invent hacks. Since important features in math only comes out of beta after decades, the temporary hacks will have become standards in physics. Famous physics textbooks most often don't upgrade the mathematical knowledge until a century later, leading to the sad state of modern physics for each generation. This tension between the development of math and physics is most notably illustrated by Dirac vs von Neumann. As Bruno Latour quipped, "We have never been modern!"

In 1945 Harish-Chandra came to see Dirac in Cambridge. Harish-Chandra became convinced that theoretical physics was not the field he should be in. He had found an error in a proof by Dirac in his work on the Lorentz group. Dirac said "I am not interested in proofs but only interested in what nature does." Harish-Chandra later wrote "This remark confirmed my growing conviction that I did not have the mysterious sixth sense which one needs in order to succeed in physics and I soon decided to move over to mathematics." Dirac did however suggest the topic of Harish-Chandra's thesis, the classification of the irreducible infinite-dimensional representations of the Lorentz group. See Dalitz & Peierls 1986

-- Wikipedia, Representation theory of the Lorentz group

Math was never free of problems itself. The foundation of math (the "language for reasoning" part of math which we call logic nowadays) is to math what math is to physics. Math is generally not rigorous according to logical standards. In comparison, the rigor of physics is almost like alchemy. "Physicists have strong arms not because they invented the atom bomb but because they are constantly hand-waving." Logic is the linguistic and computational aspect of math. Interestingly, this is also what modern computer science is about.

One can claim that math is essentially computation. The origin of math is the counting of things. Development in math is basically the development of the concept of numbers and techniques to compute them.

  1. Integers and the various arithmetics (a form of computation) we learn in elementary school.
  2. Integral ratios (rational numbers) and their arithmetics in the form of Greek geometry, which is a subset of the construction (a form of computation) of new points with compass and edge (the tools are the realizations of axioms and inference rules).
  3. Real numbers, complex numbers, functions, and their arithmetics really pushed mathematicians to consider the derivation of proofs (a form of computation) because facts regarding real numbers, complex numbers, and functions can no longer be demonstrated and used concretely. Unless we stick to the smaller constructive/computable universe of Bishop/Brouwer.
  4. Abstract algebra and fundamental issues regarding infinity forces mathematicians to take the rules of logic seriously: a computational study of various logics themselves.

So we start from computing finite concrete objects to computing abstract representions of finite objects to computing infinities that are expressible recursively/inductively by finite rules/constructors, and computing the rules of reasoning/logic (which is just a form of computation) themselves. This whole framework of computation is traditionally called logic.

Computer science was intially interested in the specific mechanisations of computation. Prior to CS were human computers, well-versed in abaci and logarithmic rulers. Hence, CS used to be the knowledge of programming machines to carry out computation efficiently with only a few guiding principles. Over time, it has become synonymous with logic as

  1. the guiding principles for CS grew with sophistication, and
  2. mechanisation is discovered for logic under the name of Curry-Howard isomorphism.

Programmers liken physicists in their reliance on hacks. Instead of getting their logic straight with proven type theoretic methods (dependent types, (co)-inductive types in Coq, Agda, Lean, etc.), programmers rely on the fragile OOP paradigm which also lacks in expressiveness. Ideally and practically, programs should be certified. A certified program is one whose requirements are formally specified and whose conformance to said requirements is formally verified.

Often, programmers (like physicists) hinder their productivity with less rigor. In the short term, less rigor seems like faster time to ship (software for programmers and papers for physicists). In fact, it is just delaying problems to the future (tech debt for programmers and quackery/deadends for physicists) which would only exacerbate existing problems. Particle physics and astrophysics are especially prone to this. Recall the sudden surge and then retraction of papers due to anomolies in LHC data and telescope imaging. Why would this sort of quackery happen? Because those physicists are not doing science, just publishing papers (most understandably due to the publish-or-perish cut-throat competition for academic positions). At a higher level, it is due to prevalent misunderstanding and misuse of math in physics. Problematic solutions like quantum normalization (which physicists proudly take as an example of what they believe as not being hindered by useless mathematical rigor) is exactly the culprit of all this nonsense. One cannot rely on logic to know whether to use quantum normalization or not, one merely guesses, and physics is just full of guesses. Guessing is also a core part of mathematical progress which is often communicated as conjectures. The difference between mathematicians and physicists is that mathematicians will admit that conjectures are not proven facts and willingly replace unproven methods with proven ones once published. Physics (most significantly, particle physics) stagnates due to physicists' reliance on their often incorrect intuition. Intuition is equally important to mathematicians, but mathematicians take the effort to hone and improve their intuition with proofs and being honest about what is a conjecture and what is a fact. The same goes for programmers. Programmers create so many bugs because they don't want to be hampered by mathematical rigor. It is often claimed that programming languages are basically the same. That is because most programmers don't know Coq/Agda/Lean, so most languages look like variations of OOP. Only with certified programming can requirements be communicated through code and checked by compilers. Instead, programmers waste time on unnecessary unit tests, code reviews, agile/scrum, and still produce tons of bugs. (Not that all of unit tests, code reviews, agile/scrum can be 100% replaced by certified programming, but like 80%). Such huge waste of time and effort and still lots of problems.

Physicists and programmers should simply learn to be rigorous. Rigor means being honest, explicit, and up-front. All are qualities that lead to much faster development at a much lower cost with a much higher quality.

  • Why do physicists hate math but indulge in theoretical physics? Because physicists find theories useful just not formal theories that they cannot understand. Physicists are lazy and never put in effort to learn math seriously. Hence, they miss out a lot of insight and results from math. Instead, physicists reinvent the wheel that are merely half-baked hacks. Wow, doesn't that sound like programmers?
  • Why do engineers loathe theories as impractical? Because engineers are as lazy 8000 as physicists. Furthermore, engineers are short-sighted by the various menial tasks that don't require deep understanding of anything, or, more likely, their curiosity is compromised by the various business people that take an indifferent stance towards everything that has nothing to do with short term money making. As the saying goes, "it's not that Oracle doesn't care about code quality. It's that Oracle is indifferent towards anything except profit".

However, papers in theoretical CS (the AI/ML fad doesn't count as theoretical CS) and logic is actually a gold mine of great ideas and results. Linear logic came out as early as 1987! Why haven't programmers heard of it?

Good theories are the culmination of our understanding. Bad theories are useless writings just for publication. People without serious mathematical training couldn't tell the two apart and are so frustrated by the math that they conveniently claim all theory (even good ones) useless.

Goal

The good news is that anyone can become mathematically sophisticated by simply taking math seriously. The path of least resistance is to take courses from your universities' math department. Even if that is not feasible, the bottom line to achieving mathematical understanding is by studying math honestly, which leads to the use of proof assistants. A particularly convenient one is Lean 4. Don't follow the advice of common physicists and programmers, but think things through and be serious. Please understand that confusing physics textbooks/papers (like the Bible in mechanics, Goldstein, or even Quantum Fields and Strings which Pierre Deligne has done his best in aiding the other coauthors) and spaghetti code are due to the authors' lack of mathematical rigor.

  • Readers can claim complete understanding by formalizing the textbooks/papers/programs in Lean 4.
  • Managers can hire anyone of any age of any background if the person can program in Lean 4. This is the ultimate form of equality and productivity.

Static analysis

The fact that programmers can write code to meet requirements is due to their ability to statically analyze the whole program before it runs. So a complete static analysis like formal verification is completely possible. We just need better tools. If the industry adopts certified programming, not only will costs go down, CS students will find their education actually useful. So much of a CS education is about acquiring the sophistication to do certified programming. The fact that this effort goes to waste by working in the industry is such a shame.

Graphics is the most underspecified aspect of programming. Take a look at the specifications of OpenType, TrueType, MathML Core, and CSS3. They are specified like Euclid's Element. However, programmers don't have a way to program graphics in an axiomatic fashion with logical constraints (like in CAD). Logical constraints are required for responsive web design; CSS flex and CSS grid should be formalizable. Instead, programmers rely on unit testing with selenium, puppeteer, and the like, which are merely exploratory methods or last resorts.

Approach

Start with classical and proven stuff. Implement those first as preparations.

  • NJ
  • LK
  • System F
  • CoC
  • CiC
  • Hoare logic (TLA+) Then, move to substructural and higher theories.
  • Linear logic
  • Temporal logic
  • Separation logic
  • HoTT/cublical

Rigor is honesty

Rigor is honesty and the scientific method is the honest method. However, there will be a profitable arbitrage if enough people are acting honestly. Hence, honesty is practically discouraged (especially in finance). See how many pointless or even fradulent papers are published each year. See how programmers disregard correctness. See how physicists make all sort of unverifiable claims. When you think about it, honesty is really a luxury. Most people don't start out being dishonest, but they are bent by reality, and they become the reality for generations to come.

Transcribed from "Yes, Prime Minister!":

  • Bernard: I want to have a clear conscience.
  • Sir Humphrey: A clear conscience?
  • Bernard: Yes!
  • Sir Humphrey: When did you acquire this taste for luxuries?

Formal methods are but methods to speak with clarity and precision

However, accuracy comes from integration tests.

Physics experiments are like software testing

Experiments and testing only prove the tentative inability to prove its falsity. It could very well be false. "It" here refers to assumptions of physical theories and software correctness.

Knowledge is too vast and too shallow

Oftentimes, what we mean by competence is better knowledge. However, depth of knowledge is often disregarded although people often dishonestly claim to encourage deep knowledge. What people really want is proficiency in execution. The necessary and sufficient condition of proficient execution is the breadth of knowledge that are not necessarily deep. Thus deep knowledge is merely a potential for obtaining broad knowledge.

Generating shallow knowledge from deep knowledge is robust but time-consuming, so a more practical approach is to focus on collecting as much shallow knowledge as possible. Gaining deep knowledge then becomes a curiosity-guided meritocracy which is often deemed optional. Is this a bad thing?

Reflecting on my experience, the real dilemma lies in the fact that deep knowledge is often attained through extensive exposure to shallow knowledge (emphasis on "extensive"). Therefore, we can't shortcut the menial task of collecting shallow knowledge, implying that I shouldn't take a demeaning attitude to what I call shallow knowledge; knowledge is never shallow. Formal methods, as a form of deep knowledge, should be applied extensively but not necessarily all-encompasing.

As mentioned, deep knowledge does not guarantee proficiency in execution, and vice versa. One shouldn't aim for depth mindlessly nor be satisfied by one's proficiency in execution. To quote 王陽明: "知行合一".

Raise the bottom line

Being pragmatic is to censor the bottom line. Formal methods is a way to raise the bottom line.

Limits of languages

Being proficient in a certain lanugage is the ability to speak one's mind in said language. Ideally, one's mind is not limited to the language one uses, but reality is not ideal. Neuroplasticity decreases as one ages, and memory consolidation is due to neuronal autophagy (which is sort of irreversible). The use of language is largely memory consolidation. In other words, our minds are becoming ever more limited by the use of language. Wittengstein touches upon this issue from a philosphical standpoint:

The limits of my language mean the limits of my world.

-- Tractatus Logico-Philosophicus

The same argument applies to programming languages and logic. It is very important to choose a suitable language that allows one to appropriately express the task at hand. It is curious to hear OOP programmers claim that all programming languages are basically the same. What they mean by "all" are the very limited languages they are exposed to which are all designed for OOP, whose similarities are merely superficial. Those OOP programmers claim proficiency in a new language by demonstrating their ability to perform convoluted inheritance and create AbstractSingletonProxyFactoryBeans.

Fuzzing as a Way to Guide Development

Let me describe what a practical development session looks like.

  1. Formally specify portions of the requirements.
  2. Formally construct a prototype that is
    • formally verified to meet the requirements in item 1 and
    • informally meeting the not-yet-formally-specified ("other" for short) requirements.
  3. Write a verifier of the other requirements.
  4. Fuzz the prototype against the requirement verifier.
  5. Let the code coverage of the requirement verifier guide you.
    • The code coverage of the requirement verifier should hint at what is missing in the prototype.
    • Try to formalize the missing parts in the prototype.
  6. Reiterate from item 1 while extending the scope of formalized requirements.

This process is healthy in the sense of:

  1. A small up-front design; up-front design shouldn't be totally discarded as some agile methods that went too far, nor be as grand (but uninformed) as traditional waterfall methods.
  2. Informally explore and update both the solution and requirements as we progress.
  3. Guarantee correctness at each stage with formal methods, and explore with test-driven design.

The accurate nature (as shown below) of tests are illuminated by this procedure. Point 3 is particularly pragmatic. Just like threat models made security pragmatic.

  1. Tests drive the design.
  2. Formal methods guarantee correctness.
  3. Tests are only used to approximate correctness (by falsification just like the scientific method) when formal methods are too hard to apply.

Cheap Taxonomy

Suppose we have the original conception of "numbers" as natural numbers. Centuries of deeper investigation suggests a more fundamental understanding of "numbers" as a collection of objects associated with two binary operations that are associative, commutative, distributive, etc, due to the discovery of negative integers, rational numbers, real numbers, complex numbers, p-adic numbers. One might be tempted to call the more fundamental (general) notion as "numbers" as well. However, other people would call you out for bending the definition at will. So the real problem one faces is "taxonomy". Instead of calling them all "numbers", one adds essentially meaningless adjectives to the word "number". What is so natural, negative, rational, irrational, real, imaginary, or complex about numbers? They are but cheap taxonomy to keep dogmatic people quiet, and those dogmatic people call those that were forced to invent cheap taxonomy self-indulgent.

However, not all taxonomy is cheap. "Open sets" and "closed sets" are poorly named, as their non-exclusivity must be stated explicitly for clarity, leading to endless confusion ("clopen sets" anyone?) which is a great cost. Ironically, many people prefer to be implicit and ambiguous in this case because making their difference explicit seems "confusing"; how typical of programmers and physicists.

General Availability

Attempting correctness without formal methods is just futile.

https://news.ycombinator.com/item?id=31544600

This is kind of like AI. When something succeeds in general practice, it is no longer "formal verification". Now it's "robust type systems" and "static analysis tools". Those provide formal verification of some aspects of the code. And that's great! It's progress!

https://news.ycombinator.com/item?id=31547769

Don't confuse resignation and ignorance with happiness. People are used to computer systems just breaking and being the root of various problems (e.g., identity theft, privacy leaks, systems that just don't work some days for no apparent reason, and so on). The fact that they accept this flaky and unreliable state as the status quo doesn't mean they're happy with it - they just don't understand that better is actually possible

https://news.ycombinator.com/item?id=31548198

Currently, Best Practices aren't really a "thing," in software development, and it shows. People like Steve McConnell are not really respected, and a general culture of "move fast and break things" is still pervasive. Engineers flit around companies like mayflies, techniques and libraries come and go, and there's an enormous reliance on dependencies with very little vetting. We spend so much time, trying to perfect our tools, without trying to perfect ourselves.

https://www.reddit.com/r/programming/comments/bm91y/if_builders_built_houses_the_way_programmers/

If builders built houses the way programmers build programs, the first woodpecker to come along would destroy civilization.

No More Pseudocode

We have Algol, but who uses Algol as pseudocode?

  1. The high-level description (in common math notation) of algorithms,
  2. their correctness and complexity analysis in a rigorous fasion,
  3. the generation of executable code,
  4. and the distribution of the bundled algorithms, proofs, and executables

can be fully covered by a single language and its compiler, like in Coq/Lean/Agda.

  • No need to re-invent pseudocode that varies in notation and formality.
  • No need to separate proofs from the algorithm specifications where the proofs are mostly repeating the algorithms.
  • No need to implement the algorithms separately which provides no guarantee of conforming to the specifications.

Formal Competitive Programming

Let's compare LeetCode (LC) with CodeForces (CF).

  • LC provides input as proper objects in the selected programming language with all the type information; CF provides plain-text input whose structure is informally described and must be consumed through standard input.
  • LC accepts output as a proper object in the selected programming language; CF accepts plain-text standard output whose structure is informally described.
  • Evaluation of solutions are done by submitting the source code to their servers. Compiling and executing arbitrary code is a risk they take.
  • Performance tests are done by measuring actual program executions. No guarantee that the asymptotic bounds are met, as optimizations could circumvent them. Performance measures between languages are not comparable.
  • Overall, LC is a much better experience, but CF has more interesting problems and allows for I/O optimizations.

Consider solving the same algorithmic puzzle with a proof-assistant.

  • The puzzle and all its requirements (including asymptotic bounds, input/output format) are expressed as formal logic that can be consumed directly.

  • Input and output are both proper objects in the proof-assistant with all the type information.

  • The normalized proof is submitted.

    • The server only checks if the proof is correct which requires no execution privilege and very few resources. Since the proof covers the conformance of the asymptotic bounds, correctness of proof implies asymptotic performance which is the only performance criteria of the puzzle.
    • If the underlying type theory of the proof-assistant is strongly normalizing, any solution will be identical after normalization. This prevents any potential malicious side-effects of checking arbitrary proofs; the server simply checks if the submitted proof term is identical to the static solution on the server. This even prevents denial-of-service by blocking requests whose length is different from the solution. Better yet, one can just compare the hashes of the proofs.
  • One solves the problem with complete confidence as long as the solution type-checks without error locally.

  • One enjoys the various guarantees made by procedures in the standard library. E.g., std::sort carries the proof of

    1. the type of the elements of the input having a properly-defined partial ordering,
    2. having $O(n\log n)$ time complexity,
    3. the output being non-decreasing in the partial ordering provided,
    4. the output being a permutation of the input (if sorting results in fewer elements, it's called Stalin sort),
    5. whatever further guarantees like sorting stability, etc.
  • One leverages existing theorems in the standard library. E.g., the master theorem.

  • One can finally solve problems without all those menial distractions in LC or CF.

  • One gets a taste of how programming should be unlike the sad current state in the industry.

  • One can prove to

    1. management,
    2. so called senior programmers that don't know any better,
    3. or actual senior programmers that know a lot better but lack evidence to persuade management

    that "programming can be much more cost-effective with formal methods."

Really, with formal methods, it requires less time, less money, and fewer people to maintain or develop a piece of software. Futhermore, the benefits of formal methods compound like interest; the wider the adoption of formal methods the easier and faster to further adopt formal methods. Best of all, formal methods could be gradually adopted like gradual typing in TypeScript, so adoption of formal methods impedes no existing plans during transition. The most direct impact would be much fewer regressions, bugs, tests, and code reviews. Much more time could be spent on the actual design of the product.

Formal methods also enables a new programming paradigm: programming by large scale code transformation. We can finally automate the creation and modification of code itself. This is a true stepping-stone towards low-code/no-code and code-writing AI. In the future, we really don't even have to bother with writing code ourselves if we have a formal foundation for better techniques to build on and eventually replace.

  1. Humans are better at managing formal methods than inf AC29 ormal methods. As long as we program manually, we should adopt formal methods quickly.
  2. Machines are potentially better at managing formal methods than human. Then, we should find ways to delegate programming to machines.
  3. Together, humans and machines evolve better methods for programming. I believe humans and machines will eventually transcend formal methods, but the transcendence can only be achieved if most software is already built with formal methods. Thus, formal methods are necessary during the process.

Never mind point 3, point 1 is enough reason to adopt formal methods.

Distinction, Clarity, Unification

  1. (見山是山) We start thinking in a unified but naive way.
  2. (見山不是山) At some point, ambiguity and confusion will arise. We clarify by making distinctions.
  3. (見山還是山) After we have attained clarity, the imposed distinction will seem artificial and unnecessary, and we will return to the original unified point-of-view but no longer naive.

小衍 大衍 天演

Correctness vs Productivity

"Productivity without correctness" is an oxymoron. The only sensible way to interpret "productivity without correctness" is "damage". "Correctness without guarantee" is another oxymoron; "correctness without guarantee" should be interpreted as "lies". In fact, guaranteed correctness should be the greatest multiplier of productivity. Without guaranteed correctness, time and money are just wasted in endless verification of stated/unstated behaviors and endless patchwork.

People complain about "fighting the compiler". However, if you are not fighting the compiler, you are either lying with your code or fighting your customers, managers, colleagues, and your future self. The possibility of fighting the compiler instead of other alternatives should be considered godsend.

See how insensible the TypeScript team is regarding ownership. No wonder TypeScript is dubbed WAIScript where WAI stands for working-as-intended. So much JS/TS/Java logic relies on invalidating objects or prevent concurrent access to objects.

  • The builder pattern. The intermediate objects returned by the builder methods shouldn't be referenced at all. All of its methods should move itself (invalidated after consumption). It should be specified like this:
    struct ToBeBuilt;
    struct Builder;
    impl Builder {
      fn add1(mut self, property1: Property1) -> Self { ... }
      fn add2(mut self, property2: Property2) -> Self { ... }
      ...
      fn build(&self) -> ToBeBuilt { ... }
    }
  • Model stuff like transferable objects sent to Worker threads
  • Ownership is really much more than memory management. It's the same misconception people have with destructors. Destructors are not only about freeing object, but the more general concept of RAII, e.g., closing files and lock guards. Thus, to dismiss destructors in a garbage-collected languge is doing a disservice to programmers. If we have destructors, do we still need ownership? Yes, the previous items describe some use cases of ownsership where destructors cannot help; another prominent case is to prevent "double free". Without ownership, destructors can be called multiple times. With Rust, we can move the ownership of the destroyed object to the moon:
    struct Something;
    impl Something {
      fn destruct(self) { ... }
    }
    The pains of C++ come from lacking ownership. Besides calling destructors more than once, consider iterator/reference invalidations that occur when methods like std::vector::push_back are called.

Also, don't think of references/pointers as merely optimizations to prevent copy. Moreover, the lack of value semantics in garbage-collected languages (where everything besides primitive types are references/pointers) is even worse. The terms "value" and "reference" only make sense with regards to ownership. Values own; references borrow.

  • The possibility to prevent copy with references is merely a convenient side-effect. The mindset of preventing copy with references is premature optimization.
  • The lack of value semantics in Java prevents the possibility of const-correctness (deep immutability) and idiomatic deep-copy.

Reference counting is problematic in the face of circular chain of references. To completely eliminate circularity, we need inductive types (or is stringent ownership enough? Think about Rust without the possibility of being unsafe). To introduce circularity in a controlled manner, we need lazy evaluation as in tying-the-knot with Haskell.

Programming is All about Control

Not only about controlling the machine, but also controlling the way one constructs programs (which should be the emphasis). We shouldn't complain about fighting the borrow checker in Rust. We should instead rush to create a borrow checker for our daily languages that don't have a borrow checker because we want a borrow checker to aid us in controlling the program not to go wrong. The so called high-level modern languages are not only low-level but backwards. Even proof assistants like Coq/Agda/Lean are low-level regarding the amount of type-theoretic primitives one has to deal with even for trivial tasks.

Programming is about avoiding unnecessary work (code quality is subjective)

What is commonly referred to as code quality or readability is very subjective and doesn't really have to do with the actual quality of code; PEP8 anyone? Code reviews often become bikeshedding or even toxic fights.

Objectively, correctness comes first, then, performance. Correctness is hard to achieve because code is often doing too many extra things (side-effects/regressions) that we as programmers might not even be aware of; even if we are aware of the side-effects, we might not have a way to reliably avoid them. A program is slow only because the program is performing unnecessary tasks, like waiting for a lock when the operations can be made async (sensibly, this doesn't mean forcing everything to be async nor throwing locks away). Algorithm design is to identify and remove all those unnecessary things. Algorithm design is hard because all those unnecessary things are hard to identify and removing them requires deep understanding of the problem at hand (e.g., it is not quite that easy to prove the cancellation law of multiplication for natural numbers). Hence, avoiding unnecessary work in a guided way is what cerified programming is all about. We make sure the restrictions we want to subject ourselves to while programming are reliably enforced by the compiler (proof assistant).

Formal methods scale correctness whereas unit-testing and code reviews don't scale nor guarantee correctness. To save time and lower cost, consider writing all code in Coq/Lean/Agda first, and only switch to ASM/C/C++/Rust/Ada as a mature optimization. Ada has an edge here, as there is Ada/SPARK that glues the Coq specifications to actual Ada code.

The cost of formal methods are grossly overestimated by people that only use it as an after-thought. As Dijkstra observed, formal methods should be applied since the design and construction phase. In this way, proofs will likely grow at most linearly. From my experience, most proofs are either term-inferenced (e.g., typeclass instances) by the compiler, merely reflections (e.g., most proofs in HoTT are just refl and id), or very simple inductions.

What we mean by readability is how easily people reading the code can fully understand the code and reason about it without executing the code. Cosmetic guides like PEP8 could be beneficial to a very limited extent. Ironically, stylistic refactoring could introduce regressions, so there is a risk to even such shallow refactoring. Moreover, no stylistic refactoring can make Dijkstra's shortest path algorithm implemented with Fibonacci heaps and union-find any easier to follow; the core logic is simply non-trivial.

Note that "formally" means being checked by a proof assistant.

Readability is achieved by not leaving out information like previous deadends, obscure behaviors of 3rd party libraries, etc. Modern programming is an extremely lossy compression of "deep insights to problems" to "code". Of course, one can write extensive documentation, but they are easily out of sync with the code and not checked by the compiler. Proof assistants could solve this "lossy compression" issue. This is the real readability criterion. Take, for instance, Newton's root finding algorithm:

  1. Argue in logic with a proof assistant; instead of pen and paper, which could be lost and error-prone, use a proof assistant to help:

    • Formulate the problem: formally specify what it means to find the root of a differentiable function.
    • Solve the problem at a high level: prove that the tangent-line-root iterations converge to the answer.

    In this way, our math can be version-controlled, verified by a proof assistant, and be reused in the implementation as code.

  2. Require the final code to be run to be correct by formally verifying every aspect of the code to meet the formal specifications of our mathematical solution. Suppose we implement the naive translation of the tangent-line-root iterations into code with floating-points:

    • Show formally that our code is faithful to the high level mathematical solution.
    • Show formally that our floating-point solution has an acceptable error bound, so called stability condition.

In fact, if we programmed in a proof assistant that supports program extraction, most proofs will simply be trivial reflections, i.e., not much to prove, as the programs are constructed correctly according to the structure of the formal specification.

Often, fast but complicated algorithms and data-structures are discouraged from being implemented because they can easily become buggy and brittle. This is hocus pocus; just use a proof assistant. The resulting program, as complicated as Dijkstra's shortest path algorithm implemented with Fibonacci heaps and union-find, will be even less buggy and less brittle than a singly-linked list implemented in C. What are modern programmers thinking? Just learn some logic!

Every Aspect of Programming is UX Design

Programming language is UX design. ABI and API are also UX; this is why owernership, dependent types, and proof carrying programs matter.

Exponential Explosion

The fact that you can reason about the program before you even run it, which is the definition of programming, doesn't involve exponential explosion at all; otherwise, your brain will likely crash. In fact, your brain is cleverly applying definition-by-recursion and proof-by-induction, so just write that down as code.

Don't Use Your Brain as a Compiler

Offload the menial accounting to the compiler. In fact, offload as much as possible to the compiler.

A retentive memory may be a good thing, but the ability to forget is the true token of greatness. -- Elbert Hubbard

With formal methods one can truly forget about the implementation and focus on the design. Without formal methods, there are 3 places where errors can be introduced, whereas formal methods reduce them to just the first:

  1. Mismatch between reality and one's model of reality.
  2. Inconsistency of the model itself.
  3. Infidelity of the implementation of the model in code.

The Actual Memory-Safety that Java(GC) Provides

Only two, and Java calls everything else NullPointerExceptions:

  1. Use-after-free
  2. Double-free

The following are not prevented:

  1. Use-before-initialization; come on, no one really believes that initialization to 0 or null is proper initialization.
  2. Leaks; this is in fact too general to ever be solved.
  3. Anything not related to memory.

Java irony:

  • Java programmers fool themselves for encapsulation with the proliferation of trivial getters and setters.
  • Java provides us with towers of class hierarchies but made null the subclass of all classes which breaks all SOLID principles.

Necessity of Formal Methods

You don't need to read much more than the comments on Lamport's Why We Should Build Software Like We Build Houses to see evidence of a strong anti-design and anti-specification bent in software engineering practitioners. -- Snark, Chord, and Trust in Algorithms

To designers of complex systems, the need for formal specs should be as obvious as the need for blueprints of a skyscraper. -- Why We Should Build Software Like We Build Houses

Zero-Cost Abstractions

Propositions-as-types is the ultimate zero-cost abstraction. Techniques in C++/Rust are well-intended, albeit poor, approximations.

Runtime Assertions

There is no need for runtime assertions. Either they are asserted as compile-time assertions, or properly handled as exceptions.

Readability

What does code readability mean?

The most important property of a program is whether it accomplishes the intention of its user. – C.A.R. Hoare

Readability is usually used as a poor proxy for correctness. Just aim for formal correctness. If people trust the correctness of your code, they, most likely, wouldn't read it, and maintenance wouldn't even be required in the first place. This is true proof of correctness.

Bugs in Specifications

It's great that the only opportunity to introduce incorrectness is in the specification. How could we debug this or test this? Physicists and mathematicians have been dealing with this for centuries; most notably, quantum and relativistic corrections to Newtonian mechanics. Note, however, that the possibility of these corrections were due to existing mathematical theories that are known to be correct. Otherwise, scientists would have been doing the futile work of fixing something that is wrong in BOTH specification and implementation. Programming should be more of a science than art so that fundamental principles trump questionable years-of-experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
0