#38 - The Case for Formal Methods: Hillel Wayne

04/11/2019

At one end of the spectrum is the young Zuck encouraging his hackers to “move fast and break things.” And then there’s Hillel Wayne with a very different sort of advice: move a bit slower and get things right. Unsurprisingly, the more mature Mark Zuckerberg of today would now agree with Hillel. “When you build something that you don’t have to fix 10 times, you can move forward on top of what you’ve built,” Zuckerberg told BI.

Hillel’s road to this wisdom was much shorter than Zuck’s. A couple years ago Hillel was working at a web development company that ran into a hairy distributed systems problem. The sheer complexity of it was overwhelming, so he starting looking for a way to make it manageable. That’s when he stumbled on TLA+. Long story short: Hillel fell in love.

Now Hillel is a renowned formal methods consultant, advising and training companies on TLA+, Alloy, and various other formal methods. It’s part of his personal mission to evanalize the benefits of formal methods to everyday programmers.

Most engineers in industry don’t get particularly excited with they hear “formal methods.” They seem like they’d be, well, formal. But that’s not entirely true. Formal methods is a big field, and some parts are more “formal” than others. There are two main categories: specification and verification. And then within each of those two categories there are: design and code. In other words, there’s:

Most of us are familiar with the benefits (and detriments) of a few kinds of code verification: type systems, tests, contracts, etc. Hillel’s schtick is educating the public on the virtues of a more obscure corner of formal methods: design specification.

TLA+ isn’t the first design specification language created, but it was the first Hillel came in contact with, and it’s still his favorite. It was created by Leslie Lamport, better known for LaTex and his seminal work on distributed computing (earning him the 2013 Turing Award). Hillel explains that the main benefit behind TLA+ is codifying all the scattered whiteboardings, UML diagrams, and documentation into a single formal notation that can also be automatically stress-tested for issues. It’s documentation for the broader system’s design, with the added assurances provided for by the TLA+ brute-force model checker. From Wikipedia: “TLA+ has been described as exhaustively-testable pseudocode, and its use likened to drawing blueprints for software systems.”

As Hillel would be the first to tell you, formal methods aren’t going to change your life. If you aren’t getting enough sleep currently, do that first. Then maybe consider formal methods, such as TLA+ or Alloy. If you want less customers getting upset at you for production bugs, and don’t want to ever again spend two weeks of your life crawling through distributed systems logs: formal methods may be right for you.

Full Transcript

Transcript sponsored by repl.it

Corrections to this transcript are much appreciated!

SK:
Hello, and welcome to the Future of Coding. This is Steve Krouse. Today, we have a guest on the podcast that, if you've been listening carefully to the other interviews, has actually been mentioned at least two, maybe more, times by other guests. Hillel Wayne is best known for his work trying to explain and promote TLA+ to a broader audience of more practical engineers, people who might not think that what we call, "formal methods" would apply to, you know, building products, building web applications, building technologies for the startup they work for.
SK:
If you aren't familiar with the term, "formal methods," I think this is a really great podcast to get your foot in the door. We start by contextualizing what formal methods are. We break up the field into four quadrants, and we go kind of quadrant by quadrant, and think about what each of the different techniques is used for, and the practicality of it. Well, I think I might be overselling how ordered this conversation is. Hillel will explain something, and I'll think I understand it, and then maybe 10 minutes later, I'll be like, "Wait a second. How is that different from that other thing we were talking about?" Then he kind of has to backtrack and clarify for me. But I think you'll be able to follow. And I think you'll get a greater sense for this small but active community of research that has a lot to offer to the future of what software engineering could look like.
SK:
Before I bring you Hillel, a quick message from our sponsor, Repl.it.
SK:
Repl.it is an online REPL for over 30 languages. It started out as a code playground, but now it scales up to a full development environment where you can do everything from deploying web servers to training ML models, all driven by the REPL. They're a small startup in San Francisco, but they reach millions of programmers, students, and teachers. They're looking for hackers interested in the future of coding and making software tools more accessible and enjoyable. So email jobs@repl.it if you're interested in learning more.
SK:
Without any further ado, I bring you Hillel Wayne. Welcome, Hillel.
HW:
Thanks for having me on.
SK:
Yeah. It's really great to have you. I'm excited for this conversation.
HW:
Mm-hmm (affirmative).
SK:
I think I originally heard of you as the TLA+ guy. Potentially, I think I may have heard of you for the first time via another interview or two on this podcast. I think we have a few mutual friends in common.
HW:
Yeah. I think it was a couple people. I think it was Kevin and I think James Koppel, both ... I think they were both interviewed by you, and they're sort of, we're sort of in the same circles, so I imagine it's one of those too.
SK:
Yeah, and I think it may have actually been both of them saying-
HW:
Oh, hey.
SK:
... "You really have to talk to-"
HW:
I'm popular.
SK:
Yeah, I think you are. I think you are. At least with the people I talk to. You have the illusion of popularity, given who I talk to. So, given that I know you as the TLA+ guy, I'd be curious to hear about your origin stories as this TLA+ superhero. Were you originally bitten by a radioactive Leslie Lamport, or did it happen some other way?
HW:
God, you sort of put me on the spot there, because now I've got to think of like a really clever comeback to that, but I can't. So actually, it wasn't really anything that interesting. I was doing some work at a web development company, and ran into a really complicated distributed systems problem with their product. And what happened is I was looking for ways to make it a little bit more manageable. I stumbled on TLA+ and it worked out really well in my favor. And I'm like, "Hey, this is really great. Like, this is incredibly useful for my problem, and nobody really expected it to be that way. Why is there so little documentation?" So I figured I'd write some documentation for it, and then I wrote documentation. Then I figured I'd give a talk on it, and then I figured I'd write a book on it, and then it just kept going from there.
SK:
Yeah, okay, so you have, I think the first thing I saw was learntla.com. Was that the documentation you were talking about?
HW:
Yeah. It was supposed to be a tutorial, because there weren't any easy tutorials online that I could find. And then it just kept going from there, but that was the first thing. That was like early 2017.
SK:
Yeah, and then I saw some talks.
HW:
Yeah.
SK:
And then you also have a workshop too?
HW:
Yeah. I also published a book on it, actually just a few months back, Practical TLA+, with Apress.
SK:
Cool, and then I saw that, consulting, that's TLA+ specific? You do trainings? Is that-
HW:
Well, it's a lot of TLA+ but I also do a few other things. I do Alloy, I've done some consulting on Minizinc constraint optimization. Just essentially whatever I feel comfortable with in the formal method space that I feel like I can really teach well. It's a lot of those stuff too.
SK:
Yeah. I just find it fascinating that you came across formal methods in your start-up work, and it seems like you just, "Well, now I just want to do this and only this for my life, for my career." Is that kind of like how it happened? You fell in love with this topic?
HW:
Well, sort of. Because, I mean, it's definitely a really interesting topic. I obviously relate to it. But I think what I really enjoy, sort of technical writing and technical communication. If you see a lot of the writings I do, a lot of it is on formal methods because that's what I think I'm best at. Some of it's on accent analysis, lightweight specification, and the history of programming. I just really like communicating and teaching ideas. And I think formal methods at this stage has the highest sort of strength to obscurity ratio, where it's way more useful to a lot of people than know about it. And that's sort of why I focus on teaching it, for that reason.
SK:
Got it. Got it. That makes a lot of sense. Yeah, I can tell from the way that you write that you enjoy it. Or at least it feels that way, that you enjoy writing. I enjoy your writing and I think it comes across.
HW:
Nobody actually enjoys writing. It's more of a compulsive thing for most people, for most writers.
SK:
Yeah. I liked, in one of your essays you talked about how you asked your editor to be needlessly cruel and that he gets extra points if he makes you cry.
HW:
Yeah. He didn't. I win.
SK:
Did he make you cry?
HW:
No. He didn't. I won.
SK:
You won. Yeah, I feel empathy for you, because, maybe we'll talk about it later, but I think you'd have interesting things to say about this, but I spent a lot of time writing an essay this past week, and I got some terrible reviews that almost but didn't quite bring me to tears. So it is hard getting cruel feedback. Or not cruel, but just harsh feedback.
HW:
Yeah.
SK:
Okay. So let's get into the formal method stuff. I thought it would be useful to start by situating ourselves and defining terms. I think your recent essay, "Why People Don't Use Formal Methods", that was on the front page of Hacker News, I think you did a really wonderful job.
HW:
Thank you.
SK:
Yeah. There are a lot of separate topics, and they all, I think the words that correspond to those topics are pretty good, and I'm glad that there are separate words for all these things. Could you roll off the top of your head? Or I wrote some of them down if you want me to kind of tee them up for you...
HW:
Sure, I can talk a little bit more about this. So one thing to keep in mind is that any sort of field in programming, formal methods is a big field. Saying, "I do formal methods", is kind of like saying, "I do web." It kind of gives somebody an impression, but there's a lot of nuance there. But also given that is that, a lot of fields in programming are sort of very big. For example, with web, there's probably more people who do web development than live in New York, the State. But formal methods is extremely small, and it's also very fractured because people, everybody who sort of is in it often knows one or two things but doesn't really know the whole space of it. And the consequence of that is that there's a lot of ideas in there and some of the ideas overlap, but the people who overlapped with their ideas don't necessarily share the same terms.
HW:
So I ended up inventing a lot of new terms for that essay, not necessarily because I think these are better terms, but just because since, again, not everybody sort of shares the same terminology. It was easier for me to talk to a public audience about it by just inventing terms of being clear, that they were terms that I just invented on the spot to talk about differences.
SK:
Oh. Okay. Yeah, that's a good clarification.
HW:
Yeah. So yeah, I divided into two categories. Again, these are very, very fuzzy categories. There's a lot of overlap and there's things that don't belong to either category, things that belong to both categories of thinking about code and thinking about designs. And then we divide each of those into two separate categories of how do we specify and how do we verify. Specifying being how do we describe what we want to be the case. And verification being how we show that what we want to be the case is the case. And that's pretty much all formal methods is specification and verification to one or two degrees.
SK:
Got it. Okay. So just to recap, so is it like we imagine is four quadrants specifying, verifying on one axis and code and design on the other?
HW:
Yeah.
SK:
So there are four? Yeah.
HW:
Again, yeah, very, very, very, very, broad, probably wrong, but wrong in a very useful way.
SK:
Okay. So yeah, do you want to talk about each of the four quadrants a bit? Or what's the next important distinctions to make?
HW:
Yeah, so actually, I should probably just mention this right now is that, because I realized I didn't actually define the term. Formal methods is sort of the study of how we can show that things are correct in ways that are sort of irrefutable. So for example, you might be familiar with say testing, right?
SK:
Yeah.
HW:
So testing works, but it only shows very limited amounts of sort of verification. If you prove your thing works through inputs one through a hundred, maybe it fails for input 101. So formal verification is a way of sort of saying, "Okay, we're going to test every possible thing, and we're going to show that no matter what you put in, it will always give what we expect."
SK:
Well, so I think you use a few terms there that are interesting. You use, in particular, irrefutable.
HW:
Yeah.
SK:
I think is an interesting word.
HW:
Which is also incredibly misleading.
SK:
Okay.
HW:
Yeah.
SK:
Okay, well, I'll let you go give the high level and then we'll drill down into the specifics in a sec.
HW:
Yeah. So, drill down into irrefutable or give the high level of the rest of the space first? Which would you prefer?
SK:
Yeah, yeah, sorry. Let's give the high level and then we'll drill down into some of the specifics later.
HW:
Okay, so, in those four quadrant-ish things, and again, this sort of a formal methods thing of just always qualifying all my statements. Qualifying, again, this is more of just like a very rough model than anything else. So for code specification, you have a few different things. You have external theorem, which is essentially writing your code and writing in a separate file, essentially, here are the properties of this code. That's very similar to what we call testing, but more rigorous. We have really strong type systems, like dependent types or refinement types. Again, some were static types, but harder to check but more comprehensive. And then we have this thing called logics and conditions, originally called Hoare Logic, but now there's a bunch of different branches, where you essentially say in a function given these inputs, this probably should be true of the outputs. And this corresponds to something called contracts in programming, which is a very powerful verification technique. But of the three ways that we verify code informally, it's the most obscure by far. Essentially, the easiest way to describe it is assertions.
SK:
Yep. Yep. Okay. And so just to, this question probably should have been asked earlier, but you just in the last description informally versus formally.
HW:
Yeah.
SK:
The way you're using that term, informally means kind of like eyeballing it and formally means a computer is checking something?
HW:
Informally, here what I'm using informally to mean it can be automated. In fact, it usually should be automated, but it's automated in a way that doesn't give you complete confidence. Essentially, informal verification is still automated verification, it's just done in a way that's one, much easier and two, not as comprehensive as formal verification.
SK:
Okay, so I guess it's a spectrum-y thing?
HW:
Yes, it's a spectrum-y thing.
SK:
Okay, so formal is at 100 or it's just past 50%?
HW:
It's basically 100%. In fact, that's one of the things that people miss, is that for the most part, formal verification is, while the most powerful sort of way of sort of verifying that stuff is correct, probably not the most productive in most cases. Because to get to 100%, you have to go much, much, much harder than it takes to go to 99%.
SK:
But 99% is informal.
HW:
Yes, ish. I mean, that's why we have to sort of put these things on a spectrum because what does it mean to be 99% Correct versus 98% correct, right?
SK:
Yes. Well, I guess I'm just trying to figure out what the whole field is because, you know, when you talk about formal methods versus informal, I don't know. Maybe this was a bad digression. I'll let you get back to the high level.
HW:
I know. We should probably drill into that digression at some point. So in any case, now the thing is that like all these ways that we can spec, we could test both formally and informally. For example, if I write sort of this is the specification of my function, I could test it using full verification or I could write a million auto manual tests or put it through a really intense code review. But if I want to sort of formally verify it, what I could do is I could write, for example, what's called a proof. Which is essentially a mathematical statement sort of showing from our basic premises how we can conclude that this is going to be correct in a way that a machine can check. Often these days, that's considered really, really hard. So when we can we usually shell out to a solver. For example, a SAT solver or what's called an SMT solver to automate some steps for us. That's pretty much the main way that we verify code is correct formally.
HW:
And it works but it's also really intensive, labor intensive. I think the fastest anybody's ever done it was four lines a day using cutting edge, all the resources of Microsoft combined. And that's one of the reasons why, for the foreseeable future, code verification is probably going to remain in the realm of experts. Code specification is really powerful and I think could be more widely used. But code verification at least formally isn't really on the horizon for mainstream use.
SK:
Got it. Okay. So, okay, so just to recap because I feel like that got a little bit messy there.
HW:
Yeah. No. It's messy topic.
SK:
Yeah. Yeah. And I interrupted with sorts of things.
HW:
No worries.
SK:
I thought, maybe let's just start over in a sense of-
HW:
With an example?
SK:
Well, I was thinking maybe, just a picture in my head to, even just for me, the field is called formal methods.
HW:
Yes.
SK:
And in that field has four subcategories: code verification, design verification, code specifications, design specification. Is that correct?
HW:
Four ways of thinking about it. Usually people do either code or design. So they do both code verification specification or design verification specification. I mostly make that division to make it a big clear, what we're talking about, we talk about sort of like proven correct for specifying correct whatnot.
SK:
So what you're saying usually any given person would only deals with code or design?
HW:
Usually, and I'm waving my hands very dramatically here when I say usually.
SK:
Okay. So does that hold for you too?
HW:
I'm mostly design verification and specification. I mostly do design work.
SK:
Design and specification like the same thing?
HW:
So when I say sort of I do design, most people when they say that they do formal spec, okay, this is weird. Usually if a person says that they primarily work with specification language, they mostly do formal specification, they usually mean that they do design specification and design verification.
SK:
Yep, okay.
HW:
Which you'd not be surprised because I said formal specification why are they doing verification. But you know, again, fuzzy terms, small field, fractured field, lots of different pieces to it.
SK:
Okay. One question that I have here my notes, I wrote down was what's the difference between verification and validation?
HW:
So verification is basically taking a description of how the code should be and proving the code matches that. So for example, let's say I say that this code should always sort a list in descending order, the specification would be that say every two indices if one indice is longer than the other one, it's going to be higher than that one. And then verification is basically showing how that's going to always be true, right? Validation is when you say "Wait, do we actually need this sorting function, maybe we actually need a maximum function. Maybe we are doing the wrong thing entirely." It's sort of at the level of what are the human requirements we need and how do we show we match the human requirements of the total system. So yeah. And that's usually outside the scope of formal methods because that deals a lot more with sort of social systems and understanding customer requirements.
SK:
Okay. So it sounds like you have on one level, you have reality and you're trying to match your specification to reality. And then below that you have a code and trying to match the code to a specification.
HW:
Yeah. That's, I think, a really good way of putting it. And then the formal would be validation. The second one would be verification.
SK:
Okay, got it. Yeah. So we're trying to validate our business specification with the market and then we're trying to verify that our code meets the business specification?
HW:
Exactly.
SK:
Got it. And that involves both code verification and design verification because in order to validate that the specification meets a business needs, we then will do the design verification stuff. And then once the design is verified, then we'll write the code and then try and verify that the code meets the specifications.
HW:
Yeah. In an ideal world with an infinite amount of resources, yes, that's how it would look.
SK:
Got it. Okay. And so today, and what you were alluding to, is that, you know, at fastest you can do four lines of code, potentially, in 100 years will have some improvements in theoretical understanding of this stuff, or just machines will be a lot faster and potentially we'll be able to do all this stuff for every line of code and it'll be just as fast as we write code today?
HW:
I don't know. I'm hesitant to sort of make predictions about 10 years from now. Because again, remember, 50 years ago we first started doing formal verification. We were like, "Oh, yeah, in 10 years, we're going to have everything verified." As it was like, "No, that's crazy. In like 20 years we'll have everything verified." And now it's been 50 years and nothing's verified. So I mean, it's really hard. I mean, look, proofs are hard, validation's hard. We often don't really know how to represent specs, these are all really difficult topics. It's hard to sort of make predictions of how things will go out. I think that there's definitely going to be expansive design of verification. But just because I think that right now we've seen it be really successful. But code verification, I think, for the foreseeable future, will remain sort of a niche topic that's done in special cases by experts. And I don't know if or when it will ever be a thing that everybody's doing.
SK:
Okay. Interesting.
HW:
Yeah.
SK:
I guess to drill down a bit, you talked about irrefutable proofs, things that are checked by the computer.
HW:
Yeah.
SK:
One of the things I found in your writing that relates to this is you were talking about how 20% of published mathematical proofs aren't actually correct. There is an error that the person who wrote it missed and the reviewers missed?
HW:
At least according to that one reference I found. It could be that reference is wrong or could be that reference to understating things.
SK:
Yes. Well, yeah, I agree. You know, for you to question in the spirit of, you know, questioning everything, you have to even question the thing that refutes things.
HW:
Yeah. So when I say the proof is irrefutable, I mean, assuming the following is true: assuming that the prover is correct, assuming that sort of any auxiliary tools that are using with the prover is correct, and finally, assuming that you have all the requirements. At which point we show it's irrefutable for the specific context you're talking about. So a common thing gets sort of said, "You can never actually prove a thing will always work because for all you know you're going to start the equation, somebody's going to hit the server with a baseball bat."
SK:
Yes, of course. And so, I guess that's, I think the word irrefutable is interesting because you could say the same thing about mathematics. A proof is irrefutable if the people who reviewed it for the journal didn't make any mistakes.
HW:
Yeah.
SK:
If you assume that to be true. Because that's essentially what you're doing when you're assuming that the code verifier has no bugs.
HW:
Yeah. Yeah, so I think, I believe that some of them for example, I believe the core for Coq, and don't quote me on this, but the core for the Coq prover has been sort of proven by hand to be correct. So that we know that the core is essentially irrefutable. But all the auxiliary tooling is sort of hodgepodge together as academic projects. So that's less trustworthy.
SK:
Well, so I'm struggling. Why is Coq irrefutably correct? Because it was done by hand?
HW:
So here's the thing, that's what I've heard and this is like what I've heard from people who worked on it, I cannot say exactly how that's the case and I cannot, if you basically put a gun to my head said, "Is this true?" I'll say, "Maybe." I don't know enough about the topic. Yeah, again, I do have to clarify here that one of the things that has been affected by me talking a lot about formal methods and working with it is that I don't really, I'm not really comfortable saying things I don't know are absolutely the case. So given that I've not worked direct with Coq and I haven't looked at the papers, I don't know enough about how they verified it to tell you how it's been verified.
SK:
I see. I see. So I guess what I'm driving towards is that the main difference between formal methods, like computer methods, and mathematical proofs is whether a human or computer is doing the checking?
HW:
Mathematical proofs tend to be less rigorous than formal methods.
SK:
Yeah, why is that?
HW:
Because most mathematical proofs aren't automated. So the thing is, is that if we have sort of a computer checking the things, assuming that we built this all correctly, assuming, assuming, assuming, we can essentially say whether every step is a correct or incorrect type inference, given that we had to break it down for level compute. But with mathematics often, the purpose of mathematical proof is to convince people not to 100% prove something's the case. So there will be things like this one step, we can sort of show what this heuristic argued. And everybody looks at it and goes like, "Yeah, that makes sense." They can sort of skip that one step of it, of the mathematical proof. And that's often done because, you know, you don't want to sort of sit down and sort of say, "Okay, in this context, we can prove that we can associative addition, and this kind of is we're going to prove that we can use induction and that induction is actually a reasonable theorem to have in the first place." Whereas the computer will have to do all that stuff. So it reduces the chance that you will accidentally assume something is possible or easy when it turns out that in this one very particular instance, you can't do it.
SK:
I see. That's quite a claim. I think I've heard it before, but I just want to repeat what you said, that mathematics is about convincing and explaining to other humans, it's not about making sure that you're not fooling yourself. Is that kind of what you're getting at?
HW:
I mean, you're trying to convince other humans who are very, very invested in not fooling themselves.
SK:
I see.
HW:
I think one good example of what I mean, this was something I read, I think it was my Terence Tao, one difference between sort of the recent proof of the ABC conjecture, the proof of Fermat's Last Theorem is that the first five pages of the proof Fermat's Last Theorem, people were getting interesting results from it. So even though it was a really, really, really huge proof, very early on in the proof people were saying, "Oh, this is interesting. This gives us some really cool new machinery to work with. This is already been useful to us." That convinces people that there's something apparently interesting here. Whereas with the ABC proof, that I think has recently been claimed to be invalid, you had to read the entire thousand page document to get any value out of it whatsoever. So that made people less convinced, mathematicians less convinced that it was all correct, that it was a useful document. Does that make sense?
SK:
So I understand the story. I'm not exactly sure how it relates, what I supposed to get out of the story.
HW:
Yeah, basically just the idea that mathematics, any anything else we do is sort of also in addition to being a technical institution is also social institution, it's all about how mathematicians interact and how we all do things as a group. And similarly, formal methods is also social institution as well as a technical institution. One of the consequences of this is that with mathematics as the social institution, some amount of mathematics is in the social act of convincing and rhetoric, which is how it should be given that we're not machines. Whereas with formal verification, often the only thing that we care about is sort of make something pass the formal verification tooling, which means that it's almost entirely in that one context about making sure that every single thing is correct.
SK:
Got it. Okay. Well, I think this is a good point to transition to the difference between theorem provers, which I think most of what we've been talking about, and model checkers.
HW:
So essentially, there's two ways to sort of show something's correct. You can either sort of construct a rigorous argument showing it's correct or you can sort of show how it's impossible to be incorrect by brute force of the entire possibilities. So I guess, here's a simple example of what I'm talking about. Let's say you have sort of something that works over 32 bit floats, right, you have some function that takes two 32 bit floats and returns a float, right?
SK:
Sure.
HW:
So there's only about two/three billion 32 bit floats, right? So you could literally just go by hand and check every single one of those combinations. And if you do that, you can actually just check and brute force and make sure that every single thing does what you expect it to. And that does sound like a bit of a burden, but proving stuffs kind of also a giant like cluster, so... That make sense?
SK:
Yeah, that makes a lot of sense.
HW:
Yeah.
SK:
And then the last kind of thing I'll ask in this abstracting before we get concrete is, I think, a very common programmer question. You know, I don't even have to speak for other programmers, I could just speak very personally. When I hear about specification and verification, I really want these things to be tied to my code. I don't having to duplicate the effort of specifying and then writing code and then having to eyeball or verify. Basically, you know, I wonder if I could just write the specification and then the compiler writes the code for me, or the specification is the code. So I feel like you have terms for that...?
HW:
Yeah. So the term for sort of taking a specification and generating the code from that is called synthesis. This is not something that we can do mainstream, it's still a very niche academic topic, and one that a lot of people are obviously working really hard on. But it turns out that generating code is really hard. I can actually link some stuff, because Nadia Polikarpova is one of the big people doing a lot of the really cool work in this space. And she recently did a talk at Strange Loop about some of the work she did. And it's all really cool stuff. But she's also very clear this is not going to be mainstream in the next 10 years. Not gonna be getting close to mainstream the next 10 years. So yeah, it turns out that that's a lot harder to do than just sitting down and showing that code matches specification, which is itself a lot harder to do than showing that code almost, most likely, matches specification informally. So yeah, there's all these terms of difficulty. And I think one of the things that happens is that people get fixated on sort of the golden mean, the sort of end state, but to the point where they sort of ignore all the really big benefits that we can get in between.
SK:
Okay, yeah. Yeah, I know what you mean. And so I think let's start talking about some of those benefits you can get even now, even in the not end perfect state, in the imperfect world we live in. You have been singing from the rooftops the virtues of using TLA+ for design verification. So let's hear more about that.
HW:
Yes. So one thing I do want to say first, really quickly, we do also get benefits with code verification. For example, a lot of type systems do some partial verification. Rust has a borrow checker and that basically let's you do a lot of verification automatically. So we are making a lot of steps to make certain aspects of verifying code more accessible and we've seen a lot of success for a lot those steps. So it's not just sort of working with designs where we see immediate benefits.
SK:
Got it. So you're saying that, both for designs and for code, even though we're not at the end stage, intermediate or 80/20 versions of these formal methods are useful, both in code and design?
HW:
Yes. But I think what happens is that a lot of sort of the code verification stuff has basically been tied to a language, which is really good. But design verification has not been tied to language, so you don't have to use a particular language in your code base to get the benefits of design verification, which is one of the reasons I think it's so valuable. One of the reasons I think things like TLA+ and Alloy have a lot of really good uses even today.
SK:
Oh, interesting. So if you're using Agda you can get those benefits. But if you're not, then you're kind of screwed. And that's the appeal of something like TLA+, you can use it with any language?
HW:
Yes? Part of the appeal anyway.
SK:
Part of the appeal. I see.
HW:
Yeah.
SK:
Yeah, that definitely makes a lot of sense. And I guess it's similar to test driven development, like unit tests can be done in any language.
HW:
Exactly.
SK:
Yeah. Okay. Same idea. Or Agile, Agile can be done in any language.
HW:
Yeah. And that ends up being really important for the development, socially, a lot of these ideas because if you can start getting the benefits without having to change your entire code base, you're more likely to do it than if you have to sort of rewrite everything from scratch to get some value out of it.
SK:
Yeah, of course. Okay, so let's finally dig into it. TLA+. So for those who aren't familiar, could you do your, you know, whatever, two minute spiel of what TLA+ is, the motivations behind it, how it came about? All that jazz.
HW:
Okay. Sure. It's actually one of the interesting challenges for how do you explain this without demos? I found that the easiest way to describe it is to show people demos, but obviously we can't do that on a podcast. So okay, so obviously when we sort of are designing, basically, we're building systems that involve say, multiple actors or multiple programs or client servers. We have the code, right, that actually embeds all of these thing. But the code is simply how we do these implementations. It doesn't sort of show our high level understanding what should be going on what is going on. For example, imagine that you sort of even have something as simple as say, a web app that has both a front end and a back end and then service in a deployment system. You're sort of looking at a space that can't really be encoded in just a single code base. You're at the very least looking at multiple code bases all interacting with each other. Right?
SK:
Yep.
HW:
So none of this, of the code that you've written, really expresses or is aware of the full design of your system. And because of that, it can't really help you with verifying the design itself. So people sort of implicitly understand this. That's why people do things like whiteboarding or draw UML diagrams or sort of talk about doing acceptance driven development. And sort of this is additional understanding of, "Hey, there's this broader design that has its own challenges beyond just how we're each in the line of code is working or not working." But if we have this idea that we have a larger scale design that we care about, why not specify it and then why not test that specification for issues. And that's sort of a lot of the motivation behind TLA+, which is by Leslie Lamport, the same who did Latex and basically half of distributed computing.
SK:
So it sounds like you can almost think about TLA+ as a direct replacement for documentation or whiteboarding or UML diagrams?
HW:
Augmentation not replacement. Still write your documents, please document stuff.
SK:
Okay. Because the specification isn't understandable, you can't just read a specification and understand the system. You still need documentation.
HW:
Oh, no, you can totally read, often people, you can read a specification and can give you a lot of insight. But I think it was David McKeever who was like, "Sure, caffeine can help you replace sleep, but caffeine isn't sleep." Things like tests and specifications can help you understated a system, but they're not documentation. Documentation exists at a human level, even higher than any specifications you can write. Still write your documentation, write your requirements analysis and then write your specifications.
SK:
Yeah.
HW:
I'm not really selling using TLA+, am I? Basically just going like, "No, it's not that great. It's not a great everyone."
SK:
Well, I guess what I'm reacting to is it seems like we just keep layering things on, you know?
HW:
Yes.
SK:
We have our code and then, okay, well actually, you have to write these tests for code. Oh, I have to do all this Agile stuff to write the code in the right way. And oh, actually, you also need these integration tests. And oh, actually, now you need to document your code. And then also, you have to write now this TLA+ specification for your code. So it would be nice if one of these things could replace some of the other ones so we can simplify some of these other things. It feels like we're just going to keep layering on things, and eventually will all be stuck writing four lines of code a day.
HW:
I mean, there's a reason we're paid a lot of money as engineers. This is hard stuff. I mean, this is really fundamentally hard stuff. And there's a reason we're paid a lot of money do software. Right?
SK:
Well, I think that's an interesting claim. Are we paid a lot of money to do software because it's fundamentally hard or is it, you know, incidentally hard?
HW:
I guess both. Okay. So I guess what you're sort of asking is because just to clarify, it's like you're sort of asking it seems like we have to do all this extra stuff is it worth the effort? Because you're sort of talking about it from a productivity perspective, you're worried about it sort of slowing everything down, right?
SK:
It's not is it all worth it, it's more that it feels like each thing we add on, unit test, integration tests, formal specification, documentation, feels a bit like an ad hoc solution to one part of the problem. And it's not like a unified solution to anything, does that make sense at all?
HW:
It does.
SK:
Yeah. Okay.
HW:
So my thoughts there is that a unified solution would be nice, that sort of solves everything for us. Historically and empirically, almost all the ones we've tried have not worked out. It turns out that coding, it turns out that complicated problems often do unfortunately require complicated solutions.
SK:
Yeah, well, actually, now just hearing myself talk and hearing what you just said, it reminds me of the no silver bullet essay.
HW:
Yeah.
SK:
Which most people misunderstand. But the the central metaphor of that, that I remember, is medicine, how before germ theory we thought there'd be some magical cure, some simple magical cure to diseases. But then once we finally accepted germ theory, we realized that there would be no one big solution, it'd be a lot of tiny little solutions that'd be hard to find.
HW:
Yeah.
SK:
I guess that's kind of what you're saying with software. There's gonna be no unified one solution. It's going to be a bunch of little add on things that we'll have to keep adding on to software to make it better incrementally over time.
HW:
Yeah.
SK:
Just like we have to take a flu shot every year and we also take a tetanus shot and we also take a polio vaccine, there's no one magical shot that will do all of those vaccines. We have to take them all.
HW:
Yeah. And I think that that's true with almost any sort of human system. I think with almost every system you're gonna look at, whether it's sort of software engineering or medicine, and I'm assuming and I could be wrong, as we all know, we do not know other fields very well. With other kinds of engineering and also with [inaudible] such, it's just that there's really complicated problems that there's a million small solutions for and no one ever finds one magical thing that just fixes everything.
SK:
Yep. So I hear that for sure. And then I feel like on the other hand, there are times when you have the geocentric theory and you add epicycles and epicycles and epicycles, and all of a sudden you realize, "Oh, wait, if we just make it a heliocentric theory, we get rid of all those epicycles and everything's more elegant." And we've replaced these ad hoc things with a new elegant foundation. So that happens too sometimes.
HW:
And then you have to start figuring out the perceptions of stuff, then you realize you have to add in general relativity and special relativity to sort of adjust for other things, which are verified but also incredibly complicated.
SK:
Yeah, yeah. Well, I guess that's kind of the beginning of infinity thing. And we'll never quite be able to explain everything.
HW:
Yeah.
SK:
I guess, to go back to my original skepticism, it's really skepticism of ad hoc-ness.
HW:
Can you clarify what you mean by ad hoc-ness?
SK:
Yeah. Ad hoc-ness is a hard thing to define. But I guess what I'm getting at is if I asked you to list all of the practices you would recommend for an engineering team, like unit tests, maybe just list them for example. What would what would be all of the, so writing code, version control, you know, maybe just list some of the things you would recommend.
HW:
I guess some of the ones that I think would be recommended would be formal specification, I think obviously, I've got to sort of say that. Obviously writing code, you have to do writing code. Version control is important. Code review is extremely important. It's one of the few things that we are empirically sure, with multiple studies, is a great idea.
SK:
Code review, probably?
HW:
Yeah, sorry. Code review. Did I say something else?
SK:
Oh, I thought maybe you said, I'm sorry. Nevermind. I'm sure that's what you said.
HW:
Yeah.
SK:
You're just dropping out a it.
HW:
Yeah. My mistake. Yeah, basically, code review, really, really good. Taking time to do stuff, adequate sleep, exercise, good relationships with clients, constant feedback, really careful post mortem system analysis, really careful pre mortems. I realize a lot of this isn't actually in the code level. Do you want what I think would be effective things for coding?
SK:
Oh, well, yes.
HW:
Yeah.
SK:
Yeah, go for it.
HW:
Coding, probably unified style, randomized testing, although that sort of is interesting, because we're not quite sure what are the best test to write. I think a lot of people are really fond of unit tests. I think those are great and you can write them really fast. But there's also other things that are really powerful large scale testing, probably some measure of observability. I'm not sure if this is really supporting your point or mine here.
SK:
Yeah, I'm not sure either, but I kind of like where this is going. I think neither of us really know where it's going.
HW:
Yeah.
SK:
Yeah, I guess maybe this is just how it is. To pick another example, if you were to say if you want to be the best tennis player in the world what are all the things you have to do. I guess the list would kind of be long and complicated. And then someone be like, "No, actually, you have to add this thing too no that we know this feature of our rackets are important." You have to worry about that too. And actually, you know, we didn't really that gluten was bad or whatever, gluten was good or whatever it is.
HW:
Yeah.
SK:
I guess-
HW:
I do see one simplification that I think you would find as interesting as simplification. I think a lot of unit tests, integration tests, not all of them, but a lot of them can be folded into a combination of property tests and contracts.
SK:
Okay.
HW:
That's just my opinion though.
SK:
Yeah, yeah. Tell me more about that.
HW:
So are you familiar with contracts?
SK:
Yes, but let's assume not.
HW:
Okay. So essentially, a contract is an assertion that you make as either usually a pre condition or condition of your function. So say, if you have something that takes the tail of a list, you can make the post condition saying that it will have one less element than the original list. And also if you append to the head of the list to the output you'll get the same thing. So essentially, these are essentially specifications that ride in the code itself. And they can be used for formal verification, but they can also be, and are more commonly used for run time verification. Every time you call the function, you just check the preconditions and post conditions. And if they're wrong, you just sort of stopped execution; raise an error. And it turns out that if you do this, one, it's really effective, but two, you can now start to test by just randomized inputs, pumping it through a system. And just if you have a bug, the appropriate contract will stop and raise the issue. So you start to get really simple integration tests from that.
SK:
Okay, yeah, that's very interesting.
HW:
Yeah.
SK:
So, you know, the chaos monkey approach. Well, chaos monkey, I guess, is more about letting servers do it and stuff.
HW:
Yeah, but essentially, the randomized testing with fine grained responses.
SK:
Well, the randomized testing reminds me of Haskell's QuickCheck, where you generate tests based on types.
HW:
Yeah.
SK:
And then these are runtime assertions, which I guess in a type world would be kind of like dependent types.
HW:
Sort of. So the thing is, is that contracts, so I guess a quick description between contracts and types, a lot of overlap between the two ideas. The main difference is that types aim for what's called legibility. They aim for being able to really easily analyze them statically, while contracts aim for expressivity. They aim for the ability to encode arbitrary assertions. So, for example, if you really wanted to, you could write a contract that says this function is only called by functions that have palindromic names.
SK:
I see. I see.
HW:
Yeah, which is probably not something you want to do, but you can easily do things like say refinement typing, this should only be called with values that are greater zero and will always return values that are less than zero.
SK:
That's more of a type level thing. But you're saying that contracts are much more expressive?
HW:
Yeah.
SK:
It sounds like contracts have access to not only the static AST, but also the actual code. So you can get the the name of the function and also have access to runtime information.
HW:
Yeah.
SK:
Maybe even have access to past runs. If I've been tested three times before then fail?
HW:
I mean, I think that's probably not something you want to be doing. But it's more long line. I guess, here's the more reasonable thing, after this is run, this mutation should happen in this class kind of thing.
SK:
Yeah. Okay. Got it. So, yeah, well, I think that's, I agree that that's something that excites me. I like the idea of simplifying. It's like a mathematical idea. You know, being able to describe the same amount of things or more things with less words. But then, yeah, I also understand the no silver bullet solution of it's a complicated thing, we just have to keep layering on improvements over time.
HW:
Yeah.
SK:
Or maybe, at one point, we'll get a paradigm shift and we can have a new foundation. But yeah, anyway.
HW:
I mean, and also, the thing is that you might end up in a case where things are simplified but on the whole, things are more complicated. It might turn out that we can fold five ideas into one, but also, we need six ideas total. So I'm not doing a great job explaining this. I'm sorry.
SK:
No, no.
HW:
There's a reason I'm much better at writing than public speaking.
SK:
Yeah, well, you're quite a good writer. So well, I guess, let's focus on the things that I think we do a good job of, focusing on the discussion kinds of things. Because your essays, and I've done this a lot in other podcasts too, when I interview people who are good writers, I try and start with their writing as a foundation and then kind of go where I find interesting. And hopefully that's where other people who will read your things and wish that they could have asked you this question, hopefully.
HW:
Oh, I see. Okay. So in that case, let me try changing sort of my answer to that.
SK:
Sure.
HW:
Because, okay, so I think that I'm going to actually shift this about why do we have to keep on layering stuff. And I'm going to sort of shift this in a slightly different direction then. Into something that I think might be more interesting for discussion. Are you familiar with systems theory?
SK:
Let's drill into it regardless.
HW:
Okay. So it's a mixture of really interesting ideas and really cultish ideas that sort of started forming around early last century and sort of developing, which is the idea that often we approach systems, we approach problems and think, "Oh, this really complicated problem, there's actually a simplification to it that we can make, that makes it easy and we can sort of abstract it out." And the systems theory is the group of people going, "Wait a minute, what if that's actually not always the right approach? Maybe there's a better approach, which is what if we sort of look at this complicated problem and say, 'Hey, this complicated problem is actually still a complicated problem. And what we should be doing instead is trying to find all the patterns inside the complicated problems that helps make it complicated. And then try to sort of think of the complicated solutions, or the simple solutions that address all these interconnected issues.'"
HW:
So the idea here is that instead of sort of saying, "Oh, there's an easy answer here," we say, "No, there's no easy answer. And now that we accept that there's no easy answer, how does this help us, to have that revelation?" Does that make sense?
SK:
Yeah. Well, maybe, can you walk us through a concrete example?
HW:
Okay, sure. So here's a concrete example and this is the stuff that I've mostly been interested in applying to a system safety, which is how properties of safety and security arise in a system emergently. So actually, yeah, actually, I think I might have actually had a better idea for an example here, which is future direction, which I think is a little bit easier to talk about. So imagine you have a system where you've got the following. People can sort of sign up and register for an account. And they have to validate the email address, right?
SK:
Yep.
HW:
So they have to validate the email address to basically do stuff. And you also have a check basically in the registration form that they cannot validate an email address if it belongs to say a spammer email address. Right? You know those like things like teleworm and youmail, which lets you get a bunch of free email addresses in bulk?
SK:
Yep, yep.
HW:
Okay. So now you actually have that requirement and now you also have the requirement that if they get an error, basically the email gets lost they can quick upon that resends the validation email, right?
SK:
Okay.
HW:
And now let's imagine we finally have a requirement that you could change your email address. Do you suddenly see the bug that we now have?
SK:
You can give them a good email address and then change it afterwards?
HW:
Yes. And not only that, but often you can end up without, even if we have the requirement that you can still change your email address to a bad email address, they can still register with a good email address, change it to a bad email address and then resend the validation. And this is actually a bug I've encountered in the wild. And then the question is though, is this a bug? Is this actually a bug? Would you say it is or would you say it isn't?
SK:
If they don't resend the validation?
HW:
If they can do this, where they can now validate a bad email address by resending, by basically switching to a bad email address then hitting the resend button.
SK:
It sounds like a bug.
HW:
Okay, so where's the buggy code?
SK:
Oh, yeah. The bug is in the design?
HW:
Yeah. But see, here's the thing, all the codes correct. All the code is satisfying all of it's requirements. But this idea that there's an immersion bug that happens, that we're not even sure if it's a bug or not until we actually talk to, figure out the broader goals the system. This is what we call feature interaction. And it's the idea that multiple working things interacting can lead to a global bug that's not a local bug. That's what I'm sort of talking about by systems theory. This idea that we have to be thinking about all the complexity that emerges from what we do to actually figure out whether things are correct or not. Does that give you a better intuition of sort of what's going on here?
SK:
Yeah a little bit. Maybe, do you want to try one more example or move on?
HW:
I can try one more example. Give me a second to sort of think of a good example here. So this is more-
SK:
So-
HW:
Yeah?
SK:
Sorry. Before you do it, one thing that might be just misleading me is Nicky Case talks about systems and he talks about cyclical systems.
HW:
Yeah.
SK:
And he has these diagrams where you have robins and foxes.
HW:
Feedback loops? Yeah.
SK:
Sorry. And you can model them. I imagine that his systems are maybe the simplified versions that you're saying we should think about things in or more complicated way or is his systems kind of what I should be thinking about?
HW:
Yeah, they're similar. So what he's talking about is stocks and flows. Right? And that's a particular way that people analyze dynamic subsystems. So it's a way that we essentially try to sort of find models of complicated systems that helps capture some properties of them. It's one of the many tools that systems theorists do have. Another one that's really close to things is that state machines are used for systems analysis outside of programming.
SK:
Okay, got it. So state machines is I think a good example. So state machines is used for systems analysis.
HW:
Yes.
SK:
Got it. Okay. So I think I was just a little bit maybe confused when you mentioned that, because you said that sometimes systems theory was kind of a reaction to oversimplifying.
HW:
Yes.
SK:
State machines seem like an oversimplification, but maybe I just haven't seen complicated enough state machines.
HW:
Right, right. But the idea of the state machine is it's a simplification, but it's one that explicitly doesn't capture all the logic of a system and it tries to capture also its more complexity. If that makes sense? It's a way of representing a complicated system in a simplified way, but also doing so without assuming that the system is actually the simplification. The map is not the territory.
SK:
Okay. So to that's the key insight.
HW:
Yeah.
SK:
The map is not the territory.
HW:
Yeah.
SK:
Okay. Sorry. So I feel like I lost the stack. So why were you going into systems theory?
HW:
Because one of the ideas that if things are this complicated, then it sort of suggests that even if we can find the simplifying paradigms of things like programming, or really anything, innately the fact that we're trying to build a system will make it complicated. There's no way to easily simplify the complexity of it being a system. And that means that no matter how we sort of work on local levels to make things easier, we're still going to have to think about the entire battery, the connections, all the various aspects that it means to validate and verify if we want to actually match what the global system is doing to what we need it to do. So what this is sort of suggesting is that even if we do find these ways to replace geocentrism and epicycles with heliocentrism, we're still going to have to deal with a lot of a lot of complicatedness in the new model. Maybe less than the old one, but we're still going to deal with a lot.
SK:
Got it. Okay. I see. Let's go back. I enjoyed this conversation. I just want to go back to some of the maybe more code related stuff.
HW:
Yeah.
SK:
So you alluded to earlier in this conversation that even though we don't have maybe the tools we'd want for formal methods, that we have good enough tools for verifying the important parts of your system. So it sounds like an 80/20 rule kind of thing.
HW:
Yeah.
SK:
So I'd be curious to hear about what your heuristic is for when a given problem... What properties of code or of design kind of begs for formal methods and when you say to stay away, is it a function of, you know, how well funded you are, how big your company is, how long the code has been around? You know, how do you think about when to apply formal methods and went to not.
HW:
Okay, so this is going to be a lot of butt talking right now, because, I mean, so first of all, there's a consequence between formal design and sort of formal coding. And also that there's the fact that I'm going to be majorly biased here on account the fact that, again, this is my job. This is what I do for a living. But essentially, there's a risk of failure and cost of failure, right? So any given system, there's going to be a chance that it's going to go wrong and the probability that when it goes wrong it'll have these effects. So the common example everybody uses to defend formal methods, which is sort of a incorrect example for a lot of reasons, is a pacemaker. A pacemaker goes wrong, someone dies. Right?
SK:
Yeah.
HW:
So you'd want to formally verify that. Turns out, there's almost no formal verification in medical devices. It's not done. So that's actually not the greatest example.
SK:
Why?
HW:
Lots of reasons. But mostly because it's an industry that isn't well regulated in general when it comes to the software aspects of it. And it's kind of behind it all, so there's a lot of different... There's no standardization among medical devices and there's so many different kinds. It's weird and messy. Most industries don't use it [formal methods] that you think would use it.
SK:
That's scary.
HW:
Yeah. But in any case, there was actually, I remember there was actually a FDA advisor that came out a year or two ago saying half a million pacemaker devices can be hijacked if you're close enough to them and you've got a radio. Yeah. Try to exercise, exercise. Exercise is good. We know that works. And if it keeps you from needing a pacemaker, reduces your chance of getting hacked.
SK:
Wow.
HW:
Yeah. But in any case, so. But there's also the lower level costs. For example, it could turn out that if you're system fails, then your clients will be really angry at you and they might threaten to sort of pull your contract. Or if the system fails, you're going to have to scramble and spend two weeks fixing it and trying to figure out where it failed and what the bug is, why is it crashing all of a sudden. Right?
SK:
So yeah.
HW:
So I think my general stance is that with design, specification and verification of designs, we've often found that it is cost effective for reducing the risk of angry clients and frantically spending time fixing fires in production. That's my personal experience. That's the experience of most people I've talked to. It saves a lot of time there. So that's why I think that is more broadly useful. Specification of code, just the specification, not the verification, is a really good way of thinking about it and figuring out what you want to test Informally. And then the verification of code, I think, is challenging enough, at least formally, that it's not something most people should be doing.
SK:
So I'm realizing here that I don't really know what the specification of code is about. I know what verifying code is about. It's showing the code meets to spec. I know what verifying the design is and specifying the design is. What's specifying the code?
HW:
Let's find a good example here. The example I commonly used is sorting. Okay, so how would you test whether something is sorted?
SK:
Well, I guess, so you gave the example of you pick two indices.
HW:
Oh, wow, yeah. I was hoping you'd have forgotten that by this point. Okay, so maybe another example. You've got a sequence of numbers and you need to find the maximum of it. Right?
SK:
Okay.
HW:
What's the maximum? How would you test that?
SK:
How would you test it, that I actually got the maximum?
HW:
Yeah.
SK:
I guess it's the and of less than of all the things. I don't know what words I'm allowed to use in order to specify something.
HW:
Exactly. Yeah, I realized that might also, wow, it's really hard to come up with a good example. I've already burned my sorted one. Yeah, so you basically say it's going to be the one that's less than all the other ones, right?
SK:
Sure. Yep.
HW:
So you have this description of how it works, what it means to be the maximum of a number. Now, most people what they do that, their solution is going to be to write tests, right, to write unit tests to say, "Okay, the max of this list should be this. The max of this list should be this. The max of this should be this." Right?
SK:
Sure.
HW:
And that essentially creates an ad hoc specification and the specification of the function is now for this input, it gives this output, for this input it gives this output, for this input it gives this output. But that doesn't really describe what it means to be the maximum of a list. It's what you actually said, that it's things such that everything else is less than it.
SK:
Yep.
HW:
So essentially the specifications then a way of describing what we actually expect that to do in a level that actually totally encompasses what we want out of it.
SK:
Okay, so I see. And then how is that different than the design specification?
HW:
Design specification is sort of happening at a much higher level where you sort of hand wave away. The example I use here is that if you're trying to sort of write something that figures out whether something's a park or a bird, it's kind of hard to specify that at the code level because you have to sort of figure out how to objectively describe as specification of what a park or a bird is. But at the design level, you're basically just hand-waving saying, okay, let's assume this can definitely figure out what parks or birds are, how does this work we try to get by with their message queuing system and how does it work when we combine with sort of the user constantly trying to cancel stuff. It works at a higher level with much more hand waving and looking at sort of the larger scale interactions between things versus each line of code and how it affects the local... So maybe another way to think about it is local design versus global design.
SK:
I see. Yeah. Yeah. Okay. Got it. Thanks for the reminder. It's hard to keep all this stuff straight.
HW:
Yeah, I mean, again, the reason I'm using sort of a lot of specific terminology here is because a lot of times people sort of how this implicit assumption of how things are with this. And I find that often leads to some confusion when talking with people because they're not sure whether wait, is this formal verification? I thought this was formal verification. So I try to very clearly split into all those different subcategories I said, which-
SK:
Yeah.
HW:
Yeah.
SK:
And then another term, another word that comes up here that I wonder if you tease out for us is the difference between TLA+ and Alloy. Are they used for the same things?
HW:
There's several specification languages. So just as there's multiple programming languages, there's multiple design, there's multiple design languages. And usually if somebody says they work in a specification language, they almost always mean a design language, that is used to write designs. Then you can sort of do things called refinement to go from design to code. But again, that's a more complicated topic. But in any case, so TLA+ is one of the big ones these days and it's the one that I started really working on. Alloy is another one. It uses some different base assumptions on what we want to be specifying and how we do it. And that leads to different design decisions. And it's also really, really good. I'm actually on the board for that and we're working on writing more documentation for that too. Then there's other ones too. Those aren't the only two. But those are the two that I personally get the most excited about.
SK:
Got it. So I feel like I can't resist...
HW:
Go for it.
SK:
I feel like it's a light and I just keep bumping into it, even though you've already kind of told me.
HW:
Go for it, this is discussion, right?
SK:
The idea of specification code it just keeps attracting me. And so I'm going to do this thing, again, that you said, it's kind of not a waste of time, but it's maybe a little distracting right now. So to me, the end state of where we want to go is the abstract specification stuff seems to be the right level of talking about things. And the essential versus incidental complexity distinction, I think that's a good frame, maybe it's not precise enough. But anyways, it feels like specifications are almost by definition, essential complexity. And so if we're trying to get rid of all the incidental complexity, a design language that's based on math seems like a really great way to go. And then of course, there might be reasons why our specification language or the way in which we mathematically model something is incomputable or is incomputable, efficiently. So I feel like the dream is that somehow you have an interactive compiler experience or interactive refinement experience where it, you know, aids you in taking your design and making code that's somehow efficient. It exposes why the specification you wrote isn't efficient and then it gives you ideas and you can work with it.
HW:
So I have three things to say that. First of all, I want to say that the idea of talking about essential versus incidental complexity, I never considered it in those terms, that's a really good way of thinking about it. I really like that. Thank you.
SK:
Oh, sorry. You aren't familiar with the Fred Brooks?
HW:
No, no, I'm familiar with Fred Brooks, I love the Mythical Man Month, I've read it multiple times. I just for some reason, I've never connected the idea of trying to think about specifications as essential verses incidental complexity, they just for some reason never was connection I made in my head.
SK:
Okay.
HW:
So that's what I'm like, hey, thank you for that. That like really works out as a great metaphor and a great model of understanding things.
SK:
Oh, okay.
HW:
Yeah. Hey, I don't know a whole lot. Second thing is, and this is just my opinion, I'm not talking about any party line, this is just my own like philosophical ramblings, I think that has been a thing that we've been doing for a while. Like C is a higher level specification of assembly and then Python is a higher level specification of C. And then prologue can be a higher level specification of all of that. So I think that essentially, as we get better and better at sort of synthesizing specification in code, we start to call them new programming languages and higher level ones.
SK:
Yeah. So in that line of thinking... But TLA+ doesn't compile down....?
HW:
Yeah.
SK:
Right. So it's kind of broken that line ... kind of on purpose. It's saying, "My job isn't to compile down to Python or whatever."
HW:
Yeah. And the reason we do that, and again, like TLA+ isn't the only thing that does it, and it's not even the first thing that does it, but the reason we do that is so that way we can talk about specification of any arbitrary level of complexity. I acquired a specification where I'm talking about, okay, the actors in the system, one of them is our server cluster, another one is our ETL pipeline, the third is our data warehouse, the fourth is all of our users and the fifth is an adversary. And then the sixth is, I don't know, an earthquake. That ability sort of jump between different concepts of what an actor is and what a system is, makes it really easy to specify really complicated properties very concisely. But the consequence is how do you sort of convert your adversary or an earthquake into code?
SK:
I see. I see. Okay.
HW:
Yeah.
SK:
So I guess one way to think about it is if we have this progression of getting more and more abstract high level specification-y languages that compile down, so we're kind of plotting along, getting more and more abstract. And then Leslie Lamport was like, "You know what? You guys keep doing that and eventually you'll get to something like TLA+, but I'm just gonna go ahead and jump all the way over to TLA+ and make this and it won't connect to that pipeline, but it'll just be in its own little world to help you reason about those other things."
HW:
I'd say yes, with two caveats. Leslie Lamport wasn't the first person to do this. I believe, and I can't remember who did this, but one that was definitely much earlier is this one called Zed. And actually TLA+ was in part a way of trying to write a specification language that was simpler than something like Zed, as was Alloy.
SK:
Okay.
HW:
And two, I sort of name drop this a couple of times, but there's this concept called refinement, which basically involves taking a spec and then writing a more fleshed out more complicated spec that matches it. And you keep doing that until you get to a point where you're basically at code and they just translate that into code. That's the concept called refinement. And it's also something you can do, but going from a really high level spec, refining it all the way down to implemented code is really, really hard. It's basically just as hard as proving code correct in the first place.
SK:
I see.
HW:
Yeah.
SK:
Okay. Cool. Well, thanks. Thanks for doing it again, even though we kind of already talked about that. I felt I got more out of it the second time.
HW:
I mean, I will be honest, the way I'm sort of seeing it and us talking about this multiple times is, every single time we talk about this it is evidence to I'm not explaining it well enough. So one of the really fundamental challenges about this space and the reason I write so much on it is that we've not done a really good job of both explaining what it is to people and teaching it to people. Have you tried looking an answer about any documentation about formal methods?
SK:
Yeah, that's what I spent today doing to prepare.
HW:
Yeah, it's not that accessible outside of this discipline. Right?
SK:
Your stuff was pretty good, actually.
HW:
Yeah, first of all, thank you a lot. Second, that's where it's sort of why I write so much because I think that it's both we don't do a good job explaining it, but also it's a hard thing to explain. So that's why I sort of care so much about explaining it well. And that's why every time we come back to this in this podcast of you have a question like, wait, you don't understand one thing, I'm just taking notes here being like, "Okay, how do I explain that thing better? How do I make this more intuitive? How do I be clear about the difference between a and b?" So you should not be the one apologizing here, I should be the one apologizing for explaining it poorly in the first place.
SK:
Well, anyways, thanks for for that. Yeah. I appreciate that you're so concerned with the reader. It shows you a lot of empathy for your students. You're a true teacher.
HW:
Yeah.
SK:
Okay. So I there's one topic that's kind of a meta topic that I want to switch to.
HW:
Go for it.
SK:
But I want to give you an opportunity, okay. Is anything else you want to say about to TLA+ or formal methods before we jump there?
HW:
It's cool. I think it's really cool. It's not a panacea. Nothing is, but it's cool. I think it's interesting. I think it's useful. I think it's powerful. Most people will probably get as much benefit out of exercise and getting adequate sleep than about using any sort of tool no matter what it is in programming. That's my personal stance.
SK:
Wait, so you're saying that if you don't do exercise and get good enough sleep and you don't know formal methods, you should first get exercising and get enough sleep and then do formal methods?
HW:
Yeah.
SK:
Okay, good. Just making sure we have our priorities straight.
HW:
I say this as a person who's permanently sleep deprived. I do not practice what I preach here.
SK:
Okay, that's good to know too. Well, I guess, so speaking about practicing and preaching, the topic that I want to talk about next is related to a really fun Twitter exchange you and I had recently. You know where I'm going?
HW:
I think I know where you're going about this. Do you want to explain, I think you could explain things better than I could.
SK:
Well, I'll explain my part then you explain your part. So I wrote this in 19-tweet rant about the comprehensibility of software and how when you have millions of lines of code, nobody can understand all the code particularly because of the way code is written. And it's it's scary. That was kind of the thesis. And in there, maybe number 12 or something, of my tweets, I made an analogy to how this is much worse than in other kinds of engineering, such as building bridges, where there aren't 12 million lines of code and so other people could understand, via the specification, how things work. And so then Hillel saw this and that one tweet in particular, the comparison to architecture, you kind of ran with it. So maybe you take off and start is from there.
HW:
So at that point, what I did was, so I'll explain why that kind of was the one that I reacted to, but basically I'm like, "Wait a minute." So you basically had this tweet about how you're like, "Well in say a bridge building, if sort of every single person left, they'd still have all these blueprints and all these construction plans. A new group come in and basically start where the old group left off." And what I ended up writing was-
SK:
Yep.
HW:
Yeah.
SK:
Thanks for specifying that out because I think that just to continue specifying out my original point I was talking about how in software, or I was alluding to how in software the word "legacy" is this bad word. Nobody likes legacy code. Nobody likes to interact with old code that nobody else can explain to them. Basically, I was arguing that, in software, most of the knowledge of how something works is stored in the people's heads and it's not actually in the code. And so if all the people die or leave and someone else's forced to... a new team is forced to understand a million lines of code, it's going to take forever. They almost have to start over from scratch or nobody's gonna want to touch it because it's just so scary. And so I thought, or I asserted, but I think you helped show me I was wrong, that it's better than other kinds of engineering.
HW:
Right. So basically what happened is I responded to that and I was like, "I'm not sure this is the case. I'm not an engineer. I'm not an architect. I don't know how does another engineering fields." And then I'm like, "What it looks like, to me, is that it's probably very similar to how it is in programming where so much as in people's heads. And what's not is basically thousands of pages of documentation." And I'm like, "Does anybody else know this who's done both?" And one thing, if there's one thing I'm really good at, it's getting people who are really cool to follow me on Twitter. That's my main skill more than anything else that I think is valuable to me. And I just got a lot of really great Twitter followers. And a lot of them turns out to have been both people who have done programming and done architecture, as professional architects and professional software engineers. And they're all like, "Yeah, they're about the same. They're both equally terrible at restoring the stuff." And then it basically got to this thing, where just a bunch of engineers from different fields were just chatting about how terrible documentation and legacy projects were their fields.
SK:
Yep. So the conclusion that I took, which again, this could be wrong, but the conclusion I took was that potentially the reason why it appears to me that software is so much worse than other fields, well, you know, like you, I don't know anything about architecture, but why software's seems so bad, just in general to me, knowing nothing about other fields, I think is an uncanny valley problem. And the uncanny valley is that when you have graphics that aren't supposed to look like reality, there's no problem. But the closer... this is like an uncanny valley, where if you get graphics that look really almost lifelike, then they look really bad because your point of reference is the difference as opposed to just not having a point of reference. So my new thesis is that programming is so good, or not quite so good, has the potential to be so good that all the places in which it's not good are glaringly obvious.
SK:
And so just to spell that out, one of the your amazing followers commented about how, you know, architecture observability is way worse than in software. For example, we take pipes and we put them behind walls where you can't see what's going on. If you want to see you have to break down a wall and then you have to do all this work to repair it. And in software, Bret Victor will complain about how we don't have observability and you have to run the code with console.log statements and that's a million times better than having to break down a wall.
HW:
Yeah.
SK:
But then of course we do way better than consoled out log statements and it's kind of obvious how we can do better than consoled out log statements, the imagination is so much easier. When how to do better than pipes in the walls, I guess it's like putting sensors on all the pipes, it's less obvious. So maybe the reason why I complained about why software is so bad is because it seems like you can prove it easier.
HW:
You can see how it can be better.
SK:
Yeah, exactly. Does that resonate with you more now or you still want to test that?
HW:
I think it does.
SK:
Yeah.
HW:
I mean, I do want to test that more. But I also tend to agree. I tend to also get really frustrated all the time, where how software could be better than it currently is. So I'm really bias here in your favor.
SK:
I do want to go to the great to the Great Theorem Prover Showdown because I think that that also has a lot of these meta concerns. So I think, yeah, you'd probably tee that up better than I would. I just wanted to start by saying I loved what you did there.
HW:
Thank you.
SK:
A really fascinating kind of challenge.
HW:
So I'm actually running a talk on this because somebody asked me to give a talk on this and I think it would be a lot of fun. So my general stance is as follows, let me just sort of tell you my stance that sort of justified this and where I'm coming across on this. So my general stance is that, well, first of all, at a formal verification level, nobody's really sure what's best. Some people do the stuff with highly procedural stuff for various reasons. Some people do functional stuff. I know, for example, LiquidHaskell, one of its major decisions that it made, was it actually reduces the amount of stuff you can say it expressing it to get deterministic solver times. Which everybody's like, "Oh, that's a great idea. Why didn't we think of that?" But so basically, it's sort of very fuzzy. And then my position is that outside of formal verification it's also really fuzzy. And I think that "what's the best language?" Is sort of a straw man or a side topic for correctness because I think, as I've made the very strong claim multiple times, which language you use probably has less effect on your ability to write correct code than how much sleep you've gotten. Which is a claim I will bet money on happily.
HW:
Yeah. We had a long discussion, for people listening to this podcast, we had a long discussion, I think was cut out, about betting money for what you believe in. But in any case, so one thing that happens a lot is that people sort of have this de facto position that functional is better than imperative, static types is better than dynamic types when it comes to correctness. And I disagree with that for, well, I don't disagree with that core idea, I don't know that that's correct or not. I tend to work primarily in imperative languages and slightly more often dynamic. So obviously I'm biased towards those two. But I firmly believe that we don't necessarily know which is quote unquote better for bugs, simply because we haven't done really careful analysis of what we actually mean by that and really careful studying of that. And most of the studies that we have done are inconclusive.
HW:
So my position is we don't yet have enough information to really make a decision if a decision is even makable. Yet at the same time, I see a lot of people making these claims as obviously true. That you're immoral if you use sort of dynamic typing or that functional is obviously going to be more correct by definition than imperative. And I've often got into these arguments where I'm like, "Look at all this amazing stuff people have done with imperative or dynamic typing that is correct." The response is usually well, it doesn't count for reasons X, Y, and Z. The most common one I hear is that, and the one that kind of trigger this, was an argument where I'm like, "Look at all this amazing stuff people have done with formally verified imperative code." And the response was, "If they did functional programming they wouldn't have needed any of that verification, it would have been obviously correct."
HW:
So that kind of got my hackles going a bit. And I decided to, after this and a few other recent similar arguments, some very enviable, some more aggressive, I decided to basically say, "Okay, enough was enough. I want to actually have everybody put this to the test." And I wrote the Theorem Prover Showdown. Which was the literally just saying, hey, it wasn't even supposed to be a theorem prover showdown. It was just to be like, "Hey, if you really believe functional is going to be more obviously correct than imperative, write these three functions I did imperatively. But also I formally proved them correct with these specific specifications. You have to prove the exact same specifications in your functional code." And it was three relatively simple things. One was left pad, one was getting the unique elements a list, and one was doing what's the fulcrum, which is finding the where you can cut a list in half to minimize the difference between the sum of the left half and the sum of the right half. Minimize the absolute difference, not draw a difference.
SK:
Yep.
HW:
Which is itself an interesting question. SO those are the theories I posted. I did them in this language called Daphne, which is essentially designed to allow for separation logic in imperative coding, very C sharp like. And it basically got a bunch of different people coming in. Some people were sort of trying to argue that while they couldn't do it, here's arguments anyway for why functional is better than imperative. Some people were like, "Hey, this is gonna be easy to to," then they're like, "Wait, this is actually really hard to do." Some people were like, "It doesn't matter. I don't even want to bother doing this. This is too easy to bother doing." And by far, the most interesting people, in my opinion, were the people who did, "Hey, I did it. Here's my solution. Here are my thoughts on this entire thing." Which was really cool because a lot of people have really interesting insights and we also got, and this is where it started to turn into a theorem prover showdown, when actual serious luminaries, Ranjit Jhala and Edwin Brady got on board and started writing explanations and proofs in their own proven languages that they invented. And that was really cool to see.
HW:
It was a lot of fun. I think everybody sort of had a blast with it. I got banned from Twitter like five times. I think for unrelated reasons, but I don't know. And I ended up doing a write up, which got pretty popular. And it also led to this on thing that we should probably link called Let's Prove Left Pad, which is a way that people can compare different, for their own education, different theorem improving languages. Where we just have a bunch of proofs of left pad in different styles.
SK:
So it's the proofs that your comparing, it's almost ToDoMVC, but for theorem provers?
HW:
Well, so the original challenge was literally to like, "Can you do this?" It didn't matter who you were. It didn't matter how long it took you. It was just: can you do this in a functional language? And then the Let's Prove Left Pad was one of the things that came out of that as a way of comparing and contrasting the different theorem prover languages. So sort of out of my attempts to be a huge jerk on the internet, something useful came out of it, which is kind of impossible on the internet, but happened once.
SK:
Yeah, it is kind of cool that, on a meta note, part of what's so fun about this whole story is that in a series of tweets you like spawned all this wonderful internet collaboration about coding.
HW:
Yeah.
SK:
But I think part of what I want to talk about is that I find myself to be, I found the criticisms you made about people who say that function programming and purity is just obviously better, I find those criticisms to be describing me, to some extent. So I spent maybe like an hour today trying to think through if I'm being ideological or in which ways I'm being ideological and how I can better... And is it a question of I'm just not explaining myself well enough (I'm being too dogmatic)? Or is it that my beliefs actually aren't justified and so I have to go do some soul searching and think about what truly is reality versus you know, religion? So anyways, I guess if I can pause there and say thank you.
HW:
One thing I want to clarify, it's fine, yeah. Welcome. So it's fine to have opinions, it's fine to believe something. I'm not saying it's wrong to believe that's a functional is better than imperative. My main criticism was more just people who say that as fact, as obviously fact, and getting mad at people who don't believe the same thing. I guess it was more of an attack on dogmatism, or I guess rabid dogmatism, than beliefs. I mean, I have beliefs about programming too; I'm not going to force them on other people. But I recognize that I have beliefs too.
SK:
Yeah, well, I guess part of why this really struck a cord for me, or maybe struck a nerve would probably be the right way to phrase it, is I think I alluded earlier in this conversation to this essay that I wrote, which really was trying to be not quite a manifesto but an essay that's trying to explain the benefits of functional programming to people who already kind of like functional programming. This essay was trying to show how it has much wider reach than they thought or much more generality than they thought. So that's kind of the purpose of the essay. And I got back basically the opposite of what you're talking about. Basically everyone was like, "Wow, you're taking all these FP things to be true when they're clearly not." So it just struck me that the dogmatism, or basically the communication gulf between people, and religion and dogmatism, I'm not exactly sure where I land at this point, but I just know that it's on my mind.
HW:
That's actually a really interesting question. And I'm going to, once again, refer to history here. If you don't mind?
SK:
Please.
HW:
So have you ever looked at C2, the C2 wiki?
SK:
Yeah.
HW:
So one of the interesting things here is that there's similar equally fervent language wars on the C2 wiki, it happened over the past, from like 1990 to, I think, 2010 was when I was most updated, but they were about things like object oriented verses procedural, or smalltalk verses not smalltalk, or lisp verses not lisp. So it's really interesting because it seems like almost every time you get a community that's different from the mainstream in some way, you get really fervent arguments about which is better. So I don't think it's anything particularly special about imperative verses functional, it's more like between people.
SK:
Yeah. Well, I would agree with that.
HW:
Yeah.
SK:
That it's definitely not specific to just, you know... I think it might just be that in today's, in the air today, functional programming is kind of having its heyday.
HW:
Yeah.
SK:
But not everybody's on board with it. So that's kind of on everybody's mind. But in the past, yeah, it was like object-oriented versus whatever else it had to compete with.
HW:
Yeah. Yeah. So I think one of things they also see is that once you get a smaller community it tends to be a common thing where people only remember the assholes in the community.
SK:
Of course.
HW:
So when I sort of think about imperative verses functional, I'm not thinking about the person who's going, "Well, I think that this is better for coding for these specific reasons and here's all these contexts where I think you're going to have this thing with very careful compare and contrast. And talking about different aspects of functional programming can paint different aspects than imperative and styling." I'm gonna remember the guy who's like, "Imperative programming is immoral and if you do it you're actually a bad person." As happened with a lot of the arguments. That's actually one of the reasons I tend to be so very, very careful about what I say about say formal methods because I don't want to be that person who's the asshole, who's convincing everybody that formal methods is full of assholes.
SK:
Ah, okay. Well, that makes sense.
HW:
Yeah.
SK:
That you feel like you're kind of a representative for a community and so you want to be extra careful that you come across as good.
HW:
Yeah. And I think what happens is that functional versus imperative is so heated not because it's actually that heated of an argument as much as that there's people on both sides that just tend to be huge assholes about it.
SK:
That's fair. I think one of my hunches is that, I think it's related to what you were saying about a small community, when you have a community of people who all agree with each other, the conversations of the inner-community are very high bandwidth conversations because we don't have to explain why we believe a certain thing.
HW:
Yeah.
SK:
We all just have common terms and beliefs. And so when someone who exists in this wonderful community leaves and it has to talk with someone who's not in that community, I feel like that's where you get into trouble. And the internet is ripe for that because it both fosters tiny communities and also is totally open and anyone can just stumble into a place where they don't know what the words mean.
HW:
Yeah and that's also the reason I tend to obsess over really, really strong and very precise clear arguments because I feel those are the ones that are best at convincing people. It's easier to sort of explain to a person why I say recursion is good, if you talk about walking a file system than if you talk about Fibonacci.
SK:
Yeah, well, I think you had a really great post on your blog about different kinds of examples and they're examples where you're trying to explain things versus convince people of things.
HW:
Yeah, I should probably update that post.
SK:
I think that's a really good post for me, actually, because I was writing a post to explain a concept and everyone who read it thought I was trying to convince them. And they were like, "This article does not convince me. I am very much not convinced." I was like, "Uh..." I don't know how to, what should I put in the header to be like, "Don't read this if you already disagree with these things. Read this only if you already agree and want to understand something new"?
HW:
Honestly, just that. I'm actually working on this piece about adversarial modeling and forma methods. It's pretty much going, "Hey, this piece is already assuming you know TLA+ pretty well. It's not going to try to convince you that it's a good thing. It's talking people who already want to do it. So just letting you know."
SK:
Yeah, okay.
HW:
Yeah.
SK:
Yeah, I think, then there you go. I just need a-
HW:
Disclaimer.
SK:
Yeah, a disclaimer at the top.
HW:
Yeah, like you're welcome to read it either way.
SK:
Yeah. That's good. Okay. Well, this was a very fun. I like at the end to give my guests an opportunity to pitch or list any ways they want to be interacted with on the internet, any things they're working on they want help with. Anything like that that you want to share, now is the time.
HW:
So I guess the first thing is that the easiest way to contact me is I'm on Twitter @hillelogram, we'll probably include that link. And I got a website HillelWayne.com. I'm really lazy about thinking about names. I'm just like, "You know what? That's my name. Hillel Wayne, done. Got the website." So yeah, either of those are great ways to find or contact me. If you work for a company that's interested in this stuff, I do consulting. Okay, actually, here's one thing to pitch. So one thing I do really enjoy learning about is both really weird software niches and really interesting aspects of software history. So if you end up having one of those two that you're just burning to talk about, feel free to DM me on Twitter. I love getting this stuff.
SK:
Great. Well, thanks so much for taking the time to do this. It's a lot of fun.
HW:
Yeah. Thank you so much for having me on.
SK:
All right. Bye.
HW:
Bye.
SK:
If you enjoy these conversations, I'd bet you'd fit right into our online community. You can join our Slack group at FutureofCoding.org/slack, where we chat about these topics, share links, feedback and organize in person meetups in various cities, New York, London, now San Francisco, Boston, and also Kitchener-Waterloo in Canada. If you'd like to support these efforts, there are a few ways you can be helpful. I would really appreciate any reviews you could leave for the podcast wherever it is that you listen. If any episode in particular speaks to you, I would encourage you to share it with your friends or on social media. Please tag me if you're sharing it on Twitter, @SteveKrouse, so I can join the conversation. I also accept support directly on Patreon, at patreon.com/SteveKrouse. And as always, I am so thankful for the ultimate form of support, your constructive criticism. So please don't be shy with your advice. Thanks again for listening and I will catch you on the next episode.