Future of Coding

#40 - Joining Logic, Relational, and Functional Programming: Michael Arntzenius

06/13/2019

This episode explores the intersections between various flavors of math and programming, and the ways in which they can be mixed, matched, and combined. Michael Arntzenius, “rntz” for short, is a PhD student at the University of Birmingham building a programming language that combines some of the best features of logic, relational, and functional programming. The goal of the project is “to find a sweet spot of something that is more powerful than Datalog, but still constrained enough that we can apply existing optimizations to it and imitate what has been done in the database community and the Datalog community.” The challenge is combining the key part of Datalog (simple relational computations without worrying too much underlying representations) and of functional programming (being able to abstract out repeated patterns) in a way that is reasonably performant.

This is a wide-ranging conversation including: Lisp macros, FRP, Eve, miniKanren, decidability, computability, higher-order logics and their correspondence to higher-order types, lattices, partial orders, avoiding logical paradoxes by disallowing negation (or requiring monotonicity) in self reference (or recursion), modal logic, CRDTS (which are semi-lattices), and the place for formalism is programming. This was a great opportunity for me to brush up on (or learn for the first time) some useful mathematical and type theory key words. Hope you get a lot out of it as well – enjoy!

Full Transcript

Transcript sponsored by Repl.it

Corrections to this transcript are much appreciated!

SK:
So, welcome, Michael.
MA:
Hello.
SK:
So, you go by Michael, yeah?
MA:
Yeah.
SK:
Because your online username is @rntz.
MA:
Yes, rntz.
SK:
And that's how it's pronounced, rntz?
MA:
That's how it's pronounced, rntz.
SK:
Where did you undergrad?
MA:
Carnegie Mellon.
SK:
And you studied CS?
MA:
Yeah, computer science.
SK:
And so, when did you get into programming languages specifically in computer science?
MA:
Very shortly after I got into programming.
SK:
Oh, interesting.
MA:
So, I think the thing that I, sort of, vaguely wanted to do when I started programming was make video games, which-
SK:
That's a common one, yeah.
MA:
Yeah. But I, sort of, very quickly got frustrated with the tools available to me, right? And started bouncing through programming languages. What program language did I start with? It might have been Python. It might have been Visual Basic. It might have been... I don't even remember now. But anyway, I eventually found my way to Lisp and to Scheme. And that was really sort of a revelation. I really enjoyed listing Scheme.
MA:
And I just sort of started going down the building tools for making your own tools rabbit hole, right? Because any time I would try to do something concrete, I would get frustrated that it was hard. And I would think, how could I make it easier to do this thing? I started thinking about building a tool for that thing and if you keep doing that, you end up with program language. And I've been going down that rabbit hole for, I guess, more than a decade now.
SK:
You just never stopped yak shaving.
MA:
Yeah. Yeah, sort of. I've like narrowed my scope a lot, right? Which academia will do to you, right? You have to focus if you're going to get anything done.
SK:
Yeah, yeah, well said. So, I find that a lot of us go through various paradigms and topics, like blocks-based programming or structured editors or logic programming or functional programming. Databases, you know. There are a bunch of different ways to improve programming. What was your arc through all those topics? Or was it not as winding? Did you kind of know early on?
MA:
No, I mean, there's been some ramblings. So, the first, the place I embarked was Lisp and Scheme. Which is sort of-
SK:
It's a common starting place for programming interested people.
MA:
Right. And Lisp and Scheme had, sort of, a couple of interesting ideas that have stayed with me. Lisp when it first came out had dozens of ideas that other languages didn't have, like garbage collection. But nowadays, garbage collection is really common. So, garbage collection didn't leave a lasting impact on me, other than that, like, yeah. I don't like having deleting the manual memory, but that's solved. We know how to do that now.
MA:
But the things that are, even now, not entirely mainstream about it are s-expressions and using s-expressions to represent all of your data, right, or most of your data. Functional programming, that's obviously getting much more attraction nowadays, but it's still not entirely mainstream. I don't know. I guess it depends on who you ask. And macros. And so, if early on, the one that most... seemed the most exciting to me and most cool was macros.
SK:
Yeah. And I guess it goes hand-in-hand with the s-expressions thing. I guess it's almost less the s-expressions and more the homoiconicity.
MA:
That's another phrase that I-
SK:
You don't like the phrase?
MA:
Well, I find it kind of ambiguous. People make a big deal out of it, but they can never define exactly what counts as being homoiconic. The thing that I think is important is it has a built-in data structure for representing the syntax of your language, but that's not unique to it. Python also has this. Most people don't know it, but Python has an AST datatype in the standard library.
SK:
Oh, wow.
MA:
But also, this datatype is s-expressions is sort of the thing used to represent almost everything, right? It's not a special purpose datatype, s-expressions. You use its building blocks everywhere else. You use lists. You use symbols. You use numbers, right? In Python, if I went on to understand the AST, I have to go read the documentation specifically for the AST. In Lisp, I'm already... there's very little distance between the tools that you familiarize yourself for general programming and the tools that you use to write macros.
MA:
And so, that's sort of what I think was making macro easier is what homoiconicity is. It's that the same data structures and concepts you used to write ordinary code are the ones you us to write macros. There's not a huge gulf between them, which makes it really easy to get started writing macros.
SK:
Yeah, that's really well said. I think that captures part of what's really, really powerful about the s-expressions, macros, pairing. You can turn the tool on itself and use it in the same way you've been using it to do other things, but on itself.
MA:
Yeah.
SK:
Yeah, I guess it's quite empowering, because I think it's in the theme of blurring the line between a creator of a tool and a user of a tool.
MA:
Yeah, definitely. I mean, it's kind of intoxicatingly powerful, right? Everybody gets turned on to... I don't know about everybody, but a lot of people get turned on to macros. And then, some never stop trying to use macros to solve every problem, right? It's really fun to write a macro that gives you a little language resolving a particular problem. So, right, oh, and also, the other thing that's relevant. I applied to work at my dream job, which was a rank on Eve.
SK:
Oh, with Chris Granger and, yeah, yeah.
MA:
With Chris Granger and Jamie Brandon. And I should remember the names of the other people who were working on it, but I don't.
SK:
Corey and... yeah.
MA:
Corey was after I applied.
SK:
Yeah, I also sent an email. I don't know if applied is the right word for, "I want to work for you.", in an email. So, I think we have that in common. I imagine a lot of the listeners to the podcast are, well, saying, "Yeah, yeah, I emailed Chris, too."
MA:
Yeah. Well, I know them. And then, they flew me out to interview.
SK:
Oh, wow. Okay, great.
MA:
And then, turned me down.
SK:
You got farther than I did. I got a, I think, less than a sentence of like, "Sorry, we're not interested." Or like, "Sorry, no."
MA:
Yeah, but talking with them was really cool and gave me a clear idea of what they were trying to do. And part of the core technology we're building on was this Datalog-like stuff.
SK:
Yeah. Oh, that's when you first heard about... You first got into it?
MA:
That's really where I first got interested in the relational algebra.
SK:
No way! That's fascinating. So, I guess, I probably have said this in the intro to the podcast, but you're like really into Datalog, it seems. That's what you're basically about. So, that's a fascinating... yeah, a fascinating little historical tidbit that you got it from the-
MA:
Yeah, my research direction has been determined by getting turned down for my dream job.
SK:
Well, it's just so funny to think that Eve was... which just feels so outside of academia. They took things from academia, but like the fact that they were then able to influence academia, I just find that somehow fascinating and wonderful.
MA:
Yeah, I mean, I think that academia is more open-minded than a lot of people might think.
SK:
Yeah, of course, than their reputation. Yeah. I guess they're just people on like, you know-
MA:
Especially grad students.
SK:
Especially grad students.
MA:
You give less as a person when you get tenure.
SK:
I see. I see.
MA:
I'll regret saying that at some point in my life.
SK:
Well, I guess, there's like a period of time in which you can like choose which various sliver of knowledge you're going to be an expert on. And once you've established that, it's not... You can still change, but once you've established-
MA:
But it's hard, right?
SK:
It's kind of a sunk cost thing, but it's also like... You're already there. You might as well just keep going.
MA:
Yeah. You can think of it as sudden cost or you can think of it as you had built up an expertise in a very specific area, and it's sort of a matter of your relative advantages, right? You have a lot of knowledge of this area, so you have a relative advantage in working at... starting in a new area is like starting all over again.
SK:
Yeah, yeah, exactly. So, when you're like a grad student, it's very easy to be influenced by things. But then, 10 years from now, you're not going to want to... If the next Chris Granger comes up with a new company in 10 years with a new direction, you're like not going to switch to that thing, you know? It's like a one-time thing, maybe.
MA:
Yeah, or it gets harder.
SK:
It gets harder, yeah.
MA:
Or less common. Yeah.
SK:
Cool.
MA:
I've lost track of where we are.
SK:
I'm curious to how you originally got an FRP.
MA:
Right, FRP.
SK:
And how you found that.
MA:
I got-
SK:
I'm still obsessed with it. I've been obsessed with it for years. Since I saw React JS, I was obsessed with it. I was into all the front-end frameworks. Then, finally, I found Conal Elliott's work. And then, I was like... Ahhhhh. And I'm really into it. And now, I'm like annoying, because I'm so into it. And I have like nobody to talk to, because I almost feel like it was like... Or anyways, the people I talk to aren't really interested in it the way that I am.
MA:
Yeah. I mean, I think it's sort of gained a brief moment of being slightly more mainstream, especially with Elm, right? And then, Elm actually kind of abandoned the FRP and approach. And there haven't been a lot of attempts to really push it forward since then. I mean, there's been academic work on it, but it's not in the spotlight anymore. And it was never hugely in the spotlight.
MA:
So, I got interested in it, more or less, because, yeah. I think Elm might have been part of it, exposing me to it. And it seemed like a nicer way to write user facing programs. It still seems like it might be a nicer way to write user facing programs. Although, I think my attention has turned more generally to the problem with incremental computation. Which FRP, I think, is... or dealing with change is how I would summarize the problem as I see it of front-end programming.
SK:
Yeah. Well, I guess because like events, like, you know, you have some UI. And it mostly stays the same. But as user interact with it, the UI slowly changes.
MA:
Yeah. But also, the external world is changing, right? You're running a website where someone has a shopping basket, right? And maybe you're trying to do the distributed setting. Now, things can change. They can change at different places, at different points in time. And you have to integrate all of that somehow.
SK:
That's interesting. I guess, because I don't... I usually make the distinction in my head between batch programs and then, reactive programs. Reactive programs like respond to the environment. Batch programs need to do something to process once. And I guess what a reactive program is is something that changes over time, but it has inertia. It's not a step functioning thing. It's usually a smooth kind of changing thing, smoothish. Occasionally, I'll press a button and the entire page will change. But usually, it's like-
MA:
Most changes are small.
SK:
Most changes are small. Anyways, maybe that doesn't make any sense.
MA:
No, I mean, it makes sense. Continuity is sort of a huge theme that connects to everything, if you look into it deep enough. And I don't fully understand it.
SK:
Almost like differentiability.
MA:
Yeah. In a certain sense, only continuous functions are computable. There's this connection with topology and computation that I do not fully understand.
SK:
I see. That's interesting.
MA:
Anyway, yeah. So, I got into FRP, because I was interested in it as a better, or nicer, model of writing UI programs. And why did I end up not so interested in it? I guess I basically got sidelined in my mind by the Datalog and Datafun and relational programming ideas.
SK:
Cool. Well, so, it feels like you first got into the logic programming, then got into relational programming? Or it kind of happened at the same-
MA:
Happened at the same time. I treat them as kind of the same.
SK:
Oh, relational programming and logic programming. Oh, they're kind of synonyms. Because to me, one feels like... Relational, I think SQL databases. And logic, I think Prolog. But I guess part of your thing is you want to unify them.
MA:
Oh, I don't know about whether unifying them fully, but I do think of them as strongly related, right?
SK:
Do most logic programmers think of that in that way?
MA:
I don't know. Certainly Will Byrd and his collaborators, Friedman referred his work as relational programming.
SK:
Who's Will Byrd?
MA:
Will Byrd is a... How do I explain Will Byrd? He's an academic. He's, I think, probably mostly known for a miniKanren, which is a relational/logic programming language that is distinctly not Prolog. It's built in Scheme. And it's, sort of, notable feature, well, there's two things I would mention. First of all, there's a mini version of it called microKanren, which is notable for having an implementation that is like 50 lines long.
MA:
And that has been ported to every language under the sun, because it has an implementation his about 50 lines long. But it captures the essence of miniKanren, so it's a really fun thing to play around with, if you're interested in relational programming. And the interesting thing about miniKanren, beyond that is that unlike Prolog, its search strategy is complete. So, Prolog, if you write a particular thing that you read it logically specifies something.
MA:
Like, let's say, you write transitive closure. So, you have edges in the graph. You have relation-predicate-edge that tells of two arguments that says an edge from this node to that node. And you want to find a predicate that gives you reachability that tells you there's a path from this node to that node. If you do this in the obvious way, which is just there's a path from X to Y, if there's an edge from X to Y. And there's a path from X to Z, if there's a path from X to Y and a path from Y to Z, right? It's edges, but transitive. If you do this and you feed it to Prolog, it will infinite loop and generate nothing of interest.
SK:
Because it'll just keep generating extra facts that you already know?
MA:
Yeah, so, it'll take the second clause, which is a path that can be built from a concatenation of two paths. And if you ask it, "Hey, what are the paths?" It will keep applying that second rule indefinitely.
SK:
I see.
MA:
Because it does sort of depth for search and the search tree you've given it, it has infinite branches.
SK:
I see.
MA:
And so, this is really annoying from a pure logic point of view. I've given you the logical definition of this. Why aren't you computing it? This is the promise of logic programming discarded for the sake of a simple implementation. And miniKanren is like, "No, we will not discard that promise." We give the complete search strategy. If you give us some rules, we will give you eventually all of their consequences. No matter how you order the rules, no matter what you do, we will eventually find all the consequences of these rules.
SK:
And they're able to do this, because...?
MA:
They changed the search strategy.
SK:
Oh, it's just an implementation detail, like-
MA:
Well-
SK:
It didn't restrict the way you could-
MA:
There are a couple of features in Prolog that are specifically about the search strategy and are about sort of extralogical things. So, for example, there was the cut operator, or the bang operator, that prevents backtracking. That's about the search strategy. It prevents backtracking.
SK:
Wait, this is in miniKanren?
MA:
No, cut is in Prolog.
SK:
I see.
MA:
It is not in miniKanren.
SK:
So, and miniKanren's almost like higher level? It wouldn't let you-
MA:
Yeah, it would not let you do this.
SK:
Explain the... like, direct the search strategy.
MA:
That's not entirely true. The order in which you put things will affect the order in which the tree gets searched, but it will eventually find... search the whole tree.
SK:
How will it know when to stop, in a way that Prolog doesn't know when to stop?
MA:
So, if you give it an infinite search tree, it will never stop. But if you give Prolog an infinite search tree, it might never stop and also not explore the whole tree. Right? So, it might just get stuck going down one particular branch of the tree and never come back up. Whereas, miniKanren is more like doing... It's not doing a breadth-first search, but it's doing something more similar to a breadth-first search where eventually, it will reach any node in the tree.
SK:
Oh, but it might keep going forever.
MA:
But it might keep going forever if the tree's infinite. Yeah, that's your problem.
SK:
I see.
MA:
Right. Datalog, on the other hand, simply does not allow infinite searches. That's the area I focused on. I focused on very decidable logic programming.
SK:
Okay. Well, let's rewind and unpack some of these terms, because I want to give... I want to use this opportunity to give a good foundation for these topics, because I think a lot... most of us, I think, have heard of these things, Prolog and logic programming and relational things. But anyways, I just want to start on firm foundations. So, when I hear relational, I think of Codd and databases.
SK:
So, maybe give like the brief history. Is that kind of where relational came from?
MA:
Yeah, yeah. It's a perfectably reasonable thing to think of when you hear relational, right? Like, you created relational algebra or relational calculus. I still don't know what the difference between those two is, by the way. And from that, came SQL and most of our modern database work.
SK:
And so, when I think of that, I think of path independence and normalization. That's where my brain goes, but is that... That's not-
MA:
What is path independence?
SK:
The opposite of path independence is when I have like a nested JSON data structure, I realized like, oh crap, I actually want... If I have a list of people and each person has a list of favorite things. I'm like, "Oh crap, actually, I want to know how many distinct favorite things there are." Basically, I know I can get in trouble if I just nest the data structure in the way that I'm going to want the data. And then, I'm like, "Oh crap, I actually want the data a different way." And usually what happens to me is I end up taking that data structure and then, unfurling it into orthogonal lists that point to each other.
MA:
Yeah. Which is very much the, sort of, relational approach, right? Just have a bunch of relations saying how your data relates. Don't think too hard about nesting everything, so it's deficient. Leave that up to the query optimizer and hope you have a good query optimizer.
SK:
What is the relationship between relational algebra and SQL?
MA:
So, relational algebra is this formalism that Codd came up with in the 70's...? Could be the 60's. I could be wrong. But anyway-
SK:
Is it like lambda calculus is to functional programming?
MA:
Kind of, yeah, right? So, SQL can be thought of as an implementation of relational algebra plus some other stuff, except not quite. So, it's relational algebra plus some stuff, but it gives up on some of the simplicity of relational algebra. For example, it has bag semantics, not set semantics. So, there's a difference between having multiple of the same thing, right? And it also adds some stuff to relational algebra that's really important like aggregations.
MA:
So, anyway, before talking about what it adds, what is relational algebra? Relational algebra is you have a bunch of relations, right? A relation is basically just a set of tuples, right? So, it's a-
SK:
Rows.
MA:
Oh, a set of rows.
SK:
Okay. And a tuple is just like a dictionary or a object? Like, it's key values.
MA:
You can think of it as key values, if you'd like, right? But you think of a relation has a bunch of columns, right? Like, there might be first name, last name, user ID. The elements of a relation are individual rows. We taught the value of first name and the value of last name and a user ID. And that's what a relation is, right, so it's a collection of rows. And all the rows have the same shape. They have values for each column.
SK:
And you say that's a set of tuples. So, it's not a list. It's-
MA:
Yeah, it's not ordered. It does not care about duplicates.
SK:
Okay. And ID's, did we talk about that or not? Not yet.
MA:
No, no, that's not particularly important. I don't even know whether the concept of a primary key is in the relational algebra. It's certainly not-
SK:
Foreign keys.
MA:
Again, that's sort of in my mind, I haven't read any of the original stuff on relational algebra. I only sort of second sources. I read Wikipedia. But in my mind, that's sort of just a concept layered on top of it that formalizes a pattern of using relational algebra, right? So, you have relations. Now, how do you use relations? And the answer is you have various operators that combine them.
MA:
Some of them are simple filtering. You can say, "Throw out the things in this relation that don't satisfy such and such a condition." Union, if two relations have the same column names, right, or contain the same shape of stuff, you can take their union. And then, the most interesting one, of course, is relational joins.
MA:
And what a join is is... Actually, perhaps before we even talk about joins, we can talk about cross product.
SK:
Unions, I thought that was what joins were.
MA:
No, so a union is just like give me anything that is in either of these sets.
SK:
Oh, yeah, yeah, yeah, yeah.
MA:
Right. So, it's literal set theory union.
SK:
So, a relation like... I usually think of a... In a database, you have a customer table and it's all of the customers. But a relation wouldn't be... Like, I could have a relation of two different subsets of customers and make a union.
MA:
Yeah, you could if you wanted to. And you could do the SQL, too. SQL has unions.
SK:
I see.
MA:
They're not all that commonly used, but they are there.
SK:
I see. So, joins are-
MA:
joins are like the thing.
SK:
I see.
MA:
All this other stuff is useful and sometimes, necessary. But joins are the single most common operation. And what they are, the way I like to think of them, although this may not be immediately obvious is they're a cross product followed by a filter followed by a projection. So, hold on. What are each of those things? A cross product is just I have two relations. Give me all possible pairings of things from those relations.
MA:
So, if I have a table of customers and I have a table of ice cream flavors, let's say, I have Charlie Coder, user ID 0 and Hilary Hacker, user ID 1. And I have chocolate and strawberry. The cross product will be Charlie Coder user ID 0 strawberry, Charlie Coder, user ID 0 chocolate, Hilary Hacker, user ID 1 strawberry, Hilary Hacker, user ID 1 chocolate. Right?
MA:
All possible combinations. This can get very big. So, okay, why would you want to do that? Well, you can then filter this by some predicates. And I've chosen a bad example, because there's no obvious way those things are connected. Maybe we can say that the parity of somebody's user ID determines whether they like chocolate or strawberry ice cream. Or another way to do it would be you have... a realistic way is you have a table of users. And then, a table of orders. You know, the user ID and then, the order ID, right? So, the table relating user IDs to order IDs and a table relating user IDs to their names. And you want to have the order IDs and the names paired together. So, you can take your cross product, which just gives you every possible user pair with every possible order. And then, you filter it down by requiring that the user IDs match.
MA:
So, that's called an equijoin, because you're requiring the two things to be equal. And then, you, yeah, you throw out the junk you predict. That's not particularly important, right? Because you have two copies of the user ID column and you're requiring them to be equal and simply throw out one.
SK:
Okay. So, the select is kind of the project. The join is the cross product. And then, the predicate is the join on-
MA:
Yeah, right.
MA:
So, yeah, this is when you have two relations and you want to correlate them somehow. You want to say, "Hey, give me all combinations of things from this relation and that relation that satisfies some predicate.", right, where these things match. And that, to me, that's the relation algebra. You have relations. You have joins. You have a few other things like unions and filters. And you're done. And you can do a whole lot of stuff with this, but not everything.
MA:
For example, if you want to have the sum of something, the relational algebra does not do that. It benefits relations. A sum is not a number, not a relation. All right. It just does not have aggregations.
SK:
Oh, okay. That's interesting.
MA:
Which, I mean, obviously, this is a limitation. It's not as if anybody has ever thought, oh, that's enough.
SK:
Why would you leave that out? But I guess, when you have an algebra, you have types. And you have operations on those types. So, if you had... like we had algebra for numbers and we can add 1 and 2, and it'll give you the number 3. But let's say I want the word "three". We need the word "three", but like it would never give you the word "three". It would just give you the number 3.
MA:
Right. It's useful to formalize numbers, even without formalizing how to print them in strings. Because, well, you can add that part if you like. But here's how we do numbers. Relational algebra is here is how we do relations.
SK:
I see.
MA:
And then, you can add extra stuff on top of that. And SQL does and it's useful.
SK:
And aggregations.
MA:
Well, it's interesting, because Datalog goes in a totally different direction. Datalog adds some-
SK:
So, Datalog, maybe give the logic programming background.
MA:
Yeah. So, well, one way of explaining Datalog, so Datalog can be thought of as a logic programming language.
SK:
Or-
MA:
It can be thought of as a database language. It's, sort of, somewhere between the two.
SK:
Okay. So, sorry for interrupting. Keep going with what you were saying.
MA:
Yeah. And one way of thinking about it is it takes relational algebra and it adds something to it, just like SQL, but it adds a completely different thing. It adds the ability to define relations, to construct relations recursively. And so, the classic example of this is what I already gave, transitive closure in a graph. You have the edges. And you want to find all the pairs of nodes, which are reachable from one another. So, an edge relation would have a source and a bits column. And in relational algebra...
SK:
Oh, I see.
MA:
Pick any number, N, and I can find you the N's distance paths.
SK:
I see. I see, because-
MA:
I can't find you all the paths.
SK:
I see. I see. Because if you do the cross product of once, that gets you one and then, you filter. That gets you one path. And then, you can do the cross product again. We can do the cross product infinitely or, like, until it doesn't change or something like that. I see.
MA:
Yeah, exactly.
SK:
Okay. So, yeah. I want to spend a lot of time talking about computability with you, because I feel like... because I think that's something that comes up a lot when people discount logic programming. It's too slow or it'll infinite loop forever. Basically, it's like too abstract. It's like let's stick closer to the bits, because we know that if we're controlling all the bits, we know the program will end, because we have a tight reign on it.
SK:
So, yeah. So, I guess, maybe, let's talk theoretical. What is computability?
MA:
Well, whether something can be computed or not by. Usually, we think of a Turing machine or whatever. It hardly matters.
SK:
Is it related to decidability?
MA:
Yeah, decidability is the same thing, basically. Decidability, strictly speaking, is pose a question with a definite yes/no answer, right. Or think of a question, a class of questions parameterized by something, right, with definite yes/no answers. So, an example would be "are these two numbers equal?" So, that's a class of questions. It's not a specific question. It'd be like, "Does two equal four?"
MA:
And, of course, that can be answered by a machine. Just build a machine that returns no. But it only gets interesting once it's a class of questions. So, it's whether two numbers are equal. So, it has two placeholders, two variables in it, X and Y. You can them whatever. That question is decidable, if your numbers are natural numbers. It is not decidable if your numbers are real numbers.
SK:
Oh, I see. Because real numbers could be infinite? They could-
MA:
Yeah, real numbers have infinite precision. And you cannot tell in advance how many digits you'll have to look at. You might have a number that you... Let's say you have the number one, two, one, three, four, five, six, seven. And you have another number one, two, one, three, four, five, six, seven. And they keep going. And they keep going forever. How do you know when you're done? How do you know that they really are equal? Maybe, there's a digit that's not equal just beyond where you looked.
SK:
So, decidability and computability has a lot to do with looping for forever?
MA:
Yeah, right? To say that a question is decidable is to say there is a Turing machine, or a computer program, that will answer every single question of that form. And for each one, it will answer it in finite time, right, with yes or no. So that it never infinite loops on any particular instance of that problem. If it infinite loops on some instance, then it's not a decision procedure. So, the problem is not decided by that program.
SK:
I see.
MA:
Okay, right. So, the real number equality is an interesting case, because it has the property of these two numbers are not equal. Then, the sort of obvious program, just compare the digits one by one, right, will eventually say, "These aren't equal." If two numbers aren't equal, eventually you'll get to a digit where they differ. And you'll be like, "They're not equal."
SK:
It's only when-
MA:
It's only when they are equal that-
SK:
You-
MA:
You won't be able to terminate, yeah.
SK:
Oh, I see. I see. That is interesting.
MA:
Yeah, so that's called either semi-decidability or co-semi-decidability. I never remember which is which.
SK:
Oh, I see. Semi-decidability is when it's decidable in specific cases.
MA:
Semi-decidable is when like you have a program that can... that if the answer is no, it will eventually answer no. But if the answer is yes, it might infinite loop.
SK:
Okay. And so, I forget how we got in this tangent. It's related to Datalog?
MA:
You're sort of thinking about decidability and computability and logic programming.
SK:
Oh, okay. I think, let's unroll the stack. And before we continue down this thread, I want to ask you about... You were playing at three things, relations-
MA:
Right. Tables, relations and predicates.
SK:
Okay. And do you remember why... oh, because you said... you were explaining how you can get to Datalog from the relational path.
MA:
Right, yeah.
SK:
Or you can get to it from the logic path. Do you know the history of Prolog, which way it come from? Or was it influenced by both?
MA:
So, I think what's sort of... I'm not sure. Datalog, sort of, rose... It was after the relational algebra. I think it arose mostly in the 80's, but people sort of noticing... I don't know whether they noticed that the syntax looks like Prolog. So, based on that, I imagine they noticed, "Hey, if we limit Prolog in such and such a way, then suddenly it is decidable, right?", which is to say, in Prolog you can ask queries where it will infinite loop. In Datalog, you cannot. Every program in Datalog terminates. Every query in Datalog terminates. It will always answer any question you pose of it.
SK:
Well, what did they remove?
MA:
Basically every predicate, or relation or table, in a Datalog program has to be finite. So, in Prolog for example, you can find a predicate that takes three lists, I'll call them X, Y, and Z, an is true if X appended to Y is Z. And you can run this.
MA:
One of the wonders of logic programming, you can run this relation in any direction. So, you can give it two lists, X and Y, and it will give you, spit back at you, the append of these two lists. But you can also give it one list for Z and it will spit back all the lists, which when appended make that list. In other words, it will find all the ways to split a list into two smaller lists. And these are both the same relation. You write it once and you get both ways of doing it.
SK:
I may have missed it. So, the same relation goes in both... So, maybe give me an example. Maybe give me concrete lists.
MA:
Sure. So, if I say, you know, append list containing 1, list containing 2, Z, right?
SK:
What's Z?
MA:
Where Z is a variable. It's an unknown. And I'm asking it. When you do this, this is called a query. And it's saying, "Give me all the values of Z such that this is true."
SK:
So, it's the-
MA:
Such that the append of one and two is Z.
SK:
In a normal program, it would just say Z equals append one, two?
MA:
Yeah, right, yeah.
SK:
But in Prolog, you-
MA:
You just give the logical expression that you want to be true and it finds the solutions. And in this case, the solution is Z equals one, two. But you can also put variables for the other parts of it. You can say, "Give me X and Y such that append X, Y is one, two." All right?
SK:
I see, okay.
MA:
So, this is like saying one, two equals X plus Y, which in a normal programming language would be a syntax error, probably.
SK:
I see. I see. I see, okay.
MA:
But in Prolog, it will give you back multiple answers. It will say, "Okay, one solution is X is the list one, two and Y is the empty list." Another one is X is the list one and Y is the list two. And the final one is X is the empty list and Y is the list one, two. And the same code, and I can write down the code. Well, I mean, this is a podcast so that's probably not helpful, but I can write down the code for this. It's not very complicated. And it can be used in both directions.
SK:
Yeah, just write down the code and show it to the microphone.
MA:
Yeah.
SK:
Okay. And so, this is almost like... this can lead us to bad things in Prolog? This can lead to undecidability?
MA:
I mean, the one time, this is part of why Prolog is awesome. And it's also dangerous, because it makes your language more powerful and it can lead to programs that infinite loop. This is not particularly any more dangerous than any other Turing-complete programming language. Every programming language that we know can write infinite loops, except for ones that are very, very carefully limited like Datalog.
SK:
Datalog cannot infinite loop.
MA:
Cannot infinite loop.
SK:
Most programming language, you could just write "while true." Datalog doesn't have "while true".
MA:
No, right. The equivalent of "while true" terminates with false. That's a little bit of an unfair comparison. But there's a concrete example of this, which is in Prolog, you can feed it, not the liar's paradox. So, it's a logic programming language. So, you can actually translate paradoxes into it. But the liar's paradox is this sentence is false. And this is problematic, because if it's false, it's true. And if it's true, it's false.
MA:
But there is another, not exactly paradox, the truth teller's paradox. I'm not sure if that's the standard name, but it's this sentence is true. And this isn't really a paradox, because like you can say it's false. And then, it's false, right? Because it says it's true and it's not true, so it's false, okay? But you can also say it's true, because it says it's true. And it's true, so it's true.
MA:
So, it's unclear what truth value it should have, but it is not paradoxical to assign to a particular truth value. Now, if you feed the equivalent of this to Prolog, you say basically foo holds as the variable X, if foo holds as the variable X. Foo of X, if foo of X. And then, if you ask Prolog, does foo hold of two? It will infinite loop. Now, in Datalog, if you do the equivalent thing, it will simply say, no. No, it's false.
MA:
So, Datalog has an answer to the question "What is the value of this sentence? This sentence is true," and its answer is false. And the reason for this is, basically, Datalog has a, sort of, minimum least fixed point semantics. Or it's sometimes called a minimum model semantics. But what it basically means is that if you don't say something is true, it assumes it is false.
MA:
For example, if you say, "There is an edge from two to three." And then, you end your program, right? You say, "That's all there is in the program, if there is an edge from a two to three." And then, you ask it, "Is there an edge from three to seven?" No. You didn't write that, so it's not true. So, it infers only the minimum set of things consistent with the program that you've written.
MA:
It will not infer anything that you didn't write. And so, if you say, "Foo of X, if foo of X." It will not infer that foo of two is true, because there's no way to get to that. It's consistent that it be true, just like it's consistent with what you wrote down, that there being an edge from four to seven. You didn't explicitly say that it was false. But, sort of, normally, we only write down the things that are true.
MA:
Sort of intuitively, if we're describing something, we say all the things that are true of the situation, not all the things that were false. Because there are too goddamn many. And so, based on that, right, based on that idea, only assume things are true if there's a clear way to prove them. Datalog will give you, sort of, the minimum level. It will not... Yeah.
SK:
I feel like there's a phrase of mathematics that does this, that only proves things that are-
MA:
That's sort of what minimum model or least fixed point is.
SK:
So, we talked through the basis of relational stuff was relational algebra. The basis for logic programming is logic.
MA:
First-order logic, yeah.
SK:
And first-order refers to?
MA:
First-order means you can quantify over object, but not over sets of objects. So, first-order logic is the kind of logic that we're, sort of, most familiar with, right? We can say things like, "For any number, X, X plus one is greater than X." And calling it a first-order means that you can write that for any number X. So, there's also something less powerful than that, which is propositional logic where you cannot write "for all."
SK:
I see.
MA:
You can take primitive propositions and you can conjunct them. You can say, X and Y. You can disjunct the next, or Y and so on. But you can't quantify over all variables. And then, there's higher-order logics, which let you effectivel, not just quantify over individual things, like numbers. But let you quantify over properties of numbers. For any property, P of numbers, there exists a number X, which P satisfies. This isn't true.
SK:
I see. I see.
MA:
That's a false proposition, because consider the property that doesn't hold of any number. But it allows you to quantify over even larger stuff.
SK:
And now, I see how it's higher-order. I see how the phrase makes sense. You can have specific propositions about specific members. And then, you can have propositions for a bunch of numbers. And then, you could have propositions about propositions. I see.
MA:
Yeah. And the weird thing is... This is a total tangent, but sort of the weird thing about this is, so logicians, more or less, figured out propositional logic, then first-order logic, then higher-order logic. There's obviously still work on each of these things, but sort of that's the order in which they started considering things. Type theorists and programming language theorists figured out the equivalent of propositional logic, very simple type systems.
MA:
And then, they figured out precisely second order type systems, right? Type systems that let you quantify over types, but not over values, over types. And then, we're beginning to figure out... Well, we are figuring out dependent types, which are kind of first-order, as well as higher-order.
SK:
Oh, that's interesting.
MA:
So, it's kind of like we skipped over the just first-order phase. There are type systems that are kind of directly correspondent to first-order logics, but they're kind of weird. The first thing they figured out was, so called, parametric polymorphism, which is where you're allowed to quantify over types. You're allowed to say, "This function has type... For any type alpha, it takes alpha to alpha."
SK:
So, in my head, I have... in Haskell, I'm thinking Int -> Int is like first... is the base.
MA:
That's about... There's many different kinds of higher-orderness in programming languages. And so, one of them is like, "Are your functions higher-order?", which I think is what you were thinking of.
SK:
Yeah, you're right. I don't even have to talk about functions.
MA:
Maybe, I was missing it.
SK:
No, no. I just was over-complicating my example. We have Ints. And then, we have lists of Ints, which is... and like a list of an Int, is that polymorphic at all or higher-order? That's still first-order? Can be parameterized...
MA:
That's like even a different direction. That's talking about types parameterized by other types.
SK:
Is that-
MA:
Well, it's related, but it's not exactly the same thing. So, quantification in a type system is, for example, having a function, take the identity function for example, that works at any type. Or the map function, map's a function. Well, that complicates things, because it also involves lists. But-
SK:
If you can like describe types. If you have a type that admits other types, like subtyping...?
MA:
No.
SK:
Like Num is-
MA:
It's not about subtyping, not really. It's about: what is the type of the identity function? Well, you could say it has the type Int -> Int, because it takes Ints. You can say it has the type Bool -> Bool. But neither of these is really a most general type to give it. The most general type to give it is to say, "For any type A, it is type A -> A." That "for A" I put there, that's the equivalent of a "for all" in logic.
MA:
But what is it quantifying over? It's not quantifying over values. It's not for any value X. It's quantifying over types. It's for any type of X.
SK:
I see.
MA:
And that's what makes it a second order quantification.
SK:
Is it but only useful for the ID function?
MA:
No. It's useful for a lot of other functions. They usually have to involve some sort of data structure. So, an example would be the map function that takes a function and a list and applies the function to every part of that list. It doesn't care whether it's a list of integers or a list of booleans. So, it has the type: for any type A and any type B, give me a function from A to B and give me a list of A. And I'll give you a list of B.
SK:
I guess, I normally think about... yeah, it doesn't occur to me that there's an implicit "for any A" and "for any B". The only time I think of the implicit "for any" if it's like "Show A => ..." you know? Because then, it's like, "Oh, okay, this is for any Show."
MA:
Yeah, because then it's explicit in the syntax.
SK:
Yes, exactly.
MA:
Yeah, but to a type theorist, we think of there being an implicit form "for any A".
SK:
For any A. I see. I see. And, okay, that's interesting that we skipped... So, the middle level that we skipped would be-
MA:
First-order, first-order. The equivalent to first-order logic.
SK:
So, what would be a type in first-order?
MA:
Well, the reason that we skipped it is it's very not obvious what it would be, right? Because-
SK:
It's just less useful?
MA:
Well, I don't know about less useful. The obvious example that I could give is dependent types. But dependent types don't really correspond to first-order logic. They correspond to very higher-order logic, like all the bloody layers. There's not a clear correspondence anymore.
SK:
Oh, okay, interesting.
MA:
But the thing that dependent types allow you to do that is like first-order logic is they allow you to quantify over all the values of the given type. So, I can say, "For any natural number, N, this will take a list of length N and of, I don't know, integers and return a list of length N and integers." So, I'm not quantifying over a type there. I'm quantifying over natural numbers, right, for the length of the list. That's like what first-order logic lets you do.
SK:
Okay. Interesting. Okay, so-
MA:
That was a kind of huge tangent.
SK:
Yeah, yeah. Well, I feel like this has been great. I feel like we've been talking about interesting things, but we should probably get to your main project. I think we spent enough time laying the foundations and talking around it. So, yeah, give the quick summary...
MA:
The spiel for Datafun. So, we have Datalog, right, which is this language that can be thought of as logic programming, but limited, right? Limited so that it's no longer true and complete. It always terminates. But because of those limitations, we have, for example, but much more efficient implementation strategies for it. And, yeah, I mean, that's basically the idea. It makes the implementation strategies more efficient and do interesting things.
MA:
Or you can think of it as relation algebra plus fixed points, so it's like SQL with extra stuff... Except aggregations are a pain. So, I'll talk more about that later. But anyway, it's between these two cool areas, logic programming and relational programming.
SK:
This is Datalog or-
MA:
Datalog. That's what Datalog is. But what Datalog doesn't let you do is it doesn't let you notice that there's a repeated pattern in your code and break it out into a function. This is an ability that logic programming has, because logic programming doesn't have the limitations of Datalog, right? But once you impose limitations to Datalog, which are nice, you lose that ability.
MA:
But it's also something that functional programming has, because we have functions. See a repeated pattern? Just write the function that encapsulates that repeated pattern. Take the parts that are varying and make them arguments to the function. And take the parts that are constant and make them the code of the function, right?
MA:
And it seems like this would be a useful ability to have in Datalog. For example, transitive closure, the standard Datalog example.
SK:
You have a lot of graphs in your life.
MA:
Yeah. You can write transitive closure in Datalog, but you cannot write a function that, taken a graph, takes its transitive closure.
SK:
It only works for specific graphs.
MA:
Right. You have to hard code. You have to pick a relation that represents the graph that you want to take the transitive closure of and write the thing that takes its transitive closure. And it's hard coded to that graph. You cannot plug in a different graph.
SK:
It's like writing macro to plug in a different graph or the ability to write functions.
MA:
Right. Or add the ability to write goddamn functions. So, that's kind of what Datafun is. It's an attempt to allow you to write what is effectively Datalog code, but in a functional language so that if you see a repeated pattern in your code, you can just abstract that over them. And along the way, we sort of end up adding a bunch of interesting things, because it's easy and natural to add them in the context of a functional language.
MA:
So, for example, we can add types. Datalog is traditionally kind of untyped. There's no particular problem with adding types directly on Datalog. But as long as we're going through a functional language and we know how to use types for that, we add those. So, you can have sum types now, if you want sosumtypes. Also, lattices, so Datalog... How do I explain the use of lattices in logic programming and in Datalog and in Datafun?
SK:
I always forget what a lattice is.
MA:
So, in this case, what I'm actually concerned with are join semi-lattices. People often call them lattices, because saying join semi-lattice every time gets to be a mouthful. But what that means is you have... There's two ways of thinking about it. One way of thinking about it is you have a binary operator that is associative, commutative, and idempotent. So, associative, the parens don't matter.
MA:
Commutative, the order doesn't matter. Swap things around as much as you'd like. Idempotent, doing things twice doesn't matter. X join... the operatives are usually called join, which is confusing, because it's not database join. It's a different operator. So, X join X is X. That's what idempotence means. And it has an identity element, a thing that does nothing. So, the classic example of a join semi-lattice is sets under union.
MA:
Union is associative. The parens don't matter. It's commutative. X union Y equals Y union X. Order doesn't matter. It's idempotent. A thing union itself is that thing. Adding a set to itself. It has the same elements.
SK:
I see. I see.
MA:
Right? And the identity element, the thing that does nothing, is the empty set.
SK:
Addition and multiplication are?
MA:
Addition and multiplication are not semi-lattices, because they're not idempotent.
SK:
Oh, if you add a number-
MA:
Two plus two is four.
SK:
I see, yeah, I see.
MA:
But maximum is a semi-lattice on the natural numbers.
MA:
Or minimum, I guess?
MA:
Minimum on the negative numbers would be. I need an identity element. So, let's go through each of these properties. Maximum is associative, yes? It's commutative. X max Y is Y max X. It's idempotent. The thing max itself is itself, but you need an identity element, a thing such that the maximum of X and this identity element is always X.
SK:
And so, zero-
MA:
Right. And so, zero, the maximum of something in zero is always that thing, as long as it's non-negative, right? So, max, different people differ on whether join semi-lattice needs a zero as where it needs a... but for me, I always insist that it have... that there be an identity element.
SK:
So, in Haskell, we have like types, like monads and monoids and traversable. And we have like a typeclassopedia.
MA:
Yeah, semi-lattice would be type class.
SK:
Okay, great, that was my question. And where would it fit in the tree? And why don't we have it in the tree?
MA:
Because there's too many goddamn mathematical concepts. Having all of them in your standard library would be a bit much.
SK:
Oh, interesting.
MA:
I mean, you can add it, right? I think, more practically, the reason why you don't have it yet is nobody has made a strong enough case for it to be in the standard library. It's not hard to add as your own library, right?
SK:
I see.
MA:
Type classes aren't something limited to the core libraries. You can make your own. And that's exactly what I do when I need to use semi-lattices. But, yeah, what it would have... Well, okay, so first let me talk about the other way to think about the lattices. We've talked about them as an operator, but there's an equally important way to view them, which is as a partial order.
SK:
Yeah, that's another phrase that-
MA:
So, a partial order is just something that acts, sort of, like less than or equal to, except it doesn't have comparison. There can be two things, neither of which is less than or equal to the other. So, the classic example here will be sets under sub-setting. So, you say that set X is less than or equal to the set Y, if X is a subset of Y. Not necessarily a proper subset. And so, X is less than or equal to itself. It's a subset of itself, as far as I'm concerned.
MA:
And so, this forms an order, in a sense, in that it's transitive. If X is a subset of Y and Y is a subset of Z, then X is a subset of Z, right? Or sub-setting might say included in where X is included in Y and Y is included in Z. It's reflexive. A thing is included in itself. A set is included in itself. And it's antisymmetric, which is if X is a subset of Y and Y is a subset of X, they are the same set. That's what we mean by partial order, those three things.
SK:
And it's equivalent to a lattice?
MA:
No.
SK:
No, okay.
MA:
Every lattice is a partial order. Not every partial order are lattice. I'll get to that bit in a bit. But, yeah, it needs to be the thing like less than or equal to that has to be reflexive, transitive and antisymmetric. But it doesn't have to be total. There can be things that are just incomparable. Like, the set containing just one and the set containing just two. Neither of them is a subset of the other.
MA:
Okay. So, when is... A partial order is a lattice, if it has a least element, a thing that is smaller than everything and any two elements have a least upper bound. So, a thing that is bigger than both of them, but smaller than any other thing that's bigger than both of them. So, the example here would be the union of two sets, right? And the least element, obviously, is the empty set. It's smaller than any other set.
MA:
The union of two sets is bigger than both of them. It's a super-set of both of them. And anything else that is a super-set of both of them contains everything in the union, right? So, that's what a least upper bound is. And that's what a semi-lattice is as a partial order. It's a partial order that has least upper bounds and the least element.
MA:
And you can prove these two things, these two views of them, the partial-order view and an operation, which is associative, commutative and idempotent, are equivalent. Because you can prove that the least upper bound operator is associative, commutative and idempotent and that the least element forms its identity, right?
SK:
I'm just curious from a historical perspective, like was there a person, or an event, that joined these things?
MA:
God, I have no idea. This is all really old math. Like, this is... Yeah.
SK:
Was it Hilbert or someone before, like Euler?
MA:
I don't know.
SK:
It wasn't Aristotle, you know?
MA:
No. Yeah. I mean, there's like umpteen zillion variations on this, right? So, like there's semi-lattices. Some people take that to not mean not necessarily having the least element. And then, you can talk about semi-lattices with the least elements. And then, you can have meet semi-lattices, instead of join semi-lattices which is the same thing. Instead of having a least element and a least upper bound, you have a greatest element and a greatest lower bound, which is, right...
MA:
And then, you can consider having both of these and that's what we usually call a lattice is if you have a least element, a top element, least upper bounds and greatest lower bounds. Subsets of a given set form a lattice, in that sense. The least element is the empty set. The greatest element is the whole set, everything. Least upper bound is union and greatest lower bound is intersection.
MA:
And these structures are very well behaved. They have all sorts of operators. And they all interact nicely. And you can... anyway. So, anyway, yeah. You can view them as having to do with partial orders or you can view them as just being this algebraic structure. They have an operator and it obeys certain laws.
MA:
So, why are semi-lattices interesting? Oh, right, because sets are a semi-lattice, right? And remember how I talked about we have relations and you have tables and you have predicates? Well, here's another thing to add to the list, sets plus tuples. Because a relation, or a table, can be thought of as a set of tuples. A tuple representing a row in the table and the set representing the whole collection.
MA:
And so, this is what Datafun does. It takes a functional language. It takes a basic simply type of lambda calculus, which is sort of like your vanilla starter base for functional language design. And it adds to it finite sets as a data type and tuples, which are easy and ordinary, right? And then, with those...
MA:
Okay, so the next question is, okay, you have finite sets. But how do you make them and how do you use them? And the answer that I give is, well, making sets is easy. You just list the things you want to be in the set.
MA:
But how do you manipulate sets? You use set comprehensions, right, which is you can basically say, "For every element X in the set Y do something, right, and give me the union of all those somethings." So, you can say, "For every element of X in the set one, two, three, union together the result of the set two times X, X plus two."
SK:
So, the body of the comprehension is itself a set. It's not just one element.
MA:
Yes, right. It's not just one element.
SK:
It can be a set containing one element or a set containing the other-
MA:
Yeah, a set containing these elements, but it turns out to be equivalent. You can do both ways and they're equivalent. The reason being if you just want to give one element. If you want to restrict it to... I mean, this is impossible to describe without pen and paper. So, I'm just not going to try it. But basically, basically, just lets you write down what you would normally think of or what a mathematician would normally think of as a set comprehension, within certain limits.
MA:
So, you can't do infinite sets this way, but you can filter existing sets. You can take the cross product of sets, because you can just comprehend over two sets. For every X in the set A, for every Y in the set B, give me X, Y. And if you can filter and you can do cross products, then you can do relational joins.
SK:
And so, you started from the lambda calculus and added set comprehensions... Which you didn't add relational algebra.
MA:
Not explicitly.
SK:
But you can maybe prove that it's like equivalent or?
MA:
It can express everything in relational algebra and some other stuff that's not in relational algebra.
SK:
Because functions from-
MA:
Yeah, for example, right.
SK:
But, I guess, you could think of it as you took two algebras, like lambda calculus, or two calculi, and relational calculus, and you-
MA:
Sort of, yeah. I sort of glummed them into one language. That's one way of thinking about it.
SK:
But technically, you just did lambda calculus and then, you added some things.
MA:
Yeah, lambda calculus adds set comprehensions and with a few other things. Like, if you want to be able to do a equijoin, you need to be able to test equality. So, add equality test and boolean, okay. Not a big deal.
SK:
So, in functional language, before you came along, I was already able to filter. And I guess, set comprehension... what's the equivalent in functional language that we just have?
MA:
You could have it as list comprehension, right? Set comprehensions are like list comprehensions, except for sets.
SK:
And a list comprehensions... Comprehension is a weird word.
MA:
I don't know why that word got there, but that's the word we use.
SK:
I used list comprehensions in Python, but I think in other languages, I just use... It's just map.
MA:
Yeah, list comprehensions can all be done, in terms of map and filter.
SK:
Oh, okay. Map and filter. But then, I've never mapped over two-
MA:
That's not true. You also need monadic join, another join that's different from all the previous joins we've discussed. . So, you need, basically, concatMap.
SK:
ConcatMap, yeah, that makes sense.
MA:
ConcatMap and from that, you can get filter. And, yeah, that's it. So, basically, it's concatMap.
SK:
So, well, we already had this in functional languages before you came along?
MA:
Yes, absolutely.
SK:
So, wait, you just invented a functional language that's just a regular functional language?
MA:
Well, but it's not Turing-complete.
SK:
Oh, okay. And you took a functional language and you just subtracted things.
MA:
Yes.
SK:
That's another way of looking at it.
MA:
Yes, it is less powerful than almost every other functional language. Deliberately less powerful in the hopes that we can apply a bunch of prior work that's been done in the Datalog community and the SQL community on optimizing expressions of this form and optimizing Datalog evaluation and optimizing SQL evaluation. Because if you take that work and you try to apply it in a context of a full-blown higher-order Turing-complete functional language, it is a nightmare.
MA:
But if we limit our language enough, we're hoping and we have some reason to believe that we might have some success in generalizing the existing optimization literature, so that you can write the stupid dumb obvious way of computing a join. And your compiler or your implementation will figure out how to do if efficiently using an index. Which is not something that existing functional languages do, right?
MA:
The best you've got is, sort of, list fusion, which is an impressive optimization, but it's sort of like combining multiple passes over a list into one pass. Like, map, map, filter, map, filter gets combined into one single pass over the list.
SK:
I didn't know that that's what happens sometimes.
MA:
Yeah, this is an optimization that a bunch of people have worked on and they've gotten it to be pretty good. And that is table stakes for database query engines. That is just like nobody talks about that, because it's freaking obvious. And this is what happens when you make your language more powerful. Everything gets harder. And so, we're trying to find a sweet spot of something that is more powerful than Datalog, but still constrained enough that we can apply existing optimizations to it and imitate what has been done in the database community and the Datalog community.
SK:
Yeah, well said. Personally, I'm someone who's really excited about the idea of less powerful languages. I don't know if you feel this way, but I feel that like our patron saint is the "Go to considered harmful" article in that like... It was very explicitly calling out languages are too powerful in this one explicit way. And if we get rid of this, it'll actually be an upgrade.
SK:
And I guess, it spawned a whole class of articles. Like, X, Y and Z is considered harmful. But particularly in making languages less powerful, so that we get... So, almost like we make them more well behaved, so we get more mathematical properties out of them. Then, we can optimize them easier and other things like that. I'm a big fan of these theme.
MA:
Yeah, I think I'm also kind of a fan of this. I don't know that it's the... I'm kind of a pluralist. I believe in taking every approach under the sun and seeing how it works out. But I think that trying to make less powerful languages, so that you can do more to the languages, is an under-explored area.
SK:
What's the opposite of a pluralist? Because I feel like I'm someone who's like ideologic?
MA:
Unitarian? I don't know.
SK:
I kind of like to do the opposite. I don't like to just kind of try things until I find one that works. From first principles, how do I go? Or, you know-
MA:
So, when I say I'm a pluralist, I say that as a sort of belief about how our research community should function, not a belief about how any individual should do research. I think it totally makes sense to focus on one particular idea, or one particular principle. I just don't think that any one principle is the only one we should be considering.
SK:
Yeah, in the whole world.
MA:
As a community, yeah.
SK:
Yeah, sure, yeah, yeah. Of course. Okay.
MA:
But this is in contrast to people who want to build one language to rule them all, for example. I think that's sort of doomed to failure, because I think that humans have many purposes. And programming languages are going to need to be built for many purposes, too. I think some of the most interesting ideas in programming languages have come from programming languages, which tried to do... tried to say everything is a something, right? Everything is an object. Everything is a function. Everything is a process.
SK:
There's a list of these. There's a list of... for each language X, everything is a Y. I'm just trying to do a verbal set comprehension. I think it's on TiddlyWiki. It's like a list given language and all-
MA:
Right. On the one hand, I think that some extremely interesting research has come out of this, right? Like, functional programming came out of lambda calculus where everything is a function. Object-oriented programming came out of thinking about trying to make everything an object.
SK:
Logic programming.
MA:
There are process calculi, which come out of everything is a process communicating with everything else. Logic programming comes out of everything is first-order logic. But at the end of the day, I don't believe any of it. Like, no, everything is not an object. Everything is not a function. Not everything is a process. Not everything is immunable to being described in terms of first-order logic.
MA:
At the end of the day, the world is complicated. And we need many tools, many approaches, to try and understand it. But the kind of single-minded focus on a single thing is how you find out what the limits of the idea is. Until you commit yourself to fully exploring a particular idea, you probably won't realize just how useful it is. So, this is sort of the sense in which I'm a pluralist.
MA:
I do not believe that any one of these single-minded ideas will rule the world, but I think the world is better for having had lots of people who have tried to push these ideas as far as possible.
SK:
I think, if I had to bet, that probably the view that would win-- and so, I feel like it's irrational for me to hold these two views -- but I am a "one thing will rule them all" kind of person. Maybe I should work on changing that about myself, because I explicitly notice that it's like a weird thing, or maybe it's just like a hope. I like the idea of... It's almost physics envy, you know?
NS:
You've heard the term physics envy? It's like a derogatory term for fields that aren't physics that try to pretend like physics. Like, mathematize things or reductionize things, which is kind of what you're getting at. Everything is just this one... We reduced everything in the complex world to this one simple thing, because it's just more elegant to look at the world that way. So, I hold out hope that we'll figure it out.
SK:
Yeah, is there a... And you're someone who builds on the lambda calculus. Is there a chance that everything in the function is the one that wins?
MA:
Well, but here's the thing. Almost every language that builds on the lambda calculus drops everything as a function. When you're working with natural numbers in functional languages, are they represented as Church encoded natural numbers? No. They're represented as a bunch of bits representing a natural number and two is complement.
SK:
Well, I think there's... That's kind of looking at implementation detail. Like, the semantics of a language-
MA:
But it's not an implementation detail, right? It's there in the semantics, too. I cannot apply a number to a function and get that function applied that many times, right?
SK:
I see.
MA:
That's what a Church encoded natural number is.
SK:
I see. Yeah, yeah, yeah, yeah.
MA:
Right? There is one language I know of, which sort of tries to take this idea to its logical extreme and like, more or less, encode everything. And that is... I hesitate to even mention it, because... Urbit tries to sort of do this. It has an extraordinarily simple virtual machine, which is almost a combinator calculus, right? And then, tries to build everything up on top of that. and if you can't do something efficiently, then rather than introducing a new primitive, they have this concept I think they might have called it "jets"... of like writing the code that does it inefficiently and having the compiler recognize that specific code and turn it into something that does it efficiently. Which is a cool idea, but I'm hesitant to mention the project, because it's tangled up with a whole bunch of ideological ideas about how programming should be organized and how distributive system should be organized.
SK:
And it's also just filled with obscure jargon and it's really hard to decode what they're actually doing. It's funny you say this in this way, because I think it was last night, someone... You may have seen this. Someone I don't know on Twitter reached out like, "What do you think of Orbit?" I click on the new primer. I'm like, "Wow, this is so much better designed than I've seen in the past." And so, I post it on Slack, because the graphic design of it. Maybe you haven't seen it new.
MA:
No, I haven't.
SK:
I don't know when it came out. And the first response was basically what you said. Like, there's some useful things here, but there's just so much noise and ideological stuff that like... It's just hard to focus on it. It and I think other projects like it, there are a lot of projects that are just super wacky. But you kind of have to focus on the parts that are worth noticing and then, just talk about those parts. You shouldn't throw the baby out with the bathwater, I guess.
MA:
Yeah. And this idea about writing the inefficient way and having the compile recognizing does not need Orbit. I think somebody working on another wacky project called Avalon Blue who calls it... I wish I could remember their name. They call it "accelerators". So, it's an idea that's going around in the fringes of programming language design community.
SK:
It's very cool. If you wanted to, you could like fully expand everything and see the inefficient, but recognizable code. But then, under the hood, the optimizations happen.
MA:
Yeah. I'm just skeptical that we'll overall reduce the complexity of your system, because you still have to have that complexity, the inefficient implementation somewhere. It's just now that it's hidden beneath... It's hidden in your compiler implementation.
SK:
The Out of the Tar Pit paper, I think, did a good job for arguing for something along these lines of like shoving your optimizations away from your more declarative code.
MA:
Yeah, but I think there's a difference. So, like the approach that I'm taking is, again, trying to write declarative code. Write the obvious join algorithm, which is just loop over set A, loop over set B. If some condition is true, yield tuple whatever. Which if you implement it naively, is at least quadratic in complexity. And then, having... you might have recognized this. But it's like the accelerator approach is you don't just do that. You write the code that implements the naïve version of a join. And then, you have your compiler recognize when it is compiling that specific chunk of code, right, as if it's a reflection on trusting trust attack only being used to make your code go faster. And then, compile it into something more efficient. Which strikes me as trading... you're gaining one sort of elegance in that you just have one Turing-complete language as your source, but you're losing the simplicity of your compiler. You're not cashing in on the ability to make real abstractions.
SK:
That makes sense. It feels ad hoc.
MA:
Yeah, I don't know.
SK:
So, back to Datafun, one way to look at it is it's a functional language, but we remove some things -- I want to get into what those things are -- in the hope of getting it as performant a database.
MA:
Yeah, kind of.
SK:
Or it could be used as like a query language.
MA:
Yeah, right. So-
SK:
You have really long, really big sets. Like, huge sets and-
MA:
Yeah, right.
SK:
Because in a programming language, you would never-
MA:
If you were going to use sets that big, you would probably use a database.
SK:
Exactly. That's a good way to put it. So, Datafun is like it's a programming language that can work with that's so big that you normally would've used a database for them.
MA:
Well, that's what we hope it can become. The implementation right now is not like that. You would not use it for anything other than two examples. And part of this is that building a really performing database engine is a lot of work. And I don't... I'm one person with an advisor. And that's not enough people to build a performing database engine. So, the things we can work on are, sort of, the theory, right? Showing that all these optimizations that people use in real database engines and real Datalog implementations can be ported to a functional language like Datalog.
SK:
Cool.
MA:
And so, I haven't actually finished describing Datalog, right? So, we add sets and set comprehensions, but that only gets you to relational algebra. That only gets you to, sort of, SQL-like levels capability. To do what Datalog can do, you also need a certain sort of recursion, the ability to find sets recursively.
SK:
Yeah, the join thing we talked about.
MA:
Yeah, for example, transitive closure. Implementing transitive closure is defined in terms of itself.
SK:
The fixed point of joins.
MA:
Yeah, the fixed points. Yeah, so we add a certain sort of fixed point to Datafun that allows you to find sets recursively. And that is what allows you to do the things Datalog can do. And one of the interesting things about this, from sort of an academic point of view that maybe is less interesting to other people, is... Well, actually, let's go back to logic. Paradoxes. Datalog allows you to find things recursively.
MA:
Now, if you think about logical paradoxes, a lot of them involve self-reference. Like, the liar's paradox, this sentence is false. Or all sets that don't contain themselves also is a paradox. So, you might start to worry if you hear the logic programming language includes recursion. You might start to worry about logical paradoxes. And Datalog does something that prevents you from getting into trouble with logical paradoxes, even though you're allowed to define things recursively.
MA:
And that thing is stratification. And what it means is you can write things that refer to themselves, but not in a negated way. You can refer to yourself, but you cannot refer to the negation of yourself. And if you look at all logical paradoxes, they all involve, not just self-reference, but negation. The set of all sets that don't contain themselves. The sentence is not true, right?
MA:
So, that is what avoids Datalog getting into hot water. Can avoid having things that have no clear meaning. The equivalent of that, when you move to a functional language and you start allowing yourself to define things recursively as a fixed point, is that the function you're taking a fixed point of has to be monotone. An increasing input must yield an increasing, or least non-decreasing, output.
MA:
This is because negation is, sort of, the fundamental non-monotone logical operator. Increases input from false to true and its output decreases from true to false. Every other logical operator increases inputs and its outputs increase. And make some of its inputs true, the output can only become true, right? Or same thing, right? So, not is the fundamental non-monotone logical operator.
SK:
I see.
MA:
So, there's this connection between monotonicity and defining things recursively without screwing yourself over, without being inconsistent or, sort of, computational terms being Turing-complete. And so, we need a type system that guarantees that functions are monotone. And so, that is part of, sort of, the more academic side the Datafun work is a type system for guaranteeing the functions are monotone.
SK:
And this is the modal type stuff?
MA:
The modal type stuff is like version two of that.
SK:
So, it's skipping ahead?
MA:
Yeah. So, the original paper gives a type system that is able to guarantee that certain functions are monotone. And it's somewhat inflexible in certain ways. And so, I've been working on a more flexible version that involves modal types.
SK:
Got it.
SK:
But it's for the same purpose, effectively. It's for guaranteeing things are monotone.
SK:
Got it, got it. So, when you do certain operations and functions, it-
MA:
It tracks.
MA:
Yeah, it tracks.
MA:
The effect they have on the order. For example, if you have, you know, expression one set minus expression two. Find the difference between these sets. Well, that is monotone in the left-hand side and the set that you're subtracting from. But it's anti-tone in the right-hand side, the set that you're-
SK:
I see, I see.
MA:
Right.
SK:
But then, if like another operation does the opposite
MA:
Right, yes.
SK:
Like subtracts again, then it flips.
MA:
Then, it all flips.
SK:
It tracks the flipping.
MA:
Yeah, well, the original one only tracks monotonicity and non-monotonicity. The modal type stuff will track monotonicity, non-monotonicity, and titonicity. And I don't care-tonicity, which I... or we call it bivariants. So, it tracks like four different ways an operation can care about the order of its arguments. It can have, say, increasing inputs yield increasing outputs. That's monotone. It can say, "I give you nothing. I give you no guarantees." That's non-monotonicity, right? No guarantees. It can say, "Increasing inputs yield decreasing outputs." That's anti-tonicity or anti-monotonicity. And it can say, "You can change that input however you like. My output will increase." That's not quite right...
SK:
How could that be? Right.
MA:
So, the obvious answer is a constant function. Change the input however you like, the output stays the same, which as far as I'm concerned is increasing. So, when I say increasing, I mean weakly increasing. Staying the same or growing.
SK:
Oh, okay. Sure. I'm like how is things increasing? Weakly increasing.
MA:
Or you can have a multi-argument function, which is constant in one of its arguments, but not in-
SK:
Sure, sure.
MA:
Right. But actually, it's a little more subtle. What it actually is is as long as the input changes to something that it is related to, either by increasing or by decreasing, then the output will stay the same or decrease. So, an example of this might be, let's say you have a type which has two copies of the natural numbers in it. So, it has, say, left zero, left one, left two, left three, blah blah blah.
MA:
And it has right zero, right one, right two, right three, blah blah blah. And the way they're ordered is left zero is below left one is below left two is below left three. And right zero is below right one is below right two is below right three, blah blah blah. But left and rights are never comparable. They're incomparable. So, what the kind of function I'm talking, it's called a bivariant function.
MA:
What it does basically is, if you change from left to right or right to left, I give you no guarantees. But as long as you stay within left, my output will not... will stay the same or increase, but it basically means it will stay the same. All right? Or if you stay within right, then it won't stay the same or increase. This comes up, basically nowhere in practice, but it makes certain internals of the system work out.
MA:
It is, in a certain sense, dual to the "I give you no guarantees" notion. So, anyway, I don't know why I got started talking about this.
SK:
The notion of like a type system that keeps track in this way is cool. It reminds me of this Conal Elliott quote where he says, "Part of what makes algebra is so great is that it doesn't keep track. Like, four plus three is just seven. And it's like it doesn't matter. Seven isn't four plus three or five plus two. Seven is just seven". And he says, "If seven wasn't equivalent to three plus four or five plus two, if you had to keep track of five plus two, if you would have a tree sort of thing."
SK:
Algebra would be like a tree thing and it would be less elegant. So, is your thing somehow less elegant? Or you don't have to actually keep track of that far in the past, because each thing is, at any point in time, either monotone or anti-tone. And so, you don't have to remember the past too much, not a tree.
MA:
So, there is no explicit past here. What this is really tracking is properties and functions. A function is monotone or it's non-monotone or it's anti-tone, or whatever. And the type system can tell you whether it's monotone or anti-tone or non-monotone.
SK:
I see. And so, plus and minus are functions.
MA:
Plus and minus are functions. Plus would be monotone. Minus would be monotone in one argument and anti-tone in the other.
SK:
And so, what you apply plus to... So, once you have a function that uses plus, that function itself just has a tone, as well.
MA:
That function has a tonality, right.
SK:
Okay, got you. So, there's no tracking really. It's just using code in the types in the same way.
MA:
Yeah, it goes into the types, right? So, the type system-
SK:
I see.
MA:
Tracks this and analyzes your code according to a certain set of rules.
SK:
I guess it tracks in the same way that the plus function tracks like its inputs are ints and so its outputs are ints.
MA:
Yeah, exactly.
SK:
I see. Okay, cool. That's very cool. It seems obvious now that you say it. It seems like an easy thing to do, now you say it. But I would have never thought to add monotonicity into a type system.
MA:
Yeah. I mean, we added it, because we needed it to capture Datalog. I wouldn't have thought of it, otherwise. But it turns out, monotonicity has all sorts of strange applications. So, as I said, monotonicity helps you avoid logical paradox, which is sort of why it's helpful here. It allows us to define things recursively while still having a well defined best answer. Monotonicity also shows up in distributed and concurrent systems a bunch though. So, there's this work out from the west coast and Berkeley and other places.
SK:
The BOOM?
MA:
Yeah, the BOOM stuff, or specifically the consistency of logical monotonicity work, right? So, I think there's a paper called "consistency as logical monotonicity", or something like that.
SK:
Sounds like a cool paper.
MA:
And it says basically that the things that you can implement in a distributed system without any coordination, without having to get nodes to specifically coordinate, are exactly those things which are monotone, in a certain sense.
SK:
I see. Like a number that can only go like-
MA:
Yeah, a counter that can only go up.
SK:
A counter that can only go up or... What's like the other? Yeah, that's like, I guess, the canonical example. Or append-only set.
MA:
Yeah, pend-only lists.
SK:
Append-only, but where ordering doesn't matter.
MA:
Yeah, I mean, yeah. That would work. I think you can also have a pend-only list. It's just a little bit more complicated. But, yeah, and this is sort of connected, in a way that I still don't fully understand, to the CRDT work.
SK:
It sounds like it.
MA:
And CRDT's are connected to semi-lattices. Because what are CRDTs? Okay. You keep track of a data structure that represents sort of your state at each node. And there is a way to merge these states when you get one from another node. And this operation, it has to be associative.
SK:
I see.
MA:
It has to be commutative.
SK:
I see. I see.
MA:
It has to be idempotent. And it may as well have a zero value or a starting value.
SK:
I see.
MA:
And that makes it a semi-lattice.
SK:
So funny. I never thought of CRDTs... because the merge function, I never thought of it as a function. But, yeah.
MA:
Right, because it's often not explicit, right, this merge function. That's not usually the way you implement it. But it is there. When you think about it, it's there in your conception of the underlying data structure.
SK:
I guess, because it's the flipped perspective. You think of a CRDT as I have an object. You have an object. And we merge the objects. But what's actually happening is the thing I do to my object and the thing you do to your object are then merged. So, it's like the edit actions that are... that you apply the merge operation to.
MA:
It sort of depends. CRDTs are... There are like a lot of different ways of representing CRDTs. So, the simplest way of representing CRDTs and the way I'm sort of thinking of it as is, sort of, state-based. So, each node keeps a data structure representing its current state. So, for a grow only counter, this is simple. It's the counter, the value of the counter. And then, to synchronize with another node, you would just send it your state.
SK:
Oh, okay. You just send it your state. You don't send it the diff.
MA:
You don't send it the diff. You just send it your state. Right. Sending it diffs, is sort of an optimization on top of this. Or at least, that's the way I think of it. But once you do that, the connection to semi-lattices becomes a lot more obscure. If I had the time, I would go investigate this area from the viewpoint of, "Hey, semi-lattices are cool. Can we make sense of all of this, in terms of semi-lattices?"
MA:
But I don't have enough time. I'm busy doing Datafun. But, yeah, I mentioned this idea to some people, in particular, Martin Kleppmann who has this idea of using Datalog to implement CRDTs. Which I find super cool, right? So, it seems to me that there is some basic science waiting to be done in this area of CRDTs. But I really am not up-to-date on the area. I only have an outsider's perspective.
SK:
We've been doing a lot of talking about math in this conversation. And I'm having a great time. I love math, but I think I've seen people criticize... people in our community. People try to improve programming criticize math or like using math too much. Not criticizing math for its own sake, but people say that we are like... We make our types too complicated. We're trying to incorporate math in ways that over-complicate things and that programming isn't math. And we shouldn't make it math. We should make it its own thing. How would you respond to that criticism?
MA:
I have somewhat complicated thoughts about this, but... So, the first one, I would think is it definitely can get in the way, if you are trying to learn something quickly and you want to just start, right? So, for people first learning to program, I think that excessive abstraction, including excessive use of type class abstractions, can definitely get in the way.
MA:
I think, for example, probably we should teach people to program in dynamically type languages, because it is one fewer thing to worry about. I'm not like dogmatic about this. I don't have a strong opinion on this, but that would be my intuition. But I think that these abstractions that create type classes, monads, they come about because they really are recurring patterns in programs.
MA:
And it really does help, especially for programming in large, to have names for these patterns and ways to reuse them in your code. And how much it helps will depend on what kind of code you're writing and what kind of things you care about. If you really need absolute control of what's going on with all your bits, then abstractions can absolutely get in your way. But a lot of code doesn't need that.
MA:
And a lot of code, it's more important that you be able to grasp at a high level what you are doing and to articulate that at the level that which you are thinking it, for what you can be able to have abstractions. So, basically, I think abstraction... the level of abstraction at which you work is one part of the trade-off space that is programming. You always have to trade off things against other things. And abstraction pays dividends in the long run, but it requires some investment up front.
SK:
I think that's a really great lens that it's like a short-term, long-term thing. If you want to just get going... And I think a lot of people who want to improve programming, what they really care about is the onboarding. People should be able to learn to code as fast as possible.
MA:
Yeah, and I think that is important.
SK:
And then, there are other people who care about the... once you've learned, how easy it should be. And they're almost opposed. Like, they're almost directly opposed. Like, the easier it is once you know how to do it-
MA:
Well, I think we perceive them as opposed. This is another one of my bugbears. I think we perceive them as opposed, because they are simply different. Right, and both hard.
SK:
There are clearly ways to optimize one that is there at the expense of the other.
MA:
Yes, but there are also ways that optimize both of them. And we don't talk about those, because we just do them. If there's an easy way to improve both things-
SK:
Of course, of course.
MA:
Then, we do it.
SK:
I see. Yeah, yeah, yeah.
MA:
There's three classes-
SK:
We only talk about the ones that, yeah, yeah, yeah, aren't opposed. Yeah, of course. Yeah, yeah, yeah, of course. Well said.
MA:
I've forgotten my train of thought.
SK:
Yeah, yeah. The thing that's good for both beginners and experts are just all programming languages.
MA:
Yeah, unless they're hard.
SK:
Unless they're like... Well then it's very clear that you should just change that language.
MA:
Unless it is hard to do them...
SK:
Oh, I see. I see. Yeah, yeah, yeah. I thought you said unless it's a hard language, unless it's just a bad language. Yeah, yeah, yeah. So, I interjected to affirm that I really like the idea that math... So, one way to think, instead of using the word math which has all sorts of weird connotations, you could just use the word "patterns" and recognizing patterns. And when you're first introduced to a subject, you don't see the patterns, because everything is new.
SK:
And so, like to just throw patterns on people right at the beginning, is not a good idea. But the more you do them, then the more you're going to want to use patterns, because you don't want to be repetitive. And so, if you're interested in making programming easy for beginners, then throw away the patterns, because they won't mean anything to them.
MA:
Or build them in as features of your language, so that they don't have to think about them. All right. So, like don't force people to use go-tos. Just have them use structured for loops, right? That is, in some sense, a pattern. For loops are a pattern that you can express in terms of go-tos, but you can turn that pattern into a language feature, so that you can just think at the higher level immediately.
MA:
So, there are some abstractions that paint evidence immediately, even for beginners like for loops, right? Other ones take a little more effort to learn and maybe, it's not worth trying to introduce a beginner to them. So, you said that you can sort of replace the idea of mathematizing program with the idea of recognizing increasing patterns. But it's also true that there are different sorts of patterns in programming languages, some of which we know how to quantify and exactly capture using math. Some of which are a little fuzzier at the moment, right?
MA:
And I think there's this more general divide in programming between techniques that are useful when you know exactly what the problem you're solving is. And techniques that are useful when the problem that you're solving is sort of fuzzy and not very clearly defined, but still important, right? Sometimes, you see this as a front-end, backend dichotomy. That's how it manifests, sometimes. But I think it's more to do with how precisely you can define what it is you're trying to do.
SK:
How like iterative your process is? Are you changing your... What this brings to mind is a waterfall versus agile. Are you a start-up that's iterating really quick or are you some old company that's rewriting existing software do the same exact thing.
MA:
Yeah, I think that's another way that this same distinction can manifest. And I think, basically, mathematics is most useful in stuff where you know exactly what it is you want. Mathematics is about formalism. It's about being able to precisely specify things. It is most useful when we know what exactly what it is we want and what we're doing. It's less useful when you're doing things that are fuzzier, right?Not not useful, right?
MA:
And I think a lot of the history of the progress in programming science has been being able to more precisely specify what we want for larger and larger components, right? And that's why I think you sort of see an increasing tide of mathematization in programming languages and programming work, because we are building up from the bottom as it were. Building up from the smaller things towards larger and larger things that we know how to precisely specify what we want.
MA:
But we also work down from the top. We work from we want to make a system that does this thing involving humans, right? We want to make a video game that is enjoyable. We don't know how to specify that, right? And so, the tension between those two things, I think, produces a lot of this conflict between the people who are really gung-ho about formal methods and mathematization and so on, and the people who are really not gung-ho about that and think that premature formalization can bog you down in details that you don't want to get bogged down in.
SK:
Yeah, yeah. And part of me wonders if you can have the best of both worlds where you can not get bogged down, but also get the benefits of more formalized stuff. And so, like the way I think that it occurs to me is part of what I like about types is they are automated reminders of like, "Oh, by the way." In certain cases that you didn't handle, things aren't going to go so well.
SK:
But I don't want the types to stop anything. I just want, on the side, as an afterthought, "Just so you know, if you care to know, there's some things. There's some implications that you may not be realizing." That's what I feel like that could maybe unify into the best of both worlds where it doesn't prevent you, doesn't bog you down, but also doesn't leave you in the dark.
MA:
Yeah, and I think that's sort of a compelling vision, which is sort of what the gradual typing work is investigating. And Cyrus's work is sort of investigating this as well, with the holes that allow you to have not fully formed programs, or not fully well typed programs, that can still execute.
SK:
Yeah, I've really had bad experiences with like TypeScript. I don't like it at all. People love TypeScript.
MA:
Yeah, I know someone who really loves TypeScript.
SK:
Everybody loves TypeScript and it's my only experience with gradual typing. It's just like so much worse than just programming JavaScript. And I guess to your point, when I program JavaScript, I just want the code to run. I don't care about the cases that you're talk... I don't. Like, stop forcing me to fix these type things. I just want to like iterate. And I'm like, at the end, I want to hear what you have to say about why my types don't make sense. But you're bugging me right now, you know.
MA:
Interesting. I guess it is true. The gradual typing work is not generally... Cyrus' work is about being able to run incomplete or ill-typed programs with those holes inserted, but most of the gradual typing work is not about that, right? You can have a project program that is not typed and you can still run that program. But if you add the types, they better type check, otherwise, no.
SK:
Yeah, yeah, yeah. So, anyway, so how does Datafun compare and contrast with the Eve work that you are such a fan of?
MA:
It's much less ambitious. I'm not trying to reinvent all of programming. I'm just trying to see whether we can combine functional programming and Datalog, right, simple relational programming. So, Eve was trying to be a system which enabled non-programmers, or people who had a lot of programming experience, to build complex, distributed systems. I'm not trying to do that.
MA:
I'm just trying to combine what I like best about Datalog, right? That it lets you do simple relational stuff without worrying too much about how their data is represented behind the scenes. And then, run it fairly efficiently with what I like about functional programming, which is that I can abstract without repeated patterns in my code.
SK:
Okay. And in a dream case for you, is it Datafun itself that goes on to be something that people use or more people take the underlying research and embed it into another language?
MA:
I'm like all for any of these options, right? So like, I think definitely the thing that I think is the most likely dream scenario, right, is that the idea is exploring Datafun will help other people design languages and build systems, right? And I think Datalog has been influential in that way, in the same way relational algebra has been influential in that way.
SK:
Oh, interesting. Datalog influenced-
MA:
Datalog, so there's sort of a community of various industrial people who use various Datalog dialects. So, Simmel uses Datalog to do static analysis of large code bases. LogicBlox who recently got acquired and no longer exist in any way that matters, but while they were around, they used Datalog to do business analytics.
SK:
Yeah, I've heard of LogicBlox from Jamie Brandon who used to be in Eve.
MA:
Yeah, Eve was influenced by Datalog. So, if they got off the ground, that would be great. And just I don't know. It might seem that way to me, only because I'm doing work on it, but there are people with ideas built around Datalog and cropping up in the research community and I think also leading into certain parts of the industry community, as well. It's still sort of on the fringes. Oh, there's the work out in Berkeley. The BOOM work, right?
SK:
"Deadalus, datalog in time and space."
SK:
So, is there a world in which Datafun becomes like a competitor to SQL or a Datalog? Is it kind of equivalent in that way, because I don't enjoy SQL for all the reasons people don't enjoy SQL. But it's like fast and I like relations, so could I like swap out my Postgres database with like a Datafun database?
MA:
In some hypothetical future, yeah. There are all sorts of questions that would have to be answered along the way. But yeah, and in some hypothetical future, Datafun could be a query language for a database. The questions that would have to be answered along the way are, well, obviously the ones we are still working on. Can we port the existing optimizations work from the database community to work on Datafun?
MA:
And also, SQL isn't just a language that lets you do queries. It also lets you do updates and, you know, migrations and all sorts of stuff.
SK:
Yeah, right, migrations, of course.
MA:
And transactions.
SK:
And transactions. Does Datafun-
MA:
Is critical.
SK:
It operates only on an existing database. You can't... There's no-
MA:
It doesn't talk to an existing database at all.
SK:
What I'm getting at is it's immunable. You can't like mutate.
MA:
Yeah. It's like the relational algebra. It's just a query language. It just lets you answer questions about some data that's already exists or write expressions that compute.
SK:
I see. I see. It feels like it, yeah. Now, it's making me think of the Datomic, the Rich Hickey project.
MA:
Yeah, although that also deals with change in some way that we don't, right? They have some story about it being sort of a append-only or always keeping your old data around. But we don't have any notion of time at all in Datafun. It's just sets and computing with them.
SK:
Yeah, what made me think about is they have a query language that feels like Prolog. You can like-
MA:
Yeah, it's Datalog, effectively, right? They sell it as it's a Datalog dialect, I think. It's proprietary, so I haven't been able to actually look at it properly.
SK:
I see. Okay, yeah, yeah, yeah. So, that's another thing to add to the list of things inspired by Datalog.
MA:
By Datalog, yeah.
SK:
Okay. So, in theory, the Datafun work could be a competitor or is it an alternative? Or it's like in the same space as Datamic.
MA:
Yeah, in the same space as Datalog or as SQL. But again, there are a bunch of questions, practical questions, that will have to be answered first.
SK:
Yeah, so, incremental computation is when you change the inputs to a function. You're like reusing some of the old... I'm sorry. Let me ask it again. So, maybe give us a foundation of what incremental computation is.
MA:
I mean, incremental computation is just the idea, basically. You run a function once. You change the input. You want to know what the functions result on the change of the input is, how can you compute that more efficiently than just recomputing the function from scratch on a new input, right? That's one way of thinking about it, right?
SK:
And so, from a programming experience perspective, what I'm curious about is a slightly different perspective on incremental computing. If I keep the inputs the same, I slightly change the function. What can I reuse from the past computation?
MA:
That's, in some sense, the same question in that you have the function that takes the program and the input and gives you the result, and you're changing one of the inputs to that function, mainly the program.
SK:
Well said.
MA:
But it's a much harder problem, because programs are really complicated structured things.
SK:
And so, you're kind of taking the derivative of the input. So, we're taking the derivative of a program.
MA:
Yeah. So, one perspective on incremental computation is that you are taking the "derivative" of the program, or the function, that you want to compute incrementally. Because you want to know how does this function change as its input changes. And that's actually related to the derivatives of fixed point is the fixed point of its derivative. Basically, I found a rule for, in a certain system of incremental computation, finding the derivative fixed point. And it turned out to be involved taking the fixed point of the derivative of the function that you were originally taking the fixed point of.
SK:
I see.
MA:
And so, what it was useful for is making Datafun go faster, because Datafun allows you to compute fixed points. And fixed points can be computed naively just by taking the function you want in the fixed point of and repeatedly applying it, smacking it on the data until its output equals its input.
MA:
This is kind of annoying, because you have the same function and its input is changing, right? First, its input is, you know, the empty set. Then, you take its output and you use it as its input. So, you have changed its input from the empty set to whatever the template was. And then, you reapply it.
SK:
I see.
MA:
And you keep on doing that.
SK:
Oh, I see.
MA:
And really what you want to do is incrementally recompute the results of the function, because its input has just changed. It happens to be the thing you changed it to was its old output. That's the weird part about fixed points, but it's really just incrementally recomputing the functions.
SK:
Oh, I see.
MA:
And that's where incremental computation and fixed points and derivatives all intersect.
SK:
I see. Cool. If you want to make fixed points more efficient, you can use incremental computation. And then, it's really important that the fixed point of derivatives is the derivative of the fixed point.
MA:
It's important, if your fixed point... if you nest fixed points. Because, for example, if I have a function that I want to save the fixed point of and the function defines the terms of the fixed point, I need the derivative of the function to compute the outer fixed point incrementally and the fixed points function itself involves a fixed point, so I need to know how that fixed point changes.
SK:
Okay, I'm going to have to listen to this a couple times post-production to get it. But that sounds...
MA:
Apologies to anybody listening at double speed.
SK:
Just hit the rewind button a few times. Okay, thank you so much for taking the time. This was a lot of fun.
SK:
Yeah, this was fun.