futureofcoding.org

#34 - The Edges of Representation: Katherine Ye

12/05/2018

Katherine Ye is a PhD student at CMU, where she works on representation, including programming languages, visualizations, notations, and interfaces to enable thinking and creating. She’s been affiliated with MIT CSAIL, Princeton, Distill at Google Brain, and the Recurse Center.

In this conversation we discuss Penrose, her project to democraize visual intuition. Katherine envisions “a magical machine where you can dump in a math textbook and out comes a fully-illustrated math textbook, or more specifically a platform where you can simply type mathematical notation in plain text and automatically get many useful and beautiful diagrams out illustrating the notation.” It’s a fascinating project in the intersection of mathematics, intuition, education, visualization, communication, programming, domain specific languages… basically, all of the interesting topics in one project.

As you’d expect in a conversation about the edges of representation, this is a wide-ranging conversation that I can described by a collection of keywords that came up:

Transcript

Transcript sponsored by repl.it

00:00:00
SK:
Hello, and welcome to The Future of Coding. This is Steve Krouse. Today, we have Katherine Ye on the podcast.
00:00:56
SK:
Now a message from our sponsor. Repl.it is an online REPL for over 30 languages. It started out as a code playground, but now it scales up to a full development environment where you can do everything from deploying web servers to training ML models, all driven by the REPL. They're a small startup in San Francisco, but they reach millions of programmers, students, and teachers. They're looking for hackers interested in the future of coding, in making software tools more accessible and enjoyable. Email jobs@repl.it if you're interested in learning more.
00:01:28
SK:
With that, welcome Katherine.
00:01:30
KY:
It's great to be here, Steve.
00:01:32
SK:
Yeah, it's really great to have you. I like to get these conversations started with a brief history of your background, particularly how you got inspired to work on these problems originally, like who your main influence is.
00:01:44
KY:
From the start I've always been interested in interdisciplinary program languages research, and in designing and using powerful environments. Programming research is funny because it has these two sides of you're very much verifying a thing, and your problem statement is pretty set, you just need to figure it out, or you're synthesizing or designing a thing, which is much more open ended. I think it was good for me to get that foundation in rigor and verification and clarity of thought, and to bring that then to design and synthesis as a grad student.
00:02:30
SK:
I'd be curious, to just talk about Penrose specifically, where the inspiration came for it from. I know when I was reading the project description it very much sounded like Bret Victor-y kind of stuff, particularly his Kill Math projects. Was that an inspiration, or did you find inspiration elsewhere?
00:02:54
KY:
The seed of the project came from a conversation between Keenan and I. Keenan is my co advisor. He is an assistant professor at CMU, in the computer science department, and cross listed with the robotics institute. He does graphics, and he's a really awesome mathematician and geometer. I met with Keenan. The process at CMU is you come as a grad student, and then you meet with tens of faculty to go through this advisor matching process. Keenan was the one I met with, and I was real interested in doing research that was kind of spanning both language and image. I didn't know quite what it would be, and Keenan basically proposed the seed for what would become Penrose on the spot, which was "You have this language expertise and I have this math and graphics expertise, what if we teamed together and we made this thing where you could simply type math notation in plain text and automatically get these beautiful diagrams out?" I was instantly sold. Then we pitched the idea to our collaborators and advisors, Jonathan and Josh. They were sold, and then we started working on it.
00:04:23
SK:
Could you describe your own thinking process? When you think about mathematic or abstract topics, are you thinking in pictures, or are you drawing pictures? Is that important to your process?
00:04:35
KY:
Yeah, absolutely. One thing that is maybe an interesting note is the relationship of the mind's eye to one's personal process and how that may or may not generalize to other people, I think it's a little bit of a pipe theory. I might have heard this from Mike O'Nielson, that people or mathematicians who are very strong at creating visualizations often tend to themselves have a poor mind's eye, so they have to externalize that... And people have different strengths of visualization ability, just in their head, so it's going to differ from person to person. Those who are maybe very strong at visualizing these photorealistic images or whatnot might be less likely to actually create visualizations because they imagine everyone else is able to do so very well.
00:05:50
KY:
In terms of how that relates to me, my mind's eye is actually very poor. I think there's this famous study from Clive Bacon or someone, I can send you the link later, one of the early studies where he just went around asking a bunch of people of various professions this systematic questionnaire of when you visualize the face of somebody you love "is it in the distance?", "is it really big?", "how great is it?", "is it centered?", "are they moving?", and then asking this about various subjects of personal importance, like your room or your boss, things that you normally would be able to recall instantly, and then starting to ask about more abstract subjects like the face of fear, or an imaginary perfect beach, or whatnot, and seeing how what kinds of subjects and situations would change the strength of that visualization.
00:06:56
SK:
Fascinating. I'm trying to do it right now, and it is not easy.
00:07:00
KY:
Imagine the face of a beloved family member, and are they moving? Do you see them wearing their usual outfit? What kind of stride do they have? Are they speaking? Is it easy to hold the face in your mind, or does it just disappear in an instant? These are all different for different people.
00:07:24
SK:
I can't do this very well, or it's not resulting in very clear pictures. It's very abstract, I think, for me.
00:07:36
KY:
Exactly. Back to your original question, "What do I actually do in visualizing things?". Because my mind's eye feels very poor I feel like I have to physically hold the picture in my head. I think that's why my ability to draw things is stronger than my ability to see them, and that's why I draw so many things.
00:08:03
SK:
That's funny. I think there are some people who doubt visualization, or doubt intuition. I feel like you could call them symbol pushers.
00:08:19
KY:
Symbol pushers, that's great, yeah, pushing algebra on the kids.
00:08:27
SK:
I feel like, pedagogically, a lot of people, a lot of elementary school or arithmetic teachers or just people in general who aren't mathematicians themselves or don't have the mathematical thinking or think that or maybe they have it and they don't think other people can have it, they feel like symbol pushing is like a reasonable way to teach things. I feel like it's kind of related to the Copenhagen interpretation of quantum mechanics, just shut up and calculate, don't worry about what it means, you don't have to have a model for it in your head, just be a computer basically and push symbols. Do you feel like you know what I'm talking about, have you seen the same thing when you've explained your project, like oh, only some people need visualizations, other people can just push symbols?
00:09:11
KY:
Have you heard this quote? "Algebra is the offer made by the devil to the mathematician. The devil says I'll give you this powerful machine, it will answer any question you like, all you need to do is give me your soul and give up geometry, and you'll have this marvelous machine."
00:09:30
KY:
I think 2004, I don't have this quote memorized, I just looked it up because it's so apropos.
00:09:35
SK:
That's very funny.
00:09:41
KY:
That basically summarizes my opinion on this.
00:09:44
SK:
Could you spell it out for me?
00:09:47
KY:
Yeah. Symbol pushing is very powerful. It offers rigor that is essential for many purposes. This is why incidentally proof assistants are entirely textual and symbolic, although there are some visual prove assistants, mostly for category theory. I'm not familiar, I haven't used those. Very powerful, but then you lose the understanding for why it's true, and I think in math being able to present your results to other humans and have them have it in their heads and be able to contribute to it is actually almost more important than having a correct result. This is like the problem with the proof, Machazuki's proof of the ABC Conjecture, where he basically did this totally inexplicable marvelous thing but nobody understands it and you can't build on it, so it's just this kind of monument that people stare up at and can't make any sense of.
00:11:04
SK:
I think that it could go either way for that specific example. He could either be just symbol pushing and nobody could understand his symbols, they can follow the symbol pushes but they don't have an intuition for it, or on the other side the way he did it, I don't know, you can tell me, the way he did it maybe was through deep intuitionistic thinking, and he can't explain his intuition somehow, or it's hard for him to visualize it, explain it. Did you know which is the case?
00:11:37
KY:
No. I haven't been able to, and I haven't tried to actually dig through his work, so I don't know. But it sounds like there are two strands here, the importance of intuition in math and the importance of visualization in math. Those often coincide, but often don't.
00:11:58
SK:
Yes. That's well said. I have this quote here from Bhet Victor. He says "The dirty little secret is that the greatest mathematicians don't actually think in symbols. Einstein himself said that he did his thinking with sensations of a kinesthetic or muscular type." That's not visual. That's like body thinking, which Seymour Pappert I think called "anthropomorphic thinking", where you imagine yourself as the turtle, and solve the problems like imagine what you would do with your legs, and I remember when I learned LOGO that that was a transformation for me, learning how to leverage my body knowledge to solve mathematical problems. You're right, that's very different than visual thinking.
00:12:45
KY:
I recall maybe a quote from Feynman’s _Surely You’re Joking_, where he is trying to understand some physical process, and he is just I think lying down on the hotel room floor, rolling around, and I can't recall exactly what happened, maybe people walked in and thought this guy's totally nuts or something, but he had to actually lie down and do this kind of ridiculous seeming physical motion to understand what he was trying to derive the symbols for, so that definitely seems like a common habit in people who are trying to get a deep intuition of something beyond the symbols. It doesn't have to be visual. It could be kinesthetic or an embodied understanding of the intuition.
00:13:52
SK:
They're definitely both very powerful. I think another Bret Victor quote is how we have all these abstract complicated concepts but we're stuck with these plain old monkey brains and these plain old monkey eyes, but we can leverage them through kind of almost like tricks to be able to feel or see the abstract thing somehow. I think Bret Victor references Edward Tufte's work a lot. Have you read some of his books?
00:14:30
KY:
Yeah, I'm familiar with some of Tufte's work.
00:14:32
SK:
Tufte in particular talks about William Playfair, and how his X-Y data plot, he was like the first person to put time on the X axis and then some number on the Y axis, things like that. He talks about how all the sudden our monkey eyes, which can judge small changes in visual height, can all the sudden be leveraged to judge changes in abstract quantities over time, and how something as simple as this new notation visualization helped spark the scientific revolution, which every journal has the X-Y plot. Anyways, I think a lot of people don't realize that that was invented, and the impact it had.
00:15:22
KY:
Yeah, exactly. I mean you know you ask the fish how the water is and they say what water?
00:15:30
SK:
Let's talk about some of the specifics of Penrose. I think just from an implementation perspective you made a lot of interesting choices, the engineering design, and there's a lot of technical challenges that you're working through. Can you maybe back up and describe the vision for the project, and then the design of the system?
00:16:00
KY:
The one liner summary of the vision is a magical machine where you can dump in a math textbook and out comes a fully-illustrated math textbook, or more specifically a platform where you can simply type mathematical notation in plain text and automatically get many useful and beautiful diagrams out illustrating the notation. That's the vision. We want to democratize visual intuition. Making that happen motivates, just that sentence, which has been our sentence from the beginning, that's motivating us as driven several design choices in the system. For example, one fundamental question is do we take a language based approach, or an interaction or image based approach first? The latter could be natural. In fact, if you ask most people how they would design diagramming software they would probably say build a GUI.
00:17:18
KY:
Thinking about here, first, that there already exists a notation that people are familiar with, that people are working with, which is mathematical notation, that motivates our design of the substance language, which is our modeling of mathematical notation, and then thinking about, now we have this one textual language, how can we design a language that enables expert users to specify how those mathematical elements are translated into visual elements systematically? The essence of diagramming, defining a visual language for your domain, how should we do that?
00:18:10
KY:
It seems natural for at least the internal representation of that to be text. Maybe eventually it could be GUI, but internally it simply has to be a text because if you specified the visual presentation otherwise then that would be inherently visual, and then you wouldn't be able to swap out different kinds of visual representations. There's no way we can build this all ourselves for any possible domain. We need to build an extensible system so people can add their own domains as plug-ins. It needs to be flexible, it needs to have mechanisms designed from the start for this. That motivated our design of the language called lement, which is a plug-in a little bit like a module systems type signature, where you say, you just tell us what the fundamental objects in your system are as types, and then the functions and predicates that relate to these objects, so the objects and their relationships.
00:19:21
KY:
This doesn't have to be specific to math. It could be any semantic domain, say chemical reactions. Those three languages, Element, Substance, and Style. Substance is like the HTML of our language, and Style is like CSS, or Substance like pattern matching but really souped up. Those are the textural front end of our system, and then it comes with a powerful numerical solver. This is now motivating the whole graphics design of the system. By the way, in the write up on the Penrose site there's a diagram of the pipeline, and that might clarify anyone who's having trouble understanding my linear description of what's going on.
00:20:21
SK:
Yes.
00:20:21
KY:
Yeah, I mean linear by necessity, because speech is linear, but there's a diagram.
00:20:25
SK:
Did you use Penrose to make this diagram?
00:20:29
KY:
One day. One day.
00:20:30
SK:
Very cool. That was a great summary. One of the analogies I saw in the write up and also you used just now is the HTML and CSS analogy. For me, I feel like it's very much a nitpick, but I feel like a better analogy is how today in web programming we have a model, some like JSON thing, and then we have HTML and CSS that's a pure function from the model. To me, that separation of the model and then the view is kind of like your Substance and Style. Because to me HTML and CSS are both very visual, so I find it hard to make that distinction. Does that make sense, or do you disagree with my characterization?
00:21:21
KY:
Yeah. I think model/view is also a way to understand it. The thing is we need to use an analogy that makes our design familiar to people who don't have necessarily a web programming background. If you say model/view and so on, they might not know what we're talking about.
00:21:41
SK:
Okay. Cool. Yeah. That makes sense. I didn't know if you chose that analogy for that reason or because it was more a better fit.
00:22:00
SK:
One of the things that occurs to me is that this platform definitely makes sense from the perspective of I have a math textbook or I'm a mathematician and I want to add some visualizations. That makes a lot of sense. The story from the other direction of I'm a math student and I'm looking to understand what I'm doing, it's a little bit harder. It reminds me of Seymour Papert's quote like if you taught dance with the notation first nobody would want to be a dancer. You have to learn dance with the movement part first, and then maybe they teach you notation later. With Penrose, you have to start with the notation, and then it gives you the visualization, not the other way around. Do you see there being any hope or promise of starting with visuals and then maybe getting notation? Is that possible at all?
00:22:53
KY:
I think that is not in the scope of our system, and I think the design as it is has a great deal of promise in educational domains. For example, one thing that Penrose can help with for as a student is you have the notation but you simply don't quite know how to illustrate it, and you don't know what the important cases are, and you could simply type that in and then Penrose could help you find the important cases. For example, if you just typed in a bunch of set notation that might be given under a specified diagram, and then Penrose could find "oh, you have these special cases where A and B are equal to C, but you only wrote subset equal", and that might help you understand it, or you had some points and stuff but it wasn't clear whether P would lie in the set or not. Then it could find these sorts of kind of like fuzz in your specification. These are often important corner cases in the notation that trip up learners, and in addition to the corner cases these general cases of what is the least misleading way to illustrate this notation as generally as possible.
00:24:35
SK:
I think I read in the description that you have some maybe direct manipulation, tweaks out of the visualization output at the end of the process. I was wondering how those were saved. Do they is it like a bidirectional thing, like Sketch-n-Sketch, or does it make its way back into style somehow, is it another layer?
00:25:00
KY:
We're not aiming for prodirect or bidirectional manipulation. So, the way that works is you've compiled your three Element Substance and Style programs. You've gotten a diagram that has been generated by our optimizer, and that diagram is specified to with some combination of objectives and constraints. And for the objectives and constraints, so the constraints need to be satisfied, they're hard. And the objectives are just, "make the energy of this function as low as possible." So, try and find something that you like.
00:25:42
KY:
And so, when you drag around these things in the gooey, we re-optimize the diagram right after that, in order to preserve the objectives and constraints on it. So, concretely, like if he had written some point P is in subset A and he dragged that point out of the set, it would get pushed back in, because that violate the mathematics or the Substance specification that was combined with the style. And so, this also ties back to the educational application, because for a student to edit this diagram and then to actually do something that violated what the mathematics meant and have that system have the knowledge of what it means and automatically preserve it, but also do the minimum possible change to preserve it, because that's how the optimizer works, I think that's very powerful. And no other general purpose GUI-based tool can let you do that.
00:26:47
KY:
There's certainly systems with the specialized knowledge built in, for example, Cinderella, which is geometry software, we can drag around things and have the semantics of the math preserved. But those are all custom built software for these constraints.
00:27:06
SK:
I see.
00:27:07
SK:
So, I was curious to talk briefly about constraint-based programming, because it's not something that I have a lot of experience with and it sounds great, it sounds declarative. In particular, I've been curious about how it relates to other paradigms of programming, like functional programming for example. I think the downside I've always seen is that it's kind of unpredictable somehow or the performance characteristics of it aren't... it's kind of opaque. You specify some things, and then who knows what it will come up with on the other end. Yeah. How do you think about constraint-based programming?
00:27:51
KY:
Yeah, so I think the sense in which you're using constraint based programming is different from the sense in which I'm using it, in the sense in which it shows up in Penrose. So, it seems like you're talking about languages like Prologue or miniKanren, which are relational/logistic/constraint based languages. And I think they hook into either SAT solvers or some kind of logical search engine. And I personally don't have very much experience using these systems, so I can't speak to that.
00:28:26
KY:
In Penrose what I mean by constraint-based or solving constraints is that, in the Style program as a style writer, so as an expert user, you might specify the visual representation of some mathematical objects or relationships, in terms of constraints. So, you might say, "for any vector that's in a vector space, I would like you to draw the vector as an arrow whose root is at the origin of the vector space. I will also draw the vector space as a square. And I'd like the head of the arrow to be constrained to be inside the shape of the vector space."
00:29:09
KY:
So naturally, as a person, you don't want to draw the vector going outside of whatever visual representation you had for the vector of space. So, that's just telling your system, apply this constraint function, which is some math that's like, "if the tip of the arrow goes outside the square, then penalize it very heavily in the resulting objective function that we optimize with the resolver." And so, in Style, all of these constraint subjectives are jointly optimized. There's currently no kind of logical search, or proof search, or SAT solving happening.
00:29:52
SK:
I see.
00:29:54
SK:
And so, can I specify the weighting of my preferences?
00:29:57
KY:
Yeah, for sure. So, as a person who is implementing the domain, you can certainly use your objectives and constraints, and add a parameter that's the weight. And then, just say the weight is in the flow. And then, in Style, you can say the weight is ten, or 100, or a thousand, or ten to the negative fifth, or whatever.
00:30:21
SK:
Got it. Very cool.
00:30:24
SK:
And I'm sure you've thought about making visualizations that are also interactive, given your work on Explorable Explanations at Distill. So, has that been built into the Penrose master plan at all yet?
00:30:40
KY:
We'd love to do that one day. We have more interesting questions than we know what to do with. But it's in our roadmap eventually, because we like this capability of Mathematica where you can simply pick a parameter and say, okay, that's a slider and that's automatically parameterized as interactive. So, I think that's something we could add with not so much engineering effort that would actually help a great deal. You could just say like, "bury this parameter in the diagram."
00:31:22
KY:
But with more sophisticated kinds of interactivity, it's hard to think of how to make that automatic. And I think that's not in our road map.
00:31:34
SK:
Yeah, that makes sense. It's a complicated problem.
00:31:37
KY:
Yeah. And other people are tackling that problem head-on as a main research question. It's its own thing. So, I'm not sure if we can simply add it to our road map and do it.
00:31:52
SK:
Could you, if you know them offhand, list some of those people for reference?
00:31:55
KY:
Yeah, so I think Matt Conlan's language Idyll, have you heard of it?
00:32:03
SK:
Yep.
00:32:03
KY:
Yeah, so I haven't used it personally, but I believe they're a Markdown-variant that integrates with web technologies, so you can make scrollable, interactive, web-based explanations. And their work seems very cool.
00:32:26
SK:
Agreed.
00:32:30
SK:
Another interesting question that I imagine is kind of farther into the future, or maybe it's just a speculative question is, so you have the standard math notation to visualization framework. And just to reiterate, you have domain experts specifying the types of objects, and then you have users specifying specific objects in their relationships, and then you have a visualization layer. Part of me is wondering if we could just like chop off the visualization layer for now and we just have like a domain specific language to explain objects, and then specific objects in a different layer, and then kind of turn that into like a replacement for pencil-and-paper in math, particularly for students, but maybe for others, kind of like how Coq is becoming a replacement for proofs.
00:33:24
SK:
I guess another way to phrase this question is... pencil-and-paper math is great because it's flexible, but it's crappy because the feedback loop is so long for students. It's like unclear if they did something right or wrong until they give the paper to the teacher and the teacher marks it up with red marks and gives it back to them like a week later. So, I'd be curious to your thoughts on how to like tighten that feedback loop for students.
00:33:47
KY:
Yeah. So, the Penrose Substance language is not as expressive by design as many of the other languages for expressing mathematics. And I think those are better suited for your purpose. And I think your purpose is definitely one that I care about and a noble one.
00:34:10
KY:
So, have you seen this thesis called the Language of Math?
00:34:16
SK:
No.
00:34:18
KY:
By Mohan Ganesalingam, Cambridge PHD.
00:34:23
SK:
Cool.
00:34:24
KY:
It's very good. It influenced our design of the substance language quite a bit, and he's doing something much more ambitious than we did, which is model study the full breadth and depth of mathematical language in the wild and try to systematically characterize it and write a parser for it. And that's a really great reference for, written by a mathematician, how to model mathematical language and the ways in which existing proof assistants fall short. So, he actually has a section on whole light advisor or something and how can quite do what it needs to do.
00:35:14
KY:
But yeah, so in general just yeah, executable, and precise, and intuitive representations of mathematical expressions are something that I care about and that I find to be important. And, as a long time Coq user, it is pretty amazing. It's not just for proofs. It's pretty amazing how fast the feedback loop is for writing something and then immediately seeing, oh, I can check this line. I can move on, or I can't check this line. I did something wrong. It doesn't type check. What did I misunderstand about the math? That's pretty cool. But certainly too advanced for an elementary school level.
00:36:14
KY:
So, I mean there are lots of games that actually tried to do this. I think DragonBox is one of them, and there might be more, where it's kind of like a gamified algebra thing where the rules are really rarified in the game. So, if you try to, I don't know, divide by zero- I haven't actually played this - but maybe if you tried to divide by zero on both sides, it just doesn't let you, but it'll let you intuitively drag a factor to the other side of the equation, and immediately give you feedback on what that does.
00:36:47
SK:
That's awesome. And yeah, thanks for that thesis or that paper you mentioned about the language of mathematics. That definitely sounds like a good place to follow up with this question.
00:37:01
SK:
I just want to make a note that a few times this conversation you've used the phrase "modeling the notation of mathematics". And I find that phrase to be delightful, because mathematics is the language of trying to model the world or model specific things. And so, we're trying to model a language that models other things. I find that wonderfully meta.
00:37:27
KY:
Oh yeah. Following the model, I hope nobody tries to model us.
00:37:31
SK:
What do you mean? You work for Distill. Isn't that where you guys do?
00:37:36
KY:
Oh boy. Yeah, models on models.
00:37:42
SK:
So, I think one resource that you put together that I really enjoy is your list of notations. It's a really fun list. So, maybe you could summarize what it is and what you've gotten from that list so far.
00:38:00
KY:
Yeah. So, it's a syllabus I put together on how notations influence thought, not necessarily improving it or augmenting it as one might suspect, but just how it can change the way you think. And so, I did this maybe three years ago, four years ago- I can't remember. And, when I made it at the time, I was very surprised that no one had already made it. That's why I made it. It wasn't like I was teaching a class on it at that time or something. And one of the impetuses was this talk I gave at Strange Loop on knot notation, Conway's knot notation.
00:39:08
KY:
So, Conway has this really interesting, wacky notation four knots that involves these very specific operations around flipping a thing diagonally, and then adding it visually to another knot, and then constructing knots in that way. And he made these very bold claims about knot enumeration, like listing all the kinds of knots and how his notation enabled him to do all these powerful things. And I was like, "wow, I need to understand how that happened." So, then I propose to give a talk on it. And then, they accepted my talk, to my great dismay that I had to actually learn it.
00:39:46
KY:
But so, after I gave this talk, I was thinking about generally taking a step back and all the other ways that- knots are not literally knots. Knots are metaphors for things that are hard to describe. So, everything else in this a loop is about things that are hard to describe, and the ways we insist on describing them, and ways that are better or worse for describing them. You mentioned dance notation and dance notation is one of them. Or, for example, I just love Chana Horowitz's series on [inaudible 00:40:33] connectography, where she makes these beautiful visual notations, paintings for this sculpture performance series that never actually happened. And, she's got these paintings describing the- I believe it's like the timing of the lights or the timing of the movements, and it's very precise and very beautiful.
00:41:03
KY:
Or another one is this wonderful write up by Colin Wright on juggling notation. So, I mean, juggling, that seems hard to describe. But apparently, I'm not a juggler myself, but apparently there's a periodic table of juggling notations. And you can discover new juggles by systematically examining the ways that you might describe these sequences.
00:41:27
SK:
So, it sounds like your motivation here was like the Sapir-Whorf Hypothesis, as applied to notation, instead of like spoken language. Is that a way of characterizing it?
00:41:40
KY:
That's a really good question. I haven't thought about how it relates to what I mean. But Sapir Whorf is - I think it's very specific to natural languages, and the folks involved have not made statements about other kinds of descriptions. And so, I'm not sure if their assertion is relevant here.
00:42:28
SK:
I guess I can rephrase it then. It feels like there's an equivalent question for a lot of languages, I guess. Does the spoken language you speak affect the things you think? Does the programming language you program in affect the sorts of programs you write? Do the notations you write in affect the visualizations you see or write? I feel like there's a common question of the representation affecting something more semantic, somehow.
00:43:11
KY:
Yeah. Well, there is your notation affecting the way you think, but there's also the way you think, the culture affecting the representations that are designed around its loop.
00:43:24
SK:
Yeah. Yeah. I'm sure culture plays into it as well.
00:43:31
SK:
So, in this syllabus of studying notation, one of the things that, maybe I just missed it, but that I really wanted to see there is kind of process notes on like if I'm ambitious enough to want to create a notation, how do I go about it? What are the tricks of notation designers?
00:44:00
KY:
I'm going to write that down. That's really interesting.
00:44:10
KY:
Well, I mean the question is what you want to design a notation for? So, the syllabus is not aiming to that - that's going from a syllabus to say a studio class on notation design, which is a little different. But there is a little bit of a hint there, in terms of- I'm thinking of two things. First, the Green's paper on dimensions of cognitive notations, cognitive dimensions of notations, something like that. And I think they discuss more specifically and systematically the elements of a good- "good versus bad notation" and different kinds of cognitive tests that different kinds of notations can help scaffold. So, I think that's one resource.
00:45:17
SK:
Mm-hmm.
00:45:20
KY:
And second, there are some slides in my Strange Loop talk where I was - it's just only a very few at the end - but I was thinking about, more abstractly, how these descriptions map to objects and how objects map to descriptions. The properties of that mapping, whether it's one to one, or many to one, one to many and what that means in terms of intuition and numerability. So, those are two starts. I wonder if there are more out there.
00:45:58
SK:
Well, so the point you made about the mapping, I do really want to talk about that, because I saw that in your talk and that's something that I think about a lot actually. But before we jump to that topic, I just want to say that, it feels to me like the creation notation is very related to the design of programming languages, and the design of like macros in programming languages, or DSLs, or even libraries. It's a very related kind of thing, becuase programming languages is very much like a notation.
00:46:34
SK:
And then, it's also kind of related to user interface design, or really any kind of design where someone wants to interact with it. Like even Gmail, like is it a notation? Is it a programming language? It's a very limited programing language, a very limited notation, and it's like a very visual notation, maybe. I guess that's kind of debatable. It depends on how you define words. But anyways, I just thought I would say that it feels like the practice of being a notation designer would be related to these other activities.
00:47:05
KY:
Yeah, absolutely. And I'm not sure if I would say that I am a notation designer, honestly. Like from scratch, it's not like I've designed bracket notation or whatever, Einstein's Tensor Notation. I haven't designed a notation. And it's not just me. Within Penrose, it's the group of us making these language design decisions.
00:47:39
KY:
But to your point, yeah, it feels like there is certainly a commonality between language design, interface design, visualization design, representation design, API design, all of that. Yeah, things, I guess, mediating the relationship between things seems like the commonality to me.
00:48:06
SK:
So, I want to go back to that topic you brought up of the correspondence between a - how do you say it - like a concept in your head, and then a notation, like an encoded notation in the world. That's how you describe it?
00:48:19
KY:
Mm-hmm.
00:48:20
SK:
And so, yeah, in your talk you have like these two sets, and one set is a concept and the other set below it is notation. And you have arrows between the two. You could have concepts that don't have notation. You could have concepts that map to multiple notations. It's like a mapping thing. You have notations that don't map to any concept, et cetera.
00:48:41
KY:
Mm-hmm.
00:48:45
SK:
So, you were saying, I think, the dream is that you have a certain notation such that there's a one to one mapping. For every concept, there is a notation, and for every notation, there's a concept. Is that the dream?
00:48:59
KY:
I'm not sure.
00:49:00
SK:
Oh, okay. Yeah, I guess it is debatable.
00:49:03
KY:
I think that that's certainly an interesting property, but I think there are many other interesting properties of these mappings. For example, part of the delight of natural language is that it's so, from a PL point of view, natural language is not orthogonal. There are many ways to say the same thing, and that's where ambiguity and that's where you find humor and delight.
00:49:33
SK:
And I guess there are also many things that you can't talk about or that we don't have the words for.
00:49:40
KY:
Exactly. And that's why they are many mediums and media.
00:49:44
SK:
Yeah. And there are also many things you can say that don't have any meaning. They don't map to anything.
00:49:52
KY:
Oh, exactly. Yeah. The edges of language are so delightful for me. These kinds of asemic language- asemic is more used for reading, I guess. Are you familiar with that word?
00:50:00
KY:
Asemic is more used for reading, I guess. Are you familiar with that word?
00:50:04
SK:
Say it again.
00:50:05
KY:
Asemic. I think that's how you say it. I've never heard anyone say it. A S E M I C.
00:50:12
SK:
No, I don't know that word.
00:50:17
KY:
I think one of the best known examples is this - I don't know how to say it - this codex where everyone thinks- like there are these beautiful pictures and then the writing is just nobody knows what language this writing is from. And scholars suspect that it's just someone writing some beautiful script with no actual meaning attached to it. So, any attempts to decipher it or stymied by the fact that there is no actual meaning.
00:50:50
KY:
Yeah. Or like, back to your point about language that doesn't mean anything. If you take some of the things that are really at the edges of language like "Colorless green ideas sleep furiously." Have you heard that?
00:51:17
SK:
No.
00:51:19
KY:
It's, I think, one of Chomsky's examples for language that is not supposed to mean something. But, of course, all these linguists have written papers about how that sentence actually means something.
00:51:35
SK:
That is funny. I can picture it actually.
00:51:39
KY:
Yeah.
00:51:43
SK:
Yeah so, obviously English is great and ambiguity is great for a lot of things. I think my dream, and I think it's a common dream from mathematic/computer science types, is wouldn't it be nice if we could be specific and precise when we want to be? And so, one dream I've had, or one thought I've had, which maybe you could tell me makes no sense, is wouldn't it be neat if for any concept there was one and only one notation for it. So...
00:52:25
KY:
I think that's just inherently not going to be possible, because there are concepts that span many domains.
00:52:35
SK:
Yeah. Yeah. Well, so I guess what I'm getting at is, in a given notation, so I'm defining a language right now, it's called Steve's Notation. And in Steve's Notation, you can encode certain concepts, and each concept you encode and meditation has only one representation. So, not in every notation, just in my notation. You still think it's impossible?
00:53:03
KY:
Well, I mean, is your language going to have arithmetic? Is one plus two going to be different from two plus one?
00:53:09
SK:
Yeah.
00:53:10
KY:
What is equality?
00:53:12
SK:
Yeah, so that's exactly the point. So, yeah, so that's actually where I was going to go next.
00:53:17
SK:
The notation would describe one plus two in such a way that it would be equivalent to two plus one, if that makes sense. So, yeah. So, the notion of one plus two or two plus one, I think, would just resolve to the number three, and that's it. If you mean one plus x or x plus one and it can't be resolved, then I think it would just be stored, not only as one plus x, but it would be stored as x plus one. It would be stored as a pair, or maybe it would be stored as "one plus x, but by the way, addition is community and so, whatever."
00:54:00
SK:
So, basically, yes. The answer to your question is yes. It would try and map all concepts to one notation.
00:54:09
KY:
Okay. So, what I'm hearing is one, some kind of normal form.
00:54:14
SK:
Yes.
00:54:16
KY:
One some kind of evaluation strategy to two, maybe several kinds of well defined, normal forms.
00:54:27
SK:
Yeah, that's a good way to put it.
00:54:30
KY:
Perhaps, but I guess what is the end goal of this?
00:54:39
SK:
So, yeah. That's a good question. And it's kind of vague at this point, but my thinking is... so actually this is kind of related to another topic that I've seen you talk about on the Internet is, it's good to be at the highest level of abstraction possible. Like in coding in the highest level of abstraction possible so you can be as precise as possible about what you mean. Like when you're forced to describe visualizations in the language of like vector shapes, you're not communicating to the computer what these things mean. And so, the computer can't like move things automatically for you and help you with things. If you move a point out of the vector space it's in, it's not going to complain, because it doesn't realize that it's a point in a vector space. It just thinks it is unrelated shapes.
00:55:29
SK:
And so, it's related to that idea. And I guess part of it is refactoring. So, if I describe an algorithm in one way and you describe the same algorithm in a different way and we're trying to like merge our code, it's gonna complain, even though it's the same underlying algorithm. So, it's like we're not being abstract enough about what we're trying to accomplish somehow. And then, it's going to like get in the way of other things we want to do.
00:56:04
SK:
It's very vague. So, it's hard for me to communicate in a short way the problem. Did anything come across there?
00:56:13
KY:
Yeah. I mean, I have an unrelated complaint, which is that you might be like trying to solve the Halting Problem. But also, it sounds like you're tackling one of the most fundamental questions in PL which is, what is equality? What is equivalence?
00:56:30
SK:
Yeah. So, yeah. I didn't realize that was one of the fundamental questions. What are the key words or like paper? Can you give me some related work on this topic?
00:56:44
KY:
Yeah. So, this is definitely more of a theoretical PL concern that I heard many people talk about when I was still in the more PL Theory and verification world, but some keywords, I guess, or "univalence", "homotopy", certainly just "equivalence", and "equality". Let me see if I can find this old paper. There are many papers that are just about definitions of equality in PL. Oh yeah. And like many logics that are about different notions of equality. Okay. Let's see.
00:57:32
KY:
Yeah, type equality. So, Observational Equality Now, is one paper which I have not read, but I remember the title. Oh yeah, also "decidability". Deciding Equivalence With Sums of Empty Type. I'm just looking at papers from the CPL group.
00:57:59
SK:
So, one thing I feel like this is related to is type and the recent work on dependent types, stuff like that. Because the idea of a dependent type is, this function doesn't just take a list and gives you another list. It gives you back a sorted list, which is like the thing that you need to know about. That's like what a sort is. You're giving the computer more information.
00:58:30
KY:
The one you're giving me here sounds more like a liquid type than a dependent type. I think a dependent type is specifically a type that depends on value, but yours is a type that depends on the predicate, which can be...
00:58:43
SK:
Yeah, yeah, yeah. I think you're right. You're right. I'm talking about liquid types.
00:58:48
KY:
Yeah, both seem cool and powerful to me. I don't- I mean I program in Haskell, I've programmed in Cog. I don't spend so much time in Agda or LiquidHaskell, but it sounds like they're doing great work.
00:59:04
SK:
Yeah. I think I saw you worked on this "end of history" proof assistant paper?
00:59:13
KY:
Oh yeah. Yeah. That was one of the things I worked on with Adam.
00:59:15
SK:
It feels relevant to this, like you try to be as high level as possible. And then, that helps with things. Is that related to this vague conversation, or not really?
00:59:33
KY:
Yeah. The idea that you have- Liquid Types - certainly. You have this very rich specification, and specification only, of, say a DNS server. And then, you use various synthesis tactics in the proof assistant to create a correct by construction and implementation of the DNS server from its high level specification.
01:00:06
SK:
That's very cool.
01:00:12
SK:
Here, I have another try of explaining this random idea.
01:00:17
KY:
Okay.
01:00:17
SK:
And then we can move on. So, in imperative programming, if I wanted to sum up the numbers and a list, there are a few different ways of doing it. I could have an iterator and a loop, and I'd add to the accumulator, and then I'd have returned the accumulated thing. But in functional programming, it seems like there is like one way to do it. You're like folding plus starting at zero over a list. It seems somehow more like that's representing more of the essential nature of what you're trying to accomplish when, in imperative programming, you're at too low a level of an abstraction. It's not at all clear what your intent is when you're creating these random variables. It's too random and freeform.
01:01:06
SK:
And so, I'm not suspicious that anytime you can represent the same idea in multiple different ways, we're somehow not being abstract enough about what we're trying to accomplish to the computer. If we were being more specific at a higher level of abstraction, then there would just be one way to say what we're trying to do. But because we're like somehow too low, then I can describe it this way, and you could describe it that way, and they're not equal at that level, but if we picked a higher level, they could be equal.
01:01:33
KY:
The idea of orthogonality in program language design seems very similar here. The idea that the elements of your programming language design, like the grammar of the primitives span are like orthogonal bases of a vector space. And you don't have any bases that are, say, almost parallel, because that would mean that they didn't have quite the expressive power that they need it. And it's a little bit of a controversial idea, but it seems related.
01:02:06
KY:
And the second thing that comes to mind is Crystal Lopez and her work on exercises and style. It's fun. It's essentially expressing the same data munching operation, but in many, many different ways in Python. And she's fighting [inaudible 01:02:32], something like that, from the French where he rewrites the same scene a bunch of times in French. That's a fun to look at.
01:02:44
SK:
That's very funny.
01:02:46
SK:
Yeah. The orthogonality and the basis vector thing is exactly... I hadn't thought of that, but now that you say it, it's very similar to what I'm talking about. That's very cool that that's a thing in programming languages already, that people are talking about.
01:03:00
KY:
Yeah. I can't remember where I picked up the word. I can't remember reading about it in any paper or anything, but it's definitely a thing that people in the community discuss.
01:03:13
SK:
Cool. Yeah. I have been struggling to find the right keywords for this, so thank you. I'll do some Google exploring and get back to you.
01:03:23
KY:
Mm-hmm.
01:03:28
SK:
Cool.
01:03:29
SK:
The last thing that I think is kind of relevant to this audience is, Distill to me, seems like a really noble project. And hopefully, it'll become more of the sort of thing that people respect and - basically, it seems like, in science, we reward people who discover things, not people who explain things. But it's wonderful when people take the time to explain things, so.
01:03:57
KY:
Mm-hmm. Yeah, discovery is fundamentally tied to explanation. You can't separate the two.
01:04:02
SK:
Yeah. When you discover something or invent something, you do write a paper. But going the extra mile and spending so many hours, like hundreds of hours, to go the extra mile and really convey what you're trying to say in a paper to people, I find that very admirable.
01:04:29
KY:
Yeah. One pointer here is Chris's essay on research. That is a great primer on his thinking behind Distill.
01:04:38
SK:
Yeah. Yeah. I should have started there. Yeah.
01:04:39
SK:
So, just to back up, I feel like the story is... Bret Victor wrote his essay on Explorable Explanations, and then Chris Olah took up the mantle when he wrote about this research debt problem. And then Distill is kind of like the baby of Chris, trying to explain machine learning AI stuff to decrease the research debt so that we can do better science faster here. Yeah. So, anyways, that's cool.
01:05:15
SK:
Do you feel like, in your own projects that don't have to do with AI, you will continue to do Explorables about the things you work on? Does that make sense? Is that a relevant question?
01:05:30
KY:
I'm not sure if it's about the explorables. It's about representation, the edges of representation and designing different kinds of mediums for concepts. If an explorable is the way to go, then I would do an explorable. If not, then I wouldn't. I don't think it's part of my platform, per se.
01:06:17
SK:
Interesting.
01:06:19
SK:
I feel like the way I think of explorables is that, like an essay, just like a regular essay or a regular paper, is an explorable that's just like bad or just very flat. And then, anything you add to it is just making it more and more explorable. But I guess, yeah. I guess that's not quite really how the words are used in practice.
01:06:46
SK:
In practice, you can spot and explorable. It's in text, and it's some widgets... it's not quite as much of a spectrum as I thought it was.
01:06:59
KY:
Right. Well, pragmatically the thing about explorables is that readers don't want to explore. And I think there is some stat from the New York Times or something that's like for articles that had interactive components, the engagement for that component tended to be very low compared to when they just surface the insight and move on.
01:07:25
SK:
Yeah. Yeah. It's funny because I, in my own behavior, I've noticed that that's the case. I like the idea of them, but it doesn't actually make me an active reader.
01:07:36
KY:
Mm-hmm.
01:07:41
KY:
Yeah, I would say I'm more interested in a related idea: the idea of software as rhetoric and how an idea can become much more powerful when it's backed up by a prototype, a provocation, an implementation, a demo, and it doesn't have to be an explorable.
01:08:14
KY:
Yeah. So, "software as rhetoric" is something I'm quoting from the new inquiry. It's not a new idea. They released this software called Bail Bloc, which, if you install it, it sits on your machine and minds cryptocurrency to be used to bail out prisoners who are unfairly imprisoned.
01:08:47
SK:
Wow, that's cool.
01:08:52
KY:
I think that's very powerful.
01:08:55
SK:
There have been a lot of projects over the last couple of decades about using your extra compute time to like solve some scientific challenge, like protein folding or whatever. So, I guess this is like a modern incarnation of that with the Blockchain. And then, and it's like a very different social purpose. Yeah.
01:09:18
KY:
Yeah. But it, I believe, was also accompanied by writing. And the combination of the rhetoric and then immediately, the real world impact of that rhetoric, as embodied in code, it was very powerful.
01:09:38
SK:
Yeah. That is cool. I'm trying to think of another example of rhetoric and code that went together like that.
01:09:52
KY:
I would say Building Blocks is even more so an example of this, because they released this big code notebook where you can immediately open it and re-pro the results and change it. That's also code as rhetoric, rhetoric as code.
01:10:11
SK:
Yeah. Interesting.
01:10:12
SK:
As opposed to when you write a paper but you don't release the code at the same time?
01:10:19
KY:
Yeah, I think artifacts can be certainly part of it. Yeah.
01:10:25
SK:
Cool. It's almost like the power of demos are like pretty powerful. Bret Victor's work I guess would fall under this category.
01:10:42
KY:
Right. Yeah. Michael Nielson and I had this very interesting conversation about demos versus prototypes. Maybe it was demos versus products, let me try and find that thread. So, it was something like - also, it wasn't just the two of us, there are other people involved in this conversation - but it was something like trying to tease out wanting to do something like a Sketchpad, so an influential demo, versus wanting to do something like a Photoshop, which is a widely used tool. And my reaction was, it's not a dichotomy. One is often necessary to feed the other.
01:11:39
KY:
This is someone else's example from the thread. There are just so many quotes in here that I can't figure out who was saying what. Or like Windows versus Mother of All Demos, or the example that I gave was Pixar versus Computer Animated Hand. So, like Computer Animated Hand was an influential demo from early computer graphics, and Pixar is a billion dollar company built on widely used tools. And those were actually done by the same person, who is Ed Catmull. I mean, obviously, Pixar is not one person's work, but he's one of the founders.
01:12:13
SK:
Yeah, yeah, yeah. He has a great book, by the way. I feel like you've probably read it too.
01:12:20
KY:
I haven't actually, but I think it might be on my bookshelf.
01:12:23
SK:
Yeah. Yeah, I'd recommend it.
01:12:27
SK:
So anyways, I know you have to run soon. So, I mostly just wanted to ask about your Speculative Nonfiction Newsletter, just to describe what it is for people. I like to end with a place for you to talk about where you could find your work online, email, Twitter, or whatever it is that you want to tell people about what your interface is.
01:12:52
KY:
Okay.
01:12:54
KY:
Yeah, so Speculative Nonfiction is a letter for sure. And it's, I mean, it's only been around for less than a year. So, it's actually the latest one was just, "this is a bunch of stuff I did, because I didn't have time to write anything new for it." So, it's a combination of, "this is a bunch of stuff I did. Please enjoy it." And an attempt to show you the unexpected futures next door. So, to understand the relationship between [inaudible 01:13:34] production, cultural production, and design. I don't know. I don't know how to describe it. It's a letter. That's about it.
01:13:46
SK:
Cool. Yeah, I agree. I also find it hard to describe, but I also find it very enjoyable.
01:13:53
KY:
Thanks.
01:13:54
SK:
Great. Well, thanks again for taking the time. This was really fun talking to you.
01:14:00
KY:
Yeah. This is a really great conversation, and stay in touch.