Listen in your podcast player by searching for Future of Coding, or via Apple Podcasts | Overcast | Google Podcasts | RSS
Ravi Chugh is a (recently-tenured 🎉) prof at the University of Chicago. He’s famous for leading the Sketch-n-Sketch project, an output-directed, bidirectional programming tool that lets you seamlessly jump back and forth between coding and directly manipulating your program’s output. The tool gives you two different projected editing interfaces for the same underlying program, so that you can leverage the different strengths of each. In the interview we talk about the principles of bidirectional editing, the team and history behind the Sketch-n-Sketch project, benchmarks and values that can be used to assess these sorts of novel programming interfaces, possible future directions for Sketch-n-Sketch and the field more broadly, and a bunch more. It’s a long one — almost two and a half hours — but it’s packed with thought and charm.
Now, this episode is a bit of an odd one, though if you’re just listening to the audio you might not notice anything usual. In fact, it was recorded over a year ago, before the pandemic hit Canada (where I live). I’d planned to release it in the spring, but hit some snags, and then got burnt out, and then the next thing you know it’s March of 2021! An entire year has just… disappeared. Weird, hey? So the episode is coming out, finally, though in a slightly different form than previous episodes. In the past, I’d spend something like 20 hours meticulously editing the transcript to fix errors and make up for the loss of inflection and subtlety of speech, building up a handy list of links to all the things mentioned, and otherwise making this episode page stand alone as a resource independent of the audio. But due to the aforementioned burn out, I won’t be doing much of that anymore. The thing I’m passionate about is the audio, and the thing that’d keep me from releasing episodes at all is the effort it takes to make these episode pages. So if you find yourself reading this, and if you feel up for helping, I warmly invite you make edits here.
On that note, I need to extend a heartfelt thanks to Kartik Agaram for putting in a ton of work to clean up the transcript over the past few days, and to Ravi for a huge drop of edits and fixes.
Ravi Chugh is here to bake cookies and talk about Sketch-n-Sketch, and I ate all the cookie dough. You can find Ravi on Twitter.
Here is Ravi’s brief research statement that was drafted for a tenure review, and a slightly longer slide deck that covers some of the ideas behind Sketch-n-Sketch (and includes a lot of swanky graphics).
There are lots of other folks in Ravi’s group working on Sketch-n-Sketch and related projects who will be familiar to active members of our community:
Here’s a fantastic presentation of Sketch-n-Sketch by Brian Hempel at UIST 2019.
And here’s the matching UIST paper.
Also fantastic is the 2016 Strange Loop talk by Ravi:
For valuable benchmarks and thoughts about programming by demonstration, check out the book Watch What I Do by Allen Cypher. Benchmarks were also discussed at LIVE 2018, after Brian presented this video.
Transcript sponsored by Repl.it
Corrections to this transcript are much appreciated!
Welcome back to The Future of Coding. I’m Ivan Reese, and my guest today is Ravi Chugh. Ravi is an associate professor at the University of Chicago. He leads the group working on Sketch-n-Sketch, a programming environment that fuses direct manipulation with text-based code in a paradigm called bidirectional editing. You have a text editor on the left, and a graphical canvas on the right. When you draw on the canvas, it generates code. When you edit the code, it updates the drawing. When you manipulate the drawing, it updates the code. You have two representations of the same information, both equal in importance, but different in how you can work with them. Sketch-n-Sketch is starting out life as an environment for working with SVG graphics and HTML webpages, but there’s a lot of promise to this idea expanding into other domains in the future. I’ll let Ravi take it from here.
What Sketch-n-Sketch looks like, what it currently looks like at least, is a pretty traditional two-pane interactive development environment. On the left half of the window, on the left pane, is just an ordinary code editor. A text editor. Really not much different from what you would normally expect. The right side, the right pane, is a completely ordinary canvas on which the output of the program is rendered.
Currently, the Sketch-n-Sketch interactive programming system is tailored to the domain of generating SVG graphics, or also HTML pages. But in the program and the text editor on the left, at the end of the day, the main expression has type SVG or type HTML. And after the user runs that program on the left, the output value — the main SVG or main HTML value — is rendered on the right pane on the canvas graphically, as it would in any other direct manipulation or drawing system.
In that sense, it’s a completely traditional, ordinary programming environment. You write a program, you run it, and it shows the output. The main new features that we’re exploring revolve around making the output of the program editable — manipulable — as opposed to just being the final value that the program spits out, and then it’s inert or disconnected from the program that generated it.
The main goal in Sketch-n-Sketch is to allow the user to actually interact. To change, to drag around things on the output to make changes, and for the system to infer changes to the program to match what interactions the user has performed.
One way to think about it, one way that I like describing it is, in addition to the normal, forward evaluation process in any normal programming language, Sketch-n-Sketch is trying to provide this backwards connection. Mapping changes back from the output, back to the program that generated it.
Just to talk about other interaction modes a little bit, to overview the system: in addition to being a normal programming language in the sense that you start with a program and run it, and then after you’ve viewed the output you can start interacting with it, we try to also provide this backward connection in a new and intuitive way — the system provides the ability to actually add new things, new output values to the output of the program, that were not even in the output before.
Let’s say you had a blank program to start, and then initially there’s actually nothing on your canvas. The Sketch-n-Sketch editor also provides a drawing toolbox of parameters that you might expect in an SVG drawing tool, where you can add new shapes to the canvas, and the system will auto-generate or insert definitions into your code. Which, again, if you would run them in the forward direction, would produce hopefully the values or the shapes that you’ve just added in the output.
The goal is to allow both directions, of both programmatically generating but also using direct manipulation drawing tools, to move back and forth between these two directions of authoring.
In the Strange Loop talk that you gave back in 2016, you did a really good job of justifying why this tool needs to exist, and why this bidirectional editing needs to happen. That is: in a normal drawing tool, when you make your drawing, if your drawing has something systematic about it — you give an example of a Ferris wheel with a number of cars around a circle — if you want to change the number of cars, you have to manually go in and move everything around. Whereas if you’re doing procedural graphics, it’s very easy to say, “Oh, I have a number that represents how many cars are around the circle.” But it’s hard to do quick tweaking and adjusting, getting things to look the way you want, if you’re having to use code. You have to make a change and then rerun it, and a change and rerun it. Or you’re iterating your way towards what you want, rather than just doing it.
Sketch-n-Sketch lets you have both. It lets you do the direct drawing to get the image that you want. But, the act of drawing that you go through creates code that is generating the drawing, so you get the benefit of it being a procedural drawing without having to sacrifice the graphical way of working.
That’s right, and that is the goal: to mix the best of both interaction modes. Clearly, direct manipulation — direct interaction, specifying changes directly in the context of what you’re building — is obviously what you want in many situations. But then, there’re other times when expressing some really fine-grained, repetitive parameters of your design are much easier to specify if you had a little bit of support for programming and procedural abstraction.
I think this is a very natural tension that comes up, not just in drawing editors or creative media applications. I think this tension really comes up pretty much in any kind of application software that we use. If we’re doing traditional office kinds of tasks, where we’re writing mostly text in a text editor or a word processor, if we’re doing data manipulation in a spreadsheet with some visualization at the end, if we’re developing a website, if we’re doing a 3-D animation, in all of these domains, there are obviously really terrific, really rich, really sophisticated, GUI editors that allow you to do tons and tons of useful operations in those domains of work. But oftentimes, they don’t provide an escape hatch to a more procedural or programmatic view for doing things that are just not necessarily the best to do inside the GUI itself.
I think this tension between trying to mix the best of both, direct manipulation, GUI applications with general purpose programming, I think is a very natural combination that ought to be useful in a wide variety of domains, if we can actually get it to work as smoothly as we want.
Yeah. that’s definitely something I’m going to come back to a little bit later on in the conversation and look at how this generalizes to other domains. But, to start with, Sketch-n-Sketch is focused on graphics. In this current prototype that you have been iterating on over the last… What’s it been? Six years now, that you’ve been working on this?
It’s been about almost five years. It was early 2015, where we really started to think about what it would take to achieve this kind of combination in the setting of graphics. I think we started writing code in April of 2015. So, coming on about five years.
In the current prototype that you’ve built up over that time, are you thinking of it more as a graphics tool or more as a programming tool. Or, if you put those on either end of the spectrum, where’s the spot along that spectrum that you think of Sketch-n-Sketch as fitting in?
It’s a good question. I don’t think we’ve really planted our flag or tried to plant our flag on either end of the spectrum really. I don’t think we’re trying to, at least in the short term, make this the end-all of drawing tools. And certainly, this is not the end-all of programming systems because, again, it’s really designed for very small-scale programming at this point, with a very narrow application domain.
Really, we’ve been using this initial prototype and this initial application domain as a laboratory, a playground to try to study the simplest and the first questions, the first challenges in this goal of connecting programs with the output that they generate. And allowing this bidirectional, back-and-forth connection. Because, as we were talking about, this kind of connection seems like it ought to be useful in a wide variety of domains. What that means is, for these kinds of techniques to be really general-purpose and reusable, there ought to be a core foundation upon which, this basic connection between programs and their output ought to be studied. The approach that we take is by picking just an ordinary traditional programming language, it happens to be a functional language, it looks like Elm right now. But, in terms of the core ideas, we’re basically studying these ideas in terms of the lambda calculus, the very core foundations of pretty much any programming language. We’re really studying what it means to change the output value of a lambda of a program in the lambda calculus or in Elm, and how do we translate changes of the output back to changes in the program.
And so it’s general in the sense that SVG and graphics didn’t factor into that description at all. But then, when we take the algorithms that we’re designing and implement them in a particular system, then we need to also think about, “Okay, what is the user interface for actually making those changes to the output?” In situations where the algorithms provide multiple options or need more help and ask the user to choose, that’s where we have to really think, “Okay, for this specific application domain, vector graphics, what is the user interface that we draw and how do we surface the results of the algorithms?”
In that sense, it’s saying, “How do we get the UI to look more like a traditional drawing tool?” I would say that currently, the prototype is this mix of very bare bones drawing tool and very bare bones programming language, so that we can explore these initial questions as soon as possible.
But then, in the longer term, I would think that, if the techniques continue to improve and scale to more and more sophisticated classes of programming tasks and design tasks, I don’t see why we wouldn’t want to also try to make this a really full-featured drawing tool that has programming support. Also, on the opposite of the spectrum, why we wouldn’t try to take these ideas and techniques, and embed them into traditional general-purpose programming IDEs.
So that if you were working on a different domain other than graphics, you could still have the benefit of bidirectional editing and output-directed programming. But then also, on the other side, if you’re doing graphics, you get the benefit of having code representation of your drawing extracted out so that you can drop down and work on it on that level, if that affords you some extra ability.
Exactly. That’s right.
Cool. So, you are thinking about growing in both directions, not just using graphics as a test case, and then going back to the programming side and just focusing on that? You’re also thinking about going at it the other way as well.
Yeah, absolutely. That’s one of the really fun parts of this project, I think. It’s got this really wide range of techniques and challenges that all need to be solved, in order to really provide this long-term goal of mixing in a fine-grained way, the best of what GUIs and direct manipulation offers and what the best of what programming offers.
If one wants to try to combine those in a fine-grained way, that’s going to require both advances in programming languages and program synthesis. It’s going to require advances in UI design, and taking into account this conversation between the programmer and what the system can do. If you look at the directions that we’ve been exploring the past few years, it’s a variety of what you could think of as core PL program synthesis kinds of research questions, as well as core user interface questions.
We’re certainly excited to keep pushing on directions in all of these fronts because I think it’s pretty clear that one needs to think about all these in tandem, and not just from the programmer’s point of view or from the designer, the end user’s point of view.
Before you started working on Sketch-n-Sketch, what did you see in the world that led you down this path? Did you play with an existing tool and think, “I know how to make this better.” Were you chasing a feeling? How did you arrive at the desire to work on this?
It’s a fun question for me to think about because I do think about the original motivations for this project quite often. I think the moments that I look back to are pretty specifically centered around my time in graduate school, where I was doing research on program analysis and types systems. Something really not super related to what I’m working on now.
But, in the process of grad school, I would give conference talks and I would give lectures and things like that. So, I would always use PowerPoint and happily use PowerPoint to create visual, interactive presentations about whatever topics I was talking about. I’m certainly not qualified in graphic design or visual arts, but I always enjoyed the challenge of coming up with really interactive visual ways to explain the ideas about whatever research topic I was talking about.
So, I just really enjoyed how I can do that in PowerPoint because I can really easily try out a bunch of different visual representations. The built in animations often provided a good way to stage transitions and sequence the story of my talk. I always really enjoyed PowerPoint. But, every time I would use it, I would also think, “Okay, there are a whole bunch of operations that are really tedious to perform. And, once I’ve figured out what the basic design of my talk and my visual motif is looking like, it would be much easier if I could make certain changes programmatically instead.”
Instead of making a change early on in my, let’s say, sequence of slides, if I make some fundamental change to the visual motif, instead of having to go back through all the 10, 20, 30, 40 places I copied and pasted and made changes, if I could instead go into a programmatic representation of that sequence and make changes in one or two places, it would be so much easier to build these really complex visual narratives and presentations.
That desire really came up over and over again in grad school. I remember talking with my friends at the time about, “Wouldn’t it be great if PowerPoint was completely programmable?” If under the hood, there was a general purpose programming language that I could use to help interactively build these presentations.
I think, that was the first time that I really thought about this, wanting this combination of, as a programmer, knowing that general purpose programming languages provide all these abstraction capabilities that make certain things easy. But then also, using terrific direct manipulation tools and realizing how they made a different set of interactions very easy.
That’s what I look back to, as one of the original motivations for wanting this combination. But, I think the same combination comes up in pretty much any other GUI application software that I use, and I think probably many other people who are programmers feel the same way. Wouldn’t it be great if you could also get some of the abstraction capabilities that programming provides?
It’s interesting, how you… I imagine you conceptualize yourself as a programmer and not an artist, based on what you said about you qualified your art chops there.
I should probably have qualified my programming chops too. I think I’m equally unqualified.
Yeah. But it’s interesting to me that you came at this from the angle of wanting to add programming into an experience that wasn’t rooted in programming. I’ve had the exact same experience using programming tools and very much wanting to add art capabilities into them. It’s neat to me to think about that there are many different roads that lead to this intersection,
Especially early on, when I was thinking about, “Okay, what are the techniques under the hood would need to be developed in order to support this combination?” I think there’s ways to think about this as like a programming problem and then adding support for direct manipulation and GUIs to it. And then, the other approach is start with a GUI and try to make it more programmable.
It’s funny how early on, especially, and actually even today, if I’m describing this project and the ideas that we’re working on to different people, people often either pick the programming side of the world and start there and describe how it could be connected to the interactive design and GUI perspective, or whether to start from the opposite side. But, as we talked about earlier, I think if these ideas and techniques turn out to be successful and scalable, the goal is to really bridge this gap so that there really is no divide about whether you’re starting with programming and making it more interactive, or you’re starting with a GUI and making it more programmable. The goal is really for the foundation to be a general-purpose expressive programming system, where you can define these composable GUIs for different tasks. And, you can really mix and match GUIs for common patterns of use, but then also compose them with others and design your own and really eliminate this spectrum of “this system is for programmers” or “this system is for end users”.
On that train of thought, in the Strange Loop talk you said that Sketch-n-Sketch is fundamentally a coding system with a graphics editor added on top, and that there are shortcomings of systems that start with direct manipulation and then add code generation. I’ll quote you here. You said, “No matter how many built-in tools come with a direct manipulation system, there are always going to be limits to what the tool can do well automatically.” Can you expand on that thought?
I think the idea is that what a user interface provides is a bunch of tools, a bunch of building blocks and features for combining them. Certain kinds of features for organizing them in maybe into layers and applying certain operations. But let’s say you, the author of some artifact, the user of one of these systems, want to add some new capability to the system. Right now, there really, isn’t an easy way to do that.
Certainly, many GUI application tools have plugin architectures and things like that, where an expert programmer could go build some plugin that extends the UI in some specific way. But that ability, to extend the UI with new primitives, ought to be much more a part of the main process of using the tool, as opposed to something that you, the author, would have to employ some expert developer to then build for you.
Some of the examples that I like to use, in the Strange Loop talk, in the recent papers that we had at UIST are, think about even the basic drawing tools in your drawing editor. Drawing a shape, drawing a circle, drawing a polygon. Even those, you might want to define variations of those primitives and have them be the tools that are presented in the toolbox. What makes those definitions of shape and polygon the only way that you can imagine a rectangle tool working?
There’s other variations where certain properties of the rectangle would be different from the defaults that are provided for you. If these primitive tools are instead defined as, let’s say, functions in some programming language, then these tools can actually be libraries that are provided to users, but could be swapped in for other libraries for use cases that might not be the use case for every single user of the tool.
Let’s say I’m in some project where the primitives that I want to use actually have these seven sided stars and snowmen or something. If the GUI system allows these primitive tools to be defined by user or library defined functions, if I’m a programmer or if I’m working with my team of designers, I can define these library functions for seven sided stars and snowman shapes and add those into the toolbox so that we can use those as direct manipulation drawing tools, as if they were the built-ins that had been provided for us.
Clearly, a user interface can’t provide every single tool that every user might possibly want. At some point, a user interface, an application software will have more and more tools, more and more menu items, more and more options. That’s, at the same time, not going to cover every single thing that users might want to do. And, it’s not the most scalable way for a user to use a system, because sometimes, you might really only want five of the 15,000 features that are there. It seems like you want to be able to make user interfaces much more customizable and much extensible so that different users and use cases can customize the UI to do different things.
In the Strange Loop talk, one of the demos you give is that you build the lambda logo for Sketch-n-Sketch, used a handful of geometric attributes that were exposed by the graphics editor, like the positions of points, the widths of rectangles, that sort of thing. More recently, in the UIST talk, it looks like you’ve gone much further down the path of exposing geometric attributes. Like the midpoints of lines that you can use to snap another line onto. These additional attributes look like they make it easier to build complex relationships between the shapes. But, at the end of the talk, Brian references this book about programming by demonstration. In that book, there are a number of benchmarks that you can use to evaluate a programming by demonstration system. When he mentioned those benchmarks, he said there’s four of them that Sketch-n-Sketch can currently do perfectly, there’s two that it can kind of do, and then there’s nine more that it can’t really do right now. In order to do some of those nine, a number of features need to be added. One of those features being attaching the end point of one line to an arbitrary position along another line. That sort of feature addition feels like a game of cat and mouse to me. It feels different from what you’ve just talked about, about adding different kinds of preset shapes and that sort of thing. This feels more like needing to change the underlying representation of the graphics or changing the way that the graphics map to the abstractions. Is it something that you feel like you’re working towards making open-ended as part of the system? Or is that something that’s going to be baked in and the end user is not going to be able to add those sorts of additional capabilities to your vector representation?
That’s a really good question. The short answer is yes, I think we can and will want to make those choices and those kinds of features also exposed to users or library or tool builders to customize. If you think about even the simplest widgets that you might draw onto a rectangle, let’s say. So, the completely standard feature that you might draw and allow the user to manipulate the corners of the shape, and maybe the midpoints and the center.
It is the case that currently our editor, Sketch-n-Sketch, draws predefined sets of features or predefined widgets for the different kinds of shapes. But you could certainly imagine even those widgets, even those choices to be defined in a library instead. You could imagine there being a library function that describes what to draw on top of the primitive unadorned, undecorated rectangle in the SVG output. And that library function could choose to draw SVG circles or SVG rectangles or whatever it is, that happened to be exactly at the corners and midpoints and centers of those shapes.
You could imagine then, let’s say, choosing a wrapper around rectangles that don’t draw any widgets at all, because let’s say you know that certain rectangles in your output are never going to be interacted with. So, why ever even have the user interface clutter that view with extra widgets? And then, you can imagine in some other part of the design, you have more knowledge about, you’re going to want to be interacting with maybe not even the midpoints, but maybe an arbitrary point on the edge.
You can imagine overlaying the right widgets on the edges of those polygons or on the edges of those shapes, and then hook those up to what the algorithms under the hood that connect to the output value to the program know about. So, I think there’s certainly details about connecting what the user-defined functions have chosen to draw on top of the real, the main values and how to map those interactions to what the underlying program synthesis and program repair algorithms can do.
I think there was a little bit of extra metadata and other kinds of things that you’d have to find there, but I certainly think that this approach would allow those user interface widgets, and actions to certainly be customizable and changed by users and libraries.
So, to build something like that, it feels like you’d need to get at the core ideas of whatever your output is, the fundamental first principles. In the case of vector graphics, it might be you have to distill everything down to a point, and then the idea that points are connected in lines, and then bootstrap your way up to all the vector graphics from some fundamental seed. Sort of like that. Does that seem like a fair characterization of what would be needed in order to make that idea work? If so, how would you apply that to other domains?
That’s a great point. I do agree that’s a fair way to describe it. I think I mentioned this in the Strange Loop talk as well. I think the way that we see this is, I said that we want this bidirectional connection between programs and their outputs to work in many domains. To do that in a scalable way, it seems that there are going to be certain operations, certain connections, certain changes that ought to be common across whatever application domain you happen to be working on.
But then also for any application domain, there’s going to be custom program analysis, program repair techniques that cater to the kinds of programs that are written in that domain, and also custom user interfaces to expose those capabilities. I think it really is this combination of some set of general purpose tools that are going to be useful, no matter what you’re programming or building. And then, certain tools that are useful, that are specifically designed for a certain domain.
Like you said, for SVG, if one wanted to expose a completely configurable, reconfigurable UI for doing vector graphics, you might want some really general representation, like points. On top of which you could then build basic structured shapes. But then, you would have really the finest grain access to be able to specify constraints over individual points in your output.
I think it is fair to say that for each, let’s say, different type of value in your application domain, or for each application domain, you identify the primitive values in that domain. That defines what the users and libraries can operate over. Currently for SVG, we choose just the normal SVG primitives as our…
As our spec.
Yeah, exactly. The spec is our domain of values. If one wanted to expose really complex constraints over individual sub features of these shapes, you might have something like you described.
That disentangling of the things that are common across domains from the things that are unique to each domain, and coming up with the representation of the things that are unique to a domain in order to expand Sketch-n-Sketch, or a successor or similar tool to work in that domain… That sounds like a very, very hard problem on the level of SDF, or the semantic web, or something like category theory. Or any of these notational systems that are designed to separate out the structure of what constitutes a domain, as opposed to the instances of things that have those structures. Like that sounds like a really big problem. Have you made a a beachhead on that part of the problem yet, when building the core of Sketch-n-Sketch?
Yes. Certainly in our current work, there are certain things that seem clearly independent of domain. For example, our programming language really knows nothing about any specific application domain. It’s got built in set of types as usual, user can define new types. And then, you can program with whatever types of values you’re working with. At the end of the day, the connection between the program and the specific domain is the main definition, the main expression that your program computes. That’s the time at which the editor starts to need to translate that main expression, that main value, into the specific domain, in this case, SVG.
That’s the point at which the output value of the program starts to be interpreted in terms of domain-specific primitives. For the evaluation of the program, there’s really not much domain-specific going on. But in fact, the execution trace of the program is being recorded so that within the final output value, different pieces of the output value are tied to different expressions and different intermediate computations. When the user interface displays the output values and intermediate computations as SVG widgets, then each individual tool that operates over SVG gets to look at the evaluation of the program and decide how it’s going to make changes to the program or not.
I guess the way that I’d describe it is, a lot of the programming and the tracing during the evaluation is domain independent. But then, when you want to convert the output into something that is drawn on the screen, and then when interactions with the output are transformed by domain specific transformations, that’s when you take the general purpose information about the program and then decide how to have to use it.
Have you, thus far in the project, done any work on how those transformations from the output domain, back to the code, and from the code to the output domain, how those are specified? Is that something that’s data-driven right now, or is that a part of the project that you’re leaving for a later stage?
So currently the transformations that Sketch-n-Sketch can make, based on interactions with the output, are each coded as arbitrary AST (abstract syntax tree) transformations. A transformation in Sketch-n-Sketch gets to look at the original program, gets to look at the final output of the program, gets look at the evaluation trace that produce the output, and it gets to look at some set of widgets or interactions that the user has made on the output. But given that knowledge, the transformation gets to do whatever arbitrary AST change it wants to make. There currently isn’t a higher level way of describing desired changes of the program based on desired output interactions. And so currently each one of these transformations is its own standalone transformation. And so sometimes they don’t compose as well or as naturally as you might want them to. And so in the future, we certainly want to define some higher level specification language for defining new transformations.
Even within the domain of SVG, let’s say I am a tool builder and I’ve got some new transformation I want to build in. I’d like to be able to maybe provide examples of when the user performs this action and the code matches the structure, transform the program in this way, so that not every detail about the program execution trace and fiddly details about the lexical structure of the program and the program dependence graph, so that not all of the details have to be manually accounted for in the implementation of the tool. We certainly haven’t done any work yet on making these higher level DSLs for defining new transformations.
Just for my own personal curiosity. I don’t know if I’ll actually include this in the show or not. When I’m imagining this architecture that you’ve built, I’m thinking of it almost like a bidirectional Multi Pass Compiler like LLVM or something like that, where you can insert your own compiler stages or your own optimization stages into this modular compiler architecture. Is that a little bit like how these transformations work, where the code representation is like your source code and you’re compiling it into this output result. And then that output result also preserves a structure so that you have another reverse compiler that takes that back to the code representation. And so the domain mappings that need to be created are very much like compiler stages where they’re doing like a tree to tree transformation or something like that.
Yeah. So I’m not sure whether there’s many more stages in which one might want to insert, let’s say a new representation or a custom transformation. I think it really is like these two phases for an evaluation and then in some sense, backward evaluation and at least in all the features and transformations interactions that we’ve wanted to provide so far, we can basically collect all the information that we want or need to know based on essentially an ordinary forward evaluation where we’re tracing dependencies between expressions and the values that they compute.
And so the tracing mechanisms that we’ve been using are pretty standard mechanisms that are used for things like omniscient debuggers, where you want explanations of how this value was computed, pretty common to how a provenance tracking analysis works, where you want to again know where did this value come from? And so essentially the evaluation of the program has all of the information that currently an end user or a final program transformation might want to refer to. In many cases, it only needs a small subset of this evaluation trace, but we so far haven’t thought of any other intermediate stages or any other intermediate representations that we might need to expose for customization.
Sure. I just meant more in terms of conceptualizing what it’s like to write one of those translation layers, because as an outsider to this project, knowing one of the ultimate goals is to take it to the point that it can be applied to arbitrary domains, not just vector graphics. I’m here thinking about, let’s say I took your core of your system and I wanted to extend it to a new domain. Like what would it actually feel like to write that domain transformation stage? Would it feel like writing a compiler pass or would it feel like something else?
Actually, that’s a good question. It makes me realize that… So earlier I described the forward evaluation of programs as a domain independent tracing mechanism where we record evaluation as usual and then domain specific transformations, get to look at that trace information when deciding how to transform programs. But actually there are a couple of situations in which our tracing mechanism is doing SVG domain specific tracing. And so, one of the ideas in the most recent UIST paper is to expose user interface widgets for manipulation on not just the final output and sub values of the final output, but also on some of the intermediate computations that didn’t necessarily draw something in the final output. And so an example of that is in the current version of Sketch-n-Sketch, the evaluation of the program looks for expressions of type point.
And so if an expression evaluates a point, even if that point isn’t in the final output of the program, that point will be shown on the output pane as something that you can interact with and snap things to and use as guides as a helper widget for whatever happens to be in the final output. And so that’s an example of how our forward evaluator actually is looking for in this case, things of type point and recording some trace information about it in a domain specific way. And so to do this for other domains, one might expose maybe the evaluator of the programming language to the tool builder to provide places where they can hook into different parts of the evaluation behavior and instrument it with additional information, additional things to log that might be of interest to the downstream transformations.
And so I think, like you described, it could be what a compiler writer might provide when implementing a new transformation like you described. It could be an API where the system provides the tool builder, something that looks like the intermediate representation that is produced during evaluation and the opportunity to change it or instrument in some way.
The talks that I’ve seen so far, the demos have focused on the direct manipulation, the GUI workflows, but not on the generic provenance tracing or the other parts of the engine, presumably my guess is that they don’t demo very well. To me, those internal bits seem like they’re the most interesting. And I feel like you feel similarly. So do you feel that way? Do you feel like those internal bits are the really interesting parts? Or what parts of the project make you think yeah, that’s the good stuff.
Right. So I think the program analysis, program synthesis, bidirectional programming techniques under the hood are certainly really important, obviously critical parts of enabling the user interactions. And when we have those techniques that are worked out clearly and cleanly enough for, like we talked about, a very core traditional programming language, where it would be easy to imagine how to incorporate that algorithm into any other programming setting. Those are the techniques that we’ll publish as a PL paper.
But yeah, those, are the things that on their own can’t be easily demoed. Those are the things that are more easily described in terms of like the math that describes the algorithm and comparing what that algorithm does compared to others. And that’s certainly a very cool part of the research and very fun to work on. But then I also think that then figuring out how to deploy those algorithms into at least even a prototype user interface that starts to show what those algorithms enable for certain simple authoring interactions — those are also really, I think fun milestones for us as well. And so like in the UIST paper, the idea there is to really see how much we can do with the techniques that we’ve built under the hood. I certainly enjoy when we were able to hopefully demonstrate some milestones on both of these fronts. What are the technical bits under the hood that enable us to then build some system that is starting to provide interesting user interactions on top. But you’re right in the demos, at least it’s harder to explain the algorithms that are under the hood.
Yeah, or it’s less enjoyable to watch. Let me rephrase that. It’s more enjoyable to watch somebody demo a really interesting looking GUI that doing really interesting things. And you can see that there’s a lot of power behind the GUI. I think that’s more interesting to watch in a live presentation than a slide deck about the technology that’s sitting underneath it. Whereas like you said, a paper is probably the right place to get into the nitty gritty of the internals.
Right, or I would say a well crafted slide deck, I think, can also be an exciting and interesting way to convey even the technical material as well. And so, I think certainly in the research community, I think lots of people try to spend lots of time to give presentations that aren’t just summaries and snippets from figures in a paper. I think oftentimes you try to figure out, okay, given that you’ve got 10 minutes to try to convey the highest level insights about what the new math and the new algorithm is doing. It takes a different vocabulary, both in terms of words, but also in terms of visuals, right? Coming up with some visual notation, some visual motifs, patterns that you can use to explain some of the insights behind the technical ideas to folks that aren’t necessarily going to spend an hour or two or three or four reading the nitty-gritty full details.
I think it’s also a fun and interesting challenge to try to develop visual representations of complex mathematical ideas, like I was describing early on with my initial motivation for a tool like this, a system like this, where in grad school I would spend a lot of time trying to come up with really nice, intuitive visual explanations for more technical concepts. And I think that is like a design challenge, a design task. And I think when you done well, I think it is an interesting combination of visual representation of more mathematical and complex ideas. And I think those… The most effective talks like that I think are really hard to do, take really a lot of work to design well.
And do you feel like you’ve done any talks or made any presentations of your research, whether that’s in a paper or you’ve done some really nice posters for various stages of the project also? Is there anything in that body of work that you would want to point the listeners to, as an example of somewhere where you guys put in a lot of effort to come up with those visual representations of the nuanced ideas that they might want to go and look at?
Yeah. So on the presentation on the talks part of my webpage, we have a bunch of presentations that we’ve given at various conferences and seminars and things like that. And a few months ago, I gave basically a summary talk of all the different research directions that we’ve been pursuing towards this goal of bidirectional programming with direct manipulation. And we tried to summarize all the different… give summaries of the main technical ideas under the hood. And it’s funny that slide deck that I put together actually has been building up over years and years. I don’t think I’ve actually started with a new slide deck in like five years or six years. I think each time it’s like building on the styles and building on the choices that we’ve made before about different colors and shapes to use for this concept and different code examples with different highlighting motifs and things like that.
And so it’s funny, it’s very full circle. It’s the kind of effort that would be really, really painful to go back and create from scratch or to make changes to because so many choices have literally been copied and pasted over into new modules of this slide deck for many, many years. And so it’s certainly the thing that I would hope, these bidirectional tools could make much more pleasant, a much more effective ways of building these presentations in the future. Currently, I just still use PowerPoint, and this talk that I gave a few months ago is covering, I would say four different research directions that we’ve been pursuing under the hood to enable these bimodal editors. And that might be a good way to get a cartoon understanding of some of the main ideas under the hood.
And I’ll include a link to that in the show notes.
Yeah. And it’s funny, this is exactly the artifact that I would really hope to produce in a system that really allowed programming. But yeah, I guess I already said this, so maybe I’ll say it again. Yeah.
Yeah. The Sketch-n-Sketch demos that I’ve watched show a specific workflow with direct manipulation at each step, and that workflow is first you draw. And then you add relationships between the things you’ve drawn and then you do graphical grouping, which serves as an abstraction operation where you extract a function that creates the group drawing that you made, and then you tweak your drawing using that abstraction. Each of those steps, creating or refactoring the code. Is that workflow a gilded path, or are there other ways to work within the tool like importing an existing graphic or starting with existing code?
Yeah. So a lot of those example demonstrations take that workflow to see what can be done, how expressive a design can be built using just direct manipulation and not also interleaving text-based edits or programmatic edits, but you’re right. Of course, an authoring workflow might really mix and match these two modes of use much more freely. For the question about importing existing graphics, currently we don’t have any tools implemented to make that process easy.
You could imagine taking an existing, let’s say SVG definition and in the limit, just inlining that literal into your program, but even better would be to try to automatically identify numbers and properties that appear over and over again, and suggest those as variables and maybe even function boundaries if they are repetitive patterns in the imported file. You can imagine doing that, but we haven’t spent any effort on that yet.
More directly about this gilded path through the tool. So, in an initial version of Sketch-n-Sketch and the version that we demoed at Strange Loop, there were many requirements about the syntactic structure of the program that if they weren’t satisfied certain interactions in the output would no longer be available. So a simple example is in that initial milestone, the main expression, the main definition of the program essentially had to be a list literal of shapes. And each of those shapes had to be a top level definition in your program and only then could certain interactions be available to users. And so a lot of the work that Brian has done recently has been to relax those restrictions.
And so by doing more general purpose tracing of the program to support more arbitrary programs in the language while still retaining the connection that, “Oh, this value in the output came from certain locations of the program,” that’s supported in a much more general way. But there are certain times when let’s say you want to create some parameterized function that repeats some design. There are certain times you can actually take multiple paths through the tool. So for instance, you can copy and paste a shape or a design multiple times, and then use a tool called merge that looks for syntactic differences in the definitions that generated them. And we’ll take the differences between those programs and turn those into arguments to a function. Or you can say, well, given just a single shape or group, you can use a tool called abstract, which turns it into a function like we just described.
So that’s an example where there’s two different ways you can choose to build one of these parameterized drawings, but there’s other times when you are forced to make a choice about what constraints to add into your program, and then you can’t undo that choice later on. And so there certainly are many times where you do have to pick the right path through the current set of tools, that certainly needs to be address in the future where you want to allow maybe multiple choices to be propagated downstream. So that later on, when you make some subsequent action, maybe then is the right time to decide whether the structure of your program should be one way or another. Currently, there are times at which you have to make a choice that you can’t undo or revert later on.
It sounds like you’re not storing an edit history on the graphical side, right? Like every change on the graphical side is immediately propagated back to the code?
That’s right. That’s right. So after every interaction, the transformation changes the code and doesn’t store the edits that led up to it. So reasoning about the edits that are being made in the output editor could certainly be a rich source of information for helping to decide or understand what changes to make to the program. And then also keeping track of the history of program edits as well can certainly provide new ways of trying to infer what the user is intending.
Yeah because if you provide a lot of different ways to achieve the same result, it could mean that the structure of the code that you end up with is different, depending on which way you went about achieving that result. And it makes me have flashbacks to… For instance, like in Gmail, when you’re trying to do WYSIWYG editing or, not even Gmail, Slack’s recent text editor change to… WYSYWYG editor had this problem, a lot of WYSIWYG text editors have the problem where when you try to add formatting, there is a representation that is invisible behind the scenes, and you can end up with things like here’s a spot that has… It’s surrounded by white space, but if I positioned my carrot in that spot and type it’s bold, even though the text on either side is not bold because there’s an empty bolt node at that spot, those sorts of things.
And so it feels like this is a place where I could imagine it being tricky to get the right balance so that… And I suppose showing the code, and if people who are using the tool are expected to be familiar with the code representation, that gives you a lot of benefit because then people can see, “Oh, when I repeat a shape using this approach, it creates that change in the code. Whereas when I repeat a shape using that approach, it makes a different change.” And so that alleviates you from the burden of having to use something like fancy edit history tracking like a CRDT or something like that, where you try to merge it down to a canonical representation, no matter how you got there.
It alleviates you from the burden of having to make sure that different changes in the output result in the same change in the code that since people can see the code… You’ve moved the burden of correctness and consistency over to the user. And I think that that’s a good thing. Like I think that that’s… You’re giving them leverage rather than foisting complexity on them. Does that feel right to you?
I guess I would say yes and no. I guess so certainly I think having a program in a general purpose programming language be the ground truth, be the artifact that matters, I think that’s a good choice. I think that is maybe not in the long-term the best, but I think it’s a very good medium in which to at least allow users, especially expert users, to make this specific choice about what the representation should be. But programs they’re oftentimes where you write something one way, but you might want to express it in a different way instead.
So a very simple example for this specific domain is let’s say I’m writing a program that generates a rectangle. Oftentimes you’ll decide whether the parameterization for this rectangle should include the location, the point of the top left corner and the width and height of the rectangle. Or sometimes you might decide the parametrization should be the top left corner and the bottom right corner, in which case the width and height would be derived in terms of those two points. And so there are times in which you might prefer the former parameterization, there are times in which you might prefer the latter, but with a program you have to pick one, right. And then when you’ve made that choice, all the subsequent code, that depends on it is not very easy to change if you want to go back and change the structure of the initial parameterization. And so I think exposing a general purpose program is a good way of making explicit exactly what the artifact is, what the representation is. But again, there’s times when programs force you to make these kinds of choices that you would ideally like to have even more deferred control over. So, there’s other intermediate representations you can imagine with like program dependence graphs, and other computation, and only have to turn it into a program, like turn it into an abstract syntax tree when it makes sense.
Like almost as a rendering step or something like that like export code or something?
Right. And then let’s say that you want to move back to the more general purpose, like bag of constraints, bag of computations, do more interactions. And then at some point, you want to say, okay, at this point I know that I want to do some repetitive operation over my data structure, codified as some AST that looks like normal data structure that I can map and fold over so that I can perform some actions, but then go back to this more general representation of the computation for subsequent edits.
So about that, like at the moment in the user interface, even though in the UIST demo, Brian, the point of the demo was we can do everything that you’re about to see entirely in the graphical side without touching the code. We pushed ourselves as far as we could to do things with direct manipulation of the output. I noticed in that version of Sketch-n-Sketch, there’s still a run button for the code.
And that was something that I saw in the Strange Loop demo earlier on was there’s a run code button, and you have to click that button to update the output representation and further to what we were just talking about about maybe inverting that, where you’re working directly with the output and it’s keeping some bag of constraints or something as a live representation behind the scenes. And then only as needed, do you render that to code that can be edited. How did you arrive at that need for an explicit run code button? And do you see yourself getting to the point where that goes away or where that’s inverted or in terms of that specific slice of the prototype? Like, what are your thoughts and feelings about it?
Yeah, so I guess there are two things that I would say, so certainly having a traditional run button was just a simple, lazy choice. There’s no reason why it couldn’t be a more, live continuous automatic process. It’s just that when you’re in normal text editing mode, the workflow is right now, you currently press compile, you press run, like you normally would, but there’s no reason why we didn’t just implement some automatic person rerun feature, like you would expect in a live programming environment or going to go in forward building the programming with holes work that Cyrus and the Hazel project are exploring. There’s no reason why we wouldn’t want that as well. And so that’s just an orthogonal feature that we just haven’t worked on or polished on at all, that does not really an explicit choice.
But your second question about not even showing the code until you really need to or want to for some reason, that’s something that we can imagine doing in the future. While you’re performing interactions through the visual editor, maybe you need to be shown and told what’s going on under the hood at every step, or maybe not. Maybe you do have these choices building up as constraints under in the hood, and only when the user knows that they need to do something with finer-grained control, that doesn’t have a mechanism in the visual editor, could the system then say, okay, based on the program that the user previously saw and the changes that have been made since, and the interaction that they want to do next, show them code in a way that makes the most sense for that next step. And maybe only the parts of the code that needs to know about. You can certainly imagine wanting to do this more selective and dynamic code generation as part of the user experience. Currently we have just, to keep things as simple as possible as we’re trying to push on the expressiveness of each of these operations is, we choose to make the changes of the code at every step and display them. But there’s no reason that, that’s the right desirable user interface for all use cases.
And that reminds me a bit of the first place, I feel like I encountered anything remotely like what you’re working on is in 3D animation tools like Maya or 3D Studio Max, or other obscure tools from back in the nineties when I was getting into this stuff. When you are working on your 3D scene, every action that you take in the GUI produces a log line that is a little bit of script. And each one of these 3D tools has its own scripting language. One of my favorite tools used Tcl as the scripting language, those logged out scripts are just a little snippet that represents, here’s the object that you had selected, and here’s the transformation that you applied. And the purpose of having that logged out is to assist you in building reusable scripts that you can run to programmatically, edit your scene.
It’s almost like a human in the loop programming by demonstration, where you do the little bit of the action that you want to do enough to generate the structure, or the code that you need because most of the people using this aren’t programmers, they’re artists. And so this is a way of saying, don’t worry about having to look up what function to call or what syntax to use, just do the edit you want, and we’ll log you out a little bit of code for that. And then you’d grab all those codes snippets, and you’d put them together into a script and manually parameterize out the parts that you wanted to make in parameters.
And then you could run that script against your scene and feed in the data that you wanted that script to act on. That’s the first place I encountered anything that feels like this. And of course what you’re doing is far more sophisticated and automatic, and it’s doing a lot of… It takes the human out of the loop so that they don’t have to be concerned with doing that abstraction process manually. Did you look at any of these Autodesk style, 3D animation tools or anything like that when doing background research for Sketch-n-Sketch? And if so, what kind of a relationship do you see between what you’re working on and what they’re doing?
Right. So, those tools that you mentioned, and these programming by demonstration tools, like you said, often do record a series of edits that serve as reusable functional units certainly is very, very related to the interactions that we’ve been exposing for certain operations. I haven’t played with these tools myself, but we certainly have read a lot about them. And Brian especially has surveyed a bunch of these different programming by demonstration tools. And I think a lot of times… So I guess some more technical differences, maybe not important differences for the vision, but technically technical differences often include the fact that the languages that these systems are generating programs for or scripts for are imperative languages where there’s a lot of state manipulation. And we didn’t talk about this yet.
I think the language to start with here really doesn’t matter as much, I think certainly functional programs and functional programming languages make certain reasoning, the tool has to do easier. There’s less implicit state manipulation, but really, I don’t see like a fundamental reason why these techniques couldn’t also work well with imperative and object-oriented programs as well. But I think these more classic programming by demonstration systems often tend to use languages that heavily use mutation and first order combinations of first order programs. Whereas we were interested in thinking about how to couch these interactions in higher order functional programs where we can then do things like map and fold and higher order combinations of these individual units. I think that’s one thing that we’re trying to push on a little bit.
Earlier on when you were talking about hiding the code representation in some circumstances, and that possibly being of interest to more expert users who would want more control over when they’re just working with the graphic or when they’re also moving over to work with the code. That was the first mention I’ve seen in the context of this project of the idea of an expert user. And that’s something that when I’m evaluating different tools for looking into the future of programming, there’s a universal focus on the beginner experience and on novice programmers and on programming being broadened to reach more people who don’t have a technical background. And sort of programming for the 99%. And I think that that’s a worthwhile goal and it’s very interesting, but it’s so universally talked about that I think that it’s sort of a gimme, and it’s one of the goals that was explicitly stated for Sketch-n-Sketch at the end of, I think it was the UIST talk Brian mentioned that. And what I’m curious about is how much thought have you given to people using this tool to develop expertise and how much do you think about what the experience of using a tool like this would be for somebody who is a master programmer or a master artist or both?
Yeah, so… Sorry. I was thinking this might be a good time to talk about using tools like this as a vehicle for teaching programming and then developing programming expertise. But I think maybe that’ll be a different topic.
Yeah. And just to touch on that, because we can go down that road if you want. But my personal feeling is that it’s a cliche. That everybody who is working on future programming tools is concerned with the beginner experience and that in every interview that I’ve ever heard about people talking about a new programming tool, the beginner experience is brought up it almost like a point of hygiene. And it’s the sort of thing where I feel like so, much ink has been spilled about that. That there’s not… There might be interesting things to say about it as Sketch-n-Sketch relates to it. Like what is Sketch-n-Sketch doing specifically to help that.
But I feel like it’s sort of… Anybody who looks at this can obviously see how it would be better for beginners. Like it’s so, plainly a richer window into programmatic behavior and dynamism, and it so transparently has all of the benefits of live programming that you want, and it has all of the benefits of being graphical or being focused on the product of the thing that you’re making, not the abstract contortions that you have to work through to get to that. It saves you from having to play compiler in your head. Like it’s doing so much that I feel like anybody who is thinking about that will see it in what you have done. Whereas the expert side, I never hear anybody talking about that.
I think really from the beginning of this conversation, we basically took as an axiom that a user will want to do some interactions with text edits and some interactions as direct manipulation interactions. I think right from that point, we’re assuming that there is an expert user that is a programmer. And then wants to dive into the programmatic representation. And I think when an expert user goes to make something, whether it would be an essay or whether they’re going to do some data manipulation, whether they’re going to create some presentation, whether they’re going to create some web application, create some pictures, some visuals, I think the expert user who’s trained in programming always has to make a choice.
Are they going to use a GUI application that is developed for those domains and give up the abstraction capabilities that they know that programming provides or are they going to pick their favorite programming language or their favorite library that caters to that domain. Oftentimes an expert user might choose to, let’s say, use Beamer to generate slides or use this Racket library called Slideshow to develop their slides. And there’s this clear downside that although you can generate these really complex abstractions and reasonable artifacts, making simple changes like, drag this thing a little bit to the right or copy and paste this thing and then change it, are extremely tedious to make when you have to think about where in the program does that operation stem from. And so, I think there is this pain that expert programmers realize they go through when they’re already programming. Right? There’s this tedious Edit-Compile-Run cycle that, especially when you’re trying to, at the end of the day, generate something that is very visual and interactive and the design process takes lots of iterations, which is common when you’re building something like this. I think expert users run into that pain point and would clearly see how they could benefit from a tool that allows them to do programming, but then also get some of these interactive capabilities for changing the outputs of their programs.
Given that that’s the appeal that an expert would see in a tool like this. Let’s look at the other side, what is Sketch-n-Sketch offering beginners, newcomers to programming, or even programmers who are newcomers to graphics that they might not otherwise be able to access?
Right. So, I’m certainly no expert in computer science education or programming education, but it seems intuitive that tools that make programming more interactive ought to help with the teaching, the understanding of programming concepts. And so, one thing that I’m interested in doing is using Sketch-n-Sketch and future versions of Sketch-n-Sketch to teach introductory programming to students that maybe want to learn simple graphic design or generative art.
Because my sense is that students that are interested in, let’s say, design or art would of course learn to use tools like Illustrator and Photoshop and all of those tools. And then some of those students might then later on learn a programming language like Processing or p5.js which cater really well to these domains of programming. Instead, what if you could teach the kinds of features that Illustrator and Photoshop provide in the same environment in which you can learn about variables and functions, and have those different concepts in different interaction paradigms just be the same system and not two disparate systems.
And so, I’ve been looking recently at the Processing community. It seems like they’ve done a lot of really cool curriculum development around Processing and p5.js. There’s folks at NYU in particular that I’ve been looking at their work, Daniel Shiffman, Allison Parrish, at UCLA Lauren McCarthy. They’re developing a lot of really interesting content for basically teaching programming to students that are interested in design and art.
And so, one idea that I’m planning to pursue is think about how to teach programming with tools like Sketch-n-Sketch where, not only do you learn different programming constructs, but then you can interact with the output of these programs. And interacting with the output of the programs can suggest changes to the original program and hopefully motivate and teach why you might want to learn about variables, why you want to learn about functions and things like that.
And so, one idea that I’ve been thinking about is designing some of these curricular exercises and projects as kind of like a game. So, a challenge would be to create some design and maybe make several variations of it with different colors and different sizes and things. And start by using the direct manipulation tools that you would otherwise learn in Photoshop or Illustrator. And use that as an opportunity to reveal several kinds of operations that involve repetitive and tedious edits. Then, when students learn what variables are used for and how using a variable in multiple places then enables the system to map one of your output interactions to other changes in the output as well, you learn to understand why you might want to use variables in your design. Staging these projects as challenges where you want to achieve some design task with, let’s say, the fewest number of like mouse edits or user edits, or the time it takes to carry out one of these tasks, as a way to motivate and explain why different programming constructs and abstraction ideas can be beneficial when creating certain classes of designs.
And it’s almost like you’re taking the tedium out of both sides. Like you’re taking the tedious parts of graphics editing, where you have to repeat a lot of shapes and make little changes to each one of them. Or you want to make a change to some shape that’s used all over the place and update all of the places where that’s used. That’s very tedious. And then on the coding side, you’ve done the same thing where if you want to make precise changes to your output, that’s very tedious to do in programming. And I think that’s a fun way of looking at it. And that is something that I’m sure would be appreciated by people who were new to this whole game, or people who are familiar with one side and are new to the other side.
Right. Yeah. That’s certainly the long-term goal. Another thing that I’m interested in is… I think I read somewhere that the name Processing originally came to describe the importance of the process of creating some artwork and not necessarily just the final art artwork as well. And so, I think about the process of programming is not the ideal process that we would want. We would want it to be much more interactive. And so, it seems very much in the spirit of the goal to build tools that are effective and efficient.
And when you were talking about using Sketch-n-Sketch to introduce programming to people who already maybe have a little bit of familiarity with graphics tools. And I don’t know if this was intentional or not. You said that what might happen in Sketch-n-Sketch is that somebody makes a change to the graphic and that change might suggest a change in the code. And are you suggesting that it wouldn’t be the case that a change to the graphic immediately changes the code, but instead present some sort of an interface saying like, “I see you want to make that kind of change to the graphic. Here’s the way that you would go about doing that by editing the code.” And it sort of like a tutelage sort of mode, or did you mean suggest as in, it just makes that change the code and they sort of have to look at the code and think, “Oh, I remember how this was before, and I see how it’s different now. And I’m going to kind of relate that to what I understand about what I did over on the graphic side?”
So, I meant the former, but I think both could be useful. So, currently as it is, there are certain interactions that lead to a program transformation right away without any subsequent interaction. But there are other times when currently the tool will give kind of a menu of options and you hover over each of the options and in each case, it’ll show you a preview of the changes to the code and a preview of the changes of the output. Because for many of these interactions, there’s just a huge amount of ambiguity in what change the user wants to affect.
And so, currently we have this very simple interaction paradigm where when we have multiple options, we sometimes use heuristics to try to rank them according to what we think are most likely to be desirable changes. But then the user has to interact with the different choices and pick one. And so, in an educational context, you could really play up that kind of interaction where maybe you could even think about presenting undesirable options and thinking about multiple kinds of edits and see visually what the effect would be on the output.
And have the user have to decide whether or not different code changes actually correspond to what they were hoping to achieve. And so, you can imagine this kind of interactive dialogue between the system being used specifically for the purposes of teaching new ideas and new constructs that maybe they haven’t yet seen. But then you can also imagine modes where, let’s say, once they’ve learned to use some language feature and learn to use some interface feature, then they could configure it in a way that as soon as they make this change and correspond, it automatically applies the code transformation that they’ve already learned to reason about.
That’s extremely cool. I would love to see that as a direction that Sketch-n-Sketch goes. So, it sounds like you are already doing things in the user interface where the code that you see, isn’t actually the true underlying representation of what’s happening in the graphic. Because there’s some way that you’re accomplishing showing a preview of what the change to the code would be, depending on a choice that you make in a pop-up context dialogue or something like that.
So, how are you handling that architecturally? Like, is there a true underlying representation and the code display is a rendering of that and you might adjust that rendering based on what the user is hovering over. Where’s the disconnect in the system between what’s actually happening in the core model and what’s actually happening in the user interface?
Right. So, the ground truth really is just a single program. That is the main artifact, that is the model, that’s what everything is built off of. But when a user makes some interaction with the output and then invokes one of the transformations that the menu provides, at that point, the results of that transformation may have more than one program. May have more than one candidate change to the program. And each of those candidate programs, when the user hovers over, we’ll show you a preview of that program and also evaluate that program and show you a preview of the output. But in some sense, those are just saying to the user, “Do you want program one, or program two, or program three, or program four.” The user then commits to one of those choices. And that becomes then the ground truth again. And so, it’s at each stage we can suggest multiple possible changes to the program.
And so, how do you avoid the issue of something like a feedback loop or something like that. Where when you’re previewing the different options that changes the output representation, which changes the thing that you’re interacting with to generate the preview. When you’re doing the previewing, does that mean that you are not currently affecting the output?
That’s right. So, after a transformation has been invoked, a set of candidate programs are generated and what the user can see is what is the output of those candidate programs, but can’t interact with any of them further-
Until they make a choice as to what program to use.
At least currently, yeah. And so, again, this is a kind of a very simple model right now, where each interaction has to be mapped back to a single choice, and then you proceed. But like we talked about, to support more non-linear authoring workflows, you might want the user to explore multiple different paths and continue to interact with the artifact. And then only later make choices about how those interactions ought to be codified in code. Currently, we don’t do any of this kind of non-linear, multiple path interactions yet.
And so, on that topic, when there are certain moments in time where you are free to manipulate the output and then there are certain moments in time where Sketch-n-Sketch will say, “Okay, the output is temporarily frozen until you make a choice about what the consequence of your last action should be on the program.” There are some visual programming environments where like Sketch-n-Sketch there’s a code representation on one side and a graphic representation on the other side, or they might even be in entirely separate places.
Like you might have a text editor and a browser that’s doing a live refresh every time you save in your text editor or something like that. There are some cases where the live editing experience is, code is in one place and the live preview is in another place. And then Sketch-n-Sketch brings them close enough together. And you get the bidirectional stuff going on where not only are they right next to each other, but a change on either side, basically instantly updates the other. And then there are some projects like, I think some of Sean McDirmid’s work where the canvas contains the code and the code is always represented in the canvas.
And you could imagine if that code were generating some graphical output, that graphical output would live in the same canvas as the code. And the two things would kind of co-exist graphically and you would use the same tools to work on both sides. And so, to me, it feels like there’s sort of another spectrum or another space here where there’s, how close together are the representations of the code and the output and how unified, or how distinct are the tools that you use to work with each. And so, what are the things that you’ve sort of thought about when bringing these two sides together.
As you have in Sketch-n-Sketch and sort of, how did you decide to stop at the point where you are, and not go further and say the canonical representation is in the graphic. Because you’ve moved further in that direction recently by adding widgets in the graphic that are a representation of an intermediate execution product in the code, but there’s still a disconnect there. So, I’m wondering if it’s sort of two tectonic plates that are going to gradually move together and one sort of subsumes the other, or do you feel like there’s a natural stopping point?
I think the tectonic plates are going to crash into each other. And I think there’s going to be a big earthquake where it’s all going to be a mix. I don’t think there’s going to be this kind of a hard divide between code lives here and output values live somewhere else. I think we chose that approach because it was the simplest one that we could take. But I certainly imagine like a much more kind of, let’s call it canvas, a freeform canvas where all of your computations live and sometimes you’re viewing the graphical output that it produces or graphical representations of the execution. And sometimes you’re looking at the text view. I imagine a much more fluid mix of those two kinds of interfaces in the future. So, this’ll be hard to describe maybe over the radio but…
But let’s do it.
One kind of UI that I like thinking about is… So, I use PowerPoint a lot that shows you how great of a designer I am. I do all of my editing and stuff in PowerPoint. So, my wife and I make holiday cards at the end of the year. And she usually comes up with a sketch on paper of I idea. She picks out a few images that might work well and then I’ll go in PowerPoint and prototype some designs. And in PowerPoint, like maybe other design tools as well, you see the space on which your slide lives, but then you can also have objects outside of that range, that focus.
And so, we’ll often have multiple images or other kinds of objects outside of the periphery of the main slide that I’m working with, I’m trying things. And then only what is in the actual slide is what’s in the final, what’s called main expression of my canvas. And so, I kind of imagine that maybe this kind of freeform canvas kind of approach could be useful where you could sprinkle around a bunch of programs snippets, graphical representations of what they produce alongside the main object, the main expression that you’re constructing in the middle.
That’s kind of one toy idea that I imagine we might try sometime in the near future where you can have the kind of helper widgets, the intermediate computations, be close to the object you’re creating so, that you don’t have to decide whether or not something is in the final output or not. You have access to all the different kinds of tools that you’re using in kind of a more freeform space.
I love that so much. If that ends up being the direction you go, I’m going to delight in seeing if that turns out to be fruitful and if you pursue it. Because I’m all about that. That’s extremely cool to hear that that’s something that you’ve thought about.
And you can imagine an analogy to kind of normal programming where you usually, you configure your program to generate some final output. But then other times you will add in a whole bunch of print statements and they’re kind of logging things to have extra information as you’re working on that main expression. And so, in the same way, you can imagine toggling on and off all these intermediate computations, these helper widgets, these pads of programming and design that you were trying out but maybe never made it in the final artifact, but allowing those to co-exist in a way that you can explore and go back and forth through the authoring process.
And you’re already doing that to a certain extent. Like, there’s the, what did you call it, ghost mode or something like that?
That’s right. That was just a very kind of simple layering mechanism. You can think of it that way, where each object that you’re generating can live in its own layer so, that you can naturally toggle between layers that you want to show or not in the output pane.
And the layers that you might want to show, or not. At least in terms of how I’ve seen you use them are for things like widgets that let you do advanced changes to the code representation, that these widgets aren’t going to be shown as part of the final output graphic, but they’re graphic representations of the structure of the code or of transformations that the user might want to make on the code.
That’s right. They could be, or they could even just be kind of helper shapes that the user has put because it makes relationships between the main shape, easier to specify.
Right. Like an arbitrary center of rotation or something like that?
Right. Yeah. So, I find myself doing that a lot in PowerPoint too. Like temporarily making a shape that has some certain size that I want, because then it’s easier to snap my main shapes to it. And so, I feel like that pattern obviously comes up over and over again. And so, you can naturally keep all these helper shapes in different layers, but rather than having to delete them, but then maybe insert them back again. If you want to make that change again, you could selectively toggle them on and off in different layers.
I work with a team of artists and I’ve seen them do that time and time again, they’ll use the graphics program that they’re working in to construct little scaffolding shapes. Here’s a line that’s a certain length and it’s a certain distance away from this other line. And I’m just going to move this little construction all over the place to help me line up everything on a grid. They make these sort of little scrap pieces of graphic and then use them to add structure to what they’re working on. And so, it’s cool to hear that you’ve also done that kind of thing, working in your art tool of choice, PowerPoint.
This podcast has a transcript, which you are reading right now. And that transcript is brought to you by Repl.it, an online REPL that lets you spin up an environment for working with any number of programming languages. They take all of the pain and hassle away from trying a new language or getting a quick project up and running. The Repl.it code editor is a collaborative environment, so you can have multiple people working on the same code base all at the same time. They have a thriving community of students and professionals who are doing amazing things with the tool all the time. Check out what they’re doing and try it out for yourself. My thanks to Repl.it for sponsoring the transcript and for helping to bring us the Future of Coding.
I should also make clear that this has really been a big team effort. I’ve hardly done any programming on Sketch-n-Sketch in the past year, two years even. I’m looking forward to getting back into it, but really the heavy lifting has been done by this really terrific group of students. Brian Hempel has been doing a great job over many years.
He was crazy enough to join me when I was starting this project. Justin Lubin, a really terrific undergraduate who’s made tons and tons of contributions to the project. Mikaël Mayer has been doing lots of really awesome work. Cyrus joined my group and brought the idea of holes and programming with holes to the group as well. Nick Collins. And so, it’s been a really terrific team effort. I’ve been really lucky to have such great collaborators.
Speak to that a little bit more. Like what have each of those people done? It’d be sort of nice to know for each of those people what they’ve contributed.
Sure. So, Brian’s focus over the past couple of years has been extending the expressiveness of what you can do purely with direct manipulation interactions. And so, he was the driving force behind this most recent UIST paper where using just direct manipulation interactions, he’s able to build complex readable programs for a variety of parameterized drawings including recursive drawings, including drawings that have shapes and groups of shapes repeated over various geometric dimensions.
And to do this has required exposing much more about the execution of the program than just the final value that it computes. And so, recording a lot of information about how intermediate program execution ends up affecting the final output value. All this more general purpose tracing of programs and exposing richer widgets for manipulation, exposing a whole bunch of new tools for transforming programs has really been the focus of his work over the past couple of years.
And he’s now thinking about how to expose similar kinds of interactions, building programs based on output interactions, for other domains where you don’t necessarily have very visual representations of values like you do in vector graphics. So, how could you implement a more typical general purpose data structures and data structure, manipulation functions by example, by demonstration. How can you do that in this kind of style? So, that’s been, that’s this kind of current focus right now.
Justin Lubin has made contributions to the project throughout in various aspects. One of the main projects and features that he had led was the design of this text editor interface that we call Deuce. Which is something that actually doesn’t have to do with the bidirectional programming at all. It’s a feature for more traditional text editors or code editors. So, obviously text editors are the main interface through which programmers read and write code.
But of course the program, once it’s parsed into an abstract syntax tree has a lot more structure than just the underlying linear text buffer that created it. And so, refactoring tools offer a variety of structured transformations, like renaming, extract method, things like that. And then structure editors have lower level AST transformations built into the system so, that you don’t have to resort to text editing all the time. And so, a feature that we built for the text editor in Sketch-n-Sketch is what we call Deuce which tries to overlay the structure information of the program, the AST, on top of the normal flat text representation.
And so, Justin led the effort on designing that interface, where you can hover over the code box. And as you hover over different parts of the program text, it shows you different nodes in the AST that you can select. So, for example, you might select some variable definition and then you might select some white space in between two definitions in your program. And one of the tools that Sketch-n-Sketch will then propose is, do you want to move this definition from this part of the program, to the other part that you’ve selected? And so, these kinds of refactorings or structured transformations can be made while staying within the text-based editor. And so, Justin took the lead on lots of that project.
Mikaël Mayer, has worked on one of these algorithms under the hood that we referred to earlier. Which is really the core bidirectional evaluator, which allows changes to be made to the output value in a way that are mapped back to changes to the program. And this bidirectional evaluator has really been developed in a generic domain independent way.
But it’s one of the many features under the hood of Sketch-n-Sketch that allows you to, for example, make changes to things like colors and positions of things, and have those edits be mapped back to corresponding program repairs. He’s also been exploring that idea of bidirectional evaluation applied to the domain of HTML applications as well. And he’s actually pursuing a startup to try to build some of those technologies into more usable tools.
Yeah. And then so, Cyrus obviously had been working on the Hazel project before he joined my group, and hasn’t been working on Sketch-n-Sketch proper. But the idea of programming with holes and running incomplete programs is something certainly that we’re going to incorporate into future versions of Sketch-n-Sketch. We’ve recently been working on a project for taking these programs with holes, partially evaluating them.
But then taking ideas from bidirectional evaluation, allowing examples of what these partially evaluated programs ought to evaluate to, as a way to synthesize expressions for those missing pieces. So, the idea is to allow a combination of a program sketch and examples of what that program ought to do, to help synthesize, help fill in missing pieces of the program. And that’s a project that Justin and Nick have been spearheading also. That algorithm that doesn’t really appear in either Sketch-n-Sketch or Hazel yet.
But it’s sort of a potential future direction.
Yeah. It’s very much one of these kinds of engines under the hood that I think, could be a really useful part of these interactive authoring environments. And then Nick Collins has also been working on aspects of this program sketching plus bidirectional evaluation project, as well as several structure editing kinds of ideas on the Hazel project and has done a little bit of work on the Deuce code editor as well for Sketch-n-Sketch.
Cool. Yeah. Thanks for going through each of their contributions. One of my goals in the interviews I’m doing on this podcast is to both learn about the projects but also I’m very interested in learning about the process and how people approach working on these future tools and the sort of the context around it because all of the things that we’re working on are evolving bit by bit. And so, I think it helps to share our processes and our insights into how the tools get made.
And so building on the theme of the process, of working on the project, and building on the example that Brian brought up in the UIST talk about the programming by demonstration book having a bunch of benchmarks in it that you can evaluate your project against, benchmarks that anyone building a tool like Sketch-n-Sketch or like a programming by demonstration system could use to evaluate their project. If I’m working on my own project and I wanted to evaluate its flexibility and its suitability, here are some standardized tests that I could put my program through.
And so I’m curious if those references were collected in the UIST paper or anywhere else that you know of, so that the other people listening to this who are working on their own projects can say, “Oh, I’m curious to see how my tool does against those benchmarks?” Because it’s like it’s tantalizing to know that in a sort of a friendly, competitive sort of way that Sketch-n-Sketch could do four of the benchmarks and then could kind of do two and then there were nine more that it couldn’t do. And that’s a really interesting admission. And so it’s sort of an interesting idea that I hadn’t considered before as almost like a TodoMVC for live bidirectional program directed output tools. And so I’m wondering if you’ve made that collection of references anywhere or if I should just go on to the Slack community and nag Brian for those links.
Oh, so certainly the UIST paper does refer to the sources from which we drew these benchmarks. And so the ones that you mentioned I think are from this Watch What I Do benchmark suite, and it is a very well-known and available resource. I guess one challenge about doing head-to-head comparisons is that there are so many differences among the specific language that is being used. And some of the tools maybe don’t run because they’re 20 years old and some of the goals of the systems are different. And so it’s really hard to, at this point, identify like these are the SPEC benchmarks of live programming or the SPEC benchmarks of output directed programming.
But it certainly is the kind of thing where, yeah, we certainly want to be able to compare different systems on shared examples, if not exactly the same example, at least different incarnations of the same goal or the same concept across multiple systems. And so I don’t know that there’s a single kind of benchmark suite that already exists, but certainly comparing to all these other examples that people are using seems to be a good first step.
For the people who are working on these sorts of tools, which I think is most of the listeners of this podcast, those sorts of benchmarks would have utility in that they might force you to approach your tool in a way that you weren’t naturally gravitating towards. And so you can sort of use it to test your assumptions and to test your model and to see where your ideas break down. And that’s the appeal of these sorts of benchmarks to me. The competition thing’s more of a… It’s kind of a joke. It’s more that this is a way of… A lot of our community are working independently. And so they might not have the resources to do extensive user testing or they might be at the wrong stage of their process to do user testing. But those benchmarks might serve as ways to help people think about their model and to just read the benchmark and think, “Hmm, is that something that I could even do within my tool?”
And I was going to… I wanted to reference in this context Brian Eno, who is a very significant figure in the history of generative music and generative art. He made a deck of cards called Oblique Strategies that you’re supposed to use when you’re facing a creative crisis, when you’re staring at a blank page and you don’t know what to write, or when you’ve made something and it’s not finished and you’re unhappy with it but you don’t know what to do to continue working on it. And each of these cards in this deck of cards is sort of an open-ended prompt. And many of them are very unusual in what they ask you to do. And they’re meant for interpretation, and they’re sort of meant to shake you out of a creative rut.
And so I’d love to see, and it sounds like these, the examples from the Watch What I Do book, are a little bit like this, is I’d love to see somebody put together a collection of just, here are intellectual stress tests that you can put your ideas through in order to assess their generality or assess their applicability to the domain or assess things that will force you to think through the problem space so that you don’t lead yourself into a blind alley. So if you know of anything like that, other than the Watch What I Do benchmarks, that would be a really interesting thing to share with our audience.
Yeah. I agree that it would be valuable to have that sort of thing. I remember at the end of the LIVE 2018 workshop, there was a little bit of discussion about whether there were common benchmarks or reference points that, like you’re describing, one could look to as stress tests and things like that. I don’t remember if anything came of that discussion.
These aren’t examples of the kind that you described. But one thing that I would mention is this work on what’s called the cognitive dimensions of notation, which are a set of heuristics for evaluating user interfaces, programming languages, other kinds of abstractions like that. And they include things like closeness of mapping, how closely does the notation or the user interface represent the notion of what the user is trying to create, what notion in the world the user is trying to create? Another heuristic is, are there hidden dependencies? Are there other things that the system is doing and knows about that are not exposed to the user for understanding or for manipulation?
And so these are properties that generally are good if they hold. They ought to hold. Some of them are mutually unsatisfiable, but you want to be able to satisfy as many of these as possible. They’re not benchmarks or examples in the sense that we were talking about but are useful heuristics to try to understand whether the system that one is building or trying to build satisfies multiple of these goals or not.
Hey, so this is Ivan from the future just cutting in here. I normally adhere to the doctrine that you should not reveal the technical details of how a podcast is made on the podcast unless you are doing so intentionally, so for instance, avoiding letting people hear the glitches that sometimes happen when Skype drops out or one person having a really good sound quality and another person having a really bad sound quality. That kind of behind the curtain stuff is, to my taste, unprofessional.
But I’m going to break my own rule right here and say that at this moment in the conversation, my connection to the internet died. I can’t remember why. But Ravi and I reconnected and picked up the conversation where we left off. I couldn’t salvage this in editing. I’ve had a number of other hiccups like this in my brief time as a podcaster. And I’ve always managed to pull them together through a very deft use of Ableton Live. But this one was just too jarring, and I couldn’t find a way to stitch it together coherently. So, you get to hear me blathering on and apologizing and saying, “I’m sorry. This isn’t how I like to do it. Don’t do me like this.” So anyways, that caveat out of the way, back to the interview.
End user programming and programming by demonstration, other kinds of approaches like that have never really succeeded in the mainstream or into really usable, useful solutions. And so I guess one question is, will this ability to actually mix programming and direct manipulation, will it actually help with novices and less expert users, or will this only be limited to experts because this is just another set of tools that need to be used and mastered?
But I guess I’m hopeful because the success story that many people like to talk about are how widely used spreadsheets are and how many people use formulas and macros and a little bit of programming without really knowing it. And I guess I like to think that, what if the standard toolbox that you see in many different GUI applications has a tool that looks like equals X, where you can introduce just a name for something, and then you can drag that name onto multiple properties and multiple objects on your canvas? I feel like very simple notions of giving names to things and very simple relationships between these things seem like they ought to be simple and general enough to be realistically part of end user user interfaces for a variety of domains. But again, I guess it remains to be seen if that’s really going to play out or not.
That’s why I brought up Maya and 3D Studio Max and the other Autodesk tools is, and I neglected to mention this, but another ability that they have that’s sort of universally present in 3D animation tools is that they allow you to make any property on any object in your scene the result of a function of another property. And the way that you can wire them together is extremely open-ended.
And as an early teenager, knowing nothing about programming, knowing frankly, nothing about computer graphics and not being a very good artist, I was still able to pick up one of these tools and just through playing around with the user interface, figure out how to make, for example, glitch art by using the incident angle of the camera on the surface, which is an xyz-position, be used as the RGB color attribute. And so the color of the surface is based on the it’s based on the angle that you’re looking at that surface from. And those sorts of mappings from one data type to another, or one representation to another, or one property to another, and applying a function on that mapping in those programs is a really core part of the UI. And a lot of the UIs are built in terms of those mappings between properties.
And it feels like when I use other programs that don’t have that ability, it feels like something’s missing, like I’ve been sort of cheated when I use Photoshop or when I use Illustrator and I can’t do that kind of… It gave me an early taste of what it feels like to be a programmer before I actually learned to be a programmer because I was programming my scene. I was programming my art. And I definitely feel like what you’ve just talked about, about wondering about whether something like Sketch-n-Sketch or something like programming by demonstration would find a home in all of these domains. I really think that it would. I really think that there are examples of where that has happened in existing tools.
And unfortunately, if you pick up a program like Maya or Modo and you open it up, it doesn’t look like Excel. It looks like a nightmare version of Excel where instead of just the one ribbon across the top with way too many tools, it has user interface controls everywhere. It has thousands of features. If you right-click in 3D Studio Max, it brings up four context menus around your mouse. These things are… They’re like the Starship Enterprise. It’s enormous and complicated.
And yet as a 12 year old with no internet at the time and no tutorial books and nothing other than the software on my computer and no expert guidance, I was able to learn how to do it because those interfaces are self-revealing. And so that’s why I wanted to ask about the expert experience and about whether you think about the ramp towards expertise is because I think the focus on making things approachable by beginners, a lot of times people will react to that desire by making a tool that is meant to be handled with boxing gloves or with… It’s like safety scissors. They sort of try to distill it down to some sort of simplified core essence so that people can’t hurt themselves or be scared away by the complicated user interface.
And yet I think in making these tools that have that programmability behind the scenes, there’s going to be a need to expose more complexity and expose that complexity in a way that’s tractable, in a way that’s approachable, in a way that is self-revealing and discoverable, to borrow a Ted Nelson term, to make the interface explain itself, not to make it intuitive, but to make it so that somebody who’s playing around with it with no other reference can figure out what it does.
And I think Sketch-n-Sketch looks to be a great approach to doing that. The ability to see the change that you’re making graphically or in whatever output domain turn into the change in code, like that connection between the complexity of the code and the simplicity of the output, I think gives people that… It gives them that ability to discover the way that the tool works. And so I am optimistic about that being something that we can look forward to in the future.
What I am not optimistic about and what I would be interested in hearing your thoughts on if you have any, is what it takes to make a tool like this have a place in the market. Because we’ve had past examples like HyperCard dying an early death, even though it enjoyed phenomenal success. And Flash more recently, Flash was a beloved tool for doing interactive art that died, I would argue, because of mismanagement by Adobe, but that’s a whole other matter. It seems like these tools have a really hard time surviving in the market. And so do you think about that at all? And what do you think you might be able to do to help fight that trend?
Yeah, so market, I guess there’s a couple of ways to think about getting, hopefully getting, these kinds of ideas, technologies, tools into the world. And so one is, can this be built into some system, tool, technology that can be marketed, that can be sold? Another is, can it be implemented and developed and released as open source tools that are used by a variety of people, in a variety of settings?
And so, so far as researchers, a small research group, we’ve chosen, we’ve specifically chosen to do things in a clean slate, prototype, toy little setting because it’s been easiest for us to try out ideas in isolation. But then when we have algorithms and ideas that we think are reusable and generalizable enough, of course we write papers and try to explain the main essence of the idea so that hopefully someday in the future, folks that are building industrial scale languages and editors and environments, hopefully, these kinds of ideas make it into those efforts in the long term.
But in the shorter term, I guess I would say that I think there are certain parts of the techniques, at least like some of the bidirectional algorithms that are making pretty modest changes to the existing program, assuming that a programmer has already written a lot of the high level logic of the program. A lot of those algorithms, I think could be scaled up to practical editors for many languages in the not too distant future.
What about artist tools or mixed art and programming tools like HyperCard or Flash?
Yeah, so I think in some sense, I mean, the bar is higher for building really usable tools for artists or programmers or novices that don’t have as much programming experience. I think the easiest realization of these ideas in the world, in the short term, I think are as developer tools because you can imagine building plugins for existing languages. You can imagine building plugins for existing editors for… You can imagine building plugins for Chrome and Firefox that expose these capabilities. So I think in terms of the short term, some of these ideas I think could appear in developer tools much more easily than they could for more ambitious domains.
Because I think for a system like this to be useful, it really has to provide a lot of the capabilities that so many existing application GUI tools already provide. And so in terms of marketability for these kinds of application domains, I certainly don’t have any good answer. It’s hard for me to imagine the situations in which this kind of thing could be a useful marketable tool.
I guess one idea is, thinking about the kinds of people that maybe are learning programming not because they want to be programmers but because they want to create some artwork, graphic design and things like that, often times will have to learn programming as part of their workflow and their tool chain. But clearly the state of the world where you have tools that are good for direct manipulation, you have tools that are good at programming, but nothing that combines them. I guess, if there’s the right niche application domain, maybe one could focus building this up for a domain in which there doesn’t already exist a very good solution for the kind of programming side of things. But I guess this is not a really… Yeah, this is not a very helpful answer.
It’s hard. One thing I thought about in trying to find out the difference between why it seems tractable to offer this sort of tooling to programmers and intractable to offer it to artists is that programmers are in the business of assembling things. And that extends to the tools that they use. And so rare is the programmer that is using a stock configuration of Emacs or is using a framework with no middleware. We tend to kind of endlessly customize everything we do by assembling other pieces together with a core structure.
And artists don’t do that. Artists buy a complete workstation off the shelf. If you’re a musician, you buy Ableton Live, and Ableton Live lets you use audio unit plugins and VST plugins. But it doesn’t let you add in a new way of… You can’t add in a score notation for writing sheet music within the Ableton Live environment. It’s a much more closed tool, and it’s expected that you will use the extension points to offers and that’s it, whereas programming tools, the same philosophy applies. You can only use the extension points that exist. But programmers are coming with a mindset that they want to customize so let’s make lots and lots and lots of extension points. Let’s make it so that things are modularized and composable.
And so I almost feel like what it might take to be able to, as a small, independent research team, to be able to make a contribution to artists tools would be to find a domain where the artist tools are more modular and where there is sort of the invitation in that domain for people to come in and add the extra slice of functionality that they can add. And it makes me wonder, maybe like scientific visualization or medical imaging or something like that, maybe there’s a domain out there where that is more common practice. And that might be a sort of a way to get the foot in the door to the arts rather than just continuing to offer things to other programmers.
Mm-hmm (affirmative). Yeah. Yeah, the thing that scares me about scientific visualization, medical imaging, also 3D animation games, all of which I think are certainly domains in which I would hope these techniques will work for and scale to eventually, what makes me scared in the short term is that the math that goes into the computation of the final artifact is much more complex. And so one of the challenges for connecting programs to their output in both directions is, how well can you invert the operations, right? When you make a change to the output value, how do you map, invert it, and map it back to the program that generated it? And the more and more complex math that you have, the more ambiguity, the harder it is to really get meaningful changes back in the inputs to the computation. It’s certainly not impossible I don’t think, but it’s, I think, more challenging than what we’ve been focusing on so far.
And so in the shorter term, still hoping to find it find more 2D application domains in which this kind of combination could be beneficial. One tool that I haven’t used much, but I’ve heard a lot of good things about is Figma. And I’ve heard there’s lots of plugins that provide a whole bunch of features for Figma. I’m not sure if you think that might be a fruitful platform in which if one could build, let’s say, a Figma plugin that provided some amount of abstraction capabilities via code in a relatively lightweight way, whether or not that could be a useful part of the design and prototyping and mock-up workflow that Figma I think is often used for.
Yep, totally could be. I don’t know. I’ve used it but only in the most casual way. I don’t know if they offer that sort of extension, but that kind of a tool absolutely would be a good fit for this.
Another example, not in this creativity kind of space but more in the, again, what I like to think of as office or everyday kind of spaces: products like Coda and Notion have been exploring interesting new ways of breaking down the barriers between all these existing different office applications into more composable and unified ones. But those two tools, at least, don’t expose code to users. So if one could have the nice integrated composable UI features that Coda and Notion expose but also expose the programmatic representation that is behind it and integrate that into the expected user workflow in some way, that’s one way that I would think is worth trying to go.
Thank you for taking so much time.
Great, yeah, no, thank you for coming on the show and for letting me pick your brain about all these different parts of Sketch-n-Sketch. It’s a tool I’ve had my eyes on since the Strange Loop talk. And it’s very exciting for me to be able to learn where it came from and especially to learn all of the different goals that you have for the project, or if not goals, then potential directions that it could go in the future. I enjoyed hearing about that very much.
Yeah. Thanks a lot. This was a lot of fun. Thanks for taking the time and thanks also to you and Steve for organizing this community and providing so many great resources for all of us. So thanks for that.