Floating little leaves of code

Animatable Content in SwiftUI

2024-07-09T00:00:00+00:00

Animations in SwiftUI are pretty magical. For example, say you have a progress bar that you expect to have a big jump in progress:

struct SwiftUIView: View {
    @State var progress = 0.0
    var body: some View {
        Button("0%") { progress = 0.0 }
        Button("100%") { progress = 1.0 }
        ProgressView(value: progress)
    }
}

When the value changes, the progress immediately jumps all the way to 100 or back to 0. Not very sexy. Magically, this simple change fixes it:

ProgressView(value: progress)
    .animation(.default, value: progress)

This tells SwiftUI to animate any changes to the progress bar whenever the progress value changes. But what about that .default parameter? That is the type of animation to use; examples of other built-in values are .bouncy and .easeInOut. Even sexier!

But if you try them out, you’ll see they make no difference compared to .default. What gives?

The problem is that it is up to each view to decide how to animate itself. This is where the magic ends: ProgressView doesn’t pay attention to the type of animation at all. It just checks whether it’s been told to animate or not.

But in this case, we have a potential workaround: we just need the progress value itself to animate. If the progress smoothly transitioned between 0 and 1, we could pass the transitional values to the ProgressView and it would look smooth. In other words, we don’t really need to animate the whole ProgressView, just its content. This is what the Animatable protocol solves.

To fix this, we will wrap the ProgressView in an Animatable view that explicitly says what property to animate:

struct AnimatableProgressView: View, Animatable {
    var value: Double
    var animatableData: Double {
        get { value }
        set { value = newValue }
    }

    var body: some View {
        // prevent negative values (they show a spinner instead)
        ProgressView(value: value < 0.0 ? 0.0 : value)
    }
}

This is still a little magical. animatableData is a protocol override that tells any animations to animate the value property. Now, the ProgressView itself doesn’t need any animations; it just blindly uses value which is itself animating.

Note that it is actually important to enforce the lower bound. Animation curves are allowed to overshoot their target, for example .bouncy could briefly animate negative numbers into the value variable when transitioning to 0. Since the default behavior of ProgressView is to show a spinner for a negative progress, that would look bad.

General use

This pattern can be generalized to any view that doesn’t animate as you expect but which you can manually update just by setting new values. Create a new view which is Animatable and which has vars for whatever data you want to animate. Declare animatableData to get/set the data.

Note that the type of animatableData must conform to the VectorArithmetic protocol. This can be accomplished with an extension if the type you want doesn’t already conform. If you have two separate values you want to animate on the same view, you can use AnimatablePair to group them together. If you have more than two values to animate, I recommend putting them all in one struct and making the whole struct conform to VectorArithmetic in the naïve way: component-by-component.

Stack Overflow is working as intended

2024-01-06T00:00:00+00:00

I see Stack Overflow getting a lot of hate these days, especially from less experienced programmers. The common complaint is that the site has a toxic culture that punishes and humiliates newbies, discouraging participation. That is true, but that is also intended. Weirdly, this is also what makes Stack Overflow such a useful site. In this blog post, I will explain why that is as well as how to rake in those sweet, sweet fake Internet points on Stack Overflow.

Instructions unclear

The welcome page for Stack Overflow says:

[This] is a question and answer site for professional and enthusiast programmers.

So that means if you have a programming question, you should click the “Ask Question” button, right?

WRONG.

This misleading introduction is the primary cause of almost every complaint I have seen about Stack Overflow. Yes the site is in the format of a Q&A and yes it is programming focused, but there are a million completely valid programming-related questions out there that will get you downvoted into oblivion. What determines if a question will win you those Internet points or shame you into quitting programming? The easiest way to know is to first understand what the site really is.

The true goal

Here is what the welcome page should say:

[This] is a database of every programming question that can have an objectively correct answer. Every such question you could ask has already been asked on this site and every reasonable answer has already been given. All questions and answers are already ranked according to their utility to the programming community at large. If you have a question, simply search for the existing instance of it and upvote it, then upvote the answer you found most useful. In the extremely unlikely event that such a question has not been asked or answered, you may attempt to do so.

This explains almost all downvotes you will get on the site. Let’s see why.

Valid questions that get downvoted and why

“How do I learn ABC?” or “Which tool should I use for XYZ?” will always get downvoted. These questions do not have objectively correct answers since they depend on the individual asking and their ultimate goals, which are both impossible to capture in the question. Even if they could be answered, the “right” answer would likely change dramatically over time as programming evolves. This runs counter to the site’s goal of being a definitive database of objectively ranked answers.

“What is the bug in this complete program? (Source code attached)” will almost always get downvoted. It is extremely rare that this has an objectively correct answer. Usually the expected behavior of the program is entirely unclear from the question and the reader is left to infer that from context clues. This makes it hard to understand what the bug even is, so it gets downvoted. Even if the expected behavior is clear, the design goals of the program itself are usually unknown. Is this a toy project meant for learning? Is this a throwaway utility that will be used once and never again? Is this for enterprise software that needs to be understood and maintained years from now? Does it need to run under specific memory/speed constraints? If you manage to clear up all of those details in the question you probably won’t get downvoted. If the resulting question ends up relatively short, you might get an answer and upvotes. Answering these kinds of questions is extremely labor-intensive, so they often get ignored.

“How do I do [COMMON TASK] in [COMMON LANGUAGE]?” will almost always get downvoted; not because it’s a bad question, but because there is probably an existing instance of the question with answers! Want to reverse a string in C#? Want to invert a map in Javascript? Great! For the love of God do not ask on Stack Overflow. First ask yourself, “Is it likely that a whole bunch of people in the history of programming have tried to do something like this before?” If so, it’s probably already on Stack Overflow. After all, that is the whole point of the site: to be a database of such questions! If you care about imaginary Internet points, you are much better off spending 20 minutes rewording Google searches until you get a Stack Overflow result than asking a new question.

“How do I do [BIZARRE TASK] in [COMMON LANGUAGE]?” will almost always get downvoted. These are questions like “How do I delete the fourth instance of the letter J from a string in Go?” or “How do I treat a UDP socket as if it’s a void heap pointer in C++?” These questions may have objectively correct answers, but they make the reader question the asker’s sanity. If it seems like no one else would ever benefit from such a question being answered, you get downvoted. Such questions, even if objectively answered, would fill the database with noise, making it harder to use Stack Overflow for its intended purpose. Usually these questions are posed by two kinds of people:

Students who were given homework. Homework contains these bizarre tasks specifically so that students cannot easily Google the answers. Filling Stack Overflow with a database of every weird scenario a programming teacher cooked up would qualify as noise, hence the downvotes.
Someone suffering from the XY problem. It is more likely that people will search for ways to do X (rather than the asker’s weird fixation on Y). Thus Stack Overflow punishes questions about Y and rewards only those about X.

Sometimes, more often than I’d like, you will get downvoted even for a question with objectively correct answers, which has never been asked before, which would be helpful to other people. Why? Because it’s the Internet and it’s full of lazy assholes with poor reading comprehension. Usually they will leave a comment explaining why. Don’t argue, just edit your question to address their comment (usually requires adding further clarification), and leave a comment @ing them saying their concern was addressed. Though it may grate on your nerves, doing this is important because potential answerers often ignore questions with downvotes, so the extra effort maximizes your chance of a successful answer. This also feeds into the site’s ultimate goal of becoming the definitive question database since placating the asshole likely resulted in your question becoming even more searchable and even less ambiguous such that any future prospective asker will stumble upon your question before asking.

Valid answers that get downvoted and why

Answers that just link to some external site which has an answer will always get downvoted. Remember, Stack Overflow wants to be the complete database itself, so external links are counter to that goal. An external link could break, making what was once a valuable answer useless. You can include an external link for extra information, but Stack Overflow always wants the answer to work on its own.

Answers that just include code with no explanation will almost always get downvoted. Without an explanation of what the code does or why, it is hard to evaluate the quality of the answer. Very simple self-evident code is sometimes okay.

How to easily get points on Stack Overflow

First, I do know what I’m talking about. I currently have 21,453 points, 5 gold badges, and 50 silver badges on Stack Overflow. So I’m not talking out of my ass.

Remember that it’s all about the database: improve the quality of the Q&A database and you get points. Unintuitively, it’s not about asking a lot of questions or posting a lot of answers!

Fill in Google gaps

If you make what seems like a reasonably clear Google search and get no Stack Overflow hits, and then spend 45 minutes trying different searches until you find your answer, that often indicates an opportunity for points. Take your original Google search and ask it as a question! Even if it gets closed as a duplicate, you will still rake in the points in the long run. Why? Because chances are someone else will make that reasonably clear Google search, see your question, see that it has answers, and upvote it! This makes sense even if it gets closed as a duplicate because your more-easily-Googleable question improves the utility of the database!

A lot of the following tips also tie into this same mechanism: make the Google results for Stack Overflow better and you will be rewarded.

Chase new technology

The main situation where the database is missing questions and answers is for new tech that doesn’t have wide adoption. If you’re an early adopter and need to figure out stuff on your own, ask those questions! Even if you already figured out the answer, ask them anyway (remember that you can answer your own question)! If the new technology takes off, you’ll start raking in points once the Google searches start hitting your questions.

This is actually not that easy because you’ll be racing against millions of other programmers trying to do the same thing. But as the profession expands, you might be able to find a niche where you can be the first to ask.

Turn comments into answers

For some reason, people love to answer questions in the comments. This is bad for the database! Comments have worse visibility and cannot be as easily ranked and voted on as answers. It may seem cheap, but if you ever see someone answer in the comments, or a highly upvoted answer sucks but has better info in the comments, just steal it! Consolidate the valuable comment info into a single, cohesive answer and you’re helping everyone else see that! It’s better for the database, so you will get points for this.

Leave clarifying comments

You don’t get many points for comments, but it can add up. Highly voted answers get high traffic by definition, so see if you can improve them, even in little ways. Is it missing a link to official documentation? Add it in a comment. Does a function have an optional parameter that wasn’t explained? Leave a comment explaining the broader usage of it.

Answer by tag

I almost never look at the front page. Technically you can get crazy points for answering there due to the high visibility but practically it’s extremely difficult. Instead, try to find a few tags that you are very experienced with and follow them. Remember to also set related tags which you know nothing about as “ignored”.

For example, I’m very good with Ruby but I know nothing about Ruby on Rails, so I follow the former and ignore the latter. When I’m in the mood for trying to answer on Stack Overflow, I can click on the Ruby tag and I’m much more likely to see questions in my area of expertise. I find this much easier to produce high quality content for the database rather than spreading myself thin across all of my areas of expertise. If I see nothing interesting for Ruby, I’ll go down my list of followed tags.

By focusing narrowly for answers and comments, you’ll have a much higher chance of being first to respond. Faster responses is valuable to the database, and also to you since duplicates are almost always downvoted.

Wait

Remember that the points represent general utility to the programming community, not volume of work. One of my single most successful days on Stack Overflow was when I posted an accepted answer which quickly racked up 11 upvotes. That’s 135 points! But it was a total quirk: the question was about a very esoteric topic and I think I just got lucky with a bunch of bored programmers browsing the site that day. The question has had literally no activity since.

Don’t waste your time trying to hit highs like this.

Contrast that with a few answers I’ve left on highly voted questions (lots of Google hits) that simply fill in the gaps with more situational approaches not mentioned in the top answers. Just two of these answers (which aren’t even accepted) are worth over 2000 points! The questions are very commonly Googled, so even though my answers are more situational, by pure volume they happen to help out a lot more people! These answers reliably get a couple upvotes a week. By having enough answers like that out there and by waiting, you will eventually develop a passive income on Stack Overflow.

When you see someone with thousands of points like me, remember you’re not necessarily seeing thousands of hours of effort. They could just have a few highly useful contributions.

The value of Stack Overflow

All of the implicit rules certainly create a hostile environment for the clueless newbies attracted to Stack Overflow. But they are also what make it such a precise, highly focused, Q&A machine. That is ultimately the genius of Stack Overflow: if you try to game the system, you just end up making the database better. That’s what all of my previous tips accomplish!

Even in this age of generative AI, we need Stack Overflow around to extract all of that human knowledge and to store it in a highly structured, searchable database so that the large language models can ingest it. That is probably what 50% of ChatGPT-written-code is at the end of the day: a glorified Stack Overflow search.

Is it difficult and frustrating to ask and answer questions on Stack Overflow? Absolutely. I wouldn’t have it any other way.

A universal definition of string length

2023-11-14T00:00:00+00:00

There are many posts on the Internet discussing the complexity of computing string length these days, but few making a clear recommendation of the right thing to do. In some sense, that’s probably because there is no single right answer.

But when it comes to free-form text (text which can contain non-English characters, emojis, etc.), I would argue that there is one very good answer and a bunch of crappy ones. So I’ll skip straight to giving you that good answer now:

The one good universal definition of free-form string length

When represented as valid Unicode, you should count the number of Unicode scalar values in the string.

You may want to handle control characters in a special way, but we’ll get into that later.

Why do you need a universal definition of string length?

If your software ingests strings that need to be saved somewhere, you should care about string length. Remember that a string can contain the entire collected works of Shakespeare. Think about what happens if all of your strings are that size.

And unless you have the luxury of writing all of your software in a single language, running on a single platform, with a single point of data entry, you need to worry about how the same string can be represented in different ways depending on the circumstances. If you have two pieces of software that can each input a string that gets saved in the same database, you don’t want them counting the length in different ways. Then you risk one piece of software being able to load strings from the database that it won’t let the user write back to it. That would be annoying.

So why is this not trivial? Because everyone wants to measure string length in “characters”. But what is a “character” anyway?

In most software you can count characters in a string by moving a cursor through the string or trying to select portions of the string and seeing what can/cannot be selected. What this actually counts is grapheme clusters: the “horizontally segmentable” parts of the string.
Each grapheme cluster is a sequence of 1 or more graphemes. Annoyingly, the Unicode standard calls these “characters”, which I am highly suspicious of. Most modern software allows users to input grapheme clusters with a single click or tap, and does not clearly subdivide them into graphemes.
Each grapheme is a sequence of 1 or more code points, which are the things you can actually look up in the Unicode tables e.g., U+1F4A9
Some code points (surrogates and noncharacters) are used only in certain encodings or for internal Unicode use and never represent characters in Unicode. All other code points are Unicode scalar values.
Each Unicode scalar value is represented by 1-4 bytes (depending on the encoding)

So there are 5 different ways to count string length: grapheme clusters, graphemes, code points, scalar values, and bytes. Why did I pick scalar values? Anything above code points is a non-starter: graphemes and grapheme clusters can contain an arbitrary number of code points and thus an arbitrary number of bytes! A length limit is pointless if a single “character” can be arbitrarily large. 3 and 5 are also no good because they are encoding-dependent and thus not universal: strings usually have different lengths when encoded in UTF-8 and UTF-16, and may contain different code points!

That’s why counting Unicode scalar values is the one good measure of string length. The nice thing is that most characters in most languages are represented with individual scalar values, so most of the time your users won’t be too confused by this method of counting. They may just need to get used to certain fancy characters like emojis consuming more of their string length than others.

Control codes

C0 and C1 are portions of the Unicode table containing control codes. Technically these are scalar values like any other and can be counted normally, but if you are specifically handling user-created text as opposed to a terminal user interface you probably don’t want them.

Most of them exist purely to support backwards compatibility with cryptic legacy systems e.g., U+009D “operating system command”. Some are useful but have better Unicode alternatives. For example the ambiguous “new line” which often uses some combination of the control codes U+000A and U+000D but can be better represented with U+2028 “line separator” and U+2029 “paragraph separator”. In my applications, I simply strip control codes from the text and calculate the string length after.

How to count Unicode scalar values

It is easy but often not obvious how to count scalar values. This happy family will help us out: 👨‍👨‍👦‍👦. This family contains 7 scalar values (the right answer), but contains 1 grapheme cluster (on most modern software), 11 code points in UTF-16, 22 bytes in UTF-16, and 25 bytes in UTF-8.

Here’s how to do it right (and wrong) in a bunch of languages. If you know how to do this in a language I didn’t include here, please leave a comment!

C#

str.EnumerateRunes().Count() // 👍
str.Length // ❌ number of UTF‑16 code points

Go

len([]rune(str)) // 👍
len(str) // ❌ number of bytes when UTF‑8 encoded
utf8.RuneCountInString(str) // 👍 also works

JavaScript

[...str].length // 👍
str.length // ❌ number of UTF‑16 code points

Kotlin (JVM)

str.codePointCount(0, str.length) // 👍
str.length // ❌ number of UTF‑16 code points

Python

len(str) # 👍

Ruby

str.length # 👍

Swift

str.unicodeScalars.count // 👍
str.count // ❌ number of grapheme clusters

Android Versions

2023-10-02T00:00:00+00:00

So you want to write Android apps? Well first you have to understand the incredibly complicated world of versions in Android. Start by answering these questions.

What version of the Kotlin language do you want to write in? You should definitely be writing in Kotlin. We’ll call this your Kotlin version.
What version of the Java language do you want to write in? You really shouldn’t be writing Java but sometimes you gotta. Just the major version is enough. We’ll call this your Java version.
What’s the oldest Android version you want your app to run on? Great. I don’t care. Instead find the oldest Android API level you want your app to run on (e.g. Android 13 is API level 33). We’ll call this your minimum Android version.
What’s the newest Android version you’re actually going to test your app on? Again, I don’t care. Get that version’s Android API level. That’s your target Android version.

Fill in the basics

Kotlin

Use your Kotlin version as the version of the kotlin gradle plugin applied in your build script:

plugins {
    id("org.jetbrains.kotlin.<...>") version "whatever"
}

<...> is probably gonna be android or multiplatform depending on how cool you are.

A general word of warning: this is the only place the Kotlin version must appear. Other Kotlin plugins/libraries may look like they share this version but that is only a loose convention. If those plugins/libraries have bugs and need to be patched they will intentionally diverge from the language version! Don’t reuse the Kotlin version in your build script for other things.

Java

Use your Java version to fill in these parts of your gradle build script:

android {
    compileOptions {
        sourceCompatibility = JavaVersion.VERSION_WHATEVER
        targetCompatibility = JavaVersion.VERSION_WHATEVER
    }
    kotlinOptions {
        jvmTarget = "WHATEVER"
    }
}

I love how these are totally different types in the script.

These versions should always match! If you need to use a different Java version for some of your code you need to split it into a separate gradle project so it gets its own build script and can get its own versions.

Note that this only affects how you write and build your Java code. Because of Android’s desugaring process, your app can still run on Android versions with older JVMs. The compiler will let you know if you pick a version that’s too new.

Android

Your minimum and target Android versions also go in the build script:

android {
    defaultConfig {
        minSdk = 69
        targetSdk = 69 // can be greater than minSdk
    }
}

The rest

Now you need to pick a bunch of other versions, all constrained ultimately by the answers you chose above.

Android Compile SDK

When you compile your Android app, it will compile it against a certain version of the API. This should be greater than or equal to your target Android version. Technically you can always choose the latest API version, but practically it never needs to be greater than your target version. Choosing a larger version gives you access to the newer APIs earlier, letting you decouple your development cycle slightly from Android’s release cycle.

Whatever version you choose, you must specify it in your gradle build script. While you’re at it, you also get an easy one: you need build tools and those can pretty much always be the latest version since they’re usually backwards compatible, so just look up the latest build tools and specify it. You can use older tools, but not older than the compile SDK version.

android {
    compileSdk = 69 // can be greater than targetSdk
    buildToolsVersion = "420"
}

Technically you can leave out build tools and one will be picked for you but it’s a better idea to be explicit so you can easily pull in a patch if one is released.

Android Gradle Plugin (AGP)

If you’ve watched my talk about gradle you know that it isn’t so much a build tool as it is a tool for making build tools. AGP is the actual build tool for building Android apps. Your script needs to apply either the com.android.application or com.android.library plugin. You want the latest version of this plugin since newer versions can fix build issues and improve build performance. Hopefully there isn’t a reason you can’t use the newest version. Foreshadowing.

plugins {
    id("com.android.application") version "whatever"
}

Gradle

Oh yeah, you needed gradle set up. Why are we only getting to it now? Because you don’t care. You want the latest version of gradle so that your build runs super fast and doesn’t have any compiler issues. Hopefully there isn’t a reason you can’t use the newest version. Foreshadowing.

You should be using gradle wrapper and setting the version in gradle-wrapper.properties:

distributionUrl=https\://services.gradle.org/distributions/gradle-whatever-bin.zip

Android Studio

Yeah you need Android Studio too. Again, latest version.

Dependency Hell

And now all the foreshadowing pays off. Everything is broken! 💥

Well, maybe. Why? Because AGP, Gradle, Android Studio, Kotlin, and Java can all be incompatible with each other. Yay! There’s no simple answer here. You have to go back and start downgrading stuff until it starts working or, if you can’t find a valid combination, change one of your original four answers to make it possible. Good luck.

The most important things to keep consistent here are Kotlin, Java, AGP, and your Android versions. If you get them wrong, you won’t necessarily see a spectacular failure at build time! It may fail…

at runtime
when running on a particular Android version
when building for release
any combination of the above

Be very careful when changing the versions of any of these things!

Be sure to test your app thoroughly on your min and target Android versions. It can break even when you didn’t change them!

The AGP release notes are pretty good at capturing possible incompabilities, but ultimately you just have to try it.

More dependency hell

The story continues with every library you include. Want to grab the latest version of some library? Well that might require you increasing your Java target compatibility, which means you need to update Kotlin, which means you need to update AGP, which means you need to update Gradle, which means… etc.

Easy Animations in DragonRuby with Enumerators

2023-02-08T00:00:00+00:00

I’ve always struggled with doing programmatic animations in game engines, and DragonRuby is no exception. The result is always very math heavy, easy to screw up, and hard to understand even after you get it working. Say you want to animate a sprite moving back and forth with the following parts to the animation:

keep the sprite stationary on the left for 0.5 seconds
over the next 1 second, smoothly move the sprite to the right
keep the sprite stationary on the right for 0.5 seconds
over the next 1 second, smoothly move the sprite to the left
repeat

Doing that using the built-in animation functions in DragonRuby involves adding code like this to your tick method:

class Numeric
  # linear interpolate from start to finish as this number varies from 0 to 1
  def lerp start, finish
    ((finish - start) * self + start).to_f
  end
end

# determine total animation length ahead of time
total_duration = 3.seconds
# mod the tick count so animation will repeat
t = args.state.tick_count % total_duration

# first, guess that we're in the left-to-right part
progress = args.easing.ease(
  0.5.seconds, # delay before this part of the animation
  t,           # tick count (modified, since we're looping)
  1.seconds,   # duration of this part of animation
  easing_func  # a proc to smooth the animation (more on this later)
)
args.state.sprite_pos = progress.lerp(320, 960)
# if that part is done, we must be in the right-to-left part
if progress >= 1
  progress = args.easing.ease(
    2.seconds,  # delay before animation, taking into account the previous parts
    t,          # remaining args same as before
    1.seconds,
    easing_func
  )
  args.state.sprite_pos = progress.lerp(960, 320)
end

This sucks.

If you’ve never struggled with this stuff before, I’ll spell it out. It sucks that we have to calculate the overall animation duration ahead of time; it sucks that we have to calculate a cumulative delay for each additional part of the animation; it sucks that we need an explicit mechanism for determining which part of the animation we should be in; it sucks that each frame has to re-compute parts of the animation that aren’t active; it sucks that this code will get harder and harder to reason about as we add more phases to the animation; and above all it sucks that our original human-readable description of the animation is now completely hidden in the code. Good luck reading this tomorrow and understanding how it effects the desired behavior.

But two months ago, I stumbled upon a blog post about how to solve this sort of problem in Unity. I was curious if there’s an analogus way to apply the technique to DragonRuby and I finally got the time to figure it out.

Enumerator

First, a brief diversion into a very cool part of Ruby: enumerators. In Ruby we have simple loops:

i = 0
loop do
  puts i
  i = 3 * i + 4
  break if i > 100
end
# outputs 0, 4, 16, 52

But loops are all-or-nothing: once you start running them you have to wait for them to finish… or do you? Enter Enumerator, with this you get fine-grained lazy evaluation:

i = 0
e = Enumerator.new do |yielder|
  loop do
    yielder << i
    i = 3 * i + 4
    break if i > 100
  end
end
e.next # 0
e.next # 4
e.next # 16
e.next # 52
e.next # raises StopIteration

This is magic! It looks misleadingly simple because the Ruby interpreter is doing some cool stuff under the hood for us. That yielder << i line is the real key: that’s where the interpreter suspends execution of the block and saves it to memory. The enumerator remains there, dormant, until we call next and it briefly springs to life until the next yielder << i line is hit. The other key part is that the enumerator is constructed with a block so, like all blocks, it captures its local scope. That means the i variable remains accessible to it. That would be the case even if we were to return the enumerator from this scope: the local variable would be gone but the enumerator asleep on the heap would still have a reference to it until it is done enumerating.

This is exactly what we need to solve our animation problem.

eease

A simple tool:

def ecount duration, &blk
  enum = (0...duration).each
  return enum unless blk
  enum.lazy.map &blk
end

ecount(30) # construct an enumerator that yields 0..29
ecount(2.seconds) # construct an enumerator that yields 0..119

In DragonRuby, x.seconds == x * 60 since it runs at a constant 60 FPS. Maybe you see where this is going?

# in tick
args.state.animation ||= ecount(2.seconds) { |i|
  puts "Animation is on frame #{i}"
}

begin
  args.state.animation.next
rescue StopIteration
  # animation is done
end

By calling next each frame, this enumerator now represents a real-time animation (currently just “animating” some text into the console, but you get the idea)! Already this solves a lot of our problems:

We no longer need to externally manage the duration of the animation. It remembers what duration it was initialized with and it tells us when it is done (via StopIteration)
This doesn’t require precomputing an absolute start-time based on the tick count. The animation starts whenever you first call next
This no longer wastes time on previous stages of animation. Ruby pauses the enumerator between frames and resumes it right where it left off
We can easily construct an array of enumerators to manage a large collection of simultaneous animations, all running independently

But we’re not done yet. Now that we can easily start independent animations, DragonRuby’s convention of basing animations on tick counts is too restrictive. Many animations don’t care how many tick counts came before they started or what tick count it is now, they only care about how far along they should be in their particular state change. So a new tool is needed:

def eease duration, easing_func, &blk
  Enumerator.new do |yielder|
    if duration == 1
      yielder << blk[easing_func[1.0]]
    else
      last_i = (duration - 1).to_f
      (0...duration).each do |i|
        yielder << blk[easing_func[i / last_i]]
      end
    end
  end
end

This is similar to ecount except we always yield a value from 0 to 1, representing the progress through the animation, passed through an easing function. An easing function is a function that takes an input from 0 to 1 and outputs something from 0 to 1, such as GTK::Easing.quad. The idea is that eease is the enumerator version of DragonRuby’s ease.

Instead of taking a start time it starts whenever you first call next, it doesn’t need a tick count because it tracks that internally, it still gets a duration and an easing function, and instead of returning its progress, it yields it so you can encapsulate the entire animation in the construction of the enumerator. Using it looks like this:

args.state.animation ||= eease(2.seconds, GTK::Easing.method(:quad)) { |t|
  # you'll see the progress slowly accelerate over time, due to Easing.quad
  puts "Animation is #{(t * 100).floor}% complete"
}

We have what we need to define the individual pieces of our animation, but now we want to wrap it up in one parent animation, with the parent mainly delegating to the child animations, but still doing some extra work like setting up the initial state and looping the whole thing. This will be handy:

class Enumerator::Yielder
  def run enum
    enum.each { |v| self << v }
  end
end

Now a single enumerator’s yielder can go off on a side quest to run another animation to completion before continuing on to the next stage of animation. Combined with the built-in Enumerator::+ that concatenates enumerators to run one after the other, finally we can rewrite our original animation:

args.state.animation ||= Enumerator.new { |yielder|
  args.state.sprite_pos = 320
  loop do
    yielder.run(
      ecount(0.5.seconds) +
      eease(1.seconds, easing_func) { |t|
        args.state.sprite_pos = t.lerp(320, 960)
      } +
      ecount(0.5.seconds) +
      eease(1.seconds, easing_func) { |t|
        args.state.sprite_pos = t.lerp(960, 320)
      }
    )
  end
}

args.state.animation.next

This now reads very similarly to our original English description without needing any comments! I haven’t even mentioned some of the bonus benefits yet. Now that we can uniformly represent all of our animations as Enumerator objects, we get fun new options like easily delaying frames (skip a call to next) to simulate lag, easily skipping frames (call next twice) to speed up animations after the fact, easily reset or replace animations while they are running (reassign the variable that stores the animation), etc.

I’m still playing around with this pattern and exploring the possibilities, but I would love to see something like this added to DragonRuby’s open source module at some point. I may do it myself.

But in the meantime, you can accomplish a lot with just ecount, eease, and Enumerator::Yielder::run. All of the code in this blog post is licensed CC0 so go forth and make stuff!

Gradle still sucks

2023-01-06T00:00:00+00:00

I wrote a blog post a while ago complaining about Gradle, mostly from a position of ignorance: I had never really taken the time to properly learn the tool and I was frustrated while trying to stumble my way through working with it. But luckily, my company encourages learning and experimentation so I was given an entire week–40 whole working hours–to do nothing but learn Gradle. So I did. And I still hate it, but for different reasons.

How to learn Gradle

During my period of intense frustration with Gradle I found this condescending blog post claiming that anyone who dislikes it is simply using it wrong. So when I finally got the opportunity to learn it, I was adamant that I would learn it the “right way”. No outdated tutorials, no hacks, no workarounds. I wanted to learn only the modern, idiomatic, as-God-intended method of Gradle usage.

To be fair, this is always the best way to learn any tool. Always be wary of learning something through only practical examples, how-to guides, or copy-pasted code. Because us programmers are universally terrible at everything we do, and if you learn that way you will invariably end up learning someone else’s bad practices. It is always better to go back to first principles; find resources that zoom out and tell you what the tool is for, how it is generally intended to be used, and the overall design patterns used with it. Ideally only trust sources very closely associated with the development of the tool itself since second- and third-hand sources can accidentally introduce their own baggage.

Having learned idiomatic Gradle, I can say that almost none of the resources out in the wild are good. The good list is extremely short:

Understanding Gradle is a great video series walking through most of the basic pieces of gradle and what they are for. It doesn’t have enough detail to teach you how to get a fully idiomatic build, but at least it gives you a sense of what kind of stuff you should be doing.
The official docs on structuring large projects presents similar information in a text form. I object to the word “large” in the title since it makes it sound exceptional in some way, as if most gradle projects won’t benefit from this approach. I think a more accurate title would be “structuring nontrivial projects”. If you’re doing some toy project in gradle, sure you don’t need this. But likely any serious project should be aware of these practices.
The idiomatic Gradle repo is a complex project applying all of the best practices. Not super useful when you’re first learning because it will look like alien hieroglyphs, but once you start to grok Gradle, it provides lots of interesting examples to consider.

Probably the most surprising discovery for me was just how useless most of the official Gradle docs are. Many of them are outdated, incomplete, or so specialized that they’re useless for general learning.

What is Gradle?

I won’t try to fully explain Gradle, that’s what the links above are for. But I can tell you the one reason you are probably failing to understand Gradle. First off, pretty much all of the common reasons given online are wrong. Well, not exactly wrong, but rather they are misleadingly simple.

Here is why you actually don’t get it: Gradle is not a build tool, rather it is a platform for building build tools. As soon as you stop seeing it as a simple means of hooking together tasks and dependencies you’ll start seeing why it is so complex.

Why does it suck?

I love this quote by Peter Bhat Harkins:

One of the most irritating things programmers do regularly is feel so good about learning a hard thing that they don’t look for ways to make it easy, or even oppose things that would do so.

I almost fell for this trap. Once I started actually solving problems with Gradle I noticed that I started talking about its complexities in a more positive light. But that’s stupid. When I took a step back I realized that all of the problems I had with Gradle before still existed, I just understood how to use them to get work done. But the fundamental problem with Gradle is that it simply does not justify these complexities.

A case study

To learn Gradle, I set myself a task of incorporating some custom code generation into our build process. This is the kind of thing I can easily do with shell scripting, but I wanted it to be a proper part of Gradle since it perfectly fits the traditional build-tool model of monitoring when an input file changes to determine when to run a task that generates an output file.

At a high level, the process is this:

Run some command line stuff to prepare the code generation tools
Run a script on some input files to produce intermediate metadata needed for code generation
Run a script on the intermediate metadata to produce the code
Compile the code normally

The goal is that I can run a single build task and it checks if either the original input files or metadata have changed and reruns whatever parts of the process are needed before building my code.

Could I have accomplished this just by opening up build.gradle and writing a bunch of custom tasks? Absolutely. But doing it that way would be like writing your entire program logic inside of main: technically feasible, but not idiomatic.

So here is what I had to do in order to do it the right way:

Since the part of the project relating to input files and metadata is logically separate from the rest of our app, split it into a Gradle subproject to follow best practices
Now that we have separate projects, best practice dictates that each has its own “convention plugin” to distinguish the type of build it is performing, so define a sub-build for storing custom plugins
Develop a custom plugin for the metadata generation. This involved going down a huge rabbit hole of trying to use an open source community plugin since technically some of the code generation stuff I was using could be run on the JVM and thus could be built into gradle
Find out that the community plugin had a critical bug in one of my required use-cases so I had to throw it out and start over. Develop a complete custom Gradle plugin for running the command line steps I had been doing manually before
Best practice is to define custom task types in your convention plugin rather than customizing built-in tasks, so subclass gradle’s Exec task to specify the kind of inputs and outputs needed for each command
Work around the fact that Gradle’s built-in Exec task doesn’t work on Windows
Best practice is to not tightly couple your build script to your plugin’s tasks, so develop a mini DSL extension so that the tasks can change independently of the build script
Now that the convention plugin is so complex, write end-to-end tests that create mini gradle builds that apply the plugin and verify that the task output is correct
Write another convention plugin for turning the metadata into source code
Break your brain trying to figure out how to apply Kotlin plugins to your convention plugin while applying different versions of the same plugins to your actual app build since Gradle bundles its own internal version of Kotlin which is older than the one for your app
Create the code generation task. This is actually the easiest part since you can pretty much write normal code with normal tests, though you still need to figure out the right input/output types for the task since the best practice is that all such types in Gradle are redirected through “provider” types instead of being used directly
Write another mini DSL extension for customizing the code generation task

Figure out how to add generated code to a source set via the Kotlin Gradle plugin. Duh, it’s obviously

kotlin {
  sourceSets {
    val commonMain by getting {
      kotlin {
        srcDir(...)
      }
    }
  }
}

Finally start setting up the actual Gradle builds. Apply the first convention plugin and configure the custom metadata-producing tasks. Easy. Wait. Shit. How do you get the output of the subproject task to be visible to the other project so that Gradle understands the task dependency between the two projects?
Define a “consumable configuration” in the subproject and add the task output to it by declaring an “artifact”. Oops now your build script is tightly coupled to the plugin task. This seems unidiomatic and unavoidable.
Remember to use flatMap on the task to get its output because task creation is lazy!
Apply the other convention plugin where code generation is needed. Configure the code generation task to get… wait, how do I get the artifact that I declared in the subproject?
Define a “resolvable configuration” here. Declare a dependency which points the resolvable configuration at the consumable configuration from the subproject
Wonder what the hell a configuration is and why I’m declaring them. Check the docs:

A configuration is a named set of dependencies grouped together for a specific goal.

Yep that clarifies nothing
Hook up the resolvable configuration to the code generation task. Wait, it needs a file, not a configuration. Arbitrarily grab the first file from the configuration (since we know the subproject only added one file). That feels shitty
Remember to use map on the configuration since it’s lazy!

And, believe it or not, it’s as simple as that.

tl;dr

What sucks about the above process is not that it’s long. It’s the sheer number of concepts that must be learned and interacted with: subbuilds, subprojects, plugins, tasks, providers, extensions, source sets, configurations, artifacts, dependencies, and all of the quirks and workarounds involved with each of those things.

All of this complexity makes working with Gradle a slog. You can fully, deeply understand every single aspect of your build until you want to do one thing slightly differently and suddenly it doesn’t work because you were supposed to be using some totally different part of Gradle you never heard about before.

The whole idea of using community-developed plugins becomes a minefield because the chance of them applying all of the concepts completely correctly is essentially 0%, meaning you might be okay as long as you’re only doing basic stuff, but as soon as you stray off the beaten path stuff starts breaking and you have almost no chance to fix it properly yourself.

The complexity does come with some small benefits. It’s cool that our custom Gradle plugins have their own tests so we can independently test our code generation process without touching any of our actual app code. Once you understand how Gradle likes to structure things, it does make your build scripts less cluttered and each bit of build logic ends up compartmentalized in a sensible place.

But goddam it feels like there must be an easier way to accomplish this stuff. I’ve learned dozens of different tools over my software development career and compared to them the amount of comfort I feel with Gradle after 40 hours of dedicated study is atrociously low. I have since gone on to further enhance our Gradle build process and it honestly hasn’t felt any easier. The only difference from before is that when working on Gradle stuff I used to spend way more time than I expected, get incredibly frustrated, and give up; now I spend way more time than I expected, get incredibly frustrated, and eventually (somehow) find a fix.

Required features in your new programming language

2022-07-22T00:00:00+00:00

Making cool new programming languages is all the rage these days. I honestly love this since it shows that people are still trying to think about new ways for humans interact with computers at a low level. However, many of the designers of these new languages don’t seem to realize that the bar for language designs is higher than it used to be. We’re way past the limited designs of established languages like C, C++, and Java; we know that programming languages can be much more helpful, safe, and expressive than those older languages and it saddens me to see someone make a brand new language with the same artificial limitations that we know of.

I’m not saying that we shouldn’t use C, C++, and Java anymore, I’m saying that those languages already exist! Why make a new language if it doesn’t materially improve on the status quo? So here is a list of features that I consider to be essential for a new language if you want it to be taken seriously. In my opinion the only good reason to omit any of these in your new language is

you’re making a toy language just for fun, or
you’re making a niche language that you don’t intend people to use for general-purpose programming

Project structure == directory structure

In older languages, you can throw your source code files basically anywhere because in order to compile those files into a working program, you need some metadata separate from those files which explains how they should be combined. That metadata could be some kind of project definition file, explicitly listing the source files and how they relate to each other, or the metadata could be some kind of build script that explicitly describes how the compiler will be invoked and with which files.

This is insanity. I have never seen a project that actually requires this degree of flexibility. Practically, every language involving this becomes a headache to manage because you have to manually keep the metadata in sync at all times, and you get a whole host of novel bugs simply related to the metadata being wrong as opposed to the actual source code.

Any new language should have a single, standard project structure. That project structure must be directly expressed through the directory structure of the source code. If a project metadata file is allowed, at most it can specify a custom location for that directory structure; it cannot do things like including extra source files from nonstandard locations, or exempting source files which are in standard locations. The goal is that anyone should be able to view the directory structure of a project and immediately understand roughly how that project works.

Package manager

We live in the age of Open Source. There is no excuse for copy-pasting other people’s code into your project anymore. Any new language should have a built-in official package manager with the following features:

Creating, publishing, and downloading packages
License information is required when publishing and included when downloading
Projects list package dependencies with their required versions
Installed packages are pinned to a specific version until explicitly updated in the project
All dependencies of a project (including transitive) can be audited (including version and license)

Language manager

If your language is new, it will have bugs. You will need to release new versions, those versions may introduce incompatibilities. Any new language should have an official language version manager such that each project can pin itself to a specific language version, and the manager can download whatever version is required prior to building.

It should be possible to simultaneously build two projects using two different language versions on the same machine i.e. the language version is sandboxed to the individual build, as opposed to some global configuration of the machine.

Turing-incomplete builds

Going hand-in-hand with the previous points, any new compiled language should have a single, standard compilation process. Every step of that process may include a finite set of options for tweaking it, however no step should allow custom code to be run as part of the build.

Again, the goal is that simply by looking at the project structure I should have an intuitive understanding of how those files will be built. It should not be possible for someone to define a custom, wacky build process that works entirely differently. Debugging the build process should at most involve walking through the standard set of steps and determining which ones are going wrong, as opposed to debugging completely custom build scripts written in a Turing-complete language.

If you want to include custom build steps, those must be externally defined per-project and must run before the standard build starts. If the users of your language start to regularly step outside of the standard build process because it is too restrictive, the build process should be expanded just enough to handle those use-cases without giving in and allowing Turing-complete builds. For example, if users of your language start using custom code generators, consider making code generation an official part of your language such that it can be standardized.

Mixed paradigm

Purity of paradigm is worthless. Pure functional languages like Haskell remain extremely niche despite years of advocacy. Virtually all Java successors hack top-level functions into the JVM since we know they are simply the best option in some cases.

Any new language should be mixed paradigm, allowing at least the creation of mutable objects encapsulating both data and methods, as well as first-class functions that can exist independent of an enclosing object and which can be stored and passed around like an object. 99% of good software design is about accurately modeling the problem you are solving. Purity in a language serves only to restrict your models, forcing you to use unintuitive representations in some cases.

Everything really is an object

Since your language must be mixed paradigm, for programmer sanity you should ensure that everything in your language behaves like an object and can have methods attached to it. Yes, that means strings, integers, classes, functions, etc. should all allow methods to be defined on them. There should never be cases where something in a variable is of some exceptional type that has no class, or cannot accept method calls. Methods should be definable as scoped extensions to existing classes, so that you can add methods to objects that you do not have control over (if you think it makes more sense than defining a function separate from the object).

Conversely, everything that looks like a method call should be a method call. Constructors should not be a special case, they should be a method call on the class object. Calling a top-level function should be syntactic sugar for a method call on the function object e.g. .invoke().

If your language is compiled, it should similarly allow interfaces to apply to all of these constructs. If you want to define a class that works like a function, you should be able to make that class and function conform to the same interface such that they can be used interchangeably.

Although I am demanding this, I do acknowledge that this is an extremely difficult requirement for any language design. In fact I don’t know of any language that does this perfectly, since you often end up with increasingly abstract levels of design logic that are hard to reason about. However, any new language should strive to achieve this as much as is practically possible and ensure that any remaining corner cases are well-understood and seem to be acceptable compromises with easy workarounds.

Testing framework

Automated tests are essential for software quality. Any new language should have an official built-in testing framework that supports automated unit and integration tests. Test code should be part of the standard project structure with a clear relationship with the source code. If the users of your language start to regularly rely on nonstandard testing tools, you should expand the official test framework to handle those use-cases.

Test output should have the option of being human-readable or being in a standard machine-parsable format such that CI tools can integrate with the language.

Regex

Any new language should have first-class support for regular expressions both as a built-in type and a literal in source code. If statically typed, regex literals should be evaluated for correctness at compile time.

Pick one: dynamic or inferred

Any new language must run in one of these two ways:

Fully dynamic, meaning the language has an eval() function that accepts a string at runtime and evaluates it as source code. The language is fully metaprogrammable, allowing all programmer actions to be executed at runtime.
Statically type-inferred, meaning that by default all types are checked at compile time and can be specified in just enough source locations that the compiler can verify through inference that types are being used correctly. Runtime type-checking (such as downcasting) is allowed, but must always be explicitly differentiated from static checking in the source code

Sum types

If your language is statically typed, it must have first-class support for sum types as well as pattern matching for those types to ensure that all cases of the type are handled whenever one is used.

Nullability in the language must be represented as a sum type with two cases: null and non-null. The compiler should be able to enforce that either a variable can never contain null or it can contain null and must be handled with pattern matching.

Explicit exceptions

If your language is statically typed, any function that throws an exception must have a distinct type from any function that does not throw. Calling a function that can throw must either require pattern matching to handle the exception or making the caller throw.

Explicit mutability

If your language is statically typed, variables and types must be explicitly marked as mutable or immutable. The compiler verifies that immutable variables are never reassigned and that objects with an immutable type are never modified.

Ideally, immutable types can be given value semantics, meaning that objects of that type with the same value are considered to be identical.

The future

There’s always room for improvement, and while I don’t think we’ve figured out the following areas enough to say that they are essential to language design, I would love to see new languages play around with them to find some new ideal.

Explicit side-effects

If your language is statically typed, any function/method that modifies one of its inputs or changes some global state of the program should be explicitly marked as causing side-effects (distinguishing them from “pure” functions/methods). The compiler should enforce that pure functions cannot call impure ones.

Async programming

It has been interested watching asynchronous programming evolve over the years. We’ve gone from explicit multithreading, to green threads, to callback hell, and now to modern concepts like coroutines. It is hard to say where this will settle down, but my gut tells me we haven’t reached the end of this journey. Ideally we don’t end up in colored function hell.

Every programmer should care about UI design

2022-07-10T00:00:00+00:00

A common view among programmers is that UI/UX design is thankfully not any of our business. We have dedicated specialists for that who figure it all out so that we never have to think about it. Unfortunately, this viewpoint is incorrect. Why? As a programmer you do UI design almost every day whether you like it or not.

You disagree? Perhaps you don’t realize who the users are that your interface is for: the other programmers working on your code.

There’s a joke that any code that isn’t yours is spaghetti code. Part of the problem is that it’s generally harder to read code than it is to write it, but I think another major reason is that most programmers don’t realize that their code is a UI, and when they don’t treat it like a UI they end up with a shitty one.

Everyone can recognize bad UI when they use it, and working on bad code has the same problems:

Isn’t it frustrating when you have to click through 6 different things to get to the one part of the app you actually care about? Isn’t it frustrating when you have to include a bunch of boilerplate in order to make small functional changes to the code?
Isn’t it frustrating when a screen is absolutely covered in information, yet the one thing you want to find isn’t there and it’s not clear how to get to it? Isn’t it frustrating when an object you want to use has 10 public methods, but only 3 of them are useful and if you call them in the wrong order your program crashes?
Isn’t it frustrating when you input a bunch of info into an app then an error occurs and it throws it all away? Isn’t it frustrating when your code is super permissive when compiling but then throws tons of runtime errors instead?
Isn’t it frustrating when an app labels its UI with meaningless buzzwords like “My Solutions”, “Insights”, etc. making it needlessly difficult to learn how to use it? Isn’t it frustrating when you have to maintain code with names like EntityOperationManager and you have absolutely no idea what it is for because each of those words has several different meanings and none of them are specific?
Isn’t it frustrating when a trivial app takes 10 or 20 seconds to load? Isn’t it frustrating when it takes 10 minutes for your simple app to build so you can test your changes?

The similarities between traditional UIs and code are not limited to their users, they also apply to their design processes. Any UI/UX designer worth their salt will tell you that there’s only one way to develop a good UI: test it. Really test it. It’s not enough to ask people “How do you think this UI should work?” It’s not enough to show people pictures of a mockup and ask if they like it. No, you need real users to actually sit down in front of the UI and try to use it.

And yet many software developers I’ve worked with seem to think that they can just dream up a software design on a whiteboard and it will be perfect. Or they use the software design to develop a working feature themselves, so it must be a good software design. But did you test it? Not automated unit tests, did you test the UI of your software design? Did you show it to other developers who were unfamiliar with it and see if they could figure it out? Did you consider all of the different kinds of users who will be interacting with it: the developer bolting on a brand new feature in 3 months who had no hand in the software design, the developer working at 3AM to track down a crash that is somehow related to your changes, the newbie trying to learn the structure of your code so that they can make good design decisions themselves?

Did you give some thought for those people—your future users—when you were creating your software UI? Or did you make selfish decisions by making your software hard to interact with?

How not to design a better software-UI

Making a software design with a good UI is similar in many ways to designing a good API for a library or web service: you need to think hard about things like data formats, stability, understandability, naming; but this is also a trap. There are other unique aspects to making a good software-UI, and you ignore them at your peril.

They key difference is that while an API is often used by complete strangers who you will never meet, the UI of your software design affects mainly your teammates. It doesn’t need to be nearly as stable, it can (and often must) evolve as the needs of your team/company/product evolve. It can also rely on esoteric domain-specific knowledge, or impose harsh restrictions based on your business needs.

The most frequent mistake I see when developers try to make a good software-UI is that they start thinking about it as an API. “How would we design this if it was an open source project used by thousands of people on github?” This often immediately starts straying into YAGNI territory like, “We can’t make that a hard dependency! What if we need to reuse this component in an environment where the dependency isn’t available?” or “We can’t enforce that at compile time! What if one day we need to reuse this component in a server that needs to reconfigure its components while running?” Avoid the temptation to optimize for flexibility above all else: your goal is to deliver value to your company. Saving on future development cost by having a flexible software design is one way to do that, but it isn’t the only way.

You can deliver much more immediate value by optimizing your design for clarity of understanding, simplicity of use, early and informative error handling; basically, all of the parts of the software design related to how other developers will interact with your code right now, not based on hypothetical unknown use-cases.

A better approach to design

Rather than starting my software designs with an application programming interface (API), I instead like to start developing them with a programmer interface (PI). In other words, don’t try to start by laying out data types and interfaces, instead start by writing the code as if your software design is already done. Imagine you are another developer using your finished software design and try to implement some of your expected use cases.

Let’s call this PI-driven-design, and like test-driven-development it can be a little hard to wrap your head around it at first. Just like how TDD starts by writing failing tests before the implementation is done, you start your PIDD by writing code that can’t compile or run because none of the software exists yet. The advantage of this is that it immediately strips away a lot of the complexity that a whiteboard software design introduces. Rather than creating an explosion of different types and interfaces and data structures and algorithms and design patterns, you focus only on the immediate needs of the users of your software design. Is the code readable? Is the intent of the code obvious by looking at the names and types? Do the initialized objects do enough to seem useful without encapsulating too much functionality? How will errors manifest if the code is wrong: compiler error, exception, silent failure? Will the cause of the error be clear to the programmer?

You can still think about flexibility, but again focus on how your code will flex to accommodate other programmers as opposed to purely hypothetical use cases. Is the code logically organized such that it is easy to find things later if changes are needed? Is functionality encapsulated in a way that small functional changes can be implemented easily? Let’s discuss two concrete examples to illustrate what too much or too little flexibility looks like in terms of a programmer interface.

1. Flexibility über alles

I have met a lot of programmers who, when adding a property to a class, always add public getters and setters for it. In most cases I would consider this user-unfriendly and a bad PI. The only “benefit” is that it provides the maximum amount of flexibility: anything with access to that object can read it or modify it at any time. But this also makes it much harder for programmers to use: any code modification around that property might require refactoring in many other parts of the code, any new interaction with that property might introduced regressions in other code which was already using it. A much more user-friendly choice is to do the opposite: make it completely private. Even better make it private and immutable so that it can only be set once even inside the instance (if your language supports that). This is user-friendly because it is the simplest behavior to understand: if you are working outside of that class you don’t have to think about it at all (encapsulation!), and if you are working inside the class you only have to consider the value it was initialized with (side-effect-free!). Not every software design works with such restrictive properties, but you should fight tooth and nail to give up as little ground as possible. If it must be public, at least make it immutable. Or if it must be mutable, make it only publicly readable not writable.

Making your software design extremely flexible and robust often makes it more verbose and challenging to use and reason about. It is wasteful if that flexibility is based only on personal speculation. You’re spending resources optimizing for a hypothetical situation while ignoring the immediate extra costs you are incurring by making your software harder to use. Most likely your software design needs a programmer interface which hides the flexibility and directly addresses your immediate needs.

2. Useless ease-of-use

I have also met a lot of programmers who frequently use the singleton pattern. When I ask why they are using singletons in their design they usually answer, “This makes it very easy to use my code,” or “We only need one instance of this object.” These justifications sound like they are in line with my previous arguments: taking into account ease of use is very considerate of our users and the singleton pattern is restrictive about instance creation so we’re also favoring simplicity over flexibility!

However, I believe these arguments are twisting the truth. In OOP, singletons are the exception, not the rule, it takes additional work to make a singleton in most languages, and most singleton implementations demand some amount of special treatment when it comes to their use and testing. While those previous arguments are not necessarily false, they omit the key fact that those supposedly PI-friendly benefits come with tradeoffs. To evaluate whether they are truly good in terms of the PI, we must evaluate those tradeoffs.

In terms of ease of use, while singletons are definitely easy to use in implementation they are much harder to use when it comes to testing. In some languages they cannot be mocked at all if you aren’t already using dependency injection, and even when they can be mocked they often require special mocking, which makes it harder to write tests. If you listen to other industry experts, you should have much more test code than application code, so this aspect sounds like it is likely more negative than positive.

How about flexibility? The flexibility argument ignores that while the classic singleton pattern does restrict initialization, it vastly expands access to shared state. Normally if some code initializes an object, it is nontrivial to give unrelated modules of code access to that same object, but not so with singletons. With singletons you open up the possibility that any two areas of code with access to the singleton’s namespace can now have a direct dependency connecting them. Not only that, but this dependency can be an example of tight coupling since often uses of a singleton assume that it is a singleton and thus rely on the fact that the state is shared. Breaking this assumption can be nontrivial. Thus, while slightly restricting flexibility, using a singleton can introduce a huge amount of potential complexity by making it easy to tightly couple your future code in ways that will be difficult to undo. Once again this aspect sounds more negative in terms of the PI design.

The PI-friendly way to view the singleton pattern (and any other design pattern for that matter) is that it is a potential burden on the design. It should only be included if the PI specifically calls for it and it doesn’t have harmful side-effects. As an example, let’s consider the PIDD for a logging feature for an application. We want it to be really easy to call, something like

log.error("something went wrong")

and I want to be able to write that same code in any file and have it work the same way. Most importantly, I want all of those log messages to end up in the same place so that I get a unified log file for the whole application. This is a PI challenge: I need other programmers to easily access that log object and I want them to be able to easily understand that all logs will be unified. The singleton pattern does solve these problems, but most importantly the tradeoffs it introduces here seem negligible. Most applications don’t bother mocking their logging interface in tests, in fact, you often use the real logs during testing to help with debugging. And the only case in which we wouldn’t want the shared mutable state of logging is if we wanted logs from different components to end up in completely different log files, which seems like a very unlikely hypothetical.

Unfortunately, I have rarely seen this kind of evaluation occur when deciding what to include in a software design.

Practical use

Practically, using PIDD in your software designs doesn’t require any major changes to your development process. Usually I use it behind the scenes to inform my software designs and to justify design decisions. So while I still include the usual UML diagrams and flowcharts and such since those are useful tools for communicating about the design, they are not what I start with. I start with PIDD in the form of a very rough proof-of-concept, iterating on that several times until I find something that feels great to interact with as a programmer. Only afterwards do I start on my formal software design. But that’s just how I do it, probably there are other viable ways to make your designs programmer-interface-centric.

The important thing is that when you design your software, start with the users: the people on your team! How will they use the software? Remember that in this context “use” means: reading the code, extending the code, bug fixing the code.

A Paradoxical Error

2022-05-31T00:00:00+00:00

I recently read GameTek by Geoffrey Engelstein, which includes several interesting examples of how to improve your intuition when it comes to the design and the playing of games. Unfortunately, two of these examples are at best extremely misleading and at worst completely incorrect. However the way in which they are wrong is fascinating and prompted several hours of thought and discussion on my part, so I’d like to share them with you.

The Premises

Engelstein presents two similar examples in the book, which I will reproduce here. First, at the end of Chapter 6, “Feeling the Loss”, we have a presentation of a classic paradox in probability called The Two Child Paradox.

You’re talking to woman at a park and she say she has two children. Suddenly a boy runs up and grabs her hand and you ask if this is her son. She replies that he is.

What are the chances that her other child is a boy?

Most people say 50 per cent: the fact that one is a boy has no bearing on the gender of the other child. But this is not correct. There are four possible family arrangements that have two children: boy/boy, boy/girl, girl/boy and girl/girl. When I tell you that at least one is a boy, that eliminates one possible family: girl/girl. There are three combinations left: boy/boy, boy/girl, and girl/boy. Each is equally likely, but in only one out of three (and thus a ⅓ chance) is the other child a boy.

Then at the end of Chapter 14, “Tic-Tac-Toe and Entangled Pairs”, we have this other situation which might sound different but is actually intimately connected to the first.

Let’s say four people are playing Bridge. One of them says, ‘I have an Ace,’ and we know she is telling the truth. The chance that she’s holding more than one Ace is about 37 per cent. Later the same player says, again truthfully, ‘I have the Ace of Spades.’ Strangely, the chance that she has more than one Ace is now 56 per cent.

Take a moment to ponder these two examples. They probably feel unintuitive, but all you need to do at this point is understand the premise of each one. When you’re ready, proceed.

The Paradox

I don’t know about you, but I find the first example extremely unintuitive, if not totally wrong. In my view, the key point that Engelstein omits is that the children are distinguishable. Since you have met one of the children at the park, you can imagine labeling that as the “park” child and the other one as the “home” child. Going through the possible configurations we have:

\[\text{boy}_{\text{park}}/\text{boy}_{\text{home}},\text{boy}_{\text{park}}/\text{girl}_{\text{home}},\text{girl}_{\text{park}}/\text{boy}_{\text{home}},\text{girl}_{\text{park}}/\text{girl}_{\text{home}}\]

By discovering that the child in the park is a boy, we are not left with three possibilities as Engelstein argues, but only two, one of which involves two boys. So in my view the intuitive answer of 50% is actually correct. Engelstein’s summary of “at least one is a boy” clashes with his earlier statement that we learned that a specific child is a boy.

So why did Engelstein think it was 1 in 3? Is there even a paradox here? First we should restate the example to remove some ambiguitiy. Let’s imagine that you and I are playing a game with the following steps:

I flip two fair coins and then hide the coins behind a screen. The random result is fixed but only I can see it
I may reveal some information about the coins to you
I reveal both coins and you win if both coins are Heads

Now consider how your chance of winning changes throughout the steps of the game. After step 1 your chance will always be the same: two fair coins were flipped, so there are four possibilities, so there is a 1 in 4 chance that both are Heads.

HH, HT, TH, TT

Suppose in step 2 that I look at the coins and tell you that at least one of them is Heads. What is your chance of winning? This is the situation that Engelstein was trying for earlier: this is equivalent to me saying that it is not the case that both coins are Tails; in other words, I have only eliminated one of the four possibilities. So by Engelstein’s argument, you have a 1 in 3 chance of winning.

Instead, suppose that in step 2 I pick up one coin from behind the screen and show you that it is Heads. In this case, we can apply my previous argument: the coins are distinguishable (we have the revealed coin and the hidden coin) thus there are only two remaining possibilities and so the chance of winning is 1 in 2.

\[\text{H}_{\text{revealed}}/\text{H}_{\text{hidden}},\text{H}_{\text{revealed}}/\text{T}_{\text{hidden}},\color{red}{\text{T}_{\text{revealed}}/\text{H}_{\text{hidden}},\text{T}_{\text{revealed}}/\text{T}_{\text{hidden}}}\]

Now the paradox: why is there a difference between these situations? In the first case, where at least one coin was Heads, certainly I could have also picked up a coin and showed you that it was Heads. Why does it matter if I actually show you or not? It’s as if, by introducing completely redundant information (physically showing you a Head rather than saying it), your chances of winning are magically increased!

Here is where the two examples from earlier connect. In the Bridge example, I first say “I have an Ace”, then by introducing completely redundant information and saying it’s specifically the Ace of Spades, the chances of having more than one Ace magically increase! Why does the redundant information help? Certainly when I say I have an Ace you can imagine me showing you that it is one of the specific Aces.

So what’s going on? Is Engelstein’s argument correct, or is mine? Or is something else going on entirely?

The Game

To help you understand how to resolve this, we need to put some money on the line. Now you have to pay to play, but there is a prize for winning:

I flip two fair coins and then hide the coins behind a screen. The random result is fixed but only I can see it
I may reveal some information about the coins to you and state both the prize for winning $W and the cost of entry $C
If you pay, I reveal both coins and you win if both coins are Heads

The question is if the game is worth playing; based on the information I revealed, the prize money, and the cost of entry, do you expect to make money by playing? For example, if you can calculate that the probability of two Heads based on the information revealed is 50%, the prize is $24, and the cost of playing is $10, then

\[\text{expected winnings}=P(\text{win})\times($24-$10)-P(\text{lose})\times($10)=0.5\times $24-$10=$2\]

In other words, you expect to make $2 per play on average (assuming you have a big enough budget to play the game repeatedly and overcome some early bad luck). As you can see from this example, it is crucial that you are able to calculate the probability of winning in order to make a profit in this game. So imagine you find me at the carnival offering that exact situation: the prize is $24, the cost $10, and I’m showing you that a coin is Heads. Do you play?

Assuming you found my argument from earlier convincing, your chances of winning are 1 in 2. Based on the previous calculation, you will make a profit! So you play and indeed you win! So you keep playing, following the exact same strategy: when I show you Heads you pay and, to keep things simple, when I don’t show you Heads you decline to pay and simply wait for me to flip the coins again. With this strategy you make a small but reliable profit ($2 per game on average); you’re going to bleed me dry if it takes all day!

Then something changes. Suddenly your strategy starts losing. Convinced it must simply be an unlucky streak, you play on, but eventually I completely drain your funds and you walk away with empty pockets. What happened?

Imagine you’re in my shoes and you need to run this game and you’ll quickly see what is missing from the picture: when do you decide to show the player a coin? For example, I could follow a procedure like this:

Flip the coins and place one on the left and one on the right
If the left coin is Heads, show them that coin
Otherwise, show them nothing.

In this case, the 1 in 2 calculation is correct: whenever I show you a coin, the only remaining possibilities are

\[\text{H}_{\text{left}}/\text{H}_{\text{right}},\text{H}_{\text{left}}/\text{T}_{\text{right}}\]

So playing is always profitable.

But what if instead I follow this procedure:

Flip the coins and place one on the left and one on the right
If the left coin is Heads, show them that coin
Otherwise, if the right coin is Heads, show them that coin
Otherwise, show them nothing.

Now whenever I show you a coin, it could be

\[\text{H}_{\text{left}}/\text{H}_{\text{right}},\text{H}_{\text{left}}/\text{T}_{\text{right}},\text{T}_{\text{left}}/\text{H}_{\text{right}}\]

and you have no way to distinguish between them. The chance of winning has dropped to 1 in 3. Unfortunately, from your perspective, both procedures look nearly identical. The only difference would be the frequency of me showing nothing, but this is complicated by the fact that I can switch between these procedures at will. If I start with the first procedure I can sucker you in by making you think the game is rigged in your favor. When I switch strategies, your expected winnings flip to −$2 and hopefully for me your greed will cost you all of your profits and then some. I could even be sneakier and switch between these strategies randomly, slightly favoring the one that pays me more.

If you approach this game naively, you will always lose in the long run. Why? Because, like Engelstein, you will assume that the specific way in which you were given information reveals something additional about the nature of why that information was revealed; specifically, that me showing Heads versus saying “There is at least one Heads,” implies something about the choice I made. In general, there may be no connection between these things.

So this is the resolution of the paradox: it is not the case that revealing redundant information can change the chances of an event, rather it is possible to phrase a probability problem in an ambiguous way such that most people will make seemingly obvious (but unsupported) assumptions about it and thus come to incorrect conclusions.

The Resolution

Going back to the original Two Child Paradox. What were the unsupported assumptions we made in order to arrive at the paradox in the first place?

To arrive at the answer of 1 in 2 for the chances of the other child being a boy, we must assume that the gender of the child we met has no bearing on why we met them at the park. If that same child were a girl, we are assuming they would have still been brought to the park. This makes the gender irrelevant, since if they brought a boy, we have boy/girl or boy/boy (1 in 2) and if they brought a girl we have girl/boy or girl/girl (1 in 2). This seems like a reasonable assumption to me, however it is not stated in the problem.

To arrive at the answer of 1 in 3, we must assume some strange parental behavior, namely that only boys are brought to parks. In that case, the fact that we have met a boy at a park means that this could be a family with two boys, or boy/girl, or girl/boy since all of these equally likely scenarios would result in us meeting a boy. Again this seems like a strange assumption, but it is also not stated and thus cannot be excluded.

Back to Bridge. How do you arrive at the probabilities that Engelstein gave? If I say “I have an Ace” what is the probability that I have two or more Aces? In order to answer this we need to state our assumptions, namely, why did I say I have an Ace? If we assume that whenever I have any Aces I always say “I have an Ace” then indeed you arrive at about 37% chance of me having two or more. This seems like a reasonable assumption to make.

But what if I say that I have the Ace of Spades? How do we arrive at 56% chance of two or more Spades? That is the probability we get if we assume that I only tell you when I have the Ace of Spades; specifically, if I have other Aces and no Ace of Spades I say nothing. Similar to the Two Child Paradox this feels like a very strange assumption to make. If we’re interested in finding Aces, why wouldn’t I tell you about the others when I have them? This means that there would be situations where I have two or more Aces (none of them Spades) but I don’t reveal any information. A much more reasonable assumption here is if we assume that when I have any Aces I tell you the suit of one of them. It’s not hard to see that now I will tell you about an Ace (and its suit) whenever I have any Aces, in other words we’re back in the first situation. So now even though I am revealing additional information (the suit), the probability of having two or more Aces is still 37%, which is the intuitive result.

My Conclusion

If you only remember one thing from this, it is to be careful in your assumptions when turning word problems into math problems since they can completely change your answer. Engelstein mostly does a good job in his book of debunking common false assumptions such as the idea of “hot streaks” in gambling, but he ironically talks himself back into such bad assumptions with these two examples. Both answers he gives are unintuitive because they make unintuitive assumptions. If you make the more natural assumptions—in my opinion—you get the intuitive answers and no paradox appears.

Software Designs

2022-04-13T00:00:00+00:00

No plan of operations extends with any certainty beyond the first encounter with the main enemy forces.

Prussian Field Marshal Helmuth von Moltke the Elder

This is often paraphrased as “no plan survives contact with the enemy”. Wise words.

I’d like to propose a variant of this saying: no software design survives first contact with the compiler. Don’t spend too much time whiteboarding, writing UML, etc. Because as soon as you see how the code actually works, how it actually feels to test it and run it, there is no certainty that your software design will continue to make sense without modification.