Recently, I was able to get my hands on one of these wishes: The Game Boy Camera and its printer accessory. Released in 1998, this “game” turned your Game Boy into a kinda bad digital camera, but certainly amazing for the time. Today, even though the quality clearly sucks, I feel that there’s a certain charm to it that I can’t quite put into words.
The way saving photos worked is that the game itself can only hold 30 photos, and if you wanted more than that, you needed to use the Game Boy Printer accessory to bring them to the real world. The problem with the printer however is that it uses the same thermal printing technology used for receipts, so not only were the photos comically small, but they also faded over time.
This is even worse today because on top of those issues, there’s now also the additional problem of how difficult it is to compatible find paper rolls nowadays. The Seiko S950 rolls are the only ones I’m aware of that work great, and not only they are not easy to find, but also quite expensive…
With that in mind, my primary goal after obtaining these products was finding out how to digitize the photos using modern approaches. I was ready to reverse engineer the printer signal to determine how to emulate it via software, but as it turns out, the community has already done all of that!
The Arduino Gameboy Printer Emulator (V3) is an open-source project that does just what the name says. It’s hosted under the GitHub account of Brian Khuu (mofosyne), but from what I could tell, this project is actually the combined work of him and several other devs, primarily Raphaël Boichot who seemed to have done a lot of work to decode the printer signals and designing the final PCBs.
By wiring an Arduino in between the camera and the printer, they were able to sniff the signals and make sense of the protocol used by the printer. I’m not going to pretend to be smart enough to explain to you how exactly the protocol works, so I’ll just copy-paste the information they added to the README:
| BYTE POS : | 0 | 1 | 2 | 3 | 4 | 5 | 6 + X | 6 + X + 1 | 6 + X + 2 | 6 + X + 3 | 6 + X + 4 |
|---------------|-----------|-----------|-----------|-------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
| SIZE | 2 Bytes | 1 Byte | 1 Byte | 1 Bytes | 1 Bytes | Variable | 2 Bytes | 1 Bytes | 1 Bytes |
| DESCRIPTION | SYNC_WORD | COMMAND | COMPRESSION | DATA_LENGTH(X) | Payload | CHECKSUM | DEVICEID | STATUS |
| GB TO PRINTER | 0x88 | 0x33 | See Below | See Below | Low Byte | High Byte | See Below | See Below | 0x00 | 0x00 |
| TO PRINTER | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x81 | See Below |
DATA_LENGTH field.With that sorted out, all that was left was cloning this protocol in software so that you could use an Arduino to emulate the printer and extract the actual image data being printed.
So that meant that there was no software work for me to do, but I still had one challenge to solve myself: how to actually run this thing? I had no intentions of prying open an actual Game Boy cable, so something would have to be built.
Luckily for me, Raphaël also designed PCBs for this project, providing a much more elegant solution than ripping some cables apart. I ordered the Arduino Uno version from JLPCB, and after waiting a very long time for a third-party GBC port and some pin headers to arrive from China, I was now tasked with my first serious soldering job.
I had no expectations that this would actually work, but soldering this project turned out to be easier than I expected, and everything worked first try!
The output of the Arduino is the raw photo information, which you then need to extract to later convert it to a proper image (there are many community built solutions on the README):
However I later found out that one of these solutions (the Game Boy Camera Gallery) can even do the serial reading directly from the browser, so I didn’t even need to pull the serial logs from the Arduino. You can even change the color scheme of the photos to match the different color palettes you’d see across the different Game Boy models you could use the camera with which I thought was a really neat feature.
Here are some of my favorite recently digitized photos:
In my previous post, I wrote about how AI wouldn't replace software engineering jobs because unlike humans, the models cannot make sense of vague requests. This is still largely true. Although the coding capabilities of the models have evolved dramatically, they still require the user to know what they are doing as those capabilities are only unlocked if you provide clear instructions to the model. This makes the idea of a product manager with zero programming experience vibecoding their way into a billion dollar business just as ridiculous as it was back then.
However, although the job will still exist, it's important to say that what it comprises is radically changing. The archetype of an ultraspecialist that focuses their entire career on a single platform / language, which became very popular around a decade ago when mobile engineering was on the rise, is no longer a good career path. There is no longer any point to becoming a specialist on a particular tool when anyone can now guide an AI model to the same result with just a fraction of the experience.
What's meta now in the engineering industry in the wake of AI is to T-shape into having broad knowledge on all the different systems and products developed by your company. That way, while the AI focuses on the deep aspects of the code, you can focus on what the AI currently cannot do, which is making sure that what it implements is in line with what's being developed for other aspects of the product, akin to a blend between software engineering and a product manager. The Pragmatic Engineer calls this a product engineer.
I predict that companies will soon require everyone to migrate to this capacity, with those that fail to do so being at risk of future layoffs. Those who are at senior level likely already perform at this role at some capacity, but this could be a large barrier to clear for more junior developers.
In my previous post, I mentioned that the bulk of my work happened with Cursor's tab completion, with Claude only being a small percentage for more special cases. This has now flipped around. I would say I nowadays do 95% of my work by prompting Claude, with around 5% manual coding being required only to sometimes make small adjustments to the AI's code. I also stopped using ChatGPT entirely in favor of using Claude for everything.
The reason for this change is because the newest models are so intelligent that it is now much faster to simply ask the model to do the work instead of having me do it myself. While before the models would cause all sorts of mistakes, nowadays they really get pretty much everything right if you give them clear instructions. In repos that have well defined AGENTS.md files and skills, I can even go a step further and use the --dangerously-skip-permissions flag to let the agent do its work and commit + open a PR without any supervision.
It's even scary at times; sometimes I prompt absolute garbage full of typos, but still somehow the model is able to understand exactly what I meant.
Many folks are using all kinds of complex architectures and plugins to interact with agents, but I run a fully vanilla setup without any of that fuss. I find that those special setups are completely unnecessary to get good performance out of the agent.
The use-case for agents that I find the most powerful was and still is making sense of complex codebases. Recently I've been constantly doing tasks that require me to work in multiple codebases I've never seen before, and what would previously be a multi-month task requiring several engineers can now be done in mere minutes by simply asking Claude some questions while having the code cloned locally. Thanks to AI, I've been able to deliver extremely intricate improvements to some of our systems not only completely by myself, but also at an unprecedented speed. It's really incredible how much this technology was able to change the industry.
What draws me to software engineering is the problem solving. I really enjoy trying to make sense of complicated issues. But now that AI does all of that for me, I'm not getting that kick anymore. Yes, I can deliver much faster now, but also I now feel that I don't really understand what I'm delivering anymore. Struggling to solve something was a very important part of the learning process, so with that out of the way, I feel that I'm not really learning anything anymore despite technically being able to deliver much more than I used to. I know many others are also feeling like this.
This makes me confused in regards to what the future of the industry will be. The AI models are smart because they were trained on decades of human content and documentation. But if no one produces this content anymore due to everything now being made by AI, how will future models be trained? They cannot train on their own content. Does this imply that there is an upper-bound to how smart coding AIs can be?
The future is very uncertain right now, but I hope we find ways to continue having fun in the midst of it.
]]>Here I will only cover the actual networking stack; if you want to know about my Home Assistant server and the devices I have connected to it, check out the separate "My Home Automation Setup" post which goes into more detail on that.
For over a decade, my home network was powered by nothing more than one of those consumer-level ASUS routers you can buy in any electronics store. While they work fine, two things that started bothering me over time:
I chose to go all in on Ubiquity hardware for a simple reason: I wanted something that was designed for power users, but also that didn’t have too much of a learning curve. While I usually fit the criteria of a mega power user for this type of stuff, modern networks are extremely complicated, so I preferred to stop somewhere in the middle of the curve to avoid the stress of being unable to use the internet due to misconfigurations.
By the way, before starting, I would like to note that this setup is extremely overkill for the average home network! I went for this simply because I thought it was cool, so if you’re reading this because you’re looking for equipment recommendations, keep in mind that something considerably less powerful would most likely already do the trick for you.
For the router, I chose the Cloud Gateway Fiber as this router is a complete bargain for what it brings to the table. Not only does it have three 10 Gbps ports (one of them being RJ45), including for WAN, it even has a PoE+ port that can power an AP. Usually something like this would be extremely expensive, but for some reason it just… isn’t, and I guess the market agrees, because finding one of these is a massive challenge. They are constantly sold out almost everywhere as of writing.
For the AP, I chose the U7 Pro XG, which is a WiFi 7 router with support for the 6 GHz band. This thing even has a 10 Gbps uplink for some reason which I can’t even use as the PoE+ port on the router is “only” 2.5 Gbps, but it was a very cheap upgrade compared to the regular version, so it seemed like a no-brainer. Luckily I do have WiFi 7 devices around already (the newest iPhones), so I can and am already making use of it.
Since I had lots of devices connected directly to the router via Ethernet, I chose to also get the Flex 2.5G switch, particularly because it has the same SFP+ port that the router has, allowing me to connect the two via fiber and get a clean 10 Gbps connection between them.
One piece of hardware that was not part of this upgrade but that is worth mentioning here is that I also have a simple Raspberry Pi 3b running Pi-Hole and my own recursive DNS server (via Unbound) for ad-blocking and privacy reasons. I only enable it for my phone and computer however to prevent guests from having issues with it, as it makes navigating certain websites slightly harder (including Google, e.g. because they wrap almost everything in sponsored ad links, which the Pi-Hole will reject).
To be able to access the Pi-Hole remotely, I used my router's native WireGuard functionality to create a VPN that gives me access to my home network regardless of where I am.
In addition to the Pi-Hole, nowadays my Home Assistant server also doubles as a Plex server (via this unofficial addon for HAOS), allowing me to effortlessly stream ultra-high-quality video straight to my TV in cases where I happen to have the video files available locally, thanks to everything being wired locally and the high-speeds now made possible thanks to these upgrades.
Now comes the part that is the other reason why I wanted to do this upgrade: allowing for more complex network setups.
Today, my network consists of a combination of “regular” devices (like my phone), servers, and random IoT devices. To connect these effectively and securely, I went for the following setup:
For Wi-Fi, I have two SSIDs:
Some people also setup a third “Guest” VLAN and SSID for visitors which is completely isolated from everything else similarly to the IoT one, but I chose not to do so because it breaks things like Chromecast and AirPlay (I’m sure you can setup firewall rules for this, but it seemed too complicated so I chose not to bother). It would add more overhead to the AP, which I would like to avoid.
In short, I’m very satisfied with my hardware and setup choices. Like I mentioned above this is all extremely overkill for what I actually use my home network for, but since it resulted in me learning more about networks and having a safer / more powerful and future-proof network for all of my devices, I’m very fine with that :)
]]>The trick here is that when you have your own domain, most email server providers will have an option to define email aliases, allowing you to "hide" your real address behind a fake one. Although in some providers this is a limited and manual process, in others such as Google Workplace (the one I use), you can define wildcard or even regex-based aliases:
While this was created primarily to allow companies to catch typos in their inbound mail, you can use it to massively increase your privacy. The idea here is that by having "infinite" email aliases, instead of registering on websites with your real address (like me@bla.com), you can instead give each address its own "dedicated" alias, like:
That way, when one of these websites inevitably sells or leaks your data, you will know exactly who it was based on which of the aliases you're getting spammed from. You can then follow up by either heavily restricting who can reach that alias or simply nuking it entirely in favor of a new one.
In other words this is basically what features like the "private email" in Sign in with Apple and other services do under the hood, but completely under your control :)
You might know that aliases also technically exist with "normal" emails by appending data with +, like me+alias@gmail.com. Isn't it the same thing?
No. The problem with + aliases is that some websites intentionally strip the alias portion of the address if you try to use one. I have also encountered cases where some websites would not accept these at all, forcing you to expose your real address. In other words, you can sort of do the above trick with a "normal" email address, but your mileage may vary.
]]>AI has become an important part of my daily software engineering work, so I wanted to write a post sharing how exactly I've been using it in case you're wondering how to use it to improve your productivity as well!
Before getting to the details of how AI has been helping me to code, I wanted to address the topic of AI replacing software engineers.
Recently, my social media feeds have become full of people making predictions about how in X months/years everything in the software engineering industry will be done by AI, via doomsday-style content about how everyone is going to lose their jobs and everything will fall apart.
If you look at who's writing these kinds of posts, you'll notice something interesting: they are either executives who have never done any kind of software engineering, or beginners with no industry experience. Honestly, that should tell you everything you need to know about these "predictions". But in the interest of being informative, I'll try to explain why they are nonsense.
The problem with these "predictions" is that the people making them for some reason seem to view software engineering as nothing more than coding and closing tasks on JIRA. You arrive at work, pick a task that is very well defined and requires no clarification whatsoever, code it, close it, pick another task that is once again perfectly defined, and repeat that ad-infimum for the entire duration of your career.
But the reality of software engineering is far more complex than that. While there's certainly a good amount of coding, it's extremely rare that the problems a software engineer needs to solve are perfectly defined from the get-go as claimed by the people making these predictions. This means that more often than not the job is not really about coding, but rather figuring out what exactly needs to be coded, by asking yourself questions such as:
The answers to questions like the ones above provide you with context that helps you define how (and when) exactly certain problems should be solved, and is a critical aspect of software engineering even for junior developers. And the interesting part is that the more senior you become, the less coding you do, and the more time you spend answering these types of questions to help your team/company determine which way it should go. This is something I've also written about on my Focus not on the task, but on the problem behind the task blog post.
While AI can be quite good at solving very simple and perfectly defined problems, it is exceptionally bad at handling anything that requires taking this level of context into account, which is something that software engineers constantly have to do. This is very easy to confirm if you have doubts about it: Grab any AI agent and project of your choice (or ask the agent to make a new one), and keep asking it to include more features in your project. While it may do relatively well the first time, it is inevitable that the AI will start confusing itself and destroying the codebase on the subsequent requests. This is because AI today doesn't understand context, and as one user on HackerNews wrote, it's like working with a junior developer that has amnesia.
Thus, while AI today can be amazing as a coding assistant (which I'll go into more detail further below), the thought of it replacing software engineers is frankly hilarious.
One counterargument that some people have is that while this is true today, it doesn't mean that in the future the AI won't be able to understand context and thus be able to do complex tasks. While this is true, what must be noted about this is that an AI capable of understanding context (and gathering it on its own) would be so powerful that it wouldn't just replace software engineers; it would replace all of humanity. If such a thing is achieved then software engineering jobs would be the least of our concerns, so I think it's a sort of weird argument to consider. Our entire lives would change in this scenario.
With that out of the way, I'd like to now present my favorite use cases for AI today!
One thing that AI is very good as of writing is solving very concrete and straightforward problems. So when I have to do very menial tasks like changing configuration files or writing a simple function that does X and Y, nowadays what I do is simply ask Cursor to do it for me, sit back, and watch the show.
Even when taking into account that the AI might not get it 100% correct and that I'll still have to patch the code afterward, this still saves me a massive amount of time overall compared to having me do everything by myself and is definitely my favorite use case of AI today. This is especially true when doing (simple) work on languages that I'm not very familiar with, as the AI in this case is also sparing me from having to do multiple trips to StackOverflow. I still need to do so since the AI will sometimes recommend things that are not correct for my case, but again, even when considering these setbacks, I can get the work done at a much faster pace.
It must be noted however that the important keyword here is simple, concrete, and straightforward. As mentioned previously, trying to have the AI solve complex problems that require large amounts of context such as code reviews or designing large features will not work in any meaningful way and is a sure way to waste everyone's time.
Another thing that I've found AI to be amazing at is when I'm working on a repository that I'm not familiar with and I need to figure out how certain things are wired together.
The way I would do this before AI was to spend hours painstakingly reading through the codebase, but now, with the right questions, it's possible for me to get started in a matter of seconds.
Here's a concrete recent example to demonstrate what I mean by this. I was recently attempting to craft a Build Server Protocol that would connect to SourceKit-LSP in order to enable iOS development on my specific non-Xcode conditions.
The problem here is that SourceKit-LSP is a very complex project. Even though I know what I have to do in theory, I have no idea what SourceKit-LSP expects me to do in practice. But nowadays, instead of having to spend weeks trying to figure this out by searching keywords on the codebase, I can simply ask Cursor to explain the project to me!
Similarly to Use case 1, it's to be expected that the explanation provided by the AI will not be 100% accurate. But once again, even when taking this into consideration, the amount of time these explanations save me is mindblowing. Since Cursor in this case provides shortcuts to the relevant parts of the codebase, I am able to very quickly piece together what I am supposed to do / determine which parts of the explanation are correct and which ones aren't.
I find that Google tends to provide good results if you know exactly what you're looking for. But if you don't really know what is it that you're trying to find out, you'll have a hard time with it.
For example, the other day I was trying to find what the _start function in iOS (the first function called when your app is launched) is, where it's defined, and what it does. But if I go now and search for "_start function iOS" on Google, I will not find a straight answer to this question. Google does return somewhat good results (the second search result contains some interesting information about it if you scroll down far enough), but it cannot give me a direct response because I asked the wrong question. I know today that what I should've done is ask it about details of how C programs are linked, but I didn't know this back then, so I couldn't have done that.
AI does not have this problem. If you don't know what you're looking for, you can explain your situation to it and it will point you in the right direction:
In this example, you can see that ChatGPT immediately pointed out that I asked the wrong question before attempting to explain it!
Although the AI's answers won't always be 100% accurate, I find them to be accurate enough to allow me to use Google to find the rest of the information. Just like the previous case, this is not so much about having the AI do everything for me (which it can't), but rather allowing me to save time and get where I want faster.
Even if you know exactly what you're looking for, you may have difficulty using Google if your question is too specific. For example, you cannot search Google on "how to implement X thing in Swift in my app that is using XYZ frameworks in X iOS version with dependency injection bridging an Obj-C type after the user logged in on my TODO list app using Apple Sign-in from Italy during a rainy day in October". For cases like this, usually what you need to do is break your problem into multiple, more generic queries, open many tabs that each help you with a specific part of the problem, and then use all of that combined knowledge to come up with the actual answer to your question.
AI excels at this. You can be as specific as you want and you'll get a relevant answer. Most coding questions fall into this category, although for these specifically nowadays I prefer using Cursor's code gen features directly as mentioned above.
In this case, I could've probably found the answer I was looking for in Google by making a bunch of generic searches about C++ global constructors and good practices, opening a bunch of tabs, and summarizing everything I found. But by asking ChatGPT, I was able to save several hours of my time instead.
It has been getting harder and harder to get fast answers to your questions with Google. Today, it's very unlikely that the answer to a question will lie at the top of a page you've opened. As SEO optimization became more and more important for survival on the web, the amount of stuff you have to endure before getting to the actual content has increased significantly. There will be a lengthy introduction, a pause for sponsors, ten paragraphs about how the question reminds the author about a personal story of how their dog bodyslammed their grandma on Christmas, a call to action for the author's newsletter, some backstory on the question, and only then you'll get to the actual content.
I find that there are many cases where this fluff is relevant and worth reading. But there are also many cases when I'm in a hurry and would much rather just get a straight answer to my question.
This is also something that I find AI to be quite good at. It generally doesn't try to educate you on things you didn't ask, it just straight up answers your question.
By asking follow-up questions regarding one or more things it mentioned in its answer, I can get all of the information I need to learn something new considerably faster than if I had used Google instead. Even though I still need to use Google to double-check if the AI didn't hallucinate particular pieces of information, this ability to quickly gather relevant information saves me an absurd amount of time.
]]>In this post, I'd like to lay out exactly what I've done that I believe contributed (and didn't contribute) to this growth, serving as documentation and inspiration for the indie dev community out there.
I cannot understate the value of having a good grasp of App Store Optimization (ASO). The case is simple: It doesn't matter how good your app is, if you don't get eyes on it, it will never succeed.
ASO refers to being strategic about how you assemble your app's store listing (keywords, name, subtitle, description, screenshots, etc) so that it ranks well when people search for keywords related to your app. In many cases what you actually want to do is avoid popular keywords in the beginning, focusing on less popular ones where you have more of a fighting chance until you get "popular" enough that you can try challenging the real ones. How and when you ask for reviews also plays a big role here as reviews also affect your app's rank.
I strongly recommend Appfigures for learning and applying ASO for your apps. The owner, Ariel, has posted many videos explaining different strategies you can take, and that's how I got to know about it.
In my case, ASO was only time-intensive in the first few weeks following the app's launch. After it picked up some steam and became no.1 in a couple of important keywords, I was able to leave it alone and enjoy full organic growth ever since.
Most indie apps fail because they are trying to solve problems that don't exist. The devs come up with the solution first, and then try to find users who have a problem that match their solution. This rarely works.
The easiest way to avoid this is to ignore other people and just focus on your own set of problems. If you can manage to build something that would make your own life better, certainly you'll find other people who will also appreciate it.
In my case, I built Burnout Buddy because iOS's default Screen Time feature was too simple for me. I wanted to make more complex scenarios such as schedule or location based conditions, but iOS only allows you to setup simple time limits. You also can't do "strict" conditions where there's no way to disable the block once it goes into effect. I searched for other alternatives, but none of them were good enough for me. So I built my own!
Once my problem was solved, I figured out that most likely there were others out there who could also make use of it. I made the app public with zero expectations, and sure enough, there were tons of other people with the same problem I had.
Being my app's primary user also means that I'm perfectly positioned to know which features the app should and shouldn't have. I don't need things like user interviews, because again, I built this for myself. All I have to do is ask myself what I'd like the app to do, and the result is sure to also be a hit with others with the same problem the app aims to solve.
I attribute Pieter Level's Make book for helping me understand this concept. It's also a great resource for learning more about indie development and how to create successful products in general!
Another decision that I've made that massively simplified things for me is that everything happens on the client. There are no accounts or backend, and I gather zero data from the users.
This means I have no backend to manage, and most importantly, no monthly server costs. As long as Apple doesn't push iOS updates that break the APIs I use (unfortunately happens a lot), I can trust that everything is working as it should and focus my attention on other things. People seem to really appreciate this too, since many apps nowadays have accounts for no reason other than wanting to hoard data which is really shady.
After the first couple of releases, I spent a good amount of time building a good suite of tests and architecting the app so that it would be easy to expand and make it even more testable. This means I very rarely have to worry about whether or not I'll push something that will fundamentally break the app. Having no backend-related code also greatly helped here.
This doesn't mean that the app is bug-free (there are a bunch of SwiftUI issues I can't seem to solve, and Apple somehow manages to break their APIs on every iOS release as mentioned above), but when it comes to the core experience of the app, I can trust that everything works as it should. This saved a lot of testing / debugging time on my end and also made sure I almost never had to deal with support e-mails regarding broken features and such.
Burnout Buddy is a one-time $9.99 bucks purchase. For a long time it used to be $4.99 even.
Why this matters? Because most alternatives are stupidly expensive subscriptions. Most of them also don't have backends and have even less features than BB, why the hell are these apps subscription-based???
Some people justify that subscriptions are necessary even for "simple" apps like BB because of things like recurring support work. While I can see the point, I also think there are other ways to tackle these issues. I for example created a FAQ support page, and that reduced 99.9% of the support requests. I'm not trying to extort my users and I believe this was a strong factor for the app's success.
It would be naive of me to claim that everything went right. I've made a couple of bad decisions that worked against the app's success, and I wanted to document them as well.
Like I mentioned in the ASO section, it doesn't matter how good your app is. You need to get the word out, otherwise it will just not work.
There is a saying in tech that goes "if you build something good, people will follow". Whoever said this has absolutely never attempted to sell something. I'm as a tech nerd as it can get and I can safely say that when it comes to building businesses, marketing is a billion times more important than building the actual product!
Unfortunately for me, I hate doing marketing work. I'm fine with putting a sponsorship section on this blog, but reaching out to journalists and hustling on X / LinkedIn is really not my thing. This means that while thankfully I was able to do just enough of it to get some nice results in the beginning, the app is destined to die a slow death as it drops in ranking in the App Store and other similar apps manage to get their word out better than me. Marketing is something you have to do constantly, but unfortunately for me it's something I just don't want to do, so there will always be a hard cap to how far I can go with any given project alone.
This will sound weird because I mentioned above that not extorting my users was a positive. But allow me to clarify this.
One thing I've learned the hard way is that you need to avoid cheapskates like the plague. This means people who expect nothing but the highest quality products, but at the same time are not willing to pay anything for it. You know when you see one because they behave like huge assholes and will do everything in their power to extract as much value from you as possible while giving nothing in return, much like the meme of a Karen screaming at the supermarket cashier because of some worthless coupon.
When Burnout Buddy was $4.99, I was constantly having my support e-mail being spammed by such people. They would constantly aggressively complain about different app features and demand refunds, often threatening that they would download a different app if I didn't help them (...why would I care about that?). A lot of these reports didn't even make sense, they were clearly people just searching for excuses to be an asshole and get free stuff. It was such a waste of my time that I even briefly considered abandoning the project entirely / pulling it from the App Store just so I wouldn't have to deal with them anymore.
It was only when I read someone complaining about the exact same problem on HackerNews that I realized what my issue was. It's not that giving support is a thankless job, it's that the app was too cheap. The cheapskates are attracted by free (or in this case, almost free) products. If you raise the price of your product just slightly, you can filter out these people without driving way the good (and kind) users.
After doing just that, these bizarre e-mails completely vanished without resulting in any loss of revenue. While I of course still get support requests every now and then, they are now all very polite and helpful, which makes everything a breeze!
In other words, the "fail" here is that I should've made the app cost $9.99 from the get-go to have filtered the cheapskates from the very beginning.
This is an interesting one because it's both a good and a bad thing depending on how you look at it.
I mentioned above that having no accounts was a good thing because it made things easier on my side and was appreciated by the users. But it also meant that I had no information regarding how users were using the app. This made things harder for me because 1) I couldn't determine which features were more popular / worth expanding upon (and which ones weren't), and 2) when people reported bugs, I had no easy way to trace their steps in order to quickly reproduce the issue (or to confirm they misunderstood the app / were doing something wrong).
If I could go back, I would probably have gone for a solution that allowed me to gather analytics data for the above reasons.
This is mostly out-of-topic for this post, so I'll keep it short. I decided to use SwiftUI for this project as a learning opportunity, and I sort of regret it. As mentioned in my SwiftUI vs UIKit post, SwiftUI is good for simple apps, but awful for more complex ones. As BB grew and became more intricate, SwiftUI became more and more of an issue. The app today is full of dirty hacks and visual bugs that are impossible to solve (as of writing) because they originate from SwiftUI itself, in ways that are impossible for me to control without dumping the entire framework.
]]>Here I only describe my home automation setup; for information on my networking / homelab hardware, check out my separate "I upgraded all home networking equipment" post where I go into deeper details of that.
I use Home Assistant OS like many others. The way I like to describe HA is that it's an Alexa on steroids. With an Alexa, you buy smart devices, link them with the Alexa, and then setup automations to control those devices based on conditions like time, weather, and so on. But the problem with Alexas is that 1) the devices must support Alexas specifically, and 2) the automations themselves are very limited, only allowing you to do simple things.
Home Assistant doesn't have such limitations. HA is open-source and has a thriving community, meaning you can find plugins that enable integrations for pretty much anything you can think of, and if you don't, you can build such plugins yourself assuming you have the programming chops to do it! HA is very extensible, and thus perfect for power users who want to set up complex automations or integrate unusual devices in unusual ways.
For a long time, my setup used to be a simple Raspberry Pi 3b and a SD card. The HA community tells you that this is a bad idea (The 3b is weak and SD cards can die if you use them too much), but in my experience this is fine as long as you don't have too many devices / automations / integrations. It's a good starting point, and it did the trick for me for a couple of years until I started wanting to do more complicated integrations.
Nowadays, I've retired the 3b in favor of a Raspberry Pi 5 w/ 8GB RAM, with a 256GB official RPi NVMe SSD and enclosed on the Argon One V3 case. This gives HA enough power and cooling to do everything that I need with ease.
Today I run the server via a wired connection, but there was a long period of time where I was running it through WiFi. The community says that this is a bad idea for latency reasons, but I never had any such issues. While the wired connection is definitely much faster, the WiFi latency never really bothered me, so I can confirm it's totally fine to run HA over WiFi if you can't run a cable to it.
For voice control, I use an Alexa. The way this works is that HA has a plugin called Emulated Hue which allows you to trick an Alexa into thinking your HA server is a Philips Hue hub, allowing you to expose your devices and scripts to the Alexa in order to make use of its voice features. But you can also pay for HA Cloud and enjoy the official "proper" Alexa integration, which I don't because I want to keep everything running on the local network.
I also have a Sonoff ZBDongle-E USB stick plugged into the server in order to drive my Zigbee devices, which I'll mention in more detail further below.
In addition to all of this, I'm using a generic Huawei Android tablet under my TV to serve not only as a physical dashboard, but also as a digital photo album when idle:
The dashboarding functionality itself is achieved by using Wallpanel to expose the tablet's controls to HA, while the photo album logic is handled via HA's lovelace-wallpanel custom integration. I host the photos on the HA itself. For added security, since we're talking about a Chinese Android device, I also have a special rule on my router's firewall that prevents this tablet from communicating with devices outside the local network.
Right now I do not provide a mechanism to allow me to access my HA instance remotely as I don't have a good enough reason to do so, but I wanted to mention that HA has official support for Tailscale, making it very easy to do if you don't happen to have access to better choices like WireGuard.
Currently, I'm running a combination of WiFi and Zigbee devices, which is an alternate wireless protocol made specifically to be used by IoT devices that uses less energy and lays off a mesh network where the devices communicate with each other (as opposed to WiFi devices where everything goes through the router, thus creating a star network).
The reason I run this mix is just because I didn't know about Zigbee in the beginning. If I could go back in time, I would have the entire network consist of Zigbee devices because I think they are just better than WiFi ones overall. It uses significantly less energy (many Zigbee devices can run on those coin cell batteries), the mesh network allows you to have devices very far away from the server, and best of all: they work even when the WiFi is down.
When you buy Zigbee devices, usually the store will say that you need a hub to drive them, which they also sell. It's true that you need a hub, but it doesn't have to be that store's specific hub. When using something like HA, you can use a USB antenna stick like the one mentioned above and that will allow you to control any Zigbee device from any manufacturer via HA.
Here are the IoT devices that I have around my apartment, excluding things that are "smart" by default like TVs and such.
This is a DIY WiFi switch that you hook into "dumb" devices in order to be able to make them smart and turn them on and off via WiFi. Given a bit of skill with electronics (stripping / crimping wires), these switches are much cheaper and more durable compared to buying smart lamps or smart outlets, and I have many of these spread around the apartment!
By default, these require you to expose your device to some awful Chinese cloud server. Luckily for us, you can flash these devices with custom firmware like Tasmota, allowing you to have full control of them. This also requires skills with electronics and some special equipment, so keep that in mind.
As previously mentioned, If I could go back in time, I would have instead bought a Zigbee equivalent to make things easier and better.
These are similar to the above, but as something that you plug directly into the wall socket instead. This allows you to enjoy the same benefits without having to mess with wires. I use these in cases where I'd like to automate something where messing with the wires would be either tricky or outright impossible.
This added flexibility comes with a price; they are much more expensive than the DIY switches. But another cool thing of these wall sockets is that they even come with sensors that measure your electrical consumption, so you can get a lot of cool data by having a bunch of these around your home.
If you'd like something that uses WiFi instead, I can recommend the Sonoff S26 R2, which is what I used before discovering this IKEA Zigbee one. Just note that they suffer from the same issue of exposing your network to some random cloud as the DIY switch mentioned above.
This is a WiFi IR blaster that you can configure IR commands and thus be able to create automations that allow you to control devices that require a remote control, like your TV.
In my case it turned out that newer Samsung TVs have some sort of API integration where you can control them over the web, but I used these blasters for a long time before I discovered this. This also puts your device on some Chinese cloud though, and in this case I'm not sure if custom firmwares are available.
A hackable LED matrix that can easily be integrated into Home Assistant. Somewhat expensive for what it does, but works really well. I use it as a pretty clock, to show the weather, and to display relevant notifications. This is achieved by ignoring all the default firmware / app stuff and flashing it with Awtrix 3 (very easy, can be done via USB).
IKEA has lots of IoT devices like buttons, remote controls, motion detectors, temperature thingies, air quality monitors, and more. They are all Zigbee and thus very easy to connect to use. I think only the button and the remote control took a bit more effort because you need to find out exactly how they work in order to build automations against them in HA, but nothing that a simple Google search couldn't solve.
Last but not least, I have the official Home Assistant app installed on several of my devices. This gives HA access to a bunch of stuff like location, battery level and notifications, which gives me a ton of potential input data for automations.
It would be too complicated for me to explain every automation and script that I have, so I'll describe this at a higher level only. When you have all of this stuff connected to HA, you can basically do anything you want. Since automations in HA are basically Python scripts, you are limited only by your creativity. As mentioned in the introduction, if HA doesn't natively support something that you want to do then most likely someone already developed an extension for it, and if for some reason that's not the case, you can code it yourself!
Here are some of my favorite automations:
As you can see, this is no different than coding any software. The biggest hurdle is providing an abstraction that allows HA to gather and send data to things that are not natively supported by the platform, but in my experience, even the most niche devices out there already have plug-and-play solutions developed by the community, which is amazing.
]]>After investigating this, I thought the answer was interesting enough that I felt like writing an article about it.
To answer this question, we need to briefly explain how git works under the hood. There's also a TL;DR at the bottom if you'd like to skip the entire explanation.
It's somewhat commonly believed that git's commits are diffs, but this is not true. Commits are snapshots of your repository, meaning that when you make changes to a file, git will store a full copy of that file on your repository (there is an important exception, but let's keep it simple for now). This is why you can easily switch between commits and branches no matter how old they are; git doesn't need to "replay" thousands of diffs, it just needs to read and apply the snapshot for the commit you're trying to access.
Under the hood, git will store all different versions of your files in the .git/objects folder, and this is something we can play with in order to find out what will happen regarding the main question we're trying to answer.
Let's make a new git repo and add a file called swiftrocks.txt with the Hello World! contents, and commit it:
git init
echo 'Hello World!' > swiftrocks.txt
git add swiftrocks.txt
git commit -m "Add SwiftRocks"
If you now go to .git/objects, you'll see a bunch of folders with encoded files inside of them. The file we just added is there, but which one?
When you add a file to git, git will do the following things:
zlibWe can locate our file in the objects folder by reproducing this process, and luckily for us, we don't have to code anything to achieve this. We can find out what the resulting hash for a given file would be by running git hash-object:
git hash-object swiftrocks.txt
980a0d5f19a64b4b30a87d4206aade58726b60e3
In my case, the hash of the file was 980a0d5f19a64b4b30a87d4206aade58726b60e3, meaning I can find the "stored" version of that file in .git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3. If you do this however, you'll notice that the file is unreadable because it's compressed. Similarly to the previous case, we don't have to code anything to de-compress this file! We just need to run git cat-file -p and git will do so automatically for us:
git cat-file -p 980a0d5f19a64b4b30a87d4206aade58726b60e3
Hello World!
There it is! Let's now make a change to this file and see what happens:
echo 'Hello World (changed)!' > swiftrocks.txt
git add swiftrocks.txt
git commit -m "Change swiftrocks.txt"
git hash-object swiftrocks.txt
cf15f0bb6b07a66f78f6de328e3cd6ea2747de6b
git cat-file -p cf15f0bb6b07a66f78f6de328e3cd6ea2747de6b
Hello World (changed)!
Since we've made a change to the file, the SHA1 of the compressed contents changed, leading to a full copy of that file being added to the objects folder. As already mentioned above, this is because git works primarily in terms of snapshots rather than file diffs. You can even see that the "original" file is still there, which is what allows git to quickly switch between commits / branches.
git cat-file -p 980a0d5f19a64b4b30a87d4206aade58726b60e3
Hello World! # The original file is still there!
Now here's the relevant part: What happens if we change our file back to its original contents?
echo 'Hello World!' > swiftrocks.txt
git add swiftrocks.txt
git commit -m "Change swiftrocks.txt back"
git hash-object swiftrocks.txt
980a0d5f19a64b4b30a87d4206aade58726b60e3
The hash is the same as before! Even though this is a new commit making a new change to the file, the hashing process allows git to determine that the file is exactly the same as the one we had in previous commits, meaning that there's no need to create a new copy. This will be the case even if you rename the file, because the hash is calculated based on the contents, not the file's name.
This is a great finding, but it doesn't fully answer the original question. We now know that renaming files will not result in new copies of those files being added to the objects folder, but what about folders? And how are those files and folders attached to actual commits?
The most useful thing to know right off the bat is that commits are also objects in git. This is why you might have seen other folders / files in .git/objects when first inspecting it; the other files were related to the commits you made when adding the file.
Since commits are also objects, we can read them with git cat-file just like with "regular" files. Let's do it with our latest commit (26d4302 in my case):
git cat-file -p 26d4302
tree 350cef2a8054111568f82dc87bbd683ee14bb1a6
parent 2891fe1393c9e1bff116c1b58a30bcf85e0596a8
author Bruno Rocha <email> 1733136171 +0100
committer Bruno Rocha <email> 1733136223 +0100
Change swiftrocks.txt back
As you can see, a "commit" is nothing more than a small text file containing the following bits of information:
In this case, what we're interested in is the last point. Luckily for us, trees are also objects in git. Thus, if we want to see what the file system looks like for that particular commit, we just need to run git cat-file -p against the commit's tree hash:
git cat-file -p 350cef2a8054111568f82dc87bbd683ee14bb1a6
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 swiftrocks.txt
Like with commits, tree objects are also very simple text files. In this case, the tree states that there's only one file (a blob) in the repository, which is a file called swiftrocks.txt with the 980a0d5f... hash. We've already uncovered that git prevents individual files from being duped, but let's see how this is reflected in the tree object:
(made a commit adding some copies, and did cat-file -p on the new commit / tree)
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 swiftrocks.txt
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 swiftrocks2.txt
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 swiftrocks3.txt
The tree object references the new copies and their different names, but as expected, their hashes all point to the same underlying object under the hood.
If we add folders to our repository, the tree object will include references to other tree objects (related to each of those folders), allowing you to recursively inspect each folder of that commit's snapshot. Here's an example:
100644 blob dd99cb611e0c77b2214392b253ed555fb838d8ee .DS_Store
040000 tree 350cef2a8054111568f82dc87bbd683ee14bb1a6 folder1
040000 tree 11ca8c2fe64b078be34824f071d32a560aba62a7 folder2
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 swiftrocks.txt
As you can see above, the output directly identifies what each hash is so that you know exactly what you're looking at. (An alternative is to run git cat-file -t, which returns the "type" for a given object hash.)
The important bit to know here is that tree objects (and commits) are calculated and stored just like regular file (blob) objects, meaning they follow the same rules. This means that if the contents of two folders are exactly the same, git will not create a new tree object for those folders; it will simply reuse the hash it had already computed in the past, just like in the case of files:
040000 tree 350cef2a8054111568f82dc87bbd683ee14bb1a6 folder1
040000 tree 350cef2a8054111568f82dc87bbd683ee14bb1a6 folder1 (copy)
However, since tree objects contain references to a folder / file's name, renaming something can result in new tree objects being created for that folder / file's parent tree in order to account for the name change, resulting in new hashes and tree objects recursively all the way up to the root of the repository. This will also be the case when moving files / folders.
The above snippet is one example of this. Even though git was able to avoid duplicating the internal contents of folder1, git still needed to generate a new tree object for its parent in order to account for the fact that a new folder called folder1 (copy) exists. If there are more parents up the chain, they would also require new tree objects.
Whether or not this would be a problem depends on where exactly the change is being made. If the change is too "deep" into the filesystem and / or the affected trees contain a massive number of files then you'd end up with lots of potentially large new tree objects. Still, as you can see, tree objects are quite simple, so you'd need a truly gargantuan repository and / or unfortunate folder setup for this to be an actual problem.
If you do have a setup that is bad enough for this to be an issue, then the good thing is that there are ways to improve it. By understanding how tree objects are created and which files change / move more often in your repo, it's possible to optimize the structure of your repository to minimize the "blast radius" of any given change. For example, placing files that change very often closer to the root of the repo could reduce the number of trees that would have to be regenerated and their overall size.
At the beginning of this article, I mentioned that there are cases where commits are not snapshots. While this is not particularly relevant for this article, I wanted to briefly cover this as it's an important aspect of how git works.
We've seen that git will make copies of your files when you change them, but this introduces a massive problem: If a particular file happens to be really big, then duplicating it for every small change could be disastrous.
When this is the case, git will pivot into calculating change deltas instead of making full copies of the file. This feature is called Packfiles, and is something that is automatically managed by git for you. I recommend reading this great write-up by Aditya Mukerjee if you'd like to know more about it.
The team then organizes itself and executes the project. After a certain amount of time, they delivered exactly what was asked of them. But then, one of these things happens (choose at least one):
What do you think happened here? Is this the Product Manager's fault for giving wrong requests, or the engineering team's fault for not understanding what was asked of them?
Some people will say this is the PM's fault, and in some cases it might be true. But the situation I want to cover here is the scenario where this is the engineers' fault, because it's something I've seen countless times throughout my career.
The reason the scenario above happened (when the PM is not the one at fault) is because the engineers focused too much on the immediate task they were given, when what they should've done is focus on the problem behind the task, by asking themselves questions such as:
It's only after understanding this context that you can consider yourself ready to come up with a technical solution to it. But what happens a lot is that some engineers tend to immediately jump into problem-solving as soon as they are presented with a task, leading to solutions that despite being "accurate" when evaluating the task in isolation, completely miss the mark when looking at the bigger picture. In short, the issue was that the engineers in that situation had too much tunnel vision.
Understanding the context behind tasks allows you to come up with a solution that fits the bigger picture, making it possible not only to solve your users' problems, but also to do so in a way that is clean, scalable, easy to maintain, and that benefits everyone (as opposed to being beneficial to you and your team, but a pain in the ass for everyone else in the company).
In The Staff Engineer's Path, Tanya Reilly describes this as the Local vs Global Maxima problem, where the Local maxima means focusing on what's good for you or your team in an individual sense (the tunnel vision situation above), and the Global maxima means focusing on what's good for the company as a whole, regardless of whether or not it would be ideal for you as an individual (the big picture situation). In the book, she presents this idea to argue that this focus and ability to gather context about the bigger picture is a core ability of Staff+ level engineers and a minimum requirement for those aspiring to reach that level.
Although I agree with her that only Staff+ engineers should be expected to be masters at this, I do also believe that this is something everyone should attempt to do, regardless of level. Doing so not only improves your knowledge of how your company works and is structured, but also teaches you how to better determine what does and doesn't matter when trying to solve a particular problem, enabling you to be more effective both at coding and at providing value for your company.
]]>But one thing that I've learned in my career is that working at such companies is not for everyone. The experience of working at a large company is extremely different from that of a startup, so if you're not aware of those differences, you can end up having a big (negative) surprise down the road that can make you regret your choices.
In this article, I'd like to show you the difference between companies of different sizes so that you can determine which one better fits your personal style and interests.
Disclaimer: I haven't worked at every company to ever exist in this planet, so this is obviously not a 100% perfect model for every company out there. There are always exceptions, this is just a basic description of the average case.
Pros:
Cons:
Working at a startup is the most fun I've had in my career, but I think it takes a special kind of person to thrive in this environment.
I feel that working at a startup is ideal if you have an entrepreneurship mindset, because you not only get to be constantly exposed to the organizational side of things, you are likely also involved in it. This allows you to build a lot of experience with how companies work under the hood, which I've found to be really handy overall.
Another thing I like about startups is that the vibe is generally very positive. Since there aren't a lot of people in the company, there's basically no bureaucracy and chances are that everyone gets along well. This also makes it so that you can grow quite fast in the company, provided that the company itself is doing well in the first place.
In general, startups are a high-risk high-reward situation. While you can win big fast, you can lose big just as fast because any minor setback can destroy the entire company. This is another reason why I find them best for those with an entrepreneurship itch. The work itself also tends to be very chaotic and thus not something that someone looking for stability would enjoy.
Another important downside to mention is that the engineering side of things tends to be a bit dull. Since startups often prioritize speed, building things tends to be down-prioritized in favor of going for easy out-of-the-box and / or plug-and-play open-source solutions, making a software engineering job feel more like assembling LEGO than anything else. Every work I had as a mobile engineer at a startup was basically 100% building UI, which became really boring to me after a while.
Pros:
Cons:
The mid-level company is the company that is big enough to overcome the downsides attributed to startups, but nowhere as big enough to have the pros attributed to large companies. In general, the pros/cons of a mid-level company are essentially the averages of the other two cases in this article.
The primary problem with mid-level companies is that they try to mimic the processes and objectives of large companies, but have nowhere near as many resources as they do. This results for example in the team being tasked to solve massive engineering infrastructure challenges, because that's what large companies do, even though almost no one in the team is skilled enough to pull it off (likely because most who did have such skills ended up getting poached by the large companies). This puts giant pressure on these select few, which on one side can be seen as a great growth opportunity, but on the other side puts the company into a difficult position, as said people are likely to either burn out or leave in favor of an actual large company.
With that said, I find that mid-level companies still offer great growth opportunities. I think they are good choices for people who like the vibe and stability of large companies but can't stand the downsides of working at actual large companies.
Pros:
Cons:
By "large company", we're talking about tech giants like Google, Apple, Meta, and so on. Looking at the pros, it's easy to see why people dream of working at such places. But what a lot of people don't know is that there are strong downsides attached to working at such companies, and being able to tolerate them is critical to succeed there. I've met many folks who couldn't and ended up leaving.
The first and most critical downside is that everything is covered by a thousand layers of bureaucracy and politics. I cannot overstate how unbearable this is, but it's how things at companies of this size.
When you work at a startup, if you want to do something, you just go there and do it. For a mid-level company, it might be slightly more annoying, but still doable. But when you work at a large company, if you want to do something, you're going to have to have a meeting about having a meeting about drafting a document about a meeting about drafting another document, which hopefully will be picked up by the planning season several months later, leading to more meetings and documents until hopefully you get to do some actual work around a year later, unless the company re-orgs sometime during this process, in which case you'll have to drop everything and start from scratch.
This boundless bureaucracy extends everywhere, including the promotion process. Growing at such companies can be extremely hard as the process involves considerable amounts of bureaucracy and things that are outside of your control, especially for Staff+ positions. Which team you're part of also plays a big role as some teams are bound to have more opportunities to drive impact than others in a company of this size.
I think that thriving at a large company is directly correlated to how much you can tolerate such politics. No one would look at this description and be happy, but if you look at it and feel that you could take it, then working at a large company might be for you.
As I mentioned in the beginning, this is just a basic description of the average case. There are thousands of exceptions who surely don't fit into these descriptions. But the idea is just to present that the concept of trade-offs also applies to companies and cultural fit. Just because one company is larger than the other doesn't necessarily mean it's best you; depending on what you value, you might find that smaller companies are a better fit for you.
]]>I believe that learning a language is not a matter of talent, but that of dedication and following a good process. In this article, I would like to share the process I used more than once (and am still using) to tackle the challenge of learning a new language with great success. I currently speak three languages (native Portuguese, English, Swedish), and am in the process of learning a fourth (Japanese).
For me, learning a new language consists of three major steps:
My first step towards learning a new language consists of understanding the basics of the language. This includes things like learning how sentences are structured, how words should be pronounced, how to count, and any other language-specific basics that may apply (for example, for Swedish, learning the difference between en / ett, or the concept of soft / hard vowels).
The word "basic" here is very important. I want to have a good understanding of how things work in that language, but I don't want to waste time reasoning about complex grammar rules. Think of the sorts of things a mom would correct a child for; my mom would correct me if I used a word in the wrong place, but she wouldn't lecture me about the theory of participle clauses.
Knowledge of basic grammar massively pays off because later on it will simplify the process of expanding the vocabulary. Although at this point I will not know many words, my knowledge of basic grammar will allow me to more easily figure out how to pronounce any new words that I may encounter in the future, to properly classify them (subject? verb? noun? adjective? present? past tense?) based on their format and position in a sentence, and in some cases even accurately guess their meaning based on this information.
Although in the next section I'm going to complain about traditional language learning books / schools, I think they are one of the best resources for learning basic grammar. It's also usually straightforward and can be mastered in just a couple of months.
But this is about as far as those language schools and books will help you out, because in my opinion they massively fuck up pretty much everything beyond this point.
In my experience, after learning basic grammar, schools and books usually follow up by teaching advanced grammar. I think this is a complete waste of time and is why in my opinion many adults struggle with learning a language despite attending classes for multiple years. Learning complex grammar rules will not help you learn the language, even natives don't know this stuff!
What natives know is vocabulary, and this is what I believe is the right focus at this stage. My goal then becomes to expand my vocabulary as much as possible by immersing myself in the language, consuming as much media as I can and as frequently as possible. This is something that is usually referred to as the immersion method.
Here are some examples of things I do in this step:
You might think that this doesn't make sense because you won't understand anything, but that's exactly the point. Children also don't understand anything at first, yet magically they seem to just "get it" after one point, simply by being exposed to the language. This is because our brains are big pattern-matching machines; the more you expose yourself to a language, the more patterns / words you discover, which leads to further discoveries until you eventually reach a point where everything just clicks. In other words, the purpose of this step is to try to replicate how a child would learn a language at home.
I find watching shows / YouTube videos to be particularly excellent for this because you can usually guess what a word means based on the context of the scene, meaning you don't need to spend as much time translating words as you would when compared to other types of media.
Duolingo can also be a good tool to expand your vocabulary, as long as you don't use it in isolation. This is because although Duolingo is a good way to learn new words, it tends to be quite bad at everything else (e.g grammar), so I think it's important to back it up with the other methods mentioned in this section. It's important to note also that the quality of Duolingo's exercises varies greatly between languages, so looking for reviews before getting started is a must here.
If you tend to quickly forget things like I do, a spaced repetition system can greatly assist you with fixating all of this new knowledge. In my case I quite enjoyed using Readwise for this, but I know many who have used Anki / traditional flashcards with great success.
This entire process is very painful at first, but gets easier with time as your vocabulary improves. The unfortunate part is that this is a lengthy process; it can take several years of doing this before reaching a point where reading / listening to the language becomes effortless, and I think there's no way around it.
Although the previous step is excellent at making me good at reading and listening, in my experience it doesn't necessarily help me become good at speaking. When I was learning Swedish for example, although I had an easy time understanding what people were saying and knew in theory what to answer back, I still had a very hard time doing so, mostly because I just wasn't used to it. Although I knew the vocabulary in theory, it would still take several seconds for the right words to emerge in my mind when having a conversation with someone.
Unlike the other steps, I don't think there is any special method that one can use to become good at speaking a new language. This is something you just have to keep doing until your brain gets used to it. It's a massive advantage if you actually live in the country in question, but this is doable even if you don't as there are many online services designed around connecting you with native speakers of a particular language. I personally never used them though, so I cannot comment on their efficiency.
Another interesting to mention is that nowadays there are websites that connect LLMs to voice recognition models, allowing you to chat with something like ChatGPT with your voice for the purpose of language learning. I've tried one but personally didn't like the experience as talking to a robot felt completely different from talking to a real person, but if you'd like to try it out, you can easily find them on Google (there are hundreds of websites for this as of writing).
I think the most important part here is to resist the urge to switch to another language when you start to struggle, especially if you live in a country like Sweden where the natives are good at English. (In fact I would say that the hardest thing about learning a language like Swedish is not the language itself, but rather that Swedes are so good at English that they automatically switch to it when they see you struggling with Swedish, making it almost impossible for you to improve!)
Although we can divide the process of learning a new language in logical steps, we're still talking about a multi-year effort. That's just how it works, there are no shortcuts. If you struggle with learning a new language, I hope this post was able to teach you something new that can help you in your journey!
]]>This is true across the organization, including in mobile engineering. Early engineers don’t just write the code that builds the foundation for the app’s future success (while building up the tech debt that future engineers will pull their hair out over) but also the initial infrastructure both for how teams get things done and how they ship code out the door.
This doesn’t last forever though. As teams grow they make the transition out of generalist, jack-of-all-trades roles and begin focusing on specialization. But one part of the org where there is often a lag in making this transition — and sometimes a very long one — is on the mobile team. Very few mobile engineers will have time to support both their team's app features and the series of Ruby scripts that support their release process, yet the expectation that they do both often lingers far beyond when other teams in the organization have fully specialized.
In this article, we'll look at why this happens and how big tech companies follow-up on this problem.
]]>My approach is to pick a handful of sessions that are immediately useful / interesting to me and ignore the rest. I find this to be a good strategy because in my experience trying to keep up with things that you don't care about / don't have an immediate need for is a very easy way to burn yourself out, especially because Apple has this awful habit of announcing things and then proceeding to make them completely obsolete the next year. If in the future I happen to start working with something where one of the ignored sessions would be handy, I go back and watch it. Otherwise, it stays unwatched.
I think the Keynote and the State of the Union should always be watched, so I won't cover them here.
I really like this feature. Greatly recommend watching this as it shows not only how it works but also many interesting tricks that you can do with it. To make it better, it's open source and even works on VSCode, although I haven't tried the latter myself.
I watch StoreKit sessions because I have an app that has IAPs, so I'm always eager to see what's new in this regard. StoreKit Views aren't a new concept but I'm glad to see they made them more powerful. They also show how you can test IAPs directly in Xcode, which is pretty neat.
This session happens every year and it's always a good watch. As someone who has been working with build systems a lot recently, I'm particularly interested in the new explicit modules feature.
This is a great watch as Xcode 16 changed quite a bit in good ways. The time profiler now has a flame graph, and the new unified backtrace view looks awesome.
I haven't worked directly with UI for a very long time, but I enjoy watching these to see what they're improving. I like the improved interop between SwiftUI and UIKit and the new fluid animation type. The new UIUpdateLink type is also very interesting.
Similarly to "What's new in UIKit", I enjoy watching these to see what's up with the frameworks. Particularly great things this year are custom containers, how things like @Environment and state in Previews have been greatly simplified, and a much better integration with scroll views.
This is one of those things that I don't have a use for but watch anyway because it sounded cool, and indeed it was. I think this can also work as a hardcore app size optimization for your apps if you're fine with losing a bunch of Swift's features.
Watching sessions about new iOS features is always a good idea if they relate to something that you can potentially implement in your apps. I can think of many things that I can App Controls for and it seems also really easy to implement since it's all based on the existing Widgets infra.
This session is great. Xcode has a ton of tools and shortcuts that we're not aware of, and many of them are extremely useful if you can remember that they exist!
Apart from talking about the new Swift 6 feature of the same name, this session goes into great detail about how imports work in Swift/Obj-C and how to debug them. I learned a lot from this one.
I am not super interested in SwiftUI improvements, but the new custom container feature is a great addition. I believe this is something that will be used a lot, so it's worth it to check this session that shows how it works and what you can do with it. This session also shares many interesting details about how subviews work in SwiftUI.
Most people will probably never use this feature, but it's one of those things that are really cool in practice and worth a look. You might also want to check this out because generics involving the new ~Copyable type are really complicated, so watching this session will help you be less confused if you end up bumping into it.
Every year has a session on LLDB, and this year's one is especially good. It's hard to summarize this one because they show a ton of different things, so just go there and watch it! I was surprised to find out that you can open crash logs in Xcode (maybe it was always a thing?) and that you can create "manual" breakpoints by calling raise(SIGSTOP).
This was not as actionable as I thought it would be, but I still found it to be interesting because it contains "official" evidence about how structs/protocols can be bad for performance / app size if you misuse them, which is something I've covered in a recent talk and that a lot of people wanted to know more about.
This session not only shows interesting examples of how to use the memory debugger and instrument, but also shares a lot of interesting pieces of info about the difference between weak and unowned that I believe weren't documented before, including how to debug their performance! It also now serves as an "official" source for the autoreleasepool trick I wrote an article about a long time ago, which is pretty neat.
Swift is regarded for its type safety, meaning the compiler (usually) doesn't allow you to reference or do things that might potentially not exist / be invalid; the complete opposite of languages like Obj-C where the compiler allows you to do pretty much whatever you want in exchange for compile-time safety.
But here's an obscure fact about Swift: The language does support ObjC-like Selectors / forward declarations, it's just that we're not supposed to use it. If you know how a function is going to be named in the compiled binary, you can use the @_silgen_name attribute to craft a direct reference to that function, allowing a module to reference and call it regardless of whether or not it actually has "visibility" of it:
@_silgen_name("somePrivateFunctionSomewhereThatICantSee")
func refToThatFuncIReallyWantToCall()
func foo() {
refToThatFuncIReallyWantToCall() // Just called something I wasn't suposed to be able to!
}
This is used extensively by the Swift standard library to create something akin to the old-school forward declarations in Obj-C / C, allowing it to call functions that live deeper in the Swift Runtime even though it shouldn't be able to. As denoted by the underscore, this is not an official feature of Swift, but rather an internal detail of the compiler that is not meant to be used outside of this specific internal case. Nonetheless, you can use it in your regular Swift apps, so if you know what you're doing and is aware of the consequences / implications, you can do some pretty neat stuff with it.
@_silgen_name and symbol manglingSince Swift has namespacing features, the names you give to your Swift functions are not actually what they will be called in the compiled binary. To prevent naming collisions, Swift injects a bunch of context-specific information into a function's symbol that allows it to differentiate it from other functions in the app that might have the same name, in a process referred to as symbol mangling:
// Module name: MyLib
func myFunc() { print("foo") }
swiftc -emit-library
nm libMyLib.dylib
MyLib.myFunc()'s "real" name is:
$s5MyLib6myFuncyyF
What @_silgen_name does under the hood is override a function's mangled symbol with something of your choosing, giving us the ability to reference functions in ways that Swift generally wouldn't allow us to (which I'll show further below).
The attribute can be used in two ways: to override the symbol of a declaration and to override the symbol of a reference to a declaration. When added to a declaration, as in, a function with a body, the attribute overrides that function's mangled name with whatever it is that you passed to the attribute:
@_silgen_name("myCustomMangledName")
func myFunc() { print("foo") }
swiftc -emit-library -module-name MyLib test.swift
nm libMyLib.dylib
MyLib.myFunc()'s name now is:
myCustomMangledName
This is interesting, but what we truly care about here is what happens when you add it to a function that doesn't have a body. This would usually be invalid Swift code, but because we've added @_silgen_name to it, the compiler will treat it as valid code and assume that this function is somehow being declared somewhere else under the name we passed to the attribute, effectively allowing us to build forward declarations in pure Swift:
@_silgen_name("$s5MyLib6myFuncyyF")
func referenceToMyFunc()
func foo() {
// Successfully compiles and calls MyLib.myFunc(), even though
// this module doesn't actually import the MyLib module
// that defines myFunc()
referenceToMyFunc()
}
(This only works if the "target" is a free function, so for things like a class's static functions you'll need to first define a function that wraps them.)
Now, it should be noted that knowing a Swift function's mangled name in advance ($s5MyLib6myFuncyyF, in the above example) is not straight-forward as the compiler doesn't expose an easy way of predicting what these values will be, but we can fix this by using @_silgen_name on the declaration itself in order to modify it to something that we know and is under our control, like in the previous example where we replaced it with "myCustomMangledName". Note that you only need to worry about this when referencing Swift functions; For Obj-C / C, a function's "mangled name" will be the function's actual name as those languages have no namespacing features.
@_silgen_name("myCustomMangledName")
func referenceToFooMyFunc()
It's critical to note that this is extremely unsafe by Swift compiler standards as it sidesteps any and every type safety check that would normally apply here. The compiler will not run any validations here; it will instead completely trust that you know exactly what you're doing, that somehow these functions will exist in runtime even though this doesn't seem to be the case during compilation, that any custom names you're using are unique and not causing any potential conflicts with other parts of the codebase, and that whatever parameters you're passing to the forward-declared functions are correct and managed properly memory-wise (if your target is a C function, you need to do manual memory management with Unmanaged<T>).
If everything is done correctly, you just got yourself a nice forward-declared function, but if not, you'll experience undefined behavior. You do get a compile-time linker error though if the functions don't exist at all, which is pretty handy as I've noticed that in addition to all of the above concerns, the compiler also may sometimes accidentally tag these functions as "unused" depending on how you declare them, causing them to be stripped out of the compiled binary when they should not. I am sure that there are way more things that can go wrong here that I'm not aware of.
Lack of safety aside, there are two situations where I find this attribute useful outside the Swift standard library. The first one is being able to do C interop without having to define annoying headers and imports, similar to how the Swift standard library has been using it. It seems that a lot of people have been doing this, but I'll not cover this here because it's not the use case that led me to use this attribute. I'll just point out that this is something you also need to be very careful about, particularly because @_silgen_name functions use the Swift calling convention, which is incompatible with C (thanks Ole Begemann for pointing that out!).
The second one however, which is what I have been using this for, is that when applied strategically, you can use this attribute to greatly improve your app's incremental build times.
Let's assume that we're developers of a large modularized Swift app that has some sort of type safe dependency injection mechanism to pass values around. For this mechanism to work, we might end up with a "central" registry of dependencies that imports every module and configures every possible dependency these modules might request:
import MyDepAImplModule
import MyDepBImplModule
import MyDepCImplModule
...
func setupRegistry() {
myRegistry.register(MyDepA(), forType: MyDepAProtocol.self)
myRegistry.register(MyDepB(
depA: myRegistry.depA,
), forType: MyDepBProtocol.self)
myRegistry.register(MyDepC(
depA: myRegistry.depA,
depB: myRegistry.depB,
), forType: MyDepCProtocol.self)
}
Something like this allows us to have a nice and safe system where features are unable to declare dependencies that don't exist, but it will come at the cost of increased incremental build times. Importing all modules like this will cause this module to be constantly invalidated, and the bigger your project gets, the worse this problem will get. In my personal experience, projects with a setup like this and with several hundred modules can easily end up with a massive 10~60 seconds delay to incremental builds, depending on the number of modules and how slow your machine is.
However, by using forward-declared @_silgen_name references to a function that wraps the initializers instead of referencing these initializers directly, we can achieve the same injection behavior without having to import any of the modules that define said initializers!
@_silgen_name("myDepAInitializer") func makeMyDepA()
@_silgen_name("myDepBInitializer") func makeMyDepB(_ depA: MyDepAProtocol)
@_silgen_name("myDepCInitializer") func makeMyDepC(_ depA: MyDepAProtocol, _ depB: MyDepBProtocol)
This allows projects like this to completely eliminate these build time bottlenecks, but it comes at the price of losing all type safety around this code. This might sound like a bad trade-off since type safety is the reason why a developer would want to have a dependency injection setup like this in the first place, but if you have other ways of validating those types and dependencies (such as a CLI that scans your app and automatically generates / validates this registry), you can abstract the dangerous bits away from your developers and effectively enjoy all the build time improvements without having to worry about any negatives other than having to be extra careful when making changes to this part of the code.
Forward-declaring Swift functions allow you to do all sorts of crazy things, but remember, this is not an official feature of the language. As mentioned in the beginning, my recommendation is that you should avoid messing with internal compiler features unless you're familiar with how Swift works under the hood and know exactly what you're doing.
But putting this aside, one thing that I tend to reflect on when learning about features like this is how the danger involved in using them is not so much about the features themselves, but rather that their behavior might change without warning.
Although I understand the Core team's vision of making Swift a safe and predictable language, I think there is a real demand for having poweruser-ish / "I know this is dangerous, I don't care" features like this officially supported in Swift, and it would be amazing if @_silgen_name could be recognized as one such feature. I like what you can achieve with it, and I would love to be able to use it without fear that it might change or stop existing in the future.
Some are about iOS development specifically, but most relate to general software engineering. If you strive to be a world-class developer, these books and resources will help you get there.
The following are not “books” in the traditional sense, but are nonetheless great software engineering resources that I think should be present here.
Over the last couple of years, the tech world has been witnessing many companies updating their products to add support to this new technology, including tech giants like Apple, who introduced several APIs and device features related to passkeys in iOS 16, and Google, who has been pushing passkeys hard since announcing passkeys support for Chrome and Gmail back in 2022.
When I first saw Apple and Google’s announcements, I wasn't sure what I was looking at. The fact that the tech industry has been trying to move away from old-school passwords wasn't news to me, but I asked myself: didn't we already solve this problem when products started allowing users to register a "secondary" authentication requirement, such as receiving an SMS containing a PIN code?
I then decided to look deeper into this topic, and was so pleasantly surprised by what I found that I thought it would be cool to share it with you. In this article, we'll take a look at what passkeys are, why companies like Apple and Google are adding support for them, how they work under the hood, and whether or not you should convert your own accounts to use them and build support for them in your mobile apps!
]]>Humans' obsession with dividing things into groups is not unknown to psychology. The ability to quickly classify information is a core contributor to humanity's evolution and is something we start doing as soon as we're born.
But we need to be careful with the fact that sometimes this classification "feature" goes wrong. Instead of categorizing something as "this or that", we sometimes go for something more in terms of "us vs them", which not only leads to a lot of unnecessary conflict, but is also bad for our lives in general.
As far as I understand, the exact reason why humans do this is not fully understood. You might have heard of the "tribalism" theory which defines that humans are hardcoded to divide themselves in this "us vs them" fashion; this is mentioned a lot in pop culture, but from my understanding, this theory is heavily critized by experts and should not be considered true. But explanations aside, we know that this happens and is something everyone needs to understand and overcome at one point.
When it comes to software engineering, the reality is most things cannot be cleanly divided into "right or wrong" boxes like that. Yes, some things are concrete and indisputable. If an OS-level API states that you should never call it outside of the main thread, then that's what you should follow.
But most things are the opposite of that. When we talk about general problems and best practices, it's extremely rare for them to have a clear right or wrong way to go. Instead, they depend on what you're trying to achieve, and everyone is trying to achieve something different. There is no right or wrong in these situations, only different approaches.
To add another layer to the problem, a lot of stuff is also largely subjective! Many of our daily choices boil down to personal preference, making it even more senseless to attempt to categorize such things into clean "right" or "wrong" boxes. The industry even has a saying for this: there are no solutions, only trade-offs.
One's ability to understand this concept is the greatest indicator of seniority in my personal opinion. It's so common for intermediate-level engineers to fail to understand the subjectivity of software engineering that I find that you can accurately gauge someone's experience level by simply observing how well they grasp this concept. Those who don't understand it overengineer things and tend to get bogged down on details that are either subjective or outright pointless, while those who do understand it keep things simple and display a much better ability to prioritize important work and ignore less important details.
In this article, I'd like to shine a light on some of the topics that iOS developers tend to be divided on as a way to help developers who still haven't cleared that hurdle understand that these topics are not as straightforward as we might tend to think. Overcoming this barrier is part of the process of becoming a more experienced software engineer, and is something almost everyone goes through in their careers, so it's not something to be anxious about. I also had a period where I thought I had all the answers to the coding universe!
When a new framework, tool, or programming language is released, it's not uncommon for developers to divide themselves into groups and argue about which one of them is "better", claiming that theirs is the only option and everything else is a mistake.
The problem with this line of thought is that it assumes that one tool was created to completely replace another, and while sometimes this may very well be the case, in most cases it's not.
As a developer, it's important to understand that different tools solve different problems. While there may be some overlap between them, they were likely designed with different use cases in mind.
The biggest example here as of writing is the SwiftUI vs UIKit discussion. Despite being largely different from each other, social media is full of content about how one is "better" than the other.
Yes, SwiftUI and UIKit are both frameworks for building UI, but they solve different problems. As covered in my earlier "Thoughts on SwiftUI vs UIKit" article, SwiftUI is amazing for simple projects but quickly becomes inferior to UIKit as the project grows in complexity. Neither of these frameworks is better than the other, they are simply different tools for different jobs.
Discussions about Hybrid vs Native development also fall into this category. Hybrid development has a bad reputation because it generally results in apps of very low quality, but it saves companies a lot of time and money. Most companies reject this trade-off as they determine that quality is more important than saving a few bucks, but that doesn't mean that nobody should do so. If you're starting a company but don't have a lot of time or money, hybrid development can be a good way to bootstrap your business. It's not fair to compare these two in a "better/worse" fashion because they don't target the same set of problems.
I find that one example of how things can be subjective in this context is how a developer uses git. There is a lot of discussion about whether you should use it via the CLI or as a dedicated GUI app, but there isn't much to be discussed here because this is something that entirely boils down to your personal preference. There are pros and cons to each approach, and you will know which one is the right one for you because you will feel that it better suits your set of preferences. Neither approach is universally right or wrong.
Architecture is usually the first thing that an iOS developer fights about. Every year we get a new architecture with some fancy acronym, that architecture gets a bunch of loyal followers, and then the groups start arguing about which architecture has the coolest name and solves the biggest number of problems. The first thing you learn is that MVC is terrible and should be avoided at all costs.
One unfortunate consequence of these fights around architecture is that it leads developers to pick architectures that solve problems that they don't really have (and not solving the problems they actually have), which are guaranteed to make a project harder to maintain in the long run.
It's important to understand that there is no architecture that solves all problems. Just like in the tools example, different architectures are meant to solve different problems, and the right architecture for your project is the one that solves your particular set of problems. MVC for example, which developers love to hate for some reason, can be a great choice for simple projects!
Architecture is not something that you pick once and stick with forever, but rather something that you continuously adjust as your project evolves and you start having to deal with different sets of problems. I have been told that my talk about how Spotify's iOS app is architected is great at demonstrating this, so I'm mentioning it here in case you want to check it out!
The same applies to general programming advice that you find on the web. We have a lot of content creators in our community, and I find that most of them present their content in the following format: "here's a thing, here's how it works, and here's what you can do with it". This is what I also strive to do when writing content for this blog, and I like this format because it doesn't claim that something is the best way of achieving something, it's simply showing you one possible way and leaving for you to decide whether or not that's the right solution for you.
But every once in a while, the algorithm recommends me content that is more in line with "here's a thing, and here's why you should always use it and abandon everything else". It's not about learning something new, it's about saying that you're wrong about something. There's usually a spike of this type of content in the WWDC week when new APIs are released.
The problem with content like this is that most best practices are highly subjective. Even if the content is referring to a very specific problem, it's hardly the case that the problem in question has one single viable solution. As we've already mentioned a couple of times in this article, personal preference plays a major role in this type of stuff. Something very helpful to you might be terrible for someone else, so they cannot be classified in a universal "right or wrong" fashion.
Another common discussion point for iOS developers is whether or not you should learn computer science theory as part of your career. This is usually brought up whenever a company that run old-school programming puzzles (LeetCode) as part of their interview processes is mentioned.
This topic however is complex enough that it deserves its own article, and convieniently enough, such an article already exists! You can find more information about this in my "How necessary are the programming fundamentals?" article, but as a quick summary, this is a very complicated topic that has no objective right or wrong.
I hope this was able to help you see that some things in software engineering are more complicated than they might seem at first glance. Realizing this is an important step in a software engineer's career, and while this article will certainly not stop those wars from popping up every once in a while, I do believe that as a community we can help others get through this phase faster.
]]>I find code signing to be interesting not just because of what it does, but because it's one of those things that iOS developers kinda just take for granted. We know it's there and we know how to deal with it, but we don't really stop to think why it's there or what it's doing under the hood. We just follow Apple's convoluted steps on how to make it work and move on with our lives.
In practice, code signing is an incredibly important safety feature of Apple's ecosystems. Knowing what it is and why it exists makes debugging issues related to it considerably easier, so I've written this article to help you understand it.
]]>In typical SwiftRocks fashion, we're going deep into the Swift compiler to answer these and other questions about how async/await works internally in Swift. This is not a tutorial on how to use async/await; we're going to take a deep dive into the feature's history and implementation so that we can understand how it works, why it works, what you can achieve with it, and most importantly, what are the gotchas that you must be aware of when working with it.
Disclaimer: I never worked at Apple and have nothing to do with the development of async/await. This is a result of my own research and reverse-engineering, so expect some of the information presented here to not be 100% accurate.
Swift's async/await brought to the language a brand new way of working with asynchronous code. But before trying to understand how it works, we must take a step back to understand why Swift introduced it in the first place.
The concept of undefined behavior in programming languages is something that surely haunts anyone who ever needed to work with the so-called “precursor programming languages” like C++ or Obj-C.
Programming languages historically provided you with 100% freedom. There were no real guardrails in place to prevent you from making horrible mistakes; you could do anything you wanted, and the compilers would always assume that you knew what you were doing.
On one hand, this behavior on behalf of the languages made them extremely powerful, but on the other hand, it essentially made any piece of software written with them a minefield. Consider the following C code where we attempt to read an array's index that doesn't exist:
// arr only has two elements
void readArray(int arr[]) {
int i = arr[20];
printf("%i", i);
}
We know that in Swift this will trigger an exception, but we’re not talking about Swift yet, we’re talking about a precursor language that had complete trust in the developer. What will happen here? Will this crash? Will it work?
The short answer is that we don't know. Sometimes you'll get 0, sometimes you'll get a random number, and sometimes you'll get a crash. It completely depends on the contents of that specific memory address will be at that specific point in time, so in other words, the behavior of that assignment is undefined.
But again, this was intentional. The language assumed that you knew what you were doing and allowed you to proceed with it, even though that turned out to almost always be a huge mistake.
Apple was one of the companies at the time that recognised the need for a safer, modern alternative to these languages. While no amount of compiler features can prevent you from introducing logic errors, they believed programming languages should be able to prevent undefined behavior, and this vision eventually led to the birth of Swift: a language that prioritized memory safety.
One of the main focuses of Swift is making undefined behavior impossible, and today this is achieved via a combination of compiler features (like explicit initialization, type-safety, and optional types) and runtime features (like throwing an exception when an array is accessed at an index that doesn’t exist. It’s still a crash, but it’s not undefined behavior anymore because now we know what’s supposed to happen!).
You could argue that this should come with the cost of making Swift an inferior language in terms of power and potential, but one interesting aspect of Swift is that it still allows you to tap into that raw power that came with the precursor languages when necessary. These are usually referred to within the language as “unsafe” operations, and you know when you’re dealing with one because they are literally prefixed with the “unsafe” keyword.
let ptr: UnsafeMutablePointer<Int> = ...
But despite being designed for memory safety, Swift was never truly 100% memory safe because concurrency was still a large source of undefined behavior in the language.
The primary reason why this was the case is because Grand Central Dispatch (GCD), Apple’s main concurrency solution for iOS apps, was not a feature of the Swift compiler itself, but rather a C library (libdispatch) that was shipped into iOS as part of Foundation. Just as expected of a C library, GCD gave you a lot of freedom in regard to concurrency work, making it challenging for Swift to prevent common concurrency issues like data races, race conditions, deadlocks, priority inversions, and thread explosion.
(If you’re not familiar with one or more of the terms above, here’s a quick glossary:)
let semaphore = DispatchSemaphore(value: 0)
highPrioQueue.async {
semaphore.wait()
// …
}
lowPrioQueue.async {
semaphore.signal()
// …
}
The above is a classic example of a priority inversion in iOS. Although you as a developer know that the above semaphore will cause one queue to wait on another, GCD would not necessarily agree and fail to properly escalate the lower priority queue’s priority. To be clear, GCD can adjust itself in certain situations, but patterns like the above example were not covered by it.
Because the compiler was unable to assist you with such problems, concurrency (and thread safety specifically) historically was one of the hardest things to get right in iOS development, and Apple was well aware of it. In 2017, Chris Lattner, one of the driving forces behind Swift, laid out his vision for making concurrency safe in his Swift Concurrency Manifesto, and in 2020, a roadmap materialized which envisioned new key features to Swift, which included:
But although what the roadmap proposed was new to the language, it was not new to tech itself. The async/await pattern, which was first introduced in 2007 as a feature of F#, has been an industry standard since 2012 (when C# made it mainstream) due to its ability to allow asynchronous code to be written as traditional synchronous ones, making concurrency-related code easier to read.
For example, before you might write:
func loadWebResource(_ path: String, completionBlock: (result: Resource) -> Void) { ... }
func decodeImage(_ r1: Resource, _ r2: Resource, completionBlock: (result: Image) -> Void)
func dewarpAndCleanupImage(_ i : Image, completionBlock: (result: Image) -> Void)
func processImageData1(completionBlock: (result: Image) -> Void) {
loadWebResource("dataprofile.txt") { dataResource in
loadWebResource("imagedata.dat") { imageResource in
decodeImage(dataResource, imageResource) { imageTmp in
dewarpAndCleanupImage(imageTmp) { imageResult in
completionBlock(imageResult)
}
}
}
}
}
whereas now you can write:
func loadWebResource(_ path: String) async -> Resource
func decodeImage(_ r1: Resource, _ r2: Resource) async -> Image
func dewarpAndCleanupImage(_ i : Image) async -> Image
func processImageData1() async -> Image {
let dataResource = await loadWebResource("dataprofile.txt")
let imageResource = await loadWebResource("imagedata.dat")
let imageTmp = await decodeImage(dataResource, imageResource)
let imageResult = await dewarpAndCleanupImage(imageTmp)
return imageResult
}
Introducing this pattern in Swift specifically would not only improve the experience of working with completion handlers but also technically allow the compiler to detect and prevent common concurrency mistakes. It’s easy to see why Chris Lattner made the pattern the central piece of his manifesto, in which he declared, in his own words: “I suggest that we do the obvious thing and support this in Swift.”
Over the years these features were gradually integrated into Swift, culminating in the Swift 5.5 release and the “official” release of async/await in Swift.
Now that we understand why async/await became a part of Swift, we’re ready to take a look at how it works under the hood!
But first, I have to set some expectations with you. Because what we refer to as “async/await” is actually multiple different compiler features working in unison, and because each of these features is complicated enough to warrant their own separate article(s), there’s no way I can possibly cover every single detail of how it works in just one article. I would lose my sanity within the first section if I did that.
So instead of doing that, I’ve decided that a good plan would be to cover only what I believe to be async/await’s “core” functionality and to leave the remaining bits for future articles. But although we don’t go into details about those other bits here, I still made sure to mention some of them in the appropriate sections for you to know where they come into play. For simplicity, we’re also not going to cover compatibility modes and Obj-C bridging here.
With that said, let’s get started! When I reverse-engineer something to learn more about it, I always start from the bottom and move my way up. So when I decided that I wanted to understand how async/await works, my first question was: “Who is managing the program’s background threads?”
One of the most important aspects of how async/await works in Swift, and one that we must cover before anything else, is that while async/await technically uses GCD under the hood, it does not use the DispatchQueues that we are familiar with. Instead, what powers async/await is a completely new feature of libdispatch called The Cooperative Thread Pool.
Unlike traditional DispatchQueues, which creates and terminates threads dynamically as it deems necessary, the Cooperative Thread Pool manages a fixed number of threads that are constantly helping each other with their tasks.
The “fixed number”, which in this case is equal to the system’s number of CPU cores, is an intentional move that aims to prevent thread explosion and improve system performance in general, something which DispatchQueues were notoriously not very good at.
In other words, the cooperative thread pool is similar to traditional GCD from an interface perspective (it’s a service that receives a job and arranges some thread to run it), but is more efficient and designed to better suit the Swift runtime’s special needs.
We can see exactly how the Cooperative Thread Pool works by exploring the open-source libdispatch repo, but I would like to reserve that for a future article. In general, I find that this WWDC session from 2021 provides great information about how the pool works internally.
It must be noted that the fact that the pool holds a fixed number of threads has a significant impact on how asynchronous code is supposed to be written in the language. There is now an expectation that threads should always make forward progress, which means you need to be really careful with when and how you do expensive operations in async/await in order to avoid starving the system's threads. Swift's async/await has many gotchas like this, and we'll uncover some of them as we proceed.
Let’s move up the abstraction layers. How does the compiler “speak” to the pool?
In Swift, you don't interact with the Cooperative Thread Pool directly. This is hidden by several layers of abstractions, and at the lowest of these layers, we can find the executors.
Executors, just like the pool itself, are services that accept jobs and arrange for some thread to run them. The core difference between them is that while the pool is just, well, the pool, executors come in many shapes and forms. They all end up forwarding jobs to the pool, but the way they do so may change depending on which type of executor you’re using. As of writing, executors can be either concurrent (jobs can run in parallel) or serial (one at a time), and the compiler provides built-in implementations for both of them.
The built-in concurrent executor is referred to internally as the Global Concurrent Executor.
The implementation of this executor in the Swift compiler is for the most part nothing more than an abstraction on top of the cooperative thread pool that we mentioned in the previous section. It starts by creating an instance of the new thread pool, which we can see is done by calling the good ol’ GCD API with a special new flag:
constexpr size_t dispatchQueueCooperativeFlag = 4;
queue = dispatch_get_global_queue((dispatch_qos_class_t)priority,
dispatchQueueCooperativeFlag);
Then, when the executor is asked to run a job, it forwards it to the pool via a special dispatch_async_swift_job API:
JobPriority priority = job->getPriority();
auto queue = getGlobalQueue(priority);
dispatch_async_swift_job(queue, job, (dispatch_qos_class_t)priority,
DISPATCH_QUEUE_GLOBAL_EXECUTOR);
I would like to leave the details of libdispatch and dispatch_async_swift_job for another time, but as mentioned in the previous section, this is supposed to be a special/more efficient variant of the regular dispatch_async API that iOS developers are familiar with that better suits the Swift runtime’s special needs.
Another aspect of this executor worth mentioning is that it's "global", meaning there is only one instance of it for the entire program. The reasoning for this is similar to why a serial DispatchQueue would deep-down forward its jobs to the global ones: while from a systems perspective, it makes sense for the responsibilities to appear to be divided, from a performance perspective it would be a nightmare for each component to have their own dedicated threads. It's sensible then to have a single, global executor that will ultimately schedule most of the work in the system, and have everyone else forward their jobs to it.
The Global Concurrent Executor is Swift’s default executor in general. If your async code is not explicitly requesting that it should go through a specific executor (we will see some examples of that as we continue to explore the abstractions), this is the executor that will handle it.
(Swift uses a different global executor in platforms that don’t support libdispatch, but I will not go into details of that as the primary focus of this article is iOS development.)
Unlike the concurrent executor, the purpose of the serial executor is to make sure jobs are executed one by one and in the order in which they were submitted.
The built-in serial executor is referred internally to as the "Default Actor" (spoiler alert!), and it is in its essence an abstraction of the concurrent executor that keeps track of a linked list of jobs:
class DefaultActorImpl : public HeapObject {
public:
void initialize();
void destroy();
void enqueue(Job *job);
bool tryAssumeThread(RunningJobInfo runner);
void giveUpThread(RunningJobInfo runner);
}
struct alignas(2 * sizeof(void*)) State {
JobRef FirstJob;
struct Flags Flags;
};
enum class Status {
Idle,
Scheduled,
Running,
};
swift::atomic<State> CurrentState;
When a job is enqueued, instead of immediately forwarding it to the concurrent executor, it stores it in the linked list and waits until there’s no one in front of it before truly passing it forward.
static void setNextJobInQueue(Job *job, JobRef next) {
*reinterpret_cast<JobRef*>(job->SchedulerPrivate) = next;
}
The full extent of what happens when a job is enqueued to the serial executor is one of those things that I said I have to skip in order to maintain my sanity because this executor is responsible for managing a lot of stuff relating to other async/await features (Actor.cpp alone has 2077 lines of code).
But one interesting thing worth mentioning is how it attempts to prevent priority inversions. When a high-priority job is enqueued to a list that previously only had low-priority jobs, the executor escalates the priority of all jobs that came before it.
if (priority > oldState.getMaxPriority()) {
newState = newState.withEscalatedPriority(priority);
}
As the name implies, the Default Actor serial executor comes into play when writing async code via the Actors feature. We still have a couple of things to understand before we can look into actors though, so let’s move on for now.
Besides the two built-in executors, it's also possible to build your own custom executor in Swift by creating a type that inherits from the Executor protocol:
public protocol Executor: AnyObject, Sendable {
func enqueue(_ job: consuming Job)
}
For serial executors specifically, Swift even provides a more specific SerialExecutor protocol:
public protocol SerialExecutor: Executor { ... }
The ability to do so was added in Swift 5.9 alongside the ability to pass custom executors to certain APIs, but there's little reason why you would do such a thing. This was added as a support tool for developers who use Swift in other platforms and is not something an iOS developer would have to deal with. With that said, we do have one very important feature to cover in this article that relies on this ability, but we need to answer a couple more questions before we can look into what that feature is.
Let's keep moving up the abstraction layers. We now know that Swift's built-in executors are the ones passing jobs to the cooperative thread pool, but where do these jobs come from?
The next piece of the puzzle lies in the async/await pattern itself.
As you might know by now, the async/await pattern consists of two new keywords (async and await) that allow you to define an asynchronous function and wait for an asynchronous function to return, respectively:
func example() async {
let fooResult = await foo()
let barResult = await bar()
doSomething(fooResult, barResult)
}
func foo() async -> FooResult {
// Some async code
}
func bar() async -> BarResult {
// Some async code
}
One of the main purposes of the async/await pattern is to allow you to write asynchronous code as if it were straight-line, synchronous code, and this might give you the impression that deep down this feature is just a compiler pass that is dividing a function into multiple components. This definition is important in order to understand how the machine is operating, but in reality, things are a lot more sophisticated than that!
Instead of thinking of an asynchronous function as just a syntax sugar for declaring a bunch of closures, think of it as an ordinary function that has the special power to give up its thread and wait for something to happen. When that thing is complete, the function bootstraps itself back up and resumes its execution.
This means that apart from how they wait for things to happen, asynchronous functions and synchronous ones are (sort of) the same thing in Swift! The only difference is that while the synchronous function gets to take full advantage of the thread and its stack, the asynchronous ones have the extra power of giving up that stack and maintaining their own, separate storage.
Although our main interest here is exploring memory safety, one interesting thing to mention is how this definition is important from a code architecture perspective; because asynchronous functions in Swift are effectively the same as synchronous ones, this means you can use them for things that you previously couldn’t do with completion handler closures, such as marking a function as throws:
func foo() async throws {
// …
throw MyError.failed // Can’t do this without async/await!
}
But enough theory. How does it work?
We can start understanding how the pattern is implemented by looking at what the Swift compiler does when it processes a line of code marked as await. By compiling the above example code with the -emit-sil flag, we can that the example’s Swift Intermediate Language output looks something like this (greatly simplified for readability):
// example()
sil hidden @$s4test7exampleyyYaF : $@convention(thin) @async () -> () {
bb0:
hop_to_executor foo
foo()
hop_to_executor example
hop_to_executor bar
bar()
hop_to_executor example
return void
} // end sil function '$s4test7exampleyyYaF'
The SIL of an async function looks exactly the same as the one from a regular synchronous function would, with the difference that Swift calls something called hop_to_executor before and after an await function is supposed to be called. According to the compiler’s documentation, the purpose of this symbol is to make sure that the code is running in the right executor. Hmmmm.
One important memory safety feature of Swift’s async/await is what it refers to as execution contexts. As we briefly mentioned when we talking about executors, whenever something runs asynchronously in Swift through async/await, it has to go through a specific executor; the majority of code will go through the default global concurrent one, but certain APIs may use different ones.
The reason why certain APIs may have specific executor requirements is to prevent data races. We’re not ready to explore this topic yet though, so for now just keep in mind that this is why different executors exist.
What hop_to_executor does in practice is check the current execution context. If the executor that the function is currently running on is the same as the function that we want to await expects, the code will run synchronously. But if it’s not, a suspension point is created; the function requests the necessary code to run in the correct context and gives up its thread while it waits for the result. This “request” is the job that we were looking for, and the same will happen when the job finishes in order to return to the original context and run the rest of the code.
func example() async {
(original executor)
let fooResult = await foo() // POTENTIAL job 1 (go to foo’s executor)
// POTENTIAL job 2 (back to original context)
let barResult = await bar() // POTENTIAL job 3 (go to bar’s executor)
// POTENTIAL job 4 (back to original context)
doSomething(fooResult, barResult)
}
The word potential here is very important: As just mentioned, a suspension point is only created if we’re in the wrong context; If no context hopping is needed, the code will run synchronously. This is something that DispatchQueues notoriously could not do, and is a very welcome ability that we will mention again later in this article.
In fact, since await only marks a potential suspension point, this has the interesting side-effect of allowing async protocol requirements to be fulfilled by regular, synchronous ones:
protocol MyProto {
func asyncFunction() async
}
struct MyType: MyProto {
func asyncFunction() {
// This is not an async function, but the Swift is fine with it
// because `async` and `await` doesn’t mean that the
// function is _actually_ async, only that it _may_ be.
}
}
This is also why you can call synchronous functions from asynchronous ones but not vice-versa; asynchronous functions know how to synchronously wait for something, but synchronous ones don’t know how to create suspension points.
Suspension points are a major win for memory safety in Swift: because they result in the thread being released (as opposed to how a lock, semaphore, or DispatchQueue.sync would hold onto it until the result arrived), this means that deadlocks cannot happen in async/await! As long as you’re not mixing async/await code with other thread-safety mechanisms (which Apple says you shouldn’t in their 2021 session), your code will always have a thread in which it can run.
It must be noted though that this behavior has an important gotcha in terms of code architecture. Because suspension points may give up their thread while waiting for a result, it can (and will) happen that the thread that originated the request may start running other jobs while it waits for the result to arrive! In fact, unless you’re using Main Actors (which we will explore in detail later on), there’s no guarantee that the thread that will process the result will even be the same one that originated the request!
func example() async {
doSomething() // Running in thread A
await somethingElse()
doSomethingAgain() // This COULD also be running in thread A, but it’s probably not!
// Also, thread A has likely moved on to do other things while we were waiting for somethingElse()!
}
This means that in order to implement thread-safe objects in async/await, you must structure your code in a way so that it’s never assuming or carrying state across suspension points because any assumptions that you made about the program’s state prior to the suspension point might not be true anymore after the suspension point. This behavior of async/await is called reentrancy and is something we’ll explain in more detail further below when we start speaking about race conditions specifically. In short, reentrancy in Swift’s async/await is intentional, and is something you must keep in mind at all times when working with async/await code in Swift.
I would like to show you how exactly these suspension points and the re-bootstrapping work in the compiler’s code, but as of writing, I was not able to properly understand it. I’d still like to do that though, so I’ll update this article once I figure that out.
We still have one important puzzle piece to investigate though. If synchronous functions are not allowed to call asynchronous ones because they don’t have the power to create a suspension point, what is the “entry point” for an asynchronous function?
In Swift’s async/await, the way you call an asynchronous function the first time is by creating a Task object:
Task {
await foo()
}
Because the closure of a task object is itself marked as async, you can use it to call other asynchronous functions. This is the “entry point” we were looking for.
Swift’s Task struct has a much bigger role than simply allowing you to call async code; they form a fundamental part of what Swift calls "Structured Concurrency," where asynchronous code is structured as a hierarchy of "tasks." This structuring allows parent tasks to manage their child tasks, sharing information like status, context, priority, and local values, as well as enabling the creation of child "task groups" that comprise multiple tasks that run in parallel. Structured Concurrency forms the backbone of Swift's async/await architecture, but is a topic large enough to warrant its own article. For the purposes of this article, we’re going to focus only on the core functionality of tasks.
Let’s get back to the original question. How is Task managing to create an async closure out of nowhere?
The key to understanding how Task bootstraps an async closure lies in its initializer. When a Task is created, the closure it captures is managed not by the Task struct itself, but by a function that lives deep within the Swift runtime:
extension Task where Failure == Never {
public init(
priority: TaskPriority? = nil,
@_inheritActorContext @_implicitSelfCapture operation: __owned @Sendable @escaping () async -> Success
) {
let flags = taskCreateFlags(
priority: priority, isChildTask: false, copyTaskLocals: true,
inheritContext: true, enqueueJob: true,
addPendingGroupTaskUnconditionally: false,
isDiscardingTask: false)
let (task, _) = Builtin.createAsyncTask(flags, operation)
self._task = task
}
}
The call to Builtin.createAsyncTask ultimately results in a call to swift_task_create in the Swift runtime, which creates a task based on a couple of flags that configure how the task should behave. The compiler conveniently takes that of that configuration automatically for you, and once the task is set up, it is immediately directed to the appropriate executor for execution.
static AsyncTaskAndContext swift_task_create_commonImpl(…) {
// The actual function is a lot more complicated than this.
// This is just a pseudo-coded simplification for learning purposes.
task.executor = task.parent.executor ?? globalConcurrentExecutor;
task.checkIfItsChildTask(flags);
task.checkIfItsTaskGroup(flags);
task.inheritPriorityFromParentIfNeeded(flags);
task.asJob.submitToExecutor();
}
Structured Concurrency is the reason why the compiler knows all of this information. Similarly to how the serial executor tracks a linked list of jobs, the Swift runtime tracks a graph of all tasks running concurrently in the program. This tracking, in combination with a secondary map connecting asynchronous functions to the tasks that invoked them, allows Swift to infer all the necessary information to bootstrap a task, including the ability to make adjustments such as escalating the priority of a child task based on their parent's priority.
Interestingly enough, Swift actually provides you with APIs that allow you to access these graphs in your Swift code, although they make it very clear that should only be used in special cases. One example of this is withUnsafeCurrentTask, which allows functions to determine if they were called as part of a task.
func synchronous() {
withUnsafeCurrentTask { maybeUnsafeCurrentTask in
if let unsafeCurrentTask = maybeUnsafeCurrentTask {
print("Seems I was invoked as part of a Task!")
} else {
print("Not part of a task.")
}
}
}
Because child tasks by default inherit the properties of their parent, and because the runtime handles that automatically for you, you might end up in situations where a task is inheriting things you didn't mean to:
func example() async {
Task {
// This is NOT a parentless task, as much as it looks like one!
}
}
In the example above, what looks like a "bland" task is actually a child task of whatever job led to example() being called! This means this task will inherit that parent's properties, which may include things you don't want this particular task to inherit, such as the executor. One example case where this can be a problem is when dealing with code that interacts with the MainActor, which we will explore in detail further below.
In order to avoid this, you must use alternate task initializers like Task.detached which define "unstructured" tasks with no parent, but it must be noted that they also have their own gotchas, so make sure to read their API documentation before using them.
We’ve now covered all the core mechanics of async/await, but we still have one question left to answer. We’ve seen how async/await is able to prevent thread explosion, priority inversions, and deadlocks, but what about data races? We know that the concept of “execution contexts” is what’s supposed to prevent it, but we haven’t seen that in practice yet.
We also haven’t even begun to talk about the infamous race conditions that plague every iOS app. What does Swift’s async/await do to protect you from those?
We have left Actors to last because they don’t relate to the core functionality of async/await, but when it comes to memory safety, they are just as important as the other features we’ve covered.
In Swift, an “actor” is a special type of class that is marked with the actor keyword:
actor MyExample {
var fooInt = 0
}
Actors are mostly the same as classes, but they contain a special power: any mutable state managed by an actor can only be modified by the actor itself:
func foo() {
let example = MyExample()
example.fooInt = 1 // Error: Actor-isolated `fooInt`
// cannot be mutated from a non-isolated context
}
In the example above, in other to mutate fooInt, we must somehow abstract that action so that it happens within the bounds of the actor:
actor MyExample {
var fooInt = 0
func mutateFooInt() {
fooInt = 1
}
}
This looks like it would make no difference, but this is where the actors’ second special power comes into play: only the actor is allowed to synchronously reference its methods and properties; everyone else must do it asynchronously:
func foo() {
let example = MyExample()
Task {
await example.mutateFooInt()
// The actor itself is allowed to call mutateFooInt() synchronously,
// but the example() function is not.
}
}
This is a concept called actor isolation, and when combined with the concept of execution contexts we’ve seen above, Swift’s async/await is able to prevent you from introducing potential data races in your program. To make it better, those checks happen in compile time!
To be more specific, when you await on an actor, your code will be forwarded not to the default global concurrent executor, but a serial one that was created specifically for that actor instance. This has the effect of not allowing you to call two actor functions at the same time (one will end before the other one starts), and when combined with the fact that the compiler doesn’t allow you to “leak” an actor’s mutable state, you have essentially a situation where it’s not possible for your actor’s state to be mutated by two threads at the same time. But how does this work internally?
When it comes to their implementations, actors are surprisingly straightforward. In Swift, declaring an actor is just a syntax sugar for declaring a class that inherits from the Actor protocol:
public protocol Actor: AnyObject, Sendable {
nonisolated var unownedExecutor: UnownedSerialExecutor { get }
}
The only property of the protocol is unownedExecutor, which is a pointer to the serial executor that is supposed to manage the jobs related to that actor. The purpose of the UnownedSerialExecutor type is to wrap a type conforming to the SerialExecutor protocol we saw previously as an unowned reference, which the documentation describes as necessary for optimization reasons.
public struct UnownedSerialExecutor: Sendable {
internal var executor: Builtin.Executor
public init<E: SerialExecutor>(ordinary executor: __shared E) {
self.executor = Builtin.buildOrdinarySerialExecutorRef(executor)
}
}
When you declare an actor via the syntax sugar, Swift automatically generates this conformance for you:
// What you write:
actor MyActor {}
// What is compiled:
final class MyActor: Actor {
nonisolated var unownedExecutor: UnownedSerialExecutor {
return Builtin.buildDefaultActorExecutorRef(self)
}
init() {
_defaultActorInitialize(self)
}
deinit {
_defaultActorDestroy(self)
}
}
We already know what this generated code is doing; it initializes the Default Actor serial executor that we’ve covered at the beginning. Since actors are deeply ingrained into Swift, the compiler knows that whenever someone references it, the eventual call to hop_to_executor should point to the actor’s unownedExecutor property and not the global one.
While actors naturally protect you from data races, it’s critical to remember that they cannot protect you from logic mistakes like race conditions / straight-up incorrect code. We have already covered why this is the case when we talked about suspension points and reentrancy, but I’d like to reiterate this because this is extra important when working with actors specifically.
When a suspension point is created, the actor will allow other jobs in the serial queue to run. This means that when the result for the original job finally arrives, it’s possible that the actor’s state may have changed in a way where whatever assumptions you made before the suspension point are no longer true!
In actors specifically, this is referred to as Actor Reentrancy, and is once again something you must keep in mind at all times when attempting to write thread-safe code with async/await. As suggested in the section about reentrancy in general, in order for your actors to be thread-safe, you must structure your code so that no state is assumed or carried over across suspension points.
Like in the case of deadlocks, an actor’s solution for data races has important consequences in terms of code architecture. If you cannot “leak” an actor’s mutable state, how does anything ever happen?
Swift’s async/await provides two features to address this. The first one is the Sendable protocol, which marks types that can safely leave an actor:
public protocol Sendable { }
This protocol has no actual code; it’s simply a marker used by the compiler to determine which types are allowed to leave the actors that created them. This doesn’t mean that you can mark anything as Sendable through; Swift really doesn’t want you to introduce data races into your programs, so the compiler has very strict requirements of what can inherit it:
final classes that have no mutable properties@Sendable)While Sendable solves this problem, it must be noted that this protocol has been the target of criticism in the Swift community due how the necessity of tagging “safe” types combined with how the compiler has the tendency to behave like an overprotective mother (it will complain that a type must be Sendable even when in situations where no data race could possibly happen) can quickly cause Sendable to “plague” your program’s entire architecture. There have been pitches on potential improvements in this area, but I believe as of writing no formal proposals have been submitted yet.
Aside from Sendable, the nonisolated keyword is also intended to assist with the problem of having to “leak” an actor’s state. As the name implies, this allows you to mark functions and properties that are allowed to ignore the actor’s isolation mechanism:
actor BankAccount {
nonisolated let accountNumber: Int
}
When referenced, the compiler will pretend that the type didn’t originate from an actor and skip any and all protection mechanisms that would normally apply. However, similarly to Sendable, not everything can be marked as nonisolated. Only types that are Sendable can be marked as such.
At this point, we’ve covered everything we needed regarding async/await in Swift, but there’s still one thing we still need to cover regarding iOS development specifically. Where’s the main thread in all of this?
We’ve talked a lot about the new thread pool and how executors interact with them, but iOS developers will know that UI work always needs to run in the main thread. How can you do that if the cooperative thread pool has no concept of a “main” thread?
In Swift, this is where the ability to build custom executors that we’ve seen at the beginning of the article comes into play. Swift’s standard library ships a type called MainActor, which as the name implies, is a special type of actor that synchronizes all of its jobs to the main thread:
@globalActor public final actor MainActor: GlobalActor {
public static let shared = MainActor()
public nonisolated var unownedExecutor: UnownedSerialExecutor {
return UnownedSerialExecutor(Builtin.buildMainActorExecutorRef())
}
public nonisolated func enqueue(_ job: UnownedJob) {
_enqueueOnMain(job)
}
}
The MainActor achieves this by overriding the default unownedExecutor with a custom Builtin.buildMainActorExecutorRef() one. Since we’re telling Swift that we don’t want to use the default serial executor for this actor, this will deep down cause the Swift runtime to call the MainActor’s custom-defined enqueue method instead.
In the case of MainActor, the call to _enqueueOnMain will cause the job to be forwarded to the global concurrent executor as usual, but this time via a special function that causes the job to be submitted to GCD’s main queue instead of the cooperative thread pool.
// The function where “regular” async/await jobs ends up in
static void swift_task_enqueueGlobalImpl(Job *job) {
auto queue = getCooperativeThreadPool();
dispatchEnqueue(queue, job);
}
// The function where MainActor jobs ends up in
static void swift_task_enqueueMainExecutorImpl(Job *job) {
auto mainQueue = dispatch_get_main_queue();
dispatchEnqueue(mainQueue, job);
}
In other words, code executed by the main actor is essentially the same thing as calling DispatchQueue.main.async, although not literally the same due to two facts that we have already covered: the fact that the Swift runtime uses a “special” version of DispatchQueue.async to submit its jobs, and the fact the dispatch will technically not happen if we’re already inside the main thread (MainActor’s “execution context”).
// What you write:
Task {
await myMainActorMethod()
}
// What (sort of) actually happens:
// (Actual behavior explained above)
Task {
DispatchQueue.main.async {
myMainActorMethod()
}
}
The final thing I’d like to show you is how actors like the MainActor are used in practice. We know that regular actors are created and passed around as normal objects, but doing so with the MainActor would not scale well. Even though the MainActor is available as a singleton, there’s a lot of stuff that needs to run in the main thread in iOS, so if we were treating it like a regular object, we would end up with a lot of code looking like this:
extension MainActor {
func myMainActorMethod() {}
}
func example() {
Task {
await MainActor.shared.myMainActorMethod()
}
}
///////////// or:
func example() {
Task {
await MainActor.run {
myMainActorMethod()
}
}
}
func myMainActorMethod() {}
Although both solutions “work”, Swift saw potential for improvement by creating the concept of “global actors”, which describe actors that can not only be referenced but also expanded from anywhere in the program. Instead of forcing everyone to reference singletons everywhere, Swift's Global Actors feature allows you to easily indicate that a certain piece of code should be executed within the bounds of a specific global actor by marking it with a special annotation:
@MainActor
func myMainActorMethod() {}
This is essentially the same thing as the examples shown above, but with much less code. Instead of having to reference the MainActor’s singleton, we can now directly reference this method and be sure that it will be executed within the MainActor’s context.
func example() {
await myMainActorMethod() // This method is annotated as @MainActor,
// so it will run in the MainActor’s context.
}
In order to be able to do this, the actor in question must be marked with the @globalActor keyword, and is something that you can do for your own actors if you find that this behavior would be useful for them. As one would expect, the MainActor is itself a global actor.
Marking an actor as @globalActor is deep down a syntax sugar for declaring an actor that inherits from the GlobalActor protocol, which is essentially a variation of the regular Actor protocol that additionally defines a singleton that Swift can refer to when it finds one of those special annotations across the program.
public protocol GlobalActor {
associatedtype ActorType: Actor
static var shared: ActorType { get }
}
Then, during compilation time, when Swift encounters one of those annotations, it follows up by emitting the underlying hop_to_executor call with a reference to that actor’s singleton.
func example() {
// SIL: hop_to_executor(MainActor.shared)
await myMainActorMethod()
// SIL: hop_to_executor(DefaultExecutor)
}
In general, I like async/await. I think this is a nice addition to Swift, and it makes working with concurrency a lot more interesting.
But you must not get this wrong. Although Swift prevents you from making memory-related mistakes, it does NOT prevent you from making logic mistakes / writing straight-up incorrect code, and the way the feature works today makes it very easy for you to introduce such mistakes. We've covered some of the pattern's gotchas in this article, but there are many more of them pertaining to features we didn't get to explore here.
Matt Massicotte's "The Bleeding Edge of Swift Concurrency" talk from Swift TO 2023 goes into more detail about gotchas in async/await, and I believe is a talk that anyone working with async/await in Swift should watch.
For more information on thread safety in Swift specifically, check out my article about it.
]]>The "standard" way of debugging performance issues in iOS is to use Xcode's Time Profiler instrument, but I personally never had a good experience with it. While it contains all the information you need to understand a particular problem, that information is not exactly easy to make sense of. To make it worse, sometimes even getting the information to show up in the first place can be quite the challenge, as Instruments in iOS in general have been historically broken and plagued by bad UX.
Thankfully, you don't have to go through any of that! Today much better performance debugging tools are available (and for free), and in this article, I'll show you one of them.
ETTrace is an open-source performance measurement framework for iOS developed by the folks behind Emerge, and I can say that today this is my favorite tool for measuring and debugging performance problems in iOS.
As mentioned in the beginning, while the Time Profiler does technically provide you with all the information that you need, actually understanding this information or even getting it to show up in the first place can be a big challenge, even if you know exactly what you're doing.
For me, personally, there are three things that make the Time Profiler hard to use. The first one is that you need to compile a special Profile build for it to work, meaning you cannot run it ad-hoc on an existing build or device. The second is that the Time Profiler has a really annoying tendency to simply refuse to work every once in a while, mostly when it comes to symbolication. Finally, last but not least, when you do manage to get it to work, the way in which the data is presented to you is not very helpful when it comes to locating the source of a particular performance bottleneck. In other words, there are better ways to display this data.
ETTrace, on the other hand, has none of these problems. It doesn't require a special build, it automatically handles symbolication for you, and it displays the data in a much more readable way. It's basically the Time Profiler on steroids, and I have found it to be in most cases a complete replacement for it.
For instructions on how to install ETTrace, check out the official repo. As of writing, ETTrace is installed by linking a dynamic framework into your app and installing a special ettrace CLI tool in your Mac. You can trace any build of your app that links against this framework, which is why you don't need to compile a special Profile build like you would when using Xcode and the Time Profiler. In practice you could even ship this framework alongside your App Store builds in order to be able to directly debug issues found in production, but I would personally not do that and keep it restricted to debug builds.
To see how ETTrace can help us debug performance issues better than the standard Time Profiler, let's pretend that we have a view controller called ExploreCardViewController, and that we have noticed that tapping a specific collection view cell in this VC is causing the app to freeze for a while.
To find out exactly why this is happening, we just need to run ETTrace. After following the usage steps as described on the repo, you'd be presented with something like this:
This way of displaying information is called a Flame Graph, and I find it to be a very efficient way of locating performance bottlenecks in your app's code. Each "entry" that you see here is a single method call in your app, with the X axis dictating when it was called (and how long it took to run), and the Y axis dictating where/who called it. In the example above, the first 3 frames (start/main/UIApplicationMain) represent functions internal to iOS that are responsible for launching and keeping the app alive, while everything else below it is actual code from our example app.
To find performance bottlenecks in a flame graph, all we need to do is look for the presence of a "chunky" stack trace and then go down the Y axis until we find which frame exactly is the source of the chunkiness.
Consider how ExploreCardViewController is shown in the report. It's very large, which means that this method is taking a really long time to run. But what exactly is causing it? Is it the literal call to didSelectItemAt, or is it something else further down the stack trace?
By going down the trace we can see that at its very bottom there's a very expensive call to usleep originating from ArticleViewController.viewDidLoad(), which is the reason why that entire stack trace is being reported as being expensive:
Oops, seems like we forgot some debug code in our class!
func viewDidLoad() {
sleep(1) // TODO: remove this!
}
After deleting the call, the bottleneck was gone!
You may find this to be a dumb example, but I find that debugging real performance issues doesn't stray too far from this. The difference is just that instead of a dumb call to sleep, you'd see some other expensive operation. Otherwise, the process to locate it and the different ways in which you could fix it are the same.
The example above showed a bottleneck that originated from a single very expensive call, but that's not the only source of performance issues. Sometimes the bottleneck may originate not from one large call, but multiple small ones in rapid sequence.
ETTrace's Invert and Cluster Libraries allow you to quickly debug issues like this by merging all those small calls together. For clarity, this is something that the Time Profiler can also do, but again, it's just that I personally find that ETTrace's flame graphs are much easier to understand than the Time Profiler's tree structure.
Another feature I find myself using a lot is the comparison view. By uploading a second trace file, ETTrace will present you the difference between both traces, allowing you to quickly determine which methods became faster and which methods became slower. This can be good for getting some quick information about whether or something improves or causes a bottleneck, but note that this is not a very reliable way of determining how fast/slow exactly a particular method is. If you need very accurate information, then I recommend using Attabench.
Alternatively, if your company happens to pay for Emerge's enterprise solutions, you can also use their performance analysis product, which is similar to ETTrace but with the difference that it can actually provide you with data that is statistically significant.
I have been using ETTrace for most of my performance debugging work, but there are still a couple of cases where you might need to use the Time Profiler.
The first case that comes to my mind is when you need to debug something that you cannot reproduce, which is something that I've covered previously here at SwiftRocks. For cases like this you'll find Apple's performance trace profiles to be the best solution, which currently require you to use Xcode and the Time Profiler.
Another case you might still need the Time Profiler for is when you're looking not just for performance data, but also other types of iOS-related information such as thread state, device temperature, battery level, os_logs, signposts, hangs, and so on. Nothing currently matches Xcode's Instruments when it comes to putting all this device information into one single place, so issues that require looking at multiple types of device information are still perfectly suited for it.
]]>