Puhomir

About Index Contact

Home / Random

Code is the enemy

Code is the cost of developing, running and maintaining software.

In order to make this easier to understand let's invent a new currency - the code dollar *$, where 1*$ = ~1 line of code (depending on the inflation). Someone clever may instinctively suggest "just write massive one-liners" or "just write Python then". If this is you, it may be time to quit eating glue.

Anyways, every application costs N*$, where N is the sum of all lines of:

Ideally software would cost 0*$, but that's physically impossible. Instead, every piece of software has a secret best deal price B*$, and all we can do is get as close to that best deal price as possible.

Realistically we can't do anything about the Operating System code, which means every application is expensive per default because all Operating Systems are gigantic. Even a trivial C application may cost millions of *$.

For that reason we will define the best possible price BP*$ which excludes the Operating System code. This is a price for the most useful, performant and reliable version of that software. Programmers only job is to get as close to this price as possible, and the easiest strategy to get as close as possible is to cut as much cost as possible.

0. Free yourself

10 years ago I used to be an OOP (Object Oriented Programming) advocate and I loved everything about it. This was during my Java 7 + Design Patterns days. If I met myself from that time right now, we would disagree on everything. We wouldn't even agree on the naming style, let alone anything important.

OOP is not for me. My life would be so much easier if I enjoyed OOP because most of the work in the software industry involves OOP in some capacity. But it makes me miserable and I don't know how to write great software in OOP style. Advice that I list here is not likely to be useful to the OOP crowd. If I'm being totally honest, the biggest thing you can do to reduce the cost of your software is to cut as much OOP as possible.

Obviously, this is just my opinion, people who can write good software with OOP are much smarter than me. If they ditched OOP I would have no way to compete with them. There's not enough space in my tiny brain for all those fancy abstractions, so I have no choice but to be on a lookout for ways to achieve more with less. The only leverage I have in this game is ignorance. Software industry rules and best practices do not apply to me because I do not understand them. I'm free like a rat, to do what I want.

1. Discover the problem first

Allow yourself to discover the problem you have to solve, instead of solving it ahead of time. The only purpose of your initial solutions is to figure out what problem you are trying to solve.

If you are making a game, you can of-course implement a super-generic multi-threaded ECS ahead of time, and then end up with a game where you have 100 entities and 4 component types. Instead, you could put all entities in 1 fixed size array, add all the fields you need to the Entity type and iterate over that array to update and render.

It's simple, it's dumb, it allows you to explore your problem space quickly and easily. In many cases, this dumb solution will also end up being all you need. But if you find yourself needing something more complex, you can easily implement it, because now you have a much better understanding of your problem.

You should never future-proof, it will never work, you are only creating more work for your future self. When the requirements change, you will address those requirements by changing your code based on those very concrete requirements. You don't want to spend too much *$ on speculative solutions and every solution is speculative at first. Pay less upfront, pay more when you need to.

2. Put a number on it

When you need to store many of something, just guess how many you need, create an array of that size and use that. Arrays are cheap, even if they are 2x bigger than what you need. A good strategy is to use some power of 2 and update it when it becomes too small. In a current 3D game I'm building, everything is stored in fixed-size arrays, like so:

Engine_Storage :: struct {
    textures: [MAX_TEXTURE_COUNT]Texture;
    meshes:   [MAX_MESH_COUNT]Mesh;
    fonts:    [MAX_FONT_COUNT]Font;
    // ...
}

MAX_TEXTURE_COUNT :: 128;
MAX_MESH_COUNT    :: 64;
MAX_FONT_COUNT    :: 16;

In my whole code-base, there isn't a single dynamically sized container, the capacity for textures, meshes, components, etc. are all known at compile time. All are some power of 2. When I exceed that capacity my compiler will yell at me and I will know to resize that array or fix a bug. Another benefit of this is that most of my memory needs (~95%) are handled at startup with 1 memory allocation and I never have to free this memory because it lives as long as the game runs.

Dynamically sized containers on the other hand are more expensive to implement, slower and more bug prone, so they should only ever be used when you absolutely need them. You can easily get away with storing most of your data in static containers and only using dynamic containers for the remaining few cases. By using dynamic containers sparingly we save a lot of *$.

3. Make it public

Access modifiers and encapsulation are, in my opinion, completely useless. Make all your data public and write functions that perform work on that data. This is faster, more flexible and less code, in other words - savings are through the roof!

// Bad code:
class My_Scary_Type {
public:
    Data* get_data() { return m_data; }
    void set_data(Data* data) { m_data = data; }

private:
    Data* m_data;
    bool is_scary = true;
};

This is so common it hurts. What's the point of having getters and setters if all they do is assignment with extra steps? User of this class can do whatever they want with m_data from the outside anyways, so instead you can do this:

struct My_Scary_Type {
    Data* data;
    bool is_scary = true;
};

This is much cheaper, and it accomplishes the same thing. In "user code" you will always call some "API" function to work with My_Scary_Type and almost never access the fields of My_Scary_Type anyways. So there's no point in adding a protective layer which will only make implementing "API" functions that work with that type harder and less flexible.

This only gets more annoying when functionality is concerned with multiple different types. Because you have to start thinking about their relationship:

Instead of creating 10x more work for yourself, for no reason at all. Just make every type POD (Plain Old Data) and write separate functions that implement all the functionality. This is much simpler, more flexible and significantly cheaper.

struct My_Scary_Type {
    Data* data;
    bool is_scary = true;
};

struct Another_Type {
    int value;
};

void do_something_scary(My_Scary_Type& t, Another_Type& t2) {
    // ...
}

4. Write big functions

If you are splitting a function in pieces and those pieces are only ever called in the body of that original function, then you are practically setting the precious *$ on fire. Functions are beautiful in all sizes, you should be comfortable writing a function that is 50 lines long as well as a function that is 1000 lines long, as long as those functions perform useful work.

// Terrible:
do_something :: () {
    first_step_of_doing_something();
    second_step_of_doing_something();
    // ...
    last_step_of_doing_something();
}

You should aim to optimize the ratio of working to organizational code. In the terrible example above the do_something function performs 0 work, all work is delegated to *_step_of_doing_something functions.

Functions like this one should not exist. Unless something is used in many places, do not factor it out. People read from top to bottom, this implementation also requires the reader to jump around all the time. It's a horrible waste of *$ and it accomplishes nothing.

5. Cut the libraries

Do not add libraries to your projects unless you are making a good use of them. In many cases you are able to swap out a massive, overly generic library with a small focused library that has the functionality you need. Getting rid of libraries also makes your code easier to build and port to different platforms.

Often you can also implement some functionality yourself. A good example of this is the C++ standard library, often you can implement some version of the data structure or algorithm you need for 10% of the original price in *$. Another benefit of rolling out your own solutions is that they will be written in your style, which makes them fit nicer into existing code-base.

The only counterpoint here is that code which isn't overly generic and built for every single use-case on the planet is somehow "primitive". As if this is somehow a problem, I like primitive. Monkey go, hit stone, fire burn. All that matters is that my users have the opposite of "primitive" experience when they use my software.

You do not need generic solutions and computers don't like generic solutions anyways. Computers are dumb, simple and discrete above all else. You may think there's an infinite number of use-cases for some code, but in reality there are only 5 and only 2 of those are common. Stop overthinking, be practical.

6. Be stupid

C++ defines the size_t type. It's returned whenever you call .size() on any of the standard C++ container types or when you use sizeof. The problem is: size_t is an unsigned 32 bit or 64 bit integer, depending on your system. The chances you will ever make use of the last bit are practically 0%.

You don't need that, but you pay for it anyways. Unsigned integers can easily overflow, simple mistake will turn 0 into 4,294,967,295 and representing invalid values is harder. With signed integers you can just use -1 and be happy. With unsigned integers you pick either 0 or MAX_VALUE as invalid and using any of them puts you at the risk of overflow.

Morale of the size_t story is that computers are not real. In the real world it makes sense that an index or size of something cannot be negative. However, that's not what unsigned integers exist for, unsigned integers exist ONLY so that you can use the last bit to represent a larger value. They are not supposed to be used to communicate bounds. That's what asserts and if statements are for, and using unsigned integers does not reduce the amount of bounds checking you have to do, it just makes bounds checking more annoying.

Many programmers feel the urge to write code which is correct in some real world terms, and by doing so they often end up with code which is impractical in the computer world. There's a lot of pride involved to the point where "simple" is often confused with dumb or bad. When in reality some academically correct and highly sophisticated solution is often much worse than the "dumbest" alternative. In fact, the "dumbest" solutions are often objectively the best.

For instance, if you needed a pseudo random number generator in C++ to generate random colors, you can use a sophisticated algorithm such as: mt19937 which is a Mersenne Twister with a period of 2^19937-1. Or you can implement xorshift32 for ~ 5*$ :

uint32_t xorshift32(uint32_t& state) {
    state ^= (state << 13);
    state ^= (state >> 17);
    state ^= (state << 5);
    return state;
}

While mt19937 is academically correct and more sophisticated, it's expensive, and in this case a super cheap xorshift32 is superior in every possible way.

Smart people tend to have higher tolerance for complexity and greater desire for "correctness". Their definition of "simple" is skewed and for that reason they have to fight an uphill battle against themselves. The only way to get away with intelligence in this industry is with an enormous dose of wisdom. Intelligence without wisdom is more dangerous than anything, it's better to be a wise idiot than a smart fool.

Intelligence answers the easier question: "how?". Wisdom answers the difficult question: "why?". In most cases "how?" is not a problem, instead "why" is what makes or breaks the projects.

Additionally, a smart programmer can understand dumb code, but a dumb programmer cannot understand smart code. If you want more people to understand your code you should make it dumb. If more people understand your code, then more people are able to efficiently modify it without adding performance regressions or bugs.

7. Don't create errors

It's common for programmers to promote regular program state to errors for no reason at all and make those errors someone else's problem. It's by far the worst way to handle errors and the best example of this is the exception error handling.

Needless to say, exceptions are by far the worst programming language feature of all time. Whoever came up with exceptions did so as a joke, but the feature somehow blew up in popularity and they decided to play along. Exceptions cost infinite *$ and any software which is built with them is fighting an uphill battle against the infinite horde of self inflicted problems.

Back to the topic. Most errors are just normal program state that you are too lazy to handle at any given moment. Real errors are unrecoverable, any error that you can handle is not an error. For instance, if I try to initialize the graphics driver and it fails, that's an error. Without graphics there's no point in trying to run a game and I don't have any ways to fix the driver.

On the other hand, if a user is trying to sign-in and you can't find them in the database that's not an error. It's expected that a user may not be in that database. You may display this as an error to the end user, but in your code this should be regular expected state. It should not be promoted to an error.

Errors are like rabbits, you let 2 go and they become 20. If your function returns or throws an error, then every caller of that function may also return or throw an error. Before you realize it half of your code-base is just error handling. You must take control as early as possible, reduce the number of failing functions and decrease the contact area between regular and failing code.

You should also make a good use of asserts which will help you define the boundaries of your program. Knowing the program boundaries will help reduce the number of failing functions because you better understand your program state throughout development.

8. Think big

You can save a lot of *$ if you are able to understand how some technical decision affects the entire project in the long-run. This is what many programmers suck at, myself included, I only started getting better at this a couple of years ago.

If I say, for example that shared_ptr is slow. Your instinct is to argue that shared_ptr is slower than a regular pointer, but it's not a big deal. That's the problem. Of course shared_ptr is slower than a regular pointer, that's obvious, and you pay for that at scale, however this is not my point of focus at all. Even if shared_ptr had no overhead it would still be slow.

On modern computers performance is basically free, all you have to do is give your hardware something useful to do. The only real bottleneck of modern CPUs and GPUs is the memory. Big part of optimizing memory is understanding memory lifetimes and being very careful about allocating and freeing memory. When used across the system a shared_ptr will obfuscate memory lifetimes to the point where it's not possible to reason about them at all. If we can't reason about the memory lifetimes, we have little hope to optimize memory and for that reason we end up with a slower system.

Every feature must be evaluated at big picture scale, because everything works in theory and on a smaller scale. For instance, we used to think inheritance is great just because Cat and Dog could implement their own eat() method from the base Animal class. It makes perfect sense in this case, we are easily blinded by such contrived examples.

People who add such features are not doing so maliciously, they are trying to make useful things. But they have no way to test their ideas at a large scale before releasing them into the wild, and letting us idiots find out all the flaws. If every feature could be made perfect without testing it in big real life situations, then we would have solved programming already. It's a fact of life that we are very bad at predicting how something will play out in the future, all we can do is learn from our own mistakes.

9. Focus on the software

Finally, it's important to remember why you are writing code to begin with. Code is intoxicating, you can easily get sidetracked and write code which serves no purpose and only provides some temporary satisfaction. It's not necessarily bad to do this from time to time, writing code is fun.

But when your goal is to create software for other people, you should do your best to remember that code is the enemy. Try to find joy in building the best software instead, and all will fall in place with time.