When to Panic

TLDR: Never panic or throw an exception, unless your server is in an unrecoverable, corrupted state.

Exceptions are bad. If you have ever written enterprise code, exceptions are the reason you need to put try/catch control statements everywhere. What’s worse, is exceptions are considered idiomatic in Java, the prime enterprise language. The main issue with exceptions is that the they are contagious. If a function does throw an exception, every time that function is called the caller will have to catch the exception. This is fine though. If a function can return an error, all of its callers must handle those errors. The difference between returning an error and raising an exception is that if an error is unhandled by a caller, then that caller’s callers do not need to worry about handling the error. But if an exception is thrown, the callers, the callers’ callers’, the callers’ callers’ called, &c will need to have a try/catch if the exception is not handled at a lower level.

def get_config() -> Config:
    f = open("config.txt", "r")
    contents = f.read()
    f.close()
    return parse_config(contents)

def compute_interest(principle: float, annual_rate: float) -> float:
    is_monthly = get_config().period == "month"
    if is_montly:
        # calculate monthly, not annual, interest
        r = (1 + annual_rate) ** (1.0/12.0)
        return (r - 1) * principle
    return annual_rate * principle

In the above, get_config will throw an exception if the file config.txt does not exist. In practice, get_config should catch this error and return some default configuration, but the issue with exceptions is that a reasonable reader might expect the function to do something like that when it does not. There is no type information that indicated that an exception could be thrown. If there were, it would have to be included in a comment and no type checker could read. Finally, the issue propagated to the callers of compute_interest. All of the issues with get_config now apply to compute_interest as well. There is no reason that a caller would expect an interest calculating function to throw a file IO exception, but it could.

Erlang encourages a philosophy called Let it Crash. The idea is to only code for the happy path. If there are any unexpected errors, do not catch them and let the program crash if need be. The proponents of this idea point to the fact that by only writing for the happy path, developers can be more productive without sacrificing code quality.

While this philosophy works well for small scripts or stateless microservices, it is no help when writing anything more complex. No programmer can afford to “let” their embedded program crash. Neither can a systems developer or a client-side programmer. In these cases, errors from IO are common and it is important to handle those errors gracefully. If a server crashes, then server-side automation can simply restart the server. But if an embedded application dies, the device will need to be restarted, which is an intolerable failure mode.

Rust and Go handle errors through functions returning error information. If you know the signature of the function, you know the errors that can come up. When you call the function, you will get an object that contains all of the nessisary success/failure information you may need to know about the function call. Error handling can then be done at the level of type checking. For large codebases, this is a must. One cannot afford to have un anticipated exceptions thrown from deep in the stack.

The only time exceptions are useful is when the server hits an unrecoverable error. This is when the server detects that some prerequisite is violated in a way where there is no correct way to proceed with some computation. For example, say some server detects that it is running on a corrupt CPU. The program should throw an exception when the corruption is detected since there is no way to recover. The idea is that if the computation should not continue and that the error should not be handled elsewhere, then it makes sense to “handle” the error by killing the program’s execution.

The issue is that many programs simply require returning early when an error is encountered and returning the error value. If that is true for the entire call stack, then returning an error and throwing an exception both cause the program’s execution to end early. How would a programmer know to return an error or throw an exception in this case? The answer is almost certainly that the program should return the errors, not throw an exception. While today, the error handling is trivial, it could grow in complexity for any of the functions in the call stack. When that happens, having a function somewhere in the call stack that throws an exception instead of returning an error makes it much harder to add the more complex error handling. A function should leave it to its callers if an error encountered is fatal.

Keeping this in mind, returning to the case of a corrupt CPU, whatever function detected the corruption should return an error that notes that the CPU is corrupt. While it is almost certain that this will cause every function in the call stack to end it computation early and return a corruption error, execution should only end early if main returns early.

The above says that we should never panic because of an error, but should we panic because of a bug in the code? Maybe, this is the argument behind offensive programming, of which I am a fan. But offensive programming is not doing error handling and it is using panics to enforce the developer’s understanding of invariants in the codebase.