How To Code Go Part 4: Error Handling
Introduction
Programmers do not usually like thinking about errors. When learning how to program, initially, programming assignments are silent about error handling, or at best dismissive. For many applications, the “best practice” for error handling is “exit the program as soon as an error is encountered”.
In contrast, in a distributed system, error definitions and error handling are a critical aspect of product quality.
Here are some of the important things that we care about:
- Errors should not cause running servers (e.g. a database node) to terminate immediately. Customers would consider this an unacceptable defect. Correct and deliberate error handling is a core part of product quality and stability.
- Users will read the text of error messages, however users cannot be assumed to understand the source code. If an error message is confusing, the users will ask confused questions to our tech support. If an error message is misguiding, the users will ask the wrong questions to our tech support. And so on. Error messages should be clear and accurate and avoid referring to source code internals.
- Errors are part of the API and thus error situations should be exercised in unit tests.
- Error make their way to log files and crash reports and can contain user-provided data. We care to separate customer confidential data from non-confidential data in log files and crash reports, and so we need to distinguish sensitive data inside error objects too.
Error Naming
For error values stored as global variables, use the prefix Err
or err
depending on whether they're exported.
Bad | Good |
---|---|
var ( . ErrBrokenLink = errors.New("link is broken") . ErrCouldNotOpen = errors.New("could not open") . errNotFound = errors.New("not found") . ) | type NotFoundError struct { . File string . } . func (e *NotFoundError) Error() string { . return fmt.Sprintf("file %q not found", e.File) . } . type resolveError struct { .`Path str} |
Error Types
There are few options for declaring errors. Consider the following before picking the option best suited for your use case.
- Does the caller need to match the error so that they can handle it? If yes, we must support the
errors.Is
orerrors.As
functions by declaring a top-level error variable or a custom type. - Is the error message a static string, or is it a dynamic string that requires contextual information? For the former, we can use
errors.New
, but for the latter we must usefmt.Errorf
or a custom error type. - Are we propagating a new error returned by a downstream function? If so, see the section on error wrapping.
For example, use errors.New
for an error with a static string. Export this error as a variable to support matching it with errors.Is
if the caller needs to match and handle this error.
No error matching | Error matching |
---|---|
func Open() error { . return errors.New("could not open") . } . . if err := foo.Open(); err != nil { . panic("unknown error") . } | var ErrCouldNotOpen = errors.New("could not open") . func Open() error { . return ErrCouldNotOpen . } . . if err := foo.Open(); err != nil { . if errors.Is(err, foo.ErrCouldNotOpen) { . // handle the known error . } else { . panic("unknown error") . } |
For an error with a dynamic string, use fmt.Errorf
if the caller does not need to match it, and a custom error
if the caller does need to match it.
No error matching | Error matching |
---|---|
func Open(file string) error { . return fmt.Errorf("file %q not found", file) . } . . if err := foo.Open("testfile.txt"); err != nil { . panic("unknown error") . } | type NotFoundError struct { . File string . } . . func (e *NotFoundError) Error() string { . return fmt.Sprintf("file %q not found", e.File) . } . . func Open(file string) error { . return &NotFoundError{File: file} . } . . if err := foo.Open("testfile.txt"); err != nil { . var notFound *NotFoundError . if errors.As(err, ¬Found) { . // handle NotFoundError . } . else { . panic("unknown error") . } |
Note that if you export error variables or types from a package, they will become part of the public API of the package.
Error Wrapping
There are three main options for propagating errors if a call fails:
- return the original error as-is
- add context with
fmt.Errorf
and the%w
verb - add context with
fmt.Errorf
and the%v
verb
Return the original error as-is if there is no additional context to add. This maintains the original error type and message. This is well suited for cases when the underlying error message has sufficient information to track down where it came from.
Otherwise, add context to the error message where possible so that instead of a vague error such as "connection refused", you get more useful errors such as "call service foo: connection refused".
Use fmt.Errorf
to add context to your errors, picking between the %w
or %v
verbs based on whether the caller should be able to match and extract the underlying cause.
- Use
%w
if the caller should have access to the underlying error. This is a good default for most wrapped errors, but be aware that callers may begin to rely on this behavior. So for cases where the wrapped error is a knownvar
or type, document and test it as part of your function's contract. - Use
%v
to obfuscate the underlying error. Callers will be unable to match it, but you can switch to%w
in the future if needed.
When adding context to returned errors, keep the context succinct by avoiding phrases like "failed to", which state the obvious and pile up as the error percolates up through the stack:
Bad | Good |
---|---|
s, err := store.New() . if err != nil { . return fmt.Errorf("failed to create new store: %w", err) . } | s, err := store.New() . if err != nil { . return fmt.Errorf("new store: %w", err) . } |
failed to x: failed to y: failed to create new store: the error | x: y: new store: the error |
However once the error is sent to another system, it should be clear the message is an error (e.g. an err
tag or "Failed" prefix in logs).
Handle Type Assertion Failures
The single return value form of a type assertion will panic on an incorrect type. Therefore, always use the "comma ok" idiom.
Bad | Good |
---|---|
t := i.(string) | t, ok := i.(string) . if !ok { . // handle the error . } |
Don't Panic
Code running in production must avoid panics. Panics are a major source of cascading failures. If an error occurs, the function must return an error and allow the caller to decide how to handle it.
Bad | Good |
---|---|
func run(args []string) { . if len(args) == 0 { . panic("an argument is required") . } . } . func main() { . run(os.Args[1:]) . } | func run(args []string) error { . if len(args) == 0 { . return errors.New("an argument is required") . } . return nil . } . func main() { . if err := run(os.Args[1:]); err != nil { . fmt.Fprintln(os.Stderr, err) . os.Exit(1) . } |
f, err := ioutil.TempFile("", "test") . if err != nil { . panic("failed to set up test") . } | f, err := ioutil.TempFile("", "test") . if err != nil { . t.Fatal("failed to set up test") . } |
Data desensitization
Many error objects are copied into logs, return to client. To preserve the confidentiality of our customer data, we are careful to isolate sensitive data from error's message, like user's internal id/secret/internal server info, etc.
Meanwhile, be careful if your system return the error message to web with the user input data. Remember to escape the origin input data to avoid xss attack.