Skip to main content

How To Code Go Part 4: Error Handling

Introduction

Programmers do not usually like thinking about errors. When learning how to program, initially, programming assignments are silent about error handling, or at best dismissive. For many applications, the “best practice” for error handling is “exit the program as soon as an error is encountered”.

In contrast, in a distributed system, error definitions and error handling are a critical aspect of product quality.

Here are some of the important things that we care about:

  • Errors should not cause running servers (e.g. a database node) to terminate immediately. Customers would consider this an unacceptable defect. Correct and deliberate error handling is a core part of product quality and stability.
  • Users will read the text of error messages, however users cannot be assumed to understand the source code. If an error message is confusing, the users will ask confused questions to our tech support. If an error message is misguiding, the users will ask the wrong questions to our tech support. And so on. Error messages should be clear and accurate and avoid referring to source code internals.
  • Errors are part of the API and thus error situations should be exercised in unit tests.
  • Error make their way to log files and crash reports and can contain user-provided data. We care to separate customer confidential data from non-confidential data in log files and crash reports, and so we need to distinguish sensitive data inside error objects too.

Error Naming

For error values stored as global variables, use the prefix Err or err depending on whether they're exported.

BadGood
var (
.ErrBrokenLink = errors.New("link is broken")
.ErrCouldNotOpen = errors.New("could not open")
.errNotFound = errors.New("not found")
.)
type NotFoundError struct {
.File string
.}
.func (e *NotFoundError) Error() string {
.return fmt.Sprintf("file %q not found", e.File)
.}
.type resolveError struct {
.`Path str}

Error Types

There are few options for declaring errors. Consider the following before picking the option best suited for your use case.

  • Does the caller need to match the error so that they can handle it? If yes, we must support the errors.Is or errors.As functions by declaring a top-level error variable or a custom type.
  • Is the error message a static string, or is it a dynamic string that requires contextual information? For the former, we can use errors.New, but for the latter we must use fmt.Errorf or a custom error type.
  • Are we propagating a new error returned by a downstream function? If so, see the section on error wrapping.

For example, use errors.New for an error with a static string. Export this error as a variable to support matching it with errors.Is if the caller needs to match and handle this error.

No error matchingError matching
func Open() error {
.return errors.New("could not open")
.}
.
.if err := foo.Open(); err != nil {
.panic("unknown error")
.}
var ErrCouldNotOpen = errors.New("could not open")
.func Open() error {
.return ErrCouldNotOpen
.}
.
.if err := foo.Open(); err != nil {
.if errors.Is(err, foo.ErrCouldNotOpen) {
.// handle the known error
.} else {
.panic("unknown error")
.}

For an error with a dynamic string, use fmt.Errorf if the caller does not need to match it, and a custom error if the caller does need to match it.

No error matchingError matching
func Open(file string) error {
.return fmt.Errorf("file %q not found", file)
.}
.
.if err := foo.Open("testfile.txt"); err != nil {
.panic("unknown error")
.}
type NotFoundError struct {
.File string
.}
.
.func (e *NotFoundError) Error() string {
.return fmt.Sprintf("file %q not found", e.File)
.}
.
.func Open(file string) error {
.return &NotFoundError{File: file}
.}
.
.if err := foo.Open("testfile.txt"); err != nil {
.var notFound *NotFoundError
.if errors.As(err, &notFound) {
.// handle NotFoundError
.}
.else {
.panic("unknown error")
.}

Note that if you export error variables or types from a package, they will become part of the public API of the package.

Error Wrapping

There are three main options for propagating errors if a call fails:

  • return the original error as-is
  • add context with fmt.Errorf and the %w verb
  • add context with fmt.Errorf and the %v verb

Return the original error as-is if there is no additional context to add. This maintains the original error type and message. This is well suited for cases when the underlying error message has sufficient information to track down where it came from.

Otherwise, add context to the error message where possible so that instead of a vague error such as "connection refused", you get more useful errors such as "call service foo: connection refused".

Use fmt.Errorf to add context to your errors, picking between the %w or %v verbs based on whether the caller should be able to match and extract the underlying cause.

  • Use %w if the caller should have access to the underlying error. This is a good default for most wrapped errors, but be aware that callers may begin to rely on this behavior. So for cases where the wrapped error is a known var or type, document and test it as part of your function's contract.
  • Use %v to obfuscate the underlying error. Callers will be unable to match it, but you can switch to %w in the future if needed.

When adding context to returned errors, keep the context succinct by avoiding phrases like "failed to", which state the obvious and pile up as the error percolates up through the stack:

BadGood
s, err := store.New()
.if err != nil {
.return fmt.Errorf("failed to create new store: %w", err)
.}
s, err := store.New()
.if err != nil {
.return fmt.Errorf("new store: %w", err)
.}
failed to x: failed to y: failed to create new store: the errorx: y: new store: the error

However once the error is sent to another system, it should be clear the message is an error (e.g. an err tag or "Failed" prefix in logs).

Handle Type Assertion Failures

The single return value form of a type assertion will panic on an incorrect type. Therefore, always use the "comma ok" idiom.

BadGood
t := i.(string)t, ok := i.(string)
.if !ok {
.// handle the error
.}

Don't Panic

Code running in production must avoid panics. Panics are a major source of cascading failures. If an error occurs, the function must return an error and allow the caller to decide how to handle it.

BadGood
func run(args []string) {
.if len(args) == 0 {
.panic("an argument is required")
.}
.}
.func main() {
.run(os.Args[1:])
.}
func run(args []string) error {
.if len(args) == 0 {
.return errors.New("an argument is required")
.}
.return nil
.}
.func main() {
.if err := run(os.Args[1:]); err != nil {
.fmt.Fprintln(os.Stderr, err)
.os.Exit(1)
.}
f, err := ioutil.TempFile("", "test")
.if err != nil {
.panic("failed to set up test")
.}
f, err := ioutil.TempFile("", "test")
.if err != nil {
.t.Fatal("failed to set up test")
.}

Data desensitization

Many error objects are copied into logs, return to client. To preserve the confidentiality of our customer data, we are careful to isolate sensitive data from error's message, like user's internal id/secret/internal server info, etc.

Meanwhile, be careful if your system return the error message to web with the user input data. Remember to escape the origin input data to avoid xss attack.