Skip to main content

GoReplay: Traffic Mirroring and Replay for Stress Testing

GoReplay is a powerful open-source tool written in Go for capturing and replaying live HTTP traffic. It allows you to mirror real-user traffic from production to staging or testing environments, enabling realistic stress testing, performance analysis, and debugging without impacting live users.

GoReplay Logo

Key Features

  • Real Traffic Replication: Captures and replays actual HTTP traffic, ensuring realistic test scenarios.
  • Minimal Overhead: Designed for low latency and minimal performance impact on production systems.
  • Traffic Filtering: Enables filtering of traffic based on various criteria (e.g., headers, URLs, methods).
  • Traffic Transformation: Allows modification or anonymization of captured traffic before replaying.
  • Multi-Target Replay: Replays traffic to multiple testing environments simultaneously.
  • Scalability: Handles high volumes of traffic with efficient resource utilization.
  • Open Source: Freely available and customizable under the MIT license.

Use Cases

GoReplay is valuable for various testing and development scenarios:

  • Stress Testing: Simulate production load on staging environments to identify performance bottlenecks.
  • Performance Monitoring: Analyze how specific code changes impact application performance under real-world traffic.
  • Regression Testing: Replay previously captured traffic to ensure that new code deployments do not introduce regressions.
  • Debugging: Recreate production issues in a controlled environment for easier debugging.
  • Security Testing: Evaluate the application's resilience against real-world attacks by replaying malicious traffic.
  • Traffic Shadowing: Test new features in production by shadowing live traffic without impacting user experience.

Basic Usage

Here's a basic example of capturing traffic from a production server and replaying it to a staging server:

1. Capture Traffic (on the production server):

sudo goreplay --input-raw :80 --output-file traffic.gor

This command captures all HTTP traffic on port 80 and saves it to a file named traffic.gor. You might want to adjust the --input-raw and potentially add filtering as needed. sudo is used to provide higher privileges for accessing the network interface.

2. Replay Traffic (on the staging server):

sudo goreplay --input-file traffic.gor --output-http "http://staging.example.com"

This command replays the captured traffic from traffic.gor to the staging server at http://staging.example.com.

Advanced Configuration Options

GoReplay provides a wide range of configuration options for controlling traffic capture and replay. Some common options include:

  • --input-raw <address>: Specifies the address to listen for incoming HTTP traffic (e.g., :80, 127.0.0.1:8080).
  • --output-file <filename>: Saves captured traffic to a file for later replay.
  • --output-http <url>: Specifies the URL to send replayed traffic to.
  • --input-tcp <address>: Captures traffic from a TCP socket
  • --input-unix <path>: Captures traffic from a Unix socket
  • --middleware <command>: Executes a command for each request to modify or filter the traffic.
  • --split-output: Routes traffic to different outputs based on specific criteria (e.g., headers, URLs).
  • --http-allow-header <header>: Whitelists HTTP headers to be captured.
  • --http-disallow-header <header>: Blacklists HTTP headers from being captured.
  • --http-allow-url <regex>: Only captures requests matching the specified URL regex.
  • --http-disallow-url <regex>: Excludes requests matching the specified URL regex.
  • --exit-after <duration>: Automatically exits after a specified duration (e.g., 1h, 30m).
  • --track-response: Tracks and reports response times for each replayed request.
  • --stats: Displays real-time statistics about traffic capture and replay.

Refer to the GoReplay documentation for a complete list of options and their descriptions.

Filtering and Transformation

GoReplay also provides powerful filtering and transformation capabilities.

Filtering

You can selectively capture or replay traffic based on various criteria.

  • URL filtering: Only capture requests to specific URLs using --http-allow-url or --http-disallow-url.
  • Header filtering: Capture or exclude requests based on HTTP headers using --http-allow-header or --http-disallow-header.
  • Method filtering: Filter by HTTP method. This often achieved via middleware processing.

Transformation

GoReplay enables you to modify captured traffic before replaying it. One common use case is anonymizing sensitive data.

  • Middleware Processing: Use the --middleware option to execute a custom script or program for each request. This script can modify the request headers, body, or other attributes before it is replayed.

Considerations

  • Data Sensitivity: Be mindful of capturing and replaying sensitive data (e.g., Personally Identifiable Information (PII)). Implement appropriate anonymization or filtering techniques to protect user privacy.
  • Database State: Replaying traffic can modify the state of your database. Ensure that your staging environment is properly isolated and that you have a mechanism for resetting the database before and after replaying traffic.
  • Rate Limiting: Consider implementing rate limiting on your staging environment to prevent it from being overwhelmed by replayed traffic.
  • Network Configuration: Ensure that your staging environment has the necessary network connectivity to receive replayed traffic from the machine running GoReplay.
  • Resource Utilization: Monitor the resource utilization of the machine running GoReplay to ensure that it has sufficient CPU, memory, and network bandwidth to handle the traffic volume. Consider running multiple GoReplay instances in parallel to scale performance.

Installation

GoReplay can be installed using pre-built binaries or by compiling from source. Refer to the official documentation for detailed instructions.

Example (using go install):

go install github.com/buger/goreplay@latest

Make sure your $GOPATH/bin is in your PATH environmental variable.

Conclusion

GoReplay is a valuable tool for improving the quality, reliability, and performance of your applications by enabling realistic testing scenarios. By mirroring real-user traffic from production to testing environments, you can identify and resolve issues before they impact live users. Remember to handle sensitive data appropriately and consider the potential impact on database state and resource utilization.