ECS Container Debugging Guide: How to Access and Debug Running Containers

Cover Image for ECS Container Debugging Guide: How to Access and Debug Running Containers
DevOps5 min read

ECS Container Debugging Guide: How to Access and Debug Running Containers

Overview

Debugging applications running in Amazon ECS containers can be challenging, especially in production environments. This guide demonstrates how to safely access running ECS containers for debugging purposes using AWS ECS Execute Command feature and interactive tools.

Prerequisites

Before you can access ECS containers for debugging, ensure you have the following tools and permissions:

Required Tools

  1. AWS CLI - Configured with appropriate credentials
  2. fzf - Fuzzy finder for interactive task selection
  3. jq - JSON processor for parsing AWS CLI output

Installation Commands

# Install fzf
brew install fzf      # on macOS
apt install fzf       # on Ubuntu/Debian
yum install fzf       # on CentOS/RHEL

# Install jq (usually pre-installed)
brew install jq       # on macOS
apt install jq        # on Ubuntu/Debian

Required AWS Permissions

Your AWS credentials must have the following permissions:

  • ecs:ListTasks
  • ecs:DescribeTasks
  • ecs:ExecuteCommand
  • ssm:StartSession (for ECS Execute Command)

How the Access Script Works

Step 1: Environment Validation

The script first validates that all required tools are installed:

# Check if fzf is installed
if ! command -v fzf &> /dev/null; then
    echo "fzf is not installed. Please install it first"
    exit 1
fi

Step 2: Fetch Running Tasks

The script queries AWS ECS to get all running tasks for a specific service:

TASKS=$(aws ecs list-tasks \
    --cluster my-app-cluster-prod \
    --service-name my-api-service-prod \
    --query 'taskArns' \
    --output json)

Key Components:

  • Cluster: my-app-cluster-prod - The ECS cluster name
  • Service: my-api-service-prod - The specific service running the API containers
  • Query: taskArns - Extracts only the task ARNs from the response

Step 3: Interactive Task Selection

Using fzf, the script provides an interactive interface to select which task to access:

TASK_ARN=$(echo "$TASKS" | jq -r '.[]' | fzf --height 40% --reverse --border)

Benefits of fzf:

  • Visual Selection: Arrow key navigation through available tasks
  • Filtering: Type to filter task ARNs in real-time
  • Safe Selection: Prevents typos in long ARN strings

Step 4: Container Information Retrieval

The script automatically detects the container name within the selected task:

CONTAINER_NAME=$(aws ecs describe-tasks \
    --cluster my-app-cluster-prod \
    --tasks "$TASK_ARN" \
    --query 'tasks[0].containers[0].name' \
    --output text)

Fallback Mechanism: If container name detection fails, it defaults to "api" as the container name.

Step 5: Shell Detection and Access

The script intelligently detects available shells and chooses the best option:

# Check available shells
SHELL_CHECK=$(aws ecs execute-command \
    --cluster my-app-cluster-prod \
    --task "$TASK_ARN" \
    --container "$CONTAINER_NAME" \
    --command "which bash sh" 2>/dev/null)

# Use bash if available, otherwise fall back to sh
if echo "$SHELL_CHECK" | grep -q "/bin/bash"; then
    # Execute with bash
    aws ecs execute-command \
        --cluster my-app-cluster-prod \
        --task "$TASK_ARN" \
        --container "$CONTAINER_NAME" \
        --interactive \
        --command "/bin/bash"
else
    # Execute with sh
    aws ecs execute-command \
        --cluster my-app-cluster-prod \
        --task "$TASK_ARN" \
        --container "$CONTAINER_NAME" \
        --interactive \
        --command "/bin/sh"
fi

Usage Instructions

Running the Script

  1. Navigate to the appropriate environment directory:

    cd prod/  # for production access
    # or
    cd dev/   # for development access
    
  2. Execute the access script:

    ./access_api.sh
    
  3. Follow the interactive prompts:

    • The script will fetch all running tasks
    • Use arrow keys to navigate through available tasks
    • Press Enter to select a task
    • The script will automatically connect you to the container

What You'll See

$ ./access_api.sh

Fetching tasks...
Select a task (use arrow keys, type to filter, press Enter to select):
> arn:aws:ecs:us-east-1:123456789012:task/my-app-cluster-prod/abc123def456
  arn:aws:ecs:us-east-1:123456789012:task/my-app-cluster-prod/def456ghi789
  arn:aws:ecs:us-east-1:123456789012:task/my-app-cluster-prod/ghi789jkl012

Selected task ARN: arn:aws:ecs:us-east-1:123456789012:task/my-app-cluster-prod/abc123def456
Getting container information...
Container name: api
Accessing container...
Checking available shells in the container...
Available shells: /bin/bash /bin/sh
Using /bin/bash...

# You're now inside the container!
root@container-id:/app#

Common Debugging Tasks

Once connected to the container, you can perform various debugging activities:

1. Check Application Logs

# View recent application logs
tail -f /var/log/app.log

# Check system logs
journalctl -f

2. Inspect Running Processes

# List all running processes
ps aux

# Check specific application processes
ps aux | grep node  # for Node.js apps
ps aux | grep python  # for Python apps

3. Check Network Connectivity

# Test external connectivity
curl -I https://api.external-service.com

# Check internal service connectivity
curl -I http://internal-service:8080/health

4. Monitor Resource Usage

# Check memory usage
free -h

# Check disk usage
df -h

# Monitor real-time resource usage
top

5. Inspect Application Configuration

# Check environment variables
env | grep -i app

# View configuration files
cat /app/config/production.json

6. Database Connectivity

# Test database connection (example for PostgreSQL)
pg_isready -h database-host -p 5432

# Connect to database
psql -h database-host -U username -d database_name

Security Best Practices

1. Limit Access Duration

  • Only stay connected as long as necessary
  • Exit the container session when debugging is complete

2. Read-Only Operations

  • Prefer read-only operations when possible
  • Avoid modifying files or configurations in production

3. Audit Trail

  • All ECS Execute Command sessions are logged in CloudTrail
  • Document any debugging actions taken

4. Environment Awareness

  • Always verify you're in the correct environment (dev/prod)
  • Use the appropriate access script for each environment

Troubleshooting Common Issues

Issue 1: "ExecuteCommandAgent not running"

Solution: Ensure your ECS service/task definition has enableExecuteCommand set to true.

Issue 2: "AccessDenied" Error

Solution: Verify your AWS credentials have the required ECS and SSM permissions.

Issue 3: "No tasks found"

Solution: Check that the service is running and has active tasks:

aws ecs describe-services \
    --cluster my-app-cluster-prod \
    --services my-api-service-prod

Issue 4: fzf Not Found

Solution: Install fzf using the package manager for your system.

Environment Differences

Development vs Production

  • Development: Uses my-app-cluster-dev and my-api-service-dev
  • Production: Uses my-app-cluster-prod and my-api-service-prod

Conclusion

This ECS container access script provides a safe, interactive way to debug running containers in your AWS ECS environment. By combining AWS CLI commands with user-friendly tools like fzf, it streamlines the debugging process while maintaining security best practices.

Remember to always:

  • Use the appropriate environment-specific script
  • Limit your debugging session duration
  • Document any significant findings or actions taken
  • Exit cleanly when debugging is complete

For additional debugging capabilities, consider implementing application-specific health check endpoints, comprehensive logging, and monitoring solutions to reduce the need for direct container access.