AWS EC2 Security Log Monitoring: The Cloud-Native Way

How to utilize modern logging and Serverless technology to jump-start your security logging pipeline.

Forward Security Logs to S3 via Fluentd

To detect and prevent security breaches, security teams must understand everything that is happening in the environment. The primary way to accomplish this is by monitoring and analyzing log events, which provide information on activity within a system.

Traditionally, this was done with the built-in Unix command-line utility syslog, where data was sent to a set of aggregation points for storage, searching, and analysis.

However, collecting high-value security logs from a large fleet of machines can be a challenge. Luckily, there are new tools to help. Over the years, new projects emerged for performant and flexible log management, such as:

  1. syslog-ng
  2. rsyslog
  3. fluentd (and fluent-bit)
  4. logstash (and beats)

In this tutorial, we will walk through how to aggregate and store security logs the cloud-native way. We will use Fluentd to transport syslog data from AWS EC2 instances to Amazon S3 in a secure and performant manner. Syslog provides information on users connecting to systems, running sudo commands, installing applications, and more.

Getting Started with EC2 Monitoring

Make sure to have the following setup:

You will use the AWS CLI to run CloudFormation from the Panther Labs tutorials repository with predefined templates.


Step 1: Setup S3 Bucket, Instance Profile, and IAM Role

To centralize data from AWS EC2, we will use a S3 Bucket and an IAM Instance Profile to permit EC2 to send the data to the bucket. Instance profiles allow for temporary credentials to be generated, which avoids usage of long-lived credentials.

Run the command below from the panther-labs/tutorials directory to setup all the required infrastructure above:

$ make deploy \
    tutorial=cloud-native-logging \
    stack=security-logging-infra \
    region=us-east-1 \
    parameters="--parameter-overrides OrgPrefix=<PrefixGoesHere>"
Code language: Shell Session (shell)

This will create the following:

  • S3 Data Bucket
  • Write-only IAM Role to send data to S3
  • An IAM Role to allow AWS EC2 to assume the write-only Role
  • IAM Instance Profile to attach to the instance

Step 2: Launch AWS EC2 Instance and Configure Fluentd

Next, launch an Ubuntu instance with IAM Role created above:

After a couple of moments, the instance will change to the running status in the AWS EC2 Console.

Connect to the instance with SSH:

$ ssh ubuntu@<public-dns-name> -i <path/to/keypair>

Follow the guide to install Fluentd.

Use the following Fluentd configuration (/etc/td-agent/td-agent.conf) to consume syslog messages from localhost to send to S3:

<source>
  @type syslog
  port 5140
  bind 0.0.0.0
  tag system
</source>

<match system.**>
  @type s3

  <assume_role_credentials>
    duration_seconds 3600
    role_arn arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<YOUR-ORG-PREFIX>S3WriteSecurityData
    role_session_name "#{Socket.gethostname}"
  </assume_role_credentials>

  s3_bucket <YOUR-AWS-ACCOUNT-ID>-security-data-us-<YOUR-REGION>
  s3_region <YOUR-REGION>

  path syslog/
  store_as gzip

  <format>
    @type json
  </format>

  <buffer tag,time>
    @type file
    path /var/log/td-agent/buffer/s3
    timekey 3600 # 1 hour partition
    timekey_wait 60m
    timekey_use_utc true # use utc
    chunk_limit_size 256m
  </buffer>
</match>
Code language: TOML, also INI (ini)

Next, configure rsyslog to forward messages to the local Fluentd daemon by adding these two lines to the bottom of /etc/rsyslog.d/50-default.conf :

# Send log messages to Fluentd
*.* @127.0.0.1:5140
Code language: Shell Session (shell)

To enable this logging pipeline, start both services below:

$ sudo systemctl start td-agent.service
$ sudo systemctl restart rsyslog.service
Code language: Shell Session (shell)

To verify the Fluentd (td-agent) service is properly running:

$ sudo systemctl status td-agent.service
● td-agent.service - td-agent: Fluentd based data collector for Treasure Data
   Loaded: loaded (/lib/systemd/system/td-agent.service; disabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-12-03 16:47:04 UTC; 27min ago
...

$ sudo tail -f /var/log/td-agent/td-agent.log
2019-12-03 16:54:30 +0000 [info]: gem 'fluent-plugin-s3' version '1.2.0'
2019-12-03 16:54:30 +0000 [info]: gem 'fluent-plugin-td' version '1.0.0'
2019-12-03 16:54:30 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.4'
2019-12-03 16:54:30 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.2.4'
2019-12-03 16:54:30 +0000 [info]: gem 'fluentd' version '1.7.4'
2019-12-03 16:54:30 +0000 [info]: adding match pattern="system.**" type="s3"
2019-12-03 16:54:30 +0000 [info]: adding source type="syslog"
2019-12-03 16:54:30 +0000 [info]: #0 starting fluentd worker pid=3272 ppid=1388 worker=0
2019-12-03 16:54:30 +0000 [info]: #0 listening syslog socket on 0.0.0.0:5140 with udp
2019-12-03 16:54:30 +0000 [info]: #0 fluentd worker is now running worker=0
Code language: Shell Session (shell)

If the service is unable to load or is throwing errors, verify the following:

  • The /etc/td-agent/td-agent.conf has no syntax errors
  • The IAM Role is properly attached to the instance

If no errors are present, continue onward!


Step 3: View Logs in S3

After about an hour of data is generated, you should see data landing in your S3 Bucket:

Each file will have the following format:

{"host":"ip-172-31-92-150","ident":"systemd-timesyncd","pid":"538","message":"Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com)."}
{"host":"ip-172-31-92-150","ident":"CRON","pid":"32611","message":"(root) CMD (   cd / && run-parts --report /etc/cron.hourly)"}
{"host":"ip-172-31-92-150","ident":"systemd-timesyncd","pid":"538","message":"Network configuration changed, trying to establish connection."}
{"host":"ip-172-31-92-150","ident":"systemd-timesyncd","pid":"538","message":"Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com)."}```Code language: Python (python)

To search through files, you can use S3 Select with the following settings:

S3 Select File Settings

And then issue a SQL query to look for sshd events:

S3 Select Query and Result

If you are reading this, everything is working and you made it to the end!

Continuously Monitor EC2 Instances with Panther

This tutorial taught you how to configure secure and performant security log collection with Fluentd to send directly to a S3 Bucket. This is a jump-off point to more sophisticated collection and analysis of your choosing.

Monitoring EC2 is critical for understanding the history of EC2 metrics changes and detecting suspicious activity. Panther’s built-in policies support the continuous monitoring of EC2 instances, or you can write your own detections in Python to fit your internal business use cases.

Want to learn more about Panther? Book a demo today and find out why Panther is loved by cloud-first security teams. 

Recommended Resources

Escape Cloud Noise. Detect Security Signal.
Request a Demo