Panther Cloud-Native SIEM: Moving Beyond Traditional SIEMs

May 5, 2020

7 min read

Learn how to detect, investigate, and remediate cybersecurity threats in real-time.

Panther is an open-source, cloud-native platform for security information and event management (SIEM). In this post, we’ll discuss its platform architecture and walk through a typical attacker scenario to demonstrate how Panther can be used to detect and remediate threats in real-time.

Architecture

Panther is a Cloud-Native SIEM that leverages a Serverless architecture and is built fully on top of cloud-native services offered by AWS such as Lambda, ECS, DynamoDB, S3, Cognito, and more.

How Panther’s Cloud-Native SIEM Works

At a high level:

Panther receives security logs from clouds, networks, endpoints, and more
Panther also baseline scans cloud infrastructure to understand the state of your world
All of this data is received, parsed, analyzed, and saved to the data warehouse
Alerts are generated and dispatched to your team
Optional remediations are applied to misconfigured infrastructure

Panther’s design provides a holistic approach to SIEM, where logs are contextually joined with standardized fields, and infrastructure context can be gained by looking up cloud resource attributes in a single pane.

Example Use Case

To better understand how Panther’s Cloud-Native SIEM can be helpful, let’s walk through a typical attacker scenario:

SSH credentials are stolen providing access into a production machine. Once the attacker connects to the host, they begin to enumerate access and establish their foothold.

How can we detect, investigate, and remediate these behaviors?

Step 1: Preparation

The first step is to collect the proper data to power detections. In most cloud-focused organizations, this involves a combination of logs across various layers:

Cloud: AWS CloudTrail, S3 Access, GuardDuty
Network: VPC Flow, Switches/Firewalls, NIDS
Endpoint: Osquery, Syslog, Auditd, CrowdStrike
Application: SSO, Productivity Tools, Sales Applications

For this exercise, let’s assume we are collecting logs from AWS CloudTrail, VPC Flow, and Osquery.

To find the suspicious login, we’ll write a rule that analyzes osquery data from the logged_in_users table:

$ sudo osqueryi

osquery> SELECT * FROM logged_in_users WHERE type = 'user';
+------+--------+-------+----------------+------------+------+
| type | user   | tty   | host           | time       | pid  |
+------+--------+-------+----------------+------------+------+
| user | ubuntu | pts/0 | 136.24.229.194 | 1584146846 | 9459 |
+------+--------+-------+----------------+------------+------+Code language: Shell Session (shell)

The above information provides context about how users are logging into our systems. Using the osquery aws_firehose logger plugin, these results can be sent to S3 and analyzed by Panther.

In the example rule below, let’s ensure users are only logging in from centralized egress points, such as offices or VPNs:

import ipaddress

# Monitor the office IP Network
OFFICE_NETWORK = ipaddress.ip_network('192.0.1.0/24')


def rule(event):
    # Only look for new entries
    if event['action'] != 'added':
        return False

    # Make sure we are analyzing the right osquery table
    if 'logged_in_users' not in event['name']:
        return False

    # Check that the host IP is present
    host_ip = event['columns'].get('host')
    if not host_ip:
        return False

    # Check that the IP is within the office network
    if ipaddress.IPv4Address(host_ip) not in OFFICE_NETWORK.hosts():
        return True

    return False


# Group logins by user to track lateral movement
def dedup(event):
    return event['columns'].get('user')Code language: Python (python)

Panther rules contain metadata to assist with triage, such as severity, log types, unit tests, runbooks, and more. This rule can be written directly in the Panther UI or uploaded programmatically with a CLI.

Step 2: Detect

After our rules are uploaded, Panther will start analyzing new logs in real time.

When the following suspicious login activity occurs:

We’ll see the following messages in Slack (or via any other supported Alert Destination):

Following the link in the alert to the Panther UI, we can now begin reviewing context and event details for the notification:

From this alert, we know:

Time of the first event (2020-03-14 00:47:47)
The IP the attacker used to connect (136.24.229.194)
The host that was logged into (ip-172-31-84-73)

This is the starting point for our investigation. Using Panther’s standardized data fields, we can begin to pivot through all of our data to answer additional questions.

Let’s dive deeper.

Step 3: Investigate

After Panther parses and analyzes logs, it stores them in a data warehouse for long-term storage. During this process, common indicators (IPs, domains, etc) are extracted to allow for fast queries and quick searches across the log corpus.

In the Panther database (at the time of writing: Athena or Snowflake), we can use SQL to query all related logs to this IP:

SELECT DISTINCT p_log_type
FROM "panther_views"."all_logs" 
WHERE contains(p_any_ip_addresses, '136.24.229.194')
AND month=3 AND day=14
Code language: SQL (Structured Query Language) (sql)

count 	p_log_type
1	AWS.CloudTrail
2	AWS.VPCFlow
Code language: YAML (yaml)

From here, we can find all possible instance IDs connected to the timeframe:

SELECT instanceid, COUNT(*) AS login_count
FROM "panther_logs"."aws_vpcflow" 
WHERE srcaddr = '136.24.229.194'
AND dstport=22 AND month=3 AND day=14
GROUP BY instanceid
ORDER BY login_count DESC
Code language: SQL (Structured Query Language) (sql)

instanceid			login_count
i-016e2cb69ac58c2d5	1
Code language: YAML (yaml)

We can then look at this instance in Panther’s resource search, which provides all attributes and associated policy successes and failures that could indicate security vulnerabilities.

Looking into the attributes, we find the following:

With this public IP, we can query CloudTrail to find related API calls from this host:

SELECT eventname, COUNT(*) AS event_count
FROM "panther_logs"."aws_cloudtrail"
WHERE sourceipaddress='54.164.105.138'
AND month=3 AND day=14 AND errormessage = ‘’
GROUP BY eventname
ORDER BY event_count DESC
Code language: SQL (Structured Query Language) (sql)

And the commands run with osquery containing the private DNS name of ip-172-31-84-73:

SELECT * 
FROM "panther_logs"."osquery_differential"
WHERE hostidentifier='ip-172-31-84-73'
AND month=3 AND day>=14 
AND name LIKE '%shell_history'
AND decorations['username'] = 'ubuntu'
Code language: SQL (Structured Query Language) (sql)

1	2020-03-12 06:58:06.000	aws sts get-caller-identity
2	2020-03-12 06:58:06.000	aws iam get-role --role-name TestDemoRole
3	2020-03-12 06:58:06.000	aws iam list-users --region us-east-2
4	2020-03-12 06:58:06.000	aws cloudformation describe-trails
5	2020-03-12 06:58:06.000	aws s3 ls --region us-east-1
Code language: YAML (yaml)

Step 4: Remediate and Post-Incident

Now that we’ve answered all of our investigation questions, it’s safe to terminate the instance, rotate credentials, and fix ACLs related to the root cause. Panther makes this easy by also offering auto remediation capabilities.

Navigating to the associated security-group for the instance will show the following policies:

As illustrated above, Panther detected an ACL failure, which had not been fixed, and was one of the causes of the compromise. Simply clicking REMEDIATE will correct the resource in the affected account. This functionality is generally used during incident response for containment.

Finally, your team can update and push new rules and policies to prevent this from happening again. Your infrastructure will be hardened, and the monitoring cycle will restart.