Learning Objectives #
- Revisiting log files and their importance.
- Understanding what a proxy is and breaking down the contents of a proxy log.
- Building Linux command-line skills to parse log entries manually.
- Analysing a proxy log based on typical use cases.
Overview #
Some malware (funny named CrypTOYminer) has been installed on workstations and servers. We are going to analyse the proxy logs and understand the suspicious network traffic.
This is my key takeaways from today:
- Combine
cut -d ' ' -f1,3
to get the first AND third field from the cut. uniq -c
shows the number of the unique result.sort
sorts the result,sort -r
reverse the search. Combine the two withsort -nr
Log Primer #
This is an example from an Apache web server log.
158.32.51.188 - - [25/Oct/2023:09:11:14 +0000] "GET /robots.txt HTTP/1.1" 200 11173 "-" "curl/7.68.0"
Field | Value | Description |
---|---|---|
Source IP Address | 158.32.51.188 | The source that initiated the HTTP request. |
Timestamp | [25/Oct/2023:09:11:14 +0000] | The date and time when the event occured. |
HTTP Request | GET /robots.txt HTTP/1.1 | The actual HTTP request made, including the request method, URI path and HTTP version. |
Status Code | 200 | The response of the web application. |
User-Agent | curl/7.68.0 | The user-agent used by the source of the request. It’s typically tied to the application tused to invoke the HTTP request. |
Proxy Servers #
A proxy server sits between you and the internet. When you request information or access a web page, your device connects to the proxy server instead of connecting directly to the target server. The proxy server then forwards your request to the internet, receives the response and sends it back to you.
A proxy server offers enhanced visibility into network traffic and user activities, since it logs all web requests and responses. This allows administrators and security analysts to monitor which websites users access, when and how much bandwith is used.
Common examples of malicious activities:
Attack Technique | Potential Indicator |
---|---|
Download attempt of a malicious binary | Connection to a known malicious URL binary (e.g. www[.]evil[.]com/malicious[.]exe |
Data exfiltration | High count of outbound bandwidth due to file upload (e.g. outbound connection to OneDrive |
Continuous C2 connection | High count of outbound connections to a single domain in regular intervals (e.g. connections every 5 minutes to a single domain |
Questions #
The logs are from a Squid Proxy
and configured to use the following log format:
timestamp - source_ip - domain:port - http_method - http_uri - status_code - response_size - user_agent
- How many unique IP addresses are connected to the proxy server?
$ cut -d ' ' -f2 access.log | sort | uniq | wc -l
- How many unique domains were accessed by all workstations?
$ cut -d ' ' -f3 access.log | sort | uniq | wc -l
- What status code is generated by the HTTP requests to the least accessed domain?
Look at the least accessed domain from the code above (without the wc -l
, and then:
$ cut -d ' ' -f3 access.log | sort | uniq -c | sort -nr
$ # Take the least visited site from the code above and add it here:
$ cut -d ' ' -f3,6 access.log | grep [redacted] | cut -d ' ' -f2 | sort | uniq
- Based on the high count of connection attempts, what is the name of the suspicious domain?
$ cut -d ' ' -f3 access.log | cut -d ':' -f1 | sort | uniq -c | sort -n | tail -n10
- What is the source IP of the workstation that accessed the malicious domain?
$ grep [redacted] access.log | cut -d ' ' -f2 | sort | uniq
- How many requests were made on the malicious domain in total?
$ grep [redacted] access.log | cut -d ' ' -f2 | sort | uniq -c
- Having retrieved the exfiltrated data, what is the hidden flag?
$ grep [redacted] access.log | cut -d ' ' -f5 | cut -d '=' -f2 | base64 -d | grep -i 'THM{' | cut ',' -f3