Yes, this is provincial, but going from a Zeek log to a visual data plot in a few lines of code might be really handy sometimes. So without further ado here's a very small bit of code :)
from zat.log_to_dataframe import LogToDataFrame
from zat.utils import plot_utils
# Just some plotting defaults
%matplotlib inline
import matplotlib.pyplot as plt
plot_utils.plot_defaults()
# Convert it to a Pandas DataFrame
log_to_df = LogToDataFrame()
http_df = log_to_df.create_dataframe('../data/http.log')
http_df.head()
Successfully monitoring ../data/http.log...
filename | host | id.orig_h | id.orig_p | id.resp_h | id.resp_p | info_code | info_msg | method | orig_fuids | ... | resp_mime_types | response_body_len | status_code | status_msg | tags | trans_depth | uid | uri | user_agent | username | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ts | |||||||||||||||||||||
2013-09-15 17:44:27.668082 | - | guyspy.com | 192.168.33.10 | 1031 | 54.245.228.191 | 80 | - | - | GET | - | ... | text/html | 184 | 301 | Moved Permanently | (empty) | 1 | CyIaMO7IheOh38Zsi | / | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... | - |
2013-09-15 17:44:27.731702 | - | www.guyspy.com | 192.168.33.10 | 1032 | 54.245.228.191 | 80 | - | - | GET | - | ... | text/html | 100631 | 200 | OK | (empty) | 1 | CoyZrY2g74UvMMgp4a | / | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... | - |
2013-09-15 17:44:28.092922 | - | www.guyspy.com | 192.168.33.10 | 1032 | 54.245.228.191 | 80 | - | - | GET | - | ... | text/html | 55817 | 404 | Not Found | (empty) | 2 | CoyZrY2g74UvMMgp4a | /wp-content/plugins/slider-pro/css/advanced-sl... | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... | - |
2013-09-15 17:44:28.150301 | - | www.guyspy.com | 192.168.33.10 | 1040 | 54.245.228.191 | 80 | - | - | GET | - | ... | text/plain | 887 | 200 | OK | (empty) | 1 | CiCKTz4e0fkYYazBS3 | /wp-content/plugins/contact-form-7/includes/cs... | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... | - |
2013-09-15 17:44:28.150602 | - | www.guyspy.com | 192.168.33.10 | 1041 | 54.245.228.191 | 80 | - | - | GET | - | ... | text/plain | 10068 | 200 | OK | (empty) | 1 | C1YBkC1uuO9bzndRvh | /wp-content/plugins/slider-pro/css/slider/adva... | Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... | - |
5 rows × 26 columns
Above we used a ZAT utility method to set up nice plotting defaults and here we simply use the plotting provided by Pandas.
http_df[['request_body_len','response_body_len']].hist()
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x114e2cfd0>, <matplotlib.axes._subplots.AxesSubplot object at 0x1150aed30>]], dtype=object)
Since ZAT automatically makes the timestamp the index, we can plot volume over time super easy.
http_df['uid'].resample('1S').count().plot()
plt.xlabel('HTTP Requests per Second')
<matplotlib.text.Text at 0x1151f2278>