The data provider module of msticpy provides functions to allow for the defining of data sources, connectors to them and queries for them as well as the ability to return query result from the defined data sources.
For more information on Data Propviders, check documentation
In this notebooks we will demonstrate Splunk data connector feature of msticpy. This feature is built on-top of the [Splunk Enterprise SDK for Python] (https://dev.splunk.com/enterprise/docs/devtools/python/sdk-python/) with some customizations and enhancements.
# Only run first time to install/upgrade msticpy to latest version
#!pip install --upgrade msticpy[splunk]
Authentication for the Splunk data provider is handled by specifying credentials directly in the connect call or specifying the credentials in msticpy config file.
For more information on how to create new user with approapriate roles and permissions, follow Splunk Docs Addandeditusers and Aboutusersandroles. The user should have permission to at least run its own searches or more depending upon the actions to be performed by user.
Once you created user account with the appropriate roles, you will require the following details to specify while connecting
Once you have details, you can specify it in msticpyconfig.yaml
as shown in below example
SplunkApp:
Args:
host: "{Splunk server FQDN or localhost}"
port: "{default 8089}"
username: "{username with search permissions to connect}"
password: "{password of the user specified}"
#Check we are running Python 3.6
import sys
MIN_REQ_PYTHON = (3,6)
if sys.version_info < MIN_REQ_PYTHON:
print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')
print('or later is selected as the active kernel.')
sys.exit("Python %s.%s or later is required.\n" % MIN_REQ_PYTHON)
#imports
import pandas as pd
import msticpy.nbtools as nbtools
#data library imports
from msticpy.data.data_providers import QueryProvider
print('Imports Complete')
Imports Complete
You can instantiate a data provider for Splunk by specifying the credentials in connect or in msticpy config file.
If the details are correct and authentication is successful, it will show connected.
splunk_prov = QueryProvider('Splunk')
splunk_prov.connect(host=<hostname>, username=<username>, password=<password>)
connected
Upon connecting to the Splunk data environment, we can take a look what query options available to us by running QUERY_PROVIDER.list_queries()
For more information, refer documentation : Listing available queries.
This will display all the saved searches from the connected splunk instance and also pre-built custom queries to do common operations such as list datatypes, list saved searches, alerts, audittrail informaion.
splunk_prov.list_queries()
['Alerts.list_all_alerts', 'SavedSearches.Errors_in_the_last_24_hours', 'SavedSearches.Errors_in_the_last_hour', 'SavedSearches.License_Usage_Data_Cube', 'SavedSearches.Load_sample_User_Agreements', 'SavedSearches.Messages_by_minute_last_3_hours', 'SavedSearches.Orphaned_scheduled_searches', 'SavedSearches.Score-Base', 'SavedSearches.Splunk_errors_last_24_hours', 'SavedSearches.Website_Performance_Problem', 'SavedSearches.inoperable_sites_rangemap', 'SavedSearches.slow_sites_avg_rangemap', 'SavedSearches.slow_sites_rangemap', 'SavedSearches.web_ping_inputs_lookup_gen', 'SavedSearches.website_availability_overview', 'SavedSearches.website_performance_problems', 'SplunkGeneral.get_events_parameterized', 'SplunkGeneral.list_all_datatypes', 'SplunkGeneral.list_all_savedsearches', 'audittrail.list_all_audittrail']
In order to get help for specific query , you can execute QUERY_PROVIDER.<QueryName>('?')
.
For more information , refer documentation - Getting Help for a query
splunk_prov.SplunkGeneral.get_events_parameterized('?')
Query: get_events_parameterized Data source: Splunk Generic parameterized query from index/source Parameters ---------- add_query_items: str (optional) Additional query clauses (default value is: | head 100) end: datetime (optional) Query end time (default value is: 08/26/2017:00:00:00) index: str (optional) Splunk index name (default value is: *) project_fields: str (optional) Project Field names (default value is: | table TimeCreated, host, EventID, EventDescripti...) source: str (optional) Splunk source type (default value is: *) start: datetime (optional) Query start time (default value is: 08/25/2017:00:00:00) timeformat: str (optional) Datetime format to use in Splunk query (default value is: "%Y-%m-%d %H:%M:%S.%6N") Query: search index={index} source={source} timeformat={timeformat} earliest={start} latest={end} {project_fields} {add_query_items}
If you want to print the query prior to executing, pass 'print' as an argument
splunk_prov.SplunkGeneral.get_events_parameterized('print')
' search index=* source=* timeformat="%Y-%m-%d %H:%M:%S.%6N" earliest="2020-08-15 19:15:47.466710" latest="2020-08-15 19:15:47.466938" | table TimeCreated, host, EventID, EventDescription, User, process, cmdline, Image, parent_process, ParentCommandLine, dest, Hashes | head 100'
If you have set the arguments and then would like to validate the query, use below example
splunk_prov.SplunkGeneral.get_events_parameterized('print',
index="botsv2",
source="WinEventLog:Microsoft-Windows-Sysmon/Operational",
timeformat="%Y-%m-%d %H:%M:%S",
start="2017-08-25 00:00:00",
end="2017-08-25 10:00:00"
)
' search index=botsv2 source=WinEventLog:Microsoft-Windows-Sysmon/Operational timeformat=%Y-%m-%d %H:%M:%S earliest="2017-08-25 00:00:00" latest="2017-08-25 10:00:00" | table TimeCreated, host, EventID, EventDescription, User, process, cmdline, Image, parent_process, ParentCommandLine, dest, Hashes | head 100'
In order to run pre-defined query , execute with the name either by setting values for arguments if available or run with default arguments.
For more information , refer documentation - Running an pre-definedfined query
splunk_prov.SplunkGeneral.get_events_parameterized(
index="botsv2",
source="WinEventLog:Microsoft-Windows-Sysmon/Operational",
start="2017-08-25 00:00:00.000000",
end="2017-08-25 10:00:00.000000"
)
TimeCreated | host | EventID | EventDescription | User | process | Image | dest | cmdline | parent_process | ParentCommandLine | Hashes | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2017-08-25T04:57:45.512440700Z | venus | 3 | Network Connect | NT AUTHORITY\SYSTEM | powershell.exe | C:\Windows\System32\WindowsPowerShell\v1.0\pow... | 45.77.65.211.vultr.com | NaN | NaN | NaN | NaN |
1 | 2017-08-25T04:57:45.213738500Z | wrk-aturing | 5 | Process Terminate | NaN | conhost.exe | C:\Windows\System32\conhost.exe | NaN | NaN | NaN | NaN | NaN |
2 | 2017-08-25T04:57:45.213738500Z | wrk-aturing | 5 | Process Terminate | NaN | cscript.exe | C:\Windows\System32\cscript.exe | NaN | NaN | NaN | NaN | NaN |
3 | 2017-08-25T04:57:45.088941700Z | wrk-aturing | 1 | Process Create | NT AUTHORITY\SYSTEM | conhost.exe | C:\Windows\System32\conhost.exe | wrk-aturing.frothly.local | \??\C:\Windows\system32\conhost.exe | C:\Windows\System32\csrss.exe | %SystemRoot%\system32\csrss.exe ObjectDirector... | SHA1=680DEC0F8907F4B8911FBE2AA5F2FD25425BE0B0 |
4 | 2017-08-25T04:57:45.088941700Z | wrk-aturing | 1 | Process Create | NT AUTHORITY\SYSTEM | cscript.exe | C:\Windows\System32\cscript.exe | wrk-aturing.frothly.local | C:\Windows\system32\cscript.exe //Job:AgentHI... | C:\Program Files (x86)\Symantec\Symantec Endpo... | "C:\Program Files (x86)\Symantec\Symantec Endp... | SHA1=70096A77E202CF9F30C064956F36D14BCBD8F7BB |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
95 | 2017-08-25T04:57:02.003800000Z | wrk-ghoppy | 1 | Process Create | NT AUTHORITY\SYSTEM | splunk-powershell.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | wrk-ghoppy.frothly.local | "C:\Program Files\SplunkUniversalForwarder\bin... | C:\Program Files\SplunkUniversalForwarder\bin\... | "C:\Program Files\SplunkUniversalForwarder\bin... | SHA1=50A428905F5BA8808464F8A8183DD3662D8157F6 |
96 | 2017-08-25T04:57:01.170335100Z | venus | 3 | Network Connect | NT AUTHORITY\SYSTEM | powershell.exe | C:\Windows\System32\WindowsPowerShell\v1.0\pow... | 45.77.65.211.vultr.com | NaN | NaN | NaN | NaN |
97 | 2017-08-25T04:57:01.941402000Z | wrk-ghoppy | 5 | Process Terminate | NaN | splunk-winprintmon.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | NaN | NaN | NaN | NaN | NaN |
98 | 2017-08-25T04:57:01.863404500Z | wrk-ghoppy | 1 | Process Create | NT AUTHORITY\SYSTEM | splunk-netmon.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | wrk-ghoppy.frothly.local | "C:\Program Files\SplunkUniversalForwarder\bin... | C:\Program Files\SplunkUniversalForwarder\bin\... | "C:\Program Files\SplunkUniversalForwarder\bin... | SHA1=0644F98A9874414C738A0B8841BB997FB9BFC274 |
99 | 2017-08-25T04:57:01.754208000Z | wrk-ghoppy | 5 | Process Terminate | NaN | splunk-powershell.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | NaN | NaN | NaN | NaN | NaN |
100 rows × 12 columns
By-default, splunk query results are limited to 100. you can specify count=0
argument to return all the results.
Deafult value for add_query_items
argument is set to | head 100
which you can reset as shown in below example while retrieving all results.
splunk_prov.SplunkGeneral.get_events_parameterized(
index="botsv2",
source="WinEventLog:Microsoft-Windows-Sysmon/Operational",
start="2017-08-25 00:00:00.000000",
end="2017-08-25 10:00:00.000000",
add_query_items='',
count=0
)
TimeCreated | host | EventID | EventDescription | User | process | Image | dest | cmdline | parent_process | ParentCommandLine | Hashes | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2017-08-25T04:57:45.512440700Z | venus | 3 | Network Connect | NT AUTHORITY\SYSTEM | powershell.exe | C:\Windows\System32\WindowsPowerShell\v1.0\pow... | 45.77.65.211.vultr.com | NaN | NaN | NaN | NaN |
1 | 2017-08-25T04:57:45.213738500Z | wrk-aturing | 5 | Process Terminate | NaN | conhost.exe | C:\Windows\System32\conhost.exe | NaN | NaN | NaN | NaN | NaN |
2 | 2017-08-25T04:57:45.213738500Z | wrk-aturing | 5 | Process Terminate | NaN | cscript.exe | C:\Windows\System32\cscript.exe | NaN | NaN | NaN | NaN | NaN |
3 | 2017-08-25T04:57:45.088941700Z | wrk-aturing | 1 | Process Create | NT AUTHORITY\SYSTEM | conhost.exe | C:\Windows\System32\conhost.exe | wrk-aturing.frothly.local | \??\C:\Windows\system32\conhost.exe | C:\Windows\System32\csrss.exe | %SystemRoot%\system32\csrss.exe ObjectDirector... | SHA1=680DEC0F8907F4B8911FBE2AA5F2FD25425BE0B0 |
4 | 2017-08-25T04:57:45.088941700Z | wrk-aturing | 1 | Process Create | NT AUTHORITY\SYSTEM | cscript.exe | C:\Windows\System32\cscript.exe | wrk-aturing.frothly.local | C:\Windows\system32\cscript.exe //Job:AgentHI... | C:\Program Files (x86)\Symantec\Symantec Endpo... | "C:\Program Files (x86)\Symantec\Symantec Endp... | SHA1=70096A77E202CF9F30C064956F36D14BCBD8F7BB |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
7923 | 2017-08-25T04:57:46.758125600Z | wrk-klagerf | 1 | Process Create | NT AUTHORITY\SYSTEM | splunk-admon.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | wrk-klagerf.frothly.local | "C:\Program Files\SplunkUniversalForwarder\bin... | C:\Program Files\SplunkUniversalForwarder\bin\... | "C:\Program Files\SplunkUniversalForwarder\bin... | SHA1=1C0C7368C8B7B688CCF77D1062708E60D581B0AF |
7924 | 2017-08-25T04:57:46.695728800Z | wrk-klagerf | 5 | Process Terminate | NaN | splunk-MonitorNoHandle.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | NaN | NaN | NaN | NaN | NaN |
7925 | 2017-08-25T04:57:46.570935200Z | wrk-klagerf | 1 | Process Create | NT AUTHORITY\SYSTEM | splunk-MonitorNoHandle.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | wrk-klagerf.frothly.local | "C:\Program Files\SplunkUniversalForwarder\bin... | C:\Program Files\SplunkUniversalForwarder\bin\... | "C:\Program Files\SplunkUniversalForwarder\bin... | SHA1=F48EDD0FE4D013D690196572EA96A4FA6EB04E77 |
7926 | 2017-08-25T04:57:46.539736800Z | wrk-klagerf | 5 | Process Terminate | NaN | splunk-powershell.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | NaN | NaN | NaN | NaN | NaN |
7927 | 2017-08-25T04:57:46.430542400Z | wrk-klagerf | 1 | Process Create | NT AUTHORITY\SYSTEM | splunk-powershell.exe | C:\Program Files\SplunkUniversalForwarder\bin\... | wrk-klagerf.frothly.local | "C:\Program Files\SplunkUniversalForwarder\bin... | C:\Program Files\SplunkUniversalForwarder\bin\... | "C:\Program Files\SplunkUniversalForwarder\bin... | SHA1=50A428905F5BA8808464F8A8183DD3662D8157F6 |
7928 rows × 12 columns
You can also define a your own splunk query and run it via splunk provider via QUERY_PROVIDER.exec_query(<queryname>)
For more information, check documentation Running and Ad-hoc Query
splunk_query = '''
search index="blackhat" sourcetype="network" earliest=0
| table TimeGenerated, TotalBytesSent
'''
df = splunk_prov.exec_query(splunk_query)
df.head()
TimeGenerated | TotalBytesSent | |
---|---|---|
0 | 2020-07-02T10:00:00Z | 27055 |
1 | 2020-07-02T09:00:00Z | 33777 |
2 | 2020-07-02T08:00:00Z | 27355 |
3 | 2020-07-02T07:00:00Z | 25544 |
4 | 2020-07-02T06:00:00Z | 11771 |