GRR Colab¶

In [0]:

%load_ext grr_colab.ipython_extension

In [0]:

import grr_colab

Specifying GRR Colab flags:

In [0]:

grr_colab.flags.FLAGS.set_default('grr_http_api_endpoint', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_admin_ui_url', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_auth_api_user', 'admin')
grr_colab.flags.FLAGS.set_default('grr_auth_password', 'admin')

Magics API¶

GRR magics allow to search for clients and then to choose a single client to work with. The results of magics are represented as pandas dataframes unless they are primitives.

Searching clients¶

You can search for clients by specifying username, hostname, client labels etc. The results are sorted by the last seen column.

In [0]:

df = %grr_search_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]

Out[0]:

	online	online.pretty	client_id	last_seen_ago	last_seen_at.pretty
0	online	🌕	C.dc3782aeab2c5b4c	0 seconds ago	2019-08-30 09:53:28.039821

There is a shortcut for searching for online only clients directly so that you don't need to filter the dataframe.

In [0]:

df = %grr_search_online_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]

Out[0]:

	online	online.pretty	client_id	last_seen_ago	last_seen_at.pretty
0	online	🌕	C.dc3782aeab2c5b4c	0 seconds ago	2019-08-30 09:53:38.331647

Every datetime field has two representations: the original one that is microseconds and the pretty one that is pandas timestamp.

In [0]:

df[['last_seen_at', 'last_seen_at.pretty']]

Out[0]:

	last_seen_at	last_seen_at.pretty
0	1567158818331647	2019-08-30 09:53:38.331647

Setting current clients¶

To work with a client you need to select a client first. It means that you are able to work only with a single client simultaneously using magic commands (there is no such restriction for Python API). To set a client you need either a hostname (works in case of one client set up for that hostname) or a client ID which you can get from the search clients dataframe.

In [0]:

client_id = df['client_id'][0]
%grr_set_client -c {client_id}

%grr_id

Out[0]:

'C.dc3782aeab2c5b4c'

An attempt to set a client with a hostname that has multiple clients will lead to an exception.

Requesting approvals¶

If you don't have valid approvals for the selected client, you will get an error while attempting to run a flow on it. You can request an approval with magic commands specifying the reason and list of approvers.

In [0]:

%grr_request_approval -r "For testing" -a admin

This function will not wait until the approval is granted. If you need your code to wait until it's granted, use grr_request_approval_and_wait instead.

Exploring filesystem¶

In addition to the selected client, working directory is also saved. It means that you can use relative paths instead of absolute. Note that the existence of directories is not checked and you will not get an error if you try to cd into directory that does not exist.

Initially you are in the root directory.

In [0]:

%grr_pwd

Out[0]:

'/'

In [0]:

%grr_cd tmp/foo/bar
%grr_pwd

Out[0]:

'/tmp/foo/bar'

In [0]:

%grr_cd ../baz
%grr_pwd

Out[0]:

'/tmp/foo/baz'

You can ls the current directory and any other directories specified by relative and absolute paths.

Note. The most file-related magics start flows and fetch live data from the client. It means that the client has to be online in order for them to work.

In [0]:

df = %grr_ls
df

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_linux
0	16877	drwxr-xr-x	17696532	65025	2	585945	89939	4096	1567157599	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir1	CASE_LITERAL	524288
1	16877	drwxr-xr-x	17832583	65025	3	585945	89939	4096	1567157734	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir2	CASE_LITERAL	524288
2	33188	-rw-r--r--	17696534	65025	1	585945	89939	70	1567158029	1567157649	1567157649	8	4096	OS	/tmp/foo/baz/file1	CASE_LITERAL	524288
3	33188	-rw-r--r--	17696533	65025	1	585945	89939	23	1567158209	1567157627	1567157627	8	4096	OS	/tmp/foo/baz/file2	CASE_LITERAL	524288

Stat mode has two representations: number and UNIX-style:

In [0]:

df[['st_mode', 'st_mode.pretty']]

Out[0]:

	st_mode	st_mode.pretty
0	16877	drwxr-xr-x
1	16877	drwxr-xr-x
2	33188	-rw-r--r--
3	33188	-rw-r--r--

In [0]:

%grr_ls ../baz/dir2

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	st_rdev	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_osx	st_flags_linux
0	16877	drwxr-xr-x	17835392	65025	2	585945	89939	4096	1567157599	1567157599	1567157599	8	4096	0	OS	/tmp/foo/baz/dir2/dir3	CASE_LITERAL	0	524288

In [0]:

%grr_ls /tmp/foo

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	st_rdev	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_osx	st_flags_linux
0	16877	drwxr-xr-x	17567410	65025	2	585945	89939	4096	1567157544	1567157544	1567157544	8	4096	0	OS	/tmp/foo/bar	CASE_LITERAL	0	524288
1	16877	drwxr-xr-x	17695802	65025	4	585945	89939	4096	1567157664	1567157631	1567157631	8	4096	0	OS	/tmp/foo/baz	CASE_LITERAL	0	524288

To see some metadata of a file you can just call grr_stat function.

In [0]:

%grr_stat file1

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	st_rdev	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_osx	st_flags_linux
0	33188	-rw-r--r--	17696534	65025	1	585945	89939	70	1567158029	1567157649	1567157649	8	4096	0	OS	/tmp/foo/baz/file1	CASE_LITERAL	0	524288

You can use globbing for stat:

In [0]:

%grr_stat "file*"

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	st_rdev	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_osx	st_flags_linux
0	33188	-rw-r--r--	17696534	65025	1	585945	89939	70	1567158029	1567157649	1567157649	8	4096	0	OS	/tmp/foo/baz/file1	CASE_LITERAL	0	524288
1	33188	-rw-r--r--	17696533	65025	1	585945	89939	23	1567158209	1567157627	1567157627	8	4096	0	OS	/tmp/foo/baz/file2	CASE_LITERAL	0	524288

You can print the first bytes of a file:

In [0]:

%grr_head file1 -c 30

Out[0]:

b'This is the first line\nThis is'

Alghough there is no offset in original bash head command you can specify offset in grr_head:

In [0]:

%grr_head file1 -c 30 -o 20

Out[0]:

b'ne\nThis is the second line\nThi'

Some of the functions like grr_head and grr_ls have --cached (-C for short) option which indicates that no calls to the client should be performed. In this case the data will be fetched from the cached data on the server. Server cached data is updated only during calls to the client so it is not always up-to-date but accessing it is way faster.

In [0]:

%grr_ls /tmp/foo/baz -C

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_linux
0	16877	drwxr-xr-x	17696532	65025	2	585945	89939	4096	1567157599	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir1	CASE_LITERAL	524288
1	16877	drwxr-xr-x	17832583	65025	3	585945	89939	4096	1567157734	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir2	CASE_LITERAL	524288
2	33188	-rw-r--r--	17696534	65025	1	585945	89939	70	1567158029	1567157649	1567157649	8	4096	OS	/tmp/foo/baz/file1	CASE_LITERAL	524288
3	33188	-rw-r--r--	17696533	65025	1	585945	89939	23	1567158209	1567157627	1567157627	8	4096	OS	/tmp/foo/baz/file2	CASE_LITERAL	524288

In [0]:

%grr_head file1 -C

Out[0]:

b'This is the first line\nThis is the second line\nThis is the third LINE\n'

Grepping files is also possible. --fixed-string (-F for short) option indicates that pattern to search for is not a regular expression. --hex-string (-X for short) option allows to pass hex strings as a pattern.

In [0]:

%grr_grep "line" file1

Out[0]:

	offset	length	data	data.pretty	pathspec.pathtype	pathspec.path	pathspec.path_options
0	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
1	42	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
2	65	4	b'LINE'	b'LINE'	OS	/tmp/foo/baz/file1	CASE_LITERAL

In [0]:

%grr_grep -F "line" file1

Out[0]:

	offset	length	data	data.pretty	pathspec.pathtype	pathspec.path	pathspec.path_options
0	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
1	42	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL

In [0]:

%grr_grep -X "6c696e65" file1

Out[0]:

	offset	length	data	data.pretty	pathspec.pathtype	pathspec.path	pathspec.path_options
0	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
1	42	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
2	65	4	b'LINE'	b'LINE'	OS	/tmp/foo/baz/file1	CASE_LITERAL

There is a shortcut for --fixed-strings option. Globbing is also available here.

In [0]:

%grr_fgrep "line" "file*"

Out[0]:

	offset	length	data	data.pretty	pathspec.pathtype	pathspec.path	pathspec.path_options
0	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
1	42	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
2	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file2	CASE_LITERAL

In [0]:

%grr_fgrep -X "6c696e65" file1

Out[0]:

	offset	length	data	data.pretty	pathspec.pathtype	pathspec.path	pathspec.path_options
0	18	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL
1	42	4	b'line'	b'line'	OS	/tmp/foo/baz/file1	CASE_LITERAL

If the file is too large and you'd like to download it then use wget:

In [0]:

%grr_wget file1

Out[0]:

'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can also download a cached version:

In [0]:

%grr_wget file1 -C

Out[0]:

'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can specify path type with --path-type flag (-P for short) for all filesystem related magics. The available values are os (default), tsk, ntfs, registry.

In [0]:

%grr_ls -P os -C

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_linux
0	16877	drwxr-xr-x	17696532	65025	2	585945	89939	4096	1567157599	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir1	CASE_LITERAL	524288
1	16877	drwxr-xr-x	17832583	65025	3	585945	89939	4096	1567157734	1567157599	1567157599	8	4096	OS	/tmp/foo/baz/dir2	CASE_LITERAL	524288
2	33188	-rw-r--r--	17696534	65025	1	585945	89939	70	1567158029	1567157649	1567157649	8	4096	OS	/tmp/foo/baz/file1	CASE_LITERAL	524288
3	33188	-rw-r--r--	17696533	65025	1	585945	89939	23	1567158209	1567157627	1567157627	8	4096	OS	/tmp/foo/baz/file2	CASE_LITERAL	524288

System information¶

Names of the functions are the same as in bash for simplicity.

Printing hostname of the client:

In [0]:

%grr_hostname

Getting network interfaces info:

In [0]:

ifaces = %grr_ifconfig 

For mac address fields there are also two columns: one with the original bytes type but not representable and pretty one with string representation of mac address.

In [0]:

ifaces[['mac_address', 'mac_address.pretty']][1:]

Out[0]:

	mac_address	mac_address.pretty
1	b'\x00\x00\x00\x00\x00\x00'	00:00:00:00:00:00

If a field contains a collection then the cell in the dataframe is represented as another dataframe. IP address fields also have two representations.

In [0]:

ifaces['addresses'][1]

Out[0]:

	address_type	packed_bytes	packed_bytes.pretty
0	INET	b'\x7f\x00\x00\x01'	127.0.0.1
1	INET6	b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...	::1

For uname command only two options are available: --machine that prints the machine architecture and --kernel-release.

In [0]:

%grr_uname -m

Out[0]:

'x86_64'

In [0]:

%grr_uname -r

Out[0]:

'4.19.37-5rodete4-amd64'

To get the client summary you can simply call interrogate flow.

In [0]:

df = %grr_interrogate
df[['client_id', 'system_info.system', 'system_info.machine']]

Out[0]:

	client_id	system_info.system	system_info.machine
0	aff4:/C.dc3782aeab2c5b4c	Linux	x86_64

There is also possible to get info about processes that are running on client machine:

In [0]:

ps = %grr_ps
ps[:5]

Out[0]:

	pid	ppid	name	exe	cmdline	ctime	real_uid	effective_uid	saved_uid	real_gid	...	status	cwd	num_threads	user_cpu_time	system_cpu_time	RSS_size	VMS_size	memory_percent	connections
0	1	0	systemd	/usr/lib/systemd/systemd	0 0 /lib/systemd/system...	1565017014530000	0	0	0	0	...	sleeping	/	1	78.779999	53.02000	9670656	230248448	0.014377	NaN
1	520	1	lvmetad	/usr/sbin/lvmetad	0 0 /sbin/lvmetad 1 ...	1565017041170000	0	0	0	0	...	sleeping	/	1	0.050000	0.05000	1937408	108138496	0.002880	NaN
2	759	1	rpc.svcgssd	/usr/sbin/rpc.svcgssd	0 0 /usr/sbin/rpc.svcgssd	1565017041590000	0	0	0	0	...	sleeping	/	1	0.000000	0.00000	3215360	31694848	0.004780	NaN
3	760	1	rpc.gssd	/usr/sbin/rpc.gssd	0 0 /usr/sbin/rpc.gssd 1 ...	1565017041600000	0	0	0	0	...	sleeping	/run/rpc_pipefs	1	0.000000	0.00000	299008	27766784	0.000445	NaN
4	848	1	mgagentxp_script_runner.par	/usr/bin/mgagentxp_script_runner.par	...	1565017042310000	65534	65534	65534	1001	...	sleeping	/	5	424.779999	490.51001	25403392	1131827200	0.037767	NaN

5 rows × 24 columns

To fetch some system information you can also use osquery. Osquery tables are also converted to dataframes.

In [0]:

%grr_osqueryi "SELECT pid, name, cmdline, state, nice, threads FROM processes WHERE pid >= 440 and pid < 600;"

Out[0]:

	cmdline	name	nice	pid	state	threads
0		kworker/4:1H-kblockd	-20	500	I	1
1		rpciod	-20	505	I	1
2		xprtiod	-20	506	I	1
3	/sbin/lvmetad -f	lvmetad	0	520	S	1

Running YARA for scanning processes is also available.

In [0]:

import os 

pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)

df = %grr_yara '{rule}' -p {pid}
df[['process.pid', 'process.name', 'process.exe']]

Out[0]:

	process.pid	process.name	process.exe
0	63438	python3	/opt/python/3.7/bin/python3.7

Configuring flow timeout¶

The default flow timeout is 30 seconds. It's time the function waits for a flow to complete. You can configure this timeout with grr_set_flow_timeout specifying number of seconds to wait. For examples, this will set the timeout to a minute:

In [0]:

%grr_set_flow_timeout 60

To tell functions to wait for the flows forever until they are completed:

In [0]:

%grr_set_no_flow_timeout

To set timeout to default value of 30 seconds:

In [0]:

%grr_set_default_flow_timeout

Setting timeout to 0 tells functions not to wait at all and exit immediately after the flow starts.

In [0]:

%grr_set_flow_timeout 0

In case timeout is exceeded (or you set 0 timeout) you will se such error with a link to Admin UI.

Collecting artifacts¶

You can first list all the artifacts that you can collect:

In [0]:

df = %grr_list_artifacts
df[:2]

Out[0]:

	artifact.name	artifact.doc	artifact.supported_os	artifact.labels	artifact.urls	artifact.sources	is_custom	error_message	dependencies	artifact.provides	path_dependencies	processors	artifact.conditions
0	APTSources	APT package sources list	0 0 Linux	0 0 Configuration Files ...	...	type at...	False		NaN	NaN	NaN	NaN	NaN
1	APTTrustKeys	APT trusted keys	0 0 Linux	0 0 Configuration Files ...	0 0 https:...	type at...	False		NaN	NaN	NaN	NaN	NaN

To collect an artifact you just need to provide its name:

In [0]:

%grr_collect "DebianVersion"

Out[0]:

	st_mode	st_mode.pretty	st_ino	st_dev	st_nlink	st_uid	st_gid	st_size	st_atime	st_mtime	st_ctime	st_blocks	st_blksize	st_rdev	pathspec.pathtype	pathspec.path	pathspec.path_options	st_flags_osx	st_flags_linux
0	33188	-rw-r--r--	10094787	65025	1	0	0	7	1567107891	1559242439	1559242439	8	4096	0	OS	/etc/debian_version	CASE_LITERAL	0	524288

Python API¶

Getting a client¶

Using Python API you can work with multiple clients simultaneously. You don't need to select a client to work with, instead you simply get a client object.

Use search method to search for clients. You can specify ip, mac, host, version, user, and labels search criteria. As a result you will get a list of client objects so that you can pick one of them to work with.

In [0]:

clients = grr_colab.Client.search(user='admin')
clients

Out[0]:

🌕 C.dc3782aeab2c5b4c @ admin.example.com (0 seconds ago)

In [0]:

clients[0].id

Out[0]:

'C.dc3782aeab2c5b4c'

If you know a client ID or a hostname (in case there is one client installed for this hostname) you can get a client object using one of these values:

In [0]:

client = grr_colab.Client.with_id('C.dc3782aeab2c5b4c')

Client properties¶

There is a bunch of simple client properties to get some info about the client. Unlike magic API this API returns objects but not dataframes for non-primitive values.

Getting the client ID:

In [0]:

client.id

Out[0]:

'C.dc3782aeab2c5b4c'

Getting the client hostname:

In [0]:

client.hostname

Getting network interfaces info:

In [0]:

client.ifaces[1:]

Out[0]:

lo (MAC: 00:00:00:00:00:00):
    inet 127.0.0.1
    inet6 ::1

In [0]:

client.ifaces[1].ifname

Out[0]:

'lo'

This is a collection of interface objects so you can iterate over it and access interface object fields:

In [0]:

for iface in client.ifaces:
  print(iface.ifname)

enp0s31f6
lo

Getting the knowledge base for the client:

You can also access its fields:

In [0]:

client.knowledgebase
client.knowledgebase.os_release

Out[0]:

'Debian GNU/Linux'

Getting an architecture of a machine that client runs on:

In [0]:

client.arch

Out[0]:

'x86_64'

Getting kernel version string:

In [0]:

client.kernel

Out[0]:

'4.19.37-5rodete4-amd64'

Getting a list of labels that are associated with this client:

In [0]:

client.labels

Out[0]:

[]

First seen and last seen times are saved as datetime objects:

In [0]:

client.first_seen

Out[0]:

datetime.datetime(2019, 8, 15, 11, 34, 17, 656692)

In [0]:

client.last_seen

Out[0]:

datetime.datetime(2019, 8, 30, 10, 5, 49, 102492)

Requesting approvals¶

As in magics API here you also need to request an approval before running flows on a client. To do this simply call request_approval method providing a reason for the approval and list of approvers.

In [0]:

client.request_approval(approvers=['admin'], reason='Test reason')

This method does not wait until the approval is granted. If you need to wait, use request_approval_and_wait method that has the same signature.

Running flows¶

To set the flow timeout use set_flow_timeout function. 30 seconds is the default value. 0 means exit immediately after the flow started. You can also reset timeout and set it to a default value of 30 seconds.

In [0]:

# Wait forever
grr_colab.set_no_flow_timeout()

# Exit immediately
grr_colab.set_flow_timeout(0)

# Wait for one minute
grr_colab.set_flow_timeout(60)

#Wait for 30 seconds
grr_colab.set_default_flow_timeout()

Below are examples of flows that you can run.

Interrogating a client:

In [0]:

summary = client.interrogate()
summary.system_info.system

Out[0]:

'Linux'

Listing processes on a client:

In [0]:

ps = client.ps()
ps[:1]

Out[0]:

   PID USER       NI  VIRT   RES S CPU% MEM% Command
     1 root        0  220M    9M S  0.0  0.0 /usr/lib/systemd/systemd

In [0]:

ps[0]

Out[0]:

     1 root        0  220M    9M S  0.0  0.0 /usr/lib/systemd/systemd

In [0]:

ps[0].exe

Out[0]:

'/usr/lib/systemd/systemd'

Listing files in a directory. Here you need to provide the absolute path to the directory because there is no state.

In [0]:

files = client.ls('/tmp/foo/baz')
files

Out[0]:

/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:

for f in files:
  print(f.pathspec.path)

/tmp/foo/baz/dir1
/tmp/foo/baz/dir2
/tmp/foo/baz/file1
/tmp/foo/baz/file2

Recursive listing of a directory is also possible. To do this specify the max depth of the recursion.

In [0]:

files = client.ls('/tmp/foo', max_depth=3)
files

Out[0]:

/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)
        📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
        📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
            📂 dir3 (drwxr-xr-x /tmp/foo/baz/dir2/dir3, 4.0 KiB)
        📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
        📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:

for f in files:
  print(f.pathspec.path)

/tmp/foo/bar
/tmp/foo/baz
/tmp/foo/baz/dir1
/tmp/foo/baz/dir2
/tmp/foo/baz/file1
/tmp/foo/baz/file2
/tmp/foo/baz/dir2/dir3

Globbing files:

In [0]:

files = client.glob('/tmp/foo/baz/file*')
files

Out[0]:

/tmp/foo/baz
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

Grepping files with regular expressions:

In [0]:

matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'line')
matches

Out[0]:

/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file1:65-69: b'LINE'
/tmp/foo/baz/file2:18-22: b'line'

In [0]:

for match in matches:
  print(match.pathspec.path, match.offset, match.data)

/tmp/foo/baz/file1 18 b'line'
/tmp/foo/baz/file1 42 b'line'
/tmp/foo/baz/file1 65 b'LINE'
/tmp/foo/baz/file2 18 b'line'

In [0]:

matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'\x6c\x69\x6e\x65')
matches

Out[0]:

/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file1:65-69: b'LINE'
/tmp/foo/baz/file2:18-22: b'line'

Grepping files by exact match:

In [0]:

matches = client.fgrep(path='/tmp/foo/baz/file*', literal=b'line')
matches

Out[0]:

/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file2:18-22: b'line'

Downloading files:

In [0]:

client.wget('/tmp/foo/baz/file1')

Out[0]:

'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

Osquerying a client:

In [0]:

table = client.osquery('SELECT pid, name, nice FROM processes WHERE pid < 5')
table

Out[0]:

         name nice pid
0     systemd    0   1
1    kthreadd    0   2
2      rcu_gp  -20   3
3  rcu_par_gp  -20   4

In [0]:

header = ' '.join(str(col.name).rjust(10) for col in table.header.columns)
print(header)
print('-' * len(header))
for row in table.rows:
  print(' '.join(map(lambda _: _.rjust(10), row.values)))

      name       nice        pid
--------------------------------
   systemd          0          1
  kthreadd          0          2
    rcu_gp        -20          3
rcu_par_gp        -20          4

Listing artifacts:

In [0]:

artifacts = grr_colab.list_artifacts()
artifacts[0]

Out[0]:

artifact {
  name: "APTSources"
  doc: "APT package sources list"
  labels: "Configuration Files"
  labels: "System"
  supported_os: "Linux"
  urls: "http://manpages.ubuntu.com/manpages/trusty/en/man5/sources.list.5.html"
  sources {
    type: FILE
    attributes {
      dat {
        k {
          string: "paths"
        }
        v {
          list {
            content {
              string: "/etc/apt/sources.list"
            }
            content {
              string: "/etc/apt/sources.list.d/*.list"
            }
          }
        }
      }
    }
  }
}
is_custom: false
error_message: ""

To collect an artifact you just need to provide its name:

In [0]:

client.collect('DebianVersion')

Out[0]:

[📄 debian_version (-rw-r--r-- /etc/debian_version, 7 Bytes)]

Running YARA:

In [0]:

import os 

pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)

matches = client.yara(rule, pids=[pid])
print(matches[0].process.pid, matches[0].process.name)

63438 python3

Working with files¶

You can read and seek files interacting with them like fith usual python files.

In [0]:

with client.open('/tmp/foo/baz/file1') as f:
  print(f.readline())

b'This is the first line\n'

In [0]:

with client.open('/tmp/foo/baz/file1') as f:
  for line in f:
    print(line)

b'This is the first line\n'
b'This is the second line\n'
b'This is the third LINE\n'

In [0]:

with client.open('/tmp/foo/baz/file1') as f:
  print(f.read(22))
  f.seek(0)
  print(f.read(22))
  print(f.read())

b'This is the first line'
b'This is the first line'
b'\nThis is the second line\nThis is the third LINE\n'

Cached data¶

To fetch server cached data use cached property of a client object.

You can list files in directory (recursively also) and read and download files as above:

In [0]:

files = client.cached.ls('/tmp/foo/baz')
files

Out[0]:

/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:

files = client.cached.ls('/tmp/foo/baz', max_depth=2)
files

Out[0]:

/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
        📂 dir3 (drwxr-xr-x /tmp/foo/baz/dir2/dir3, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:

with client.cached.open('/tmp/foo/baz/file1') as f:
  for line in f:
    print(line)

b'This is the first line\n'
b'This is the second line\n'
b'This is the third LINE\n'

In [0]:

client.cached.wget('/tmp/foo/baz/file1')

Out[0]:

'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can also refresh filesystem metadata that is cached on the server by calling refresh method (that will refresh the contents of the directory and not its subdirectories):

In [0]:

client.cached.refresh('/tmp/foo/baz')

To refresh a directory recursively specify max_depth parameter:

In [0]:

client.cached.refresh('/tmp/foo/baz', max_depth=2)

In [0]:

### Path types

To specify path type, just use one of the client properties: client.os (the same as just using client), client.tsk, client.ntfs, client.registry.

In [0]:

client.os.ls('/tmp/foo')

Out[0]:

/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)

In [0]:

client.os.cached.ls('/tmp/foo')

Out[0]:

/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)