Interactive Query with Gremlin

Gremlin is one of the most popular query languages in graph databases, as SQL in relational databases. Here, we will give some examples to illustrate how gremlin helps navigate the vertices and edges of a graph.

Dataset

MODERN, a toy graph from tinkerpop, which consists of 6 vertices and 6 edges. We extend it to larger size by adding more complex relationships among vertices. Here are the edges:

[(1,3),(1,2),(1,4),(4,5),(4,3),(6,3),(2,4),(4,1)]

In [ ]:
import os
import graphscope
from graphscope.framework.graph import Graph
from graphscope.framework.loader import Loader
import vineyard

k8s_volumes = {
    "data": {
        "type": "hostPath",
        "field": {
          "path": "/testingdata/modern_graph_2",  # Path in host
          "type": "Directory"
        },
        "mounts": {
          "mountPath": "/home/jovyan/datasets/modern_graph_2",  # Path in pods
          "readOnly": True
        }
    }
}
graphscope.set_option(show_log=True)  # enable logging
session = graphscope.session(k8s_volumes=k8s_volumes)  # create a session
modern_graph = session.load_from(
    vertices={
        "person": (Loader("/home/jovyan/datasets/modern_graph_2/person.csv", delimiter="|", header_row=True), ["name", ("age", "int")], "id"),
        "software": (Loader("/home/jovyan/datasets/modern_graph_2/software.csv", delimiter="|", header_row=True), ["name", "lang"], "id"),
    },
    edges={
        "knows": [Loader("/home/jovyan/datasets/modern_graph_2/knows.csv", delimiter="|"), [], (0, "person"), (1, "person")],
        "created": [Loader("/home/jovyan/datasets/modern_graph_2/created.csv", delimiter="|"), [], (0, "person"), (1, "software")],
    }, generate_eid=False)
interactive = session.gremlin(modern_graph)

Between Vertices

Traversals between two particular vertices is quite common situations in graph databases. For example, to figure out the relationships between v1 and v2/v3, a gremlin query can be written like this:

In [ ]:
q1 = interactive.execute("g.V().has(\"id\", 1).as(\"u\").out().has(\"id\", eq(2).or(eq(3))).as(\"v\").select(\"u\", \"v\").by(\"id\")")
for p in q1:
    print(p)

Here is an example which is popular in social network scenarios, such as finding common features between two different people, one called "marko" while another called "peter".

In [ ]:
q2 = interactive.execute("g.V().has(\"name\", \"marko\").out().where(__.in().has(\"name\", \"peter\")).valueMap()")
for p in q2:
    print(p)

Degree Centrality

Degree centrality is a measure of the number of edges associated to each vertex, which is of statistical significance in large-scale data processing. Here are some examples:

In [ ]:
q3 = interactive.execute("g.V().group().by().by(bothE().count())")
for p in q3:
    print(p[0])
q4 = interactive.execute("g.V().group().by().by(inE().count())")
for p in q4:
    print(p[0])

Cycle Detection

Cycle detection is another important application of graph query in commerce area where cycles are usually considered as fraudulent patterns. Here is an example illustrating how gremlin helps detect cycles in a graph.

In [ ]:
q5 = interactive.execute("g.V().as(\"u\").repeat(out().simplePath()).times(2).where(out().where(eq(\"u\"))).count()")
print(q5.one())

Close session

In [ ]:
session.close()