.. _graph_transformation: Graph Transformations ===================== We introduce a series of method that can append more labels to a existed grpah, and do projection over existed graph. We will also show how to make a complex property graph compatible with algorithms that can only run on simple graph. Finally, we show how to add the query result of algorithm back to graph as a property on vertex. More specically, :class:`Graph` provides two methods for append labels, and one method for projection. .. code:: python def add_vertices(self, vertices, label="_", properties=[], vid_field=0): pass def add_edges(self, edges, label="_", properties=[], src_label=None, dst_label=None, src_field=0, dst_field=1): pass def project(self, vertices, edges): pass We have already seem `add_vertices` and `add_edges` in :ref:`loading graphs`, we use them to build a graph iteratively. Further, we can use them to attach more vertex labels and edge labels to a existed graph. But this won't modify the source graph, instead, it will return a new graph, which is based on the source graph. Attach new labels ----------------- Take LDBC-SNB Property Graph as an example,We now load a subset of labels, as the source graph. .. code:: python import graphscope from pathlib import Path from graphscope.framework.loader import Loader sess = graphscope.session() graph = sess.g(directed=directed) graph = graph.add_vertices(Loader("person_0_0.csv", delimiter="|"), "person") graph = graph.add_edges(Loader("person_knows_person_0_0.csv", delimiter="|"), "knows", src_label="person", dst_label="person" ) # graph has 1 vertex label "person" print(graph.schema) Now we have an loaded graph, let's attach some new labels to it. .. code:: python graph1 = graph.add_vertices(Loader("comment_0_0.csv", delimiter="|"), "comment") # Now graph1 has 2 vertex labels "person" and "comment" print(graph1.schema) graph2 = graph1.add_edges(Loader("comment_replyOf_comment_0_0.csv", delimiter="|"), "replyOf", src_label="comment", dst_label="comment" ) # graph2 has 2 edge labels "knows" and "replyOf" print(graph2.schema) We can see each operation of `add` will produce a new graph. In implementation detail, their common labels will share the common memory, so it won't copy the source graph. Projection ---------- In some scenario, we need to extract a subgraph from a complex graph. We do that by `project`. .. code:: python def project( self, vertices: Mapping[str, Union[List[str], None]], edges: Union[Mapping[str, Union[List[str], None]], None] ): pass The parameter definition means it's a `dict`, the key is the label name, the value is a `list` of `str`, which is the name of properties. Specifically, if the value is `None`, it means select all properties. A graph that produced by `project` should just like a normal property graph, and can be projected further. Here's some examples. .. code:: python sub_graph = graph2.project(vertices={"person": ["firstName", "lastName"]}, edges={"knows": None}) # contains 1 vertex label "person", and 1 edge label "knows", with selected properties. print(sub_graph.schema) sub_graph2 = sub_graph.project(vertices={"person": []}, edges={"knows": ["creationDate"]}) # No properties on the vertex, and 1 property on the edge. print(sub_graph2.schema) Transform to simple graph implicitly ------------------------------------ When an algorithm that only works on simple graph query a property graph, the property graph will be converted to a simple graph implicitly. If such transformation cannot be performed (the vertex label num and edge label num is not one, or has more than 1 property on vertex/edge), an exception will be raised. .. code:: python from graphscope import wcc ret = wcc(sub_graph2) # wcc(graph2) # Error! More than 1 vertex label / edge label # wcc(sub_graph) # Error! More than 1 property. Add results back to graph as a property --------------------------------------- The result `ret` produced in previous step can be add to a graph as a property of vertex. Note the result can not only be added to the graph it directly queried on, but also the graph which produced the queried graph by `project`, as long as the vertex label that will be mutated is the same between the two graphs. .. code:: python new_graph = sub_graph2.add_column(ret, selector={'cc': 'r'}) new_graph = sub_graph.add_column(ret, selector={'cc': 'r'}) new_graph = graph.add_column(ret, selector={'cc': 'r'})