Gremlin query language

Return intermediate vertices by creating aliases with the Gremlin language

In previous blog posts, I explained the use-case, how to load data into an Azure Cosmos DB graph database with the bulk executor library, how to write your first queries with the Gremlin language, how to Manage vertices’ properties and the first steps to traverse a graph. This blog post is dedicated to explain how to you can use the traversal of a graph and return some vertices that are not the end of the path.

Until now, you’ve always returned the last step of our query. But it’s not always what we want! In the previous query, we’ve returned the karatekas having defeated the karatekas who defeated my daughter. One of the issue of this query was that we didn’t know who defeated who.

We can fix this problem by returning tuples containing the two karatekas corresponding to the vertices in green on this drawing.

To achieve this, we’ll need to give an alias to the selection performed at the third intermediate step (and also to the latest). Gremlin as an operator named as to give aliases.

The following query is just assigning aliases to the interesting steps:

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.as("loser")
.out("loses")
.in("wins")
.as("winner")

At that moment, if you executed the query, the result would be identical to what it was previously. Just giving aliases is not enough, you’ll also have to instruct to Gremlin to make some usage of them. The simpler usage is to select them. This operation ensure that the aliased vertices are part of the result of the query.

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.as("loser")
.out("loses")
.in("wins")
.as("winner")
.select ("winner", "loser")

The result of this query is a list of tuples containing each time a loser and a winner:

{
"loser": {
"id": "karateka.52",
"label": "karateka",
"type": "vertex",
"properties": {
…,
"fullName": [
{
"id": "3dec83c1-5f2e-41d4-98e9-8371cf6017b4",
"value": "B****, Rania"
}
],
...
]
}
},
"winner": {
"id": "karateka.59",
"label": "karateka",
"type": "vertex",
"properties": {
...
"fullName": [
{
"id": "44fdc18a-a3e7-47ba-8ce5-d7599548c267",
"value": "S****, Kiara"
}
...
},
{
"loser": {
...
},
"winner": {
...
},
...

It’s usually not suited to get the whole list of properties. Most of the time we’ll only be interested by a subset of the properties in this case the fullName. To project the vertices to a single property, we can use the operation by provided by Gremlin.

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.as("loser")
.out("loses")
.in("wins")
.as("winner")
.select ("winner", "loser")
.by("fullName")

This query is limiting the result to the full names of the losers and the winners:

[
{
"loser": "B****, Rania",
"winner": "S****, Kiara"
},
...
{
"loser": "P****, Denisa",
"winner": "D*****, Louise"
}
]

The next blog post will about filtering during the traversal of a graph.

Advertisements

Basic traversing of a graph with the Gremlin language

In previous blog posts, I explained the use-case, how to load data into an Azure Cosmos DB graph database with the bulk executor library, how to write your first queries with the Gremlin language and how to Manage vertices’ properties This blog post is dedicated to explain how to jump from a vertex to an adjacent vertex using the existing edges.

You’ll have to learn a few additional operations to traverse a graph. The first set of operations are inE and outE. These two operations let you select all the edges respectivelly ending and starting from the selected vertex. The example here under show the result (in green) of the operation outE for a given vertex.

The following query is returning all the edges having the label participates and starting from the vertex representing a given karateka.

g.V().has("karateka", "fullName", "Charlier, Alice").outE("participates")

The result of this query is a list of edges. From this result we can see that the starting node is always the given karateka and that each edge is linking to a bout.

[
{
"id": "4a33a619-593a-469e-88ff-983e951149ed",
"label": "participates",
"type": "edge",
"inVLabel": "bout",
"outVLabel": "karateka",
"inV": "bout.120",
"outV": "karateka.70",
"properties": {
"color": "aka"
}
},
{
"id": "517e4b17-95ea-4ad0-a74c-45c083587a46",
"label": "participates",
"type": "edge",
"inVLabel": "bout",
"outVLabel": "karateka",
"inV": "bout.116",
"outV": "karateka.70",
"properties": {
"color": "aka"
}
},
… (many more ojects)

The same kind of operations exist for selecting vertices being the end or the start of a selected edge. These functions are named outV and inV.

The following query is starting from a karateka, then jumping the the edges having the label participates and then jumping to all the vertices being the end of the previously selected edges.

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.outE("participates")
.inV()

The result of this query is a list of bouts (based on our previously defined graph schema).

[
{
"id": "bout.120",
"label": "bout",
"type": "vertex",
"properties": {
"pk": [
{
"id": "bout.120|pk",
"value": 1
}
]
}
},
{
"id": "bout.116",
"label": "bout",
"type": "vertex",
"properties": {
"pk": [
{
"id": "bout.116|pk",
"value": 1
}
]
}
},

Most of the time, you don’t really want to select the edges. They are just a some means to go from one vertex to any adjacent vertex. for convenience, Gremlin is supporting two operations in and out. They are the equivalent of respectively outE followed by inV and inE followed by outV.

The following drawing explains that starting from the black vertex and using a in operation, you’ll directly select the three adjacent vertices.

A good usage of the traversing of a graph will be to know the names of all the karateka having defeated a given karateka. To write this query we’ll first select the initial karateka, then going to all the bouts where the edge is labelled loses and corresponding to all the bouts where the karateka has been defeated. Having this list of bouts we just need to follow the edges labelled as wins to know the name of the winner.

The following query applies this pattern:

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")

The result of this query is a list of karatekas. If we want to just display their names, we can add a values operations.

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.values("fullName")

The result will be a list of strings (results have been anonymized for GDPR reasons):

[
"B****, Rania",
"H****, Maëlie",
"P****, Denisa"
]

We can go a bit further and check if these karatekas have already been defeated by someone or not. To achieve this, I’ll apply the exact same pattern and use the edges loses and wins from the second karateka.

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.out("loses")
.in("wins")
.values("fullName")

The result is also a list of string:

[
"D****, Louise",
"F****, Julia",
"F****, Julia",
"S****, Kiara",
"K****, Lee-lou",
"Z****, Manon",
"E****, Lena",
"A****, Ezdina",
"B****, Margaux",
"B****, Zineb",
"B****, Rania",
"D****, Louise"
]

As you can see the first and last names but also the second and third are identical. The reason is that these two karatekas have defeated twice one of the three karatekas listed above (or once two of them). That’s really important to understand that Gremlin doesn’t automatically deduplicate the vertices. If you want to achieve this, just use the function dedup

g.V()
.has("karateka", "fullName", "Charlier, Alice")
.out("loses")
.in("wins")
.out("loses")
.in("wins")
.dedup()
.values("fullName")

Note that I applied the function dedup to the vertices and not to the property fullName. The reason is to avoid to consider as duplicates two karatekas that are just homonyms.

The next blog post will be about the step modulator … and how to return a result where the selected vertices are not the last vertices traversed by the query!

Querying, adding or dropping vertices’ properties with the Gremlin language

In previous blog posts, I explained the use-case, how to load data into an Azure Cosmos DB graph database with the bulk executor library but also how to write your first queries with the Gremlin language. This blog post is dedicated to explain how to add or remove properties on a vertex.

Properties are always attached to a vertex or an edge and give some information about this vertex or edge.  Using Azure Cosmos DB, you’ll always have a property named pk that is mandatory and automatically generated by Azure Cosmos DB

      "pk": [
        {
          "id": "karateka.70|pk",
          "value": 1
        }
      ],

pk stands for partition key. Partition keys are an important feature of Azure Cosmos DB but I’ll explain them in a future dedicated post.

Other properties always have the same structure. The name of JSON field is the name of Gremlin property. Each JSON field contains an array of values for this property. Indeed, in Gremlin each property is multi-valued, we’ll come back to this later in this post. Each property’s value has an id having for value a GUID automatically attributed by the Azure Cosmos DB engine and a second field named value keeping track of the effective value of the property.

      "fullName": [
        {
          "id": "5c4a9e2e-4c62-41ab-9732-9bd39c0a6837",
          "value": "Charlier, Alice"
        }
      ],
      "firstName": [
        {
          "id": "b5c99f14-2bd2-4700-a42f-3bf97042e351",
          "value": "Alice"
        }
      ],
      "lastName": [
        {
          "id": "5cef6756-8ddc-45bf-95b9-3ee858aa0a12",
          "value": "Charlier"
        }

If you want to add a property to a vertex, you don’t need to perform any explicit change to the underlying schema. Graph databases are schema-less. It means that you can quickly add a new property by just specifying it. Naturally, it has some drawbacks and any typo could create a new property in place of editing an existing one.

To add a property to a vertex, you first need to select this vertex as explain in the previous blog post and then apply the operator property. This operation is expecting the property’s name and the property’s value as parameters:

g.V()
  .has("karateka", "fullName", "Charlier, Alice")
  .property("nationality", "Belgian")

Executing this query will return the selected vertex and you’ll quickly see that the property has been added.

[
  {
    "id": "karateka.70",
    "label": "karateka",
    "type": "vertex",
    "properties": {
      "pk": [
        {
          "id": "karateka.70|pk",
          "value": 1
        }
      ],
      ...
      "nationality": [
        {
          "id": "f0387c84-42e4-43ea-badb-7bfecc75443e",
          "value": "Belgian"
        }
      ]
    }
  }
]

It’s possible to add a property to more one than vertex. To achieve this, just select multiple vertices and define the common property. In the example below, I’m assigning the same birthday to my twin daughters.

g.V()
  .has("karateka", "fullName", within(
      ["Charlier, Alice", "Charlier, Clémence"]))
  .property("birthDate", "2010-07-04")

As explained above, Gremlin natively support multi-valued properties. If you want to define a property having more than one property, you’ll have to specify it as the first parameter of the operator property by specifying the keyword list. The next query is adding a few middle names to a karateka.

g.V()
  .has("karateka", "fullName", "Charlier, Alice")
  .property(list, "middleName", "Coralie")
  .property(list, "middleName",  "Florence")

The result is the following:

     "middleName": [
        {
          "id": "a4ec9a1d-caf0-4ba5-9611-eae448f6c684",
          "value": "Coralie"
        },
        {
          "id": "ba1ce36a-1369-4c30-a4b2-08391c9b5a30",
          "value": "Florence"
        }
      ]

To return the values of some properties for a given vertex, you’ll need to apply the operator values and specify the properties’ name.

g.V()
  .has("karateka", "fullName", "Charlier, Alice")
  .values("lastName", "firstName")

This query is returning a really flat and compact JSOn document with just the values of the properties.

[
  "Charlier",
  "Alice"
]

When selecting multiple vertices this operator could be useless due to the fact that you don’t have the values grouped by vertex and that you don’t know the mapping between the property and the value. If you want a more explicit view on the property you can use the operator valueMap.

g.V()
  .has("karateka", "lastName", "Charlier")
  .valueMap("nationality", "firstName", "lastName")

The result of this query will be an explicit JSON document, listing all the requested properties and grouping them by vertex. Once again if a property is not existing for a given vertex, it won’t be an issue.

[
  {
    "nationality": [
      "Belgian"
    ],
    "firstName": [
      "Alice"
    ],
    "lastName": [
      "Charlier"
    ]
  },
  {
    "firstName": [
      "Clémence"
    ],
    "lastName": [
      "Charlier"
    ]
  }
]

If you want to remove a property, you’ll have to select it and then drop it.

g.V()
  .has("karateka", "fullName", "Charlier, Alice")
  .properties("birthDate")
  .drop()

It’s also possible to drop several properties on several vertices. If the properties are not existing it won’t create an exception once again a benefit of schema-less databases.

g.V()
  .has("karateka", "fullName", 
     within(["Charlier, Alice", "Charlier, Clémence"]))
  .properties("middleName", "birthDate")
  .drop()

Exactly the same approach can be applied to a vertex to add or drop properties!

The next step in this series of blog posts is to traverse the path, jumping from one vertex on another using the edges.

 

 

First steps with Gremlin to query a graph loaded in Azure Cosmos DB

In previous blog posts, I explained the use-case and also how to load data into an Azure Cosmos DB graph database with the bulk executor library. This blog post is dedicated to explain how to perform basic queries with Gremlin on this dataset.

You can use two tools to query a graph hosted by Azure Cosmos DB. The first tool is embedded in the Azure Portal and is named Data Explorer.

cosmos-data-explorer

The second tool is an extension of Visual Studio Code that is freely available and is named Azure Cosmos DB. This extension let you browse and query your MongoDB databases both locally and in the cloud using scrapbooks but also to write queries in Gremlin and display results as a Graph or as JSON documents.

cosmos-visual-code-extension

Before we go further, a little bit of theory about the Gremlin query language.

  • Each query is starting by g
  • V() stands for vertices and returns one or more vertices
  • E() stands for edges and returns one or more edges
  • hasLabel("label") filter Vertices/Edges based on label (type)
  • hasId("id") filter Vertices(/Edges) based on Id (must be unique)
  • has("propertyName", "value") filter Vertices based on value of any property

With these few elements, it’s already possible to write some interesting queries. The two first queries will respectivelly retrieve all the bouts and all the karatekas:

g.V().hasLabel("bout")
g.V().hasLabel("karateka")

If you want to retrieve a specific karateka and you know her id, you can apply the two variantes. It will return a unique result.

g.V().hasId("karateka.1") 
g.V("karateka.1")

The graph view is not really providing a value

cosmos-query-id-graph

But the JSON view, offer the opportunity to confirm that when returning a vertex, we’re also returning each of the properties!

cosmos-query-id-json

Most of the time, you don’t know the if of a vertex and you’ll need to perform a search through the graph to find it. In the first example, we’re looking for a karateka named Alice and in the second we’re looking for two karatekas.

g.V().has("karateka", "fullName", "Charlier, Alice") 
g.V().has("karateka", "fullName", 
   within(["Charlier, Alice", "Charlier, Clémence"]))

The first query is identical in terms of result to the following query:

g.V().has("fullName", "Charlier, Alice")

But the first version is more performant. Indeed, by specifying that you’re looking for a karateka, the engine will avoid to search within all the vertices that have not the label karateka. It’s a best practice to always specify this information when possible.

In the next blog posts we’ll see how to add or remove some properties to a vertex or and edge.

Automating the testing of gremlin queries with NBi

Since five years, you can run automated tests on your SQL server databases and SSAS cubes with the help of the open-source framework NBi. This framework was extended to support other relational databases (using the OleDb or ODBC drivers) in 2017. In 2018, I’m adding support for graph databases. Earlier this year, I released an extension for the Neo4j bolt driver supporting Cypher queries and another compatible with some of the Azure Cosmos DB API and notably the Graph API. This new extension is supporting all the Apache Tinkerpop-enabled databases and is tested on Azure Cosmos Db and Tinkerpop Server but should also work other graph databases supporting gremlin such as JanusGraph, Titan and OrientDB. The code for this extension is hosted on GitHub.Note that Azure Cosmos DB is now recommending to use the Gremlin driver in place of the Graph API dlls, so this extension should also be your first target if you want to connect to Azure Cosmos DB database and you want to run gremlin queries.

If you’re used to NBi, you can directly jump to the next paragraph. For newcomers, NBi is an open-source framework dedicated to the testing of BI solutions: it supports complex comparison of queries results, asserting conditions on each row of a result-set, running ETL (SSIS), validating queries from reports (SSRS) or many other possibilities. On top of this, it has an advanced feature to automate the process of creating test-suites.

In order to setup an environment for testing some gremlin queries, you must download the version 1.18 of NBi and its extension for Gremlin in version 1.0. Unzip the content of the NBi download and then partially override it with the content of the extension in the NBi.Gremlin (dlls from both packages must be deployed in a unique directory). If NUnit 2.6.4 is not available on your computer don’t forget to download it. If you need more info about how to setup a NBi’s extension, read the documentation.

When NBi and its extension are installed, create your test-suite with its config and NUnit project file as you’d normally do it for any NBi test-suite (or, if you’re in the hurry, download the files with the examples here, but don’t forget to update the connection-string when trying to run it). If you need more info about how to setup a test-suite, read the here.

For this example, I started a Tinkerpop Server and created some vertexes and edges with:

g.V().drop()
g.addV('person').property('id', 'thomas')
  .property('firstName', 'Thomas').property('age', 44)
g.addV('person').property('id', 'mary')
  .property('firstName', 'Mary')
  .property('lastName', 'Andersen')
  .property('age', 39)
g.addV('person').property('id', 'ben')
  .property('firstName', 'Ben')
  .property('lastName', 'Miller')
g.addV('person').property('id', 'robin')
.property('firstName', 'Robin')
  .property('lastName', 'Wakefield')
g.V().has('firstName','Thomas').addE('knows')
  .to(g.V().has('firstName','Mary'))
g.V().has('firstName','Thomas').addE('knows')
  .to(g.V().has('firstName','Ben'))
g.V().has('firstName','Ben').addE('knows')
  .to(g.V().has('firstName','Robin'))

When the environment is correctly configured, you must edit your config file to reference the extension NBi.Core.Gremlin.

<configuration>
  <configSections>
    <section name="nbi" type="NBi.NUnit.Runtime.NBiSection, NBi.NUnit.Runtime"/>
  </configSections>
  <nbi testSuite="...">
    <extensions>
      <add assembly="NBi.Core.Gremlin"/>
    </extensions>
  </nbi>
</configuration>

You also need to edit your config file. Add all the binding redirections that the Azure Cosmos DB is coming with. To achieve this, copy/paste the content of the runtime element of NBi.Core.CosmosDb.dll.config into a new runtime element in the config file of your test-suite FoF.config.

<runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="System.Diagnostics.DiagnosticSource" publicKeyToken="cc7b13ffcd2ddd51" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-4.0.1.0" newVersion="4.0.1.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="System.Net.Http" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-4.2.0.0" newVersion="4.2.0.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="System.Net.WebSockets" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-4.0.2.0" newVersion="4.0.1.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="System.Net.WebSockets.Client" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-4.0.2.0" newVersion="4.0.1.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-11.0.0.0" newVersion="11.0.0.0" />
      </dependentAssembly>
    </assemblyBinding>
</runtime>

The concept of connection-string is not familiar to Tinkerpop or Gremlin but to match with relational databases, we use a concatenation of the usual parameters defined to connect to an instance supporting gremlin queries. You must provide the hostname, the port, the enableSsl, the username and password. You must also add a tip for NBi to identify the correct type of connection-string by adding an api tag with the value gremlin.If your database is a Azure Cosmos DB instance, then your username will be /dbs/{db}/colls/{coll} where {db} stands for the name of the database and {coll} for the name of the collection. Also, note that the hostname is a url part of the subdomain .gremlin.cosmosdb.azure.com.

hostname=https://gremlin-demo.gremlin.azure.com;
port=443;
enableSsl=true;
username=database=/dbs/Friends/colls/FoF;
password=F...==;
api=gremlin;

The same connection-string for a Tinkerpop Server, running on your computer, will be

hostname=localhost;
port=8182;
enableSsl=false;
username=database=Anything;
password=Anything;
api=gremlin;

Note that username and password are not expected by a Tinerpop Server. Anyway, they can’t be omited but any value will make it.

In your test-suite, querying a relational database with SQL or a Tinkerpop-enabled database with a gremlin query is not different. You’re still using the result-set and query elements. The gremlin query is specified within the query. It doesn’t support parameters at the moment but you’re still free to use template-tokens.

Gremlin queries can return complex results such as paths that are not easily mapped to a table (result-set). Currently, NBi support for Gremlin is limited to queries returning a list of Vertexes or a list of Edges or a list of values.

In this first test, NBi is asserting that the query to return Thomas’ friends of friends is effectively returning the correct result. In this case a unique vertex representing a person named Robin.

FoF-Thomas-Robin

<test name="Thomas' friends of friends" uid="0001">
    <system-under-test>
      <resultSet>
        <query>
          g.V().has('firstName', 'Thomas').outE('knows').inV().hasLabel('person').outE('knows').inV().hasLabel('person')
        </query>
      </resultSet>
    </system-under-test>
    <assert>
      <equalTo values-default-type="text">
        <resultSet>
          <row>
            <cell>robin</cell>
            <cell>person</cell>
            <cell>vertex</cell>
            <cell>Robin</cell>
            <cell>Wakefield</cell>
          </row>
        </resultSet>
      </equalTo>
    </assert>
  </test>

In this second test, we’re asserting that the all the vertexes contain a property age greater than 20.

<test name="All of them are older than 20 (or unknown)" uid="0002">
    <system-under-test>
      <resultSet>
        <query>
          g.V()
        </query>
      </resultSet>
    </system-under-test>
    <assert>
      <all-rows>
        <combination operator="or">
          <predicate operand="age">
            <null/>
          </predicate>
          <predicate operand="age">
            <more-than>20</more-than>
          </predicate>
        </combination>
      </all-rows>
    </assert>
  </test>

And finally, in this last test, we’re checking that the result-set returned by a projection& matches the expected result.

  <test name="Count of person's relationships" uid="0003">
    <system-under-test>
      <resultSet>
        <query>
          g.V().project('FirstName','KnowsCount').by('firstName').by(out().count())
        </query>
      </resultSet>
    </system-under-test>
    <assert>
      <equalTo>
        <resultSet>
          <row>
            <cell>Mary</cell>
            <cell>0</cell>
          </row>
          <row>
            <cell>Robin</cell>
            <cell>0</cell>
          </row>
          <row>
            <cell>Ben</cell>
            <cell>1</cell>
          </row>
          <row>
            <cell>Thomas</cell>
            <cell>2</cell>
          </row>
        </resultSet>
      </equalTo>
    </assert>
  </test>

FoF-running-test-suite

Our test-suite is green … time to write some additional tests!

To know more about NBi, check the website at www.nbi.io and if you’ve any question or suggestion, feel free to ask on Twitter or on the Github repository (section “issues”). The code of this extension is also available on Github.

Automating the testing of an Azure Cosmos DB instance with NBi

Since five years, you can run automated tests on your SQL server databases and SSAS cubes with the help of the open-source framework NBi. This framework was extended to support other relational databases (using the OleDb or ODBC drivers) in 2017. To give a really good start to the year 2018, NBi is now supporting Azure Cosmos DB databases using the Graph API (gremlin queries)!

If you’re used to NBi, you can directly jump to the next paragraph. For newcomers, NBi is an open-source framework dedicated to the testing of BI solutions: it supports complex comparison of queries results, asserting conditions on each row of a result-set, running ETL (SSIS), validating queries from reports (SSRS) or many other possibilities. On top of this, it has an advanced feature to automate the process of creating test-suites.

In order to setup an environment for testing some gremlin queries through the Graph API of Azure Cosmos DB, you must download the version 1.18 of NBi and its extension for Gremlin in version 1.0. Unzip the content of the NBi download and then partially override it with the content of the extension in the NBi.CosmosDb (dlls from both packages must be deployed in a unique directory). If NUnit 2.6.4 is not available on your computer don’t forget to download it. If you need more info about how to setup a NBi’s extension, read the documentation.

When NBi and its extension are installed, create your test-suite with its config and NUnit project file as you’d normally do it for any NBi test-suite (or, if you’re in the hurry, download the files with the examples here, but don’t forget to update the connection-string when trying to run it). If you need more info about how to setup a test-suite, read the here.

For this example, I created a database on Azure Cosmos DB (named Friends) containing a graph (named FoF). I created some vertexes and edges with:

g.V().Drop()
g.addV('person').property('id', 'thomas')
  .property('firstName', 'Thomas')
  .property('age', 44)
g.addV('person').property('id', 'mary')
  .property('firstName', 'Mary')
  .property('lastName', 'Andersen')
  .property('age', 39)
g.addV('person').property('id', 'ben')
  .property('firstName', 'Ben')
  .property('lastName', 'Miller')
g.addV('person').property('id', 'robin')
  .property('firstName', 'Robin')
  .property('lastName', 'Wakefield')
g.V('thomas').addE('knows').to(g.V('mary'))
g.V('thomas').addE('knows').to(g.V('ben'))
g.V('ben').addE('knows').to(g.V('robin'))

When the environment is correctly configured, you must edit your config file to reference the extension NBi.Core.CosmosDB.

<configuration>
  <configSections>
    <section name="nbi" type="NBi.NUnit.Runtime.NBiSection, NBi.NUnit.Runtime"/>
  </configSections>
  <nbi testSuite="...">
    <extensions>
      <add assembly="NBi.Core.CosmosDb"/>
    </extensions>
  </nbi>
</configuration>

You also need to edit your config file. Add all the binding redirections that the Azure Cosmos DB is coming with. To achieve this, copy/paste the content of the runtime element of NBi.Core.CosmosDb.dll.config into a new runtime element in the file FoF.config.

The concept of connection-string is not familiar to Azure Cosmos DB but to match with relational databases, we use a concatenation of the usual parameters defined to connect to an Azure Cosmos DB instance. You must provide the endpoint, the authkey, the database Id and the collection Id, concatenated into named-tokens separated by semi-colons. For the collection Id, you must use the name graph and not collection, in prevision of NBi’s compatibility with other APIs supported by Cosmos DB.

Endpoint=https://gremlin-demo.documents.azure.com:443;AuthKey=F...==;database=Friends;graph=FoF

In your test-suite, querying a relational database or your Azure Cosmos DB database with the Graph API is not different. You’re still using the result-set and query elements. The gremlin query is specified within the query. It doesn’t support parameters at the moment but you’re still free to use template-tokens.

Graph queries can return complex results such as paths that are not easily mapped to a table (result-set). Currently, NBi supports for Cosmos DB is limited to queries returning a list of Vertexes or a list of Edges or a list of values.

In this first test, NBi is asserting that the query to return Thomas’ friends of friends is effectively returning the correct result. In this case a unique vertex representing a person named Robin.

FoF-Thomas-Robin

<test name="Thomas' friends of friends" uid="0001">
  <system-under-test>
    <resultSet>
      <query>
        g.V('thomas').outE('knows').inV().hasLabel('person').outE('knows').inV().hasLabel('person')
      </query>
    </resultSet>
  </system-under-test>
  <assert>
    <equalTo values-default-type="text">
      <resultSet>
        <row>
          <cell>robin</cell>
          <cell>person</cell>
          <cell>vertex</cell>
          <cell>Robin</cell>
          <cell>Wakefield</cell>
        </row>
      </resultSet>
    </equalTo>
  </assert>
</test>

In this second test, we’re asserting that the all the vertexes contain a family name correctly capitalized:

<test name="All of them are older than 20" uid="0002">
  <system-under-test>
    <resultSet>
      <query>
        g.V()
      </query>
    </resultSet>
  </system-under-test>
  <assert>
    <all-rows>
      <predicate operand="firstName" type="text">
        <matches-regex>^([A-Z][a-z]+)$</matches-regex>
      </predicate>
    </all-rows>
  </assert>
</test>

And finally, in this last test, we’re checking that the result-set returned by a projection matches the expected result.

<test name="Count of person's relationships" uid="0003">
  <system-under-test>
    <resultSet>
      <query>
        g.V().project('FirstName','KnowsCount').by('firstName').by(out().Count())
      </query>
    </resultSet>
  </system-under-test>
  <assert>
    <equalTo>
      <resultSet>
        <row>
          <cell>Mary</cell>
          <cell>0</cell>
        </row>
        <row>
          <cell>Robin</cell>
          <cell>0</cell>
        </row>
        <row>
          <cell>Ben</cell>
          <cell>1</cell>
        </row>
        <row>
          <cell>Thomas</cell>
          <cell>2</cell>
        </row>
      </resultSet>
    </equalTo>
  </assert>
</test>

FoF-running-test-suite

Our test-suite is green … time to write new tests!

To know more about NBi, check the website at www.nbi.io and if you’ve any question or suggestion, feel free to ask on Twitter
or on Github. The code of this extension is also available on Github.