Index

Ultipa leverages an indexing mechanism to accelerate access to meta-data (nodes or edges), which is comparable to traditional databases. On top of that, Ultipa index also accelerates A-B query, k-hop query and other path-based queries.

Ultipa provides a graph-native full-text search engine, which is much faster than Java-based Solr/Lucene or ES frameworks. It searches through long text data effectively against given search strings, be it meta-data oriented or path oriented, the latter of which is UNPRECEDENTED.

The template-based full-text search capability is UNIQUE to Ultipa Graph system only, creating unprecedented powerful user experiences by searching super intelligently. See examples at the end of this chapter.

This chapter will introduce operations on the general index and full-text index, which are all case insensitive.

Index v.s. LTE

Ultipa index and LTE (Load-To-Engine, introduced in chapter Instance, GraphSet and Property) both aim to accelerate queries and consume a certain amount of disk space to ensure that data are persistent therefore no risk of being lost over a reboot, but still, they differ in some aspects:

Index LTE
Scope Metadata query Path-based query
Principle Create index trees in persistent storage Load properties to in-RAM computing engine
Disk Usage Mainly in disk Both in RAM and disk

Diff 1:Implementation Scope

The main operations that the general index accelerates are meta-data queries, while LTE, on the other hand, accelerates path-based queries with 1-step depth or deeper.

The general rule of thumb is: if both LTE and general index are created, path-based query will use LTE first, and the meta-data query will use index first.

Diff 2:Implementation Principle

Index creates various index trees over persistent storage (disk per se), therefore the query that leverages index does NOT cast burden to memory.

LTE, on the other hand, loads properties into Ultipa's graph computing engine (in-RAM) to lower disk-based I/O dependency.

Diff 3:RAM/Disk Usage

Index is a disk-based data structure that mainly dwells on disk, with limited occupation of memory space.

As LTE loads data to memory, the LTE-ed data consume memory space as well as keeping their persistent counterpart over disk(s) to avoid LTE information loss over a system reboot.

Basic Operations

Users with corresponding privileges can create, remove and list indexes based on their needs.

createIndex()

Type Components
Command createIndex()
Parameters node_property(<>) or edge_property(<>), fulltext().name(<>)
Return (operational status)

Values of parameter:

Name Data Type Specification Description
node_property string node property (1/2) The node property to be used for indexing, does not coexist with edge_property()
edge_property string edge property (2/2) The edge property to be used for indexing, does not coexist with node_property()
fulltext / / (Optional) To create fulltext index
name string (follows the naming convention of property) (Used with fulltext()) The reference name of fulltext index

Example 1: Create general index for node property 'age'

createIndex().node_property(age)

Example 2: Create general index for edge property 'rank'

createIndex().edge_property(rank)

Example 3: Create fulltext index for node property 'age' and name it as 'nodeAge'

createIndex().fulltext().node_property(age).name("nodeAge")

Parameters fulltext() and name() must be used together when creating fulltext index.

dropIndex()

As indexes consume considerable disk and memory space, a quality graph system user should always remember to remove the index that is no longer in use.

Type Components
Command dropIndex()
Parameters node_property(<>) or edge_property(<>) or fulltext().name(<>)
Return (operational status)

Values of parameter:

Name Data Type Specification Description
node_property string node property (1/3) To remove general index of a particular node property, does not coexist with edge_property() nor fulltext()
edge_property string edge property (2/3) To remove general index of a particular edge property, does not coexist with node_property() nor fulltext()
fulltext / / (3/3) To remove fulltext index, does not coexist with node_property() nor edge_property()
name string (follows the naming convention of property) (Used with fulltext()) The reference name of fulltext index

Example 1: Delete ganeral index for node property 'age'

dropIndex().node_property(age);

Example 2: Delete ganeral index for edge property 'rank'

dropIndex().edge_property(rank);

Example 3: Delete fulltext index with name 'nodeAge'

dropIndex().fulltext().name("nodeAge")

Parameters fulltext() and name() must be used together when removing fulltext index.

showIndex()

Type Components
Command showIndex()
Parameters fulltext()
Return (a list of indexes used by the current graphset )

Values of parameter:

Name Data Type Specification Description
fulltext / / (Optional) To show the fulltext indexes only

Example 1: Show all indexes

showIndex()

Example 2: Show all fulltext indexes

showIndex().fulltext()

Ultipa Manager returns indexes with below information:

Figure: Information in an Index List

Fulltext Filter

Fulltext filter allows quick and intelligent searches through fulltext index against search strings. As an extension of Ultipa filter, the so-called fulltext filter is essentially a basic filtering condition that comprises fulltext index item, operator and search strings.

Three kinds of fulltext filtering logic are supported:

  1. Condition that the fulltext item contains any of the search strings:
{ ~<fulltext>: { $in: ["<string1>", "<string2>", ...] } }
  1. Condition that the fulltext item contains all the search strings:
{ ~<fulltext>: "<string1> <string2> ..." }
  1. Condition that the fulltext item contains any group of search strings:
{ ~<fulltext>: { $in: ["<string1> <string2> ...", "<string1> <string2> ...", ...] } }

More sophisticated filtering logics should be constructed into compound filtering condition, with logical operators cooperating with these three forms above.

To achieve different search precisions, fulltext index supports whole word or prefix type of word search by controlling a wildcard * at the end of the search string.

Example 1: Search nodes against fulltext index 'nodeDesc' for words 'Graph' and 'Ultipa'

find()
  .nodes({ ~nodeDesc: "Graph Ultipa" })
  .limit(10).select(*)

Exmaple 2: Search edges against fulltext index 'edgeInfo' for either 'Graph' or 'Ultipa'

find()
  .edges({ ~edgeInfo: { $in:["Graph", "Ultipa"] } })
  .limit(10).select(*)

Exmaple 3:Search nodes against fulltext index 'nodeDesc' for either ULTIPA + GRAPH or ULTIPA + QUERY + LANGUAGE

find()
  .nodes({ 
    ~nodeDesc: { 
      $in:[
        "ULTIPA GRAPTH",
        "ULTIPA QUERY LANGUAGE"
      ]
    }
  })
  .limit(10).select(*)

Prefix search is a way of fuzzy search by attaching wildcard * right behind the search string, to make it the prefix of the target words in the full-text.

Example 1: Search nodes against fulltext index 'nodeName' for words that start with 'ult'

find()
  .nodes({ ~nodeName : "ult*" })
  .limit(10).select(*)

With the wildcard * attached, the result may contain words like 'ultimate', 'Ultipa', 'ult234', etc.

Example 2: Search nodes against fulltext index 'nodeName' for content that contains both ult* and graph* like words

find()
  .nodes({ ~nodeName : "ult* graph*" })
  .limit(10).select(*)

Example 3: Search nodes against fulltext index 'nodeName' for content that contains either ult* or graph* like words

find()
  .nodes({ ~nodeName : { $in: ["ult*", "graph*"] } })
  .limit(10).select(name)

Running Example 3 within Ultipa-Manager will get this:

Figure: Find Node with Fulltext Search

Example 4: Search for paths from Sequoia* to Hillhouse* within 5-hop, return up to 100 possible paths

t() .n({ ~name: "Sequoia*" })
    .e()[:5].n({ ~name: "Hillhouse*" })
    .limit(100).select(*)

Example 5: Starting from Rick*, search along paths up to 3-hop for Ted*, who has an L* relationship with the next entity in the graph, return up to 10 paths

t() .n({ ~node_name: "Rick*" })
    .e()[:3].n({ ~node_name: "Ted*" })
    .e({ ~edge_name: "L*" }).n()
    .limit(10).select(*)

Running Example 5 within Ultipa-Manager will get this:

Figure: Template Query with Fulltext Search

Note: in Example 5, both node's and edge's fulltext indexes are leveraged in a template-based search, the capability of which is a sheer invention by Ultipa.

Given a GP/LP or AI-investment knowledge graph or analytics data network, the examples from above can help facilitate and implement otherwise super sophisticated search in a single like of uQL – such as finding the network entities and relationships comprising Sequoia, Hillhouse, and whatsoever branches, plus all the companies that they have invested in and are competing against or collaborating with each other. Before Ultipa invented template-based fulltext search, a query like this is unthinkable! Now, this can be done with ease, elegance and in real-time.

Note: The purpose of prefix search is to maximize the hit results. By replacing prefix search with whole word search, the following may happen:

  • Immediate return at microsecond (if not millisecond) level as a result of exact match (100% match), compared with the slightly slower prefix search;
  • Word-breaking caused 0 return due to that the search strings do not exist in the word library, which is language-specific.

Unless you know precisely what you are searching for and intend to do it with whole word search, we recommend you provide '*' to your search strings, after all, it's the sole purpose of fulltext search.