Graph databases are schemaless, allowing flexible storage of interconnected data. When updating, manipulating or changing data, this can lead to heterogeneity. This is caused by implicit structural changes with so-called evolution operations such as add, rename, delete, transform, merge, copy, split or move.
To describe these implicit structural changes, we developed a domain independent evolution language named GEO - Graph Evolution Operation - to specify how each evolution operation works. Through its intuitive syntax, GEO is proclaimed to be used by both experts and non-experts. Consequently, the presence of GEO at all levels is utmost important to support potential users.
Nautilus benefits from the naturalish language GEO. Consequently, we plan on implementing the evolution language to widen the range of users for graph databases in general, aiming to ease the usage of graph databases e.g. in interdisciplinary research projects. The novelty, moreover, is given through the former lack of an evolution language including graph-specific operations like transform. As some evolution operations, such as splitting nodes or relationships at a specified property key, are neither in Neo4j's query language Cypher nor in the APOC library available, a workaround is needed. By the means of Schema Modification Operations (SMO), enabling a precise translation into a workaround with Cypher, even such operations are available in Nautilus. Users, therefore, benefit from an easy-to-understand language to execute complex workarounds on their database.
Nautilus represents a first implementation of an evolution-approach, on base of which we plan an interactive interface to visualize the schema as well as structural profiles. Therefore, that users can not overlook the impact of evolution in the sense of:
The user interface of Nautilus offers an interactive scatter plot, illustrating structural database statistics (SDS) for this. SDS are a hybrid approach consisting of schema information together with database statistics to visualize which data currently is stored in the database. In addition, the SDS before and after the evolution process are compared to one another.
Next steps in the development are solutions for the following questions:?