NoSQL and Polyglot Persistence Research Project

I have an upcoming research project, that I have to do for the BTR820 course at Seneca. Since there has been a revolution of NoSQL databases and polyglot persistence in recent years, this research project is a perfect opportunity for me to learn about various NoSQL databases.

The ultimate goal of the research project is to explore the applicability of polyglot persistence. I plan to examine and compare the programming model imposed by different types of NoSQL databases, and contrast them with traditional relational databases. The idea is to look at how application code would be written for each of the databases, and how the development of application would be impacted. I may also be exploring the philosophies and algorithms behind the data storage engines. If scope permits, I may possibly be looking at the management and security facilities provided by each, since these have impact on application development as well. Performance and horizontal scaling, e.g. clustering, however, will not be my focus.

For the types of databases to be examined, currently I plan to look at document store, graph and key-value store databases, as they are the ones I hear the most often. Implementations I have tentatively chosen are MongoDB, Neo4j and Cassandra. If time permits, I may give HBase or FlockDB1 a look. For the RDBMS baseline, it would be MySQL.

To compare the databases, my plan is to implement the same set of functionality in Java using each of the databases. I will then look at how complex the code has to be to achieve the same effect. The code would be written in Java and would be as low level as possible, meaning no JPA or similar fancy frameworks will be used.

The conclusion of the paper would be drawn based on how practical it is to implement a given task using each of the databases, along with some possible baggages/overheads imposed by them. My hypothesis is that each type of the databases has its primary strength and intended use case, so the goal is to find out which database would be suitable for what use case and why.

If you think what I’m doing interests you, feel free to leave me some comments.

1 Suggested by Seneca Professor Jordan Anastasiade.