Mongo is a database, which is simply software that allows you to store, retrieve, filter, and sort information quickly and easily.  Databases have been around for decades.  The first relational database was proposed by IBM's EF Codd in 1972.  In his seminal work titled "Relational Completeness of Data Base Sublanguages", he outlined the first relational database system software, and handed it to IBM on a silver platter.  IBM shelved the idea because they were afraid of cannibalizing sales on their existing non-relational database products, and only after Oracle beat IBM to releasing a relational database product did they begin considering it a viable product.

IBM and Oracle are, today, database giants in the world of relational databases.  Joined later by Microsoft and open source offerings like MySQL and Postgres, as well as newcomers like Apache Cassandra, database software companies are raking in billions of dollars a year in sales.

So it’s big business, but what is a relational database, and how is a relational database different from Mongo, which is a NoSQL, non-relational database?

If you’ve ever used a spreadsheet before, you’ve seen a simple database.  You can record tabular data on a spreadsheet using rows and columns.  Each row is called a record.

The problem with simple tables is eventually, you get data duplication.  Let’s say you wanted to make a list of authors so you can share your favorite books on your favorite subject.  Our list has the book title, the year of publication, maybe an ISBN number, and the name and address of the publishing company.

Eventually we’re going to have duplicate data in our simple table.  Duplicate data is bad because if the publishing house’s address ever changes, you have to update every copy of the information.  The duplicate information also takes up space in our table, and space on our computer, yet it serves no real, useful purpose.

It would be better if you could update the information in one place by recording the details just once, and then linking or relating the data together.  To achieve this, you could create another table which contains just the publishing house information.  From there you could create a unique number that identifies each publishing house, and you could then put that number in your list of books.  Now we have a relationship between books and publishers.  They key to the relationship is the unique number we created for each publisher.  Using this key, we can join the data together using a relational query.

Relational databases are awesome at storing enormous amounts of data and allowing you to find individual records in under a second.  But each time you add another relation, by creating a separate table to hold information related to your main tables, you increase the query time, or the time it takes to assemble the information you want in the format you want to see it.  More relations, means more joins, which means slower query response times.

The other big problem with relational structures is that they are defined by simple two-dimensional structures.  Object oriented programming languages model data using classes or objects, and these structures can be made very rich owing to their hierarchical nature.  Objects can store data inside themselves, and subsequently we can add them into other object structures which creates a tree-like graph structure called an object graph.

Given the disparity between how programmers work with objects containing data, and how a relational database stores it’s data, we see the process of converting rich object structures into two dimensional relations as being very tedious work.  Object relational mapper software, or ORMS, mitigate some of this for us, but they come with their own penalties in performance, and in adding more more layer to your lasagna.  Its one more thing that can break or go wrong.

Wouldn’t it be great if a database could just store data in the same format as programmers work with objects?

A number of years ago, programmers at Facebook asked that very question, and the NoSQL revolution was born.  Mongo represents one of the most popular entries in a short but growing list of database systems that are non-relational.  Instead of tables with rows and columns, Mongo, or any NoSQL database, allows you to store, retrieve, search, sort, and filter data in the same structure and format that exists in your program code.  You no longer need to deal with modeling relational structures and defining constraints and rules for relational tables, nor do you have to limit yourself to what an ORM can accomplish.  NoSQL databases are very simple and easy to use, and they often perform as well as or slightly better than their relational counterparts.

Want to learn more about how to actually install and use MongoDB in your code projects?  Check out our MongoDB courses at