DB-EnginesExtremeDB for everyone with an RTOSEnglish
Deutsch
Informationen zu relationalen und NoSQL DatenbankmanagementsystemenEin Service von solid IT

Blog > Post

Why is Hadoop not listed in the DB-Engines Ranking?

von Paul Andlinger, 13. Mai 2013
Tags: Cassandra, Hadoop, HBase, Hive

We are often asked why we do not list Hadoop in the DB-Engines ranking. So we provide the way we see it.

To start with a platitude: yes, of course Apaches Hadoop is a very powerful and popular tool and has gained a special importance in the handling of Big Data.

However, we understand Hadoop as a system providing a (distributed) file system (HDFS) coming along with a comprehensive ecosystem (MapReduce, Yarn, ZooKeeper, Pig, Hive etc.). From a methodical point of view Hadoop could be compared to a distributed file system like NFS or to a 'file server software' like Samba. (We don't dare to mention VSAM here...)

But what exactly is the difference between a file system and a database management system? For us, the main difference is this:

  • A file system stores data to be used by applications without knowing about the structure of the data. E.g. a file system stores a spreadsheet as a set of bits, without knowing anything about cells or formulas.
  • A database management system stores data to be used by applications, and provides access to the data in a way that makes use of the structure and content of the data. E.g. a DBMS is able to deliver all data that belong to a person with name "John". That type of access can be provided e.g. via SQL and via an API.

That criteria works well in most cases. The line blures when looking at the most simple key value stores. They also don't provide any means to handle structured data. Any file system could be seen as a key value store, where the file name (incl. path) is a key, and the content of the file is a value. We include very simple key value stores in our ranking if they are perceived as DBMS by their vendor and by their users, otherwise not.

Hadoop is indeed close to that blured line, but according to the criteria defined above, we decided to consider it a file system, altough a very advanced file system.

We do actually list a couple of interesting systems which are either built upon Hadoop (e.g. HBase and Hive) or can handle data stored in the Hadoop file system (e.g. Cassandra).





Teilen sie diese Seite mit ihrem Netzwerk

Featured Products

Milvus logo

The open source vector database for GenAI.
Try Managed Milvus Free

AllegroGraph logo

Graph Database Leader for AI Knowledge Graph Applications - The Most Secure Graph Database Available.
Free Download

Datastax Astra logo

Bring all your data to Generative AI applications with vector search enabled by the most scalable
vector database available.
Try for Free

Neo4j logo

See for yourself how a graph database can make your life easier.
Use Neo4j online for free.

Ontotext logo

GraphDB allows you to link diverse data, index it for semantic search and enrich it via text analysis to build big knowledge graphs. Get it free.

Präsentieren Sie hier Ihr Produkt