Thekey differencebetween RDBMS and Hadoop is that theRDBMS stores structured data while the Hadoop stores structured, semi-structured, and unstructured data.
The RDBMS is a database management system based on the relational model. The Hadoop is a software for storing data and running applications on clusters of commodity hardware.
CONTENTS
1.Overview and Key Difference
2.RDBMS是什么
3.What is Hadoop
4.并排比较 - rdbms vs hadoop以表格形式
5.概括
RDBMS是什么?
RDBMS代表基于关系模型的关系数据库管理系统。在RDBMS中,表用于存储数据,键和索引有助于连接表。表是数据元素的集合,它们是实体。它包含行和列。行表示表中的一个条目。这些列表示属性。
For example, the sales database can have customer and product entities. The customer can have attributes such as customer_id, name, address, phone_no. The item can have attributes such as product_id, name etc. The primary key of customer table is customer_id while the primary key of product table is product_id. Placing the product_id in the customer table as a foreign key connects these two entities. Likewise, the tables are also related to each other. They provide data integrity, normalization, and many more. Few of the common RDBMS areMySQL, MSSQL and Oracle. They useSQLfor querying.
Hadoop是什么?
The Hadoop is an Apache open source framework written in Java. It helps to store and processes a large quantity of data across clusters of computers using simple programming models. The main objective of Hadoop is to store and processBig Data,which refers to a large quantity of complex data. The throughput of Hadoop, which is the capacity to process a volume of data within a particular period of time, is high.
There are four modules in Hadoop architecture. They are Hadoop common, YARN, Hadoop Distributed File System (HDFS), and Hadoop MapReduce. The common module contains the Java libraries and utilities. It also has the files to start Hadoop. Hadoop YARN performs the job scheduling and cluster resource management.
Furthermore, the Hadoop Distributed File System (HDFS) is the Hadoop storage system. It uses the master-slave architecture. The Master node is the NameNode, and it manages the file system meta data. Other computers are slave nodes or DataNodes. They store the actual data. On the other hand, Hadoop MapReduce does the distributed computation. It has the algorithms to process the data. In the HDFS, the Master node has a job tracker. It runs map reduce jobs on the slave nodes. There is a Task Tracker for each slave node to complete data processing and to send the result back to the master node. Overall, the Hadoop provides massive storage of data with a high processing power.
What is the Difference Between RDBMS and Hadoop?
RDBMS vs Hadoop |
|
RDBMS is a system software for creating and managing databases that based on the relational model. | Hadoop是开源软件的集合,该软件连接了许多计算机来解决涉及大量数据和计算的问题。 |
Data Variety | |
RDBMS stores structured data. | Hadoop商店结构化,半结构化和非结构化数据。 |
Data Storage | |
RDBMS stores average amount of data. | Hadoop商店比RDBMS大量数据。 |
速度 | |
In RDBMS, reads are fast. | 在Hadoop中,阅读和写作很快。 |
Scalability | |
RDBMS has vertical scalability. | Hadoop has horizontal scalability. |
硬件 | |
RDBMS use high-end servers. | Hadoop uses commodity hardware. |
Throughput | |
RDBMS吞吐量更高。 | Hadoop throughput is lower. |
概括– RDBMS vs Hadoop
This article discussed the difference between RDBMS and Hadoop. The key difference between RDBMS and Hadoop is that the RDBMS stores structured data while the Hadoop stores structured, semi-structured and unstructured data.
Reference:
1.Tutorials Point. “SQL RDBMS Concepts.” ,导师ials Point, 8 Jan. 2018.Available here
2.Tutorials Point. “Hadoop Tutorial.” ,导师ials Point, 8 Jan. 2018.Available here
Image Courtesy:
1.’8552968000’by Intel Free Press(CC BY-SA 2.0)viaFlickr
发表评论