Hadoop vs teradata what's the difference

Question

Hadoop vs teradata what's the difference

I touched Teradata. I never touched howtop, but since yesterday I have been doing some research. According to the description of both, they seem to be completely interchangeable, but some documents say that they serve for different purposes. But everything I found is vague. I'm confused.

Does anyone have experience with both of them? What is the difference between the two?

A simple example: I want to create an ETL that converts billions of rows of raw data and organizes them into DWH. Then make some resources an expensive analysis on them. Why use TD? Why Hadoop or why not?

+7

database hadoop teradata business-intelligence

John Jan 31 '13 at 8:59

source share

3 answers

ryanbwork · Answer 1 · 2013-04-02T19:58:37+0000

I think in this article entitled "MapReduce and Parallel DBMSs: Friends or Enemies" does a good job of describing situations where each technology works best. In a nutshell, Hadoop is great for storing unstructured data and performing parallel transformations to "disinfect" incoming data where DBMS outperform complex queries.

Yaniv · Answer 2 · 2013-05-25T17:56:22+0000

Hadoop, Hadoop with Extensions, Feature / Feature Comparison of RDBMS

I am not an expert in this field, but on coursera.com's course "Introduction to Data Science" there is a lecture entitled "Comparing MapReduce and Databases", as well as a lecture on parallel databases in the section Map Reduction Course.

Here is a summary of these lectures on comparing MapReduce and RDBMS (not necessarily parallel to RDMBS). Keep in mind that the comparison is different if you enable extensions for Hadoop such as PIG, Hive, etc. I will add () MapReduce extensions that will add some of this functionality / properties.

Some functions / properties that RDBMS have, but are not native to MapReduce:

Declarative query languages - (Pig, HIVE)
Schemes (Hive, Pig, DyradLINQ, Hadapt)
Logical data independence
Indexing (Hbase)
Algebraic Optimization (Pig, Dryad, HIVE)
Caching / Materialized Views
ACID / Operations

MapReduce (relatively regular RDBMS not necessarily Parallel RDMBS)

High scalability
fault tolerance
One-Person Deployment

shazin · Answer 3 · 2013-01-31T09:47:38+0000

For starters, Vanilla Apache Hadoop is 100% open source. But if you need commercial support along with advice, there are companies like Cloudera, MapR, HortonWorks, etc.

Hadoop is supported by a growing community of bug fixes and constantly improving improvements. The Hadoop HDFS storage model is based on Google GFS , which has been proven to process a large amount of data. In addition, the Hadoop Map Reduce analysis model is based on the Google Map Reduce Model .

Hadoop is used by Tech Giants like Facebook, Yahoo, Twitter, EBay, etc. to store and analyze their large amount of data in real time, as well as passively.

For your question, ETL systems read these slides where you will see.

Ok now. Why Hadoop?

Open source
Proven storage and analysis model for large volumes of data
Minimum hardware requirement for setup and startup.

Ok now. Why TD?

Commercial support

Hadoop vs teradata what's the difference

Hadoop, Hadoop with Extensions, Feature / Feature Comparison of RDBMS

More articles: