Introduction to Hadoop and BIG DATA

  • What is BIG DATA ?
  • What are the challenges for processing
  • What technologies support BIG DATA ?
  • What is Hadoop?
  • Why Hadoop
  • History of Hadoop
  • Use Cases of Hadoop
  • RDBMS VS Hadoop
  • When to use and when not to use Hadoop
  • Ecosystem Tour
  • vendor comparison
  • Hardware Recommendation & Statistics

HDFS:  Hadoop Distributed File System

Significance of HDFS in Hadoop

  • Features of HDFS
  • 5 daemons of Hadoop
  1. Name Node and ITs functionality
  2. Data node and its functionality
  3. Secondary Name Node and its functionality
  4. Job Tracker and its functionality
  5. Task Tracker and its functionality

Data Storage in HDFS

  •  Introduction about Blocks
  • Data replication


Accessing in HDFS

  • CLI (Command Line Interface) and admin commands
  • JAVA Based Approach

Fault Tolerance
Download Hadoop
Installation and Set-up Hadoop

  • Start-up & Shut down process

HDFS Federation

Map Reduce

  • Map Reduce Story
  • Map Reduce Architecture
  • How Map Reduce works
  • Developing Map Reduce
  • Map Reduce Programming Model
  1. Different Data Types in Map Reduce
  2. Different Phases of Map Reduce Algorithm

How write a basic Map Reduce Program

  • Driver Code
  • Mapper
  • Reducer

Creating Input and Output Formats in Map Reduce Jobs

  • Text Input Formats
  • Key value input Formats

Data Localization in Map Reduce

Combiner (Mini Reducer)  and Partitioner

Hadoop I/O

Distributed cache


PIG

  • Introduction to Apache PIG
  • Map Reduce vs Apache PIG 
  • SQL vs Apache PIG 
  • Different Data Types in PIG
  • Modes of Execution in PIG
  • Grunt shell
  • Loading Data
  • Exploring PIG
  • Latin Commands

HIVE

  •  Hive Introduction
  • Hive Architecture
  • HIVE vs RDBMS
  • HiveQL and the shell
  • Managing tables (external vs managed)
  • Data types and schemas
  • Partition and buckets

HBASE

  • Architecture and schema design
  • HBASE vs RDBMS
  • HMaster and Region services
  • Column Families and Region
  • Write Pipeline
  • Read Pipeline
  • HBase Command

FLUME
SQOOP