vmanubolu Posted June 25, 2015 Report Share Posted June 25, 2015 HADOOP AT VSBTECH HADOOP BASICS ·Problems with traditional large-scale systems ·Data Storage literature survey ·Data Processing literature Survey ·Network Constraints · Requirements for a new approach Hadoop: Basic Concepts ·What is Hadoop. ·The Hadoop Distributed File System ·Hadoop Map Reduce Works ·Anatomy of a Hadoop Cluster ·Master Daemons ·Name node ·Job Tracker ·Secondary name node ·Slave Daemons ·Job tracker ·Task tracker HDFS(Hadoop Distributed File System) ·Blocks and Splits ·Input Splits ·HDFS Splits ·Data Replication ·Hadoop Rack Aware ·Data high availability ·Cluster architecture and block placement CASE STUDIES Programming Practices & Performance Tuning ·Developing MapReduce Programs in ·Local Mode ·Running without HDFS ·Pseudo-distributed Mode ·Running all daemons in a single node ·Fully distributed mode ·Running daemons on dedicated nodes ·INSTALLING APACHE SINGLE NODE CLUSTER ·Name Node in Safe mode Writing a MapReduce Program ·Examining a Sample MapReduce Program ·With several examples ·Basic API Concepts ·The Driver Code ·The Mapper ·The Reducer ·Hadoop's Streaming API Performing several Hadoop jobs ·The configure and close Methods ·Sequence Files ·Record Reader ·Record Writer ·Role of Reporter ·Output Collector ·Counters ·Directly Accessing HDFS ·ToolRunner ·Using The Distributed Cache ·Killing a job Several MapReduce jobs (In Detailed) ·MOST EFFECTIVE SEARCH USING MAPREDUCE ·GENERATING THE RECOMMENDATIONS USING MAPREDUCE ·PROCESSING THE LOG FILES USING MAPREDUCE ·IMAGE COUNTERS IN MAPREDUCE ·MRUNIT TESTING ·Identity Mapper ·Identity Reducer ·Exploring well known problems using MapReduce applications Debugging MapReduce Programs ·Testing with MRUnit ·Logging ·Other Debugging Strategies. Advanced MapReduce Programming ·The Secondary Sort ·Customized Input Formats and Output Formats ·Joins in MapReduce ·Compressions Monitoring and debugging on a Production Cluster ·Skipping Bad Records ·Running in local mode Tuning for Performance in MapReduce ·Reducing network traffic with combiner ·Partitioners ·Reducing the amount of input data ·Speculative execution ·Other Performance Aspects CASE STUDIES CDH4 Enhancements ·Name Node High – Availability ·Name Node federation ·Fencing ·MapReduce Version - 2 HIVE ·Hive concepts ·Hive architecture ·Install and configure hive on cluster ·Different type of tables in hive ·Hive library functions ·Buckets ·Partitions ·Joins in hive ·Inner joins ·Outer Joins ·Hive UDF ·Hive Serde ·Processing JSON in hive ·Compressions in Hive PIG ·Pig basics ·Install and configure PIG on a cluster ·PIG Library functions ·Pig Vs Hive ·Write sample Pig Latin scripts ·Modes of running PIG ·Running in Grunt shell ·Designing Pig Scripts ·Using PiggyBank ·Running as Java program ·PIG UDFs ·Pig Macros ·Debugging PIG IMPALA ·Difference between Impala Hive and Pig ·How Impala gives good performance ·Exclusive features of Impala ·Impala Challenges ·Use cases of Impala SQOOP ·Install and configure Sqoop on cluster ·Connecting to RDBMS ·Installing Mysql ·Import data from Oracle/Mysql to hive ·Export data to Oracle/Mysql ·Internal mechanism of import/export FLUME ·Architecture ·Ingesting Streaming tweets ·HDFS as Sink NOSQL HBase ·HBase concepts ·HBase architecture ·Region server architecture ·File storage architecture ·HBase basics ·Column access ·Scans ·HBase use cases ·Install and configure HBase on a multi node cluster ·Create database, Develop and run sample applications OOZIE ·Oozie architecture ·XML file specifications ·Install and configuring Oozie and Apache ·Specifying Work flow ·Action nodes ·Control nodes ·Oozie job coordinator Hadoop Challenges ·Hadoop disaster recovery ·Hadoop suitable cases ELASTICSEARCH ·Get and Put API ·Java approarch ·ElasticSearch with Kibana SPARK ·Basics of in memory computation ·RDD in Spark ·Installation ·Spark with Scala example ·Spark Java API ·Spark Mlib STORM BASICS KAFKA BASICS Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.