Skip to main content

Command Palette

Search for a command to run...

Important Hadoop Configurtions

Updated
โ€ข2 min read
Important Hadoop Configurtions
N

I am a Tech Enthusiast having 13+ years of experience in ๐ˆ๐“ as a ๐‚๐จ๐ง๐ฌ๐ฎ๐ฅ๐ญ๐š๐ง๐ญ, ๐‚๐จ๐ซ๐ฉ๐จ๐ซ๐š๐ญ๐ž ๐“๐ซ๐š๐ข๐ง๐ž๐ซ, ๐Œ๐ž๐ง๐ญ๐จ๐ซ, with 12+ years in training and mentoring in ๐’๐จ๐Ÿ๐ญ๐ฐ๐š๐ซ๐ž ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ๐ข๐ง๐ , ๐ƒ๐š๐ญ๐š ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ๐ข๐ง๐ , ๐“๐ž๐ฌ๐ญ ๐€๐ฎ๐ญ๐จ๐ฆ๐š๐ญ๐ข๐จ๐ง ๐š๐ง๐ ๐ƒ๐š๐ญ๐š ๐’๐œ๐ข๐ž๐ง๐œ๐ž. I have ๐’•๐’“๐’‚๐’Š๐’๐’†๐’… ๐’Ž๐’๐’“๐’† ๐’•๐’‰๐’‚๐’ 10,000+ ๐‘ฐ๐‘ป ๐‘ท๐’“๐’๐’‡๐’†๐’”๐’”๐’Š๐’๐’๐’‚๐’๐’” and ๐’„๐’๐’๐’…๐’–๐’„๐’•๐’†๐’… ๐’Ž๐’๐’“๐’† ๐’•๐’‰๐’‚๐’ 500+ ๐’•๐’“๐’‚๐’Š๐’๐’Š๐’๐’ˆ ๐’”๐’†๐’”๐’”๐’Š๐’๐’๐’” in the areas of ๐’๐จ๐Ÿ๐ญ๐ฐ๐š๐ซ๐ž ๐ƒ๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ฆ๐ž๐ง๐ญ, ๐ƒ๐š๐ญ๐š ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ๐ข๐ง๐ , ๐‚๐ฅ๐จ๐ฎ๐, ๐ƒ๐š๐ญ๐š ๐€๐ง๐š๐ฅ๐ฒ๐ฌ๐ข๐ฌ, ๐ƒ๐š๐ญ๐š ๐•๐ข๐ฌ๐ฎ๐š๐ฅ๐ข๐ณ๐š๐ญ๐ข๐จ๐ง๐ฌ, ๐€๐ซ๐ญ๐ข๐Ÿ๐ข๐œ๐ข๐š๐ฅ ๐ˆ๐ง๐ญ๐ž๐ฅ๐ฅ๐ข๐ ๐ž๐ง๐œ๐ž ๐š๐ง๐ ๐Œ๐š๐œ๐ก๐ข๐ง๐ž ๐‹๐ž๐š๐ซ๐ง๐ข๐ง๐ . I am interested in ๐ฐ๐ซ๐ข๐ญ๐ข๐ง๐  ๐›๐ฅ๐จ๐ ๐ฌ, ๐ฌ๐ก๐š๐ซ๐ข๐ง๐  ๐ญ๐ž๐œ๐ก๐ง๐ข๐œ๐š๐ฅ ๐ค๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž, ๐ฌ๐จ๐ฅ๐ฏ๐ข๐ง๐  ๐ญ๐ž๐œ๐ก๐ง๐ข๐œ๐š๐ฅ ๐ข๐ฌ๐ฌ๐ฎ๐ž๐ฌ, ๐ซ๐ž๐š๐๐ข๐ง๐  ๐š๐ง๐ ๐ฅ๐ž๐š๐ซ๐ง๐ข๐ง๐  new subjects.

In this article, we will learn about important Hadoop configuration files

hadoop-env.sh

Environment variables that are used in the scripts to run Hadoop

Exploring core-site.xml

All the configuration settings related to Hadoop core such as I/O settings that are common to HDFS and MapReduce.

Reference Link: https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml

Exploring hdfs-site.xml

  • Configuration settings for HDFS daemons, the namenode, the secondary namenode and the data nodes.

  • Configuring Replication Factor, Block Size, Directory Specific Details, Permission, Security Level

dfs.block.size

This property is used to change the block size from its default size i.e 128 MB

<property>
    <name>dfs.block.sizw</name>
    <value>134217728</value>
</property>

dfs.replication

This property is used to change the replication factor from its default.

<property>
    <name>dfs.replication</name>
    <value>3</value>
</property>

dfs.permissions

If true, enable permission checking in HDFS. If false permission checking turned off.

<property>
     <name>dfs.permissions</name>
     <value>false</value>
</property>

dfs.namenode.name.dir

This property determines where on the local files system the DFS NameNode should store the NameNode table.

<property>
    <name>dfs.namenode.name.dir</name>
    <value>/home/npntraining/dfs/namenode</value>
</property>

dfs.datanode.data.dir

This property determines where on the local files system the DFS DataNode should store the blocks#

<property>
    <name>dfs.datanode.data.dir</name>
    <value>/home/npntraining/dfs/namenode</value>
</property>

Exploring mapred-site.xml

Configuration settings for MapReduce daemons : the ResourceManager and the NodeManager

mapreduce.framework.name

The runtime framework for executing MapReduce jobs. Can be one of local, classic or YARN

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

mapreduce.map.memory.mb

The amount of memory to request from the scheduler for each map task.

<property>
    <name>mapreduce.map.memory.mb</name>
    <value>1024</value>
</property>

mapreduce.map.cpu.vcores

The number of virtual cores to request from the scheduler for each map task..

<property>
    <name>mapreduce.map.cpu.vcores</name>
    <value>1</value>
</property>

mapreduce.reduce.memory.mb

The amount of memory to request from the scheduler for each reduce task.

<property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>1024</value>
</property>

mapreduce.reduce.cpu.vcores

The number of virtual cores to request from the scheduler for each reduce task.

<property>
    <name>mapreduce.reduce.cpu.vcores</name>
    <value>1</value>
</property>

Exploring yarn-site.xml

mapreduce.framework.name

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

masters

A list of machines (one per line) that each run a secondary namenode.

slaves

A list of machines (one per line) that each run a DataNode and a NodeManager.

...

Connect with me on LinkedIn, If you are looking for 1:1 mentorship for a career and interviews connect me on ๐ญ๐จ๐ฉ๐ฆ๐š๐ญ๐ž.๐ข๐จ/๐ง๐š๐ฏ๐ž๐ž๐ง๐ฉ๐ง

More from this blog

Naveen P.N's Tech Blog

94 posts