# Important Hadoop Configurtions

In this article, we will learn about important Hadoop configuration files

# hadoop-env.sh

Environment variables that are used in the scripts to run Hadoop

# Exploring core-site.xml

All the configuration settings related to Hadoop core such as I/O settings that are common to HDFS and MapReduce.

**Reference Link:** [https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml](https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml)

# Exploring hdfs-site.xml

* Configuration settings for HDFS daemons, the namenode, the secondary namenode and the data nodes.
    
* Configuring Replication Factor, Block Size, Directory Specific Details, Permission, Security Level
    

## dfs.block.size

This property is used to change the block size from its default size i.e 128 MB

```xml
<property>
	<name>dfs.block.sizw</name>
	<value>134217728</value>
</property>
```

## dfs.replication

This property is used to change the replication factor from its default.

```xml
<property>
	<name>dfs.replication</name>
	<value>3</value>
</property>
```

## dfs.permissions

If true, enable permission checking in HDFS. If false permission checking turned off.

```xml
<property>
 	<name>dfs.permissions</name>
 	<value>false</value>
</property>
```

## dfs.namenode.name.dir

This property determines where on the local files system the DFS NameNode should store the NameNode table.

```xml
<property>
	<name>dfs.namenode.name.dir</name>
	<value>/home/npntraining/dfs/namenode</value>
</property>
```

## dfs.datanode.data.dir

This property determines where on the local files system the DFS DataNode should store the blocks#

```xml
<property>
	<name>dfs.datanode.data.dir</name>
	<value>/home/npntraining/dfs/namenode</value>
</property>
```

# Exploring mapred-site.xml

Configuration settings for MapReduce daemons : the ResourceManager and the NodeManager

## mapreduce.framework.name

The runtime framework for executing MapReduce jobs. Can be one of local, classic or YARN

```xml
<property>
	<name>mapreduce.framework.name</name>
	<value>yarn</value>
</property>
```

## mapreduce.map.memory.mb

The amount of memory to request from the scheduler for each map task.

```xml
<property>
	<name>mapreduce.map.memory.mb</name>
	<value>1024</value>
</property>
```

## mapreduce.map.cpu.vcores

The number of virtual cores to request from the scheduler for each map task..

```xml
<property>
	<name>mapreduce.map.cpu.vcores</name>
	<value>1</value>
</property>
```

## mapreduce.reduce.memory.mb

The amount of memory to request from the scheduler for each reduce task.

```xml
<property>
	<name>mapreduce.reduce.memory.mb</name>
	<value>1024</value>
</property>
```

## mapreduce.reduce.cpu.vcores

The number of virtual cores to request from the scheduler for each reduce task.

```xml
<property>
	<name>mapreduce.reduce.cpu.vcores</name>
	<value>1</value>
</property>
```

# Exploring yarn-site.xml

## mapreduce.framework.name

```xml
<property>
	<name>mapreduce.framework.name</name>
	<value>yarn</value>
</property>
```

# masters

A list of machines (one per line) that each run a secondary namenode.

# slaves

A list of machines (one per line) that each run a DataNode and a NodeManager.

...

> ***Connect with me on LinkedIn, If you are looking for 1:1 mentorship for a career and interviews connect me on*** [***𝐭𝐨𝐩𝐦𝐚𝐭𝐞.𝐢𝐨/𝐧𝐚𝐯𝐞𝐞𝐧𝐩𝐧***](https://topmate.io/naveenpn)
