MRS-Labs

MRS Introduction

For more information about MapReduce Service you can refer to this Link

MRS version

This guide is validated with MRS version 3.1.0-LTS and anaconda3 (Anaconda3-2020.07-Linux-x86_64.sh Link). Since MRS has python 2.7 and 3.8 installed, we choose the version of Anaconda which has also python 3.8 installed

Steps

Install MRS Client
Install Anaconda
Integrate with Spark2x

Install MRS Client

Most of the details are described in Link

It is recommanded to install the VM which runs notebook in the same VPC of the MRS cluster. In this way MRS Manager can easily transfer MRS client to the target VM.

When the client is copied to the target VM, you need to configure NTP server on this VM then configure this MRS client.

For installing and configuring NTP:

sudo yum install ntp -y

Change /etc/ntp.conf with your master nodes ip

service ntpd stop
ntpdate 192.168.1.151 # change to your own master ip
service ntpd start

For configuring the MRS client:

./install.sh /opt/mrsclient

Install Anaconda

You can use wget to download a choosen version of anaconda for the VM. For example:

wget https://repo.anaconda.com/archive/Anaconda3-2020.07-Linux-x86_64.sh

It is advised to install in another place than the default one, for example /opt/anaconda3

Once done click yes to initiate Anaconda3, the initiation process will be written in ~/.bashrc

The problem is that if it is written in ~/.bashrc, everytime login it will automatically start Anaconda3, so you can copy paste is to ~/.bashrc.anaconda

cp ~/.bashrc ~/.bashrc.anaconda

Then do:

vi ~/.bashrc to remove the conda initialize part

Finally do source ~/.bashrc.anaconda to load the environment

Then do:

jupyter notebook --generate-config --allow-root to generate the conf file.

vi /root/.jupyter/jupyter_notebook_config.py to modify the ip to the host ip:

Change port if already in use:

Save the file

Integrate with Spark2x

Once done for installing MRS client and anaconda, then you can launch jupyter notebook by the following commands:

source /opt/hadoopclient/bigdata_env
kinit developuser
source ~/.bashrc.anaconda
export PYSPARK_DRIVER_PYTHON="ipython"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --allow-root"

Finally start the notebook:

pyspark --master yarn --deploy-mode client &

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
pyspark-demo.py		pyspark-demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MRS-Labs

MRS Introduction

MRS version

Steps

Install MRS Client

Install Anaconda

Integrate with Spark2x

About

Releases

Packages

Languages

FlexibleEngineCloud/MRS-Labs

Folders and files

Latest commit

History

Repository files navigation

MRS-Labs

MRS Introduction

MRS version

Steps

Install MRS Client

Install Anaconda

Integrate with Spark2x

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages