Skip to content

FlexibleEngineCloud/MRS-Labs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

MRS-Labs

MRS Introduction

For more information about MapReduce Service you can refer to this Link

MRS version

This guide is validated with MRS version 3.1.0-LTS and anaconda3 (Anaconda3-2020.07-Linux-x86_64.sh Link). Since MRS has python 2.7 and 3.8 installed, we choose the version of Anaconda which has also python 3.8 installed

Steps

  1. Install MRS Client
  2. Install Anaconda
  3. Integrate with Spark2x

Install MRS Client

Most of the details are described in Link

image

It is recommanded to install the VM which runs notebook in the same VPC of the MRS cluster. In this way MRS Manager can easily transfer MRS client to the target VM.

image

image

When the client is copied to the target VM, you need to configure NTP server on this VM then configure this MRS client.

For installing and configuring NTP:

sudo yum install ntp -y

Change /etc/ntp.conf with your master nodes ip

image

service ntpd stop
ntpdate 192.168.1.151 # change to your own master ip
service ntpd start 

For configuring the MRS client:

./install.sh /opt/mrsclient

Install Anaconda

You can use wget to download a choosen version of anaconda for the VM. For example:

wget https://repo.anaconda.com/archive/Anaconda3-2020.07-Linux-x86_64.sh

It is advised to install in another place than the default one, for example /opt/anaconda3

Once done click yes to initiate Anaconda3, the initiation process will be written in ~/.bashrc

The problem is that if it is written in ~/.bashrc, everytime login it will automatically start Anaconda3, so you can copy paste is to ~/.bashrc.anaconda

cp ~/.bashrc ~/.bashrc.anaconda

Then do:

vi ~/.bashrc to remove the conda initialize part

image

Finally do source ~/.bashrc.anaconda to load the environment

Then do:

jupyter notebook --generate-config --allow-root to generate the conf file.

image

vi /root/.jupyter/jupyter_notebook_config.py to modify the ip to the host ip:

image

Change port if already in use:

image

Save the file

Integrate with Spark2x

Once done for installing MRS client and anaconda, then you can launch jupyter notebook by the following commands:

source /opt/hadoopclient/bigdata_env
kinit developuser
source ~/.bashrc.anaconda
export PYSPARK_DRIVER_PYTHON="ipython"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --allow-root"

image

Finally start the notebook:

pyspark --master yarn --deploy-mode client &

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages