GPU-notebook (Xin Li) #3

XiaoranYan · 2019-02-28T20:59:55Z

Hi Ying and Xin,

It is definitely possible. We are currently transitioning to a new AWS account and will be doing some research on AWS and Jetstream for potential solutions.

Please give us a month or so. We will get back to you when we have some working implementations.

Thank you!

Xiaoran

-----Original Message-----
From: Ding, Ying
Sent: Tuesday, January 29, 2019 9:23 PM
To: Pentchev, Valentin vpentche@iu.edu; Yan, Xiaoran yan30@iu.edu; Li, Xin xl60@iu.edu
Subject: biobert

dear Val and xiaoran,

It was great to meet you all today. We made great progress.

Xin checked bioBert which is really good, but requires 12GB GPU. Do you think that your group can provide this for xin. We can help you extract bio entities from pubmed articles and make your dataset unique, and available for end users (such as life scientists or biologists). The impact is great.

best
ying

--
Ying Ding
Professor of Informatics
School of Informatics and Computing
Indiana University
http://info.slis.indiana.edu/~dingying/

XiaoranYan · 2019-03-03T06:35:39Z

Hi Xin,

Sorry for the extended delay. Due to administrative reasons, we are still waiting for our new AWS account. We went ahead and created a GPU enabled container environment on Azure, and you can access the jupyter notebook at the following location:

http://13.66.251.29

Please login with user name XXX, password XXX. Since this is a product in active development, you will have to work with us closely for any issues and problems. In particular:

We have allocated 5 vCPUs & 48GB RAM, 1 NVIDIA Tesla K80 GPU (12 GB) for your account. For cost saving measures, it will automatically open from 9am-5pm on workdays, and automatically shut down in other time slots. Please let us know if you prefer a different time slot.
The current notebook image has tensorflow 1.12 installed and tested with the Bert package. You can check out the notebook " GPUtesting.ipynb". I have not tested biobert, please set up your dependencies and work from there.
Please try to keep your work flow inside the notebook. You should upload local data and download data directly into the folders inside the notebook. You can create new notebooks, but remember there is only a single GPU and it will not be sharable between notebooks. If you are not familiar with notebook environment, feel free to ask questions.
For access to CADRE datasets, it is already setup with our Azure blob storage. Please check out the notebook " AzureBlobTest.ipynb" for examples of how to download and upload to Azure. If you are interested in learning more about our dataset, please feel free to ask questions. We are planning a tutorial next month if you are intested.
We will be using your use case as a demo. Once you are done some analysis, please build a notebook with clean code with explanations. We will also build Docker images with your source code for reproducibility testing.
We plan to migrate to AWS once our account is ready. We will also be rolling out new functionalities in the future, such as cluster auto scaling, data versioning, and model serving technologies. We will collaborate with you to test these new features.
Keep in mind the implementation is pre-production, so be prepared for the bugs. We will do our best to address your issues. Please record your thoughts and feedback through our Github issues. Details about the github setup will be sent out in a separate email.

Thank you!

Xiaoran

XiaoranYan · 2019-03-21T18:25:12Z

To do:

Turn off culling, and switch custom image with the official kubeflow image (done)
Increase PV volume to 30 GB (done)
Explore ways to allow persistent user package environment
2.5 work around: install packages under the folder //home/jovyan/ (--user for pip, specify lib for CRAN), still need to re-add build paths upon restarting
Explore ways to sync Azure credentials with CADRE login system
Explore docker integration with BinderHub, setup Kubernetes resources quotas
Build custom images based on official kubeflow image for GPU and Azure/R package environments for demo
Explore admin user controls vs Kubernetes controls
Migrate to AWS

Administrative decisions:

Opening range (public, BTAA users upon request or only stage 2 collaborators) of the 3 different products, i.e. query GUI, Postgres & notebooks. What should we inform the product owner console?
For stage 2 collaborators, what are the resource limitation and cost estimates?
Specify demo size and allocate AWS deployment resources and lifetime.
Review storage cost, propose data versioning policies (layered storage options, <10gb dockerizable working space, permanent publich repo, 3rd party drives, premium versioned data repo) with the product owner console.

XiaoranYan added GPU notebook kubernetes Azure labels Mar 1, 2019

XiaoranYan self-assigned this Mar 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU-notebook (Xin Li) #3

GPU-notebook (Xin Li) #3

XiaoranYan commented Feb 28, 2019

XiaoranYan commented Mar 3, 2019 •

edited

Loading

XiaoranYan commented Mar 21, 2019 •

edited

Loading

GPU-notebook (Xin Li) #3

GPU-notebook (Xin Li) #3

Comments

XiaoranYan commented Feb 28, 2019

XiaoranYan commented Mar 3, 2019 • edited Loading

XiaoranYan commented Mar 21, 2019 • edited Loading

XiaoranYan commented Mar 3, 2019 •

edited

Loading

XiaoranYan commented Mar 21, 2019 •

edited

Loading