Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU-notebook (Xin Li) #3

Open
XiaoranYan opened this issue Feb 28, 2019 · 2 comments
Open

GPU-notebook (Xin Li) #3

XiaoranYan opened this issue Feb 28, 2019 · 2 comments

Comments

@XiaoranYan
Copy link
Contributor

Hi Ying and Xin,

It is definitely possible. We are currently transitioning to a new AWS account and will be doing some research on AWS and Jetstream for potential solutions.

Please give us a month or so. We will get back to you when we have some working implementations.

Thank you!

Xiaoran

-----Original Message-----
From: Ding, Ying
Sent: Tuesday, January 29, 2019 9:23 PM
To: Pentchev, Valentin vpentche@iu.edu; Yan, Xiaoran yan30@iu.edu; Li, Xin xl60@iu.edu
Subject: biobert

dear Val and xiaoran,

It was great to meet you all today. We made great progress.

Xin checked bioBert which is really good, but requires 12GB GPU. Do you think that your group can provide this for xin. We can help you extract bio entities from pubmed articles and make your dataset unique, and available for end users (such as life scientists or biologists). The impact is great.

best
ying

--
Ying Ding
Professor of Informatics
School of Informatics and Computing
Indiana University
http://info.slis.indiana.edu/~dingying/

@XiaoranYan
Copy link
Contributor Author

XiaoranYan commented Mar 3, 2019

Hi Xin,

Sorry for the extended delay. Due to administrative reasons, we are still waiting for our new AWS account. We went ahead and created a GPU enabled container environment on Azure, and you can access the jupyter notebook at the following location:

http://13.66.251.29

Please login with user name XXX, password XXX. Since this is a product in active development, you will have to work with us closely for any issues and problems. In particular:

  1. We have allocated 5 vCPUs & 48GB RAM, 1 NVIDIA Tesla K80 GPU (12 GB) for your account. For cost saving measures, it will automatically open from 9am-5pm on workdays, and automatically shut down in other time slots. Please let us know if you prefer a different time slot.

  2. The current notebook image has tensorflow 1.12 installed and tested with the Bert package. You can check out the notebook " GPUtesting.ipynb". I have not tested biobert, please set up your dependencies and work from there.

  3. Please try to keep your work flow inside the notebook. You should upload local data and download data directly into the folders inside the notebook. You can create new notebooks, but remember there is only a single GPU and it will not be sharable between notebooks. If you are not familiar with notebook environment, feel free to ask questions.

  4. For access to CADRE datasets, it is already setup with our Azure blob storage. Please check out the notebook " AzureBlobTest.ipynb" for examples of how to download and upload to Azure. If you are interested in learning more about our dataset, please feel free to ask questions. We are planning a tutorial next month if you are intested.

  5. We will be using your use case as a demo. Once you are done some analysis, please build a notebook with clean code with explanations. We will also build Docker images with your source code for reproducibility testing.

  6. We plan to migrate to AWS once our account is ready. We will also be rolling out new functionalities in the future, such as cluster auto scaling, data versioning, and model serving technologies. We will collaborate with you to test these new features.

  7. Keep in mind the implementation is pre-production, so be prepared for the bugs. We will do our best to address your issues. Please record your thoughts and feedback through our Github issues. Details about the github setup will be sent out in a separate email.

Thank you!

Xiaoran

@XiaoranYan XiaoranYan self-assigned this Mar 9, 2019
@XiaoranYan
Copy link
Contributor Author

XiaoranYan commented Mar 21, 2019

To do:

  1. Turn off culling, and switch custom image with the official kubeflow image (done)
  2. Increase PV volume to 30 GB (done)
  3. Explore ways to allow persistent user package environment
    2.5 work around: install packages under the folder //home/jovyan/ (--user for pip, specify lib for CRAN), still need to re-add build paths upon restarting
  4. Explore ways to sync Azure credentials with CADRE login system
  5. Explore docker integration with BinderHub, setup Kubernetes resources quotas
  6. Build custom images based on official kubeflow image for GPU and Azure/R package environments for demo
  7. Explore admin user controls vs Kubernetes controls
  8. Migrate to AWS

Administrative decisions:

  1. Opening range (public, BTAA users upon request or only stage 2 collaborators) of the 3 different products, i.e. query GUI, Postgres & notebooks. What should we inform the product owner console?
  2. For stage 2 collaborators, what are the resource limitation and cost estimates?
  3. Specify demo size and allocate AWS deployment resources and lifetime.
  4. Review storage cost, propose data versioning policies (layered storage options, <10gb dockerizable working space, permanent publich repo, 3rd party drives, premium versioned data repo) with the product owner console.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant