Skip to content

Latest commit

 

History

History
121 lines (97 loc) · 16.3 KB

index.md

File metadata and controls

121 lines (97 loc) · 16.3 KB

This page provides a summary list and contact links for some key computing and storage hardware (physical and virtual) resources that are available to support computing needs of MIT researchers. There is a vibrant community of computational resources avaiable to support research that go beyond physical/virtual hardware too. This page focusses core physical/virtual hardware infrastructure that is readily available.

Cloud credits programs

Name URL Details
Microsoft Azure Cloud https://cloud.mit.edu/credits The MIT cloud credits currently has a very limited set of Microsoft Azure credits available. These can be used for any of the Azure Virtual machines and Azure machine learning tools. The resources include clusters of recent generation Volta GPUs with high-speed interconnects, suitable for large machine learning workflows.
Google Compute Platform https://cloud.mit.edu/credits Currently all MIT credits for Google Compute Platform have been assigned to projects. Google does provide extensive direct credits to researchers. Further information is available from abpadgett@google.com
Amazon AWS https://aws.amazon.com Amazon has several programs that provide credits for computing and data storage for educational, research and societal good. Further information is available from cnoonan@amazon.com .
Penguin POD https://pod.penguincomputing.com Penguin computing are preparing an AMD GPU service that will be available to MIT researchers in the fall of 2020. Further information is available from cdeyoung@penguincomputing.com.

MIT campus wide hardware

Name URL Details
Engaging https://engaging-web.mit.edu The Engaging cluster is open to everyone on campus. It has around 80,000 x86 CPU cores and 300 GPU cards ranging from K80 generation to recent Voltas. Hardware access is through the Slurm resource scehduler that suports batch and interactive workloads and allows dedicated reservations. The cluster has a large shared file system for working datasets. Additional compute and storage resources can be purchased by PIs. A wide range of standard software is available and the Docker compatible Singularity container tool is supported. User-level tools like Anaconda for Python, R libraries and Julia packages are all supported. A range of PI group maintained custom software stacks are also availble through the widely adopted environment modules toolkit. A standard, open-source, web-based portal supporting Jupyter notebooks, R studio, Mathematica and X graphics is available at https://engaging-ood.mit.edu. Further information and support is available from engaging-support@techsquare.com.
C3DDB https://c3ddb01.mit.edu/request_account The C3DDB cluster is open to everyone on campus for research in the general area of life-sciences, health-sciences, computational biology, biochemistry and biomechanics. It has around 8000 x86 CPU cores and 100 K80 generation GPU cards. Hardware access is through the Slurm resource scheduler that suports batch and interactive workload and allows dedicated reservations. A wide range of standard software is available and the Docker compatible Singularity container tool is supported. Further information and support is available from c3ddb-admin@techsquare.com.
Supercloud https://supercloud.mit.edu The Supercloud system is a collaboration with MIT Lincoln Laboratory on a shared facility that is optimized for streamlining open research collaborations with Lincoln Laboratory. The facility is open to everyone on campus. The latest Supercloud system has more than 16,000 x86 CPU cores and more than 850 NVidia Volta GPUs in total. Hardware access is through the Slurm resource scheduler that suports batch and interactive workload and allows dedicated reservations. A wide range of standard software is available and the Docker compatible Singularity container tool is supported. A custom, web-based portal supporting Jupyter notebooks is available at https://txe1-portal.mit.edu/. Further inforamtion and support is available at supercloud@mit.edu.
Satori https://mit-satori.github.io Satori is an IBM Power 9 large memory node system. It is open to everyone on campus and has optimized software stacks for machine learning and for image stack post-processing for MIT.nano Cryo-EM faciltiies. The system has 256 NVidia Volta GPU cards attached in groups of four to 1TB memory nodes and a total of 2560 Power 9 CPU cores. Hardware access is through the Slurm resource scehduler that suports batch and interactive workload and allows dedicated reservations. A wide range of standard software is available and the Docker compatible Singularity container tool is supported. A standard web based portal https://satori-portal.mit.edu with Jupyter notebook support is available. Additional compute and storage resources can be purchased by PIs and integrated into the system. Further information and support is available at satori-support@techsquare.com.
subMIT http://submit04.mit.edu/submit-users-guide/index.html [subMIT.mit.edu](http://submit04.mit.edu/submit-users-guide/index.html target="_blank") is a batch submission interface to on- and off-campus computing resources including the Open Science Grid. It is operated by the MIT Laboratory for Nuclear Science and is intended to be available for anyone on campus. Further information is available from pra@mit.edu.
AMD GPU cluster http://amdmit.mit.edu AMD and MIT are partnering on a supercomputing resources for machine learning applications from the AMD HPC Fund for COVID-19 research. The system will provide 1PFlop/s of floating point capability through 160 AMD MI50 GPU cards in fall 2020. The AMD GPU cluster is integrated into the Satori cluster for access.

DLC shared hardware

Name URL Details
TIG Shared Computing https://tig.csail.mit.edu/shared-computing The CSAIL infrastructure group (TIG) operates an Openstack cluster and a Slurm cluster for general use by members of CSAIL. The Openstack environment supports full virtual machines. The Slurm cluster supports Singularity as a container enginer for Docker containers. Additional compute and storage resources can be purchased by PIs ti support group specific needs. Further information and support is available at help@csail.mit.edu.
Openmind https://openmind.mit.edu Openmind is a shared cluster for Brain and Cognitive Science research at MIT. It has several hundred GPU cards and just under 2000 CPU cores. The cluster resources are managed by the Slurm scheduler wich provides support for batch, interactive and reservation based use. The Singulairty container system is available for executing custom Docker images. Further information and support is available from neuro-admin@techsquare.com.
LNS Computing http://rc.lns.mit.edu The Laboratory for Nuclear Science in Physics operates computing resources that are available to researchers within LNS. Further information and support is available from pra@mit.edu.
Kavli Computing Kavli Computing The MIT Kavli Institute operates a cluster for astrophysics research. THe cluster uses the Slurm resource scheduler and is available for use by Kavli researchers.
Koch Bioinformatics https://ki.mit.edu/sbc/bioinformatics Koch operates a bioinformatics facility which specializes in processing needs of computational biologists

COVID-19 research support programs

There are a multiple programs providing accelerated access to resources to COVID-19 related activities, some of these are

Name URL Details
OSTP HPC COVID-19 https://covid19-hpc-consortium.org Provides rapid access to large compute resources for favorably reviewed mini-proposals. Resources available span very large supercomputers to commercial cloud providers. Projects in need of at scale computer resources for any work related to managing and mitigating the COVID-19 out break are elegible.
MGHPCC COVID-19 https://www.mghpcc.org/mghpcc-resources-for-covid-19-research/ University participants in the Masschusetts Green High Performance Computing Center are pooling capacity to provide accelerated access to COVID-19 projects that may need resources.
AWS https://aws.amazon.com/data-exchange/covid-19 AWS has assembled an open repository of COVID-19 data.
Azure https://ai4hcovidgrants.microsoft.com The Microsoft Azure organization is providing grants theough its AI for Health program to support COVID-19 computational needs.
IBM https://developer.ibm.com/callforcode/ IBM is sponsoring a call for code program for COVID-19 projects.
GCP https://edu.google.com/programs/credits/research Google cloud is providing a rapid cloud credits application process for COVID-19 related activities.

Major cloud provider standard credits programs

Provider URL Details
AWS https://aws.amazon.com/grants Provides cloud credit grants for both research and education projects.
AWS https://aws.amazon.com/opendata Hosts open data for sharing with others. AWS has a process for applying to have datasets considered for hosting.
Azure https://azure.microsoft.com/en-us/education Provides Azure cloud credits for education
Azure https://azure.microsfoft.com/en-us/services/open-datasets Hosts standard datasets for general use including machine learning. Additional datasets for inclusion can be nominated.
Google https://edu.google.com/programs/students, https://edu.google.com/programs/faculty, https://edu.google.com/programs/researchers Offers free credits and technical resources for education and research.

Longer term storage

The compute resources listed provie various storage options, especially for active data. For longer term storage some available general services are listed below. Many research communities also partcipate in domain specific archival projects such as ( e.g. EMPIAR, PDB, BCO, General List etc.. ). Long term archival of large digital artifacts is an evolving field. MIT libraries provides online and in-person services to assist researchers in identifying digital archival practices. https://libraries.mit.edu/data-management/.

Provider URL Details
Google https://drive.google.com All MIT accounts include access to Google drive storage. The storage does not have any pre-set capacity limits. Files can be uploaded using web-clients or command line clients such as rclone. rclone to Google drive officially supports at least 1TB of transfer per day, although sometimes the cap allows larger amounts. Transfer speeds are sufficient to move 1TB within a few hours.
Code42 Crashplan MIT accounts can use the Code42 Crashplan service as a cloud backup service for desktop systems. This provides a self-managed way to mirror laptop and desktop system contents to a cloud recovery service.
TSM TSM For self-managed servers MIT provides access to a backup service for servers that support all operating systems and hardware. This service is geared toward disaster recovery to backsop self-manage storage servers.
NESE NESE The Northeast Storage Exchange is an experimental very large-capacity storage infrastructure that is jointly operated across several regional universities (BU, Harvard, MIT, UMass, Northeastern). It is targetting projects with storage needs that go beyond what is practical with commercial and/or open community services. The NESE storage service supports both unencrypted and encrypted data access. The service is planned to be a cost effective modest fee for service option for projects with needs that go beyond commercial services. It is in an early test mode with several projects. Further information is available from mtiernan@mit.edu.
Dataverse Dataverse Many groups make use of Dataverse services for archiving digital material with an attached DOI.
Zenodo Zenodo Zenodo is a CERN based service that is openly available for all researchers and provides DOI handles for uploaded collection.

National facilities

Academic researchers and their collaborators can write proposals for access to numerous large scale computing facilities both within the US and globally. These include some of the largest systems available. The programs include XSEDE, DOE Incite, Open Science Grid

Provider URL Details
NSF XSEDE XSEDE The US National Science Foundation supports a network of shared resources for workloads of all sizes and in all disciplines. The XSEDE program provides easy access to getting started accounts and training as well as a competitive proposal process for access to large resources.
DOE Incite INCITE The US Department of Energy operates some of the worlds largest high-performance computers. Access to these systeems for large problems that align with DOE missions is available through competitive proposals to the INCITE program.
Open Science Grid OSG The Open Science Grid is a large-scale distributed computing platform that provides opportinistic cycles on all sorts of systems around the US and beyond. Its operations are supported by the US National Science Foundation, but many facilities allow opportunistic workloads from the Open Science Grid. Its primary area of application is massive ensembles of independent compute tasks of all sorts.
Cloud Bank https://www.cloudbank.org Cloud Bank is an NSF sponsored intiative that aims to provide a unified portal to access multiple cloud providers. It supports several models, including incorporating cloud funding into proposals to certain NSF programs in a way that may be exclusive of MTDC F&A. The Cloud Bank effort also supports small educational grants of credits.

About this page

This page is maintained in github. If you have something you want to add/change, please don't be shy and feel super free to submit a PR.