Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Creating a Prediction Service on Vertex AI (chapter 19: training_and_deploying_at_scale.ipynb) #135

Open
michabuehlmann opened this issue Jun 4, 2024 · 0 comments

Comments

@michabuehlmann
Copy link

I try the following code in the paragraph "Creating a Prediction Service on Vertex AI" in chapter 19. The first cells are running normally. Then comes the code with a bug:

endpoint = aiplatform.Endpoint.create(display_name="michael-mnist-endpoint")

endpoint.deploy(
    mnist_model,
    min_replica_count=1,
    max_replica_count=5,
    machine_type="n1-standard-32",
    #accelerator_type='NVIDIA_TESLA_K80',
    accelerator_type="NVIDIA_TESLA_T4",
    accelerator_count=4
)

I get the following stacktrace:

INFO:google.cloud.aiplatform.models:Creating Endpoint
INFO:google.cloud.aiplatform.models:Create Endpoint backing LRO: projects/980799330137/locations/us-central1/endpoints/2848656506683916288/operations/8057826248975450112
INFO:google.cloud.aiplatform.models:Endpoint created. Resource name: projects/980799330137/locations/us-central1/endpoints/2848656506683916288
INFO:google.cloud.aiplatform.models:To use this Endpoint in another session:
INFO:google.cloud.aiplatform.models:endpoint = aiplatform.Endpoint('projects/980799330137/locations/us-central1/endpoints/2848656506683916288')
INFO:google.cloud.aiplatform.models:Deploying Model projects/980799330137/locations/us-central1/models/8350220166423904256 to Endpoint : projects/980799330137/locations/us-central1/endpoints/2848656506683916288
INFO:google.cloud.aiplatform.models:Deploy Endpoint model backing LRO: projects/980799330137/locations/us-central1/endpoints/2848656506683916288/operations/2642810647015849984
---------------------------------------------------------------------------
ResourceExhausted                         Traceback (most recent call last)
[<ipython-input-12-edad841ee8d1>](https://mubp1prpov-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240523-060110_RC00_636443439#) in <cell line: 3>()
      1 endpoint = aiplatform.Endpoint.create(display_name="michael-mnist-endpoint")
      2 
----> 3 endpoint.deploy(
      4     mnist_model,
      5     min_replica_count=1,

4 frames
[/usr/local/lib/python3.10/dist-packages/google/api_core/future/polling.py](https://mubp1prpov-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240523-060110_RC00_636443439#) in result(self, timeout, retry, polling)
    259             # pylint: disable=raising-bad-type
    260             # Pylint doesn't recognize that this is valid in this case.
--> 261             raise self._exception
    262 
    263         return self._result

ResourceExhausted: 429 The following quotas are exceeded: CustomModelServingCPUsPerProjectPerRegion,CustomModelServingT4GPUsPerProjectPerRegion 8: The following quotas are exceeded: CustomModelServingCPUsPerProjectPerRegion,CustomModelServingT4GPUsPerProjectPerRegion

I'm not sure how to configure the google cloud.

Versions

  • OS: [MacOSX 14.1.2]
  • Python: [3.11.8]
  • TensorFlow: [2.15.0]
  • Scikit-Learn: [1.4.2]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant