class moto.sagemakerruntime.models.SageMakerRuntimeBackend(region_name: str, account_id: str)

Implementation of SageMakerRuntime APIs.

Implemented features for this service

  • [X] invoke_endpoint

    This call will return static data by default.

    You can use a dedicated API to override this, by configuring a queue of expected results.

    A request to get_query_results will take the first result from that queue. Subsequent requests using the same details will return the same result. Other requests using a different QueryExecutionId will take the next result from the queue, or return static data if the queue is empty.

    Configuring this queue by making an HTTP request to /moto-api/static/sagemaker/endpoint-results. An example invocation looks like this:

    expected_results = {
        "account_id": "123456789012",  # This is the default - can be omitted
        "region": "us-east-1",  # This is the default - can be omitted
        "results": [
                 "Body": "first body",
                 "ContentType": "text/xml",
                 "InvokedProductionVariant": "prod",
                 "CustomAttributes": "my_attr",
            # other results as required
    client = boto3.client("sagemaker", region_name="us-east-1")
    details = client.invoke_endpoint(EndpointName="asdf", Body="qwer")
  • [X] invoke_endpoint_async

  • [ ] invoke_endpoint_with_response_stream