Consider returning empty usage data for /v3/processes/:guid/stats

Thanks for submitting an issue to `cloud_controller_ng`. We are always trying to improve! To help us, please fill out the following template.

## Issue

When `log-cache` (or `metric-proxy` on cf-for-k8s) is unavailable or returns a bad response for a particular process the entire `/v3/processes/:guid/stats` endpoint will exit with an error. 

## Context

This behavior is unfortunate for clients who do not care about process metrics and instead are using the endpoint for other information (e.g. is my process running? what ports is my process listening on?). Ideally the endpoint could continue to respond with the information it _is_ able to provide when the platform is degraded.

## Steps to Reproduce

On cf-for-k8s you can easily reproduce this behavior by running `cf restart` on an app in a loops. It fails roughly every 10 restarts since `metric-proxy` has difficulty fetching metrics from the Kubernetes API while pod containers are being churned.

More easily you can reproduce this by explicitly raising an error in the log-cache client in this method:
https://github.com/cloudfoundry/cloud_controller_ng/blob/637fe7d6449a593e89fc344d0745aadfad221f5c/lib/logcache/client.rb#L32

## Expected result

**Request:**
```
curl "https://api.example.org/v3/processes/[guid]/stats" \
  -X GET \
  -H "Authorization: bearer [token]"
```

**Response:**
```json
HTTP/1.1 200 OK
Content-Type: application/json

{
  "resources": [
    {
      "type": "web",
      "index": 0,
      "state": "RUNNING",
      "usage": { },
      "host": "10.244.16.10",
      "instance_ports": [
        {
          "external": 64546,
          "internal": 8080,
          "external_tls_proxy_port": 61002,
          "internal_tls_proxy_port": 61003
        }
      ],
      "uptime": 9042,
      "mem_quota": 268435456,
      "disk_quota": 1073741824,
      "fds_quota": 16384,
      "isolation_segment": "example_iso_segment",
      "details": null
    }
  ]
}
```

or

```json
HTTP/1.1 200 OK
Content-Type: application/json

{
  "resources": [
    {
      "type": "web",
      "index": 0,
      "state": "RUNNING",
      "usage": {
        "time": "2016-03-23T23:17:30.476314154Z",
        "cpu": null,
        "mem": null,
        "disk": null
      },
      "host": "10.244.16.10",
      "instance_ports": [
        {
          "external": 64546,
          "internal": 8080,
          "external_tls_proxy_port": 61002,
          "internal_tls_proxy_port": 61003
        }
      ],
      "uptime": 9042,
      "mem_quota": 268435456,
      "disk_quota": 1073741824,
      "fds_quota": 16384,
      "isolation_segment": "example_iso_segment",
      "details": null
    }
  ]
}
```

## Current result

An error is returned from the CF API. Clients typically do not handle this well and it results in failed `cf push` and `cf restart` commands for users.

## Possible Fix

One potential fix is to return `0`s for the values since this is what we do for empty envelopes. However, in a recent story (https://www.pivotaltracker.com/n/projects/966314/stories/177066349) it was pointed out that his is not ideal since 0 is a valid value for these metrics by @jenspinney. I agree that that's not ideal. Alternatively you could use a sentinel value integer like `-1`, but I imagine clients that are making decisions based on this data might not handle that well and it may result in odd bugs.

I think the safest and most semantic fix here is to return an empty `usage` array. Some clients may not handle that correctly, but ideally it would manifest as a response parsing error rather than a mathematical error.

Alternatively the onus could be put on the clients to retry when they get errors back from this endpoint. However, if log-cache is unavailable for long periods of time, this will not be much help. Since this endpoint is in the critical path for `cf push` and `cf (re)start` I think it makes the most sense to try and improve the experience in Cloud Controller.

For good measure, I did create an issue with the CLI, though.

https://github.com/cloudfoundry/cli/issues/2160




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider returning empty usage data for /v3/processes/:guid/stats #2227

Issue

Context

Steps to Reproduce

Expected result

Current result

Possible Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider returning empty usage data for /v3/processes/:guid/stats #2227

Description

Issue

Context

Steps to Reproduce

Expected result

Current result

Possible Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions