FAQ
On the Cloud Foundry Runtime team, we're beginning to discuss monitoring
for deployments that may not have internet access. I was hoping to get
feedback from the community in this decision that will likely affect the
architecture of Cloud Foundry.

To briefly go over where we are and how we got here:

    - Prior to AWS, we were using OpenTSDB <http://opentsdb.net/>
    - We decided to use CloudWatch <http://aws.amazon.com/cloudwatch/> for
    Amazon AWS deployments of Cloud Foundry
    - We decided to use Datadog <http://www.datadoghq.com/> instead of
    CloudWatch
       - IIRC the reasons we used Datadog included but were not limited to:
          - Higher resolution when viewing charts
          - More charting functions
          - Out-of-the box integration with PagerDuty
          - Costs money now, but lets us focus on higher priorities until
          appropriate to revisit

I really like Datadog. It's easy to use, the charts look great, and their
support has been fantastic. But Datadog just won't work in everyone's
circumstances (e.g. non-AWS deployments with no internet access). We're
about at the point where it's time to consider options for monitoring that
will not require external internet access.

We're thinking that statsd is our most obvious candidate to replace
Datadog. It seems to be a buzzworthy tool, and Datadog can optionally use a
customized statsd backend <http://docs.datadoghq.com/guides/dogstatsd/> (potentially
saving us some work).

However, many of our metrics currently rely on tags, which are specific to
Datadog. For instance, we would record CPU load average of each component
and use tags to specify that a data point was for tag job of "dea" and
index of 0. Then we can filter the data by tags, so we could see the
average CPU load average for all DEAs, or we could show a dashboard with a
chart just for each Cloud Controller. Once I learned that we were
accomplishing this through a non-standard statsd backend, I began to wonder
if needing tags at all was a statsd antipattern. Surely our use case is
very typical for users of statsd.

So it's going to take some non-trivial effort to switch to statsd, and we
haven't even touched on frontends yet. I don't think I've talked to anyone
on the team who has had significant experience using statsd, so we're not
certain that statsd is the best choice.

Whichever tool we use, our major use cases include:

    - Record arbitrary time series data
    - View charts from the recorded data and from realtime data
    - Alert on-call support when the recorded meets certain predefined
    criteria

What do you say, community? Should we go forward with statsd?

Search Discussions

  • Mike Youngstrom at Aug 1, 2013 at 4:35 pm
    I'm not very experienced with monitoring tools and metric data collection
    and such.

    However, in lieu of a non internet solution from you guys we created a
    simple collector historian that just puts that data into a relational
    database. This seems to work well enough for us at the moment. It allows
    us to use our enterprise reporting solution to create some charts and
    dashboards for reporting and such as well as Nagios for component
    monitoring and alerts.

    I'd be happy to clean it up and contribute it if you thought it'd be useful
    to anyone else.

    The schema is rather normalized with the tags and such so queries are
    somewhat ugly but it seems to work.

    I'm interested in see what you come up with for a true non public internet
    solution.

    Mike


    On Thu, Aug 1, 2013 at 12:39 AM, Mark Rushakoff
    wrote:
    On the Cloud Foundry Runtime team, we're beginning to discuss monitoring
    for deployments that may not have internet access. I was hoping to get
    feedback from the community in this decision that will likely affect the
    architecture of Cloud Foundry.

    To briefly go over where we are and how we got here:

    - Prior to AWS, we were using OpenTSDB <http://opentsdb.net/>
    - We decided to use CloudWatch <http://aws.amazon.com/cloudwatch/> for
    Amazon AWS deployments of Cloud Foundry
    - We decided to use Datadog <http://www.datadoghq.com/> instead of
    CloudWatch
    - IIRC the reasons we used Datadog included but were not limited to:
    - Higher resolution when viewing charts
    - More charting functions
    - Out-of-the box integration with PagerDuty
    - Costs money now, but lets us focus on higher priorities until
    appropriate to revisit

    I really like Datadog. It's easy to use, the charts look great, and their
    support has been fantastic. But Datadog just won't work in everyone's
    circumstances (e.g. non-AWS deployments with no internet access). We're
    about at the point where it's time to consider options for monitoring that
    will not require external internet access.

    We're thinking that statsd is our most obvious candidate to replace
    Datadog. It seems to be a buzzworthy tool, and Datadog can optionally use a
    customized statsd backend <http://docs.datadoghq.com/guides/dogstatsd/> (potentially
    saving us some work).

    However, many of our metrics currently rely on tags, which are specific to
    Datadog. For instance, we would record CPU load average of each component
    and use tags to specify that a data point was for tag job of "dea" and
    index of 0. Then we can filter the data by tags, so we could see the
    average CPU load average for all DEAs, or we could show a dashboard with a
    chart just for each Cloud Controller. Once I learned that we were
    accomplishing this through a non-standard statsd backend, I began to wonder
    if needing tags at all was a statsd antipattern. Surely our use case is
    very typical for users of statsd.

    So it's going to take some non-trivial effort to switch to statsd, and we
    haven't even touched on frontends yet. I don't think I've talked to anyone
    on the team who has had significant experience using statsd, so we're not
    certain that statsd is the best choice.

    Whichever tool we use, our major use cases include:

    - Record arbitrary time series data
    - View charts from the recorded data and from realtime data
    - Alert on-call support when the recorded meets certain predefined
    criteria

    What do you say, community? Should we go forward with statsd?
  • Duglin at Aug 2, 2013 at 1:45 pm
    We've been working on what we call an "AdminUI". It started out as just a
    tool to nicely show all of the useful data that the /varz endpoints
    return. But, as we organized the data it quickly became the main
    monitoring tool our admins use to keep track of the system. Starting out
    as a read-only tool, we've started to add more things like the ability to
    kick off new DEAs and to send out email notifications when things go bad
    (like components are down). I was hoping to show it at the conference next
    month and get feedback on whether the community would be interested in
    seeing it as new sub-project, but if there's interest before that we can
    look at doing it sooner.

    In full disclosure, we started on it back in the v1 days and are working
    hard to port it to v2.

    -Doug
  • David Laing at Aug 2, 2013 at 2:41 pm
    There is definitely interest -
    https://groups.google.com/a/cloudfoundry.org/forum/#!searchin/vcap-dev/cf-console/vcap-dev/-qaQqPWXlpM/4JdIkJ5WWUAJ

    Any chance you could release your AdminUI to
    github.com/cloudfoundry-community ?

    On Friday, 2 August 2013, wrote:

    We've been working on what we call an "AdminUI". It started out as just a
    tool to nicely show all of the useful data that the /varz endpoints
    return. But, as we organized the data it quickly became the main
    monitoring tool our admins use to keep track of the system. Starting out
    as a read-only tool, we've started to add more things like the ability to
    kick off new DEAs and to send out email notifications when things go bad
    (like components are down). I was hoping to show it at the conference next
    month and get feedback on whether the community would be interested in
    seeing it as new sub-project, but if there's interest before that we can
    look at doing it sooner.

    In full disclosure, we started on it back in the v1 days and are working
    hard to port it to v2.

    -Doug

    --
    David Laing
    Open source @ City Index - github.com/cityindex
    http://davidlaing.com
    Twitter: @davidlaing
  • Jamie Van Dyke at Aug 2, 2013 at 3:06 pm
    I hate to be a 'me too', but I'm also building a web dashboard. It
    covers BOSH and Cloud Foundry. Give me a few weeks and I'll have
    something you can all get your teeth into.

    In simple terms, it's a set of rails workers, api and js front end. In
    its current state the workers poll the BOSH and CF api's for
    information, and put it in a redis db. The js front end queries the
    rails api and it pull the information out of redis. That way there's no
    blocking requests.

    Urgent issues at work have put it on hold for a week, but I'll be back
    on it soon.
    David Laing 2 August 2013 15:41
    There is definitely interest -
    https://groups.google.com/a/cloudfoundry.org/forum/#!searchin/vcap-dev/cf-console/vcap-dev/-qaQqPWXlpM/4JdIkJ5WWUAJ
    <https://groups.google.com/a/cloudfoundry.org/forum/#%21searchin/vcap-dev/cf-console/vcap-dev/-qaQqPWXlpM/4JdIkJ5WWUAJ>

    Any chance you could release your AdminUI to
    github.com/cloudfoundry-community
    <http://github.com/cloudfoundry-community> ?


    On Friday, 2 August 2013, wrote:


    --
    David Laing
    Open source @ City Index - github.com/cityindex
    <http://github.com/cityindex>
    http://davidlaing.com
    Twitter: @davidlaing
    duglin@gmail.com 2 August 2013 14:45
    We've been working on what we call an "AdminUI". It started out as
    just a tool to nicely show all of the useful data that the /varz
    endpoints return. But, as we organized the data it quickly became the
    main monitoring tool our admins use to keep track of the system.
    Starting out as a read-only tool, we've started to add more things
    like the ability to kick off new DEAs and to send out email
    notifications when things go bad (like components are down). I was
    hoping to show it at the conference next month and get feedback on
    whether the community would be interested in seeing it as new
    sub-project, but if there's interest before that we can look at doing
    it sooner.

    In full disclosure, we started on it back in the v1 days and are
    working hard to port it to v2.

    -Doug
    --
    *Jamie van Dyke*: Chief Science Officer at PharmMD Inc.
    <http://www.pharmmd.com>
    phone: 615-713-2020 <callto:415-526-2339>
    web: www.pharmmd.com <http://www.pharmmd.com>
    twitter: @fearoffish <http://twitter.com/fearoffish>
  • David Laing at Aug 2, 2013 at 4:47 pm
    Yay!
    On 2 Aug 2013 16:06, "Jamie Van Dyke" wrote:

    I hate to be a 'me too', but I'm also building a web dashboard. It covers
    BOSH and Cloud Foundry. Give me a few weeks and I'll have something you can
    all get your teeth into.

    In simple terms, it's a set of rails workers, api and js front end. In its
    current state the workers poll the BOSH and CF api's for information, and
    put it in a redis db. The js front end queries the rails api and it pull
    the information out of redis. That way there's no blocking requests.

    Urgent issues at work have put it on hold for a week, but I'll be back on
    it soon.

    David Laing <david@davidlaing.com>
    2 August 2013 15:41
    There is definitely interest -
    https://groups.google.com/a/cloudfoundry.org/forum/#!searchin/vcap-dev/cf-console/vcap-dev/-qaQqPWXlpM/4JdIkJ5WWUAJ

    Any chance you could release your AdminUI to
    github.com/cloudfoundry-community ?


    On Friday, 2 August 2013, wrote:


    --
    David Laing
    Open source @ City Index - github.com/cityindex
    http://davidlaing.com
    Twitter: @davidlaing
    duglin@gmail.com
    2 August 2013 14:45
    We've been working on what we call an "AdminUI". It started out as just a
    tool to nicely show all of the useful data that the /varz endpoints
    return. But, as we organized the data it quickly became the main
    monitoring tool our admins use to keep track of the system. Starting out
    as a read-only tool, we've started to add more things like the ability to
    kick off new DEAs and to send out email notifications when things go bad
    (like components are down). I was hoping to show it at the conference next
    month and get feedback on whether the community would be interested in
    seeing it as new sub-project, but if there's interest before that we can
    look at doing it sooner.

    In full disclosure, we started on it back in the v1 days and are working
    hard to port it to v2.

    -Doug


    --
    *Jamie van Dyke*: Chief Science Officer at PharmMD Inc.<http://www.pharmmd.com>
    phone: 615-713-2020 <callto:415-526-2339>
    web: www.pharmmd.com
    twitter: @fearoffish <http://twitter.com/fearoffish>
  • Mike Youngstrom at Oct 17, 2013 at 10:23 pm
    Have you considered adding support for vcOPS? Could be a good fit for a
    number organizations especially those using vmware.

    http://www.vmware.com/products/vcenter-operations-manager/

    Mike


    On Thu, Aug 1, 2013 at 12:39 AM, Mark Rushakoff
    wrote:
    On the Cloud Foundry Runtime team, we're beginning to discuss monitoring
    for deployments that may not have internet access. I was hoping to get
    feedback from the community in this decision that will likely affect the
    architecture of Cloud Foundry.

    To briefly go over where we are and how we got here:

    - Prior to AWS, we were using OpenTSDB <http://opentsdb.net/>
    - We decided to use CloudWatch <http://aws.amazon.com/cloudwatch/> for
    Amazon AWS deployments of Cloud Foundry
    - We decided to use Datadog <http://www.datadoghq.com/> instead of
    CloudWatch
    - IIRC the reasons we used Datadog included but were not limited to:
    - Higher resolution when viewing charts
    - More charting functions
    - Out-of-the box integration with PagerDuty
    - Costs money now, but lets us focus on higher priorities until
    appropriate to revisit

    I really like Datadog. It's easy to use, the charts look great, and their
    support has been fantastic. But Datadog just won't work in everyone's
    circumstances (e.g. non-AWS deployments with no internet access). We're
    about at the point where it's time to consider options for monitoring that
    will not require external internet access.

    We're thinking that statsd is our most obvious candidate to replace
    Datadog. It seems to be a buzzworthy tool, and Datadog can optionally use a
    customized statsd backend <http://docs.datadoghq.com/guides/dogstatsd/> (potentially
    saving us some work).

    However, many of our metrics currently rely on tags, which are specific to
    Datadog. For instance, we would record CPU load average of each component
    and use tags to specify that a data point was for tag job of "dea" and
    index of 0. Then we can filter the data by tags, so we could see the
    average CPU load average for all DEAs, or we could show a dashboard with a
    chart just for each Cloud Controller. Once I learned that we were
    accomplishing this through a non-standard statsd backend, I began to wonder
    if needing tags at all was a statsd antipattern. Surely our use case is
    very typical for users of statsd.

    So it's going to take some non-trivial effort to switch to statsd, and we
    haven't even touched on frontends yet. I don't think I've talked to anyone
    on the team who has had significant experience using statsd, so we're not
    certain that statsd is the best choice.

    Whichever tool we use, our major use cases include:

    - Record arbitrary time series data
    - View charts from the recorded data and from realtime data
    - Alert on-call support when the recorded meets certain predefined
    criteria

    What do you say, community? Should we go forward with statsd?
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • Wayne E. Seguin at Oct 18, 2013 at 2:19 am
    On Thu, Aug 1, 2013 at 12:39 AM, Mark Rushakoff
    ({}, 'cvml',
    'mrushakoff@pivotallabs.com');>
    wrote

    Whichever tool we use, our major use cases include:

    - Record arbitrary time series data
    - View charts from the recorded data and from realtime data
    - Alert on-call support when the recorded meets certain predefined
    criteria

    What do you say, community? Should we go forward with statsd?
      StatsD is an excellent tool choice for solving the first use case for
    collecting metric names. Use bucket name spacing schemes instead of tags.

    For the other use cases of charting and alerting a search and discussion of
    those spaces independently should be done and then tied together.

       ~Wayne


    --
       ~Wayne

    Wayne E. Seguin
    wayneeseguin@gmail.com
    wayneeseguin on irc.freenode.net
    http://twitter.com/wayneeseguin/
    https://github.com/wayneeseguin/

    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • David Laing at Oct 18, 2013 at 7:17 am
    +1 for statsd as the collection mechanism.
    On 18 Oct 2013 03:19, "Wayne E. Seguin" wrote:



    On Thu, Aug 1, 2013 at 12:39 AM, Mark Rushakoff <
    mrushakoff@pivotallabs.com> wrote
    Whichever tool we use, our major use cases include:

    - Record arbitrary time series data
    - View charts from the recorded data and from realtime data
    - Alert on-call support when the recorded meets certain predefined
    criteria

    What do you say, community? Should we go forward with statsd?
    StatsD is an excellent tool choice for solving the first use case for
    collecting metric names. Use bucket name spacing schemes instead of tags.

    For the other use cases of charting and alerting a search and discussion
    of those spaces independently should be done and then tied together.

    ~Wayne


    --
    ~Wayne

    Wayne E. Seguin
    wayneeseguin@gmail.com
    wayneeseguin on irc.freenode.net
    http://twitter.com/wayneeseguin/
    https://github.com/wayneeseguin/

    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • Simon at Mar 13, 2014 at 12:34 pm
    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to use it
    to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • David Lee at Mar 13, 2014 at 2:39 pm
    Hi Simon,

    On the Pivotal CF side of things we are just about to release our Ops
    Metrics add-on. This tool takes all the varz and BOSH health data and
    exposes them back out via JMX.

    We've considered releasing this on the OSS side but:
    1. We aren't sure if the community prefers some other protocol (such as
    statsd) and wouldn't find JMX that useful.
    2. We wanted to make more changes to the upstream (varz), which might make
    a better integration point.

    (Also, we do have plans to rework the generated metrics to make it easier
    to interpret.)

    Do you have any other constraints on your side? For example, if JMX were to
    become available soon, would using another project (say jmxtrans) to get
    the data over to statsd be okay? Do you have any debugging and alerting
    processes that would not be well solved by this?

    Thanks,

    -Dave



    On Thu, Mar 13, 2014 at 5:34 AM, wrote:

    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to use
    it to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • Simon Johansson at Mar 13, 2014 at 3:36 pm
    Well, JMX exporting is better than no exporting :) Also the more open
    source stuff there is in the ecosystem the more traction CF will get. I say
    release it!.

    Ive been bitten by jmxtrans before where it basically ate up all my FDs
    because I had l lots and lots of checks.
    Im sure there is other JMX -> Statsd bridges out there.

    Also I've poked around a bit in the collector code base and it seems like
    it would be trivial to add Statsd support.



    On Thu, Mar 13, 2014 at 2:39 PM, David Lee wrote:

    Hi Simon,

    On the Pivotal CF side of things we are just about to release our Ops
    Metrics add-on. This tool takes all the varz and BOSH health data and
    exposes them back out via JMX.

    We've considered releasing this on the OSS side but:
    1. We aren't sure if the community prefers some other protocol (such as
    statsd) and wouldn't find JMX that useful.
    2. We wanted to make more changes to the upstream (varz), which might make
    a better integration point.

    (Also, we do have plans to rework the generated metrics to make it easier
    to interpret.)

    Do you have any other constraints on your side? For example, if JMX were
    to become available soon, would using another project (say jmxtrans) to get
    the data over to statsd be okay? Do you have any debugging and alerting
    processes that would not be well solved by this?

    Thanks,

    -Dave



    On Thu, Mar 13, 2014 at 5:34 AM, wrote:

    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to use
    it to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • Simon Johansson at Mar 17, 2014 at 10:55 am
    I just sent a pull request for adding graphite as a historian for collector.
    I chose graphite directly over statsd since everything will be under a
    index key so bucketing is not necessary.

    https://github.com/cloudfoundry/collector/pull/6/commits


    On Thu, Mar 13, 2014 at 3:36 PM, Simon Johansson
    wrote:
    Well, JMX exporting is better than no exporting :) Also the more open
    source stuff there is in the ecosystem the more traction CF will get. I say
    release it!.

    Ive been bitten by jmxtrans before where it basically ate up all my FDs
    because I had l lots and lots of checks.
    Im sure there is other JMX -> Statsd bridges out there.

    Also I've poked around a bit in the collector code base and it seems like
    it would be trivial to add Statsd support.



    On Thu, Mar 13, 2014 at 2:39 PM, David Lee wrote:

    Hi Simon,

    On the Pivotal CF side of things we are just about to release our Ops
    Metrics add-on. This tool takes all the varz and BOSH health data and
    exposes them back out via JMX.

    We've considered releasing this on the OSS side but:
    1. We aren't sure if the community prefers some other protocol (such as
    statsd) and wouldn't find JMX that useful.
    2. We wanted to make more changes to the upstream (varz), which might
    make a better integration point.

    (Also, we do have plans to rework the generated metrics to make it easier
    to interpret.)

    Do you have any other constraints on your side? For example, if JMX were
    to become available soon, would using another project (say jmxtrans) to get
    the data over to statsd be okay? Do you have any debugging and alerting
    processes that would not be well solved by this?

    Thanks,

    -Dave



    On Thu, Mar 13, 2014 at 5:34 AM, wrote:

    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to use
    it to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • James Bayer at Mar 17, 2014 at 3:05 pm
    thanks for the PR simon! at pivotal, we use a customized hosted version of
    graphite for run.pivotal.io, but we use the DataDogHQ plugin which is
    specific to their SaaS.

    david lee is the PM for metrics and logging and he and a small team are
    working on overhauling the system and app metrics architecture to move from
    the varz/collector approach to something that looks conceptually like
    loggregator. instead of having separate collection and transport mechanisms
    and systems for system logs, app logs, system metrics and app metrics we
    want to have a unified multi-tenant system whereby the system logs/metrics
    are just another tenant (albeit with special configs perhaps). the team is
    about ready to share some material with the community for feedback and
    review including write-up and diagrams, etc.

    the reason why that is important is that there are tradeoffs for spending
    time in the collector code base vs delivering on this unified approach.
    i'll ask david and his team to review this submission and hopefully it's
    something we can accept easily even if we call it an experimental
    community-contributed feature.

    On Mon, Mar 17, 2014 at 3:55 AM, Simon Johansson
    wrote:
    I just sent a pull request for adding graphite as a historian for
    collector.
    I chose graphite directly over statsd since everything will be under a
    index key so bucketing is not necessary.

    https://github.com/cloudfoundry/collector/pull/6/commits


    On Thu, Mar 13, 2014 at 3:36 PM, Simon Johansson <simon@simonjohansson.com
    wrote:
    Well, JMX exporting is better than no exporting :) Also the more open
    source stuff there is in the ecosystem the more traction CF will get. I say
    release it!.

    Ive been bitten by jmxtrans before where it basically ate up all my FDs
    because I had l lots and lots of checks.
    Im sure there is other JMX -> Statsd bridges out there.

    Also I've poked around a bit in the collector code base and it seems like
    it would be trivial to add Statsd support.



    On Thu, Mar 13, 2014 at 2:39 PM, David Lee wrote:

    Hi Simon,

    On the Pivotal CF side of things we are just about to release our Ops
    Metrics add-on. This tool takes all the varz and BOSH health data and
    exposes them back out via JMX.

    We've considered releasing this on the OSS side but:
    1. We aren't sure if the community prefers some other protocol (such as
    statsd) and wouldn't find JMX that useful.
    2. We wanted to make more changes to the upstream (varz), which might
    make a better integration point.

    (Also, we do have plans to rework the generated metrics to make it
    easier to interpret.)

    Do you have any other constraints on your side? For example, if JMX were
    to become available soon, would using another project (say jmxtrans) to get
    the data over to statsd be okay? Do you have any debugging and alerting
    processes that would not be well solved by this?

    Thanks,

    -Dave



    On Thu, Mar 13, 2014 at 5:34 AM, wrote:

    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to
    use it to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.


    --
    Thank you,

    James Bayer

    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.
  • Simon Johansson at Mar 17, 2014 at 3:46 pm
    instead of having separate collection and transport mechanisms and
    systems for system logs, app logs, system metrics and app metrics we want
    to have a unified multi-tenant system whereby the system logs/metrics are
    just another tenant.

    Sounds really interesting. Looking forward to it :) Do you have any ETA?
    the reason why that is important is that there are tradeoffs for spending
    time in the collector code base vs delivering on this unified approach.
    i'll ask david and his team to review this submission and hopefully it's
    something we can accept easily even if we call it an experimental community
    contributed feature.

    As luck would have it, my code is 100% flawless, so a merge should be no
    problem ;).

    But in the highly unlikely event that my PR would be deemed unfit, no
    hard feelings, if it allows you guys to deliver shiny new cool stuff
    quicker!


    On Mon, Mar 17, 2014 at 3:05 PM, James Bayer wrote:

    thanks for the PR simon! at pivotal, we use a customized hosted version of
    graphite for run.pivotal.io, but we use the DataDogHQ plugin which is
    specific to their SaaS.

    david lee is the PM for metrics and logging and he and a small team are
    working on overhauling the system and app metrics architecture to move from
    the varz/collector approach to something that looks conceptually like
    loggregator. instead of having separate collection and transport mechanisms
    and systems for system logs, app logs, system metrics and app metrics we
    want to have a unified multi-tenant system whereby the system logs/metrics
    are just another tenant (albeit with special configs perhaps). the team is
    about ready to share some material with the community for feedback and
    review including write-up and diagrams, etc.

    the reason why that is important is that there are tradeoffs for spending
    time in the collector code base vs delivering on this unified approach.
    i'll ask david and his team to review this submission and hopefully it's
    something we can accept easily even if we call it an experimental
    community-contributed feature.

    On Mon, Mar 17, 2014 at 3:55 AM, Simon Johansson <simon@simonjohansson.com
    wrote:
    I just sent a pull request for adding graphite as a historian for
    collector.
    I chose graphite directly over statsd since everything will be under a
    index key so bucketing is not necessary.

    https://github.com/cloudfoundry/collector/pull/6/commits


    On Thu, Mar 13, 2014 at 3:36 PM, Simon Johansson <
    simon@simonjohansson.com> wrote:
    Well, JMX exporting is better than no exporting :) Also the more open
    source stuff there is in the ecosystem the more traction CF will get. I say
    release it!.

    Ive been bitten by jmxtrans before where it basically ate up all my FDs
    because I had l lots and lots of checks.
    Im sure there is other JMX -> Statsd bridges out there.

    Also I've poked around a bit in the collector code base and it seems
    like it would be trivial to add Statsd support.



    On Thu, Mar 13, 2014 at 2:39 PM, David Lee wrote:

    Hi Simon,

    On the Pivotal CF side of things we are just about to release our Ops
    Metrics add-on. This tool takes all the varz and BOSH health data and
    exposes them back out via JMX.

    We've considered releasing this on the OSS side but:
    1. We aren't sure if the community prefers some other protocol (such as
    statsd) and wouldn't find JMX that useful.
    2. We wanted to make more changes to the upstream (varz), which might
    make a better integration point.

    (Also, we do have plans to rework the generated metrics to make it
    easier to interpret.)

    Do you have any other constraints on your side? For example, if JMX
    were to become available soon, would using another project (say jmxtrans)
    to get the data over to statsd be okay? Do you have any debugging and
    alerting processes that would not be well solved by this?

    Thanks,

    -Dave



    On Thu, Mar 13, 2014 at 5:34 AM, wrote:

    Sorry to bring up an old topic like this.

    But we already have graphite/statsd infrastructure and would love to
    use it to collect stats for debugging and alerting.

    Please let me know if this is not on the road map, in that case I will
    probably pick it up since we need it. :)

    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to vcap-dev+unsubscribe@cloudfoundry.org.


    --
    Thank you,

    James Bayer

    To unsubscribe from this group and stop receiving emails from it, send an
    email to vcap-dev+unsubscribe@cloudfoundry.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe@cloudfoundry.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupvcap-dev @
postedAug 1, '13 at 6:39a
activeMar 17, '14 at 3:46p
posts15
users9

People

Translate

site design / logo © 2021 Grokbase