FAQ
Greetings...

I have a BOSH director running (Ver: 0.5.1 (6fc33428) ) in AWS, along with
a cloudfoundry and wordpress sample release. No matter what I try I cannot
get either to deploy. I only receive "Error 450002: Timed out sending
`get_state' to xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx after 30 seconds". The only
thing I see in the director logs are redis connections (D,
[2012-07-30T03:27:04.416443 #576] [0x1dab960] DEBUG -- : Acquired
connection: 22828760) and various SQL debug statements.

I'm running an AWS stemcell (v 0.6.3) built from Git rev ca34b9cc7.

Any tips on where to debug?

Thanks!
Evan

Search Discussions

  • Ejhazlett at Jul 30, 2012 at 4:26 am
    I have managed to find the debug logs for the task:

    D, [2012-07-30T02:54:41.920840 #3417] [0x2618b48] DEBUG -- : Worker thread
    raised exception: Timed out sending `get_state' to
    0f807d32-571b-4b62-8d85-a5abafd203a2 after 30 seconds -
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:96:in
    `block in handle_method'
    /var/vcap/deploy/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in
    `mon_synchronize'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:91:in
    `handle_method'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:27:in
    `method_missing'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:158:in
    `get_state'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:54:in
    `bind_existing_vm'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:43:in
    `block (4 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_formatter.rb:46:in
    `with_thread_name'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:42:in
    `block (3 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `call'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `block (2 levels) in create_thread'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `loop'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `block in create_thread'

    Furthermore, I never see any AWS instances actually get launched.

    Evan
    On Sunday, July 29, 2012 11:44:21 PM UTC-4, ejha...@gmail.com wrote:

    Greetings...

    I have a BOSH director running (Ver: 0.5.1 (6fc33428) ) in AWS, along with
    a cloudfoundry and wordpress sample release. No matter what I try I cannot
    get either to deploy. I only receive "Error 450002: Timed out sending
    `get_state' to xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx after 30 seconds". The only
    thing I see in the director logs are redis connections (D,
    [2012-07-30T03:27:04.416443 #576] [0x1dab960] DEBUG -- : Acquired
    connection: 22828760) and various SQL debug statements.

    I'm running an AWS stemcell (v 0.6.3) built from Git rev ca34b9cc7.

    Any tips on where to debug?

    Thanks!
    Evan
  • Vadim Spivak at Jul 30, 2012 at 4:30 am
    In this case it shouldn't try to launch any instances since it's trying to
    find the ones it already launched previously.

    Were they ever launched in a previous task? Were they deleted manually?

    -Vadim
    On Sun, Jul 29, 2012 at 9:26 PM, wrote:

    I have managed to find the debug logs for the task:

    D, [2012-07-30T02:54:41.920840 #3417] [0x2618b48] DEBUG -- : Worker thread
    raised exception: Timed out sending `get_state' to
    0f807d32-571b-4b62-8d85-a5abafd203a2 after 30 seconds -
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:96:in
    `block in handle_method'
    /var/vcap/deploy/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in
    `mon_synchronize'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:91:in
    `handle_method'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:27:in
    `method_missing'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:158:in
    `get_state'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:54:in
    `bind_existing_vm'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:43:in
    `block (4 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_formatter.rb:46:in
    `with_thread_name'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:42:in
    `block (3 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `call'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `block (2 levels) in create_thread'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `loop'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `block in create_thread'

    Furthermore, I never see any AWS instances actually get launched.

    Evan
    On Sunday, July 29, 2012 11:44:21 PM UTC-4, ejha...@gmail.com wrote:

    Greetings...

    I have a BOSH director running (Ver: 0.5.1 (6fc33428) ) in AWS, along
    with a cloudfoundry and wordpress sample release. No matter what I try I
    cannot get either to deploy. I only receive "Error 450002: Timed out
    sending `get_state' to xxxxxxx-xxxx-xxxx-xxxx-**xxxxxxxx after 30
    seconds". The only thing I see in the director logs are redis connections
    (D, [2012-07-30T03:27:04.416443 #576] [0x1dab960] DEBUG -- : Acquired
    connection: 22828760) and various SQL debug statements.

    I'm running an AWS stemcell (v 0.6.3) built from Git rev ca34b9cc7.

    Any tips on where to debug?

    Thanks!
    Evan
  • Ejhazlett at Jul 30, 2012 at 4:36 am
    Ah thanks! That was it. I received the AWS error "The requested
    Availability Zone is currently constrained and we are no longer accepting
    new customer requests for t1/m1/c1/m2 instance types. Please retry your
    request by not specifying an Availability Zone or choosing us-east-1c,
    us-east-1b, us-east-1e, us-east-1d." I thought that re-running the
    deployment with an updated manifest would just re-launch them. I've
    deleted the old deployment and re-ran it and instances are running.

    Thanks for the quick reply!
    On Monday, July 30, 2012 12:30:40 AM UTC-4, Vadim Spivak wrote:

    In this case it shouldn't try to launch any instances since it's trying to
    find the ones it already launched previously.

    Were they ever launched in a previous task? Were they deleted manually?

    -Vadim
    On Sun, Jul 29, 2012 at 9:26 PM, wrote:

    I have managed to find the debug logs for the task:

    D, [2012-07-30T02:54:41.920840 #3417] [0x2618b48] DEBUG -- : Worker
    thread raised exception: Timed out sending `get_state' to
    0f807d32-571b-4b62-8d85-a5abafd203a2 after 30 seconds -
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:96:in
    `block in handle_method'
    /var/vcap/deploy/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in
    `mon_synchronize'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:91:in
    `handle_method'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:27:in
    `method_missing'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:158:in
    `get_state'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:54:in
    `bind_existing_vm'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:43:in
    `block (4 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_formatter.rb:46:in
    `with_thread_name'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:42:in
    `block (3 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `call'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in
    `block (2 levels) in create_thread'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `loop'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in
    `block in create_thread'

    Furthermore, I never see any AWS instances actually get launched.

    Evan
    On Sunday, July 29, 2012 11:44:21 PM UTC-4, ejha...@gmail.com wrote:

    Greetings...

    I have a BOSH director running (Ver: 0.5.1 (6fc33428) ) in AWS, along
    with a cloudfoundry and wordpress sample release. No matter what I try I
    cannot get either to deploy. I only receive "Error 450002: Timed out
    sending `get_state' to xxxxxxx-xxxx-xxxx-xxxx-**xxxxxxxx after 30
    seconds". The only thing I see in the director logs are redis connections
    (D, [2012-07-30T03:27:04.416443 #576] [0x1dab960] DEBUG -- : Acquired
    connection: 22828760) and various SQL debug statements.

    I'm running an AWS stemcell (v 0.6.3) built from Git rev ca34b9cc7.

    Any tips on where to debug?

    Thanks!
    Evan
  • Dr Nic Williams at Jul 30, 2012 at 11:11 am
    As a side conversation - not being able to provision a new instance during the middle of a "bosh deploy" could cause crazy grief to a system? You start with an m1.small & EBS in us-east-1a, you attempt to upsize, it detaches the us-east-1a EBS, deletes the m1.small, and then fails to create an m1.medium. At this point, AWS may not allow you (in the immediate or permanent time scale) to get back your m1.small.

    I think I started a conversation once about changing the sequence of a deploy to reduce the downtime.

    The same resequencing could help protect a system from failing to get a new VM:

    1. Create all new VMs
    2. Turn them into jobs (but don't start them)
    3. Create all new persistent disks & attach them
    4. Stop jobs that are to be migrated/deleted
    5. Move across any persistent disks from "migrated" jobs/VMs (probably mutually exclusive with step 3 & 7)
    6. Start new jobs
    7. Delete old VMs & old persistent disks

    Would this sequence protect a system from any inability to get new resources from an IaaS, and reduce NET downtime of a job?

    Does it potentially introduce any badness into the system?

    Nic

    Dr Nic Williams - VP Developer Evangelism
    Engine Yard
    The Leading Platform as a Service
    Mobile: +1 415 860 2185
    Skype: nicwilliams
    Twitter: @drnic

    On Sunday, July 29, 2012 at 9:36 PM, ejhazlett@gmail.com wrote:

    Ah thanks! That was it. I received the AWS error "The requested Availability Zone is currently constrained and we are no longer accepting new customer requests for t1/m1/c1/m2 instance types. Please retry your request by not specifying an Availability Zone or choosing us-east-1c, us-east-1b, us-east-1e, us-east-1d." I thought that re-running the deployment with an updated manifest would just re-launch them. I've deleted the old deployment and re-ran it and instances are running.

    Thanks for the quick reply!
    On Monday, July 30, 2012 12:30:40 AM UTC-4, Vadim Spivak wrote:
    In this case it shouldn't try to launch any instances since it's trying to find the ones it already launched previously.

    Were they ever launched in a previous task? Were they deleted manually?

    -Vadim
    On Sun, Jul 29, 2012 at 9:26 PM, (mailto:ejhazlett@gmail.com)> wrote:
    I have managed to find the debug logs for the task:

    D, [2012-07-30T02:54:41.920840 #3417] [0x2618b48] DEBUG -- : Worker thread raised exception: Timed out sending `get_state' to 0f807d32-571b-4b62-8d85-a5abafd203a2 after 30 seconds - /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:96:in `block in handle_method'
    /var/vcap/deploy/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:91:in `handle_method'
    /var/vcap/deploy/bosh/director/current/director/lib/director/client.rb:27:in `method_missing'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:158:in `get_state'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:54:in `bind_existing_vm'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:43:in `block (4 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_formatter.rb:46:in `with_thread_name'
    /var/vcap/deploy/bosh/director/current/director/lib/director/deployment_plan_compiler.rb:42:in `block (3 levels) in bind_existing_deployment'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in `call'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:83:in `block (2 levels) in create_thread'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in `loop'
    /var/vcap/deploy/bosh/director/shared/gems/ruby/1.9.1/gems/bosh_common-0.4.0/lib/common/thread_pool.rb:67:in `block in create_thread'

    Furthermore, I never see any AWS instances actually get launched.

    Evan
    On Sunday, July 29, 2012 11:44:21 PM UTC-4, ejha...@gmail.com wrote:
    Greetings...

    I have a BOSH director running (Ver: 0.5.1 (6fc33428) ) in AWS, along with a cloudfoundry and wordpress sample release. No matter what I try I cannot get either to deploy. I only receive "Error 450002: Timed out sending `get_state' to xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx after 30 seconds". The only thing I see in the director logs are redis connections (D, [2012-07-30T03:27:04.416443 #576] [0x1dab960] DEBUG -- : Acquired connection: 22828760) and various SQL debug statements.

    I'm running an AWS stemcell (v 0.6.3) built from Git rev ca34b9cc7.

    Any tips on where to debug?

    Thanks!
    Evan


Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbosh-users @
postedJul 30, '12 at 3:44a
activeJul 30, '12 at 11:11a
posts5
users3

People

Translate

site design / logo © 2022 Grokbase