FAQ
I upgraded to CDM 4.7 from 4.6 (actually i'm not sure what previous version
i had, but it was a 4.x version) and suddenly host monitoring is not
working. The subject line is the message i get:


    - Unable to issue query: the Host Monitor is not running


CDM was working fine before and I'm new to it, so I didn't look much into
the innards until I had this issue. I'm not super clear on how the host
monitoring is *supposed* to work -- i.e, pull from CDM, push from the
hosts? I'm not even sure what the daemon is that is supposed to do the host
monitoring -- i assume it's cloudera-scm-agent, but that seems to be making
its connection properly. On the hosts in /etc/cloudera-scm-agent, here's
the part of config.ini that is not commented out:

# Hostname of Cloudera SCM Server
server_host=10.35.130.85

# Port that server is listening on
server_port=7182


That IP address is the IP addr of the machine with CDM, so that should be
fine. Plus, the connection from host ---> SCM seems to be established fine.
Port 7182 on the CDM server is open and connection from my 3 hosts is
established:

[[email protected] cloudera-scm-server]# lsof -i :7182
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 6031 cloudera-scm 269u IPv4 5075782 0t0 TCP *:7182 (LISTEN)
java 6031 cloudera-scm 270u IPv4 5282469 0t0 TCP
10.35.130.85:7182->rdrh61-srv3.dsone.3ds.com:36823 (ESTABLISHED)
java 6031 cloudera-scm 274u IPv4 5282477 0t0 TCP
10.35.130.85:7182->rdrh61-srv2.dsone.3ds.com:47722 (ESTABLISHED)
java 6031 cloudera-scm 278u IPv4 5282493 0t0 TCP
10.35.130.85:7182->rdrh61-srv1.dsone.3ds.com:40405 (ESTABLISHED)


Still, something isn't connection properly from host ---> SCM as is shown
in the cloudera-scm-agent log on one of the hosts:

[17/Jan/2014 09:17:13 +0000] 20236 MonitorDaemon-Reporter throttling_logger
ERROR (9 skipped) Error sending messages to firehose:
mgmt1-HOSTMONITOR-b39f49f7653335eb63feffc2ff44f323
Traceback (most recent call last):
   File "/usr/lib64/cmf/agent/src/cmf/monitor/firehose.py", line 70, in _send
     self._port)
   File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py",
line 471, in __init__
     self.conn.connect()
   File "/usr/lib64/python2.6/httplib.py", line 720, in connect
     self.timeout)
   File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
     raise error, msg
error: [Errno 111] Connection refused


So there's an error in cloudera-scm-agent going from host --> SCM. But
unless i'm reading the SCM logs wrong, there appears also to be an error in
some process trying to connect from SCM --> host too. On the SCM server,
here's a snippet from the log:

2014-01-17 09:25:30,925 ERROR
[193404[email protected]:[email protected]] Exception occurred when
checking the host health of rdrh61-srv1.dsone.3ds.com
2014-01-17 09:26:00,908 WARN
[9471441[email protected]:[email protected]] (10 skipped) Exception
querying events
java.io.IOException: Error connecting to
*rdrh61-srv1.dsone.3ds.com/10.6.40.223:7184*
         at
org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:249)
         at
org.apache.avro.ipc.NettyTransceiver.(NettyTransceiver.java:147)
         at
org.apache.avro.ipc.NettyTransceiver.(NettyTransceiver.java:102)
         at
com.cloudera.cmf.event.query.AvroEventStoreQueryProxy.checkSpecificRequestor(AvroEventStoreQueryProxy.java:75)
         at
com.cloudera.cmf.event.query.AvroEventStoreQueryProxy.doQuery(AvroEventStoreQueryProxy.java:122)
         at
com.cloudera.server.web.cmf.events.EventDao.findEvents(EventDao.java:333)
         at
com.cloudera.server.web.cmf.EventsController.query(EventsController.java:169)
         at sun.reflect.GeneratedMethodAccessor367.invoke(Unknown Source)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at
org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
         at
org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:436)
         at
org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:424)
         at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:790)
         at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719)
         at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:669)
         at
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:574)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
         at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
         at
org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:78)
         at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:131)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
         at
com.jamonapi.http.JAMonServletFilter.doFilter(JAMonServletFilter.java:48)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
         at
com.cloudera.enterprise.JavaMelodyFacade$MonitoringFilter.doFilter(JavaMelodyFacade.java:109)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:311)
         at
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:116)
         at
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:83)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:101)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:146)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:182)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:105)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.session.ConcurrentSessionFilter.doFilter(ConcurrentSessionFilter.java:125)
         at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:323)
         at
org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:173)
         at
org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:237)
         at
org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:167)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
         at
org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
         at
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
         at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
         at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
         at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
         at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
         at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
         at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
         at
org.mortbay.jetty.handler.StatisticsHandler.handle(StatisticsHandler.java:53)
         at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
         at org.mortbay.jetty.Server.handle(Server.java:326)
         at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
         at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
         at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
         at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.net.ConnectException: Connection refused
         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
         at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
         at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:404)
         at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:366)
         at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:282)
         at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
         at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)


The port 7184 is *not* open so that's a problem. Here's the relevant part
of the port listing on that host server:

dnsmasq 3966 nobody 6u IPv4 12363 0t0 TCP
192.168.122.1:domain (LISTEN)
dnsmasq 3966 nobody 7u IPv4 12364 0t0 UDP
192.168.122.1:domain
python 4302 root 4u IPv4 14080 0t0 TCP
localhost:etlservicemgr (LISTEN)
python 4302 root 6u IPv4 41124953 0t0 TCP
localhost:etlservicemgr->localhost:37209 (ESTABLISHED)
sshd 15574 root 3r IPv4 41114540 0t0 TCP
rdrh61-srv1.dsone.3ds.com:ssh->t3s780mon.dsone.3ds.com:51650 (ESTABLISHED)
python 20236 root 6u IPv4 41124952 0t0 TCP
localhost:37209->localhost:etlservicemgr (ESTABLISHED)
python 20236 root 10u IPv4 41124959 0t0 TCP
rdrh61-srv1.dsone.3ds.com:cslistener (LISTEN)
python 20236 root 11u IPv4 41142039 0t0 TCP
rdrh61-srv1.dsone.3ds.com:41338->10.35.130.85:7182 (ESTABLISHED)
java 31140 hdfs 162u IPv4 40860823 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-svcdisc (LISTEN)
java 31140 hdfs 178u IPv4 40861013 0t0 TCP
rdrh61-srv1.dsone.3ds.com:50070 (LISTEN)
java 31140 hdfs 182u IPv4 40861018 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-svcdisc->rdrh61-srv1.dsone.3ds.com:52925
(ESTABLISHED)
java 31140 hdfs 184u IPv4 40861020 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-svcdisc->rdrh61-srv3.dsone.3ds.com:60047
(ESTABLISHED)
java 31191 hdfs 166u IPv4 40860812 0t0 TCP
rdrh61-srv1.dsone.3ds.com:50090 (LISTEN)
java 31231 hdfs 161u IPv4 40860609 0t0 TCP
rdrh61-srv1.dsone.3ds.com:50010 (LISTEN)
java 31231 hdfs 164u IPv4 40860615 0t0 TCP
rdrh61-srv1.dsone.3ds.com:50075 (LISTEN)
java 31231 hdfs 171u IPv4 40860798 0t0 TCP
rdrh61-srv1.dsone.3ds.com:50020 (LISTEN)
java 31231 hdfs 184u IPv4 40861012 0t0 TCP
rdrh61-srv1.dsone.3ds.com:52925->rdrh61-srv1.dsone.3ds.com:intu-ec-svcdisc
(ESTABLISHED)
java 31434 mapred 158u IPv4 40862429 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-client (LISTEN)
java 31434 mapred 170u IPv4 40863275 0t0 TCP *:50030 (LISTEN)
java 31434 mapred 179u IPv4 40864414 0t0 TCP
rdrh61-srv1.dsone.3ds.com:9290 (LISTEN)
java 31434 mapred 181u IPv4 40864425 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-client->rdrh61-srv3.dsone.3ds.com:45297
(ESTABLISHED)
java 31434 mapred 182u IPv4 40864429 0t0 TCP
rdrh61-srv1.dsone.3ds.com:intu-ec-client->rdrh61-srv1.dsone.3ds.com:51504
(ESTABLISHED)
java 31475 mapred 160u IPv4 40863277 0t0 TCP
localhost:unify-debug (LISTEN)
java 31475 mapred 161u IPv4 40863283 0t0 TCP localhost:46281
(LISTEN)
java 31475 mapred 173u IPv4 40863824 0t0 TCP *:50060 (LISTEN)
java 31475 mapred 182u IPv4 40864428 0t0 TCP
rdrh61-srv1.dsone.3ds.com:51504->rdrh61-srv1.dsone.3ds.com:intu-ec-client
(ESTABLISHED)
java 31710 oozie 177u IPv4 40864404 0t0 TCP *:irisa (LISTEN)
java 31710 oozie 253u IPv4 40864847 0t0 TCP
localhost:metasys (LISTEN)
java 31856 hive 252u IPv4 40866046 0t0 TCP *:9083 (LISTEN)
java 31856 hive 256u IPv4 40866064 0t0 TCP
rdrh61-srv1.dsone.3ds.com:9083->rdrh61-srv3.dsone.3ds.com:48178
(ESTABLISHED)
java 31998 hue 265u IPv4 40866757 0t0 TCP *:teradataordbms
(LISTEN)
python2.6 32022 hue 3u IPv4 40866771 0t0 TCP
rdrh61-srv1.dsone.3ds.com:ddi-tcp-1 (LISTEN)
python2.6 32103 hue 3u IPv4 40866771 0t0 TCP
rdrh61-srv1.dsone.3ds.com:ddi-tcp-1 (LISTEN)
python2.6 32103 hue 22u IPv4 40866771 0t0 TCP
rdrh61-srv1.dsone.3ds.com:ddi-tcp-1 (LISTEN)


Are these issues related? 1) cloudera-scm-agent reports problems connecting
to CDM even though the port on SCM server appears to be open and 2) SCM
reports a problem connecting to the hosts and the port on the hosts is not
open.

I'm a developer, not a sysadmin and want to get on with developing and stop
messing with the system administration (which i'm not particularly good
at). Ideally I'd like to manage my env through the CDM UI. So any over all
suggestions for how i can restore health via CDM UI? Configuration?
Uninstall/reinstall? Rollback? If not, then suggestions for what I can do
at the low-level to make things right?

thanks for any help.





To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

Search Discussions

  • Misterblinky at Jan 17, 2014 at 3:04 pm

    I seem to have fixed the issue. I simply restarted the CDM *through the
    CDM UI*. I restarted the daemon a bunch of times and it had no effect,
    but restarting through the UI and now host monitoring appears to be
    working. I've got other problems now, but hopefully i'll be able to work
    through my issues ....
      It was a simple fix. Sometimes you have to travel a long long road to
    return home. :-\

    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
  • Hadoopdan at Feb 4, 2014 at 2:31 pm
    I have been having the same problems, however the fix appears to be
    temporary for me.
    On Friday, January 17, 2014 10:04:10 AM UTC-5, [email protected] wrote:

    I seem to have fixed the issue. I simply restarted the CDM *through the
    CDM UI*. I restarted the daemon a bunch of times and it had no effect,
    but restarting through the UI and now host monitoring appears to be
    working. I've got other problems now, but hopefully i'll be able to work
    through my issues ....
    It was a simple fix. Sometimes you have to travel a long long road to
    return home. :-\
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedJan 17, '14 at 2:51p
activeFeb 4, '14 at 2:31p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Misterblinky: 2 posts Hadoopdan: 1 post

People

Translate

site design / logo © 2023 Grokbase