2014-02-03

Hi All!

I would like to believe that our infrastructure is that stable but i know that isnt the case!

 

We are using Zenoss 4.2.0

 

Scenario:

Nothing has appeared to of changed in the environment but all of a sudden we are not getting any events for any of our devices and we have noticed that on the odd occasion one or two of the demons would stop and would need restarting (not unusual for us though).

I am a windows admin so please be gentle!

 

Looking at the demons and their associated logs i can see that we have problems but need a starting piont. Some errors:

 

zenactiond

2014-02-03 12:58:12,678 INFO zen.maintenance: Performing periodic maintenance

2014-02-03 12:58:12,679 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 12:58:12,680 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 12:58:12,682 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

2014-02-03 12:59:12,683 INFO zen.maintenance: Performing periodic maintenance

2014-02-03 12:59:12,684 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 12:59:12,684 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 12:59:12,685 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

 

zeneventd

2014-02-03 12:57:41,361 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

2014-02-03 12:58:41,368 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

2014-02-03 12:59:41,377 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

2014-02-03 13:00:41,387 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

2014-02-03 13:01:41,394 ERROR zen.maintenance: Maintenance failed. Message from hub: Could not publish message. Connection may be down

 

zeneventlog

2014-02-03 13:01:49,939 INFO zen.maintenance: Performing periodic maintenance

2014-02-03 13:01:49,940 INFO zen.zeneventlog: Counter discardedEvents, value 148146279

2014-02-03 13:01:49,940 INFO zen.zeneventlog: Counter eventCount, value 177093947

2014-02-03 13:01:49,941 INFO zen.zeneventlog: 30 devices processed (0 datapoints)

2014-02-03 13:01:49,943 INFO zen.collector.scheduler: Tasks: 31 Successful_Runs: 4257 Failed_Runs: 0 Missed_Runs: 15 Queued_Tasks: 0 Running_Tasks: 1 

2014-02-03 13:01:50,067 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:01:50,130 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:07,906 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:07,969 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:08,140 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:08,289 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:08,486 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

2014-02-03 13:02:08,543 ERROR zen.zeneventlog: Discarding oldest 51 events because maxqueuelen was exceeded: 5051/5000

 

zenhub

2014-02-03 13:07:06,463 INFO zen.ZenHub: Worker (19051) reports

2014-02-03 13:07:06,462 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 13:07:06,463 INFO zen.ZenHub: Worker (19051) reports

2014-02-03 13:07:06,463 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 111] Connection refused

2014-02-03 13:07:06,463 INFO zen.ZenHub: Worker (19051) reports

2014-02-03 13:07:06,463 ERROR zen.Events: Could not publish message. Connection may be down Traceback (most recent call last): File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 42, in sendEvents self._publishEvent(event, publisher) File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 82, in _publishEvent publisher.publish(event) File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 283, in publish self._publish("$RawZenEvents", routing_key, event, mandatory=mandatory, immediate=immediate) File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 302, in _publish mandatory, immediate) File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 376, in publish headers=headers, declareExchange=declareExchange) File "/opt/zenoss/lib/python/zenoss/protocols/amqp.py", line 138, in publish raise Exception("Could not publish message. Connection may be down") Exception: Could not publish message. Connection may be down

 

zenjobs

Traceback (most recent call last): File "/opt/zenoss/lib/python/celery/worker/consumer.py", line 349, in start self.reset_connection() File "/opt/zenoss/lib/python/celery/worker/consumer.py", line 592, in reset_connection self.connection = self._open_connection() File "/opt/zenoss/lib/python/celery/worker/consumer.py", line 657, in _open_connection self.app.conf.BROKER_CONNECTION_MAX_RETRIES) File "/opt/zenoss/lib/python/kombu/connection.py", line 223, in ensure_connection interval_start, interval_step, interval_max) File "/opt/zenoss/lib/python/kombu/utils/__init__.py", line 158, in retry_over_time return fun(*args, **kwargs) File "/opt/zenoss/lib/python/kombu/connection.py", line 146, in connect return self.connection File "/opt/zenoss/lib/python/kombu/connection.py", line 574, in connection self._connection = self._establish_connection() File "/opt/zenoss/lib/python/kombu/connection.py", line 533, in _establish_connection conn = self.transport.establish_connection() File "/opt/zenoss/lib/python/kombu/transport/amqplib.py", line 278, in establish_connection connect_timeout=conninfo.connect_timeout) File "/opt/zenoss/lib/python/kombu/transport/amqplib.py", line 88, in __init__ super(Connection, self).__init__(*args, **kwargs) File "/opt/zenoss/lib/python/amqplib/client_0_8/connection.py", line 129, in __init__ self.transport = create_transport(host, connect_timeout, ssl) File "/opt/zenoss/lib/python/amqplib/client_0_8/transport.py", line 281, in create_transport return TCPTransport(host, connect_timeout) File "/opt/zenoss/lib/python/amqplib/client_0_8/transport.py", line 85, in __init__ raise socket.error, msg error: [Errno 111] Connection refused

2014-02-03 13:08:25,216 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 2 seconds...

2014-02-03 13:08:27,219 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 4 seconds...

2014-02-03 13:08:31,224 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 6 seconds...

2014-02-03 13:08:37,231 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 8 seconds...

2014-02-03 13:08:45,240 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 10 seconds...

2014-02-03 13:08:55,251 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 12 seconds...

2014-02-03 13:09:07,264 ERROR celery.worker.consumer: Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 14 seconds...

 

All other logs seem to report error free except zopectl which just tells me there is a problem and to forward to zenoss, here is the error details:

 

Type: <type 'exceptions.ValueError'>
Value: 'event' is not a valid daemon name

 

I am hoping that with the above information someone can give me a starting point. I appologise in advance if this is not in the correct forum.

 

Many thanks

Rob

Show more