FAQ
Hi,

I am running a Flume agent using Cloudera Manager and the agent is not able
to roll the data on the basis of specified configurations. Rather it always
rolls data arbitrarily in small small files (1KB). I have tested the same
configuration in local mode and it is working fine but the same config
doesn't work using the manager. Following is the config for HDFS sink:

ataplatform.sinks.sink1.hdfs.round = true
dataplatform.sinks.sink1.hdfs.roundValue = 1
dataplatform.sinks.sink1.hdfs.roundUnit = minute
dataplatform.sinks.sink1.hdfs.rollInterval = 60
dataplatform.sinks.sink1.hdfs.rollSize = 0
dataplatform.sinks.sink1.hdfs.rollCount = 0
dataplatform.sinks.sink1.hdfs.batchSize = 100
dataplatform.sinks.sink1.hdfs.txnEventMax = 40000
dataplatform.sinks.sink1.hdfs.fileType = DataStream
dataplatform.sinks.sink1.hdfs.maxOpenFiles = 50
dataplatform.sinks.sink1.hdfs.appendTimeout = 10000
dataplatform.sinks.sink1.hdfs.callTimeout = 10000
dataplatform.sinks.sink1.hdfs.threadsPoolSize = 100
dataplatform.sinks.sink1.hdfs.rollTimerPoolSize = 1


Please let me know me if I am doing anything wrong which is causing this
issue.

Thanks
Amar

Search Discussions

  • Vikram Srivastava at Jul 11, 2013 at 3:28 pm
    Adding cdh-user.
    On Thu, Jul 11, 2013 at 3:17 AM, wrote:

    Hi,

    I am running a Flume agent using Cloudera Manager and the agent is not
    able to roll the data on the basis of specified configurations. Rather it
    always rolls data arbitrarily in small small files (1KB). I have tested the
    same configuration in local mode and it is working fine but the same config
    doesn't work using the manager. Following is the config for HDFS sink:

    ataplatform.sinks.sink1.hdfs.round = true
    dataplatform.sinks.sink1.hdfs.roundValue = 1
    dataplatform.sinks.sink1.hdfs.roundUnit = minute
    dataplatform.sinks.sink1.hdfs.rollInterval = 60
    dataplatform.sinks.sink1.hdfs.rollSize = 0
    dataplatform.sinks.sink1.hdfs.rollCount = 0
    dataplatform.sinks.sink1.hdfs.batchSize = 100
    dataplatform.sinks.sink1.hdfs.txnEventMax = 40000
    dataplatform.sinks.sink1.hdfs.fileType = DataStream
    dataplatform.sinks.sink1.hdfs.maxOpenFiles = 50
    dataplatform.sinks.sink1.hdfs.appendTimeout = 10000
    dataplatform.sinks.sink1.hdfs.callTimeout = 10000
    dataplatform.sinks.sink1.hdfs.threadsPoolSize = 100
    dataplatform.sinks.sink1.hdfs.rollTimerPoolSize = 1


    Please let me know me if I am doing anything wrong which is causing this
    issue.

    Thanks
    Amar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedJul 11, '13 at 10:17a
activeJul 11, '13 at 3:28p
posts2
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Vikram Srivastava: 1 post Amar08007: 1 post

People

Translate

site design / logo © 2022 Grokbase