FAQ
Hello all,
I have what i feel is a unique situation which may not be resolved with
this inquiry. I have constructed the below data set so that i may give an
example of what im doing. The example works perfectly and i have no issues
with it. My problem arises with my actual data, which includes another 11
columns of data (used in later analysis) and a total of about 7000
cases(rows). i mention the dimensions of the actual data because im
wondering if my below process would encounter problems with more data.
To be sure the problem occurs in the last step. Is$NotTooSmall gives me a
binary output that is then put back in MultiLotBldgs.. (as shown in the
example) to return the cases i want to keep.
In my actual data the binary designation is correct but when
MultiLotBldgs2.. returns it doesnt remove the cases that are False in
Is$NotTooSmall. Like i said my sample data works fine but my actual
implementation does not. Any suggestions? I know this is not easy to
answer without seeing the problem but this is the best i can do without
sending you all of my data.

Cheers,
JR




#Sample data
Bldgid<-c(1000,1000,1001,1002,1003,1003)
Maplot<-c(20000,20001,30000,30001,40000,40001)
Area<-c(40,170,50,100,100,4.9)
#Construct Sample dataframe
MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
#Get Building Areas
MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
MultiLotBldgs..$Bldgid,
function(x) x))

# Calculate the proportion of the total building area in each piece of the
building
MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
MultiLotBldgs..$Bldgid,
function(x) x/sum(x)))

#Identify buildings that should be considered for joining
Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
< 0.05))))

MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

--
View this message in context: http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
Sent from the R help mailing list archive at Nabble.com.

Search Discussions

  • Jim holtman at Feb 17, 2010 at 11:58 pm
    Your example does not work since "Is" is not defined. What is it supposed
    to be?
    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:


    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context:
    http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • ROLL Josh F at Feb 17, 2010 at 11:58 pm
    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ________________________________
    From: jim holtman
    Sent: Wednesday, February 17, 2010 3:58 PM
    To: ROLL Josh F
    Cc: r-help@r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined. What is it supposed to be?

    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:

    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context: http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.



    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • Jim holtman at Feb 18, 2010 at 12:04 am
    You might want to put some debugging statements to see what the values are.
    As you say the example seems to work, so look at the data:
    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]
    # some debugging statements
    str(MultiLotBldgs2..)
    'data.frame': 4 obs. of 3 variables:
    $ Bldgid: num 1000 1001 1002 1003
    $ Maplot: num 20001 30000 30001 40000
    $ Area : num 170 50 100 100
    str(Is)
    List of 1
    $ NotTooSmall.X: Named logi [1:6] FALSE TRUE TRUE TRUE TRUE FALSE
    ..- attr(*, "names")= chr [1:6] "10001" "10002" "1001" "1002" ...
    print(sum(Is$NotTooSmall)) [1] 4
    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:

    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ------------------------------
    *From:* jim holtman
    *Sent:* Wednesday, February 17, 2010 3:58 PM
    *To:* ROLL Josh F
    *Cc:* r-help@r-project.org
    *Subject:* Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined. What is it
    supposed to be?
    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:


    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no
    issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me
    a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) &
    (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context:
    http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • ROLL Josh F at Feb 18, 2010 at 12:07 am
    Jim,
    Thanks for the input so far. Perhaps a solution would be to do what im doing below, differently, and trying it with my actual data set and seeing if i have the same issues. To be honest, i am kind of a novice R programmer and the below took me a substantial amount of time to construct. Perhaps you have a suggestion of a different way of doing it using my sample data?

    ________________________________
    From: jim holtman
    Sent: Wednesday, February 17, 2010 4:04 PM
    To: ROLL Josh F
    Cc: r-help@r-project.org
    Subject: Re: [R] Procedure not working for actual data

    You might want to put some debugging statements to see what the values are. As you say the example seems to work, so look at the data:
    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]
    # some debugging statements
    str(MultiLotBldgs2..)
    'data.frame': 4 obs. of 3 variables:
    $ Bldgid: num 1000 1001 1002 1003
    $ Maplot: num 20001 30000 30001 40000
    $ Area : num 170 50 100 100
    str(Is)
    List of 1
    $ NotTooSmall.X: Named logi [1:6] FALSE TRUE TRUE TRUE TRUE FALSE
    ..- attr(*, "names")= chr [1:6] "10001" "10002" "1001" "1002" ...
    print(sum(Is$NotTooSmall))
    [1] 4

    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:
    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ________________________________
    From: jim holtman
    Sent: Wednesday, February 17, 2010 3:58 PM
    To: ROLL Josh F
    Cc: r-help@r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined. What is it supposed to be?

    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:

    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context: http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.



    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?



    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • Jim holtman at Feb 18, 2010 at 1:09 am

    Try this on your real data:

    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Area Proportions
    MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
    + FUN=function(x) x / sum(x))
    # find not too small
    notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area >
    45) &
    + (MultiLotBldgs..$Prop < 0.05)))
    MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
    # print out results
    MultiLotBldgs2..
    Bldgid Maplot Area Prop
    2 1000 20001 170 0.8095238
    3 1001 30000 50 1.0000000
    4 1002 30001 100 1.0000000
    5 1003 40000 100 0.9532888

    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:

    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ------------------------------
    *From:* jim holtman
    *Sent:* Wednesday, February 17, 2010 3:58 PM
    *To:* ROLL Josh F
    *Cc:* r-help@r-project.org
    *Subject:* Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined. What is it
    supposed to be?
    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:


    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no
    issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me
    a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) &
    (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context:
    http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • ROLL Josh F at Feb 18, 2010 at 4:50 pm
    Hey Jim,
    That appears to work properly with my larger data set. That's really strange to me though, why would my procedure not work even though the test works correctly? I have always coded under the assumption that the code doesn't do anything the user doesn't tell it too but I cant see a problem with my code.

    I am looking at a similar problem with another piece of the code right now where everything looks right but it just isn't giving me the right output, although I haven't constructed a test yet.

    Thanks for the help.

    JR


    ________________________________
    From: jim holtman
    Sent: Wednesday, February 17, 2010 5:09 PM
    To: ROLL Josh F
    Cc: r-help@r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Try this on your real data:
    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Area Proportions
    MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
    + FUN=function(x) x / sum(x))
    # find not too small
    notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area > 45) &
    + (MultiLotBldgs..$Prop < 0.05)))
    MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
    # print out results
    MultiLotBldgs2..
    Bldgid Maplot Area Prop
    2 1000 20001 170 0.8095238
    3 1001 30000 50 1.0000000
    4 1002 30001 100 1.0000000
    5 1003 40000 100 0.9532888

    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:
    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ________________________________
    From: jim holtman
    Sent: Wednesday, February 17, 2010 3:58 PM
    To: ROLL Josh F
    Cc: r-help@r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined. What is it supposed to be?

    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:

    Hello all,
    I have what i feel is a unique situation which may not be resolved with
    this inquiry. I have constructed the below data set so that i may give an
    example of what im doing. The example works perfectly and i have no issues
    with it. My problem arises with my actual data, which includes another 11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    To be sure the problem occurs in the last step. Is$NotTooSmall gives me a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. Like i said my sample data works fine but my actual
    implementation does not. Any suggestions? I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x))

    # Calculate the proportion of the total building area in each piece of the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context: http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.



    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?



    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • Jim holtman at Feb 18, 2010 at 6:17 pm
    Even though it may work for a small subset, it can still break on
    larger sets. Your code was doing a number of 'unlist' and tearing
    apart the data and it is possible that some of the transformations
    were not aligned with the data in the way you thought them to be.
    What you need to do in that case is break down what is happening and
    look at the data in each substep to make sure it is what you are
    expecting.
    On Thu, Feb 18, 2010 at 11:50 AM, ROLL Josh F wrote:
    Hey Jim,
    ? That appears to work properly with my larger data set.? That's really
    strange to me though, why would my procedure not work even though the test
    works correctly?? I have always coded under the assumption that the code
    doesn't do anything the user doesn't tell it too but I cant see a problem
    with my code.

    I am looking at a similar problem with another piece of the code right now
    where everything looks right but it just isn't giving me the right output,
    although I haven't constructed a test yet.

    Thanks for the help.

    JR

    ________________________________
    From: jim holtman [mailto:jholtman at gmail.com]
    Sent: Wednesday, February 17, 2010 5:09 PM
    To: ROLL Josh F
    Cc: r-help at r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Try this on your real data:
    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Area Proportions
    MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
    +???? FUN=function(x) x / sum(x))
    # find not too small
    notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area >
    45) &
    +???? (MultiLotBldgs..$Prop < 0.05)))
    MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
    # print out results
    MultiLotBldgs2..
    ? Bldgid Maplot Area????? Prop
    2?? 1000? 20001? 170 0.8095238
    3?? 1001? 30000?? 50 1.0000000
    4?? 1002? 30001? 100 1.0000000
    5?? 1003? 40000? 100 0.9532888

    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:

    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ________________________________
    From: jim holtman [mailto:jholtman at gmail.com]
    Sent: Wednesday, February 17, 2010 3:58 PM
    To: ROLL Josh F
    Cc: r-help at r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined.? What is it supposed
    to be?
    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:

    Hello all,
    ? I have what i feel is a unique situation which may not be resolved with
    this inquiry. ?I have constructed the below data set so that i may give
    an
    example of what im doing. ?The example works perfectly and i have no
    issues
    with it. ?My problem arises with my actual data, which includes another
    11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). ?i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    ?To be sure the problem occurs in the last step. ?Is$NotTooSmall gives me
    a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    ?In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. ?Like i said my sample data works fine but my actual
    implementation does not. ?Any suggestions? ?I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?function(x) x))

    # Calculate the proportion of the total building area in each piece of
    the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ? ? ? ? ? ? ? ? ? ? ? ? ? ?((MultiLotBldgArea.X > 45) &
    (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context:
    http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?
  • Bert Gunter at Feb 18, 2010 at 6:38 pm
    ?traceback may be useful.

    Bert Gunter
    Genentech Nonclinical Biostatistics



    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
    Behalf Of jim holtman
    Sent: Thursday, February 18, 2010 10:17 AM
    To: ROLL Josh F
    Cc: r-help at r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Even though it may work for a small subset, it can still break on
    larger sets. Your code was doing a number of 'unlist' and tearing
    apart the data and it is possible that some of the transformations
    were not aligned with the data in the way you thought them to be.
    What you need to do in that case is break down what is happening and
    look at the data in each substep to make sure it is what you are
    expecting.
    On Thu, Feb 18, 2010 at 11:50 AM, ROLL Josh F wrote:
    Hey Jim,
    ? That appears to work properly with my larger data set.? That's really
    strange to me though, why would my procedure not work even though the test
    works correctly?? I have always coded under the assumption that the code
    doesn't do anything the user doesn't tell it too but I cant see a problem
    with my code.

    I am looking at a similar problem with another piece of the code right now
    where everything looks right but it just isn't giving me the right output,
    although I haven't constructed a test yet.

    Thanks for the help.

    JR

    ________________________________
    From: jim holtman [mailto:jholtman at gmail.com]
    Sent: Wednesday, February 17, 2010 5:09 PM
    To: ROLL Josh F
    Cc: r-help at r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Try this on your real data:
    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Area Proportions
    MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
    +???? FUN=function(x) x / sum(x))
    # find not too small
    notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area >
    45) &
    +???? (MultiLotBldgs..$Prop < 0.05)))
    MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
    # print out results
    MultiLotBldgs2..
    ? Bldgid Maplot Area????? Prop
    2?? 1000? 20001? 170 0.8095238
    3?? 1001? 30000?? 50 1.0000000
    4?? 1002? 30001? 100 1.0000000
    5?? 1003? 40000? 100 0.9532888

    On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F wrote:

    Sorry Just a generic list

    Is<-list()

    forgot to add that from my actual code
    ________________________________
    From: jim holtman [mailto:jholtman at gmail.com]
    Sent: Wednesday, February 17, 2010 3:58 PM
    To: ROLL Josh F
    Cc: r-help at r-project.org
    Subject: Re: [R] Procedure not working for actual data

    Your example does not work since "Is" is not defined.? What is it
    supposed
    to be?
    On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 wrote:

    Hello all,
    ? I have what i feel is a unique situation which may not be resolved
    with
    this inquiry. ?I have constructed the below data set so that i may give
    an
    example of what im doing. ?The example works perfectly and i have no
    issues
    with it. ?My problem arises with my actual data, which includes another
    11
    columns of data (used in later analysis) and a total of about 7000
    cases(rows). ?i mention the dimensions of the actual data because im
    wondering if my below process would encounter problems with more data.
    ?To be sure the problem occurs in the last step. ?Is$NotTooSmall gives
    me
    a
    binary output that is then put back in MultiLotBldgs.. (as shown in the
    example) to return the cases i want to keep.
    ?In my actual data the binary designation is correct but when
    MultiLotBldgs2.. returns it doesnt remove the cases that are False in
    Is$NotTooSmall. ?Like i said my sample data works fine but my actual
    implementation does not. ?Any suggestions? ?I know this is not easy to
    answer without seeing the problem but this is the best i can do without
    sending you all of my data.

    Cheers,
    JR




    #Sample data
    Bldgid<-c(1000,1000,1001,1002,1003,1003)
    Maplot<-c(20000,20001,30000,30001,40000,40001)
    Area<-c(40,170,50,100,100,4.9)
    #Construct Sample dataframe
    MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
    #Get Building Areas
    MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?function(x) x))

    # Calculate the proportion of the total building area in each piece of
    the
    building
    MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
    MultiLotBldgs..$Bldgid,
    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?function(x) x/sum(x)))

    #Identify buildings that should be considered for joining
    Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
    ? ? ? ? ? ? ? ? ? ? ? ? ? ?((MultiLotBldgArea.X > 45) &
    (MultiLotBldgProp.X
    < 0.05))))

    MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

    --
    View this message in context:
    http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492
    .html
    Sent from the R help mailing list archive at Nabble.com.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?


    --
    Jim Holtman
    Cincinnati, OH
    +1 513 646 9390

    What is the problem that you are trying to solve?

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedFeb 17, '10 at 11:34p
activeFeb 18, '10 at 6:38p
posts9
users3
websiter-project.org
irc#r

People

Translate

site design / logo © 2017 Grokbase