Grokbase Groups R r-devel May 2003
FAQ
Full_Name: Don Allen
Version: 1.6.2
OS: Solaris
Submission from: (NULL) (140.186.148.11)


If you use '\246' to separate fields in a csv-like file, read.table fails if
you
have more than 5 lines in the file (in the following, the separators in junk.csv
are really '\246's, despite the way they printed):

Fails:

read.table("/tmp/junk.csv",as.is=T,header=T,sep="\246")
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :

line 5 did not have 5 elements

junk.csv
----------------
x¦a¦b¦c¦d
1¦7¦13¦19¦25
2¦8¦14¦20¦26
3¦9¦15¦21¦27
4¦10¦16¦22¦28
5¦11¦17¦23¦29
6¦12¦18¦24¦30
----------------

That works if you delete the last two lines:

read.table("/tmp/junk.csv",as.is=T,header=T,sep="\246")
x a b c d
1 1 7 13 19 25
2 2 8 14 20 26
3 3 9 15 21 27
4 4 10 16 22 28

When using tabs or vertical bars as separators, you do not encounter this
problem. The suspicion, of course, is that this has something to do with using a
separator
that has the high-order bit set (Insightful introduced just such a bug in Splus
6.1
that completely breaks their read.table for such separators).

Search Discussions

  • Prof Brian Ripley at May 17, 2003 at 6:12 pm
    It transpires this is not to do with read.table: scan fails on your
    example and it is in scan that a character is being compared with an
    unsigned char after each has been coerced to int. It's of long standing
    (but 1.6.2 is not current, and please do check on the current version).

    It will be fixed for 1.7.1.
    On Sat, 17 May 2003 don@delphioutpost.com wrote:

    Full_Name: Don Allen
    Version: 1.6.2
    OS: Solaris
    Submission from: (NULL) (140.186.148.11)


    If you use '\246' to separate fields in a csv-like file, read.table fails if
    you
    have more than 5 lines in the file (in the following, the separators in junk.csv
    are really '\246's, despite the way they printed):

    Fails:

    read.table("/tmp/junk.csv",as.is=T,header=T,sep="\246")
    Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :

    line 5 did not have 5 elements

    junk.csv
    ----------------
    x?a?b?c?d
    1?7?13?19?25
    2?8?14?20?26
    3?9?15?21?27
    4?10?16?22?28
    5?11?17?23?29
    6?12?18?24?30
    ----------------

    That works if you delete the last two lines:

    read.table("/tmp/junk.csv",as.is=T,header=T,sep="\246")
    x a b c d
    1 1 7 13 19 25
    2 2 8 14 20 26
    3 3 9 15 21 27
    4 4 10 16 22 28

    When using tabs or vertical bars as separators, you do not encounter this
    problem. The suspicion, of course, is that this has something to do with using a
    separator
    that has the high-order bit set (Insightful introduced just such a bug in Splus
    6.1
    that completely breaks their read.table for such separators).

    ______________________________________________
    r-devel@stat.math.ethz.ch mailing list
    https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
    --
    Brian D. Ripley, ripley@stats.ox.ac.uk
    Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
    University of Oxford, Tel: +44 1865 272861 (self)
    1 South Parks Road, +44 1865 272866 (PA)
    Oxford OX1 3TG, UK Fax: +44 1865 272595

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-devel @
categoriesr
postedMay 17, '03 at 5:03p
activeMay 17, '03 at 6:12p
posts2
users2
websiter-project.org
irc#r

2 users in discussion

Prof Brian Ripley: 1 post Don: 1 post

People

Translate

site design / logo © 2022 Grokbase