Hi.
Recently I made a small script to do some file transferring (among other
things). I wanted to monitor the progress of the file transfer, so I needed
to know the size of the files I was transferring. Finding out how to get
this information took some time (reading the manuals - googling did not
prove worthwhile). Anyway, I did eventually figure out how to do it (there
are a few ways, including os.path.getsize(filename)).
My question is this: Is there a reason why file objects could not have a
size method or property? So that you could then just ask the file how big it
is using fd.size or fd.size(). I'm just curious, because, well, it seems to
have obvious utility, and the way to find it is less than obvious (at least,
it was to me).
Thanks,
Sean
[Python] finding file size
| Tweet |
|
Search Discussions
-
David M. Wilson at Jan 2, 2004 at 6:46 am ⇧
"Sean Ross" <sross at connectmail.carleton.ca> wrote...My question is this: Is there a reason why file objects could not have aHey!
size method or property? So that you could then just ask the file how big it
is using fd.size or fd.size(). I'm just curious, because, well, it seems to
have obvious utility, and the way to find it is less than obvious (at least,
it was to me).
1) Using 'fd' as a name for a file object is a bad idea - you can get
fds from os.open. If you insist on C-ish names, how about 'fp'
instead? :)
2) There's nothing to stop the file object from having a size method,
except that file-like objects then have more to implement.
How about something like:
py> class SizedFile(file):
... def __len__(self):
... oldpos = self.tell()
... self.seek(0, 2)
... length = self.tell()
... self.seek(oldpos)
... return length
...
py> bleh = SizedFile("/etc/passwd")
py> len(bleh)
1520
py> len([ x for x in bleh ])
33
As I wrote this I realised it's wrong - size() would be better, since
the length of the sequence is not the number of bytes. Maybe it is in
binary mode? Dunno, me sleepy, goodnight..
David.Thanks,
Sean -
Sean Ross at Jan 2, 2004 at 5:44 pm ⇧
"David M. Wilson" <dw-google.com at botanicus.net> wrote in message
news:99dce321.0401012246.712dd2fb at posting.google.com...1) Using 'fd' as a name for a file object is a bad idea - you can getor just f would work ...
fds from os.open. If you insist on C-ish names, how about 'fp'
instead? :)2) There's nothing to stop the file object from having a size method,except that file-like objects then have more to implement.
See Martin v. Loewis' post for some other rationale.How about something like:Right. size() is more apt. Also, while I appreciate the effort of
py> class SizedFile(file):
... def __len__(self):
... oldpos = self.tell()
... self.seek(0, 2)
... length = self.tell()
... self.seek(oldpos)
... return length
...
py> bleh = SizedFile("/etc/passwd")
py> len(bleh)
1520
py> len([ x for x in bleh ])
33
As I wrote this I realised it's wrong - size() would be better, since
the length of the sequence is not the number of bytes. Maybe it is in
binary mode? Dunno, me sleepy, goodnight..
David.
subclassing file, what I was looking for was to have the builtin file (or
file-like) objects expose this operation, not just custom implementations.
Thanks for your response,
Sean
-
Martin v. Loewis at Jan 2, 2004 at 11:39 am ⇧
Yes. In Python, file objects belong to the larger category of "file-likeSean Ross wrote:
My question is this: Is there a reason why file objects could not have a
size method or property?
objects", and not all file-like objects have the inherent notion of a
size. E.g. what would you think sys.stdin.size should return (which
actually is a proper file object - not just file-like)?
Other examples include the things returned from os.popen or socket.socket.
Regards,
Martin
-
Sean Ross at Jan 2, 2004 at 5:34 pm ⇧
"Martin v. Loewis" <martin at v.loewis.de> wrote in message
news:bt3l9i$pog$07$3 at news.t-online.com...Sean Ross wrote:I see what you mean. I suppose the only option I could think of forMy question is this: Is there a reason why file objects could not have aYes. In Python, file objects belong to the larger category of "file-like
size method or property?
objects", and not all file-like objects have the inherent notion of a
size. E.g. what would you think sys.stdin.size should return (which
actually is a proper file object - not just file-like)?
Other examples include the things returned from os.popen or socket.socket.
Regards,
Martin
sys.stdin, os.popen, and socket.socket would be to return the number of
bytes written to these objects so far. But, then, those objects, or
something else, would have to track that information. Also, pipes and
sockets could be written to from two directions, so is the size the total
number of bytes written from both sides, or would you prefer to know how
much you'd written as the size, or how much the other side had written
(Perhaps all three would be nice). Another option would be to return '-1',
or 'None', to let people know that the request is unsupported for this
file-like object. Still another option would be to raise an exception. And,
of course, there's the ever popular, leave-well-enough-alone option.
Anyway, thank you for your response. I see it's merit.
Sean
-
Gerrit Holl at Jan 3, 2004 at 10:17 am ⇧
Hi,
I propose to add a "filename" type to Python.
Martin v. Loewis wrote:Sean Ross wrote:A different solution to this problem would be to introduce "filename"My question is this: Is there a reason why file objects could not have aYes. In Python, file objects belong to the larger category of "file-like
size method or property?
objects", and not all file-like objects have the inherent notion of a
size. E.g. what would you think sys.stdin.size should return (which
actually is a proper file object - not just file-like)?
type to Python, a subclass of str. The "name" attribute of file would be of this
type. This type would inherit a lot of os.path stuff: getsize becomes
simpler, more readable, and more object oriented, as do other os.path
functions. I think the alternatives look a lot more prety:
OLD NEW
os.path.realpath(fn) fn.realpath()
os.path.getmtime(fp.name) fp.name.getmtime()
os.path.ismount(os.path.dirname(fp.name)) fp.name.dirname().ismount()
It's more beatiful, simpler, flatter (#3), practical, obvious, easy.
problem: what do do with os.path constants?
solution: make them class attributes
problem: how to handle posixpath, ntpath, macpath?
solution: abstract Path class with NTPath, MacPath, PosixPath sublasses which is the actual type of e.g. fn.name on a certain platform
problem: backwards compatibility
solution: same as string methods
problem: "/dev/null" reads as a Path but is a str
solution: path("/dev/null") is a little more typing for a lot more luxery
problem: what to do with commonprefix?
solution: don't know
problem: what to do with os.path.walk?
solution: use os.walk instead
problem: what to do with sameopenfile?
solution: make it a file method
problem: what to do with join, split?
solution: rename to joinpath, splitpath.
Any comments?
yours,
Gerrit.
--
158. If any one be surprised after his father with his chief wife, who
has borne children, he shall be driven out of his father's house.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/ -
Martin v. Loewis at Jan 3, 2004 at 12:11 pm ⇧
-
Gerrit Holl at Jan 3, 2004 at 1:09 pm ⇧
It should indeed. But it isn't what I had in mind, and it's not exactlyMartin v. Loewis wrote:
Gerrit Holl wrote:Any comments?It should be possible to implement that type without modifying
Python proper.
the same as a filename type in the language: for example, the name
attribute of a file will still be a string, just as the contents of
os.listdir, glob.glob, etc. (it seems glob follows listdir).It might make a good recipe for the cookbook.If the type would be created without changing python proper, the type
would probably just call os.path.foo for the filename.foo method. It
would be the other way around if the type would become part of the
language: os.path would only be there for backward compatibility, like
string. But in order for os.listdir (and probably more functions) to
return Path objects rather than strings, a C implementation would be
preferable (necessary?). On the other hand, would this type ever be
added, a python implementation would of course be a must.Any volunteers?I may have a look at it.
When thinking about it, a lot more issues than raised in my first post
need to be resolved, like what to do when the intializer is empty...
curdir? root?
I guess there would a base class with all os-independent stuff, or stuff
that can be coded independently, e.g:
class Path(str):
def split(self):
return self.rsplit(self.sep, 1)
def splitext(self):
return self.rsplit(self.extsep, 1)
def basename(self):
return self.split()[1]
def dirname(self):
return self.split()[0]
def getsize(self):
return os.stat(self).st_size
def getmtime(self):
return os.stat(self).st_mtime
def getatime(self):
return os.stat(self).st_atime
def getctime(self):
return os.stat(self).st_ctime
where the subclasses define, sep, extsep, etc.
yours,
Gerrit.
--
168. If a man wish to put his son out of his house, and declare before
the judge: "I want to put my son out," then the judge shall examine into
his reasons. If the son be guilty of no great fault, for which he can be
rightfully put out, the father shall not put him out.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/ -
Peter Otten at Jan 3, 2004 at 12:14 pm ⇧
You might have a look atGerrit Holl wrote:
I propose to add a "filename" type to Python.
A different solution to this problem would be to introduce "filename"
type to Python, a subclass of str. The "name" attribute of file would be
of this type. This type would inherit a lot of os.path stuff: getsize
becomes simpler, more readable, and more object oriented, as do other
os.path functions. I think the alternatives look a lot more prety:
OLD NEW
os.path.realpath(fn) fn.realpath()
os.path.getmtime(fp.name) fp.name.getmtime()
os.path.ismount(os.path.dirname(fp.name)) fp.name.dirname().ismount()
It's more beatiful, simpler, flatter (#3), practical, obvious, easy.
http://mail.python.org/pipermail/python-list/2002-June/108425.html
http://members.rogers.com/mcfletch/programming/filepath.py
has an implementation of your proposal by Mike C. Fletcher. I think both
filename class and os.path functions can peacefully coexist.
Peter
-
Gerrit Holl at Jan 3, 2004 at 4:50 pm ⇧
Thanks for the links.Peter Otten wrote:
http://mail.python.org/pipermail/python-list/2002-June/108425.html
http://members.rogers.com/mcfletch/programming/filepath.py
has an implementation of your proposal by Mike C. Fletcher. I think both
filename class and os.path functions can peacefully coexist.
(I think they don't, by the way)
yours,
Gerrit.
--
19. If he hold the slaves in his house, and they are caught there, he
shall be put to death.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/ -
Mike C. Fletcher at Jan 3, 2004 at 10:23 pm ⇧
You hawks, always seeing war where we see peace :) ;) .Gerrit Holl wrote:
Peter Otten wrote:http://mail.python.org/pipermail/python-list/2002-June/108425.htmlThanks for the links.
http://members.rogers.com/mcfletch/programming/filepath.py
has an implementation of your proposal by Mike C. Fletcher. I think both
filename class and os.path functions can peacefully coexist.
(I think they don't, by the way)
Seriously, though, a path type would eventually have ~ the same relation
as the str type now does to the string module. Initial implementations
of a path type are going to use the os.path stuff, but to avoid code
duplication, the os.path module would eventually become a set of trivial
wrappers that dispatch on their first argument's method(s) (after
coercian to path type).
Is that peaceful? I don't know. If there's a war, let's be honest,
os.path is going to take a good long while to defeat because it's there
and embedded directly into thousands upon thousands of scripts and
applications. We can fight a decent campaign, making a common module,
then getting it blessed into a standard module, encouraging newbies to
shun the dark old os.path way, encouraging maintainers to use the new
module throughout their code-base, etceteras, but os.path is going to
survive a good long while, and I'd imagine that being friendly toward it
would keep a few of our comrades off the floor.
Just as a note, however, we haven't had a *huge* outpouring of glee for
the current spike-tests/implementations. So it may be that we need to
get our own little army in shape before attacking the citadel :) .
Have fun,
Mike
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
-
Gerrit Holl at Jan 4, 2004 at 8:52 am ⇧
[Peter Otten][Gerrit Holl (me)]I think both filename class and os.path functions can peacefully coexist.[Mike C. Fletcher]Thanks for the links.
(I think they don't, by the way)Is that peaceful? I don't know. If there's a war, let's be honest,Sure, I don't think os.path would die soon, it will surely take longer
os.path is going to take a good long while to defeat because it's there
and embedded directly into thousands upon thousands of scripts and
applications. We can fight a decent campaign, making a common module,
then getting it blessed into a standard module, encouraging newbies to
shun the dark old os.path way, encouraging maintainers to use the new
module throughout their code-base, etceteras, but os.path is going to
survive a good long while, and I'd imagine that being friendly toward it
would keep a few of our comrades off the floor.
than the string module to die. But I think there is a number of places
where Python could be more object-oriented than it is, and this is one
of them. The first step in making those modules more object-oriented is
providing a OO-alternative: the second step is deprecating the old way,
and the third step is providing only the OO-way. The third step will
surely not be made until Python 3.0.
The string module has made the first two steps. In my view, the time
module has made the first step, although I'm not sure whether that's
true. I would like to see a datetime module that makes the time module
totally reduntant, because I never liked the time module: it doesn't fit
into my brain properly, because it's not object oriented. Now, I try to
use the datetime module whenever I can, but something like strptime
isn't there. PEP 321 solves this, so I'd like time to become eventually
deprecated after something DateUtil-like inclusion as well, but it
probably won't.
Hmm, the Zen of Python is not very clear about this:
Now is better than never.
Although never is often better than *right* now.
...so there must be a difference between 'now' and 'right now' :)Just as a note, however, we haven't had a *huge* outpouring of glee forSure :)
the current spike-tests/implementations. So it may be that we need to
get our own little army in shape before attacking the citadel :) .
yours,
Gerrit.
--
147. If she have not borne him children, then her mistress may sell her
for money.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/ -
Peter Otten at Jan 4, 2004 at 9:54 am ⇧
[Gerrit Holl][Peter Otten][Gerrit Holl (me)]I think both filename class and os.path functions can peacefully
coexist.[Mike C. Fletcher]Thanks for the links.
(I think they don't, by the way)Is that peaceful? I don't know. If there's a war, let's be honest,
os.path is going to take a good long while to defeat because it's there
and embedded directly into thousands upon thousands of scripts andSure, I don't think os.path would die soon, it will surely take longerI don't think OO is a goal in itself. In addition to the os.path functions'
than the string module to die. But I think there is a number of places
where Python could be more object-oriented than it is, and this is one
of them. The first step in making those modules more object-oriented is
providing a OO-alternative: the second step is deprecating the old way,
and the third step is providing only the OO-way. The third step will
surely not be made until Python 3.0.
ubiquity there are practical differences between a path and the general str
class.
While a string is the default that you read from files and GUI widgets, a
filename will never be. So expect to replace e. g.
os.path.exists(somestring)
with
os.filename(somestring).exists()
which is slightly less compelling than somefile.exists().
Are unicode filenames something we should care about?
Should filename really be a subclass of str? I think somepath[-1] could
return the name as well.
Should files and directories really be of the same class?
These to me all seem real questions and at that point I'm not sure whether a
filename class that looks like a light wrapper around os.path (even if you
expect os.path to be implemented in terms of filename later) is the best
possible answer.
Peter
-
Gerrit Holl at Jan 4, 2004 at 1:43 pm ⇧
I'm not so sure about that. A GUI where a file is selected from the listPeter Otten wrote:
While a string is the default that you read from files and GUI widgets, a
filename will never be.
could very well return a Path object - it won't for a while, of course,
but that's a different issue. But I agree that is often isn't. Just as
an integer is not something you read from a file, etc.So expect to replace e. g.I would rather read:
os.path.exists(somestring)
with
os.filename(somestring).exists()
which is slightly less compelling than somefile.exists().
path(somestring).exists()
which is better than os.filename(somestring).exists() and, IMO, better
than os.path.exists(somestring). I think path should be a builtin.Are unicode filenames something we should care about?It could. But I don't think it should. This would mean that the index of
That's a difficult issue. I don't know how to solve that.
Should filename really be a subclass of str? I think somepath[-1] could
return the name as well.
a path returns the respective directories. Explicit is better than
implicit: somepath[-1] is not very explicit as being a basename.Should files and directories really be of the same class?...questions exist to be answered. I don't claim to know all answers,
Directories could be a subclass, with some more features. But...
These to me all seem real questions and at that point I'm not sure whether a
filename class that looks like a light wrapper around os.path (even if you
expect os.path to be implemented in terms of filename later) is the best
possible answer.
but I think OO-ifying os.path is a good thing. How - that's another
issue, which is PEP-worthy.From earlier discussions, I get the impression that most people aresympathic about OO-ifying os.path but that people don't agree in how to
do it. If we can agree on that, the only thing we need to do is
upgrading the BDFL's judgement from lukewarm to liking :)
I've written a Pre-PEP at: http://tinyurl.com/2578q
It is very unfinished but it is a rough draft. Comments welcome.
yours,
Gerrit.
--
132. If the "finger is pointed" at a man's wife about another man, but
she is not caught sleeping with the other man, she shall jump into the
river for her husband.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/ -
Martin v. Loewis at Jan 4, 2004 at 3:19 pm ⇧
It depends on the platform. There are:Gerrit Holl wrote:
Are unicode filenames something we should care about?
That's a difficult issue. I don't know how to solve that.
1. platforms on which Unicode is the natural string type
for file names, with byte strings obtained by conversion
only. On these platforms, all filenames can be represented
by a Unicode string, but some file names cannot
be represented by a byte string.
Windows NT+ is the class of such systems.
2. platforms on which Unicode and byte string filenames
work equally well; they can be converted forth and
back without any loss of accuracy or expressiveness.
OS X is one such example; probably Plan 9 as well.
3. platforms on which byte strings are the natural string
type for filenames. They often have only a weak notion
of file name encoding, causing
a) not all Unicode strings being available as filenames
b) not all byte string filenames being convertible to
Unicode
c) the conversion may depend on user settings, so for
the same file, Unicode conversion may give different
results for different users.
POSIX systems fall in this category.
So if filenames where a datatype, I think they should be
able to use both Unicode strings and byte strings as their
own internal representation, and declare one of the two
as "accurate". Conversion of filenames to both Unicode
strings and byte strings should be supported, but may
fail at runtime (unless conversion into the "accurate"
type is attempted).
Regards,
Martin
-
Notice: Undefined variable: pl_u_link_beg in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.php on line 831
Notice: Undefined variable: pl_u_link_end in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.php on line 831
Notice: Undefined variable: pl_u_link_beg2 in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.php on line 833
Irmen de Jong
Notice: Undefined variable: pl_u_link_end in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.php on line 833
at Jan 3, 2004 at 12:23 pm ⇧
Are you aware of Jason Orendorff's path module?Gerrit Holl wrote:
Any comments?
(haven't tried it myself though)
See this thread: http://tinyurl.com/3gq8r (google link)
--Irmen
-
Just at Jan 3, 2004 at 12:27 pm ⇧
In article <mailman.45.1073125105.12720.python-list at python.org>,
Gerrit Holl wrote:I propose to add a "filename" type to Python. [ ... ]This has been proposed a few times, and even implemented at least once:
A different solution to this problem would be to introduce "filename"
type to Python, a subclass of str. The "name" attribute of file would be of
this
type. This type would inherit a lot of os.path stuff: getsize becomes
simpler, more readable, and more object oriented, as do other os.path
functions. I think the alternatives look a lot more prety:
OLD NEW
os.path.realpath(fn) fn.realpath()
os.path.getmtime(fp.name) fp.name.getmtime()
os.path.ismount(os.path.dirname(fp.name)) fp.name.dirname().ismount()
It's more beatiful, simpler, flatter (#3), practical, obvious, easy.
http://www.jorendorff.com/articles/python/path/
I'm very much in favor of adding such an object, but I don't like Jason
Orendorff's design all that much. There has been a discussion about it
in the past:
http://groups.google.com/groups?q=g:thl1422628736d&dq=&hl=en&lr=&ie=UTF-8
&oe=UTF-8&safe=off&selm=mailman.1057651032.22842.python-list%40python.org
Just
-
John Roth at Jan 3, 2004 at 12:46 pm ⇧
"Martin v. Loewis" <martin at v.loewis.de> wrote in message
news:bt3l9i$pog$07$3 at news.t-online.com...Sean Ross wrote:I think the issue here is that the abstract concept behind a "file-likeMy question is this: Is there a reason why file objects could not have aYes. In Python, file objects belong to the larger category of "file-like
size method or property?
objects", and not all file-like objects have the inherent notion of a
size. E.g. what would you think sys.stdin.size should return (which
actually is a proper file object - not just file-like)?
Other examples include the things returned from os.popen or socket.socket.
object"
is that of something external that can be opened, read, written to and
closed.
As you say, this does not include the notion of basic implementation: a file
on a file system is a different animal than a network socket, which is
different
from a pipe, etc.
I think we need an object that encapsulates the notion of a file (or
directory)
as a file system object. That object shouldn't support "file-like"
activities:
it should have a method that returns a standard file object to do that.
I like Geritt Holl's filename suggestion as well, but it's not the same
as this suggestion.
John RothRegards,
Martin
Related Discussions
Discussion Navigation
| view | thread | post |
Discussion Overview
| group | python-list |
| categories | python |
| posted | Jan 2, '04 at 2:45a |
| active | Jan 4, '04 at 3:19p |
| posts | 18 |
| users | 9 |
| website | python.org |
