At 4:37 PM -0500 4/25/10, Harry Putnam wrote:
I do have another question that was only in the background of my first

Is there a canonical way to read a bunch of -f type files into a hash?
I take it you mean add the file names to a hash, not the file contents.
I want the end name `$_' on one side and full name `File::Find::name'
on the other...
The "end name" is called the "file name". What comes before the file
name is referred to as the "directory" or "directory path". The whole
string is referred to as the "path" or "full path".
what happens is the keys start slaughtering each other if you get it
the wrong way round... and even when it ends up right... I wonder
there may still be some chance of names canceling
Hash keys must be unique. If you are worried about key collision (two
keys the same), always test whether a key already exists before
inserting it into a hash.
Doing it like this:

$h1{$File::Find::name} = $_

So far, has agreed with the count I see from `wc -l'. I'd like to
know for sure though if that is a reliable way to do it?

That is the reliable way to generate a hash of all files in a
directory tree. Since full paths must be unique on a system (else how
could the operating system find the file?), a full path specification
must be unique. The reverse (inverse, obverse?) is not true: because
of links and aliases, two full path strings could refer to the same
And is there some kind of handy way to turn a hash into a scalar like
can be done to arrays with File::Slurp
Arrays can be transformed to scalars by the join function.
File::Slurp can either return the contents of a file as a single
scalar or as an array, one line per array element. It doesn't really
turn an array into a scalar.
What I'm after is a way to grep the list of full names using the
endnames of a similar but not identical list, in order to discover
which names are in the longer list, but not the shorter list.

Hashes are the best data structure to use for this purpose.
Writing it to file is one way. And it seem likely to be the better
way really since the lists can be pretty long.

I wondered if this can all be done in the script with hashes somehow.

I suggest you try implementing an algorithm using hashes. Your method
(looking for substrings in a string containing all file names), is
needlessly inefficient and prone to error).

Jim Gibson

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 4 | next ›
Discussion Overview
groupbeginners @
postedApr 25, '10 at 7:25p
activeApr 25, '10 at 11:46p



site design / logo © 2022 Grokbase