FAQ

On 24/09/2013 00:17, David Christensen wrote:

I'm looking for a hash function and a related function or operator such
that:

H(string1 . string2) = f(H(string1), H(string2))
H(string1 . string2) = H(string1) op H(string2)

where:

H() is the hash function
string1 is a string
string2 is a string
. is the string concatenation operator
f() is a function
op is a binary operator
On 09/23/13 15:29, Rob Dixon wrote:
Could you explain the problem you're trying to solve?
Writing scripts that look for duplicate, similar, and/or
missing files.
I assume this is about paths and filenames. Have you considered an rsync
dry-run?

I also assume that you want to communicate as little as possible, so you
don't have supersets of all strings on all sides. (or it would become a
simple indexing problem)

I also assume that you are more interested in missing items, so
hash-value collisions are not a problem.

I also assume that the set of string1 is smaller than that of string2,
let's say 100 vs. 10000 different values.

For local deduplication, you would store paths as a directory name and a
parent-index:

#table=path
#columns=id,name,pid
1,"",0
2,"usr",1
3."local",2

And then have a list of filenames, and per filename in which path it exists.

#table=file
#columns=id,name

#table=detail
#columns=file_id,path_id,size,md5

For combining index values, use something like: ( i1 << N ) | i2.
(where N is the number of bits needed by i2)

I would not involve string concatenation: keep things separate once
separated. Use arrays.

Use (parts of) md5's of strings, if you need to compare to remote locations.

So best first explain *more* now about what you try to solve.
A single or multiple computers, connected or not?

Suppose 1 computer sends a concise email about what it has, such that
the other computer can reply with an even conciser email about what it
has, and what it needs. IOW: diff+patch.

--
Greetings, Ruud

## Related Discussions

 view thread | post posts ‹ prev | 7 of 11 | next ›
Discussion Overview
 group beginners categories perl posted Sep 23, '13 at 10:17p active Sep 27, '13 at 4:32p posts 11 users 5 website perl.org

### 5 users in discussion

Content

People

Support

Translate

site design / logo © 2022 Grokbase