I'm looking for a way to calculate a hashcode or fingerprint for a row /
tuple of arbitrary width. The goal is to be able to store, in an audit
table, a fingerprint which could be compared against a later fingerprint
to detect changes (this application does not call for cryptographic
level security, so a simple checksum would be adequate for now).
Something like:

SELECT id, row_hashcode(*) FROM my_table;

id | row_hashcode
1 a3843587d893589ef321d31
2 934983497f9083298cb0843


I've considered implementing it in a procedural language, but the
inability of passing in ROW datatypes seems to make it impossible for
dynamic tuples (I suppose for static tables it could be done by a
process of iteration over the system catalog column lists for a given
table and PK). Also, for static tables, it seems like for a 1 .. 32
column table, creating a multicolumn index would have this effect, but I
can't seem to find a function to get at the calculated value of the index.

It seems like implementing this in C would be feasible, but I don't want
to reimplement it if it exists. Merely concatenating the values would
fail for nulls, so it seems to me the best thing to do would be to take
the effectively-serialized "COPY" output and run MD5 (or your
less-collision-prone hash of choice) on that.

I've done a fairly thorough search of the docs, contrib, and mailing
list archives (although the frequency in other contexts of the terms
"hash," "fingerprint," and "hashcode" are lowering the S/N). Does
anybody have any pointers or ideas either on existing implementations,
or writing a new C function (such as generating COPY output from an
arbitrary tuple)?



Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
grouppgsql-general @
postedApr 11, '06 at 3:02a
activeApr 11, '06 at 3:22a

2 users in discussion

Randall Lucas: 1 post Tom Lane: 1 post



site design / logo © 2022 Grokbase