FAQ
Hi.

I wrote a small hashlib for C. Because I'm new to hashes I looked at pythons implementation and
reused *some* of the code... or more the mathematical "hash-function", not really the code.

In particular I looked at pythons hash and lookup functions, so I came up with this (see the code
underneath).

So, can this code be considered as derived and do I have to put my code under the GPL? I'd like to
publish it under something less restrictive, like a BSD style license. But if GPL is the only way,
then GPL it is. :)

-panzi


#define PERTURB_SHIFT 5

static ENTRY_T * lookup(CONTAINER_T * table, hash_const_str_t key,
int (*loop_cond)(ENTRY_T * ptr, hash_const_str_t key)) {
register ENTRY_T * entries = table->entries;
register hash_size_t size = table->size;
register hash_size_t pos = hash(key) % size;
register hash_size_t perturb = pos;

register ENTRY_T * ptr = entries + pos;

while(loop_cond(ptr,key)) {
pos = ((pos << 2) + pos + perturb + 1) % size;
perturb >>= PERTURB_SHIFT;

ptr = entries + pos;

#ifdef HASH_COUNT_COLLISIONS
++ table->collisions;
#endif
}

return ptr;
}

hash_t hash(register hash_const_str_t str) {
/* python 2.5's hash function: */
register hash_size_t len;
register hash_t h;

assert(str);

len = strlen(str);
h = *str << 7;

while (*str)
h = (1000003 * h) ^ *str ++;
h ^= len;

return h;
}

From http Sat Nov 25 22:03:25 2006
From: http (Paul Rubin)
Date: 25 Nov 2006 13:03:25 -0800
Subject: my small hashlib - using pythons hash-functions
References: <45689fae$0$11868$3b214f66@tunews.univie.ac.at>
Message-ID: <7x7ixjp7c2.fsf@ruckus.brouhaha.com>

Mathias Panzenboeck <e0427417 at student.tuwien.ac.at> writes:
So, can this code be considered as derived and do I have to put my
code under the GPL? I'd like to publish it under something less
restrictive, like a BSD style license. But if GPL is the only way,
then GPL it is. :)
Python is not GPL'd but has its own license. You can make the derived
work GPL if you want, but it is not required. You'll have to check
its terms to see whether you can specifically use the BSD license.
Simplest might be to just stay with the Python license.

Search Discussions

  • Mathias Panzenboeck at Nov 25, 2006 at 9:13 pm

    Paul Rubin wrote:
    Mathias Panzenboeck <e0427417 at student.tuwien.ac.at> writes:
    So, can this code be considered as derived and do I have to put my
    code under the GPL? I'd like to publish it under something less
    restrictive, like a BSD style license. But if GPL is the only way,
    then GPL it is. :)
    Python is not GPL'd but has its own license. You can make the derived
    work GPL if you want, but it is not required. You'll have to check
    its terms to see whether you can specifically use the BSD license.
    Simplest might be to just stay with the Python license.
    Yes, right. It's PSF.

    But the question is: *IS* this derived work? I mean, it's not copied code.
    It's the same hashing-logic, which I learned by watching pythons code.

    -panzi
  • Fredrik Lundh at Nov 25, 2006 at 9:25 pm

    Mathias Panzenboeck wrote:

    But the question is: *IS* this derived work? I mean, it's not copied code.
    It's the same hashing-logic, which I learned by watching pythons code.
    given that it's only a few lines of code, and there's hardly any other
    way to write those lines if you want to implement the same algorithm,
    I'd say it falls under "fair use". crediting the Python developers in
    the source code should be good enough.

    </F>
  • Robert Kern at Nov 26, 2006 at 12:18 am

    Fredrik Lundh wrote:
    Mathias Panzenboeck wrote:
    But the question is: *IS* this derived work? I mean, it's not copied code.
    It's the same hashing-logic, which I learned by watching pythons code.
    given that it's only a few lines of code, and there's hardly any other
    way to write those lines if you want to implement the same algorithm,
    I'd say it falls under "fair use". crediting the Python developers in
    the source code should be good enough.
    If you'll forgive my pedantry, that's not "fair use," but simply the use of
    material that cannot be copyrighted by itself (in many jursidictions, at least)
    because it is so short, obvious, non-creative, etc.

    A good overview of fair use is here:

    http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter9/index.html

    IANAL. TINLA.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
  • Mathias Panzenboeck at Nov 25, 2006 at 10:17 pm

    Fredrik Lundh wrote:
    Mathias Panzenboeck wrote:
    But the question is: *IS* this derived work? I mean, it's not copied
    code.
    It's the same hashing-logic, which I learned by watching pythons code.
    given that it's only a few lines of code, and there's hardly any other
    way to write those lines if you want to implement the same algorithm,
    I'd say it falls under "fair use". crediting the Python developers in
    the source code should be good enough.

    </F>
    ic. I'll write a mail to the python developers then.

    The code in python looks like this (see code underneath), just if someone what's to know.
    I only used the logic of the parts I understand. ;)

    Objects/stringobject.c:

    static long
    string_hash(PyStringObject *a)
    {
    register Py_ssize_t len;
    register unsigned char *p;
    register long x;

    if (a->ob_shash != -1)
    return a->ob_shash;
    len = a->ob_size;
    p = (unsigned char *) a->ob_sval;
    x = *p << 7;
    while (--len >= 0)
    x = (1000003*x) ^ *p++;
    x ^= a->ob_size;
    if (x == -1)
    x = -2;
    a->ob_shash = x;
    return x;
    }

    Objects/dictobject.c:

    static dictentry *
    lookdict_string(dictobject *mp, PyObject *key, register long hash)
    {
    register size_t i;
    register size_t perturb;
    register dictentry *freeslot;
    register size_t mask = (size_t)mp->ma_mask;
    dictentry *ep0 = mp->ma_table;
    register dictentry *ep;

    /* Make sure this function doesn't have to handle non-string keys,
    including subclasses of str; e.g., one reason to subclass
    strings is to override __eq__, and for speed we don't cater to
    that here. */
    if (!PyString_CheckExact(key)) {
    #ifdef SHOW_CONVERSION_COUNTS
    ++converted;
    #endif
    mp->ma_lookup = lookdict;
    return lookdict(mp, key, hash);
    }
    i = hash & mask;
    ep = &ep0[i];
    if (ep->me_key == NULL || ep->me_key == key)
    return ep;
    if (ep->me_key == dummy)
    freeslot = ep;
    else {
    if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
    return ep;
    freeslot = NULL;
    }

    /* In the loop, me_key == dummy is by far (factor of 100s) the
    least likely outcome, so test for that last. */
    for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
    i = (i << 2) + i + perturb + 1;
    ep = &ep0[i & mask];
    if (ep->me_key == NULL)
    return freeslot == NULL ? ep : freeslot;
    if (ep->me_key == key
    (ep->me_hash == hash
    && ep->me_key != dummy
    && _PyString_Eq(ep->me_key, key)))
    return ep;
    if (ep->me_key == dummy && freeslot == NULL)
    freeslot = ep;
    }
    }

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedNov 25, '06 at 7:57p
activeNov 26, '06 at 12:18a
posts5
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase