FAQ
I have always been curious about why a construct like:

a = "the quick brown fox jumps over the lazy dog"

is called a string. I have always assumed that it was a reference to a
necklace type construct as in (and I can't resist) a "string of perls".

My curiosity has been rekindled due to recent reports (to laypeople like
myself) of the existance of a Khipu code in which the Incans used
knotted strings as a type of binary code to store information. From
"FreeRepublic.com":

--
http://209.157.64.200/focus/f-news/928058/posts

Researchers take a fresh look at Incan knotted strings and suggest that
they may have been a written language, one that used a binary code to
store information In the late 16th century, Spanish travelers in central
Peru ran into an old Indian man, probably a former official of the Incan
empire, which Francisco Pizarro had conquered in 1532. The Spaniards saw
the Indian try to hide something he was carrying, according to the
account of one traveler, Diego Avalos y Figueroa, so they searched him
and found several bunches of the cryptic knotted strings known as khipu.
Many khipu simply recorded columns of numbers for accounting or census
purposes, but the conquistadors believed that some contained historical
narratives, religious myths, even poems. In this case, the Indian
claimed that his khipu recorded everything the conquerors had done in
the area, "both the good and evil." The leader of the Spanish party,
Avalos y Figueroa reported, immediately "took and burned these accounts
and punished the Indian" for having them. But although the Spanish
considered khipu dangerous, idolatrous objects and destroyed as many as
they could, scholars have long dismissed the notion that khipu (or
quipu, as the term is often spelled) were written documents.
--

The rest of the article is worth a read. The analogies are obviously to
delicious to pass up. Is anyone interested in creating a mechanical
"knotted string" parser?

That being said, what is the programming-centric etymology of "string"?
Having been familiar with buffer overflows, I am familiar with the
entymology of said construct :)

--
Brian Kelley bkelley at wi.mit.edu
Whitehead Institute for Biomedical Research 617 258-6191

Search Discussions

  • Richard Brodie at Aug 21, 2003 at 2:11 pm
    "Brian Kelley" <bkelley at wi.mit.edu> wrote in message
    news:3f44ce77$0$556$b45e6eb0 at senator-bedfellow.mit.edu...
    That being said, what is the programming-centric etymology of "string"?
    The usage of string as "a sequence of similar objects", appears to be very
    old. Dropping the 'character' part is probably much more recent (probably
    around the classic Unix/C era: old Pascal documentation tends to say
    "character string" explicitly.
  • Brian Kelley at Aug 21, 2003 at 2:28 pm

    Richard Brodie wrote:

    "Brian Kelley" <bkelley at wi.mit.edu> wrote in message
    news:3f44ce77$0$556$b45e6eb0 at senator-bedfellow.mit.edu...

    That being said, what is the programming-centric etymology of "string"?

    The usage of string as "a sequence of similar objects", appears to be very
    old. Dropping the 'character' part is probably much more recent (probably
    around the classic Unix/C era: old Pascal documentation tends to say
    "character string" explicitly.
    Isn't this relatively unique to characters though? I haven't seen
    "integer string" and some old borland documentation (old memory here)
    that had some code to convert from a String of Integers into an Array of
    Byte. This usage of string specifically meant the ASCII equivalent of
    an integer such as:

    a = "1234567890"

    More often I have seen "Array of Integers" or "List of Integers" so the
    term "string" does appear to mean "human readable" or imply character
    based... i.e. EBDIC/ASCII or the appropriate encoding.

    --
    Brian Kelley bkelley at wi.mit.edu
    Whitehead Institute for Biomedical Research 617 258-6191
  • Brian Kelley at Aug 21, 2003 at 3:03 pm
    This just goes to show that what is visible is usually the tip of the
    iceberg.

    It turns out that the term "linguistic string" has been in use for a
    long time in the linguistics community to describe language syntax.
    Zellig Harris apparently used the term in some of his work (published
    around 1936). In 1965 Naomi Sager used "linguistic string" theory to
    form the Linguistic String Project mainly used as an application toward
    medical documents using a controlled medical vocabulary.

    linguistic string
    n : a linear sequence of words as spoken or written [syn: string,
    string of words, word string]

    The earliest paper I can find directly related to Linguistic strings is:

    Linguistic String Analysis (1960)
    Naomi Sager, NYU

    ALthough I'm fairly sure the term was widely used in the liguistic
    community in the 1950's.

    So it seems that, indeed, the Incan's were first :)

    --
    Brian Kelley bkelley at wi.mit.edu
    Whitehead Institute for Biomedical Research 617 258-6191
  • Roy Smith at Aug 21, 2003 at 5:32 pm

    Brian Kelley wrote:
    The usage of string as "a sequence of similar objects", appears
    to be very old.
    The old way to identify newborn babies in hospitals was to make up a
    name bracelet with letter beads on a string. This would then be
    attached to the baby's wrist or ankle.
  • David Opstad at Aug 21, 2003 at 2:25 pm
    In article <3f44ce77$0$556$b45e6eb0 at senator-bedfellow.mit.edu>,
    Brian Kelley wrote:
    That being said, what is the programming-centric etymology of "string"?
    It goes back at least to 1962, because that's when SNOBOL was invented,
    and the 'S' stands for "String" in that acronym.

    Dave
  • Duncan Booth at Aug 21, 2003 at 2:29 pm
    Brian Kelley <bkelley at wi.mit.edu> wrote in
    news:3f44ce77$0$556$b45e6eb0 at senator-bedfellow.mit.edu:
    I have always been curious about why a construct like:

    a = "the quick brown fox jumps over the lazy dog"

    is called a string. I have always assumed that it was a reference to a
    necklace type construct as in (and I can't resist) a "string of perls".
    I think you have that about right. It's a "string of characters" rather
    than "pearls".

    I could be wrong though, we get so used to our terminology that we don't
    ever need to think where it comes from. I was watching an episode of
    Mastermind (a UK quiz show) the other day, and one of the questions totally
    threw me, it was: "In computing, the word 'bit' is an abbreviation of what
    two other words?", and I was sitting there thinking 'is it really an
    abbreviation?' long after the contestant had passed on that answer and gone
    on to other questions. Yes, I worked it out eventually, and I think the
    contestant had as well by the time he got told the answers to the questions
    he had passed on.

    --
    Duncan Booth duncan at rcp.co.uk
    int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
    "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
  • Travis Whitton at Aug 21, 2003 at 2:30 pm
    "In computing, the word 'bit' is an abbreviation of what two other words?"
    Wow, I had no idea it was an abbreviation either. Luckily, dict had the answer
    waiting for me:
  • P at Aug 21, 2003 at 2:49 pm

    Travis Whitton wrote:
    "In computing, the word 'bit' is an abbreviation of what two other words?"
    Wow, I had no idea it was an abbreviation either. Luckily, dict had the answer
    waiting for me:

    From Jargon File (4.3.0, 30 APR 2001) [jargon]:

    bit n. [from the mainstream meaning and `Binary digIT']

    It's nice to see the occasional thread on computing history rather than the
    endless stream of coding issues.
    I'm always surprised by the number of people
    in computing that don't know this.

    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)

    P?draig.
  • Peter Hansen at Aug 21, 2003 at 3:18 pm

    P at draigBrady.com wrote:
    Travis Whitton wrote:
    "In computing, the word 'bit' is an abbreviation of what two other words?"
    Wow, I had no idea it was an abbreviation either. Luckily, dict had the answer
    waiting for me:

    From Jargon File (4.3.0, 30 APR 2001) [jargon]:

    bit n. [from the mainstream meaning and `Binary digIT']

    It's nice to see the occasional thread on computing history rather than the
    endless stream of coding issues.
    I'm always surprised by the number of people
    in computing that don't know this.

    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)
    And nibble is spelled "nybble", at least around here, continuing
    the pun started by byte...
  • Istvan Albert at Aug 21, 2003 at 4:52 pm

    P at draigBrady.com wrote:

    I'm always surprised by the number of people
    in computing that don't know this.
    In all fairness, knowing what bit stands for
    does not mean much.

    I would guess that for people that
    lived through the computing revolution the notion of a
    binary digit made the concept more accessible.

    For those born into the whole thing the fact
    that the word bit is an acronym might feel
    as an oddity since they are so much accustomed
    to it that does not need explanations.

    Istvan.
  • Brian Kelley at Aug 21, 2003 at 5:04 pm

    Istvan Albert wrote:
    P at draigBrady.com wrote:
    I'm always surprised by the number of people
    in computing that don't know this.
    Coloquially (in american-english):

    A "bit" used to be half a quarter and therefore there are also 8 bits to
    a dollar. Coincidence or a play on words? Some of us here are making
    our livings using "pieces of eight", so to speak. Anyway, this brings
    up this reather mildly amusing chart for pricing memory (from
    http://www.thehumanitarian.org/faq.php)

    1 Megabyte 1024 Kilobytes
    1 Kilobyte 1024 Bytes
    1 Byte 8 Bits
    1 Shave and a Haircut 2 Bits
    1 Bit $0.125
    $1 8 Bits
    1 Byte $1
    1 Kilobyte $1024
    1 Megabyte $1048576

    --
    Brian Kelley bkelley at wi.mit.edu
    Whitehead Institute for Biomedical Research 617 258-6191
  • Mel Wilson at Aug 21, 2003 at 10:57 pm
    In article <3f44fb74$0$571$b45e6eb0 at senator-bedfellow.mit.edu>,
    Brian Kelley wrote:
    A "bit" used to be half a quarter and therefore there are also 8 bits to
    a dollar. Coincidence or a play on words? Some of us here are making
    our livings using "pieces of eight", so to speak.
    Synchornicity, IMHO, since early bits were more likely to
    bunch up into half-dozens than eights.

    Regards. Mel.
  • Greg Krohn at Aug 21, 2003 at 5:06 pm
    <P at draigBrady.com> wrote in message news:3F44DBF2.1010403 at draigBrady.com...
    ...
    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)
    I could swear there where more of these. Isn't there one for 2 bits and
    32bits, etc?

    While using [http://labs.google.com/sets] to see if I could find any others,
    I ran across [http://www.intuitor.com/counting/HandCounter.html]. Look at
    the represenation of 4 in binary. I hope that's just a joke.
  • Sean Ross at Aug 21, 2003 at 5:29 pm
    "Greg Krohn" <ask at me.com> wrote in message
    news:q_61b.1904$Ej6.268 at newsread4.news.pas.earthlink.net...
    <P at draigBrady.com> wrote in message
    news:3F44DBF2.1010403 at draigBrady.com...
    ...
    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)
    I could swear there where more of these. Isn't there one for 2 bits and
    32bits, etc?

    This lists several, but there are others: for instance, I seem to recall
    seeing trio, quartet, quintet, etc. used somewhere...

    2 bits: crumb, quad, quarter, tayste, tydbit, morsel
    4 bits: nybble, nibble
    5 bits: nickle
    10 bits: deckle
    16 bits: playte, chawmp (on a 32-bit machine), word (on a 16-bit machine),
    half-word (on a 32-bit machine).
    18 bits: chawmp (on a 36-bit machine), half-word (on a 36-bit machine)
    32 bits: dynner, gawble (on a 32-bit machine), word (on a 32-bit machine),
    longword (on a 16-bit machine).
    36 bits: word (on a 36-bit machine)
    48 bits: gawble (under circumstances that remain obscure)
    64 bits: double word (on a 32-bit machine) quad (on a 16-bit machine)
    128 bits: quad (on a 32-bit machine)

    from: http://developer.syndetic.org/query_jargon.pl?term=nybble


    Sean
  • Mikael Olofsson at Aug 22, 2003 at 6:24 am

    On Thu, 21 Aug 2003 17:06:30 GMT "Greg Krohn" wrote:
    I could swear there where more of these. Isn't there one for 2 bits and
    32bits, etc?
    I call 2 bits a dibit.

    /Mikael Olofsson
    Universitetslektor (Associate professor)
    Link?pings universitet

    -----------------------------------------------------------------------
    E-Mail: mikael at isy.liu.se
    WWW: http://www.dtr.isy.liu.se/en/staff/mikael
    Phone: +46 - (0)13 - 28 1343
    Telefax: +46 - (0)13 - 28 1339
    -----------------------------------------------------------------------
    Link?pings kammark?r: www.kammarkoren.com
  • Paul Watson at Aug 21, 2003 at 9:31 pm
    <P at draigBrady.com> wrote in message news:3F44DBF2.1010403 at draigBrady.com...
    Travis Whitton wrote:
    "In computing, the word 'bit' is an abbreviation of what two other
    words?"
    Wow, I had no idea it was an abbreviation either. Luckily, dict had the
    answer
    waiting for me:

    From Jargon File (4.3.0, 30 APR 2001) [jargon]:

    bit n. [from the mainstream meaning and `Binary digIT']

    It's nice to see the occasional thread on computing history rather than
    the
    endless stream of coding issues.
    I'm always surprised by the number of people
    in computing that don't know this.

    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)

    P?draig.
    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
  • Andrew Dalke at Aug 21, 2003 at 10:51 pm
    Paul Watson
    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    I thought that was called a word, as in "the CDC 6400 had a 60
    bit word size"

    Andrew
    dalke at dalkescientific.com
  • Thomas Bellman at Aug 22, 2003 at 11:19 pm

    "Andrew Dalke" wrote:

    Paul Watson
    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    I thought that was called a word, as in "the CDC 6400 had a 60
    bit word size"
    A word is the natural unit for operation in a machine. The
    registers are word sized (if there are registers, that is), reads
    and writes to memory use words (at least from the machine code
    programmers point of view), and so on. A byte is a, typically
    smaller, unit of bits that is semiconvenient to operate on, but
    usually requires a bit more work by the hardware to extract and
    deposit a single byte in a whole word.

    Today many CPUs have a word size of 32 bits, and a byte size of
    8?bits; there are instructions for operating on a quarter of a
    word. But on a 36 bit machine, 8 bit bytes would not be very
    convenient - you'd probably prefer bytes being 6 or 9 bits in
    size.

    The DEC PDP-10 was a 36 bit machine. Bytes, however, were
    variable sized, and could be anywhere between 1 and 36 bits.
    The instructions LDB (LoaD Byte) and DPB (DePosit Byte) took,
    in addition to the destination/source register and the word
    address to load/store from, also an offset (measured in bits)
    within the word, and a byte size (also measured in bits).


    --
    Thomas Bellman, Lysator Computer Club, Link?ping University, Sweden
    "We don't understand the software, and ! bellman @ lysator.liu.se
    sometimes we don't understand the hardware, !
    but we can *see* the blinking lights!" ! Make Love -- Nicht Wahr!
  • Jarek Zgoda at Aug 22, 2003 at 10:31 pm

    Paul Watson <pwatson at knightsbridge.com> pisze:

    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    I remember that "byte" in French is expressed as "octet", even if it has
    only 7 bits...

    --
    Jarek Zgoda
    Registered Linux User #-1
    http://www.zgoda.biz/ JID:zgoda at chrome.pl http://zgoda.jogger.pl/
  • Piet van Oostrum at Aug 25, 2003 at 6:49 pm
    Jarek Zgoda (JZ) wrote:
    JZ> Paul Watson <pwatson at knightsbridge.com> pisze:
    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    JZ> I remember that "byte" in French is expressed as "octet", even if it has
    JZ> only 7 bits...

    Are you sure. Several international organisations use the word 'octet' in
    their official specifications, to make sure that an 8-bit byte is meant.
    --
    Piet van Oostrum <piet at cs.uu.nl>
    URL: http://www.cs.uu.nl/~piet [PGP]
    Private email: P.van.Oostrum at hccnet.nl
  • Jarek Zgoda at Aug 25, 2003 at 7:50 pm

    Piet van Oostrum <piet at cs.uu.nl> pisze:

    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    JZ> I remember that "byte" in French is expressed as "octet", even if it has
    JZ> only 7 bits...

    Are you sure. Several international organisations use the word 'octet' in
    their official specifications, to make sure that an 8-bit byte is meant.
    I cann't recall the machine I'm referring to (it was product of Bull),
    but I am sure that lecturer called 7 bit units as "octets".

    --
    Jarek Zgoda
    Registered Linux User #-1
    http://www.zgoda.biz/ JID:jarek at jabberpl.org http://zgoda.jogger.pl/
  • JanC at Aug 26, 2003 at 2:15 am

    Jarek Zgoda <jzgoda at gazeta.usun.pl> schreef:

    JZ> I remember that "byte" in French is expressed as "octet", even if
    it has JZ> only 7 bits...

    Are you sure. Several international organisations use the word
    'octet' in their official specifications, to make sure that an 8-bit
    byte is meant.
    I cann't recall the machine I'm referring to (it was product of Bull),
    but I am sure that lecturer called 7 bit units as "octets".
    Most of the time, in French "octet(s)" is used like "byte(s)" in English.

    On French sites you'll see "Ko", "Mo", etc. to indicate download-sizes.
    Look here for an example:
    <http://zdnet.fr/telecharger/windows/categorie/0,39021356,10010014r,00.htm>
    (t?l?charger = to download)

    --
    JanC

    "Be strict when sending and tolerant when receiving."
    RFC 1958 - Architectural Principles of the Internet - section 3.9
  • John Baxter at Aug 26, 2003 at 11:52 pm
    In article <bidpaf$fm5$1 at atlantis.news.tpi.pl>,
    Jarek Zgoda wrote:
    Piet van Oostrum <piet at cs.uu.nl> pisze:
    My CS professor would insist that a byte is a collection of bits, and not
    necessarily eight. There are machines which do not have 8-bit addressable
    bytes.
    JZ> I remember that "byte" in French is expressed as "octet", even if it has
    JZ> only 7 bits...

    Are you sure. Several international organisations use the word 'octet' in
    their official specifications, to make sure that an 8-bit byte is meant.
    I cann't recall the machine I'm referring to (it was product of Bull),
    but I am sure that lecturer called 7 bit units as "octets".
    The "oct" prefix, which reasonable people might expect to mean "8" has
    been abused for several years. One evening at WTBS (the MIT version,
    before the sailor guy from Atlanta bought the callsign from my
    successors), I soldered up an 11-pin octal plug. Then I unwired it,
    installed the plug cover, and soldered it up again.

    Then, I unwired it, turned the plug cover around the proper way, and
    wired it up yet again. [Pretty good lesson: I haven't forgotten a plug
    cover or gotten one backwards since.]

    --John

    --
    Email to above address discarded by provider's server. Don't bother sending.
  • John Baxter at Aug 22, 2003 at 11:33 pm

    In article <3F44DBF2.1010403 at draigBrady.com>, P at draigBrady.com wrote:

    Travis Whitton wrote:
    "In computing, the word 'bit' is an abbreviation of what two other words?"
    Wow, I had no idea it was an abbreviation either. Luckily, dict had the
    answer
    waiting for me:

    From Jargon File (4.3.0, 30 APR 2001) [jargon]:

    bit n. [from the mainstream meaning and `Binary digIT']

    It's nice to see the occasional thread on computing history rather than the
    endless stream of coding issues.
    I'm always surprised by the number of people
    in computing that don't know this.

    bit is a contraction of "Binary digIT"
    byte is a pun on the word bit (8 bits)
    nibble is a pun on the word byte (4 bits)
    And for the NCR C-315 (early 1960s), the 12-bit word was called a
    "slab". This was a (near) abbreviation of "syllable" since someone felt
    that 12 bits was too small to call a "word". [Compared with IBM's
    36-bit words (704, et seq) that made sense.]

    Data storage and the accumulator were variable in length, up to a
    few--8, I think--slabs. The accumulator grew and shrank as need to
    contain what it needed to, with a register named T@ ("tally of the
    accumulator") keeping track.

    A slab could be considered to hold 3 digits or two characters. (The
    machine was decimal, not binary, in the face it presented to the
    programmer. Including in memory addressing.)

    And that's my yarn for the day, at least in this thread.

    --John

    --
    Email to above address discarded by provider's server. Don't bother sending.
  • U. N. Owen at Aug 27, 2003 at 7:42 am
    I don't know if this helps very much, but...

    In french we *almost always* use "octet" for
    8bit bytes. Sometimes for 7bit.
    And for 4bit, we use "quartet", which is mostly
    seen in HP48 world.
    It concerns only software, since I don't know
    very well "hardware words".
    --
    _______________________________________________
    Get your free email from http://www.uymail.com

    Powered by Outblaze

Related Discussions

People

Translate

site design / logo © 2022 Grokbase