Hi,

A database I'm handling is becoming a bit large'ish (~ 30 tables), and
I'd like to break them down into their natural units. Schemas for each
of these natural units seems logical, but are they really meant for
this? I'm also worried about how this would affect programs like
Libreoffice (the sdbc driver in particular)?

Thanks,

--
Seb

Search Discussions

  • John R Pierce at Apr 29, 2011 at 1:15 am

    On 04/28/11 5:51 PM, Seb wrote:
    Hi,

    A database I'm handling is becoming a bit large'ish (~ 30 tables), and
    I'd like to break them down into their natural units. Schemas for each
    of these natural units seems logical, but are they really meant for
    this? I'm also worried about how this would affect programs like
    Libreoffice (the sdbc driver in particular)?
    30 tables is really not that many. we have schemas with 200 tables in
    databases with 6 or 10 different schemas.

    we use schemas to separate out data that is rarely used together,
    typically each schema is for different application classes.

    as far as liberoffice goes, I have no idea, but you should be able to
    refer to schemaname.tablename regardless.
  • Seb at Apr 29, 2011 at 1:42 am

    On Thu, 28 Apr 2011 18:15:05 -0700, John R Pierce wrote:
    On 04/28/11 5:51 PM, Seb wrote:
    Hi,
    A database I'm handling is becoming a bit large'ish (~ 30 tables),
    and I'd like to break them down into their natural units. Schemas
    for each of these natural units seems logical, but are they really
    meant for this? I'm also worried about how this would affect
    programs like Libreoffice (the sdbc driver in particular)?
    30 tables is really not that many. we have schemas with 200 tables in
    databases with 6 or 10 different schemas.
    we use schemas to separate out data that is rarely used together,
    typically each schema is for different application classes.
    Thanks, this is exactly what I have in mind. The last paragraph in
    section 5.7.7 of the documentation is somewhat ominous regarding the use
    of schemas if at some point one has to work with other DBMS that don't
    support schemas. I suppose this means trouble if at some point one
    needs to move the database to a DBMS not supporting schemas or that does
    so differently.

    as far as liberoffice goes, I have no idea, but you should be able to
    refer to schemaname.tablename regardless.
    I'll look into this.

    Thanks,

    --
    Seb
  • Darren Duncan at Apr 29, 2011 at 2:29 am

    Seb wrote:
    A database I'm handling is becoming a bit large'ish (~ 30 tables), and
    I'd like to break them down into their natural units. Schemas for each
    of these natural units seems logical, but are they really meant for
    this? I'm also worried about how this would affect programs like
    Libreoffice (the sdbc driver in particular)?
    I think that 30 tables is too few to be splitting into schemas based just on
    their number; you should have other reasons for splitting them. Also, 30 is
    quite small, or at least medium-small; some databases have hundreds, thousands
    or tens of thousands of tables.

    Think of a schema like a programming namespace. In a program, you may have
    different libraries or classes that each contain functions and such. You would
    logically group functions together at some times and separate them at other
    times. You can let similar concerns organize your schemas, where as schema is
    like a library as a function is to a table ... or some other database object
    like a stored function.

    -- Darren Duncan
  • Seb at Apr 29, 2011 at 3:03 am

    On Thu, 28 Apr 2011 19:29:11 -0700, Darren Duncan wrote:

    Seb wrote:
    A database I'm handling is becoming a bit large'ish (~ 30 tables),
    and I'd like to break them down into their natural units. Schemas
    for each of these natural units seems logical, but are they really
    meant for this? I'm also worried about how this would affect
    programs like Libreoffice (the sdbc driver in particular)?
    I think that 30 tables is too few to be splitting into schemas based
    just on their number; you should have other reasons for splitting
    them. Also, 30 is quite small, or at least medium-small; some
    databases have hundreds, thousands or tens of thousands of tables.
    Think of a schema like a programming namespace. In a program, you may
    have different libraries or classes that each contain functions and
    such. You would logically group functions together at some times and
    separate them at other times. You can let similar concerns organize
    your schemas, where as schema is like a library as a function is to a
    table ... or some other database object like a stored function.
    Thanks for these thoughts. Perhaps I can describe a cartoon of this
    database to explain what I'm trying to accomplish.

    The database stores information related to biological research. The
    bulk of the tables describe things like individual ID, morphometrics,
    and behavioural data on all the individuals in several studies.
    However, there are a few tables that do not relate to the research
    itself (the main use of the DB), but to logistics. For example, a group
    of tables store information on purchases and inventory of material
    required for the overall project. These tables would never (or almost)
    be queried together with the others.

    So typically we have two types of uses: research and
    preparation/logistics for the project. We wouldn't want to even see the
    logistcs tables for research work, whereas we would like to see only
    these ones for preparation/planning. As the project and number of
    tables grow, we could see a similar divergence within the group of
    tables related to research. Are these good reasons for separating these
    units into schemas?

    Cheers,

    --
    Seb
  • Darren Duncan at Apr 29, 2011 at 6:39 am

    Seb wrote:
    Thanks for these thoughts. Perhaps I can describe a cartoon of this
    database to explain what I'm trying to accomplish.

    The database stores information related to biological research. The
    bulk of the tables describe things like individual ID, morphometrics,
    and behavioural data on all the individuals in several studies.
    However, there are a few tables that do not relate to the research
    itself (the main use of the DB), but to logistics. For example, a group
    of tables store information on purchases and inventory of material
    required for the overall project. These tables would never (or almost)
    be queried together with the others.

    So typically we have two types of uses: research and
    preparation/logistics for the project. We wouldn't want to even see the
    logistcs tables for research work, whereas we would like to see only
    these ones for preparation/planning. As the project and number of
    tables grow, we could see a similar divergence within the group of
    tables related to research. Are these good reasons for separating these
    units into schemas?
    If you mean using just 2 schemas for those 2 distinct tasks, then yes, that may
    be a reasonable organization strategy. However, it is important to keep
    yourself flexible to a degree such as to change your mind on this decision or
    reorganize now and then. Similar to how you should expect you may refactor or
    reorganize your program now and then, though you could design now for how you
    reasonably expect to be for awhile. This all said, I could go either way. Your
    proposal is a more reasonable purpose for using 2 schemas, but at the same time,
    because you only have 30 tables for now, it may not hurt to keep them together
    in 1 schema anyway.

    Okay, bottom line, for simplicity, stick to a single schema for everything until
    you can point to a clear benefit for splitting. If you're on the fence and
    could go either way with no clear benefit of one way over the other, then just
    use 1 schema for everything by default as that is a simpler approach.

    As for the possibility of supporting another DBMS later without schema support,
    well ... Postgres is free and is generally superior to all other free DBMSs, and
    probably most of the DBMSs you would have to choose from also support schemas
    anyway ... or if you need to use the others you could fake it with common
    prefixes for your table names.

    -- Darren Duncan
  • Alban Hertroys at Apr 29, 2011 at 6:48 am

    On 29 Apr 2011, at 5:03, Seb wrote:

    So typically we have two types of uses: research and
    preparation/logistics for the project. We wouldn't want to even see the
    logistcs tables for research work, whereas we would like to see only
    these ones for preparation/planning. As the project and number of
    tables grow, we could see a similar divergence within the group of
    tables related to research. Are these good reasons for separating these
    units into schemas?

    I suppose the two different schema's are interesting to similarly different users of the database as well? If so, you could also create two different types of users with schema search_paths defaulting to their respective schema.
    That way logistics-type users don't get confused by research tables and researchers don't get confused by logistics tables. You could even restrict access to the schema the user isn't supposed to access.

    Alban Hertroys

    --
    Screwing up is an excellent way to attach something to the ceiling.


    !DSPAM:737,4dba5f4212121823650944!
  • Fork at Apr 29, 2011 at 4:40 pm

    Seb <spluque <at> gmail.com> writes:

    The database stores information related to biological research. The
    bulk of the tables describe things like individual ID, morphometrics,
    and behavioural data on all the individuals in several studies.
    However, there are a few tables that do not relate to the research
    itself (the main use of the DB), but to logistics. For example, a group
    of tables store information on purchases and inventory of material
    required for the overall project. These tables would never (or almost)
    be queried together with the others.

    So typically we have two types of uses: research and
    preparation/logistics for the project. We wouldn't want to even see the
    logistcs tables for research work, whereas we would like to see only
    these ones for preparation/planning. As the project and number of
    tables grow, we could see a similar divergence within the group of
    tables related to research. Are these good reasons for separating these
    units into schemas?
    One thing that splitting into schemas would give you is the ability to play
    games with "SEARCH_PATH". Users who handle logistics would have their
    search_path only set to "LOGISTICS,public" and would need to fully qualify a
    research table to even see it (like "select * from research.foo"); any table
    creation would also happen in the schema RESEARCH. Running an unadorned "\d"
    from the command line will only show tables in the search path, which might be
    useful too. Vice versa for RESEARCH.

    That being said, ..., less complexity is always better, so unless this is really
    compelling because people are complaining, I wouldn't bother. My rule is always
    "pull" technology within an organization, never "push"; on your own experimental
    box, though, it is well worth hacking for understanding.
    Cheers,

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedApr 29, '11 at 12:51a
activeApr 29, '11 at 4:40p
posts8
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase