FAQ
Hi Evgeny,

As always, thanks for the patch. A few comments:

1. A refactoring suggestion: move the new static methods introduced in
ColumnDescriptor to non-static methods of RowDescriptorBuilder.
2. In 3.0 unit test classes are using *Test suffix instead of *Tst,
and also we no longer put @author tags in the code.

Both items above are minor, and I can change them myself. Here is one
potentially bigger issue with the patch though:

3. The patch merges 2 strategies for creation of columns descriptor
into one, that requires access to ResultSetMetadata. I am not sure
that ResultSet.getMetaData() is a trivial operation for all JDBC
drivers that we support. I suspect there may be overhead accessing
this data at least on some DBs. While we can verify this theory, I am
wondering if we should, as in most cases we control the query syntax
and don't need to consult ResultSetMetadata at all.

So maybe preserve the flow in RowDescriptorBuilder, and use your
strategy merging explicit and implicit column descriptors *only* when
an optional metadata object is initialized?

Andrus


On Oct 9, 2009, at 2:08 PM, Evgeny Ryabitskiy (JIRA) wrote:


[ https://issues.apache.org/jira/browse/CAY-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Evgeny Ryabitskiy updated CAY-1282:
-----------------------------------

Attachment: CAY-1282-3.patch

Done!
Use #result as optional directive for only few columns (not all)
----------------------------------------------------------------

Key: CAY-1282
URL: https://issues.apache.org/jira/browse/CAY-1282
Project: Cayenne
Issue Type: Improvement
Components: Cayenne Core Library
Affects Versions: 1.2.5, 2.0.5, 3.0
Reporter: Evgeny Ryabitskiy
Assignee: Andrus Adamchik
Fix For: 3.0 beta 1

Attachments: CAY-1282-3.patch, CAY-1282-3.patch,
CAY-1282-3.patch, CAY-1282-3.patch, CAY-1282-3.patch, CAY-1282.patch


Here is few queries to show the problem:
SELECT ARTIST_ID, ARTIST_NAME FROM ARTIST
- working properly
SELECT #result('ARTIST_ID' 'java.lang.Integer'),
#result('ARTIST_NAME' 'java.lang.String') FROM ARTIST
- also working properly
SELECT ARTIST_ID, #result('ARTIST_NAME' 'java.lang.String') FROM
ARTIST
- first column is returned as null!!! Not nice...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Evgeny Ryabitskiy at Oct 11, 2009 at 9:38 pm
    If we talking about re-factor... I really wish to rewrite there
    everything to use some Collection.. like ArrayList... or maybe even
    HashSet.
    In that case I can use just really fast hashed contains() method
    instead going through all array of columns.
    So.. collections is more safe, is much more flexible in code. And in
    some cases even faster. I think usage of simple arrays is deprecated
    since Java 1.5
    But it's huge patch.. to change every ColumnDescriptor[] on some
    Collection<ColumnDescriptor>. So.. what do you think?
    3. The patch merges 2 strategies for creation of columns descriptor into
    one, that requires access to ResultSetMetadata. I am not sure that
    ResultSet.getMetaData() is a trivial operation for all JDBC drivers that we
    support. I suspect there may be overhead accessing this data at least on
    some DBs. While we can verify this theory.
    Yeah.. we should verify it, but I think ResultSetMetadata is formed
    when result is returned.. anyway.. It's just a getter... do you think
    ResultSetMetadata is "lazy" initialized? No sure.... I'm going to look
    at JTDS (one that we are using) drivers tomorrow.

    And.. one more idea.. any time when we don't use #result detective...
    we use ResultSetMetadata to get columns names and labels...
    So... doesn't seems that it's a big problem for JDBC drivers...
    It can be really cool if we perform some load testing and compare both
    strategies performance.
    I am wondering if we should, as
    in most cases we control the query syntax and don't need to consult
    ResultSetMetadata at all.

    So maybe preserve the flow in RowDescriptorBuilder, and use your strategy
    merging explicit and implicit column descriptors *only* when an optional
    metadata object is initialized?
    The problem that if we don't set ResultSetMetadata like in current
    (trunk) version, without ResultSetMetadata we don't know all
    columns.. only columns that was explicitly set. That is why I need to
    have result set every time... and without it you can get old
    behavior... not presented column values in result... That is why I
    think Builder should get ResultSet every time.


    Evgeny Ryabitskiy.
  • Andrus Adamchik at Oct 12, 2009 at 7:30 am

    On Oct 12, 2009, at 12:38 AM, Evgeny Ryabitskiy wrote:

    If we talking about re-factor... I really wish to rewrite there
    everything to use some Collection.. like ArrayList... or maybe even
    HashSet.
    In that case I can use just really fast hashed contains() method
    instead going through all array of columns.
    So.. collections is more safe, is much more flexible in code. And in
    some cases even faster. I think usage of simple arrays is deprecated
    since Java 1.5
    But it's huge patch.. to change every ColumnDescriptor[] on some
    Collection<ColumnDescriptor>. So.. what do you think?
    It doesn't matter how this represented *inside* the builder class, as
    builder is used only once per query. On the other hand, coming out of
    the builder it must be optimized, as access to the column descriptors
    array is performed N*M times during each result set processing, where
    N is the width of the result set, and M is its length. I.e. it can be
    a very large number (up to tens or hundreds of millions calls). Every
    small optimization matters here.
    I think usage of simple arrays is deprecated since Java 1.5
    This is most certainly NOT true.

    3. The patch merges 2 strategies for creation of columns descriptor
    into
    one, that requires access to ResultSetMetadata. I am not sure that
    ResultSet.getMetaData() is a trivial operation for all JDBC drivers
    that we
    support. I suspect there may be overhead accessing this data at
    least on
    some DBs. While we can verify this theory.
    Yeah.. we should verify it, but I think ResultSetMetadata is formed
    when result is returned.. anyway.. It's just a getter... do you think
    ResultSetMetadata is "lazy" initialized?
    This is something I don't know. We need to check about a dozen of
    drivers from different vendors that we support to verify that. This is
    just a getter in the interface. Implementors could've made it anything.
    So maybe preserve the flow in RowDescriptorBuilder, and use your
    strategy
    merging explicit and implicit column descriptors *only* when an
    optional
    metadata object is initialized?
    The problem that if we don't set ResultSetMetadata like in current
    (trunk) version, without ResultSetMetadata we don't know all
    columns..
    Not true. We don't know all the columns for SQLTemplate queries. For
    all other types of queries we DO know all the columns, as Cayenne
    generates SQL from scratch for those queries. I think this one place
    is where we have the biggest mismatch in our views of the
    implementation.

    Andrus
  • Andrus Adamchik at Oct 12, 2009 at 7:45 am

    On Oct 12, 2009, at 10:30 AM, Andrus Adamchik wrote:

    Yeah.. we should verify it, but I think ResultSetMetadata is formed
    when result is returned.. anyway.. It's just a getter... do you think
    ResultSetMetadata is "lazy" initialized?
    This is something I don't know. We need to check about a dozen of
    drivers from different vendors that we support to verify that. This
    is just a getter in the interface. Implementors could've made it
    anything.
    Another thing to check here is actually reading column data from
    returned ResultSetMetadata, as lazy resolving of it can be postponed a
    step further.

    Andrus
  • Evgeny Ryabitskiy at Oct 12, 2009 at 9:39 am

    It doesn't matter how this represented *inside* the builder class, as
    builder is used only once per query. On the other hand, coming out of the
    builder it must be optimized, as access to the column descriptors array is
    performed N*M times during each result set processing, where N is the width
    of the result set, and M is its length. I.e. it can be a very large number
    (up to tens or hundreds of millions calls). Every small optimization matters
    here.
    So.. I was talking exactly about optimization... HashedSet array can
    be faster cause we perform several scans over whole array of
    ColumnDescriptors. And safety cause we don't get duplicates for
    Columns. And.. I didn't get you position about this idea
    This is something I don't know. We need to check about a dozen of drivers
    from different vendors that we support to verify that. This is just a getter
    in the interface. Implementors could've made it anything.

    I have looked through JTDS drivers (not a dozen but a least one).
    ResultSet has all information about columns (just private final
    ColInfo[] columns).
    When getMetaData performed - constructs new Object that has reference
    to array of columns from ResultSet .
    Looks like there is no problem with JTDS.

    The problem that if we don't set ResultSetMetadata like in current
    (trunk) version, without ResultSetMetadata  we don't know all
    columns..
    Not true. We don't know all the columns for SQLTemplate queries. For all
    other types of queries we DO know all the columns, as Cayenne generates SQL
    from scratch for those queries. I think this one place is where we have the
    biggest mismatch in our views of the implementation.
    ah... now I see. You are right that was a mismatch in our views. I
    will work on it in the evening.
    Another thing to check here is actually reading column data from returned ResultSetMetadata, as lazy
    resolving of it can be postponed a step further.
    Again in JTDS it's just a array of ColInfo (like our
    ColumnDescriptor), it's passed to RowSet through constructor from
    protocol implementation.


    Evgeny Ryabitskiy.
  • Andrus Adamchik at Oct 12, 2009 at 10:59 am
    Excellent. Looks like we are on the same page now.

    Re: HashSet (HashMap?) vs. []. The map is definitely performing better
    when you are looking up a value by its key. This may be the case when
    we are assembling column descriptors inside the builder. This
    operation is done 1 time (ok maybe it is done N times, where N is the
    number of columns).

    However processing the ResultSet is a different story. There's no
    lookup by key. For each ResultSet row we need to apply ALL column
    descriptors one by one to get the values out. So with a HashSet/
    HashMap we'd have this:

    Iterator<ColumnDescriptor> it = map.values().iterator();
    while(it.hasNext()) {
    ColumnDescriptor column = it.next();
    ....
    }

    With a ColumnDescriptor[] we have this:

    for(int i = 0; i < length; i++) {
    column = columns[i];
    }

    Both loops are done M times, where M is the number of rows in the
    ResultSet. In the worst case scenario, M is much larger than N. In the
    first case, we call three extra methods (iterator, hasNext, next) and
    create at least one extra object (Iterator). So the secon case is
    marginally faster. Now if you multiply that nanosecond or whatever
    difference by a few millions, it can become more significant.

    So essentially when talking about this refactoring we need to separate
    the first step of preparing the columns, and the second step of using
    them.

    Andrus

    On Oct 12, 2009, at 12:38 PM, Evgeny Ryabitskiy wrote:

    It doesn't matter how this represented *inside* the builder class, as
    builder is used only once per query. On the other hand, coming out
    of the
    builder it must be optimized, as access to the column descriptors
    array is
    performed N*M times during each result set processing, where N is
    the width
    of the result set, and M is its length. I.e. it can be a very large
    number
    (up to tens or hundreds of millions calls). Every small
    optimization matters
    here.
    So.. I was talking exactly about optimization... HashedSet array can
    be faster cause we perform several scans over whole array of
    ColumnDescriptors. And safety cause we don't get duplicates for
    Columns. And.. I didn't get you position about this idea
    This is something I don't know. We need to check about a dozen of
    drivers
    from different vendors that we support to verify that. This is just
    a getter
    in the interface. Implementors could've made it anything.

    I have looked through JTDS drivers (not a dozen but a least one).
    ResultSet has all information about columns (just private final
    ColInfo[] columns).
    When getMetaData performed - constructs new Object that has reference
    to array of columns from ResultSet .
    Looks like there is no problem with JTDS.

    The problem that if we don't set ResultSetMetadata like in current
    (trunk) version, without ResultSetMetadata we don't know all
    columns..
    Not true. We don't know all the columns for SQLTemplate queries.
    For all
    other types of queries we DO know all the columns, as Cayenne
    generates SQL
    from scratch for those queries. I think this one place is where we
    have the
    biggest mismatch in our views of the implementation.
    ah... now I see. You are right that was a mismatch in our views. I
    will work on it in the evening.
    Another thing to check here is actually reading column data from
    returned ResultSetMetadata, as lazy
    resolving of it can be postponed a step further.
    Again in JTDS it's just a array of ColInfo (like our
    ColumnDescriptor), it's passed to RowSet through constructor from
    protocol implementation.


    Evgeny Ryabitskiy.
  • Andrus Adamchik at Oct 15, 2009 at 6:39 pm
    Hi Evgeny,

    First of all, thanks for your work on this patch. This closes arguably
    the biggest usability hole in SQLTemplate.

    I Just committed the patch, with the only change being removal of the
    public case transform method from ColumnDescriptor and inlining it
    into the builder class. Also added this comment to the builder code:

    // TODO: andrus, 10/14/2009 - 'equalsIgnoreCase' check can result in
    // subtle bugs in DBs with case-sensitive column names (or when quotes
    are
    // used to force case sensitivity). Alternatively using 'equals' may
    miss
    // columns in case-insensitive situations.

    In a couple of days we'd see if this problem pops up during the tests
    with DBs other than HSQLDB. Hopefully not.

    2. Also with this and CAY-1293 patch by Olga, we are very close to
    freezing the code for the 3.0 Beta. Are you planning to work on a
    patch for CAY-1280 in the nearest future, or are you ok to postpone it
    till 3.1 release (In practical terms, 3.1 choice means that if you'd
    like to use this in your work, you'd have to start using trunk builds
    with your apps, or maintain the internal patched version of 3.0).

    Cheers,
    Andrus
  • Evgeny Ryabitskiy at Oct 16, 2009 at 9:44 am

    First of all, thanks for your work on this patch. This closes arguably the
    biggest usability hole in SQLTemplate.
    Glad that you liked it :)
    class. Also added this comment to the builder code:

    // TODO: andrus, 10/14/2009 - 'equalsIgnoreCase' check can result in
    // subtle bugs in DBs with case-sensitive column names (or when quotes are
    // used to force case sensitivity). Alternatively using 'equals' may miss
    // columns in case-insensitive situations.
    Yeah.. good idea. We should think about how we can get information
    about "case-sensibility" of DB.
    If you can give me some hint about it, I can just parametrize this
    comparing for 2 columns.
    2. Also with this and CAY-1293 patch by Olga, we are very close to freezing
    the code for the 3.0 Beta. Are you planning to work on a patch for CAY-1280
    in the nearest future, or are you ok to postpone it till 3.1 release (In
    practical terms, 3.1 choice means that if you'd like to use this in your
    work, you'd have to start using trunk builds with your apps, or maintain the
    internal patched version of 3.0).
    Yes, I'm going to work on it on weekends.When we are going to freeze 3.0 Beta?

    Evgeny Ryabitskiy.
  • Andrus Adamchik at Oct 16, 2009 at 12:57 pm

    On Oct 16, 2009, at 12:44 PM, Evgeny Ryabitskiy wrote:

    Yeah.. good idea. We should think about how we can get information
    about "case-sensibility" of DB.
    If you can give me some hint about it, I can just parametrize this
    comparing for 2 columns.
    Auto-guessing column capitalization has been challenging for us so
    far. Till now we sort of pushed that on the users (e.g. see a second
    patch for CAY-1293). Something to check is whether
    ResultSetMetaData.html.isCaseSensitive(..) works reliably across the
    board. Factors that can affect that (and that need to be tested) are:

    * the type of the DB
    * the way tables are created in that DB and other DB-specific
    configuration parameters that may affect case sensitivity
    * whether Cayenne is configured to quote column names or not (this can
    be done in the Modeler since 3.0)

    2. Also with this and CAY-1293 patch by Olga, we are very close to
    freezing
    the code for the 3.0 Beta. Are you planning to work on a patch for
    CAY-1280
    in the nearest future, or are you ok to postpone it till 3.1
    release (In
    practical terms, 3.1 choice means that if you'd like to use this in
    your
    work, you'd have to start using trunk builds with your apps, or
    maintain the
    internal patched version of 3.0).
    Yes, I'm going to work on it on weekends.When we are going to freeze
    3.0 Beta?
    Cool. A decision when to create a beta release tag will be a result of
    an informal consensus among Cayenne developers. IMO, the only thing
    left to finish is CAY-1280, so I am asking whether this is going to
    happen soon.

    Andrus
  • Evgeny Ryabitskiy at Oct 16, 2009 at 3:17 pm
    If I'm the only person who is blocking release then let's delay my
    CAY-1280 till 3.1
    We found some solution to handle our problems without this issue, so
    we can wait. But it's still very useful thing :)

    Evgeny

    2009/10/16 Andrus Adamchik <andrus@objectstyle.org>:
    On Oct 16, 2009, at 12:44 PM, Evgeny Ryabitskiy wrote:

    Yeah.. good idea. We should think about how we can get information
    about "case-sensibility" of DB.
    If you can give me some hint about it, I can just parametrize this
    comparing for 2 columns.
    Auto-guessing column capitalization has been challenging for us so far. Till
    now we sort of pushed that on the users (e.g. see a second patch for
    CAY-1293). Something to check is whether
    ResultSetMetaData.html.isCaseSensitive(..) works reliably across the board.
    Factors that can affect that (and that need to be tested) are:

    * the type of the DB
    * the way tables are created in that DB and other DB-specific configuration
    parameters that may affect case sensitivity
    * whether Cayenne is configured to quote column names or not (this can be
    done in the Modeler since 3.0)

    2. Also with this and CAY-1293 patch by Olga, we are very close to
    freezing
    the code for the 3.0 Beta. Are you planning to work on a patch for
    CAY-1280
    in the nearest future, or are you ok to postpone it till 3.1 release (In
    practical terms, 3.1 choice means that if you'd like to use this in your
    work, you'd have to start using trunk builds with your apps, or maintain
    the
    internal patched version of 3.0).
    Yes, I'm going to work on it on weekends.When we are going to freeze 3.0
    Beta?
    Cool. A decision when to create a beta release tag will be a result of an
    informal consensus among Cayenne developers. IMO, the only thing left to
    finish is CAY-1280, so I am asking whether this is going to happen soon.

    Andrus
  • Andrey Razumovsky at Oct 16, 2009 at 5:04 pm
    Actually, as already mentioned, we need to do something around CAY-942
    (query generics). This is very critical and desired API change

    2009/10/16 Evgeny Ryabitskiy <evgeny.ryabitskiy@gmail.com>
    If I'm the only person who is blocking release then let's delay my
    CAY-1280 till 3.1
    We found some solution to handle our problems without this issue, so
    we can wait. But it's still very useful thing :)

    Evgeny

    2009/10/16 Andrus Adamchik <andrus@objectstyle.org>:
    On Oct 16, 2009, at 12:44 PM, Evgeny Ryabitskiy wrote:

    Yeah.. good idea. We should think about how we can get information
    about "case-sensibility" of DB.
    If you can give me some hint about it, I can just parametrize this
    comparing for 2 columns.
    Auto-guessing column capitalization has been challenging for us so far. Till
    now we sort of pushed that on the users (e.g. see a second patch for
    CAY-1293). Something to check is whether
    ResultSetMetaData.html.isCaseSensitive(..) works reliably across the board.
    Factors that can affect that (and that need to be tested) are:

    * the type of the DB
    * the way tables are created in that DB and other DB-specific
    configuration
    parameters that may affect case sensitivity
    * whether Cayenne is configured to quote column names or not (this can be
    done in the Modeler since 3.0)

    2. Also with this and CAY-1293 patch by Olga, we are very close to
    freezing
    the code for the 3.0 Beta. Are you planning to work on a patch for
    CAY-1280
    in the nearest future, or are you ok to postpone it till 3.1 release
    (In
    practical terms, 3.1 choice means that if you'd like to use this in
    your
    work, you'd have to start using trunk builds with your apps, or
    maintain
    the
    internal patched version of 3.0).
    Yes, I'm going to work on it on weekends.When we are going to freeze 3.0
    Beta?
    Cool. A decision when to create a beta release tag will be a result of an
    informal consensus among Cayenne developers. IMO, the only thing left to
    finish is CAY-1280, so I am asking whether this is going to happen soon.

    Andrus


    --
    Andrey
  • Andrus Adamchik at Oct 16, 2009 at 5:55 pm

    On Oct 16, 2009, at 8:04 PM, Andrey Razumovsky wrote:

    Actually, as already mentioned, we need to do something around CAY-942
    (query generics). This is very critical and desired API change
    Not sure if we have another more precise Jira (since generics in
    relationships mentioned in this Jira have already been added).

    Anyways, if we are talking about query generics, I did some research
    and prototyping of that in the past. My conclusion is that to do it
    right it has to be a rather dramatic change in Cayenne, affecting not
    simply the code, but a number design concepts. Essentially we have too
    many variables to squeeze into a rather rigid Java generics engine. To
    start, here is possible query result types:

    ? extends Persistent
    ? extends Object (unfinished POJO implementation)
    CayenneDataObject (as in "generic persistence" [1])
    DataRow
    scalar
    Object[] (a mix of scalars and any of the above)

    I don't see how we can easily parameterize that in a meaningful way.

    So my vote is to postpone this till 3.1 and make it a 3.1 priority to
    create an appropriate generics based query design, and maybe reduce
    the number of options. I don't think significantly delaying and
    radically changing 3.0 (that has been de-facto stable for some time)
    is a good idea.

    Andrus

    [1] http://cayenne.apache.org/doc/generic-persistent-class.html
  • Robert Zeigler at Oct 17, 2009 at 3:28 pm
    +1.

    Although having truly generified queries would be useful in many
    situations, I think it's too late in the dev. cycle for 3.0 to add
    this now.


    Robert
    On Oct 16, 2009, at 10/1612:55 PM , Andrus Adamchik wrote:

    On Oct 16, 2009, at 8:04 PM, Andrey Razumovsky wrote:

    Actually, as already mentioned, we need to do something around
    CAY-942
    (query generics). This is very critical and desired API change
    Not sure if we have another more precise Jira (since generics in
    relationships mentioned in this Jira have already been added).

    Anyways, if we are talking about query generics, I did some research
    and prototyping of that in the past. My conclusion is that to do it
    right it has to be a rather dramatic change in Cayenne, affecting
    not simply the code, but a number design concepts. Essentially we
    have too many variables to squeeze into a rather rigid Java generics
    engine. To start, here is possible query result types:

    ? extends Persistent
    ? extends Object (unfinished POJO implementation)
    CayenneDataObject (as in "generic persistence" [1])
    DataRow
    scalar
    Object[] (a mix of scalars and any of the above)

    I don't see how we can easily parameterize that in a meaningful way.

    So my vote is to postpone this till 3.1 and make it a 3.1 priority
    to create an appropriate generics based query design, and maybe
    reduce the number of options. I don't think significantly delaying
    and radically changing 3.0 (that has been de-facto stable for some
    time) is a good idea.

    Andrus

    [1] http://cayenne.apache.org/doc/generic-persistent-class.html
  • Michael Gentry at Oct 19, 2009 at 1:09 pm
    A cleaner API would be nice, but an official Cayenne 3.0 would be even
    nicer (especially since some are hesitant to use 3.0 because it isn't
    final yet).

    mrg
    On Fri, Oct 16, 2009 at 1:55 PM, Andrus Adamchik wrote:
    So my vote is to postpone this till 3.1 and make it a 3.1 priority to create
    an appropriate generics based query design, and maybe reduce the number of
    options. I don't think significantly delaying and radically changing 3.0
    (that has been de-facto stable for some time) is a good idea.

    Andrus
  • Andrus Adamchik at Oct 19, 2009 at 5:47 am

    On Oct 16, 2009, at 6:17 PM, Evgeny Ryabitskiy wrote:

    If I'm the only person who is blocking release then let's delay my
    CAY-1280 till 3.1
    We found some solution to handle our problems without this issue, so
    we can wait.
    Great, we'll go ahead with the release then.
    But it's still very useful thing :)
    Agreed.

    Andrus

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriescayenne
postedOct 11, '09 at 2:22p
activeOct 19, '09 at 1:09p
posts15
users5
websitecayenne.apache.org

People

Translate

site design / logo © 2021 Grokbase