| 1) Oliver Jowett Some of the threads on this are getting a bit bogged down, I thought I'd summarize the viable... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Some of the threads on this are getting a bit bogged down, I thought I'd summarize the viable options I've seen so far (well, from my point of view anyway)
setObject() currently allows the user to bypass parameter escaping via setObject(n, string, Types.NUMERIC) (and variants on that). This needs to be plugged as it's a potential security hole.
However the same functionality lets you do the (nonstandard) trick of providing an IN clause to a PreparedStatement like "SELECT * FROM table WHERE pk IN ?". It'd be good to still allow this functionality somehow after setObject is fixed. This is going to be a postgresql-specific extension however we do it.
Here are the permutations I can remember:
Option 1: add a method to PGStatement that explicitly sets an IN clause, taking either a java.sql.Array, java.util.Collection + component type, array + component type, or a custom postgresql object
+ there's no confusion as to what it means + using a custom object allows access via setObject(..., Types.OTHER) consistently, as well as via the extension method. - java.sql.Array and java.util.Collection have problems as PGStatement is compiled for all JDKs and JDBC versions and those types may not be present (we could do a PGJDBC2Statement or something, but that's getting messy) - have to downcast to a PGStatement to use it
Option 2: make setArray() expand to an IN clause when the parameter follows " IN".
+ no new methods or types needed - setArray() behaves differently depending on query context - user has to wrap the underlying array in a java.sql.Array
Option 3: make setObject(n, Collection [, type]) expand to an IN clause.
+ no new methods or types needed - must assume that the contents of the collection use the default type mapping if a type is not provided - if a type is provided and we apply it to the *components* of the collection, this breaks the general getObject() interface of "bind this object interpreting it as this particular type". - not obvious what to do with objects that are both Collections and some other SQL-relevant type; solutions make setObject's behaviour complex and/or query-context-dependent
Option 4: as 3, but use java arrays (native arrays, not java.sql.Array) instead of java.util.Collection
+ as 3, but the ambiguity of "object is both Collection and SQL type X" goes away.
Option 5: don't provide an extension at all i.e. do away with setting IN clauses in this way.
+ no issues with server-side prepare - obviously, you can't set IN clauses in one go any more.
1-4 all need to disable server-side prepare when used.
Did I miss anything? My personal order of preference is 1-2-4-5-3. I have a partial implementation of 2 written but it's easy to adapt that to whatever external interface.
setArray() needs fixing regardless of what happens here. I hope to have a patch for that ready later today.
-O
|
|
|
| 2) Felipe Schnack I also prefer number one. Maybe we should do a poll? :-) > (send "unregister YourEmailAddressHere"... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
I also prefer number one. Maybe we should do a poll? :-)
On Tue, 22 Jul 2003 16:11:19 +1200 Oliver Jowett <oliver@opencloud.com> wrote:
> Some of the threads on this are getting a bit bogged down, I thought I'd > summarize the viable options I've seen so far (well, from my point of view > anyway) > > setObject() currently allows the user to bypass parameter escaping via > setObject(n, string, Types.NUMERIC) (and variants on that). This needs to be > plugged as it's a potential security hole. > > However the same functionality lets you do the (nonstandard) trick of > providing an IN clause to a PreparedStatement like "SELECT * FROM table > WHERE pk IN ?". It'd be good to still allow this functionality somehow after > setObject is fixed. This is going to be a postgresql-specific extension > however we do it. > > Here are the permutations I can remember: > > Option 1: add a method to PGStatement that explicitly sets an IN clause, > taking either a java.sql.Array, java.util.Collection + component type, > array + component type, or a custom postgresql object > > + there's no confusion as to what it means > + using a custom object allows access via setObject(..., Types.OTHER) > consistently, as well as via the extension method. > - java.sql.Array and java.util.Collection have problems as PGStatement is > compiled for all JDKs and JDBC versions and those types may not be present > (we could do a PGJDBC2Statement or something, but that's getting messy) > - have to downcast to a PGStatement to use it > > Option 2: make setArray() expand to an IN clause when the parameter follows " IN". > > + no new methods or types needed > - setArray() behaves differently depending on query context > - user has to wrap the underlying array in a java.sql.Array > > Option 3: make setObject(n, Collection [, type]) expand to an IN clause. > > + no new methods or types needed > - must assume that the contents of the collection use the default type mapping > if a type is not provided > - if a type is provided and we apply it to the *components* of the > collection, this breaks the general getObject() interface of "bind this > object interpreting it as this particular type". > - not obvious what to do with objects that are both Collections and some > other SQL-relevant type; solutions make setObject's behaviour complex > and/or query-context-dependent > > Option 4: as 3, but use java arrays (native arrays, not java.sql.Array) instead of > java.util.Collection > > + as 3, but the ambiguity of "object is both Collection and SQL type X" > goes away. > > Option 5: don't provide an extension at all i.e. do away with setting IN clauses > in this way. > > + no issues with server-side prepare > - obviously, you can't set IN clauses in one go any more. > > 1-4 all need to disable server-side prepare when used. > > Did I miss anything? My personal order of preference is 1-2-4-5-3. I have a > partial implementation of 2 written but it's easy to adapt that to whatever > external interface. > > setArray() needs fixing regardless of what happens here. I hope to have a > patch for that ready later today. > > -O > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org])
--
/~\ The ASCII Felipe Schnack ([email protected: fe...@ritterdosreis.br]) \ / Ribbon Campaign Analista de Sistemas X Against HTML Cel.: 51-91287530 / \ Email! Linux Counter #281893
Centro Universitário Ritter dos Reis http://www.ritterdosreis.br [email protected: r...@ritterdosreis.br] Fone: 51-32303341
|
|
|
| 3) Fernando Nasser Thanks for summarizing it Oliver. I've asked Tom Lane about the backend behavior and he informed me... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Thanks for summarizing it Oliver. I've asked Tom Lane about the backend behavior and he informed me that: 1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, $3) (i.e., our (?, ?, ?) syntax). 2) 7.4 backends have a PostgreSQL specific extension that allows you to fill the IN predicate with a list: ($1) (i.e., our (?) ). One has to pass a PostgreSQL array, like integer[] to fill the list. Note that the parenthesis is already in place, it is not generated by the ? expansion. The feature 2 in 7.4 backends is of limited use as the planner does not know about the list, so the generated plan will not be as good as if you pass the list with fixed values since the beginning. But an improvement for this can be attempted for 7.5. Regards, Fernando
Oliver Jowett wrote: > Some of the threads on this are getting a bit bogged down, I thought I'd > summarize the viable options I've seen so far (well, from my point of view > anyway) > > setObject() currently allows the user to bypass parameter escaping via > setObject(n, string, Types.NUMERIC) (and variants on that). This needs to be > plugged as it's a potential security hole. > > However the same functionality lets you do the (nonstandard) trick of > providing an IN clause to a PreparedStatement like "SELECT * FROM table > WHERE pk IN ?". It'd be good to still allow this functionality somehow after > setObject is fixed. This is going to be a postgresql-specific extension > however we do it. > > Here are the permutations I can remember: > > Option 1: add a method to PGStatement that explicitly sets an IN clause, > taking either a java.sql.Array, java.util.Collection + component type, > array + component type, or a custom postgresql object > > + there's no confusion as to what it means > + using a custom object allows access via setObject(..., Types.OTHER) > consistently, as well as via the extension method. > - java.sql.Array and java.util.Collection have problems as PGStatement is > compiled for all JDKs and JDBC versions and those types may not be present > (we could do a PGJDBC2Statement or something, but that's getting messy) > - have to downcast to a PGStatement to use it > > Option 2: make setArray() expand to an IN clause when the parameter follows " IN". > > + no new methods or types needed > - setArray() behaves differently depending on query context > - user has to wrap the underlying array in a java.sql.Array > > Option 3: make setObject(n, Collection [, type]) expand to an IN clause. > > + no new methods or types needed > - must assume that the contents of the collection use the default type mapping > if a type is not provided > - if a type is provided and we apply it to the *components* of the > collection, this breaks the general getObject() interface of "bind this > object interpreting it as this particular type". > - not obvious what to do with objects that are both Collections and some > other SQL-relevant type; solutions make setObject's behaviour complex > and/or query-context-dependent > > Option 4: as 3, but use java arrays (native arrays, not java.sql.Array) instead of > java.util.Collection > > + as 3, but the ambiguity of "object is both Collection and SQL type X" > goes away. > > Option 5: don't provide an extension at all i.e. do away with setting IN clauses > in this way. > > + no issues with server-side prepare > - obviously, you can't set IN clauses in one go any more. > > 1-4 all need to disable server-side prepare when used. > > Did I miss anything? My personal order of preference is 1-2-4-5-3. I have a > partial implementation of 2 written but it's easy to adapt that to whatever > external interface. > > setArray() needs fixing regardless of what happens here. I hope to have a > patch for that ready later today. > > -O > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org]) >
-- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 4) Dmitry Tkach it doesn't (at least, not in the current implementation) - Types.OTHER ends up calling setString(),... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
> > >Here are the permutations I can remember: > >Option 1: add a method to PGStatement that explicitly sets an IN clause, > taking either a java.sql.Array, java.util.Collection + component type, > array + component type, or a custom postgresql object > > + there's no confusion as to what it means > + using a custom object allows access via setObject(..., Types.OTHER) > consistently, as well as via the extension method. > it doesn't (at least, not in the current implementation) - Types.OTHER ends up calling setString(), that makes it useless for IN parameters > - java.sql.Array and java.util.Collection have problems as PGStatement is> compiled for all JDKs and JDBC versions and those types may not be present> (we could do a PGJDBC2Statement or something, but that's getting messy)>you could declare it to take Object, I suppose (that would be the only way anyway if you wanted to support arrays of primitive types anyway) > - have to downcast to a PGStatement to use it>> >>>Option 2: make setArray() expand to an IN clause when the parameter follows " IN".>> + no new methods or types needed> - setArray() behaves differently depending on query context> - user has to wrap the underlying array in a java.sql.Array>>Option 3: make setObject(n, Collection [, type]) expand to an IN clause.>> + no new methods or types needed> - must assume that the contents of the collection use the default type mapping> if a type is not provided>You can require the type to be provided. > - if a type is provided and we apply it to the *components* of the> collection, this breaks the general getObject() interface of "bind this> object interpreting it as this particular type".> - not obvious what to do with objects that are both Collections and some> other SQL-relevant type; solutions make setObject's behaviour complex> and/or query-context-dependent> >>Option 4: as 3, but use java arrays (native arrays, not java.sql.Array) instead of> java.util.Collection>> + as 3, but the ambiguity of "object is both Collection and SQL type X"> goes away.>>Option 5: don't provide an extension at all i.e. do away with setting IN clauses> in this way.>> + no issues with server-side prepare> - obviously, you can't set IN clauses in one go any more.>>1-4 all need to disable server-side prepare when used.>>Did I miss anything? My personal order of preference is 1-2-4-5-3.>For what it's worth, mine is 3-4-1,2,5 (commas meaning that the last three seem equally useless). Dima
> I have a >partial implementation of 2 written but it's easy to adapt that to whatever >external interface. > >setArray() needs fixing regardless of what happens here. I hope to have a >patch for that ready later today. > >-O > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org]) > >
|
|
|
| 5) Fernando Nasser Oliver has to update his summary first. There are some new info from the backend side. Note that... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Felipe Schnack wrote: > I also prefer number one.> Maybe we should do a poll? :-)> Oliver has to update his summary first. There are some new info from the backend side. Note that option 2 now should read "when inside the parenthesis that define an <in value list> of the IN <predicate>. (I am using the SQL standard clause names here). I would go with number 2 because that is exactly what the backend does in its PREPARE statement. Cheers, Fernando
> On Tue, 22 Jul 2003 16:11:19 +1200 > Oliver Jowett <oliver@opencloud.com> wrote: > > >>Some of the threads on this are getting a bit bogged down, I thought I'd >>summarize the viable options I've seen so far (well, from my point of view >>anyway) >> >>setObject() currently allows the user to bypass parameter escaping via >>setObject(n, string, Types.NUMERIC) (and variants on that). This needs to be >>plugged as it's a potential security hole. >> >>However the same functionality lets you do the (nonstandard) trick of >>providing an IN clause to a PreparedStatement like "SELECT * FROM table >>WHERE pk IN ?". It'd be good to still allow this functionality somehow after >>setObject is fixed. This is going to be a postgresql-specific extension >>however we do it. >> >>Here are the permutations I can remember: >> >>Option 1: add a method to PGStatement that explicitly sets an IN clause, >> taking either a java.sql.Array, java.util.Collection + component type, >> array + component type, or a custom postgresql object >> >> + there's no confusion as to what it means >> + using a custom object allows access via setObject(..., Types.OTHER) >> consistently, as well as via the extension method. >> - java.sql.Array and java.util.Collection have problems as PGStatement is >> compiled for all JDKs and JDBC versions and those types may not be present >> (we could do a PGJDBC2Statement or something, but that's getting messy) >> - have to downcast to a PGStatement to use it >> >>Option 2: make setArray() expand to an IN clause when the parameter follows " IN". >> >> + no new methods or types needed >> - setArray() behaves differently depending on query context >> - user has to wrap the underlying array in a java.sql.Array >> >>Option 3: make setObject(n, Collection [, type]) expand to an IN clause. >> >> + no new methods or types needed >> - must assume that the contents of the collection use the default type mapping >> if a type is not provided >> - if a type is provided and we apply it to the *components* of the >> collection, this breaks the general getObject() interface of "bind this >> object interpreting it as this particular type". >> - not obvious what to do with objects that are both Collections and some >> other SQL-relevant type; solutions make setObject's behaviour complex >> and/or query-context-dependent >> >>Option 4: as 3, but use java arrays (native arrays, not java.sql.Array) instead of >> java.util.Collection >> >> + as 3, but the ambiguity of "object is both Collection and SQL type X" >> goes away. >> >>Option 5: don't provide an extension at all i.e. do away with setting IN clauses >> in this way. >> >> + no issues with server-side prepare >> - obviously, you can't set IN clauses in one go any more. >> >>1-4 all need to disable server-side prepare when used. >> >>Did I miss anything? My personal order of preference is 1-2-4-5-3. I have a >>partial implementation of 2 written but it's easy to adapt that to whatever >>external interface. >> >>setArray() needs fixing regardless of what happens here. I hope to have a >>patch for that ready later today. >> >>-O >> >>---------------------------(end of broadcast)--------------------------- >>TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org]) > > >
-- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 6) Dmitry Tkach This is the same problem, as it generally exists with x=? - the query plan is generally not as good... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Fernando Nasser wrote:
> Thanks for summarizing it Oliver. > > I've asked Tom Lane about the backend behavior and he informed me that: > > 1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, > $3) (i.e., our (?, ?, ?) syntax). > > 2) 7.4 backends have a PostgreSQL specific extension that allows you > to fill the IN predicate with a list: ($1) (i.e., our (?) ). One has > to pass a PostgreSQL array, like integer[] to fill the list. Note > that the parenthesis is already in place, it is not generated by the ? > expansion. > > The feature 2 in 7.4 backends is of limited use as the planner does > not know about the list, so the generated plan will not be as good as > if you pass the list with fixed values since the beginning.
This is the same problem, as it generally exists with x=? - the query plan is generally not as good as x=1, because the planner doesn't know the value to use with statistics. Are you saying that #2 only works with integers? Or can you give it any array? Dima
|
|
|
| 7) Dmitry Tkach If I read this correctly, there is no need for any special handling from the driver side - just... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Fernando Nasser wrote:
> Thanks for summarizing it Oliver. > > I've asked Tom Lane about the backend behavior and he informed me that: > > 1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, > $3) (i.e., our (?, ?, ?) syntax). > > 2) 7.4 backends have a PostgreSQL specific extension that allows you > to fill the IN predicate with a list: ($1) (i.e., our (?) ). One has > to pass a PostgreSQL array, like integer[] to fill the list. Note > that the parenthesis is already in place, it is not generated by the ? > expansion.
If I read this correctly, there is no need for any special handling from the driver side - just setArray() should work. Or am I missing something? Dima
|
|
|
| 8) Oliver Jowett Well, certainly, we'd need to change setObject to understand this new type. Good point. Hmm, so... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
On Tue, Jul 22, 2003 at 10:27:17AM -0400, Dmitry Tkach wrote: > > > > > >Here are the permutations I can remember: > > > >Option 1: add a method to PGStatement that explicitly sets an IN clause, > > taking either a java.sql.Array, java.util.Collection + component type, > > array + component type, or a custom postgresql object > > > > + there's no confusion as to what it means > > + using a custom object allows access via setObject(..., Types.OTHER) > > consistently, as well as via the extension method. > > > it doesn't (at least, not in the current implementation) - Types.OTHER > ends up calling setString(), that makes it useless for IN parameters
Well, certainly, we'd need to change setObject to understand this new type. > > - java.sql.Array and java.util.Collection have problems as PGStatement is> > compiled for all JDKs and JDBC versions and those types may not be > > present> > (we could do a PGJDBC2Statement or something, but that's getting messy)> >> you could declare it to take Object, I suppose (that would be the only > way anyway if you wanted to support arrays of primitive types anyway)Good point. > >Option 3: make setObject(n, Collection [, type]) expand to an IN clause.> >> > + no new methods or types needed> > - must assume that the contents of the collection use the default type > > mapping> > if a type is not provided> >> You can require the type to be provided.Hmm, so what does setObject with no type do in that case? Also see the next point. > > - if a type is provided and we apply it to the *components* of the> > collection, this breaks the general getObject() interface of "bind this> > object interpreting it as this particular type".-O
|
|
|
| 9) Fernando Nasser With the new V3 protocol this is probably true (7.4 will support V3). |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Dmitry Tkach wrote: > Fernando Nasser wrote: > >> Thanks for summarizing it Oliver. >> >> I've asked Tom Lane about the backend behavior and he informed me that: >> >> 1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, >> $3) (i.e., our (?, ?, ?) syntax). >> >> 2) 7.4 backends have a PostgreSQL specific extension that allows you >> to fill the IN predicate with a list: ($1) (i.e., our (?) ). One has >> to pass a PostgreSQL array, like integer[] to fill the list. Note >> that the parenthesis is already in place, it is not generated by the ? >> expansion. > > > If I read this correctly, there is no need for any special handling > from the driver side - just setArray() should work. Or am I missing > something? >
With the new V3 protocol this is probably true (7.4 will support V3). -- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 10) Dmitry Tkach It will throw an exception - "Unrecognized parameter type: " + Object.getClass().getName () Well...... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
> > >>>Option 3: make setObject(n, Collection [, type]) expand to an IN clause. >>> >>>+ no new methods or types needed >>>- must assume that the contents of the collection use the default type >>>mapping >>> if a type is not provided >>> >>> >>> >>You can require the type to be provided. >> >> > >Hmm, so what does setObject with no type do in that case? Also see the next >point. > > It will throw an exception - "Unrecognized parameter type: " + Object.getClass().getName () > >>>>- if a type is provided and we apply it to the *components* of the>>> collection, this breaks the general getObject() interface of "bind this>>> object interpreting it as this particular type".>>> >>>Well... this "general interface" is *by implication* only. It is not defined this way in the spec, it is not documented to always work this way. So, you just *assume*, that this is the general interface... It doesn't have to be like that... Certainly not at the cost of valuable functionality... Dima. P.S. Actually, in light of that previous message about 7.4 support for arrays in the in clause, this whole discussion seems to be moot :-) It seems to me that just setArray () should then work, without any special handling by the driver... Isn't it the case?
|
|
|
| 11) Oliver Jowett I assume this is only when you're doing a PREPARE/EXECUTE? Hm, then it sounds like the right... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
On Tue, Jul 22, 2003 at 09:05:45AM -0400, Fernando Nasser wrote: > Thanks for summarizing it Oliver. > > I've asked Tom Lane about the backend behavior and he informed me that: > > 1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, > $3) (i.e., our (?, ?, ?) syntax). > > 2) 7.4 backends have a PostgreSQL specific extension that allows you to > fill the IN predicate with a list: ($1) (i.e., our (?) ). One has to > pass a PostgreSQL array, like integer[] to fill the list. Note that the > parenthesis is already in place, it is not generated by the ? expansion.
I assume this is only when you're doing a PREPARE/EXECUTE? > The feature 2 in 7.4 backends is of limited use as the planner does not > know about the list, so the generated plan will not be as good as if you > pass the list with fixed values since the beginning. But an improvement > for this can be attempted for 7.5.Hm, then it sounds like the right solution is to have setArray() expand as the guts of an IN clause when the backend is <7.4 or server prepares are off, and the parameter is in a query of the form "... IN (?)", and as a normal array otherwise. -O
|
|
|
| 12) Fernando Nasser yes. That is _exactly_ what I am proposing (option 2 of your summary)... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Oliver Jowett wrote: > On Tue, Jul 22, 2003 at 09:05:45AM -0400, Fernando Nasser wrote: > >>Thanks for summarizing it Oliver. >> >>I've asked Tom Lane about the backend behavior and he informed me that: >> >>1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, >>$3) (i.e., our (?, ?, ?) syntax). >> >>2) 7.4 backends have a PostgreSQL specific extension that allows you to >>fill the IN predicate with a list: ($1) (i.e., our (?) ). One has to >>pass a PostgreSQL array, like integer[] to fill the list. Note that the >>parenthesis is already in place, it is not generated by the ? expansion. > > > I assume this is only when you're doing a PREPARE/EXECUTE? >
yes. > >>The feature 2 in 7.4 backends is of limited use as the planner does not >>know about the list, so the generated plan will not be as good as if you >>pass the list with fixed values since the beginning. But an improvement >>for this can be attempted for 7.5.> > > Hm, then it sounds like the right solution is to have setArray() expand as> the guts of an IN clause when the backend is <7.4 or server prepares are> off, and the parameter is in a query of the form "... IN (?)", and as a> normal array otherwise.> That is _exactly_ what I am proposing (option 2 of your summary)
-- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 13) Darin Ohashi I'm not sure if this makes sense, but could you have a conflict between a set containing a single... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
> Oliver has to update his summary first. There are some new info from > the backend side. > > Note that option 2 now should read "when inside the parenthesis that > define an <in value list> of the IN <predicate>. (I am using the SQL > standard clause names here). > > I would go with number 2 because that is exactly what the > backend does > in its PREPARE statement.
I'm not sure if this makes sense, but could you have a conflict between a set containing a single element that is the array and a set containing the elements of the array. Darin
|
|
|
| 14) Tom Lane No, it's not the same thing --- the planner can generate an indexscan plan when scalar params are... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Dmitry Tkach <dmitry@openratings.com> writes: > Fernando Nasser wrote: >> The feature 2 in 7.4 backends is of limited use as the planner does >> not know about the list, so the generated plan will not be as good as >> if you pass the list with fixed values since the beginning.
> This is the same problem, as it generally exists with x=? - the query > plan is generally not as good as x=1, because the planner doesn't know > the value to use with statistics.
No, it's not the same thing --- the planner can generate an indexscan plan when scalar params are involved, although it might choose not to. The planner is simply not aware that any comparable optimization might be possible when using the new array syntax. Let me attach the example I sent Fernando last night ... Fernando Nasser <fnasser@redhat.com> writes: > PREPARE tststmt (integer[]) AS SELECT * from testmetadata where id IN (?);> PREPARE tststmt (integer, integer) AS SELECT * from testmetadata where id IN (?, ?);> all give parsing errors.The second case works fine from the command line: regression=# prepare z(int,int) as select * from tenk1 where unique1 in ($1,$2); PREPARE regression=# execute z(42,66); unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+--------- 42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx 66 | 6723 | 0 | 2 | 6 | 6 | 66 | 66 | 66 | 66 | 66 | 132 | 133 | OCAAAA | PYJAAA | VVVVxx (2 rows) Perhaps JDBC has some problem with it? I would not expect the first case to work, since it violates the plain meaning of IN. But Joe Conway has implemented some non-SQL syntax to handle that in 7.4: regression=# prepare zz(int[]) as select * from tenk1 where unique1 = ANY ($1); PREPARE regression=# execute zz(ARRAY[42,66]); unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+--------- 42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx 66 | 6723 | 0 | 2 | 6 | 6 | 66 | 66 | 66 | 66 | 66 | 132 | 133 | OCAAAA | PYJAAA | VVVVxx (2 rows) I should warn you though that this is not yet executed efficiently; the planner has no idea about reducing it to a set of indexscans. Compare regression=# explain analyze execute z(42,66); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Index Scan using tenk1_unique1, tenk1_unique1 on tenk1 (cost=0.00..12.02 rows=2 width=244) (actual time=0.28..0.48 rows=2 loops=1) Index Cond: ((unique1 = $1) OR (unique1 = $2)) Total runtime: 1.35 msec (3 rows) regression=# explain analyze execute zz(ARRAY[42,66]); QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on tenk1 (cost=0.00..708.00 rows=5000 width=244) (actual time=70.03..126.16 rows=2 loops=1) Filter: (unique1 = ANY ($1)) Total runtime: 126.78 msec (3 rows) Perhaps we can make it work better in 7.5. regards, tom lane
|
|
|
| 15) Dmitry Tkach Oops :-( I see... that is a world of a differnce :-( Dima ... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Oops :-( I see... that is a world of a differnce :-( Dima
Tom Lane wrote:
>Dmitry Tkach <dmitry@openratings.com> writes: > > >>Fernando Nasser wrote: >> >> >>>The feature 2 in 7.4 backends is of limited use as the planner does >>>not know about the list, so the generated plan will not be as good as >>>if you pass the list with fixed values since the beginning. >>> >>> > > > >>This is the same problem, as it generally exists with x=? - the query >>plan is generally not as good as x=1, because the planner doesn't know >>the value to use with statistics. >> >> > >No, it's not the same thing --- the planner can generate an indexscan >plan when scalar params are involved, although it might choose not to. >The planner is simply not aware that any comparable optimization might >be possible when using the new array syntax. Let me attach the example >I sent Fernando last night ... > > >Fernando Nasser <fnasser@redhat.com> writes: > > >>PREPARE tststmt (integer[]) AS SELECT * from testmetadata where id IN (?); >>PREPARE tststmt (integer, integer) AS SELECT * from testmetadata where id IN (?, ?); >>all give parsing errors. >> >> > >The second case works fine from the command line: > >regression=# prepare z(int,int) as select * from tenk1 where unique1 in ($1,$2); >PREPARE >regression=# execute z(42,66); > unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 >---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+--------- > 42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx > 66 | 6723 | 0 | 2 | 6 | 6 | 66 | 66 | 66 | 66 | 66 | 132 | 133 | OCAAAA | PYJAAA | VVVVxx >(2 rows) > >Perhaps JDBC has some problem with it? > >I would not expect the first case to work, since it violates the plain >meaning of IN. But Joe Conway has implemented some non-SQL syntax to >handle that in 7.4: > >regression=# prepare zz(int[]) as select * from tenk1 where unique1 = ANY ($1); >PREPARE >regression=# execute zz(ARRAY[42,66]); > unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 >---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+--------- > 42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx > 66 | 6723 | 0 | 2 | 6 | 6 | 66 | 66 | 66 | 66 | 66 | 132 | 133 | OCAAAA | PYJAAA | VVVVxx >(2 rows) > >I should warn you though that this is not yet executed efficiently; the >planner has no idea about reducing it to a set of indexscans. Compare > >regression=# explain analyze execute z(42,66); > QUERY PLAN >------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using tenk1_unique1, tenk1_unique1 on tenk1 (cost=0.00..12.02 rows=2 width=244) (actual time=0.28..0.48 rows=2 loops=1) > Index Cond: ((unique1 = $1) OR (unique1 = $2)) > Total runtime: 1.35 msec >(3 rows) > >regression=# explain analyze execute zz(ARRAY[42,66]); > QUERY PLAN >------------------------------------------------------------------------------------------------------- > Seq Scan on tenk1 (cost=0.00..708.00 rows=5000 width=244) (actual time=70.03..126.16 rows=2 loops=1) > Filter: (unique1 = ANY ($1)) > Total runtime: 126.78 msec >(3 rows) > > >Perhaps we can make it work better in 7.5. > > regards, tom lane > >
|
|
|
| 16) Fernando Nasser You will need to have an Array of Arrays in that case. |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Darin Ohashi wrote: >>Oliver has to update his summary first. There are some new info from >>the backend side. >> >>Note that option 2 now should read "when inside the parenthesis that >>define an <in value list> of the IN <predicate>. (I am using the SQL >>standard clause names here). >> >>I would go with number 2 because that is exactly what the >>backend does >>in its PREPARE statement. > > > I'm not sure if this makes sense, but could you have a conflict between a set > containing a single element that is the array and a set containing the elements > of the array. >
You will need to have an Array of Arrays in that case.
-- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 17) Oliver Jowett Ouch. That syntax is going to be messy to transform into an IN clause for <7.4 backends. -O |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
On Tue, Jul 22, 2003 at 10:51:51AM -0400, Tom Lane wrote:
> Fernando Nasser <fnasser@redhat.com> writes: > > PREPARE tststmt (integer[]) AS SELECT * from testmetadata where id IN (?); > > PREPARE tststmt (integer, integer) AS SELECT * from testmetadata where id IN (?, ?); > > all give parsing errors.
> I would not expect the first case to work, since it violates the plain > meaning of IN. But Joe Conway has implemented some non-SQL syntax to > handle that in 7.4: > > regression=# prepare zz(int[]) as select * from tenk1 where unique1 = ANY ($1); > PREPARE > regression=# execute zz(ARRAY[42,66]);
Ouch. That syntax is going to be messy to transform into an IN clause for <7.4 backends. -O
|
|
|
| 18) Felipe Schnack > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Am I the only the only one who doesn't like the idea of the driver parsing SQL statements (to check if there is a IN clause) On Tue, 22 Jul 2003 10:41:22 -0400 Fernando Nasser <fnasser@redhat.com> wrote: > Oliver Jowett wrote:> > On Tue, Jul 22, 2003 at 09:05:45AM -0400, Fernando Nasser wrote:> > > >>Thanks for summarizing it Oliver.> >>> >>I've asked Tom Lane about the backend behavior and he informed me that:> >>> >>1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, > >>$3) (i.e., our (?, ?, ?) syntax).> >>> >>2) 7.4 backends have a PostgreSQL specific extension that allows you to > >>fill the IN predicate with a list: ($1) (i.e., our (?) ). One has to > >>pass a PostgreSQL array, like integer[] to fill the list. Note that the > >>parenthesis is already in place, it is not generated by the ? expansion.> > > > > > I assume this is only when you're doing a PREPARE/EXECUTE?> > > > yes.> > > > >>The feature 2 in 7.4 backends is of limited use as the planner does not > >>know about the list, so the generated plan will not be as good as if you > >>pass the list with fixed values since the beginning. But an improvement > >>for this can be attempted for 7.5.> > > > > > Hm, then it sounds like the right solution is to have setArray() expand as> > the guts of an IN clause when the backend is <7.4 or server prepares are> > off, and the parameter is in a query of the form "... IN (?)", and as a> > normal array otherwise.> > > > That is _exactly_ what I am proposing (option 2 of your summary)> > > > -- > Fernando Nasser> Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com]> 2323 Yonge Street, Suite #300> Toronto, Ontario M4P 2C9> > > ---------------------------(end of broadcast)---------------------------> TIP 2: you can get off all lists at once with the unregister command> (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org])
--
/~\ The ASCII Felipe Schnack ([email protected: fe...@ritterdosreis.br]) \ / Ribbon Campaign Analista de Sistemas X Against HTML Cel.: 51-91287530 / \ Email! Linux Counter #281893
Centro Universitário Ritter dos Reis http://www.ritterdosreis.br [email protected: r...@ritterdosreis.br] Fone: 51-32303341
|
|
|
| 19) Fernando Nasser Mind that this is only necessary for backward compatibility. With 7.4 and the V3 protocol you just... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Felipe Schnack wrote: > Am I the only the only one who doesn't like the idea of the driver parsing SQL statements (to check if there is a IN clause)> Mind that this is only necessary for backward compatibility. With 7.4 and the V3 protocol you just send an array and the backend sorts it out.
> On Tue, 22 Jul 2003 10:41:22 -0400 > Fernando Nasser <fnasser@redhat.com> wrote: > > >>Oliver Jowett wrote: >> >>>On Tue, Jul 22, 2003 at 09:05:45AM -0400, Fernando Nasser wrote: >>> >>> >>>>Thanks for summarizing it Oliver. >>>> >>>>I've asked Tom Lane about the backend behavior and he informed me that: >>>> >>>>1) 7.4 backends do support parameters in the IN predicate, as ($1, $2, >>>>$3) (i.e., our (?, ?, ?) syntax). >>>> >>>>2) 7.4 backends have a PostgreSQL specific extension that allows you to >>>>fill the IN predicate with a list: ($1) (i.e., our (?) ). One has to >>>>pass a PostgreSQL array, like integer[] to fill the list. Note that the >>>>parenthesis is already in place, it is not generated by the ? expansion. >>> >>> >>>I assume this is only when you're doing a PREPARE/EXECUTE? >>> >> >>yes. >> >> >>>>The feature 2 in 7.4 backends is of limited use as the planner does not >>>>know about the list, so the generated plan will not be as good as if you >>>>pass the list with fixed values since the beginning. But an improvement >>>>for this can be attempted for 7.5. >>> >>> >>>Hm, then it sounds like the right solution is to have setArray() expand as >>>the guts of an IN clause when the backend is <7.4 or server prepares are >>>off, and the parameter is in a query of the form "... IN (?)", and as a >>>normal array otherwise. >>> >> >>That is _exactly_ what I am proposing (option 2 of your summary) >> >> >> >>-- >>Fernando Nasser >>Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] >>2323 Yonge Street, Suite #300 >>Toronto, Ontario M4P 2C9 >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to [email protected: majo...@postgresql.org]) > > >
-- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
| 20) Fernando Nasser Remember we will already have to know that we are handling the <in values list> clause (i.e. it is... |
|
|
| |
+1 vote
|
|
 |
|
|
|
|
|
|
Oliver Jowett wrote: > On Tue, Jul 22, 2003 at 10:51:51AM -0400, Tom Lane wrote: > > >>Fernando Nasser <fnasser@redhat.com> writes: >> >>>PREPARE tststmt (integer[]) AS SELECT * from testmetadata where id IN (?); >>>PREPARE tststmt (integer, integer) AS SELECT * from testmetadata where id IN (?, ?); >>>all give parsing errors. >> > >>I would not expect the first case to work, since it violates the plain >>meaning of IN. But Joe Conway has implemented some non-SQL syntax to >>handle that in 7.4: >> >>regression=# prepare zz(int[]) as select * from tenk1 where unique1 = ANY ($1); >>PREPARE >>regression=# execute zz(ARRAY[42,66]); > > > Ouch. That syntax is going to be messy to transform into an IN clause for > <7.4 backends. >
Remember we will already have to know that we are handling the <in values list> clause (i.e. it is a " IN (?)'), so we can very well special case the expansion of the array. But it will be much better on 7.4 and V3, I agree. -- Fernando Nasser Red Hat Canada Ltd. E-Mail: [email protected: fn...@redhat.com] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
|
|
|
|
 | |