FAQ
Hi Michael,

the problem is, that I do not even get an iterator because executing a query like the following results in a Java Heap Space error:

ResultIterator it = dataContext.performIteratedQuery(query);

The answers to your questions are:
1) How many records are you talking about?
It's about half a million records
2) Are you updating your object with a flag/etc you can query on again later (to exclude objects you've already processed)?
I already do exclude objects by setting them to a different state. But it may happen that I have to process half a million records despite of this.
3) What version of Cayenne are you using and what database?
Cayenne 3.0.2, Postgres 9.1
4) When you convert your Map (from the iterated query) into a DataObject, are you creating a new DataContext or using the old one over and over again?
At the moment I am using just one DataContext unregistering the processed objects. But as mentioned above execution does not even get to this point.

Simon
Hi Simon, some questions:

1) How many records are you talking about?
2) Are you updating your object with a flag/etc you can query on again later (to exclude objects you've already processed)?
3) What version of Cayenne are you using and what database?
4) When you convert your Map (from the iterated query) into a DataObject, are you creating a new DataContext or using the old one over and over again?

For #4, if you are using the same DataContext repeatedly, try changing your logic to something more like:

while (iterator.hasNextRow()) {
DataContext context = DataContext.createDataContext();
Map row = (Map) iterator.nextRow();
CayenneObject object = (CayenneObject) context.objectFromDataRow("CayenneObject", row);
...
object.doStuff();
...
context.commitChanges();
}

This way you won't build up a ton of objects in a single DataContext and possibly run out of memory.

mrg

Search Discussions

  • Michael Gentry at Dec 17, 2012 at 2:51 pm
    Hi Simon,

    I don't know why your performIteratedQuery() would fail with a heap error.
    Based upon your answer to #2, it sounds like you can do a fetch limit on
    your query (call dataContext.setFetchLimit(limit) and do a normal
    performQuery() and you'll get back real Cayenne objects) and only pull back
    100 or 1000 records, process them (setting them to a different state), then
    commit. Do this in a new DataContext each time so the GC can reclaim the
    memory.

    mrg


    On Mon, Dec 17, 2012 at 8:38 AM, Simon Schneider wrote:

    Hi Michael,

    the problem is, that I do not even get an iterator because executing a
    query like the following results in a Java Heap Space error:

    ResultIterator it = dataContext.performIteratedQuery(query);

    The answers to your questions are:
    1) How many records are you talking about?
    It's about half a million records
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    I already do exclude objects by setting them to a different state. But it
    may happen that I have to process half a million records despite of this.
    3) What version of Cayenne are you using and what database?
    Cayenne 3.0.2, Postgres 9.1
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    At the moment I am using just one DataContext unregistering the processed
    objects. But as mentioned above execution does not even get to this point.

    Simon
    Hi Simon, some questions:

    1) How many records are you talking about?
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    3) What version of Cayenne are you using and what database?
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    For #4, if you are using the same DataContext repeatedly, try changing
    your logic to something more like:
    while (iterator.hasNextRow()) {
    DataContext context = DataContext.createDataContext();
    Map row = (Map) iterator.nextRow();
    CayenneObject object = (CayenneObject)
    context.objectFromDataRow("CayenneObject", row);
    ...
    object.doStuff();
    ...
    context.commitChanges();
    }

    This way you won't build up a ton of objects in a single DataContext and
    possibly run out of memory.
    mrg
  • Simon Schneider at Dec 17, 2012 at 3:42 pm
    Hi Michael,

    I understand your approach of using a flag to identify already processed objects. But introducing a flag or in my case another state just for processing my records, was something I wanted to avoid. I thought that Cayenne maybe has another way of fetching objects in a memory preserving manner. Maybe some Iterator which on creation fetches the primary keys only. And then while iterating, batches of data rows are fetched in the background.

    Simon


    Am 17.12.2012 um 15:50 schrieb Michael Gentry:
    Hi Simon,

    I don't know why your performIteratedQuery() would fail with a heap error.
    Based upon your answer to #2, it sounds like you can do a fetch limit on
    your query (call dataContext.setFetchLimit(limit) and do a normal
    performQuery() and you'll get back real Cayenne objects) and only pull back
    100 or 1000 records, process them (setting them to a different state), then
    commit. Do this in a new DataContext each time so the GC can reclaim the
    memory.

    mrg


    On Mon, Dec 17, 2012 at 8:38 AM, Simon Schneider wrote:

    Hi Michael,

    the problem is, that I do not even get an iterator because executing a
    query like the following results in a Java Heap Space error:

    ResultIterator it = dataContext.performIteratedQuery(query);

    The answers to your questions are:
    1) How many records are you talking about?
    It's about half a million records
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    I already do exclude objects by setting them to a different state. But it
    may happen that I have to process half a million records despite of this.
    3) What version of Cayenne are you using and what database?
    Cayenne 3.0.2, Postgres 9.1
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    At the moment I am using just one DataContext unregistering the processed
    objects. But as mentioned above execution does not even get to this point.

    Simon
    Hi Simon, some questions:

    1) How many records are you talking about?
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    3) What version of Cayenne are you using and what database?
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    For #4, if you are using the same DataContext repeatedly, try changing
    your logic to something more like:
    while (iterator.hasNextRow()) {
    DataContext context = DataContext.createDataContext();
    Map row = (Map) iterator.nextRow();
    CayenneObject object = (CayenneObject)
    context.objectFromDataRow("CayenneObject", row);
    ...
    object.doStuff();
    ...
    context.commitChanges();
    }

    This way you won't build up a ton of objects in a single DataContext and
    possibly run out of memory.
    mrg
  • Michael Gentry at Dec 17, 2012 at 4:40 pm
    Hi Simon,

    I think I misunderstood something you said earlier because I thought you
    already had a "processed" flag you could query against. Given that you
    don't and I'm not sure why your performIteratedQuery() is failing, perhaps
    you could merge using data rows with paginated queries:

    http://cayenne.apache.org/docs/3.0/data-rows.html
    http://cayenne.apache.org/docs/3.0/paginated-queries.html

    I suspect, however, this will not scale as much as you need (I think the
    paginated query will fetch in ~500k data rows still). You may end up
    having to do an SQLTemplate query and fetch only the primary keys (which is
    what a paginated query does), and then do a loop fetching batches of your
    records based upon the primary keys (using new DataContexts, of course).
    This is a bit more work, but shouldn't have issues.

    mrg


    On Mon, Dec 17, 2012 at 10:40 AM, Simon Schneider wrote:

    Hi Michael,

    I understand your approach of using a flag to identify already processed
    objects. But introducing a flag or in my case another state just for
    processing my records, was something I wanted to avoid. I thought that
    Cayenne maybe has another way of fetching objects in a memory preserving
    manner. Maybe some Iterator which on creation fetches the primary keys
    only. And then while iterating, batches of data rows are fetched in the
    background.

    Simon


    Am 17.12.2012 um 15:50 schrieb Michael Gentry:
    Hi Simon,

    I don't know why your performIteratedQuery() would fail with a heap error.
    Based upon your answer to #2, it sounds like you can do a fetch limit on
    your query (call dataContext.setFetchLimit(limit) and do a normal
    performQuery() and you'll get back real Cayenne objects) and only pull back
    100 or 1000 records, process them (setting them to a different state), then
    commit. Do this in a new DataContext each time so the GC can reclaim the
    memory.

    mrg



    On Mon, Dec 17, 2012 at 8:38 AM, Simon Schneider <sschneider@mackoy.de
    wrote:
    Hi Michael,

    the problem is, that I do not even get an iterator because executing a
    query like the following results in a Java Heap Space error:

    ResultIterator it = dataContext.performIteratedQuery(query);

    The answers to your questions are:
    1) How many records are you talking about?
    It's about half a million records
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    I already do exclude objects by setting them to a different state. But
    it
    may happen that I have to process half a million records despite of
    this.
    3) What version of Cayenne are you using and what database?
    Cayenne 3.0.2, Postgres 9.1
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    At the moment I am using just one DataContext unregistering the
    processed
    objects. But as mentioned above execution does not even get to this
    point.
    Simon
    Hi Simon, some questions:

    1) How many records are you talking about?
    2) Are you updating your object with a flag/etc you can query on again
    later (to exclude objects you've already processed)?
    3) What version of Cayenne are you using and what database?
    4) When you convert your Map (from the iterated query) into a
    DataObject, are you creating a new DataContext or using the old one over
    and over again?
    For #4, if you are using the same DataContext repeatedly, try changing
    your logic to something more like:
    while (iterator.hasNextRow()) {
    DataContext context = DataContext.createDataContext();
    Map row = (Map) iterator.nextRow();
    CayenneObject object = (CayenneObject)
    context.objectFromDataRow("CayenneObject", row);
    ...
    object.doStuff();
    ...
    context.commitChanges();
    }

    This way you won't build up a ton of objects in a single DataContext
    and
    possibly run out of memory.
    mrg

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriescayenne
postedDec 17, '12 at 1:39p
activeDec 17, '12 at 4:40p
posts4
users2
websitecayenne.apache.org

2 users in discussion

Simon Schneider: 2 posts Michael Gentry: 2 posts

People

Translate

site design / logo © 2021 Grokbase