We would like to create an impala table which can supply updated data to.
Currently we are using inserts with a version timestamp to allow us to get
the latest row version as well as run "as of" a specific time.
The solution we have is using a join to "max(version_timestamp), pkey ...
group by pkey"
which has the obvious performance issue of a join.
Q) What other options can you recommend?
select version_timestamp as rev, m.pkey, field1, field2
from my_table m
inner join (
select max(version_timestamp) as rev,
group by pkey
) v on v.rev = m.version_timestamp;