I've been reading through this a bit lately as I work on a merge tool for our tables, and it looks like there's an off by one bug in the OfflineMerger constructor in util.HMerge:
InternalScanner rootScanner =
root.getScanner(scan);
try {
List<KeyValue> results = new ArrayList<KeyValue>();
while(rootScanner.next(results)) {
for(KeyValue kv: results) {
HRegionInfo info = Writables.getHRegionInfoOrNull(kv.getValue());
if (info != null) {
metaRegions.add(info);
}
}
}
} finally {
...
}
That call to InternalScanner.next() in the while condition returns true if there's another result *after* the one it just loaded into the out param. That is, after it reads the last row into the 'results' collection, it returns false and the loop exits with that last row unread. It probably wants to be structured more like this:
final InternalScanner metaScanner = meta.getScanner(scan);
List<KeyValue> results = Lists.newArrayList();
while (true) {
boolean hasMore = metaScanner.next(results);
for (KeyValue kv : results) {
HRegionInfo hri = Writables.getHRegionInfoOrNull(kv.getValue());
if (hri != null) {
regionInfo.add(hri);
}
}
if (!hasMore) {
break;
}
}
I get the impression this class doesn't get used much, but just thought I'd point it out.
Thanks,
Sandy