We have a few projects which do this on hadoop, but I dont see any
reason why it cant have been done in pig.
As Alan and Ashutosh mentioned, the image itself will be just bytearray
(and so you need your own loader, or in our case use a sequence file
loader) : but you can extract and populate meta-data about the image
through udf primitives which can then be used in the pig workflow to
control how it is processed in a scaleout fashion on top of hadoop.
On Tuesday 27 July 2010 12:26 AM, Ifeanyichukwu Osuji wrote:
I was wondering if it would be possible to process images on a
low level using PIG. I want to be able to write a pig script
that can differentiate between two images.