I am trying to crawl several thousands of rss feeds every 30 minutes.
I thought I could use Hadoop and HBase as my platform.
However, I am not familiar with the HBase architecture and was wondering if
I could insert crawled news articles directly into HBase without first
saving it into HDFS.
I am asking this dumb question because all the HBase examples I saw in
reference books are always starting with saving data to HDFS.
And also, If I have 2 computers comprised of A for HDFS, and B for HBase,
what happens when I insert data directly into HBase?
Is the data stored in B automatically and a pointer is made to A?
Or is the data stored in A and a pointer is made to itself?
I really have no idea how HBase operates :(