PS, yes that was coming from master
On 3 August 2010 14:22, Jamie Cockrill wrote:
Hi JD,

The cluster is on a separated network, I'll see if any of the traces
remain. As for the ulimit and xceivers bit, those are setup correctly
as per the API doc you mention.


On 2 August 2010 19:18, Jean-Daniel Cryans wrote:
Is that coming from the master? If so, it means that it was trying to
write recovered data from a failed region server and wasn't able to do
so. It sounds bad.

- Can we get full stack traces of that error?
- Did you check the datanode logs for any exception? Very often
(strong emphasis on "very"), it's an issue with either ulimit or
xcievers. Is your cluster configured per the last bullet on that page?


On Mon, Aug 2, 2010 at 6:16 AM, Jamie Cockrill wrote:
Hi All,

I set off a long-running loading job over the weekend and it seems to
have rather destroyed my hbase cluster. Most of the nodes were down
this morning and upon restarting them, I'm now persistently getting
the following message every few ms in the master logs:

DfsClient: Could not complete file
/hbase/.logs/compute17.cluster1.lan,60020,1280518716613/a filename

That file is a zero-byte file on the HDFS. The data-nodes all look
fine and don't seem to have had any trouble. I'm not especially fussed
about having to rebuild that table and reload it, but the trouble is
now that I can't start the cluster properly so I can drop the table.

Does anyone know how I can remove the table/fix these errors manually.
As I said, I'm not fussed about data-loss.



Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 6 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedAug 2, '10 at 1:17p
activeAug 3, '10 at 5:15p



site design / logo © 2021 Grokbase