If you’re running Hadoop 0.20 with Hive 0.7 here are a couple of
bugs that it’s useful to know about:
NullPointerException
If you have an external partitioned table, this could mean you
forgot to recover the partitions before running the query:
ALTER
TABLE sample RECOVER PARTITIONS;
MR jobs hanging on 0/0 completed map tasks
Creating an external table that points to an empty location will
cause hive to generate mapreduce jobs that hang *forever*. It’s because the map
tasks stay at 0% complete (0/0 completed).
There is a Hadoop patch for
this (so
long as you have the ability to patch your cluster), and it should already be
integrated into hadoop version 0.21.
Bonus:
If you have some sort of delimited data (eg, tab
delimited) in a Hive external table, and you want to find all records
where a particular string field is non-existent, you need to test for
empty string and not NULL:
select
* from events where venue IS NULL <= Won’t work
select
* from events where venue = “” <= Will work

Pega Developer Course
ReplyDeleteBecome a certified Pega Developer by mastering app design, process automation, and integration tools. This developer course helps you gain practical experience to excel in top MNC roles.