Issues / learning’s during SkyNet POC
Learning 1: BZip2Codec configuration for Hadoop Cluster
Go to Ambari / core-site.xml and add
org.apache.hadoop.io.compress.BZip2Codec to io.compression.codecs
Learning 2: Enable Tez
Use the following instructions to enable Tez for Hive
Queries:
Copy the hive-exec-0.13.0.jar to HDFS at the following
location: /apps/hive/install/hive-exec-0.13.0.jar.
$ su - hive
$ hadoop fs
-mkdir /apps/hive/install
$ hadoop fs
-copyFromLocal /usr/lib/hive/lib/hive-exec-*
/apps/hive/install/hive-exec-0.13.0.jar
Enable Hive to use Tez DAG APIs. On the Hive client
machine, add the following to your Hive script or execute it in the Hive shell:
set
hive.execution.engine=tez;
Disabling Tez for Hive Queries: Use the following
instructions to disable Tez for Hive queries:
On the Hive client machine, add the following to your
Hive script or execute it in the Hive shell:
set hive.execution.engine=mr;
Learning 3: Flume from Ambari directly
In latest Ambari, we can start Flume directly with
configuration file content.
Learning 4: Flume-Ng command to start service from CLI in debug mode
$ bin/flume-ng agent --conf ./conf/ -f
conf/flumeSkyNet.conf -Dflume.root.logger=DEBUG,console -n agentADSB
Learning 5: Finding Hadoop Echo Systems version from Ambari
Admin à
Cluster (Hue interface à
About is the other way to find from Hue interface)
Issue 1: Could not resolve org.apache.hcatalog.pig.HCatStorer - Error in HDP2.2.
At the end of the "pig script" section, there
is a text box that says "pig arguments".
Type -useHCatalog and then press Enter
It should highlight in gray and add a new empty text box
Use org.apache.hive.hcatalog.pig.HCatLoader(); instead of
"org.apache.hcatalog.pig.HCatLoader();"
Issue 2: cannot access /usr/lib/hive/lib/slf4j-api-*.jar: No such file or
directory
1. find / -name slf4j-api-*.jar
2. cp
/usr/lib/hive/lib/
Issue 3: Pig 'bytearray' type in column 0(0-based) cannot map to HCat
'STRING'type. Target filed must be of
HCat type {BINARY}
Change pig type to chararray
Issue 4: Column names should all be in lowercase. Invalid name found:
airGround
Change column names to lowercase in pig
Issue 5: Permissions issues with Linux scripts and HDFS permissions due to
root id and hue id
Make sure the automated process has correct permissions
to read / write / execute
Issue 6: Flume: “[ERROR -
org.apache.flume.source.NetcatSource.start(NetcatSource.java:167)] Unable to
bind to socket.” during the source connectivity
In Progress ….
No comments:
Post a Comment