Hadoop Hive與Hbase整合+thrift

s19930811 ? 2015-04-13 22:52 ? Linux干貨, 大數據運維, 系統運維

1. 簡介

Hive是基于Hadoop的一個數據倉庫工具，可以將結構化的數據文件映射為一張數據庫表，并提供完整的sql查詢功能，可以將sql語句轉換為MapReduce任務進行運行。其優點是學習成本低，可以通過類SQL語句快速實現簡單的MapReduce統計，不必開發專門的MapReduce應用，十分適合數據倉庫的統計分析。

Hive與HBase的整合功能的實現是利用兩者本身對外的API接口互相進行通信，相互通信主要是依靠hive_hbase-handler.jar工具類，大致意思如圖所示：

2. Hive項目介紹

項目結構

Hive配置文件介紹
?hive-site.xml      hive的配置文件
?hive-env.sh        hive的運行環境文件
?hive-default.xml.template 默認模板
?hive-env.sh.template     hive-env.sh默認配置
?hive-exec-log4j.properties.template   exec默認配置
? hive-log4j.properties.template log默認配置
hive-site.xml
< property>
<name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createData baseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
   <description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
   <value>test</value>
   <description>password to use against metastore database</description>
</property>

hive-env.sh
?配置Hive的配置文件路徑
?export HIVE_CONF_DIR= your path
?配置Hadoop的安裝路徑
?HADOOP_HOME=your hadoop home

我們按數據元的存儲方式不同安裝。

3. 使用Derby數據庫安裝

什么是Derby安裝方式?Apache Derby是一個完全用java編寫的數據庫，所以可以跨平臺，但需要在JVM中運行?Derby是一個Open source的產品，基于Apache License 2.0分發?即將元數據存儲在Derby數據庫中，也是Hive默認的安裝方式

1 .Hadoop和Hbase都已經成功安裝了

Hadoop集群配置：http://blog.csdn.net/hguisu/article/details/723739

hbase安裝配置：http://blog.csdn.net/hguisu/article/details/7244413

2. 下載hive

hive目前最新的版本是0.12，我們先從http://mirror.bit.edu.cn/apache/hive/hive-0.12.0/ 上下載hive-0.12.0.tar.gz，但是請注意，此版本基于是基于hadoop1.3和hbase0.94的（如果安裝hadoop2.X ，我們需要修改相應的內容）

3. 安裝：

tar zxvf hive-0.12.0.tar.gz

cd hive-0.12.0

4. 替換jar包，與hbase0.96和hadoop2.2版本一致。

由于我們下載的hive是基于hadoop1.3和hbase0.94的，所以必須進行替換，因為我們的hbse0.96是基于hadoop2.2的，所以我們必須先解決hive的hadoop版本問題，目前我們從官網下載的hive都是用1.幾的版本編譯的，因此我們需要自己下載源碼來用hadoop2.X的版本重新編譯hive，這個過程也很簡單，只需要如下步驟：
  1. 先從http://svn.apache.org/repos/asf/hive/branches/branch-0.12 或者是http://svn.apache.org/repos/asf/hive/trunk 我們下載到/home/hadoop/branch-0.12下。
2. branch-0.12是使用ant編譯，trunk下面是使用maven編譯，如果未按照maven，需要從http://maven.apache.org/download.cgi 下載maven，或者使用yum install maven。然后解壓出來并在PATH下把$maven_home/bin加入或者使用鏈接（ln -s /usr/local/bin/mvn $maven_home/bin ）.然后就是使用mvn 命令。運行mvn -v就能知道maven是否配置成功   3. 配置好maven開始編譯hive，我們cd到下載源碼的branch-0.12 目錄，然后運行mvn clean package -DskipTests -Phadoop-2開始編譯
4.編譯好后的新jar包是存放在各個模塊下的target的，這些新jar包的名字都叫hive-***-0.13.0-SNAPSHOT.jar，***為hive下的模塊名，我們需要運行命令將其拷貝到hive-0.12.0/lib下。 find /home/hadoop/branch-0.12 -name "hive*SNAPSHOT.jar"|xargs -i cp {} /home/hadoop/hive-0.12.0/lib?？截愡^去后我們比照著刪除原lib下對應的0.12版本的jar包。      5. 接著我們同步hbase的版本，先cd到hive0.12.0/lib下，將hive-0.12.0/lib下hbase-0.94開頭的那兩個jar包刪掉，然后從/home/hadoop/hbase-0.96.0-hadoop2/lib下hbase開頭的包都拷貝過來    find /home/hadoop/hbase-0.96.0-hadoop/lib -name "hbase*.jar"|xargs -i cp {} ./
  6. 基本的同步完成了，重點檢查下zookeeper和protobuf的jar包是否和hbase保持一致，如果不一致，

拷貝protobuf.**.jar和zookeeper-3.4.5.jar到hive/lib下。

7.如果用mysql當原數據庫，別忘了找一個mysql的jdbcjar包mysql-connector-java-3.1.12-bin.jar也拷貝到hive-0.12.0/lib下

5. 配置hive

?進入hive-0.12/conf目錄?依據hive-env.sh.template，創建hive-env.sh文件?cp hive-env.sh.template hive-env.sh?修改hive-env.sh?指定hive配置文件的路徑?export HIVE_CONF_DIR=/home/hadoop/hive-0.12/conf?指定Hadoop路徑? HADOOP_HOME=/home/hadoop/hadoop-2.2.0

hive-site.xml
<?xml version="1.0"?>  
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
<configuration>  
  
<!-- Hive Execution Parameters -->  
  
<property>  
  <name>hive.exec.reducers.bytes.per.reducer</name>  
  <value>1000000000</value>  
  <description>size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers.</description>  
</property>  
  
<property>  
  <name>hive.exec.reducers.max</name>  
  <value>999</value>  
  <description>max number of reducers will be used. If the one  
        specified in the configuration parameter mapred.reduce.tasks is  
        negative, hive will use this one as the max number of reducers when  
        automatically determine number of reducers.</description>  
</property>  
  
<property>  
  <name>hive.exec.scratchdir</name>  
  <value>/hive/scratchdir</value>  
  <description>Scratch space for Hive jobs</description>  
</property>  
  
<property>  
  <name>hive.exec.local.scratchdir</name>  
  <value>/tmp/${user.name}</value>  
  <description>Local scratch space for Hive jobs</description>  
</property>  
<property>  
  <name>javax.jdo.option.ConnectionURL</name>  
  <value>jdbc:derby:;databaseName=metastore_db;create=true</value>  
  <description>JDBC connect string for a JDBC metastore</description>  
</property>  
  
<property>  
  <name>javax.jdo.option.ConnectionDriverName</name>  
  <value>org.apache.derby.jdbc.EmbeddedDriver</value>  
  <description>Driver class name for a JDBC metastore</description>  
</property>  
  
<property>  
  <name>javax.jdo.PersistenceManagerFactoryClass</name>  
  <value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>  
  <description>class implementing the jdo persistence</description>  
</property>  
  
<property>  
  <name>javax.jdo.option.DetachAllOnCommit</name>  
  <value>true</value>  
  <description>detaches all objects from session so that they can be used after transaction is committed</description>  
</property>  
  
<property>  
  <name>javax.jdo.option.ConnectionUserName</name>  
  <value>APP</value>  
  <description>username to use against metastore database</description>  
</property>  
  
<property>  
  <name>javax.jdo.option.ConnectionPassword</name>  
  <value>mine</value>  
  <description>password to use against metastore database</description>  
</property>  
  
<property>  
  <name>hive.metastore.warehouse.dir</name>  
  <value>/hive/warehousedir</value>  
  <description>location of default database for the warehouse</description>  
</property>  
  
  
<property>  
 <name>hive.aux.jars.path</name>  
  <value>  
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,  
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,  
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,  
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,  
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,  
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar  
  </value>  
</property>

Hive使用Hadoop，這意味著你必須在PATH里面設置了hadoop路徑，或者導出export HADOOP_HOME=<hadoop-install-dir>也可以。另外，你必須在創建Hive庫表前，在HDFS上創建/tmp和/hive/warehousedir（也稱為hive.metastore.warehouse.dir的），并且將它們的權限設置為chmod g+w。完成這個操作的命令如下：
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$ $HADOOP_HOME/bin/hadoop fs -mkdir /hive/warehousedir
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w/hive/warehousedir
我同樣發現設置HIVE_HOME是很重要的，但并非必須。
$ export HIVE_HOME=<hive-install-dir>
在Shell中使用Hive命令行(cli)模式：
$ $HIVE_HOME/bin/hive

5. 啟動hive

1）.單節點啟動

#bin/hive -hiveconf hbase.master=master:490001

2）集群啟動：

#bin/hive -hiveconf hbase.zookeeper.quorum=node1,node2,node3

如何hive-site.xml文件中沒有配置hive.aux.jars.path，則可以按照如下方式啟動。

bin/hive --auxpath /usr/local/hive/lib/hive-hbase-handler-0.96.0.jar, /usr/local/hive/lib/hbase-0.96.jar, /usr/local/hive/lib/zookeeper-3.3.2.jar -hiveconf hbase.zookeeper.quorum=node1,node2,node3

啟動直接#bin/hive 也可以。

6 測試hive

?建立測試表pokeshive> CREATE TABLE pokes (foo INT, bar STRING);
OK
Time taken: 1.842 seconds
hive> show tables;
OK
pokes
Time taken: 0.182 seconds, Fetched: 1 row(s)
?數據導入pokes

hive> LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO pokse

然后查看hadoop的文件：bin/hadoop dfs -ls /hive/warehousedir
看到新增一個文件：drwxr-xr-x – hadoop supergroup 0 09:06 /hive/warehousedir/pokes

注：使用derby存儲方式時，運行hive會在當前目錄生成一個derby文件和一個metastore_db目錄。這種存儲方式的弊端是在同一個目錄下同時只能有一個hive客戶端能使用數據庫，否則報錯。

4. 使用MYSQL數據庫的方式安裝

安裝MySQL
? Ubuntu 采用apt-get安裝
? sudo apt-get install mysql-server
? 建立數據庫hive
? create database hivemeta
? 創建hive用戶,并授權
? grant all on hive.* to hive@'%' identified by 'hive';
? flush privileges;

我們直接修改hive-site.xml就可以啦。

修改hive-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>hive.exec.scratchdir</name>
  <value>/hive/scratchdir</value>
  <description>Scratch space for Hive jobs</description>
</property>
<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>
<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
  <description>location of default database for the warehouse</description>
</property>
<property>
 <name>hive.aux.jars.path</name>
  <value>
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar
  </value>
</property>

jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true其中hiveMeta是mysql的數據庫名。createDatabaseIfNotExist沒有就自動創建

本地mysql啟動hive ：

直接運行#bin/hive 就可以。

遠端mysql方式，啟動hive：

服務器端（192.168.1.214上機master上）：

在服務器端啟動一個 MetaStoreServer，客戶端利用 Thrift 協議通過 MetaStoreServer 訪問元數據庫。

啟動hive，這個又可以分為啟動metastore和hiveserver，其中metastore用于和mysql之間的表結構創建或更新時通訊，hiveserver用于客戶端連接，這這個都要啟動，具體的啟動命令：啟動metastore：hive –service metastore -hiveconf hbase.zookeeper.quorum=node1,node2，node3 -hiveconf hbase.zookeeper.property.clientPort=2222 （遠程mysql需要啟動）
啟動hiveservice：hive –service hiveserver -hiveconf hbase.zookeeper.quorum=node1,node2，node3 -hiveconf hbase.zookeeper.property.clientPort=2222 (啟動服務，這樣jdbc:hive就能連上，默認10000端口，后面的部分一定要帶上，否則用eclipse連接不上的) 起來后我們在eclipse就可以使用jdbc:hive來連接了。如 Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver"); Connection conn = DriverManager.getConnection("jdbc:hive://server1:10000/hiveMeta","root","111111"); return conn;其實使用上和普通的數據庫已經很相似了，除了建表的語句有一些差別。
當然你也可以在hive-0.12.0/bin運行hive -hiveconf hive.root.logger=DEBUG,console -hiveconf hbase.zookeeper.quorum=server2,server3 -hiveconf hbase.zookeeper.property.clientPort=2222其中 hbase.zookeeper.property.clientPort就是hbase-site.xml配置的zookeeper的端口號。

客戶端hive 的hive-site.xml配置文件：

<?xml version="1.0"?>  
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
   
<configuration>  
  
<property>  
  <name>hive.metastore.warehouse.dir</name>  
  <value>/hive/warehousedir</value>  
</property>  
   
<property>  
  <name>hive.metastore.local</name>  
  <value>false</value>  
</property>  
  
<property>  
  <name>hive.metastore.uris</name>  
  <value>thrift://192.168.1.214:9083</value>  
</property>  
  
</configuration>

這個就是使用thrift訪問的端口配置。thrift://192.168.1.214:9083就是hive元數據訪問路徑。

進入hive客戶端，運行show tables;
至此，可以在linux用各種shell來測試，也可以通過eclipse連接到hive來測試，和通過jdbc連接普通數據庫一致hive的服務端和客戶端都可以放在同一臺服務器上：
hive-site.xml

<?xml version="1.0"?>  
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
<configuration>  
  
  
<property>  
  <name>hive.exec.scratchdir</name>  
  <value>/hive/scratchdir</value>  
  <description>Scratch space for Hive jobs</description>  
</property>  
  
  
<property>  
  <name>hive.exec.local.scratchdir</name>  
  <value>/tmp/${user.name}</value>  
  <description>Local scratch space for Hive jobs</description>  
</property>  
<property>  
  <name>javax.jdo.option.ConnectionURL</name>  
  <value>jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true</value>  
  <description>JDBC connect string for a JDBC metastore</description>  
</property>  
  
  
<property>  
  <name>javax.jdo.option.ConnectionDriverName</name>  
  <value>com.mysql.jdbc.Driver</value>  
  <description>Driver class name for a JDBC metastore</description>  
</property>  
  
  
<property>  
  <name>javax.jdo.option.ConnectionUserName</name>  
  <value>hive</value>  
  <description>username to use against metastore database</description>  
</property>  
  
  
<property>  
  <name>javax.jdo.option.ConnectionPassword</name>  
  <value>hive</value>  
  <description>password to use against metastore database</description>  
</property>  
  
  
<property>  
  <name>hive.metastore.warehouse.dir</name>  
  <value>/hive/warehousedir</value>  
  <description>location of default database for the warehouse</description>  
</property>  
  
<property>  
 <name>hive.aux.jars.path</name>  
  <value>  
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,  
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,  
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,  
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,  
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,  
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar  
  </value>  
  
<property>    
  <name>hive.metastore.uris</name>    
  <value>thrift://192.168.1.214:9083</value>    
</property>    
</property>

4. 與Hbase整合

之前我們測試創建表的都是創建本地表,非hbase對應表?，F在我們整合回到hbase。

創建hbase識別的數據庫：

CREATE TABLE hbase_table_1(key int, value string)  
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")  
TBLPROPERTIES ("hbase.table.name" = "xyz");

hbase.table.name 定義在hbase的table名稱

hbase.columns.mapping 定義在hbase的列族

在hbase 下也能看到，兩邊新增數據都能實時看到。

可以登錄Hbase去查看數據了

#bin/hbase shell
hbase(main):001:0> describe 'xyz'  
hbase(main):002:0> scan 'xyz'  
hbase(main):003:0> put 'xyz','100','cf1:val','www.360buy.com'

這時在Hive中可以看到剛才在Hbase中插入的數據了。

2.使用sql導入數據

如果要insert 與hbase整合的表，不能像本地表一樣load，需要利用已有的表進行。如insert overwrite hbase_table_1 hivetest select * from pokes 注意兩個的類型要一致，否則用insert overwrite table hivetest select * from table_hive; 導不進去數據

使用sql導入hbase_table_1:

hive> INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes WHERE foo=86;

3 hive訪問已經存在的hbase

使用CREATE EXTERNAL TABLE:

CREATE EXTERNAL TABLE hbase_table_2(key int, value string)        
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")  
TBLPROPERTIES("hbase.table.name" = "some_existing_table");

內容參考：http://wiki.apache.org/hadoop/Hive/HBaseIntegration

5. 問題

bin/hive 執行show tables 報錯：

Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

如果是使用Derby數據庫的安裝方式，查看

<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehousedir</value>
<description>location of default database for the warehouse</description>
</property>

配置是否正確，

或者

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

是否有權限訪問。

如果配置了mysql的Metastore方式,檢查的權限：

bin/hive -hiveconf hive.root.logger=DEBUG,console

然后show tables 就會看到ava.sql.SQLException: Access denied for user 'hive'@'××××8' (using password: YES) 之類從錯誤消息。

執行

CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");

報錯：

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException: Retried 10 times

出現這個錯誤的原因是引入的hbase包和hive自帶的hive包沖突，刪除hive/lib下的 hbase-0.94.×××.jar， OK了。

同時也要移走hive-0.12**.jar 包。

執行

hive>select uid from user limit 100;

java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

解決：修改$HIVE_HOME/conf/hive-env.sh文件，加入

export HADOOP_HOME=hadoop的安裝目錄

5. 通過thrift訪問hive（使用php做客戶端）

php連接hive執行sql查詢

使用php連接hive的條件：

1. 下載thrift

wget http://mirror.bjtu.edu.cn/apache//thrift/0.9.1/thrift-0.9.1.tar.gz

2. 解壓

tar -xzf thrift-0.9.1.tar.gz

3 .編譯安裝：

如果是源碼編譯的，首先要使用./boostrap.sh創建文件./configure ，我們這下載的tar包，自帶有configure文件了。(（可以查閱README文件）)

If you are building from the first time out of the source repository, you will
need to generate the configure scripts.  (This is not necessary if you
downloaded a tarball.)  From the top directory, do:
./bootstrap.sh

./configure

1 需要安裝thrift 安裝步驟

# ./configure –without-ruby

不要使用ruby，

make ; make install

如果沒有安裝libevent libevent-devel的應該先安裝這兩個依賴庫yum -y install libevent libevent-devel

其實Thrift就是使用來生成客戶端和服務器端代碼的。在這里沒用到。

安裝好后啟動hive thrift

# ./hive –service hiveserver >/dev/null 2>/dev/null &

查看hiveserver默認端口是否打開10000 如果打開表示成功，在官網wiki有介紹文章：https://cwiki.apache.org/confluence/display/Hive/HiveServer

Thrift Hive Server

HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results. HiveServer is built on Apache Thrift^TM(http://thrift.apache.org/), therefore it is sometimes called the Thrift server although this can lead to confusion because a newer service named HiveServer2 is also built on Thrift.

Thrift's interface definition language (IDL) file for HiveServer is hive_service.thrift, which is installed in $HIVE_HOME/service/if/.

WARNING!

Icon

HiveServer cannot handle concurrent requests from more than one client. This is actually a limitation imposed by the Thrift interface that HiveServer exports, and can't be resolved by modifying the HiveServer code.
HiveServer2 is a rewrite of HiveServer that addresses these problems, starting with Hive 0.11.0. See HIVE-2935.

Once Hive has been built using steps in Getting Started, the Thrift server can be started by running the following:

0.8 and Later

$ build/dist/bin/hive --service hiveserver --help
usage: hiveserver
 -h,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --maxWorkerThreads <arg>      maximum number of worker threads,
                                  default:2147483647
    --minWorkerThreads <arg>      minimum number of worker threads,
                                  default:100
 -p <port>                        Hive Server port number, default:10000
 -v,--verbose                     Verbose mode
 
$ bin/hive --service hiveserver

下載php客戶端包：

其實hive-0.12包中自帶的php lib，經測試，該包報php語法錯誤。命名空間的名稱竟然是空的。

我上傳php客戶端包：http://download.csdn.net/detail/hguisu/6913673（源下載http://download.csdn.net/detail/jiedushi/3409880）

php連接hive客戶端代碼

<?php  
// php連接hive thrift依賴包路徑  
ini_set('display_errors', 1);  
error_reporting(E_ALL);  
$GLOBALS['THRIFT_ROOT'] = dirname(__FILE__). "/";  
// load the required files for connecting to Hive  
require_once $GLOBALS['THRIFT_ROOT'] . 'packages/hive_service/ThriftHive.php';  
require_once $GLOBALS['THRIFT_ROOT'] . 'transport/TSocket.php';  
require_once $GLOBALS['THRIFT_ROOT'] . 'protocol/TBinaryProtocol.php';  
// Set up the transport/protocol/client  
$transport = new TSocket('192.168.1.214', 10000);  
$protocol = new TBinaryProtocol($transport);  
  
//$protocol = new TBinaryProtocolAccelerated($transport);  
  
$client = new ThriftHiveClient($protocol);  
$transport->open();  
  
// run queries, metadata calls etc  
  
$client->execute('show tables');  
var_dump($client->fetchAll());  
$transport->close();  
  
?>

轉自：http://blog.csdn.net/hguisu/article/details/7282050

原創文章，作者：s19930811，如若轉載，請注明出處：http://www.www58058.com/3079

hadoop hadoop集群 hbase 數據倉庫數據存儲

贊 (0)

0

MySQL優化大全

上一篇 2015-04-13 22:49

用PHP編寫Hadoop的MapReduce程序

下一篇 2015-04-13 22:52

Linux概述

計算機組成概述計算機組成原理如上圖計算機的組成結構，CPU運行速度遠遠高于內存，而內存運行速度又遠遠高于I/O；由于三者運行速度巨大差異，如果系統僅運行一個任務（單任務系統），那么將有大量的CPU空閑時間等待緩慢的I/O及內存的讀取。為提高CPU的使用效率，于是便產生了多任務系統系統的需求。多任務系統多任務實現的基礎，我們知道計算機處理任務主要靠CP…

Linux干貨 2016-12-02
【linux】正則表達式之grep、egrep、元字符

正則表達式：又稱正規表示法、常規表示法（英語：Regular Expression，在代碼中常簡寫為regex、regexp或RE），計算機科學的一個概念。是一類字符所書寫的模式，其中許多字符（元字符）不表示其字面意義，而是表達控制或通配等功能。正則表達式使用單個字符串來描述、匹配一系列符合某個句法規則的字符…

Linux干貨 2015-04-01
Linux干貨

文本處理工具之sed

文本處理工具之sed 1、sed是一種行編輯器，對文本逐行處理。處理時，它會將文本行載入”模式空間”（臨時緩沖區），接著用sed命令對模式空間的內容進行處理，處理完成后，將模式空間內容輸出到顯示屏與之類似的處理工具還有grep和awk，相比grep，sed不僅對文本有過濾功能，還可以對文本進行其他增刪改的操作。而相比awk，sed顯得更加簡單高效。下面我們…

2017-03-18
Linux任務計劃

Linux任務計劃，周期性任務執行未來的某時間點執行一次某任務：at, batch 周期性運行某任務：crontab &n…

Linux干貨 2016-12-31
AWK基礎用法

AWK： print 例子：給沒個/etc/passwd下的用戶都打招呼hello 答案： awk ‘{print “hello:”$1}’ /etc/passwd 內建變量 FS ：輸入分隔符 OFS：輸出分隔符 RS ：行輸入分隔符 ORS：行輸出分割符 NF ：字段數 NR ：行號數 FNR：每個文件的行號數 ARGC：參數個數 ARG…

Linux干貨 2017-05-22
Linux干貨

Nginx之ngx_http_proxy_module模塊詳解

一、正反向代理簡介 1、正向代理：局域網內的機器借助于代理服務器訪問局域網外的網站這時正向代理的功能： &nbs…

2017-06-25

欧美性久久久久