hive学习01-环境搭建

环境准备:

环境准备玩本人目录如下:

–D:\hadoop

​ –hadoop-2.7.7

​ –jdk1.8.0_171

​ –apache-hive-2.1.1-bin

hadoop环境搭建

第零步:配置环境变量:HADOOP_HOME=D:\hadoop\hadoop-2.7.7

第一步:修改配置文件:

  • core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>    
</configuration>
  • hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>    
        <name>dfs.namenode.name.dir</name>    
        <value>file:/hadoop/data/dfs/namenode</value>    
    </property>    
    <property>    
        <name>dfs.datanode.data.dir</name>    
        <value>file:/hadoop/data/dfs/datanode</value>  
    </property>
</configuration>
  • mapred-site.xml(mapred-site.xml.template修改而来)
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
  • yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>
  • hadoop-env.cmd
set JAVA_HOME=D:\hadoop\jdk1.8.0_171

第二步:替换bin目录部分文件

https://github.com/steveloughran/winutils下载hadoop-2.8.1,将hadoop-2.8.1文件夹内所有内容放入D:\hadoop\hadoop-2.7.7\bin,如有重复,则覆盖

第三步:进入D:\hadoop\hadoop-2.7.7\bin目录,执行hdfs初始化命令hdfs namenode -format

第四步:进入D:\hadoop\hadoop-2.7.7\sbin目录,执行启动命令,start-all.cmd

这时,一般情况下,就会启动4个命令行窗口,使用jps命令可以查看是否启动

D:\hadoop>jps
4912 RunJar
14532 Jps
23508 RunJar    ##1.
6036 DataNode    ##2
19800 ResourceManager    ##3.
21436 NameNode        ##4.
6556 NodeMana    ger

至此hadoop环境搭建完成

hive环境搭建

第一步:准备工作

  • 从maven中下载mysql-connector-java-5.1.37.jar(或其他jar版本)放在hive目录下的lib文件夹
  • 配置hive环境变量,HIVE_HOME=D:\hadoop\apache-hive-2.1.1-bin

第二步:配置文件:

  • 修改 hive-env.sh
export HADOOP_HOME=D:\hadoop\hadoop-2.7.7
export HIVE_CONF_DIR=D:\hadoop\apache-hive-2.1.1-bin\conf
export HIVE_AUX_JARS_PATH=D:\hadoop\apache-hive-2.1.1-bin\lib
  • 修改hive-site.xml
<!--修改的配置-->  

<property>  

<name>hive.metastore.warehouse.dir</name>  

<!--hive的数据存储目录,指定的位置在hdfs上的目录-->  

<value>/user/hive/warehouse</value>  

<description>location of default database for the warehouse</description>  

</property>  

<property>  

<name>hive.exec.scratchdir</name>  

<!--hive的临时数据目录,指定的位置在hdfs上的目录-->  

<value>/tmp/hive</value>  

<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>  

</property>  

<property>  

<name>hive.exec.local.scratchdir</name>  

<!--本地目录-->  

<value>D:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value>  

<description>Local scratch space for Hive jobs</description>  

</property>  

<property>  

<name>hive.downloaded.resources.dir</name>  

<!--本地目录-->  

<value>D:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value>  

<description>Temporary local directory for added resources in the remote file system.</description>  

</property>  

<property>  

<name>hive.querylog.location</name>  

<!--本地目录-->  

<value>D:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value>  

<description>Location of Hive run time structured log file</description>  

</property>  

<property>  

<name>hive.server2.logging.operation.log.location</name>  

<value>D:/hadoop/apache-hive-2.1.1-bin/hive/iotmp/operation_logs</value>  

<description>Top level directory where operation logs are stored if logging functionality is enabled</description>  

</property>  

<!--新增的配置-->  

<property>  

<name>javax.jdo.option.ConnectionURL</name>  

<value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8</value>  

</property>  

<property>  

<name>javax.jdo.option.ConnectionDriverName</name>  

<value>com.mysql.jdbc.Driver</value>  

</property>  

<property>  

<name>javax.jdo.option.ConnectionUserName</name>  

<value>root</value>  

</property>  

<property>  

<name>javax.jdo.option.ConnectionPassword</name>  

<value>***</value>  

</property>  

<!-- 解决 Required table missing : "`VERSION`" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.autoCreateTables"  -->  

<property>  

<name>datanucleus.autoCreateSchema</name>  

<value>true</value>  

</property>  

<property>  

<name>datanucleus.autoCreateTables</name>  

<value>true</value>  

</property>  

<property>  

<name>datanucleus.autoCreateColumns</name>  

<value>true</value>  

</property>  

<!-- 解决 Caused by: MetaException(message:Version information not found in metastore. )  -->  

<property>    

<name>hive.metastore.schema.verification</name>    

<value>false</value>    

<description>    

    Enforce metastore schema version consistency.    

    True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic    

          schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures    

          proper metastore schema migration. (Default)    

    False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.    

</description>    

</property>

第三步:MySQL设置

create database hive default character set latin1;

!此处如果使用UTF-8建库的话,会出现Index column size too large. The maximum column size is 767 bytes.报错

第四步:启动metastore服务:hive --service metastore

第五步:启动Hive:hive

至此,hive安装部署完成