These notes are for HCatalog 0.4.0 release. Highlights ========== HCatalog is a table management and storage management layer for Hadoop that enables users with different data processing tools – Pig, MapReduce, and Hive, to easily share data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file system (HDFS) and ensures that users need not worry about where or in what format their data is stored – RCFile format, text files, sequence files. System Requirements =================== 1. Java 1.6.x or newer, preferably from Sun. Set JAVA_HOME to the root of your Java installation 2. Ant build tool, version 1.8 or higher: http://ant.apache.org - to build source only 3. This release is compatible with Hadoop 1.0 4. This release is compatible with Pig 0.8.1, Pig 0.9.1, and 0.9.2. 5. This release is compatible with Hive 0.9.0. Trying the Release ================== 1. Download hcatalog-src-0.4.0-incubating.tar.gz 2. Follow the directions at http://incubator.apache.org/hcatalog/docs/r0.4.0/install.html 3. If you are upgrading from HCatalog 0.2.0 you will need to upgrade your metastore. The upgrade script can be found in server installation at share/hcatalog/hive/external/metastore/scripts/upgrade/mysql/upgrade-0.7.0-to-0.8.0.mysql.sql This should be done after you install the server and before you start it. Relevant Documentation ====================== See http://incubator.apache.org/hcatalog/docs/r0.4.0 These notes are for HCatalog 0.4.0 release. Changes Since Last Release ========================== HCatalog 0.3 was not released. HCatalog 0.2 was the last release of HCatalog. Major changes since the last release include: - Full support for reading from and writing to Hive. - Support for deeply nested maps, arrays, and structs. - Switch from StorageDrivers to SerDes. See "Backward Incompatibilities" below. - Addition of JSonSerDe to support reading and writing JSON data. - Rather than releasing rpms from HCatalog, we rely on the rpms provided by Apache Bigtop. - The HCatalog binary distribution no longer includes Apache Hive. We now require that Hive first be installed. - The HCatalog source distribution no longer includes Apache Hive source. It now pulls the required jars via maven. For a full list of changes see CHANGES.txt located in the same directory as this file. Backward Incompatibilities ========================== - HCatalog no longer supports its own StorageDriver classes for data (de)serialization. Instead it uses Hive's SerDe classes. - Rather than releasing rpms from HCatalog, we rely on the rpms provided by Apache Bigtop. - The HCatalog binary distribution no longer includes Apache Hive. We now require that Hive first be installed. - The HCatalog source distribution no longer includes Apache Hive source. It now pulls the required jars via maven. Notes ===== HBase integration with HCatalog is experimental and not yet ready for production use.