Installing Apache Thrift On Windows

Posted on  by
Installing Apache Thrift On Windows 5,5/10 1654reviews
Apache Thrift On Windows

The following works on version 0.8.0 of Thrift. To solve this problem on Windows 7x64, msys + mingw64, I installed the latest from source code by configuring for both static and shared:./configure mingw64 shared make make test make install Installing just the static version of the library solved libcrypto not being found. Installing the shared version as well fixed it for libssl as well. There were errors further on when compiling openssl v1.0.1c while trying to compile the tests. About 3 of the c test files just had a file name in them which the compiler didn't know what to do with. Copying the code from the dummytest.c file from the same directory into those problematic files solved the issues.

Install Apache Thrift On Windows

Note that by default openssl will install into /usr/local/ssl so you will have to specify the LDFLAGS and CPPFLAGS to point to the correct directories when configuring thrift. From a build directory, using mingw64 it was:./thrift-0.8.0/configure CPPFLAGS=-I/usr/local/ssl/include LDFLAGS=-L/usr/local/ssl/lib CXXFLAGS=-DMINGW.

Apache Thrift's compiler is written in C++ and designed to be portable, but there are some system requirements which must be installed prior to use. Select your os below for a guide on setting up your system to get started. Centos 6.5 Install Debian/Ubuntu install OS X Install Windows Install.

Shade 14 Keygen Free on this page. Appears to be outdated. When I add this to /etc/profile: export PYTHONPATH=$PYTHONPATH:/usr/lib/hive/lib/py I can then do the imports as listed in the link, with the exception of from hive import ThriftHive which actually need to be: from hive_service import ThriftHive Next the port in the example was 10000, which when I tried caused the program to hang. The default Hive Thrift port is 9083, which stopped the hanging.

I believe the easiest way is to use PyHive. To install you'll need these libraries: pip install sasl pip install thrift pip install thrift-sasl pip install PyHive Please note that although you install the library as PyHive, you import the module as pyhive, all lower-case. If you're on Linux, you may need to install SASL separately before running the above. Install the package libsasl2-dev using apt-get or yum or whatever package manager for your distribution. For Windows there are some options on GNU.org, you can download a binary installer.

On a Mac SASL should be available if you've installed xcode developer tools ( xcode-select --install in Terminal) After installation, you can connect to Hive like this: from pyhive import hive conn = hive.Connection(host='YOUR_HIVE_HOST', port=PORT, username='YOU') Now that you have the hive connection, you have options how to use it. You can just straight-up query: cursor = conn.cursor() cursor.execute('SELECT cool_stuff FROM hive_table') for result in cursor.fetchall(): use_result(result).or to use the connection to make a Pandas dataframe: import pandas as pd df = pd.read_sql('SELECT cool_stuff FROM hive_table', conn).

I assert that you are using HiveServer2, which is the reason that makes the code doesn't work. You may use pyhs2 to access your Hive correctly and the example code like that: import pyhs2 with pyhs2.connect(host='localhost', port=10000, authMechanism='PLAIN', user='root', password='test', database='default') as conn: with conn.cursor() as cur: #Show databases print cur.getDatabases() #Execute query cur.execute('select * from table') #Return column info from query print cur.getSchema() #Fetch table results for i in cur.fetch(): print i Attention that you may install python-devel.x86_64 cyrus-sasl-devel.x86_64 before installing pyhs2 with pip. Wish this can help you. The examples above are a bit out of date. Pyhs2 is no longer maintained. A better alternative is impyla It has many more features over pyhs2, for example, it has Kerberos authentication, which is a must for us. From impala.dbapi import connect conn = connect(host='my.host.com', port=10000) cursor = conn.cursor() cursor.execute('SELECT * FROM mytable LIMIT 100') print cursor.description # prints the result set's schema results = cursor.fetchall() ## cursor.execute('SELECT * FROM mytable LIMIT 100') for row in cursor: process(row) Cloudera is putting more effort now on hs2 client which is a C/C++ HiveServer2/Impala client.

Might be a better option if you push a lot of data to/from python. (has Python binding too - ) Some more information on impyla: • • Don't be confused that some of the above examples talk about Impala; just change port to 10000 (default) for HiveServer2, and it'll work the same way as with Impala examples. It's the same protocol (Thrift) that is used for both Impala and Hive.

You don't have to do a global INVALIDATE METADATA, you could just do a table-level one INVALIDATE METADATA schema.table. Even then, I don't understand the downvote, because my code above connects to port 10000 - which is a thrift service of HiveServer2, so you don't have to do any invalidates, as you SQL commands would be run directly in Hive. Re-copying last paragraph.