PyExasol
PyExasol was originally developed by Vitaly Markov and is now officially supported by Exasol. It helps us to handle massive volumes of data commonly associated with this database.
You may expect significant performance improvement over existing ODBC / JDBC solutions in single-process scenarios involving pandas. It is also possible to split data set across multiple processes and servers to achieve linear scalability. With PyExasol you are not limited by a single CPU core.
Prerequisites
To run PyExasol, you need:
- An Exasol installation.
- An environment with a Python 3 installation, version 3.6 or above.
- Pip to install additional modules.
- Make sure you are able to ping the Exasol instance from the computer with the Python installation.
- The procedure in the document uses Jupyter Notebook as command-line tool. However, this is not mandatory for using the PyExasol package.
Procedure
- Launch Anaconda Navigator on your system.
- Launch Jupyter Notebook from the navigator home screen.
On launching the Jupyter Notebook, Jupyter Notebook Dashboard opens in a window opens in your default web browser. - Click New > Python 3 on the Jupyter Notebook Dashboard.
Jupyter Notebook Editor opens in a new tab. - Enter the following command in the edit command mode and click Run.
pyexasol package is installed.
You may need to restart the kernel to use the updated package. You can do that by selecting Kernel > Restart in the Jupyter Notebook Editor menu.
- Enter the following command in the edit command mode and click Run.
All the required files are loaded into the notebook.
- Enter the following command in the edit command mode to connect to Exasol database.
The following parameters are used in the above command:
- dsn: Database source name containing the IP address and port of Exasol database.
- user: Database username to log in.
- password: Password for the database user.
- Enter the following command to load data into a pandas.DataFrame.
Additional Information
For any additional information (examples, reference, best practices), refer to the PyExasol GitHub repository.