BucketFS

This section describes how to use the BucketFS file system in Exasol.

What is BucketFS?

BucketFS is a synchronous file system (also known as replicated file system) available in the Exasol cluster. Each cluster node can connect to this service through the HTTPS interface and will see exactly the same content.

A BucketFS service contains a number of buckets, and every bucket stores a number of files. Each bucket can have different access privileges. Folders are not supported directly, but if you specify an upload path including folders, these will be created. If all files from a folder are deleted, the folder will be dropped automatically.

Writing data to BucketFS is an atomic operation. There is no lock on files, so the latest write operation will overwrite the file. In contrast to the database itself, BucketFS is a pure file-based system and has no transactional semantic.

When scripts are executed in parallel on the Exasol cluster, there are some use cases where all instances have to access the same external data. Your algorithms could for example use a statistical model or weather data. For such requirements, it is obviously possible to use an external service such as a file server. But in terms of performance, it is quite handy to have such data available locally on the cluster nodes. The Exasol BucketFS file system is developed for such use cases, where data should be stored synchronously and replicated across the cluster.

For detailed steps on how to create a new BucketFS service and create new buckets, see BucketFS Setup.

For additional information on BucketFS and how to expand the script languages (for example, installing additional R packages) or even integrate completely new languages into the script framework using BucketFS, refer to the following Adding New Packages to Existing Script Languages section.

Usage Notes

  • BucketFS provides a default bucket that contains pre-installed script languages (Java, Python, R). For storing larger amounts of user data we recommend that you create a separate BucketFS instance on a separate partition.
  • The data in BucketFS is replicated locally on every server and automatically synchronized. Therefore, you should not store very large amounts of data in BucketFS.
  • The data in BucketFS is not part of the database backups and has to be backed up manually if required.
  • In a fresh installation of Exasol, the default BucketFS service does not have a TLS port defined. As an admin, you need to add the HTTP or HTTPS port number. The recommended default port for BucketFS service is 2580.