Parquet

Connection Details

Storage

  • Azure Data Lake This is an enterprise-wide hyper-scale repository for big data analytic workloads. Azure Data Lake enables you to capture data of any size, type, and ingestion speed in one single place for operational and exploratory analytics. Choosing Azure Data Lake for storage allows you to take advantage of Azure’s massive scalability and security capabilities.

  • Azure Blob Blob storage is optimized for storing massive amounts of unstructured data. This can include text or binary data, such as documents, media files, or application installers. Blob storage is ideal for serving images or documents directly to a browser, storing files for distributed access, and streaming video and audio.

  • Amazon S3 Amazon Simple Storage Service (S3) is a scalable, high-speed, web-based cloud storage service designed for online backup and archiving of data and application programs. It is a good choice if you are looking for a widely adopted and secure cloud storage service that can integrate with a vast array of AWS services.

  • Local Selecting local storage implies that the Parquet files are stored on a local drive or network attached storage that is directly accessible to the system on which the connection is being configured. This might be the preferred option for smaller datasets or when operating within a secure and fast local network.

Authentication

  • AAD Interactive For authentication with Azure Active Directory allowing interactive sign-in by the user.

  • AAD Password For non-interactive authentication using a username and password through Azure Active Directory.

  • AAD Managed Identity For automated authentication using managed identities for Azure resources.

Connection URL

Enter the endpoint URL for your Azure Data Lake storage account. It typically looks like https://<account_name>.dfs.core.windows.net.

SAS Token

Input the generated Shared Access Signature token that provides specific permissions to the storage account.

Container

Specify the name of your Azure storage container which will be used as the root folder.

Database Folder

Input the path within the container where your Parquet files are stored.

Delta table

  • Disabled The system will only convert the files without any logging.

  • Enabled The system will convert the files and also create a delta table log for transactional control.

Include subdirectories

Tick this if you wish to include Parquet files located in subdirectories within the specified database folder.

Last updated