Fabric Warehouse
Last updated
Last updated
Microsoft Fabric Warehouse offers a versatile and user-friendly data warehousing solution within the comprehensive Microsoft Fabric ecosystem. Built on an enterprise-grade distributed processing engine, it seamlessly integrates with Power BI, providing both novice and professional users with a unified, SaaS-ified experience for Database, Analytics, Messaging, Data Integration, and Business Intelligence workloads.
Server is the SQL connection string of your warehouse. To get to it, please open your lakehouse, click the Settings gear icon in the top-left corner, then navigate to the About section and copy the link.
Database name - click on Select to pick a database from the list of all databases available on the server, or type in by hand.
Data is loaded into the warehouse by uploading data files (Parquet or CSV) and triggering the ingestion by the warehouse. This means that both Omni Loader and Fabric Warehouse need access to the storage. As Omni Loader can run both on premises and the cloud, while Fabric Warehouse is cloud-only, you might have to configure their access differently.
Azure Active Directory (AAD) Managed Identity A feature that provides Azure services with an automatically managed identity in AAD. It eliminates the need for developers to manage credentials explicitly. Connection Method: To connect using AAD Managed Identity, your application needs to be granted the necessary permissions in Azure AD. The application then uses its managed identity to authenticate and access Azure resources.
Shared Access Signature(SAS) A secure way to share access to your Azure resources without sharing the actual account key. It is a URI that grants restricted access rights for a specific time to resources. Connection Method: To connect using a SAS, you generate a token (SAS) that includes the specific permissions and constraints you want. This token is then appended to the resource URI when making requests.
Connection string A string that contains the information required to connect to a resource, such as the endpoint and authentication details. When it involves an account key, it typically refers to a shared key that provides access to the resource. Connection Method: To connect using a connection string with an account key, you include the connection string in your application configuration. This string contains the necessary information, including the account key, to authenticate and connect to the resource.
Type in your Fabric Connection URL.
A SAS token is a key-based authentication method used to grant limited access to resources in Azure, such as storage accounts, Service Bus, or other Azure services. The token encapsulates specific permissions and constraints, allowing secure and time-limited access to the resource without exposing the account key.
Azure Data Lake This storage option is designed for big data analytics and large-scale database migration. It provides a highly scalable and secure repository that can handle massive volumes of data and a wide variety of data types. The connection string to an Azure Data Lake would include detailed endpoint information and utilize secure authentication methods, such as a shared access signature or a service principal, to authorize data transfers. It is an optimal choice when dealing with analytical workloads that require complex transformations and batch processing of data.
Azure Blob Azure Blob Storage is suitable for storing unstructured data in the cloud as blobs (objects). It is ideal for migrating databases that contain large amounts of unstructured data, like multimedia files, documents, or backups. The connection string to Azure Blob Storage will contain necessary endpoint details and employ authentication mechanisms, typically a shared access signature or an account key, to maintain the security and integrity of data during the migration process. This option is preferred for its ease of access, high availability, and performance when working with unstructured data at scale.
Clear before run Empties the destination folder before starting the migration, ensuring no residual data affects the new transfer.
Timestamped Generates a new folder with a unique timestamp for each migration run, keeping historical data intact.
Overwrite Replaces existing files with new ones during the migration, without deleting any files that are not being replaced.
Folder Per Table Organizes data into separate folders for each table, clearing each folder before the migration begins.
Clear after run Removes all data from the destination folder after the migration is completed.
Parquet A columnar storage file format optimized for use with Big Data frameworks, offering efficient data compression and encoding schemes.
CSV A comma-separated values format that is widely used for representing tabular data and is compatible with most data processing applications.
Snappy A fast compression and decompression library that provides a balance between speed and compression ratio, often used for real-time data processing.
Gzip A widely used compression format that offers a good trade-off between compression ratio and the speed of decompression, suitable for network data transmission.
None No compression is applied, which can be suitable for scenarios where processing speed is more critical than saving storage space or when the data is already compressed.
AAD Password Utilizes Azure Active Directory credentials for authentication, requiring a username and password.
AAD Managed Identity Leverages the managed identity assigned to the Azure resource for a secure, credential-less authentication method.
AAD Interactive Employs an interactive sign-in process that supports multi-factor authentication for enhanced security.
Shared Access Signature Provides delegated access to Azure services using a special token included in the connection string.
Connection string Uses a pre-defined, comprehensive string that typically includes an account key for resource access.
Refers to the specific storage container within the account that is being used to hold data. A container can be thought of as a sub-directory within a storage account that groups a set of blobs (files) together. The name of the container is typically included in the Connection URL to direct the migration tool to the correct location where the data should be placed or retrieved from during the migration process.
A specified directory within the storage container where the database files will be stored during migration. It organizes the exported data in a structured manner. For instance, if you are migrating multiple databases, each can have its own folder within the container to keep the data separate and organized. This is particularly useful for maintaining order when dealing with large datasets or when you need to isolate data for different environments or purposes.