
This post examines strategies to reduce data duplication and enhance performance by enabling shared access to data at rest across Snowflake, Databricks, and Microsoft Fabric. These platforms are widely adopted in modern data ecosystems and each offers lakehouse capabilities that decouple compute from storage. In environments where multiple platforms coexist, sharing storage while orchestrating compute workloads across different engines can significantly reduce data duplication. This approach not only streamlines data management but also drives cost efficiency and can improve query performance.
This post is the first in a three part series focusing on interoperability amongst Snowflake, Databricks, and Microsoft Fabric. The following list will be updated with urls as the posts are published:
- Unify data sharing across Snowflake, Databricks, and Fabric for a lakehouse trifecta (this article)
- Snowflake and Microsoft Fabric integration connectivity options (coming soon)
- Databricks and Microsoft Fabric integration connectivity options (coming soon)
While consolidating data solutions onto a single platform is often ideal, many large organizations operate across multiple cloud environments due to team autonomy, legacy investments, or strategic diversification. Consider a scenario where Team B initiates a project to analyze the impact of supply chain disruptions on product sales. They manage and fund their analytics workloads on Cloud Platform B, which houses the sales data. However, the curated supply chain data resides on Cloud Platform A, owned by a separate team. With lakehouse architectures that support cross-platform data sharing, Team B can query data from Cloud Platform A using compute resources exclusively on Cloud Platform B. This model avoids unnecessary data replication, isolates compute to the platform using the data, reduces storage costs, and enables efficient cross-cloud analytics without compromising performance:

When using a modern lakehouse cross-platform architecture as in Figure 1.1 above, the following benefits are possible:
- Cost containment by platform – Data from Cloud Platform A can be used by Cloud Platform B without incurring compute costs in Cloud Platform A. In the real world, a team with in-demand data in Cloud Platform A can share large volumes of data with many other teams without incurring additional costs.
- Minimize data duplication – With a lakehouse architecture using data sharing amongst compatible platforms, the replication of identical data can usually be reduced overall.
- Performance benefits – Sharing data amongst lakehouse cloud platforms can potentially reduce latency by eliminating unnecessary data copy steps.
- Platform flexibility – Large organizations won’t need to standardize on a single platform. Value can be realized faster amongst existing teams with diverse platforms. Data can also be integrated faster after mergers and acquisitions. Vendor lock-in risks can be avoided.
Before listing out the options for data sharing amongst Snowflake, Databricks, and Microsoft Fabric please note the following:
- I limited the options in this article to data sharing using lakehouse architectures. SQL endpoints (all three platforms have them), third party connectivity options, and other compute-to-compute options were left off this list. Fabric mirroring of Snowflake was included because it creates a lakehouse table as a carbon copy mirror of a Snowflake table. Upcoming posts listed above will cover other options beyond lakehouse storage.
- I focused on lakehouse integration where the files are interoperable versions of delta parquet or iceberg. Other file formats can move across platforms, but metadata compatibility is key to minimizing data duplication for analytic scenarios.
- At the time of writing this article, some of the options are still in Preview. I’ll try to update these articles as the status changes.
- I left off options that are not “out of the box” for the three cloud platforms. For example, some customers will write delta parquet files to Azure Data Lake using Fabric and then reuse the files with Databricks. Other customers have successfully used Apache Iceberg change data capture tooling to shift Snowflake Iceberg tables to Fabric.
- There are important details about the connectivity options left out of this article for the sake of simplicity. For example options are impacted if Snowflake or Databricks are in another cloud other than Azure, or when private endpoint and private link capabilities are enabled on the platforms. If I covered every nuanced scenario, this article would become a book.
- I consulted colleagues to confirm the accuracy of this list, but if anything is mis-stated or missing please let me know and I will make corrections.
- These articles are not an attempt to compare or rank these three cloud platforms. At the time of writing this article I am a Microsoft employee, and all three products are fully supported on Microsoft Azure.
The diagram below in Figure 1.2 may not initially be easy on the eyes, but if you follow each of the Lakehouse Connectivity Options one-at-a-time you can walk through the different ways to share lakehouse data amongst Snowflake, Databricks, and Microsoft Fabric:

Figure 1.3 below lists out each of the nine options above with details about availability status, potential use case scenarios, and details about where the data physically exists:

For options 1-7 in Figure 1.3 above, the following list details examples of where the feature might add value, along with a url to learn more about the capability:
| Feature | Reference url |
| Fabric mirroring of Snowflake DB (Copies Metadata & Data) | Microsoft Fabric Mirrored Databases From Snowflake – Microsoft Fabric | Microsoft Learn |
| Snowflake write Iceberg to Fabric | CREATE EXTERNAL VOLUME | Snowflake Documentation |
| Fabric shortcut to Snowflake Iceberg | Use Iceberg tables with OneLake – Microsoft Fabric | Microsoft Learn |
| Fabric shortcut to Databricks unmanaged Delta Parquet | Unify data sources with OneLake shortcuts – Microsoft Fabric | Microsoft Learn |
| Fabric mirroring of Databricks Unity Catalog (just Metadata) | Microsoft Fabric Mirrored Catalog From Azure Databricks – Microsoft Fabric | Microsoft Learn |
| Snowflake read table from Fabric as Iceberg | New in OneLake: Access your Delta Lake tables as Iceberg automatically (Preview) | Microsoft Fabric Blog | Microsoft Fabric |
| Databricks read Fabric Delta Parquet via Managed Identity | Integrate OneLake with Azure Databricks – Microsoft Fabric | Microsoft Learn |
Notice that I left options 8-9 off of the url reference table in Figure 1.4, I was unable to locate official documentation pages from Databricks or Snowflake regarding those capabilities. If anyone has those links, let me know and I’ll add them to the table.
Per the links at the beginning of this article, I will follow up to this post with two more posts about 1) Fabric/Snowflake and 2) Fabric/Databricks integration that include connectivity options beyond shared lakehouse storage.
In summary, solutions built on Snowflake, Databricks, or Fabric can share data across the platforms using lakehouse architecture tools to minimize data duplication, reduce data latency, and optimize costs without consolidating on a single platform.

Leave a Reply to Snowflake, Fabric and Power BI Integration Options – Greg Beaumont's Data & Analytics BlogCancel reply