Reposted from the Microsoft Healthcare & Life Sciences Blog at this link: https://techcommunity.microsoft.com/blog/healthcareandlifesciencesblog/ingest-healthcare-open-data-into-azure-and-power-bi-using-new-github-repository/2278523

Numerous Government agencies make Healthcare Open Data available to the public at no cost. Data from the CDC, CMS, FDA, World Bank, US Census, USDA and many others provide rich sources of valuable data. These vast sources of robust and useful data are free to use but can have different file formats, different table structures, different context, and different data granularities. Ingesting all of this data into a common place where it can be used and shared is often time-consuming and challenging. I’ve put together a repository in GitHub called Power Pop Health to help with these challenges.

Power Pop Health is a collection of content intended to simplify the process of ingesting and prepping Healthcare Open Data for Analytics, Business Intelligence, Data Science, and more. Power Pop Health has a simple mission: Make it easy for you to ingest, transform and format Healthcare Open Data and common reference tables so that you can achieve more. The GitHub repository can be viewed at this link.

How does Power Pop Health work? I’ve tried to make it simple with low code/no code/no PowerShell deployment so that anyone can use it with nothing more than an Azure subscription and Power BI. Where code is necessary, there are cut-and-paste scripts with tutorial videos for the deployment:

  • Step 1 – Ingest Raw Data into an Azure Data Lake
  • Step 2 – Make the Data usable in Azure and/or Power BI
  • Step 3 – You take it from here! The data is ready to blend with your Organizational data, use for training, create demos, analyze to find trends, etc.

What data is currently available in the first release of Power Pop Health?

Over the last few years I have accumulated examples and tutorials that leverage public Healthcare data. This first release is a repository to share those examples in a unified format, and in one place. Future additions to this repository will be based on feedback from the community, with an initial plan to focus primarily on Population Health data such as Social Determinants of Health. Below is a chart of the data available in this first release:

Here’s a quick summary of each data set in the initial release. Before using these data sources, I’d also recommend reading the licensing terms from the data providers to ensure that you are using the data appropriately:

1. CDC Daily PM 2.5 Concentrations – Air Quality measurements at the level of States and Counties for 2001-2016.
2. CDC Population Weighted UV Irradiance – Ultraviolet Radiation measurements at the level of States and Counties for 2004-2015.
3. CMS DRG /MDC / Surgical Class v38.1 – Diagnoses Related Groups (DRGs), Major Diagnostic Categories, and Surgical Class version 38.1.
4. CMS ICD10 CM 2021 – 2021 ICD10 CM Diagnosis codes for the US.
5. CMS ICD10 PCS 2021 – 2021 ICD10 PCS Procedural codes for the US.
6. Date Table (DataFlows) – A custom Date Table that can be deployed to Power BI DataFlows.
7. Date Table (Power Query) – A custom Date Table that can be deployed to Power BI Power Query.
8. Time Table – (DataFlows) – A custom Time Table that can be deployed to Power BI DataFlows.
9. Time Table (Power Query) – A custom Time Table that can be deployed to Power BI Power Query.
10. FCC State & County FIPS – A reference table for State and County FIPS geographical mapping codes provided by the FCC.
11. FDA Food Recall Enforcement Reports – Foods that have been recalled.
12. FDA CAERS Reports (Food Events) – Adverse events attributed to Foods.

13. Medicare Part D Provider Utilization and Payment Data 2013-2018 – I’ll have this data available in the next release, but for now it is available in an end-to-end Azure Synapse and Power BI solution at this link: https://github.com/kunal333/E2ESynapseDemo 

Posted in ,

Leave a Reply

Discover more from Greg Beaumont's Data & Analytics Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading