I am looking for best practices for how to share data from a SQL Server database that contains PHI/PII to individuals that cannot view PHI/PII. In short, we maintain a SQL server database that contains 30+ columns of PHI/PII. We need to provide datasets for certain individuals that cannot access the PHI/PII columns, but can access the other fields to conduct different types of analyses.
Current structure: The database is 100GBs and is updated 4 times per day. All data resides on Azure SQL Server. The tables should maintain metadata so blob storage is not ideal (unless someone has an idea to maintain metadata in blob storage). The data will be accessed through PowerBI or Azure Databricks.
Several options come to mind:
- Create a DB Role and deny access to PHI/PII columns
- Create a new SQL server database and an ETL that copies non-PHI/non-PII data from one database to the new database
- Create a new schema and then create views of the tables which do not contain PII/PHI columns and then restrict access to users for this schema only
- Build an script/ leverage Azure Data Factory to copy data from the database to Azure table.
Security is number one priority. Any advice is much appreciated.