I am accessing quite a number of data sources and I want to keep track of their properties like source path, last update, etc. And, metadata of fields as well.
I found a similar link: Best practice for storing record metadata where no satisfying solution could I find.
For instance, shape data from Natural Earth downloaded yesterday from this URL: https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/10m_cultural.zip
In the zip file, I find some fields useful and I intend to mix them with my World Bank data.
- How would you keep track of the data sources?
- Where would you best store the metadata info?
- Can we call this method a Data Catalog?
ps: setup => Debian 10, Postgresql 12, Python 3.9