I am building a database, and have implemented a transaction log system to ensure that writes are guaranteed, atomic & result inconsistency.
When a transaction is ‘committed’ it is appended to a log file before it is considered ‘committed’. The log file is eventually persisted to the disk file, once it is marked as “full” during running operation by a boolean flag stored in memory.
The log contents are stored in memory so in case an updated record is requested, but hasn’t been persisted, the in-memory layer is checked prior to the disk layer being checked.
Each log-file stores up to 250 transactions, and a new log file is created, and the name is incremented by 1.
I also want to keep logs for backup/auditing, so each log file name is numbered from 0 to infinity – like
I want a good way to keep track of which log files have been marked as “persisted” and which are yet to be persisted, so then when the database is restarting, it can load those committed, but not yet persisted changes into memory, but also avoid loading all logs into memory because that would be O(n) time complexity.
There may be a number of log files at any given time that have not been persisted to disk.
Essentially, what I’m asking is
- Whats the best way to find out which log files have not been persisted, so that they can be loaded into the temporary memory on restart, and eventually persisted?
These are the only ideas I currently have:
- Having a separate log file for the database in which the most recently persisted transaction log file number is stored.
- problems: This operation will need to happen each time a new log file is generated, which could slow down writes
- Reading the log files backwards, and checking a flag to see if they were persisted
- problems: very large folder sizes with hundreds of thousands or millions of log files could slow things down (I am not sure if this will be significant)
- moving persisted files to a separate folder
- I am not sure if this will work, and whether this is an operation that will result in a consistent database state