That goal makes sense to me, but depending on how we solve it, it will come with tradeoffs. Tor's current state file has a combination of client-side info (e.g. CircuitBuildTimeBin, Guard, TotalBuildTimes), relay-side info (e.g. BWHistory*, LastRotatedOnionKey, Accounting*, TransportProxy) and items that apply to both or more (e.g. Dormant, MinutesSinceUserActivity). Putting it all in one place reflect's Tor's peer-to-peer design where a single Tor instance can play multiple roles, and you lose features if you try to partition state into too few roles -- for example I've seen use cases where a Tor client offering an onion service relies on AccountingMax.
More generally, all of the entries in the state file really are for persistence, sometimes with security implications (like Guard), sometimes with network health implications (like CircuitBuildTimeBin). You can read more about the current state lines in doc/state-contents.txt in your torgit. And you can read more about the files in your DataDirectory in the FILES section at the bottom of 'man tor'.
Thinking more about it... I think we've already done much of what you requested, in that we've consolidated everything you should want for persistence (besides the keys/ directory) in the state file. (Exceptions are if you're an onion service then you want to keep your onion service keys, and if you offer or use a pluggable transport you'll want to consider your pt_state directory, and if you're a v3 directory authority then there are a bunch more files but there are only 9 of those.)
So: do you want better documentation of state entries, or better partitioning of them by roles, or maybe this is more "can you just make them use less total space"? :)
Thank you for the detailed response, I'll start from your conclusions. I hadn't found the documentation about the state file and it looks excellent. I think the point is that currently it's not intuitive to understand what needs to be saved and in which context. As you rightly pointed out, for us it might be convenient to script the generation of the file from some data we save periodically, but in light of this, I think it would be useful to document: - which fields are mandatory (if absent, could they prevent tor from starting) - which fields, if duplicated, can be shortened to the last value (for example CircuitBuildTimeBin) Basically, if I take a state file from one of our relays, I have a ~9k file andI need to reduce it to <5k net of compression, while trying to lose the leastamount of functionality possible. What are the effects of deleting the variousCircuitBuildTimeBin and Guard entries that take up 90% of the file?