Merge bitcoin/bitcoin#27866: blockstorage: Return on fatal flush errors

d8041d4e04 blockstorage: Return on fatal undo file flush error (TheCharlatan)
f0207e0030 blockstorage: Return on fatal block file flush error (TheCharlatan)
5671c15f45 blockstorage: Mark FindBlockPos as nodiscard (TheCharlatan)

Pull request description:

  The goal of this PR is to establish that fatal blockstorage flush errors should be treated as errors at their call site.

  Prior to this patch `FlushBlockFile` may have failed without returning in `Chainstate::FlushStateToDisk`, leading to a potential write from `WriteBlockIndexDB` that may refer to a block that is not fully flushed to disk yet. By returning if either `FlushUndoFile` or `FlushBlockFile` fail, we ensure that no further write operations take place that may lead to an inconsistent database when crashing. Add `[[nodiscard]]` annotations to them such that they are not ignored in future.

  Functions that call either `FlushUndoFile` or `FlushBlockFile`, need to handle these extra abort cases properly. Since `Chainstate::FlushStateToDisk` already produces an abort error in case of `WriteBlockIndexDB` failing, no extra logic for functions calling `Chainstate::FlushStateToDisk` is required.

  Besides `Chainstate::FlushStateToDisk`, `FlushBlockFile` is also called by `FindBlockPos`, while `FlushUndoFile` is only called by `FlushBlockFile` and `WriteUndoDataForBlock`. For both these cases, the flush error is not further bubbled up. Instead, the error is logged and a comment is provided why bubbling up an error would be less desirable in these cases.

  ---

  This pull request is part of a larger effort towards improving the shutdown / abort / fatal error handling in validation code. It is a first step towards implementing proper fatal error return type enforcement similar as proposed by theuni in this pull request [comment](https://github.com/bitcoin/bitcoin/pull/27711#issuecomment-1563561502). For ease of review of these critical changes, a first step would be checking that `AbortNode` leads to early and error-conveying returns at its call site. Further work for enforcing returns when `AbortNode` is called is done in https://github.com/bitcoin/bitcoin/pull/27862.

ACKs for top commit:
  stickies-v:
    re-ACK d8041d4
  ryanofsky:
    Code review ACK d8041d4e04

Tree-SHA512: 47ade9b873b15e567c8f60ca538d5a0daf32163e1031be3212a3a45eb492b866664b225f2787c9e40f3e0c089140157d8fd1039abc00c7bdfeec1b52ecd7e219
This commit is contained in:
Ryan Ofsky
2023-09-29 12:50:15 -04:00
3 changed files with 47 additions and 10 deletions

View File

@@ -2594,7 +2594,11 @@ bool Chainstate::FlushStateToDisk(
LOG_TIME_MILLIS_WITH_CATEGORY("write block and undo data to disk", BCLog::BENCH);
// First make sure all block and undo data is flushed to disk.
m_blockman.FlushBlockFile();
// TODO: Handle return error, or add detailed comment why it is
// safe to not return an error upon failure.
if (!m_blockman.FlushBlockFile()) {
LogPrintLevel(BCLog::VALIDATION, BCLog::Level::Warning, "%s: Failed to flush block file.\n", __func__);
}
}
// Then update all block file information (which may refer to block and undo files).