-------------------------------- MODULE dbft -------------------------------- EXTENDS Integers, FiniteSets CONSTANTS \* RM is the set of consensus node indexes starting from 0. \* Example: {0, 1, 2, 3} RM, \* RMFault is a set of consensus node indexes that are allowed to become \* FAULT in the middle of every considered behavior and to send any \* consensus message afterwards. RMFault must be a subset of RM. An empty \* set means that all nodes are good in every possible behaviour. \* Examples: {0} \* {1, 3} \* {} RMFault, \* RMDead is a set of consensus node indexes that are allowed to die in the \* middle of every behaviour and do not send any message afterwards. RMDead \* must be a subset of RM. An empty set means that all nodes are alive and \* responding in in every possible behaviour. RMDead may intersect the \* RMFault set which means that node which is in both RMDead and RMFault \* may become FAULT and send any message starting from some step of the \* particular behaviour and may also die in the same behaviour which will \* prevent it from sending any message. \* Examples: {0} \* {3, 2} \* {} RMDead, \* MaxView is the maximum allowed view to be considered (starting from 0, \* including the MaxView itself). This constraint was introduced to reduce \* the number of possible model states to be checked. It is recommended to \* keep this setting not too high (< N is highly recommended). \* Example: 2 MaxView VARIABLES \* rmState is a set of consensus node states. It is represented by the \* mapping (function) with domain RM and range RMStates. I.e. rmState[r] is \* the state of the r-th consensus node at the current step. rmState, \* msgs is the shared pool of messages sent to the network by consensus nodes. \* It is represented by a subset of Messages set. msgs \* vars is a tuple of all variables used in the specification. It is needed to \* simplify fairness conditions definition. vars == <> \* N is the number of validators. N == Cardinality(RM) \* F is the number of validators that are allowed to be malicious. F == (N - 1) \div 3 \* M is the number of validators that must function correctly. M == N - F \* These assumptions are checked by the TLC model checker once at the start of \* the model checking process. All the input data (declared constants) specified \* in the "Model Overview" section must satisfy these constraints. ASSUME /\ RM \subseteq Nat /\ N >= 4 /\ 0 \in RM /\ RMFault \subseteq RM /\ RMDead \subseteq RM /\ Cardinality(RMFault) <= F /\ Cardinality(RMDead) <= F /\ Cardinality(RMFault \cup RMDead) <= F /\ MaxView \in Nat /\ MaxView <= 2 \* RMStates is a set of records where each record holds the node state and \* the node current view. RMStates == [ type: {"initialized", "prepareSent", "commitSent", "cv", "commitAckSent", "blockAccepted", "bad", "dead"}, view : Nat ] \* Messages is a set of records where each record holds the message type, \* the message sender and sender's view by the moment when message was sent. Messages == [type : {"PrepareRequest", "PrepareResponse", "Commit", "CommitAck", "ChangeView"}, rm : RM, view : Nat] \* -------------- Useful operators -------------- \* IsPrimary is an operator defining whether provided node r is primary \* for the current round from the r's point of view. It is a mapping \* from RM to the set of {TRUE, FALSE}. IsPrimary(r) == rmState[r].view % N = r \* GetPrimary is an operator defining mapping from round index to the RM that \* is primary in this round. GetPrimary(view) == CHOOSE r \in RM : view % N = r \* GetNewView returns new view number based on the previous node view value. \* Current specifications only allows to increment view. GetNewView(oldView) == oldView + 1 \* CountCommitted returns the number of nodes that have sent the Commit message \* in the current round or in some other round. CountCommitted(r) == Cardinality({rm \in RM : Cardinality({msg \in msgs : msg.rm = rm /\ msg.type = "Commit"}) /= 0}) \* MoreThanFNodesCommitted returns whether more than F nodes have been committed \* in the current round (as the node r sees it). \* \* IMPORTANT NOTE: we intentionally do not add the "lost" nodes calculation to the specification, and here's \* the reason: from the node's point of view we can't reliably check that some neighbour is completely \* out of the network. It is possible that the node doesn't receive consensus messages from some other member \* due to network delays. On the other hand, real nodes can go down at any time. The absence of the \* member's message doesn't mean that the member is out of the network, we never can be sure about \* that, thus, this information is unreliable and can't be trusted during the consensus process. \* What can be trusted is whether there's a Commit message from some member was received by the node. MoreThanFNodesCommitted(r) == CountCommitted(r) > F \* PrepareRequestSentOrReceived denotes whether there's a PrepareRequest \* message received from the current round's speaker (as the node r sees it). PrepareRequestSentOrReceived(r) == [type |-> "PrepareRequest", rm |-> GetPrimary(rmState[r].view), view |-> rmState[r].view] \in msgs \* -------------- Safety temporal formula -------------- \* Init is the initial predicate initializing values at the start of every \* behaviour. Init == /\ rmState = [r \in RM |-> [type |-> "initialized", view |-> 0]] /\ msgs = {} \* RMSendPrepareRequest describes the primary node r broadcasting PrepareRequest. RMSendPrepareRequest(r) == /\ rmState[r].type = "initialized" /\ IsPrimary(r) /\ rmState' = [rmState EXCEPT ![r].type = "prepareSent"] /\ msgs' = msgs \cup {[type |-> "PrepareRequest", rm |-> r, view |-> rmState[r].view]} /\ UNCHANGED <<>> \* RMSendPrepareResponse describes non-primary node r receiving PrepareRequest from \* the primary node of the current round (view) and broadcasting PrepareResponse. \* This step assumes that PrepareRequest always contains valid transactions and \* signatures. RMSendPrepareResponse(r) == /\ \/ rmState[r].type = "initialized" \* We do allow the transition from the "cv" state to the "prepareSent" or "commitSent" stage \* as it is done in the code-level dBFT implementation by checking the NotAcceptingPayloadsDueToViewChanging \* condition (see \* https://github.com/nspcc-dev/dbft/blob/31c1bbdc74f2faa32ec9025062e3a4e2ccfd4214/dbft.go#L419 \* and \* https://github.com/neo-project/neo-modules/blob/d00d90b9c27b3d0c3c57e9ca1f560a09975df241/src/DBFTPlugin/Consensus/ConsensusService.OnMessage.cs#L79). \* However, we can't easily count the number of "lost" nodes in this specification to match precisely \* the implementation. Moreover, we don't need it to be counted as the RMSendPrepareResponse enabling \* condition specifies only the thing that may happen given some particular set of enabling conditions. \* Thus, we've extended the NotAcceptingPayloadsDueToViewChanging condition to consider only MoreThanFNodesCommitted. \* It should be noted that the logic of MoreThanFNodesCommittedOrLost can't be reliable in detecting lost nodes \* (even with neo-project/neo#2057), because real nodes can go down at any time. See the comment above the MoreThanFNodesCommitted. \/ /\ rmState[r].type = "cv" /\ MoreThanFNodesCommitted(r) /\ \neg IsPrimary(r) /\ PrepareRequestSentOrReceived(r) /\ rmState' = [rmState EXCEPT ![r].type = "prepareSent"] /\ msgs' = msgs \cup {[type |-> "PrepareResponse", rm |-> r, view |-> rmState[r].view]} /\ UNCHANGED <<>> \* RMSendCommit describes node r sending Commit if there's enough PrepareResponse \* messages. RMSendCommit(r) == /\ \/ rmState[r].type = "prepareSent" \* We do allow the transition from the "cv" state to the "prepareSent" or "commitSent" stage, \* see the related comment inside the RMSendPrepareResponse definition. \/ /\ rmState[r].type = "cv" /\ MoreThanFNodesCommitted(r) /\ Cardinality({ msg \in msgs : /\ (msg.type = "PrepareResponse" \/ msg.type = "PrepareRequest") /\ msg.view = rmState[r].view }) >= M /\ PrepareRequestSentOrReceived(r) /\ rmState' = [rmState EXCEPT ![r].type = "commitSent"] /\ msgs' = msgs \cup {[type |-> "Commit", rm |-> r, view |-> rmState[r].view]} /\ UNCHANGED <<>> \* RMSendCommitAck describes node r collecting enough Commit messages and sending \* the CommitAck message. RMSendCommitAck(r) == /\ rmState[r].type /= "bad" /\ rmState[r].type /= "dead" /\ rmState[r].type /= "commitAckSent" /\ rmState[r].type /= "blockAccepted" /\ PrepareRequestSentOrReceived(r) /\ Cardinality({msg \in msgs : msg.type = "Commit" /\ msg.view = rmState[r].view}) >= M /\ rmState' = [rmState EXCEPT ![r].type = "commitAckSent"] /\ msgs' = msgs \cup {[type |-> "CommitAck", rm |-> r, view |-> rmState[r].view]} /\ UNCHANGED <<>> \* RMAcceptBlock describes node r collecting enough CommitAck messages and accepting \* the block. RMAcceptBlock(r) == /\ rmState[r].type = "commitAckSent" /\ Cardinality({msg \in msgs : msg.type = "CommitAck" /\ msg.view = rmState[r].view}) >= M /\ rmState' = [rmState EXCEPT ![r].type = "blockAccepted"] /\ UNCHANGED <> \* RMSendChangeView describes node r sending ChangeView message on timeout. RMSendChangeView(r) == /\ \/ (rmState[r].type = "initialized" /\ \neg IsPrimary(r)) \/ rmState[r].type = "prepareSent" /\ LET cv == [type |-> "ChangeView", rm |-> r, view |-> rmState[r].view] IN /\ cv \notin msgs /\ rmState' = [rmState EXCEPT ![r].type = "cv"] /\ msgs' = msgs \cup {[type |-> "ChangeView", rm |-> r, view |-> rmState[r].view]} \* RMReceiveChangeView describes node r receiving enough ChangeView messages for \* view changing. RMReceiveChangeView(r) == /\ rmState[r].type /= "bad" /\ rmState[r].type /= "dead" /\ rmState[r].type /= "blockAccepted" /\ rmState[r].type /= "commitSent" /\ rmState[r].type /= "commitAckSent" /\ Cardinality({ rm \in RM : Cardinality({ msg \in msgs : /\ msg.type = "ChangeView" /\ msg.rm = rm /\ GetNewView(msg.view) >= GetNewView(rmState[r].view) }) /= 0 }) >= M /\ rmState' = [rmState EXCEPT ![r].type = "initialized", ![r].view = GetNewView(rmState[r].view)] /\ UNCHANGED <> \* RMBeBad describes the faulty node r that will send any kind of consensus message starting \* from the step it's gone wild. This step is enabled only when RMFault is non-empty set. RMBeBad(r) == /\ r \in RMFault /\ Cardinality({rm \in RM : rmState[rm].type = "bad"}) < F /\ rmState' = [rmState EXCEPT ![r].type = "bad"] /\ UNCHANGED <> \* RMFaultySendCV describes sending CV message by the faulty node r. RMFaultySendCV(r) == /\ rmState[r].type = "bad" /\ LET cv == [type |-> "ChangeView", rm |-> r, view |-> rmState[r].view] IN /\ cv \notin msgs /\ msgs' = msgs \cup {cv} /\ UNCHANGED <> \* RMFaultyDoCV describes view changing by the faulty node r. RMFaultyDoCV(r) == /\ rmState[r].type = "bad" /\ rmState' = [rmState EXCEPT ![r].view = GetNewView(rmState[r].view)] /\ UNCHANGED <> \* RMFaultySendPReq describes sending PrepareRequest message by the primary faulty node r. RMFaultySendPReq(r) == /\ rmState[r].type = "bad" /\ IsPrimary(r) /\ LET pReq == [type |-> "PrepareRequest", rm |-> r, view |-> rmState[r].view] IN /\ pReq \notin msgs /\ msgs' = msgs \cup {pReq} /\ UNCHANGED <> \* RMFaultySendPResp describes sending PrepareResponse message by the non-primary faulty node r. RMFaultySendPResp(r) == /\ rmState[r].type = "bad" /\ \neg IsPrimary(r) /\ LET pResp == [type |-> "PrepareResponse", rm |-> r, view |-> rmState[r].view] IN /\ pResp \notin msgs /\ msgs' = msgs \cup {pResp} /\ UNCHANGED <> \* RMFaultySendCommit describes sending Commit message by the faulty node r. RMFaultySendCommit(r) == /\ rmState[r].type = "bad" /\ LET commit == [type |-> "Commit", rm |-> r, view |-> rmState[r].view] IN /\ commit \notin msgs /\ msgs' = msgs \cup {commit} /\ UNCHANGED <> \* RMFaultySendCommitAck describes sending CommitAck message by the faulty node r. RMFaultySendCommitAck(r) == /\ rmState[r].type = "bad" /\ LET ack == [type |-> "CommitAck", rm |-> r, view |-> rmState[r].view] IN /\ ack \notin msgs /\ msgs' = msgs \cup {ack} /\ UNCHANGED <> \* RMDie describes node r that was removed from the network at the particular step \* of the behaviour. After this node r can't change its state and accept/send messages. RMDie(r) == /\ r \in RMDead /\ Cardinality({rm \in RM : rmState[rm].type = "dead"}) < F /\ rmState' = [rmState EXCEPT ![r].type = "dead"] /\ UNCHANGED <> \* Terminating is an action that allows infinite stuttering to prevent deadlock on \* behaviour termination. We consider termination to be valid if at least M nodes \* has the block being accepted. Terminating == /\ Cardinality({rm \in RM : rmState[rm].type = "blockAccepted"}) >= M /\ UNCHANGED <> \* TerminatingFourNodesDeadlock describes node r that is in the state where dBFT \* stucks in a four nodes scenario with one dead node allowed. Allow infinite \* stuttering to prevent TLC deadlock recognition. \* Note that this step is unused in the current specification, however, it may be \* used for further investigations of this deadlock. TerminatingFourNodesDeadlockSameView(r) == /\ Cardinality({rm \in RM : rmState[rm].type = "dead" /\ rmState[rm].view = rmState[r].view}) = 1 /\ Cardinality({rm \in RM : rmState[rm].type = "cv" /\ rmState[rm].view = rmState[r].view}) = 2 /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent" /\ rmState[rm].view = rmState[r].view}) = 1 /\ UNCHANGED <> \* TerminatingFourNodesDeadlock describes node r that is in the state where dBFT \* stucks in a four nodes scenario and the same view. Allow infinite stuttering \* to prevent TLC deadlock recognition. \* Note that this step is unused in the current specification, however, it may be \* used for further investigations of this deadlock. TerminatingFourNodesDeadlock == /\ Cardinality({rm \in RM : rmState[rm].type = "dead"}) = 1 /\ Cardinality({rm \in RM : rmState[rm].type = "cv"}) = 2 /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent"}) = 1 /\ UNCHANGED <> \* Next is the next-state action describing the transition from the current state \* to the next state of the behaviour. Next == \/ Terminating \/ \E r \in RM: RMSendPrepareRequest(r) \/ RMSendPrepareResponse(r) \/ RMSendCommit(r) \/ RMSendCommitAck(r) \/ RMAcceptBlock(r) \/ RMSendChangeView(r) \/ RMReceiveChangeView(r) \/ RMDie(r) \/ RMBeBad(r) \/ RMFaultySendCV(r) \/ RMFaultyDoCV(r) \/ RMFaultySendCommit(r) \/ RMFaultySendCommitAck(r) \/ RMFaultySendPReq(r) \/ RMFaultySendPResp(r) \* Safety is a temporal formula that describes the whole set of allowed \* behaviours. It specifies only what the system MAY do (i.e. the set of \* possible allowed behaviours for the system). It asserts only what may \* happen; any behaviour that violates it does so at some point and \* nothing past that point makes difference. \* \* E.g. this safety formula (applied standalone) allows the behaviour to end \* with an infinite set of stuttering steps (those steps that DO NOT change \* neither msgs nor rmState) and never reach the state where at least one \* node is committed or accepted the block. \* \* To forbid such behaviours we must specify what the system MUST \* do. It will be specified below with the help of fairness conditions in \* the Fairness formula. Safety == Init /\ [][Next]_vars \* -------------- Fairness temporal formula -------------- \* Fairness is a temporal assumptions under which the model is working. \* Usually it specifies different kind of assumptions for each/some \* subactions of the Next's state action, but the only think that bothers \* us is preventing infinite stuttering at those steps where some of Next's \* subactions are enabled. Thus, the only thing that we require from the \* system is to keep take the steps until it's impossible to take them. \* That's exactly how the weak fairness condition works: if some action \* remains continuously enabled, it must eventually happen. Fairness == WF_vars(Next) \* -------------- Specification -------------- \* The complete specification of the protocol written as a temporal formula. Spec == Safety /\ Fairness \* -------------- Liveness temporal formula -------------- \* For every possible behaviour it's true that eventually (i.e. at least once \* through the behaviour) block will be accepted. It is something that dBFT \* must guarantee (an in practice this condition is violated). TerminationRequirement == <>(Cardinality({r \in RM : rmState[r].type = "blockAccepted"}) >= M) \* A liveness temporal formula asserts only what must happen (i.e. specifies \* what the system MUST do). Any behaviour can NOT violate it at ANY point; \* there's always the rest of the behaviour that can always make the liveness \* formula true; if there's no such behaviour than the liveness formula is \* violated. The liveness formula is supposed to be checked as a property \* by the TLC model checker. Liveness == TerminationRequirement \* -------------- ModelConstraints -------------- \* MaxViewConstraint is a state predicate restricting the number of possible \* behaviour states. It is needed to reduce model checking time and prevent \* the model graph size explosion. This formulae must be specified at the \* "State constraint" section of the "Additional Spec Options" section inside \* the model overview. MaxViewConstraint == /\ \A r \in RM : rmState[r].view <= MaxView /\ \A msg \in msgs : msg.view <= MaxView \* -------------- Invariants of the specification -------------- \* Model invariant is a state predicate (statement) that must be true for \* every step of every reachable behaviour. Model invariant is supposed to \* be checked as an Invariant by the TLC Model Checker. \* TypeOK is a type-correctness invariant. It states that all elements of \* specification variables must have the proper type throughout the behaviour. TypeOK == /\ rmState \in [RM -> RMStates] /\ msgs \subseteq Messages \* InvTwoBlocksAccepted states that there can't be two different blocks accepted in \* the two different views, i.e. dBFT must not allow forks. InvTwoBlocksAccepted == \A r1 \in RM: \A r2 \in RM \ {r1}: \/ rmState[r1].type /= "blockAccepted" \/ rmState[r2].type /= "blockAccepted" \/ rmState[r1].view = rmState[r2].view \* InvFaultNodesCount states that there can be F faulty or dead nodes at max. InvFaultNodesCount == Cardinality({ r \in RM : rmState[r].type = "bad" \/ rmState[r].type = "dead" }) <= F \* This theorem asserts the truth of the temporal formula whose meaning is that \* the state predicates TypeOK, InvTwoBlocksAccepted and InvFaultNodesCount are \* the invariants of the specification Spec. This theorem is not supposed to be \* checked by the TLC model checker, it's here for the reader's understanding of \* the purpose of TypeOK, InvTwoBlocksAccepted and InvFaultNodesCount. THEOREM Spec => [](TypeOK /\ InvTwoBlocksAccepted /\ InvFaultNodesCount) ============================================================================= \* Modification History \* Last modified Wed Jun 19 17:51:15 MSK 2024 by anna \* Last modified Mon Mar 06 15:36:57 MSK 2023 by root \* Last modified Sat Jan 21 01:26:16 MSK 2023 by rik \* Created Thu Dec 15 16:06:17 MSK 2022 by anna