Thermodynamic Inference in Partially Accessible Markov Networks: A Unifying Perspective from Transition-Based Waiting Time Distributions Jann van der Meer , Benjamin Ertel , and Udo Seifert II. Institut für Theoretische Physik, Universität Stuttgart, 70550 Stuttgart, Germany (Received 22 March 2022; revised 2 June 2022; accepted 1 July 2022; published 12 August 2022) The inference of thermodynamic quantities from the description of an only partially accessible physical system is a central challenge in stochastic thermodynamics. A common approach is coarse-graining, which maps the dynamics of such a system to a reduced effective one. While coarse-graining states of the system into compound ones is a well-studied concept, recent evidence hints at a complementary description by considering observable transitions and waiting times. In this work, we consider waiting time distributions between two consecutive transitions of a partially observable Markov network. We formulate an entropy estimator using their ratios to quantify irreversibility. Depending on the complexity of the underlying network, we formulate criteria to infer whether the entropy estimator recovers the full physical entropy production or whether it just provides a lower bound that improves on established results. This conceptual approach, which is based on the irreversibility of underlying cycles, additionally enables us to derive estimators for the topology of the network, i.e., the presence of a hidden cycle, its number of states, and its driving affinity. Adopting an equivalent semi-Markov description, our results can be condensed into a fluctuation theorem for the corresponding semi-Markov process. This mathematical perspective provides a unifying framework for the entropy estimators considered here and established earlier ones. The crucial role of the correct version of time reversal helps to clarify a recent debate on the meaning of formal versus physical irreversibility. Extensive numerical calculations based on a direct evaluation of waiting time distributions illustrate our exact results and provide an estimate on the quality of the bounds for affinities of hidden cycles. DOI: 10.1103/PhysRevX.12.031025 Subject Areas: Statistical Physics I. INTRODUCTION Over the past two decades, stochastic thermodynamics has emerged as a comprehensive universal framework for describing small driven systems [1–5]. One major para- digm comprises a Markovian, i.e., memoryless, dynamics on a set of discrete states, which arises from integrating out fast microscopic degrees of freedom under the assumption of a timescale separation. Such a fairly general Markov network model is of widespread use in the description of chemical and biophysical processes, ranging from chemical reaction networks [6–10] to protein folding [11–13], molecular motors [14–19], and molecular dynamics in general [20–22]. There is, however, a difference between identifying an effective description of a complex system and actually having full access to it in practice. On the arguably coarsest level of description, one is interested in estimation methods of crucial quantities like the entropy production. As a prominent result, the thermodynamic uncertainty relation (TUR) [23–25] provides thermodynamic bounds that can be used in estimation techniques for entropy [26–31] or topology [32,33] if it is possible to measure currents of the underlying system. These currents are a trace of the fundamental time-reversal asymmetry in dissipative sys- tems [34,35] that can also be utilized directly as an entropy estimator [36–38]. Furthermore, entropy estimators that incorporate or are even based on waiting times between measurable events have been discussed more recently [39– 42]. For a partially visible Markov network, entropy production can be estimated through the fraction that is visible in the subsystem through passive observation [43] or by controlling adjustable parameters [44,45]. These methods raise the general issue how an underlying, only partially accessible system is related to a reduced effective model, a topic known as coarse-graining in sto- chastic thermodynamics. Earlier interest in the field mainly considered coarse-graining as a mapping in which unre- solved Markov states are lumped into compound states, for example, via schemes described in Refs. [46–50]. In general, the resulting system is no longer Markovian, so that a description of the dynamics or the entropy production is Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI. PHYSICAL REVIEW X 12, 031025 (2022) 2160-3308=22=12(3)=031025(29) 031025-1 Published by the American Physical Society https://orcid.org/0000-0002-3619-6653 https://orcid.org/0000-0002-4696-2243 https://orcid.org/0000-0002-9271-6190 https://crossmark.crossref.org/dialog/?doi=10.1103/PhysRevX.12.031025&domain=pdf&date_stamp=2022-08-12 https://doi.org/10.1103/PhysRevX.12.031025 https://doi.org/10.1103/PhysRevX.12.031025 https://doi.org/10.1103/PhysRevX.12.031025 https://doi.org/10.1103/PhysRevX.12.031025 https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/ formulated in terms of phenomenological, apparent equa- tions [27,51–55].While particular symmetric systems can be described as semi-Markov processes in this coarse-graining approach [56–58], a general framework to describe situations with incomplete information remains an open issue. To give a recent example [59,60], allowing states that are not contained in any compound state breakswith thewell-studied paradigm of state lumping as coarse-graining scheme. This novel scheme extends our ability to formulate thermodynamically consistent models while also exhibiting new effects such as kinetic hysteresis that require a refined understanding of the relationship between time reversal and coarse-graining. In this work, we discuss thermodynamic inference based on the observation of a few transitions and their waiting time distributions rather than on the observation of a few states. This strategy has been proposed independently in the very recent Ref. [61], where the corresponding estimator for entropy production is introduced and its properties derived using mainly concepts from information theory, in particular, the Kullback-Leibler divergence. In our com- plementary approach that is based on the analysis of cycles, we show that the underlying trajectory-dependent quantity obeys a fluctuation theorem. Our analysis reveals that this estimator is the entropy production of a semi-Markov process. In particular, we show that the description discussed in the present work and in Ref. [61] shows kinetic hysteresis [59].Mathematically, this effect is the consequence of a time- reversal operation that differs from the one that is usually employed for semi-Markov processes. In this context, higher-order semi-Markov processes [39] fit into the picture naturally as semi-Markov processes with yet another time- reversal operation. Thus, our mathematical perspective establishes semi-Markov processes as an underlying common model while also highlighting the subtleties involved in identifying the correct time-reversal operation. Thermodynamic inference is not limited to estimating entropy production. We show that the waiting time dis- tributions allow us to infer topological properties and further thermodynamic quantities like the number of states in cycles and their driving affinity. Furthermore, we propose an inductive scheme to detect the presence of hidden cycles in a complex network. The paper is structured as follows. In Sec. II, we describe the setup and present our key results qualitatively. The fundamental concepts of our effective description are introduced in Sec. III for the paradigmatic model of a single observed link in a unicyclic Markov network. By generalizing these concepts to multicyclic Markov net- works in Sec. IV, we propose and discuss an entropy estimator and inference methods theoretically and numeri- cally. The general framework of multiple observed links in a multicyclic Markov network is discussed in Sec. V. In Sec. VI, we discuss our and related work from the perspective of semi-Markov processes. We conclude with a summary and an outlook on further work in Sec. VII. II. SETUP AND KEY QUALITATIVE RESULTS We start with a general Markov network of N inter- connected states, e.g., the one shown in Fig. 1(a). At time t, a state iðtÞ ¼ k is assigned to the physical system, with k ¼ 1;…; N. The time evolution follows a stochastic description by allowing transitions between two states k and l that are connected by a link (equivalently, an edge) in the network. Quantitatively, these transitions from k to l and their reverse happen instantaneously with transition rates kkl and klk, respectively. We assume that kkl > 0 implies klk > 0 to ensure thermodynamic consistency. In the long-time limit t → ∞, the probability pkðtÞ to observe the system in a particular state k at time t approaches a constant value ps k, which characterizes the stationary state of the network. In general networks, it is possible to walk along closed loops. These are accessed systematically from the network by identifying its cycles C, which are defined as closed, directed loops without self-crossings. From a thermody- namic perspective, cycles are a crucial concept due to their possibility to break time-reversal symmetry by favoring the forward direction over the reverse or vice versa. This preference is quantified by the cycle affinity AC, defined as the product over all forward rates in C divided by the corresponding backward rates: AC ¼ ln Y ðklÞ∈C kkl klk : ð1Þ As shown in Fig. 1(b), the network from Fig. 1(a) has three different cycles with different affinities. The affinity AC is also related to the entropy production associated with the cycle C [62,63]. For biochemical reactions or driving along a periodic track by a force, the affinity is given by the free energy change or dissipated work, respectively [3]. Cycles C with nonvanishing affinities give rise to macro- scopic, sustained flows along their constituent links, even in the limit of large observation times T. These circular flows are the cause of the mean entropy production rate hσi ¼ X C jCAC; ð2Þ where jC is the expected net number of completed cycles C divided by the observation time T in the limit T → ∞ [63– 65]. If hσi > 0, there is a constant rate of dissipation in the stationary state, which is then referred to as a nonequili- brium stationary state (NESS). Calculating the entropy production via Eq. (2) requires the ideal case of knowing all cycles and all cycle currents, which is not practically feasible, in general. In our setup, we assume that an external observer measures individual transitions along a limited number of edges connecting neighboring states in the Markov network. Conceptually, VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-2 this approach coincides with the transition-based effective description proposed in Ref. [61]. Notationally, we discern transitions from states by utilizing capital letters I; J;… and write I ¼ ðklÞ to express that I is a transition from the Markov state k to the Markov state l. An example illustrating this effective description for observable tran- sitions (23) and (32) in the Markov network from Fig. 1(a) is shown in Figs. 1(c) and 1(d). The central objects of interest for this effective description are waiting time distributions of the form ψ I→JðtÞ≡ PðJ;TJ − TI ¼ tjIÞ; ð3Þ which quantify the probability density that the transition J is measured at time TJ ¼ TI þ t given that the previous transition I is registered at time TI. With transitions I, J replacing states k, l, waiting time distributions ψ I→JðtÞ are the time-resolved analog of transition rates kkl. Figures 1(e)–1(g) illustrate the concept of waiting time distributions for the effective description in Figs. 1(c) and 1(d). In the following, we derive several remarkable results centered around these waiting time distributions and their underlying semi-Markov description, which are sum- marized here on a qualitative level. (1) For a unicyclic network, it is sufficient to determine the ψ I→JðtÞ from just one edge in order to infer the affinity of the cycle C and the exact mean entropy production rate hσi from the ratio of these distribu- tions. We recover this result of Ref. [61] independ- ently, here based on a microscopic fluctuation theorem from the perspective of network cycles. Since the full entropy production is inferred by this estimator, it beats the TUR, which, in general, does not recover full entropy production even in a unicycle. (2) For a multicyclic network, the same information from just one edge yields the affinity of the shortest cycle, its length, and the length of the second- shortest cycle this edge is a part of. Second, it yields a lower bound on the largest cycle affinity contrib- uting to the current through this edge. Finally, it (a) (b) (c) (d) (e) (f) (g) FIG. 1. Key concepts of the effective description for an exemplary Markov network. (a) Markov network including four different states. Every link between state i and state j allows for transitions in both directions with respective transition rates kij and kji. (b) Different cycles within the network. The three different cycles in the network are numbered incrementally starting with cycle C0 ¼ ð12341Þ, drawn as a green dashed curve, cycle C1 ¼ ð1231Þ, drawn as a blue dash-dotted curve, and cycle C2 ¼ ð1341Þ, drawn as an orange dotted curve. By definition in Eq. (1), the affinity of C0 is given by AC0 ¼ lnðk12k23k34k41=k21k32k43k14Þ; AC1 and AC2 are defined analogously. Furthermore, these affinities coincide with AC ¼ lnPð↺Þ=Pð↻Þ, the quotient of probabilities to observe a completed cycle in the forward and backward direction, respectively [cf. Eq. (7)]. (c) Effective description of the network if only the link between 2 and 3 is observable. Observing this link gives information about transitions between 2 and 3, i.e., (23) and its reverse (32), and intermediate waiting times. (d) Observable cycles in the effective description. Two successive transitions along the observable link indicate the completion of a cycle. As indicated with gray color, only completions of C0 or C1 can be registered, since C2 does not include the observed link. Additionally, C0 and C1 are drawn as curves with the same color, because, by counting transitions without temporal resolution, we cannot distinguish between both cycles. (e) A trajectory and its effective description. The observable parts of a trajectory of the underlying network are transitions (23) and (32) at corresponding transition times. By conditioning the observed transitions on the previous ones, four different waiting time distributions for the different combinations of subsequent transitions can be defined. (f),(g) Waiting time distributions for the observable link for fixed transition rates. The four different waiting time distributions of the observed link are illustrated; they are calculated with the method introduced in Appendix A 3. The particular choice of transition rates is given in Appendix E. THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-3 provides a lower bound on the overall entropy production of the network that coincides with the bound proposed in Ref. [61]. This bound is shown to be tighter than the entropy estimator in Ref. [44] while also omitting any assumptions of physical control over system parameters at the observed edge. (3) If several edges can be observed, the estimator on total entropy production becomes successively tighter. Based on the ratios of the ψ I→JðtÞ, we establish operational criteria to infer the presence of hidden cycles and hidden entropy production not accounted for by the estimator. (4) From a mathematical perspective, observing tran- sitions results in a semi-Markov process. The cycle-based approach of this work and the informa- tion-theoretical approach of Ref. [61] can be seen as equivalent strategies to establish the entropy pro- duction of the corresponding semi-Markov process. From this point of view, we relate the proposed entropy estimator to the semi-Markov entropy esti- mator proposed and discussed in Refs. [39,66,67] and highlight the crucial role of the different time- reversal operations. III. UNICYCLIC NETWORK AS PARADIGM For an introductory example, we consider a Markov network with only a single cycle C in its NESS. In this network, we observe a single edge between neighboring states k and l that is part of the cycle. We assume that forward and backward transitions along this edge can be distinguished and denote forward transitions ðklÞ as Iþ and backward transitions ðlkÞ by I−, respectively. On the microscopic level, a waiting time distribution of the form ψ I→JðtÞ has contributions only from microscopic trajectories γtI→J that start with a transition I and end with another one, J, after time t without any other observed transition in between. With a microscopic path weight P½γ� for microscopic trajectories γ, the waiting time distribution can be expressed as ψ I→JðtÞ ¼ X γtI→J P½γtI→JjI�; ð4Þ which sums only trajectory snippets of the form γ ¼ γtI→J with a path weight that is conditioned on the first jump I at timeTI . For example, thewaiting time distributionψ Iþ→IþðtÞ originates froma trajectory snippet γtIþ→Iþ of length twith the jump sequence γtIþ→Iþ ¼ k → l → � � � → k → l. Likewise, ψ I−→I−ðtÞ arises from γtI−→I− ¼ l → k → � � � → l → k. Although the identification in Eq. (4) is reasonable from a practical point of view, its derivation contains some subtleties that are explained in the full proof of Eq. (4) in Appendix A. Since γtI−→I− is the reverse of γtIþ→Iþ , the logarithmic ratio of the corresponding waiting time distributions, aðtÞ≡ aIþ→IþðtÞ≡ ln ψ Iþ→IþðtÞ ψ I−→I−ðtÞ ; ð5Þ is a natural, antisymmetric measure of irreversibility of the underlying trajectory. As a first main result, we show that aðtÞ is independent of t and, in particular, can be identified with the cycle affinity AC: aIþ→IþðtÞ≡ a ¼ −aI−→I−ðtÞ ¼ AC: ð6Þ This relation can be seen as a fluctuation theorem applied to sections of the underlying trajectory on the Markov network that give rise to a waiting time distribution ψ Iþ→IþðtÞ. These sections are trajectory snippets γtIþ→Iþ of the form given above, where the time difference between both jumps k → l is exactly t. To observe the genuine time reverse ψ I−→I−ðtÞ, the underlying trajectory must complete the cycle in the reverse direction, which means P½γtI−→I− jI−� ¼ P½γtIþ→IþjIþ�e−AC ð7Þ for the path weights of every possible trajectory snippet γtI�→I� . Since this argument holds true for all trajectories contributing to the waiting time distribution ψ Iþ→IþðtÞ, we can sum the left side of Eq. (7) over all γtI−→I− and the right side of Eq. (7) over all γtIþ→Iþ to conclude that ψ I−→I−ðtÞ ¼ ψ Iþ→IþðtÞe−AC ð8Þ using Eq. (4). Inserting Eq. (8) into Eq. (5) proves Eq. (6). Since aðtÞ ¼ a ¼ AC is time independent, we get from Eq. (5) to AC ¼ a ¼ ln R∞ 0 dtψ Iþ→IþðtÞR∞ 0 dtψ I−→I−ðtÞ ¼ ln PðIþjIþÞ PðI−jI−Þ ð9Þ with an integration over the time t. The last equality follows from the definition of ψ I→JðtÞ as a joint distribution in J and t in Eq. (3). Thus, the cycle affinity is encoded in conditional probabilities PðJjIÞ to observe transition J after transition I irrespective of the intermediate waiting time. The relationship between cycle affinities and a time- antisymmetric probability ratio, given by Eq. (6) [or, equivalently, Eq. (9)], indicates that aðtÞ can be used as an estimator for the mean entropy production rate hσi in the steady state via hσi ¼ jCAC ¼ jCa; ð10Þ which is exact even for finite observation times T, because the average is taken in the NESS. This noninvasive VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-4 estimator is directly accessible from an operational point of view, as by definition jC can be calculated by counting transitions along the observed link and aðtÞ ¼ a can be calculated either directly from histogram data for the waiting time distributions using Eq. (5) or from conditional probabilities deduced from observed transitions using Eq. (9). This unicyclic result also recovers one of the main results in Ref. [61], here using a technique based on the microscopic cycle fluctuation theorem Eq. (7). Thus, the result additionally addresses the conceptual issue of relat- ing entropy production, cycles, and fluctuation theorems that is raised at the end of Ref. [61]. Conceptually, the identificationA ¼ aðtÞ relies crucially on the observation of transitions rather than states. Two subsequent transitions in the same direction imply a completed cycle with associated entropy production, whereas two visits of the same compound state emerging from state lumping in typical coarse-graining strategies do not. As all transitions except for one are invisible in the present partially accessible system, previous state-based coarse-graining approaches would yield a trivial model containing only a single compound state. Note that alter- nated observed transitions, observing a forward transition after a backward transition or vice versa, can never imply the completion of an underlying cycle. Therefore, it is not surprising that the estimator of the entropy production of a unicyclic network contains only the statistics of two subsequent transitions in the same direction, as observed in Ref. [61]. IV. MULTICYCLIC NETWORKS WITH ONE OBSERVED TRANSITION For a general network topology, we cannot reconstruct a unique underlying path contributing to the waiting time distributions ψ Iþ→IþðtÞ and ψ I−→I−ðtÞ as in the unicyclic case. Topologically distinct hidden pathways may result in the same pair of consecutive observed transitions. Nevertheless, bounds for the affinities of those cycles that include the observable link can be derived from the ratio aðtÞ. In addition, the cycle lengths of specific cycles can be inferred from the short-time limit of the waiting time distributions. Furthermore, the entropy estimator for uni- cyclic networks can be generalized to the multicyclic case. A. Bounds on cycle affinities For each possible underlying cycle C with Iþ ∈ C, Eq. (7) is valid with corresponding cycle affinity AC, if γtIþ→Iþ completes the cycle once in the forward direction without taking detours and γtI−→I− denotes the correspond- ing reverse path. Thus, the bound min C;Iþ∈C AC ≤ ln P½γtIþ→IþjIþ� P½γtI−→I− jI−� ≤ max C;Iþ∈C AC ð11Þ is an immediate consequence for these trajectories γtIþ→Iþ by comparing with the smallest and largest possible affinity, respectively. Remarkably, the inequality in Eq. (7) holds true for general γtIþ→Iþ , if the corresponding γtI−→I− is defined appropriately by the following algorithm. (1) Consider the sequence of states in γtIþ→Iþ . For Iþ ¼ ðklÞ, this is ðkl � � � klÞ. (2) Remove the first and last state: ðkl � � � klÞ ↦ ðl � � � kÞ. (3) Reading from left to right, remove all closed loops; i.e., as soon as a state m appears twice, remove the intermediate part: ð���amb���cmd���Þ↦ð���amd���Þ. (4) The remaining trimmed path visits each state at most once. This trimmed path completed with Iþ gives rise to a contributing cycle. (5) Reverse the trimmed path and reintegrate the first and last state: ðk � � � lÞ ↦ ðlk � � � lkÞ. (6) Reintegrate the closed loops from step 3 without reversing: ð� � �dma � � �Þ↦ ð� � �dmb � � �cma � � �Þ. The resulting sequence of states determines the partial reverse RγtIþ→Iþ , which is of the form γtI−→I− . This procedure identifies a trimmed path of γtIþ→Iþ that visits each state at most once. By reversing only this trimmed path, one obtains the partial reverse of γtIþ→Iþ , which is denoted by RγtIþ→Iþ. The associated cycle con- taining the transition Iþ that is reversed byR has to be one of the possible C in Eq. (11). For an example of this procedure, see Fig. 2(b). Thus, inverting only the trimmed part of γtIþ→Iþ while maintaining the original direction of the remaining transitions restores the inequality in Eq. (7) and, hence, also the bound in Eq. (11) for every possible microscopic trajectory γtIþ→Iþ with the correspond- ing partner γtI−→I− ¼ RγtIþ→Iþ defined in this way. By averaging over all possible trajectory snippets of length t, we can combine Eq. (4) with Eq. (11), which is now valid for all γtIþ→Iþ with corresponding partner γtI−→I− , to conclude min C;Iþ∈C AC ≤ ln ψ Iþ→IþðtÞ ψ I−→I−ðtÞ ≤ max C;Iþ∈C AC ð12Þ for arbitrary 0 < t < ∞. For this step, it is important to note that the algorithm provides a bijective mappingR between trajectories of the form γtIþ→Iþ and trajectories of the form γtI−→I− . The inverse mapping is given by applying the same algorithm to γtI−→I− except for reading right to left in step 3 to recover the correct sequence of states for γtIþ→Iþ. The quotient in Eq. (12) can be identified as aðtÞ via Eq. (5). Thus, the extremal values of aðtÞ can be identified as bounds on the actual cycle affinity in the form ACþ ≡max C AC ≥ sup 0≤t<∞ aðtÞ≡ a�þ; ð13Þ THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-5 AC− ≡min C AC ≤ inf 0≤t<∞ aðtÞ≡ a�−: ð14Þ Here, the maximum and minimum of the affinities are taken over all cycles C contributing to the observed link. Strong driving along or against the observed link manifests itself in a high positive or negative affinity for a given cycle, respectively. The inequalities (13) and (14) allow us to infer such a source of strong driving from its impact on aðtÞ from the viewpoint of the observed link. The derived bounds for the cycle affinities are illustrated in Fig. 2. Figures 2(c) and 2(d) show that the extremal affinities ACþ and AC− of the contributing cycles are indeed bounded by the maximum value a�þ and the minimum value a�− of aðtÞ. Furthermore, the affinity AC0 of the shortest contributing cycle is always equal to the initial value a�0 ≡ aðt ¼ 0Þ as we prove in the following section. To quantify the quality of the bounds in Eqs. (13) and (14) for the network from Fig. 2(a), we distinguish two different cases of network realizations. A network with a particular configuration of transition rates belongs to class I if the initial value a�0 of aðtÞ is a global maximum or minimum. An exemplary aðtÞ of a realization of the network belonging to this class is shown in Fig. 3(a), case (I). For this class of network realizations, Eqs. (13) and (14) provide only a single bound for either the maximal or the minimal affinity of the cycles contributing to the observed link. The other bound is satisfied by the shortest cycle with affinity a�0 ¼ AC0. Class II contains the remaining realiza- tions of the network in which a�0 ¼ aðt ¼ 0Þ is not the global maximum or minimum. An example for an aðtÞ sorted into class II is given in Fig. 3(a), case (II); another one is already shown in Figs. 2(c) and 2(d). For this class of network realizations, Eqs. (13) and (14) provide bounds for both the maximal and the minimal affinity of the cycles contributing to the observed link, respectively. For both classes of rate configurations, quality factors Q can be defined such that forQ ¼ 1 equality in Eqs. (13) and (14) holds and the value of the bounds equals the actual affinity of the cycle. ForQ < 1, the quality factor quantifies the ratio between the value of the bounds and the actual affinity of the corresponding cycle. Using the affinity AC0 of the shortest cycle given by a�0 as a baseline, we introduce the relative distance ΔaðtÞ≡ jaðtÞ −AC0 j ¼ jaðtÞ − a�0j: ð15Þ The quality factors are defined by comparing the maximal value Δaþ ¼ ja�þ −AC0 j ð16Þ and the minimal value Δa− ¼ ja�− −AC0 j ð17Þ of Eq. (15) with the respective actual distance between the true cycle affinities given by jAC� −AC0 j. For network realizations belonging to class I, either Eq. (13) or Eq. (14) is a bound for the affinity of a single (a) (b) (c) (d) FIG. 2. Illustrative example for a partially accessible multicyclic network. (a) Effective description for a seven-state multicyclic network in which the link between state 1 and state 7 is observable, leading to five different contributing cycles Ci numbered incrementally. The corresponding transition rates are given in Appendix E. For cycle C0 ¼ ð1271Þ, the affinityAC0 vanishes; the affinity of cycle C1 ¼ ð13271Þ is AC1 ¼ 3.18; the affinity of cycle C2 ¼ ð134571Þ is AC2 ¼ −1.43; the affinity of cycle C3 ¼ ð1345671Þ is AC3 ¼ 7.27; and the affinity of cycle C4 ¼ ð1234571Þ is given byAC4 ¼ −5.61. (b) Example for a trimmed path. For the snippet γtIþ→Iþ depicted with blue arrows, the sequence of visited states is (713276571). The trimmed path for this snippet is (713271) (cf. the algorithm in the main text). The corresponding γtI−→I− is not the reversed sequence but rather (176572317) and depicted with dashed orange arrows. Thus, the associated cycle is C1, i.e., P½γtIþ→Iþ jIþ�=P½γtI−→I− jI−� ¼ AC1 . Terms due to the extra loop (7567) cancel in this path weight quotient. (c),(d) Estimation of the cycle affinities of the contributing cycles based on the extreme values of að71Þ→ð71ÞðtÞ. The maximal value a�þ ≃ 0.13 and the minimal value a�− ≃ −0.66 of að71Þ→ð71ÞðtÞ are lower and upper bounds for the maximal affinity AC3 ¼ 7.27 and the minimal affinity AC4 ¼ −5.61, respectively. The initial value að71Þ→ð71Þð0Þ ¼ a�0 ¼ 0 equals the affinity AC0 ¼ 0 of the shortest network cycle. The local maximum a�1 ≃ 0.03 and the local minimum a�2 ≃ −0.05 can be identified as lower and upper bounds for the affinities AC1 ¼ 3.18 and AC2 ¼ −1.43 of the remaining contributing cycles C1 and C2, respectively. VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-6 cycle. If the initial value a�0 is a global minimum, the maximal affinity ACþ of the cycles contributing to the observed link is bounded by Eq. (13). Thus, the quality factor QI for this network realization is defined as QI ≡ Δaþ jACþ −AC0 j : ð18Þ If the initial value a�0 is a global maximum, the minimal affinity AC− of the cycles contributing to the observed link is bounded by Eq. (14), and the quality factor QI for this network realization is given by QI ≡ Δa− jAC− −AC0 j : ð19Þ A graphical illustration of the quantities entering the definition of QI is shown in Fig. 3(a), case (I). For network configurations belonging to class II, both Eqs. (13) and (14) provide nontrivial bounds for the extremal affinities of the contributing cycles. To distinguish both bounds, two quality factors Qþ II and Q− II defined similarly to Eqs. (18) and (19) are needed. The quality factor Qþ II defined as Qþ II ≡ Δaþ jACþ −AC0 j ð20Þ quantifies the quality of the bound Eq. (13) for the maximal affinity ACþ of the contributing cycles. The quality of the bound Eq. (14) for the minimal affinity AC− of the contributing cycles is quantified analogously by Q− II ≡ Δa− jAC− −AC0 j : ð21Þ The quantities entering the definition of Qþ II and Q− II are illustrated in Fig. 3(a), case (II). The quality factors for a total of 2 063 495 randomly drawn realizations of the multicyclic network from Fig. 2 are shown in Figs. 3(b)–3(d) as a function of the affinity AC0 of the smallest contributing cycle. The different structure and mean value of quality factors QI for network realizations from class I, shown in Fig. 3(b), when con- trasted to the structures and mean values of quality factors for network realizations from class II, shown in Figs. 3(c) and 3(d), indicate that the partition into two different classes of network realizations corresponds to distinct features of the network that are reflected in these affinity bounds. The mean value of the quality factors for network realizations belonging to class I is given byQI ≃ 0.4, which means that the maximal or minimal affinity of the con- tributing cycles can be estimated based on Eq. (13) or Eq. (14) with an average accuracy of 0.4. This result is remarkable, because, on the one hand, the estimation is based on a noninvasive observation of a single link of the network only and, on the other hand, to our knowledge, no coarse-graining inference scheme exists that bounds affin- ities of a partially accessible network to this degree of precision. The mean values of the quality factors for network realizations belonging to class II are given by Qþ II ≃ 0.2 and Q− II ≃ 0.1, respectively. Compared to the bounds for realizations belonging to class I, realizations belonging to class II tend to quantitatively weaker bounds. However, local maxima and minima of aðtÞ seem to provide further, loose bounds for the affinities of other, (a) (b) (c) (d) FIG. 3. Quality of the affinity bounds for the seven-state multicyclic network from Fig. 2. (a) Illustration of the quantities entering the definition of the quality factors for the two classes of network realizations. (I) shows aðtÞ for a network realization belonging to class I. Since a�0 is the global minimum of aðtÞ, the quality factor QI for this realization is defined according to Eq. (18) with Δaþ ¼ ja�þ − a�0j from Eq. (16). (II) shows aðtÞ for a network realization belonging to class II. The quality factors Qþ II and Q− II for this realization are defined according to Eqs. (20) and (21) with Δaþ ¼ ja�þ − a�0j and Δa− ¼ ja�− − a�0j from Eqs. (16) and (17), respectively. (b)–(d) Quality factorsQI,Q þ II , andQ − II for 2 063 495 randomly drawn rate configurations of the multicyclic network as a function of the affinity AC0 of the smallest contributing cycle. The mean value of the quality factors QI in (b) is given by QI ≃ 0.4, whereas the mean values of the quality factorsQþ II in (c) andQ − II in (d) are given byQ þ II ≃ 0.2 andQ− II ≃ 0.1, respectively. The difference between the quality factors in (c) and (d) for the same class of network realizations is caused by the ensemble for the transition rates that is biased towards positive affinities, explained in detail in Appendix E. All quality factors are determined from the corresponding waiting time distributions derived with the method explained in Appendix A 3. THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-7 nonextremal cycles contributing to the observed link. This numerical finding, illustrated for a given network realiza- tion in Fig. 2(c), indicates that each successive maximal and minimal value of aðtÞ corresponds to a contributing cycle. Therefore, the number of successive maximal and minimal values of aðtÞ can be interpreted as a lower bound for the total number of contributing cycles for networks from class II. B. Short-time limit and inference of cycle lengths Additional information about the network can be obtained from the time dependence of the waiting time distributions ψ Iþ→IþðtÞ and ψ I−→I−ðtÞ. In the limit t → 0, only the shortest cycle(s) including the link with forward transition Iþ and backward transition I− contribute(s) to the waiting time distribution, as longer paths lead to effects of higher order in t. Thus, we can extract the number of hidden transitions N1 needed to complete the smallest cycle and, if unique, its corresponding affinityAC0 from the waiting time distributions via lim t→0 � t d dt lnψ I�→I�ðtÞ � ¼ N1 ð22Þ and lim t→0 aIþ→IþðtÞ ¼ −lim t→0 aI−→I−ðtÞ ¼ AC0 ; ð23Þ respectively, as proven in Appendix C 1. Note that N1 þ 1 is equal to the length of the smallest cycle, because, afterN1 hidden transitions, an additional observed transition is needed to complete the full cycle. As an illustration for the identification of N1, we consider the ratio of waiting time distributions for the observable link of the two-cycle network shown in Fig. 4(a). Figure 4(b) illustrates that the evaluation of Eq. (22) for Iþ ¼ ð32Þ coincides with N1 ¼ 2, the minimal number of hidden transitions needed to observe (32) after (32) in the smallest cycle of the network. For the multicyclic network in Fig. 2, the identification of the affinity in Eq. (23) is illustrated in Fig. 2(c) together with the previously discussed affinity bounds, as the affinity AC0 of the shortest cycle is reflected in the initial value að71Þ→ð71Þð0Þ ¼ 0. Terms of higher order around t ¼ 0 of the form tN encode similar information about cycles with increasing size contributing to the observable link. Qualitatively, we can extract information about the number of hidden transitions N2 needed to complete the second-shortest cycle from aðtÞ, since aðtÞ − að0Þ ∼ tN2−N1 : ð24Þ More quantitatively and as proven in Appendix C 2, the absolute value of the relative distance introduced in Eq. (15) can be seen as the lowest-order perturbation to the shortest cycle. Typically, e.g., if the affinities of the two shortest cycles do not coincide, this effect is due to the second-shortest cycle. In this case, N2 can be extracted from Eq. (15) via lim t→0 � t d dt ln jΔaI→JðtÞj � ¼ N2 − N1 ð25Þ if N2 > N1, i.e., if the shortest cycle is unique. By combining the results from Eqs. (22) and (25), we can inferN2 from observable waiting time distributions. Similar to the length of the shortest network cycle, the length of the second-shortest network cycle is given by N2 þ 1. (a) (b) (c) (d) FIG. 4. Inference of cycle lengths and entropy estimation for a partially accessible two-cycle network. (a) Effective description of a four-state network with two cycles in which transitions along the link between states 2 and 3, i.e., (23) and (32), are observable. F is a dimensionless force applied to the observable link between states 2 and 3; all transition rates of the network are given in Appendix E. (b) Inference of the number of hidden transitionsN1 of the smallest network cycle C0 based on waiting time distributions calculated with the method from Appendix A 3 for fixed F ¼ ln 3. N1 ¼ 2 corresponds to the slope of the short-time limit of lnψðtÞ resulting in jC0j ¼ 3. (c) Inference of the number of hidden transitions N2 of the second-smallest network cycle C1 based on waiting time distributions calculated with the method from Appendix A 3 for fixed F ¼ ln 3. N2 − N1 ¼ 1 corresponds to the slope of the short-time limit of ln jΔaðtÞj resulting in N2 ¼ 3 and jC1j ¼ 4. (d) The estimator hσ̂i from Eq. (28) for the mean entropy production hσi of the full network as a function of F. The details for the simulations of νþjþðtÞ and ν−j−ðtÞ are given in Appendix E. The method from Appendix A 3 is used to calculate aðtÞ. VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-8 Figure 4(c) illustrates the evaluation of Eq. (25) for aðtÞ for Iþ ¼ ð32Þ leading toN2 − N1 ¼ 1. This result is consistent with N2 ¼ 3, the number of hidden transitions needed to observe (32) as the next observable transition after (32) along the second-smallest cycle of the network. C. Entropy estimator 1. Definition A time-dependent aðtÞ implies the presence of a second cycle, as longer waiting times between subsequent tran- sitions hint at the completion of longer pathways. Exploiting this time dependence leads to an entropy estimator that generalizes the estimator of the unicyclic case. To quantify this notion, we let T be the length of a long trajectory with N þ 1 transitions Ik located at Tk−1. The observation starts with the transition I1 at T0 ¼ 0 and ends with INþ1 at time TN ¼ T. Then, the number of subsequent forward or backward transitions with waiting time t in between is given by the time-resolved conditional jump counters defined as νþjþðtÞ≡ 1 T XN m¼1 δðTm − Tm−1 − tÞδImþ1;IþδIm;Iþ ; ð26Þ with ν−j−ðtÞ defined accordingly. These time-resolved conditional jump counters are used together with the ratio of waiting times aðtÞ defined in Eq. (5) to define a trajectory-dependent entropy estimator σ̂ ≡ Z ∞ 0 dtaðtÞ½νþjþðtÞ − ν−j−ðtÞ�: ð27Þ Operationally, νþjþðtÞ and ν−j−ðtÞ can be obtained from counting conditional transitions up to time t. aðtÞ can be obtained from histograms for the waiting time distributions based on waiting times between observed transitions. As proven in Appendix B in the limit of long trajectories, i.e., observation times T → ∞, Eq. (27) defines an entropy estimator respecting time-reversal symmetry in thermody- namic equilibrium whose mean additionally satisfies hσ̂i ≤ hσi: ð28Þ This property can be deduced from a fluctuation theorem σ̂ ¼ lim T→∞ 1 T ln PðΓÞ PðΓ̃Þ ð29Þ for the trajectory Γ and its time reverse Γ̃, both emerging from trajectories of the underlying network by a mapping defined by the effective description of the system. An interpretation for Γ from a mathematical point of view is given in Sec. VI. 2. Illustration and comparison to existing methods Anumerical illustration of the estimator [Eq. (27)] applied to the partially accessible two-cycle network is depicted in Fig. 4(d). The mean entropy production hσi and the entropy estimator hσ̂i are simulated for long, stationary trajectories and different values of a parameter F, which can be interpreted as a driving force applied to the observed link between the states 2 and 3. An external observer who is able to tune the force parameter F can find a value for which the net stationary current 0 ¼ j ¼ R∞ 0 dthνþjþðtÞ − ν−j−ðtÞi vanishes. This setup and the particular value ofF are referred to as stalling conditions and the stalling force, respectively [39,44,45]. Knowing this stalling force through either measurement or calculation amounts to knowing the effec- tive “pressure” the remaining network exerts on the link (23) against the force F. This information is incorporated in the so-called ”informed partial“entropy estimator hσIPi intro- duced in Ref. [44]. Since the remaining network is taken into account through the effective pressure, hσIPi surpasses the estimator obtained by merely measuring the ”passive parti- al“entropy production hσPPi that can be attributed to the transitions in an observed subset [43], i.e., hσPPi ≤ hσIPi ≤ hσi ð30Þ as proven in the context of the informed partial estimator in Ref. [45]. Under stalling conditions, both estimators hσPPi and hσIPi become trivial, because they cannot rule out the possibility that the underlying system is at equilibrium if j ¼ 0. The introduced time-resolved estimator hσ̂i, how- ever, is able to infer nonequilibrium, since hσ̂i > 0 even if j ¼ 0, as additional information enters its definition in Eq. (27). Intuitively, the waiting time distributions encode information about the hidden cycle in their time depend- ence through a nonconstant aðtÞ. More quantitatively, the estimator hσ̂i defined by Eq. (27) numerically reproduces the bound of the waiting time distribution based estimator proposed in Ref. [39] for the network in Fig. 4. Both the estimator in Ref. [39] and hσ̂i share the features of considering successive transitions and adding a time resolution through waiting time distributions. However, hσ̂i is formulated without the framework of a higher-order semi-Markov process or a Markov chain decimation scheme. While these differences render a general quanti- tative comparison with our estimator difficult, hσ̂i beats the informed partial estimator hσIPi for long, stationary tra- jectories, hσIPi ≤ hσ̂i ≤ hσi; ð31Þ as we prove in Appendix B 4. Note that the expectation values are still taken in the limit of large observation times in which finite-time effects at the initial and final transition can be neglected. It is also evident from the proof that the THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-9 equality is achieved in the first relation if and only if aðtÞ is time independent. Equality in the second relation is achieved if and only if removing the observed edge results in a network in which detailed balance is satisfied. To give a less formal interpretation of Eq. (31), observational access to the waiting time distributions contains more information than operational access to the observed links via the stalling force F. In particular, it is possible to measure F via −F ¼ ln PðIþjIþÞ PðI−jI−Þ ¼ ln hR∞ 0 dtνþjþðtÞi=hnþi hR∞ 0 dtν−j−ðtÞi=hn−i ; ð32Þ without perturbing the system at all, as we prove in Appendix B 4. V. MULTIPLE OBSERVED LINKS IN A MULTICYCLIC NETWORK Access to additional observable transitions provides further information about the underlying network, which allows us to infer topology qualitatively by identifying allowed and forbidden sequences of transitions and quan- titatively by sharpening our entropy estimator for multi- cyclic networks. A. Entropy estimator For M observed links, there are 2M possible transitions and a 2M × 2M matrix of quotients aIJðtÞ≡ ln ψ I→JðtÞ ψ J̃→ĨðtÞ ð33Þ with I; J ∈ fIð1Þþ ; Ið1Þ− ;…; IðMÞ − g. Here, Ĩ is defined as the reverse transition ĨðmÞ � ≡ IðmÞ∓ , which yields a skew sym- metry aIJ ¼ −aJ̃ Ĩ. Intuitively, the ratio in Eq. (33) encodes the entropy production term of an effective two-step trajectory Γt IJ ¼ I → J of length t. This term is related to the path weights of microscopic trajectory snippets γtI→J ¼ k → l → � � � → o → p of the same length t between two observed transitions I ¼ ðklÞ and J ¼ ðopÞ in the form aIJðtÞ ¼ ln P½Γt IJjI� P½Γt J̃ Ĩ jJ̃� ¼ ln P γtI→J P½γtI→JjI�P γt J̃→Ĩ P½γt J̃→Ĩ jJ̃� : ð34Þ Similar to the unicyclic case in Eq. (5), unobserved degrees of freedom in the microscopic path γtI→J are integrated out by the summation over the path weights. The ratios in Eq. (33) allow us to generalize σ̂, defined in Eq. (27), to multiple observed transitions. We define the conditional counters as νJjIðtÞ≡ 1 T XN m¼1 δðTm − Tm−1 − tÞδImþ1;JδIm;I; ð35Þ where we adopt the same notation as in Eq. (26); i.e., the mth transition Im is located at Tm−1. The sum over all aIJðtÞ in a trajectory constitutes the entropy estimator σ̂ ≡X IJ Z ∞ 0 dtaIJðtÞνJjIðtÞ; ð36Þ which reduces to Eq. (27) in the case of a single link, i.e., two possible transitions I� ¼ �. Thus, registering a jump J after a previous jump I during an observation of a long trajectory increases σ̂ by aIJðtÞ, an antisymmetric increment in which inaccessible data beyond the registered observable one are integrated out. The entropy estimator is thermo- dynamically consistent in the sense of Eq. (28) and satisfies the fluctuation theorem from Eq. (29) in the long-time limit T → ∞. Moreover, the definition (36) provides the fluc- tuating counterpart of the entropy estimator for multicyclic networks introduced in Ref. [61], which is given by hσ̂i in our notation. B. Network topology When we consider multiple transitions, their relative position in the network has a crucial impact on the observed data. For a given network, the waiting time distribution ψ I→JðtÞ depends not only on the pair of transitions I, J, but the entire set of observed links. For example, in the effective description of the network in Fig. 4(a), að23Þð23ÞðtÞ is time dependent but becomes time indepen- dent if, in addition, the transitions (13) and (31) are observed. The reason is that the fluctuation-theorem-like argument for the affinity can be restored, since observing ψ ð23Þ→ð23ÞðtÞ necessarily implies completion of the cycle C ¼ ð23412Þ. Formulated differently, we can retrace the arguments underlying Eq. (11) to deduce an equality AC ¼ ln P½γtð23Þ→ð23Þjð23Þ� P½γtð32Þ→ð32Þjð32Þ� ; ð37Þ because the only possible completed cycle is C. Based on this observation, we can conclude in more general terms that increasing the number of observed links in a network decreases the possible pathways in the remaining, hidden part of the underlying Markov network. This subnetwork, which is obtained by removing all observed links from the Markov network, is denoted a hidden subnetwork. While the hidden subnetwork is made up of the same states as the Markov network, it contains fewer links and, therefore, may be disconnected. We can make a few technical but far-reaching observa- tions, which are here formulated for long, stationary trajectories; i.e., expectation values are taken in the VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-10 NESS and in the limit T → ∞, as before. Let I ¼ ðklÞ and J ¼ ðopÞ be two arbitrary observed transitions in the network. (1) If the hidden subnetwork is topologically trivial, i.e., does not contain any cycles, then hσ̂i ¼ hσi. More- over, all aIJðtÞ are time independent. (2) A time-dependent aIJðtÞ implies the presence of a cycle in the hidden subnetwork. More precisely, if aIJðtÞ is nonconstant in time, then there is a cycle with nonvanishing affinity in the hidden subnetwork that connects the Markov states l and o. In particu- lar, hσ̂i < hσi. (3) If J cannot be an immediate successor of I, i.e., if ψ I→JðtÞ ¼ 0, the Markov states l and o are not connected in the hidden subnetwork. In particular, we can leave out at least one observed transition without decreasing hσ̂i. (4) The converse of 2 is not true. It is possible that aIJðtÞ is constant in time despite a cycle with nontrivial affinity containing both l and o. However, this behavior is not the generic case but rather requires high symmetry. An explicit example containing such an invisible cycle is provided in Appendix E 5. These four results are based on the microscopic origin of aIJðtÞ as a ratio of path weights as indicated in Eq. (34). The crucial argument is an extension of the reasoning used in the unicyclic case to relate ratios of path weights to the cycle affinity A [cf. Eq. (7)]. We consider two consecutive transitions I ¼ ðklÞ and J ¼ ðopÞ and two arbitrary paths γ1 and γ2 starting and ending in the Markov states l and o, respectively. Their path weights satisfy ln P½γ1jl� P½γ̃1jo� − ln P½γ2jl� P½γ̃2jo� ¼ A12; ð38Þ where A12 is the affinity of the closed loop obtained by appending γ̃2 to γ1. If the hidden subnetwork does not contain any cycles, A12 ¼ 0 follows trivially. Since γ1 and γ2 are arbitrary, Eq. (38) implies the existence of a specific number aIJ satisfying P½γtI→JjI�eaIJ ¼ P½γt J̃→Ĩ jJ̃� ð39Þ for paths γtI→J of arbitrary length t with time reverse γt J̃→Ĩ . By summing the previous equation over all possible trajectories of the form γtI→J, we conclude aIJðtÞ ¼ ln P γtI→J P½γtI→JjI�P γt J̃→Ĩ P½γt J̃→Ĩ jJ̃� ¼ ln ψ I→JðtÞ ψ J̃→ĨðtÞ : ð40Þ In particular, aIJðtÞ is time independent if the hidden subnetwork does not contain any cycles or if it satisfies detailed balance; i.e., any cycles in the hidden subnetwork have vanishing affinity. This argument establishes rule 1. To emphasize the relation to our previous results, we note that Eq. (40) can be seen as a special case of the affinity bounds from Eq. (11), which collapse to equalities if the set of possible AC contains only one element. If the hidden subnetwork is a spanning tree, the diagonal element aII ¼ AC is the affinity of the cycle C in the unicyclic network obtained by adding the link I back to the hidden subnet- work. In particular, every cycle passes through at least one observed link and is, therefore, registered. Since NESS entropy production stems from cycle currents, it seems plausible to conjecture hσ̂i ¼ hσi. Up to contributions from the first and last transition of the trajectory, the statement even holds on the level of individual trajectories in the form σ̂ ¼ σ; ð41Þ as is proven in Appendix D. Rule 2 is obtained from Eq. (38) by reversing the argument above. Since a nontrivial time dependence aIJðtÞ is impossible if A12 vanishes for all γ1 and γ2, there must be at least one cycle with nonvanishing affinity. We now argue that, despite the counterexample given in Appendix E 5, the converse of rule 2 is usually satisfied in a generic setup. If aIJðtÞ is constant in time, it equals its limit aIJð0Þ as t → 0. By a timescale separation argument similar to Eq. (23), only the shortest connection between the corresponding Markov states l and o contributes in the short-time limit, whereas longer connections are sup- pressed and lead to higher-order effects. A hidden cycle containing l and o can be split along these states, giving rise to two topologically distinct pathways γ1 and γ2. Unless both pathways contain the exact same number of states, one class of paths is suppressed by the other in the short-time limit. Thus, the hidden cycle must contain an even number of states to avoid this timescale separation argument. In addition to this purely qualitative argument, generic choices of transition rates generally lead to different first-passage times from l to o depending on the topology of the path, which would also lead to a nontrivial time dependence in aIJðtÞ. While the derivation of rule 3 is straightforward from a mathematical point of view, it is of high value operation- ally, as it can be used to infer the connected components of the hidden subnetwork. In addition, this rule describes a scheme to identify the transitions needed to recover the full entropy production. While rule 2 gives a simple criterion when a particular set of observed transitions is insufficient to conclude hσi ¼ hσ̂i, rule 3 formulates a complementary criterion about transitions which are redundant for the entropy estimate. On the level of the Markov network, restoring the minimal number n of observed links I1;…; In to connect l and o does not create any cycles in the hidden subnetwork. Since entropy production in the steady state is always due to cycle currents, the entropy production in the hidden subnetwork is not increased by not ob- serving I1;…; In, i.e., by adding I1;…; In to the hidden subnetwork. THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-11 The interplay of statement 2 working ”bottom up“ and statement 3 coming ”top down“ is not limited to assessing the quality of the discussed estimator σ̂. It is also an algorithm for inferring topological aspects of the Markov network by identifying underlying spanning trees, con- nected components, the position of hidden cycles, and, lastly, their affinities and lengths by combining these rules with the methods introduced in Sec. IV. VI. UNIFYING SEMI-MARKOV PERSPECTIVE A. Identification of the semi-Markov description In the transition-based description, each trajectory ζ of the underlying Markov network is mapped to a trajectory Γ that includes only the observable transitions and the waiting times in between, i.e., symbolically ζ ↦ Γ½ζ�: ð42Þ Clearly, this mapping from ζ to Γ is well defined and many- to-one. Adopting a different yet equivalent perspective, this kind of mapping for the underlying trajectory can be seen as a type of milestoning using the space of observable transitions for partitioning. Milestoning is a particular coarse-graining scheme from molecular dynamics simu- lations [68] introduced to stochastic thermodynamics in Refs. [59,60]. In short, the milestones represent certain events, whose occurrence indicates the crossing of a milestone that updates the coarse-grained state of the system. In practice, this approach results in a semi-Markov description for the coarse-grained system defined on the space of observable transitions. In other words, each observed transition I is identified as a state in the semi- Markov model. The following discussion includes the key concepts of semi-Markov processes in the context of stochastic thermodynamics; see Refs. [56,58,69] for details. The equivalence of the transition-based description to a semi-Markov model becomes evident on the level of single trajectories emerging from the mapping in Eq. (42). An effective trajectory Γ containing N þ 1 transitions starting and ending with registered transitions I1 at time T0 ¼ 0 and INþ1 at time TN ¼ T, respectively, is fully characterized by the sequence Γ ¼ fðI1; T1Þ; ðI2; T2Þ;…; ðIN; TNÞg ð43Þ for 0 ≤ t < TN. From a mathematical point of view, the sequence in Eq. (43) precisely defines a particular reali- zation of a semi-Markov trajectory [56], in which the fIkg take the role of the states. Compared to a Markov process, in which the system is fully described by specifying the state i, a full semi-Markov description of the system requires knowing the state I and the waiting time t that has elapsed since I has been entered. B. Semi-Markov kernels and embedded Markov chain Since the theory of semi-Markov processes provides the mathematical framework of the effective description, quan- tities defined for the latter can be expressed in the language of corresponding semi-Markov processes. The waiting time distribution ψ I→JðtÞ assigned to each transition I, dubbed as intertransition time density in Ref. [61], is called the semi-Markov kernel in this framework. A semi-Markov kernel ψ I→JðtÞ is defined as the joint distribution of waiting time t and transition destination J if the actual state is I with age zero, which coincides precisely with the definition of the waiting time distributions in Eq. (3). Integrating out the waiting time t of a semi-Markov kernel results in condi- tional probabilities pIJ ≡ PðJjIÞ ¼ Z ∞ 0 dtψ I→JðtÞ ð44Þ for a transition between two semi-Markov states irrespec- tive of the waiting time in I. These probabilities, whose ratios are already used in Eq. (9), can now be placed in a mathematical context. Based on the transition probabilities pIJ defined by Eq. (44), the concept of the embedded Markov chain (EMC) can be established for every semi- Markov process by integrating out its time variable [56]. The embedded Markov chain of the effective trajectory in Eq. (43) is given by the sequence ΓEMC ¼ ðI1; I2;…; INþ1Þ ð45Þ of observed transitions. The transition probabilities of the corresponding discrete-time Markov process are given by Eq. (44). C. Path weight and time-reversal operation According to the semi-Markov description, the path weight P½ΓjI1; 0� of the effective trajectory ΓðtÞ condi- tioned on the first transition is simply given by P½ΓjI1; 0� ¼ YN i¼1 ψ Ii→Iiþ1 ðtiÞ; ð46Þ with ti ¼ Ti − Ti−1, where we follow the conventional definition [56,69–71]. Equation (46) coincides with the effective path weight defined for trajectories of the tran- sition-based description in Ref. [61]. Note that the first and last transitions do not need to be treated differently [56,58,69,72], since the trajectory starts and ends with a transition by construction. The time-reversal operation for the present semi-Markov process is not given by the conventional time-reversal operation for semi-Markov processes. Instead of simply reversing Γ in time, as proposed in Refs. [56,70], two peculiarities emerging from the time reversal of the VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-12 underlying trajectory ζ have to be taken into account. First, Γ contains observed transitions that are odd under time reversal similar to momenta and, therefore, need to be reversed [39,59,73]. Thus, it is natural to define the reversed transition Ĩ for a transition I as I ¼ ðklÞ → Ĩ ≡ ðlkÞ: ð47Þ Second, we observe an effect introduced as kinetic hyste- resis in Ref. [59]. After registering a transition I ¼ ðijÞ at time tI , it would be misleading to treat I as a compound state and conclude that the underlying system remains in I until the next transition J is observed at tJ. At some time t with tI ≤ t ≤ tJ, the state of the coarse-grained system is described completely by knowing the last transition I and the time t − tI that has passed since then. However, the same point in time on the reversed trajectory is described by knowing that tJ − t has passed since the last transition J̃. Thus, J̃ replaces I as the latest registered transition. Combining both effects allows us to formulate the time reversal of a semi-Markov kernel ψ I→JðtJ − tIÞ as ψ̃ I→JðtJ − tIÞ≡ ψ J̃→ĨðtJ − tIÞ; ð48Þ resulting in P½Γ̃jIN; T� ¼ YN−1 i¼1 ψ Ĩi→Ĩi−1ðtiÞ; ð49Þ for the conditioned path weight P½Γ̃jIN; T� of the time- reversed trajectory Γ̃. Clearly, the time reversal in Eq. (48) is identical to the time reversal proposed in Ref. [61], since the shift of intertransition times discussed there is precisely the effect of kinetic hysteresis described above. Note that the modifications to the time-reversal operation of the semi- Markov process arise naturally, in accordance with the paradigm that time reversal does not commute with coarse- graining of the form Eq. (42), in general [59]. In the common conception of semi-Markov processes, the direction-time independence criterion is a necessary condition to ensure time-reversal symmetry in equilibrium [56,70]. Remarkably, the semi-Markov process as intro- duced here breaks this condition, in general. This apparent contradiction is resolved, since the derivation of the direction-time independence relies crucially on the conven- tional time-reversal operation for semi-Markov processes, which does not apply here, as discussed above. D. Interpretation of the entropy estimators The entropy estimator hσ̂i is established for unicyclic networks in Eq. (10). It is based on the microscopic fluctuation theorem in Eq. (8) valid for the ratio of waiting time distributions. The generalization of hσ̂i for multicyclic networks with multiple observed links in Eq. (36), which includes the estimator for a single observed link Eq. (27) as a special case, relies on the same fluctuation theorem generalized to the multicyclic case. From the semi-Markov perspective, these fluctuation theorems can be interpreted as the consequence of an actual fluctuation theorem of the semi-Markov process. We define the semi-Markov entropy production rate σSM as the limit σSM ≡ lim T→∞ 1 T ln PðΓÞ PðΓ̃Þ ; ð50Þ which differs from the known expressions, e.g., in Refs. [69,72,74] because of the modified time-reversal operation. Comparing Eq. (50) to Eq. (29), we conclude that σSM, in fact, equals σ̂, which is established as a thermodynamically consistent coarse-grained entropy pro- duction term in the previous sections. In hindsight, the fluctuation theorem in Eq. (8) can be derived from Eq. (50) by specifying to semi-Markov trajectories with only a single transition. The underlying Markov description does not enter explicitly anymore; instead, it is incorporated implicitly by ensuring that σSM is the correct physical entropy production. The affinity estimators derived in Sec. IV can also be seen as consequences of Eq. (50), tracing back the entropy production to the level of contributing cycles. From the unifying semi-Markov perspective, we can give three complementary interpretations of the estimator hσ̂i. First, the derivation presented in Ref. [61] relies on the information-theoretical identification of the expected entropy production of a stochastic process as a Kullback- Leibler divergencebetween thepathweights of a forward and backward process [36,37]. Second, contributions to the fluctuating quantity σ̂ can be attributed to the completion of cycles in the underlying Markov network, which are partially observed for an external observer. Third, σ̂ ¼ σSM can be interpreted as the entropy production rate of a semi- Markov process with a particular time-reversal operation. Thermodynamic consistency of σ̂ is then coupled to the applicability of the time-reversal operation, which has to be established from the underlying network. By interpreting σ̂ as the entropy production σSM of the equivalent semi-Markov process, the decomposition pro- posed in Ref. [61] can be identified as a decomposition of hσSMi into the entropy production hσEMCi of the EMC and the remaining entropy production hσWTDi caused by the waiting times: hσSMi ¼ hσEMCi þ hσWTDi: ð51Þ Up to a time conversion factor, hσEMCi is the mean entropy production of the EMC, which is given by hσEMCi ¼ 1 hti X I;J ps IpIJ ln pIJ pJ̃ Ĩ ; ð52Þ THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-13 where ps I is the steady state of the EMC as a discrete-time Markov chain. The factor hti, the average waiting time between two transitions, is needed because entropy pro- duction of a discrete-time Markov chain is naturally measured per step rather than per time. In terms of the application to observed links, ps I quantifies the relative frequency of a particular transition I in a long sequence of observed transitions as given by Eq. (45). Equivalently, Eq. (52) can be derived as the mean of σEMC ≡ lim T→∞ 1 T ln P½ΓEMC� P½Γ̃EMC� ; ð53Þ defined on the level of single trajectories ΓEMC based on the arguments presented in Appendix B 3. Note that Eq. (52) coincides with Eq. (49) in Ref. [61], dubbed there as transition sequence contribution to the entropy estimator. Since the EMC emerges from integrating out the temporal resolution of the semi-Markov process, hσEMCi vanishes in situations with no observable net current. In other words, the contribution of a particular pair of transitions I, J to σEMC vanishes if and only if the net number of transitions J after previous I matches the number of transitions Ĩ after previous J̃ on average, i.e., if P½ΓEMC� ¼ P½Γ̃EMC�. The condition of vanishing hσEMCi can also be related to the stalling conditions. In fact, the entropy production associated with the embedded Markov chain coincides with the informed partial entropy estimator hσIPi formulated for the case of one accessible transition [44,45], i.e., hσIPi ¼ hσEMCi; ð54Þ as proven in Appendix B 4. In particular, the force F can be determined as ln pIþIþ pI−I− ¼ ln PðIþjIþÞ PðI−jI−Þ ¼ −F ð55Þ by virtue of Eq. (32) without referring to waiting times at all. This result is not surprising, since both estimators measure the affinity AC of a single, averaged “effective cycle” either through the applied force F or through the ratio lnPðþjþÞ=Pð−j−Þ. Without the time resolution, the esti- mator hσ̂i loses the ability to distinguish between longer or shorter hidden cycles. Thus, we can reformulate a conjecture proposed in Ref. [61] that states that hσEMCi exceeds an analogous expression based on the TUR, hσTURi, since hσEMCi ≥ hσTURi is equivalent to hσIPi ≥ hσTURi.As another consequence of Eq. (54), the fluctuation theorem proven in Ref. [45] for σIP, the fluctuating counterpart of the estimator hσIPi, is related to its counterpart for the EMC, Eq. (52). The second expression in Eq. (51), hσWTDi, can be deduced by transferring the splitting of the entropy pro- duction into contributions from the EMC and remaining contributions from the waiting times to individual semi-Markov kernels in the path weights. In more practical terms, a single semi-Markov kernel ψ I→JðtÞ can be decom- posed into ψ I→JðtÞ ¼ pIJ · ψðtjIJÞ; ð56Þ separating the contribution from the EMC from a condi- tional waiting-time kernel ψðtjIJÞ ¼ ψ I→JðtÞ=pIJ. By decomposing all kernels in the path weights using Eq. (56), we can identify hσWTDi as a Kullback-Leibler divergence between the normalized probability densities ψðtjIJÞ and their reverse ψðtjJ̃ ĨÞ. Thus, the derivation in Ref. [61] relates to factorizing out the EMC according to Eq. (56) in the context of semi-Markov processes. Using Eq. (40), we see that hσWTDi vanishes if and only if all aIJðtÞ are constant in time. In particular, all aIJðtÞ are constant in time if detailed balance is satisfied in the hidden subnetwork. The decomposition of the semi-Markov entropy produc- tion in Eq. (51) clarifies additionally the relation between the estimator hσ̂i and the entropy estimator hσKLDi intro- duced in Ref. [39], which is also decomposed in the form hσKLDi ¼ hσaffi þ hσ̃WTDi: ð57Þ Similar to Eq. (51), this decomposition into contributions from waiting time distributions and affinities is obtained by splitting off the EMC. The analogy is further strengthened by noting that hσaffi ¼ hσIPi ¼ hσEMCi; ð58Þ with the first equality proven in Ref. [39]. Note that the respective embedded Markov chains are different objects, as hσaffi refers to a coarse-grained unicyclic three-state model, whereas hσEMCi observes only a single transition of this model. Nevertheless, the result is not entirely surprising in hindsight, since hσEMCi recovers the full entropy production of a unicyclic model by virtue of Eq. (9). The difference between the estimators hσWTDi and hσ̃WTDi, or hσ̂i and hσKLDi, respectively, emerges from different rationales underlying the respective semi-Markov processes. Describing a physical system with a semi- Markov process is not sufficient to determine its entropy production uniquely, since the correct time-reversal oper- ation needs to be discussed separately [39,66,67]. In total, three different time-reversal operations for semi-Markov processes are implicitly used to define entropy estimators for partially accessible Markov networks. (1) Conventional time reversal, Γ̃ðtÞ ¼ ΓðT − tÞ.—In this case, physically consistent semi-Markov proc- esses satisfy direction-time independence [70], which causes hσWTDi to vanish [56]. This time- reversal operation is applicable to particular settings VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-14 of coarse-graining [56,58]. States do not change, i.e., are even under time reversal. (2) Modified time reversal, introduced above.—This operation includes the kinetic hysteresis effect in- troduced [59], which is natural for coarse-graining based on milestoning [60]. In our case, semi-Markov states model transitions, which are odd under time reversal. (3) Time reversal for second-order semi-Markov proc- esses, introduced in Ref. [39].—States in a second- order semi-Markov process are doublets containing the previous and current state by construction. Because of this memory effect, states are neither even nor odd under time reversal. Any of these operations can be used to define an entropy via Eq. (50). This entropy can always be split according to Eq. (56), where the resulting waiting-time contributions are given by 0, hσWTDi, and hσ̃WTDi, respectively. In addition, any of the discussed operations are involutions, each giving rise to a dual dynamics for which an appropriate fluctuation theorem holds for the corresponding entropy production [3]. At this level, any nonvanishing entropy production quantifies a different mathematical notion of irreversibility, which becomes a thermodynamic quantity only if the time reversal is known to be justified physically [59]. VII. CONCLUSION A. Summary and discussion In this paper, we have introduced an effective description for partially accessible Markov networks based on the observation of transitions along individual links and wait- ing times between successive observed transitions. The corresponding waiting time distributions yield an entropy estimator hσ̂i. The corresponding fluctuating counterpart σ̂ additionally obeys a fluctuation theorem and was shown to have a natural interpretation as a semi-Markov entropy production. On a microscopic level, we have discussed with cycle fluctuation theorem arguments why observing one link suffices to recover the full entropy production in a unicyclic network. More generally, we have derived an operational criterion that indicates the absence of hidden cycles, which guarantees hσ̂i ¼ hσi. If the hidden part of the network contains hidden cycles, we have shown that the estimator hσ̂i yields a lower bound on the entropy production, which has been shown to improve on known estimation methods. Additionally, we have shown that the waiting time distributions contain information about topology and cycle affinities of the hidden network. To extract this information, we have derived exact results and estimation methods, whose quality has been assessed numerically. Both the entropy estimator and the affinity estimators are built upon the generalized microscopic cycle fluctuation theorem argu- ment which is, as we have shown, the signature of a fluctuation theorem valid for an effective semi-Markov process. From the perspective of this semi-Markov process, we have unified extant entropy estimators by providing a mathematical interpretation. Different inference methods can be compared based on the required input data and the significance of their predictions. In the case of a single link, hσ̂i relies on the measurement of statistical data contributing to a single current. While the amount of input data is comparable to methods based on the TUR, the predictions generally are much stronger, at least in the unicyclic case. While the TUR provides lower bounds on entropy production and cycle affinity in this case [23], we recover exact values for both quantities even without access to the waiting times. When the waiting time distributions are available, exact cycle lengths can be deduced, which improves significantly on a known TUR-based trade-off relation between affinity and cycle length [32,33]. In terms of predictive significance, the entropy estimator is comparable to the method introduced in Ref. [39] that is based on knowing a coarse-grained subnetwork, but it requires substantially less information. Calculating σ̂ is possible without any knowledge about the underlying network beyond a single observed link. In particular, the issue of decimation schemes for coarse-graining is circum- vented completely. Rather, the entropy estimator σ̂ com- bines current measurements with information-theoretical notions via conditional counting, since our expectation on the next transition depends explicitly on the previous one [36]. Thus, the sequence of transitions forms a Markov chain, which is identified as the EMC in the corresponding semi-Markov description. A mathematical discussion of semi-Markov processes allows us to clarify physically distinct categories of semi-Markov descriptions depending on the correct underlying time-reversal operation. Although different entropylike quantities satisfy fluctuation theorems and provide a mathematical notion of irreversibility, the thermodynamically consistent entropy production must be identified by more fundamental means. If measuring the entropy production is feasible operationally, this knowl- edge can be used to decide which time-reversal operation recovers the correct entropy production. In this sense, identifying the correct time-reversal operation is a task of thermodynamic inference. B. Perspectives The transition-based effective description for partially accessible Markov networks and the derived estimators for entropy and topology open a wide range of possible subsequent research topics. First of all, it will be promising to generalize the estimators for affinity and cycle length to networks with multiple observable links. Based on such a generalization, it would become possible to apply the estimators to a broader range of networks. The combined observation of different links would make it additionally THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-15 possible to infer more information about the network, because different affinities and cycle lengths would be accessible. With the macroscopic limit of large, complex systems in mind, it is an obvious, albeit ambitious, challenge to transfer thermodynamic inference methods to Markov networks whose cycles outnumber the observed links by far. Conceptually, the ratio ofwaiting timedistributions separates the time-resolved notion of irreversibility from other time- dependent effects entering a waiting time distribution. The estimation techniques for topology and affinity that are based on the short-time limit and, hence, short pathways infer local properties of Markov networks that may even be large. Passing from local to global methods would require a different approach. The dominant parts of the large-scale network structure might become manifest in patterns of particular transition sequences or waiting times in long trajectories. Splitting these into smaller snippets as proposed here is a first step toward a future study of self-correlations in a long trajectory to extract more complex structures. To gain more insight into the effective description from the established perspective of coarse-graining, one should investigate how existing coarse-graining strategies for observable states [43,44,48–55,57] are related to the approach introduced here. By combining these comple- mentary approaches and by taking into account conclusions on milestoning [59,60], the concept of coarse-graining can potentially be generalized to a more fundamental level. From a practical perspective, we may ask how the method can be generalized to less ideal situations, e.g., if the observer cannot distinguish between different transitions or registers particular patterns or sequences of transitions only. This class of situations also includes the complementary problem when particular states can be observed rather than particular transitions, because observing the arrival in a state is equivalent to observing all transitions into this state without the ability to distinguish between them. The potential of waiting time distributions and their role for inference schemes is certainly not exhausted by the results presented here. Combining the estimators for entropy production and network topology with existing numerical methods may increase the usefulness of the waiting time distributions in thermodynamic inference schemes. Fitting rates of the underlying Markov network to the recorded waiting time distribution [42] or using minimization methods [40,41] are promising tools to obtain tighter, more specialized bounds for the discussed estima- tors or even to reconstruct the transition rates in a small network from sufficient data. These methods will gain particular practical relevance, since topological aspects of the underlying network can be deduced rather than have to be assumed. Furthermore, even though the effective description has been introduced and discussed for observable transitions of a partially accessible Markov network in the NESS, it is, in principle, not limited to this setting. For example, the description could be applied beyond the steady state to analyze transient dynamics. Finally, it would be interesting to apply the approach to a Langevin dynamics to explore the adjustments needed for systems with continuous degrees of freedom. APPENDIX A: WAITING TIME DISTRIBUTIONS FROM PATH WEIGHTS AND TRAJECTORY SNIPPETS 1. Markovian path weights and master equation We consider the effective description of a given, only partially accessible system in which transitions are observed, e.g., the effective two-cycle network from the main text based on the observation of transitions between states 2 and 3, shown in Fig. 5. We assume that there is an underlying, more fundamental network to which a discrete Markovian description from the perspective of stochastic (a) (b) (c) (d) FIG. 5. Example for the analytical calculation of waiting time distributions based on effective absorbing dynamics. (a) Effective description for the partially accessible two-cycle network from the main text. Only transitions from state 2 to state 3 and in the reversed direction are observable. (b) Underlying Markov network. On the fundamental level of description, the network is Markovian, and transitions from state k to state l are governed by the transition rate kkl. (c) Effective absorbing Markov network. Between two observed transitions, the system can be described with an absorbing master equation. This intermediate hidden dynamics is terminated by either a transition (32) or a transition (23). (d) Exemplary waiting time distribution derived from the numerical solution of the absorbing master equation for the effective dynamics and the corresponding distribution determined from a histogram of thewaiting timeswithin a trajectory of length T ¼ 107 generated with a Gillespie simulation [75] of the network. The transition rates of the network are drawn randomly. VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-16 thermodynamics, as described in detail in Ref. [3], can be applied. For the effective description in Fig. 5(a), this full Markov network with two fundamental cycles is shown in Fig. 5(b). Transitions from state k to state l are governed by a transition rate kkl, which is independent of the time already spent in the state k due to the Markov property of the description. Thus, the waiting time distribution in a particular state must be memoryless and, therefore, expo- nentially distributed. In formulas, the probability density for surviving in state k until exactly time t is given by Γk exp ð−ΓktÞ, where Γk ¼ P l kkl denotes the escape rate of state k. Given that state k is exited, a transition to state l is weighted with the transition rate and, therefore, happens with the transition probability kkl= P l kkl. Based on the discussed survival and transition proba- bilities, a path weight quantifying the probability of a trajectory ζ of the Markov network can be introduced. We assume that the network hasN states, is fully connected and that there are no unidirectional links; i.e., kkl > 0 implies klk > 0. The path weight P½ζðtÞ� for a generic trajectory ζðtÞ conditioned on the initial state k0 at time t ¼ 0 is given by P½ζðtÞjk0; 0� ¼ YN k¼1 exp ð−ΓkτkÞ Y ðklÞ knklkl ; ðA1Þ where the second product runs over all possible transitions ðklÞ in the network. The trajectory-dependent quantities τk and nkl denote the total time spent in state k and the total number of transitions ðklÞ in ζðtÞ, respectively. In principle, a trajectory-dependent observable can be obtained by a path integral over all trajectories ζ, which in practice means summing over the number L of possible jumps and integrating over all transition times t1;…; tL. An important consequence is that the probability to observe L jumps in a short trajectory ζ of length Δt scales as PðL jumpsÞ ∼ ΔtL for Δt → 0, since Pðζ contains exactlyL jumps jk0;0Þ ¼ YL l¼1 �Z Δt 0 dtl � P½ζðtÞjk0;0�∼ΔtL½1þOðΔtÞ�; ðA2Þ because the path weight as given in Eq. (A1) is of the order of 1 in Δt. Thus, a first-order differential equation gov- erning the time evolution of ζðtÞ can be derived by calculating the path weights for constant and one-jump trajectories, which are the only terms containing terms of first order in Δt. The resulting differential equation ∂tpkðtÞ ¼ X l≠k ½plðtÞklk − pkðtÞkklðtÞ� ðA3Þ is known as the master equation and can be solved to obtain pkðtÞ, the probability to find the system in state k at time t. Since the master equation description Eq. (A3) is equiv- alent to the path weight description, solving the initial value problem for pkð0Þ ¼ δk0k amounts to calculating pkðtÞ ¼ P½ζðtÞ ¼ kjζð0Þ ¼ k0� ðA4Þ ¼ X ζðtÞ¼k P½ζðtÞjk0; 0�: ðA5Þ The symbolic notation of a sum over paths is used repeatedly in the following calculations. 2. From fully accessible networks to partially accessible networks On a coarse-grained level of description, the trajectories of the network are only partially accessible. Thus, a complete analytical description by solving the master equation (A3) is generally impossible, because even the underlying fundamental network may be unknown. In the following, we assume that transitions along a single link connecting the Markov states k and l can be observed but not the states themselves. This transition- based description coincides with the description proposed in Ref. [61]. Adopting the notation from the main text, a transition along this link k → l and its reverse l → k are abbreviated as Iþ ¼ ðklÞ and I− ¼ ðlkÞ, respectively. Since the sequence of observed jumps and the waiting times in between are the only accessible information about the system in our effective description, a typical example of an observed effective trajectory Γ may look like Γ ¼ ? → Iþ → Iþ → I− → Iþ → � � � at jump times?; T0; T1; T2; T3;…; ðA6Þ where ? represents the unknown transition of the system in the past prior to the first observed transition. For simplicity, we assume from now on that the process starts and ends immediately after the observation of an observable transition I1 at time T0 ¼ 0, INþ1 at time TN ¼ T, to address the core of our argumentation without worrying about non-time-extensive initial and final terms of the trajectory. Moreover, the scheme indicated in Eq. (A6) can be generalized to any number of observable links. We write In ¼ ðknlnÞ as the nth observed transition between the underlying states kn and ln, where we note that ln ≠ knþ1, in general, as hidden dynamics cannot be excluded. Schematically, a coarse-grained trajectory Γ takes the form THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-17 Γ¼ I1 → I2 → I3 → � � �→ IN → INþ1 at jump times T0 ¼ 0; T1; T2;…; TN−1; TN ¼ T: ðA7Þ Similar to the Markov case, the probability for Γ can, in principle, be quantified by a path weight description. First of all, it is important to note that no memory effects ranging over multiple observed transitions need to be considered. The path weight for the future of the trajectory, i.e., the path weight for the trajectory after a transition In is registered, is unaffected by the previous block In−1 → In, since knowing the transition In ¼ ðknlnÞ at tn implies knowing the state of the underlying Markovian system immediately after Tn−1. Thus, the path weight can be split into parts belonging to observed transitions: P½ΓðtÞjI1; 0� ¼ PðI2; T1jI1; 0ÞPðI3; T2jI2; T1Þ � � � PðINþ1; TN jIN; TN−1Þ; ðA8Þ with PðJ; TJjI; TIÞ denoting the probability for observing transition J at time TJ if transition I is observed at time TI . Constituting the elementary building blocks in the coarse- grained picture, the objects PðJ; TJjI; TIÞ quantify the probability for observing J after a given I with waiting time TJ − TI in between. Thus, Eq. (A8) can also be written as P½ΓðtÞjI1;0� ¼ψ I1→I2ðt1Þψ I2→I3ðt2Þ� � �ψ IN→INþ1 ðtNÞ; ðA9Þ with ti ¼ Ti − Ti−1 and t1 ¼ T1 in terms of the waiting time distribution ψ I→JðtÞ ¼ PðJ; tjI; 0Þ; ðA10Þ according to the definition of ψ I→JðtÞ in Eq. (3). Thewaiting time distributions are normalized in the form X J Z ∞ 0 dtψ I→JðtÞ ¼ 1; ðA11Þ whereas integrating out the time variable gives the marginal distribution pIJ ≡ Z ∞ 0 dtψ I→JðtÞ ¼ Pðnext observed transition is Jjlast observed transition is IÞ: ðA12Þ 3. Effective absorbing dynamics On a fundamental level, we are interested in how the path weights of the effective description Eq. (A9) and their elementary building blocks Eq. (A10) are linked to the path weights Eq. (A1) of the corresponding microscopic tra- jectories of the full network. As a first step, we note that the way in which the effective trajectory Γ is split carries over to a splitting on the fundamental level for the microscopic trajectory ζ, because not only the coarse-grained but the entire microscopic state is known at the observed transition events. Symbolically, this can be denoted as ζ¼̂ γt1I1→I2 → γt2I2→I3 → � � � → γtNIN→INþ1 ; ðA13Þ where γtI→J is the snippet of the full trajectory between two subsequent observable transitions I and J with waiting time t in between. This snippet starts in the destination state of I and ends immediately after the transition event J in the corresponding destination state. Since a given snippet is completed immediately after an observed transition J is registered for the first time, each trajectory snippet can be interpreted as a trajectory of an effective Markovian absorbing dynamics defined on the full network obtained by removing all observed links. As soon as the original trajectory ζ completes an observed transition, the absorbing dynamics for γ are terminated immediately. The corre- sponding first-passage time is precisely the length of γ in time and corresponds to the waiting time t between two transitions in the effective description. Practically, the effective absorbing Markov network is obtained from the correspondingoriginal network by treating all observable links as absorbing, i.e., redirecting the observed transitions into absorbing states. An example for such an effective absorbing Markov network is shown in Fig. 5(c), which depicts the absorbing network for the effective description of the two-cycle network in Fig. 5(a). The possible transitions along the observed link are repre- sented by the states (32) and (23), which are absorbing states in the associated first-passage problem. If the considered snippet beginswith (23) or (32), the corresponding absorbing dynamics starts in 3 or 2, respectively. The effective trajectory Γ originates from a mapping of microscopic trajectories ζ → Γ½ζ� to the effective descrip- tion of the system. The path weight of Γ is obtained by summing over microscopic path weights P½ΓðtÞjI1; 0� ¼ X ζ∈Γ P½ζðtÞjl1; 0�; ðA14Þ where P½ζjj1; 0� is conditioned on l1 at time t ¼ 0 for I1 ¼ ðk1l1Þ. While integrating out the Markov path weight P½ζðtÞ� directly to obtain the coarse-grained path weight P½ΓðtÞjI1; 0� is not feasible, in general, the decomposition of Γ in Eq. (A9) and of ζ in Eq. (A13) reduces the problem VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-18 to the level of elementary building blocks ψ I→JðtÞ and γ, respectively. Thus, the decomposition Eq. (A9) can be combined with the summation in Eq. (A14) to obtain P½ΓðtÞjI1;0� ¼ ψ I1→I2ðt1Þ ψ I2→I3ðt2Þ � � � ψ IN→INþ1 ðtNÞ ðA15Þ ¼ X γ∈ψ I1→I2 ðt1Þ P½γjl1;0� X γ∈ψ I2→I3 ðt2Þ P½γjl2;0� � � � X γ∈ψ IN→INþ1 ðtNÞ P½γjlN;0�: ðA16Þ The path weights P½γjln; 0� of individual snippets γ ¼ γtI→J are seamlessly conditioned on the final state of their predecessor, since In ¼ ðknlnÞ. The only type of summa- tion that needs to be performed is the calculation of the waiting time distributions ψ I→JðtÞ conditioned on I ¼ ðklÞ as introduced in Eq. (A10) by integrating all possible γ ¼ γtI→J: ψ I→JðtÞ ¼ X γ∈ψ I→JðtÞ P½γtI→Jjl; 0�: ðA17Þ This equation identifies the waiting time distributions of the effective description as summations over not observable trajectory snippets and proves, therefore, Eq. (4) in the main text. For I ¼ ðklÞ and J ¼ ðmnÞ, γtI→J starts at l and ends with a jump ðmnÞ exactly at time t. Since the system is in m immediately before the jump at t, we can use the Markov property to calculate ψ I→JðtÞ ¼ P½jump ðmnÞ at time tjl; 0� ðA18Þ ¼ P½jump ðmnÞ at time tjm; t�Pðm; tjl; 0Þ ðA19Þ ¼ kmnpmðtÞ; ðA20Þ where Eq. (A4) is used for the last equality. The result in Eq. (A20) makes it possible to calculate waiting time distributions analytically by solving the master equation of the effective absorbing dynamics defined on the hidden subnetwork. Note that this procedure is, in principle, equivalent to calculating the first-passage time distributions for the associated first-passage problem with the method introduced in Ref. [76]. Conceptually, the reasoning used to derive Eq. (A17) and, therefore, Eq. (A20) is identical to the reasoning used in Ref. [61] to derive the intertransition time densities. For both derivations, the partially accessible Markov network con- sidered in the transition-based description is mapped to an effective first-passage time problem and the waiting time distributions are identified as the corresponding first-passage time distributions. In our derivation, this mapping is moti- vated from an effective splitting emerging on the level of single trajectories, whereas in the derivation in Ref. [61], the mapping is deduced mathematically. Operationally, the proposed calculation method for waiting time distributions differs from the method proposed in Ref. [61]. Instead of carrying out the summation in Eq. (A20) explicitly, the waiting time distributions can be calculated from the solution of the effective absorbing master equation for different initial configurations using Eq. (A17). In addition, our calculation method is effective, since collecting histogram data from a Gillespie simulation [75] is unnecessary to reconstruct the waiting time dis- tributions, as they can be calculated directly. To give an explicit example, the proposed method is used to calculate the waiting time distributions for the effective description of the two-cycle network in Fig. 5(a). Solving the corresponding effective absorbing master equation for fixed, randomly drawn transition rates results in four different waiting time distributions; one of them is shown in Fig. 5(d). Additionally, the figure shows how this waiting time distribution based on Eq. (A20) coincides with the corresponding waiting time distribution calculated from histogram data simulated with a Gillespie algorithm of the full network for long trajectories. APPENDIX B: ENTROPY ESTIMATOR 1. Coarse-grained and full entropy production Our effective description loses information about irre- versibility and entropy production. From an abstract point of view, a well-defined many-to-one mapping of trajecto- ries ζ ↦ Γ½ζ� of length T suffices to bound the mean coarse-grained entropy production rate hσ̂i against the physical entropy production rate hσi: hσ̂i≡ 1 T � ln P½Γ� P½Γ̃� � ¼ 1 T X Γ P½Γ� lnP½Γ� P½Γ̃� ≤ hσi; ðB1Þ provided that Γ ↦ Γ̃ is the correct, physical time-reversal operation. Technically, the bound relies on the log-sum inequality, a standard tool in information theory [77] stating X i ai ln P iaiP ibi ≤ X i ai ln ai bi ; ðB2Þ for ai ≥ 0, bi ≥ 0. We apply this inequality in the form [27,78] Thσ̂i ¼ X ζ;Γ P½Γjζ�P½ζ� ln P ζP½Γjζ�P½ζ�P ζ̃P½Γ̃jζ̃�P½ζ̃� ≤ X ζ;Γ P½Γjζ�P½ζ� lnP½Γjζ�P½ζ� P½Γ̃jζ̃�P½ζ̃� ðB3Þ THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022) 031025-19 ¼ X ζ;Γ P½Γjζ�P½ζ� � ln P½Γjζ� P½Γ̃jζ̃� þ ln P½ζ� P½ζ̃� � ¼ Thσi: ðB4Þ The last equality follows since P½Γjζ� ¼ 1 is satisfied only if Γ matches the correct effective trajectory Γ½ζ� and vanishes otherwise, P½Γjζ� ¼ 0. Moreover, the equality requires that the first term in the sum vanishes, i.e., requires P½Γ̃jζ̃� ¼ 1 when P½Γjζ� ¼ 1. This condition defines the modified time-reversal operation Γ ↦ Γ̃ uniquely, since the correct Γ̃ is identified as the trajectory obtained by using ξ ¼ ζ̃ in the mapping ξ ↦ Γ½ξ�. In other words, we have to first time reverse ζ, which is then followed by the coarse- graining operation, as discussed in Ref. [59]. 2. Time reversal and conditional counting entropy estimator The previous section identifies the correct time-reversal operation Γ ↦ Γ̃ as the coarse-graining applied to the microscopic time reverse ζ̃ðtÞ ¼ ζðT − tÞ. An effective trajectory Γ consists of a series of transitions In ¼ ðknlnÞ at times Tn, which is schematically denoted Γ ¼ ðk1l1Þ!t1 ðk2l2Þ!t2 ðk3l3Þ!t3 � � � !tN−1ðkNlNÞ!tN ðkNþ1lNþ1Þ: ðB5Þ Compared to Eq. (A7), the jumping times Ti are replaced by the waiting times ti ¼ Ti − Ti−1 with T0 ¼ 0. Reversing the corresponding microscopic trajectory ζ in accordance with the previous discussion gives a well-defined effective trajectory of the form Γ̃ ¼ ðlNþ1kNþ1Þ!tN ðlNkNÞ!tN−1 � � �!t3 ðl3k3Þ!t2 ðl2k2Þ!t1 ðl1k1Þ ¼ ĨNþ1!tN ĨN !tN−1 � � �!t3 Ĩ3!t2 Ĩ2!t1 Ĩ1; ðB6Þ where we introduce the reversal operation on individual transitions Ĩn ≡ ðlnknÞ for In ¼ ðknlnÞ. The reverse tran- sition happens along the same link and is, therefore, also observable in the effective description by construction. The path weight for the backward trajectory Eq. (B6) can be decomposed into a product of single waiting time distri- bution objects as in Eq. (A9): P½Γ̃ðtÞjINþ1; 0� ¼ P½ĨN; TN − TN−1jĨNþ1; 0�P½ĨN−1; TN − TN−2jĨN; TN − TN−1� � � �P½Ĩ1; TN jĨ2; TN − T1� ðB7Þ ¼ ψ ĨNþ1→ĨN ðtNÞψ ĨN→ĨN−1 ðtN−1Þ � � �ψ Ĩ2→Ĩ1ðt1Þ: ðB8Þ After the proper time reverse Γ̃ is identified, the entropy production of a particular trajectory Γ can be calculated explicitly as Tσ̂ ¼ ln P½Γ� P½Γ̃� ¼ ln PðI1Þ PðĨNþ1Þ þ ln P½ΓjI1; 0� P½Γ̃jĨNþ1; 0� ðB9Þ ¼ ln PðI1Þ PðĨNþ1Þ þ XN j¼1 ln ψ Ij→Ijþ1 ðtjÞ ψ Ĩjþ1→ĨjðtjÞ ðB10Þ ¼ ln PðI1Þ PðĨNþ1Þ þ T X I;J Z ∞ 0 dtνJjIðtÞ ln ψ I→JðtÞ ψ J̃→ĨðtÞ ; ðB11Þ where the conditional counters νJjIðtÞ are introduced as νJjIðtÞ≡ 1 T XN j¼1 δðtj − tÞδIjþ1;JδIj;I: ðB12Þ In the limit T → ∞, contributions from the initial and final states can be neglected, which yields the fluctuation theorem σ̂ ¼ lim T→∞ 1 T ln P½Γ� P½Γ̃� ¼ X I;J Z ∞ 0 dthνJjIðtÞi ln ψ I→JðtÞ ψ J̃→ĨðtÞ ðB13Þ and an explicit formula for the expected coarse-grained entropy production rate hσ̂i ¼ X I;J Z ∞ 0 dthνJjIðtÞi ln ψ I→JðtÞ ψ J̃→ĨðtÞ : ðB14Þ 3. Expectation values and entropy production for semi-Markov processes We calculate the expectation value hνJjIðtÞi in Eq. (B14) using an appropriate technique known for semi-Markov processes. Note that the transitions I; J;… are the ”states“ of the semi-Markov process and that the waiting time t ”in state I“ is interpreted as the elapsed time since transition I. As defined in the main text, the conditional counter νI→JðtÞ measures the number of transitions J after a preceding transition I: VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022) 031025-20 νI→JðtÞΔt ¼ No: of ðIJÞ jumps after waiting time ∈ ½t; tþ Δt� T ; ðB15Þ hνI→JðtÞi ¼ Pðjump to J after waiting time tjIÞ hNo: of jumps from Ii T ðB16Þ ¼ ψ I→JðtÞhnIi ¼ ψ I→JðtÞ ps I hti : ðB17Þ In the last line, we use hnIi ¼ hNo: of jumps from Ii T ¼ hNo: of jumps from Ii htotal No: of jumpsi · htotal No: of jumpsi T ¼ ps I · 1 hti ; ðB18Þ where hti is defined as the average waiting time between two semi-Markov transitions. The identification of the stationary distribution ps I is based on elementary results for discrete-time Markov chains, as the number of visits of a particular state I in a long trajectory ðI1I2…INÞ divided by N tends toward ps I as N → ∞. Note that, although this distribution is related to the stationary distribution of the semi-Markov process itself, they are different even in the Markovian case [56]. Since the ψ I→JðtÞ are normalized by virtue of Eq. (A12), we can integrate over t to obtain the expected flux hnJjIi from a semi-Markov state I to J as hnJjIi ¼ Z ∞ 0 dthνI→JðtÞi ¼ pIJ ps I hti ¼ hnIipIJ: ðB19Þ The semi-Markov entropy production σSM is defined by Eq. (50) as the probability ratio of forward and backward trajectory under the time-reversal operation Γ ↦ Γ̃. Thus, the calculations of the previous section B 2 starting from Eq. (B11) actually apply to the semi-Markov entropy production σSM ¼ σ̂. Substituting Eq. (B17) into Eq. (B14), we obtain hσSMi ¼ hσ̂i ¼ X I;J Z ∞ 0 dthnIiψ I→JðtÞ ln ψ I→JðtÞ ψ J̃→ĨðtÞ ðB20Þ for the semi-Markov entropy production. To put this expression into relation with the entropy production of the EMC, we apply the log-sum inequality after using Eq. (B17) to obtain hσ̂i ≥ 1 hti X I;J ps I �Z ∞ 0 dtψ I→JðtÞ � ln R ∞ 0 dtψ I→JðtÞR ∞ 0 dtψ J̃→ĨðtÞ ¼ 1 hti X I;J ps IpIJ ln pIJ pJ̃ Ĩ ¼ hσEMCi ðB21Þ in accordance with Eq. (52). 4. Comparison to informed partial entropy production a. Entropy estimators: Embedded Markov chain versus informed partial In this section, we prove that hσEMCi ¼ hσIPi ðB22Þ in the one-link case, which implies hσIPi ≤ hσ̂i by virtue of Eq. (B21). We prove the case of one observable link between the Markov states k and l, since the crucial relation Eq. (B27) and its proof can be generalized to multiple observed links following an analogous approach. For two states þ ¼ ðklÞ and − ¼ ðlkÞ, Eq. (B21) simplifies to hσEMCi ¼ 1 hti ðp sþpþþ − ps