Thermodynamic Inference in Partially Accessible Markov Networks:
A Unifying Perspective from Transition-Based Waiting Time Distributions

Jann van der Meer , Benjamin Ertel , and Udo Seifert
II. Institut für Theoretische Physik, Universität Stuttgart, 70550 Stuttgart, Germany

(Received 22 March 2022; revised 2 June 2022; accepted 1 July 2022; published 12 August 2022)

The inference of thermodynamic quantities from the description of an only partially accessible physical
system is a central challenge in stochastic thermodynamics. A common approach is coarse-graining, which
maps the dynamics of such a system to a reduced effective one. While coarse-graining states of the system
into compound ones is a well-studied concept, recent evidence hints at a complementary description by
considering observable transitions and waiting times. In this work, we consider waiting time distributions
between two consecutive transitions of a partially observable Markov network. We formulate an entropy
estimator using their ratios to quantify irreversibility. Depending on the complexity of the underlying
network, we formulate criteria to infer whether the entropy estimator recovers the full physical entropy
production or whether it just provides a lower bound that improves on established results. This conceptual
approach, which is based on the irreversibility of underlying cycles, additionally enables us to derive
estimators for the topology of the network, i.e., the presence of a hidden cycle, its number of states, and its
driving affinity. Adopting an equivalent semi-Markov description, our results can be condensed into a
fluctuation theorem for the corresponding semi-Markov process. This mathematical perspective provides a
unifying framework for the entropy estimators considered here and established earlier ones. The crucial
role of the correct version of time reversal helps to clarify a recent debate on the meaning of formal versus
physical irreversibility. Extensive numerical calculations based on a direct evaluation of waiting time
distributions illustrate our exact results and provide an estimate on the quality of the bounds for affinities of
hidden cycles.

DOI: 10.1103/PhysRevX.12.031025 Subject Areas: Statistical Physics

I. INTRODUCTION

Over the past two decades, stochastic thermodynamics
has emerged as a comprehensive universal framework for
describing small driven systems [1–5]. One major para-
digm comprises a Markovian, i.e., memoryless, dynamics
on a set of discrete states, which arises from integrating out
fast microscopic degrees of freedom under the assumption
of a timescale separation. Such a fairly general Markov
network model is of widespread use in the description of
chemical and biophysical processes, ranging from chemical
reaction networks [6–10] to protein folding [11–13],
molecular motors [14–19], and molecular dynamics in
general [20–22].
There is, however, a difference between identifying an

effective description of a complex system and actually
having full access to it in practice. On the arguably coarsest
level of description, one is interested in estimation methods

of crucial quantities like the entropy production. As a
prominent result, the thermodynamic uncertainty relation
(TUR) [23–25] provides thermodynamic bounds that can
be used in estimation techniques for entropy [26–31] or
topology [32,33] if it is possible to measure currents of the
underlying system. These currents are a trace of the
fundamental time-reversal asymmetry in dissipative sys-
tems [34,35] that can also be utilized directly as an entropy
estimator [36–38]. Furthermore, entropy estimators that
incorporate or are even based on waiting times between
measurable events have been discussed more recently [39–
42]. For a partially visible Markov network, entropy
production can be estimated through the fraction that is
visible in the subsystem through passive observation [43]
or by controlling adjustable parameters [44,45].
These methods raise the general issue how an underlying,

only partially accessible system is related to a reduced
effective model, a topic known as coarse-graining in sto-
chastic thermodynamics. Earlier interest in the field mainly
considered coarse-graining as a mapping in which unre-
solved Markov states are lumped into compound states, for
example, via schemes described in Refs. [46–50]. In general,
the resulting system is no longer Markovian, so that a
description of the dynamics or the entropy production is

Published by the American Physical Society under the terms of
the Creative Commons Attribution 4.0 International license.
Further distribution of this work must maintain attribution to
the author(s) and the published article’s title, journal citation,
and DOI.

PHYSICAL REVIEW X 12, 031025 (2022)

2160-3308=22=12(3)=031025(29) 031025-1 Published by the American Physical Society

https://orcid.org/0000-0002-3619-6653
https://orcid.org/0000-0002-4696-2243
https://orcid.org/0000-0002-9271-6190
https://crossmark.crossref.org/dialog/?doi=10.1103/PhysRevX.12.031025&domain=pdf&date_stamp=2022-08-12
https://doi.org/10.1103/PhysRevX.12.031025
https://doi.org/10.1103/PhysRevX.12.031025
https://doi.org/10.1103/PhysRevX.12.031025
https://doi.org/10.1103/PhysRevX.12.031025
https://creativecommons.org/licenses/by/4.0/
https://creativecommons.org/licenses/by/4.0/


formulated in terms of phenomenological, apparent equa-
tions [27,51–55].While particular symmetric systems can be
described as semi-Markov processes in this coarse-graining
approach [56–58], a general framework to describe situations
with incomplete information remains an open issue. To give a
recent example [59,60], allowing states that are not contained
in any compound state breakswith thewell-studied paradigm
of state lumping as coarse-graining scheme. This novel
scheme extends our ability to formulate thermodynamically
consistent models while also exhibiting new effects such as
kinetic hysteresis that require a refined understanding of the
relationship between time reversal and coarse-graining.
In this work, we discuss thermodynamic inference based

on the observation of a few transitions and their waiting
time distributions rather than on the observation of a few
states. This strategy has been proposed independently in the
very recent Ref. [61], where the corresponding estimator
for entropy production is introduced and its properties
derived using mainly concepts from information theory, in
particular, the Kullback-Leibler divergence. In our com-
plementary approach that is based on the analysis of cycles,
we show that the underlying trajectory-dependent quantity
obeys a fluctuation theorem. Our analysis reveals that this
estimator is the entropy production of a semi-Markov
process. In particular, we show that the description discussed
in the present work and in Ref. [61] shows kinetic hysteresis
[59].Mathematically, this effect is the consequence of a time-
reversal operation that differs from the one that is usually
employed for semi-Markov processes. In this context,
higher-order semi-Markov processes [39] fit into the picture
naturally as semi-Markov processes with yet another time-
reversal operation. Thus, our mathematical perspective
establishes semi-Markov processes as an underlying
common model while also highlighting the subtleties
involved in identifying the correct time-reversal operation.
Thermodynamic inference is not limited to estimating

entropy production. We show that the waiting time dis-
tributions allow us to infer topological properties and
further thermodynamic quantities like the number of states
in cycles and their driving affinity. Furthermore, we
propose an inductive scheme to detect the presence of
hidden cycles in a complex network.
The paper is structured as follows. In Sec. II, we describe

the setup and present our key results qualitatively. The
fundamental concepts of our effective description are
introduced in Sec. III for the paradigmatic model of a
single observed link in a unicyclic Markov network. By
generalizing these concepts to multicyclic Markov net-
works in Sec. IV, we propose and discuss an entropy
estimator and inference methods theoretically and numeri-
cally. The general framework of multiple observed links in
a multicyclic Markov network is discussed in Sec. V. In
Sec. VI, we discuss our and related work from the
perspective of semi-Markov processes. We conclude with
a summary and an outlook on further work in Sec. VII.

II. SETUP AND KEY QUALITATIVE RESULTS

We start with a general Markov network of N inter-
connected states, e.g., the one shown in Fig. 1(a). At time t,
a state iðtÞ ¼ k is assigned to the physical system, with
k ¼ 1;…; N. The time evolution follows a stochastic
description by allowing transitions between two states k
and l that are connected by a link (equivalently, an edge) in
the network. Quantitatively, these transitions from k to l
and their reverse happen instantaneously with transition
rates kkl and klk, respectively. We assume that kkl > 0
implies klk > 0 to ensure thermodynamic consistency. In
the long-time limit t → ∞, the probability pkðtÞ to observe
the system in a particular state k at time t approaches a
constant value ps

k, which characterizes the stationary state
of the network.
In general networks, it is possible to walk along closed

loops. These are accessed systematically from the network
by identifying its cycles C, which are defined as closed,
directed loops without self-crossings. From a thermody-
namic perspective, cycles are a crucial concept due to their
possibility to break time-reversal symmetry by favoring the
forward direction over the reverse or vice versa. This
preference is quantified by the cycle affinity AC, defined
as the product over all forward rates in C divided by the
corresponding backward rates:

AC ¼ ln
Y
ðklÞ∈C

kkl
klk

: ð1Þ

As shown in Fig. 1(b), the network from Fig. 1(a) has three
different cycles with different affinities. The affinity AC is
also related to the entropy production associated with the
cycle C [62,63]. For biochemical reactions or driving along
a periodic track by a force, the affinity is given by the free
energy change or dissipated work, respectively [3].
Cycles C with nonvanishing affinities give rise to macro-

scopic, sustained flows along their constituent links, even
in the limit of large observation times T. These circular
flows are the cause of the mean entropy production rate

hσi ¼
X
C

jCAC; ð2Þ

where jC is the expected net number of completed cycles C
divided by the observation time T in the limit T → ∞ [63–
65]. If hσi > 0, there is a constant rate of dissipation in the
stationary state, which is then referred to as a nonequili-
brium stationary state (NESS).
Calculating the entropy production via Eq. (2) requires

the ideal case of knowing all cycles and all cycle currents,
which is not practically feasible, in general. In our setup,
we assume that an external observer measures individual
transitions along a limited number of edges connecting
neighboring states in the Markov network. Conceptually,

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-2


this approach coincides with the transition-based effective
description proposed in Ref. [61]. Notationally, we discern
transitions from states by utilizing capital letters I; J;…
and write I ¼ ðklÞ to express that I is a transition from the
Markov state k to the Markov state l. An example
illustrating this effective description for observable tran-
sitions (23) and (32) in the Markov network from Fig. 1(a)
is shown in Figs. 1(c) and 1(d). The central objects of
interest for this effective description are waiting time
distributions of the form

ψ I→JðtÞ≡ PðJ;TJ − TI ¼ tjIÞ; ð3Þ

which quantify the probability density that the transition J is
measured at time TJ ¼ TI þ t given that the previous
transition I is registered at time TI. With transitions I, J
replacing states k, l, waiting time distributions ψ I→JðtÞ
are the time-resolved analog of transition rates kkl.
Figures 1(e)–1(g) illustrate the concept of waiting time
distributions for the effective description in Figs. 1(c)

and 1(d). In the following, we derive several remarkable
results centered around these waiting time distributions and
their underlying semi-Markov description, which are sum-
marized here on a qualitative level.
(1) For a unicyclic network, it is sufficient to determine

the ψ I→JðtÞ from just one edge in order to infer the
affinity of the cycle C and the exact mean entropy
production rate hσi from the ratio of these distribu-
tions. We recover this result of Ref. [61] independ-
ently, here based on a microscopic fluctuation
theorem from the perspective of network cycles.
Since the full entropy production is inferred by this
estimator, it beats the TUR, which, in general, does
not recover full entropy production even in a unicycle.

(2) For a multicyclic network, the same information
from just one edge yields the affinity of the shortest
cycle, its length, and the length of the second-
shortest cycle this edge is a part of. Second, it yields
a lower bound on the largest cycle affinity contrib-
uting to the current through this edge. Finally, it

(a) (b) (c) (d)

(e) (f) (g)

FIG. 1. Key concepts of the effective description for an exemplary Markov network. (a) Markov network including four different
states. Every link between state i and state j allows for transitions in both directions with respective transition rates kij and kji.
(b) Different cycles within the network. The three different cycles in the network are numbered incrementally starting with cycle
C0 ¼ ð12341Þ, drawn as a green dashed curve, cycle C1 ¼ ð1231Þ, drawn as a blue dash-dotted curve, and cycle C2 ¼ ð1341Þ, drawn as
an orange dotted curve. By definition in Eq. (1), the affinity of C0 is given by AC0 ¼ lnðk12k23k34k41=k21k32k43k14Þ; AC1 and AC2 are
defined analogously. Furthermore, these affinities coincide with AC ¼ lnPð↺Þ=Pð↻Þ, the quotient of probabilities to observe a
completed cycle in the forward and backward direction, respectively [cf. Eq. (7)]. (c) Effective description of the network if only the link
between 2 and 3 is observable. Observing this link gives information about transitions between 2 and 3, i.e., (23) and its reverse (32), and
intermediate waiting times. (d) Observable cycles in the effective description. Two successive transitions along the observable link
indicate the completion of a cycle. As indicated with gray color, only completions of C0 or C1 can be registered, since C2 does not include
the observed link. Additionally, C0 and C1 are drawn as curves with the same color, because, by counting transitions without temporal
resolution, we cannot distinguish between both cycles. (e) A trajectory and its effective description. The observable parts of a trajectory
of the underlying network are transitions (23) and (32) at corresponding transition times. By conditioning the observed transitions on the
previous ones, four different waiting time distributions for the different combinations of subsequent transitions can be defined. (f),(g)
Waiting time distributions for the observable link for fixed transition rates. The four different waiting time distributions of the observed
link are illustrated; they are calculated with the method introduced in Appendix A 3. The particular choice of transition rates is given in
Appendix E.

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-3


provides a lower bound on the overall entropy
production of the network that coincides with the
bound proposed in Ref. [61]. This bound is shown to
be tighter than the entropy estimator in Ref. [44]
while also omitting any assumptions of physical
control over system parameters at the observed edge.

(3) If several edges can be observed, the estimator on
total entropy production becomes successively
tighter. Based on the ratios of the ψ I→JðtÞ, we
establish operational criteria to infer the presence
of hidden cycles and hidden entropy production not
accounted for by the estimator.

(4) From a mathematical perspective, observing tran-
sitions results in a semi-Markov process. The
cycle-based approach of this work and the informa-
tion-theoretical approach of Ref. [61] can be seen as
equivalent strategies to establish the entropy pro-
duction of the corresponding semi-Markov process.
From this point of view, we relate the proposed
entropy estimator to the semi-Markov entropy esti-
mator proposed and discussed in Refs. [39,66,67]
and highlight the crucial role of the different time-
reversal operations.

III. UNICYCLIC NETWORK AS PARADIGM

For an introductory example, we consider a Markov
network with only a single cycle C in its NESS. In this
network, we observe a single edge between neighboring
states k and l that is part of the cycle. We assume that
forward and backward transitions along this edge can be
distinguished and denote forward transitions ðklÞ as Iþ and
backward transitions ðlkÞ by I−, respectively.
On the microscopic level, a waiting time distribution of

the form ψ I→JðtÞ has contributions only from microscopic
trajectories γtI→J that start with a transition I and end with
another one, J, after time t without any other observed
transition in between. With a microscopic path weight P½γ�
for microscopic trajectories γ, the waiting time distribution
can be expressed as

ψ I→JðtÞ ¼
X
γtI→J

P½γtI→JjI�; ð4Þ

which sums only trajectory snippets of the form γ ¼ γtI→J
with a path weight that is conditioned on the first jump I at
timeTI . For example, thewaiting time distributionψ Iþ→IþðtÞ
originates froma trajectory snippet γtIþ→Iþ of length twith the
jump sequence γtIþ→Iþ ¼ k → l → � � � → k → l. Likewise,
ψ I−→I−ðtÞ arises from γtI−→I−

¼ l → k → � � � → l → k.
Although the identification in Eq. (4) is reasonable from a
practical point of view, its derivation contains some subtleties
that are explained in the full proof of Eq. (4) in Appendix A.

Since γtI−→I−
is the reverse of γtIþ→Iþ , the logarithmic ratio of

the corresponding waiting time distributions,

aðtÞ≡ aIþ→IþðtÞ≡ ln
ψ Iþ→IþðtÞ
ψ I−→I−ðtÞ

; ð5Þ

is a natural, antisymmetric measure of irreversibility of the
underlying trajectory. As a first main result, we show that
aðtÞ is independent of t and, in particular, can be identified
with the cycle affinity AC:

aIþ→IþðtÞ≡ a ¼ −aI−→I−ðtÞ ¼ AC: ð6Þ

This relation can be seen as a fluctuation theorem applied to
sections of the underlying trajectory on the Markov network
that give rise to a waiting time distribution ψ Iþ→IþðtÞ. These
sections are trajectory snippets γtIþ→Iþ of the form given
above, where the time difference between both jumps k → l
is exactly t. To observe the genuine time reverse ψ I−→I−ðtÞ,
the underlying trajectory must complete the cycle in the
reverse direction, which means

P½γtI−→I−
jI−� ¼ P½γtIþ→IþjIþ�e−AC ð7Þ

for the path weights of every possible trajectory snippet
γtI�→I� . Since this argument holds true for all trajectories
contributing to the waiting time distribution ψ Iþ→IþðtÞ, we
can sum the left side of Eq. (7) over all γtI−→I−

and the right
side of Eq. (7) over all γtIþ→Iþ to conclude that

ψ I−→I−ðtÞ ¼ ψ Iþ→IþðtÞe−AC ð8Þ

using Eq. (4). Inserting Eq. (8) into Eq. (5) proves Eq. (6).
Since aðtÞ ¼ a ¼ AC is time independent, we get from

Eq. (5) to

AC ¼ a ¼ ln

R∞
0 dtψ Iþ→IþðtÞR∞
0 dtψ I−→I−ðtÞ

¼ ln
PðIþjIþÞ
PðI−jI−Þ

ð9Þ

with an integration over the time t. The last equality follows
from the definition of ψ I→JðtÞ as a joint distribution in J
and t in Eq. (3). Thus, the cycle affinity is encoded in
conditional probabilities PðJjIÞ to observe transition J after
transition I irrespective of the intermediate waiting time.
The relationship between cycle affinities and a time-
antisymmetric probability ratio, given by Eq. (6) [or,
equivalently, Eq. (9)], indicates that aðtÞ can be used as
an estimator for the mean entropy production rate hσi in the
steady state via

hσi ¼ jCAC ¼ jCa; ð10Þ

which is exact even for finite observation times T, because
the average is taken in the NESS. This noninvasive

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-4


estimator is directly accessible from an operational point of
view, as by definition jC can be calculated by counting
transitions along the observed link and aðtÞ ¼ a can be
calculated either directly from histogram data for the
waiting time distributions using Eq. (5) or from conditional
probabilities deduced from observed transitions using
Eq. (9). This unicyclic result also recovers one of the main
results in Ref. [61], here using a technique based on the
microscopic cycle fluctuation theorem Eq. (7). Thus, the
result additionally addresses the conceptual issue of relat-
ing entropy production, cycles, and fluctuation theorems
that is raised at the end of Ref. [61].
Conceptually, the identificationA ¼ aðtÞ relies crucially

on the observation of transitions rather than states. Two
subsequent transitions in the same direction imply a
completed cycle with associated entropy production,
whereas two visits of the same compound state emerging
from state lumping in typical coarse-graining strategies do
not. As all transitions except for one are invisible in the
present partially accessible system, previous state-based
coarse-graining approaches would yield a trivial model
containing only a single compound state. Note that alter-
nated observed transitions, observing a forward transition
after a backward transition or vice versa, can never imply
the completion of an underlying cycle. Therefore, it is not
surprising that the estimator of the entropy production of a
unicyclic network contains only the statistics of two
subsequent transitions in the same direction, as observed
in Ref. [61].

IV. MULTICYCLIC NETWORKS WITH ONE
OBSERVED TRANSITION

For a general network topology, we cannot reconstruct a
unique underlying path contributing to the waiting time
distributions ψ Iþ→IþðtÞ and ψ I−→I−ðtÞ as in the unicyclic
case. Topologically distinct hidden pathways may result in
the same pair of consecutive observed transitions.
Nevertheless, bounds for the affinities of those cycles that
include the observable link can be derived from the ratio
aðtÞ. In addition, the cycle lengths of specific cycles can be
inferred from the short-time limit of the waiting time
distributions. Furthermore, the entropy estimator for uni-
cyclic networks can be generalized to the multicyclic case.

A. Bounds on cycle affinities

For each possible underlying cycle C with Iþ ∈ C,
Eq. (7) is valid with corresponding cycle affinity AC, if
γtIþ→Iþ completes the cycle once in the forward direction
without taking detours and γtI−→I−

denotes the correspond-
ing reverse path. Thus, the bound

min
C;Iþ∈C

AC ≤ ln
P½γtIþ→IþjIþ�
P½γtI−→I−

jI−�
≤ max

C;Iþ∈C
AC ð11Þ

is an immediate consequence for these trajectories γtIþ→Iþ
by comparing with the smallest and largest possible
affinity, respectively. Remarkably, the inequality in
Eq. (7) holds true for general γtIþ→Iþ , if the corresponding
γtI−→I−

is defined appropriately by the following algorithm.
(1) Consider the sequence of states in γtIþ→Iþ . For

Iþ ¼ ðklÞ, this is ðkl � � � klÞ.
(2) Remove the first and last state: ðkl � � � klÞ ↦ ðl � � � kÞ.
(3) Reading from left to right, remove all closed loops;

i.e., as soon as a state m appears twice, remove the
intermediate part: ð���amb���cmd���Þ↦ð���amd���Þ.

(4) The remaining trimmed path visits each state at most
once. This trimmed path completed with Iþ gives
rise to a contributing cycle.

(5) Reverse the trimmed path and reintegrate the first
and last state: ðk � � � lÞ ↦ ðlk � � � lkÞ.

(6) Reintegrate the closed loops from step 3 without
reversing: ð� � �dma � � �Þ↦ ð� � �dmb � � �cma � � �Þ. The
resulting sequence of states determines the partial
reverse RγtIþ→Iþ , which is of the form γtI−→I−

.
This procedure identifies a trimmed path of γtIþ→Iþ that

visits each state at most once. By reversing only this
trimmed path, one obtains the partial reverse of γtIþ→Iþ ,
which is denoted by RγtIþ→Iþ. The associated cycle con-
taining the transition Iþ that is reversed byR has to be one
of the possible C in Eq. (11). For an example of this
procedure, see Fig. 2(b). Thus, inverting only the trimmed
part of γtIþ→Iþ while maintaining the original direction of
the remaining transitions restores the inequality in
Eq. (7) and, hence, also the bound in Eq. (11) for every
possible microscopic trajectory γtIþ→Iþ with the correspond-
ing partner γtI−→I−

¼ RγtIþ→Iþ defined in this way.
By averaging over all possible trajectory snippets of

length t, we can combine Eq. (4) with Eq. (11), which is
now valid for all γtIþ→Iþ with corresponding partner γtI−→I−

,
to conclude

min
C;Iþ∈C

AC ≤ ln
ψ Iþ→IþðtÞ
ψ I−→I−ðtÞ

≤ max
C;Iþ∈C

AC ð12Þ

for arbitrary 0 < t < ∞. For this step, it is important to note
that the algorithm provides a bijective mappingR between
trajectories of the form γtIþ→Iþ and trajectories of the form
γtI−→I−

. The inverse mapping is given by applying the same
algorithm to γtI−→I−

except for reading right to left in step 3
to recover the correct sequence of states for γtIþ→Iþ.
The quotient in Eq. (12) can be identified as aðtÞ via

Eq. (5). Thus, the extremal values of aðtÞ can be identified
as bounds on the actual cycle affinity in the form

ACþ ≡max
C

AC ≥ sup
0≤t<∞

aðtÞ≡ a�þ; ð13Þ

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-5


AC− ≡min
C
AC ≤ inf

0≤t<∞
aðtÞ≡ a�−: ð14Þ

Here, the maximum and minimum of the affinities are taken
over all cycles C contributing to the observed link. Strong
driving along or against the observed link manifests itself in
a high positive or negative affinity for a given cycle,
respectively. The inequalities (13) and (14) allow us to
infer such a source of strong driving from its impact on aðtÞ
from the viewpoint of the observed link. The derived
bounds for the cycle affinities are illustrated in Fig. 2.
Figures 2(c) and 2(d) show that the extremal affinities ACþ
and AC− of the contributing cycles are indeed bounded by
the maximum value a�þ and the minimum value a�− of aðtÞ.
Furthermore, the affinity AC0 of the shortest contributing
cycle is always equal to the initial value a�0 ≡ aðt ¼ 0Þ as
we prove in the following section.
To quantify the quality of the bounds in Eqs. (13)

and (14) for the network from Fig. 2(a), we distinguish
two different cases of network realizations. A network with
a particular configuration of transition rates belongs to class
I if the initial value a�0 of aðtÞ is a global maximum or
minimum. An exemplary aðtÞ of a realization of the
network belonging to this class is shown in Fig. 3(a), case
(I). For this class of network realizations, Eqs. (13) and (14)
provide only a single bound for either the maximal or the
minimal affinity of the cycles contributing to the observed
link. The other bound is satisfied by the shortest cycle with
affinity a�0 ¼ AC0. Class II contains the remaining realiza-
tions of the network in which a�0 ¼ aðt ¼ 0Þ is not the

global maximum or minimum. An example for an aðtÞ
sorted into class II is given in Fig. 3(a), case (II); another
one is already shown in Figs. 2(c) and 2(d). For this class of
network realizations, Eqs. (13) and (14) provide bounds for
both the maximal and the minimal affinity of the cycles
contributing to the observed link, respectively.
For both classes of rate configurations, quality factors Q

can be defined such that forQ ¼ 1 equality in Eqs. (13) and
(14) holds and the value of the bounds equals the actual
affinity of the cycle. ForQ < 1, the quality factor quantifies
the ratio between the value of the bounds and the actual
affinity of the corresponding cycle. Using the affinity AC0
of the shortest cycle given by a�0 as a baseline, we introduce
the relative distance

ΔaðtÞ≡ jaðtÞ −AC0 j ¼ jaðtÞ − a�0j: ð15Þ

The quality factors are defined by comparing the maximal
value

Δaþ ¼ ja�þ −AC0 j ð16Þ

and the minimal value

Δa− ¼ ja�− −AC0 j ð17Þ

of Eq. (15) with the respective actual distance between the
true cycle affinities given by jAC� −AC0 j.
For network realizations belonging to class I, either

Eq. (13) or Eq. (14) is a bound for the affinity of a single

(a) (b) (c) (d)

FIG. 2. Illustrative example for a partially accessible multicyclic network. (a) Effective description for a seven-state multicyclic
network in which the link between state 1 and state 7 is observable, leading to five different contributing cycles Ci numbered
incrementally. The corresponding transition rates are given in Appendix E. For cycle C0 ¼ ð1271Þ, the affinityAC0 vanishes; the affinity
of cycle C1 ¼ ð13271Þ is AC1 ¼ 3.18; the affinity of cycle C2 ¼ ð134571Þ is AC2 ¼ −1.43; the affinity of cycle C3 ¼ ð1345671Þ is
AC3 ¼ 7.27; and the affinity of cycle C4 ¼ ð1234571Þ is given byAC4 ¼ −5.61. (b) Example for a trimmed path. For the snippet γtIþ→Iþ
depicted with blue arrows, the sequence of visited states is (713276571). The trimmed path for this snippet is (713271) (cf. the algorithm
in the main text). The corresponding γtI−→I−

is not the reversed sequence but rather (176572317) and depicted with dashed orange
arrows. Thus, the associated cycle is C1, i.e., P½γtIþ→Iþ jIþ�=P½γtI−→I−

jI−� ¼ AC1 . Terms due to the extra loop (7567) cancel in this path
weight quotient. (c),(d) Estimation of the cycle affinities of the contributing cycles based on the extreme values of að71Þ→ð71ÞðtÞ. The
maximal value a�þ ≃ 0.13 and the minimal value a�− ≃ −0.66 of að71Þ→ð71ÞðtÞ are lower and upper bounds for the maximal affinity
AC3 ¼ 7.27 and the minimal affinity AC4 ¼ −5.61, respectively. The initial value að71Þ→ð71Þð0Þ ¼ a�0 ¼ 0 equals the affinity AC0 ¼ 0 of
the shortest network cycle. The local maximum a�1 ≃ 0.03 and the local minimum a�2 ≃ −0.05 can be identified as lower and upper
bounds for the affinities AC1 ¼ 3.18 and AC2 ¼ −1.43 of the remaining contributing cycles C1 and C2, respectively.

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-6


cycle. If the initial value a�0 is a global minimum, the
maximal affinity ACþ of the cycles contributing to the
observed link is bounded by Eq. (13). Thus, the quality
factor QI for this network realization is defined as

QI ≡ Δaþ
jACþ −AC0 j

: ð18Þ

If the initial value a�0 is a global maximum, the minimal
affinity AC− of the cycles contributing to the observed link
is bounded by Eq. (14), and the quality factor QI for this
network realization is given by

QI ≡ Δa−
jAC− −AC0 j

: ð19Þ

A graphical illustration of the quantities entering the
definition of QI is shown in Fig. 3(a), case (I).
For network configurations belonging to class II, both

Eqs. (13) and (14) provide nontrivial bounds for the
extremal affinities of the contributing cycles. To distinguish
both bounds, two quality factors Qþ

II and Q−
II defined

similarly to Eqs. (18) and (19) are needed. The quality
factor Qþ

II defined as

Qþ
II ≡ Δaþ

jACþ −AC0 j
ð20Þ

quantifies the quality of the bound Eq. (13) for the maximal
affinity ACþ of the contributing cycles. The quality of the
bound Eq. (14) for the minimal affinity AC− of the
contributing cycles is quantified analogously by

Q−
II ≡ Δa−

jAC− −AC0 j
: ð21Þ

The quantities entering the definition of Qþ
II and Q−

II are
illustrated in Fig. 3(a), case (II).
The quality factors for a total of 2 063 495 randomly

drawn realizations of the multicyclic network from Fig. 2
are shown in Figs. 3(b)–3(d) as a function of the affinity
AC0 of the smallest contributing cycle. The different
structure and mean value of quality factors QI for network
realizations from class I, shown in Fig. 3(b), when con-
trasted to the structures and mean values of quality factors
for network realizations from class II, shown in Figs. 3(c)
and 3(d), indicate that the partition into two different
classes of network realizations corresponds to distinct
features of the network that are reflected in these affinity
bounds. The mean value of the quality factors for network
realizations belonging to class I is given byQI ≃ 0.4, which
means that the maximal or minimal affinity of the con-
tributing cycles can be estimated based on Eq. (13) or
Eq. (14) with an average accuracy of 0.4. This result is
remarkable, because, on the one hand, the estimation is
based on a noninvasive observation of a single link of the
network only and, on the other hand, to our knowledge, no
coarse-graining inference scheme exists that bounds affin-
ities of a partially accessible network to this degree of
precision. The mean values of the quality factors for
network realizations belonging to class II are given by
Qþ

II ≃ 0.2 and Q−
II ≃ 0.1, respectively. Compared to the

bounds for realizations belonging to class I, realizations
belonging to class II tend to quantitatively weaker bounds.
However, local maxima and minima of aðtÞ seem to
provide further, loose bounds for the affinities of other,

(a) (b) (c) (d)

FIG. 3. Quality of the affinity bounds for the seven-state multicyclic network from Fig. 2. (a) Illustration of the quantities entering the
definition of the quality factors for the two classes of network realizations. (I) shows aðtÞ for a network realization belonging to class I.
Since a�0 is the global minimum of aðtÞ, the quality factor QI for this realization is defined according to Eq. (18) with Δaþ ¼ ja�þ − a�0j
from Eq. (16). (II) shows aðtÞ for a network realization belonging to class II. The quality factors Qþ

II and Q−
II for this realization are

defined according to Eqs. (20) and (21) with Δaþ ¼ ja�þ − a�0j and Δa− ¼ ja�− − a�0j from Eqs. (16) and (17), respectively. (b)–(d)
Quality factorsQI,Q

þ
II , andQ

−
II for 2 063 495 randomly drawn rate configurations of the multicyclic network as a function of the affinity

AC0 of the smallest contributing cycle. The mean value of the quality factors QI in (b) is given by QI ≃ 0.4, whereas the mean values of
the quality factorsQþ

II in (c) andQ
−
II in (d) are given byQ

þ
II ≃ 0.2 andQ−

II ≃ 0.1, respectively. The difference between the quality factors
in (c) and (d) for the same class of network realizations is caused by the ensemble for the transition rates that is biased towards positive
affinities, explained in detail in Appendix E. All quality factors are determined from the corresponding waiting time distributions derived
with the method explained in Appendix A 3.

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-7


nonextremal cycles contributing to the observed link. This
numerical finding, illustrated for a given network realiza-
tion in Fig. 2(c), indicates that each successive maximal and
minimal value of aðtÞ corresponds to a contributing cycle.
Therefore, the number of successive maximal and minimal
values of aðtÞ can be interpreted as a lower bound for the
total number of contributing cycles for networks from
class II.

B. Short-time limit and inference of cycle lengths

Additional information about the network can be
obtained from the time dependence of the waiting time
distributions ψ Iþ→IþðtÞ and ψ I−→I−ðtÞ. In the limit t → 0,
only the shortest cycle(s) including the link with forward
transition Iþ and backward transition I− contribute(s) to the
waiting time distribution, as longer paths lead to effects of
higher order in t. Thus, we can extract the number of hidden
transitions N1 needed to complete the smallest cycle and, if
unique, its corresponding affinityAC0 from the waiting time
distributions via

lim
t→0

�
t
d
dt

lnψ I�→I�ðtÞ
�

¼ N1 ð22Þ

and

lim
t→0

aIþ→IþðtÞ ¼ −lim
t→0

aI−→I−ðtÞ ¼ AC0 ; ð23Þ

respectively, as proven in Appendix C 1. Note that N1 þ 1
is equal to the length of the smallest cycle, because, afterN1

hidden transitions, an additional observed transition is
needed to complete the full cycle. As an illustration for
the identification of N1, we consider the ratio of waiting
time distributions for the observable link of the two-cycle

network shown in Fig. 4(a). Figure 4(b) illustrates that the
evaluation of Eq. (22) for Iþ ¼ ð32Þ coincides with
N1 ¼ 2, the minimal number of hidden transitions needed
to observe (32) after (32) in the smallest cycle of the
network. For the multicyclic network in Fig. 2, the
identification of the affinity in Eq. (23) is illustrated in
Fig. 2(c) together with the previously discussed affinity
bounds, as the affinity AC0 of the shortest cycle is reflected
in the initial value að71Þ→ð71Þð0Þ ¼ 0.
Terms of higher order around t ¼ 0 of the form tN

encode similar information about cycles with increasing
size contributing to the observable link. Qualitatively, we
can extract information about the number of hidden
transitions N2 needed to complete the second-shortest
cycle from aðtÞ, since

aðtÞ − að0Þ ∼ tN2−N1 : ð24Þ

More quantitatively and as proven in Appendix C 2, the
absolute value of the relative distance introduced in
Eq. (15) can be seen as the lowest-order perturbation to
the shortest cycle. Typically, e.g., if the affinities of the two
shortest cycles do not coincide, this effect is due to the
second-shortest cycle. In this case, N2 can be extracted
from Eq. (15) via

lim
t→0

�
t
d
dt

ln jΔaI→JðtÞj
�

¼ N2 − N1 ð25Þ

if N2 > N1, i.e., if the shortest cycle is unique. By
combining the results from Eqs. (22) and (25), we can
inferN2 from observable waiting time distributions. Similar
to the length of the shortest network cycle, the length
of the second-shortest network cycle is given by N2 þ 1.

(a) (b) (c) (d)

FIG. 4. Inference of cycle lengths and entropy estimation for a partially accessible two-cycle network. (a) Effective description of a
four-state network with two cycles in which transitions along the link between states 2 and 3, i.e., (23) and (32), are observable. F is a
dimensionless force applied to the observable link between states 2 and 3; all transition rates of the network are given in Appendix E.
(b) Inference of the number of hidden transitionsN1 of the smallest network cycle C0 based on waiting time distributions calculated with
the method from Appendix A 3 for fixed F ¼ ln 3. N1 ¼ 2 corresponds to the slope of the short-time limit of lnψðtÞ resulting in
jC0j ¼ 3. (c) Inference of the number of hidden transitions N2 of the second-smallest network cycle C1 based on waiting time
distributions calculated with the method from Appendix A 3 for fixed F ¼ ln 3. N2 − N1 ¼ 1 corresponds to the slope of the short-time
limit of ln jΔaðtÞj resulting in N2 ¼ 3 and jC1j ¼ 4. (d) The estimator hσ̂i from Eq. (28) for the mean entropy production hσi of the full
network as a function of F. The details for the simulations of νþjþðtÞ and ν−j−ðtÞ are given in Appendix E. The method from
Appendix A 3 is used to calculate aðtÞ.

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-8


Figure 4(c) illustrates the evaluation of Eq. (25) for aðtÞ for
Iþ ¼ ð32Þ leading toN2 − N1 ¼ 1. This result is consistent
with N2 ¼ 3, the number of hidden transitions needed to
observe (32) as the next observable transition after (32)
along the second-smallest cycle of the network.

C. Entropy estimator

1. Definition

A time-dependent aðtÞ implies the presence of a second
cycle, as longer waiting times between subsequent tran-
sitions hint at the completion of longer pathways.
Exploiting this time dependence leads to an entropy
estimator that generalizes the estimator of the unicyclic
case. To quantify this notion, we let T be the length of a
long trajectory with N þ 1 transitions Ik located at Tk−1.
The observation starts with the transition I1 at T0 ¼ 0 and
ends with INþ1 at time TN ¼ T. Then, the number of
subsequent forward or backward transitions with waiting
time t in between is given by the time-resolved conditional
jump counters defined as

νþjþðtÞ≡ 1

T

XN
m¼1

δðTm − Tm−1 − tÞδImþ1;IþδIm;Iþ ; ð26Þ

with ν−j−ðtÞ defined accordingly. These time-resolved
conditional jump counters are used together with the ratio
of waiting times aðtÞ defined in Eq. (5) to define a
trajectory-dependent entropy estimator

σ̂ ≡
Z

∞

0

dtaðtÞ½νþjþðtÞ − ν−j−ðtÞ�: ð27Þ

Operationally, νþjþðtÞ and ν−j−ðtÞ can be obtained from
counting conditional transitions up to time t. aðtÞ can be
obtained from histograms for the waiting time distributions
based on waiting times between observed transitions. As
proven in Appendix B in the limit of long trajectories, i.e.,
observation times T → ∞, Eq. (27) defines an entropy
estimator respecting time-reversal symmetry in thermody-
namic equilibrium whose mean additionally satisfies

hσ̂i ≤ hσi: ð28Þ

This property can be deduced from a fluctuation theorem

σ̂ ¼ lim
T→∞

1

T
ln
PðΓÞ
PðΓ̃Þ ð29Þ

for the trajectory Γ and its time reverse Γ̃, both emerging
from trajectories of the underlying network by a mapping
defined by the effective description of the system. An
interpretation for Γ from a mathematical point of view is
given in Sec. VI.

2. Illustration and comparison to existing methods

Anumerical illustration of the estimator [Eq. (27)] applied
to the partially accessible two-cycle network is depicted in
Fig. 4(d). The mean entropy production hσi and the entropy
estimator hσ̂i are simulated for long, stationary trajectories
and different values of a parameter F, which can be
interpreted as a driving force applied to the observed link
between the states 2 and 3. An external observer who is able
to tune the force parameter F can find a value for which the
net stationary current 0 ¼ j ¼ R∞

0 dthνþjþðtÞ − ν−j−ðtÞi
vanishes. This setup and the particular value ofF are referred
to as stalling conditions and the stalling force, respectively
[39,44,45]. Knowing this stalling force through either
measurement or calculation amounts to knowing the effec-
tive “pressure” the remaining network exerts on the link (23)
against the force F. This information is incorporated in the
so-called ”informed partial“entropy estimator hσIPi intro-
duced in Ref. [44]. Since the remaining network is taken into
account through the effective pressure, hσIPi surpasses the
estimator obtained by merely measuring the ”passive parti-
al“entropy production hσPPi that can be attributed to the
transitions in an observed subset [43], i.e.,

hσPPi ≤ hσIPi ≤ hσi ð30Þ

as proven in the context of the informed partial estimator
in Ref. [45].
Under stalling conditions, both estimators hσPPi and

hσIPi become trivial, because they cannot rule out the
possibility that the underlying system is at equilibrium if
j ¼ 0. The introduced time-resolved estimator hσ̂i, how-
ever, is able to infer nonequilibrium, since hσ̂i > 0 even if
j ¼ 0, as additional information enters its definition in
Eq. (27). Intuitively, the waiting time distributions encode
information about the hidden cycle in their time depend-
ence through a nonconstant aðtÞ. More quantitatively, the
estimator hσ̂i defined by Eq. (27) numerically reproduces
the bound of the waiting time distribution based estimator
proposed in Ref. [39] for the network in Fig. 4. Both the
estimator in Ref. [39] and hσ̂i share the features of
considering successive transitions and adding a time
resolution through waiting time distributions. However,
hσ̂i is formulated without the framework of a higher-order
semi-Markov process or a Markov chain decimation
scheme. While these differences render a general quanti-
tative comparison with our estimator difficult, hσ̂i beats the
informed partial estimator hσIPi for long, stationary tra-
jectories,

hσIPi ≤ hσ̂i ≤ hσi; ð31Þ

as we prove in Appendix B 4. Note that the expectation
values are still taken in the limit of large observation times
in which finite-time effects at the initial and final transition
can be neglected. It is also evident from the proof that the

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-9


equality is achieved in the first relation if and only if aðtÞ is
time independent. Equality in the second relation is
achieved if and only if removing the observed edge results
in a network in which detailed balance is satisfied. To give a
less formal interpretation of Eq. (31), observational access
to the waiting time distributions contains more information
than operational access to the observed links via the stalling
force F. In particular, it is possible to measure F via

−F ¼ ln
PðIþjIþÞ
PðI−jI−Þ

¼ ln
hR∞

0 dtνþjþðtÞi=hnþi
hR∞

0 dtν−j−ðtÞi=hn−i
; ð32Þ

without perturbing the system at all, as we prove in
Appendix B 4.

V. MULTIPLE OBSERVED LINKS IN A
MULTICYCLIC NETWORK

Access to additional observable transitions provides
further information about the underlying network, which
allows us to infer topology qualitatively by identifying
allowed and forbidden sequences of transitions and quan-
titatively by sharpening our entropy estimator for multi-
cyclic networks.

A. Entropy estimator

For M observed links, there are 2M possible transitions
and a 2M × 2M matrix of quotients

aIJðtÞ≡ ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

ð33Þ

with I; J ∈ fIð1Þþ ; Ið1Þ− ;…; IðMÞ
− g. Here, Ĩ is defined as the

reverse transition ĨðmÞ
� ≡ IðmÞ∓ , which yields a skew sym-

metry aIJ ¼ −aJ̃ Ĩ. Intuitively, the ratio in Eq. (33) encodes
the entropy production term of an effective two-step
trajectory Γt

IJ ¼ I → J of length t. This term is related
to the path weights of microscopic trajectory snippets
γtI→J ¼ k → l → � � � → o → p of the same length t
between two observed transitions I ¼ ðklÞ and J ¼ ðopÞ
in the form

aIJðtÞ ¼ ln
P½Γt

IJjI�
P½Γt

J̃ Ĩ
jJ̃� ¼ ln

P
γtI→J

P½γtI→JjI�P
γt
J̃→Ĩ

P½γt
J̃→Ĩ

jJ̃� : ð34Þ

Similar to the unicyclic case in Eq. (5), unobserved degrees
of freedom in the microscopic path γtI→J are integrated out
by the summation over the path weights. The ratios in
Eq. (33) allow us to generalize σ̂, defined in Eq. (27), to
multiple observed transitions. We define the conditional
counters as

νJjIðtÞ≡ 1

T

XN
m¼1

δðTm − Tm−1 − tÞδImþ1;JδIm;I; ð35Þ

where we adopt the same notation as in Eq. (26); i.e., the
mth transition Im is located at Tm−1. The sum over all aIJðtÞ
in a trajectory constitutes the entropy estimator

σ̂ ≡X
IJ

Z
∞

0

dtaIJðtÞνJjIðtÞ; ð36Þ

which reduces to Eq. (27) in the case of a single link, i.e.,
two possible transitions I� ¼ �. Thus, registering a jump J
after a previous jump I during an observation of a long
trajectory increases σ̂ by aIJðtÞ, an antisymmetric increment
in which inaccessible data beyond the registered observable
one are integrated out. The entropy estimator is thermo-
dynamically consistent in the sense of Eq. (28) and satisfies
the fluctuation theorem from Eq. (29) in the long-time limit
T → ∞. Moreover, the definition (36) provides the fluc-
tuating counterpart of the entropy estimator for multicyclic
networks introduced in Ref. [61], which is given by hσ̂i in
our notation.

B. Network topology

When we consider multiple transitions, their relative
position in the network has a crucial impact on the observed
data. For a given network, the waiting time distribution
ψ I→JðtÞ depends not only on the pair of transitions I, J, but
the entire set of observed links. For example, in the
effective description of the network in Fig. 4(a),
að23Þð23ÞðtÞ is time dependent but becomes time indepen-
dent if, in addition, the transitions (13) and (31) are
observed. The reason is that the fluctuation-theorem-like
argument for the affinity can be restored, since observing
ψ ð23Þ→ð23ÞðtÞ necessarily implies completion of the cycle
C ¼ ð23412Þ. Formulated differently, we can retrace the
arguments underlying Eq. (11) to deduce an equality

AC ¼ ln
P½γtð23Þ→ð23Þjð23Þ�
P½γtð32Þ→ð32Þjð32Þ�

; ð37Þ

because the only possible completed cycle is C. Based on
this observation, we can conclude in more general terms
that increasing the number of observed links in a network
decreases the possible pathways in the remaining, hidden
part of the underlying Markov network. This subnetwork,
which is obtained by removing all observed links from the
Markov network, is denoted a hidden subnetwork. While
the hidden subnetwork is made up of the same states as the
Markov network, it contains fewer links and, therefore,
may be disconnected.
We can make a few technical but far-reaching observa-

tions, which are here formulated for long, stationary
trajectories; i.e., expectation values are taken in the

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-10


NESS and in the limit T → ∞, as before. Let I ¼ ðklÞ and
J ¼ ðopÞ be two arbitrary observed transitions in the
network.
(1) If the hidden subnetwork is topologically trivial, i.e.,

does not contain any cycles, then hσ̂i ¼ hσi. More-
over, all aIJðtÞ are time independent.

(2) A time-dependent aIJðtÞ implies the presence of a
cycle in the hidden subnetwork. More precisely, if
aIJðtÞ is nonconstant in time, then there is a cycle
with nonvanishing affinity in the hidden subnetwork
that connects the Markov states l and o. In particu-
lar, hσ̂i < hσi.

(3) If J cannot be an immediate successor of I, i.e., if
ψ I→JðtÞ ¼ 0, the Markov states l and o are not
connected in the hidden subnetwork. In particular,
we can leave out at least one observed transition
without decreasing hσ̂i.

(4) The converse of 2 is not true. It is possible that aIJðtÞ
is constant in time despite a cycle with nontrivial
affinity containing both l and o. However, this
behavior is not the generic case but rather requires
high symmetry. An explicit example containing such
an invisible cycle is provided in Appendix E 5.

These four results are based on the microscopic origin of
aIJðtÞ as a ratio of path weights as indicated in Eq. (34).
The crucial argument is an extension of the reasoning used
in the unicyclic case to relate ratios of path weights to the
cycle affinity A [cf. Eq. (7)]. We consider two consecutive
transitions I ¼ ðklÞ and J ¼ ðopÞ and two arbitrary paths
γ1 and γ2 starting and ending in the Markov states l and o,
respectively. Their path weights satisfy

ln
P½γ1jl�
P½γ̃1jo�

− ln
P½γ2jl�
P½γ̃2jo�

¼ A12; ð38Þ

where A12 is the affinity of the closed loop obtained by
appending γ̃2 to γ1. If the hidden subnetwork does not
contain any cycles, A12 ¼ 0 follows trivially. Since γ1 and
γ2 are arbitrary, Eq. (38) implies the existence of a specific
number aIJ satisfying

P½γtI→JjI�eaIJ ¼ P½γt
J̃→Ĩ

jJ̃� ð39Þ

for paths γtI→J of arbitrary length t with time reverse γt
J̃→Ĩ

.
By summing the previous equation over all possible
trajectories of the form γtI→J, we conclude

aIJðtÞ ¼ ln

P
γtI→J

P½γtI→JjI�P
γt
J̃→Ĩ

P½γt
J̃→Ĩ

jJ̃� ¼ ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

: ð40Þ

In particular, aIJðtÞ is time independent if the hidden
subnetwork does not contain any cycles or if it satisfies
detailed balance; i.e., any cycles in the hidden subnetwork
have vanishing affinity. This argument establishes rule 1.
To emphasize the relation to our previous results, we note

that Eq. (40) can be seen as a special case of the affinity
bounds from Eq. (11), which collapse to equalities if the set
of possible AC contains only one element. If the hidden
subnetwork is a spanning tree, the diagonal element aII ¼
AC is the affinity of the cycle C in the unicyclic network
obtained by adding the link I back to the hidden subnet-
work. In particular, every cycle passes through at least one
observed link and is, therefore, registered. Since NESS
entropy production stems from cycle currents, it seems
plausible to conjecture hσ̂i ¼ hσi. Up to contributions from
the first and last transition of the trajectory, the statement
even holds on the level of individual trajectories in the form

σ̂ ¼ σ; ð41Þ
as is proven in Appendix D.
Rule 2 is obtained from Eq. (38) by reversing the

argument above. Since a nontrivial time dependence
aIJðtÞ is impossible if A12 vanishes for all γ1 and γ2, there
must be at least one cycle with nonvanishing affinity. We
now argue that, despite the counterexample given in
Appendix E 5, the converse of rule 2 is usually satisfied
in a generic setup. If aIJðtÞ is constant in time, it equals its
limit aIJð0Þ as t → 0. By a timescale separation argument
similar to Eq. (23), only the shortest connection between
the corresponding Markov states l and o contributes in the
short-time limit, whereas longer connections are sup-
pressed and lead to higher-order effects. A hidden cycle
containing l and o can be split along these states, giving rise
to two topologically distinct pathways γ1 and γ2. Unless
both pathways contain the exact same number of states, one
class of paths is suppressed by the other in the short-time
limit. Thus, the hidden cycle must contain an even number
of states to avoid this timescale separation argument. In
addition to this purely qualitative argument, generic choices
of transition rates generally lead to different first-passage
times from l to o depending on the topology of the path,
which would also lead to a nontrivial time dependence
in aIJðtÞ.
While the derivation of rule 3 is straightforward from a

mathematical point of view, it is of high value operation-
ally, as it can be used to infer the connected components of
the hidden subnetwork. In addition, this rule describes a
scheme to identify the transitions needed to recover the full
entropy production. While rule 2 gives a simple criterion
when a particular set of observed transitions is insufficient
to conclude hσi ¼ hσ̂i, rule 3 formulates a complementary
criterion about transitions which are redundant for the
entropy estimate. On the level of the Markov network,
restoring the minimal number n of observed links I1;…; In
to connect l and o does not create any cycles in the hidden
subnetwork. Since entropy production in the steady state
is always due to cycle currents, the entropy production
in the hidden subnetwork is not increased by not ob-
serving I1;…; In, i.e., by adding I1;…; In to the hidden
subnetwork.

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-11


The interplay of statement 2 working ”bottom up“ and
statement 3 coming ”top down“ is not limited to assessing
the quality of the discussed estimator σ̂. It is also an
algorithm for inferring topological aspects of the Markov
network by identifying underlying spanning trees, con-
nected components, the position of hidden cycles, and,
lastly, their affinities and lengths by combining these rules
with the methods introduced in Sec. IV.

VI. UNIFYING SEMI-MARKOV PERSPECTIVE

A. Identification of the semi-Markov description

In the transition-based description, each trajectory ζ of
the underlying Markov network is mapped to a trajectory Γ
that includes only the observable transitions and the waiting
times in between, i.e., symbolically

ζ ↦ Γ½ζ�: ð42Þ

Clearly, this mapping from ζ to Γ is well defined and many-
to-one. Adopting a different yet equivalent perspective, this
kind of mapping for the underlying trajectory can be seen
as a type of milestoning using the space of observable
transitions for partitioning. Milestoning is a particular
coarse-graining scheme from molecular dynamics simu-
lations [68] introduced to stochastic thermodynamics in
Refs. [59,60]. In short, the milestones represent certain
events, whose occurrence indicates the crossing of a
milestone that updates the coarse-grained state of the
system.
In practice, this approach results in a semi-Markov

description for the coarse-grained system defined on the
space of observable transitions. In other words, each
observed transition I is identified as a state in the semi-
Markov model. The following discussion includes the key
concepts of semi-Markov processes in the context of
stochastic thermodynamics; see Refs. [56,58,69] for
details. The equivalence of the transition-based description
to a semi-Markov model becomes evident on the level of
single trajectories emerging from the mapping in Eq. (42).
An effective trajectory Γ containing N þ 1 transitions
starting and ending with registered transitions I1 at time
T0 ¼ 0 and INþ1 at time TN ¼ T, respectively, is fully
characterized by the sequence

Γ ¼ fðI1; T1Þ; ðI2; T2Þ;…; ðIN; TNÞg ð43Þ

for 0 ≤ t < TN. From a mathematical point of view, the
sequence in Eq. (43) precisely defines a particular reali-
zation of a semi-Markov trajectory [56], in which the fIkg
take the role of the states. Compared to a Markov process,
in which the system is fully described by specifying the
state i, a full semi-Markov description of the system
requires knowing the state I and the waiting time t that
has elapsed since I has been entered.

B. Semi-Markov kernels and embedded Markov chain

Since the theory of semi-Markov processes provides the
mathematical framework of the effective description, quan-
tities defined for the latter can be expressed in the language
of corresponding semi-Markov processes. The waiting time
distribution ψ I→JðtÞ assigned to each transition I, dubbed
as intertransition time density in Ref. [61], is called the
semi-Markov kernel in this framework. A semi-Markov
kernel ψ I→JðtÞ is defined as the joint distribution of waiting
time t and transition destination J if the actual state is I with
age zero, which coincides precisely with the definition of
the waiting time distributions in Eq. (3). Integrating out the
waiting time t of a semi-Markov kernel results in condi-
tional probabilities

pIJ ≡ PðJjIÞ ¼
Z

∞

0

dtψ I→JðtÞ ð44Þ

for a transition between two semi-Markov states irrespec-
tive of the waiting time in I. These probabilities, whose
ratios are already used in Eq. (9), can now be placed in a
mathematical context. Based on the transition probabilities
pIJ defined by Eq. (44), the concept of the embedded
Markov chain (EMC) can be established for every semi-
Markov process by integrating out its time variable [56].
The embedded Markov chain of the effective trajectory in
Eq. (43) is given by the sequence

ΓEMC ¼ ðI1; I2;…; INþ1Þ ð45Þ

of observed transitions. The transition probabilities of the
corresponding discrete-time Markov process are given
by Eq. (44).

C. Path weight and time-reversal operation

According to the semi-Markov description, the path
weight P½ΓjI1; 0� of the effective trajectory ΓðtÞ condi-
tioned on the first transition is simply given by

P½ΓjI1; 0� ¼
YN
i¼1

ψ Ii→Iiþ1
ðtiÞ; ð46Þ

with ti ¼ Ti − Ti−1, where we follow the conventional
definition [56,69–71]. Equation (46) coincides with the
effective path weight defined for trajectories of the tran-
sition-based description in Ref. [61]. Note that the first and
last transitions do not need to be treated differently
[56,58,69,72], since the trajectory starts and ends with a
transition by construction.
The time-reversal operation for the present semi-Markov

process is not given by the conventional time-reversal
operation for semi-Markov processes. Instead of simply
reversing Γ in time, as proposed in Refs. [56,70], two
peculiarities emerging from the time reversal of the

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-12


underlying trajectory ζ have to be taken into account. First,
Γ contains observed transitions that are odd under time
reversal similar to momenta and, therefore, need to be
reversed [39,59,73]. Thus, it is natural to define the
reversed transition Ĩ for a transition I as

I ¼ ðklÞ → Ĩ ≡ ðlkÞ: ð47Þ

Second, we observe an effect introduced as kinetic hyste-
resis in Ref. [59]. After registering a transition I ¼ ðijÞ at
time tI , it would be misleading to treat I as a compound
state and conclude that the underlying system remains in I
until the next transition J is observed at tJ. At some time t
with tI ≤ t ≤ tJ, the state of the coarse-grained system is
described completely by knowing the last transition I and
the time t − tI that has passed since then. However, the
same point in time on the reversed trajectory is described by
knowing that tJ − t has passed since the last transition J̃.
Thus, J̃ replaces I as the latest registered transition.
Combining both effects allows us to formulate the time
reversal of a semi-Markov kernel ψ I→JðtJ − tIÞ as

ψ̃ I→JðtJ − tIÞ≡ ψ J̃→ĨðtJ − tIÞ; ð48Þ

resulting in

P½Γ̃jIN; T� ¼
YN−1

i¼1

ψ Ĩi→Ĩi−1ðtiÞ; ð49Þ

for the conditioned path weight P½Γ̃jIN; T� of the time-
reversed trajectory Γ̃. Clearly, the time reversal in Eq. (48)
is identical to the time reversal proposed in Ref. [61], since
the shift of intertransition times discussed there is precisely
the effect of kinetic hysteresis described above. Note that
the modifications to the time-reversal operation of the semi-
Markov process arise naturally, in accordance with the
paradigm that time reversal does not commute with coarse-
graining of the form Eq. (42), in general [59].
In the common conception of semi-Markov processes,

the direction-time independence criterion is a necessary
condition to ensure time-reversal symmetry in equilibrium
[56,70]. Remarkably, the semi-Markov process as intro-
duced here breaks this condition, in general. This apparent
contradiction is resolved, since the derivation of the
direction-time independence relies crucially on the conven-
tional time-reversal operation for semi-Markov processes,
which does not apply here, as discussed above.

D. Interpretation of the entropy estimators

The entropy estimator hσ̂i is established for unicyclic
networks in Eq. (10). It is based on the microscopic
fluctuation theorem in Eq. (8) valid for the ratio of waiting
time distributions. The generalization of hσ̂i for multicyclic
networks with multiple observed links in Eq. (36), which

includes the estimator for a single observed link Eq. (27) as
a special case, relies on the same fluctuation theorem
generalized to the multicyclic case. From the semi-Markov
perspective, these fluctuation theorems can be interpreted
as the consequence of an actual fluctuation theorem of the
semi-Markov process. We define the semi-Markov entropy
production rate σSM as the limit

σSM ≡ lim
T→∞

1

T
ln
PðΓÞ
PðΓ̃Þ ; ð50Þ

which differs from the known expressions, e.g., in
Refs. [69,72,74] because of the modified time-reversal
operation. Comparing Eq. (50) to Eq. (29), we conclude
that σSM, in fact, equals σ̂, which is established as a
thermodynamically consistent coarse-grained entropy pro-
duction term in the previous sections. In hindsight, the
fluctuation theorem in Eq. (8) can be derived from Eq. (50)
by specifying to semi-Markov trajectories with only a
single transition. The underlying Markov description does
not enter explicitly anymore; instead, it is incorporated
implicitly by ensuring that σSM is the correct physical
entropy production. The affinity estimators derived in
Sec. IV can also be seen as consequences of Eq. (50),
tracing back the entropy production to the level of
contributing cycles.
From the unifying semi-Markov perspective, we can give

three complementary interpretations of the estimator hσ̂i.
First, the derivation presented in Ref. [61] relies on the
information-theoretical identification of the expected
entropy production of a stochastic process as a Kullback-
Leibler divergencebetween thepathweights of a forward and
backward process [36,37]. Second, contributions to the
fluctuating quantity σ̂ can be attributed to the completion
of cycles in the underlying Markov network, which are
partially observed for an external observer. Third, σ̂ ¼ σSM
can be interpreted as the entropy production rate of a semi-
Markov process with a particular time-reversal operation.
Thermodynamic consistency of σ̂ is then coupled to the
applicability of the time-reversal operation, which has to be
established from the underlying network.
By interpreting σ̂ as the entropy production σSM of the

equivalent semi-Markov process, the decomposition pro-
posed in Ref. [61] can be identified as a decomposition of
hσSMi into the entropy production hσEMCi of the EMC and
the remaining entropy production hσWTDi caused by the
waiting times:

hσSMi ¼ hσEMCi þ hσWTDi: ð51Þ

Up to a time conversion factor, hσEMCi is the mean entropy
production of the EMC, which is given by

hσEMCi ¼
1

hti
X
I;J

ps
IpIJ ln

pIJ

pJ̃ Ĩ
; ð52Þ

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-13


where ps
I is the steady state of the EMC as a discrete-time

Markov chain. The factor hti, the average waiting time
between two transitions, is needed because entropy pro-
duction of a discrete-time Markov chain is naturally
measured per step rather than per time. In terms of the
application to observed links, ps

I quantifies the relative
frequency of a particular transition I in a long sequence of
observed transitions as given by Eq. (45). Equivalently,
Eq. (52) can be derived as the mean of

σEMC ≡ lim
T→∞

1

T
ln
P½ΓEMC�
P½Γ̃EMC�

; ð53Þ

defined on the level of single trajectories ΓEMC based on the
arguments presented in Appendix B 3. Note that Eq. (52)
coincides with Eq. (49) in Ref. [61], dubbed there as
transition sequence contribution to the entropy estimator.
Since the EMC emerges from integrating out the temporal
resolution of the semi-Markov process, hσEMCi vanishes in
situations with no observable net current. In other words,
the contribution of a particular pair of transitions I, J to
σEMC vanishes if and only if the net number of transitions J
after previous I matches the number of transitions Ĩ after
previous J̃ on average, i.e., if P½ΓEMC� ¼ P½Γ̃EMC�.
The condition of vanishing hσEMCi can also be related to

the stalling conditions. In fact, the entropy production
associated with the embedded Markov chain coincides with
the informed partial entropy estimator hσIPi formulated for
the case of one accessible transition [44,45], i.e.,

hσIPi ¼ hσEMCi; ð54Þ

as proven in Appendix B 4. In particular, the force F can be
determined as

ln
pIþIþ

pI−I−

¼ ln
PðIþjIþÞ
PðI−jI−Þ

¼ −F ð55Þ

by virtue of Eq. (32) without referring to waiting times at all.
This result is not surprising, since both estimators measure
the affinity AC of a single, averaged “effective cycle”
either through the applied force F or through the ratio
lnPðþjþÞ=Pð−j−Þ. Without the time resolution, the esti-
mator hσ̂i loses the ability to distinguish between longer or
shorter hidden cycles. Thus, we can reformulate a conjecture
proposed in Ref. [61] that states that hσEMCi exceeds an
analogous expression based on the TUR, hσTURi, since
hσEMCi ≥ hσTURi is equivalent to hσIPi ≥ hσTURi.As another
consequence of Eq. (54), the fluctuation theorem proven in
Ref. [45] for σIP, the fluctuating counterpart of the estimator
hσIPi, is related to its counterpart for the EMC, Eq. (52).
The second expression in Eq. (51), hσWTDi, can be

deduced by transferring the splitting of the entropy pro-
duction into contributions from the EMC and remaining
contributions from the waiting times to individual

semi-Markov kernels in the path weights. In more practical
terms, a single semi-Markov kernel ψ I→JðtÞ can be decom-
posed into

ψ I→JðtÞ ¼ pIJ · ψðtjIJÞ; ð56Þ

separating the contribution from the EMC from a condi-
tional waiting-time kernel ψðtjIJÞ ¼ ψ I→JðtÞ=pIJ. By
decomposing all kernels in the path weights using
Eq. (56), we can identify hσWTDi as a Kullback-Leibler
divergence between the normalized probability densities
ψðtjIJÞ and their reverse ψðtjJ̃ ĨÞ. Thus, the derivation in
Ref. [61] relates to factorizing out the EMC according to
Eq. (56) in the context of semi-Markov processes. Using
Eq. (40), we see that hσWTDi vanishes if and only if all
aIJðtÞ are constant in time. In particular, all aIJðtÞ are
constant in time if detailed balance is satisfied in the hidden
subnetwork.
The decomposition of the semi-Markov entropy produc-

tion in Eq. (51) clarifies additionally the relation between
the estimator hσ̂i and the entropy estimator hσKLDi intro-
duced in Ref. [39], which is also decomposed in the form

hσKLDi ¼ hσaffi þ hσ̃WTDi: ð57Þ

Similar to Eq. (51), this decomposition into contributions
from waiting time distributions and affinities is obtained by
splitting off the EMC. The analogy is further strengthened
by noting that

hσaffi ¼ hσIPi ¼ hσEMCi; ð58Þ

with the first equality proven in Ref. [39]. Note that the
respective embedded Markov chains are different objects,
as hσaffi refers to a coarse-grained unicyclic three-state
model, whereas hσEMCi observes only a single transition of
this model. Nevertheless, the result is not entirely surprising
in hindsight, since hσEMCi recovers the full entropy
production of a unicyclic model by virtue of Eq. (9).
The difference between the estimators hσWTDi and

hσ̃WTDi, or hσ̂i and hσKLDi, respectively, emerges from
different rationales underlying the respective semi-Markov
processes. Describing a physical system with a semi-
Markov process is not sufficient to determine its entropy
production uniquely, since the correct time-reversal oper-
ation needs to be discussed separately [39,66,67]. In total,
three different time-reversal operations for semi-Markov
processes are implicitly used to define entropy estimators
for partially accessible Markov networks.
(1) Conventional time reversal, Γ̃ðtÞ ¼ ΓðT − tÞ.—In

this case, physically consistent semi-Markov proc-
esses satisfy direction-time independence [70],
which causes hσWTDi to vanish [56]. This time-
reversal operation is applicable to particular settings

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-14


of coarse-graining [56,58]. States do not change, i.e.,
are even under time reversal.

(2) Modified time reversal, introduced above.—This
operation includes the kinetic hysteresis effect in-
troduced [59], which is natural for coarse-graining
based on milestoning [60]. In our case, semi-Markov
states model transitions, which are odd under time
reversal.

(3) Time reversal for second-order semi-Markov proc-
esses, introduced in Ref. [39].—States in a second-
order semi-Markov process are doublets containing
the previous and current state by construction.
Because of this memory effect, states are neither
even nor odd under time reversal.

Any of these operations can be used to define an entropy
via Eq. (50). This entropy can always be split according to
Eq. (56), where the resulting waiting-time contributions are
given by 0, hσWTDi, and hσ̃WTDi, respectively. In addition,
any of the discussed operations are involutions, each giving
rise to a dual dynamics for which an appropriate fluctuation
theorem holds for the corresponding entropy production
[3]. At this level, any nonvanishing entropy production
quantifies a different mathematical notion of irreversibility,
which becomes a thermodynamic quantity only if the time
reversal is known to be justified physically [59].

VII. CONCLUSION

A. Summary and discussion

In this paper, we have introduced an effective description
for partially accessible Markov networks based on the
observation of transitions along individual links and wait-
ing times between successive observed transitions. The
corresponding waiting time distributions yield an entropy
estimator hσ̂i. The corresponding fluctuating counterpart σ̂
additionally obeys a fluctuation theorem and was shown to
have a natural interpretation as a semi-Markov entropy
production. On a microscopic level, we have discussed with
cycle fluctuation theorem arguments why observing one
link suffices to recover the full entropy production in a
unicyclic network. More generally, we have derived an
operational criterion that indicates the absence of hidden
cycles, which guarantees hσ̂i ¼ hσi.
If the hidden part of the network contains hidden cycles,

we have shown that the estimator hσ̂i yields a lower bound
on the entropy production, which has been shown to
improve on known estimation methods. Additionally, we
have shown that the waiting time distributions contain
information about topology and cycle affinities of the
hidden network. To extract this information, we have
derived exact results and estimation methods, whose
quality has been assessed numerically. Both the entropy
estimator and the affinity estimators are built upon the
generalized microscopic cycle fluctuation theorem argu-
ment which is, as we have shown, the signature of a

fluctuation theorem valid for an effective semi-Markov
process. From the perspective of this semi-Markov process,
we have unified extant entropy estimators by providing a
mathematical interpretation.
Different inference methods can be compared based on

the required input data and the significance of their
predictions. In the case of a single link, hσ̂i relies on the
measurement of statistical data contributing to a single
current. While the amount of input data is comparable to
methods based on the TUR, the predictions generally are
much stronger, at least in the unicyclic case. While the TUR
provides lower bounds on entropy production and cycle
affinity in this case [23], we recover exact values for both
quantities even without access to the waiting times. When
the waiting time distributions are available, exact cycle
lengths can be deduced, which improves significantly on a
known TUR-based trade-off relation between affinity and
cycle length [32,33].
In terms of predictive significance, the entropy estimator

is comparable to the method introduced in Ref. [39] that is
based on knowing a coarse-grained subnetwork, but it
requires substantially less information. Calculating σ̂ is
possible without any knowledge about the underlying
network beyond a single observed link. In particular, the
issue of decimation schemes for coarse-graining is circum-
vented completely. Rather, the entropy estimator σ̂ com-
bines current measurements with information-theoretical
notions via conditional counting, since our expectation on
the next transition depends explicitly on the previous one
[36]. Thus, the sequence of transitions forms a Markov
chain, which is identified as the EMC in the corresponding
semi-Markov description. A mathematical discussion of
semi-Markov processes allows us to clarify physically
distinct categories of semi-Markov descriptions depending
on the correct underlying time-reversal operation. Although
different entropylike quantities satisfy fluctuation theorems
and provide a mathematical notion of irreversibility, the
thermodynamically consistent entropy production must be
identified by more fundamental means. If measuring the
entropy production is feasible operationally, this knowl-
edge can be used to decide which time-reversal operation
recovers the correct entropy production. In this sense,
identifying the correct time-reversal operation is a task
of thermodynamic inference.

B. Perspectives

The transition-based effective description for partially
accessible Markov networks and the derived estimators for
entropy and topology open a wide range of possible
subsequent research topics. First of all, it will be promising
to generalize the estimators for affinity and cycle length to
networks with multiple observable links. Based on such a
generalization, it would become possible to apply the
estimators to a broader range of networks. The combined
observation of different links would make it additionally

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-15


possible to infer more information about the network,
because different affinities and cycle lengths would be
accessible.
With the macroscopic limit of large, complex systems in

mind, it is an obvious, albeit ambitious, challenge to transfer
thermodynamic inference methods to Markov networks
whose cycles outnumber the observed links by far.
Conceptually, the ratio ofwaiting timedistributions separates
the time-resolved notion of irreversibility from other time-
dependent effects entering a waiting time distribution. The
estimation techniques for topology and affinity that are based
on the short-time limit and, hence, short pathways infer local
properties of Markov networks that may even be large.
Passing from local to global methods would require a
different approach. The dominant parts of the large-scale
network structure might become manifest in patterns of
particular transition sequences or waiting times in long
trajectories. Splitting these into smaller snippets as proposed
here is a first step toward a future study of self-correlations in
a long trajectory to extract more complex structures.
To gain more insight into the effective description from

the established perspective of coarse-graining, one should
investigate how existing coarse-graining strategies for
observable states [43,44,48–55,57] are related to the
approach introduced here. By combining these comple-
mentary approaches and by taking into account conclusions
on milestoning [59,60], the concept of coarse-graining can
potentially be generalized to a more fundamental level.
From a practical perspective, we may ask how the method
can be generalized to less ideal situations, e.g., if the
observer cannot distinguish between different transitions or
registers particular patterns or sequences of transitions only.
This class of situations also includes the complementary
problem when particular states can be observed rather than
particular transitions, because observing the arrival in a
state is equivalent to observing all transitions into this state
without the ability to distinguish between them.

The potential of waiting time distributions and their role
for inference schemes is certainly not exhausted by the
results presented here. Combining the estimators for
entropy production and network topology with existing
numerical methods may increase the usefulness of the
waiting time distributions in thermodynamic inference
schemes. Fitting rates of the underlying Markov network
to the recorded waiting time distribution [42] or using
minimization methods [40,41] are promising tools to obtain
tighter, more specialized bounds for the discussed estima-
tors or even to reconstruct the transition rates in a small
network from sufficient data. These methods will gain
particular practical relevance, since topological aspects of
the underlying network can be deduced rather than have to
be assumed.
Furthermore, even though the effective description has

been introduced and discussed for observable transitions of
a partially accessible Markov network in the NESS, it is, in
principle, not limited to this setting. For example, the
description could be applied beyond the steady state to
analyze transient dynamics. Finally, it would be interesting
to apply the approach to a Langevin dynamics to explore
the adjustments needed for systems with continuous
degrees of freedom.

APPENDIX A: WAITING TIME DISTRIBUTIONS
FROM PATH WEIGHTS AND TRAJECTORY

SNIPPETS

1. Markovian path weights and master equation

We consider the effective description of a given, only
partially accessible system in which transitions are
observed, e.g., the effective two-cycle network from the
main text based on the observation of transitions between
states 2 and 3, shown in Fig. 5. We assume that there is an
underlying, more fundamental network to which a discrete
Markovian description from the perspective of stochastic

(a) (b) (c) (d)

FIG. 5. Example for the analytical calculation of waiting time distributions based on effective absorbing dynamics. (a) Effective
description for the partially accessible two-cycle network from the main text. Only transitions from state 2 to state 3 and in the reversed
direction are observable. (b) Underlying Markov network. On the fundamental level of description, the network is Markovian, and
transitions from state k to state l are governed by the transition rate kkl. (c) Effective absorbing Markov network. Between two observed
transitions, the system can be described with an absorbing master equation. This intermediate hidden dynamics is terminated by either a
transition (32) or a transition (23). (d) Exemplary waiting time distribution derived from the numerical solution of the absorbing master
equation for the effective dynamics and the corresponding distribution determined from a histogram of thewaiting timeswithin a trajectory
of length T ¼ 107 generated with a Gillespie simulation [75] of the network. The transition rates of the network are drawn randomly.

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-16


thermodynamics, as described in detail in Ref. [3], can be
applied. For the effective description in Fig. 5(a), this full
Markov network with two fundamental cycles is shown in
Fig. 5(b).
Transitions from state k to state l are governed by a

transition rate kkl, which is independent of the time already
spent in the state k due to the Markov property of the
description. Thus, the waiting time distribution in a
particular state must be memoryless and, therefore, expo-
nentially distributed. In formulas, the probability density
for surviving in state k until exactly time t is given by
Γk exp ð−ΓktÞ, where Γk ¼

P
l kkl denotes the escape rate

of state k. Given that state k is exited, a transition to state l is
weighted with the transition rate and, therefore, happens
with the transition probability kkl=

P
l kkl.

Based on the discussed survival and transition proba-
bilities, a path weight quantifying the probability of a
trajectory ζ of the Markov network can be introduced. We
assume that the network hasN states, is fully connected and
that there are no unidirectional links; i.e., kkl > 0 implies
klk > 0. The path weight P½ζðtÞ� for a generic trajectory
ζðtÞ conditioned on the initial state k0 at time t ¼ 0 is
given by

P½ζðtÞjk0; 0� ¼
YN
k¼1

exp ð−ΓkτkÞ
Y
ðklÞ

knklkl ; ðA1Þ

where the second product runs over all possible transitions
ðklÞ in the network. The trajectory-dependent quantities τk
and nkl denote the total time spent in state k and the total
number of transitions ðklÞ in ζðtÞ, respectively.
In principle, a trajectory-dependent observable can be

obtained by a path integral over all trajectories ζ, which in
practice means summing over the number L of possible
jumps and integrating over all transition times t1;…; tL. An
important consequence is that the probability to observe L
jumps in a short trajectory ζ of length Δt scales as
PðL jumpsÞ ∼ ΔtL for Δt → 0, since

Pðζ contains exactlyL jumps jk0;0Þ

¼
YL
l¼1

�Z
Δt

0

dtl

�
P½ζðtÞjk0;0�∼ΔtL½1þOðΔtÞ�; ðA2Þ

because the path weight as given in Eq. (A1) is of the order
of 1 in Δt. Thus, a first-order differential equation gov-
erning the time evolution of ζðtÞ can be derived by
calculating the path weights for constant and one-jump
trajectories, which are the only terms containing terms of
first order in Δt. The resulting differential equation

∂tpkðtÞ ¼
X
l≠k

½plðtÞklk − pkðtÞkklðtÞ� ðA3Þ

is known as the master equation and can be solved to obtain
pkðtÞ, the probability to find the system in state k at time t.
Since the master equation description Eq. (A3) is equiv-
alent to the path weight description, solving the initial value
problem for pkð0Þ ¼ δk0k amounts to calculating

pkðtÞ ¼ P½ζðtÞ ¼ kjζð0Þ ¼ k0� ðA4Þ

¼
X
ζðtÞ¼k

P½ζðtÞjk0; 0�: ðA5Þ

The symbolic notation of a sum over paths is used
repeatedly in the following calculations.

2. From fully accessible networks
to partially accessible networks

On a coarse-grained level of description, the trajectories
of the network are only partially accessible. Thus, a
complete analytical description by solving the master
equation (A3) is generally impossible, because even the
underlying fundamental network may be unknown.
In the following, we assume that transitions along a

single link connecting the Markov states k and l can be
observed but not the states themselves. This transition-
based description coincides with the description proposed
in Ref. [61]. Adopting the notation from the main text, a
transition along this link k → l and its reverse l → k are
abbreviated as Iþ ¼ ðklÞ and I− ¼ ðlkÞ, respectively. Since
the sequence of observed jumps and the waiting times in
between are the only accessible information about the
system in our effective description, a typical example of an
observed effective trajectory Γ may look like

Γ ¼ ? → Iþ → Iþ → I− → Iþ → � � �
at jump times?; T0; T1; T2; T3;…; ðA6Þ

where ? represents the unknown transition of the system in
the past prior to the first observed transition.
For simplicity, we assume from now on that the process

starts and ends immediately after the observation of an
observable transition I1 at time T0 ¼ 0, INþ1 at time
TN ¼ T, to address the core of our argumentation without
worrying about non-time-extensive initial and final terms of
the trajectory. Moreover, the scheme indicated in Eq. (A6)
can be generalized to any number of observable links. We
write In ¼ ðknlnÞ as the nth observed transition between
the underlying states kn and ln, where we note that
ln ≠ knþ1, in general, as hidden dynamics cannot be
excluded. Schematically, a coarse-grained trajectory Γ
takes the form

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-17


Γ¼ I1 → I2 → I3 → � � �→ IN → INþ1

at jump times T0 ¼ 0; T1; T2;…; TN−1; TN ¼ T: ðA7Þ

Similar to the Markov case, the probability for Γ can, in
principle, be quantified by a path weight description. First
of all, it is important to note that no memory effects ranging
over multiple observed transitions need to be considered.
The path weight for the future of the trajectory, i.e., the path
weight for the trajectory after a transition In is registered, is
unaffected by the previous block In−1 → In, since knowing
the transition In ¼ ðknlnÞ at tn implies knowing the state of
the underlying Markovian system immediately after Tn−1.
Thus, the path weight can be split into parts belonging to
observed transitions:

P½ΓðtÞjI1; 0� ¼ PðI2; T1jI1; 0ÞPðI3; T2jI2; T1Þ � � �
PðINþ1; TN jIN; TN−1Þ; ðA8Þ

with PðJ; TJjI; TIÞ denoting the probability for observing
transition J at time TJ if transition I is observed at time TI .

Constituting the elementary building blocks in the coarse-
grained picture, the objects PðJ; TJjI; TIÞ quantify the
probability for observing J after a given I with waiting time
TJ − TI in between. Thus, Eq. (A8) can also be written as

P½ΓðtÞjI1;0� ¼ψ I1→I2ðt1Þψ I2→I3ðt2Þ� � �ψ IN→INþ1
ðtNÞ; ðA9Þ

with ti ¼ Ti − Ti−1 and t1 ¼ T1 in terms of the waiting time
distribution

ψ I→JðtÞ ¼ PðJ; tjI; 0Þ; ðA10Þ

according to the definition of ψ I→JðtÞ in Eq. (3). Thewaiting
time distributions are normalized in the form

X
J

Z
∞

0

dtψ I→JðtÞ ¼ 1; ðA11Þ

whereas integrating out the time variable gives the marginal
distribution

pIJ ≡
Z

∞

0

dtψ I→JðtÞ ¼ Pðnext observed transition is Jjlast observed transition is IÞ: ðA12Þ

3. Effective absorbing dynamics

On a fundamental level, we are interested in how the path
weights of the effective description Eq. (A9) and their
elementary building blocks Eq. (A10) are linked to the path
weights Eq. (A1) of the corresponding microscopic tra-
jectories of the full network. As a first step, we note that the
way in which the effective trajectory Γ is split carries over
to a splitting on the fundamental level for the microscopic
trajectory ζ, because not only the coarse-grained but the
entire microscopic state is known at the observed transition
events. Symbolically, this can be denoted as

ζ¼̂ γt1I1→I2
→ γt2I2→I3

→ � � � → γtNIN→INþ1
; ðA13Þ

where γtI→J is the snippet of the full trajectory between two
subsequent observable transitions I and J with waiting time
t in between. This snippet starts in the destination state of I
and ends immediately after the transition event J in the
corresponding destination state. Since a given snippet is
completed immediately after an observed transition J is
registered for the first time, each trajectory snippet can be
interpreted as a trajectory of an effective Markovian
absorbing dynamics defined on the full network obtained
by removing all observed links. As soon as the original
trajectory ζ completes an observed transition, the absorbing
dynamics for γ are terminated immediately. The corre-
sponding first-passage time is precisely the length of γ in

time and corresponds to the waiting time t between two
transitions in the effective description.
Practically, the effective absorbing Markov network is

obtained from the correspondingoriginal network by treating
all observable links as absorbing, i.e., redirecting the
observed transitions into absorbing states. An example for
such an effective absorbing Markov network is shown in
Fig. 5(c), which depicts the absorbing network for the
effective description of the two-cycle network in Fig. 5(a).
The possible transitions along the observed link are repre-
sented by the states (32) and (23), which are absorbing states
in the associated first-passage problem. If the considered
snippet beginswith (23) or (32), the corresponding absorbing
dynamics starts in 3 or 2, respectively.
The effective trajectory Γ originates from a mapping of

microscopic trajectories ζ → Γ½ζ� to the effective descrip-
tion of the system. The path weight of Γ is obtained by
summing over microscopic path weights

P½ΓðtÞjI1; 0� ¼
X
ζ∈Γ

P½ζðtÞjl1; 0�; ðA14Þ

where P½ζjj1; 0� is conditioned on l1 at time t ¼ 0 for
I1 ¼ ðk1l1Þ. While integrating out the Markov path weight
P½ζðtÞ� directly to obtain the coarse-grained path weight
P½ΓðtÞjI1; 0� is not feasible, in general, the decomposition
of Γ in Eq. (A9) and of ζ in Eq. (A13) reduces the problem

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-18


to the level of elementary building blocks ψ I→JðtÞ and γ,
respectively. Thus, the decomposition Eq. (A9) can be
combined with the summation in Eq. (A14) to obtain

P½ΓðtÞjI1;0� ¼ ψ I1→I2ðt1Þ ψ I2→I3ðt2Þ � � � ψ IN→INþ1
ðtNÞ

ðA15Þ

¼
X

γ∈ψ I1→I2
ðt1Þ

P½γjl1;0�
X

γ∈ψ I2→I3
ðt2Þ

P½γjl2;0�

� � �
X

γ∈ψ IN→INþ1
ðtNÞ

P½γjlN;0�: ðA16Þ

The path weights P½γjln; 0� of individual snippets γ ¼ γtI→J
are seamlessly conditioned on the final state of their
predecessor, since In ¼ ðknlnÞ. The only type of summa-
tion that needs to be performed is the calculation of the
waiting time distributions ψ I→JðtÞ conditioned on I ¼ ðklÞ
as introduced in Eq. (A10) by integrating all possible
γ ¼ γtI→J:

ψ I→JðtÞ ¼
X

γ∈ψ I→JðtÞ
P½γtI→Jjl; 0�: ðA17Þ

This equation identifies the waiting time distributions of the
effective description as summations over not observable
trajectory snippets and proves, therefore, Eq. (4) in the
main text.
For I ¼ ðklÞ and J ¼ ðmnÞ, γtI→J starts at l and ends with

a jump ðmnÞ exactly at time t. Since the system is in m
immediately before the jump at t, we can use the Markov
property to calculate

ψ I→JðtÞ ¼ P½jump ðmnÞ at time tjl; 0� ðA18Þ

¼ P½jump ðmnÞ at time tjm; t�Pðm; tjl; 0Þ ðA19Þ

¼ kmnpmðtÞ; ðA20Þ

where Eq. (A4) is used for the last equality. The result in
Eq. (A20) makes it possible to calculate waiting time
distributions analytically by solving the master equation of
the effective absorbing dynamics defined on the hidden
subnetwork. Note that this procedure is, in principle,
equivalent to calculating the first-passage time distributions
for the associated first-passage problem with the method
introduced in Ref. [76].
Conceptually, the reasoning used to derive Eq. (A17) and,

therefore, Eq. (A20) is identical to the reasoning used in
Ref. [61] to derive the intertransition time densities. For both
derivations, the partially accessible Markov network con-
sidered in the transition-based description is mapped to an
effective first-passage time problem and the waiting time
distributions are identified as the corresponding first-passage

time distributions. In our derivation, this mapping is moti-
vated from an effective splitting emerging on the level of
single trajectories, whereas in the derivation in Ref. [61], the
mapping is deduced mathematically.
Operationally, the proposed calculation method for

waiting time distributions differs from the method proposed
in Ref. [61]. Instead of carrying out the summation in
Eq. (A20) explicitly, the waiting time distributions can be
calculated from the solution of the effective absorbing
master equation for different initial configurations using
Eq. (A17). In addition, our calculation method is effective,
since collecting histogram data from a Gillespie simulation
[75] is unnecessary to reconstruct the waiting time dis-
tributions, as they can be calculated directly.
To give an explicit example, the proposed method is used

to calculate the waiting time distributions for the effective
description of the two-cycle network in Fig. 5(a). Solving
the corresponding effective absorbing master equation for
fixed, randomly drawn transition rates results in four
different waiting time distributions; one of them is shown
in Fig. 5(d). Additionally, the figure shows how this waiting
time distribution based on Eq. (A20) coincides with the
corresponding waiting time distribution calculated from
histogram data simulated with a Gillespie algorithm of the
full network for long trajectories.

APPENDIX B: ENTROPY ESTIMATOR

1. Coarse-grained and full entropy production

Our effective description loses information about irre-
versibility and entropy production. From an abstract point
of view, a well-defined many-to-one mapping of trajecto-
ries ζ ↦ Γ½ζ� of length T suffices to bound the mean
coarse-grained entropy production rate hσ̂i against the
physical entropy production rate hσi:

hσ̂i≡ 1

T

�
ln
P½Γ�
P½Γ̃�

�
¼ 1

T

X
Γ
P½Γ� lnP½Γ�

P½Γ̃� ≤ hσi; ðB1Þ

provided that Γ ↦ Γ̃ is the correct, physical time-reversal
operation. Technically, the bound relies on the log-sum
inequality, a standard tool in information theory [77] stating

X
i

ai ln

P
iaiP
ibi

≤
X
i

ai ln
ai
bi
; ðB2Þ

for ai ≥ 0, bi ≥ 0. We apply this inequality in the form
[27,78]

Thσ̂i ¼
X
ζ;Γ

P½Γjζ�P½ζ� ln
P

ζP½Γjζ�P½ζ�P
ζ̃P½Γ̃jζ̃�P½ζ̃�

≤
X
ζ;Γ

P½Γjζ�P½ζ� lnP½Γjζ�P½ζ�
P½Γ̃jζ̃�P½ζ̃� ðB3Þ

THERMODYNAMIC INFERENCE IN PARTIALLY ACCESSIBLE … PHYS. REV. X 12, 031025 (2022)

031025-19


¼
X
ζ;Γ

P½Γjζ�P½ζ�
�
ln
P½Γjζ�
P½Γ̃jζ̃� þ ln

P½ζ�
P½ζ̃�

�
¼ Thσi: ðB4Þ

The last equality follows since P½Γjζ� ¼ 1 is satisfied only
if Γ matches the correct effective trajectory Γ½ζ� and
vanishes otherwise, P½Γjζ� ¼ 0. Moreover, the equality
requires that the first term in the sum vanishes, i.e., requires
P½Γ̃jζ̃� ¼ 1 when P½Γjζ� ¼ 1. This condition defines the
modified time-reversal operation Γ ↦ Γ̃ uniquely, since the
correct Γ̃ is identified as the trajectory obtained by using
ξ ¼ ζ̃ in the mapping ξ ↦ Γ½ξ�. In other words, we have to
first time reverse ζ, which is then followed by the coarse-
graining operation, as discussed in Ref. [59].

2. Time reversal and conditional
counting entropy estimator

The previous section identifies the correct time-reversal
operation Γ ↦ Γ̃ as the coarse-graining applied to the
microscopic time reverse ζ̃ðtÞ ¼ ζðT − tÞ. An effective
trajectory Γ consists of a series of transitions In ¼ ðknlnÞ
at times Tn, which is schematically denoted

Γ ¼ ðk1l1Þ!t1 ðk2l2Þ!t2 ðk3l3Þ!t3 � � � !tN−1ðkNlNÞ!tN ðkNþ1lNþ1Þ:
ðB5Þ

Compared to Eq. (A7), the jumping times Ti are replaced
by the waiting times ti ¼ Ti − Ti−1 with T0 ¼ 0. Reversing
the corresponding microscopic trajectory ζ in accordance
with the previous discussion gives a well-defined effective
trajectory of the form

Γ̃ ¼ ðlNþ1kNþ1Þ!tN ðlNkNÞ!tN−1 � � �!t3 ðl3k3Þ!t2 ðl2k2Þ!t1 ðl1k1Þ
¼ ĨNþ1!tN ĨN !tN−1 � � �!t3 Ĩ3!t2 Ĩ2!t1 Ĩ1; ðB6Þ

where we introduce the reversal operation on individual
transitions Ĩn ≡ ðlnknÞ for In ¼ ðknlnÞ. The reverse tran-
sition happens along the same link and is, therefore, also
observable in the effective description by construction. The
path weight for the backward trajectory Eq. (B6) can be
decomposed into a product of single waiting time distri-
bution objects as in Eq. (A9):

P½Γ̃ðtÞjINþ1; 0� ¼ P½ĨN; TN − TN−1jĨNþ1; 0�P½ĨN−1; TN − TN−2jĨN; TN − TN−1� � � �P½Ĩ1; TN jĨ2; TN − T1� ðB7Þ

¼ ψ ĨNþ1→ĨN ðtNÞψ ĨN→ĨN−1
ðtN−1Þ � � �ψ Ĩ2→Ĩ1ðt1Þ: ðB8Þ

After the proper time reverse Γ̃ is identified, the entropy
production of a particular trajectory Γ can be calculated
explicitly as

Tσ̂ ¼ ln
P½Γ�
P½Γ̃� ¼ ln

PðI1Þ
PðĨNþ1Þ

þ ln
P½ΓjI1; 0�

P½Γ̃jĨNþ1; 0�
ðB9Þ

¼ ln
PðI1Þ

PðĨNþ1Þ
þ
XN
j¼1

ln
ψ Ij→Ijþ1

ðtjÞ
ψ Ĩjþ1→ĨjðtjÞ

ðB10Þ

¼ ln
PðI1Þ

PðĨNþ1Þ
þ T

X
I;J

Z
∞

0

dtνJjIðtÞ ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

; ðB11Þ

where the conditional counters νJjIðtÞ are introduced as

νJjIðtÞ≡ 1

T

XN
j¼1

δðtj − tÞδIjþ1;JδIj;I: ðB12Þ

In the limit T → ∞, contributions from the initial and final
states can be neglected, which yields the fluctuation
theorem

σ̂ ¼ lim
T→∞

1

T
ln
P½Γ�
P½Γ̃� ¼

X
I;J

Z
∞

0

dthνJjIðtÞi ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

ðB13Þ

and an explicit formula for the expected coarse-grained
entropy production rate

hσ̂i ¼
X
I;J

Z
∞

0

dthνJjIðtÞi ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

: ðB14Þ

3. Expectation values and entropy production
for semi-Markov processes

We calculate the expectation value hνJjIðtÞi in Eq. (B14)
using an appropriate technique known for semi-Markov
processes. Note that the transitions I; J;… are the ”states“
of the semi-Markov process and that the waiting time t ”in
state I“ is interpreted as the elapsed time since transition I.
As defined in the main text, the conditional counter νI→JðtÞ
measures the number of transitions J after a preceding
transition I:

VAN DER MEER, ERTEL, and SEIFERT PHYS. REV. X 12, 031025 (2022)

031025-20


νI→JðtÞΔt ¼
No: of ðIJÞ jumps after waiting time ∈ ½t; tþ Δt�

T
; ðB15Þ

hνI→JðtÞi ¼ Pðjump to J after waiting time tjIÞ hNo: of jumps from Ii
T

ðB16Þ

¼ ψ I→JðtÞhnIi ¼ ψ I→JðtÞ
ps
I

hti : ðB17Þ

In the last line, we use

hnIi ¼
hNo: of jumps from Ii

T
¼ hNo: of jumps from Ii

htotal No: of jumpsi ·
htotal No: of jumpsi

T
¼ ps

I ·
1

hti ; ðB18Þ

where hti is defined as the average waiting time between
two semi-Markov transitions. The identification of the
stationary distribution ps

I is based on elementary results
for discrete-time Markov chains, as the number of visits of
a particular state I in a long trajectory ðI1I2…INÞ divided
by N tends toward ps

I as N → ∞. Note that, although this
distribution is related to the stationary distribution of the
semi-Markov process itself, they are different even in the
Markovian case [56].
Since the ψ I→JðtÞ are normalized by virtue of Eq. (A12),

we can integrate over t to obtain the expected flux hnJjIi
from a semi-Markov state I to J as

hnJjIi ¼
Z

∞

0

dthνI→JðtÞi ¼ pIJ
ps
I

hti ¼ hnIipIJ: ðB19Þ

The semi-Markov entropy production σSM is defined by
Eq. (50) as the probability ratio of forward and backward
trajectory under the time-reversal operation Γ ↦ Γ̃. Thus,
the calculations of the previous section B 2 starting from
Eq. (B11) actually apply to the semi-Markov entropy
production σSM ¼ σ̂. Substituting Eq. (B17) into
Eq. (B14), we obtain

hσSMi ¼ hσ̂i ¼
X
I;J

Z
∞

0

dthnIiψ I→JðtÞ ln
ψ I→JðtÞ
ψ J̃→ĨðtÞ

ðB20Þ

for the semi-Markov entropy production. To put this
expression into relation with the entropy production of
the EMC, we apply the log-sum inequality after using
Eq. (B17) to obtain

hσ̂i ≥ 1

hti
X
I;J

ps
I

�Z
∞

0

dtψ I→JðtÞ
�
ln

R
∞
0 dtψ I→JðtÞR
∞
0 dtψ J̃→ĨðtÞ

¼ 1

hti
X
I;J

ps
IpIJ ln

pIJ

pJ̃ Ĩ
¼ hσEMCi ðB21Þ

in accordance with Eq. (52).

4. Comparison to informed partial entropy production

a. Entropy estimators: Embedded Markov
chain versus informed partial

In this section, we prove that

hσEMCi ¼ hσIPi ðB22Þ

in the one-link case, which implies hσIPi ≤ hσ̂i by virtue of
Eq. (B21). We prove the case of one observable link
between the Markov states k and l, since the crucial relation
Eq. (B27) and its proof can be generalized to multiple
observed links following an analogous approach. For two
states þ ¼ ðklÞ and − ¼ ðlkÞ, Eq. (B21) simplifies to

hσEMCi ¼
1

hti ðp
sþpþþ − ps