Ulrich Kohlenbach
Abstract. The contemporary “proof mining” paradigm has its historical roots in Georg Kreisel’s program of “unwinding of proofs”.In this paper we elaborate on the tremendous influence Kreisel’s ideas have had and still have on this applied reorientation of proof theory and discuss some logical aspects of“proof mining”.
The story above began and ended with shifts away from traditional logical aims.([106],p.401)
A price paid by those(of us)later pursuing proof theory has been to discover suitable ideas for adapting consistency proofs to aims rewarding for a broader view,naturally,not an unduly high price for those(of us)attracted by such activity.([108],pp.43–44)
Starting in the early 50’s,Kreisel argued that proof theoretic methods which had been developed in the course of Hilbert’s consistency program could be used for con crete mathematical goals different and independent from their original foundational motivation.Kreisel’s writings on this topic,in particular in connection with his con cept of“unwinding of proofs”,span over a period of 60 years and contain numerous important insights and suggestions on how such a reorientation of traditional proof theory from foundational aims to applications in core mathematics could be achieved.In this paper,we revisit some of Kreisel’s main themes from the perspective of the current“proof mining”program which has been pursued,e.g.,by the present author and many collaborators as a modern and more systematic version of Kreisel’s original“unwinding”program and which has the area of nonlinear analysis as its prominent field for applications (see [70,72]for surveys).As we will see,the importance of Kreisel’s insights for the development of the proof mining paradigm can hardly be overstated.
Behind Kreisel’s quest for unwinding mathematically useful but hidden content by the logical analysis of proofs is his general conviction that
the validity of a theorem,in fact,the validity of a proof,is only a small part of the significant knowledge contained in the proof:it just happens to be the part which is most easily put into words.([101],p.164)
Kreisel did not reject the significance of foundational research in principle.In fact,he writes:
Nobody denies that,at various periods of history,people experiment ed with dubious principles…It cannot be repeated too often that there can be and have been realistic doubts…In particular,certain precursors of the concepts mentioned above(sets,lawless sequences)were indeed dubious in a realistic sense,and their foundational analyseswereby no means superfluous.([97],pp.171–172)
However,on the same pages of that paper he states:
1.There are no realistic doubts concerning the concepts used in cur rent mathematical practice and concerning their basic properties formu lated in the usual axiomatic systems.…
2.There is,in my opinion,no hint of evidence for the assumption that anyanalysisof the usual concepts or of their basic properties could im prove in anygeneralway on the mere recognition of theirvalidity.([97],pp.171–172)
Beginning in the early 50’s,he already aimed at clear mathematical uses of proof theoretic methods for which he expected relevant success.In [90,91,92],Kreisel stressed themathematicalsignificance of the actual proof reductions achieved in the context of
1.Gentzen’s consistency proof for Peano Arithmetic PA based on cut reductions and subsequently by Ackermann([1])using Hilbert’sεsubstitution method and
2.Herbrand’s reduction of predicate logic(resp.open theoriesT)to propositional logic(resp.the quantifier free fragment ofT)by Herbrand’s theorem.
The first correct syntactic proof of Herbrand’s theorem had also been obtained by theεsubstitution method in[48].
Later,e.g.,in[95]he also discusses the potential of G?del’s functional interpre tation(used by G?del again in the context of a consistency proof for PA)to unwind proofs while in his own work in this direction he always gave preference to Herbrand’s theorem and theεsubstitution method(note though that functional interpretation may be viewed as a modular“high level”generalization of Herbrand’s theorem,see[29,38,129]).
With regard to all these different methods which have been developed in the course of the consistency program,Kreisel’s novel approach was to
describepragmaticuses of“consistency”proofs independent of(du bious)epistemological claims.([98],p.113)
Already in 1958,Kreisel states on a similar note:
There is a different general program which does not seem to suffer the defects of the consistency program:To determine the constructive(recursive)content or the constructive equivalent of the non constructive concepts and theorems used in mathematics,particularly arithmetic and analysis.([92],p.155)
At the same time he stresses already in the opening sentence of[92]that themethodsdeveloped in the course of the consistency program can be used for this:
The principal aim of the present paper is to sketch some mathemat ical applications of the work of the Hilbert school in the foundations of mathematics.([92],p.155)
One such use,made already in [91],is to characterize the provably total func tions (and—in the presence of function parameters—also the provably total type 2 functionals) of PA as theα(<ε0) recursive ones using Ackermann’s method.In the course of this,Kreisel developed his so called no counterexample interpretation(n.c.i) and showed how this could be used to unwind the computational content of proofs of theorems in mathematics.As a first—pedagogical—example,he treats the Cauchy property of bounded monotone sequences of reals to extract a bound on the n.c.i.of this statement.In [132],Tao rediscovered the n.c.i.in special cases and incidentally treated exactly the same example where he uses the term“metastability”instead of “n.c.i.”.In fact,from Kreisel’s analysis one immediately obtains Tao’s“finite convergence principle”which can be seen as a constructive and even finitary equivalent to the usual monotone convergence principle.This is also true for the advanced quantitative metastability results that have been obtained by proof mining in the areas of ergodic theory,fixed point theory and convex optimization,see[70],Section 2.6,for a discussion of this point.Sometimes,the metastable version is“en riched”,i.e.,formulated with additional bounding data.The latter guarantee that the finitary version not only(classically)implies back the convergence of the sequence in question but also that the limit has the properties stated in the original theorem.We refer the interested reader to,e.g.,Remarks 3.6 and 3.11 in[85].
After treating the monotone convergence principle,Kreisel shows how one can—using the n.c.i.—“give a systematic discussion” ([91],p.52) on why from Little wood’s proof that there are infinitely many sign changes ofπ(n)?li(n)it is possible to extract a bound on such a sign change≥n.
In[90,91],the n.c.i.is presented as an instance of the concept of an interpre tation of a formal system (in the situation at hand that being PA).Kreisel lists four conditions to be satisfied by such an interpretation of which the condition(δ)implies that the interpretation should respect the modus ponens rule in a certain technical sense.However,as discussed in detail in[59],the n.c.i.respects modus ponens only if one uses the full class ofα(<ε0) recursive functionals.By contrast,while for the fragment PAn+1of Peano arithmetic withinduction only,the n.c.i.inter pretation can be solved byα(<ωn) recursive functionals (as can be seen,e.g.,by using G?del’s functional(“Dialectica”)interpretation),the n.c.i.as an interpretation of PAn+1in this class of functionals in its own right does not satisfy(δ)as is shown by the Corollary 2.4 in[59].1Obviously,I was very pleased when Kreisel in a letter from 25.1.98 commented that he considered[59]as“not grotesque”(“nicht grotesk”).Interestingly,in[92],Kreisel gives a different defini tion of“interpretation”in which(δ)is replaced by a condition which is satisfied by the n.c.i.of PAn+1(see[59]for details on all this).
In contrast to the n.c.i.,the functional interpretation(combined with a negative translation)allows for an essentially trivial treatment of the modus ponens rule.This is related to the fact that this interpretation stays much closer to the original formula,to which it is equivalent using justquantifier freechoice though in higher types,see[93],whereas the equivalence with the n.c.i.requires choice forarbitrary arithmetic predicates(though only for type N).This difference is emphasized by Kreisel in[95],where he writes:
from the point of recursiveness the present interpretation [G?del’s]is more satisfactory than the no counterexample interpretation since,as observed above,thestepsin the proof ofA ??s?tA0(s,t)do not rela tivize to the relevant classes of recursive functions.([95],p.380)
in the case of the n.c.i..
The most elaborate proposal for a change of emphasis in Hilbert’s original pro gram away from the consistency problem towards mathematical applications is given by Kreisel in[98].There he writes:
There is no doubt that,in particular parts of mathematics or at least of mathematical practice at the present stage of development,there are candidates for the followingvariantof Hilbert’s program:There is an accepted classPof proofs and there are extensionsP′which are either genuinely problematic or appear logically unnecessary.One looks for a classFPof propositions (“corresponding” to Hilbert’s f.c.[finitary character]propositions)such that,forF ∈FP,if there is a proof inP′ofFthen there is a proof ofFinPtoo.([98],pp.99–100)
Somewhat later he says
it is a principal problem to analyze relative consistencyproofsand reformulate their content.This explains why some of the formulations below are unfamiliar.([98],p.100)
Referring to G?del’s first incompleteness theorem he states the central theme of such a re orientation of the goals in Hilbert’s program:
if not all true propositions about the ring of integers can be formally derived by means of given formal rules,we expect to formulatewhat more we know about a formally derived theorem F than if we merely know that F is true.([98],p.110)
He then indicates how functional interpretations can be used to extract witnessing functions based on the restricted operations implicitly used in the proof at hand.Later he summarizes the quest for a shift of emphasis as follows:
The shift of interest,from the“results”stated in proof theory to what were—considered to be—“auxiliary” constructions,is to be expected from general experience with “grand” theories (here:the foundational theory behind Hilbert’s program).When such grand theories are refuted,incidentally quite often by easy general arguments as in II.3,some of the methods developed in work on the grand theory may retain an interest?but the most essential step is to make the interest independent of the dis carded assumptions,that is,toreinterpret old proofsandformulate new theorems.([98],pp.127–128)
In[99],Kreisel takes up the topic of using proof theoretic reductions for the extraction of new mathematical content again:
it is quite clear that the“reductions”involve transformations of proofs:π →πε.And even if one has no doubts about(the validity of)πor less doubts aboutπthan aboutπε(for example,becauseπεis more involved thanπand so has a higher chance of containing copying errors),there remains the possibility thatπε tells us something we want to know that π doesn’t.Finding that“something”becomes a principal problem:it may need more imagination than the step fromπtoπε.([99],p.64)
On numerous occasions,Kreisel emphasizes that the“shift of interest”he asks for may have some drastic consequences on the issues one focuses at:whereas in Hilbert’s consistency program,the interest is exclusively on proofs ofsentences,namely consistency statements,in applied uses of proof transformations,sub proofs oflemmas can be disregarded altogether:
For the kind of unwinding mentioned in§1,most details of a proof arenotrelevant? for example,none of the details involved in proving so called identities,that is,axioms,and if the latter are true then the transformed proof will again use only true()axioms.As a corollary,when we have the job of unwinding a proofπ,we shall look for chunks of the proof that are used only for provingtheorems,and suppress them altogether from the representationπ′to which the proof theoretical transformation is applied?cf.:a physicist who is given data including the spectral lines of the light coming from a planet,will ignore this optical information if his job is to determine the motion of the planet.The fact that proof theoretic methods are occasionally of use,is not in doubt.As documented in[100,pp.113–116],even without a computer they have been applied to unwind proofs,and to extract information which the dis coverers of those proofs wanted to know and did not find by themselves.Spotting relevant features of those proofs was not a major obstacle.([99],pp.66–67)
This theme is also formulated most clearly in[104]:
Warningsabout using the proof theoretic literature.(i) The rule of thumb in 3d,about suppressing proofs of?lemmas,is stated in logical terms.Nevertheless it conflicts with the very essence of Hilbert’s pro gram since consistency concernsonly ?theorems? for instance,adding the negation of a G?del sentence,a false?sentence,leaves a system con sistent.Thus practically no metatheorem contributes positively both to algorithmic concerns and to Hilbert’s program(still used to make books on proof theory systematic).2The additional warnings(ii)–(iv),not reproduced here,are also recommended to the reader.([104],p.143)
Kreisel championed Herbrand’s theorem and related methods (such as theεsubstitution method)for his unwinding program while
Herbrand himself did not pin point the interest of HT in connection with weaknesses of Hilbert’s program,for example,in terms of some notion as:interpretation.([102],p.47)
In particular,Kreisel proposed to use it to extract bounds on the number of solutions in Roth’s theorem([96,102])which subsequently was achieved by Luckhardt in[116]providing the first polynomial such bounds.The topic of extracting bounds on the number of solutions for finiteness statements has been pursued in the context of strong convergence theorems in ergodic theory under the name of bounds onεfluctuations,see[4],e.g.,which in turn uses[78]and[88].For a recent generalization of[78]see[127]and for nonlinear ergodic theorems see[33,68,121].
In the late 80’s,I began investigating potential uses of proof theoretic methods in the area of analysis([52]).I was particularly attracted by G?del’s functional inter pretation for the following reasons:
1.The interpretation has a somewhat semantic flavor as it translates proofs into functional equations which can then be studied in terms of the mathematical properties(such as continuity,monotonicity or majorizability)of the relevant functionals in a way closer to ordinary mathematics than the syntactic manipu lations of proof trees.
2.While continuity was the property of the functionals emphasized originally(e.g.,in [93]and later in connection with Spector’s bar recursion),I came to the conclusion that the hereditary monotonicity property in terms of “ma jorizability”,not only of G?del’s primitive recursive functionals of finite type([49])but also of the bar recursive functionals([12]),was particularly promis ing to exploit([52,53]).Note that already in[105],Kreisel foresaw the poten tial of Bezem’s model:
Recently,several more models[for bar recursion]have appeared,for example by Bezem([12])that seem rewarding,even if not neces sarily for the properties generally emphasized(by logicians).([105],p.116)
3.When focusing on majorizing functionals the complexity of the functional in terpretation in treating contractions disappears as one can simply take the max imum of the terms provided by the different cases.This later resulted in the so called“monotone functional interpretation”of[56]and is systematically uti lized in the bounded functional interpretation due to[30].
4.The modularity of the functional interpretation(i.e.,its good behavior w.r.t.the modus ponens rule)leaves the overall structure of the analyzed proof intact and makes it possible to re use analyzed lemmas when analyzing different proofs that use these lemmas.This point is acknowledged by Kreisel,e.g.,in[104]:
Functional interpretations versus cut elimination…(i)If thedat hand are sensible,interpretations are generally superior because the definitionsαof the algorithms are read offwithout rebuilding d:the proofs given in logical texts showing correctness(of the defini tion principles corresponding to inference rules)are ignored because they involve only?statements.([104],p.141)
The importance of the property of majorizability in analyzing proofs in analysis stems from the fact that here most?statements(about numbersn ∈N representing someδ=1/n)are naturally monotone so that any bound in fact is a realizer.Ma jorizability has been particularly effective in producing strikinglyuniformbounds in contexts involving abstract classes of(not assumed to be separable)metric or normed spacesX,where—in the normed case—one takes as a majorant ofx ∈Xanyn ∈N s.t.n ≥∥x∥.
The monotonicity of many quantifiers occurring in analysis has always been one reason why I considered analysis as a particularly fruitful area of mathematics for “proof mining” whereas Kreisel’s original “unwinding” program mainly aimed at number theory and algebra.Another reason was that in the area of analysis,logi cal interpretations can already be very helpful to determine the correct“enrichments of data”(see also Section 3)and representations of analytical objects to be used—a crucial step to detect the relevant logical structure of a theorem.Experience in col laborations with analysts over the years has repeatedly shown that in the absence of logical guidance this can be a first major obstacle towards a quantitative form of an analytic statement.
An important feature of the monotone functional interpretation and the subse quently developed bounded functional interpretation ([30]),also based on majoriz ability,is that rather large classes of sentences do not contribute to the extractable uniform bounds.A general example is the following:
for arbitrary typesρ,τ,γ;the associated class is much more general than the univer sal sentences and includes many theorems/lemmas occurring in practice.Sometimes,by modifying a given proof in such a way that large parts of the proof establish sen tences of this form ?,one can drastically facilitate the extraction process.This may even make use of certain classically false axiomsFof the form ?,which neverthe less hold in the type structure of all(strongly)majorizable functionals([12])and its extensions to abstract spaces ([44,65]).Such axioms can be used to derive strong uniform boundedness principles(see[56]for types over N and[65]for extensions to abstract spaces).In fact,the aforementioned bounded functional interpretation([26,30])has such principles systematically built into the interpretation itself under the la bel of“bounded(counter )collection”.AxiomsFof this form may state the existence of an ideal object (e.g.,a fixed point of some nonexpansive mappingT:C →C,whereCis a nonempty bounded convex subset of a normed space,which in gen eral will not exist even ifXis complete andCis closed).However,Fcan only be used in an approximate form in proofs of??theorems and so can be eliminated by its reduction to the(always true in the situation above)existence of approximate fixed points(in fact this approach can be seen as an effective proof theoretic version of ar guments which model theoretically are performed using ultraproducts,see[44]).In the paper[31],it is shown how a certain type of use of sequential weak compactness in Hilbert spaces can be circumvented using uniform boundedness which nicely ex plains in general proof theoretic terms why the preceding analysis of such a use in[67]did not require any use of bar recursion but had turned out to be computationally empty.
In[103],Kreisel already discusses uses of“incompletely defined objects”:
Even if we prove theorems about ordinary propositions,not involving any incompletely defined terms,the proof will generally use only partial information about the latter.So it will apply to all objects about which this information is available,including incompletely defined objects(if they are considered at all).Much the same idea was behind Hilbert’s stress on the finiteness of proofs,(even) of propositions about infinite objects….([103],p.80)
In his discussion of Hilbert’s program,Kreisel usually emphazises a broad un derstanding of it as a general program to exhibit the finitary content of proofs which are based on the use of ideal principles,as distinct from the narrow “presentation”of the program occasionally done by Hilbert himself and,in particular,in the logic literature after Hilbert.Kreisel is mainly critical of
the logical atrocities committed in Hilbert’s presentation of his pro gram.([102],p.47)
Nonetheless,Kreisel’s work on unwinding proofs and,in particular,the systematic de velopment in“proof mining”in more recent years demonstrate strikingly that Hilbert was largely correct in claiming that ideal elements in proofs could in principle be re placed by finitary approximations.Moreover,the computational content hidden,e.g.,in the use of abstract algebra was studied in the Hilbert school itself,e.g.,in Grete Hermann’s PhD thesis ([45],written under the direction of Emmy Noether).Even when“finitary content”is interpreted so as to refer to a reduction to primitive recur sive arithmetic PRA(as in the“orthodox”readings of Hilbert’s program),Hilbert was largely correct as far as ordinary mathematics is concerned:all but two bounds we have been involved in extracting from proofs in nonlinear analysis,ergodic theory,fixed point theory,convex optimization and approximation theory have been at most primitive recursive(and the only two exceptions,[68]and[89],are definable in the primitive recursive functionals of higher type introduced in[47]as finitistic in an ex tended sense and subsequently used by [41]in G?del’s functional interpretation of PA).Related to this,the proof theoretic strength of the tools used on ordinary math ematics largely does not go beyond PRA(see[60,124])and very rarely beyond PA(see[27]).
In connection to this,one should mention the program of“reverse mathematics”which has contributed much to the discovery of this status of ordinary mathematics([34,125]).But again,in order to be relevant for proof mining,one has to make a shift of direction:the foundational significance of,e.g.,the equivalence of some math ematical theoremAto the binary (“weak”) K?nig’s lemma WKL is commensurate with the frugality of the base system(and its language),over which this equivalence is established.For proof mining,the relevant information is that a proof of,e.g.,a??theoremBwhose main non elementary tool isAis likely to allow for the extraction of a reasonably good bound as WKL not only over RCA0but even over systems of polynomially bounded analysis(see[57]),does not create any additional growth.To utilize this,however,it is beneficial to have systems which are mathematically as rich and w.r.t.to their language as flexible as possible over which the addition of WKL or related principles still does not create new growth.Most importantly,moreover,is that the focus now is on proofs which prima facie seem to make an essential use of WKL but whose conclusionBis of a logical form such that a WKL elimination is possible.Reverse mathematics might,incorrectly,suggest that such proofs do not exist since if some theorem is naturally proved using WKL this theorem might be expected to imply WKL and if it cannot(due to its logic form)it might be expected that WKL will not help in proving it.An early counterexample to this is the use of WKL in Artin’s solution of Hilbert’s 17th problem from which (at least in the caseK=Q where the statement is) primitive recursive bounds can be extracted as was first observed by Kreisel([94])although at this time there was not yet a metatheorem on the conservation of WKL over PRA.See[22],in particular 5.13,for an extremely thorough discussion of[94]and comments on Kreisel’s unwinding program in general.
Counterexamples in analysis were found in[52],namely the standard proofs for the uniqueness of best polynomial Chebycheff andL1approximations to functionsf ∈C[0,1],which could successfully be mined(see[52,54,84]as well as[65]for a thorough presentation).In[52]it was also stated that the uniqueness of best Cheby cheff approximations by polynomials with bounded coefficients can be proven using WKL and that,therefore,the extractability of an explicit rate of strong uniqueness is guaranteed by the logical metatheorems from [52].This was finally carried out re cently by Sipo? in[128].A different unwinding of a WKL based proof in the context of nonlinear semigroups of operators is given in[76].
Let us conclude this section with a citation from Kreisel which clearly shows his distinction between the general motivation behind Hilbert’s program and the way this program has been occasionally presented(by Hilbert himself and others):
Though the aims of Hilbert’s program and of the theory of proofs are usually unrelated and often in conflict,it should not be assumed that the aims of that theory are in conflict with Hilbert’s own interests! Both from his writings in[46],and from his conversations quoted in[11],it appears that at least at one time,his principal interest was to attract the attention of mathematicians to the topic ofmathematical proofs as an object of(mathematical)study:and the “grand” program was to have been the bait.([98],p.128)
While there are certainly connections between the unwinding program(or proof mining)and constructive mathematics,there are also important differences as we dis cuss in this section.
As mentioned already above,various novel forms and extensions of G?del’s functional interpretation play a crucial role in contemporary“proof mining”,in partic ular in the area of nonlinear analysis.In order to apply those methods to proofs based on full classical logic one (at least in principle) carries out some appropriate nega tive translation into an approximately intuitionistic formal system.“Approximately”,since,e.g.,the negative translation of the schema of quantifier free choice requires the Markov principle(in all types).As first emphasized by Shoenfield([123]),it is possible to neatly define the final product of the composition of some negative trans lation and the original Dialectica interpretation directly as the so called Shoenfield interpretation.The latter has been shown in[131]to—indeed—exactly coincide with such a composition if a negative translation due to Krivine is used.While for a de tailed understanding of the potential of the functional interpretation(and,e.g.,for a locally optimized use of negative translations),expertise w.r.t.intuitionistic reasoning is necessary,the subject as such is not directly connected to intuitionism:
G?del([41])is very simply adapted to classical systems.As it hap pens,the pedagogic value of this adaptation can be documented by an(impressive) example.Girard had seen various expositions which fol lowed G?del([41])closely,before reading Shoenfield([123]),but learnt to use the ideas effectively only from the latter.([109],p.229)
Most importantly,however,even in situations where some statementAis already intuitionistically provable,the functional interpretation (possibly combined with a negative translation)may give a stronger information than the too weak direct con structive interpretation.
In[109],the“weakness of the constructive meaning of”([109],p.225)is discussed:the constructive reading of an implication
provides a program which translates any Skolem function for the premise into one for the conclusion.However,in general there may be no computable Skolem function for the premise.As explained in [109],more rewarding is what the application of G?del’s functional interpretation provides,namely programs realizing
(in connection with this,we mention that already in[14]Bishop noticed that the func tional interpretation of implications is more meaningful as a“numerical implication”than any standard BHK style constructive interpretation).
As shown first in[61](see the discussion in[66]),the reward becomes particu larly striking for sentenceswhen one treats a constructive proof of such an implication
as a classical one to which then first the negative translation is applied:again the standard constructive interpretation of a constructive proof of(1),e.g.,by the use of Kreisel’s modified realizability interpretation,extracts a programφwhich transforms any Skolem function for the premise into one for the conclusion,i.e.,
Consider now,by contrast,(1)as proven classically.Then negative translation yields
and so by Markov’s principle
Since the functional interpretation interprets the Markov principle,the former can be applied to(4).This yields a program Φ which transforms any solutionφof the n.c.i.of the premise into a Skolem function for the conclusion,i.e.,
The functional interpretation also provides witnessing functionals for“?x,g”in terms ofφ,u.
Combined with the aforementioned majorizability of the closed terms in suit able term calculi one can see that actually a (self majorizing) bound?x,g ?y ≤φ(x,g)Aqf(x,y,g(y)) is sufficient to get a bound?u ?v ≤Φ(φ,u)Bqf(u,v) for some slightly modified Φ.Hence,if the conclusion is monotone w.r.t.?v(which typically is the case in analysis),we even get a realizer for?vwithout the need of any bounded search.The crucial benefit of this,compared to(2),is that subrecursive and highly uniform such boundsφcan usually be extracted from a classical proof of?x ?y ?z Aqf(x,y,z)while,as mentioned already,in general anyfsatisfying the premise of (2) will be noncomputable.This approach is already useful when the premise isinstead of(whereas the situation discussed in[109]only applies to thecase).Then a negative translation is not needed,assuming the original proof was constructive,while one crucially uses that the functional interpretation provides witnessing information on“?z”.For concrete applications of this in nonlinear analy sis,see the discussion in[66]as well as Section 4 below.
While—as discussed above—the constructive interpretation of statements of the form→may be too weak to be useful and the situation may improve by looking at it from a classical point of view,the constructive interpretation of astatement certainly does provide interesting information,namely a witnessing func tion,which from a noneffective proof in general may not be extractable(but only a functional satisfying its n.c.i.).However,in order to guarantee the extractability of an effectiveboundingfunction
the proof does not have to be fully constructive but may either(but not simultaneously)use
(i) the law of excluded middle schema for arbitrary negated formulas (and even full impredicative comprehension for negated formulas)or
(ii) the Markov principle(even in all types)plus the so called lesser limited prin ciple of omniscience LLPO(and even K?nig’s lemma KL),see[58,65].This,in particular,yields that for the important class of Cauchy and convergence statements one even gets effective Cauchy and convergence rates since,due to the monotonicity w.r.t.“?y”,any bound here in fact is a witness.Also for the closure under the so called fan rule to hold,a system by no means has to be fully constructive and may again contain the aforementioned classical principles(see[58,62,65]).Of course,to diagnose that a proof uses only this amount of classical logic (and not,e.g.,LEM)requires us to study the proof from the point of view of intuitionistic logic.In concrete proofs,some parts of the proof may make use ofLEM (or even more complicated instances of the law of excluded middle),and so require the combination of negative translation and functional interpretation for their analysis,while other parts may be constructive(or only use,e.g.,LEM for negated formulas)and can be analyzed by modified realizability (resp.its monotone version).This can be utilized to get in some cases full rates of convergence even from prima facie substantially nonconstructive proofs(see,e.g.,[113,126]for uses of such a“hybrid”proof analysis).
Corollaryon extracting an algorithmα:n →msatisfying?n A[n,α(n)].This job is essentially equally difficult whether one starts with an intuitionistic proof of?n?m Aor one in the corresponding classical system.…The unwinding of relatively simple(classical)proofs can be tedious.Thus theintuitionistic restrictions are too weak? so weak that a reduction to intuitionistic rules is not even a usefulintermediatestep towards unwinding classical proofs(oftheorem).([109],p.223)
Now while indeed experience shows that constructive proofs of??statements produced from classical proofs by the combination of negative translation and Fried man’sAtranslation are often even harder to analyze by,say,modified realizabil ity than the original classical proofs when analyzed by the combination of negative translation and functional interpretation(see,e.g.,[65,Ch.14]and[110]),ordinaryconstructive proofs are usually easier to analyze than classical ones.The difficulty of proofs resulting from theAtranslation is due to the fact that the latter creates for all double negated statements implications (usually of existential formulas) that are nested to the left as(A →B)→Cwhich hardly occur in ordinary proofs and which are difficult to analyze by modified realizability due to the(compared to the functional interpretation)weaker interpretation of the implicationA →B.
However,finding an ordinary constructive proof (rather than finding the con structive content of a given classical proof)may shift attention away from getting best possible witnessing terms or bounds(allowed to be verified classically)to obtaining a constructive proof of the original non quantitative statement.This may suggest re placing,e.g.,noneffective analytical principles such as WKL which do not contribute to the complexity of extractable bounds by constructive induction arguments which do.In fact,this is the reason why the moduli and constants of strong unicity for best Chebycheff approximation extracted in[54,55]from standard classical proofs turned out to be better than the ones produced in[16,17,18]as a byproduct of giving a fully constructive approach to Chebycheff approximation.
This illustrates also the point Kreisel made already in[98]:
The popular literature on socalled constructive mathematics uses a highly idealized notion of“applicability”depending on a quite problem atic view of the nature of mathematical reasoning?cf.II.5(b).As a result it rarely goes into the possibility of aconflictbetween actual applica bility and the demand for constructive proofs.Foractualapplications(which requires efficiency and intelligibility)the difference between lin ear and exponential speed up is more significant than the so to speak non archimedean difference between exponential and non recursive speed up.([98],p.123)
Related to this account of“constructivity in principle”is Kreisel’s critique of“mere definability”:
whenadditional informationis required—above,in the case of Π2theorems,about bounding functions—that discovery shifts the emphasisfrom,say,mere definability…of sometopurposeful selection from the set of all choice functions.Roughly,with mere definability,one falls between two stools:not enough for the purpose in hand,not needed for showing that the choice set in question is not empty.([107],p.249)
Another difference between constructive mathematics(in the sense of Bishop),but also predicative foundations of mathematics (as well as reverse mathematics),and contemporary proof mining is given by the fact that the former usually only con siders separable Banach spaces“to avoid pseudo generality(Separability hypotheses are freely employed…)”([13],p.x)whereas in the latter most applications concern proofs which involve general classes of abstract metric and normed structures(e.g.,arbitraryHilbert spaces).To reflect faithfully the fact that a given proof does not use a separability assumption is crucial to obtain bounds which are independent from parameters in metrically bounded substructures(rather than only compact subspaces?see[39,64]).This uniformity can then be utilized even when the bound obtained is only applied to separable spaces.When I mentioned these facts in a letter to Kreisel,his reply(in a letter dated from 28.I.09)was that Serre had pointed out to him already in the 50’s and 60’s that separability assumptions may yield convoluted (“unüber sichtliche”)bounds and that it,therefore,can be mathematically rewarding to avoid such assumptions in proofs(even if one is only interested in Banach spaces whichareseparable).
To deal with abstract classes of spaces one cannot use the usual formal systems developed to formalize constructive or predicative mathematics nor the ones used in reverse mathematics but has to add abstract structures as atoms via extra primitive types as done in[39,64].
Let us conclude this section by mentioning an aspect where the specific approach of Bishop ([13]) to constructive analysis has influenced the proof mining program positively:this concerns the frequent use made by Bishop of appropriate enrichments of concepts by data which usually are available in practice (e.g.,a modulus of uni form continuity for a continuous functionf:[a,b]→R etc.).Needless to say,this aspect of Bishop’s work had been emphasized by Kreisel long before on a number of occasions,e.g.,as related to“the most successful parts of Bishop(1967)”([109],p.238).
It will not have escaped the reader’s notice that the interests of both Littlewood and Baker were so to speak contemplative,not activist.Theyhadbounds—and (as mentioned) there arebetterones than those sup plied by the original proofs considered.Contrary to the activist propa ganda,those interests are no less permanent than the search for better bounds,which,after all,will be superseded,too.([100],p.115)
When applying proof theoretic methods to specific,prima facie noneffective,proofs with the aim of extracting explicit effective bounds,the quality of these bounds and how they compete with existing ones,of course,is an issue,but,as hinted at already in the citation from Kreisel at the beginning of this section,it is not always the numerical quality of bounds which is the main interest.In this section we will discuss both aspects.
4.1.1 Polynomial bounds in nonlinear analysis
In many applications of proof mining to nonlinear analysis during the last 25 years,the bounds extracted have been the only ones known so far(except occasion ally in special cases) and in some other cases (as discussed in Section 3) they im proved known bounds.One exception where in a special case a better bound than that obtained by proof mining was known already will be discussed further below.In the cases where the bounds obtained by proof mining are the only ones known,often the general form of the bounds makes it plausible to conjecture that they are also essentially optimal and in a number of cases and at least for some of the data the essential optimality can be shown.In the case of moduli of uniqueness in Chebycheff approximation as well as in the case ofL1approximation of functionsf ∈C[0,1],the moduli extracted in[54]and[84]have the(known to be)optimal dependence on the errorε(linear in the case of Chebycheff approximation and of orderε·ωf(ε),for a modulus of uniform continuityωfoff,in theL1case)and are simple exponential in the degreenof the approximating polynomials.For fixedn,the moduli are second order polynomials in all the data.Also in metric fixed point theory,ergodic theory and convex optimization rates of asymptotic regularity (or related moduli) for uni formly convex normed orWhyperbolic space often are w.r.t.the dependence onεof the same order as the modulus of uniform convexity and so quadratic in the Hilbert space case and and even for CAT(0) spaces (X,d) for theorems which hold in this metric generalization of Hilbert spaces(see also[73]for a discussion of this“proof theoretic tameness” of ordinary mathematics).In some cases,though,an iteration of moduli,such as moduli of uniform convexity,are used depending on some norm boundbor the errorεis needed(see,e.g.,[32,83]).By a rate of asymptotic regularity we mean a rate of convergence ford(xn,Txn)0 for some iterative algorithm(xn)n∈Nsolving the approximate fixed point problem for an operatorT:X →X.E.g.this is case for the respective rates of asymptotic regularity in[2,63,67,79,111]and the moduli in [5].The rates for the approximate solution of convex feasibility problems for Mann type schemes in Hilbert space in[51]and for cyclic projections in CAT(0) spaces in[3]are both of order(1/ε)4.As an exception to these results,the rate of asymptotic regularity for the composition of two mappings in CAT(0) spaces extracted in [81]is of order 21/ε.While a polynomial bound does not seem to be ruled out(and is known for the special case of resolvents of proper lower semicon tinuous convex functions),this would probably require an essentially new proof for the asymptotic regularity in the first place.In general,obviously,a bound extractable from a given proof can only be as good as the optimized numerical content which that proof provides.So the choice of promising proofs as well as optimizations along its analysis(making use of any available knowledge of the subject area at hand,see Section 5)is a central issue to get good bounds.
In[71],we succeeded to extract a polynomial rate from Bauschke’s proof of the so called zero displacement conjecture,more precisely for the asymptotic regularity of iterations of compositions of projections in Hilbert spaces(without any assumption on the existence of fixed points of the composition?[71]also covers the more general case of firmly nonexpansive mappings treated in [8]).Although Bauschke’s proof([7])uses a large arsenal of prima facie noneffective tools,e.g.,from the general the ory of maximal monotone set valued operators(theorems of Minty,Rockafellar and Brezis Haraux as well as the theory of strongly nonexpansive operators as developed by Bruck and Reich),the analysis of the proof exhibits a rather explicit quantitative numerical content.The rate extracted is a polynomial in all data which—viewed as a polynomial in the errorε—is of degree 8.We do not know what the lowest degree possible is.Recently,this line of research was generalized from the context of firmly nonexpansive mappings (such as metric projections in Hilbert space) to the larger class of averaged mappings,see[130]which analyzes the proof from[9].
While it was shown already in[69]that the necessary tools from the theory of strongly nonexpansive operators are available in the formal systems used in the meta mathematical bound extraction theorems,it was only recently established that one can extend these systems to cover maximally set valued operators and their resolvents in such a way that,e.g.,the proof from[7]formalizes in an appropriate such extensions([86,118]).As a consequence of this,the concrete minings carried out prior in[71]and[130]can indeed be interpreted as instances of general bound extraction metathe orems.
4.1.2 Convergence and metastability
Let us now move to the second important quantitative form of convergence state ments in nonlinear analysis,namely the convergence of (xn) itself (e.g.,to a fixed point ofTor some zero of an operatorF:X →R).IfT(resp.F) has a unique fixed point (resp.zero) one can usually extract a modulus of uniqueness ([54,65])which often can be combined with a rate of asymptotic regularity to get a rate of con vergence for(xn).This has been applied,e.g.,in approximation theory([54,55,84,128]),convex optimization([87,114]),and fixed point theory([85,Sec.4]).In rare cases one has—even in the absence of uniqueness—moduli of so called metric reg ularity which in cases of Fejér monotone situations also lead to rates of convergence([82]).In most situations,however,the existence of several solutions can typically be transformed into an argument that there is no computable rate of convergence pos sible (see,e.g.,[117]).The latter corresponds to what is referred to in numerical analysis as“arbitrary slow convergence”.So one aims for the next best thing(which alwaysispossible),namely a solution of the n.c.i.of the Cauchy property,called—as mentioned above—metastability by Tao ([132]).The rates of metastability (for strong convergence)extracted so far have in all but one case(see[89]for the latter)been primitive recursive and in almost all cases(with notable exceptions in[23,74,121]where the number of iterations depends on the outcome of a previous iteration depending on the counterfunctiong)of the form
for some transformationL??gof ?g(n):=max{g(i):i ≤n}+n(whereLmay depend onk)and some simple functionB(k)which does not depend ong(herekrepresents the errorε=2?kandgis the counterfunction in the n.c.i.of the Cauchy statement at hand?see[88]for a general logical metatheorem explaining this).As shown in[88],such a metastability bound can be interpreted in terms of the learnability of a rate of convergence using a learning strategyLandB(k) many“mind changes”.
Usually,this type of bound results from the use of the monotone convergence principle for monotone bounded sequences of reals and the Cauchy statement at hand can often been shown(for easy choices of the data)to conversely imply the Cauchy property of monotone bounded sequences of reals for which a rate of metastability of the form above is known to be optimal (though here onlyL=Idis needed)and so—modulo the precise numerical quality ofBandLwhich usually are of low complexity—also the extracted rates of metastability are optimal w.r.t.their gen eral form (see [80]for more on this).If certain gap conditions are satisfied,rates of the form (?) sometimes can be strengthened to obtain bounds on the number ofεfluctuations as discussed already further above(see[88]).
We like to stress in this section that bounds extracted by proof mining have re ceived interest by analysts even in cases where either better bounds have been known(at least in special cases)or are expected to exist.The main reasons for this are:
1.the bounds have been extracted from proofs which have been considered(e.g.,by the authors of the proofs themselves even in print)as essentially noneffective(while known bounds have only been obtained using completely new ideas and much more complicated techniques)3As stressed in a number of places by Kreisel(see also the citation at the beginning of this section),it is this aspect—namely that by proof theoretic reasoning one can conclude that Littlewood’s original proof contains an effective(and clearly subrecursive)bound(in contrast to Littlewood’s own assessment of his proof as being essentially noneffective)— which is the contribution of Kreisel’s discussion of Littlewood’s proof and not to produce any new or better bound than the ones obtained already by Skewes who used new ideas.and/or
2.the bounds extracted show a striking uniformity(independence from most data)and/or
3.the bounds—being obtained from a prima facie nonconstructive but very gener ally applicable argument—easily generalize to all situations to which the same argument can be adapted and/or
4.in the course of the extraction certain assumptions have turned out to be su perfluous making a generalization possible even to situations in which these assumptions are no longer valid.
An early example of an “unwinding” of a highly impure (not “methodenrein”,see Section 5)proof illustrating point 1(and to some extent point 2)is Girard’s analy sis([40])of the proof by Furstenberg and Weiss([36])of van der Waerden’s theorem using topological dynamics (see also [37]for “proof mining” based on monotone functional interpretation in this context).That the Ackermann type bound extracted from this proof soon was superseded by Shelah’s primitive recursive bound and—much later—Gowers’Kalmar elementary bound does not reduce the general logical significance of this unwinding which has been discussed by Kreisel on many occa sions.
As a“proof mining” example in nonlinear analysis which illustrates well points 1–4 let us briefly recall an early success of proof mining in metric fixed point theory:consider a bounded convex subsetC ?Xof some normed spaceXand a nonexpan sive(i.e.,1 Lipschitz)mappingT:C →C.Let(λn)be a sequence in[0,1].In 1976,Ishikawa([50])proved that under suitable conditions on(λn)the sequence(xn)of so called Krasnoselski Mann iterations
is asymptotically regular,i.e.,
If,moreover,Cis compact,then (xn) converges to a fixed point ofT.One eas ily shows that the sequence (∥xn ?Txn∥)n∈Nis nonincreasing so that (?)∈??.Since∥xn ?Txn∥∈[0,∥x0?Tx0∥]?[0,D]withD ≥diam(C),the sequence(∥xn ?Txn∥)converges by the monotone convergence principle.What Ishikawa’s proof establishes is that the limit is 0.Here one is precisely in the situation discussed in Section 3 (note that the Cauchy property of (∥xn ?Txn∥) is in):both from a classical as well as from a constructive point of view the proof does not seem to provide a rate of convergence to 0 since the convergence as such is derived from the noneffective monotone convergence principle.However,as explained in Section 3,in the light of proof theory,the proof clearly does contain an explicit rate of conver gence which was extracted in [61].In fact,[61]analyzes a more general theorem due to Borwein,Reich and Shafrir([15])which states that even for unboundedCand unbounded sequences(xn)n∈None has that(∥xn ?Txn∥)n∈Nconverges to the so called minimal displacement infx∈C{∥x ?Tx∥}ofT(which happens to be 0 in the bounded case).
The rate extracted is extremely uniform:in the bounded case it only depends on a boundD ≥diam(C),the errorεand some general moduli witnessing the assump tions on(λn).In particular,the bound is independent ofT,the starting pointx0∈Cof the iteration and(except forD)even ofCandX.The(qualitative without any rate)independence fromTandx0has been the main result in[42]and the independence fromCwas conjectured by the same authors “unlikely” to be true in [43,p.101].When specialized to the case of constantλn=λ ∈(0,1)the bound extracted in[61]is simple exponential while an optimal quadratic rate of convergence in this case has been known since 1996 by an extremely difficult proof due to[6].
An extension of the result to general (λn) was only conjectured in [6]and re mained open until 2014 when it was verified in[21](by a completely new approach).
Hence between 2001 and 2014,the rate of asymptotic regularity extracted in[61]was the only rate known for general(λn)(satisfying the conditions in Ishikawa’s theorem)and general Banach spaces.Even for constantλ,the rate from[61]is still the only one known for the case of unbounded sequences (xn)n∈N.As conjectured already in[61],the proof theoretic analysis could be generalized to general so calledWhyperbolic metric spaces(X,d,W)and largely also to directionally nonexpansive mappings(which can be discontinuous)as was verified in[77].Both in the case of directionally nonexpansive mappings in bounded convex subsets of Banach spaces as well as even in the case of nonexpansive mappings in bounded generalWhyperbolic spaces the rates extracted in[77]as a generalization of[61]are the only ones known today.
In another case,a rate of asymptotic regularity for so called Halpern iterations was obtained for the first time by proof mining for general sequences of scalars in[112].For a special choice of the parameters the exponential rate resulting from[112]was improved to a quadratic one in [67],again in the context of proof mining.Six years years later,for a related special choice of the scalars,a linear rate was estab lished (not using proof mining) in [120]and—with the optimal constant but in the case of Hilbert spaces only—in [115].The paper [120]used a new proof of an el ementary lemma for sequences of reals adapted to this particular choice of scalars.As this lemma was also used in various proofs that had been logically mined already,the improvement of that lemma obtained in [120]in the special case at hand could easily be incorporated into these existing minings,resulting in new linear rates in a number of other situations such as the more general modified Halpern iterations and the Tikhonov Mann iterations and even in general geodesic settings instead the linear normed case only(treated in[115,120],see[19]).This nicely illustrates the modular nature of the proof mining approach.
Let us comment more on the topic of “generalizing proofs” by proof mining:from the very beginning of uses of proof mining in nonlinear analysis,it became clear that often the explicit finitary character of the interpreted proof made it obvious that the result(as well as its new quantitative version obtained by proof mining)could be generalized from the setting of normed spaces(or Hilbert spaces)to general geodesic settings such asWhyperbolic spaces(or CAT(0)spaces and even CAT(κ)spaces withκ >0.Another type of generalization was recently obtained in [75]where results on the proximal point algorithm PPA and its Halpern type variant HPPA could be easily generalized on the level of“mined”proofs obtained in the setting of monotone operators to operators which are allowed to be nonmonotone in a controlled way(so called comonotone operators,see[10])since the property of monotonicity was only used in some approximate way.Finally,the mining of a proof in the context of pursuit evasion games in[83]made it possible to eliminate a compactness assumption which was crucially used in the original proof.Yet another generalization of a given proof on the level of the finitary analysis of an original proof can be found in[24]where a previous mining of a strong convergence proof in the context of Hilbert spaces,which proceeded via by passing a sequential weak compactness argument,was generalized to the setting of CAT(0)spaces where even the qualitative strong convergence result was new.
Legalistically,G?del’s papers only settle questions about the possi bility of purity of method.But inspection of the arguments suggests quite strongly that thewhole ideal of purity of method is suspect,even when it can be achieved.” ([101],p.164)
Kreisel repeatedly criticized the logical ideal of “Methodenreinheit” (“purity of method”) as a central component in Hilbert’s foundational program.While—as Kreisel argues in[101]—G?del’s completeness theorem states that in first order logic,purity of method is achievable in principle,G?del’s incompleteness theorem is pre sented by Kreisel to have established the opposite for number theory.As(the proof theoretic form of)Herbrand’s theorem and subsequently Gentzen’s Hauptsatz show,to establish a logic validity one not only does not have to use tools outside the logical calculus but one can even avoid the use of lemmas(“direct proofs”).The latter,how ever,comes at the price of a nonelementary increase in the size of the proof and the loss of a conceptual understanding of a proof.For number theory(and other parts of core mathematics),in practice,the G?del incompleteness phenomenon almost never occurs and so—in principle—one can also provide“pure”proofs.But as indicated in the citation from Kreisel at the beginning of this section,it is problematic to strive for pure proofs as a general ideal since what is lost is
knowledge of relations between the objects referred to in the theo rem stated and other things?in short,the kind of knowledge obtained by looking at those objects from a broader point of view.In the case of the prime number theorem,by looking at the natural numbers as embedded in the complex plane.([103],pp.74 75)
Now,while methods such asεsubstitution,Herbrand analyses and cut elimina tion,which have a close connection to the quest for direct “pure” proofs,do play a major role also in Kreisel’s unwinding program,the emphasis in the latter—as already indicated in Section 2—is quite different,namely on extracting witnessing data for?statements rather than on verifying these data by pure methods.For functional interpretations,the primary techniques in contemporary proof mining,the distinction between these different goals is even more apparent as these interpretations clearly distinguish between them.
As an early example,let us recall the treatment of WKL based proofs by the combination of functional interpretation and majorization (subsequently phrased as“monotone functional interpretation”)as developed in[52,53]and applied to proofs in approximation theory in[52,54,55]:WKL is first re written as a sentence of the form ?considered already in Section 2 (withρ=τ=1,γ=0 ands(x)=11).Consider now a proof relative to extensional Peano arithmetic over all finite types E PAω
with quantifier freeBqf(containing onlyf,xfree).By elimination of extensionality,one first reduces things the weakly extensional setting of WE PAω(needed to apply the functional interpretation).The monotone functional interpretation(combined with negative translation)now asks for a majorant Φ?for the Skolemized form
of ?,which—using the truth of(?)—is simply given by 11→1,to compute a bound and so(by bounded search or—due to monotonicity—immediately)to produce a wit nessing termt
for the conclusion provably in WE HAω+(?),i.e.,using a uniform version UWKL even of WKL.Instead of focusing on the extraction oftone can also use this ap proach to eliminate WKL from the original proof:by the deduction theorem(which applies to E PAωbut not to WE PAω)one obtains(using subsequently elimination of extensionality)
and so
The monotone functional interpretation can now be used to extract a uniform bounds(f)on“?z”with
Since one can construct(uniformly inz)a Φz(mì) ≤1 such that
One now can use either functional interpretation or modified realizability to extract a witnessing termt.
Of course,the elimination of WKL via the extraction ofsand the extraction oftare closely related and in some cases may happen almost simultaneously.But in general,experience in proof mining shows that it is usually very helpful to focus on the extraction oftand to freely use WKL and even UWKL for its verification.In practice,moreover,proofs will not use WKL as such but specific theoremsT(e.g.,in the case of Chebycheff approximation the fact that there is a best approximating polynomial which—by the Chebycheff alternation theorem—has an alternant)which can directly be seen to have the form ?so that there is not even the need to proveTfrom WKL(and then to eliminate WKL from the proof).
Even when dealing with nonstandard axioms of the form ?as mentioned in Section 2 (e.g.,the axiomsFandFXin [56]and [65]which are not true in the respective full set theoretic modelsSω,Sω,X) there is (modulo some technicalities related to the failure of the deduction theorem) the possibility for a proof theoretic elimination of these axioms from proofs(of suitable classes of theoremsA)but there is no need to carry this out:one can simply reason in Bezem’s modelMω(resp.its extensionMω,X) which satisfiesF(resp.FX) to get the validity of the extracted bound inMωand from there(using mild type restrictions inA)inSω.
As indicated above,in many cases,by additional proof theoretic arguments,one can actually guarantee that a constructive verification of the extracted term can in principle be achieved but even when one might be interested in this due to founda tional reasons there is no point in actually carrying this out.Strangely,constructive proofs which are guaranteed to be possible by logical metatheorems are sometimes in the constructive literature dubbed as being only“formal constructive proofs”(see e.g.[122])as if for a proof to count as genuinely constructive it had to be produced so to speak“im Schwei?e des Angesichts”(“by the sweat of the brow”).
The hybrid(partly proof theoretic and partly model theoretic)way of reasoning sketched above is even more useful when one works in theories with abstract spacesXadded as base types for which one does not have an elimination of extensionality procedure and for which one has to use the weakly extensional formulation which does not satisfy the deduction theorem.To have the deduction theorem is considered in general to be a crucial requirement for a logically well behaved formal system and systems such as WE HAωare,therefore,sometimes dismissed as“not very attractive:the deduction theorem does not hold for this theory”([133],p.231).For proof mining purposes the lack of the deduction theorem,however,is a blessing as it allows one to distinguish between assumptions treated as axioms Γ(i.e.,written to the left of“?”)from implicative assumptions ?!鶤to which proof mining could be applied as well asking,e.g.,how much of Γ is actually needed to prove a certain instance ofA.For example,if Γ is some identitys=X tbetween objects of typeXone may ask for aδ >0 such thatdX(s,t)≤δsuffices to prove someεinstance ofA.
In the presence of mappingsT:X →X,such an analysis will in general only be possible ifTis uniformly continuous(on the set of data in question)since uniform continuity(with modulus)is the quantitative form of extensionality(+)s=X t →T(s)=X T(t).There are,however,special uses of extensionality,such ass=X t ∧T(s)=X s →T(t)=X t,whose monotone functional interpretation requires strictly less information than(a modulus of)uniform continuity which can be provided also for classes of discontinuous functions such as functions satisfying the so called conditionEof Suzuki(see[70]for more on this).
The issue of extensionality has recently been studied in the context of set valued operatorsA:X →2Xon Banach spacesX.Then,already the proofs to be analyzed sometimes require a continuity assumption which is expressed in terms of the Haus dorff distance.In[87]it is shown how this can in certain situations be replaced by a condition(together with a modulus witnessing this condition)which can be written as an axioms ?(see the axiom(UC)in[119])whereas the Hausdorff distance cannot even be defined in the formal systems used in proof mining.In fact,uniform continu ity w.r.t.the Hausdorff distance is not even appropriate to express the extensionality of set valued operators (see [118,119]).In [119]an interesting characterization of the extensionality of a monotone operatorAis given in terms the maximality ofAwhenX:=His a Hilbert space(and with“accretivity”in the Banach space case).Maximal monotonicity usually is only used to show—by means of Minty’s theorem—thatId+Ahas full rangeHand that fact can be formalized in a way which permits metatheorems on bound extraction without involving extensionality(see[119]).This seems to be the reason why often extensionality issues do not arise in the context of maximally monotone operators.
In general,nevertheless,to be able to deal with discontinuous mappings we only allow a rule version of(+)which then requires us to takes=X tas an axiom which cannot be replaced by some approximate weakening.This has many applications in nonlinear analysis and the issue is extensively discussed,e.g.,in [39,64,65](note that in the model theoretic approach to metric and normed structures as in positive bounded or continuous logic,the uniform continuity of all function symbols has al ways been assumed except for the[20]which aims at generalizing this prompted by the proof theoretic treatment of discontinuous situations).
Yet another manifestation of the ideal of“purity of methods”is the view some times formulated by logicians,that for a proof theoretic analysis of a proofpto count as a genuine application of logic,that analysis must not use any knowledge of the subject matter in addition to what is used inp.It certainly is of both general logi cal interest as well as practical use(to systematically search for promising proofs)to formulate metatheorems which guarantee that a bound can be extracted in principle frompalone provided thatpcan be formalized in a suitable formal system.But the purpose of actually carrying out such an analysis is to get a much better numerical information than the kind of complexity upper bound provided the metatheorem and for this—as explained already in Section 3—one,of course,will use all available additional knowledge,optimization tricks etc.,that are helpful.As Kreisel used to say:
I do not conceive of mathematical logic as the mathematics for mo rons.([92],p.171)
Moreover,even in cases where there is no general logical metatheorem which a pri ori guarantees the extractability of an effective bound,as this may depend on some conditions that can only be checked a posteriori when certain witnessing data have been extracted(as,e.g.,is the case for finiteness theorems where the terms extracted have to satisfy a suitable gap condition,see [88,102,116]),it seems completely in line with the usual concept of“application”in mathematics to call this an application of logic despite the criticism raised in[28]by which many applications of ideas and concepts of an areaAof mathematics in a different areaBcould not be counted as“applications”as well.