Tuesday, 1 July 2014

Philosophy of Mind and Psychology Reading Group -- The Predictive Mind chapter 5

Jona Vance
Welcome to the fifth post of the online reading group in the Philosophy of Mind and Psychology hosted by the Philosophy@Birmingham blog. This month, Jona Vance (Northern Arizona University) presents chapter 5 of The Predictive Mind by Jakob Hohwy (OUP 2013).

Chapter 5 - Binding is Inference
Presented by Jona Vance

Part 1 of the book (Chs 1-4) sets out the prediction error minimization (PEM) framework. Part 2 (Chs 5-8) applies the PEM framework to a number of specific problems and phenomena in cognitive science and the philosophy of mind. This post is on Ch 5, which addresses the binding problem (or problems). 

Hohwy has two main stated aims in Ch 5. First, he aims to use PEM to give a “reasonably detailed answer” to the binding problem. Second, he aims to use the debate about binding issues and the phenomena it centers on to illustrate how the PEM framework can be applied to various interesting cases. So the chapter aims to use PEM to illuminate how binding works and aims to use binding to illuminate how PEM works.

Hohwy glosses the binding problem in a few ways. On one gloss it concerns “how the brain manages to discern properties of objects in the world and correctly bind these properties together in perception” (p. 101). A second gloss adds that part of the problem is to explain how the brain correctly binds properties “in spite of processing them in different regions throughout the brain” (p. 101). For example, if visual receptors receive information as of something red and as of something round and the olfactory receives information as of something sweet, the brain still has to figure out whether the redness, roundness, and sweetness are properties of the same object or not. And it has to do so despite processing some of the information in different regions.

Hohwy notes that there are numerous approaches to the binding problem (p. 102). It’s also important to note that there is not just one binding problem. There are numerous related binding problems. Binding issues arise in perceptual processing regarding information across space, types of features, sensory modalities, and binding neural signals across cortical space. Binding issues arise for single percepts and more than one percept, at a single time and across time. This is worth emphasizing in PEM’s favor. On Hohwy’s account, PEM solves the binding problems through a very general mechanism: causal inference via prediction error minimization. PEM promises to offer an elegant solution to the full range of binding problems.

Hohwy’s own PEM solution to perceptual binding problems begins by noting that the ambiguity that must be resolved for accurate binding is in principle no different from other ambiguity problems the perceptual system faces. As a result, we can appeal to the same solution for binding problems as for the more general underdetermination problems that motivate constructivist approaches to perception generally, of which PEM is an example. Regarding perceptual binding, Hohwy appeals to the same Bayesian story as is used for other ambiguity resolution: properties will be bound together and to objects represented in perception just in case a proposition expressing them as bound is the mean or maximum a posteriori (MAP) of the Bayesian perceptual inference. For example, whether the perceptual system represents some bit of redness and roundness as properties of the same object or different ones depends on whether the binding hypothesis (a red, round object) is the MAP of the perceptual inference or not. If it is, the properties are represented at bound in the percept; otherwise not. Researchers working on perceptual binding problems have noted that spatiotemporally overlapping properties tend to get bound in perception. This makes sense from a Bayesian perspective, since there is high prior that spatiotemporally overlapping properties are properties of the same distal object.

Not all Bayesian approaches to perception adopt the PEM framework. (It would have been good, I think, if the book had made this point clearer.) However, at first blush PEM seems to provide a particularly elegant solution to the binding problem. PEM posits that hypotheses about the world are represented at various levels of a perceptual hierarchy (where the levels according to Hohwy’s version of PEM are individuated according to causal regularities at different time scales). The hypotheses already bind together relevant properties. So the framework provides for part of a solution to the binding problem almost by default: the system simply builds in that properties are represented at bound at the various levels. In addition, a full solution to the binding problems requires that the properties be bound *accurately*. This aspect of the problem is solved in PEM because the hypotheses are constantly supervised by feedback from prediction error signals up through the hierarchy. According to the model, inaccurate hypotheses are revised in response to error signals generated by the sensory data. 


Hohwy address binding both within and across modalities. He illustrates how cross modal binding works according to PEM using the rubber hand illusion (pp. 104-106).

I now want to use the rubber hand illusion to raise a worry for PEM. There seem to be inconsistencies in the representations across modalities in the rubber hand illusion. In the illusion, the following propositions are all simultaneously represented in perception, though not all in the same modality.

Visually, it looks as if

(1) the rubber hand is rubbery.

Haptically, it feels as if

(2) the subject’s hand is not rubbery.

But haptically again, it feels as if

(3) the subject’s hand is the rubber hand.

1, 2 and 3 are inconsistent. So it looks like one’s perceptual system simultaneously represents a set of inconsistent propositions across two modalities (sight and touch).

The rubber hand illusion is not unique in having inconsistent propositions that are represented simultaneously across modalities for a single subject. To see another example, consider the haptic and visual contents that are simultaneously represented when one both looks at and holds a straight stick partly submerged in water. The represented claims include:

Represented visually:

(4) The stick is bent.

Represented haptically:

(5) The stick is unbent.

4 and 5 are inconsistent: the stick can’t be both bent and unbent. So again it looks like there are actual cases in which inconsistent claims are simultaneously represented across different perceptual modalities. (Note that the claim is about inconsistencies across modalities. There is no claim here that a single percept has inconsistent contents, as in some objections to classical sense data theory.)

Here’s how these cases might provide an objection to PEM. If PEM is true, percepts are generated via Bayesian inference from a set of hypotheses represented in the perceptual hierarchy. The hierarchy is unified in important ways, which is essential to Hohwy’s PEM-based explanation of how binding occurs within and across modalities. So the contradictory hypotheses would have to simultaneously derive from a hierarchy of hypotheses that is in an important sense unified. This seems implausible. So on PEM we should expect that these cross-modal contradictions in perceptual representation do not occur. But such cases do occur. So PEM is not true.

One might reply that in some cases of perceptual binding, only hypotheses that bear on representation in one modality are engaged in the perceptual inference. Even if PEM posits a perceptual hierarchy that is unified in important ways, not every perceptual inference draws from the whole hierarchy. So for example, one might argue that PEM does not entail that perceptual inference that yields the haptic representations in the rubber hand illusion utilize the same hypotheses that are utilized for the perceptual inference that yields the visual representations.

However, this reply seems unpromising. Even if it is true that some perceptual inferences are confined to representations that bear on some narrow perceptual task, that isn’t the case in the rubber hand illusion. The point Hohwy makes in the chapter is precisely that in such cases, the perceptual system draws from a core hierarchy of hypotheses and coordinates information across both modalities to generate the relevant percepts in each modality. The problem for PEM that I am raising is that even when such cross-modal binding occurs, the perceptual system still yields inconsistent representational outputs simultaneously. On a Bayesian PEM account, that arguably should not happen.


  1. Thanks a lot for the post Jona. A question about your objection to PEM from cases of cross-modal contradictory contents- why do you see these cases (like the bent stick and rubber hand illusions) as particularly problematic for PEM to explain, more so than ordinary (non cross-modal) cases of perceptual error? It seems like the story that the PEM theorist should offer in response for these takes pretty much the same shape as the one they would offer whenever the content of perception is non-veridical: that although the system was engaging in a good inference-like process, it simply turned out that given its priors, it reached a different conclusion than what actually happened to be the case in the external world (at least with respect to the contents represented in one modality, in the cross-modal cases you describe). Just because the system has access to some information with content X doesn't mean that the priors as a whole when funneled through the Bayesian process will generate a perceptual representation that doesn't contradict X, because lots of info will ultimately bear on the hypothesis that is selected to become the content of perception. For example, in the bent stick case, if you visually perceive the stick as bent first, and then touch it and feel it as unbent, the information that the stick appears bent in vision may to some degree influence the processing that goes into generating a tactile representation as of the stick as unbent, but it will 1) be outweighed by other priors, such as ones that sticks tend not to be bent, and 2) that hypothesis will be rejected due to checks from tactician. Maybe over the long run we'd expect the visual illusion to disappear due to cross-modal checks from other senses, but there may be strongly held priors about the way light reflectance works in general that make this special case of representing an object that is partially underwater particularly difficult to get right, without revising the rest of our visual representations of standard (non-watery) cases such that they aren't totally off instead. I'm curious to hear from Jona and Jakob and anyone else whether they think this sort of line is an adequate response for PEM.

  2. Hi Zoe,
    Thanks for responding. You may be right that PEM theorists can reply to the worry I raised along the lines you suggest. I’m not yet convinced. Also, even if the worry turns out not to be a problem for PEM, working it out ma help me better understand the PEM account of binding.

    Let me say a bit more about why I focused on the kind of cases I did. I was focusing on inconsistent contents represented simultaneously across two modalities. Simultaneous because I wanted each of the relevant contents to be the upshot of a closely related set of common cause inferences on the same set of sensory data, along the lines Jakob sketches in Ch 5, not separate inferences on temporally distinct sets of sensory data. Cross-modal because I’m not sure that there can be inconsistent contents within the same percept. Some people think that, e.g. the waterfall illusion is such a case, but that’s controversial, and I wanted to avoid that controversy.

    In re-raising the worry, I’ll highlight two aspects of the PEM account of binding that Jakob gives in Ch 5. First, according to PEM there is a single perceptual hierarchy with a unified set of hypotheses about the world. Second, the perceptual system coordinates its representations in one modality with sensory data received in other modalities. As I understand Jakob’s account, the perceptual representation of the location of the touch in the rubber hand illusion takes into account visual data received (e.g. whether the rubber item looks hand-shaped). The cross modal coordination helps explain how properties are represented as bound using information from various modalities.

    Okay, here’s another try at the worry: when these two aspects of the account are combined (i.e. unified perceptual hierarchy plus coordinated cross-modal integration of sensory data), the account seems to predict that simultaneous perceptual representations will be consistent. It does so because the account seems to say that a single perceptual hierarchy integrates data from various modalities to generate percepts in each modality, in rough accord with Bayes’ theorem. The probability of an inconsistent set of hypotheses would likely be low (at or near zero). So it seems implausible that each of the hypotheses in the inconsistent set would be the max of the posterior of the relevant inference, for each of the specific perceptual representations in question. So it seems like the Bayesian, coordinated, PEM account would predict that the perceptual system can’t generate sets of simultaneously represented, inconsistent hypotheses. However, there do seem to be cases in which there are simultaneously perceptually represented inconsistent hypotheses.

    To clarify a bit more about how this might be a special problem for PEM, consider the following. There is some empirical evidence for coordination of sensory data across modalities. So the coordination aspect is not the part of Jakob’s PEM account to which the worry applies. Rather, I’m thinking the worry seems especially to concern the PEM claim that there’s a unified perceptual hierarchy which drives perceptual inference in all the modalities. Other accounts need not entail that perceptual inferences are generated by a single hierarchy of hypotheses as PEM seems to. So other accounts could partly explain the existence of simultaneous, inconsistent perceptual representations by appeal to the different hypotheses that are utilized for the inferences in the distinct modalities.

  3. Thanks to Jona for this excellent discussion of Ch 5. I think it captures very nicely not only the flow of the argument in that chapter, but also the intent behind it. I am sure not every aspect of the binding problem has been covered in that chapter, and so it also works as an invitation to think further about binding in the light of PEM.

    Jona raises a problem about the use of the hierarchy for these kinds of common cause inferences, in multisensory cases. For example, how can the stick both be represented as bent and as straight if the representations in the hierarchy are meant to be unified and coherent?

    This is a nice challenge to think about. It relates not only to the binding problem but also to cognitive penetrability, discussed in ch 6, where the issue is how the system can accommodate both the experience of, say, the Müller-Lyer and the belief that it is not experienced right. It also related to Ch 7, where I discuss the competing representations in vision for perception and vision for action (e.g,, the Ebbinghaus illusion). This in turn speaks to the issue of misrepresentation, which Zoe mentions, and which relates to Ch 8. Further, it relates to the later chapters on unity of consciousness and self.

    I think I can see Jona’s point that the issue concerns simultaneous representations of contradictory attributes with respect to a given perceptual inference. This I agree prevents a straightforward appeal to misrepresentation along Zoe’s lines.

    As a preparatory remark, I think we should allow contradictory representations in the system, as long as they are associated with unrequited prediction error (as in the case of cognitive penetrability). But Jona’s point is that this route is closed in the case where crossmodal binding occurs and prediction error therefore is minimized (i.e., not unrequited) under the contradictory representations.

    The answer here has to do with the final perceptual inference. A major plot in the book is the brain’s proclivity for ridding itself of prediction error even at the cost of rather gross misrepresentation, which may fly in the face of long-held priors.

    Binocular rivalry is one example, where coherence is maintained and prediction error thus minimized even if there is apparently gross contradiction in the system (a face which is a house). The brain solves this by reverting to the temporal dimension: it creates a temporal illusion of there being a face, then a house, then a face etc. Voila no contradiction.

    Similarly with the stick which is seen to be bent and felt to be straight. My guess is that the system treats this as two objects in different places. That is, this situation would give rise to a spatial illusion (similarly for the Ebbinghaus case, cf. Ch.7).

    These two cases have something in common because the inference is not to a common cause. The contradictory representations force an inference to distinct causes.

    There are a few other things in play here. If we consider the ventriloquist, we cannot say that there is no inference to a common cause (since then there would be no illusion). In a true ventriloquist, there is cue veto of the auditory representation. That is, the visual cue is considered so strong relative to the auditory cue that it wins outright. This gets rid of contradiction between the multisensory representations.

    The RHI is in all likelihood a mixture of these mechanisms. There may be cue veto of the spatial aspect of the tactile sensation, and perhaps of proprioception. And there is illusory inference about the perceived situation: people will experience their real hand as being somehow rubbery itself (they often report the hand has numbed skin, is paralysed, cold). So they are rummaging around in supernatural/bizarre hypothesis space in an attempt to preserve their prediction error minimization. It is a probabilistic divide and conquer strategy for dealing with prediction error.

  4. (Just adding two more comments)
    Slightly more formally, independent hypotheses that become dependent conditional some evidence must lead to explaining-away of one hypothesis in favour of the other. Explaining away doesn’t ensue if there is reason to believe the evidence pertains to different hypotheses. For example, upon observing my wet lawn and my neighbor’s wet lawn, I may avoid the conclusion that it’s been raining, which would explain away the hypothesis that the sprinkler has been on, namely if I accept the unlikely coincidence that both my and my neighbor’s sprinklers have been on, which in turn explains away the rain hypothesis.

    One last point about Jona’s comments about the varieties of Bayesianism about perception (many of which I ignore in the book). It is true that not every Bayesian approach to perception subscribes to PEM. However, there is something to be said for the notion that they all are subsumed under the free energy principle, at least to the extent they lend themselves to self-supervised systems (whereas if they require supervision, then they are not so interesting).

  5. Hi Jakob,
    Thanks for your reply. Let me see if I can summarize some of your main points, to see if I’ve understood your response.

    In my post I suggested that the perceptual system sometimes simultaneously represents contradictory contents across modalities. I take it that you reject that claim, though you do leave open the possibility of contradictions across levels in the perceptual hierarchy--e.g. in cases of recalcitrant illusion such as the Muller Lyer.

    In the rubber hand illusion (RHI), you accept that there’s a visual representation as of the hand being rubbery (or something close to that). And you accept that the perceptual system treats the rubber hand and the subject’s hand (which she can feel being touched) as being the same object. The PEM explanation is that the proprioceptive information is imprecise, since the hand’s position is held fixed by restraints, so the system can’t engage in active inference by moving the hand around. But, you deny that the perceptual system represents the hand as being unrubbery (or something close to that). You point out that subjects report that their hands feel numbed, cold, or otherwise weird. That suggests that the system does integrate its visual and tactile representations of the hand in the RHI, in ways that PEM predicts.

    In the bent stick illusion, you again deny that there is a simultaneous contradictory set of representations by the perceptual system. You agree with my characterization that vision represents the stick as bent and that tactician simultaneously represents the stick as straight/unbent. But you rightly point out that that is not enough for a contradiction if we’re careful about the contents. To get a contradiction, the perceptual system would also need to represent that there is just one stick in the perceptual field. But, you plausibly suggest, the system doesn’t do that; rather, it treats the stick as if it were two objects.

    I find your response to my worry about the rubber hand illusion to be very plausible. I didn’t realize (or had forgotten) that perceptual system represents one’s own hand as feeling really weird. That seems like good evidence that the system is trying its best to integrate the visual, haptic, and proprioceptive information to generate a consistent set of representations. I take it that this is what PEM predicts (was I right about that?). If so, rather than posing a challenge, the RHI actually provides support for PEM.

    I’m less persuaded initially by your response to the bent stick illusion. I agree that there’s no contradiction between existentially quantified claims such as the visual representation that (1) There exists a bent stick and the haptic representation that (2) There exists a straight stick. So, you’re quite right to point about that we need some kind of explicit representation that (3) there’s just one stick (in the perceptual field). Your initial response was to deny that the system represents (3). But I think we can make it plausible that the system represents (3) if we add a few details to the case. Suppose that the subject actively moves the stick around in the water for a while and attentively watches its path throughout. This gives the subject a lot of additional visual, haptic, and proprioceptive information about the presence of just one stick. On the Bayesian PEM account, it would be very implausible that there are two sticks in the perceptual field which just happen to remain co-located throughout the relevant time. That suggests that the perceptual system would then soon treat the stick as just one object and thereby represent (3). This version of the case avoids your initial response. Additionally, if one does move the stick around, I think it’s plausible that it still looks bent and feels straight. So this version of the case looks like it involves simultaneous representation of (1), (2), and (3). How does PEM handle this version of the case?

  6. Hi Jona,
    About the rubber hand illusion, we see a lot of different kinds of responses, as the PEM wheels frantically try to minimize prediction error. Some people in fact override the rubberiness and instead hallucinate that the rubber hand looks like their own, for example with prominent veins and distinguishing marks. Quite a few people also report fairly inchoate, uncertain experiences of tactile touch that is vaguely spread out between the real and the rubber hand (something weird is afoot but no confident inference is made).

    In the case with the stick that looks bent but feels straight, there are several avenues for PEM, some of which I explore in Ch 7, and else where in the book. In the discussion of the Ebbinghaus type cases, which are somewhat similar, I suggest that the system might accept some isolated islands of misperception in the service of long term prediction error minimization. This would translate to some expected uncertainty in the overall perceptual inferences people engage in (and here there would probably be some individual variability). From the perspective of PEM this is fine since the system should not expect perfect minimization of error. There is also a possibility that some low level perceptual priors are updated as the system soaks up new statistical regularities, as can be seen in the light from above prior, and in motor learning in the Müller-Lyer illusion (see Skewes et al DOI10.1007/s00221-011-2542-1).
    I also still believe that this kind of situation can lead to supernatural type experiences, a la the way people responses when Bryan and I introduced conflict into the rubber hand illusion (DOI:10.1371/journal.pone.0009416). Perhaps, if the conflict began to interfere with action people would experience being bilocated or have supernumerous limbs...

    It would be nice to see more example of this kind of perceptual conflict. One thing that comes to mind is what happens when wearing inverting spectacles over long periods of time, Linden et al Perception 1999 seemed to show that in that case we just learn new sensorimotor regularities.