How visual perceptual grouping influences foot placement
Everybody would agree that vision guides locomotion; but how does vision influence choice when there are different solutions for possible foot placement? We addressed this question by investigating the impact of perceptual grouping on foot placement in humans. Participants performed a stepping stone task in which pathways consisted of target stones in a spatially regular path of foot falls and visual distractor stones in their proximity. Target and distractor stones differed in shape and colour so that each subset of stones could be easily grouped perceptually. In half of the trials, one target stone swapped shape and colour with a distractor in its close proximity. We show that in these ‘swapped’ conditions, participants chose the perceptually groupable, instead of the spatially regular, stepping location in over 40% of trials, even if the distance between perceptually groupable steps was substantially larger than normal step width/length. This reveals that the existence of a pathway that could be traversed without spatial disruption to periodic stepping is not sufficient to guarantee participants will select it and suggests competition between different types of visual input when choosing foot placement. We propose that a bias in foot placement choice in favour of visual grouping exists as, in nature, sudden changes in visual characteristics of the ground increase the uncertainty for stability.
1. Introduction
Locomotion control results from large scale fusion of proprioceptive and other sensory information. In particular, vision plays a crucial role in locomotion, and how it does so has been extensively studied. Research on the visual impact on foot placement started with an influential paper by Lee et al. [1] on visual control of hitting the take-off board in long jump. Since then, the use of visual information to control step length has been investigated for running and walking alike [2–4]. Moreover, much is known of how vision is used to control stepping over obstacles (for recent reviews see [5,6]), avoid collisions (e.g. [7]), stepping off kerbs (e.g. [8]) and for walking on complex terrain [9–12]. Recently, it has been suggested that during visually guided walking, humans make active use of the mechanisms underlying bipedal gait by selecting the most energetically efficient footholds available, based on visual information two steps ahead [13]. We were interested here whether a largely automated sensory process in vision, perceptual grouping, would impact on foot placement. Would perceptual grouping be able to bias foot placement choice away from spatially regular placement locations expected for people walking on hard flat-level ground?
Perceptual grouping is a term used to cover a number of factors that produce well-known effects on visual perception (for review see e.g. [14]): our visual environment is perceived as consisting of organized wholes or patterns rather than individual items. Patterns are grouped on the basis of similar visual properties, such as shape or colour; proximity; continuity and symmetry (see figure 1 for an illustration). Originally investigated by the Gestalt psychologists, perceptual grouping phenomena are nowadays thought to be based on largely automatic mid-level core sensory processes involved in figure-ground segmentation and object recognition outside conscious awareness (but see [15] for the possibility of incremental grouping processes in vision). Therefore, for the purposes of the present experiment, we define locations that conform to some form of perceptually groupable properties or features as ‘visually congruent’ and distinguish them from stepping locations that are spatially regular. The latter, henceforth referred to as ‘spatially regular’ should support a stereotyped walking action at a comfortable pace for the individual, as kinematics of locomotion on a continuous hard flat-level ground are known to show low inter-stride variance [16,17]. We reasoned that spatially regular stepping locations should be sufficient to explain participants' foot placement choices. Note that both terms ‘spatially regular’ and ‘visually congruent’ refer to different types of visual information that can be selected by participants to choose where to place their feet.
In this experiment, we asked participants to follow a ‘stepping stone’ pathway that was projected onto the laboratory floor. Each projected pathway consisted of targets and distractors. Targets were considered to be spatially regular stepping locations. Distractors provided visual clutter and were projected in different shape and colour, in locations inconsistent with spatially regular stepping stones; distractors might be considered spatially irregular stepping locations. When a visual target is displayed with the same visual characteristics of the other, perceptually grouped targets, we consider the corresponding step to be visually congruent or unswapped. However, when a target is displayed with the visual characteristics of a distractor, and a distractor is displayed in the shape and colour of an otherwise visually grouped target, we consider the corresponding step to be visually incongruent or swapped. Accordingly, note that for a visually incongruent trial, a distractor could also be described as visually similar to the remaining stepping stones of interest while the target would be visually dissimilar. The distance between targets was calculated using the formula step distance=0.7×leg length in line with earlier studies [10,13] in order to provide a comfortable stepping distance for self-paced walking. Additional spatially irregular stepping locations (‘distractors’) were projected as visual clutter, or noise, on the surrounding floor (see figure 2 for an example). Participants were asked to walk at a normal walking speed along the projected pathway, taking the most direct and most comfortable route to the other end of the laboratory.
There were two experimental conditions. In one condition, the visually congruentor unswapped condition according to the definition above, all target locations were shown as the same type of visual element (e.g. a light blue/cyan circle); a second type of visual element (e.g. a pink/magenta rhombus) was consistently placed in distractor locations. In the second condition, the visually incongruentor swapped condition, all target locations were shown as the same type of visual element except for one target location which was presented with the visual features otherwise assigned to distractors, while the distractor location in its proximity exhibited the visual features of the normal target location. In other words, this exception step created a visual conflict between perceptually groupable but spatially irregular foot placement choice (from now on referred to as incongruent visually similar or swapped visual) and visually non-groupable but spatially regular foot placement choice (from now on referred to as incongruent visually dissimilar or swapped spatially regular). The distance between target and distractor was parametrically varied. The question we posed was to what extent, if any, would participants follow the visual information they had been primed with from the visual grouping in the reminder of the path. In other words, would participants deviate from the spatially regular stepping locations towards the perceptually groupable stepping locations? Thus, would automatic perceptual grouping impact on foot placement during walking?
Based on a substantial body of research on how vision affects locomotion during walking (for reviews see [5,9,18,19]), we predicted that the parametric variation in distance between target and distractor in the visually incongruent condition should bias participants towards preferring visually similar stepping locations (that is, distractors projected with the visual features otherwise used for targets within the path) for smaller distances, but towards spatially regular targets for larger distances. Furthermore, if participants stepped on a perceptually groupable distractor instead of a non-groupable but spatially regular target (a swapped step), in particular for larger distances between target and distractor, we expected to find a larger error in stepping accuracy (interpretable as a cost related to the irregular stepping action required for stepping on a perceptually groupable distractor), both for the step of interest as well as for the following step, analogous to increased errors observed for eye movement landing positions in the decision making literature (e.g. [20–23]).
2. Material and method
2.1 Participants
Forty-seven participants took part in the experiment (male=9). The age of the participants ranged from 18 to 66 years (M=31, s.d.=15.28). All the participants reported no neurological conditions that could affect mobility and walking, and all had normal or corrected-to-normal vision. None wore multi- or vari-focal glasses.
2.2 Materials
The experiment took place in the Bristol Vision Institute (BVI) Movement Laboratory at the University of Bristol. The laboratory is equipped with multiple projectors (Optoma EW536, resolution 1280×800, frequency 60 Hz) and a state-of-the-art motion capture system (Qualisys Motion Capture Systems, operating at 128 Hz, with a spatial resolution of approx. 1 mm). The projectors, for displaying the stimuli, and cameras for the motion capture system were mounted on metal racks surrounding the laboratory to ensure comprehensive coverage. The floor area covered by the projectors was 2 m wide×12 m long, with the motion capture system covering an area of 2 m wide×10 m (from 1 m into the projection path to 11 m of the projected path) long×2 m high. Side walls were covered with black curtain material. The mean luminance of uniform white projected onto this floor area was 4.74 cd m−2; and participants were dark adapted.
Each experimental trial consisted of a projected pathway comprising 16 target stepping stones together with 20 distractors. The stimuli representing the target stepping stones and distractors were shapes, circles or diamonds, coloured either light blue (cyan) (CIE xy:0.258/0.331; 1.96 cd m−2) or pink (magenta) (CIE xy:0.297/0.178; 0.488 cd m−2). Colour and shape combinations were counterbalanced for different pathways with each participant undertaking 40 trials, one trial per condition. Distractors were placed randomly from a target, at a distance of at least eight times the radius of a target, in order to minimize visual interference. However, for one of the stepping stone locations (the location of interest), the distractor was displayed (pseudo-)randomly either to the north, south, east or west of the target at a predefined distance. The predefined distance was either 25%, 30%, 35%, 40% or 45% of the participant's step length and was pseudo-randomly varied between trials. The target location of interest was placed randomly between the 9th and 13th step. A photograph from an experimental trial is shown in figure 2.
To produce walking pathways that were visually comparable between participants and for which at the same time stepping stone locations corresponded as much as possible to estimated footfall locations for walking at a self-selected comfortable walking speed, we calculated the distance between each stepping stone for each participant as 70% of their leg length [10,13] (measured from greater trochanter of the hip to floor; mean leg length 88.98±4.87 cm s.d.). The lateral distance between the centres of successive stimuli was fixed as 25% of leg length. Note, that while this distance is wider than the average step width of 13% leg length found by Donelan et al. [24], it means that, owing to the radius of the stimulus, the distance between the closest lateral points of successive stimuli was 15% of leg length. As a consequence, the landing area for each step has a diameter of 10% of leg length, between 15 and 25% of leg length from the nearest point of a preceding or successive stimulus. The radius of individual stimuli (target and distractor) was scaled for each participant to 5% of their leg length (in the case of diamond-shaped images, the radius was averaged).
Three 20 mm diameter spherical infrared reflectors were used per foot to enable the motion capture cameras to record foot trajectory and foot landing point. The reflectors were attached by double-sided tape to the talus, and the heads of the 1st and the 5th metatarsal. The spatial location (landing point) of the foot was measured relative to the centre of the target stepping location. The experiment used a repeated measures design with independent variables step type (congruent; incongruent), distractor location (north, south, east and west of the target location) and distance (25%, 30%, 35%, 40%, and 45% step length) as described above. Note that in subsequent analyses, we split trials for the incongruent step type into two groups, depending on whether participants stepped onto the non-groupable but spatially regular target (incongruent dissimilar/swapped spatially regular) or on the groupable distractor (incongruent similar/swapped visual).
2.3 Procedure
Before arrival of the participants, both visual projection and three-dimensional motion capture system were calibrated and aligned with each other. On arrival, participants provided written consent, and their leg length was measured and entered into the computer controlling the construction of the trial pathways. Then, the infrared markers were attached to participants' feet.
At the beginning of the experimental session, participants were asked to stand on a fixed starting point, and the following instructions about the experimental procedure were read out by the experimenter.
‘For this task, I would like you to walk normally across the room, using the stepping stones projected onto the floor, and get to the end of the path as directly as you can—which stepping stones you use, that is, how they look, doesn't matter.’
Note that we explicitly included the comment that the visual characteristics of the stepping stones did not matter to try to address issues of demand characteristics [25]; in other words, we tried to avoid situations in which, if participants were good at picking up the experimenter's intentions and conforming to them, they simply stuck to the perceptually grouped path because they thought that that was the main task demand.
Two practice trials and 40 experimental trials (2 [swapped/unswapped]×5 [distractor distances]×4 [distractor locations]) took place consecutively, with the participant immediately returning to the fixed starting point for the next trial unless a refreshment break was requested. On completion of the experiment, participants would sit down for the removal of the infrared reflectors, and were asked to explain what they thought the experiment had been about. After this, they were given a debriefing sheet and the opportunity to ask any questions. Participants were then thanked and escorted from the laboratory.
2.3.1 Preparation of motion capture data
Data from the motion capture system were provided as three-dimensional coordinates, x=transverse, y=direction of travel and z=vertical, for every 1/128th of a second for every trial. Preparation of these data was required in order to identify the actual landing points and to calculate the angle and distance of the landing points from the target points. This was carried out by identifying when the forward movement of the foot is zero (or, in practice, less than a threshold value; in the present case 1 mm). The data were smoothed using a simple moving average of differences and then selecting the point of least change where the mean is less than the threshold.
3. Results
Following data preparation as described above, and accounting for practice steps and data not being correctly recorded, the maximum number of location-of-interest steps (1974=47 participants×40 trials) was reduced to 1740 (872 congruent/unswapped and 868 incongruent/swapped), corresponding to a loss of 11.8% of data for further analysis. The distance of the landing points from the centre of the chosen stepping stone was calculated for: (i) the location of interest, (ii) some control steps (steps 1–3 of the recorded path), (iii) one step before the target step (before step), and (iv) the step immediately following the location of interest (after step). A representation of the landing points (foot placements) for the target conditions are shown in the polar plot, figure 3.
From visual inspection of the top polar plot in figure 3, it is clear that a substantial proportion of footfalls to target steps for incongruent/swapped conditions were made to the visually similar distractor position and not to the spatially regular target position, in line with our hypothesis that low-level visual factors such as perceptual grouping might be able to bias foot placement decisions. However, taken together with the multiple distances between location-of-interest targets, the distribution of steps meant that we needed an objective way to quantify which steps were to targets (incongruent dissimilar/swapped spatially regular) and which were to distractors (incongruent similar/swapped visual). While we could potentially use a very simple algorithm to decide where people step, namely estimating which stepping stone, the spatially regular target or the visual distractor, was closest to the foot placement, we reasoned that such a measure might misclassify steps because of a step-to-step variability in length, width and timing of human gait that occurs even for normal obstacle-free walking on even ground [17,26]. Also, we expected inter-individual differences in foot placement on the stones that were unrelated to our question. Therefore, we quantified foot locations instead by fitting a two-dimensional Gaussian mixture model (GMM) to our data.
A GMM is a probabilistic model which assumes that data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters. Fitting the best mixture of Gaussians for a given dataset (as measured by the log likelihood) results in a probability distribution of classes that can be used to predict the probability (posterior) of new data points belonging to those classes. Fitting GMMs is an example of an unsupervised learning method, and while this does not guarantee the optimal solution, models do converge quickly to a ‘local’ optimum. To improve the quality of the model, it is common practice to fit many of these models, and then choose the model that best fits the data, often on the basis of log likelihood or similar approach. In the present case, using GMM functions provided by Matlab, a mixture of two Gaussians was fitted to the data for each stepping distance. The results of this process are shown in table 1 and figures 4 and 5.