s

S-STRUCTURE



Introduction to S-Structure and Its Definition

The term S-Structure is an essential abbreviation within the theoretical framework of generative linguistics, specifically standing for Surface Structure. This concept represents one of the two primary levels of syntactic representation posited in the theories developed by Noam Chomsky, particularly the Standard Theory (ST) and the Extended Standard Theory (EST). Essentially, the S-Structure is the final linear arrangement of words and phrases that constitutes the observable sentence, reflecting the order in which elements are spoken or written. It is the output of the transformational component of the grammar, taking the more abstract Deep Structure (D-Structure) as its input and applying various movement and rearrangement rules. The S-Structure is thus the level that interfaces directly with the phonetic component of the language system, dictating how the sentence will be pronounced and perceived acoustically.

While the D-Structure captures the core semantic and grammatical relations, often termed the thematic structure, the S-Structure captures the effects of operations necessary for grammatical realization in natural language. For instance, in many languages, certain elements must move from their underlying thematic positions to satisfy syntactic requirements, such as subject agreement or focusing. It is at the S-Structure level that these movements—including aspects like passive constructions, auxiliary inversion, and wh-movement—are fully realized and visible. Therefore, the S-Structure serves as a bridge, ensuring that the abstract meaning derived from the D-Structure is converted into a form that adheres to the specific constraints and linear ordering principles of a given language, making it pronounceable and interpretable by the listener.

Understanding the properties of S-Structure is crucial because it is the primary level at which various grammatical filters and conditions apply. For example, specific constraints governing the relationships between moved elements and their original positions (traces) are often evaluated at this level. The final structure must be well-formed according to the universal principles and parameters set by the grammar. If a sentence violates these conditions at the level of S-Structure, it is predicted to be ungrammatical, even if its underlying meaning (D-Structure) is coherent. This dual function—providing the interface to pronunciation and serving as a critical checkpoint for grammatical well-formedness—highlights the central role of S-Structure in the architecture of the human language faculty.

Historical Context: S-Structure in Transformational Grammar

The concept of S-Structure originated in the foundational work of Transformational Generative Grammar (TGG), notably Chomsky’s 1957 work, Syntactic Structures, and was formalized significantly in Aspects of the Theory of Syntax (1965). In the earliest models, the grammar was conceived as a system that generated an infinite number of sentences from a finite set of rules. Crucially, it became evident that sentences which were superficially very different often shared a fundamental logical or semantic organization, while sentences that looked similar superficially might have vastly different meanings. This realization necessitated the distinction between two distinct levels of representation.

Initially, the focus was heavily placed on the D-Structure, which was generated directly by Phrase Structure Rules and where lexical items were inserted. The D-Structure was considered the input to the semantic component. The Surface Structure (later S-Structure) was derived from the D-Structure solely through obligatory and optional transformations. In the Standard Theory, the S-Structure was primarily responsible for providing the input to the Phonological Component. However, as the theory evolved into the Extended Standard Theory (EST) and subsequently Government and Binding Theory (GB), the role of S-Structure became far more complex and central to the overall framework.

A significant shift occurred when linguists recognized that the meaning of a sentence was not solely determined by the D-Structure. Phenomena such as scope ambiguities (e.g., “Everyone didn’t leave”) required that semantic interpretation be sensitive to features of the derived structure. Consequently, the S-Structure gained interpretive weight. By the time of the EST, the S-Structure was defined as the level where certain interpretative rules, particularly those related to focus, scope, and binding relationships, began to apply, thus sharing the interpretive burden with the D-Structure. This evolution demonstrated the growing recognition that the overt arrangement of elements is not merely a cosmetic consequence of grammar but a crucial determinant of the final semantic output.

The Distinction: S-Structure versus D-Structure

The fundamental contrast in TGG lies between S-Structure and D-Structure. The D-Structure, or Deep Structure, is the representation where all lexical items are base-generated in their thematic positions—that is, the positions where they receive their Theta Roles (e.g., agent, theme, patient). The D-Structure is highly abstract and is designed to capture the core predicate-argument relationships of the sentence; it is often described as the input to the semantic component that determines meaning based on thematic relations. For example, in the sentence “The cake was eaten by the child,” the D-Structure would place ‘the child’ in the subject position of ‘eat’ (the agent position), reflecting the underlying action.

The S-Structure, conversely, represents the sentence after all mandatory transformational movements have taken place. It is the structure visible to the eye or ear. Continuing the example, to generate the passive voice “The cake was eaten by the child,” the object ‘the cake’ moves from its initial position in the D-Structure to the subject position in the S-Structure. The remaining original position is marked by a silent element called a ‘trace’ (t). The crucial difference is that D-Structure is purely thematic, while S-Structure is configurational, reflecting both thematic roles and the requirements of the grammatical system (e.g., case assignment, requirement for a surface subject).

This bifurcation is essential for explaining ambiguity and relatedness between sentences. Consider the sentences “John is easy to please” and “John is eager to please.” Both have identical S-Structures at a superficial level (Subject-Copula-Adjective-Infinitive phrase). However, their D-Structures are radically different: in the first, John is the object of ‘please’ (t-to-please John), while in the second, John is the subject of ‘eager’ and the subject of ‘please’ (John is eager [John to please]). The generative framework uses the distinction between the underlying D-Structure and the derived S-Structure to account for the native speaker’s intuition that these superficially similar sentences have fundamentally different grammatical properties and interpretations, thereby demonstrating the explanatory power of the two-level model.

The Role of Transformations

Transformations are the formal operations that mediate the transition from D-Structure to S-Structure. In the Standard Theory and its successors, the core transformation is often simplified to Move-Alpha, which states that any constituent (Alpha) can move anywhere, subject to overarching constraints. These movement operations are not random; they are typically triggered by specific morphological or syntactic requirements, such as the need for a noun phrase to move into a position where it can receive structural Case (Nominative or Accusative). The application of these transformations is what defines the difference between the abstract and the visible structure.

A classic example of a transformation leading to a change from D-Structure to S-Structure is Wh-movement. In the D-Structure for the question “What did Mary buy?,” the wh-phrase ‘what’ originates in the object position of the verb ‘buy.’ However, English requires wh-phrases to move to the sentence-initial position (the Specifier of the Complementizer Phrase, or Spec-CP) to form a question. The movement process generates the observable S-Structure, leaving behind a trace (t) in the original position. The resulting S-Structure is crucial because it is the level at which the moved element and its trace must satisfy structural conditions, such as the Empty Category Principle (ECP), which governs the proper licensing of traces.

The transformational component can be conceptualized as a computational engine that takes a well-formed D-Structure tree and applies a series of operations to achieve the necessary surface configuration. These operations include not only movement (displacement) but also operations like deletion and insertion in earlier models, though modern generative frameworks tend to minimize non-movement operations. The sequence of derivation is critical, as the application of one transformation can create the necessary structural environment for a subsequent rule to apply. The final product, the S-Structure, must structurally encode all the effects of these movements, including the chain relationship between the moved phrase and its trace, which is vital for later interpretation at the Logical Form (LF) component.

S-Structure and Phonetic Form (PF)

Within the Y-model of grammar, which characterizes frameworks like Government and Binding (GB), the S-Structure occupies a pivotal position at the point of bifurcation. It is the structure that feeds two major interpretive components: the Phonetic Form (PF) and the Logical Form (LF). While LF handles the subtle semantic interpretations (like scope and quantifier binding), PF is concerned exclusively with the physical realization of the sentence. The S-Structure acts as the primary input to the Phonological Component, which translates the linear, bracketed, and terminal string of the S-Structure into a set of instructions for articulation.

The relationship between S-Structure and PF is relatively direct. The S-Structure provides the exact sequence of morphemes and words, along with their grouping into phrases. The PF component then applies phonological rules, including stress assignment, intonation contours, and morphological adjustments (e.g., pronunciation of plural endings or past tense markers). Crucially, movements that occur between D-Structure and S-Structure are visible to PF; the words are physically located where the S-Structure dictates. However, movements that occur after the S-Structure (covert movement, often feeding LF) are invisible to PF, meaning they do not affect pronunciation.

The effectiveness of the S-Structure as the input to PF lies in its linear organization. Since language is spoken sequentially, the S-Structure must accurately reflect the temporal ordering of the elements. It must also contain the information necessary for prosodic phrasing. The boundaries established by the syntactic phrases in the S-Structure often correspond directly to the boundaries used by the phonological system to group words for intonation and pause structure. Thus, the S-Structure is the final syntactic blueprint before the sentence is converted into an audible acoustic signal, ensuring that the syntactic requirements are met before physical articulation takes place.

Constraints and Well-Formedness

A core function of the S-Structure level is to serve as a checkpoint where specific constraints on movement and structure are enforced. These constraints, often developed within the GB framework, filter out derivations that, while semantically plausible, are syntactically ill-formed in a given language. If a transformation violates one of these principles, the resulting S-Structure is marked as ungrammatical, and the sentence fails to be generated. This mechanism is central to explaining why certain movements or structural arrangements are universally disallowed or restricted by parameters.

Key constraints that apply specifically at the level of S-Structure include elements of Binding Theory, which governs the distribution of referential expressions, pronouns, and anaphors. For example, Principle A of Binding Theory requires that an anaphor (like ‘himself’) must be bound within its local domain, and this binding relationship is structurally evaluated at the S-Structure. Furthermore, constraints on movement, such as the Subjacency Condition in older models, determined that movement must not cross too many syntactic barriers, preventing long-distance extractions that result in awkward or impossible sentences.

The filtering capacity of the S-Structure is vital for maintaining the strictness of the grammatical model. The grammar generates a vast number of potential structures, but only those that satisfy all constraints at the S-Structure level (and subsequently the LF level) are accepted as valid sentences of the language. These constraints ensure that the resulting linear arrangement is not only pronounceable but also respects the underlying hierarchical structure derived from the D-Structure. Key constraints evaluated at this level include:

  • Binding Principles: Governing anaphors, pronouns, and referential expressions.
  • Case Filter: Ensuring all overt noun phrases have received appropriate structural Case.
  • Theta Criterion: While largely satisfied at D-Structure, the effects of movement must not violate thematic requirements.

S-Structure in the Minimalist Program (MP)

The Minimalist Program (MP), initiated in the early 1990s, aimed to radically simplify the architecture of generative grammar, reducing complexity and eliminating theoretically redundant levels. While the MP does not use the term S-Structure in the same formal sense as GB theory, the function previously attributed to S-Structure remains crucial and is maintained through the operation called Spell-Out. Spell-Out is the point in the derivation where the structure is handed off to the sensorimotor interface (the Phonetic Form, PF).

Under the MP, the grammar is viewed as a purely derivational system, where structures are built up phase by phase. The structure immediately prior to the application of Spell-Out serves the exact functional purpose of the traditional S-Structure: it is the point where movement ceases to be overt and where the structure is linearized for pronunciation. In this refined framework, the goal is to show that the structures generated are necessary solely to satisfy interface conditions (PF and LF). The MP argues that if movement occurs before Spell-Out, it is overt movement (visible at the surface, like S-Structure); if it occurs after Spell-Out, it is covert movement (only visible to LF).

This re-conceptualization integrates the function of S-Structure into the ongoing derivation rather than treating it as a static, distinct representational level. By defining the structure at the point of Spell-Out as the input to the PF component, the MP maintains the crucial divide between overt and covert syntax, which was the central insight derived from the S-Structure/D-Structure distinction. The principles of economy and efficiency inherent in the MP dictate that movement only happens when necessary, ensuring that the resulting surface structure (the post-Spell-Out structure) is the most economical way to satisfy all relevant features and constraints.

Psychological Reality and Processing

The theoretical distinction between D-Structure and S-Structure has profound implications for psycholinguistics, the study of how humans process language in real time. If the generative model accurately reflects the cognitive architecture of language, then language processing (parsing and production) should ideally involve stages that mirror the grammatical levels. Specifically, psycholinguists have investigated whether the human parser first accesses a deep, thematic representation (D-Structure) and subsequently processes the transformational movements to arrive at the surface form (S-Structure).

Early psycholinguistic models, particularly those influenced by TGG, sometimes proposed a Derivational Theory of Complexity (DTC), suggesting that sentences requiring more transformations to move from D-Structure to S-Structure would take longer for speakers to process. While the strong version of the DTC was largely refuted by experimental evidence, more subtle evidence supports the psychological reality of movement chains and abstract representations. For instance, studies on filler-gap processing show that when a wh-phrase moves to the front of a sentence, listeners actively anticipate the ‘gap’ (the trace position) where the phrase originated, suggesting that the parser reconstructs the underlying D-Structure or the movement chain linking the S-Structure position to the trace.

The S-Structure remains undeniably relevant to real-time comprehension because it is the structure that provides the immediate input for linear word-by-word processing. Features such as morphological agreement, prosody, and local dependencies are visible and immediately available at the S-Structure level. Therefore, while the abstract D-Structure may reflect the meaning component, the S-Structure provides the necessary framework for the highly demanding task of real-time speech perception and syntactic analysis. The dynamic interplay between these two levels in the theoretical model provides a powerful hypothesis about the cognitive mechanisms underlying language comprehension.