Relationships Implicitify Research Team

Millonian Matchmaking: Interpersonal Complementarity and the Dovetails Compatibility Engine

Compatibility is one of the oldest questions in psychology and one of the youngest as a measurement problem. The popular literature has spent decades oscillating between two folk theories — opposites attract and birds of a feather flock together — without often noticing that the empirical research on dyadic fit had quietly worked out, by the late 1970s, that both theories are partly right, and partly right in a specific and predictable way. Each one applies to a different axis of interpersonal behavior. The interpersonal complementarity tradition, running from Timothy Leary's early circumplex work through Donald Carson, Donald Kiesler, Jerry Wiggins, and Aaron Pincus, gives us the cleanest available description of what "fit" between two people actually looks like at the level of interpersonal style. Theodore Millon's mature personality theory, in turn, gives that descriptive framework a richer set of clinical types to work with. The Dovetails compatibility engine on this site is, in effect, an attempt to operationalize the union of those two traditions on a single subject's report — what the partner is like, read indirectly through the subject — and to return a defensible estimate of fit without overclaiming what one informant can know.

This article walks through the theoretical case for that approach, the evidence base in honest detail, and the actual mechanics of the engine.

Interpersonal complementarity: why opposites and similars both predict fit

The Interpersonal Circumplex (IPC) organizes interpersonal behavior on two orthogonal dimensions: an agentic axis (dominance versus submission, sometimes called control or status) and a communal axis (warmth versus coldness, sometimes called affiliation or love). Eight octants tile the circle — Assured-Dominant (PA), Arrogant-Calculating (BC), Cold-Quarrelsome (DE), Aloof-Introverted (FG), Unassured-Submissive (HI), Unassuming-Ingenuous (JK), Warm-Agreeable (LM), and Gregarious-Extraverted (NO) — and they have been replicated, with minor variations of label, across decades of factor-analytic work on interpersonal trait adjectives, peer ratings, and behavior in laboratory dyads.

The complementarity principle, formalized by Carson (1969) and developed in detail by Kiesler (1983, 1996), is a strong and counterintuitive claim about how two interpersonal styles fit together. It says that on the control axis, fit is reciprocal: dominance pulls for submission and submission pulls for dominance. On the affiliation axis, fit is correspondent: warmth pulls for warmth and hostility pulls for hostility. A confident, take-charge person and a person who naturally defers to leadership are complementary on agency; two warm, accepting people are complementary on communion. The same logic explains why two people who both want to lead the room often grate on each other (anti-complementary on agency), and why a warm person paired with a cold one is a chronically uncomfortable match (anti-complementary on communion).

The empirical basis for this rule is unusually clean for an interpersonal proposition. Orford (1986) showed that observed behavior in dyads tends to follow the complementarity prediction in agentic interactions and especially clearly in affiliative ones. Sadler and Woody (2003) demonstrated, in carefully measured laboratory dyads, that an individual's behavior shifts in real time to complement the partner's, with the affiliation axis showing the strongest pull. Markey, Funder, and Ozer (2003) replicated the basic asymmetry in naturalistic settings. Wiggins and Pincus's later programmatic work tied the same complementarity principle to the broader literature on interpersonal problems and personality dysfunction. The pattern is not universal — and the limits matter, which we will return to — but it is one of the better-supported regularities in the social-personality literature.

What Millon adds on top of the circumplex

Millon's contribution is to take the IPC's eight regions, which are descriptive trait clusters, and overlay onto them a set of personality prototypes with a developmental and motivational logic. Millon's polarities — pleasure/pain, self/other, active/passive — generate a typology of personality styles whose extreme expressions correspond to the DSM-style personality disorders, but whose moderate expressions describe ordinary recognizable people. Crucially, Millon's prototypes inherit the geometry of the circumplex: the Narcissistic prototype lives at the high-agency, neutral-affiliation pole (PA), the Antisocial-ish "active-independent" style at high agency and low warmth (BC), the Schizoid pattern at low agency and low warmth (FG), the Dependent at low agency and moderate warmth (JK), the Histrionic at moderate-to-high agency and high warmth (LM/NO), and so on around the circle.

Two things follow from layering Millon onto Wiggins-Kiesler in this way. First, the IPC stops being a purely descriptive trait map and becomes something a clinician can actually formulate from. A person at FG is not just "low on warmth and low on dominance"; in Millon's reading, they are organized around solitude and withdrawal in a way that has a recognizable inner architecture, and that architecture predicts what they will need from a partner and what will reliably overwhelm them. Second, complementarity acquires content. To say that someone at PA is complementary to someone at HI on the circumplex is geometrically true; to say that the assured-dominant Narcissistic-style person is reliably soothed by an unassured-submissive Avoidant-style partner — and that the Avoidant-style partner experiences the directive presence of the PA partner as containing rather than threatening — is a clinical claim that one can interrogate against case material and against the dyadic literature.

The trade-off, relative to a pure trait model, is openness to challenge. A pure trait model lets you predict only that warm people will get along with warm people; it cannot tell you why a particular warm pairing falls apart. The Millon-on-IPC reading lets you say something more specific: that two LM-style partners may share warmth but, lacking the agency differential that lets one of them lead and the other defer, will struggle when decisions need to be made. That is a useful prediction, and a falsifiable one.

The empirical evidence base, honestly stated

It would be misleading to present complementarity as a settled law. Several caveats are real and worth foregrounding before describing what an engine built on this theory can claim.

First, the affiliation prediction is much better supported than the agency prediction. Sadler and Woody's data, and the meta-analytic synthesis by Sadler, Ethier, and Woody (2011), show that within-dyad correspondence on warmth is robust and large; reciprocity on dominance is real but smaller and more context-dependent. A person who is highly dominant in a work meeting may not be dominant at home; the agency dimension is more situationally elastic than the affiliation one.

Second, hostile-dominant pairings are a well-documented exception to the rule that complementarity predicts satisfaction. Two cold-dominant partners can sometimes lock into a stable but mutually corrosive pattern; the geometry says they are anti-complementary on agency and complementary on hostility, and the relationship behaves accordingly. Complementarity predicts behavioral fit; it does not, by itself, predict happiness. A pattern can be stable and recognizably "complementary" in Carson's sense and still be a bad place to live.

Third, dyadic prediction from one partner's report has intrinsic limits. Most of the research on which complementarity rests was done with both partners measured directly, often with peer ratings or behavioral observation in addition to self-report. Predicting fit from a single informant — especially one who is in the early stages of a relationship and whose impressions of the partner are filtered through their own attachment history — is a harder problem and a weaker inference. A serious tool built on this tradition has to acknowledge that limit and structure its outputs accordingly.

Finally, complementarity predicts initial attraction and short-term comfort substantially better than it predicts long-term satisfaction. The literature on long-term marital outcomes points to a different set of variables — conflict regulation, perceived responsiveness, attachment security — that are only partially captured by interpersonal style. An interpersonal-style match is necessary but not sufficient for durable satisfaction, and a tool that conflates "predicts attraction" with "predicts a good marriage" is overclaiming.

With those caveats on the table, the affirmative case is still substantial. Interpersonal complementarity, properly stated, is one of the most reliable findings in dyadic personality research; Millon's typology gives that finding clinical content; and a single-informant read of a partner is, while imperfect, vastly better than no read at all.

How Dovetails operationalizes this

The Dovetails engine implements the Millon-on-IPC framework directly, with three deliberate design choices that follow from the caveats above.

The first design choice concerns how the partner is measured. Dovetails does not ask you to give us your partner's responses to a personality inventory — for the obvious reason that early-dating partners do not, in general, sit down and complete one. Instead, the engine measures the partner indirectly through what is called a Countertransference Questionnaire (CTQ): twenty-four short statements about how you feel when you are with this person. The construct is borrowed from the clinical literature on countertransference, where Westen and colleagues (notably the Countertransference Questionnaire of Betan, Heim, Conklin, & Westen, 2005) have shown that systematic patterns in a clinician's emotional response to a patient track the patient's underlying personality organization with meaningful reliability. The same logic applies, with appropriate humility, to romantic dyads. Items like "I feel like I need to earn their approval" (PA), "I feel like I'm being sized up or evaluated" (BC), "I have trouble reaching them emotionally" (FG), "I feel protective toward them" (HI), "There's an easy warmth to our interactions" (LM), and "I feel energized by being around them" (NO) tap the complementary pulls each octant tends to evoke in a partner. Three items per octant, summed, place the partner on the IPC. The engine then takes the highest-scoring octant as the inferred type.

This is an indirect read by design. We are not asking you to diagnose your date. We are reading the evoked countertransference — the pattern of feeling-states a particular interpersonal style reliably elicits — and inferring the style from the pattern.

The second design choice concerns how the inferred type is enriched. The CTQ alone gives the engine a discrete octant. To turn that octant into something usable by the compatibility step, Dovetails passes the octant's prototypical description through a free-text profiling layer we call the Lexicaine client. Lexicaine returns continuous IPC coordinates (a dominance value and a warmth value) and a higher-dimensional factor representation derived from a personality factor space trained on a large corpus of natural-language descriptions. This step buys the engine two things. First, it converts a categorical inference into continuous coordinates, which the geometry needs in order to compute distances rather than just compare labels. Second, it lets the engine compute compatibility in a richer factor space than the two IPC dimensions, when that richer signal is available. The result is a compatibility estimate that degrades gracefully: if the Lexicaine signal is unavailable, the engine falls back to two-dimensional cosine similarity on the IPC plane; if it is available, it uses the higher-dimensional factor space.

The third design choice concerns what counts as compatible. Dovetails uses a fixed Millonian complement table, derived from the circumplex geometry and Millon's clinical writing, to identify each user's ideal and nightmare match types. For a user whose own IPC-32 results put them at PA (Assured-Dominant / Narcissistic-style), the ideal is HI (Unassured-Submissive / Avoidant-style) and the nightmares are BC and DE. For a user at FG (Aloof-Introverted / Schizoid-style), the ideal is NO (Gregarious-Extraverted) and the nightmares are HI and JK. The engine then compares the inferred date type against these reference points in the chosen feature space and returns two numbers: an overlap-with-ideal percentage and an overlap-with-nightmare percentage.

The reason these are reported as two separate numbers rather than collapsed into a single score is theoretical. A high ideal-overlap is informative; a high nightmare-overlap is a different kind of informative; and a pairing can be moderately close to both (because the inferred date sits on a region of the circumplex equidistant from each reference point). Showing both numbers preserves information that a single composite would destroy.

A worked example

Consider a user whose own IPC-32 places her at LM (Warm-Agreeable, Histrionic-adjacent style): high on warmth, neutral on dominance, organized around emotional availability and the maintenance of positive affective ties. The Millon complement table specifies that her ideal is NO (Gregarious-Extraverted) and her nightmare is DE (Cold-Quarrelsome).

She runs the CTQ on a recent date. Her responses load most heavily on the FG items: she found him hard to reach emotionally, felt strangely alone in his company, and found that she was doing most of the relational work. The engine infers FG (Aloof-Introverted, Schizoid-style). Lexicaine, given the FG prototype description, returns coordinates well into the low-warmth, low-dominance quadrant. The engine compares these coordinates to the LM-user's reference ideal (NO) and reference nightmare (DE).

The output shows a low overlap-with-ideal score (FG sits diagonally opposite NO on the circumplex; her date is far from the gregarious-warm partner the model says she would do best with) and a moderate overlap-with-nightmare score (FG and DE share the cold half of the circumplex but differ on dominance, so the overlap is not maximal). The interpretive frame the engine puts around this is straightforward: the date pulls strongly toward the kind of one-sided emotional labor that LM-style partners often find exhausting in the long run; the date is not the user's worst match in the manual, but he is closer to it than to the ideal, and the user should weight her own observed reactions accordingly.

The right way to read this is not as a verdict. It is as a structured restatement of what her CTQ responses already suggested, geometrically located against the predictions Millon's theory makes from her own type. If she has reason to think the engine is wrong — if, for example, she suspects that what she read as withdrawal was situational anxiety rather than schizoid distance — that is exactly the kind of qualitative information the engine cannot incorporate, and it should outweigh the percentage.

Honest caveats

Dovetails is a decision aid, not a verdict. The CTQ is a single-informant instrument and inherits the well-documented limits of single-informant personality assessment: projection, recency effects, the influence of the perceiver's own attachment style on what they experience as warmth or coldness, and the basic fact that one's reactions to a person stabilize over time and are particularly noisy in the first few encounters. The Millon complement table encodes a strong theoretical prior, and that prior is not always right; some users find their best long-term partners in regions of the circumplex the table would not have predicted. The compatibility numbers the engine returns are estimates of interpersonal-style fit, which is a real and important component of relational satisfaction but is not the whole of it.

These limits are why the engine reports two separate compatibility numbers rather than a single composite, why it returns the inferred octant alongside the user's own raw CTQ scores so the inference can be inspected, and why the interpretive copy on the results page consistently frames the output as a structured read of the user's own impressions rather than as a measurement of the partner. A tool built on this literature should be candid about the difference between predicting attraction (which complementarity does well), predicting short-term comfort (which it does moderately well), and predicting durable satisfaction (which depends on variables this engine does not measure).