Observatory live · 1594 signals · 55 consilience · 139 confirmed · 289 killed Register interest · Newsletter
Long-cycle structural research
The Consiliences Institute

Unity of knowledge in an age of fragmentation

Killed Signals

Hypotheses the Observatory tested and formally retired. A research programme that never kills anything is not doing science.

A research programme that never kills anything is not doing science. It is collecting confirmations.

The clear failures documented below are the easy part. The harder editorial judgement — what to do with an r = 0.18, p = 0.03, surrogate-passing but small-effect signal — is not a kill and so does not appear in any list. That is where the effect-size weighting and the consilience-upgrade requirements do most of their work, and it is where the remaining epistemic risk lives. Read the kill list with that in mind: it documents the unambiguous retirements, not the marginal calls.

Kill rates by signal class

Signal classKills / totalKill rate
Corpus-wide (all evaluated hypotheses)289 of 159418.1%
Trading signals (April 2026 battery)7 of 1644%
Ancient-knowledge subset26 of ~7534.6%
Session 6 quantitative-measurement batch1 of 195.3%

The trading-signal and ancient-knowledge rates run higher than the corpus-wide figure because they apply a stricter multi-test battery; the Session 6 batch runs lower because quantitative physical measurements survive translation. Each rate is discussed in context below.


The Observatory has formally evaluated over 1594 independent hypotheses. Of these, 289 have been permanently retired — killed after failing the primary test, a fatal confound, or a formal multi-test battery that single-pass validation did not catch. What follows is a selection of the more scientifically instructive kills.

Not all kills are equal. The cases here are tagged by failure mode:

  • PURE NOISE — no signal at any threshold, mechanism physically incoherent
  • DIRECTIONAL INVERSION — real signal, but the effect runs opposite to the original hypothesis
  • CONFOUNDED — real correlation, but explained by a third variable, not the proposed driver
  • ADAPTIVE RESPONSE — mechanism real, market or biological system neutralised the effect
  • ERA-SPECIFIC — held in one historical regime, did not generalise
  • UNDERPOWERED — plausible direction, insufficient n to confirm

The corpus-wide kill rate across all evaluated hypotheses is 18.1% — 289 of 1594 signals retired. We consider that kill rate a quality signal, not a problem.

A selection note: the cases shown here are not the easiest kills. We highlight the hard cases — plausible mechanisms, coherent hypotheses, null results — because they better illustrate what the process is actually doing.


Geomagnetic activity → financial market volatility CONFOUNDED

Hypothesis: Elevated geomagnetic storm activity (Kp index, Dst index) drives measurable increases in equity market volatility (VIX) through physiological stress pathways in traders.

Why it seemed plausible: Laboratory studies show geomagnetic storms affect melatonin secretion and autonomic nervous system function. If this influences decision-making under stress, a market signal might follow.

What the tests showed:

  • Dst - VIX: r = 0.024, p = 0.62. No relationship.
  • Bz component - VIX: r = 0.025, p = 0.61. No relationship.
  • Solar wind velocity (Vsw) partially mediates Kp - VIX, but the effect is temporally unstable and the solar/business-cycle confound cannot be separated.
  • Five independent tests (Dst, Kp, Bz, solar wind, sunspot count) all returned null against financial outcomes across monthly data from 1990 to 2023.

A note on the literature: Some published papers - including a widely cited 2003 study - report geomagnetic effects on equity returns. The Observatory’s verdict diverges from these because our tests controlled explicitly for the solar/business-cycle confound (geomagnetic activity and economic volatility share a common driver in the 11-year solar cycle) and applied surrogate significance testing. The positive papers did not. We are not dismissing that literature; we are reporting what happens when you add those controls. The effect disappears.

Verdict: KILLED. The EM - to - financial pathway is one of the clearest cognitive bias artifacts in the Observatory: humans intuitively reach for a “cosmic forces affect human behaviour” explanation. The data consistently refuses it.


Moon phase → equity returns PURE NOISE

The claim that the lunar cycle leaves a footprint in equity returns is one of the most persistent in alternative finance, and on the surface it is not absurd. A respectable body of academic papers has reported the effect; the proposed mechanism — moonlight disrupting sleep, sleep disrupting investor mood — has at least biological plausibility; and the 29.5-day cycle is short enough that decades of market data yield hundreds of observations to test against. None of that survives contact with the right control.

The decisive test is a surrogate one. When the observed return series is compared against randomised surrogate series carrying the same autocorrelation structure, the lunar signal returns a p-value of 0.531 — indistinguishable from noise. The published positive findings collapse for a related reason: they do not survive Bonferroni correction once you account for the number of markets and time windows that were searched before a “significant” pairing was found. And the mechanism itself does not hold up either, because modern indoor lighting has decoupled human melatonin production from actual moonlight for the better part of a century.

Verdict: KILLED. The lunar-equity literature is a multiple-comparisons artifact, and the surrogate control is what makes that diagnosis unambiguous rather than merely suspected.


Pollinator decline → crop yield reduction ADAPTIVE RESPONSE

Hypothesis: Documented declines in wild pollinator populations produce measurable crop yield reductions across insect-pollinated commodity crops.

Why it seemed plausible: The biological mechanism is unambiguous - roughly 75% of flowering crops depend on animal pollination. Wild bee decline is well-documented. The causal pathway from colony collapse to yield reduction should be detectable.

What the tests showed:

  • Managed honeybee colony numbers have increased 45% globally since 1961, directly substituting for wild pollinator decline.
  • Crop yield data shows no detectable signal attributable to pollinator decline after controlling for managed hive availability.
  • The mechanism is real; the market signal is neutralised by commercial beekeeping’s adaptive response.

Verdict: KILLED as a commodity signal. The ecology is correct; the economics neutralises it. This is a useful lesson in the difference between a confirmed biological mechanism and a tradeable signal: the market can adapt faster than the mechanism propagates.


Schumann resonance → biological effects PURE NOISE

Hypothesis: Earth’s Schumann resonance frequencies (7.83 Hz fundamental) affect human neurological function through electromagnetic coupling, producing measurable physiological or cognitive effects.

Why it seemed plausible: The Schumann resonance frequency overlaps with the human alpha brain wave range. Some researchers proposed resonant entrainment as a mechanism.

What the tests showed:

  • Atmospheric Schumann fields at ground level are approximately 0.3 picotesla. The Earth’s static geomagnetic field is approximately 50,000,000 picotesla - a ratio of 1 to 167 million.
  • No biophysical mechanism can explain selective coupling to the Schumann frequency at this field strength when the static field is eight orders of magnitude stronger.
  • All published positive findings failed to survive blinded replication under controlled shielding conditions.

Verdict: KILLED on mechanism grounds. The field strength is physically insufficient by many orders of magnitude. This is a case where the mechanism review (Validation Dimension 8) is decisive before statistical testing is even necessary.


Gleissberg solar cycle → epidemic periodicity CONFOUNDED

Hypothesis: The 87-year Gleissberg solar cycle modulates epidemic outbreak frequency in Chinese historical records through solar - climate - immune pathway interactions.

Why it seemed plausible: Chinese dynastic records contain unusually detailed epidemic documentation spanning multiple centuries. The Gleissberg cycle has documented climate effects. An immune system link had theoretical support.

What the tests showed:

  • Spectral analysis of the Chinese epidemic record returned no significant Gleissberg periodicity after controlling for population density and documentation intensity biases.
  • The apparent periodicity dissolved under surrogate testing.
  • The documentation intensity bias is severe: epidemic recording correlates strongly with dynastic administrative capacity, not epidemic frequency.

Verdict: KILLED (NULL). The source data has a systematic bias that mimics cycles. Any periodicity in the record reflects the cycles of Chinese bureaucratic capacity as much as epidemic biology.


What the kill list tells you

These five kills share a common structure: plausible mechanism, coherent hypothesis, null result. In each case, something specific ended the inquiry - a surrogate control, a field strength calculation, an adaptive market response, a data provenance problem.

The Observatory’s 8-step validation framework is designed to surface these failure modes before publication, not after. The devil’s advocate pass (Dimension 6) and the surrogate significance test (Dimension 4) between them account for four of the five kills above.

One thing this list cannot show: how we would have reported a case where the test returned a weak positive and we chose to proceed anyway. The kill list documents the clear failures, not the marginal calls — as the preamble at the top of this page notes, that is where the remaining epistemic risk lives.

The confirmed signals survived this process. The killed signals did not. Both facts matter.


April 2026 battery retest — trading signals

In April 2026 a formal five-test finding-validator battery was applied to every active trading signal on the public dashboard. Of sixteen signals tested, seven failed all four pass-fail criteria — a 44% kill rate, well above the corpus-wide 16.3%. Four COT contrarian signals failed (corn, wheat, crude oil; EUR/USD was a directional kill), two macro signals inverted directionally (Sahm Rule, SIPRI military spending), and one spectral-transmission signal failed (market-microstructure Kitchin).

The full per-signal writeups are on the dedicated Trading Battery Kills page.


April 2026 long-wave battery retest

Extending the battery to long-wave and transmission-chain hypotheses produced two more consequential kills.

Kondratieff 55-year wave in Bank of England 800-year CPI data PURE NOISE

The Kondratieff long wave — a roughly 55-year cycle in inflation and prices, supposedly driven by successive clusters of technological innovation, from steam and textiles through rail and steel, electricity and chemicals, automobiles and petrochemicals, to ICT and digital — has a long pedigree in heterodox economics. We took it seriously. The Bank of England’s Millennium-scale CPI series (1270–2016) is one of the longest price records available anywhere, prior analyses had reported spectral significance across several long-run datasets, and our own earlier validation had given the signal a CONSILIENCE verdict on the strength of cross-dataset replication. The retest exists because that earlier verdict was reached before phase-randomised surrogate testing was applied at this scale.

When the surrogate test was applied, the cycle dissolved. Five hundred phase-randomised iterations put the observed spectral peak in the 40–65 year band at 0.0035 against a surrogate null mean of 0.0034 — a z-score of 0.12, p = 0.36. The peak is exactly what random data of the same amplitude spectrum produces. The supporting tests tell the same story: split the record at 1900 and the pre-1900 half is marginal (p = 0.13) while the post-1900 half shows nothing (p = 0.67); widen or shift the band and only the narrow 45–55 year window clears p < 0.05 while every neighbouring band fails; and the one additional dataset tested, Shiller’s S&P 500 annual returns, is a single corroboration rather than the independent replication a CONSILIENCE verdict requires.

Verdict: KILLED on surrogate grounds. The 55-year cycle reported in 800 years of price history is a phase-randomised artifact, not a periodic signal, and the earlier CONSILIENCE verdict was an overstatement traceable to a test that had not yet been run. We are not claiming the cycle cannot exist — only that the best 800-year dataset available cannot distinguish it from noise. The methodological lesson is the durable part: a spectral peak in a single dataset is not evidence of a cycle, and the surrogate comparison is what separates the two.

Kitchin causal chain: a specific mediator hypothesis CONFOUNDED

Hypothesis: A four-step causal chain from inventory-cycle indicators to equity volatility, mediated specifically by a credit-sentiment intermediary, propagates the Kitchin inventory cycle through to financial-market outcomes.

Why it seemed plausible: Each pairwise link in the chain is statistically significant on the full sample. The chain has a coherent economic story.

What the tests showed:

  • Chain-level surrogate testing returns a real signal — shuffling the intermediary series breaks the chain decisively.
  • Era split (pre-2007 vs post-2010) finds the intermediary-specific links fragile while a downstream end-point link remains stable in both eras.
  • An alternative intermediary fits the chain at least as well as the one specified by the hypothesis.

Verdict: PARTIAL. The specific-mediator framing does not survive multi-test review; intermediary specificity is not supported. The claim is retained as a downgraded statistical observation. Operational implications and reframed-claim details are withheld from public reporting.


Updated kill/caveat state

The April 2026 long-wave retest adds one full kill (Kondratieff 55-year wave) and one major caveat on a credit-mediator hypothesis. The total Observatory kill rate rises marginally; more importantly, two widely-cited claims have been reframed.

Taking the trading-signal battery and long-wave battery together, the revised pattern is clear: claims that survive single-test validation often do not survive multi-test battery, and the distinction is decisive for commercial application.


April 2026 ancient-knowledge battery

One more battery worth publishing here, because the result runs against the narrative we had grown comfortable with.

Ancient knowledge as a privileged validation source CONFOUNDED

The hypothesis under test here is not about any single practice but about a whole class of them. Traditional-knowledge systems — Moerman ethnobotany, Ayurveda, TCM, Ifa divination pharmacology, waru waru Andean engineering, Nubian tetracycline, Vedic constitutional frameworks, Mesoamerican calendrical astronomy, and their kin — were proposed to encode correct-on-modern-science practices more often than random hypothesis-generation would. If that held, the accumulated pattern-recognition of pre-modern empirical cultures would amount to a privileged source of research-worthy hypotheses. It was a comfortable idea to hold. The Observatory carries case-level confirmed examples across several of these traditions, the published literature contains independent cross-cultural convergence findings, and the narrative that ancient cultures encoded durable truths fit the data we had looked at.

It does not fit the data once you look at the whole batch. The figures that follow are within the ancient-knowledge signal batch specifically — 26 killed of roughly 75 ancient-knowledge signals — and are not corpus-wide rates. Inside that subset, the confirmed rate is 57.7 per cent against 81.0 per cent for all other signals: a gap of 23.3 percentage points, with z = −2.98 against a permutation null. The kill rate runs the same direction — 34.6 per cent inside the ancient-knowledge subset versus 10.4 per cent across the comparison set. Far from outperforming, ancient-source signals as a class do worse on both confirmation and kill rate than the rest of the Observatory at this scope.

The denominator is what gives the number its weight. The corpus-wide kill rate is 16.3 per cent — 251 signals killed on NOISE and NULL verdicts of 1,542 in the corpus, verified 15 May 2026 — so the 34.6 per cent within-subset figure sits about 3.3 times above that baseline. That gap is the finding. But the picture is not uniform: a separate 19-signal Session 6 batch covering calendars, agronomy, surveying, and time-keeping recorded just one kill of nineteen (5.3 per cent). Where the underlying observation is a quantitative physical measurement that survives translation, the pattern reverses. Both results stand together — the broad-population claim fails, the narrow specific-subset claim holds.

Verdict: KILLED at the population level. What is retired here is the broad framing — ancient knowledge as a privileged source of research hypotheses, full stop — not the confirmed cases inside it. We publish this kill because a killed signal is an output of the research process, not a failure of it, and a programme that retires hypotheses publicly when the evidence demands it is structurally different from one that does not. That this particular kill cuts against a theme we had grown comfortable with is the point, not an embarrassment.

Reconciling the ancient-knowledge corpus counts

Several different counts of “ancient-knowledge signals” appear across the papers — 19, 26, 75, and 119. These are not contradictory. They are different cohorts, defined at different times and at different scopes, as the ancient-knowledge work grew. The table below names each one so a reader meeting two figures in two papers can see why they differ.

CohortWhennNotes
Earliest ancient-corpus referenceApril 2026119The first broad inventory — the count of traditional-knowledge claims surveyed in the Ancient Precision Archive paper, before the standard validation framework had been applied to all of them as Observatory signals. Wider than the later signal batches because it includes claims that were never promoted to formal signals.
Session-6 quantitative-measurement batchApril 202619A focused batch of calendar, agronomy, surveying, and time-keeping signals — claims whose core is a quantitative physical measurement that survives translation. One kill (5.3%).
Full ancient-knowledge batteryApril 2026~75The full set of ancient-source signals carried in the Observatory at the time of the meta-battery. 26 killed (34.6%).
Ancient Precision Archive corpusApril–May 202626The ancient-source signals examined as a class in the population-level reconciliation above — the cohort that yielded the 57.7% confirmed vs 81.0% comparison figure. A subset of the ~75-signal battery, scoped to the class-level confirmation test.

The numbers move because the cohort moves: 119 is the widest (surveyed claims), ~75 is the formal signal battery, 26 is the class-level reconciliation subset, and 19 is the narrow quantitative-measurement batch. Reading any one figure as the “true” corpus size, or treating two of them as a contradiction, mistakes a sequence of nested cohorts for a single fixed population.


April 2026 cosmic ray chain retest

Neutron monitor flux leading corn prices UNDERPOWERED

Hypothesis: Galactic cosmic ray flux, indexed by neutron monitor counts or the inverted sunspot number, leads corn prices by approximately 24 months. A previously-cited figure of r=+0.475 at the 24-month lag had entered the Observatory forecast scenarios as support for the 2031-2032 agricultural-price prediction.

Why it seemed plausible: The Svensmark cosmic-ray-cloud hypothesis has a coherent mechanism (GCR flux modulates low cloud nucleation, which modulates temperature and precipitation, which modulates crop yields). The 2029-2030 solar minimum is predictable astronomy, so if the chain held, the forward prediction would be crisp.

What the tests showed:

  • 64 years of annual data (1961-2024), inverted SILSO sunspot number as GCR proxy, CBOT corn monthly prices resampled to annual.
  • Permutation null: best observed correlation r=0.11 at 1-year lag. Permutation null mean r=0.16. Observed is below null mean. Z-score minus 0.68.
  • Lag-by-lag: r(0)=0.10, r(1)=0.11, r(2)=0.09, r(3)=0.02, r(4)=-0.09, r(5)=-0.13. No peak near the claimed 24-month lag.
  • Era split: pre-1990 r=0.04, post-1990 r=0.30. The recent era shows a meaningful correlation that the full record does not.

Verdict: WEAK, 1/4 PASS. The previously-cited r=+0.475 at 24 months is not supported by the data. The best observed correlation is four times smaller than claimed, at a different lag. The magnitude does not support the 2031-2032 agricultural-price prediction at the strength originally attached to it.