Nandi Schoots
  • Home
  • Thesis
  • Thesis
  • Publications
    • Relating Piecewise Linear Kolmogorov Arnold Networks to ReLU Networks
    • The Propensity for Density in Feed-forward Models
    • Dissecting Language Models: Machine Unlearning via Selective Pruning
    • Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
    • Extending Activation Steering to Broad Skills and Multiple Behaviours
    • Improving Activation Steering in Language Models with Mean-Centring
    • Any Deep ReLU Network is Shallow
    • Finding Sparse Initialisations using Neuroevolutionary Ticket Search (NeTS)
  • Recent & Upcoming Talks
  • Teaching
    • Learn JavaScript
    • Learn Python

Improving Activation Steering in Language Models with Mean-Centring

Jul 26, 2023·
Ole Jorgensen
,
Dylan Cope
,
Nandi Schoots
,
Murray Shanahan
· 0 min read
PDF
Last updated on Jul 21, 2025

← Extending Activation Steering to Broad Skills and Multiple Behaviours Aug 26, 2023
Any Deep ReLU Network is Shallow Jun 26, 2023 →

© 2025 Nandi Schoots. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.