Who Wrote Bitcoin? — Code Stylometry Analysis

Comparing Bitcoin v0.1 (2009) against open-source code from 11 Satoshi candidates

Satoshi Nakamoto's Coding DNA

Based on analysis of Bitcoin v0.1.0 (January 2009) — main.cpp, net.cpp, ui.cpp, serialize.h, bignum.h

Language C++ (10 .cpp files, 0 .c files)
Comments // line comments exclusively (295 instances, zero /* */ block comments)
Class naming CamelCase with C prefix — CBlock, CTransaction, CWalletTx, CKey, CAddress
Member/global vars Hungarian notation — fFound, nBestHeight, mapWallet, vWalletUpdated, csMapKeys
Indentation 4 spaces (zero tab lines in entire codebase)
if spacing if (x) — space before paren, no spaces inside
Return style return true; — no parentheses
Brace style Allman (opening brace on next line for functions)
Function decls Return type on same line as function name
Platform Windows-first — 44 Win32/wxMSW refs, 0 Unix refs, wxWidgets GUI
Headers Single monolithic headers.h (precompiled header pattern)
Pointers Raw pointers, NULL (not nullptr), no smart pointers
STL Heavy use — map<>, vector<>, multimap<>, string
Error handling C++ throw, custom error() function, printf
The C class prefix and Hungarian notation (f/n/map/v/cs prefixes) are specifically from Microsoft's MFC/COM programming tradition of the 1990s–2000s. The monolithic headers.h is a Windows precompiled header pattern. These traits collectively point to someone who spent years writing Windows desktop applications in Visual C++.
Stylometric Comparison — 12 Traits

Trait-by-Trait Analysis

Trait Satoshi M. Stokes V. Falco Wei Dai P. Le Roux J. McCaleb L. Sassaman H. Finney P. Gutmann A. Back G. Andresen
Language C++ C++ C++ C++ C (some .cpp) C++ C C C C C++
Comments // only // (819 vs 8) // mostly // mostly /* */ primary // mostly /* */ only /* */ only /* */ only /* */ only // mostly
Naming CamelCase CamelCase CamelCase CamelCase C style CamelCase snake_case snake_case snake_case snake_case CamelCase
C class prefix Yes (CBlock) Yes (CNetwork) No No N/A No N/A N/A N/A N/A Yes (CScheduler)
Hungarian vars f/n/map/v m_b/n/p full m_ prefix simple b/sz/n full m prefix N/A N/A N/A N/A Partial (n prefix)
Indentation 4 spaces tabs 2 spaces tabs tabs tabs 2 spaces tabs tabs tabs 4 spaces
if spacing if (x) if ( x ) if (x) if (x) mixed if(x) if (x) if (x) if ( x ) if (x)
Return style return x return x return x return x return x return(x) return x return x Mixed
Brace style Allman Allman Allman Allman K&R-ish K&R Allman K&R K&R Mixed Allman/K&R
Ret type placement same line same line same line same line same line same line same line separate same line same line
Platform Windows-first Windows MFC Windows Cross+MSVC Windows Cross-plat Unix+Win32 Unix-first Cross-plat Unix-first Unix/Linux
Raw ptrs / NULL Yes Yes Some No smart Yes (BOOL) Some smart Mixed
No namespaces Yes Yes
Final Ranking by Stylometric Match

Candidate Scorecard

Rank Candidate Score Era Code Best Match Key Gap
1 Michael Stokes 11/12
Shareaza (2002–2017) ONLY candidate matching C prefix + Hungarian + MFC + P2P protocol design Tabs not spaces; if ( x ) has extra interior spaces
2 Vincent Falco 9/11
DSPFilters (2009) Windows P2P developer (BearShare) — exact profile BearShare source closed; no C prefix in available code
3 Gavin Andresen 8.5/12
scheduler, IBLT (2015) C prefix (CScheduler) + 4-space indent + line comments Unix not Windows; mixed brace/return style. Style may reflect Satoshi's influence.
4 Wei Dai 8.5/11
Crypto++ (1995–present) Most general C++ traits match; sophisticated codebase Not Windows-first; no C prefix or Hungarian notation
5 Paul Le Roux 5.5/12
E4M/TrueCrypt (1998–2000) Hungarian notation + Windows. Criminal = anonymity motive Writes C not C++, block comments, tabs
6 Jed McCaleb 4.5/11
NewCoin (2011) Writes C++, CamelCase naming Tabs, return(x), if(x), not Windows-first
6 Len Sassaman 4.5/9
Mixmaster Allman braces, if (x) spacing Writes C not C++, block comments
8 Hal Finney 2/10
RPOW (2004) if (x) spacing, return x style Writes C, tabs, block comments, separate-line return type
8 Peter Gutmann ~2/12
cryptlib Writing stylometry #2 match (ULC Legal study) Writes C, block comments, tabs
10 Adam Back 1/10
hashcash Same-line return type Writes C, tabs, block comments, spaces inside parens, snake_case
Nick Szabo N/A
(Java only) Cannot evaluate — no public C/C++ code
Craig Wright N/A
(none) Cannot evaluate — 0 public repos
Note on Gavin Andresen (#3): Gavin took over Bitcoin development directly from Satoshi in 2010. His C prefix usage (CScheduler) and n-prefix variables could reflect Satoshi's influence on his coding style rather than shared authorship. He is based in Amherst, MA — US Eastern timezone, which fits Satoshi's posting pattern.
The MFC P2P Developer Profile

Convergence on a Specific Developer Archetype

Across all 10 candidates, the stylometric analysis has converged on a very specific profile for Satoshi: someone who wrote Windows MFC/ATL C++ peer-to-peer applications in the 2000–2008 era. The C class prefix, Hungarian notation, Allman braces, precompiled headers, and heavy Win32/MFC references are not generic C++ habits — they are the specific fingerprint of a Visual C++ application developer working on desktop software.

The three candidates who best match this profile:

All three are Windows developers. Two of them — Stokes and Falco — are P2P protocol designers who built decentralized file-sharing networks, the exact same domain as Bitcoin's peer-to-peer protocol. This is a remarkably narrow intersection: Windows MFC C++ developer AND decentralized P2P protocol designer. Very few people in the world fit both criteria simultaneously.
The C Prefix and Hungarian Notation — Solved

The MFC/COM Signature

Two of Satoshi's most distinctive traits — the C class prefix (CBlock, CTransaction) and Hungarian notation (fFound, nBestHeight, mapWallet) — were previously unmatched by any candidate. These conventions come specifically from Microsoft's MFC (Microsoft Foundation Classes) and COM programming traditions.

Michael Stokes' Shareaza is the first and only codebase to match both traits. Shareaza uses CamelCase with C prefix (CCollectionFile, CNetwork, CBTClient, CDownload, CBuffer, CAlbumFolder) and full Hungarian notation (m_bActive, nLength, nCount, pFile, pPacket, pHost). The Shareaza codebase also uses StdAfx.h (precompiled headers), the same pattern as Satoshi's monolithic headers.h. And Stokes designed a decentralized P2P protocol (Gnutella2), matching Satoshi's domain expertise in peer-to-peer network design.

The remaining gap between Stokes and Satoshi — tabs vs. 4 spaces for indentation, and if ( x ) with extra interior spaces vs. Satoshi's if (x) — is the smallest discrepancy of any candidate. Indentation style is also the trait most easily changed by a developer (via editor settings or project conventions), making it a weaker signal than naming conventions or comment style.

Caveat: There is no known connection between Michael Stokes and the cypherpunk or cryptography community. Being a stylometric match does not confirm identity — it means the code was written by someone with the same development background: a Windows MFC C++ peer-to-peer application developer. Satoshi could be Stokes, someone who worked alongside him, or any developer from the same narrow ecosystem of Windows P2P application builders in the 2000s. Code stylometry identifies the type of developer, not the specific person.
Linguistic Evidence: British/Commonwealth English

Satoshi Used Commonwealth English — and So Does Stokes

Satoshi's writings contain well-documented Commonwealth English patterns: "colour," "favour," "grey," "defence," "analyse," and the colloquialism "bloody hard." The Bitcoin whitepaper uses "favour." Bitcoin v0.1's UI code uses SetBackgroundColour and GetColour. These spellings are standard in the UK, Australia, New Zealand, and Canada.

Michael Stokes resides in Australia (confirmed on the Shareaza Wiki). Australian English uses British spellings as standard — "colour," "favour," "analyse," "defence." The word "bloody" is quintessentially Australian slang.

Shareaza's source code confirms Stokes writes in British/Australian English:

WordBritish (Stokes)AmericanRatio
colour vs color 302 61 83% British
initialise vs initialize 2 48 4% British
serialise vs serialize 0 403 0% British
Total British spellings 354 instances across codebase

The pattern is telling: Stokes defaults to British spelling for words he chose himself (his own function names like CalculateColour(), GetColour(), OnSysColourChange()) but uses American spelling where forced by the Windows API (which uses "Color" in its own type names like COLORREF). This mixed British/American pattern is exactly what Satoshi's writing exhibits — and exactly what you'd expect from an Australian developer working with American APIs.

Satoshi's corpus shows the same mix: "favour" and "colour" (British) alongside "characterized" and "optimize" (American). This is natural for a Commonwealth English speaker immersed in American technology — you spell your own words in your native dialect but absorb American spellings from the tools you use daily.

Key forum evidence still needed: Stokes was active on the Gnutella Developer Forum (GDF), which was hosted on Yahoo Groups (groups.yahoo.com/group/the_gdf/). Yahoo Groups was shut down in 2020, but archives may have been preserved. Stokes' GDF forum posts would allow a direct writing style comparison to Satoshi's bitcointalk.org posts — looking at sentence structure, vocabulary, punctuation habits (double-spacing after periods), and the mixed British/American spelling pattern.
Satoshi's Timezone — Who Was Awake?

Posting & Commit Timestamp Analysis

Multiple studies have analyzed 742+ timestamps from Satoshi's bitcointalk posts, SourceForge commits, and emails (Oct 2008 – Dec 2010). The findings are consistent:

Sleep window (near-zero activity): ~05:00 – 11:00 UTC
Peak activity: ~14:00 – 23:00 UTC

Period UTC London (GMT) US Eastern (EST) US Pacific (PST) Australia AEST (UTC+10)
Sleep 05:00–11:00 5 AM – 11 AM 12 AM – 6 AM 9 PM – 3 AM 3 PM – 9 PM
Peak 14:00–23:00 2 PM – 11 PM 9 AM – 6 PM 6 AM – 3 PM 12 AM – 9 AM

Timezone Assessment

Note on Michael Stokes: Stokes' 2004–2009 whereabouts are poorly documented. He was CTO at Mercora, Inc. in Sunnyvale, California (~2003–2004). If he remained in the US through 2008–2009, US Pacific or Eastern timezone would apply — both of which are compatible with Satoshi's posting pattern. He is confirmed back in Australia by 2010, which coincides with Satoshi's withdrawal from active development (final post April 2011, declining activity from mid-2010). The timeline gap is the most intriguing aspect of the Stokes case.

Methodology

Source code analyzed:

ProjectAuthorSource
Bitcoin v0.1.0 Satoshi Nakamoto (Jan 2009) github.com/trottier/original-bitcoin
Shareaza Michael Stokes (2002–2017) github.com/jason-jxc/Shareaza
DSPFilters Vincent Falco (2009) github.com/vinniefalco/DSPFilters
Crypto++ Wei Dai (1995–present) github.com/weidai11/cryptopp
E4M / TrueCrypt Paul Le Roux (1998–2000) github.com/FreeApophis/TrueCrypt
NewCoin Jed McCaleb (Oct 2011) github.com/XRPLF/rippled first commit
Mixmaster Len Sassaman et al. github.com/crooks/mixmaster
RPOW Hal Finney (2004) github.com/NakamotoInstitute/RPOW
cryptlib Peter Gutmann github.com/cryptlib/cryptlib
hashcash Adam Back (~2002–2005) github.com/hashcash-org/hashcash
scheduler Gavin Andresen (2015) github.com/gavinandresen/scheduler
IBLT_Cplusplus Gavin Andresen github.com/gavinandresen/IBLT_Cplusplus

Analysis performed by comparing coding conventions across 12 stylometric traits: language choice, comment style, naming conventions, class prefix patterns, variable naming conventions, indentation, control flow spacing, return statement style, brace placement, function declaration format, platform orientation, and pointer/memory management patterns.

Shareaza corpus: 527 .cpp files, 675 .h files. E4M/TrueCrypt corpus: Mount.c, Dlgcode.c (derived from E4M 2.02a). cryptlib corpus: .c files with 159 block comment instances and 0 line comment instances.

Of the additional candidates investigated (Ray Dillinger, Phil Wilson/Scronty, Ian Grigg, Dustin Trammell, Mike Hearn, Martti Malmi, David Chaum), only Gavin Andresen had substantial public C++ code suitable for comparison. The others write in C, Java, Ruby, Assembly, TypeScript, or have no public code.

Generated April 2026. Code stylometry is one analytical lens — it can rule candidates out but cannot definitively confirm identity. Writing style can be deliberately altered.