web3
Markdown

bitcoin

Bitcoin Technical Architecture: Complete Component Breakdown

Bitcoin represents a sophisticated distributed system combining cryptographic primitives, peer-to-peer networking, economic incentives, and consensus mechanisms into the world's first successful decentralized digital currency. This comprehensive technical breakdown examines every component of the Bitcoin system at the protocol level.

Core blockchain architecture and data structures

Bitcoin's blockchain operates as an ordered, back-linked list of blocks stored in flat files or databases. Each block references its predecessor through a 32-byte SHA-256 hash in its header, creating an immutable chain back to the genesis block (hash: 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f). The system uses Google's LevelDB for primary storage since version 0.8, organizing data across multiple databases: the chainstate database maintains the UTXO set, while the block index database maps block hashes to disk locations.

The block structure consists of an 80-byte header containing six fields: a 4-byte version number, 32-byte previous block hash, 32-byte merkle root, 4-byte Unix timestamp, 4-byte difficulty target (nBits), and 4-byte nonce. Following the header, blocks contain a variable-length list of transactions preceded by a CompactSize count. With SegWit activation, blocks separate witness data from the base transaction data, allowing effective sizes up to 4MB while maintaining backward compatibility at 1MB for legacy nodes.

Merkle trees provide efficient transaction commitment and verification. Bitcoin constructs binary hash trees by recursively hashing transaction pairs with SHA-256(SHA-256(data)), duplicating the last transaction when dealing with odd numbers. This enables O(log n) inclusion proofs - proving a transaction exists in a block requires only the transaction hash plus sibling hashes along the path to the root. SegWit introduced a parallel witness merkle tree committed via the coinbase transaction.

The UTXO (Unspent Transaction Output) model fundamentally differs from account-based systems. Bitcoin tracks discrete, indivisible outputs that must be consumed entirely when spent. The chainstate database maintains approximately 84 million UTXOs totaling ~5GB serialized data. Each UTXO entry uses the key format 0x43 + little-endian TXID + varint(vout) and stores the block height, coinbase flag, amount in satoshis, and compressed script data. Custom compression reduces storage by ~80%, particularly for common script types like P2PKH.

Transaction structure, types, and scripting system

Bitcoin transactions follow a precise byte-level format. Legacy transactions contain version (4 bytes), input count (CompactSize), inputs, output count, outputs, and locktime (4 bytes). Each input specifies a previous transaction output via TXID and index, provides an unlocking script (scriptSig), and includes a sequence number. Outputs contain an 8-byte amount and locking script (scriptPubKey). SegWit transactions add a marker (0x00) and flag (0x01) after the version, moving signatures to a separate witness section that doesn't affect the TXID calculation.

The script system implements a Forth-like, stack-based language deliberately designed to be Turing-incomplete. Scripts execute left-to-right using approximately 100 opcodes covering constants, flow control, stack operations, arithmetic, and cryptography. Notable opcodes include OP_DUP (0x76), OP_HASH160 (0xa9), OP_EQUALVERIFY (0x88), OP_CHECKSIG (0xac), OP_CHECKLOCKTIMEVERIFY (0xb1), and OP_CHECKSEQUENCEVERIFY (0xb2). Many opcodes like OP_CAT and OP_MUL were disabled due to DoS vulnerabilities.

Standard transaction types follow specific templates:

  • P2PKH (Pay-to-Public-Key-Hash): 25-byte script OP_DUP OP_HASH160 <20-byte hash> OP_EQUALVERIFY OP_CHECKSIG
  • P2SH (Pay-to-Script-Hash): 23-byte script OP_HASH160 <20-byte hash> OP_EQUAL enabling arbitrary scripts
  • P2WPKH (Pay-to-Witness-Public-Key-Hash): 22-byte SegWit script OP_0 <20-byte hash>
  • P2WSH (Pay-to-Witness-Script-Hash): 34-byte script OP_0 <32-byte hash>
  • P2TR (Pay-to-Taproot): 34-byte script OP_1 <32-byte output> with dual spending paths

Transaction malleability plagued Bitcoin until SegWit. Attackers could modify signatures without invalidating them, changing TXIDs and breaking dependent transactions. SegWit solved this by excluding witness data from TXID calculations. Remaining malleability exists only with specific SIGHASH flags like SIGHASH_NONE or SIGHASH_SINGLE.

Cryptographic foundations

Bitcoin employs multiple cryptographic primitives. SHA-256 serves as the primary hash function, typically applied twice (HASH256) for additional security against length extension attacks. Mining uses SHA-256 for proof-of-work, while transactions and blocks are identified by their SHA-256 hashes. RIPEMD-160 combines with SHA-256 (HASH160) for address generation, producing shorter 160-bit outputs.

ECDSA signatures use the secp256k1 elliptic curve with parameters:

  • Field prime p = 2^256 - 2^32 - 977
  • Curve equation: y² = x³ + 7
  • Generator order n ≈ 2^256

Signatures consist of (r, s) values encoded in DER format (71-73 bytes typically). The signing process requires a cryptographically secure random nonce k - reusing k values enables private key recovery. Bitcoin enforces low-S values and strict DER encoding to prevent malleability.

Schnorr signatures (BIP-340) activated with Taproot provide significant improvements: fixed 64-byte encoding, provable security, non-malleability, and signature aggregation capability. Schnorr enables advanced protocols like MuSig2 for n-of-n multisignatures that appear as single signatures on-chain.

Address generation varies by type:

  • P2PKH: Base58Check(0x00 || HASH160(pubkey)) produces "1" addresses
  • P2SH: Base58Check(0x05 || HASH160(script)) produces "3" addresses
  • Native SegWit: Bech32 encoding with "bc1" prefix
  • Taproot: Bech32m encoding with "bc1p" prefix using x-only public keys

HD wallets (BIP-32) derive hierarchical key trees from a master seed using HMAC-SHA512. Standard derivation paths include BIP-44 (m/44'/0'/account'/change/index) for P2PKH and BIP-84 (m/84'/0'/account'/change/index) for native SegWit.

Network protocol and peer-to-peer communication

The Bitcoin network operates on a peer-to-peer protocol using TCP connections on port 8333 (mainnet). All messages follow a standard 24-byte header format: 4-byte network magic (0xf9beb4d9 for mainnet), 12-byte ASCII command, 4-byte payload size, and 4-byte checksum (first 4 bytes of double-SHA256).

Core message types include:

  • version/verack: Handshake establishing protocol version and capabilities
  • inv/getdata/tx/block: Transaction and block relay
  • headers/getheaders: Headers-first synchronization
  • addr/getaddr: Peer discovery and address relay
  • ping/pong: Connection keepalive with 8-byte nonce
  • sendcmpct/cmpctblock: Compact block relay (BIP-152)

Peer discovery begins with DNS seeds (hardcoded hostnames returning active node IPs) or fallback hardcoded seed nodes. Nodes exchange addresses via addr messages, maintaining tables of ~3000 addresses organized in anti-Sybil buckets. The protocol limits outbound connections to 10 (including 2 block-relay-only) and inbound to 125.

Initial Block Download (IBD) uses headers-first synchronization. Nodes request headers via getheaders messages containing block locators (exponentially spaced block hashes). After validating the header chain, nodes download full blocks in parallel from multiple peers using a 1024-block sliding window. The assumevalid feature skips signature verification before a specified block to accelerate syncing.

Compact blocks (BIP-152) reduce bandwidth by ~90%. Instead of full transactions, nodes send 6-byte short IDs calculated via SipHash-2-4. High-bandwidth mode sends compact blocks immediately for 0.5*RTT propagation, while low-bandwidth mode waits for requests. Version 2 uses WTXIDs for SegWit compatibility.

Consensus mechanism and mining

Bitcoin's Proof-of-Work consensus requires miners to find a block header hash below the current difficulty target. Miners iterate through the 32-bit nonce field, and when exhausted, modify the coinbase transaction's extra nonce to change the merkle root. The double-SHA256 hash must satisfy: SHA256(SHA256(header)) < target.

The difficulty adjustment algorithm recalculates the target every 2,016 blocks (~2 weeks). The formula new_target = old_target × (actual_time / expected_time) maintains 10-minute average block times, with adjustments capped at 4× increase or 0.25× decrease. The compact "bits" representation uses 1-byte exponent and 3-byte mantissa: target = mantissa × 256^(exponent-3).

Mining evolved from CPU mining (0.001 MH/s) through GPUs (300 MH/s) and FPGAs (~800 MH/s) to modern ASICs achieving 100+ TH/s. Current generation ASICs like the Antminer S21 reach 200 TH/s at ~30 J/TH efficiency. The network hashrate exceeds 500 EH/s, requiring specialized facilities with megawatt-scale power infrastructure.

Block validation enforces numerous consensus rules:

  • Maximum 4,000,000 weight units per block
  • Valid proof-of-work below target difficulty
  • Timestamp greater than median of previous 11 blocks, less than 2 hours future
  • Coinbase transaction as first transaction with correct subsidy
  • All transactions valid with no double-spending
  • Merkle root matches transaction tree

The halving schedule reduces block rewards every 210,000 blocks:

  • 2009-2012: 50 BTC
  • 2012-2016: 25 BTC
  • 2016-2020: 12.5 BTC
  • 2020-2024: 6.25 BTC
  • 2024-2028: 3.125 BTC

Total supply approaches 21 million BTC asymptotically, with the last satoshi mined around 2140.

Economic incentives and fee market mechanics

Bitcoin's fee market operates through competitive bidding for limited block space. Fees are measured in satoshis per virtual byte (sat/vB), where witness data receives a 75% discount. The mempool maintains transactions sorted by fee rate using ancestor scores that consider dependent transaction packages.

Replace-by-Fee (RBF) allows transaction replacement with higher fees if the original signals replaceability (nSequence < 0xfffffffe). The replacement must pay higher absolute fees and exceed the original's fee rate by at least 1 sat/vB.

Child-Pays-for-Parent (CPFP) enables fee bumping by spending unconfirmed outputs with high fees. Miners evaluate the package fee rate: (parent_fee + child_fee) / (parent_size + child_size). Ancestor/descendant limits (25 transactions, 101KB) prevent abuse.

The mempool uses a multi-index container tracking transaction relationships. When exceeding the 300MB default limit, nodes evict lowest fee-rate packages. The minimum relay fee dynamically adjusts based on mempool congestion with a 12-hour half-life decay.

Fee estimation algorithms track confirmation times across fee rate buckets using exponentially weighted moving averages. Bitcoin Core's estimatesmartfee returns rates achieving 85-95% confirmation probability within target blocks.

Game theory ensures honest mining remains profitable. A 51% attack requires ~$25 million daily operational costs plus billions in hardware investment. The 6-confirmation standard provides ~99.9% security against 10% attackers. As block subsidies decline, transaction fees must grow exponentially to maintain security budgets.

Major protocol upgrades and activation mechanisms

Segregated Witness (BIP-141/143/144) activated August 2017 after contentious community debates. SegWit separated signature data into a witness structure, fixing transaction malleability and effectively increasing block capacity to 4MB. The upgrade introduced native witness outputs (P2WPKH/P2WSH) and enabled Lightning Network development.

Taproot (BIP-340/341/342) activated November 2021 via Speedy Trial. The upgrade introduced Schnorr signatures, Taproot outputs with dual key/script paths, and Tapscript improvements. All P2TR outputs appear identical on-chain, enhancing privacy while enabling complex smart contracts through MAST (Merkelized Abstract Syntax Trees).

Activation mechanisms evolved from simple flag days to sophisticated signaling:

  • BIP-9: Version bits allowing parallel soft fork proposals with 95% miner thresholds
  • BIP-8: Height-based activation with mandatory signaling option (LOT parameter)
  • UASF: User-activated soft forks like BIP-148 forcing SegWit activation
  • Speedy Trial: Short 3-month signaling periods with 90% thresholds

Notable BIP implementations include:

  • BIP-32: Hierarchical Deterministic wallets
  • BIP-39: Mnemonic seed phrases
  • BIP-44/49/84: Standard derivation paths
  • BIP-65: CHECKLOCKTIMEVERIFY for absolute timelocks
  • BIP-112: CHECKSEQUENCEVERIFY for relative timelocks
  • BIP-125: Replace-by-Fee signaling
  • BIP-152: Compact block relay
  • BIP-174: Partially Signed Bitcoin Transactions (PSBT)

Lightning Network integration

The Lightning Network enables instant, low-fee Bitcoin payments through bidirectional payment channels. Channels use 2-of-2 multisig addresses with pre-signed commitment transactions that can be broadcast if either party disappears.

Commitment transactions maintain asymmetric states with revocation keys preventing old state broadcasts. Each commitment includes to_local and to_remote outputs plus HTLC outputs for in-flight payments. The to_self_delay (typically 144 blocks) allows the counterparty time to claim funds using penalty transactions if old states are broadcast.

HTLCs (Hash Time-Locked Contracts) enable trustless multi-hop payments. HTLCs lock funds that can be claimed with a preimage (payment secret) before timeout or refunded after. The onion routing protocol ensures payment privacy across the network.

Anchor outputs (option_anchors) address the fee commitment problem by allowing CPFP fee bumping of commitment transactions. Each party gets a 330-satoshi anchor output spendable only by them for fee adjustment.

Advanced features and optimizations

Bloom filters (BIP-37) allow SPV clients to request filtered blockchain data. Clients create probabilistic filters of their addresses, receiving only matching transactions. However, implementation flaws reduce the effective false positive rate, compromising privacy.

Compact block filters (BIP-157/158) provide better SPV privacy using Golomb-Rice coded sets. Unlike bloom filters, servers cannot determine client interests, though bandwidth requirements increase.

PSBT (BIP-174) standardizes partially signed transaction formats for multi-party signing workflows. The format separates creator, updater, signer, finalizer, and extractor roles, enabling hardware wallet integration and complex signing schemes.

AssumeUTXO accelerates initial sync by loading a hardcoded UTXO snapshot at a recent block height. Nodes begin operating immediately while background-validating the full history. The snapshot hash undergoes extensive review before inclusion in releases.

CoinJoin implementations like WabiSabi enable privacy through collaborative transactions where multiple users combine inputs/outputs with uniform amounts. PayJoin (P2EP) provides two-party mixing that appears as normal payments.

Node operations and database management

Bitcoin Core uses LevelDB for the chainstate (~5GB UTXO set) and block index databases. Block data is stored in flat blk*.dat files with corresponding rev*.dat undo files enabling reorganizations. The chainstate uses custom compression and XOR obfuscation.

Chain reorganizations occur when a longer valid chain appears. Nodes disconnect blocks back to the fork point using undo data, restore spent UTXOs, return valid transactions to the mempool, then connect the new chain. While no hard limit exists, transactions from reorgs deeper than 10 blocks aren't re-added to the mempool.

Resource management varies by node type:

  • Full nodes: ~500GB storage, growing ~50GB/year
  • Pruned nodes: Configurable from 550MB minimum
  • SPV clients: Headers only (~60MB)
  • Neutrino clients: Headers plus compact filters

The RPC interface provides JSON-RPC 2.0 access for blockchain queries, network management, mining operations, and wallet functions. Multi-wallet support allows independent wallet operations via named endpoints.

Performance optimizations include parallel script validation across CPU cores, signature/script caching, UTXO cache tuning (450MB default, benefits up to 8GB), and hardware SHA-256 acceleration. Nodes preferentially store the chainstate on SSDs while blocks can use cheaper HDDs.

Security model and attack vectors

Bitcoin's security relies on proof-of-work making attacks economically infeasible. A 51% attack requires ~$5-20 billion in hardware plus ~$25 million daily operating costs. The standard 6-confirmation wait provides strong probabilistic security against reorganization.

Network attacks include:

  • Eclipse attacks: Isolating nodes from honest peers (mitigated by connection limits and peer diversity)
  • Sybil attacks: Creating multiple identities (ineffective due to PoW requirements)
  • BGP hijacking: Redirecting network traffic (addressed by encryption proposals)

Transaction attacks encompass:

  • Transaction malleability: Modified signatures changing TXIDs (fixed by SegWit)
  • Fee sniping: Re-mining recent blocks for fees (prevented by nLockTime)
  • Transaction pinning: Blocking fee bumps via RBF abuse (addressed by package relay development)

Mining attacks include:

  • Selfish mining: Withholding blocks for advantage (profitable above ~25% hashrate)
  • Block withholding: Reducing pool rewards while claiming shares
  • Time warp: Manipulating timestamps for easier difficulty

Privacy attacks use chain analysis combining transaction graph analysis, address clustering heuristics, and network traffic correlation. Commercial firms like Chainalysis provide blockchain surveillance tools. Defenses include CoinJoin, Lightning Network, and address reuse avoidance.

Quantum resistance remains a long-term concern. Shor's algorithm threatens ECDSA signatures while Grover's algorithm weakens SHA-256 to 128-bit security. Migration would require new address types using post-quantum signatures like SPHINCS+ and coordinated UTXO migration before quantum computers become viable.

Future developments and conclusion

Bitcoin continues evolving through conservative upgrades balancing innovation with stability. Active development areas include cross-input signature aggregation for transaction size reduction, SIGHASH_ANYPREVOUT enabling eltoo payment channels, covenant proposals like OP_CHECKTEMPLATEVERIFY for smart contracts, and privacy improvements through confidential transactions research.

The system's remarkable resilience over 16 years demonstrates the robustness of combining cryptographic security, economic incentives, and decentralized consensus. Bitcoin's layered architecture enables continuous improvement while maintaining backward compatibility and the core properties of digital scarcity, censorship resistance, and permissionless participation that define its value proposition as humanity's first successful decentralized digital currency.