Introduction

Supply chains are networks: companies source from suppliers, sell to customers, have common board members and investors. These relationships—represented as graphs—contain valuable signals for trading. Graph databases like Neo4j excel at representing and querying these networks. This article explores using graph databases to model and trade on entity relationships in finance.

Why Graphs for Financial Data

Traditional databases (SQL, NoSQL) represent data as tables or documents. This forces relationships to be secondary—encoded as foreign keys or embedded references. In contrast, graph databases treat relationships as first-class citizens. Queries like "find all companies two hops away from Apple in the supply chain" or "find insiders connected to multiple public companies" are natural in graphs, cumbersome in tables.

Financial Relationship Types

  • Supply chain: company A supplies components to company B
  • Ownership: investor X owns Y% of company C
  • Board relationships: person P serves on boards of companies Q and R
  • Business relationships: company A has contracts with company B
  • Geographic: companies in same region, supply chains spanning countries
  • Personnel: executives move between companies, carrying relationships

Data Model: Entities and Relationships

Design a financial graph with two types of nodes: companies (properties: ticker, name, sector, market cap), people (properties: name, title, insider status), and institutions (properties: name, type: PE firm, hedge fund). Relationships connect them: Person works_at Company, Company supplies_to Company, Person invested_in Company.

Example Graph Structure

Apple (company) --CEO--> Tim Cook (person) --previously_worked_at--> Compaq (company). Apple --supplier--> Taiwan Semiconductor (company) --investor--> Warren Buffett (person) --investment--> Berkshire Hathaway (company). This graph enables queries like "which executives at Apple's suppliers have prior tech company experience?" or "which semiconductor suppliers have overlapping board members?"

Data Sources for Entity Relationships

SEC Filings (Insider and Director Data)

SEC EDGAR contains comprehensive insider and director information: Form 4 filings (officer/director trades), proxy statements (board composition, executive compensation), 8-K filings (company announcements). Extract relationships: officer_name works_at company_ticker, with fields: title, compensation, ownership percentage.

Supply Chain Data

Company 10-Ks and 10-Qs disclose major suppliers (especially suppliers representing >10% of revenue). Extract: company_A supplies_to company_B with properties: revenue share, contract type, criticality. Combine with logistics data (shipping manifests, port data) for real-time supply chain visibility.

Investment Data

Institutional investor filings (13-F forms), venture capital databases, private equity databases provide investor-company relationships. Create investor--invests_in-->company relationships with properties: investment amount, stake percentage, investment date.

Graph Queries for Trading Signals

Supply Chain Disruption Detection

When a major supplier announces bankruptcy, query the graph: "find all companies that depend on this supplier." Traverse supplier relationships to identify affected customers. Priority: companies with high supplier concentration (few alternative suppliers) are more vulnerable.

Trading signal: short affected companies (anticipate margin pressure or revenue disruption), long alternative suppliers (benefit from volume shift).

Insider Network Clustering

Find clusters of companies connected by shared board members or executives. When multiple companies in a cluster announce positive results, it suggests broader network strength. Conversely, distress in one network member might cascade.

Query: "find all companies with executives who've previously worked at Apple" (Apple alumni network). If many are outperforming, it suggests strength in Apple's talent; if underperforming, it suggests Apple's advantage comes from something other than people (brand, IP, culture).

Contagion and Spillover

Model financial contagion through graphs. Company A defaults, affecting creditors (upstream suppliers and banks), affecting downstream customers. Graph traversal identifies ripple effects. Useful for risk management: identify systemically important companies that, if they fail, cascade failures through ecosystem.

Temporal Dynamics in Graphs

Relationships change: companies acquire or divest suppliers, executives change jobs, investors enter and exit stakes. Models treating graphs as static miss important signals. Temporal graphs encode relationship start/end dates.

Query: "find companies that recently added suppliers from a competitor's supply chain" (suggests technology or supplier access shift). Temporal queries identify relationship changes signaling strategic shifts.

Graph Machine Learning

Graph neural networks (GNNs) extend machine learning to graphs, enabling end-to-end learning on graph-structured data. Train GNNs to predict company performance or stock returns incorporating graph features: node centrality (how many connections?), community membership (is company in cohesive cluster?), local neighborhood features (what is profile of connected companies?).

GNNs naturally incorporate relationship information, outperforming models that ignore structure.

Implementation with Neo4j

Neo4j is the dominant graph database for financial applications. Create nodes for companies, people, institutions. Create relationships connecting them with properties (dates, amounts, type). Use Cypher query language for natural relationship queries.

Example Cypher query: "MATCH (c:Company)-[:supplied_by]->(s:Company)-[:investor_is]->(i:Person) WHERE c.sector='Semiconductors' RETURN i.name, count(*) as company_connections"—finds people invested in multiple suppliers of semiconductor companies.

Challenges and Limitations

Graph databases excel at relationship traversal but can struggle with large-scale analytics. Aggregating properties across millions of nodes requires careful optimization. Data quality is critical: incorrect relationships propagate through graph queries. Continuously validate graph data against source data.

Conclusion

Financial networks—supply chains, executive relationships, investor webs—contain predictive signals that relational databases cannot easily exploit. Graph databases enable natural querying of these networks and discovering patterns of contagion, concentration, and disruption. Sophisticated quant firms increasingly implement graph databases as core infrastructure for alternative data integration, enabling signals invisible to traditional database approaches.