1. ☁️ Cloud Infrastructure: The Forgotten Archaeology
AWS Config automated discovery. 27 active services tracked: EC2 instances (remember that t2.micro from the 2019 demo?), Lambda functions (contractor deployed 47, left 3 years ago, all still running with admin IAM), S3 buckets (TEMP-DELETE-LATER from 2018, still public, still leaking), RDS databases (test database with production data copy, nobody knows root password), VPCs (five different VPCs because each team created their own), security groups (438 rules, 127 allowing 0.0.0.0/0, "temporarily" from 2017). AWS Config continuously monitors before amnesia becomes CVE.
Infrastructure Archaeology Reality: CloudFormation IaC ensures version-controlled infrastructure (if people use it—manual EC2 launches still happen). Multi-account organization with centralized logging (proving someone launched that instance, even if they've forgotten). Cost anomaly detection (that's how we discovered the crypto-mining instance nobody knew existed).
Horror Story: Organization discovers during breach that attacker had been operating from EC2 instance launched by intern during "learning week" 4 years prior. Never tracked. Never patched. Never decommissioned. Default credentials still worked. Attacker had root access to VPC for 18 months before breach detection. Cost to organization: $4.2M in fines, $18M in remediation. Cost to launch EC2: $0.012/hour. Asset inventory failure math doesn't favor you.
AWS Config means real-time inventory, not annual spreadsheets that become archaeological artifacts themselves. Every forgotten instance is running vulnerabilities you haven't patched because you don't know it exists. FNORD. The instance you forgot about is already compromised. Question: Which one?
2. 📝 Code & Repositories: Ghost Repos Haunt You
GitHub repository inventory via API automation. 40+ repositories tracked in Hack23 organization: CIA (Citizen Intelligence Agency OSINT platform), Black Trigram (Korean martial arts combat simulator), CIA Compliance Manager (CIA Triad assessment tool), Lambda in Private VPC (AWS resilience architecture), Sonar-CloudFormation Plugin (IaC security scanning). GitHub API provides automated repository discovery before developers create shadow repos in personal accounts (it happens—always happens).
Repository Archaeology: SECURITY_ARCHITECTURE.md mandatory in all repos (enforced via branch protection). Public ISMS repository demonstrating transparency (70% of policies public). Archived repositories tracked (abandoned but not deleted, because deletion is data loss). Forked repositories monitored (security patches upstream propagate how, exactly?). Private repositories in public organizations (free tier limits create shadow infrastructure).
Ghost Repository Horror: Security researcher discovers credentials in public repository created 6 years prior during hackathon. Repository archived. Never audited. AWS root account keys committed in 2018. Still valid (nobody rotated). Researcher reports via responsible disclosure. Company discovers they've been cryptomining victims for 3 years. Terraform state files in repo contained database passwords (still current). S3 bucket URLs revealed internal architecture. One forgotten repository = complete infrastructure compromise.
Shadow Repository Reality: Developers create personal GitHub accounts for "testing" (with company code). Contractors push to personal repos "for backup" (still accessible after contract ends). Acquisitions bring repositories nobody inventories. Open source forks contain company customizations (and secrets). Your repository inventory is incomplete. Always. Question: By how much?
Code repositories are assets. Abandoned repos are forgotten attack surfaces containing passwords that are still valid (because who rotates credentials for repos they've forgotten existed?). Systematic inventory prevents repository sprawl before sprawl becomes breach. But only if inventory includes repositories you didn't know you had. How do you inventory what you don't know exists? Automation. GitHub API. Daily scans. Accept that discovery is ongoing archaeology, not one-time audit. FNORD.
3. 👤 Identity & Access: Zombie Accounts Hunt You
AWS Identity Center + GitHub access reviews revealing zombie privileges. IAM users (deprecated—using SSO now), IAM roles (427 tracked, 89 unused >180 days), IAM policies (custom policies proliferate like rabbits), GitHub organization members (quarterly reviews), AWS permission sets (AWSAdministratorAccess, AWSPowerUserAccess, AWSReadOnlyAccess, AWSServiceCatalogAdminFullAccess). 90-day dormant account detection per Access Control Policy. Quarterly access reviews ensure privilege hygiene before privileges become persistent access for departed employees.
Zombie Account Archaeology: IAM Access Analyzer reveals cross-account access (that external account still has S3 read? Since when?). AWS Organizations tracks member accounts (when did we add this account? Who owns it?). MFA enforcement via Identity Center (humans forget, automation enforces). Access keys actively used (those keys from 2019 API integration? Still valid. Still used. By whom? Nobody knows.). People are assets. Departed employees with active access are vulnerabilities with legs.
Zombie Account Horror: Quarterly access review discovers contractor from 2020 still has AWS AdministratorAccess. Contract ended November 2020. Access never revoked. Contractor hasn't logged in (or have they?—logging gaps during CloudTrail migration). Investigation reveals: Access key created 2 weeks before contract end. Never rotated. Used sporadically from Eastern European IPs. Contractor sold access to ransomware group. Group used access for reconnaissance (9 months). Data exfiltration (3 months). Ransomware deployment (1 day). Total breach cost: $12M. Asset management failure: Priceless. Access is asset. Forgotten access is persistent vulnerability.
Third-Party Access Reality: SaaS integrations create OAuth tokens (remember that analytics tool you evaluated in 2021? Still has read access). Vendor support accounts (opened for emergency, never closed). Shared credentials in Slack DMs (rotated when exactly?). Service accounts proliferate (each automation creates new IAM role). Question: How many identities have access to your infrastructure right now? Count. You'll be wrong. AWS tells truth.
People are assets. Dormant accounts are time-delayed privilege escalations waiting to activate. Quarterly reviews prevent forgotten privileges from becoming persistent backdoors. But only if reviews are real (checking logs, validating access patterns, questioning anomalies) not checkbox compliance theater (confirming everyone looks familiar on the list). Departed employees hunt you from abandoned accounts. FNORD. How many accounts from departed employees still exist? Check now. You'll be surprised. Or horrified. Probably both.
4. 🏷️ Data Assets: Unclassified Means Unprotected
Classification-driven data inventory. Databases (PostgreSQL for CIA application, RDS with automated backups, point-in-time recovery enabled), S3 buckets (versioning enabled, lifecycle policies configured, but that TEMP bucket from 2018?), file storage (WorkMail attachments, CloudWatch logs, Glacier archives), classified per Classification Framework: Extreme assets (customer credentials, encryption keys) quarterly reviewed, Very High/High (financial data, PII) quarterly, Moderate (internal docs) semi-annually, Public (marketing materials) annually. Classification drives protection. Unclassified data gets generic controls—or no controls. Classification-driven inventory means risk-appropriate protection.
Data Classification Archaeology: S3 Intelligent-Tiering automatically moves data (but to where? Hot/Cold/Archive/Deep Archive?). S3 versioning preserves deleted files (that sensitive doc you thought you deleted? 47 versions still exist). RDS snapshots proliferate (automated daily, retained 30 days, except when retention changed to 180 days, forgot to change back). CloudWatch Logs Insights reveals data flows (logs contain more sensitive data than databases). Data classification requires knowing data exists. Forgot the data? Forgot the classification. Forgot the protection. Breach imminent.
Data Asset Horror: GDPR right-to-erasure request reveals organization cannot locate all user data. RDS database (obviously). S3 buckets (checked). CloudWatch Logs (oh, right). EBS snapshots (didn't think of those). AMI backups (contained user data?). DynamoDB (thought we migrated off that). Glacier archives (forgot those existed). Athena query results (cached in S3, forgot about those). Elasticsearch indices (thought we decommissioned that). Total data locations: 23. Data locations in asset register: 4. GDPR fine: €2.4M. Asset management failure: Actually enforced this time. Can't delete data you don't know exists.
Shadow Data Reality: Developers create S3 buckets for testing (contain production data copies). Analysts export data to local machines (still there when they leave). Contractors receive data shares (via unencrypted email—yes, really). API responses cached (Redis keys containing PII, expired when?). Logs contain sensitive data (structured logging prevented how?). Your data inventory is fiction. Your actual data is everywhere. Including places you've never inventoried. Question: What data exists that you've forgotten? Answer: The data that breaches you.
Data classification enables appropriate protection. Unclassified data receives lowest protection tier (because you didn't classify it, not because it's not sensitive). Classification-driven inventory means knowing data exists first, then classifying, then protecting. But most organizations skip step one—knowing data exists. They classify databases (easy, obvious, inventory says so). They forget EBS snapshots, CloudWatch Logs, Athena results, Redis caches, Lambda /tmp, container layers, CI/CD artifacts. Data proliferates. Inventory doesn't. Gap widens daily. Breach discovers gap. FNORD. How much data exists that you haven't classified? Answer: All of it. Classification is fiction. Data is everywhere. Protection is theater.
5. 🤝 Third-Party Services: Shadow SaaS Sprawl
SaaS inventory and vendor management exposing shadow subscriptions. 18 integrated services tracked: AWS (infrastructure), GitHub (code), SEB (banking), Bokio (accounting), SonarCloud (quality), FOSSA (license compliance), Stripe (payments), OpenAI (AI services), Google Workspace (IdP), Search Console, Bing Webmaster, YouTube, Product Hunt, TikTok, X (Twitter), LinkedIn, Suno, ElevenLabs. Vendor assessments per Third Party Management. Annual reviews ensure continued compliance. Third-party services are assets you don't control. Vendor inventory enables risk management. Shadow SaaS is shadow vulnerability.
SaaS Archaeology: Credit card statements reveal subscriptions nobody remembers authorizing (expensed as "marketing" or "tools" or "research"). OAuth app lists reveal integrations nobody uses (GitHub Apps last accessed 2019). Google Workspace admin console shows accounts nobody recognizes (that contractor's account still active?). DNS records point to SaaS providers nobody remembers contracting (that analytics subdomain—what service was that?). SaaS sprawl is real. SaaS inventory is fictional. Gap is your exposed API surface.
Shadow SaaS Horror: Security breach traced to compromised SaaS vendor nobody knew company used. Marketing manager subscribed to "free trial" social media management tool 2 years prior. Trial expired. Manager forgot. Account remained active (payment failed but service continued—poor vendor collection process). Manager's credentials compromised (phishing). Attacker accessed SaaS tool (still had OAuth access to company systems). Tool had GitHub integration (read access to private repos). Twitter integration (post access). Google Drive integration (read/write access). Slack integration (post to all channels). One forgotten SaaS trial = complete infrastructure access. Shadow SaaS is shadow infrastructure owned by vendors with worse security than yours.
Third-Party Reality: Every department subscribes to tools (marketing, sales, ops, dev). Every employee expense reports SaaS subscriptions (accounting doesn't track access). Every integration creates OAuth tokens (revoked when subscription ends? Never.). Every vendor claims "bank-level security" (AES-256 encryption! SOC 2! GDPR compliant!—enforcement questionable). Your third-party inventory lists 20 vendors. Your credit card statements show 47. Your OAuth app list shows 89. Your actual vendor count: Unknown. Probably 200+. Maybe 500. Question: How many third parties can access your data right now? Answer: More than you think. Way more.
Third-party services are assets you don't control but trust implicitly. Every SaaS integration is persistent access you've granted (OAuth doesn't expire unless you revoke—vendors don't remind you). Every vendor is potential breach vector (their security is now your security—hope they're paranoid enough). Vendor inventory enables third-party risk management. But only if inventory is real (including shadow SaaS nobody remembers subscribing to). Shadow SaaS is reality. Official SaaS inventory is fiction. Breach discovers truth. FNORD. Count your SaaS vendors. Check credit cards. Check OAuth. Check DNS. Multiply estimate by 3. That's closer to reality. Still probably low. SaaS sprawl is exponential. Inventory is linear. Math doesn't favor you.