Damien's Developer Diary

The Job Isn't Writing Code. It's Knowing When the AI Is Wrong.

Damien Alleyne — Thu, 19 Feb 2026 13:06:31 GMT

I use an AI coding agent for almost everything on my job board GlobalRemote. It writes my scrapers, builds my CI pipelines, architects my database schemas. It's written the vast majority of the codebase.

After a few months of building this way, I've noticed a pattern: the most valuable thing I do isn't writing code. It's catching where the AI gets it wrong — specifically the cases where the output looks correct but doesn't hold up once you think about it.

Here are three recent examples.

1. The Wrong Tool for the Job

My pipeline extracts tech stack requirements from job postings using regex. A role showed up on the board with no tech stack listed. The AI investigated, found the regex wasn't matching that posting's format, and proposed expanding the regex pattern.

Fair enough. But we already had LLMs classifying and extracting other fields from these same job descriptions. Why maintain a brittle regex when we could use the LLM we're already paying for?

The agent agreed and built the LLM-based extraction instead. More resilient, handles edge cases the regex never would have caught.

The AI optimized within the current approach. I questioned whether the approach itself was right. That's a pattern I keep seeing — AI agents are excellent at solving the problem you give them, but they don't question whether you're solving the right problem. That's still on you.

2. Technically Correct, Actually Misleading

My pipeline extracted geographic data from a GitLab job posting — a role open in the US, Canada, France, Germany, Ireland, Netherlands, Spain, and the UK — and tagged it as multi-region with regions Americas and Europe. I asked the agent to verify. It confirmed the data was accurate — the posting listed countries across both regions.

The problem: if a user from Brazil sees "Americas", they'll assume they can apply. Someone in Hungary sees "Europe", same thing. But this job is only open in 8 specific countries.

The agent hadn't considered this. It checked my existing data, found I already had a select-countries badge for this situation, updated the job, and then updated the LLM extraction prompt so the system would get this distinction right on future runs.

I caught this because I've been the person in a non-obvious country getting excluded from roles that say "Americas" or "Global Remote." I've had Zapier, Outliant, and others reject me on location after their postings implied I was eligible.

3. The Silent Failure

My pipeline ran on schedule. Scraped 39 jobs. Processed them. Reported: "No new entries to add." No errors, clean exit.

Zero new jobs from 39 listings didn't seem right. I pulled the raw data and asked the agent to audit its own pipeline's decisions.

It found two bugs. One was a dedup rule incorrectly matching a new job against a discontinued listing with a similar title — different posting, different job ID, valid salary data, silently dropped. The other was a salary field that the pipeline never parsed, so jobs with visible salary data were being dropped for "no salary transparency."

The pipeline didn't error or warn. It reported success while quietly dropping valid jobs.

I didn't catch this by reading code. I caught it because the output didn't pass a gut check.

Why This Matters

Ben Shoemaker wrote a piece recently arguing that engineers should stop reading code line-by-line and invest in the "harness" — specs, tests, verification layers, trust boundaries. OpenAI calls this Harness Engineering.

Looking at these three examples through that lens, that's what I've been doing without realizing it. The AI handles production. I handle specification, trust boundaries, and the "does this actually make sense for my users?" layer.

If you're an engineer building with AI tools right now, I'd suggest paying attention to the moments where you override the AI's suggestions. Those moments aren't interruptions to your workflow — they're the most valuable part of it. That's the skill set the market is shifting toward, and it's worth documenting for yourself even if you never publish it.

I'm a Senior Software Engineer with over a decade of experience, including building internationalization systems serving 50M+ users. I write about building with AI at blog.alleyne.dev.

I Benchmarked 6 LLMs to Automate My Job Board for $0.35/Month

Damien Alleyne — Tue, 10 Feb 2026 13:32:43 GMT

Update (April 11, 2026): I re-ran benchmarks with March 2026 models. GPT-5.4 Nano ($0.20/$1.25 per M tokens) matched GPT-5 Mini on classification (96% F1) and beat Claude Haiku on extraction (100% salary, 92.3% geo) — at 664ms avg latency and $0.001/run. We've consolidated from two models down to one. Monthly cost dropped from ~$0.35 to ~$0.01.

GPT-5 Nano remains unusable (20% F1). The 5.4 generation is a different beast entirely. Gemini Flash still can't produce reliable JSON.

Background

I run a curated remote job board (GlobalRemote) focused on established remote-first companies with transparent salaries that hire globally — companies like GitLab, Automattic, Buffer, and Zapier. I started it last September to scratch my own itch, manually curating every listing — researching interview processes, verifying geographic restrictions, and cross-referencing salary data. That worked for a small board, but didn't scale.

So I built custom Apify scrapers with department-level filtering to pull only engineering, product, design, and data roles from Greenhouse and Ashby boards. That cut the noise by 80%, but I still needed to automatically:

Classify whether a scraped job is a relevant tech role (engineer, designer, data scientist, PM) — department filtering catches the obvious non-tech roles, but borderline titles like "Integrations Consultant" or "Senior Sales Engineer" still slip through
Extract details that aren't in the job board's structured fields — geographic eligibility, regional variants, and salary data when it's buried in the description text

Previously, this pipeline ran locally using Ollama with qwen3:8b. I wanted to move it entirely to the cloud (GitHub Actions) using cheap API models, so it runs automatically twice a week without my local machine.

The Question

Which cloud LLM models give the best accuracy for classification and extraction, at the lowest cost? Should we use the same model for both tasks, or different models for each?

Methodology

Ground Truth Dataset

I built a test set from my own production data:

Classification (50 tests): 25 jobs that ARE on my board (known relevant, with expected categories) + 25 jobs that are NOT relevant (sales, marketing, HR, finance, legal titles from the same companies)
Extraction (5 tests): Jobs with known geographic badges and salary ranges, covering open-globally, multi-region, us-canada-only, americas-only, and null salary cases

Models Tested

Model	Provider	Type	Pricing (input/output per 1M tokens)
Claude Haiku 4.5	Anthropic	Fast inference	$0.80 / $4.00
GPT-5 Mini	OpenAI	Reasoning	~$0.15 / ~$0.60
GPT-5 Nano	OpenAI	Reasoning (smallest)	~$0.05 / ~$0.20
Gemini 2.5 Flash	Google	Fast inference	$0.15 / $0.60
Gemini 3 Flash (preview)	Google	Fast inference (preview)	~$0.15 / ~$0.60
Qwen3 8B	Alibaba (via Ollama)	Open-source	Free

Note: GPT-4o-mini and Gemini 2.0 Flash were also tested initially but replaced with their successors (GPT-5 Mini, Gemini 2.5 Flash) for the final benchmarks.

Prompts

Same prompts used across all models — the exact prompts from my production pipeline:

Classification prompt:

Classify this job title for a tech job board. Respond with JSON only.

Title: {title}
Company: {company}

{"isRelevant": true/false, "category": "engineering|product|design|data|other", "reason": "1-3 words"}

Rules:
- engineering: Software Engineer, Platform Engineer, SRE, DevOps, QA, Solutions Engineer, Design Engineer, Developer Advocate, Security Engineer
- product: Product Manager ONLY (not Growth Manager, not Renewals Manager)
- design: Product Designer, UX Designer, Visual Designer, Brand Designer
- data: Data Scientist, Data Engineer, ML Engineer, AI Engineer, Research Scientist
- ALL other roles (sales, marketing, HR, support, finance, legal) → isRelevant: false, category: "other"

Extraction prompt:

Extract job details from this posting. Respond with JSON only.

Title: {title}
Company: {company}
Text: {description}

{
  "geoBadge": "open-globally|americas-only|emea-only|...|multi-region|...",
  "regionalVariants": ["Americas", "EMEA", "APAC"] or null,
  "salaryMin": number or null,
  "salaryMax": number or null,
  "salaryCurrency": "USD" or "EUR" or "GBP" or "CAD" or null,
  "reasoning": "brief explanation"
}

Results

Full Comparison Table

Model	F1 (overall score)	Precision (% flagged that were correct)	Recall (% of relevant jobs caught)	Category	Geography	Salary	Cost/run
GPT-5 Mini	94.1%	92.3%	96.0%	96.0%	80.0%	100%	$0.008
Claude Haiku 4.5	91.7%	95.7%	88.0%	88.0%	100%	100%	$0.019
Gemini 2.5 Flash	89.8%	91.7%	88.0%	88.0%	80.0%	80.0%	$0.003
Gemini 3 Flash (preview)	89.4%	95.5%	84.0%	84.0%	40.0%	40.0%	$0.003
Qwen3 8B	85.7%	77.4%	96.0%	92.0%	60.0%	100%	free
GPT-5 Nano	23.3%	27.8%	20.0%	20.0%	0.0%	0.0%	$0.006

Key Findings

1. GPT-5 Mini is the best classifier

F1: 94.1% with 96% recall — it catches nearly every relevant job
Only 1 false negative: "Senior Marketing Data Analyst" (ambiguous title)
2 false positives: "Integrations Consultant" and "Senior Sales Engineer" (borderline roles)
96% category accuracy — correctly distinguishes engineering vs. design vs. data

2. Claude Haiku 4.5 is the best extractor

100% geography badge accuracy — correctly identifies open-globally, multi-region, us-canada-only, americas-only
100% salary accuracy — extracts exact numbers and currency, handles null correctly
Classification is good (91.7% F1) but misses some edge cases like "Growth Designer"

3. GPT-5 Nano is unusable

23.3% F1, 0% extraction accuracy — massive JSON parse errors
Classified most jobs as irrelevant, couldn't extract structured data
Despite being cheapest, it costs MORE than Gemini 2.5 Flash while being terrible
Verdict: Do not use GPT-5 Nano for structured extraction tasks

4. Gemini 3 Flash (preview) has JSON reliability issues

Classification is decent (89.4% F1) but extraction fails 60% of the time with parse errors
The preview model wraps JSON in markdown code blocks or adds commentary
Not production-ready yet — wait for GA

5. Gemini 2.5 Flash is the budget option

Cheapest cloud model at $0.003/run
Decent classification (89.8% F1) but weaker extraction (80% geography, 80% salary)
One parse error on a EUR salary extraction test

6. Qwen3 8B is surprisingly capable

Free and runs locally
85.7% F1 classification — decent but too many false positives (7)
100% salary extraction but only 60% geography badge accuracy
Misclassifies "multi-region" as "open-globally" consistently

Error Pattern Analysis

Common false negatives across models:

"Senior Marketing Data Analyst" — every model flagged this as marketing, not data. Ambiguous title.
"Data Analyst, Customer Intelligence" — "Customer" in title triggers exclusion
"Senior Growth Designer" — "Growth" confuses classification

Common false positives:

"Integrations Consultant - Americas" — every model said "engineering" (borderline role)
"Senior Sales Engineer" — GPT-5 Mini incorrectly treated as engineering

Extraction patterns:

"Americas" region consistently extracted correctly by Claude Haiku
"US and Europe" → "multi-region" was the hardest badge to get right
EUR salary with different format than USD tripped up Gemini models

Decision: Hybrid Model Strategy

Based on benchmarks, I implemented task-based model routing:

Task	Primary Model	Fallback	Why
Classification	GPT-5 Mini	Claude Haiku → Ollama	Best F1 (94.1%), best recall (96%), best category accuracy (96%)
Extraction	Claude Haiku	GPT-5 Mini → Ollama	Perfect geography + salary (100%)

Cost Estimate

Per twice-weekly ingestion run (~50-100 jobs to classify, ~10-20 to extract):

Task	Model	Estimated tokens	Cost
Classification	GPT-5 Mini	~30K input, ~5K output	~$0.008
Extraction	Claude Haiku	~20K input, ~5K output	~$0.036
Total per run			~$0.044
Monthly (8 runs)			~$0.35

Implementation

The routing is handled in a small LLM abstraction layer:

const TASK_ROUTING = {
  classify: ['openai', 'claude', 'ollama'],   // GPT-5 Mini first
  extract:  ['claude', 'openai', 'ollama'],   // Claude Haiku first
  default:  ['claude', 'openai', 'ollama'],
};

// Pipeline calls specify the task:
await batchGenerateJSON(classificationPrompts, { task: 'classify' });
await batchGenerateJSON(extractionPrompts, { task: 'extract' });

Each task resolves to the best available model in priority order. If OpenAI is down, classification falls back to Claude Haiku. If Anthropic is down, extraction falls back to GPT-5 Mini. Ollama is available as a fallback for local development, but isn't used in the cloud pipeline.

GPT-5 API Gotchas

If you're migrating from GPT-4o-mini to GPT-5 models, watch out for:

No temperature parameter — GPT-5 models are reasoning models (like o1/o3). They don't accept temperature. Remove it entirely.
max_completion_tokens not max_tokens — The parameter name changed for reasoning models.
response_format: { type: 'json_object' } still works — JSON mode is supported via Chat Completions.
Chat Completions API still works — Despite OpenAI promoting the new Responses API, Chat Completions hasn't been deprecated. For simple single-turn JSON extraction, Chat Completions is fine.

Timeline

I ran a local pipeline with Ollama for a while, which worked but required my Mac to be on. Moving everything to the cloud was another weekend:

Benchmarked 6 cloud models to find the best fit for classification and extraction
Built the cloud pipeline on GitHub Actions with hybrid model routing
Hardened data quality — HTML-based requirements extraction, fuzzy title dedup, paid-trial badge detection
Result: Fully automated, runs twice a week, creates PRs for review, costs ~$0.35/month

I Built 2 Job Scrapers in One Weekend to Avoid Paying for Data

Damien Alleyne — Mon, 02 Feb 2026 17:09:40 GMT

I run GlobalRemote, a curated job board that shows interview processes and hiring transparency upfront. To keep it relevant, I needed to update it 2x per week with fresh jobs from Greenhouse and Ashby boards.

The problem? The scraper I was using fetched every job from each company — Sales, HR, Support, everything — and stored it all in my Apify dataset. With 6-8 companies, that's 300-400 jobs per scrape, but only 5-10 were actually relevant.

I was burning through my Apify free tier ($5/month, ~2000 dataset operations) on irrelevant data. Two scrapes per week would blow past my quota. I wasn't ready to pay for a higher tier just to subsidize wasteful scraping.

So my options were:

Update infrequently (once every 2-3 weeks) and let the board go stale
Pay for a higher Apify tier to subsidize wasteful scraping
Build my own scrapers with department filtering

I chose #3.

The scrapers are now live on Apify Store, open-source, and I'm dogfooding them on GlobalRemote right now.

The Problem: I Couldn't Update Frequently Enough

The scraper I was using worked like this:

Fetch all jobs from a company's job board
Store everything in an Apify dataset
I filter locally for the jobs I actually want

This makes sense if you want all the jobs. But for a curated board like GlobalRemote, I only wanted:

Engineering roles (not Sales, Marketing, HR)
From specific departments (e.g., "Code Wrangling" at Automattic, "Engineering" at GitLab)
Recent postings (not 6-month-old listings)

With 300-400 jobs stored per scrape and only 5-10 relevant, I was wasting my dataset quota. Two scrapes per week would exceed my free tier limit. The choice was: pay for a higher tier or update less frequently. Neither was ideal.

The Solution: Per-URL Department Filtering

I built two Apify actors:

Greenhouse Job Scraper (Automattic, GitLab, Speechify, etc.)
Ashby Job Scraper (Buffer, Zapier, RevenueCat, etc.)

Both support per-URL configuration, meaning each company can have different filters:

{
  "urls": [
    {
      "url": "https://job-boards.greenhouse.io/automatticcareers",
      "departments": [307170],
      "maxJobs": 50,
      "daysBack": 7
    },
    {
      "url": "https://job-boards.greenhouse.io/gitlab",
      "departments": [4011044002],
      "maxJobs": 20
    }
  ]
}

The scraper:

Fetches department metadata
Filters jobs by department ID before storing them
Only stores jobs that match your criteria
You only pay for the jobs you actually get (not the ones filtered out)

Result: I went from storing 300-400 jobs per scrape to 30-50 jobs — an 80% reduction in dataset usage.

How I Built It

Tech Stack

Apify platform — handles hosting, scheduling, dataset storage
Greenhouse + Ashby APIs — public APIs for job boards
AI (Claude) — for rapid development

How the APIs Work

Both platforms expose public APIs for their job boards. This meant I could:

Fetch departments/teams programmatically
Filter by department/team ID before fetching job details
Only pull full job data for matches
No browser automation or HTML scraping needed

This is key: I'm filtering before fetching details, not after. Most scrapers fetch everything, then you filter locally. Mine filters first, then only fetches what you need.

Development Process

I built both scrapers over one weekend using AI (Claude).

Saturday (Jan 31): Greenhouse scraper

Prompt: "Build an Apify actor that scrapes Greenhouse job boards with department filtering"
AI figured out the API structure
I tested on Automattic and GitLab job boards

Sunday (Feb 1): Ashby scraper

Prompt: "Build an Apify actor for Ashby job boards with department filtering (similar structure to the existing Greenhouse scraper)"
AI figured out Ashby's API
Tested on Buffer, Zapier, RevenueCat

What AI handled:

Reading API documentation (Greenhouse, Ashby, Apify actor structure)
Writing the scraper logic and Apify boilerplate
Handling edge cases (null departments, missing dates)
Generating input/output schemas

What I did:

Product decisions (per-URL config vs global config)
Testing on real job boards
Iterating when things didn't work
Catching issues (e.g., updated Node 20 → 22 in Dockerfile)

I never opened:

Total development time: One weekend.

AI is a co-pilot, not autopilot - but it handled all the research and boilerplate so I could focus on testing and product decisions.

Dogfooding on GlobalRemote

I'm using both scrapers to populate GlobalRemote right now.

When I need fresh data, I trigger both scrapers. They return 30-50 relevant jobs instead of 300-400, keeping me well within my Apify free tier.

What I've learned from dogfooding:

Department filtering reduced dataset usage by ~80%
I can now update regularly without exceeding my quota

If the scrapers break, GlobalRemote breaks. That's a strong incentive to keep them working.

What I Learned

1. Filter before storing, not after

For curated job boards, filtering before storage is way more cost-effective. The scraper I was using didn't do this.

2. Per-URL config beats global config

My first version had global department filters (same filter for all companies). That was a mistake. Different companies organize departments differently. Per-URL config gives users way more flexibility.

3. Real examples > Fake examples

In my README, I used real companies (Automattic, GitLab) and real department IDs (307170 = "Code Wrangling" at Automattic). Fake examples would've been useless for someone trying to replicate this.

4. AI accelerates weekend projects into production tools

I shipped two working scrapers in one weekend without reading a single API doc. AI handled research and implementation; I handled product decisions and testing. That's the real power of AI in 2026.

5. Open-sourcing on Apify was easy

Publishing to Apify Store took ~10 minutes:

Add README
Set pricing
Add input/output schemas
Add Banking information (they prefer PayPal)
Click "Publish"

What's Next

Both scrapers are live and stable. I will be using them on GlobalRemote twice a week, well within my free tier.

Potential improvements:

Add automated tests (right now it's just manual verification)
Add salary parsing to Ashby scraper (Greenhouse already extracts salary ranges)
Build a Lever scraper (if there's demand)

But honestly? I built these to solve my own problem. If other people find them useful, great. If not, I'm still updating GlobalRemote 2x/week without blowing my budget.

What If We Could Fix Barbados Traffic With Data?

Damien Alleyne — Sun, 23 Nov 2025 21:13:02 GMT

Every morning during my commute, I deal with the same roundabout problem. A car is coming around, and I need to figure out: are they exiting, or staying in?

No indicator light. So I watch their trajectory, their speed, try to read their intentions. If I guess right and they exit, I can enter and save a few seconds. If I guess wrong, it's dangerous, so most times, I just wait.

I'd always figured this was just how roundabouts work in Barbados. Poor signaling, unpredictable flow, daily frustration. That's just the reality here.

Then I went to a meetup at Pelican Village on Saturday, November 22nd, 2025.

The Challenge

The turnout was smaller than usual — as persons had the option to attend via Zoom. We were there to learn about the Barbados Traffic Analysis Challenge, a machine learning competition organized by GovTech and Keleya Labs and hosted on Zindi to help the Ministry of Transport and Works (MTW).

The presentation covered the technical side: using Machine Learning (ML) to analyze video footage, extracting features from data, different classification models. A lot of it was new territory for me.

During the networking session afterward, Conrad Brits, founder of Keleya Labs, explained how the competition came together. He's been working on various projects in Barbados since moving here during COVID. He said that our highway infrastructure is great, but for some reason the speed of traffic at our roundabouts is slow when compared to other places with similar highway infrastructure. When new cameras were installed at the Norman Niles roundabout, he felt that this data can be used to predict traffic conditions with machine learning.

Then he shared his theory about what he thinks is actually causing delays.

The Signaling Problem

Conrad shared his hypothesis: people don't signal when exiting roundabouts, and he thinks that's measurably slowing down how quickly cars can enter.

Hearing him say it out loud made me realize — I've experienced this. I've seen the rare driver who does signal their exit, and I've tried it myself a few times. When someone signals, the person waiting to enter the roundabout gets those crucial milliseconds of certainty. They can go instead of hesitating.

But it's uncommon here. And without signals, every driver faces the same choice: wait to be sure, or guess and risk it.

Conrad's idea is to use machine learning to verify this with actual data. If the analysis shows that signaling improves entry rates, he can take that evidence to the MTW. Not assumptions — data. And maybe they'd update driver education to emphasize signaling at exits.

It reminded me of how jambusting became accepted practice. What started as an unofficial technique —minivans using the outside lane to go straight through roundabouts in the early 2000s — eventually became allowed at ABC Highway roundabouts as a way to ease traffic flow. Sometimes the practices that work in reality get officially recognized. Maybe signaling could be similar.

The Competition Details

The challenge itself: analyze 15 minutes of traffic video, then predict traffic conditions 5 minutes into the future. You work with four video streams from the Norman Niles roundabout, each with congestion ratings. The goal is identifying what's actually causing the delays.

There are cash prizes and exposure to international companies that recruit from these competitions. Deadline is January 26th, 2026 — two months away.

Conrad emphasized the social good that could come from this. Less time in traffic has ripple effects — people getting home to their families sooner, reduced stress, lower fuel costs, better quality of life. The improvements might seem small on an individual level, but multiply that across thousands of daily commutes and it becomes significant.

Why This Matters

There's something compelling about this challenge. The daily frustrations we accept as unchangeable might not be. Data could turn assumptions into evidence. A small group at Pelican Village on a Saturday might improve commutes for everyone on the island.

If you have any interest in data science, ML, or solving civic problems with code, check out the Barbados Traffic Analysis Challenge. It closes January 26th, 2026.

Maybe someone will prove that a simple indicator light could save thousands of Barbadians time every day.

I Built a Job Board for Transparency. But Is Applying Even the Right Strategy?

Damien Alleyne — Fri, 10 Oct 2025 10:00:52 GMT

Five weeks ago, I launched GlobalRemote — a curated job board focused on interview transparency for remote developers. The feedback was encouraging: developers loved knowing what to expect before applying. I expanded from 7 manually verified jobs to 26 across 11 companies. People engaged with the concept.

Then I read research from 30+ hiring managers that made me question everything I'd built.

TL;DR: I built a job board for interview transparency. Then research showed that applications don't work — companies hire 90% of engineers through referrals and direct outreach, not inbound applications. I'm rethinking whether job boards should help people apply, or help them research companies for targeted outreach instead.

The Problem I Thought I Was Solving

After experiencing surprise interview formats — expecting a conversation about my experience but getting handed a whiteboard problem instead — I realized most job boards tell you the company’s tech stack … may tell you the salary, but rarely the interview process. I built GlobalRemote to fix that: show developers what to expect before they invest time applying.

The early validation seemed strong. Developers in my network immediately understood the value. "I would absolutely use this," several told me. The geographic restriction transparency resonated even more —people were tired of discovering location restrictions deep in the job description or application process.

I thought I was solving a real problem. And I was. Just not in the way I expected.

The interview transparency still matters — developers still need to know what they're walking into. But maybe it matters for a different reason than I originally thought.

Why Job Applications Don't Work in 2025

The Pragmatic Engineer recently published findings from conversations with 30+ tech hiring managers and recruiters. The numbers are stark: companies regularly receive 1,000+ applications for a single role, yet only about 10% of applicants are even minimally qualified.

One startup founder reported 23,000 applications in 30 days for 8 roles. A Spotify engineering manager saw 1,700 applicants in 15 hours. A Swiss startup stopped accepting applications at 600 in just 2 days.

But here's what shook me: despite this flood of applications, most companies hire fewer than 10% of their engineers through inbound applications. The majority come from direct outreach, referrals, and recruiter sourcing.

A now-deleted Reddit post (removed as potential advertising, though the pattern is very real) illustrated this perfectly: a senior backend engineer sent 1,147 applications over five months, which generated 47 phone screenings. Meanwhile, 400 targeted emails to hiring managers and recruiters resulted in 62 responses and 16 technical interviews. He eventually received 3 offers.

The math is brutal. Traditional job applications have become a numbers game where even qualified candidates get lost in the noise.

AI Is Making It Worse

Hiring managers consistently reported the same frustration: application quality has never been lower. Many applicants don't meet basic job requirements — frontend roles flooded with backend engineers, location requirements completely ignored, resumes clearly tailored by AI to mirror job descriptions when the person lacks the claimed experience.

I'll even admit: I've used AI to tailor my resume to job descriptions. I feed AI the job posting along with my real experience, and it generates a resume that caters to the job posting’s requirements. The problem? It often lists experiences that I've never had — just because they appeared in the job description. Every time, I have to manually edit out the hallucinated experience.

If I'm doing this — and I know enough to catch the fabrications — imagine how many desperate job seekers are submitting AI-generated resumes without reviewing them carefully.

One hiring manager described LinkedIn as "an irrecoverable hellscape for inbound applications." Multiple companies have stopped posting jobs on LinkedIn entirely, turning to smaller job boards or relying exclusively on outbound recruiting.

The Atlantic recently ran a headline: "The Job Market Is Hell—Young people are using ChatGPT to write their applications; HR is using AI to read them; no one is getting hired."

The system is broken on both sides. Applicants are desperate, so they use AI to spam applications. Companies are overwhelmed, so they use AI to filter them out. Everyone loses.

What Actually Works: Direct Outreach

The hiring managers' data revealed a clear pattern: the best hires come from people they already know or people who reach out directly.

When a recruiter sources a candidate — proactively reaching out on LinkedIn — those candidates pass interviews at significantly higher rates than inbound applicants. When an engineer refers someone they've worked with, the quality is exponentially better. When a candidate identifies a company worth targeting and emails the hiring manager directly, they bypass automated filtering and ensure their application actually gets reviewed.

One hiring manager admitted: "The only way I've gotten truly amazing people to apply is through reaching out to my network."

This isn’t because good engineers don’t apply to jobs. It’s because in a market with 1,000+ applications per role, even strong candidates get filtered out by volume. The signal-to-noise ratio has collapsed.

I experienced this firsthand. When I applied to Automattic in late 2021, my first application went nowhere. Then I attended a YouTube livestream where they were partnering with Tech Beach Retreat, a Caribbean tech ecosystem that connects regional talent with global companies. In the chat, I was told to reapply because they were swamped with applications.

My second application - same resume, same qualifications - got reviewed. I still went through their full interview process, including the paid trial.

But something changed between attempt one and attempt two. Maybe someone noticed I’d engaged with their Caribbean recruitment initiative. Maybe they were being more deliberate about reviewing applications from that region. I don’t know exactly what happened.

What I do know: the same application that was ignored the first time succeeded the second time, not because my credentials improved, but because of context that existed outside the application itself.

The Realization: I Built for the Wrong Workflow

Here's what hit me while reading this research: I built a platform to help developers apply better. But what if applying isn't the strategy anymore?

Interview transparency still matters. Geographic restrictions still matter. Salary ranges still matter. But maybe they matter for a different reason than I thought.

Instead of: "Here's a job you should apply to, and here's what the interview will be like."

It should be: "Here's a company worth targeting, and here's the research you need to reach out directly."

Let me show you what I mean. Take Automattic's Experienced Software Engineer role on GlobalRemote. Right now, it shows:

Salary: $70K-$170K USD (transparent range based on location and experience)
Location: Fully remote
Interview: Take-home test, paid trial project with actual team (6-8 weeks total)

That's useful for applications. But for direct outreach, you'd want to know:

Automattic is fully async with 1400+ employees distributed globally
They publish company culture and processes openly
They contribute actively to WordPress.org open source
Their paid trial process proves they value your time and judge real work
Hiring managers are identifiable on LinkedIn

That's not application information. That's research ammunition for a targeted outreach campaign.

What Job Seekers Actually Need

The developer who sent 1,147 applications eventually figured this out. His successful strategy combined:

Researching companies worth targeting (not just any open role)
Finding recruiter and hiring manager contacts
Sending personalized outreach emails
Negotiating with competing offers

Applications alone weren't enough — 1,147 applications got him 47 phone screenings. But direct outreach changed his results: 400 emails generated 16 interviews from just 62 responses.

Rethinking GlobalRemote

This research has me questioning some fundamental assumptions about what I'm building.

What if the value isn't helping people apply better, but helping them research companies worth targeting? The interview transparency wouldn't just be nice-to-have information — it could be proof that a company respects candidates enough to be transparent. That's a signal worth targeting.

The geographic restrictions wouldn't just be filters—they could be indicators of company culture. A truly distributed company that hires globally is fundamentally different from one that requires US timezone overlap. That distinction might matter when you're choosing where to invest your networking energy.

I don't know yet if this reframing makes sense. I've only been at this for five weeks.

The Bigger Question

This research raises uncomfortable questions about what I'm building. If applications don't work, why optimize application experiences? If most hires come from referrals and outreach, what role should a job board play?

I don't have answers yet. I'm five weeks into this journey, with 26 jobs across 11 companies and minimal traffic. The validation from developers suggests the problem I identified is real. But maybe I'm solving it from the wrong angle.

Maybe the future isn't better job boards. Maybe it's better research tools. Or maybe there's a way to do both — help people find opportunities while acknowledging that "apply" might not be the best next step.

What This Means for Developers Job Hunting Right Now

If you're sending hundreds of applications and hearing nothing back, you're not alone. The system is genuinely broken. But here's what seems to be working:

Stop spraying applications. Pick 10-20 companies that genuinely interest you and research them deeply. Understand their culture, read their blogs, identify their pain points.

Find the humans. Bypass automated filters. Find engineering managers, recruiters, or senior engineers at these companies on LinkedIn. Send personalized messages that show you understand what they're building.

Use every tool available. This isn't cheating — it's survival. If applications worked, you'd use them. Since they don't, use LinkedIn, use email, use your network, use whatever gets you in front of actual humans.

Think like a researcher, not an applicant. Your goal isn't to apply to more jobs. It's to identify companies worth targeting and figure out how to reach them directly.

Key Takeaways for Job Seekers:

Companies receive 1,000+ applications per role but only 10% are qualified
Most hires come from direct outreach and referrals, not applications
Research 10-20 target companies deeply instead of mass-applying
Reach out directly to hiring managers on LinkedIn
Interview transparency matters — but for researching companies, not just applying

Where I Go From Here

I'm sitting with this tension. The 26 jobs I've curated show companies that are transparent about interviews, clear about geographic policies, and honest about compensation. That information has value — I'm just not sure yet if it's most valuable as "here's where to apply" or "here's what to research for direct outreach."

Maybe it's both. Maybe some developers still want to apply, and transparency helps them choose wisely. Maybe others want to skip applications entirely, and the same information helps them identify targets for outreach.

I'm going to keep curating companies and watching how people actually use GlobalRemote. The data from hiring managers is clear, but I need to see how job seekers respond before making any dramatic changes.

If you're job hunting right now, I'd encourage you to think about your job search differently. Don't just ask "How many applications should I send?" Also ask "Which 10 companies are worth my focused attention, and how do I reach the people who make hiring decisions there?"

The application black hole is real. Whether the solution is better applications, direct outreach, or some combination of both — I'm still figuring that out.

I'm working through these questions in real-time. If you've had success with direct outreach, or if you have thoughts on what job seekers actually need right now, I'd genuinely love to hear about it. You can check out GlobalRemote at jobs.alleyne.dev to see what I've built so far, or reach out to me on LinkedIn.

From Interview Surprise to MVP: Testing Developer Job Transparency

Damien Alleyne — Mon, 15 Sep 2025 11:23:43 GMT

A few weeks ago, I applied for a Backend Engineering role, expecting a technical interview similar to what I had experienced before—like the collaborative pair programming session I had when joining Automattic, where a developer and I worked together to add features to a WordPress plugin. Instead, I received a 90-minute coding assessment with a mix of LeetCode-style and domain-specific questions, scheduled within a 5-day window. A few days later, another interview surprised me again—I showed up expecting a discussion about my experience but was handed a piece of paper and asked to design an API response.

This made me wonder: Why are developers going into interview processes without knowing what to expect?

The Problem I Experienced

Most job boards tell you the salary, tech stack, and basic requirements. Company culture information is usually found on their career pages or sites like Glassdoor. But they rarely mention what the interview process actually looks like. Will there be whiteboarding? Take-home projects? Multiple rounds? System design sessions?

As a senior developer with options, knowing the interview format helps me:

Allocate preparation time effectively
Choose opportunities that align with my strengths
Avoid processes that feel misaligned with my experience

But this information is typically discovered during the interview process itself — too late to make informed decisions about time investment. Some companies do provide excellent transparency (detailed assessment instructions, preparation resources, and clear reapplication timelines), but they're the exception. Most companies leave you guessing about format, expectations, and preparation requirements until you're already committed to their process.

Building an MVP to Test the Idea

I decided to build something to test whether other developers want this transparency. Enter GlobalRemote — a job board that shows interview processes upfront.

The initial concept was simple: curate remote jobs that include detailed interview process information alongside the usual job details.

What I Built

Rather than scraping existing job boards, I chose manual curation to ensure data quality:

Data Sources:

Company career pages and hiring documentation
Direct outreach to recruiters for process clarification
Cross-referencing with resources like the hiring-without-whiteboards GitHub repo
Employee posts on platforms discussing company interview experiences

Information Collected:

Number of interview rounds and types (technical, behavioral, system design)
Time commitment expectations for assessments
Specific technologies or frameworks tested
Any unique aspects of their process

Current Status: 7 manually verified listings with complete interview transparency

Problems I Discovered While Building

As I started sharing the concept with other developers, two additional pain points kept coming up:

Geographic Restrictions: Multiple people mentioned getting excited about "remote" jobs only to discover hidden location requirements during the application process. Some jobs require specific timezone overlap, others have legal restrictions, and some companies prefer certain regions for operational reasons.

Regional Salary Context: A developer friend in New Zealand pointed out that my original $120k+ USD threshold excluded competitive senior roles in his market. A $90k USD role in Auckland might be excellent locally, even if it wouldn't meet US market expectations.

These insights taught me that the problem space was broader than just interview transparency.

Early Feedback and Response

I've been testing this concept in developer communities to see if it resonates:

Developer Community (WhatsApp):

11 positive reactions
Discussion focused primarily on geographic restriction frustrations
One business-oriented suggestion about data reporting potential
Strong engagement suggesting the concept resonates

Tech Alumni Network (Slack):

8 positive reactions
One person confirmed experiencing the geographic restriction problem and mentioned facing similar challenges during their recent job search
Another mentioned that LeetCode-style processes feel "lazy" from companies
Generally positive response to the transparency concept

Technical Challenges I'm Learning From

Building this MVP has revealed several interesting technical challenges:

Data Verification Complexity: Interview processes are described inconsistently across companies. "Technical interview" could mean live coding, whiteboarding, take-home projects, or system design discussions. This required careful manual interpretation to understand what each company actually does.

Geographic Restriction Nuance: "Remote work" has many different meanings depending on company policy. Some companies require timezone overlap (e.g., "must work EST hours"), others have legal restrictions (e.g., "US/Canada only"), and some have operational preferences. Parsing and categorizing these restrictions requires understanding both company policies and legal frameworks.

Quality vs. Scale Trade-offs: I could scrape thousands of job listings quickly, but most would lack the interview process details that make the platform valuable. Manual curation ensures quality but limits scale — at least initially.

What I'm Learning About Developer Pain Points

Developers immediately understand the value proposition. The feedback suggests this is a real pain point, not just my personal frustration.

Geographic restrictions are more frustrating than I initially thought. The distinction between timezone requirements, legal restrictions, and operational preferences matters to job seekers.

Interview format transparency could change application behavior. Some developers mentioned they would self-select out of processes that don't align with their strengths or preferences.

Is This Problem Worth Solving?

The early feedback suggests there's genuine demand for this transparency, but I need more data points to be confident. The real test is whether developers would actually change their job search behavior based on this information being available upfront.

If interview transparency and geographic clarity would genuinely help you make better decisions about where to spend your time applying and preparing, then this is worth building. If it's just "nice to have" information that wouldn't actually change your behavior, then it's probably not worth the effort.

What I Need from You

I'm at the point where I need broader feedback from developers to decide whether to continue developing this or pivot to other opportunities.

If you're a developer who's job hunting or has recently job hunted:

Have you experienced similar interview process surprises?
Would knowing the interview format upfront change how you apply to jobs?
What other information would be valuable to know before applying?
Does the geographic restriction transparency resonate with your experience?

The engagement so far suggests there's something here, but I want to validate this with more developers before investing significant time in scaling the platform.

Check out the current MVP: jobs.alleyne.dev

Right now, I'm genuinely uncertain whether this problem is significant enough to warrant a dedicated solution, or if it's better solved through incremental improvements to existing platforms. Your feedback as developers who experience this problem firsthand will help me make that decision.

What's been your experience with interview process transparency during job searches? I'd love to hear your stories and thoughts in the comments.

Essential Course Notes on Cloud Native Application Architecture

Damien Alleyne — Mon, 25 Oct 2021 06:27:55 GMT

A few months ago, I signed up for a course to help me understand the responsibilities of a Software Architect, and the projects mentioned in the syllabus were exactly what I was looking for. Refactoring a Monolith is the most difficult Udacity project that I've completed, so I'm making a record for myself and future students.

Setting up the developer environment

Kubernetes

The project's setup instructions calls for using vagrant to create a virtual machine which automatically installs k3s and forwards the necessary ports for access this cluster. If you already have a local kubernetes instance installed, this process can be skipped. I already had Docker Desktop installed, which includes kubernetes, it simply needed to be enabled. Once enabled, the project's deployment folder can be applied to the local cluster with ease.

kubectl apply -f deployment/

Using an existing cluster avoids having to managing access to multiple clusters. Once the project works, you can use vagrant to create your VM and debug with confidence knowing that your installation was sound. Issues typically encountered involved incompatibilities between k3s and virtualbox OS images.

Skaffold

Any project that involves local kubernetes development should use skaffold. From the documentation:

Skaffold is a command line tool that facilitates continuous development for Kubernetes-native applications. Skaffold handles the workflow for building, pushing, and deploying your application, and provides building blocks for creating CI/CD pipelines.

https://youtu.be/8_Ozfa7JLEs

Near the end of your development, you can then use skaffold to push the latest images to your docker hub account. One issue I encountered was that the images pushed to docker hub had tags other than latest. I had to include the following in my skaffold.yaml build section.

  tagPolicy:
    sha256: {}

An example of how it looks can be seen here.

Very important: set the imagePullPolicy in your deployment files to IfNotPresent. This allows your local cluster to use the images directly built by skaffold, rather than pulling them from Docker Hub.

Damien's Developer Diary

The Job Isn't Writing Code. It's Knowing When the AI Is Wrong.

1. The Wrong Tool for the Job

2. Technically Correct, Actually Misleading

3. The Silent Failure

Why This Matters

I Benchmarked 6 LLMs to Automate My Job Board for $0.35/Month

Background

The Question

Methodology

Ground Truth Dataset

Models Tested

Prompts

Results

Full Comparison Table

Key Findings

1. GPT-5 Mini is the best classifier

2. Claude Haiku 4.5 is the best extractor

3. GPT-5 Nano is unusable

4. Gemini 3 Flash (preview) has JSON reliability issues

5. Gemini 2.5 Flash is the budget option

6. Qwen3 8B is surprisingly capable

Error Pattern Analysis

Decision: Hybrid Model Strategy

Cost Estimate

Implementation

GPT-5 API Gotchas

Timeline

I Built 2 Job Scrapers in One Weekend to Avoid Paying for Data

The Problem: I Couldn't Update Frequently Enough

The Solution: Per-URL Department Filtering

How I Built It

Tech Stack

How the APIs Work

Development Process

Dogfooding on GlobalRemote

What I Learned

1. Filter before storing, not after

2. Per-URL config beats global config

3. Real examples > Fake examples

4. AI accelerates weekend projects into production tools

5. Open-sourcing on Apify was easy

What's Next

Links

What If We Could Fix Barbados Traffic With Data?

The Challenge

The Signaling Problem

The Competition Details

Why This Matters

I Built a Job Board for Transparency. But Is Applying Even the Right Strategy?

The Problem I Thought I Was Solving

Why Job Applications Don't Work in 2025

AI Is Making It Worse

What Actually Works: Direct Outreach

The Realization: I Built for the Wrong Workflow

What Job Seekers Actually Need

Rethinking GlobalRemote

The Bigger Question

What This Means for Developers Job Hunting Right Now

Where I Go From Here

From Interview Surprise to MVP: Testing Developer Job Transparency

The Problem I Experienced

Building an MVP to Test the Idea

What I Built

Problems I Discovered While Building

Early Feedback and Response

Technical Challenges I'm Learning From

What I'm Learning About Developer Pain Points

Is This Problem Worth Solving?

What I Need from You

Essential Course Notes on Cloud Native Application Architecture

Setting up the developer environment

Kubernetes

Skaffold