Role at a Glance

Tech Lead for a distributed team of 2–10 devs at Cartrack Singapore — a SaaS fleet telematics platform under Karooooo Ltd (NASDAQ: KARO) with 2.5M+ active subscribers across 20+ countries. ~80% leadership and coordination, ~20% hands-on coding. You own technical scope, timelines, code reviews, and cross-functional alignment with Product, Design, and QA.

GPS Live TrackingDriver BehaviourFuel AnalyticsGeofencingRoute OptimisationAsset TrackingDelivery ManagementMulti-tenant SaaS
Must-Have
  • Team management (5+ yrs, 2–10 devs)
  • Language & stack agnostic (15+ yrs)
  • SQL — writing, optimising, DB design
  • Linux — scripting, system management
  • Git — branching, merging, workflows
Nice-to-Have
  • PHP / C# backend
  • TypeScript + React frontend
  • PostgreSQL query optimisation
  • Docker / Kubernetes
  • CI/CD pipelines (GitLab)
Interview Readiness by Topic
Team leadership & deliveryHighest weight
SQL & database designHigh weight
Linux & scriptingMedium-high
Git workflowsMedium
Fleet/telematics domain knowledgeDifferentiator
Stack nice-to-haves (PHP, React, Docker)Bonus

Leadership

Mentoring, delivery management, code review, removing blockers.

8 Q&As

SQL & Databases

Query writing, EXPLAIN ANALYZE, index types, schema design, multi-tenancy, migrations.

10 Q&As + code

Linux

System diagnostics, scripting, log management, service troubleshooting.

5 Q&As + code

Git & Workflows

Branching strategies, rebase vs merge, hotfixes, PR standards, bisect.

8 Q&As + code

Fleet / Telematics

GPS ingestion pipeline, geofencing at scale, driver scoring, multi-tenancy.

4 system design Qs

Scenarios

Incident response, scope negotiation, legacy codebases, questions to ask.

7 scenarios

Docker / K8s

Images, layers, K8s architecture, Deployments vs StatefulSets, probes.

6 Q&As + code

CI/CD & GitLab

Pipeline design, GitLab CI yaml, secrets, runners, environments.

4 Q&As + full YAML

Agile & Scrum

Roles, all five ceremonies, DoD vs acceptance criteria, Kanban vs Scrum.

6 Q&As

If You Don't Know

4-step framework, ready-to-use phrases, what never to say.

5 response templates
Mentoring & managing developers
Q1

Walk me through how you structured 1-on-1s and growth plans for junior developers.

Strong answer structure: Describe a regular cadence (weekly or bi-weekly), what you cover (blockers, growth goals, project feedback), and how you adapted it per individual.
Example: "I ran weekly 30-minute 1-on-1s. First half was a safe space for whatever was on their mind. Second half was structured — I maintained a shared doc with each developer tracking their current goal, blockers, and one skill we were actively building. For a junior dev who struggled writing testable code, we'd spend 10 minutes reviewing a PR together and I'd explain the reasoning. Within three months, their review-to-merge ratio improved significantly and they started flagging design issues before I did."
Tip: Show you adapted your style per person — not one-size-fits-all. Mention you tracked progress over time, not just had the meetings.
Q2

Describe a situation where a developer was underperforming. How did you approach it?

Key elements: Early detection, private direct feedback, a clear improvement plan with measurable goals, follow-through.
"I noticed a mid-level developer consistently delivering features with high defect rates and missing estimates. I didn't wait for a formal review — I scheduled a private conversation, framed it around patterns I observed (not personal criticism), and asked if something was blocking them. Turns out they'd been guessing at requirements rather than asking questions. We agreed on a 30-day plan: they would confirm requirements in writing before coding, and I'd review their understanding at task start. After two sprints, defect rate dropped by 60%."
Never say you "let HR handle it." Show you took ownership. Also show empathy — ask what's blocking them before concluding it's capability.
Q3

How do you handle a team member who strongly disagrees with a technical decision you've made?

"I welcome pushback — it's a signal the person cares and is thinking. I'd ask them to walk me through their concern in detail and engage with the actual argument. If they raised a valid point I'd missed, I'd change the decision and acknowledge it publicly. If I still disagreed after hearing them out, I'd explain my reasoning clearly, acknowledge their concern was legitimate, and ask them to commit to the decision for one iteration so we can evaluate the results together. People accept a decision they disagreed with much more readily when they felt genuinely heard."
Tip: Don't position yourself as always right. Interviewers want intellectual humility paired with decisiveness.
Q4

How do you balance giving autonomy to experienced devs while keeping visibility?

"For senior developers, I'd agree upfront on what I needed to see: major design decisions before implementation, blockers immediately, a brief status note at end of week. Everything else was theirs. In code review I'd focus more on architecture and knowledge-sharing rather than style nitpicks. The key is making it easy for them to pull you in rather than you pushing into their work."
Technical leadership & delivery
Q5

How do you define technical scope at the start of a project?

Framework: Requirements → constraints → design → risks → resource estimate → sign-off
"I'd start with a requirements session with Product — not just what we're building but why, who the user is, and what success looks like. Then I'd run a technical design meeting with the team to map the system — data flows, APIs, DB schema, external dependencies. I'd document this in a lightweight technical spec: problem, proposed solution, alternatives considered, timeline, dependencies, and what's explicitly out of scope. Out of scope is as important as in scope."
Tip: Mention you involve the team in scoping — not just you dictating. This shows you build shared ownership of commitments.
Q6

With less than 20% time for hands-on coding, how do you stay technically credible?

"Credibility comes from depth in code reviews, quality of technical questions in design discussions, and occasionally unblocking someone with a targeted PR rather than owning large features. I stay current by reviewing architecture decisions closely, pairing with developers on hard problems, and deliberately picking up small but complex tasks — things that touch the core of the system. I also read broadly: technical blogs, postmortems, and adjacent team architecture decisions."
Q7

What does a constructive code review look like for you?

"I look at correctness first, then readability, then edge cases, then performance where it matters. I label my comments: 'blocking' means it must change, 'suggestion' means I'd do it differently but it's not a blocker, 'question' means I want to understand their choice. I avoid vague feedback like 'this is confusing' — I say what specifically confuses me and why. For junior developers I'll often include a code snippet showing the alternative. I also try to call out what's done well — reviews shouldn't just be a list of problems."
Q8

How do you coordinate between engineering, QA, and product when requirements change mid-sprint?

"First, I'd assess impact: is this a clarification or a scope change? If it's genuine scope creep, I'd document the delta and the cost — additional days, testing time, risk — and bring it to the product owner for a decision: do we cut something else, push the timeline, or reduce quality? I'd never silently absorb scope changes. For QA, I'd keep a shared test plan updated and flag any new acceptance criteria as soon as requirements shift. Communication is the fix — not heroics."
Query writing & optimisation
SQL

JOINs — how they actually work

INNER, LEFT, RIGHT — and the algorithms behind them

When you write a JOIN, Postgres doesn't magically combine tables — it chooses a join algorithm based on table sizes and available indexes. Nested Loop Join: for each row in the outer table, scan the inner for matches. Hash Join: build a hash table from the smaller table, then probe it. Great for large tables with no useful index. Merge Join: both tables must be sorted on the join key first — very efficient when pre-sorted.
SELECT e.name, d.dept_name, COUNT(p.id) AS projects
FROM employees e
LEFT JOIN departments d  ON e.dept_id = d.id
LEFT JOIN projects p     ON e.id = p.owner_id
GROUP BY e.name, d.dept_name
HAVING COUNT(p.id) > 2;
-- LEFT JOIN keeps ALL rows from employees even with no match (NULLs)
-- INNER JOIN would silently drop employees with no dept or projects
Interviewer questions
QWhat's the difference between WHERE and HAVING? Why can't you use aggregates in WHERE?
WHERE filters individual rows before they are grouped. HAVING filters groups after GROUP BY has been applied. Aggregates like COUNT and SUM don't exist yet at the WHERE stage — rows haven't been grouped so there's nothing to aggregate. The SQL execution order is: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. So if you want to filter based on an aggregate (e.g. "only departments with more than 5 employees"), you must use HAVING.
CTE

CTEs and window functions

WITH clauses, RANK, LAG, LEAD, SUM OVER

CTEs let you write a named subquery at the top of your statement. Improves readability and lets you reference the result multiple times. Window functions compute a value for each row based on a "window" of related rows — without collapsing rows like GROUP BY would. OVER() defines the window: PARTITION BY divides into groups, ORDER BY determines frame order.
-- Find top earner per department
WITH top_earners AS (
  SELECT *, RANK() OVER (
    PARTITION BY dept_id ORDER BY salary DESC
  ) AS rnk
  FROM employees
)
SELECT name, dept_id, salary FROM top_earners WHERE rnk = 1;

-- Rank each driver's worst speeding event per month
SELECT * FROM (
  SELECT driver_id,
         DATE_TRUNC('month', event_time) AS month,
         speed - speed_limit AS overspeed,
         RANK() OVER (
           PARTITION BY driver_id, DATE_TRUNC('month', event_time)
           ORDER BY speed - speed_limit DESC
         ) AS rnk
  FROM speeding_events
) t WHERE rnk = 1;
Interviewer questions
QWhat are window functions and how are they different from GROUP BY?
GROUP BY collapses multiple rows into a single summary row per group. Window functions compute a value for each row while still keeping all rows intact. For example, AVG(salary) GROUP BY dept_id gives one row per department. AVG(salary) OVER (PARTITION BY dept_id) keeps every employee row but adds the department average alongside — letting you compare each employee to their department average in the same query. Window functions are perfect for rankings, running totals, moving averages, and comparing consecutive rows with LAG/LEAD.
EXP

EXPLAIN ANALYZE — reading query plans

How to identify slow queries and understand what Postgres is doing

EXPLAIN shows the query plan Postgres intends to use. EXPLAIN ANALYZE actually executes the query and shows actual timings alongside estimates. Key things to look for: Seq Scan on a large table (usually bad), large mismatch between estimated and actual rows (stale statistics — run ANALYZE), and high cost nodes at the top of the plan tree.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;

-- BAD output: reading 3421 rows to find 12
-- Seq Scan on orders  (cost=0.00..3421.00 rows=12)
--                     (actual time=0.021..45.2 rows=12)

-- GOOD after index:
-- Index Scan using idx_orders_customer on orders
--   (actual time=0.018..0.031 rows=12)

-- Slow: function prevents index use
WHERE DATE_TRUNC('month', created_at) = '2024-01-01'
-- Fast: range scan uses index
WHERE created_at >= '2024-01-01' AND created_at < '2024-02-01'
Interviewer questions
QWalk me through diagnosing and fixing a slow query in Postgres.
I'd start with EXPLAIN ANALYZE to see the actual execution plan. I look for three things: first, whether there's a Seq Scan on a large table — that usually means a missing index. Second, I compare estimated rows to actual rows — a large mismatch means the planner's statistics are stale, so I'd run ANALYZE. Third, I check which node is most expensive.

If it's a missing index, I'd create one on the filtered/joined column. I'm careful not to apply a function on an indexed column in WHERE — things like DATE_TRUNC() prevent the index from being used; rewrite those as range conditions. For complex queries I'd consider a materialised view to pre-compute expensive joins. Finally, I'd re-run EXPLAIN ANALYZE to confirm the improvement.
Bonus: In production, use pg_stat_statements to find top slow queries without instrumenting every call manually.
QFind the top 10 vehicles by average speed in the last 30 days from a 50M-row GPS events table.
SELECT vehicle_id,
       ROUND(AVG(speed), 2) AS avg_speed
FROM   gps_events
WHERE  recorded_at >= NOW() - INTERVAL '30 days'
  AND  speed > 0
GROUP  BY vehicle_id
ORDER  BY avg_speed DESC
LIMIT  10;
Performance at scale: Partition gps_events by month on recorded_at — the 30-day filter hits only 1–2 partitions. Add a composite index on (vehicle_id, recorded_at). Consider a materialised view refreshed nightly for dashboards that don't need real-time data.
Database design
IDX

Index types — B-Tree, GIN, GiST, BRIN, partial, expression

How each one works and when to use it

B-Tree (default)

Balanced tree. Supports =, <, >, BETWEEN, LIKE 'prefix%'. Right choice for almost everything.

GIN

Generalised Inverted Index. For JSONB, arrays, full-text search. Stores each element separately.

GiST

Generalised Search Tree. For geometric types, range types. Used heavily with PostGIS for spatial queries.

BRIN

Block Range Index. Tiny footprint. Only useful for large, physically-ordered tables like append-only time-series.

-- Partial: only index what you query (smaller, faster)
CREATE INDEX idx_active_orders ON orders(status) WHERE status != 'completed';

-- Covering: include extra cols → avoid heap lookup entirely
CREATE INDEX idx_orders_cover ON orders(customer_id) INCLUDE (total, created_at);

-- GIN for JSONB / spatial with PostGIS for geofencing
CREATE INDEX idx_geofences_geom ON geofences USING GiST(boundary);
Interviewer questions
QYou have a large orders table. Queries filter by status='pending' frequently but most orders are completed. How would you index this?
I'd use a partial index: CREATE INDEX ON orders(status) WHERE status = 'pending'. A regular index on status would be large because it indexes every row including millions of completed orders. A partial index only indexes the subset I actually query — it's smaller, fits in cache better, and is faster to scan.
QExplain the difference between clustered and non-clustered indexes. When would you avoid adding one?
Clustered index: Determines the physical order of data on disk. One per table (the primary key in most databases). In PostgreSQL, you can use CLUSTER to physically reorder but it's not automatic.

Non-clustered index: A separate structure pointing back to the heap rows. Multiple allowed per table.

Avoid adding an index when: the table is write-heavy (every insert/update maintains all indexes), the column has low cardinality (e.g. a boolean flag), the table is small enough that a full scan is faster, or the column is rarely used in WHERE, JOIN, or ORDER BY.
NRM

Normalization — 1NF to 3NF + when to break the rules

1
1NF — Atomic values
Every column holds a single indivisible value. No arrays, no comma-separated lists, no repeating column groups.
2
2NF — No partial dependencies
Every non-key column must depend on the entire primary key. Only matters with composite PKs. If product_name depends only on product_id (not the full PK), move it to a products table.
3
3NF — No transitive dependencies
Non-key columns must depend only on the primary key, not on other non-key columns. dept_name depending on dept_id (not employee_id) is a transitive dependency — extract it.
When to denormalize deliberately: For read-heavy reporting tables, joins become the bottleneck. Denormalize by duplicating data or use CREATE MATERIALIZED VIEW and refresh with REFRESH MATERIALIZED VIEW CONCURRENTLY to avoid locking readers.
SCH

Fleet schema design + multi-tenancy

Design the DB for a fleet management SaaS serving 125,000+ customers

-- Core entities
fleets      (id, customer_id, name)          -- tenancy boundary
vehicles    (id, fleet_id, plate, make, model, status)
drivers     (id, fleet_id, name, licence_no, status)

-- Assignment (many-to-many with time periods)
driver_vehicle_assignments
  (id, driver_id, vehicle_id, assigned_at, unassigned_at)

-- Trips (reconstructed from GPS events)
trips       (id, vehicle_id, driver_id, start_time, end_time,
             distance_km, start_lat, start_lng, end_lat, end_lng)

-- GPS events — PARTITION BY RANGE(recorded_at)
gps_events  (id, vehicle_id, trip_id, recorded_at,
             lat, lng, speed, heading, odometer)

-- Alerts (JSONB for flexible payload)
alerts      (id, vehicle_id, driver_id, alert_type,
             severity, triggered_at, resolved_at, payload jsonb)
Interviewer questions
QHow would you design a multi-tenant database for a SaaS fleet platform with 125,000+ customers?
Three strategies:

1. Shared schema (discriminator column): All customers in same tables with a customer_id. Simple to operate. Risk: one bad query can leak cross-tenant data.

2. Shared DB, separate schemas: Each customer gets their own schema. Better isolation but migrations across 125k schemas are unmanageable.

3. Separate DBs per tenant group: Best isolation, enables per-region data residency. Most operationally complex.

At Cartrack's scale: Strategy 1 (shared schema) with customer_id enforced at the application layer and Row-Level Security (RLS) in PostgreSQL as a safety net. Partition by customer segment for the largest enterprise accounts.
Mention RLS: It enforces tenant isolation at the DB engine level even if application code has a bug — a strong signal of production-grade thinking.
QHow do you approach zero-downtime database migrations?
Expand-contract pattern:

Step 1 — Expand: Add the new column/table alongside the old one. Deploy code that writes to both.
Step 2 — Backfill: Background job populates the new structure from existing data in batches, not one giant UPDATE.
Step 3 — Migrate reads: Deploy code that reads from the new structure.
Step 4 — Contract: Remove the old column/table and dual-write code.

Also: use CREATE INDEX CONCURRENTLY (never table-locking), never rename columns directly (add new, migrate, drop old), and always test migration rollback before applying in production.
PostgreSQL-specific features
TXN

Transactions, MVCC & ACID

How Postgres handles concurrent reads and writes without locking everything

Postgres uses MVCC (Multi-Version Concurrency Control) to allow readers and writers to work simultaneously. Instead of locking a row when you update it, Postgres creates a new version of the row and marks the old one as expired. Readers see the old version until they start a new transaction — this is why reads don't block writes. The downside: old row versions accumulate (dead tuples). VACUUM reclaims this space. AUTOVACUUM does it automatically.
BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT; -- both succeed together, or ROLLBACK to undo both

-- Monitor dead tuples
SELECT relname, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
PTN

Table partitioning

Range/list/hash partitioning and when it helps over just adding an index

Partitioning splits a large logical table into smaller physical sub-tables. Postgres routes queries to the relevant partitions automatically (partition pruning). If you query WHERE recorded_at BETWEEN '2024-01-01' AND '2024-03-31', Postgres scans only the 2024-Q1 partition instead of the full table.
CREATE TABLE gps_events (
  id         BIGSERIAL,
  vehicle_id INT,
  recorded_at TIMESTAMPTZ NOT NULL,
  lat NUMERIC, lng NUMERIC, speed NUMERIC
) PARTITION BY RANGE(recorded_at);

CREATE TABLE gps_events_2024_q1 PARTITION OF gps_events
  FOR VALUES FROM('2024-01-01') TO('2024-04-01');

-- Drop old data cheaply — instant, no vacuum needed
DROP TABLE gps_events_2023_q1;
System troubleshooting
CPU

A production server is showing high CPU usage. Walk me through your diagnosis step-by-step.

# Step 1: Identify the offending process
top          # sort by %CPU with 'P'
htop         # more visual; filter by process name
ps aux --sort=-%cpu | head -10

# Step 2: Inspect the process
pidstat -u 1 5    # per-process CPU usage over 5 seconds
systemctl status php-fpm
journalctl -u php-fpm --since "10 minutes ago"

# Step 3: Identify runaway threads or zombie processes
ps -eLf | grep php    # list all threads
ps aux | grep Z       # zombie processes

# Step 4: Check for CPU-heavy system calls
strace -p <PID> -c    # summarise system calls

# Also check: cron jobs, sustained vs spike, I/O waits
Commands to know cold: top/htop, iostat, netstat/ss, lsof, strace, journalctl, systemctl, awk, sed, grep, find, crontab, df/du, chmod/chown, scp/rsync
SVC

A service fails silently after a restart. How do you investigate using systemd?

# Check current status and recent logs
systemctl status my-service
journalctl -u my-service -b         # logs since last boot
journalctl -u my-service -f         # follow live
journalctl -u my-service -n 100     # last 100 lines
systemctl cat my-service            # view unit file
systemctl list-dependencies my-service
Common culprits: Missing environment variables (check EnvironmentFile=), wrong file permissions, port already in use (ss -tlnp | grep :8080), or a dependency service not fully started.
LOG

Set up log rotation for a high-volume telemetry service.

# /etc/logrotate.d/telemetry
/var/log/telemetry/*.log {
    daily
    rotate 14
    compress
    delaycompress      # compress yesterday's log, not today's (safe for open handles)
    missingok
    notifempty
    sharedscripts
    postrotate
        systemctl reload telemetry-service
    endscript
}
For containerised workloads: Centralise logs via stdout → log shipper (Filebeat, Fluentd) rather than file-based rotation.
Shell scripting
SH

Write a bash script that monitors a directory for new CSV files and logs results with timestamps.

#!/bin/bash
set -euo pipefail   # exit on error, unbound var, pipe fail

WATCH_DIR="/data/incoming"
LOG_FILE="/var/log/csv-processor.log"
DONE_DIR="/data/processed"

mkdir -p "$DONE_DIR"

log() { echo "$(date '+%Y-%m-%d %H:%M:%S') $*" >> "$LOG_FILE"; }

process_file() {
  local file="$1"
  local rows
  rows=$(tail -n +2 "$file" | wc -l)   # skip header
  log "Processed: $(basename "$file") | Rows: $rows"
  mv "$file" "$DONE_DIR/"
}

log "=== Watcher started ==="

inotifywait -m -e close_write "$WATCH_DIR" --format '%f' |
while read -r filename; do
  if [[ "$filename" == *.csv ]]; then
    process_file "$WATCH_DIR/$filename"
  fi
done
Key points: set -euo pipefail for robust error handling. Use inotifywait instead of polling with sleep loops. Move processed files to avoid double-processing.
AWK

Extract all unique vehicle IDs from a large log file efficiently.

Assuming log lines like: 2024-01-15 GPS vehicle_id=TRK-2041 lat=1.34 lng=103.82
# Option 1: grep + sort + uniq (portable)
grep -oP 'vehicle_id=\K[^\s]+' fleet.log | sort -u

# Option 2: awk (faster for very large files)
awk 'match($0, /vehicle_id=([^ ]+)/, a) {print a[1]}' fleet.log | sort -u

# For 10GB+ files: use LC_ALL=C sort for much faster ASCII sorting
LC_ALL=C sort -u
Core concepts & commands
HOW

How Git stores data

Commits, trees, blobs — snapshots not diffs

Git doesn't store diffs — it stores snapshots. Each commit is a pointer to the full state of the project at that moment. Internally: blob (file content), tree (directory listing), commit (snapshot + author + parent), tag. Everything is content-addressed by SHA-1. This is why Git history is immutable — changing any commit changes its hash and all subsequent hashes.
Interviewer questions
QWhat's the difference between git reset, git revert, and git restore?
git revert: Creates a new commit that undoes a previous commit. Safe for shared branches — doesn't rewrite history. Use this when you need to undo something already pushed.

git reset: Moves the branch pointer backward, rewriting history. --soft keeps changes staged, --mixed keeps them unstaged, --hard discards changes. Only use locally before pushing — breaks teammates' branches if used on shared branches.

git restore: For working directory and staging area only — doesn't touch commits. git restore file.txt discards uncommitted changes. git restore --staged file.txt unstages a file.
Branching strategies
GF

Git Flow vs Trunk-Based Development

Long-lived branches vs always-deployable main

main ●──────────────────────────────────●── (production) ╲ ╱ hotfix ●──────────────────────────────● (urgent prod fix) release ●────────────────────────────────────● (stabilise, no new features) ╱ ╲ develop ●──────────────────────────────────────● (integration branch) ╱ ╲ ╱ feature ●─────────────── (feature/login, feature/api)
Interviewer questions
QCompare Git Flow and trunk-based development. Which would you recommend for Cartrack?
Git Flow: Long-lived branches (main, develop, feature/*, release/*, hotfix/*). Good for versioned software with scheduled releases. Downsides: merge conflicts accumulate, high ceremony, slow feedback loops.

Trunk-Based Development: Everyone integrates to main frequently via short-lived feature branches (1–3 days max). Feature flags control in-progress work. Faster feedback, less merge pain.

For Cartrack (SaaS, continuous deployment): I'd recommend a lightweight trunk-based approach with short-lived feature branches and mandatory CI passing before merge. Feature flags let us deploy dark features safely. Git Flow adds ceremony without benefit when you're deploying to production multiple times a week.
QExplain rebase vs merge. When would you use each with a team of 8 developers?
Merge: Creates a merge commit, preserving the full history of both branches. Safe for shared branches.

Rebase: Replays your commits on top of another branch — creates a linear history. Rewrites commit hashes. Never rebase shared/public branches.

Team policy: For feature branches before opening a MR, rebase onto main to get a clean linear diff that's easier to review. For merging into main, use a merge or squash commit so history is preserved and rollback is clear.
GitLab tip: Use GitLab's Squash Commits option on MR merge — all feature commits collapse into one clean commit on main.
QHow do you handle hotfixes when main has moved significantly ahead of the release branch?
git checkout v2.4.1               # branch from the release tag
git checkout -b hotfix/gps-null-fix
git commit -m "fix: handle null GPS coordinates"

git checkout release/2.4
git merge hotfix/gps-null-fix
git tag v2.4.2

git checkout main
git cherry-pick <commit-hash>     # cherry-pick the fix, don't merge the whole branch
Key point: Cherry-pick the specific commit(s) onto main rather than merging the hotfix branch — you don't want stale release-branch code landing in main. Always tag the hotfix release.
Collaboration & standards
STD

How do you enforce commit and PR standards across a distributed team?

"I'd adopt Conventional Commits as the standard: feat:, fix:, chore:, refactor:, docs: prefixes. This enables automated changelogs. I'd enforce it with a commit-msg hook (commitlint) that CI runs — the pipeline fails if the format is wrong. For MRs, I'd use a GitLab MR template with a checklist: what changed, how it was tested, screenshots for UI changes, and a link to the ticket. Branch protection rules require at least one approval and passing CI before merge."
BST

How do you use git bisect to locate a regression?

git bisect start
git bisect bad                 # current commit has the bug
git bisect good v2.3.0         # this commit was known-good

# Git checks out the midpoint. Test it, then:
git bisect bad   # or
git bisect good

# Automate with a test script (exit 0 = good, non-zero = bad)
git bisect run ./test_script.sh

git bisect reset               # clean up when done
Pro tip: On a repo with 1,000 commits between good and bad, bisect narrows it down in ~10 steps (binary search).
System design for telematics
Q1

How would you design the data pipeline for ingesting real-time GPS telemetry from millions of IoT devices?

1
Device → Ingestion Layer
IoT devices send GPS pings (lat/lng, speed, heading, odometer) every few seconds via MQTT or lightweight TCP. Stateless, horizontally scalable ingestion service receives them.
2
Message Queue (Kafka)
Events go into Kafka. Decouples ingestion from processing, handles traffic spikes, and provides replay capability for reprocessing.
3
Stream Processing
Kafka consumers or Flink jobs process events in real-time: geofence checks, speed violation detection, trip boundary detection.
4
Storage
Raw events → PostgreSQL with monthly partitions. Computed aggregates (trip summaries, driver scores) → relational tables for fast dashboard queries.
5
API Layer
Fleet managers query pre-aggregated data. Real-time tracking uses WebSocket pushing latest vehicle positions.
Key NFRs to mention: Out-of-order event handling (GPS packets can arrive late due to connectivity), idempotency on the processing side, and horizontal scaling on the ingestion and consumer layers.
Q2

How would you design geofencing alerts — detecting when a vehicle enters or exits a zone — at low latency for millions of vehicles?

The challenge: For each incoming GPS event, check if the vehicle crossed any of potentially thousands of geofence boundaries.
Approach:
Spatial indexing: Store geofences in PostgreSQL with the PostGIS extension. Use GiST indexes on the geometry column. A point-in-polygon check with proper indexing is fast even for millions of geofences.
In-memory cache: Load each customer's active geofences into Redis keyed by geohash/grid cell. On incoming GPS event, look up only the geofences relevant to that grid cell.
State tracking: Per vehicle, track "was inside geofence X at last ping?" — store this state to detect transitions (enter/exit) rather than re-evaluating absolute position on every event.
Event-driven alerts: Transition events publish to an alert queue, which then triggers notifications (push, SMS, email) asynchronously.
Q3

GPS data can arrive out of order or as duplicates. How do you handle this on ingestion?

Out-of-order events: Each GPS event carries a device timestamp (not server arrival time). Always store and query by device timestamp. Use a short watermark window (e.g. 30 seconds) in the stream processor — hold events briefly to allow late arrivals to catch up. For trip reconstruction, sort events by device timestamp in the computation step regardless of ingestion order.
Duplicate packets: Each event should carry a unique event ID generated on the device (sequence number + device ID). Use an idempotency key on insert — PostgreSQL's ON CONFLICT DO NOTHING or upsert on the event ID. A short-lived Redis set of recently seen event IDs can block duplicates before they even hit the database.
Q4

How would you build a driver behaviour scoring system?

Raw events to capture: Speeding (vs road speed limit), harsh braking (deceleration threshold), harsh acceleration, sharp cornering (lateral g-force), excessive idling, seatbelt (from OBD-II).
Scoring model: Start each driver at 100. Each event deducts points based on severity and frequency. Weight events differently: speeding 30+ km/h over limit deducts more than mild speeding. Calculate a rolling score (last 30 days) for operational dashboards and a cumulative score for insurance/compliance.
Storage strategy: Raw events in partitioned driving_events table. Nightly batch job computes the 30-day score and writes to a driver_scores summary table. Dashboards query the summary table — never aggregate raw events on every page load.
Real-time consideration: Provide a live preview score in the driver mobile app computed in the stream processor, separate from the authoritative batch-computed score.
High-pressure situations
INC

A critical bug causes incorrect GPS locations for 10,000 vehicles. You're the tech lead on call.

1
Acknowledge & communicate (0–5 min)
Confirm the incident, set up a war room (Slack channel), immediately notify stakeholders — support, customer success, engineering leadership. Don't investigate in silence.
2
Contain (5–15 min)
Can we roll back the recent deployment? Can we disable the affected feature for impacted customers? Stop the bleeding before root-causing.
3
Investigate (parallel)
Check recent deployments, config changes, and DB changes. Pull logs for the affected time window. Identify scope — is it all vehicles or a subset?
4
Fix & deploy
Make the smallest safe fix. Don't refactor under pressure. Get a second pair of eyes before deploying.
5
Post-mortem (24–48 hrs later)
Blameless post-mortem. Document timeline, root cause, contributing factors, and action items to prevent recurrence. Share with engineering and stakeholders.
Key phrase to include: "Communication is as important as the fix. Customers and stakeholders can handle bad news — they can't handle silence."
SCP

Product wants to ship a feature in 2 weeks that the team estimates at 6 weeks. How do you handle it?

"I'd never just say 'no, it takes 6 weeks' without engaging in the problem. First I'd ask: what's driving the 2-week deadline — is it a customer commitment, competitive pressure, a conference? Understanding the real constraint often reveals options.

Then I'd come back with options rather than a single answer: (a) Full feature in 6 weeks at current scope. (b) An MVP in 2 weeks that delivers the core value — here's specifically what we'd cut. (c) Full feature in 4 weeks if we add a developer or drop another priority. (d) Feature in 2 weeks but with known technical debt we address in the next sprint.

I'd make the tradeoffs explicit, let product make the call, and document whatever decision is made so there's shared accountability."
Never say you'll just work weekends to make it happen. That signals you'll absorb scope without pushback — a red flag for a lead role.
LGC

You inherit a legacy codebase with no documentation and high technical debt. What do you do?

Phase 1 — Understand before judging (weeks 1–3): Read the code and run it. Understand what it does before deciding it's "bad." Talk to anyone who knows the history. Map the critical paths — what are the highest-risk areas?
Phase 2 — Stabilise: Add tests around the highest-risk and most-changed areas first. You can't safely refactor without tests. Document as you learn — write the README, architecture notes, and runbook you wish existed.
Phase 3 — Incremental improvement: Don't do a big-bang rewrite. Strangle the old code with new patterns over multiple sprints. Budget 20–30% of each sprint for debt reduction. Frame it to stakeholders as reducing the cost of future features, not cleanup for its own sake.
Questions to ask Cartrack

What does the current tech stack look like, and where is the biggest technical debt?

Why ask this: Shows you're thinking about what you're walking into. Their answer signals how honest and self-aware the engineering organisation is. Also confirms whether the stack matches the JD (PHP/C#, React, PostgreSQL, Docker).

What does success look like for this role in the first 90 days?

Why ask this: Gets them to articulate concrete expectations rather than vague desires. Also helps you evaluate if their definition of success is realistic. Listen for: are they expecting you to immediately own delivery, or is there an onboarding period?

How do you balance shipping new features vs maintaining platform reliability?

Why ask this: For a fleet telematics SaaS, uptime is a contractual obligation — a vehicle tracking system going down has immediate business impact. Their answer reveals engineering culture: do they have SLOs? Error budgets? Incident response maturity?
Backend — PHP / C#
API

How would you structure a REST API for a fleet tracking service?

# Resource-oriented URLs
GET    /v1/fleets/{fleet_id}/vehicles
GET    /v1/vehicles/{vehicle_id}/trips?from=2024-01-01&to=2024-01-31
GET    /v1/vehicles/{vehicle_id}/position    # latest GPS fix
POST   /v1/alerts
PATCH  /v1/drivers/{driver_id}
Key design points: OAuth2 / JWT with fleet-scoped tokens. Version in URL (/v1/). Cursor-based pagination for GPS event endpoints — offset pagination breaks on large tables. Rate limiting per customer, not per IP.
ASY

(C#) Explain async/await and where you'd apply it in a high-throughput telematics API.

// Blocking (bad — thread parked while DB responds)
public Vehicle GetVehicle(int id) {
    return _db.Vehicles.Find(id);
}

// Non-blocking (good — thread returns to pool during DB wait)
public async Task<Vehicle> GetVehicleAsync(int id) {
    return await _db.Vehicles.FindAsync(id);
}
async/await allows the thread to be returned to the thread pool while waiting for I/O. For a telematics API where an endpoint fetches 30 days of GPS events from PostgreSQL, making it async means a single server can handle thousands of concurrent fleet dashboards refreshing in parallel without running out of threads.
Frontend — TypeScript + React
MAP

A live fleet map re-renders too frequently as vehicle positions update. How do you optimise this?

// Bad — new array reference every render triggers re-render
const markers = vehicles.map(v => ({ lat: v.lat, lng: v.lng }));

// Good — memoised, only recalculates when vehicles changes
const markers = useMemo(
  () => vehicles.map(v => ({ lat: v.lat, lng: v.lng })),
  [vehicles]
);
Additional optimisations: Wrap the map component in React.memo(). Throttle WebSocket position updates — GPS devices ping every few seconds but the map doesn't need to re-render at the full telemetry rate. Consider mutating marker positions directly on the map instance (Mapbox GL JS / Leaflet) rather than updating React state.
Docker
IMG

Images, containers & layers — why layer order matters

A Docker image is a read-only layered filesystem snapshot. Each Dockerfile instruction creates a new layer. Docker caches each layer — the moment any layer changes, all subsequent layers are invalidated. This is why you copy dependency files and install them before copying source code.
# WRONG: cache busted every time source code changes
COPY . .
RUN npm ci     # reinstalls on every code change!

# CORRECT: deps layer cached unless package.json changes
COPY package*.json ./
RUN npm ci --production   # cached if deps unchanged
COPY . .                  # source invalidates cache only here
Interviewer questions
QWhat's the difference between CMD and ENTRYPOINT in a Dockerfile?
ENTRYPOINT defines the executable that always runs — can't be overridden without --entrypoint. CMD provides default arguments that can be replaced at runtime.

Common pattern: ENTRYPOINT ["node"] and CMD ["server.js"]. Running normally starts node server.js. Running with docker run myimage debug.js starts node debug.js — CMD was overridden. If you only use CMD, someone can override the entire command including the executable.
QContainerise a PHP backend service wired to a PostgreSQL database.
FROM php:8.3-fpm AS base
RUN docker-php-ext-install pdo pdo_pgsql

FROM base AS deps
COPY composer.json composer.lock ./
RUN composer install --no-dev --optimize-autoloader

FROM base AS final
COPY --from=deps /app/vendor ./vendor
COPY . .
EXPOSE 9000

---
# docker-compose.yml
services:
  api:
    build: .
    environment:
      DB_HOST: postgres
      DB_PASS: ${DB_PASS}   # from .env — never hardcode
    depends_on:
      postgres:
        condition: service_healthy
  postgres:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app"]
      interval: 5s
Key points: Multi-stage builds reduce image size. Use depends_on with healthcheck — not just depends_on (which doesn't wait for readiness). Never embed secrets in the image.
Kubernetes
K8S

Cluster architecture — control plane, worker nodes, and rolling deployments

Control plane

API Server: Front door. All kubectl commands go here.
etcd: Stores all cluster state.
Scheduler: Picks which node a new pod runs on.
Controller Manager: Watches for divergence and fixes it.

Worker nodes

kubelet: Node agent. Manages pods on the node.
kube-proxy: Manages network rules for Services.
Container runtime: Actually runs containers (containerd).

Interviewer questions
QWhat's the difference between a Deployment and a StatefulSet?
A Deployment manages stateless pods — the pods are interchangeable replicas. If pod-1 dies, a replacement is scheduled and it doesn't matter that it has a different name or IP.

A StatefulSet is for stateful applications. Pods get predictable names (app-0, app-1, app-2), a stable network DNS hostname, and their own PersistentVolumeClaim that survives pod restarts. Critical for databases where a pod needs to know "I am replica number 1."

I'd use a Deployment for web API servers and workers. I'd use a StatefulSet for PostgreSQL, Redis, Kafka — anything that maintains its own state that must persist.
QExplain liveness vs readiness probes. What happens if each one fails?
Liveness probe: "Is this container still alive?" If it fails, Kubernetes kills and restarts the container. Use this to detect deadlocked or stuck processes.

Readiness probe: "Is this container ready to accept traffic?" If it fails, Kubernetes removes the pod from the Service's endpoints — traffic stops going to it, but the container is NOT restarted. Use this for startup warmup or temporarily overloaded pods.

Common mistake: Setting liveness probe thresholds too aggressively — Kubernetes will keep restarting a pod that just needs time to warm up, causing a crash loop. Use readiness for traffic control, liveness as a last resort for truly stuck processes.
QHow does Kubernetes handle rolling deployments and how do you minimise downtime?
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 0   # never take a pod down before a new one is ready
    maxSurge: 1         # allow one extra pod above desired count during update
To minimise downtime: Configure readinessProbe properly — K8s only routes traffic to a pod once the probe passes. Handle SIGTERM gracefully — finish in-flight requests before exiting. Set terminationGracePeriodSeconds to match your longest expected request time. Use PodDisruptionBudgets to guarantee a minimum number of healthy pods.
Core concepts
CI

CI vs CD (Delivery) vs CD (Deployment)

Three stages, three different levels of automation

Continuous Integration

Developers merge code to main frequently. Each merge triggers automated build + test run. Goal: detect integration problems early.

Continuous Delivery

Every passing change can be released to production at any time — but requires a human to press the deploy button. Software is always releasable.

Continuous Deployment

Every passing change is automatically deployed to production without human intervention. Requires very high confidence in tests and monitoring.

Interviewer questions
QContinuous Delivery vs Continuous Deployment — which would you recommend for Cartrack?
Continuous Delivery means every build is ready to deploy, but a human makes the final decision. Continuous Deployment removes that gate — every passing build goes to production automatically.

For Cartrack, I'd start with Continuous Delivery. Fleet telematics is a mission-critical system — unexpected production changes can directly impact 125,000+ commercial customers tracking vehicles in real time. The manual gate also gives product managers visibility into what's going out. I'd graduate to Continuous Deployment as confidence in the test suite and monitoring grows, potentially starting with lower-risk services first.
GitLab CI in practice
YML

Full GitLab CI pipeline for a fleet tracking backend

stages: [lint, test, build, deploy]

lint-code:
  stage: lint
  image: php:8.3
  script:
    - composer run phpcs       # PSR-12 coding standards
    - composer run phpstan     # static analysis
  cache:
    paths: [vendor/]
    key: $CI_COMMIT_REF_SLUG

unit-tests:
  stage: test
  services: [postgres:16]    # real DB for tests
  variables: {POSTGRES_DB: test_db}
  script:
    - composer run phpunit --coverage-text
  coverage: '/Lines:\s+(\d+\.\d+)%/'

build-image:
  stage: build
  image: docker:24
  services: [docker:dind]
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

deploy-staging:
  stage: deploy
  script: [kubectl set image deployment/api api=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA]
  environment: staging
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

deploy-production:
  stage: deploy
  when: manual         # human approval gate
  environment: production
  script: [kubectl set image deployment/api api=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA]
  rules:
    - if: $CI_COMMIT_TAG    # only on tagged releases
Interviewer questions
QHow do you handle secrets in a CI/CD pipeline? What would you never do?
Secrets go in GitLab CI/CD Variables (Settings → CI/CD → Variables), marked as masked so they don't appear in job logs. They're injected as environment variables at runtime — never hardcoded in the YAML file.

For production deployments I'd integrate with a secrets manager like HashiCorp Vault or AWS Secrets Manager for rotation and audit trails. Variables scoped to specific environments prevent staging pipelines from accidentally accessing production credentials.

What I'd never do: Put a secret in a CI variable that isn't masked, echo a secret to the job log, store credentials in a Docker image layer (they can be extracted from the image history), or commit a .env file with secrets (even .gitignored — they end up in git history).
Scrum roles

Product Owner

  • Owns and prioritises the product backlog
  • Defines acceptance criteria for stories
  • Single voice of the customer
  • Decides what gets built and in what order
  • Approves completed work against the DoD

Scrum Master

  • Facilitates all Scrum ceremonies
  • Removes impediments and blockers
  • Protects team from scope creep
  • Coaches the team on Agile practices
  • No direct authority over developers

Development team

  • Self-organising and cross-functional
  • Estimates effort and owns sprint commitments
  • Pulls work from the sprint backlog
  • Collectively responsible for quality
  • Defines and upholds the DoD
Sprint ceremonies
Planning · 2–4 hrs

Sprint planning

Team selects items from the top of the prioritised backlog and commits to delivering them. PO clarifies requirements. Team estimates in story points, breaks stories into tasks. Output: a clear, committed sprint goal.

Daily · 15 min

Daily standup

Each member answers: what did I complete yesterday, what will I do today, any blockers? Goal is team coordination and early blocker detection — not a status report to management.

End of sprint · 1–2 hrs

Sprint review (demo)

Team demonstrates completed work to stakeholders. Only "done" work — that meets the DoD — is presented. Incomplete work is returned to the backlog. This is about the product.

After review · 1–1.5 hrs

Sprint retrospective

Team reflects on its own working process: what went well, what didn't, what specific actions to take next sprint. This is about the process — separate from the review. Output: actionable improvements with owners assigned.

Mid-sprint · ongoing

Backlog refinement

PO and team review upcoming items: clarify requirements, write acceptance criteria, split large epics, estimate effort. Goal: ensure the top of the backlog is always sprint-ready so planning meetings don't stall.

Interviewer questions
QWhat is the difference between the sprint review and the sprint retrospective?
The sprint review is about the product — the team demonstrates what was built to stakeholders, gets feedback, and adjusts the backlog. External people attend. The sprint retrospective is about the process — it's an internal team meeting to discuss how the team worked and identify specific improvements. Stakeholders don't attend retros.

Simple: review = "what did we build?", retro = "how did we work?" Mixing them up in an interview is a red flag — it suggests you haven't worked in Scrum.
QWhat's the Definition of Done and how does it differ from acceptance criteria?
The Definition of Done is the team's universal quality checklist — it applies to every story in every sprint. A typical DoD: code reviewed by at least one peer, all unit tests passing, CI pipeline green, deployed to staging, documentation updated. Until all of these are true, the story is not done regardless of whether the feature works.

Acceptance criteria are story-specific — they define what "correct" means for that particular feature. For a login story, AC might be: "given a valid email and password, the user is redirected to the dashboard; given an invalid password, an error is shown."

A story must meet both: AC confirms it does the right thing, DoD confirms it was done to the team's quality standard.
The 4-step framework
1
Acknowledge honestly
Say clearly and briefly that you haven't worked with that specific thing. Don't pad it. "I haven't used that directly" is better than "hmm, well, it depends, let me think..."
2
Bridge to what you know
Connect the unknown topic to something you do understand. "That's similar to X which I've used extensively. My understanding is that [concept] applies here because..."
3
Reason out loud
Even without the answer, show your thinking. "Based on what I know about [related area], I'd guess [logical inference]... but I'd want to verify that." Interviewers love people who reason systematically under uncertainty.
4
Show eagerness to learn
End forward: "That's something I haven't used yet but I'd pick it up quickly — could you tell me more about how your team uses it?" Turns a knowledge gap into genuine curiosity.
Ready-to-use response templates
S1

"Have you used [X technology]?" — when you haven't

Say this: "I haven't used [X] directly, but I'm familiar with the concept — it's similar to [Y] which I've worked with extensively. From what I understand, [X] solves [problem] by [approach]. I'd be comfortable picking it up quickly given my background in [related area]. Is it something your team uses heavily?"
The key is to immediately show related knowledge rather than just saying "no." You're demonstrating you can learn it — not pretending you already know it.
S2

"Explain how [deep concept] works" — you know it partially

Say this: "I'm not fully confident on all the internals of that. What I know is [related foundation]. Based on that, I'd reason that [logical inference] — but I'd want to verify that assumption rather than state it as fact. Am I in the right direction?"
Asking "am I in the right direction?" turns the question into a dialogue. The interviewer may correct you — which shows you're open to learning, not defensive.
S3

Complete blank — no starting point at all

Say this: "That's not something I've encountered yet, so I'd rather be upfront than guess. In practice, I'd check the official docs / spin up a quick test / ask a teammate who's used it. Could you give me a brief overview? I'm genuinely curious how it works."
Saying how you'd find the answer is almost as valuable as knowing the answer. It shows you're self-sufficient, resourceful, and not afraid to admit a gap.
S4

You know it but forgot the exact syntax

Say this: "I know exactly what it does and when to use it — [explain the concept and use case]. I can't recall the exact syntax right now, but in practice I'd check the docs. I find the important thing is understanding why a tool exists, not memorising its flags."
S5

The interviewer digs deeper than you can go

Say this: "I've reached the edge of what I know confidently here. I know [what you said], but beyond that I'd be speculating rather than speaking from experience. I'm curious — what does happen at that level? I'd like to understand it properly."
Knowing your own knowledge boundary is a sign of maturity. Interviewers often probe to find this boundary deliberately — they're testing self-awareness, not just depth.
Never say these things

What not to do

  • "I know this, let me think..." — then fill time with unrelated waffle
  • "It's basically the same as [completely wrong analogy]"
  • Giving a long confident-sounding answer that's actually incorrect
  • Saying "that's a great question!" — sounds hollow every time
  • Apologising excessively: "I'm so sorry, I should know this..."
  • Pretending a vague answer was complete

What to do instead

  • Be brief: "I haven't used that yet" — one sentence, then pivot
  • Show adjacent knowledge immediately after admitting the gap
  • Ask the interviewer to explain — turns a gap into a conversation
  • State your reasoning process even if you don't know the conclusion
  • Stay calm — a pause to think looks confident, not nervous
  • Be genuinely curious, not performatively so