Cartrack Tech Lead
Interview Prep
Everything in one place — must-have topics with Q&As, fleet domain knowledge, behavioural scenarios, and techniques for when you don't know the answer.
Tech Lead for a distributed team of 2–10 devs at Cartrack Singapore — a SaaS fleet telematics platform under Karooooo Ltd (NASDAQ: KARO) with 2.5M+ active subscribers across 20+ countries. ~80% leadership and coordination, ~20% hands-on coding. You own technical scope, timelines, code reviews, and cross-functional alignment with Product, Design, and QA.
- Team management (5+ yrs, 2–10 devs)
- Language & stack agnostic (15+ yrs)
- SQL — writing, optimising, DB design
- Linux — scripting, system management
- Git — branching, merging, workflows
- PHP / C# backend
- TypeScript + React frontend
- PostgreSQL query optimisation
- Docker / Kubernetes
- CI/CD pipelines (GitLab)
Leadership
Mentoring, delivery management, code review, removing blockers.
SQL & Databases
Query writing, EXPLAIN ANALYZE, index types, schema design, multi-tenancy, migrations.
Linux
System diagnostics, scripting, log management, service troubleshooting.
Git & Workflows
Branching strategies, rebase vs merge, hotfixes, PR standards, bisect.
Fleet / Telematics
GPS ingestion pipeline, geofencing at scale, driver scoring, multi-tenancy.
Scenarios
Incident response, scope negotiation, legacy codebases, questions to ask.
Docker / K8s
Images, layers, K8s architecture, Deployments vs StatefulSets, probes.
CI/CD & GitLab
Pipeline design, GitLab CI yaml, secrets, runners, environments.
Agile & Scrum
Roles, all five ceremonies, DoD vs acceptance criteria, Kanban vs Scrum.
If You Don't Know
4-step framework, ready-to-use phrases, what never to say.
Team Management & Leadership
The role is ~80% leadership. Expect deep probing here. Have 3–4 concrete STAR stories ready covering mentoring, delivery, conflict, and code quality.
Walk me through how you structured 1-on-1s and growth plans for junior developers.
Describe a situation where a developer was underperforming. How did you approach it?
How do you handle a team member who strongly disagrees with a technical decision you've made?
How do you balance giving autonomy to experienced devs while keeping visibility?
How do you define technical scope at the start of a project?
With less than 20% time for hands-on coding, how do you stay technically credible?
What does a constructive code review look like for you?
How do you coordinate between engineering, QA, and product when requirements change mid-sprint?
SQL & Database Design
Expect a practical component — live query writing or whiteboard schema design. Cartrack handles massive time-series GPS event data. Think partitioning, indexing strategy, and multi-tenancy.
JOINs — how they actually work
INNER, LEFT, RIGHT — and the algorithms behind them
SELECT e.name, d.dept_name, COUNT(p.id) AS projects FROM employees e LEFT JOIN departments d ON e.dept_id = d.id LEFT JOIN projects p ON e.id = p.owner_id GROUP BY e.name, d.dept_name HAVING COUNT(p.id) > 2; -- LEFT JOIN keeps ALL rows from employees even with no match (NULLs) -- INNER JOIN would silently drop employees with no dept or projects
CTEs and window functions
WITH clauses, RANK, LAG, LEAD, SUM OVER
-- Find top earner per department WITH top_earners AS ( SELECT *, RANK() OVER ( PARTITION BY dept_id ORDER BY salary DESC ) AS rnk FROM employees ) SELECT name, dept_id, salary FROM top_earners WHERE rnk = 1; -- Rank each driver's worst speeding event per month SELECT * FROM ( SELECT driver_id, DATE_TRUNC('month', event_time) AS month, speed - speed_limit AS overspeed, RANK() OVER ( PARTITION BY driver_id, DATE_TRUNC('month', event_time) ORDER BY speed - speed_limit DESC ) AS rnk FROM speeding_events ) t WHERE rnk = 1;
AVG(salary) GROUP BY dept_id gives one row per department. AVG(salary) OVER (PARTITION BY dept_id) keeps every employee row but adds the department average alongside — letting you compare each employee to their department average in the same query. Window functions are perfect for rankings, running totals, moving averages, and comparing consecutive rows with LAG/LEAD.EXPLAIN ANALYZE — reading query plans
How to identify slow queries and understand what Postgres is doing
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42; -- BAD output: reading 3421 rows to find 12 -- Seq Scan on orders (cost=0.00..3421.00 rows=12) -- (actual time=0.021..45.2 rows=12) -- GOOD after index: -- Index Scan using idx_orders_customer on orders -- (actual time=0.018..0.031 rows=12) -- Slow: function prevents index use WHERE DATE_TRUNC('month', created_at) = '2024-01-01' -- Fast: range scan uses index WHERE created_at >= '2024-01-01' AND created_at < '2024-02-01'
EXPLAIN ANALYZE to see the actual execution plan. I look for three things: first, whether there's a Seq Scan on a large table — that usually means a missing index. Second, I compare estimated rows to actual rows — a large mismatch means the planner's statistics are stale, so I'd run ANALYZE. Third, I check which node is most expensive.If it's a missing index, I'd create one on the filtered/joined column. I'm careful not to apply a function on an indexed column in WHERE — things like
DATE_TRUNC() prevent the index from being used; rewrite those as range conditions. For complex queries I'd consider a materialised view to pre-compute expensive joins. Finally, I'd re-run EXPLAIN ANALYZE to confirm the improvement.pg_stat_statements to find top slow queries without instrumenting every call manually.SELECT vehicle_id,
ROUND(AVG(speed), 2) AS avg_speed
FROM gps_events
WHERE recorded_at >= NOW() - INTERVAL '30 days'
AND speed > 0
GROUP BY vehicle_id
ORDER BY avg_speed DESC
LIMIT 10;
Performance at scale: Partition gps_events by month on recorded_at — the 30-day filter hits only 1–2 partitions. Add a composite index on (vehicle_id, recorded_at). Consider a materialised view refreshed nightly for dashboards that don't need real-time data.Index types — B-Tree, GIN, GiST, BRIN, partial, expression
How each one works and when to use it
B-Tree (default)
Balanced tree. Supports =, <, >, BETWEEN, LIKE 'prefix%'. Right choice for almost everything.
GIN
Generalised Inverted Index. For JSONB, arrays, full-text search. Stores each element separately.
GiST
Generalised Search Tree. For geometric types, range types. Used heavily with PostGIS for spatial queries.
BRIN
Block Range Index. Tiny footprint. Only useful for large, physically-ordered tables like append-only time-series.
-- Partial: only index what you query (smaller, faster) CREATE INDEX idx_active_orders ON orders(status) WHERE status != 'completed'; -- Covering: include extra cols → avoid heap lookup entirely CREATE INDEX idx_orders_cover ON orders(customer_id) INCLUDE (total, created_at); -- GIN for JSONB / spatial with PostGIS for geofencing CREATE INDEX idx_geofences_geom ON geofences USING GiST(boundary);
CREATE INDEX ON orders(status) WHERE status = 'pending'. A regular index on status would be large because it indexes every row including millions of completed orders. A partial index only indexes the subset I actually query — it's smaller, fits in cache better, and is faster to scan.Non-clustered index: A separate structure pointing back to the heap rows. Multiple allowed per table.
Avoid adding an index when: the table is write-heavy (every insert/update maintains all indexes), the column has low cardinality (e.g. a boolean flag), the table is small enough that a full scan is faster, or the column is rarely used in WHERE, JOIN, or ORDER BY.
Normalization — 1NF to 3NF + when to break the rules
CREATE MATERIALIZED VIEW and refresh with REFRESH MATERIALIZED VIEW CONCURRENTLY to avoid locking readers.Fleet schema design + multi-tenancy
Design the DB for a fleet management SaaS serving 125,000+ customers
-- Core entities fleets (id, customer_id, name) -- tenancy boundary vehicles (id, fleet_id, plate, make, model, status) drivers (id, fleet_id, name, licence_no, status) -- Assignment (many-to-many with time periods) driver_vehicle_assignments (id, driver_id, vehicle_id, assigned_at, unassigned_at) -- Trips (reconstructed from GPS events) trips (id, vehicle_id, driver_id, start_time, end_time, distance_km, start_lat, start_lng, end_lat, end_lng) -- GPS events — PARTITION BY RANGE(recorded_at) gps_events (id, vehicle_id, trip_id, recorded_at, lat, lng, speed, heading, odometer) -- Alerts (JSONB for flexible payload) alerts (id, vehicle_id, driver_id, alert_type, severity, triggered_at, resolved_at, payload jsonb)
1. Shared schema (discriminator column): All customers in same tables with a
customer_id. Simple to operate. Risk: one bad query can leak cross-tenant data.2. Shared DB, separate schemas: Each customer gets their own schema. Better isolation but migrations across 125k schemas are unmanageable.
3. Separate DBs per tenant group: Best isolation, enables per-region data residency. Most operationally complex.
At Cartrack's scale: Strategy 1 (shared schema) with
customer_id enforced at the application layer and Row-Level Security (RLS) in PostgreSQL as a safety net. Partition by customer segment for the largest enterprise accounts.Step 1 — Expand: Add the new column/table alongside the old one. Deploy code that writes to both.
Step 2 — Backfill: Background job populates the new structure from existing data in batches, not one giant UPDATE.
Step 3 — Migrate reads: Deploy code that reads from the new structure.
Step 4 — Contract: Remove the old column/table and dual-write code.
Also: use
CREATE INDEX CONCURRENTLY (never table-locking), never rename columns directly (add new, migrate, drop old), and always test migration rollback before applying in production.Transactions, MVCC & ACID
How Postgres handles concurrent reads and writes without locking everything
BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; UPDATE accounts SET balance = balance + 100 WHERE id = 2; COMMIT; -- both succeed together, or ROLLBACK to undo both -- Monitor dead tuples SELECT relname, n_dead_tup, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
Table partitioning
Range/list/hash partitioning and when it helps over just adding an index
WHERE recorded_at BETWEEN '2024-01-01' AND '2024-03-31', Postgres scans only the 2024-Q1 partition instead of the full table.CREATE TABLE gps_events ( id BIGSERIAL, vehicle_id INT, recorded_at TIMESTAMPTZ NOT NULL, lat NUMERIC, lng NUMERIC, speed NUMERIC ) PARTITION BY RANGE(recorded_at); CREATE TABLE gps_events_2024_q1 PARTITION OF gps_events FOR VALUES FROM('2024-01-01') TO('2024-04-01'); -- Drop old data cheaply — instant, no vacuum needed DROP TABLE gps_events_2023_q1;
Linux & Scripting
Expect practical scenario questions. Know your diagnostic commands cold. The platform runs on Linux servers handling continuous IoT data ingestion.
A production server is showing high CPU usage. Walk me through your diagnosis step-by-step.
# Step 1: Identify the offending process top # sort by %CPU with 'P' htop # more visual; filter by process name ps aux --sort=-%cpu | head -10 # Step 2: Inspect the process pidstat -u 1 5 # per-process CPU usage over 5 seconds systemctl status php-fpm journalctl -u php-fpm --since "10 minutes ago" # Step 3: Identify runaway threads or zombie processes ps -eLf | grep php # list all threads ps aux | grep Z # zombie processes # Step 4: Check for CPU-heavy system calls strace -p <PID> -c # summarise system calls # Also check: cron jobs, sustained vs spike, I/O waits
A service fails silently after a restart. How do you investigate using systemd?
# Check current status and recent logs systemctl status my-service journalctl -u my-service -b # logs since last boot journalctl -u my-service -f # follow live journalctl -u my-service -n 100 # last 100 lines systemctl cat my-service # view unit file systemctl list-dependencies my-service
EnvironmentFile=), wrong file permissions, port already in use (ss -tlnp | grep :8080), or a dependency service not fully started.Set up log rotation for a high-volume telemetry service.
# /etc/logrotate.d/telemetry /var/log/telemetry/*.log { daily rotate 14 compress delaycompress # compress yesterday's log, not today's (safe for open handles) missingok notifempty sharedscripts postrotate systemctl reload telemetry-service endscript }
Write a bash script that monitors a directory for new CSV files and logs results with timestamps.
#!/bin/bash set -euo pipefail # exit on error, unbound var, pipe fail WATCH_DIR="/data/incoming" LOG_FILE="/var/log/csv-processor.log" DONE_DIR="/data/processed" mkdir -p "$DONE_DIR" log() { echo "$(date '+%Y-%m-%d %H:%M:%S') $*" >> "$LOG_FILE"; } process_file() { local file="$1" local rows rows=$(tail -n +2 "$file" | wc -l) # skip header log "Processed: $(basename "$file") | Rows: $rows" mv "$file" "$DONE_DIR/" } log "=== Watcher started ===" inotifywait -m -e close_write "$WATCH_DIR" --format '%f' | while read -r filename; do if [[ "$filename" == *.csv ]]; then process_file "$WATCH_DIR/$filename" fi done
set -euo pipefail for robust error handling. Use inotifywait instead of polling with sleep loops. Move processed files to avoid double-processing.Extract all unique vehicle IDs from a large log file efficiently.
2024-01-15 GPS vehicle_id=TRK-2041 lat=1.34 lng=103.82# Option 1: grep + sort + uniq (portable) grep -oP 'vehicle_id=\K[^\s]+' fleet.log | sort -u # Option 2: awk (faster for very large files) awk 'match($0, /vehicle_id=([^ ]+)/, a) {print a[1]}' fleet.log | sort -u # For 10GB+ files: use LC_ALL=C sort for much faster ASCII sorting LC_ALL=C sort -u
Git & Collaborative Workflows
Cartrack uses GitLab. Know branching strategies, merge request workflows, and how to enforce standards across a distributed team.
How Git stores data
Commits, trees, blobs — snapshots not diffs
git reset: Moves the branch pointer backward, rewriting history.
--soft keeps changes staged, --mixed keeps them unstaged, --hard discards changes. Only use locally before pushing — breaks teammates' branches if used on shared branches.git restore: For working directory and staging area only — doesn't touch commits.
git restore file.txt discards uncommitted changes. git restore --staged file.txt unstages a file.Git Flow vs Trunk-Based Development
Long-lived branches vs always-deployable main
Trunk-Based Development: Everyone integrates to main frequently via short-lived feature branches (1–3 days max). Feature flags control in-progress work. Faster feedback, less merge pain.
For Cartrack (SaaS, continuous deployment): I'd recommend a lightweight trunk-based approach with short-lived feature branches and mandatory CI passing before merge. Feature flags let us deploy dark features safely. Git Flow adds ceremony without benefit when you're deploying to production multiple times a week.
Rebase: Replays your commits on top of another branch — creates a linear history. Rewrites commit hashes. Never rebase shared/public branches.
Team policy: For feature branches before opening a MR, rebase onto main to get a clean linear diff that's easier to review. For merging into main, use a merge or squash commit so history is preserved and rollback is clear.
git checkout v2.4.1 # branch from the release tag git checkout -b hotfix/gps-null-fix git commit -m "fix: handle null GPS coordinates" git checkout release/2.4 git merge hotfix/gps-null-fix git tag v2.4.2 git checkout main git cherry-pick <commit-hash> # cherry-pick the fix, don't merge the whole branchKey point: Cherry-pick the specific commit(s) onto main rather than merging the hotfix branch — you don't want stale release-branch code landing in main. Always tag the hotfix release.
How do you enforce commit and PR standards across a distributed team?
feat:, fix:, chore:, refactor:, docs: prefixes. This enables automated changelogs. I'd enforce it with a commit-msg hook (commitlint) that CI runs — the pipeline fails if the format is wrong. For MRs, I'd use a GitLab MR template with a checklist: what changed, how it was tested, screenshots for UI changes, and a link to the ticket. Branch protection rules require at least one approval and passing CI before merge."How do you use git bisect to locate a regression?
git bisect start git bisect bad # current commit has the bug git bisect good v2.3.0 # this commit was known-good # Git checks out the midpoint. Test it, then: git bisect bad # or git bisect good # Automate with a test script (exit 0 = good, non-zero = bad) git bisect run ./test_script.sh git bisect reset # clean up when done
Fleet Telematics & Cartrack Product
Most candidates won't know the domain. Showing you've thought about the product and its technical challenges signals initiative and seniority.
How would you design the data pipeline for ingesting real-time GPS telemetry from millions of IoT devices?
How would you design geofencing alerts — detecting when a vehicle enters or exits a zone — at low latency for millions of vehicles?
• Spatial indexing: Store geofences in PostgreSQL with the PostGIS extension. Use GiST indexes on the geometry column. A point-in-polygon check with proper indexing is fast even for millions of geofences.
• In-memory cache: Load each customer's active geofences into Redis keyed by geohash/grid cell. On incoming GPS event, look up only the geofences relevant to that grid cell.
• State tracking: Per vehicle, track "was inside geofence X at last ping?" — store this state to detect transitions (enter/exit) rather than re-evaluating absolute position on every event.
• Event-driven alerts: Transition events publish to an alert queue, which then triggers notifications (push, SMS, email) asynchronously.
GPS data can arrive out of order or as duplicates. How do you handle this on ingestion?
ON CONFLICT DO NOTHING or upsert on the event ID. A short-lived Redis set of recently seen event IDs can block duplicates before they even hit the database.How would you build a driver behaviour scoring system?
driving_events table. Nightly batch job computes the 30-day score and writes to a driver_scores summary table. Dashboards query the summary table — never aggregate raw events on every page load.Situational Questions
Use STAR: Situation, Task, Action, Result. Have real examples ready — don't invent scenarios.
A critical bug causes incorrect GPS locations for 10,000 vehicles. You're the tech lead on call.
Product wants to ship a feature in 2 weeks that the team estimates at 6 weeks. How do you handle it?
Then I'd come back with options rather than a single answer: (a) Full feature in 6 weeks at current scope. (b) An MVP in 2 weeks that delivers the core value — here's specifically what we'd cut. (c) Full feature in 4 weeks if we add a developer or drop another priority. (d) Feature in 2 weeks but with known technical debt we address in the next sprint.
I'd make the tradeoffs explicit, let product make the call, and document whatever decision is made so there's shared accountability."
You inherit a legacy codebase with no documentation and high technical debt. What do you do?
What does the current tech stack look like, and where is the biggest technical debt?
What does success look like for this role in the first 90 days?
How do you balance shipping new features vs maintaining platform reliability?
PHP / C# & React / TypeScript
Frame your experience as stack-agnostic. Depth in one backend language and a modern frontend framework is enough — show breadth across 15 years to carry the language-agnostic requirement.
How would you structure a REST API for a fleet tracking service?
# Resource-oriented URLs GET /v1/fleets/{fleet_id}/vehicles GET /v1/vehicles/{vehicle_id}/trips?from=2024-01-01&to=2024-01-31 GET /v1/vehicles/{vehicle_id}/position # latest GPS fix POST /v1/alerts PATCH /v1/drivers/{driver_id}
/v1/). Cursor-based pagination for GPS event endpoints — offset pagination breaks on large tables. Rate limiting per customer, not per IP.(C#) Explain async/await and where you'd apply it in a high-throughput telematics API.
// Blocking (bad — thread parked while DB responds) public Vehicle GetVehicle(int id) { return _db.Vehicles.Find(id); } // Non-blocking (good — thread returns to pool during DB wait) public async Task<Vehicle> GetVehicleAsync(int id) { return await _db.Vehicles.FindAsync(id); }
async/await allows the thread to be returned to the thread pool while waiting for I/O. For a telematics API where an endpoint fetches 30 days of GPS events from PostgreSQL, making it async means a single server can handle thousands of concurrent fleet dashboards refreshing in parallel without running out of threads.A live fleet map re-renders too frequently as vehicle positions update. How do you optimise this?
// Bad — new array reference every render triggers re-render const markers = vehicles.map(v => ({ lat: v.lat, lng: v.lng })); // Good — memoised, only recalculates when vehicles changes const markers = useMemo( () => vehicles.map(v => ({ lat: v.lat, lng: v.lng })), [vehicles] );
React.memo(). Throttle WebSocket position updates — GPS devices ping every few seconds but the map doesn't need to re-render at the full telemetry rate. Consider mutating marker positions directly on the map instance (Mapbox GL JS / Leaflet) rather than updating React state.Docker & Kubernetes
How containers work, how to build images properly, and how Kubernetes orchestrates them at scale.
Images, containers & layers — why layer order matters
# WRONG: cache busted every time source code changes COPY . . RUN npm ci # reinstalls on every code change! # CORRECT: deps layer cached unless package.json changes COPY package*.json ./ RUN npm ci --production # cached if deps unchanged COPY . . # source invalidates cache only here
--entrypoint. CMD provides default arguments that can be replaced at runtime.Common pattern:
ENTRYPOINT ["node"] and CMD ["server.js"]. Running normally starts node server.js. Running with docker run myimage debug.js starts node debug.js — CMD was overridden. If you only use CMD, someone can override the entire command including the executable.FROM php:8.3-fpm AS base
RUN docker-php-ext-install pdo pdo_pgsql
FROM base AS deps
COPY composer.json composer.lock ./
RUN composer install --no-dev --optimize-autoloader
FROM base AS final
COPY --from=deps /app/vendor ./vendor
COPY . .
EXPOSE 9000
---
# docker-compose.yml
services:
api:
build: .
environment:
DB_HOST: postgres
DB_PASS: ${DB_PASS} # from .env — never hardcode
depends_on:
postgres:
condition: service_healthy
postgres:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app"]
interval: 5s
Key points: Multi-stage builds reduce image size. Use depends_on with healthcheck — not just depends_on (which doesn't wait for readiness). Never embed secrets in the image.Cluster architecture — control plane, worker nodes, and rolling deployments
Control plane
API Server: Front door. All kubectl commands go here.
etcd: Stores all cluster state.
Scheduler: Picks which node a new pod runs on.
Controller Manager: Watches for divergence and fixes it.
Worker nodes
kubelet: Node agent. Manages pods on the node.
kube-proxy: Manages network rules for Services.
Container runtime: Actually runs containers (containerd).
A StatefulSet is for stateful applications. Pods get predictable names (app-0, app-1, app-2), a stable network DNS hostname, and their own PersistentVolumeClaim that survives pod restarts. Critical for databases where a pod needs to know "I am replica number 1."
I'd use a Deployment for web API servers and workers. I'd use a StatefulSet for PostgreSQL, Redis, Kafka — anything that maintains its own state that must persist.
Readiness probe: "Is this container ready to accept traffic?" If it fails, Kubernetes removes the pod from the Service's endpoints — traffic stops going to it, but the container is NOT restarted. Use this for startup warmup or temporarily overloaded pods.
Common mistake: Setting liveness probe thresholds too aggressively — Kubernetes will keep restarting a pod that just needs time to warm up, causing a crash loop. Use readiness for traffic control, liveness as a last resort for truly stuck processes.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # never take a pod down before a new one is ready
maxSurge: 1 # allow one extra pod above desired count during update
To minimise downtime: Configure readinessProbe properly — K8s only routes traffic to a pod once the probe passes. Handle SIGTERM gracefully — finish in-flight requests before exiting. Set terminationGracePeriodSeconds to match your longest expected request time. Use PodDisruptionBudgets to guarantee a minimum number of healthy pods.CI/CD & GitLab Pipelines
How automated pipelines work, how GitLab CI is configured, and how to design pipelines that are fast, secure, and reliable.
CI vs CD (Delivery) vs CD (Deployment)
Three stages, three different levels of automation
Continuous Integration
Developers merge code to main frequently. Each merge triggers automated build + test run. Goal: detect integration problems early.
Continuous Delivery
Every passing change can be released to production at any time — but requires a human to press the deploy button. Software is always releasable.
Continuous Deployment
Every passing change is automatically deployed to production without human intervention. Requires very high confidence in tests and monitoring.
For Cartrack, I'd start with Continuous Delivery. Fleet telematics is a mission-critical system — unexpected production changes can directly impact 125,000+ commercial customers tracking vehicles in real time. The manual gate also gives product managers visibility into what's going out. I'd graduate to Continuous Deployment as confidence in the test suite and monitoring grows, potentially starting with lower-risk services first.
Full GitLab CI pipeline for a fleet tracking backend
stages: [lint, test, build, deploy] lint-code: stage: lint image: php:8.3 script: - composer run phpcs # PSR-12 coding standards - composer run phpstan # static analysis cache: paths: [vendor/] key: $CI_COMMIT_REF_SLUG unit-tests: stage: test services: [postgres:16] # real DB for tests variables: {POSTGRES_DB: test_db} script: - composer run phpunit --coverage-text coverage: '/Lines:\s+(\d+\.\d+)%/' build-image: stage: build image: docker:24 services: [docker:dind] script: - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA . - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA deploy-staging: stage: deploy script: [kubectl set image deployment/api api=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA] environment: staging rules: - if: $CI_COMMIT_BRANCH == "main" deploy-production: stage: deploy when: manual # human approval gate environment: production script: [kubectl set image deployment/api api=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA] rules: - if: $CI_COMMIT_TAG # only on tagged releases
For production deployments I'd integrate with a secrets manager like HashiCorp Vault or AWS Secrets Manager for rotation and audit trails. Variables scoped to specific environments prevent staging pipelines from accidentally accessing production credentials.
What I'd never do: Put a secret in a CI variable that isn't masked, echo a secret to the job log, store credentials in a Docker image layer (they can be extracted from the image history), or commit a .env file with secrets (even .gitignored — they end up in git history).
Agile & Scrum
How Scrum works in practice — roles, all five ceremonies, and the mechanics of sprint-based development. Often tested as a leadership question.
Product Owner
- Owns and prioritises the product backlog
- Defines acceptance criteria for stories
- Single voice of the customer
- Decides what gets built and in what order
- Approves completed work against the DoD
Scrum Master
- Facilitates all Scrum ceremonies
- Removes impediments and blockers
- Protects team from scope creep
- Coaches the team on Agile practices
- No direct authority over developers
Development team
- Self-organising and cross-functional
- Estimates effort and owns sprint commitments
- Pulls work from the sprint backlog
- Collectively responsible for quality
- Defines and upholds the DoD
Sprint planning
Team selects items from the top of the prioritised backlog and commits to delivering them. PO clarifies requirements. Team estimates in story points, breaks stories into tasks. Output: a clear, committed sprint goal.
Daily standup
Each member answers: what did I complete yesterday, what will I do today, any blockers? Goal is team coordination and early blocker detection — not a status report to management.
Sprint review (demo)
Team demonstrates completed work to stakeholders. Only "done" work — that meets the DoD — is presented. Incomplete work is returned to the backlog. This is about the product.
Sprint retrospective
Team reflects on its own working process: what went well, what didn't, what specific actions to take next sprint. This is about the process — separate from the review. Output: actionable improvements with owners assigned.
Backlog refinement
PO and team review upcoming items: clarify requirements, write acceptance criteria, split large epics, estimate effort. Goal: ensure the top of the backlog is always sprint-ready so planning meetings don't stall.
Simple: review = "what did we build?", retro = "how did we work?" Mixing them up in an interview is a red flag — it suggests you haven't worked in Scrum.
Acceptance criteria are story-specific — they define what "correct" means for that particular feature. For a login story, AC might be: "given a valid email and password, the user is redirected to the dashboard; given an invalid password, an error is shown."
A story must meet both: AC confirms it does the right thing, DoD confirms it was done to the team's quality standard.
When You Don't Know the Answer
Interviewers are testing honesty, thinking process, and whether you'd be safe to work with. Saying "I don't know" confidently and constructively signals high self-awareness. Bluffing signals poor judgment — both are disqualifying.
"Have you used [X technology]?" — when you haven't
"Explain how [deep concept] works" — you know it partially
Complete blank — no starting point at all
You know it but forgot the exact syntax
The interviewer digs deeper than you can go
What not to do
- "I know this, let me think..." — then fill time with unrelated waffle
- "It's basically the same as [completely wrong analogy]"
- Giving a long confident-sounding answer that's actually incorrect
- Saying "that's a great question!" — sounds hollow every time
- Apologising excessively: "I'm so sorry, I should know this..."
- Pretending a vague answer was complete
What to do instead
- Be brief: "I haven't used that yet" — one sentence, then pivot
- Show adjacent knowledge immediately after admitting the gap
- Ask the interviewer to explain — turns a gap into a conversation
- State your reasoning process even if you don't know the conclusion
- Stay calm — a pause to think looks confident, not nervous
- Be genuinely curious, not performatively so