Rafael demonstrated strong system design skills and a clear, structured approach throughout the mock focused on a ChatGPT design. The session reflected not only technical depth but also an ability to think proactively and communicate effectively under time constraints.
Areas of strength included:
- Crystal-clear articulation of requirements, both functional and non-functional. The inclusion of numeric metrics and scenario-driven thinking for NFRs was especially strong.
- Quickly identified and applied correct assumptions, such as priority queue for LLM model inference, which set a solid foundation for downstream design decisions.
- Excellent time allocation: requirements and core entities were completed efficiently within the first 8 minutes, enabling deeper discussions later.
- Thoughtfully modeled snapshots operations in phases: from making async msg preparation if msg is editable or msg pointer if msg is immutable.
- Designed a search for both semantic and keyword search to manage snapshot and message search.
- Provided a step-by-step walkthrough of the high-level architecture — clear, logical, and well-scoped.
- Maintained a strong, self-directed pace, keeping the conversation focused and structured.
Areas for improvement:
- Instead of direct reference how ChatGPT works, try proactive propose different solutions and do trade-off analysis.
- Get familiar with prompt assembly: system + instructions + citations + user prompt + snippets.
- When you type something in the ChatGPT UI and hit send, the frontend sends just your new message (plus some lightweight metadata such as model selection, stream flag, conversation id) to the OpenAI API.
- The backend service (the chat orchestration layer) is responsible for:
- Fetching previous turns in the conversation,
- Merging them into a single messages[] array (system + user + assistant turns),
- Adding any hidden instructions (safety rules, memory, tool-calling context, etc.),
- Passing that full prompt to the model.
- Get familiar fault tolerance about client and backend server connection:
- Option A SSE: Service emits text/event-stream frames: event: token\ndata: "..."\n\n until event: done.
- Option B fetch-stream: Pass raw chunked text, client concatenates.
- Backpressure:
- Server writes small SSE frames (e.g., every 20–50 tokens) to avoid Nagle/buffering.
- If client is slow, coalesce frames or drop to tail-only mode (UI stays responsive).
- Cancellation:
- Client Cancel closes SSE / aborts fetch; server cancels provider request.
- Idempotent cleanup: mark turn as cancelled and persist partial transcript.
- Retries:
- UI auto-retry on transient network errors with jitter (max 1–2 retries).
- SSE can use Last-Event-ID to resume from last token id (optional).