Renhao Xie demonstrated strong system design intuition and problem-solving abilities during our mock interview. His approach to designing an Ads Click Aggregation System showed an understanding of both streaming infrastructure and data consistency trade-offs.
Areas of strength included:
- Clearly listed all functional requirements with good alignment to the business context.
- Thoughtful scalability estimation, including discussion of peak values and throughput.
- Prioritized consistency over availability, aligning well with the importance of accuracy in click aggregation systems.
- Presented a clear and correct high-level design.
- Proposed the use of Flink for streaming ingestion and OLAP systems for downstream querying and aggregation.
- Appropriately brought up reconciliation processes for data loss prevention, including async strategies.
- Discussed sharding strategies based on ads_id to scale both the Flink job and the OLAP store.
- Proposed Redis-based deduplication with awareness of durability and clustering implications.
- Mentioned watermarking in Flink to handle late events and discussed Redis clustering with partitioning keys like <client_id, ads_id> and impression_id.
Areas for improvement:
- Start by targeting a core non-functional requirement like accuracy (e.g., precision types, time sensitivity, or legal/privacy implications). This creates a clearer problem frame and can be expanded throughout the discussion.
- Explore the design implications of URL redirection (e.g., attribution and accuracy vs. complexity tradeoffs).
- Develop a hot vs. cold data reconciliation strategy, especially for long-tail or delayed data.
- When discussing deduplication, mention usage of idempotency keys or hash-based strategies.
- For multi-device tracking, consider using composite keys like <user_id, ads_id> instead of relying solely on impression_id to ensure deduplication aligns with the logical ad interaction rather than device-specific impressions.