Real-time feature computation, which calculates features from raw data on demand, is a crucial component in the machine learning (ML) application process. These real-time features are vital for various real-world ML applications, such as anti-fraud management, risk control, and personalized recommendations. In these cases, low latency (milliseconds) in computing fresh data features is crucial for accurate and high-quality online inference.
As illustrated in the accompanying figure, a data scientist typically begins an ML application by developing feature computation scripts (for example, using Python or SparkSQL) for offline training. However, these scripts cannot meet the demands of online serving, including low latency, high throughput, and high availability. Hence, it is necessary to transform these scripts into performance-optimized code (for example, using C++) that can be developed by an engineering team with system and production knowledge. This transformation process is time-consuming and requires significant development, deployment, and double system maintenance efforts.
No entries found