JPMorganChase
Case Likelihood
Developed a Hybrid Transformer based Risk-Ranking Model (downstream to Comment Monitoring) on behalf of Employee Relations to predict the likelihood of a Code of Conduct Incident raised by an employee to be further investigated (and have potential action taken against the Subject). This hybrid model includes description embeddings along with historical features of subject and complainant such as Allegation & Performance History, Salary Grade, LOB and Manger Hierarchy Information
-
Hybrid Modeling ~ 89% Case Recall
Combining incident description embeddings generated by a Sentence Transformer model with historical features to feed the XGBoost Classification Head we achieve high accuracies in predicting case likelihood. We further optimize this pipeline using an extensive Grid Search to fetch the best hyperparameters.
-
Unstructured Data - Multiprocessing
Applying Multiprocessing we import the latest data from multiple systems reading more than a 1000 Parquet File Partitions via multiple CPU Cores and creating features from big unstructured data ( > million records ) in less than few minutes. -
Real Time Inference
Recreating features for every incoming case item (subject, complainant and incident description features only) during Inference to get a Case Likelihood score within ~ 1.87 seconds. -
Path to Production - Fast API
Deployed this model using Jules Pipeline and FastAPI with Daily Data Refreshing. The API Posts a Case Likelihood Score from 0-4 (Lowest to Highest) alongside the most impactful features/domains based on SHAP local explainability