sid_0_o

JPMorganChase

Case Likelihood

Developed a Hybrid Transformer based Risk-Ranking Model (downstream to Comment Monitoring) on behalf of Employee Relations to predict the likelihood of a Code of Conduct Incident raised by an employee to be further investigated (and have potential action taken against the Subject). This hybrid model includes description embeddings along with historical features of subject and complainant such as Allegation & Performance History, Salary Grade, LOB and Manger Hierarchy Information

Hybrid Modeling ~ 89% Case Recall

Combining incident description embeddings generated by a Sentence Transformer model with historical features to feed the XGBoost Classification Head we achieve high accuracies in predicting case likelihood. We further optimize this pipeline using an extensive Grid Search to fetch the best hyperparameters.
Unstructured Data - Multiprocessing
Applying Multiprocessing we import the latest data from multiple systems reading more than a 1000 Parquet File Partitions via multiple CPU Cores and creating features from big unstructured data ( > million records ) in less than few minutes.
Real Time Inference
Recreating features for every incoming case item (subject, complainant and incident description features only) during Inference to get a Case Likelihood score within ~ 1.87 seconds.
Path to Production - Fast API

Deployed this model using Jules Pipeline and FastAPI with Daily Data Refreshing. The API Posts a Case Likelihood Score from 0-4 (Lowest to Highest) alongside the most impactful features/domains based on SHAP local explainability

Case Likelihood

Hybrid Modeling ~ 89% Case Recall

Unstructured Data - Multiprocessing

Real Time Inference

Path to Production - Fast API