AI-driven Silicon Prediction for Steel Manufacturing
TL;DR: During a 10-month research internship, I developed an Artificial Neural Network model to predict silicon impurity in hot metal for the IISCo Steel Plant. My work involved deep domain research, complex time-series data matching, and rigorous model benchmarking, resulting in a solution with 89.67% accuracy.
This project was completed at the Center of Excellence in Advanced Manufacturing Technology (CoEAMT) under the mentorship of Prof. Surjya K. Pal and Ms. Pooja Sarkar for their client, the IISCo Steel Plant in Burnpur.
Disclaimer: The data and findings presented are from a research project with the IISCo Steel Plant. Visuals are derived from my academic presentation to illustrate the project's methodology and outcomes.
1. The Business Challenge: The High Cost of Impurity
In steel manufacturing, controlling the silicon (Si) impurity in hot metal is critical. High silicon content leads to significant operational inefficiencies, including increased consumption of expensive resources like oxygen and lime, and higher production costs.
The goal for the IISCo plant was to move from reactive adjustments to proactive control. I was tasked with developing a predictive model to forecast silicon content, enabling the plant to optimize processes and reduce costs.

2. Understanding the Domain: The Metallurgy Behind the Model
Before diving into the data, the first step was to understand the complex chemical and physical processes inside a blast furnace. My research focused on answering a critical question: why is low silicon a priority? I found that it directly increases operational costs by requiring more oxygen and lime and creating excess slag. This domain knowledge was essential for identifying the most influential variables in the dataset.

3. The Process: From Raw Data to a Predictive Engine
Data Matching, Cleaning & Feature Engineering
The first major hurdle was handling the raw plant data. Input variables had a significant time lag of 330-380 minutes before impacting the final output. I developed Python scripts to create uniform timestamps, accurately match inputs with their corresponding outputs, and systematically clean the dataset. This crucial process ensured data integrity and produced a high-quality dataset for modeling.



Model Benchmarking
With a clean dataset, I benchmarked several machine learning algorithms to find the most effective model for this regression task. My evaluation included Partial Least Squares (PLS) Regression, Random Forest, XGBoost, and an Artificial Neural Network (ANN). This methodical approach ensured that the final model choice was backed by comparative performance data.
4. The Outcome: An 89% Accurate Prediction
The final Artificial Neural Network model, built with Keras, consistently delivered the best performance, significantly outperforming other benchmarked models as shown in the comparison.
Trained on 49 distinct process parameters, the model achieved an **accuracy of 89.67%** (based on Mean Absolute Error). This provides a powerful tool for the plant's process control team to proactively optimize raw material mix, reduce resource consumption, and enhance blast furnace efficiency.

