Capstone: PARSIGHT

As part of our final-year engineering capstone, we developed ParSight, an autonomous drone system designed to assist senior golfers by tracking golf balls in real time and hovering over their final position to provide an enhanced visual reference. The project addresses a growing accessibility issue in recreational sports—specifically age-related vision decline—by offering a real-time robotic solution that enhances visibility without interfering with gameplay.

ParSight follows a Sense–Plan–Act architecture. The sensing module uses a downward-facing RGB camera mounted on a quadrotor drone to continuously capture video at 30 FPS. Frames are processed onboard using an NVIDIA Jetson Nano running ROS2, with real-time detection achieved through HSV colour filtering and OpenCV-based contour detection. The system scores contours by size and circularity, isolating the red golf ball even in noisy environments.

Tracking is handled through Image-Based Visual Servoing (IBVS), where the pixel offset between the detected ball and the center of the image is converted into a physical setpoint using a Proportional-Derivative (PD) controller. These setpoints are published to the drone’s flight controller, which adjusts motor output to maintain lock on the moving ball and ultimately hover over its final resting position.

ParSight was implemented on a Minion Mini H-Quad drone platform equipped with a Jetson Nano and Orange Cube+ flight controller. The system achieved 95% detection accuracy with only 3% false positives, maintained real-time inference at 30 Hz, and responded with a total system latency of ~350 ms. In MVP testing, the drone successfully hovered within 6 cm of the ball’s final location, even during dynamic launch scenarios using an RC car and catapult.

The project met 8 out of 10 key performance benchmarks, validating the effectiveness of lightweight, vision-based autonomy for assistive applications. Future improvements include incorporating higher frame-rate cameras, upgraded compute (e.g., Jetson Orin), and deploying more robust deep learning models like YOLOv8 with TensorRT for enhanced outdoor tracking.

Github

Final Report Parsight.pdf

Thesis

Title: Optimizing 3D Metal Printing Using Machine Learning

My thesis presents a comprehensive framework to improve the reliability and quality of Laser Directed Energy Deposition (LDED)—an advanced metal additive manufacturing process, by integrating high-speed thermal imaging with machine learning-based predictive modeling.

LDED enables the fabrication and repair of complex, high-value metal components but remains susceptible to defects due to the highly dynamic nature of the melt pool. To address this, I designed a high-throughput experimental setup that systematically varied over 360 combinations of laser power, scan speed, and powder feed rate, while capturing real-time melt pool dynamics using a 2000 fps infrared camera. This setup allowed me to analyze both in situ thermal behavior and post-process geometric features, offering a full-spectrum view of deposition quality.

Dynamic features such as melt pool stability, morphology, and sputter density were extracted from the thermal videos, while static characteristics like track height, volume, and surface roughness were derived from 3D surface scans. Among all features, melt pool stability, quantified using the steady-state duration and coefficient of variation, emerged as the most reliable real-time predictor of print quality. Morphology and sputter density, while visually distinct, showed no consistent correlation and were excluded from further modeling.

I then built four regression models to predict key print outcomes, using both dynamic and static features as inputs. These models included:

Linear Regression
Decision Trees
Extra Trees Ensemble
Neural Networks

The models were evaluated on their ability to predict melt track height, melt pool area, melt pool stability, and a hybrid quality score combining stability and surface roughness. The hybrid model, particularly when using neural networks and tree-based methods, achieved the best performance with R² values over 0.84, demonstrating that combining real-time and post-process features significantly improves predictive accuracy.

Key findings include:

Scan speed and feed rate were the most influential parameters; laser power had a comparatively minor effect.
The most stable prints occurred at high power and low scan speed combinations.
Dynamic metrics like melt pool stability serve as better early indicators of defects than traditional post-process geometry alone.

This work not only enables a deeper understanding of LDED physics but also lays the foundation for real-time quality control and closed-loop adaptive manufacturing. The integration of high-speed IR sensing and interpretable machine learning paves the way for smarter additive manufacturing systems, reducing trial-and-error and improving efficiency in aerospace, biomedical, and automotive applications.

Siddharth_Khanna_Thesis.pdf

IngrAIdients - Ingrediants Generation

IngrAIdients is an ongoing deep learning project aimed at revolutionizing meal preparation, dietary tracking, and nutrition management by identifying ingredients from images of prepared dishes. The goal is to provide users with a tool that allows them to upload a photo of a dish and instantly receive a breakdown of its ingredients. This information helps track nutrition, avoid allergens, and recreate dishes with ease.

Currently, we are utilizing large-scale datasets like Recipe1M+ and the Food Ingredients and Recipe Dataset for training the model. These datasets offer millions of food images and corresponding recipes, providing a solid foundation for accurate ingredient detection. Alongside this, we are actively building our own dataset by web scraping various food websites. This approach allows us to gather diverse images and ingredient lists, ensuring that the model performs well across a wide range of cuisines and dishes. By continuously expanding the dataset, we aim to keep the model adaptable to evolving food trends and underrepresented cuisines.

Deep Learning Architecture

We are currently experimenting with multiple deep learning architectures to find the most effective solution for ingredient detection:

Convolutional Neural Networks (CNNs):
CNNs are being used as the backbone for image processing, extracting key features that represent the ingredients. We are testing architectures like ResNet-50 and VGG-16 to produce vector embeddings that capture the visual characteristics of the dish.
Recurrent Neural Networks (RNNs):
To handle the sequential nature of ingredient lists, we are working with RNNs like GRU and bi-directional LSTMs. These models are designed to predict multiple ingredients from an image, allowing for accurate recognition even with complex dishes.
Transformer Models:
We are also exploring transformer architectures due to their superior ability to capture complex patterns in data. Transformers allow for a more flexible relationship between image features and ingredient labels, enhancing the model’s ability to predict ingredients accurately.
Joint Embedding Space:
We are working on creating a joint embedding space that aligns both the visual and textual representations of ingredients. This shared space helps the model better match the image features with ingredient lists, improving overall prediction accuracy.

IngrAIdients is actively evolving, with continuous testing and fine-tuning of neural networks to enhance ingredient detection. By leveraging a combination of existing datasets and custom data gathered through web scraping, we are ensuring that the model remains accurate, scalable, and adaptable to various cuisines and dishes. Our ongoing work demonstrates the potential for AI to transform how users interact with food, making meal planning and nutritional tracking easier and more intuitive.

Helping Hand

During the Praxis 2 course, I worked on a project aimed at improving the lived experience of powered wheelchair users by helping them retrieve fallen objects. Initially, we were provided with a broad project scope and a request for a proposal from another team. After reviewing it, my team and I decided to reframe the opportunity to focus on mitigation solutions (solutions to retrieve dropped objects) rather than prevention.

We engaged closely with the stakeholders to refine the project’s objectives and criteria, ensuring our design would effectively address the specific needs. This phase of the project significantly developed my communication and teamwork skills as we collaborated with stakeholders and aligned on the best approach forward.

For the final design phase, our solution—called the “Helping Hand”—was selected after thorough exploration of multiple concepts. I led the technical design work, spending several hours refining the CAD model and creating animations to visualize the solution. I also produced detailed engineering drawings to help showcase the concept.

This project not only enhanced my CAD skills but also reinforced the importance of effective stakeholder communication and teamwork in engineering design.

Matboard Bridge

In this design project, my team and I were tasked with developing two matboard bridge concepts: one was a box girder supported only at the ends, and the other incorporated intermediate support. The goal was to span a 950mm valley with 30x100mm supports at each end. We were responsible for delivering a comprehensive report, detailed calculations, and engineering drawings for both designs.

My group focused on performing the engineering calculations and creating the designs. We conducted extensive analyses to identify potential failure points and optimize the maximum load each bridge could carry. Using Autodesk Fusion, I developed CAD models for both bridges, simulated their load capacities, and refined the designs based on the results. I also created detailed engineering drawings to showcase each component.

This project helped me enhance my engineering analysis skills, deepen my understanding of structural calculations, and refine my CAD and simulation abilities using Autodesk Fusion.

Page updated

Google Sites

Report abuse