A workflow for my second Kaggle competition
[ kaggle ]

August 15, 2025

Last week I completed the Advanced House Prices practice Kaggle competition. My next challenge is the NeurIPS - Open Polymer Prediction 2025.

Objective: Use text data about polymer structure to predict five metrics that determine performance of a given polymer

Goal: Place in the top half of the leaderboard

I looked through the tutorial notebook posted by the competition hosts and then used Claude to create a workflow to keep me on track as I progress through the competition. I expect it will take me some time. As I progress through the competition, I’ll post more about what I’ve learned at each major step. Check back soon to see where I’m at. See the workflow below:

Open Polymer Prediction Competition Workflow

Understanding the Problem

What does the data look like?

The input consists of SMILES structures. SMILES stands for Simplified Molecular Line Input System. If you studied a science-related subject in college like I did, you may have taken organic chemistry. And if you took organic chemistry, you may have been exposed to the IUPAC system for chemical nomenclature. IUPAC is a standardized set of rules for naming chemical structures. SMILES has the same idea, except the goal of SMILES is to represent molecular structures in a computer-readable format.

What’s the evaluation metric?

Weighted Mean Absolute Error (wMAE)

Baseline Implementation

Systematic Parameter Tuning

Model Architecture Exploration

Advanced Techniques