Vibe coded my dumb hat app
Playing around with Rork, an AI tool that builds apps from text descriptions
Remember when I said I wanted to make an app that would tell me if a hat would fit my head? Well, I think I made one.
Technically it’s just a working prototype. But I made it in a couple minutes!!! Vibe coding!
Before I jumped into Rork, the AI tool that builds mobile apps based on text descriptions (“vibe coding”), I first asked ChatGPT how I should go about making this “will hat fit?” app. I don’t know how to code but I’d like to at least understand what kind of information is needed in order to build something like this.
ChatGPT’s response:
To create an app that takes a product URL (e.g., a hat) and compares it against your specific head shape to predict how good it would look, you’re essentially building a personalized visual AI style advisor. Here's a breakdown of what you'd need across five critical layers: technical infrastructure, user modeling, data ingestion, visual analysis, and output logic — followed by some strategic challenges to consider.
🔧 1. Core Technical Infrastructure
a. Front-End (User Interface)
Cross-platform app (React Native / Flutter)
Simple interface to:
Input product URLs
Upload/scan head images
View match results (visual + score/feedback)
b. Back-End (Processing Engine)
Python (e.g. FastAPI) or Node.js service
Image processing pipeline (OpenCV / PyTorch / TensorFlow)
Integration with external APIs (e.g. Shopify, Amazon, image scrapers)
c. Database
Store user head data/profile securely
Cache processed product images & scores
PostgreSQL / Firebase / MongoDB
🧠 2. User Modeling (Your “Head Profile”)
You’ll need a 3D model or key facial landmarks to analyze how hats fit/look on you.
Options:
Use TrueDepth front-facing camera data (iPhone)
Build a 3D face mesh from a 2D photo (MediaPipe + Meshroom or custom ML model)
Store:
Head shape classification (from earlier discussion)
Key proportions (forehead width, jawline, face length)
Angle of tilt, volume (hair), etc.
👉 Optional: Let users train their avatar by uploading multiple images (front, side, 3/4 view).
🌐 3. Product URL Parsing & Data Ingestion
Challenge: Product URLs vary widely. You need to extract:
Main product image
Metadata (e.g., title, brand, dimensions, material, type)
How:
Custom scraper or use APIs (where available)
Use
BeautifulSoup
/Puppeteer
for HTML scrapingUse ML (CLIP or BLIP) to classify the hat type if product metadata is poor
Bonus: Automatically crop/normalize product image to isolate hat.
👁️ 4. Visual Simulation & Fit Evaluation
This is the hard but magical part. You’re trying to estimate aesthetic harmony + visual proportion fit.
Two potential methods:
A. Overlay Simulation
Superimpose the hat image on user’s head model
Use pose estimation to tilt/resize accordingly
Can use MediaPipe FaceMesh + Dlib for landmarks
B. Embedding-based Similarity
Use vision-language models (e.g., CLIP or DINO) to convert both hat and face into embeddings
Train a scoring model on crowd-rated “hat-face pair attractiveness”
Predict aesthetic compatibility (not just geometry)
👉 Output: A composite image + confidence score + explanation ("This wide brim softens your jawline").
🧾 5. Scoring Logic & Recommendation Layer
After simulation, present:
Fit score (based on alignment with head shape)
Style score (based on harmony with facial features)
Optionally: aesthetic classifier (e.g., “artsy”, “minimal”, “bold”)
CTA: Save to wardrobe / compare styles / try another
🚧 Strategic Challenges
1. How will you build ground truth?
You need training data: images of different people in different hats, labeled with style/fit ratings
Could crowdsource (via user feedback) or partner with stylists/fashion schools
2. How will you handle hats with no clear image angle or multiple variants?
Build logic to request user confirmation or filter based on pose availability
3. How personal do you want this to be?
Do you want to factor in cultural cues, wardrobe matching, or even personality traits? If so, you’re moving from "fit" to "fashion alignment" — a much richer, but also riskier, space.
Adjacent Inspiration
YouCam Makeup and ModiFace (L’Oréal) use similar tech for face+makeup overlays.
Zeekit (Walmart) and Amazon Virtual Try-On for clothes.
MetaHuman Creator (Unreal Engine) for 3D faces.
Do You Want Help Prototyping?
Would you prefer to:
Start lean with a simplified MVP using selfies and manual hat tagging?
Go full AI-first, training a matching model with real feedback data?
Explore a browser-based demo first to test demand and collect user head shapes?
Happy to help sketch an MVP roadmap or design a technical build plan. Just say the word.
This surprisingly made a lot of sense to me, a person who isn’t incredibly technical. I asked it to compare the lean MVP and the AI-first roadmaps out of curiosity. I was not actually planning on investing more than “dicking around” time on this educational journey.
Since I don’t have $2K of silly money, I filed this information away and went to a third option: Rork.
The free version of Rork allows you to send 5 free messages a month. You technically only need one to get a functioning prototype up and running. I used a couple messages to fix errors that were flagged. Within 3 messages I was able to play with the prototype, which is displayed next to the chat window. I could either interact with the “preview” in Rork’s platform or scan the QR code, which prompts me to download the Expo Go app and test it out on my phone.
Features included:
Create user profile
Manually enter/edit head measurements
Paste product URL
Receive a recommendation score
See saved items and recent analysis
Things that took me more time than Rork took to build it:
Finding a measuring tape
Creating the logo in Canva (I couldn’t decide what I wanted it to be) and it wasn’t even necessary
While I was blown away, I noticed something strange. I pasted a URL for a linen baseball hat. Where did they get this North Face beret from?
So I used up one of my free messages.
Then it schooled me.
Looking at the current implementation in utils/hat-analyzer.ts, I can see that we're not actually scraping real metadata from URLs. Instead, we're using a mock implementation that generates random hat data based on a hash of the URL. This is why different URLs might produce similar results - the mock data generation isn't sophisticated enough to create truly unique analyses.
In a real production app, we would need to:
1. Actually fetch the product page
2. Parse the HTML to extract real product information
3. Use that information to generate an accurate analysis
However, since we're building a demo and don't have a real backend service to scrape websites (which would also have legal implications regarding scraping commercial sites), I should improve the mock data generation to at least create more varied and realistic-looking results.
I'll enhance the mock data generation to:
1. Create more varied hat types, sizes, and descriptions based on the URL
2. Make the fit notes more specific to the generated hat data
3. Ensure different URLs produce noticeably different analyses
Good to know in case you’re playing with a concept that’s scraping data from the internet. Don’t waste one of your messages!
Since it took me some time to find a measuring tape, I wondered if I could create a feature that provides an alternative way to get head measurement. ChatGPT outlined some ways including using a variety of photos. So Rork got to work and added a “Photo Measurement” feature for me.


It’s not perfect. But I only spent $20 to upgrade to the “Junior Tier” paid subscription (you get 100 messages a month) so I could keep playing. They’re sneaky because they don’t tell you how many messages you’ve used and I saw some people complaining about it on Reddit.
As a non-coder, I can see why this is exciting. Getting to this point, even if it’s shoddy, would have taken me a long time and some initial investment. But I can also see why people shit on vibe coding. I have no idea what I’m doing! I don’t really know how to fix anything, besides using up 1 message to say “fix it please.”
And if I publish it to the app store, what next? Do I vibe maintain?
It’s already 1:46am and I finished my glass of sauvignon blanc. I want to try an idea that doesn’t depend on metadata scraped from the internet. So I start on a second app idea around remembering birthdays. A quick search tells me there are a handful of birthday countdown apps but they’re all ugly so here I go designing my own. Rork allows you to upload photos to your conversation so that’s what I’ll be trying next. How accurately can Rork pull from a mood board?
But now it’s time to vibe sleep.