After Repair Value Prediction Model
Prediction Tool
This project pioneers an innovative model to predict the After-Repair Values (ARV) of properties, leveraging data processing and machine learning (Patent Pending).
- Data Sourcing: Utilizing real estate data from a remote SQL database, with the flexibility to also load from CSVs.
- Census Data Integration: Enhancing model accuracy by acquiring census tracts for each property via the Census Data API.
- Descriptive Text Analysis: Transforming descriptive text into quantifiable data using a Term Frequency-Inverse Document Frequency (TF-IDF) matrix, enriching our binary renovation classification model.
- Advanced Modeling: Employing the tree-based LightGBM model to predict ARVs with precision.
- Innovation and Impact: A development currently under patent consideration, addressing a significant need in the real estate market for accurate, data-driven property valuation.
Note: The full codebase for this project is currently not publicly available due to pending patent approval. A detailed overview and selective code snippets are provided to illustrate the project's methodology and scope.
Offer Evaluator Grid for Investor Deals
The Offer Evaluator Grid is a tool for real estate investors to calculate the Maximum Allowable Offer (MAO) for repair-needed properties. The grid adjusts property prices based on specific attributes and provides a straightforward interface for quick decision-making, helping investors efficiently determine optimal offer prices. Comparative analysis in conjunction with investor reviews were employed to identify price adjustment % per property attribute
Baltimore Real Estate Investor Tracts
This Tableau presentation is the culmination of an extensive data analysis project on individual Maryland property records aggregated from Redfin and supplemented with other public sources such as zillow (Zillow API, now defunct), government census data (Census Data API), and local crime metrics. The goal of the analysis was to quantitatively identify the census tracts in Maryland that were optimal for property flipping (green tracts) as well as tracts non-conducive for real estate investing such as high crime areas (red tracts) or affluent areas (yellow tracts). Threshold tuning was optimized and validated with the help of local investors. Metrics requested by investors are displayed on mouseover for each census tract. Cashflow estimates were calculated for each property by pulling rent estimates from Zillow's API, quantifying known hard costs (mortgage, taxes, insurance, hoa), and estimating likely soft costs (maintenance, capex, vacancy, and property management) using investor feedback.
Baltimore Citations Web Scraper
This web scraper loops through the property citations for each neighborhood in Baltimore City and stores them in an excel file. The scrapped results were used to conduct a 6-month marketing campaign to the investor owners as part of the wholesaling business that my partner and I conducted from September 2021 to February 2022. We found that owners of vacant properties with multiple citations were more likely to sell to investors. More information is available in the repo README file.
Kingdom Investors Club (Cohost)
My partner and I hosted the Kingdom Investors Club starting in January 2022. Together, we used his background in the property flipping business and my background in data analytics to build a a framework of investing principles and document them in a curriculum format. We shared this framework to our local church community where we presented our findings and supported them with synergistic biblical principles.
Air Force Clustering Analysis Project
One of the few projects from my day job that is unclassified, I combined an Exploratory Factor Analysis and a Gaussian Mixture cluster model to group United States Air Force (USAF) survey responses and demographic data into clusters. These clusters allowed me to identify key patterns that led to significant changes in group morale across different USAF units. The jupyter notebook is available at the "GitHub Repo" button below, while a recreation of the PowerPoint I presented to USAF leadership is in the "PowerPoint" button. All data, code, and presentation results were scrubbed, scrambled, or removed to preserve privacy.